AI series: On AI ethics - influencing its use in the delivery of public good

While AI algorithms can pose a number of challenges in terms of the size and sophistication of the algorithms, ethical issues can be the hardest and most important aspect to get right. Olivia Varley-Winter looks at why ethical considerations need to come first when working with AI and what this means.

AI ethics

Olivia Varley-Winter


May 14, 2024

Criminal sentencing biased by race in the US, students systematically downgraded in UK public examinations with no process for appeal, and decisions to rescind food welfare in India riddled with errors and discrepancies are all instances where AI algorithms have hit the headlines. When Bill Gates wrote that the age of AI has begun and “will change the way people work, learn, travel, get health care, and communicate with each other,” those probably weren’t the changes he had in mind. Nor need they be an inevitable side effect of living with AI.

A number of points require consideration to work safely with AI, from the potential for bias in input and training data, and consent over data use, to the transparency and fairness of applying an algorithm – who has decided the problem, or set of problems, it is to solve? The steps that are taken to explain and involve an organisation’s stakeholders in the conclusions that AI reaches also require ethical consideration, as does ethical development of AI. Its use for social policies and services highlights an additional set of problems.

As AI becomes more active in society, AI ethics involves not only defining the objectives for data scientists, researchers and technologists to work on. It involves governing bodies, regulators, policy makers, businesses and organisations, the media, and civil society, working to handle and communicate AI’s benefits and mitigate its harms. Organisations with international clout – such as the United Nations Educational, Scientific and Cultural Organization (UNESCO) and the Organisation for Economic Co-operation and Development (OECD) – have prominently set out ethical principles that can broadly apply. Nonetheless, a lot can go wrong.

Bias in bias out

In 2016 when ProPublica launched an investigation into potential biases in a ‘risk assessment’ algorithm used by the US criminal justice system, it was the first independent investigation of its kind. This was despite the widespread use of the algorithm and its power to influence a judge’s sentence, in one instance doubling the duration while increasing the severity of the imprisonment. On examining 7000 risk assessment scores and the records detailing whether the subjects of those scores had reoffended in the subsequent two years, Propublica found “Only 20 percent of the people predicted to commit violent crimes actually went on to do so”. Even when the full range of crimes was taken into account “the algorithm was somewhat more accurate than a coin flip” at 61%. Part of the enthusiasm for these algorithms had been the expectation that they might bypass the prejudices and unconscious biases of human judges, enabling fairer justice. However, while many might baulk at the thought of tossing a coin to determine someone’s prison sentence, it turns out this might be a fairer approach than the algorithm, which was found to “falsely flag black defendants as future criminals” at twice the rate of white defendants.

Eleanor Roosevelt reads the Universal Declaration of Human Rights in 1949; FDR Presidential Library & Museum 64-165 CC-BY-2.0

Eleanor Roosevelt reads the Universal Declaration of Human Rights in 1949; FDR Presidential Library & Museum 64-165 CC-BY-2.0

Since Propublica’s investigation there have been multiple reports highlighting problems with algorithms trained on historic data for use in the criminal justice system. The risk illustrated here, which can be generalised, is that such algorithms will tend to propagate social biases. In this case it means that those from ethnic minorities and lower socioeconomic backgrounds are awarded harsher sentences. Compounding the problem was the proprietary nature of the algorithms involved, which made it difficult to launch independent investigations. However, in the case of the algorithm investigated by Propublica, the input data, which is taken from questions put to the defendant and their prison records, did provide clues as to the scope for unfair outcomes. Although race is not explicitly identified, it likely correlates with other data that is used as input. This meant that the outcomes would be biased with respect to race all the same. A lot more work is needed to mitigate the effects of historical social injustices in how the criminal justice system uses data. Innovators in this area need to have confidence in what will be affected by their evidence base, as well as support from independent legal and ethical reviewers, and from regulators, to determine what will make a good innovation, and what will not.

Openness, explainability, and the scope to challenge AI decisions

A principle that many data science communities have been working on, is towards ensuring transparency and explainability of AI (OECD AI Principle). In OECD parlance that is in part “to ensure that people understand when they are engaging with [artificial intelligence] and can challenge outcomes.” In acknowledgement that some AI applications make this disclosure harder and more unappealing, the OECD suggests that the fact that AI is in use should be disclosed “with proportion to the importance of the outcome … so that consumers, for example, can make more informed choices”. The OECD emphasises the importance of the “explainability” of the algorithms, which it defines as “enabling people affected by the outcome of an AI system to understand how it was arrived at. … notably – to the extent practicable – the factors and logic that led to an outcome.”

The tens of millions of digital ‘platform workers’ that now live all over the world are a case in point for where explainability is needed. They perform short-term, freelance, or temporary work through digital platforms or apps in the “gig economy”. There is little transparency about how algorithms and AI influence outcomes for gig workers, and whether platform algorithms are contributing systematically to unfair outcomes. Platform workers themselves have come together to share their data to understand more about the outcomes of the algorithms, or AI, which is shaping their lives.

It follows that where the use of an AI system does not affect outcomes for people, there may be less of a demand to publicly justify how AI arrived at its outcomes. For example, where AI is used to simulate something, or to research a decision, rather than to make a decision, there could be less weight placed on explaining the model publicly.

Aerial view of tech cluster in Silicon Valley, taken on 29 March 2013, courtesy of Patrick Nouhaillier CC-BY-3.0

Aerial view of tech cluster in Silicon Valley, taken on 29 March 2013, courtesy of CC-BY-3.0

François Candelon, Theodoros Evgeniou, and David Martens, writing for the Harvard Business Review have outlined that their preference is for accuracy as well as explainability. Often, to strike this balance, they will prefer ‘white box’ models which are transparent and interpretable. But not always. “In [complex] applications such as face-detection for cameras, vision systems in autonomous vehicles, facial recognition, image-based medical diagnostic devices, illegal/toxic content detection, and most recently, generative AI tools like ChatGPT and DALL-E, a black box approach may be advantageous or even the only feasible option.”

Even where the algorithm is too large and complicated to be interpretable, work like that conducted by the Alan Turing Institute in Project ExplAIn finds ways of extracting some kind of explanation, for instance by embedding layers in the coding. The case for opening up AI in this way has to be balanced against concerns for intellectual property, information security and privacy. There can be cybersecurity issues with making the different layers of an AI model more open to interrogation. Nonetheless, experiments with transparent and explainable models enable developers to advance their understanding of AI, as well as to consider whether its use for decision-making is ethically sound. The OECD principles make clear that it is important that AI doesn’t elude human insight, checks and balances. As Andrew Ng highlighted in the RSS fireside chat in 2021: “AI is increasing concentration of power like never before…governments and regulators need to look at that and think of what to do.”

Appropriate, human-centred governance

When school exams in England were cancelled during the Covid-19 pandemic, the government’s Department for Education decided that an algorithm should be used to allot grades to A-Level students, partly as a measure to counter grade inflation (a trend in which the grades awarded for the same standard of work will tend to rise, year on year). Algorithms had been used before in previous years to adjust the marks that were awarded for exams and coursework. Here instead of exams and coursework, the input data was gathered from Ofqual’s historical records about how particular schools’ pupils had performed in previous years, and some was generated by teachers. Efforts had been made at transparency in terms of how the new algorithm would arrive at these decisions (it was a relatively simple, white box algorithm). But there were ‘outliers’ acknowledged in the model even prior to deployment. Coupled with the widespread downgrading of teacher-estimated grades to fit a curve that would avoid grade inflation, there was not a clear process by which students and schools could appeal to change their grades. Dissatisfaction with the grades awarded in the absence of exams or coursework was rife, as young people regarded as academically talented by their schools fell short of the grades their teachers had predicted, and lost university places.

In the resulting furore, the Department for Education determined that its original policy was wrong and adopted the teacher estimated grades with an appeal process in place. The incident demonstrates that achieving the functional transparency of an algorithm is only one step in due process. Controversial policies could be using an algorithm to apportion losses across the population (e.g. to try to reduce grade inflation) in ways that are abhorred by individuals.

Vested interests also surfaced during investigation of an algorithm brought into use to tackle fraud in India’s welfare system. “From 2014 to 2019, the government of Telangana “cancelled more than 1.86 million existing food security cards and rejected 142,086 fresh applications without any notice.” reported Al Jazeera in January of this year. Despite the government’s initial claims that the cancelled food security cards were fraudulent, critical data scholarship in India and elsewhere has established discrepancies and errors in the algorithms used, such as, confusing the records of a valid claimant with a car-owning citizen by the same name. (Under the government’s policies, SUV owners cannot receive food aid.) Further investigations revealed that at least 7.5 per cent of the food security cards were wrongly cancelled. The investigations highlight what can be a common problem: a focus on reducing the costs of welfare programmes tends to lead services to identify false positives - wrongful claimants – rather than false negatives. Thus efforts to correct sloppy data may meet resistance if this leads to fewer “frauds” being identified, even when citizens bring evidence to challenge it.

There is a similar type of example in the UK’s Post Office scandal, in which many sub-postmasters were wrongfully prosecuted for false accounting, after the Post Office adopted accounting software that contained significant bugs, which were covered up for many years. This similarly goes to show how far organisations can pursue wrongful judgements, and the life-changing consequences.

The EU’s new AI Act advocates a risk-based approach, to balance the desire to minimise the burden of compliance while ensuring the safety of people who may be affected by the implementation of AI algorithms. Systems assessed as high risk according to specific criteria are then “subject to strict obligations before they can be put on the market”.

Governments across the industrialised world have raised their hopes for AI that will help to drive increases in productivity, and do so safely in ways that are fairly constructed, making use of legitimate data sources, and with fair outcomes for society. The work of data scientists is integral to the foundations by which AI can be used for social good, from establishing protocols for data management and sharing, to understanding the workings of complex algorithms, and the use of large and unstructured data sources. Data scientists and researchers are getting closer to understanding what good looks like, not just in terms of the ethical values to uphold but the technicalities of the code and data involved. However, a great deal of not only data work but also other work also needs to be maintained to uphold the ideal of ‘AI ethics’. Support for well-established ethical and legal rights and principles, to meaningfully involve people in policies that will be affected by AI use, and to develop data governance and infrastructure. It is always possible that when we’re working on AI ethics, we find that there are fairer and more ethical approaches that should precede the use of AI.

“AI development raises a range of ethical questions for data practitioners, whether they are data scientists, econometricians, analysts, or statisticians,” Daniel Gibbons, Vice Chair of the Royal Statistical Society’s Data Ethics and Governance Section told Real World Data Science. Today, many data scientists would urge that ethical considerations precede the development of an AI algorithm and must inform its design and use, particularly for processes that significantly affect people, to ensure it does not propagate errors and injustices.

Explore more data science ideas

About the author
Olivia Varley-Winter Olivia is an experienced policy manager who has worked for the Royal Statistical Society, the Open Data Institute, Open Data Charter, the Nuffield Foundation, and the Alan Turing Institute. She was part of the Ada Lovelace Institute’s founding team in 2018 to 2020 and has since supported the development of other policy-related programmes and partnerships relating to data, AI and ethics. She is presently working for Smart Data Research UK on matters pertaining to ethics and responsible data governance. She has an MSc. in Nature, Society, and Environmental Policy from University of Oxford.
Copyright and licence
© 2024 Royal Statistical Society

This article is licensed under a Creative Commons Attribution 4.0 (CC BY 4.0) International licence.

How to cite
Varley-Winter, O., Author. 2024. “On AI ethics - influencing its use in the delivery of public good.” Real World Data Science, May 14, 2024. URL


  1. Grother, P., Ngan, M. & Hanaoka K. Face Recognition Vendor Test (FRVT) Part 3: Demographic Effects NISTIR 8280 (2019)↩︎

  2. Participatory data stewardship (2021) Ada Lovelace Institute ↩︎

  3. Jo, E. S. & Gebru T. Lessons from Archives: Strategies for Collecting SocioculturalData in Machine Learning Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (2020)↩︎