15. Reliable AI to Design a Better Future - Student Presentations, Discussion & Conclusion

June 10, 2022 lectured Distinguished Students of the Course and written by Dr. Merve Ayyüce KIZRAK,
A man without ethics is a wild beast loosed upon this world. — Albert Camus
In the last lesson of the 2022 Spring semester, students were randomly divided into two groups, A and B.
  • The Subject of Group A: AI-empowered grading system in U.K.
  • The Subject of Group B: AI-empowered arrest decisions
Both topics have been covered before in the lecture. They researched these two issues as a group for 1 hour and evaluated this real-life case through the checklist below.

Checklist: Requirements of Trustworthy AI

  • Human agency and oversight: Including fundamental rights, human agency and human oversight
  • Technical robustness and safety: Including resilience to attack and security, fallback plan and general safety, accuracy, reliability and reproducibility
  • Privacy and data governance: Including respect for privacy, quality and integrity of data, and access to data
  • Transparency: Including traceability, explainability and communication
  • Explainability: refers to the knowledge of the internal mechanics of a model: what a “node” in the model represents and its specific relevance to the outcomes (e.g. the “age” variable determines 15% of the outcome).
  • Diversity, non-discrimination and fairness: Including the avoidance of unfair bias, accessibility and universal design, and stakeholder participation
  • Societal and environmental wellbeing: Including sustainability and environmental friendliness, social impact, society and democracy
  • Accountability: Including auditability, minimization and reporting of negative impact, trade-offs and redress.

Grup A: AI-empowered grading system in U.K.

Case: UK government’s A-LEVEL grading AI, developed by Ofqual, was used during the start of the pandemic.

Ethical Evaluation by Checklist

Students have the right to an education without prejudice and to shape their future. The UK government has ignored this with its grading system. To avoid such disasters in the future, authorities need to “be more inclusive and diverse in the process of creating such models and algorithms,” says Ed Finn, an associate professor at Arizona State University and the author of “What Algorithms Want.
  • A robust AI should not pose unreasonable safety risks, in conditions of normal use or misuse. It should be functioning appropriately and secure throughout its entire lifecycle.
To do this, it is crucial that engineers keep track of traceability, processes, and decisions made.
For most of the students that were from less advantaged schools, their scores had downgraded, while students from richer schools were more likely to have their scores raised. Professor Jo-Anne Baird, director of the Department of Education at the University of Oxford explained that:
“Mathematical models never predict perfectly, especially in the field of human learning. But the Secretary of State’s remit to Ofqual was to produce a system that brought about broadly comparable results with those of the past. So it wasn’t possible just to use teachers’ grades as they’re not comparable. This meant some statistical moderation was needed to produce a model that worked best within the parameters set”.*
From this, we can conclude that the system is starting to fall apart.
  • The right of every citizen to manage their own personal information and make decisions about it is known as privacy.
However, the data collection of this AI is suspicious. What was absent here was transparency and analysis of the algorithm’s intentions, and there was plenty of time to do so.
“There’s a lot more that can be done in terms of demonstrating proper risk and effect reduction.”
In the end, concerns about lack of transparency may diminish future faith in algorithmic systems that can benefit society. What this means is that there is no public debate or awareness of events until after they have occurred or until rumors have surfaced.
Through research, we have concluded that most of the algorithms’ explainability is close to none which caused major issues in the past. With newer algorithms, however, we see more explainability.
Such as a program developed by Stanford AI Lab that grades game designs. The game design program first learns the instructor’s game by playing it and then moves on to the students’ work and plays it too. But they didn’t explain how it grades.
The way your AI professionals build out your system will determine how your algorithms interact with people of all cultures, genders, sexualities, races, and so on. Using Ai for grading is tricky because the algorithm trained by the last 3 years’ data which data they used can already be discriminant yet Aİ has a distance to every student. The case of grading particularly is a pretty sensitive topic, one mistake done by the ai due to bias can go as far as ruining a student’s future so an expert diversity team is much needed and required
Furthermore, if there are less than 15 students in a certain topic at a given school, the algorithm gives the CAGs greater weight. As a result, pupils at smaller schools were more likely than those in bigger schools to profit from grade inflation. According to one study, the “proportion of A* and As granted to independent (fee-paying) schools jumped by 4.7 percentage points — more than double the rate for state comprehensive schools.
The AI grading system and its bias created a massive social outcry, students were shocked to see that they received grades that were proportionally lower than what they have expected, they went on to showcase their frustration in the form of hundreds of protests, while gathered outside of the Department for Education’s building, they chanted and voiced their frustration with the system and its algorithm.
Parents of children whose papers were subjected to the system were also unhappy, they expressed how they were not notified in advance, and how the test results were kept under tight confidentiality. In some classrooms, students were not even informed that their work had been assessed by AI rather than a human.
The ordeal was mainly blamed on poor political decisions and the simple misunderstanding of data, according to Jenny Brennan, a researcher of AI and technology’s impact on society at the Ada Lovelace Institute,
“It’s more about the procedure and the issues of goals, risk mitigation, adequate scrutiny, and remedy. According to her, the algorithm in itself wasn’t necessarily wrong, but it was the wrong one to use for this type of problem, especially with its massive scale and huge effect on the population.”
Deals with fairness, liability, audibility, traceability
  • Every summer thousands of students take the A-levels, an exam that affects their future.
  • Instead of taking this exam and determining their own future, an algorithm was used to do it for them.
  • If worked properly, the AI system could provide a fair way of predicting the results. (exam day circumstances and the futures of the students being dependent of just a couple of hours)
  • However, the algorithm “predicted” the scores way lower than how students expected them to be.
  • The algorithm was biased.
  • The socioeconomic situations of the students were clearly a criterion for the algorithm. (Scores of students in less-advantaged schools were downgraded, while scores of students in richer schools were more likely to be raised.)
  • This kind of algorithm should be more inclusive and diverse.

Grup B: AI-empowered arrest decisions

Case: Courtrooms across the US have turned to automated tools in attempts to shuffle defendants through the legal system as efficiently and safely as possible.
Argument: AI is wrongfully sending people to jail
  1. 1.
    Example: The Killing of George Floyd
  2. 2.
    Example: How Wrongful Arrests Based on AI Derailed 3 Men’s Lives
  3. 3.
    Example: How AI Powered-Tech landed a man in jail with Scant Evidence Williams was arrested in January 2020 for allegedly stealing five watches from a Shinola store in Detroit, after he was wrongfully identified by facial recognition software. He was among the first people known to be wrongfully accused because of the software, which is an increasingly common tool for police. Michael Oliver and Nijeer Parks were wrongly arrested in 2019 after also being misidentified by facial recognition technology.

Ethical Evaluation by Checklist

Ethical concerns also arise in authoritarian governments exploiting AI surveillance in the name of combating crime. One such country is China; through the “New Generation Artificial Intelligence Development Plan” (AIDP), the nation delineated an overarching goal to make China the world leader in AI. Anderson believes that China wants to use AI to build an all-seeing digital system of social control, which would push China to the cutting edge of surveillance. This possibility of an all-knowing system fueled by AI-based surveillance presents ethical concerns because it grants governments absolute control at the expense of civil liberties.
African-Americans are arrested at four times the rate of white Americans on drug-related charges. Even if engineers were to faithfully collect this data and train a machine learning model with it, the AI would still pick up the embedded bias as part of the model.
Crawford and Schultz argue:
“When challenged, many state governments have disclaimed any knowledge or ability to understand, explain, or remedy problems created by AI systems that they have procured from third parties, The general position has been “we cannot be responsible for something we don’t understand.” This means that algorithmic systems are contributing to the process of government decision making without any mechanisms of accountability or liability.”
There are many private companies that market facial recognition products to state and local law enforcement.
One of the companies, ODIN Intelligence, partners with police departments and local government agencies to maintain a database of individuals experiencing homelessness, using facial recognition to identify them and search for sensitive personal information such as age, arrest history, temporary housing history, and known associates.
In 2020, federal law enforcement agencies purchased geolocation data without a warrant or binding court order from analytic companies, ICE and CBP used this data to enable potential deportations or arrests, which shows how geolocation can have singular consequences for immigrant communities, especially among populations of color.
AI systems should be secure and resilient in their operation in a way that minimizes potential harm, optimizes accuracy, and fosters confidence in their reliability
Facial recognition systems have been used by police forces for more than two decades. Recent studies by MIT and the NIST have found that while the technology works relatively well on white men, the results are less accurate for other demographics, in part because of a lack of diversity in the images used to develop the underlying databases.
In this case, we need to hold the people or the government accountable to avoid it from happening again.
If people were held accountable, more awareness will be achieved hence, people will be more understanding of this and will be able to trust the process.
Societal and environmental wellbeing: Including sustainability and environmental friendliness, social impact, society, and democracy.
Getting wrongly convicted or sentenced can have a negative effect on society and the environment in general.
Conclusion: For these reasons, ai isn’t ready to be in the justice system just yet, much adjustment needed to be implied because freedom is not something we can put in the hands of something we don’t even understand just yet.


It characterizes AI systems and applications in the following dimensions:
  • People and Planet, Economic Context, Data and Input, AI Model and Task and Output.
  • Each dimension has its own characteristics and attributes or sub-dimensions related to evaluating policy evaluations of particular AI systems.
The phases of the AI system lifecycle can be associated with the dimensions of the OECD Framework for the Classification of AI Systems.
  • This mapping is useful to identify some of the key AI actors in each dimension, which has accountability and risk management implications.
Mapping the AI system’s lifecycle to the key dimensions of an AI system-Source: OECD, 2022a.
Note: The actors included in the visualization are illustrative, not exhaustive, and based on previous OECD work on the AI system lifecycle.


Group A

Group B