How AI risks automating racism, prejudice, and human bias
Chris Middleton explains how problems in human society are being replicated – often accidentally – by artificial intelligence.
• This article has been quoted in London’s Evening Standard newspaper.
AI is the new must-have differentiator for technology vendors and their customers. Yet the need to understand AI’s social impact is overwhelming, not least because most AI systems rely on human beings to train them. As a result, existing flaws and biases within our society risk being replicated – not in the code itself, necessarily, but in the training data that is supplied to some systems, and in the problems that they’re being asked to solve.
Without complete data, AI can never be truly impartial, they can only reflect or reproduce the conditions in which they are created, and the belief systems of their creators. This report will explain how and why, and share some real-world examples. The need to examine these issues is becoming increasingly urgent. As AI, machine learning, deep learning, and computer vision rise, buyers and sellers are rushing to include AI in everything, from enterprise CRM to national surveillance programmes and policing systems.
Are people with tattoos criminals?
One example of AI in national surveillance is the FBI’s bizarre scheme to record and analyse citizens’ tattoos, in order to predict if people with ink on their skin will commit crimes. Take a ‘Big Bang’ view of this project (rewind the clock to infer what the moment of creation must have been), and it’s clear that a subjective, non-scientific viewpoint (‘people with tattoos are criminals’) was adopted as the core principle of a national security system, and software was designed to prove it.
The code itself is probably clean, but the problem that the system is being asked to solve, and the data it is tasked with analysing, are surely flawed. Arguably, they betray the prejudices of the system’s commissioners. Why else would it have been conceived?
In such a febrile atmosphere, the twin problems of confirmation bias in research, and human prejudice in society, may become automated pandemics: AIs that can only tell people what they want to hear, because of how the system has been trained. Automated politics, with a veneer of evidenced fact.
Often this part of the design process will be invisible to the user, who will regard whatever results the system produces as being impartial. A recent AI white paper published by UK-RAS, the UK’s research organisation for robotics and AI, makes exactly this point: “Researchers saw how machine learning technology reproduces human bias, for better or for worse. [AI systems] reflect the links that humans have made themselves.”
That’s the view of the UK’s leading AI and robotics researchers. So, is AI automating prejudice and other societal problems? Or are these issues simply hypothetical?
The racist facial recognition system
The unfortunate fact is that they are already becoming real-world problems, in a significant minority of cases. Take the facial recognition system developed at MIT recently that was unable to identify African American women, because it was created and tested within a closed group of white males. The libraries for the system were distributed worldwide before an African American student at MIT exposed the fact that it could only identify white faces.
We know this story is true, because it was shared by Joichi Ito, head of MIT’s Media Lab, at the World Economic Forum 2017. He described his own students as “oddballs” – introverted white males working in small teams with few external reference points, he said.
The programmers weren’t consciously prejudiced, Ito explained, but it simply hadn’t occurred to them that their group lacked the diversity of the real world into which their system would be released.
As a result, a globally distributed AI was poorly trained and ended up discriminating against an entire ethnic group, which was invisible to the system. That the developers hadn’t anticipated this problem was their key mistake, but it was a massive one.
Male dominance and insularity are big problems for the tech industry: in the UK, just 17 per cent of people in science, technology, engineering, or maths (STEM) careers are women, while in the West the overwhelming majority of coders are young, white males.
The UK-RAS report shares a similar example of societal bias entering AI systems: “When an AI program became a juror in a beauty contest in September 2016, it eliminated most black candidates, as the data on which it had been trained to identify ‘beauty’ did not contain enough black-skinned people.” Again, the humans training the AI unconsciously weighted the data.
The lesson here is not that any given AI or line of code is inherently biased – although it might be – it’s that the data that populates AI systems may reflect local/social prejudices. At the same time, AI is seen as impartial, so any human bias risks being accepted as evidenced fact. Most AI is a so-called ‘black box’ solution (see below), making it hard for users to interrogate the system to see how or why a result was arrived at.
In short, many AI systems are inscrutable.
Staring bias in the face
The possibility that AI could worsen discrimination in human society is being taken seriously by analysts and researchers. For example, The Age of Automation, a 2017 report by the RSA and YouGov, suggests that automation could lead to an “entrenchment of demographic biases”, while the use of AI in recruitment – such as algorithms that screen CVs – could amplify workplace biases and block people from employment based on their age, ethnicity, or gender.
Discrimination can take subtler forms, too, with some being much harder to identify than gender bias or racism in the workplace, adds the report. “Equipped with AI systems, organisations will have greater precision in predicting people’s behaviours and the risks they face. This could lead to certain groups being denied access to goods, services, and employment opportunities.
“Insurance companies, for example, may one day be able to use advanced algorithms to determine the likelihood of prospective customers acquiring a disease, making them uninsurable.”
These fears are shared by Lord Clement-Jones, who chairs the UK’s Parliamentary Select Committee on the economic, ethical, and social implications of AI. In September 2017, he said: “How do we know in the future, when a mortgage, or a grant of an insurance policy, is refused, that there is no bias in the system?
“There must be adequate assurance, not only about the collection and use of big data, but in particular about the use of AI and algorithms. It must be transparent and explainable, precisely because of the likelihood of autonomous behaviour. There must be standards of accountability, which are readily understood.”
Consent adds another dimension. It’s illegal to collect personally identifiable information (PII) or data (PID) without the subject’s consent – national security applications excepted. But if we do divulge personal details, we may not be made aware that an organisation is mining them to identify hidden traits in our personalities. In the future, the Ts & Cs we sign will demand detailed reading.
Consider this: the human face is the definitive example of PII, which is why our passports contain our photos, and why facial recognition systems can verify our identities. But when we have our pictures taken for ID purposes – or friends tag us on Facebook – are we consenting to an AI system analysing our features to predict our beliefs, politics, health, or sexual orientation? This isn’t a science fiction scenario: such programmes already exist.
Are they legal? Might they, too, be examples of confirmation bias? Would a citizen ever find out why they had been denied services, life insurance, employment, or admission? And what if the AI is wrong?
The legal dimension
Why are all of these risks so important to consider? Evidence is mounting that data problems may already have begun to automate bias within our legal systems: a real challenge as law enforcement becomes increasingly augmented by machine intelligence in different parts of the world.
COMPAS is an algorithm that’s already being used in the US to assess whether defendants or convicts are likely to commit future crimes. The risk scores it generates are used in sentencing, bail, and parole decisions – just as credit scores are in the world of financial services. A recent article published on FiveThirtyEight.com set out the alleged problem with COMPAS:
“An analysis by ProPublica found that, when you examine the types of mistakes the system made, black defendants were almost twice as likely to be mislabeled as likely to reoffend – and potentially treated more harshly by the criminal justice system as a result. On the other hand, white defendants who committed a new crime in the two years after their COMPAS assessment were twice as likely as black defendants to have been mislabeled as low-risk.
“An even stickier question is whether the data being fed into these systems might reflect and reinforce societal inequality. For example, critics suggest that at least some of the data used by systems like COMPAS is fundamentally tainted by racial inequalities in the criminal justice system.” Again, this is a problem of flawed data being fed into an application that is seen by its users as impartial.
Tainted data in a networked system
The problem of tainted data runs deep in a networked society. In 2015, a journalist colleague shared a story with Facebook friends of how he searched for images of teenagers to accompany an article on youth IT skills.
When he searched for “white teenagers”, he said, most of the results were library shots of happy, photogenic young people, but when he searched for “black teenagers”, he was shocked to see Google return a disproportionately high number of criminal/suspect mug shots.
(Author’s note: I verified his findings at the time. The problem is still noticeable today, but far less overt, suggesting that Google has tweaked its image search algorithm.)
The underlying point is that, for decades, overall media coverage in the US, the UK, and elsewhere, has disproportionately focused on criminality within certain ethnic groups. This partial coverage populates the network, which in turn reinforces public perceptions: a vicious circle of confirmation bias feeding confirmation bias. This is why diversity programmes and positive messaging are important; it’s not about ‘political correctness’, as some allege; it’s about rebalancing a system before we replicate it in software.
This extraordinary article on Google search data reveals how prejudices run much deeper in human society than some of us like to believe, and are revealed by what we search for in private more than what we say in public. (Sample quote: “Overall, Americans searched for the phrase ‘kill Muslims’ with about the same frequency that they searched for ‘martini recipe’ and ‘migraine symptoms’.”)
Human bias can affect the data within AI systems at both linguistic and cultural levels, because – as we’ve seen – most AI still relies on being trained by human beings. To a computer looking at the world through camera eyes, a human is simply a collection of pixels. AI has no fundamental concept of what a person is, or what human society might be.
A computer has to be taught to recognise that a certain arrangement of pixels is a face, and that a different arrangement is the same thing. And it has to be taught by human beings what ‘beauty’ and ‘criminality’ are by feeding it the relevant data. The case studies above demonstrate that both these concepts are subjective and prone to human error, while legal systems throughout the world have radically different views on crime (as we will see below).
Our systems replicate our beliefs and personal values – including misconceptions or omissions – while coders themselves often prefer the binary world of computers to the messy, emotional world of humans. Again, MIT’s Ito made this observation of his own students.
The proof of Tay
Microsoft’s Tay chatbot disaster last year proved this point: a naïve robot, programmed by binary thinkers in a closed community. Tay was goaded by users into spouting offensive views within 24 hours of release, as the AI learned from the complex human world it found itself in. Humour and internet trolls weren’t part of its training. That’s an extraordinary omission for a chatbot let loose on a social network, and it speaks volumes about the mindset of its programmers.
However, the cultural dimension of AI was demonstrated by another story in 2016: in China, Microsoft’s Xiaoice chatbot faced none of the problems that its counterpart did in the West: Chinese users behaved differently, and there were few reported attempts to subvert the application. Surely proof that AI is both modelled on, and shaped by, local human society. Its artificiality does not make it neutral.
These issues will become more and more relevant as law enforcement becomes increasingly automated. The cultural landscape and legal system surrounding a robot policeman in, say, Dubai is very different to that in Beijing or San Francisco.
The rise of robocop
In each of these three locations robots are already being trained and trialled by local police services: Pal Robotics’ Reem machines in Dubai (in public liaison/information roles); Knightscope K5s in the Bay Area (which patrol malls, recording suspicious activity); and Anbot riot-control bots in China.
There is no basis for assuming that future AI police officers or applications will implement a form of blank, globalised machine intelligence without bias or favour. It is more likely that they will reflect the cultures and legal systems of the countries in which they operate, just as human police do.
And the world’s legal systems are far from uniform. In Saudi Arabia, for example, to be an atheist is to be regarded as a terrorist, and women have far fewer rights than men. In Iran, homosexuality is punishable by death, as are offences such as the abandonment of religious belief (apostasy).
It’s comforting to believe that, in the real world, no one would design AIs or facial recognition algorithms to determine citizens’ private thoughts, political beliefs, or sexual orientation, and yet here’s an example of AI being deployed to predict if people are gay or straight. Note how quickly this system has been developed within the current AI boom. Take the Big Bang approach again, and ask: Why was this issue uppermost in the developer’s mind?
Now factor in robot police or AI applications enforcing laws in one culture that another might find abhorrent. The potential is clearly there for technology to be programmed to act against globally stated human rights.
In the US, the numbers of people shot by police are documented here by the Washington Post, while this report suggests that black Americans are three times more likely to be killed by officers than white Americans. Meanwhile, this article exposes the racial profiling that occurs in some sectors of US law enforcement, despite attempts to prevent it. In the UK, statistics reveal that force is more likely to used against black Londoners by police than against any other racial group. This is the messy human world that robots are entering – robots programmed by human beings.
Throughout the world, politicians are increasingly targeting minority groups, or removing legal protections from them, even in societies that we don’t regard as oppressive. In the US alone, recent examples include the proposed US bans on people travelling from certain Muslim-majority countries, and on transgender people serving in the military, along with the proposed removal of legal protections for LGBTQ people and the scrapping of the Obama-era DACA scheme. Russia is among several other countries to turn against LGBTQ citizens.
So might any future robocop perpetuate the apparent biases in the US legal system, for example? As we’ve seen, that will depend on what training data has been put into the system, by whom, to what end, and based on what assumptions. The COMPAS case study above suggests that core data can be tainted at source by previous flaws and inequalities in the legal system.
The limits of AI
But let’s get back to the technology itself. The UK-RAS white paper acknowledges that AI has severe limitations, at present, and that many users have “unrealistic expectations” of it. For example, the report says: “One limitation of AI is the lack of ‘common sense’; the ability to judge information beyond its acquired knowledge […] AI is also limited in terms of emotional intelligence.”
Then the researchers make a simple observation that everyone rushing to implement the technology should consider: “true and complete AI does not exist”, says the white paper, and there is “no evidence yet” that it will exist before 2050.
So it’s a sobering thought that AIs with no common sense and possible training bias, and which can’t understand human emotions, behaviour, or social contexts, are being tasked with trawling context-free data pulled from human society in order to expose criminals – as defined by politicians.
And yet that’s precisely what’s happening in US and UK national surveillance programmes.
Opening the ‘black box’
The UK-RAS white paper takes pains to set out both the opportunities and the risks of AI, which it describes as a transformative, trillion-dollar technology, the future of which extends into augmented intelligence and quantum computing.
On the one hand, the authors note: “[AI] applications can replace costly human labour and create new potential applications and work along with/for humans to achieve better service standards. […] It is certain that AI will play a major role in our future life. As the availability of information around us grows, humans will rely more and more on AI systems to live, to work, and to entertain. […] AI can achieve impressive results in recognising images or translating speech.”
But on the other, they add: “When the system has to deal with new situations when limited training data is available, the model often fails. […] Current AI systems are still missing [the human] level of abstraction and generalisability. […] Most current AI systems can be easily fooled, which is a problem that affects almost all machine learning techniques.
“Deep neural networks have millions of parameters, and so to understand why the network provides good or bad results becomes impossible. Trained models are often not interpretable. Consequently, most researchers use current AI approaches as a black box.”
That last quote is telling: researchers are saying that some AI systems are already so complex that even their designers can’t say how or why a decision has been made by the software.
Organisations should be wary of the black box’s potential to mislead and to be misled, along with its capacity to tell people what they already believe – for better, or for worse.
Business and government should take these issues on board, and the systems they release into the wild must be transparent – as far back as the first principles that were adopted before the parameters were specified. More, the data that is being put into these systems should be open to interrogation, to ensure that AI systems are not being gamed to produce weighted results.
Regulations may help. The EU’s GDPR, which comes into force in May 2018, provides citizens with a new right to see information about the logic involved in, and the “significance and envisaged consequences of”, any automated decision-making systems that affect them.
Users: question your data before you ask an AI to do it for you, and challenge your preconceptions.
• For more articles on robotics, AI, and automation, go to the Robotics Expert page.
• Further reading:
How Google search data reveals the truth of who we are (Guardian).
Face-reading AI will be able to detect your politics, claims professor.
When AI and privacy meet (Constellation Research)
© Chris Middleton 2017