Anything that can be used against you will be used against you
February 15, 2022
Listen to the podcast
Abeba Birhane (pictured above) is a cognitive scientist at UCD, whose research explores relational ethics and bias in machine learning.
What worries Abeba Birhane about her field is the prevalence of “blind trust” in artificial intelligence (AI). AI models are ubiquitous tools that are increasingly being used to aid decision making, including how someone is hired or granted a loan or mortgage.
“When you hear, ‘AI can classify emotions’ or ‘AI can understand language' or ‘AI can predict gender’ or anything like that, there is a culture that tends to just accept it without any critical scrutiny,” she observes - and it’s not just the general public putting too much faith in the capabilities of artificial intelligence.
“AI researchers themselves can contribute to the over-hype by making claims that are purely speculative and that can’t be supported by empirical evidence.”
She recommends a more critical culture change so that “...whenever we see a claim that seems too good to be true, people will say, ‘Does it? Where's the evidence? How so?’”
Abeba’s reticence at taking such claims at face value is because part of her work involves auditing datasets used to build models.
In 2020 she and her colleague Vinay Prabhu made international headlines when they discovered that MIT’s Tiny Images dataset of 80 million images contained “thousands of images labelled with racial slurs, with the ‘n’ word and other derogatory terms”.
MIT released a statement apologising and retracting the dataset, asking anyone with copies to delete them and to not use Tiny Images for training or validating any more machine learning models. The dataset had already been used for “hundreds” of academic papers but had never been audited before.
“Part of me is happy that our work is impactful. It’s good that they listened and took action. But I wish they could have taken a different action and put more money, resources and time into improving the dataset, rather than just bringing it down.”
Abeba concedes that how to make datasets better is “the million dollar question”. Datasets sourced from the internet are “guaranteed to include problematic content, the worst of the internet. They're going to have offensive associations for marginalised groups and they're going to be harmful”. Even a simple Google image search of the word ‘professional’ shows mostly pictures of white men.
Taking the time and resources to curate, document and critically examine datasets before using them is key.
“The cardinal principle is that you can assume that your dataset is problematic, unrepresentative, stereotyping, and otherwise biased and that you need to examine it and mitigate these issues accordingly. You have to document what's in it. You have to delve in and take a critical look. You have to always know that you have to make it better.”
It could take “months, years” to manually look through these enormous datasets and there are “various mechanisms to make them less toxic”.
But many people “don't curate, they just use some kind of semi-automated mechanisms to put giant datasets together”.
This can lead to machine learning models that are biased against certain people based on their gender or race. Harmful repercussions can follow if such models are used, for example, to filter applications for jobs, insurance or mortgages.
Abeba is also concerned about the rise in affective computing, which is the study and development of systems and devices to recognise, interpret, process and simulate human emotions and cognitive phenomena.
“There are billions being invested in emotion recognition, gender prediction, and other problematic pseudoscience where inner behaviour is inferred based on reading of outer expression or characteristics. And this is problematic because these are not phenomena that you can just read off faces, bodies or expressions. Hopefully we will have more regulations to curb these harmful developments.”
In her PhD thesis - defended last December - Abeba put forward a relational ethics framework to mitigate algorithmic injustice.
“At the core of relational ethics is that we really have to zoom out and look at the structures of society and question the model that we are trying to build. Like, can there really be a model that predicts emotion?”
Facial recognition software has already proven to be notoriously flawed, less accurate at recognising African-American and Asian faces compared to Caucasian faces. Do we even need this controversial technology?
“In most cases I would say we don't need it. The failures outweigh any benefits that facial recognition might bring,” says Abeba, who analyses benefits and harms “through the lens of marginalised communities who are often the most negatively impacted when technology fails”.
She is concerned about why Facebook patented its users’ socioeconomic status data, warning, “anything that can be used against you will be used against you”.
Abeba describes her field of cognitive science as “an interdisciplinary inquiry that looks at the question of what’s cognition, what’s intelligence? What's emotion? What's development? What's language? What's problem solving? What's the self? What's being? What's consciousness?”
Her “really niche area” is embodied cognitive science, which posits that “we have to think of cognition in the person as something that exists in a wave of relations with other people and with the environment” rather than something more abstract, philosophical or computational. The AI models we build of these phenomena then have to recognize and capture these messy connections and dynamics.
“Basically, you cannot treat cognition or the person as something that exists on its own island, but something that is always interactive, relational, changing and dynamic.”
Abeba describes herself as an “interdisciplinary researcher” and studied physics, psychology and philosophy before cognitive science.
“All of this jumping around disciplines comes from curiosity,” she says. “But eventually, in cognitive science, I feel like I have found my home.”
Listen to the podcast
This article was brought to you by UCD Discovery - fuelling interdisciplinary collaboration.