Counterfactual explanations and overconfident artificial intelligence

Friday, 28 July, 2023

Listen to the (opens in a new window)podcast

Eoin Delaney is a fourth-year PhD student at University College Dublin and works in the(opens in a new window)Insight Centrefor Data Analytics and the(opens in a new window)VistaMilkSFI research centre. He won the Best Application of AI in a Student Project Award at the recent(opens in a new window)AI Ireland awardsfor his research on Explainable AI (XAI), specifically on how black-box AI models can be explained to developers and end-users using counterfactual explanations.

“This morning I drove into the campus and I crashed my car,” says Eoin Delaney - and thankfully this is the start of a fictional anecdote to illustrate a counterfactual explanation, rather than a real-life event.

“I might say, ‘Well, if I had left a little bit later then I wouldn't have crashed’ or, ‘If I'd taken my normal route instead of a quicker route then I would have arrived safely.’ We love thinking in counterfactuals as humans and one of the great things about them is that they can also be used to explain machine learning models.”

Machine learning models, or algorithms, are AI systems trained using datasets to make predictions, recommendations or decisions that impact on people’s lives. These can be, as Delaney says, “high stakes [and often ethically dubious] decisions”, like predicting who qualifies - or not - for bank loans, mortgages, and insurance. Public administrations worldwide are also experimenting with algorithmic decision-making and predictive analytics to manage and improve key areas like transport and healthcare.

“The aim of Explainable AI (XAI) is trying to understand why predictions were made in a certain way, and how those predictions could be different. I think the take-home message is that these machine learning models don't normally operate how we think they do - and explanation is really important.”

For example, counterfactual explanations can offer more helpful reasons why, if we apply for a loan, the ‘computer says no’ - that catchphrase of BBC noughties sketch showLittle Britain.

“The model might say, ‘I'm sorry, but you didn't get the loan’. And I might be left really angry and wondering, ‘Why didn't I get this loan?’ The counterfactual will say, ‘Well, if you had €1000 more in your bank account, you would have got the loan.’”

This information should improve the person’s chances of applying successfully next time. “And that’s a huge motivation for why we use counterfactuals: they can help us realise better actions or things we can do to make the outcome more favourable in the future.”

Delaney also applies counterfactuals to his research in smart agriculture, helping farmers to understand what changes might improve yields.

“We can programme up a system to make counterfactuals but I think the AI community could do a bit more engagement with users: Are these actions useful? Are they feasible? Could you make these changes on a farm?”

He says “you become more critical of AI when you work in Explainable AI” and Delaney has found that counterfactual explanations highlight shortcomings and bias in machine learning models.

“There are a lot of times where models don't work well at all. I think it's really important that people don't over-promise how good models are at doing things. We need to be really, really careful about that.”

He gives the example of a model that looks at an image and tries to predict what it is.

“One of the datasets that we sometimes work with is this dataset of different types of birds. So I could look at this image of a bird, plug it into a model and the model will come back saying, ‘This is a magpie’. You'll begin to notice pretty soon that if you just change a few random pixels around the image, sometimes the model will completely change its prediction to maybe a robin or some other bird. As humans, we think that decision is not really plausible. It doesn't really make sense.”

He has found that even when making blatantly wrong decisions like this, the AI can be pretty sure of itself.

“One of the things about machine learning models and AI models is they're often very overconfident in their predictions. So they might say, ‘I'm 100% sure this is an image of a magpie’. Whereas we, as humans - or a bird expert - might not be so sure. We might be 75% sure. And models are actually pretty bad at doing that,” he adds, of bringing uncertainty into the equation."

He points out that humans, of course, can be overconfident too. But Delaney was “amazed” at the extent of the AI’s overconfident mistakes when working with the MNIST dataset of handwritten digits.

“The model was really good at looking at a digit and saying, ‘This is a 5’. But if I plugged in an image of something that wasn't a digit at all, like an image of a T-shirt, it would say, ‘I'm 95% sure this is an image of a 3’. They're really bad at generalising into different scenarios.”

He is now keen to explore this concerning area in more depth.

“There is a huge research effort in trying to study uncertainty from an AI point of view and trying to make these systems less overconfident.”

This effort underscores the need to keep humans-in-the-loop. Is there enough human oversight of machine learning models?

“I'd say definitely not. I think humans are seriously lacking in terms of being in the loop and we also need to test our explanations on human users. Not necessarily just domain experts, but also just regular people who might not be too interested in AI, which is perfectly fine. But they will be in a world where AI makes decisions for them and it's hopefully a good world. As AI becomes more deployed, the consequences will become high stakes.”

Listen to the (opens in a new window)podcast

This article was originally published on 21 December 2022.

UCD Institute for Discovery

O'Brien Centre for Science Belfield Dublin Ireland

E: discovery@ucd.ie | Location Map

Explore UCD

About UCD

Students

Research & Innovation

Colleges

Engage

Key Services

Counterfactual explanations and overconfident artificial intelligence

UCD Institute for Discovery