Close

Can you trust algorithms to be fair? New research says ‘it depends’

Algorithms and AI are keen to please us. That is not always a good thing.
An image of computer code on a purple background.

In 2017 Netflix premiered an episode of its dystopian sci-fi series ‘Black Mirror’, in which an all-knowing algorithm had taken over the dating game.

The algorithm analysed, matched and told prospective couples – before their relationship even started – how long they would last.

The episode spoke to an underlying angst in our digital age: If computers know us better than we do ourselves, is there even such a thing as free will?

A fair recommendation

While an omniscient love robot has yet to be invented, the Netflix vision has in some sense come true.

On average, Danes spend 5-7 hours a day in front of a screen according to Telia, a telecoms provider. And a lot of that time is spent with some form of a recommender system – algorithms whose sole purpose is to give you what you want. Sometimes before you even know you want it.

“The purpose of recommender systems is to match items such as products, services, or even people to their users,” says Theresia Rampisela, a PhD researcher at Copenhagen University. “The goal is that people who receive the recommendations will find them relevant, or that they will interact with, purchase, click, watch or rate them.”

These systems have come to dominate our experience online – recommending your next binge-worthy tv series, people you might know on Facebook or the right pants to go with the 12-pack of socks you ordered online. But why they suggest one thing over another is not always clear.

“What is fair in one domain may not be fair for another domain” — Theresia Rampisela

That is what Theresia Rampisela is aiming to shed light on as a contributing researcher to the Algorithms, Data and Democracy project by studying algorithmic fairness. “What do I mean by fairness?” Theresia Rampisela asks. “Unfortunately, there is no single definition for this, but it is about ensuring users are treated without discrimination.”

Take an online shopping platform, where the users are buyers and sellers: Should it give local sellers more exposure over far-away competitors? Or similarly boost women- or minority-owned businesses? Should you be able to pay to get better recommendations than others?

“What is fair in one domain may not be fair for another domain,” says Theresia Rampisela who has recently published a new paper comparing different ways of measuring algorithmic fairness. While she thinks there is no such thing as a perfectly fair algorithm, her work indicates that fairness and relevance is not always a zero-sum game:

“There used to be this belief that if we increase fairness, we sacrifice recommendation effectiveness. According to recent research from others, this is not always true. What we see in our experiments is that we can sacrifice relevance a little bit, but then we can make it more fair,” she says.

The dark side of recommendations

In a recent study, the Anti-Bullying Centre at Dublin City University concluded that “recommender algorithms used by social media platforms are rapidly amplifying toxic content” – for example by exposing teenage boys to videos by the controversial manosphere influencer Andrew Tate.

So is algorithmic fairness the right metric, or should we rather be talking about algorithmic ethics?

“In this field, everything is somehow related,” says Theresia Rampisela. “Fairness, accountability, transparency, ethics. They all have some kind of intersection between them. But in the case of harmful content, I believe this has more to do with safety than fairness.”

“There are different techniques to ensure safety of the recommendation platforms, but I think we need to care about not just safety but also other aspects of responsible AI because they’re all important and it’s a hard problem.”

Do you agree that recommender systems are a part of the reason why some people are getting radicalised online?

“I would not put the whole blame on recommender systems, because I think people also have a responsibility. We have a brain to think and act rationally and logically. What I think is important is to be aware that recommender systems can and do influence our beliefs, and sometimes they can be biased in serving recommendations,” says Theresa Rampisela.

Opening the black box

Theresia Rampisela shares an office at the department of computer science with Sara Marjanovic – another PhD researcher and contributor the ADD project. She explores the question of trust from a different angle, namely when can you trust what an AI tells you?

Sara Marjanovic researches explainable AI at Copenhagen University’s department of computer science.

“In the past, we used to view these large language models as black boxes that we could never interpret. You put in a bunch of training data, some magic happens, and all of a sudden you have an answer which is often pretty accurate. But why are we getting this response, especially if the response is incorrect, was once thought to be something that people couldn’t fully understand,” she says.

“As users, we need to be a bit more cautious about the output that we get” — Sara Marjanovic

But with the right models, Sara Marjanovic’s work shows it is possible to begin to pry open the black box. In a recent paper, she explores what happens when large language models try to answer questions with a high degree of uncertainty or “noise”. 

“I wanted to see if a model could give meaningful answers even when it was uncertain,” Sara Marjanovic explains. To do that, Sara Marjanovic and her co-authors purposely made the ask harder on the model by misspelling or omitting words:

“We found that sometimes a model will be very uncertain, but it will still find salient information, which means that it can still be used in human-AI collaboration. Other times we can see that even when we’ve made this task super difficult, and the model is performing poorly, the output uncertainty value doesn’t necessarily indicate this,” she adds.

That might not matter too much, if you are asking an AI for directions, but it matters a whole lot if you are doctor using an AI to help diagnose a patient. But because people like confident answers, the models can tend to downplay their own uncertainty.

“So as a user, we need to be a bit more cautious about the output that we get. And as people researching large language models, we also need to find ways to reduce this overconfidence or at least reflect confidence accurately,” Sara Marjanovic says.