A Twitter image-cropping algorithm that went viral last year when users discovered it preferred white people to Black people was also coded with implicit bias against a number of other groups, researchers have found.
Researchers looked at how the algorithm, which automatically edited photos to focus on people’s faces, dealt with a variety of different people and saw evidence that Muslims, people with disabilities and older people faced similar discrimination. The same artificial intelligence had learned to ignore people with white or gray hair, who wore headscarves for religious or other reasons, or who used wheelchairs, researchers said.
The findings were part of a first-of-its-kind contest hosted by Twitter over the weekend at the Def Con hacker conference in Las Vegas. The company invited researchers to unearth new ways to prove that an image-cropping algorithm was inadvertently coded to be biased against particular groups of people. Twitter gave out cash prizes including $3,500 to the winner and small amounts for runners up.
It’s a unique step in what has become an important field of research: Finding ways that automated systems trained by existing data resources have become imbued with existing biases in society.
The image-cropping algorithm, which has since largely been decommissioned by Twitter, went viral last September. Users noticed that when they would tweet photos that included both a white and a Black person, it would almost always automatically highlight the white person. With photos that included both former President Barack Obama and Sen. Mitch McConnell, R-Ky., for instance, Twitter would invariably choose to crop the photo to only show McConnell. The company apologized at the time and took it offline.
But that unintended racism wasn’t unusual, said Parham Aarabi, a professor at the University of Toronto and director of its Applied AI Group, which studies and consults on biases in artificial intelligence. Programs that learn from users’ behavior almost invariably introduce some kind of unintended bias, he said.
“Almost every major AI system we’ve tested for major tech companies, we find significant biases,” he said. “Biased AI is one of those things that no one’s really fully tested, and when you do, just like Twitter’s finding, you’ll find major biases exist across the board.”
While the photo-cropping incident was embarrassing for Twitter, it highlighted a broad problem across the tech industry. When companies train artificial intelligence to learn from their users’ behavior — like seeing what kind of photos the majority of users are more likely to click on — the AI systems can internalize prejudices that they would never intentionally write into a program.
The Aarabi team’s submission to Twitter’s contest, which won second place, found that the algorithm was severely biased against individuals with white or gray hair. The more artificially lightened their hair, the less likely the algorithm would choose them.
“The most common effect it has is it crops out people who are older,” Aarabi said. “The older demographic is predominantly marginalized with the algorithm.”
Contestants found a host of ways in which the cropping algorithm could be biased. Aarabi’s team also found that in photos of a group of people where one person was lower because they were in a wheelchair, Twitter was likely to crop the wheelchair user out of the photo.
Another submission found that Twitter would be more likely to ignore people who wore head coverings, which effectively cut out people who wore them for religious reasons, like a Muslim hijab.
The winning entry found that the more a person’s face is photoshopped to look slimmer, younger or softer, the more likely Twitter’s algorithm will be to highlight them.
Rumman Chowdhury, director of the Twitter team devoted to machine learning ethics, which ran the contest, said she hoped that more tech companies will enlist outside help to identify algorithmic bias.
“Identifying all the ways in which an algorithm might go wrong when released to the public is really a daunting task for one team, and frankly probably not feasible,” Chowdhury said.
“We want to set a precedent at Twitter and in the industry for proactive and collective identification of algorithmic harms,” she said.