Perhaps because Friday is the UN World Day for Cultural Diversity for Dialogue and Development, Twitter has revealed some news that surely shocked no one: one of its AI tools, which it has now discontinued, is biased against Black people.

More specifically, researchers have determined that its image-cropping algorithm tool was “fundamentally flawed” after finding that it tended to cut out Black people and men from pictures posted on the platform. Now Twitter has scrapped the tool just ahead of World Diversity Day, saying that “how to crop an image is a decision best made by people.”

Twitter introduced a so-called saliency algorithm in 2018. The algorithm was trained on human eye-tracking data to predict a saliency score on all regions of an image. The area with the highest score became the centre of the picture when it was published in people’s feeds.

While the tool had been tested to prevent bias before it was unleashed into the wild, netizens soon noticed that something was a bit off.

In September 2020, several Twitter users noticed that the algorithm repeatedly cropped out Black people when there was a white person in the same picture. In one telling example, one user uploaded a picture of former President Barack Obama and of Republican leader Mitch McConnell. The AI repeatedly decided to focus on Mitch McConnell.

Subsequent experiments into Twitter’s image-cropping tool included other Black and White individuals as well different-coloured animals.

When the same experiment was made with the famous Simpsons characters Carl, who is Black, and Lenny, who is white (or yellow), and resulted in Lenny being picked over Carl, one user succinctly summarised the general mood by tweeting: “FFS.”

Following the very public backlash, Twitter apologised for the results and promised to investigate further. Fast-forward seven months, and Twitter has now fulfilled that pledge.

Three members of the Twitter Machine Learning, Ethics and Algorithmic Transparency team have conducted a research experiment on the cropping tool. Their aim was to determine whether or not white faces were picked over Black ones, if the AI had any objectification where the “male gaze” would focus on women’s chests or legs, and if the algorithm prevented people from expressing themselves the way they wanted to.

The Twitter researchers found that there was a 4% difference in favour of picking White individuals over Black people as the focus of the image. A similar 8% difference in favour of women over men was noted.

When comparing specifically between White and Black women, the researchers noted a 7% difference in favour of White women.

Running the same test for men, the AI tended to pick White men as the focus of the pic over Black men 2% of the time.

The team found that only three out of 100 pictures resulted in focus on another location than a head. In those cases it usually focused on a non-physical object, such as the number of a sports jersey.

Negligible as these differences may have been, the researchers were left uncomfortable with the results.

“Even if the saliency algorithm were adjusted to reflect perfect equality across race and gender subgroups, we’re concerned by the representational harm of the automated algorithm when people aren’t allowed to represent themselves as they wish on the platform,” Rumman Chowdhury, director of software engineering at Twitter, wrote in a blog post.

“Saliency also holds other potential harms beyond the scope of this analysis, including insensitivities to cultural nuances.”

In the report, the researchers also noted that using the AI also removed agency for netizens on how they wished to express themselves.

“Machine learning based cropping is fundamentally flawed because it removes user agency and restricts user’s expression of their own identity and values, instead imposing a normative gaze about which part of the image is considered the most interesting,” the researchers wrote.

“Twitter is an example of an environment where the risk of representational harm is high since Twitter is used to discuss social issues and sensitive subject matter. In addition, Tweets can be potentially viewed by millions of people, meaning mistakes can have far reaching impact.”

Twitter has recently rolled out a new feature where the pictures were displayed with a standard aspect ratio, rather than using the algorithm. The update also included the ability for Twitter users to preview how the picture would look before tweeting it. Chowdhury said it is also working on new features to be launched in the future.

“One of our conclusions is that not everything on Twitter is a good candidate for an algorithm, and in this case, how to crop an image is a decision best made by people,” Chowdhury said.

Diversity Day: It’s been a long time coming

Twitter’s image-cropping tool is not the only instance of AI amplifying societal inequality.

A famous example was Microsoft’s AI-powered chatbot Tay, rolled out in 2016. Tay was trained on the interactions it had with other people on Twitter. Anyone familiar with the online cesspool of vitriol that is frequently unleashed on the chat platform may think this would not be a great idea. They’d be right. It took less than 24 hours before Tay started spewing out racist and sexist remarks sprinkled with a shocking helping of holocaust denial. Microsoft soon took down the chatbot.

Twitter’s image-cropping tool also cuts into a wider story about the failures of facial recognition software. When Apple unveiled the iPhone X in 2017, its Face ID feature was one of its key selling points. The tool was supposed to enable users to unlock the phone by holding up the phone’s camera to scan their faces.

However, several reports soon surfaced from China suggesting that the software failed to separate Asian people from one another. One report, for instance, suggested that a Chinese woman had had her iPhone unlocked by two of her colleagues.

A 2019 study from the US National Institute of Standards and Technology found that facial recognition software from Idema – which was used by law enforcement agencies in the US, France and Australia – was 10 times more likely to misidentify Black women than White women. Idema claimed that the systems tested had not been released commercially.

Despite the likelihood of false positives, the Chinese government has reportedly been happy to use the technology in its crackdown against the Uighurs, a largely Muslim minority. The technology was integrated in Beijing’s vast network of surveillance cameras and trained to look exclusively for Uighurs based on their appearance. Once identified, the system kept a record on their movements. Millions of Uighurs have been put into detention camps.

“The practice makes China a pioneer in applying next-generation technology to watch its people, potentially ushering in a new era of automated racism,” Paul Mozur, technology correspondent at The New York Times wrote.

In summary, it’s likely that this won’t be the last World Day for Cultural Diversity for Dialogue and Development where tech is found to have amplified racial inequality.