Google’s Hate Speech-Detecting AI is Biased Against Black People

Self-Defeating

Artificial intelligence algorithms meant to detect and moderate speech online, including the Perspective algorithm built by Google, have built-in biases against people.

Scientists from the University of Washington found alarming anti-black bias in the AI tools that are supposed to protect marginalized communities from online abuse, according to New Scientist  demonstrating how a well-intentioned attempt to make the internet safer could discriminate against already-marginalized communities.

Built-In Bias

The scientists examined how humans had annotated a database of over 100,000 tweets that had been used to train anti-hate speech algorithms, according to yet-unpublished research. They found that the people responsible for labeling whether or not a tweet was toxic tended to flag tweets written in African-American Vernacular English (AAVE) as offensive a bias that then propagated down into the algorithms themselves.

The team confirmed that bias by training several AI systems on the database, finding that the algorithms associated AAVE with hate speech.

Downstream Effects

The team then tested algorithms, including Perspective, on a database of 5.4 million tweets, the authors of which had disclosed their race. The algorithms ranged from being one-and-a-half to twice as likely to flag posts written by people who identified as African-American in the database for being toxic, New Scientist reports.

That means that automated content moderation tools will likely take down a lot of benign posts based on the ethnicity of their posters, leading to silencing and suppression of certain communities online.

You might also like

Comments are closed.

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. AcceptRead More