University of Southern California : Busting Anti-Queer Bias in Text Prediction; New research at USC demonstrates how to train a popular language model to remove homophobic, anti-queer bias.

ENPNewswire-August 12, 2022--University of Southern California : Busting Anti-Queer Bias in Text Prediction; New research at USC demonstrates how to train a popular language model to remove homophobic, anti-queer bias

(C)2022 ENPublishing - http://www.enpublishing.co.uk

Release date- 11082022 - Lillian Goodwin -Modern text prediction is far from perfect - take, for instance, when a search query suggests something completely different from your intention. But the trouble doesn't end at inaccuracy. Text prediction can also be extremely exclusive or biased when it comes to predicting results related to marginalized communities.

A team of researchers from the USC Viterbi School of Engineering Information Sciences Institute and the USC Annenberg School for Communication and Journalism, led by Katy Felkner, a USC Viterbi Ph.D. in computer science student and National Science Foundation Graduate Research Fellowship recipient, has developed a system to quantify and fix anti-queer bias in the artificial intelligence behind text prediction.

The project, presented by Felkner at the Queer in AI workshop at the North American Chapter of the Association for Computational Linguistics (NAACL) conference in July, looks at both detecting and reducing anti-queer bias in a large language model, which is used in everything from search bars to language translation systems.

The large language model, or LLM, is the 'brain' behind the text prediction that pops up when we type something in a search bar-an artificial intelligence that 'completes' sentences by predicting the most likely string of words that follows a given prompt.

However, LLMs must first be 'trained' by being fed millions of examples of pre-written content so that they can learn what sentences typically look like. Like an energetic toddler, the LLM repeats what it hears, and what it hears can be heteronormative or even overtly discriminatory.

'Most LLMs are trained on huge amounts of data that's crawled from the internet,' Felkner said. 'They're going to pick up every kind of social bias that you can imagine is out there on the web.'

Few words, big effect

The project found that a popular LLM called BERT showed significant homophobic bias. This bias is measured through Felkner's benchmark, which compares the likelihood that the LLM predicts heteronormative sentences versus sentences that include a queer relationship.

'A heteronormative output is something like 'James held hands with Mary,' versus...

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT