Building reliable AI models requires understanding the people behind the datasets, UMSI researchers say
Social media companies are increasingly relying on complex algorithms and artificial intelligence to detect offensive behavior online. These algorithms and AI systems all rely on data to learn what is offensive. But who's behind the data, and how do their backgrounds influence their decisions?
In a recent study, University of Michigan School of Information assistant professor David Jurgens and PhD candidate Jiaxin Pei show the backgrounds of data annotators — the people labeling texts, videos and online media — matter a lot.
Our study suggests that understanding the background of annotators and collecting labels from a demographically balanced pool of crowd workers is important to reduce the bias of datasets.
Through an analysis of 6,000 Reddit comments, the study shows annotator beliefs and decisions around politeness and offensiveness impact the learning models used to flag the online content we see each day. What is rated as polite by one part of the population can be rated much less polite by another.
“AI systems all use this kind of data and our study helps highlight the importance of knowing who is labeling the data,” says Pei. “When people from only one part of the population label the data, the resulting AI system may not represent the average viewpoint.”
“Annotators are not fungible,” Jurgens says. “Their demographics, life experiences and backgrounds all contribute to how they label data. Our study suggests that understanding the background of annotators and collecting labels from a demographically balanced pool of crowd workers is important to reduce the bias of datasets.”
Through their research, Jurgens and Pei set out to better understand the differences between annotator identities and how their experiences impact their decisions. Previous studies have only looked at one aspect of identity, like gender. Their hope is to help AI models better model the beliefs and opinions of all people.
The results of Jurgens and Pei’s research demonstrates the following:
- While some existing studies suggest that men and women may have different ratings of toxic language, their research found no statistically significant difference between men and women. However, participants with non-binary gender identities tended to rate messages as less offensive than those identifying as men and women.
- People older than 60 tend to perceive higher offensiveness scores than middle-aged participants.
- The study found significant racial differences in offensiveness ratings. Black participants tended to rate the same comments with significantly more offensiveness than all the other racial groups. In this sense, classifiers trained on data annotated by white people may systematically underestimate the offensiveness of a comment for Black and Asian people.
- No significant differences were found with respect to annotator education.
Using these results, Jurgens and Pei created POPQUORN, the Potato-Prolific dataset for Question Answering, Offensiveness, text Rewriting and politeness rating with demographic Nuance. The dataset offers social media companies and AI companies an opportunity to explore a model that accounts for intersectional perspectives and beliefs.
“Systems like ChatGPT are increasingly used by people for everyday tasks,” Jurgens says. “But whose values are we instilling in the trained model? If we keep taking a representative sample without accounting for differences, we continue marginalizing certain groups of people.
“POPQUORN helps ensure everyone has equitable systems that match their beliefs and backgrounds,” Pei notes.
David Jurgens is an expert in computational social science, sociolinguistics and natural language processing. Read more about his research in his UMSI faculty profile.