AI saving humans from the emotional toll of monitoring hate speech

Researchers use machine learning to identify hate speech with 88 per cent accuracy

Date:: May 31, 2024
Source:: University of Waterloo
Summary:: A team of researchers have developed a new machine-learning method that detects hate speech on social media platforms with 88 per cent accuracy, saving employees from hundreds of hours of emotionally damaging work.
Share:: Facebook Twitter Pinterest LinkedIN Email

FULL STORY

A team of researchers at the University of Waterloo have developed a new machine-learning method that detects hate speech on social media platforms with 88 per cent accuracy, saving employees from hundreds of hours of emotionally damaging work.

The method, dubbed the Multi-Modal Discussion Transformer (mDT), can understand the relationship between text and images as well as put comments in greater context, unlike previous hate speech detection methods. This is particularly helpful in reducing false positives, which are often incorrectly flagged as hate speech due to culturally sensitive language.

"We really hope this technology can help reduce the emotional cost of having humans sift through hate speech manually," said Liam Hebert, a Waterloo computer science PhD student and the first author of the study. "We believe that by taking a community-centred approach in our applications of AI, we can help create safer online spaces for all."

Researchers have been building models to analyze the meaning of human conversations for many years, but these models have historically struggled to understand nuanced conversations or contextual statements. Previous models have only been able to identify hate speech with as much as 74 per cent accuracy, below what the Waterloo research was able to accomplish.

"Context is very important when understanding hate speech," Hebert said. "For example, the comment 'That's gross!' might be innocuous by itself, but its meaning changes dramatically if it's in response to a photo of pizza with pineapple versus a person from a marginalized group.

"Understanding that distinction is easy for humans, but training a model to understand the contextual connections in a discussion, including considering the images and other multimedia elements within them, is actually a very hard problem."

Unlike previous efforts, the Waterloo team built and trained their model on a dataset consisting not only of isolated hateful comments but also the context for those comments. The model was trained on 8,266 Reddit discussions with 18,359 labelled comments from 850 communities.

"More than three billion people use social media every day," Hebert said. "The impact of these social media platforms has reached unprecedented levels. There's a huge need to detect hate speech on a large scale to build spaces where everyone is respected and safe."

The research, Multi-Modal Discussion Transformer: Integrating Text, Images and Graph Transformers to Detect Hate Speech on Social Media, was recently published in the proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence.

RELATED TOPICS

RELATED TERMS

Story Source:

Materials provided by University of Waterloo. Note: Content may be edited for style and length.

Journal Reference:

Liam Hebert, Gaurav Sahu, Yuxuan Guo, Nanda Kishore Sreenivas, Lukasz Golab, Robin Cohen. Multi-Modal Discussion Transformer: Integrating Text, Images and Graph Transformers to Detect Hate Speech on Social Media. Proceedings of the AAAI Conference on Artificial Intelligence, 2024; 38 (20): 22096 DOI: 10.1609/aaai.v38i20.30213

Cite This Page:

University of Waterloo. "AI saving humans from the emotional toll of monitoring hate speech." ScienceDaily. ScienceDaily, 31 May 2024. <www.sciencedaily.com/releases/2024/05/240530132718.htm>.

University of Waterloo. (2024, May 31). AI saving humans from the emotional toll of monitoring hate speech. ScienceDaily. Retrieved October 3, 2025 from www.sciencedaily.com/releases/2024/05/240530132718.htm

University of Waterloo. "AI saving humans from the emotional toll of monitoring hate speech." ScienceDaily. www.sciencedaily.com/releases/2024/05/240530132718.htm (accessed October 3, 2025).

Explore More

from ScienceDaily

AI saving humans from the emotional toll of monitoring hate speech

Researchers use machine learning to identify hate speech with 88 per cent accuracy

Explore More

Breaking

Trending Topics

Strange & Offbeat