1. Insights
  2. Trust & Safety
  3. Article
  • Share on Facebook
  • Share via email

The increasing sophistication of AI in content moderation

Posted October 26, 2022 - Updated January 12, 2024
Data scientist observing connected data overlaid onto the profile of a human head

Research from the Consumer Technology Association shows that user-generated content (UGC) now accounts for 39% of the time Americans spend consuming media every week, compared with 61% for traditional media. Whether they’re watching TikTok videos, following gamers on Twitch or leaving product reviews, consumers are spending more time than ever engaging with UGC.

According to cloud software company DOMO, every one minute of the day in 2022, users shared 66,000 photos on Instagram, posted 1.7 million pieces of content to Facebook, uploaded 500 hours of video to YouTube and sent 2.43 million Snapchats. It has been predicted by Statista that the total amount of data to be created, captured, copied and consumed globally in 2022 was 97 zettabytes. To put this into context, one zettabyte is equal to one trillion gigabytes, and incredibly, this number, and the amount of content posted to social media, community forums and other sites is projected to grow to over 180 zettabytes by 2025. To add to this complexity, the popularization of generative AI (GenAI) — a category of artificial intelligence that focuses on creating new and original content such as text, images, audio, video, code or synthetic data — is accelerating the pace of UGC even more than before.

What does all of that mean for brands? More UGC means there’s more data to monitor in order to ensure you aren’t inadvertently aligning yourself with inappropriate, violent or fake content. Additionally, the more UGC there is, the higher the chances are that bad actors will post something objectionable — making content moderation a colossal task that simply cannot be achieved efficiently without the support of artificial intelligence (AI).

The battle for control

The problem is that while AI is becoming more sophisticated, the individuals creating negative digital content are, too. “The challenge is that it’s an arms race,” Nigel Duffy, AI entrepreneur and global AI leader at Ernst & Young, explains. “The sophistication of the tools to generate content is competing against the capacity to moderate that content—and right now, the former is winning.”

Duffy predicts that we’re about to experience a “tsunami of content moderation.” As brands fight to keep up with those creating negative content online, AI will become even more essential to their success.

The benefits of AI in content moderation

While AI is helping to produce and scale content, it also plays a key role in the moderation of the growing amounts of UGC online. In fact, Abhijnan Dasgupta, practice director, business process services at Everest Group, called out GenAI as the next frontier in advancing AI models for content moderation, inThe Evolved Trust and Safety Industry and What to Expect Next webinar.

Because of the necessity and the sheer volume and variety of content, companies are already finding ways to capitalize on AI technology to protect their brands and maintain a positive customer experience. Naturally, speed and scalability are among the features that give it so much appeal. With artificial intelligence, brands can process large amounts of content in very little time — something that simply can’t be matched by manual content moderation by humans alone. For example, Meta reported in its Q3 2023 Community Standards Enforcement Report that it relies on such tools – not user reports – to identify, on average, 96.4% of content the platform removes for violating its policies.

While image content and text data can contain content that’s harmful to both consumers and moderators, tools like natural language processing (NLP), image processing algorithms, sentiment analysis and computer vision can all help brands defend against violence, harassment and more.

Multimodal content is particularly challenging to moderate because the different mediums, like text, image, or video, cannot be considered separately; often the meaning of a particular piece of content is conveyed through the combination of its parts. GenAI has a unique capability to simultaneously analyze content incorporating diverse modalities, providing a comprehensive approach to content evaluation. Addressing the challenge of multilingual content moderation, GenAI also excels in translating and detecting harmful information in multiple languages. Moreover, advanced GenAI models can specifically detect deepfake content, serving as a crucial tool for platforms to identify manipulated media and prevent its dissemination.

Introducing a human touch

AI-enabled content moderation does have limitations. Brands can learn how to train AI systems to spot harmful images — but human oversight helps determine the context of potentially harmful text, and human-led data annotation ensures the placements of metadata tags to the original dataset, which provides a layer of richer information to support machine learning.

“It’s useful to have humans-in-the-loop,” Duffy says. “If you think about graphic violence in a video game or a screen capture versus in real life, it can be relatively hard for AI to distinguish sometimes.” The same, he notes, is true of what people define as appropriate or inappropriate levels of violence. “There are boundaries,” Duffy continues, “and AI and humans need to work in tandem to identify them.”

Because it’s critical for businesses to protect the mental health and overall wellness of their customer care teams, they’re leveraging automated content filtering to identify banned behavior and content. Better tagging weeds out the most flagrant instances of violence and labels the content so that moderators have an idea of what they are about to view. Moreover, research shows that interactive image blurring can reduce the emotional impact and overall strain on content moderators “without sacrificing accuracy or speed.” Viewing content in grayscale can also have a positive effect on the moderators, while still enabling them to flag violent and extreme material.

Reducing human exposure to harmful content helps to mitigate the psychological, emotional and physical impact on human moderators. This is even true of content that’s being delivered in real-time, as AI can moderate livestreams and automatically delete harmful material before it’s posted. Ideally, humans play a key role, but AI should be the first line of defense.

Avoiding AI bias

Organizations should also strive to hire and maintain diverse and inclusive teams throughout the data collection and labeling process. Assembling a diverse team of data annotators and validators can help brands make the right judgment calls and help reduce bias in AI. Because this process starts with people, it can be a slippery slope for homogenous groups to let subtle, unconscious biases enter the algorithms without a second-opinion and rigorous testing backup. As such, diverse teams help reduce bias because they make room for new perspectives, backgrounds and experiences that might not have been considered and fed into the AI systems initially.

For example, a study found that AI models trained by African-American users to process hate speech online were 1.5 times more likely to identify tweets as offensive or hateful. This is important because the feedback these human moderators provide is incorporated back into the AI training loop, which in turn helps brands train their AI systems responsibly. If the team isn’t diverse, neither is the AI in its thinking.

AI is only getting smarter, with technological capabilities that range from pattern recognition to reading language context. Regardless, humans will likely always be better at identifying the emotion and intent behind digital content. The solution? Humans and AI must work together to monitor content more effectively and create better and more inclusive experiences for all.

This article is part of a four-part series on content moderation. Check out our other articles on the evolving nature of digital content, wellness strategies for content moderators and content moderation regulations.


Check out our solutions

Protect the safety and well-being of your user communities to maintain customer trust.

Learn more