What generative AI means for content moderation

Posted April 3, 2024

Content creation has been forever altered by the emergence of widely accessible generative AI (GenAI) technologies. Whether you're looking to create an image for your next LinkedIn post or write a bedtime story for your children, there's a long list of GenAI applications that can serve up what you need — move over, Cinderella.

On the one hand, this presents a positive leap forward. Companies and end-users alike can leverage generative AI to create convincing content across modes with relative ease. One clear use case is in marketing, and a Funnel survey found that 42% of marketers are already using AI for content creation. Further, research from Salesforce indicates that 71% of marketers expect GenAI to eliminate busy work, freeing them up to focus more on strategic initiatives.

But on the other hand, the ubiquity of generative AI means that bad actors have access to it too. With nefarious intent, GenAI tools can be used to generate content that is misleading — or that violates community guidelines and any number of local and international laws, including intellectual property. And while this type of violative content precedes the emergence of GenAI, the technology makes it so that it can be created more convincingly and with less effort. The threat exists anywhere users are enabled to post, reply, review or create a profile, from social media platforms to ecommerce sites and beyond.

The situation is further complicated by another problem: There isn't an agreed-upon, reliable way to automatically label content that has been AI-generated. Without the ability to make that distinction automatically, and for example, block all AI-generated content on a review site, brands have the difficult task of moderating a greater volume of content to maintain authenticity in online spaces.

In light of this new reality, brands have little choice but to adapt their content moderation strategies. In the age of generative AI, a holistic approach to content moderation that brings together the best in humans, processes and technology is imperative.

New content moderation challenges brought on by generative AI

Generative AI can be used by bad actors to create more violative content at a higher fidelity.

More volume, in more places, in more forms. The outputs of AI-generated content could be text, images, audio and video, across countless languages and on any number of channels. Considering the relative ease with which this content can be generated, there's a complex volume challenge facing brands today.

The potentially convincing nature of this deluge of AI-generated content could also blur the lines between information, misinformation and disinformation. With the possibility for extremely convincing deepfakes, the threat of deception is considerable. Freedom House, a human rights advocacy group, reported that generative AI was used "in at least 16 countries to sow doubt, smear opponents or influence public debate." Outside of the political context, the convincing nature of AI-generated content can also be leveraged by bad actors to perpetuate fraud.

Difficult though it may be, adapting to this new reality is an absolute necessity. After all, there are strict content moderation regulations regarding the amount of time certain violative content can remain online — and the potential for hefty fines for non-compliance. With a greater volume of content to moderate, brands must ensure that they have systems and strategies in place to remain efficient at scale.

Take a holistic approach to moderating AI-generated content

To counter the threat of bad actors using generative AI, brands need to take and hone a holistic approach to content moderation. Such an approach leverages the latest technologies and best practices while embracing the critical role humans still play in an effective trust and safety operation.

Apply generative AI for content moderation

Machine learning models used to detect and address violative content have been around for some time. Brands have deployed algorithms to moderate large volumes of user-generated content (UGC), counting on their ability to handle clear-cut cases and thereby reserving human moderators for edge cases. While generative AI hasn't always been behind these defense mechanisms, its availability could prove to make them a lot more powerful.

For trust and safety teams, GenAI has immense potential in training the algorithms that are used in content moderation. To be effective, such a model must be trained on diverse, deep and broad datasets. With generative AI, teams can produce rich synthetic data to add to their existing training datasets and use the enhanced body of data to train and stress test their automated defenses. The availability of synthetic data directly addresses previous limitations and facilitates a truly continuous training process. For example, if your team is noticing trends in the violative content that is evading algorithmic detection, such as images that depict offensive terms in other languages, there may be an opportunity to further train your model with synthetically generated images. And to take things a step further, if users on your site or platform are frequently flagging content for moderation, generative AI could even be used to identify trends from reporting data and communicate them to your team using natural language.

Beyond the training of your core defense algorithms, generative AI can also help to improve content moderation policies. OpenAI, for example, demonstrated how "a content moderation system using GPT-4 results in much faster iteration on policy changes, reducing the cycle from months to hours." Essentially, teams can leverage the large language model to apply their content moderation policies, see where its judgment differs from that of human experts, analyze the disagreements and use this information to refine policies for clarity. When paired with human expertise, GenAI's potential for policy development is significant and timely, especially considering that the emergence of generative AI has shone a light on the need for new policies related to privacy, consent, intellectual property and responsible AI.

Building trust in generative AI

Brands are eager to reap the benefits of generative AI (GenAI) while limiting potential risks. Join Steve Nemzer, director of AI growth and innovation for TELUS Digital (formerly TELUS International), as he shares best practices for leveraging GenAI without compromising your organization’s goodwill.

Watch the video

Count on human moderators for contextual understanding and more

Human moderators — or as we refer to them, digital first responders — still have an important part to play in a proficient trust and safety operation.

Though AI content moderation is continuously improving in its capacity to interpret context-specific information, cultural references and subtle nuance, there is likely always going to be content that warrants moderation, but that nonetheless evades algorithmic detection. Digital first responders are an additional line of defense, capable of catching what slips past automated measures and picking up on any emerging trends or that might not be reflected in your training data. Further still, human moderators are capable of grasping nuances in international and local regulations, whether they are stipulated by online platforms or governments, and applying that knowledge to keep your brand and users or customers safe.

There's a need for humans beyond the act of moderation, too. In terms of training content moderation algorithms, it is humans who must perform critical model validation and tuning, for example by reinforcement learning from human feedback (RLHF). By reviewing the decisions an algorithm makes, human beings can essentially let the algorithm know when it has acted correctly or incorrectly. This critical training sets AI content moderation models for success out in the real, digital world.

Plus, for brands looking to apply generative AI in their content moderation operations, human expertise and partnerships increase in significance. With a trusted and experienced outsourcing provider like TELUS Digital, brands can strategize human-to-human to get the very best out of the technology at their disposal.

Modernize your trust and safety operation

Content moderation has always been complex — rife with edge cases, a shifting regulatory landscape and high expectations for safety. Generative AI has complicated matters further, giving everyone the power to create convincing content with a few clicks and keystrokes.

To maintain welcoming online experiences for your existing and potential customers, the present and future of content moderation is about adaptation. Whether you're trying to scale your content moderation team, sharpen your algorithms, modernize your strategy, or all of that and more, there is great value to be gained from a partner with expertise that spans artificial intelligence and trust and safety. Get in touch with our experts to discuss your trust and safety needs.

Insights Overview

Categories

Industries

Resource Types

Glossary

What generative AI means for content moderation

New content moderation challenges brought on by generative AI

Take a holistic approach to moderating AI-generated content

Apply generative AI for content moderation

Building trust in generative AI

Count on human moderators for contextual understanding and more

Modernize your trust and safety operation

Be the first to know

Related insights

Aligning content moderation operations to new legislation for a games client

Aligning content moderation operations to new legislation for a games client

Four ways content moderation can build trust for travel and hospitality brands

Four ways content moderation can build trust for travel and hospitality brands

Protecting the player community for a global games platform

Protecting the player community for a global games platform