How Anthropic Is Teaching AI the Difference Between Right and Wrong

Written by Mike Kaput | May 16, 2023 12:09:54 PM

Anthropic, a major AI player, just published something that could have a big impact on AI safety.

In a recent post, Anthropic, creator of AI assistant Claude, outlined an approach they’re using to make large language models safer, called “constitutional AI.”

Constitutional AI is the act of giving a large language model “explicit values determined by a constitution, rather than values determined implicitly via large-scale human feedback.”

By a “constitution,” Anthropic means a custom set of principles created by the company that guide the model’s outputs.

In the past, you would teach a machine what is a “good” or “bad” output based on human feedback after the model produced an output. This exposed humans to disturbing content and relied on essentially crowdsourced feedback to dictate the model’s “values.”

With constitutional AI, instead the model both compares its outputs to the established set of core values and receives feedback from another AI system to gauge how closely it is following the constitution.

Anthropic notes: “This isn’t a perfect approach, but it does make the values of the AI system easier to understand and easier to adjust as needed.”

For its Claude AI assistant, Anthropic created Claude’s constitution by drawing from sources like the U.N. Declaration of Human Rights, trust and safety best practices, and principles proposed by other research labs.

Why does this matter? On Episode 47 of the Marketing AI Show, I spoke with Marketing AI Institute founder and CEO Paul Roetzer to find out.

This is one of generative AI’s biggest challenges. Making sure generative AI systems output content that isn’t toxic, biased, or harmful is not easy. And it’s a subjective process. There are differing opinions on what constitutes problematic outputs across political and societal spectrums. Getting this wrong results in harmful machines.
This is just one way of solving the problem. There are different ways to tackle this issue, says Roetzer. For instance, OpenAI has stated you’ll be able to tune models however you want in the future, putting the responsibility for outputs on the user.
Anthropic’s approach is one to watch. “This is a really novel approach,” says Roetzer. And the company pioneering it matters. Anthropic is not a small player in AI. They’ve raised $1.3 billion. And they claim the next model they’re building will be 10X more powerful than today’s most powerful AI systems.
This matters because language models will increasingly become sources of facts and truth. As adoption increases and models improve, we’ll regularly turn to these models for our information. So what these models are tuned to respond with in a social, political, and cultural context matters, says Roetzer.

The bottom line: How we control what outputs large language models provide will increasingly determine what facts and truths people see as adoption increases.

Don’t get left behind…

You can get ahead of AI-driven disruption—and fast—with our Piloting AI for Marketers course series, a series of 17 on-demand courses designed as a step-by-step learning journey for marketers and business leaders to increase productivity and performance with artificial intelligence.

The course series contains 7+ hours of learning, dozens of AI use cases and vendors, a collection of templates, course quizzes, a final exam, and a Professional Certificate upon completion.

After taking Piloting AI for Marketers, you’ll:

Understand how to advance your career and transform your business with AI.
Have 100+ use cases for AI in marketing—and learn how to identify and prioritize your own use cases.
Discover 70+ AI vendors across different marketing categories that you can begin piloting today.

View full post