Artificial Intelligence (AI) is no longer a distant, futuristic concept – it’s here, and its influence is growing. With this growth, however, comes the need for ensuring the safety of these AI systems, a problem that AI startup Anthropic is keen to address. In a recent article by James Vincent on The Verge, we delve into Anthropic’s novel approach, known as “constitutional AI”, which is reshaping how we understand and manage AI safety.
Founded by former OpenAI employees, Anthropic is a somewhat enigmatic presence in the AI landscape. Despite its lack of public-facing products, aside from a chatbot named Claude, the startup has attracted significant attention and funding, including a $300 million investment from Google. It has even secured a seat at high-level discussions on AI regulation, such as those at the White House.
What is it, then, that makes Anthropic’s approach to AI so intriguing?
According to co-founder Jared Kaplan, the key lies in Anthropic’s commitment to AI safety. The company’s main focus is on its constitutional AI methodology, which, in contrast to traditional reinforcement learning, involves training AI systems like chatbots to follow specific sets of rules, or ‘constitutions’.
These constitutions guide the AI’s behavior, emphasizing principles of being helpful, honest, and harmless. They draw from various sources, including the UN’s Universal Declaration of Human Rights, Apple’s terms of service, non-Western perspectives, DeepMind’s Sparrow Rules, and Anthropic’s own research.
The goal, Kaplan explains, is not to predetermine a specific set of principles but to demonstrate the effectiveness of the constitutional AI approach in guiding AI outputs. This methodology is intended to provide a more manageable and safer way to train AI systems, reducing the need for human moderators and decreasing the risk of harmful outputs.
Anthropic’s approach also addresses the issue of bias in AI, often skewed towards Western perspectives, by including principles that consider non-Western viewpoints. It also aims to prevent users from anthropomorphizing chatbots by instructing the AI not to present itself as a human. Furthermore, it offers guiding principles that tackle existential threats, a contentious issue that brings the possibility of superintelligent AI systems causing harm in the future into the spotlight.
Despite the promising developments, Anthropic’s approach to AI safety isn’t without critics. Some argue that teaching a machine to abide by a constitution simply teaches it to lie, while others feel the entire concept is based on unnecessary fears. Kaplan, however, maintains that the company’s intention is to spark a public discussion on how AI systems should be trained and what principles they should follow.
The AI landscape is ever-evolving, and the need for regulations and principles to guide AI behavior is becoming increasingly clear. Anthropic’s constitutional AI offers an interesting new perspective on how to approach AI safety, and it’s definitely a conversation we should be having. As Kaplan rightly said, perhaps it’s time we started thinking about a new constitution, one that has AI in mind.
For a detailed understanding of Anthropic’s constitutional AI and its potential implications, read the full article on The Verge.
This blog post is a response to an article published on The Verge. The views and opinions expressed in this blog post are those of the author and do not necessarily reflect the official policy or position of any other agency, organization, employer or company.