Anthropic’s Claude AI gets a new constitution embedding safety and ethics
Anthropic has completely overhauled the “Claude constitution”, a document that sets out the ethical parameters governing its AI model’s reasoning and behavior.
Launched at the World Economic Forum’s Davos Summit, the new constitution’s principles are that Claude should be “broadly safe” (not undermining human oversight), “Broadly ethical” (honest, avoiding inappropriate, dangerous, or harmful actions), “genuinely helpful” (benefitting its users), as well as being “compliant with Anthropic’s guidelines”.
According to Anthropic, the constitution is already being used in Claude’s model training, making it fundamental to its process of reasoning.
Claude’s first constitution appeared in May 2023, a modest 2,700-word document that borrowed heavily and openly from the UN Universal Declaration of Human Rights and Apple’s terms of service.
While not completely abandoning those sources, the 2026 Claude constitution moves away from the focus on “standalone principles” in favor of a more philosophical approach based on understanding not simply what is important, but why.
“We’ve come to believe that a different approach is necessary. If we want models to exercise good judgment across a wide range of novel situations, they need to be able to generalize — to apply broad principles rather than mechanically following specific rules,” explained Anthropic.
The constitution will help Claude to move from simply following a limited checklist of approved possibilities to one based on deeper reasoning. So, for example, instead of keeping data private because this agrees with a rule, the constitution will help it understand the ethical framework in which privacy is important.
The effect of this added complexity is length, with the new version expanding dramatically to 84 pages and 23,000 words. If this sounds long-winded, the reasoning is that the document has been written to be ingested primarily by Claude itself. “It [the constitution] needs to work both as a statement of abstract ideals and a useful artifact for training,” the announcement said.
It also noted that the document is currently written for mainline, general access Claude models, and that specialized models may not fully fit, but said that the company will “continue to evaluate” how to make them meet the constitution’s core objectives. In addition, it promised to be open about missteps “in which model behavior comes apart from our vision.”
Intriguingly, Anthropic has released Claude’s constitution under a Creative Commons CC0 1.0 Deed, which means it can be used freely by other developers in their models.
Don’t be evil
The context for the update is rising skepticism rising about the reliability, ethics, and safety of large proprietary LLMs. From the start, Anthropic, which was founded in 2021 by former OpenAI employees worried about the latter’s direction, has sought to set itself apart as taking a different approach.
More contentious is the constitution’s oblique reference to the debate over AI consciousness. “Claude’s moral status is deeply uncertain. We believe that the moral status of AI models is a serious question worth considering. This view is not unique to us: some of the most eminent philosophers on the theory of mind take this question very seriously,” it states on page 68.
In August, Anthropic introduced a new feature to its most advanced Claude Opus 4 and 4.1 models it said would end a conversation if a user repeatedly tried to push harmful or illegal content, as a mode of self-protection. And in November, an Anthropic research paper suggested that the same Opus 4 and 4.1 models showed “some degree” of introspection, reasoning about past actions in an almost human-like way.
In fact, LLMs are statistical models, not conscious entities, countered Satyam Dhar, an AI engineer with technology startup Galileo.
“Framing them as moral actors risks distracting us from the real issue, which is human accountability. Ethics in AI should focus on who designs, deploys, validates, and relies on these systems,” he said.
“An AI ‘constitution’ can be useful as a design constraint, but it doesn’t resolve the underlying ethical risk,” he added. “No philosophical framework embedded in a model can replace human judgment, governance, and oversight. Ethics emerge from how systems are used, not from abstract principles encoded in weights.”
This article originally appeared on CIO.com.
Original Link:https://www.computerworld.com/article/4120916/anthropics-claude-ai-gets-a-new-constitution-embedding-safety-and-ethics-2.html
Originally Posted: Thu, 22 Jan 2026 20:30:17 +0000












What do you think?
It is nice to know your opinion. Leave a comment.