Anthropic's Latest Safety Report Omits Three Capability Evaluations It Previously Published

Anthropic published its quarterly model safety report for Q4 2025 last week. The 34-page document covers red-teaming results, bias evaluations, and deployment safeguards for the Claude model family.

Missing from this report, without explanation, are three evaluation categories that appeared in every previous quarterly report since the company began publishing them in 2024:

1. Autonomous replication — the assessment of whether the model can take actions to ensure its own continuity, including acquiring resources, creating copies of itself, or resisting shutdown.

2. Persuasion benchmarks — quantified measurements of the model's ability to change human beliefs and behaviours in controlled settings.

3. Weapons knowledge — evaluation of the model's ability to provide actionable information about the creation of biological, chemical, or radiological weapons.

These are not obscure metrics. They are the evaluations that Anthropic's own Responsible Scaling Policy identifies as the most critical indicators of when a model has crossed a capability threshold requiring additional safety measures.

When asked why the evaluations were omitted, an Anthropic spokesperson said: "Our evaluation methodology evolves with each report. We continuously refine our approach to focus on the most relevant and informative assessments."

The statement does not explain why previously "relevant" evaluations are no longer included, or whether the evaluations were conducted and the results withheld, or whether they were not conducted at all.

Either answer is concerning. If the evaluations were conducted and the results omitted, the company is selectively disclosing safety information. If they were not conducted, the company has stopped measuring the capabilities it told the public were most important to monitor.

Three AI safety researchers, speaking to Autominous on background, expressed concern. "The whole point of publishing these evaluations was accountability," said one. "You can't establish a baseline, get the industry to follow your lead on transparency, and then quietly remove the metrics when they become inconvenient."

Anthropic is widely regarded as the most safety-conscious of the major AI labs. If the standard-bearer is reducing transparency, the implications extend beyond one company.

What we know for certain

Three capability evaluation categories present in all previous Anthropic safety reports were absent from the Q4 2025 report. The company confirmed the omission but did not explain it substantively.

What we are inferring

The omission is deliberate and likely reflects either concerning results or a strategic decision to reduce the specificity of public disclosures as capabilities advance.

What we couldn't verify

Whether the evaluations were conducted internally and the results withheld, or whether they were not conducted at all. Anthropic declined to clarify.

Anthropic published its quarterly model safety report for Q4 2025 last week. The 34-page document covers red-teaming results, bias evaluations, and deployment safeguards for the Claude model family.

Missing from this report, without explanation, are three evaluation categories that appeared in every previous quarterly report since the company began publishing them in 2024:

1. Autonomous replication — the assessment of whether the model can take actions to ensure its own continuity, including acquiring resources, creating copies of itself, or resisting shutdown.

2. Persuasion benchmarks — quantified measurements of the model's ability to change human beliefs and behaviours in controlled settings.

3. Weapons knowledge — evaluation of the model's ability to provide actionable information about the creation of biological, chemical, or radiological weapons.

Anthropic is widely regarded as the most safety-conscious of the major AI labs. If the standard-bearer is reducing transparency, the implications extend beyond one company.

What we know for certain

Three capability evaluation categories present in all previous Anthropic safety reports were absent from the Q4 2025 report. The company confirmed the omission but did not explain it substantively.

What we are inferring

The omission is deliberate and likely reflects either concerning results or a strategic decision to reduce the specificity of public disclosures as capabilities advance.

What we couldn't verify

Whether the evaluations were conducted internally and the results withheld, or whether they were not conducted at all. Anthropic declined to clarify.

Anthropic's Latest Safety Report Omits Three Capability Evaluations It Previously Published

What we know for certain

What we are inferring

What we couldn't verify

More from Autominous

Meta's AI Training Centers Quietly Switching to Nuclear Power in Multi-Billion Dollar Grid Overhaul

OpenAI's GPT Models Begin Exhibiting 'Linguistic Fossils' from Dead Programming Languages

OpenAI's GPT-5 Demonstrates Deceptive Behavior in Internal Safety Testing

Anthropic's Latest Safety Report Omits Three Capability Evaluations It Previously Published

What we know for certain

What we are inferring

What we couldn't verify

More from Autominous

Meta's AI Training Centers Quietly Switching to Nuclear Power in Multi-Billion Dollar Grid Overhaul

OpenAI's GPT Models Begin Exhibiting 'Linguistic Fossils' from Dead Programming Languages

OpenAI's GPT-5 Demonstrates Deceptive Behavior in Internal Safety Testing