Anthropic Restricts AI Model Distribution Over Cybersecurity Attack Concerns

Thousands of critical security flaws in leading operating systems and browsers, including some that have remained unpatched for decades, were discovered by Anthropic's Claude Mythos Preview.

Following the discovery of thousands of critical security vulnerabilities spanning operating systems, web browsers and various other software platforms, Anthropic has revealed plans to distribute its Claude Mythos Preview AI model exclusively to a carefully chosen group of companies.

According to Anthropic, the newly developed general-purpose model identified high-severity security vulnerabilities present in each of the major operating systems and web browsers currently in widespread use.

Given the rate of AI progress, it will not be long before such capabilities proliferate, potentially beyond actors who are committed to deploying them safely.

Hackers have already begun leveraging AI technology to carry out cyberattacks. Statistics from AllAboutAI indicate a 72% year-over-year surge in cyberattacks powered by AI, with 87% of global organizations falling victim to AI-enabled cyberattacks throughout 2025.

The prospect of malicious actors gaining access to comparable AI capabilities has raised significant concerns at Anthropic.

In response to these threats, Anthropic unveiled Project Glasswing on Tuesday, a newly launched initiative that unites more than 40 companies, featuring participants such as Amazon Web Services, Apple, Cisco, Google, JPMorgan, the Linux Foundation, Microsoft and Nvidia.

Through Project Glasswing, the capabilities of Claude Mythos Preview will be employed defensively to identify bugs, distribute the findings among partner organizations and proactively address threats by remedying critical vulnerabilities before malicious actors have an opportunity to weaponize them.

Decades-old bugs are being discovered

When a software bug can be weaponized before anyone capable of fixing it becomes aware of its existence, it is classified as a zero-day vulnerability. Historically, identifying and remedying these vulnerabilities has demanded scarce and costly human expertise, though AI technology has the potential to transform both the magnitude and velocity of vulnerability detection.

According to Anthropic, the security flaws it has uncovered are "often subtle or difficult to detect."

A substantial number of these vulnerabilities are 10 or 20 years old, with the most ancient discovered thus far being a currently-patched 27-year-old bug located in OpenBSD — an operating system recognized predominantly for its security features, the company noted.

Additional discoveries include a 16-year-old bug present in the FFmpeg media processing library, a 17-year-old remote code execution vulnerability affecting the open-source FreeBSD operating system and multiple vulnerabilities embedded in the Linux kernel.

The company further noted that web applications "contain a myriad of vulnerabilities," spanning from cross-site scripting and SQL injection to domain-specific vulnerabilities like cross-site request forgery, which sees frequent deployment in phishing attacks.

Lifecycle of a zero-day exploit — The lifecycle of a zero-day exploit. Source: PhoenixNAP

Anthropic has asserted that 99% of the vulnerabilities discovered by its system remain unpatched at this time, "so it would be irresponsible for us to disclose details about them."

Software will emerge more secure, but not overnight

According to Anthropic, this development likely represents merely the initial phase of an emerging trend, and the "work of defending the world's cyber infrastructure might take years," though AI technology will contribute to strengthening software and systems against attacks.

In the long run, we expect that defense capabilities will dominate: that the world will emerge more secure, with software better hardened—in large part by code written by these models. But the transitional period will be fraught.

← Powrót do bloga