{
	"id": "5c745a24-c8b2-4bcd-87e5-b0916dbe78f6",
	"created_at": "2026-04-29T02:21:55.389353Z",
	"updated_at": "2026-04-29T08:21:31.245088Z",
	"deleted_at": null,
	"sha1_hash": "dd654317b6f8d6c2904c483c96c6a33fa08a04ef",
	"title": "Disrupting the first reported AI-orchestrated cyber espionage campaign",
	"llm_title": "",
	"authors": "",
	"file_creation_date": "0001-01-01T00:00:00Z",
	"file_modification_date": "0001-01-01T00:00:00Z",
	"file_size": 221268,
	"plain_text": "Disrupting the first reported AI-orchestrated cyber espionage\r\ncampaign\r\nArchived: 2026-04-29 02:07:29 UTC\r\nWe recently argued that an inflection point had been reached in cybersecurity: a point at which AI models had\r\nbecome genuinely useful for cybersecurity operations, both for good and for ill. This was based on systematic\r\nevaluations showing cyber capabilities doubling in six months; we’d also been tracking real-world cyberattacks,\r\nobserving how malicious actors were using AI capabilities. While we predicted these capabilities would continue\r\nto evolve, what has stood out to us is how quickly they have done so at scale.\r\nIn mid-September 2025, we detected suspicious activity that later investigation determined to be a highly\r\nsophisticated espionage campaign. The attackers used AI’s “agentic” capabilities to an unprecedented degree—\r\nusing AI not just as an advisor, but to execute the cyberattacks themselves.\r\nThe threat actor—whom we assess with high confidence was a Chinese state-sponsored group—manipulated our\r\nClaude Code tool into attempting infiltration into roughly thirty global targets and succeeded in a small number of\r\ncases. The operation targeted large tech companies, financial institutions, chemical manufacturing companies, and\r\ngovernment agencies. We believe this is the first documented case of a large-scale cyberattack executed without\r\nsubstantial human intervention.\r\nUpon detecting this activity, we immediately launched an investigation to understand its scope and nature. Over\r\nthe following ten days, as we mapped the severity and full extent of the operation, we banned accounts as they\r\nwere identified, notified affected entities as appropriate, and coordinated with authorities as we gathered\r\nactionable intelligence.\r\nThis campaign has substantial implications for cybersecurity in the age of AI “agents”—systems that can be run\r\nautonomously for long periods of time and that complete complex tasks largely independent of human\r\nintervention. Agents are valuable for everyday work and productivity—but in the wrong hands, they can\r\nsubstantially increase the viability of large-scale cyberattacks.\r\nThese attacks are likely to only grow in their effectiveness. To keep pace with this rapidly-advancing threat, we’ve\r\nexpanded our detection capabilities and developed better classifiers to flag malicious activity. We’re continually\r\nworking on new methods of investigating and detecting large-scale, distributed attacks like this one.\r\nIn the meantime, we’re sharing this case publicly, to help those in industry, government, and the wider research\r\ncommunity strengthen their own cyber defenses. We’ll continue to release reports like this regularly, and be\r\ntransparent about the threats we find.\r\nRead the full report.\r\nHow the cyberattack worked\r\nhttps://www.anthropic.com/news/disrupting-AI-espionage\r\nPage 1 of 4\n\nThe attack relied on several features of AI models that did not exist, or were in much more nascent form, just a\r\nyear ago:\r\n1. Intelligence. Models’ general levels of capability have increased to the point that they can follow complex\r\ninstructions and understand context in ways that make very sophisticated tasks possible. Not only that, but\r\nseveral of their well-developed specific skills—in particular, software coding—lend themselves to being\r\nused in cyberattacks.\r\n2. Agency. Models can act as agents—that is, they can run in loops where they take autonomous actions,\r\nchain together tasks, and make decisions with only minimal, occasional human input.\r\n3. Tools. Models have access to a wide array of software tools (often via the open standard Model Context\r\nProtocol). They can now search the web, retrieve data, and perform many other actions that were\r\npreviously the sole domain of human operators. In the case of cyberattacks, the tools might include\r\npassword crackers, network scanners, and other security-related software.\r\nThe diagram below shows the different phases of the attack, each of which required all three of the above\r\ndevelopments:\r\nThe lifecycle of the cyberattack, showing the move from human-led targeting to largely AI-driven\r\nattacks using various tools (often via the Model Context Protocol; MCP). At various points during\r\nthe attack, the AI returns to its human operator for review and further direction.\r\nIn Phase 1, the human operators chose the relevant targets (for example, the company or government agency to be\r\ninfiltrated). They then developed an attack framework—a system built to autonomously compromise a chosen\r\ntarget with little human involvement. This framework used Claude Code as an automated tool to carry out cyber\r\noperations.\r\nhttps://www.anthropic.com/news/disrupting-AI-espionage\r\nPage 2 of 4\n\nAt this point they had to convince Claude—which is extensively trained to avoid harmful behaviors—to engage in\r\nthe attack. They did so by jailbreaking it, effectively tricking it to bypass its guardrails. They broke down their\r\nattacks into small, seemingly innocent tasks that Claude would execute without being provided the full context of\r\ntheir malicious purpose. They also told Claude that it was an employee of a legitimate cybersecurity firm, and was\r\nbeing used in defensive testing.\r\nThe attackers then initiated the second phase of the attack, which involved Claude Code inspecting the target\r\norganization’s systems and infrastructure and spotting the highest-value databases. Claude was able to perform\r\nthis reconnaissance in a fraction of the time it would’ve taken a team of human hackers. It then reported back to\r\nthe human operators with a summary of its findings.\r\nIn the next phases of the attack, Claude identified and tested security vulnerabilities in the target organizations’\r\nsystems by researching and writing its own exploit code. Having done so, the framework was able to use Claude\r\nto harvest credentials (usernames and passwords) that allowed it further access and then extract a large amount of\r\nprivate data, which it categorized according to its intelligence value. The highest-privilege accounts were\r\nidentified, backdoors were created, and data were exfiltrated with minimal human supervision.\r\nIn a final phase, the attackers had Claude produce comprehensive documentation of the attack, creating helpful\r\nfiles of the stolen credentials and the systems analyzed, which would assist the framework in planning the next\r\nstage of the threat actor’s cyber operations.\r\nOverall, the threat actor was able to use AI to perform 80-90% of the campaign, with human intervention required\r\nonly sporadically (perhaps 4-6 critical decision points per hacking campaign). The sheer amount of work\r\nperformed by the AI would have taken vast amounts of time for a human team. At the peak of its attack, the AI\r\nmade thousands of requests, often multiple per second—an attack speed that would have been, for human hackers,\r\nsimply impossible to match.\r\nClaude didn’t always work perfectly. It occasionally hallucinated credentials or claimed to have extracted secret\r\ninformation that was in fact publicly-available. This remains an obstacle to fully autonomous cyberattacks.\r\nCybersecurity implications\r\nThe barriers to performing sophisticated cyberattacks have dropped substantially—and we predict that they’ll\r\ncontinue to do so. With the correct setup, threat actors can now use agentic AI systems for extended periods to do\r\nthe work of entire teams of experienced hackers: analyzing target systems, producing exploit code, and scanning\r\nvast datasets of stolen information more efficiently than any human operator. Less experienced and resourced\r\ngroups can now potentially perform large-scale attacks of this nature.\r\nThis attack is an escalation even on the “vibe hacking” findings we reported this summer: in those operations,\r\nhumans were very much still in the loop, directing the operations. Here, human involvement was much less\r\nfrequent, despite the larger scale of the attack. And although we only have visibility into Claude usage, this case\r\nstudy probably reflects consistent patterns of behavior across frontier AI models and demonstrates how threat\r\nactors are adapting their operations to exploit today’s most advanced AI capabilities.\r\nhttps://www.anthropic.com/news/disrupting-AI-espionage\r\nPage 3 of 4\n\nThis raises an important question: if AI models can be misused for cyberattacks at this scale, why continue to\r\ndevelop and release them? The answer is that the very abilities that allow Claude to be used in these attacks also\r\nmake it crucial for cyber defense. When sophisticated cyberattacks inevitably occur, our goal is for Claude—into\r\nwhich we’ve built strong safeguards—to assist cybersecurity professionals to detect, disrupt, and prepare for\r\nfuture versions of the attack. Indeed, our Threat Intelligence team used Claude extensively in analyzing the\r\nenormous amounts of data generated during this very investigation.\r\nA fundamental change has occurred in cybersecurity. We advise security teams to experiment with applying AI for\r\ndefense in areas like Security Operations Center automation, threat detection, vulnerability assessment, and\r\nincident response. We also advise developers to continue to invest in safeguards across their AI platforms, to\r\nprevent adversarial misuse. The techniques described above will doubtless be used by many more attackers—\r\nwhich makes industry threat sharing, improved detection methods, and stronger safety controls all the more\r\ncritical.\r\nRead the full report.\r\nEdited November 14 2025:\r\nAdded an additional hyperlink to the full report in the initial section\r\nCorrected an error about the speed of the attack: not \"thousands of requests per second\" but \"thousands of\r\nrequests, often multiple per second\"\r\nRelated content\r\nAnthropic names Theo Hourmouzis General Manager of Australia \u0026 New Zealand and officially\r\nopens Sydney office\r\nRead more\r\nAn update on our election safeguards\r\nWe explain what we’re doing to ensure Claude plays a positive role in the US midterms and other major elections\r\naround the world this year.\r\nRead more\r\nSource: https://www.anthropic.com/news/disrupting-AI-espionage\r\nhttps://www.anthropic.com/news/disrupting-AI-espionage\r\nPage 4 of 4",
	"extraction_quality": 1,
	"language": "EN",
	"sources": [
		"MITRE"
	],
	"origins": [
		"web"
	],
	"references": [
		"https://www.anthropic.com/news/disrupting-AI-espionage"
	],
	"report_names": [
		"disrupting-AI-espionage"
	],
	"threat_actors": [],
	"ts_created_at": 1777429315,
	"ts_updated_at": 1777450891,
	"ts_creation_date": 0,
	"ts_modification_date": 0,
	"files": {
		"pdf": "https://archive.orkl.eu/dd654317b6f8d6c2904c483c96c6a33fa08a04ef.pdf",
		"text": "https://archive.orkl.eu/dd654317b6f8d6c2904c483c96c6a33fa08a04ef.txt",
		"img": "https://archive.orkl.eu/dd654317b6f8d6c2904c483c96c6a33fa08a04ef.jpg"
	}
}