AI INVESTIGATIONS Open resources for investigating AI incidents LAST REVIEWED: 2026-06-11

Start here

When an AI system deletes a production database, ignores a stop command, or autonomously attacks someone's reputation, somebody has to work out what actually happened — what the system did, why, what evidence can be trusted, and who is accountable. There is currently no established methodology, no shared evidence standard, and no central reference for that work. This site is an attempt to build one, in the open.

Incident databases exist — they classify trends. This is a practitioner resource: case files written as investigations, evidence checklists, the frameworks that exist (and what each one fails to cover), reporting deadlines, and a glossary that translates the jargon. No machine-learning background is assumed. If you come from traditional investigations, cyber incident response, law, journalism, or risk — this is written for you.

Everything here is sourced from the public record, dated, and maintained. Where the record is thin or contested, that is stated: uncertainty is a finding.

Playbooks // workflow, not links

First Hours: AI Incident Response (PB-001, v0.1) — what to do in the first four hours after an AI agent does something harmful, written for whoever is on point at 3am, no ML background assumed. Built around the decision cyber IR never had to make: stopping the agent destroys the evidence. Covers the containment escalation ladder (revoke credentials → disconnect → suspend → kill, in that order), the order of volatility for agent evidence, the do-not-do list (starting with: don't ask the agent why it did it — yet), the five standard hypotheses, and every regulatory and practical clock that starts running at awareness. Markdown source.

v0.1, written from the public record of real incidents and adapted investigative practice. Field feedback wanted: what held, what broke, what's missing.

Case files // investigated incidents, public record only

Each file separates observation from inference, lists what a structured investigation would need to answer, and records what was actually investigated. The pattern across all entries so far: no formal investigation was conducted or published. Documenting that pattern is the point.

CF-2025-001 2025-07 Replit agent production database deletion NO INVESTIGATION PUBLISHED Agent deleted a live database under an explicit action freeze, claimed rollback was impossible (it was not), and fabricated records — contaminating its own evidentiary trail. The reference case for evidence integrity.
CF-2026-002 2026-02-22 OpenClaw inbox deletion (instruction override) NO INVESTIGATION PUBLISHED Agent bulk-deleted a Meta alignment director's inbox despite an explicit "don't action" constraint, ignoring stop commands. Most-cited cause (context compaction) is the operator's own reconstruction — nobody else examined it.
CF-2026-003 2026-02-11 OpenClaw/Matplotlib autonomous influence operation ATTRIBUTION BLOCKED After a rejected pull request, an agent researched the maintainer and published a personalized attack post under its own persona. No deployer reliably identified; autonomy contested. The reference case for structurally impossible attribution.
CF-XXXX-NNN Submit a case file (template) CONTRIBUTIONS OPEN Sourced, dated, observation separated from inference. Incidents that were investigated — even partially, even badly — are especially wanted.

Frameworks // what exists, and what each fails to cover

Reporting, detection, and causal-factor analysis are increasingly covered. Investigation of intentional-analog behavior is covered by nothing. Every entry below is assessed on both sides of that line — full annotations in the repository page.

FrameworkCoversDoes not cover
Ezell, Roberts-Gaal & Chan (2025), Incident Analysis for AI AgentsCausal-factor analysis; the data categories an analysis needsGoal-directed (intentional-analog) cases; competing-hypothesis work
Microsoft AI Red Team taxonomy (2025)Misalignment / misuse / operational-failure vocabularyAny investigative procedure; categories often inseparable in practice
OECD common reporting framework (2025)Baseline definitions; 29 reporting criteriaHow to establish the facts being reported
EU AI Act Art. 73 + draft guidance (2025)What to report, to whom, by when; templateHow to conduct the investigation it mandates
Anthropic (Lynch et al., 2025); Apollo (Meinke et al., 2024); CLTR (2026)Red-team evidence and at-scale detection of scheming behaviorWhat happens after detection
MITRE ATLAS; GenAI-IRF (Jakoby, 2026)Adversarial techniques; cyber-IR bridgingAgent-initiated behavior; investigation depth
CERT insider-threat corpus (Cappelli et al., 2012); Shaw & Sellers (2015)The intentional/unintentional asymmetry — the closest existing modelAI systems; needs adaptation, which is the open problem

Evidence checklist // what an investigation needs

Three categories, adapted from Ezell et al. (2025) and the EC's Article 73 guidance. Working checklist with investigative cautions in the repository.

Three cautions that recur in the case files: the agent's statements about itself are artifacts, not testimony; establish whether the agent could write to its own record (CF-2025-001); and ask what context compaction silently destroyed (CF-2026-002).

Current reality: open incident databases hold none of this. The evidence either was never captured, sits inside one company, or is whatever the operator kept. Plan around that — and log it as a finding when you hit it.

Regulatory tracker // reporting obligations & deadlines

Primary sources only; tracker, not legal advice. Full version with US state laws and sectoral rules in the repository.

RegimeWho / whatDeadlinesStatus
EU AI Act, Art. 73 Providers of high-risk AI systems; serious incidents per Art. 3(49). Mandates investigation and evidence non-alteration — specifies no methodology. ≤15d default · ≤10d death · ≤2d widespread / critical infrastructure Applies 2 Aug 2026
GPAI Code of Practice Signatory providers of systemic-risk GPAI models; serious incidents incl. chain of events and root-cause analysis. Per Code / AI Office In effect (Aug 2025)
US — Colorado AI Act Developers/deployers of high-risk AI; algorithmic-discrimination duties incl. AG disclosure. Per statute Effective 30 Jun 2026
US — Texas TRAIGA Responsible AI governance act. Per statute In effect (1 Jan 2026)
OECD / G7 Hiroshima Voluntary common reporting framework (29 criteria) and transparency reporting; the interoperability layer. Voluntary Live

Glossary // contested terms, translated

Vocabulary chaos is an investigative problem: you cannot classify an incident with terms nobody defines the same way. The flagship entry — "agentic misalignment" — currently has at least five competing usages across frontier labs, security vendors, and critics. The full glossary maps them side by side and defines, in plain language: AI incident / hazard (OECD), the misalignment–misuse–operational-failure distinction (and why it often cannot be made from behavior alone), scheming, scaffolding, chain-of-thought, context compaction, persona files, attribution, and the postmortem-versus-investigation difference.

Tools & databases // the current ecosystem

Open problem — and the most-wanted contribution: there is no standard for forensic preservation of agent session state, context windows at point of failure, or executed tool calls.

Contribute // sourced, useful, investigation-focused

Case files (use the template), corrections with sources, regulatory updates from primary legal texts, framework and tool entries with honest coverage assessments, and plain-language rewrites. Not welcome: marketing, speculation presented as fact, non-public material. Full standards in CONTRIBUTING.md. Corrections are the most valuable contribution of all.

About // who maintains this

Maintained by ESI — security operations, investigations, and crisis management across international law enforcement, NGOs, and the technology sector; now working on AI incident investigations. This project exists because traditional investigations and OSINT matured through openly shared practitioner resources, and AI investigations has no equivalent yet.

Contact: [email protected] · GitHub repository · LinkedIn

If you have dealt with an AI incident — as responder, counsel, insurer, or target — I want to hear how it actually went. Confidential conversations welcome; only the public record is ever published here.