Start here
When an AI system deletes a production database, ignores a stop command, or autonomously attacks someone's reputation, somebody has to work out what actually happened — what the system did, why, what evidence can be trusted, and who is accountable. There is currently no established methodology, no shared evidence standard, and no central reference for that work. This site is an attempt to build one, in the open.
Incident databases exist — they classify trends. This is a practitioner resource: case files written as investigations, evidence checklists, the frameworks that exist (and what each one fails to cover), reporting deadlines, and a glossary that translates the jargon. No machine-learning background is assumed. If you come from traditional investigations, cyber incident response, law, journalism, or risk — this is written for you.
Everything here is sourced from the public record, dated, and maintained. Where the record is thin or contested, that is stated: uncertainty is a finding.
Playbooks // workflow, not links
First Hours: AI Incident Response (PB-001, v0.1) — what to do in the first four hours after an AI agent does something harmful, written for whoever is on point at 3am, no ML background assumed. Built around the decision cyber IR never had to make: stopping the agent destroys the evidence. Covers the containment escalation ladder (revoke credentials → disconnect → suspend → kill, in that order), the order of volatility for agent evidence, the do-not-do list (starting with: don't ask the agent why it did it — yet), the five standard hypotheses, and every regulatory and practical clock that starts running at awareness. Markdown source.
v0.1, written from the public record of real incidents and adapted investigative practice. Field feedback wanted: what held, what broke, what's missing.
Case files // investigated incidents, public record only
Each file separates observation from inference, lists what a structured investigation would need to answer, and records what was actually investigated. The pattern across all entries so far: no formal investigation was conducted or published. Documenting that pattern is the point.
Frameworks // what exists, and what each fails to cover
Reporting, detection, and causal-factor analysis are increasingly covered. Investigation of intentional-analog behavior is covered by nothing. Every entry below is assessed on both sides of that line — full annotations in the repository page.
| Framework | Covers | Does not cover |
|---|---|---|
| Ezell, Roberts-Gaal & Chan (2025), Incident Analysis for AI Agents | Causal-factor analysis; the data categories an analysis needs | Goal-directed (intentional-analog) cases; competing-hypothesis work |
| Microsoft AI Red Team taxonomy (2025) | Misalignment / misuse / operational-failure vocabulary | Any investigative procedure; categories often inseparable in practice |
| OECD common reporting framework (2025) | Baseline definitions; 29 reporting criteria | How to establish the facts being reported |
| EU AI Act Art. 73 + draft guidance (2025) | What to report, to whom, by when; template | How to conduct the investigation it mandates |
| Anthropic (Lynch et al., 2025); Apollo (Meinke et al., 2024); CLTR (2026) | Red-team evidence and at-scale detection of scheming behavior | What happens after detection |
| MITRE ATLAS; GenAI-IRF (Jakoby, 2026) | Adversarial techniques; cyber-IR bridging | Agent-initiated behavior; investigation depth |
| CERT insider-threat corpus (Cappelli et al., 2012); Shaw & Sellers (2015) | The intentional/unintentional asymmetry — the closest existing model | AI systems; needs adaptation, which is the open problem |
Evidence checklist // what an investigation needs
Three categories, adapted from Ezell et al. (2025) and the EC's Article 73 guidance. Working checklist with investigative cautions in the repository.
- 1. Activity logs — prompts (user + system), reasoning traces / chain-of-thought, retrieved external content, per-step outputs, executed tool calls, guardrail outputs, timestamps and version identifiers.
- 2. System documentation — model/system cards, the exact version at incident time, runtime settings enabling reconstruction, change logs, persona/configuration files.
- 3. Tool records — every tool the agent used or attempted: what it grants access to, what the agent did with it, errors encountered.
Three cautions that recur in the case files: the agent's statements about itself are artifacts, not testimony; establish whether the agent could write to its own record (CF-2025-001); and ask what context compaction silently destroyed (CF-2026-002).
Current reality: open incident databases hold none of this. The evidence either was never captured, sits inside one company, or is whatever the operator kept. Plan around that — and log it as a finding when you hit it.
Regulatory tracker // reporting obligations & deadlines
Primary sources only; tracker, not legal advice. Full version with US state laws and sectoral rules in the repository.
| Regime | Who / what | Deadlines | Status |
|---|---|---|---|
| EU AI Act, Art. 73 | Providers of high-risk AI systems; serious incidents per Art. 3(49). Mandates investigation and evidence non-alteration — specifies no methodology. | ≤15d default · ≤10d death · ≤2d widespread / critical infrastructure | Applies 2 Aug 2026 |
| GPAI Code of Practice | Signatory providers of systemic-risk GPAI models; serious incidents incl. chain of events and root-cause analysis. | Per Code / AI Office | In effect (Aug 2025) |
| US — Colorado AI Act | Developers/deployers of high-risk AI; algorithmic-discrimination duties incl. AG disclosure. | Per statute | Effective 30 Jun 2026 |
| US — Texas TRAIGA | Responsible AI governance act. | Per statute | In effect (1 Jan 2026) |
| OECD / G7 Hiroshima | Voluntary common reporting framework (29 criteria) and transparency reporting; the interoperability layer. | Voluntary | Live |
Glossary // contested terms, translated
Vocabulary chaos is an investigative problem: you cannot classify an incident with terms nobody defines the same way. The flagship entry — "agentic misalignment" — currently has at least five competing usages across frontier labs, security vendors, and critics. The full glossary maps them side by side and defines, in plain language: AI incident / hazard (OECD), the misalignment–misuse–operational-failure distinction (and why it often cannot be made from behavior alone), scheming, scaffolding, chain-of-thought, context compaction, persona files, attribution, and the postmortem-versus-investigation difference.
Tools & databases // the current ecosystem
- AI Incident Database (AIID) — largest public harm catalog. Trend research and precedent; no investigative evidence.
- OECD AI Incidents Monitor (AIM) — media-detected incidents with standardized metadata, cross-jurisdiction view.
- AVID — vulnerability/failure taxonomy and reports.
- AIAAIC — incidents and controversies; broad inclusion, useful for reputational context.
- MITRE ATLAS — adversarial technique knowledge base; the cyber bridge.
- Agent observability platforms (LangSmith, Langfuse, Arize Phoenix, W&B Weave) — built for debugging, currently the closest thing to flight recorders. If your subject runs one, that is where the evidence is.
Open problem — and the most-wanted contribution: there is no standard for forensic preservation of agent session state, context windows at point of failure, or executed tool calls.
Contribute // sourced, useful, investigation-focused
Case files (use the template), corrections with sources, regulatory updates from primary legal texts, framework and tool entries with honest coverage assessments, and plain-language rewrites. Not welcome: marketing, speculation presented as fact, non-public material. Full standards in CONTRIBUTING.md. Corrections are the most valuable contribution of all.
About // who maintains this
Maintained by ESI — security operations, investigations, and crisis management across international law enforcement, NGOs, and the technology sector; now working on AI incident investigations. This project exists because traditional investigations and OSINT matured through openly shared practitioner resources, and AI investigations has no equivalent yet.
Contact: [email protected] · GitHub repository · LinkedIn
If you have dealt with an AI incident — as responder, counsel, insurer, or target — I want to hear how it actually went. Confidential conversations welcome; only the public record is ever published here.