Auditing the black box: who's accountable when an autonomous agent acts?
When an agent takes an action, auditors ask two things: can you reconstruct why, and who's accountable? Here's the audit trail SOC 2 and the EU AI Act expect.
Last updated
When an autonomous agent takes an action in production — files a change, grants an entitlement, moves money, deploys code — an auditor asks two questions, and neither is about the model. First: can you reconstruct why it did that? Second: who is accountable for the decision? If the honest answer to either is a shrug at a black box, you fail the review — not because the agent was wrong, but because you can’t prove it was right.
This is engineering guidance, not legal advice — your compliance counsel has the final word on what any framework requires of your system. But the gaps below are the ones that fail an audit, and they are all infrastructure.
Every framework reduces to the same two questions
SOC 2, the EU AI Act, an internal risk review — they use different language, but they are all asking reconstruct it and who owns it. A SOC 2 examination under the Trust Services Criteria tests, among other things, that you monitor your system components and can produce evidence the controls actually operated (the CC7 “system operations” criteria). The EU AI Act, for systems it classifies as high-risk, turns that into a statutory requirement: Article 12 mandates automatic logging over the system’s lifetime, and Article 14 mandates effective human oversight (EU AI Act, Art. 12; Art. 14). Same two questions, escalating consequences.
An agent is uniquely bad at answering them by default. That’s the problem.
Why an agent is a black box by default
A traditional service does roughly what its code says. An agent decides, at runtime, which tools to call and in what order — and it does not leave behind a record of why unless you built one. As ISACA’s audit guidance notes, agentic AI often “does not offer human-readable reasoning unless explicitly programmed to log it,” leaving auditor and operator alike “navigating a black box, increasing the risk of unchecked behavior, legal exposure, and reputational damage” (ISACA, The Growing Challenge of Auditing Agentic AI, Sept 2025).
The same guidance lists what an agent can do without a formal approval step in the loop: chain together tools and APIs dynamically, generate and deploy code, create new identities to execute tasks, make access decisions in real time, and modify infrastructure in real time. Every one of those is an auditable event in any other system. Wired through an agent, they happen at machine speed with no paper trail — unless the trail is something you engineered.
What auditors ask, mapped to an agent
| What the audit checks | What it means for your agent | Common fail |
|---|---|---|
| Reconstruct a specific action | Replay the trigger, retrieved context, tool calls, and the action from logs alone | Only a chat transcript survives; the tool calls and their arguments weren’t recorded |
| Traceability over time | Automatic, durable logs across the system’s lifetime, not just this week’s traces | Traces expire in 7 days; no correlation ID ties a decision to its effects |
| Human oversight | A person can inspect, override, and halt the agent — and evidence shows they can | No stop function; “override” means redeploying |
| Access attribution | The agent’s action is attributable to an initiating user and a scoped identity | The agent acts as one shared service account; you can’t say on whose behalf |
| Named accountability | A specific human owns each class of agent decision | ”The AI team owns it” — i.e., no one |
| Change of behavior | A prompt or model change is logged, reviewed, and reversible | Prompts edited in a console with no version, no diff, no approver |
None of these grades the model. All of them are about the system and the operating discipline around it — the same Observability and Ownership dimensions of the Production-Readiness Bar, viewed through an auditor’s lens.
What a real agent audit trail records
A chat log is not an audit trail. An audit trail is a structured, append-only record of the decision, emitted by the agent as it acts, with enough fidelity to reconstruct a single action months later without re-running it. Concretely, per action:
- The trigger — what initiated it, and on whose behalf (the initiating user, not just the agent).
- The context — what was retrieved or read into the decision, so you can see what the agent actually “knew.”
- Each tool call — the tool, its arguments, and its result, in order. This is the part a plain transcript loses.
- The action and its effect — what changed, with a correlation ID that links the decision to the downstream state change.
- The approval — for any gated action, which human approved it, when, and on what evidence.
- Time and identity — timestamps and the scoped identity the agent used, so the whole chain is attributable.
This is exactly the shape of record the EU AI Act’s Article 12 is reaching for when it requires high-risk systems to enable “the automatic recording of events (logs) over the lifetime of the system” (Art. 12). You don’t get it by turning on a tracing SDK; you get it by treating the decision log as a first-class output of the agent. The pattern that makes it enforceable is the same one behind a machine-verified gate on agent-written changes: the agent proposes and records; a gate and a human decide; the record is the evidence.
Accountability is a control, not an org-chart footnote
Logging tells you what happened. It does not tell you who answers for it. That is a separate control, and it is the one teams most often skip because it isn’t code.
ISACA frames the open question exactly: “Who owns an AI decision? If an agent takes autonomous action, who is accountable?” (ISACA, Sept 2025). The EU AI Act answers part of it structurally for high-risk systems: Article 14 requires that a human be able to oversee the system, “intervene in the operation… or interrupt the system through a ‘stop’ button,” and “decide… not to use” or to “disregard, override or reverse the output” (Art. 14). A stop function nobody is named to press is not oversight. Accountability has to resolve to a specific person, decided before the incident — the Ownership bar: a named human who runs the agent, knows what to watch, and can halt it. Autonomy doesn’t dissolve accountability; it raises the cost of not having assigned it.
The gap is measurable — and correlated with incidents
This isn’t a hypothetical. Documented AI incidents are climbing: the AI Incident Database recorded 362 in 2025, up from 233 in 2024, per Stanford’s 2026 AI Index — roughly a 55% year-over-year rise. And the control gaps track the harm. In IBM’s 2025 Cost of a Data Breach report, 13% of organizations reported a breach of an AI model or application — and of those, 97% reported not having AI access controls in place, while 63% either had no AI governance policy or were still developing one (IBM, July 2025).
Read those as correlation, not proof of cause — they describe organizations that were breached, not a controlled experiment. But the direction is not subtle: the shops getting hurt are disproportionately the ones that deployed AI without the access controls, logging, and governance that make an agent auditable in the first place. The audit trail isn’t just paperwork for the examiner; it’s the same instrumentation that lets you notice, attribute, and stop a bad trajectory before it becomes the incident.
The work
Making an agent auditable is not a policy document. It’s engineering: structured decision logs the agent emits as it acts, retention that outlives an incident, scoped identities so actions are attributable, a real stop function, versioned prompts and models, and a named owner accountable for each class of decision. That is the same ground a compliance review walks — the HIPAA version of this argument reaches the same place from a different regulation — and it’s the Observability and Ownership half of the production-readiness bar made concrete.
If you’re heading into a SOC 2 or regulatory review with an agent you can’t fully reconstruct, that’s the production-readiness audit: a fixed-scope assessment against the bar — including the auditability and ownership gaps above — with a prioritized risk register and a scoped path to close them.
Questions this raises
Straight answers.
- What do auditors actually ask about an autonomous AI agent?
- Two things, and neither is about the model. First, can you reconstruct a specific action — what triggered it, what data the agent saw, which tools it called, and what it did? Second, who is the named human accountable for that decision? A SOC 2 examination tests whether you monitor the system and can produce the evidence; the EU AI Act, for high-risk systems, requires the logs and effective human oversight in law. If the agent is a black box, you can't answer either.
- Does an LLM agent log enough to pass an audit by default?
- Usually not. As ISACA puts it, agentic AI often "does not offer human-readable reasoning unless explicitly programmed to log it." A chat transcript is not an audit trail. You have to instrument the agent to record the decision — the trigger, the retrieved context, each tool call and its arguments, the action taken, and the human who approved it — as structured, tamper-evident events. That instrumentation is engineering work you do on purpose, not a byproduct you get for free.
- What does the EU AI Act require for logging and oversight?
- For systems classified high-risk, Article 12 requires that the system technically enable "the automatic recording of events (logs) over the lifetime of the system" so that operation is traceable. Article 14 requires the system be designed so it can be "effectively overseen by natural persons," including the ability to intervene or interrupt it through a stop function and to disregard, override, or reverse its output. Those obligations apply from 2 August 2026 for most high-risk systems. Whether your system is high-risk is a legal determination — this is engineering guidance, not legal advice.
- Who is accountable when an agent makes a bad call?
- Accountability has to resolve to a named person before the incident, not be assigned during it. "The AI team owns it" means no one owns it. A production agent needs a specific human who runs it, knows what to watch, can stop it, and answers for its decisions — the Ownership dimension of the production-readiness bar. Autonomy doesn't remove accountability; it just makes the absence of a named owner more expensive.
Production-Readiness Audit
This is the work, not just the writeup.
If this is your situation, the production-readiness audit is where it gets fixed — by the person who wrote this.