Skip to content
RingMod Request an audit
← Notes
Field note

We let AI write our production infrastructure. Here's the gate that stops it deploying.

Multi-agent AI writes the CDK that runs ringmod.ai. A machine-verified safety gate — not trust — decides what reaches the AWS account. Here's the gate.

Last updated

A production-safety gate is the set of automated, machine-verified checks that stand between an AI agent proposing an infrastructure change and that change reaching your cloud account. The agent’s job is to write and verify. The gate’s job is to refuse anything that lacks evidence. A human approves the apply. Nothing about that is trust — it’s all checkable.

This site is built and continuously deployed by exactly that system: AWS CDK in TypeScript, written with multi-agent orchestration, shipped through a gate that blocks any apply without evidence. So instead of describing it abstractly, here’s the real one — every claim below is verifiable against the public repository.

Why “should we let the agent apply it?” is the wrong question

The interesting capability isn’t that an AI can write Terraform or CDK — it can, and well. The risk was never the authoring. It’s the apply. An ungoverned agent with credentials to your production account is one confidently-wrong change away from an outage or a data-exposure incident, and it will narrate the mistake just as fluently as the fix.

So the useful question isn’t “do we trust the agent?” It’s “what has to be true before any change — agent-written or not — is allowed to touch production?” Answer that, encode the answer in machine-verified gates, and the source of the change stops mattering. You get agent-scale velocity on infrastructure your platform and security leads will actually approve, because the controls don’t depend on who or what wrote the diff.

What actually touches our AWS account

Almost nothing with standing power. There are no long-lived AWS access keys anywhere in the system. CI authenticates to AWS through a GitHub OIDC provider and assumes a scoped deploy role for the duration of a single run — short-lived, auditable, revocable. The agent operates on code and proposes changes; the credential that can change cloud state is keyless, time-bound, and narrow.

That one design choice removes the largest category of “AI touched prod and something bad happened” failure modes before any of the other gates run.

The seven gates every change passes

Every change — including the one that published this page — passes these, in code, on every run. Any failure stops the line.

GateToolWhat it enforces
Typestsc --noEmitCompile-time correctness; CI fails on any error
LintESLintStyle and footgun checks across the infrastructure code
Template assertionsaws-cdk-lib/assertions23 tests asserting the synthesized CloudFormation is what we intend
Policy-as-codecdk-nag (AwsSolutions)Synth fails on any finding; current status: 0 non-compliant
Plan reviewcdk diff on every PRA human-reviewable change set before anything is applied
Least-privilege deployScoped exec policy + permissions boundaryNo path from “deploy this repo” to AWS administrator
Keyless CIGitHub OIDCNo long-lived AWS access keys exist anywhere

Policy-as-code is the one people underestimate. cdk-nag runs on every cdk synth and fails the synthesis on any finding — so a non-compliant change can’t even be packaged for deploy, let alone applied. Every finding is either compliant or suppressed with a written engineering justification, regenerated into docs/compliance/ on each synth.

Bootstrap doesn’t grant admin

This is the load-bearing control, and it’s the one most setups get wrong by default.

The default AWS CDK bootstrap attaches AdministratorAccess to the CloudFormation execution role — the role that creates every resource. We don’t accept that. The execution role is replaced with a policy scoped to this project’s exact service surface, and it can only create IAM roles that carry a permissions boundary. That boundary is applied to every role in the application at synth time, through a CDK Aspect (PermissionsBoundary.of(app).apply(...) in infra/bin/ringmod.ts).

The consequence: even a mis-authored stack policy — whether a human or an agent wrote it — cannot mint administrator access. There is no path from “deploy this repo” to admin. The gate isn’t a promise; it’s the absence of the capability.

The two times the guardrails caught something real

Both of these happened during this build. Both are exactly the failure modes a default AdministratorAccess bootstrap would have waved through silently.

  1. The permissions boundary was too strict — and said so. The first foundation deploy failed because the boundary denied iam:CreateOpenIDConnectProvider, which the GitHub OIDC provider legitimately needs. CloudFormation rolled the stack back cleanly. The fix was a deliberate boundary correction, re-verified and re-deployed before anything shipped. The guardrail did its job before production, not after.

  2. Least-privilege caught a case-sensitivity bug. Updating the CI deploy role’s trust policy failed with an explicit AccessDenied on iam:UpdateAssumeRolePolicy — the policy was scoped to role/RingMod*, but the role is named ringmod-github-deploy, and IAM ARN matching is case-sensitive. The policy was genuinely too tight; the failure told us exactly how. We widened the pattern to the real name and recovered with continue-update-rollback.

That second one is the whole thesis in miniature: least-privilege that actually constrains will occasionally tell you it’s too tight, with a precise error. Least-privilege as a slogan never does, because it was never really constraining anything.

So how do you let an AI agent touch a production account safely?

Not by trusting the agent. By making the trust unnecessary:

  • Remove standing power. No long-lived keys. Short-lived, scoped, keyless credentials per run.
  • Scope the blast radius. The deploy identity gets a specific service surface, never AdministratorAccess. A permissions boundary on every role makes escalation structurally impossible.
  • Demand evidence, mechanically. Types, lint, template assertions, and policy-as-code that fails the build — not a checklist someone promises they ran.
  • Keep the plan human-reviewable. A cdk diff on every PR. The apply requires a person.
  • Make every claim verifiable. If you can’t point at the file that enforces a control, you don’t have the control.

The agent gets to be fast. The account stays safe. Those aren’t in tension once the gate is real.

Verify it yourself

Don’t take it on faith — that would defeat the point. The seven gates map to real CI steps in .github/workflows/deploy.yml. The scoped policy and boundary live in infra/bootstrap/ as cfn-exec-policy.json and permissions-boundary.json. The boundary is applied app-wide in infra/bin/ringmod.ts. The cdk-nag reports sit in docs/compliance/ with 0 non-compliant findings, regenerated each synth. The how-this-was-built page walks the same ground with links.

Questions this raises

Straight answers.

Can AI safely write production infrastructure code?
Writing it, yes — AI is good at producing Terraform or CDK. The risk isn't the writing, it's the applying. Safe means the generated change passes machine-verified gates (types, policy-as-code, a reviewable plan, least-privilege credentials) before a human approves the apply. The agent proposes; verification and policy decide; a person presses the button.
What stops an AI agent from making a destructive change to my cloud account?
Three things, in order: it has no long-lived credentials (CI authenticates via short-lived OIDC), the deploy role is scoped to a specific service surface rather than AdministratorAccess, and a permissions boundary is applied to every role at synth so even a mis-authored policy cannot escalate to admin. The agent literally cannot mint the access required to do broad damage.
Do you let the agent run cdk deploy or terraform apply on its own?
No. Every change produces a human-reviewable plan (cdk diff) on a pull request, and the apply requires human approval. The agent's leverage is in proposing and verifying changes fast; the apply stays gated. That single boundary is the difference between agent-scale velocity and an unattended apply your security lead would never sign off on.

Agentic Delivery

This is the work, not just the writeup.

If this is your situation, the agentic delivery is where it gets fixed — by the person who wrote this.

Request an audit