The Agentic Dev Loop — Designing Modern Development Systems for Human + Agent Teams

Pull requests are a coordination mechanism for humans. That model doesn't scale when the author is an agent.

There were 1 billion commits in 2025. Now it's 275 million per week, on pace for 14 billion this year if growth remains linear (spoiler: it won't.)

GitHub Actions went from 500M min/week in 2023 → 1B in 2025 → 2.1B this week.

April 2026

That's not growth. That's a different kind of thing entirely.

PRs were designed for a world where a human writes some code, opens a PR, another human reads it, and someone approves or pushes back. That whole model assumes the author is a person, that there aren't too many PRs to read carefully, and that the bottleneck of human review is acceptable. None of that holds anymore.

The wrong answer is “review faster”

Most AI code review tools today bolt AI onto the existing PR workflow. The agent writes code, opens a PR, and now there's an AI reviewer in the comments alongside the humans. That's a fine transitional step. It's not the end state.

If agents are writing code at 100x human velocity, the goal shouldn't be to review 100x faster. The goal is to build a system where review happens automatically — where code can't land on main unless it's been verified, not approved.

Verification and approval are different things. Approval is a human saying “I read this and it seems fine.” Verification is a system asserting “these specific checks passed.” One scales, one doesn't.

What the system looks like

Instead of a human reviewing a PR, you have a set of specialized agents that each evaluate the change from one specific angle. Security. Architecture. Tenancy isolation. Dependency risk. Rollout safety. If all of them pass, the change merges — automatically, with no human in the loop.

The important thing here is that these aren't generic “AI reviewers.” They're narrow. A tenancy agent doesn't have opinions about code style — it's specifically checking whether every database query is scoped by org_id. An architecture agent isn't reading the whole diff — it's enforcing the patterns your team has defined. The specificity is the point.

The other thing that's important: when something slips through, you don't just fix the bug. You encode the class of failure into a new check. The system gets harder to break over time. Human review doesn't work that way — a reviewer who missed a tenancy bug in March doesn't automatically catch the same pattern in October. A check does.

The right model — and why it doesn't exist yet

Here's what this should look like in a world where the tooling catches up.

Git already supports commit signing — you can sign a commit with a GPG or SSH key to prove it came from a specific identity. The natural extension for agentic review is that each review agent signs off on the commit too. The security agent runs, finds no issues, and cryptographically attests that it approved this specific commit. Same for the architecture agent, the tenancy agent, and so on.

Then branch protection would say: this commit cannot merge unless it carries signatures from security-agent, tenancy-agent, and architecture-agent. You'd get a verifiable, cryptographic proof that every required check ran and approved — not just that CI passed, but that specific identities signed off.

That feature doesn't exist. Git supports one signature per commit. GitHub's branch protection has no concept of requiring signatures from a set of identities. Someone needs to build this — either as a Git extension, a GitHub feature, or a layer on top using something like Sigstore. The open questions are real: what does an agent attestation contain, who holds the keys, how do you revoke a compromised agent, how do you version agents as your threat model evolves.

In the meantime, you can get to most of this model with what exists today.

How to approximate this today

The key thing to understand: CI is not where the review happens. CI is where you verify that the review already happened.

The coding agent writes code on a branch. Before it pushes, each review agent runs against the diff and produces a signed attestation — cryptographic proof that it ran and what it found. The security agent attests “I reviewed this commit and found no vulnerabilities.” The tenancy agent attests “every query I found is correctly scoped.” Those attestations are stored and attached to the commit SHA.

Then the coding agent pushes. CI runs. But CI isn't doing the review — it's checking that the right attestations exist and are valid. If any are missing or failed, the push is rejected. The branch protection rule is: “before this can merge, I need to see valid attestations from these specific agent identities.”

In practice today you approximate this with required status checks. Turn off required human reviews. Use Sigstore / cosign to sign attestations and attach them to the commit. Wire CI jobs to verify those attestations exist. All required attestations present and valid? The bot merges. Any missing? The push fails with a clear explanation of which agent didn't sign off.

It's not elegant. The multi-identity branch protection feature doesn't exist yet, so you're building it on top of the existing primitives. But the logic is sound and the security properties are real.

Where humans still fit

Early on, you're still reviewing — but you're reviewing the agents, not the code. Is the tenancy agent catching everything it should? Did the security agent miss a new attack pattern? You're debugging the system, not the diff.

Over time, as agents prove out, human review becomes exception-based. You step in when something falls outside any agent's coverage, or when an agent flags low confidence. The rest merges automatically.

The long-term version isn't “humans no longer review code.” It's “humans govern the review system.” You design the agents, set the policies, audit outcomes. You're not reading diffs. You're running the machine that reads diffs.

The accountability question

When an agent commits code that causes an incident, people ask: who's responsible? This is the question that will actually block teams from adopting this model, more than any technical limitation.

The honest answer: whoever deployed the agent and whoever decided it was a required check. The same way we think about automated tests — if a bug escaped because a test didn't cover it, we don't blame the test runner. We blame the team that didn't write the test. The agent is the test.

What helps here is that the pipeline produces an auditable trail: what ran, what passed, what merged, when. That's actually better than human review, which mostly produces a comment thread that nobody reads six months later.

The pull request isn't going away tomorrow. But it's already the wrong abstraction for how code is being written. The teams that figure out verification-first review — composable, enforceable, and continuously improving — are going to ship faster with more confidence than anyone still running humans through a 275M commit/week firehose.

You can start building that system today.