Agentic PR Flood Is Breaking Multi-Repo Code Review

The number that changes the conversation

In late June 2026, Addy Osmani published "Agentic Code Review" on O'Reilly Radar — and buried inside it was a Faros AI dataset that deserves more attention than it has received. Faros instrumented 22,000 developers across 4,000 teams and tracked what happened as those teams moved from low to high AI adoption. The headline metrics on throughput were positive: developers merged more PRs and completed more work. Then came the rest of the report.

Median PR review duration is up 441.5%. PRs merged with zero review are up 31.3%. The per-developer defect rate climbed from 9% to 54%. Nobody on those teams chose to stop reviewing. Reviewers simply couldn't keep pace with the volume of incoming diffs, so code began merging unread — and that became the new normal.

This is not a story about bad engineers. It is a story about a workflow that was never designed to absorb agent-scale output.

Why the old review model is structurally broken

Code review used to be self-regulating. A senior engineer could read code faster than a junior could write it, which meant the review queue stayed manageable without anyone designing it to. Coding agents removed that speed advantage overnight. The diff arrives instantly; the context required to evaluate it does not.

Osmani's framing of the underlying problem is sharp: when an agent writes a PR, its reasoning — the thinking traces, the tradeoffs it weighed — is discarded the moment the diff is produced. The reviewer is left reconstructing intent from code alone, which is slower and harder than checking reasoning that sits in front of you. That asymmetry explains the 441% figure directly.

A companion 2026 research paper studying 33,707 agent-authored PRs found that around 28% merge almost instantly, but agents tend to abandon the back-and-forth that review actually requires as soon as subjective feedback appears. Reviewer abandonment accounted for 38% of rejected agent PRs. The volume is high and the signal-to-noise ratio of any individual PR is harder to judge than it used to be.

The multi-repo dimension nobody is talking about

Most of the discourse around agentic code review treats the problem as a single-repo queue management challenge. For the majority of engineering teams at growth-stage and enterprise companies, that framing misses the harder part of the problem.

Modern microservice architectures mean a single product feature almost always spans multiple repositories. An agent working on a checkout flow might open a PR against the payments service, the order management API, and the frontend gateway — three separate repos, possibly on two different providers. Each of those PRs looks reasonable in isolation. Together, they represent a coordinated change that can only be assessed by seeing all three at once.

When the review surface is three separate browser tabs pointing at three separate repository dashboards, the cognitive overhead of pulling that context together is enough to discourage thorough review entirely. This is not a discipline problem. It is a tooling problem. The Faros data on zero-review merges is almost certainly worse in organizations with higher repo counts — but that segmentation is not yet published, because most observability tools are themselves scoped to single repositories.

What engineering leaders should watch

A few concrete things worth tracking as this trend continues:

Risk before volume. The most useful intervention from the 2026 research is the idea of a "circuit breaker" — a signal that predicts high-maintenance PRs from cheap heuristics like file types, patch size, and sensitive path modifications, applied before a human opens the diff. Triage by risk, not by arrival order.

Cross-repo change correlation. When an agent (or a developer using an agent) opens PRs across multiple repos as part of a single logical change, those PRs need to be visible together. Review that happens PR-by-PR without that correlated view is structurally incomplete.

Review as a first-class engineering metric. DORA metrics have always included lead time and deployment frequency. The Faros data suggests review duration and zero-review merge rate need to become standard health indicators alongside them — especially as agent adoption accelerates.

Context preservation. The agent's reasoning disappears at diff time. Teams are experimenting with structured context files (AGENTS.md, VISION.md) to anchor agent intent closer to the code. Whether or not those conventions standardize, the goal — preserving the "why" alongside the "what" — is the right one.

The closing take

The Faros numbers are a leading indicator, not an endpoint. Teams that treat agentic adoption as a throughput problem without addressing the review surface will continue to see defect rates climb and review quality decline. The agents are not going away; the economics are too good. The right response is to make the human review layer faster, smarter, and less dependent on willpower.

For multi-repo teams specifically, that starts with visibility. You cannot triage what you cannot see. When PRs are scattered across dozens of repositories and two providers, the review queue is effectively invisible until something goes wrong. Tools like Code Board address this directly — aggregating every open PR from GitHub and GitLab into a single Kanban-style board, with AI-powered risk scoring and cross-repo context so that reviewers can prioritize in seconds rather than minutes. As agentic PR volume keeps climbing, that kind of unified visibility is shifting from a nice-to-have to a structural requirement.