Article

Why Agile and Scrum Don't Work for AI-Assisted Development

Building with AI

Key Takeaways
  • Digital.ai's survey data shows the average Scrum team carries a backlog of 50 to 200 items, conducts 4 to 6 hours per week of ceremony (standups, planning, grooming, retrospectives), and delivers at a velocity that remains roughly constant sprint-over-sprint once the team stabilizes.
  • The alternative is not chaos — it is a different kind of discipline.
  • If your team is using AI coding tools — GitHub Copilot, Cursor, Claude — within a Scrum framework, the data suggests you are paying the coordination tax of Scrum without receiving the coordination benefit.

The Setup

Agile won the methodology wars. Digital.ai's State of Agile Report shows that over 80% of software organizations use some form of Agile, with Scrum as the dominant framework. Two-week sprints, backlog grooming, daily standups, sprint retrospectives, and velocity-based planning have become the default operating system for software teams worldwide. The methodology was designed for a specific context: coordinating groups of human developers working on shared codebases under conditions of changing requirements.

The problem is that AI-assisted development violates nearly every assumption Scrum was built on. Scrum assumes a stable team with predictable velocity. AI-assisted execution shows velocity that accelerates exponentially — not linearly — as accumulated infrastructure compounds. Scrum assumes work can be estimated in story points and allocated to fixed-length sprints. AI-assisted execution deploys entire application foundations in minutes through scaffold patterns, making sprint-level time estimates meaningless. Scrum assumes a backlog of prioritized work items. AI-native execution shows that backlogs create cognitive overhead and inventory waste that actively degrades solo-operator performance.

The Standish Group's CHAOS Report has tracked software project outcomes for decades. The data is consistent: only about 31% of Agile projects succeed by the Report's criteria (on time, on budget, with satisfactory results), and project success rates decline as project size and planning complexity increase. McKinsey's Developer Velocity research found that top-quartile organizations deliver software 4-5x faster than bottom-quartile ones — but the differentiator was not better Agile execution. It was elimination of coordination overhead and reduction of handoff delays. The question is whether the ceremony and structure of Scrum help or hinder when the execution unit is a single operator working with AI rather than a team of human developers.

What the Data Shows

Digital.ai's survey data shows the average Scrum team carries a backlog of 50 to 200 items, conducts 4 to 6 hours per week of ceremony (standups, planning, grooming, retrospectives), and delivers at a velocity that remains roughly constant sprint-over-sprint once the team stabilizes. The Standish Group reports that larger projects with longer planning horizons fail at higher rates — a finding that has persisted across every edition of the CHAOS Report for over twenty years.

One technology infrastructure operation produced ten production systems totaling 596,903 lines of code between October 2025 and February 2026 with the following process characteristics (M25, M26, CEM_Timeline):

  • Formal backlog items maintained: zero
  • Technical debt tickets accumulated: zero
  • Roadmaps created: zero
  • Project plans created: zero
  • Gantt charts created: zero
  • Planning documents of any kind: zero
  • Sprint planning sessions: zero

The output across this zero-ceremony period: 2,561 commits across 10 repositories, 29 commits per active day against an industry median of 2 (Sieber & Partners benchmark of 3.5 million commits across 47,000 developers). Daily commit average accelerated from 6.8 in October to 31.1 in January — a 4.6x increase. The fastest MVP shipped in 5 active days at 100% solo execution. The second-fastest shipped in 9 days at 91.4% operator execution. Days-to-MVP compressed from 21 days for early projects to 5 days for late projects — a 76% reduction (CEM_Timeline).

The backlog data is particularly striking. Across 596,903 lines of production code and 2,561 commits spanning ten production systems, zero formal backlog was maintained. Zero technical debt tickets accumulated. Zero "future consideration" lists persisted. Industry research consistently shows that 60-80% of backlog items are never completed — they are eventually pruned or silently abandoned. The backlog does not defer work; it defers the decision to reject the work. For a solo operator, every backlog item creates a cognitive loop that competes with execution for finite attention. Eliminating the backlog entirely redirected that cognitive capacity toward active building (M25).

The velocity acceleration is the clearest indicator that Scrum's assumptions break down. Scrum expects velocity to stabilize — that is the entire basis for sprint planning. If a team delivers 40 story points per sprint, the next sprint is planned for approximately 40 story points. But when accumulated infrastructure compounds, velocity does not stabilize. It accelerates. The portfolio data shows:

  • October: 6.8 commits/day average
  • November: 4.8 commits/day average
  • December: 10.0 commits/day average
  • January: 31.1 commits/day average

A sprint planned in October based on October's velocity would have allocated 4x too much time for the same work in January. The planning framework assumes a stable system. AI-assisted execution with compounding infrastructure is not a stable system — it is an accelerating one (M26, CEM_Timeline).

Planning beyond fourteen days produced negative returns for similar reasons. The portfolio's days-to-MVP compressed from 14-21 days (early projects) to 4-5 days (late projects). A quarterly roadmap written in October would have been falsified by December — not by failure, but by success. The operator's capability grew faster than any plan could have modeled because the accumulated infrastructure is not linear and therefore not plannable. When scaffold patterns deploy between 67,000 and 127,000 lines of code in single commits — entire application foundations instantiated from accumulated templates — a plan that allocates two weeks for "build application foundation" is invalidated in minutes (M26).

How It Works

The alternative is not chaos — it is a different kind of discipline. Instead of planning work into fixed time blocks and tracking progress against estimates, the approach uses three directional elements: a long-range vision that establishes where the portfolio is going, a locked current objective that defines what to build now, and an 80% completion threshold that defines when a product is ready to ship.

Every unit of work gets an immediate binary decision: advance it toward the current objective, or store it as a reusable asset for future projects. There is no "later" list. No deferred queue. No accumulating inventory of unkept promises. This is structurally different from backlog prioritization. Prioritization accepts the backlog as legitimate and tries to manage it. This approach eliminates the category entirely.

Within a fourteen-day window, the operator maintains awareness of active projects, available infrastructure assets, and near-term execution priorities. Beyond fourteen days, the operator relies on directional vision, accumulated capability, and real-time decision rules rather than detailed plans. The planning horizon matches the system's predictability horizon — and in AI-assisted execution with compounding infrastructure, that horizon is approximately two weeks. The 60% of active days that had commits to multiple projects simultaneously — with a peak of 4 projects in a single day and 132 commits on October 21 across 4 repositories — emerged from execution dynamics, not from a planning document that allocated days to projects (M25, M26, CEM_Timeline).

What This Means for Organizations Running Scrum on AI-Assisted Projects

If your team is using AI coding tools — GitHub Copilot, Cursor, Claude — within a Scrum framework, the data suggests you are paying the coordination tax of Scrum without receiving the coordination benefit. Scrum's ceremonies exist to synchronize human developers. When the execution unit shrinks from a team to an individual operator augmented by AI, the synchronization overhead becomes pure waste.

The numbers from this portfolio do not mean Scrum is universally broken. For large teams of human developers working on shared codebases, Scrum's coordination mechanisms may still provide value. But for AI-assisted execution — particularly for small teams or solo operators building on accumulated infrastructure — the data shows that zero-ceremony execution with clear directional guidance and real-time decision rules outperforms sprint-based planning on every measured dimension: output rate (29 commits/day vs. industry median of 2), quality (12.1% defect rate vs. 20-50% industry norm), cost ($67,895 total vs. $795,000-$2,900,000 market replacement value), and delivery speed (5 days to MVP vs. industry-typical months). The methodology that was built for coordinating human teams is not the methodology that maximizes AI-assisted output. The process framework needs to match the execution paradigm.


Related: Spoke #9 (AI Code Quality Metrics) | Spoke #11 (Measuring AI Development Productivity) | Spoke #7 (Cost to Build Software with AI)

References

  1. Digital.ai (2024). "State of Agile Report." Over 80% of software organizations use some form of Agile.
  2. Standish Group (2024). "CHAOS Report." Software project success rates (31% Agile project success rate).
  3. McKinsey & Company (2024). "Developer Velocity Benchmarks." Top-quartile organizations deliver software 4-5x faster than bottom-quartile.
  4. Sieber & Partners (2024). "Commit Velocity Benchmarks." Analysis of 3.5 million commits across 47,000 developers.