In small, high-context product engines that already use diff-first review and lane-tagged verification, what concrete changes to the review bottleneck emerge when seniors deliberately maximize judgment leverage (e.g., rubber-stamping UX/glue lanes, delegating more harness tweaks to designers), and which specific review tasks stubbornly resist delegation to agents or non-senior humans even in a highly opinionated, token-efficient stack?
dhh-agent-first-software-craft | Updated at
Answer
Main shifts and the irreducible senior work.
- How the review bottleneck changes when seniors maximize judgment leverage
-
More rubber-stamp lanes
- Small, tagged UX/glue diffs with passing lane checks move from line-by-line review to quick lane sanity checks ("is this really UX/glue?"), then auto-merge.
- Queue composition skews toward fewer but denser reviews (core-domain, schema, boundary, harness-safety changes).
-
Review moves up a layer for many PRs
- Seniors spend less time on style and micro-idioms; more on:
- Fit with existing flows and boundaries.
- Whether verification is adequate for the change.
- Comments are more often about invariants, naming, and flow seams than about local refactors.
- Seniors spend less time on style and micro-idioms; more on:
-
Harness and lane rules become a primary review surface
- More review time shifts from individual PRs to edits of:
- Lane definitions and auto-label rules.
- Harness commands and verification scripts.
- The practical bottleneck becomes "who can safely change the rules of the game" rather than "who can read this diff."
- More review time shifts from individual PRs to edits of:
-
Designers and non-seniors absorb more low-risk review
- Designers rubber-stamp PRs clearly within UX/glue lanes when:
- Paths, size, and change-types meet lane rules.
- Required scenario/visual checks are present.
- Juniors review each other’s small changes inside safe lanes, with seniors spot-checking.
- Designers rubber-stamp PRs clearly within UX/glue lanes when:
-
Fewer interrupts, more batch review
- Seniors batch core-domain reviews, often per boundary or probe, instead of context-switching across many tiny diffs.
- This reduces total review minutes per change but raises the importance of good lane tagging and CI signals.
- Review tasks that remain stubbornly senior-owned
Even with strong agents, opinionated Rails-style monoliths, and good harnesses, some checks resist delegation:
-
Boundary and invariant changes
- Introducing or changing:
- Cross-bounded-context calls.
- Key business invariants (billing, auth, permissions, data integrity).
- Requires seniors to:
- Reconstruct the real dependency and risk graph.
- Decide whether invariants are encoded in the right place.
- Introducing or changing:
-
Schema, migration, and data-shape evolution
- Table/index changes, new enum states, backfills, and data moves.
- Needs senior judgment on:
- Backward compatibility and rollout order.
- Operational impact (locks, runtime, failure modes, rollback).
-
Verification-layer topology
- Adding/removing whole classes of tests, lanes, or gates:
- New verification commands or harness flows.
- Relaxing or tightening auto-approval rules.
- Seniors must assess system-level safety bar, not just local correctness.
- Adding/removing whole classes of tests, lanes, or gates:
-
Architecture and performance inflection points
- Changes that:
- Move work across process/service boundaries.
- Introduce new caching, fan-out, or background work.
- Affect latency/throughput of hot paths.
- Still need someone who understands non-local tradeoffs and failure modes.
- Changes that:
-
Naming and concept introduction
- New domain concepts or renames that cut across the codebase.
- Seniors arbitrate to avoid long-lived concept drift and accidental duplication.
-
Subtle UX/product risk in core flows
- In core monetization, trust, or compliance paths, even small UX or copy changes can have outsized impact.
- Seniors (often paired with design/product) keep the final say on: "Is this change acceptable risk for this flow now?"
- Net effect
- Delegable review expands mainly in:
- Local UX, glue, and copy.
- Mechanically checked harness edits.
- The bottleneck concentrates in:
- Boundary, schema, and invariant changes.
- Verification topology and lane rules.
- Architecture and cross-cutting naming.
- Judgment leverage rises, but a hard core of review work remains senior-only, even with strong agents and a tight monolith.