Most current guidance assumes implementation abundance inside productive product engines; how does the picture change if the same agent-first workflows are dropped into cost-center software orgs where review bottlenecks, craft bar, and ambition frontier are all optimized for predictability and budget control rather than expansion, and what new failure modes or leverage points appear when agents are primarily used to hit tickets and SLAs instead of to push small-team leverage and ambition?
dhh-agent-first-software-craft | Updated at
Answer
In cost-center orgs, agent-first workflows mostly amplify existing incentives: predictability, volume, and SLA adherence. That shifts where value and risk sit.
- How the picture changes
-
Review bottleneck
- Moves from taste/architecture to compliance and SLA risk.
- PRs are judged on “does it close the ticket and pass tests” more than “does it fit the system well.”
- Agents increase ticket throughput faster than review capacity, so default becomes shallow or sampling-based review.
-
Craft bar
- Often narrows to correctness + policy + basic maintainability.
- Aesthetic or long-horizon design gets de‑emphasized; agent output that is merely adequate is accepted.
- Style and architecture consistency are enforced only where they reduce risk or on-call pain.
-
Ambition frontier
- Org rarely uses new capacity to expand scope; savings are taken as cost cuts or more tickets done.
- Agents mostly compress time-to-ticket, not product surface.
- A small exception: “previously too-expensive” cleanups that reduce incident risk may now become affordable.
-
Human bottlenecks that gain value
- Queue shaping: deciding which tickets are worth human depth vs pure agent execution.
- Risk triage: spotting changes that could violate SLAs, compliance, or high-blast-radius systems.
- Context guardians: a few seniors who keep rough architectural maps so agent churn doesn’t destroy operability.
- New failure modes when agents are used to hit tickets/SLAs
-
Ticket-churn architecture
- Agents ship many small fixes that satisfy local tickets but fragment flows and deepen workaround layers.
- Over time, this increases incident complexity and MTTR even if SLA metrics look fine short-term.
-
Review theater
- Humans rubber-stamp agent PRs to keep up with volume.
- Verification layer becomes “tests green + linter + one glance,” so subtle regressions and invariants slip through.
-
Hidden cost inflation
- Opex moves from dev hours to on-call, debugging, and vendor bills (e.g., more queries, more CPU) because agents optimize for correctness over efficiency.
- Budget control looks good at sprint level, worse at annual infra/ops level.
-
Policy and compliance drift
- Agents adapt old patterns that technically work but miss new policy rules.
- Without explicit machine-checkable policies in the harness, non-compliant code lands while still “passing review.”
-
Apprenticeship decay under KPI pressure
- Juniors are pushed to close more tickets by driving agents, with little time to understand systems.
- Seniority gap widens: seniors do triage and firefighting; juniors become low-context orchestrators.
-
Scope intoxication with no ownership
- Cheap implementation tempts managers to add “nice-to-have” tickets, increasing system surface area without adding ownership or observability.
- Leverage points specific to cost-center settings
-
SLA- and risk-aware harness lanes
- Classify tickets/PRs into lanes like
low_risk_ticket,SLA_critical,policy_sensitive. - For low-risk lanes, allow more automated path (agent + tests + spot review).
- For high-risk lanes, enforce stricter human review and targeted checks (canary, rollback plans, perf guards).
- Classify tickets/PRs into lanes like
-
Ops-centric verification layer
- Bake SLOs and common incident patterns into harness checks: perf probes on hot paths, quota/rate-limit checks, migration safety scripts.
- Favor checks that directly reduce on-call pain over aesthetic rules.
-
“System health” budgets instead of only story points
- Reserve a fixed percent of capacity for agent-assisted simplifications that reduce incident and support load (e.g., dead-code removal, duplication cleanup around top-incident modules).
- Tie this to concrete metrics (incident count, MTTR) to make it legible to cost-focused leadership.
-
Guardrails against review theater
- Cap per-reviewer daily PR count or total review minutes; route overflow to batch sampling or stricter automation.
- Use agents to summarize diffs and highlight risk, but keep humans in charge of final signoff for non-trivial lanes.
-
Minimal craft bar encoded as policy
- Define a small, machine-enforced craft bar: boundary tags, error-handling patterns, logging/metrics requirements, and basic performance constraints.
- This keeps “just good enough” from degrading into “nobody can safely change this” without demanding high-aesthetic code.
- When agents become real leverage even in cost centers
-
Targeted reductions in operational drag
- Use agents to close whole classes of repetitive incidents (e.g., log hygiene, timeouts, retries) cheaply.
- Let a small senior group define patterns; agents roll them out in bulk.
-
Cheap “what if we simplified?” passes
- Periodic agent runs that propose merges of duplicate flows, contract consolidations, or retirement of low-use features.
- Only a tiny subset are accepted, but the option value is new in cost-constrained orgs.
-
Safer vendor and legacy interfacing
- Use agents to wrap messy vendor APIs or legacy subsystems behind stricter, monitored interfaces.
- This shrinks the area where low-context agent ticket work can cause serious damage.
Overall: in cost-center orgs, agent-first workflows tilt toward volume and SLA satisfaction, not ambition. The main design problem is preventing that volume from quietly eroding system health and human judgment. Leverage lives in ops-oriented harness design, explicit risk lanes, and small, protected budgets for system simplification rather than in feature frontier expansion.