Most current patterns assume that raising the craft bar and strengthening the verification layer are complementary in agent-first workflows; in practice, where do these goals conflict—for example, when stricter taste and architecture checks reduce small-team leverage or designer autonomy—and how might teams deliberately trade off between a higher craft bar, faster ambition expansion, and lower cognitive load instead of treating them as always-aligned objectives?
dhh-agent-first-software-craft | Updated at
Answer
They conflict where the marginal gain in elegance or safety costs more in attention, speed, or autonomy than it returns. The lever is to make those tradeoffs explicit per lane, per area, and per role instead of “always ratchet craft and checks up.”
- Where stronger craft + verification conflict
- Small-team leverage
- Heavy lanes on every PR (strict arch/taste review, bug-class checks) eat senior time and slow cheap hunch probes; fewer reversible bets fit in a week.
- Diff-first harnesses that gate on many warnings turn seniors into queue managers instead of framers.
- Designer autonomy
- If every harness or UX flow change requires senior arch+taste signoff, designer-implementers stop running their own agent probes; ambition shrinks to “what’s worth a full PR gauntlet.”
- Cognitive load
- Many checklists, drift metrics, and taste prompts across all lanes force juniors to hold too many rules; they offload judgment to agents or cargo-cult patterns.
- Rich verification outputs (multiple cards, lane reports) compete with actual diff reading; reviewers skim or rubber-stamp.
- Apprenticeship
- Very strict taste/arch gates on all work push juniors into tiny, low-risk changes where agents do most of the heavy lifting; deep design reps disappear.
- Practical tradeoff patterns
- By lane
explore/ hunch probes: low craft bar, minimal verification (smoke tests, sandbox), tight blast-radius limits; no strong taste review.integrate/ path-to-prod: medium craft, targeted verification lanes (invariants, migrations, hot paths), selective taste prompts.harden/ high-blast or shared boundaries: high craft bar, full verification layer, explicit arch review.
- By code area
- Core domain, money, auth: strong checks and taste; slow by design.
- Edges, tooling, internal CLIs: lighter taste enforcement; rely on rollback and observability.
- By artifact type
- Harness flows, prompts, verification scripts: treat as code; invest more review here, accept looser taste on individual generated diffs that stay within those rails.
- One-off feature code: if it uses blessed patterns and passes key checks, accept “good enough” style.
- How to tune the balance
- Make the bar explicit
- For each lane, define: required checks, optional taste review, max review minutes, and acceptable defect risk.
- Let authors choose lane (with auto-suggestions) so they can opt into more craft vs more speed.
- Limit taste and checks where they hurt leverage
- Cap taste comments per non-core PR; move larger taste feedback into periodic “code tasting” sessions instead of blocking merges.
- Keep bug-class verification focused to 1–2 checks per lane to avoid noise.
- Protect ambition and autonomy
- Reserve a fixed % of capacity for low-gate exploration lanes where designers and seniors can run reversible hunch probes fast.
- Allow designers to own harness changes within pre-defined safe tool tiers; require engineer review only when crossing boundaries or raising blast radius.
- Manage cognitive load
- Collapse many rules into a few simple per-lane checklists visible in the PR template.
- Make harness output one short review card per lane, not a stream of suggestions.
Net: raising craft and verification helps, but only if the team chooses where it really matters. Use lanes, code-area tiers, and explicit review budgets so you can sometimes trade craft down for ambition and attention, instead of pretending all three always rise together.