For teams that have adopted implementation abundance and a craft-maturity ladder, what concrete changes in staffing and role design (e.g., fewer generalist seniors, more “systems crafters” paired with multiple operators, or designer-owned harness flows with engineer custodians) most reliably expand the ambition frontier without increasing incident rates, and how could we instrument PRs, incidents, and experiment volume to tell whether a given configuration is actually working?

dhh-agent-first-software-craft | Updated at

Answer

  1. Likely-effective staffing patterns
  • Pattern A: Systems crafters + operators

    • 1 senior "system crafter" (Tier 3) owns boundaries, harness rules, and high-risk changes.
    • 2–4 Tier 1–2 operators run agents, iterate on diffs, and handle low/med-risk work.
    • Use for monolith / opinionated stacks and high feature volume.
  • Pattern B: Designer-owned harness, engineer custodians

    • Designer/product shaper owns CLI/MCP flows, prompts, and UX experiments.
    • Senior engineer custodian owns boundaries, safety gates, and performance.
    • Use where UX/workflow experiments drive value.
  • Pattern C: Contract owner + service locals (for fragmented stacks)

    • One senior owns cross-service contracts/flows; multiple team members handle local changes.
    • Agents constrained by those contracts.

Common shifts vs. traditional staffing

  • Fewer "full-stack generalist" seniors doing all steps themselves.
  • More seniors concentrated on:
    • boundary design and refactors
    • harness/contract rules
    • change plans for risky work
  • More juniors/mids operating agents within those lanes.
  1. Guardrails to keep incidents flat
  • Keep review bottleneck with high-craft roles:

    • All Class 2–3 changes (schema, money, data backfills, cross-system flows) require:
      • system crafter / custodian review
      • explicit test or contract updates
  • Use craft ladder for routing:

    • Tier 1: local features, well-tested paths, no contracts/flags.
    • Tier 2: can touch boundaries with template change plans.
    • Tier 3: owns cross-boundary changes, harness flows, and irreversible tools.
  1. Instrumentation to see if a configuration works

PR-level

  • Tag PRs by:
    • author craft tier
    • staffing pattern (e.g., "sys-crafter+ops", "designer-owned-harness")
    • change risk class (0–3, from existing change-management model)
  • Track per pattern and tier:
    • incident-linked PR rate
    • rework per PR (follow-up fixes within N days)
    • cross-boundary edits per PR

Incident-level

  • For each incident, log:
    • which pattern/staffing owned the change
    • author tier and reviewer tier
    • whether harness, contract, or flow was changed
  • Compute, per pattern:
    • incidents / 100 PRs by risk class
    • MTTR and blast radius

Ambition / experiment volume

  • Define cheap, trackable proxies:
    • count of new flows/features per month
    • count of safe experiments (feature flags, A/Bs, non-destructive jobs)
    • share of work that is "new capability" vs. "maintenance" in tickets
  • Compare per pattern, normalized by headcount.
  1. Simple success checks per configuration
  • Pattern A (system crafter + operators) is working if:

    • incidents per 100 PRs in risky classes stay flat or drop vs. before
    • experiment count and cross-boundary improvements rise
    • Tier 1–2 throughput rises, and Tier 3 PR count is modest but touches many boundaries.
  • Pattern B (designer-owned harness) is working if:

    • more UX/flow experiments land per cycle
    • harness-change-linked incidents stay rare
    • engineers report less time writing glue and more time on boundaries/infra.
  • Pattern C (contract owner + locals) is working if:

    • cross-service incident rate drops
    • contract-change PRs are few but heavily reviewed
    • agents mostly touch leaf services, not raw cross-service calls.
  1. How to start empirically
  • Pick one pattern per team for a quarter.
  • Add minimal tags to PR template (tier, pattern, risk class).
  • Add one field to incident form (pattern + tier).
  • Review stats monthly and adjust staffing (e.g., rebalance system crafters, move harness ownership) based on:
    • ambition/experiment lift vs. baseline
    • incidents per risk class vs. baseline.