If we treat agent accessibility of a system (CLI substrate, context bridge, verification layer) as a first-class design goal independent of stack shape, in which real-world cases would that push us to prefer architectures that are worse for traditional human-centric craft (e.g., over-segmented flows, stricter façades, heavier scripts), and how should teams decide when this tradeoff in taste and ergonomics is actually worth it for small-team leverage and ambition expansion?

dhh-agent-first-software-craft | Updated at

Answer

Agent-accessible but less human-pleasant architectures are worth it only in specific, high-leverage situations and under clear guardrails.

  1. Where agent-accessible-but-uglier architectures make sense
  • Cross-system glue and ops

    • Case: many small, repetitive, error-prone integrations (billing reconciles, data syncs, backfills, migrations).
    • Pattern: stricter façades, many small scripts, CLI-first flows around stable contracts.
    • Why accept worse craft: humans rarely touch these paths directly; the main pain today is toil and context juggling, not local code elegance.
  • Designer/junior-driven UX flows

    • Case: designers or juniors run agent-first loops via CLI/MCP to build/iterate flows.
    • Pattern: over-segmented flow commands ("new_step", "tweak_copy", "add_edge_case"), verbose templates, rigid boundaries.
    • Tradeoff: less pleasant for a senior to hand-edit, but lets non-seniors safely ship with strong agent help and scripted verification.
  • High-churn, low-shared-context teams

    • Case: rotation, contractors, weak shared taste.
    • Pattern: heavier scripts, explicit pipelines, narrow façades that agents and newcomers can follow.
    • Reason: human-centric craft that relies on tacit context doesn’t survive; explicit, agentable structure does.
  • Safety-critical or policy-heavy paths

    • Case: money movement, access control, privacy.
    • Pattern: rigid boundary modules, policy scripts, approval CLIs, dense verification hooks.
    • Tradeoff: feels bureaucratic but gives agents unambiguous APIs and checks.
  • Org-wide utilities and infra surfaces

    • Case: internal tools used by many teams (logging, analytics, permissions, scaffolding).
    • Pattern: CLI/JSON contracts, small wrappers, scripted checks; less “beautiful code”, more predictable shapes.
    • Rationale: consistency and agent accessibility across teams beat local elegance.
  1. Where the tradeoff is usually not worth it
  • Core product domain where seniors work daily and tests are strong.
  • Small, stable, high-taste teams with low churn.
  • Code that is mainly read/modified by humans in deep debugging sessions.

Here, over-segmentation and heavy scripting can hurt flow, obscure intent, and slow expert work more than they help agents.

  1. How to decide if the tradeoff is worth it

Use a simple rubric per area/flow:

  • A) Who edits this most?

    • Mostly agents + juniors/designers + newcomers → bias toward agent-accessible structure.
    • Mostly seniors with deep context → keep human-centric craft; add thin agent affordances.
  • B) Error and coordination cost

    • High-risk or cross-team surface (money, auth, shared data) → accept heavier façades/scripts + verification to reduce coordination mistakes.
    • Local, low-risk feature work → prefer human ergonomics.
  • C) Churn and staff mix

    • High churn, few seniors → prioritize explicit CLI flows, strong context bridge, scripted checks.
    • Stable, senior-heavy → only agent-optimize the most repetitive or cross-system bits.
  • D) Agent contribution share

    • If >50% of diffs in an area are already agent-authored and review is the bottleneck, invest in:
      • Stricter façades and layouts.
      • Command-level flows ("implement_story", "harden_edges").
      • Verification scripts that align with those flows.
  • E) Verification difficulty

    • If humans struggle to see blast radius from diffs, move behavior behind clearer, smaller interfaces even if that adds files and scripts.
  1. Guardrails so “ugly for humans” doesn’t rot the system
  • Constrain where you optimize for agents

    • Mark agent-optimized lanes (e.g., glue, ops, low-risk UX). Keep core domain and deep modeling human-shaped.
  • Keep human-verifiable anchors

    • For each agent-optimized area, ensure:
      • One obvious entry façade or flow file.
      • One short doc or script that shows how to run key verification.
  • Periodically sample for taste regressions

    • Have seniors review a small set of agent-heavy diffs per lane and adjust structure or harness when they see accumulating clutter.
  • Preserve apprenticeship surfaces

    • Make juniors explain agent diffs and scripts in reviews.
    • Keep at least some core flows in a style that teaches design and taste, not just harness operation.

Summary:

  • Prefer more agent-accessible, less human-ergonomic architectures in glue, cross-system, safety-critical, high-churn, and designer/junior-led areas where agents already do much of the work and errors are costly.
  • Preserve traditional craft in core domains and stable, senior-heavy teams.
  • Decide per area using who edits it, risk level, churn, agent share, and verification difficulty.
  • Use lanes, clear façades, and periodic sampling to keep the tradeoff from silently eroding overall taste and apprenticeship.