If we flip the usual assumption that monolith vs services vs CLI substrate is the core design choice and instead treat apprenticeship decay as the primary risk to optimize against in agent-first teams, what contradictions appear in our current advice about agent accessibility, designer-owned harnesses, and token-efficient stacks—and in which concrete situations would a deliberately less agent-accessible, more frictionful stack produce better long-term judgment and taste than a highly optimized, agent-friendly architecture?

dhh-agent-first-software-craft | Updated at 2026-04-09 09:42

Answer

Key tension: most current advice pushes toward maximum agent accessibility and token-efficient stacks. If apprenticeship decay is the primary risk, some of that advice flips: you sometimes want friction, manual lanes, and less-accessible stacks so juniors are forced to see the system and exercise judgment.

Main contradictions

Agent accessibility vs learning depth

Prior line: more CLI/MCP surfaces and clearer façades are always good for agents (6f3d8f7e, ec182c91, c06fe0ad).
Contradiction when apprenticeship is primary:
- The better the façades and flows, the easier it is for juniors to stay at the “call this command” layer and never learn internals.
- A perfectly agent-shaped monolith can turn juniors into prompt operators over time.
Implication: some domains should stay partially opaque to agents so humans must read and design core flows.

Designer-owned harnesses vs engineer-led craft

Prior line: letting designers own harness flows increases ambition and UX fit (a0208d49, ec182c91).
Contradiction:
- If designers own most harness logic, juniors learn the product surface but not the underlying design constraints.
- Craft bar drifts from code-level patterns to prompt hacks that juniors copy but don’t understand.
Implication: harness ownership in core domains should be engineer-led or co-owned, with designers driving scenarios, not mechanics.

Token-efficient stacks vs exposure to complexity

Prior line: more token-efficient monoliths/CLIs help agents and humans reason locally (c06fe0ad, 6f3d8f7e, token-efficient stack).
Contradiction:
- Over-flattened façades and tiny local files can hide global tradeoffs, concurrency, and failure modes that seniors used to absorb while working in larger spans.
- Juniors learn to patch flows, not to design boundaries.
Implication: some complexity should remain visible and occasionally painful so people practice whole-system thinking.

When less agent-accessible, more frictionful stacks help

Core domain and boundary design work

Context: Rails-style monolith; billing, auth, or permissions modules.
Pattern:
- No general-purpose “agent:mutate_billing” tools.
- Limited CLI/agent entrypoints; boundary changes require humans to touch 2–3 layers (model, service, integration tests).
- Reviews demand design notes, not just diffs.
Why better for judgment/taste:
- Forces juniors to understand invariants, data flows, and failure modes.
- Seniors see real reasoning, not only harness scripting.

Apprenticeship lanes for juniors

Context: junior-heavy team using sidecar agent loops.
Pattern:
- Mark “learning lanes” where agents can explain and scaffold but cannot apply diffs or run mutating CLIs.
- Require juniors to hand-write or heavily rewrite key pieces (domain methods, boundaries, tricky tests).
Why better:
- Preserves struggle in a controlled slice of work.
- Builds taste around naming, control flow, and test design before full automation.

New architectural and pattern work

Context: introducing a new boundary, event pattern, or concurrency model.
Pattern:
- Deliberately keep harness support minimal (read-only explain tools, no scaffolding generators).
- Make early implementations manual and pair-based; add agent tooling only after patterns stabilize.
Why better:
- Early taste stays human; patterns are debated in prose and diagrams, not hard-coded into premature harness flows.

High-risk, high-teaching subsystems

Context: data migrations, security-sensitive flows, privacy logic.
Pattern:
- Agents may propose plans and checks but cannot execute without human-curated scripts.
- Stack interfaces remain slightly awkward (multi-step scripts, manual dry runs) so humans must think.
Why better:
- Friction slows work just enough for reflection and “what could go wrong?” judgment.

Early-stage teams with 1–2 seniors, many juniors

Context: greenfield monolith, limited tests but high learning goals.
Pattern:
- Keep agent accessibility narrow: docs, search, explanation, small refactors.
- Avoid broad “agent can touch any file” harnesses and cross-repo CLIs.
Why better:
- Seniors can intentionally choose where juniors get help vs where they need to wrestle with complexity.

Design directions to reconcile the tension

Capability-tiered agent surfaces
- Tier 0: read/explain only (broad coverage).
- Tier 1: low-risk glue/migrations behind strong tests (wider agent access).
- Tier 2: core domains where only seniors or designated apprentices can run mutating tools.
Harness modes: "teaching" vs "throughput"
- Teaching mode: slower, more verbose, requires human edits and explanations.
- Throughput mode: full diff application for trusted contributors and domains.
Intentional friction in style guides
- Explicitly forbid one-click flows for schema/boundary changes.
- Require design notes and manual test curation in certain directories.
Rotating manual weeks/areas
- Periodically run “agent-light” work in selected areas where juniors implement more by hand, using agents only for explanation and review.

Overall: once apprenticeship decay is primary, stack and harness choices must be optimized for where you want humans to struggle and think, not just where you can make agents fast. That often means fewer surfaces, more gated tools, and deliberately rough edges in the most important teaching areas.