If we invert the current framing and treat designers/product shapers as the primary owners of agent harnesses (CLI/MCP flows, prompts, verification scripts) while engineers act as safety net and systems custodians, which assumptions about who controls the craft bar, ambition frontier, and review bottleneck stop holding, and how might this role swap expose new failure modes or leverage in agent-first teams?
dhh-agent-first-software-craft | Updated at
Answer
Key shifts if designers own harnesses and engineers become custodians.
- Assumptions that break
- A1: “Engineers own the craft bar.”
- Breaks: harness prompts, flows, and verifiers now encode most of the craft bar; designers set many defaults (API shapes, CLI flows, test heuristics).
- A2: “Engineers alone set the ambition frontier.”
- Breaks: product shapers can cheaply script new flows and multi-system automations; ambition is gated more by designer taste and ops risk than engineer capacity.
- A3: “Review bottleneck sits in engineering code review.”
- Breaks: bottleneck shifts to harness and flow design review (prompt packs, verification scripts, tool wiring) more than per-PR line review.
- A4: “Engineers define what is ‘safe enough’ to automate.”
- Breaks: designers can expose risky surfaces via harness tools unless custodians add hard policy gates.
- New leverage
- L1: Faster ambition tests
- Designers can spin up CLI/MCP flows to prototype features and cross-app workflows without waiting on engineers for each spike.
- L2: Closer fit between UX and automation
- The same people shaping product flows can shape agent flows (CLI verbs, arguments, checks), tightening UX–implementation feedback.
- L3: Richer verification scenarios
- Designers can author scenario- and behavior-focused scripts ("happy path", "edge persona") that agents run as part of verification.
- L4: Senior engineer leverage
- Custodians focus on boundaries, observability, performance, and hard failure modes instead of feature-by-feature diff cleanup.
- New failure modes
- F1: Craft bar drift
- Harness prompts encode inconsistent style/architecture; designers drift toward local convenience; engineers lose day-to-day control until issues surface as subtle erosion.
- F2: Hidden coupling via harness
- Designer-owned flows call across boundaries and services in ways that bypass existing abstractions; architecture drifts from the “automation layer” down.
- F3: Review blind spots
- Teams review UI copy and flows but not harness diffs; prompts and scripts become an unreviewed codebase controlling large change volumes.
- F4: Safety gaps
- Designers expose powerful tools (e.g., billing, data deletions) without strong guardrails; custodians discover issues only through incidents.
- F5: Skill bifurcation
- Designers become strong “agent programmers” but weak in underlying code; engineers stay strong in code but lose influence on product-level taste.
- Mitigations and design patterns
- P1: Dual ownership of the craft bar
- Make craft bar a small shared spec (examples of good diffs, naming, boundaries) co-owned by a designer–engineer pair; harness prompts must reference it.
- P2: Harness PR lanes and review
- Treat harness assets (CLI specs, prompt packs, verification scripts) as first-class code:
- separate repo or directories,
- labeled PRs ("harness-flow", "harness-verify"),
- custodians review all cross-boundary or high-risk harness changes.
- Treat harness assets (CLI specs, prompt packs, verification scripts) as first-class code:
- P3: Guardrails in tools, not just norms
- Engineers encode hard limits in the harness:
- capability tiers per tool,
- no-go zones and risk flags (e.g., billing, auth),
- mandatory tests/approvals for certain flows.
- Engineers encode hard limits in the harness:
- P4: Split reviews by question
- Designer reviewers: "Does this harness flow express the right product behavior and UX?"
- Engineer custodians: "Does this keep boundaries, performance, and safety intact?"
- Net effect on craft bar, ambition frontier, review bottleneck
- Craft bar
- Moves from "how we write code" to "how we script flows"; must be co-policed via harness review and small reference examples.
- Ambition frontier
- Expands on UX and workflow dimensions (more experiments, more integrations); is now constrained by custodial risk gates and verification coverage.
- Review bottleneck
- Shifts from per-line code review to:
- harness change review,
- verification script design,
- periodic audits of agent behavior in production.
- Shifts from per-line code review to:
This swap can increase ambition and fit-to-product if harness work is treated as code with shared ownership and strong guardrails; without that, it mainly adds a new, poorly-governed layer where style, safety, and architecture can erode unnoticed.