As designer-implementers take on more ownership of agent harness flows and reversible hunch probes, what concrete apprenticeship patterns (shadowing, paired harness design, staged autonomy on change-types) best preserve deep engineering taste and reduce apprenticeship decay for juniors who mostly interact with agents instead of hand-writing large features, and how could you tell if the seniority gap is widening or shrinking under those patterns?

dhh-agent-first-software-craft | Updated at 2026-04-09 09:07

Answer

Bullet view:

Concrete apprenticeship patterns

Guided shadowing on flows
- Juniors sit with a senior designer-implementer for full agent-first loops (ticket → brief → harness flow → diff → review).
- After each run, junior writes a short “why this diff is good/bad” note.
- Senior comments directly on that note, not just the code.
Paired harness design sessions
- Weekly 60–90 min where junior and senior co-design or refactor 1–2 harness flows (CLI/MCP commands, prompts, checks).
- Senior narrates taste: boundaries, naming, risk lanes, what not to expose to agents.
- Junior operates the tools; senior only vetoes and explains.
Staged autonomy by change-type
- Tier 0: safe UX-only and cosmetic changes. Junior owns harness flows + merges with light review.
- Tier 1: medium-risk app behavior. Junior proposes harness usage and verification; senior must OK plan and final diff.
- Tier 2: high-risk classes (money, auth, migrations, cross-system per 2aadec1c-*). Juniors only co-author; seniors decide.
- Promotion between tiers gated on checklists: number of clean PRs, quality of briefs, correctness of self-assessments.
Diff-reading reps without agents
- A fixed quota of “learning PRs” per week where the junior only reads and critiques diffs (their own or others’), including agent-generated ones.
- Focus prompts: boundaries, invariants, test sufficiency, naming.
Intent-to-diff reviews
- Before any non-trivial change, junior writes 5–10 line intent + harness plan (which tools, what checks, what they’re worried about).
- Senior reviews the plan quickly; junior then runs agents and implements.
- Post-merge, junior compares realized diff vs plan and notes gaps.
Rotation through verification ownership
- Juniors own small verification scripts or scenario tests for one or two flows.
- They must keep these green across agent changes, learning where agents break invariants.

Signals the patterns are working (seniority gap shrinking)

Review and defect metrics
- For tier-0/1 work, junior PRs approach senior PRs on:
  - Review cycles per PR.
  - Share of comments about fundamentals (arch/safety) vs nits.
  - Post-merge bug rate on junior-owned areas.
Harness and brief quality
- Juniors’ briefs and harness changes increasingly need no rewrite of:
  - Boundaries (no cross-service leaks without reason).
  - Risk tagging and lane choice.
  - Verification steps (fewer “add tests” comments).
Taste articulation
- In 3–6 month intervals, ask juniors to walk through:
  - One agent-generated diff they approved and why it fits.
  - One they rejected or heavily edited and why.
- Compare depth and consistency of reasoning over time.
Autonomy mix by change-type
- Rising share of medium-risk work that juniors can carry through with only one senior touch.
- Stable or falling incident rate in those areas.

Signals the seniority gap is widening

Review pattern red flags
- Seniors rewriting large portions of junior-initiated harness flows or diffs.
- Many “looks fine, I guess” approvals on sizeable agent diffs, followed by cleanup PRs.
Agent over-reliance
- Juniors struggle to explain why a diff is safe or tasteful beyond “tests pass” and “agent suggested it.”
- Learning work gravitating to prompt tweaks instead of understanding core code paths.
Drift in high-risk areas
- Repeated boundary leakage or test gaps around auth, money, migrations, or cross-system flows.
- Seniors quietly taking those domains back, reducing junior exposure.
Skill-formation stall
- After 6–12 months, juniors still uncomfortable starting work from a textual intent and rough plan; they only feel safe when starting from existing harness templates.

How to instrument this lightly

Tag PRs by:
- Author seniority.
- Change-type tier (0/1/2).
- Agent involvement level (none / assist / heavy).
Track over time:
- Review rounds and time-to-merge by tier and seniority.
- Bug/rollback rate by PR tag.
- Fraction of junior PRs where seniors mainly edit:
  - Plan/brief.
  - Harness flows.
  - Core code.
Add quarterly qualitative checks:
- Short, structured interviews or written reviews where seniors rate each junior on:
  - Flow taste (does their harness use match team norms?).
  - Risk sense (do they lane changes correctly?).
  - Verification habit (do they propose reasonable checks unprompted?).
- Compare cohorts over time to see if new agent-era juniors converge toward pre-agent baselines faster or slower.