For designer-implementers who own key agent harness flows in a token-efficient stack (e.g., opinionated Rails monolith with CLI/MCP surfaces), which specific practices or artifacts—brief templates, lane presets, example diffs, verification scripts—most reliably teach taste and architectural judgment through everyday work, and how can teams detect and reverse apprenticeship decay when designers start shipping merge-worthy code via agents without actually absorbing the underlying design and implementation patterns?

Answer

Design briefs with taste hooks
- Tiny template per change:
  - "intent" (user + domain),
  - "boundary" (which Rails context),
  - "happy path sketch" (controller → service → view),
  - "non-goals".
- Force designer-implementer to name where the change lives and why.
Lane presets for harness runs
- explore: cheap spikes, low taste bar, sandbox only.
- integrate: normal features, taste prompts on naming/boundary.
- harden: high-risk; requires explicit boundary + rollback notes.
- Each lane has a short, fixed checklist in the PR template.
Curated example diffs
- 10–20 small Rails PRs tagged "good taste":
  - clean controller/service split,
  - consistent domain terms,
  - readable tests.
- Linked from harness help: "show similar change" pulls 1–2 examples.
Taste-focused review prompts (per PR)
- 3 yes/no questions:
  - "Do new names match glossary and nearby files?"
  - "Is there a single obvious happy path file to read?"
  - "Do tests read as behavior stories, not wiring?"
- Authors self-rate; reviewers spot-check.
Minimal verification scripts
- CLI tasks that encode local judgment:
  - smoke flows (end-to-end Rails requests),
  - domain invariants (e.g., billing totals),
  - log/metric sanity for new paths.
- Designer-implementers own these alongside harness flows.
Boundary summaries in repo
- Short README per boundary: "job", main entrypoints, anti-goals.
- Harness shows summary when agents touch that area.

Signals over a few weeks:

PR review patterns
- Many "OK but rename/extract" comments; same feedback repeated.
- Correct code that ignores nearby patterns or helpers.
Test & incident shape
- Passing tests that don’t catch obvious regressions in behavior.
- Bugs from misused boundaries or duplicated flows, not syntax.
Designer explanations
- Can’t explain why an agent-chosen pattern is acceptable.
- Can’t sketch the flow they just shipped without reading the diff.
Harness usage
- Heavy copy-paste of prompts/flows with no edits.
- Always using highest-automation lane, rarely editing agent output.

Intentional "learning" lanes
- Mark some tickets as learning: require
  - pre-brief written by designer,
  - first pass design without agent,
  - review comments focused on naming/boundaries.
Triad sessions (senior + designer + agent)
- 30–60 minutes:
  - 10: sketch flow on paper/whiteboard.
  - 30: let agent implement; designer drives; senior narrates tradeoffs.
  - 10: quick retro: "what did the agent miss, what did we change, why?"
Review quotas by type
- For each designer-implementer per week, sample N PRs for
  - "can you explain this architecture choice?"
  - "what else could we have done?"
- If they can’t answer, route next work into learning lane.
Harness friction for deep work
- In learning lane:
  - force smaller diffs,
  - require a short "what I learned" note in PR,
  - limit auto-fix tools; encourage manual edits.
Refresh examples and glossaries
- Promote good PRs into the example set.
- Keep domain glossary tight and visible in harness.

Evidence type: synthesis Evidence strength: mixed