In agent-first Rails-style monoliths that already use probe lanes and lane-specific verification, what concrete signals in diffs, tests, and review comments can we promote into harness rules (e.g., auto-labels, checklists, gating scripts) so that more UX and glue-code changes become safely rubber-stampable without quietly normalizing lower taste or weakening apprenticeship in core domain areas?

dhh-agent-first-software-craft | Updated at 2026-04-09 09:26

Answer

Focus on simple, machine-detectable signals and keep taste/apprenticeship anchored in a few human-gated spots.

Diff-level signals → auto-labels & gates

Path + change-type
- Rule: if files ∈ app/views, app/javascript, app/assets, config/locales, or app/services/*/glue/* and no changes in app/models, app/boundaries, db/migrate, label lane:ux_glue_lowrisk.
- Gate: only allow auto-approve when label present and tests pass.
No new public API or cross-boundary calls
- Script: fail checklist if diff adds/changes public methods in app/boundaries/** or adds new calls to app/boundaries/** from outside allowed dirs.
- Effect: core-domain moves stay human-gated; glue stays eligible for rubber-stamp.
Size and shape
- Auto-flag rubberstamp:candidate if LOC delta < N (e.g., 50), files ≤ 3, and no schema or boundary files touched.
- Hard block if any file matches db/migrate, config/initializers, or lib/core/**.

Test signals → lane-specific checklists

Scenario / story tests present
- Harness rule: for lane:ux_glue_lowrisk, require at least one of:
  - Updated system/spec feature tagged with the flow, or
  - Harnessed CLI story test (bin/story <flow> green).
- If missing, drop auto-approve; require normal review.
Snapshot/screenshot diffs
- For UI: require updated snapshots or harness-generated before/after screenshots attached.
- Gate: no auto-approve if visual artifacts missing or changed in more than K snapshots (guard against broad CSS changes).
Smoke coverage for glue
- For adapters/jobs/scripts dirs, require a minimal happy-path test (*_spec.rb with a single main example) that runs in CI.
- Harness can auto-generate/patch these; gate only on presence + green.

Review-comment signals → harness hints, not hard gates

Recurrent nits → codified style checks
- Mine past comments (e.g., “push this into a façade”, “avoid callbacks here”) and translate top 5 into static checks or RuboCop rules scoped to agent-heavy dirs.
- Example: if diff adds an ActiveRecord callback in app/models, require human review and label needs:senior_arch.
Boundary warnings → checklist prompts
- When reviewers often ask “why is this crossing boundary X→Y?”, add a PR template question for any diff that touches both boundary dirs.
- Harness: if both app/boundaries/x and app/boundaries/y changed, block auto-approve and require the boundary note field to be filled.
Apprenticeship hooks
- For lane:ux_glue_lowrisk PRs authored by juniors, require a short “reasoning” comment (1–2 bullets: what changed, why safe) before auto-approve is allowed.
- Harness enforces presence/length only; content stays human/taste-driven.

Guardrails against taste erosion

Taste-tiered directories
- Mark some dirs as tier:high_taste (core domain, key flows). Harness never auto-approves there, even for small diffs.
- Keep auto-approve confined to tier:utility and tier:ux_shell dirs.
Taste exemplars
- Link 2–3 example PRs per lane in the PR template. Harness posts them as hints, not checks.
- Agents are instructed to mimic these patterns in prompts; reviewers use them as quick taste baselines.

Guardrails against apprenticeship decay

Human-owned core moves
- Any diff that adds a new domain class, changes a boundary façade, or alters verification semantics gets a lane:core_judgment label and cannot be auto-approved.
Learning flags
- For junior-authored UX/glue PRs, encourage a learning tag; harness routes at least one in N to deeper human review even if it qualifies for rubber-stamp.

Net effect: use simple, inspectable signals (paths, size, tests, a few patterns mined from comments) to expand safe rubber-stamping on UX/glue, while keeping core-domain and verification changes firmly human-gated and preserving spaces where juniors must still think and explain.