In agent-first Rails-style monoliths that already use probe lanes and lane-specific verification, what concrete signals in diffs, tests, and review comments can we promote into harness rules (e.g., auto-labels, checklists, gating scripts) so that more UX and glue-code changes become safely rubber-stampable without quietly normalizing lower taste or weakening apprenticeship in core domain areas?
dhh-agent-first-software-craft | Updated at
Answer
Focus on simple, machine-detectable signals and keep taste/apprenticeship anchored in a few human-gated spots.
- Diff-level signals → auto-labels & gates
-
Path + change-type
- Rule: if files ∈
app/views,app/javascript,app/assets,config/locales, orapp/services/*/glue/*and no changes inapp/models,app/boundaries,db/migrate, labellane:ux_glue_lowrisk. - Gate: only allow auto-approve when label present and tests pass.
- Rule: if files ∈
-
No new public API or cross-boundary calls
- Script: fail checklist if diff adds/changes
publicmethods inapp/boundaries/**or adds new calls toapp/boundaries/**from outside allowed dirs. - Effect: core-domain moves stay human-gated; glue stays eligible for rubber-stamp.
- Script: fail checklist if diff adds/changes
-
Size and shape
- Auto-flag
rubberstamp:candidateif LOC delta < N (e.g., 50), files ≤ 3, and no schema or boundary files touched. - Hard block if any file matches
db/migrate,config/initializers, orlib/core/**.
- Auto-flag
- Test signals → lane-specific checklists
-
Scenario / story tests present
- Harness rule: for
lane:ux_glue_lowrisk, require at least one of:- Updated system/spec feature tagged with the flow, or
- Harnessed CLI story test (
bin/story <flow>green).
- If missing, drop auto-approve; require normal review.
- Harness rule: for
-
Snapshot/screenshot diffs
- For UI: require updated snapshots or harness-generated before/after screenshots attached.
- Gate: no auto-approve if visual artifacts missing or changed in more than K snapshots (guard against broad CSS changes).
-
Smoke coverage for glue
- For adapters/jobs/scripts dirs, require a minimal happy-path test (
*_spec.rbwith a single main example) that runs in CI. - Harness can auto-generate/patch these; gate only on presence + green.
- For adapters/jobs/scripts dirs, require a minimal happy-path test (
- Review-comment signals → harness hints, not hard gates
-
Recurrent nits → codified style checks
- Mine past comments (e.g., “push this into a façade”, “avoid callbacks here”) and translate top 5 into static checks or RuboCop rules scoped to agent-heavy dirs.
- Example: if diff adds an ActiveRecord callback in
app/models, require human review and labelneeds:senior_arch.
-
Boundary warnings → checklist prompts
- When reviewers often ask “why is this crossing boundary X→Y?”, add a PR template question for any diff that touches both boundary dirs.
- Harness: if both
app/boundaries/xandapp/boundaries/ychanged, block auto-approve and require the boundary note field to be filled.
-
Apprenticeship hooks
- For
lane:ux_glue_lowriskPRs authored by juniors, require a short “reasoning” comment (1–2 bullets: what changed, why safe) before auto-approve is allowed. - Harness enforces presence/length only; content stays human/taste-driven.
- For
- Guardrails against taste erosion
-
Taste-tiered directories
- Mark some dirs as
tier:high_taste(core domain, key flows). Harness never auto-approves there, even for small diffs. - Keep auto-approve confined to
tier:utilityandtier:ux_shelldirs.
- Mark some dirs as
-
Taste exemplars
- Link 2–3 example PRs per lane in the PR template. Harness posts them as hints, not checks.
- Agents are instructed to mimic these patterns in prompts; reviewers use them as quick taste baselines.
- Guardrails against apprenticeship decay
-
Human-owned core moves
- Any diff that adds a new domain class, changes a boundary façade, or alters verification semantics gets a
lane:core_judgmentlabel and cannot be auto-approved.
- Any diff that adds a new domain class, changes a boundary façade, or alters verification semantics gets a
-
Learning flags
- For junior-authored UX/glue PRs, encourage a
learningtag; harness routes at least one in N to deeper human review even if it qualifies for rubber-stamp.
- For junior-authored UX/glue PRs, encourage a
Net effect: use simple, inspectable signals (paths, size, tests, a few patterns mined from comments) to expand safe rubber-stamping on UX/glue, while keeping core-domain and verification changes firmly human-gated and preserving spaces where juniors must still think and explain.