In agent-first Rails-style monoliths that already use probe lanes and lane-specific verification, what concrete signals in diffs, tests, and review comments can we promote into harness rules (e.g., auto-labels, checklists, gating scripts) so that more UX and glue-code changes become safely rubber-stampable without quietly normalizing lower taste or weakening apprenticeship in core domain areas?

dhh-agent-first-software-craft | Updated at

Answer

Focus on simple, machine-detectable signals and keep taste/apprenticeship anchored in a few human-gated spots.

  1. Diff-level signals → auto-labels & gates
  • Path + change-type

    • Rule: if files ∈ app/views, app/javascript, app/assets, config/locales, or app/services/*/glue/* and no changes in app/models, app/boundaries, db/migrate, label lane:ux_glue_lowrisk.
    • Gate: only allow auto-approve when label present and tests pass.
  • No new public API or cross-boundary calls

    • Script: fail checklist if diff adds/changes public methods in app/boundaries/** or adds new calls to app/boundaries/** from outside allowed dirs.
    • Effect: core-domain moves stay human-gated; glue stays eligible for rubber-stamp.
  • Size and shape

    • Auto-flag rubberstamp:candidate if LOC delta < N (e.g., 50), files ≤ 3, and no schema or boundary files touched.
    • Hard block if any file matches db/migrate, config/initializers, or lib/core/**.
  1. Test signals → lane-specific checklists
  • Scenario / story tests present

    • Harness rule: for lane:ux_glue_lowrisk, require at least one of:
      • Updated system/spec feature tagged with the flow, or
      • Harnessed CLI story test (bin/story <flow> green).
    • If missing, drop auto-approve; require normal review.
  • Snapshot/screenshot diffs

    • For UI: require updated snapshots or harness-generated before/after screenshots attached.
    • Gate: no auto-approve if visual artifacts missing or changed in more than K snapshots (guard against broad CSS changes).
  • Smoke coverage for glue

    • For adapters/jobs/scripts dirs, require a minimal happy-path test (*_spec.rb with a single main example) that runs in CI.
    • Harness can auto-generate/patch these; gate only on presence + green.
  1. Review-comment signals → harness hints, not hard gates
  • Recurrent nits → codified style checks

    • Mine past comments (e.g., “push this into a façade”, “avoid callbacks here”) and translate top 5 into static checks or RuboCop rules scoped to agent-heavy dirs.
    • Example: if diff adds an ActiveRecord callback in app/models, require human review and label needs:senior_arch.
  • Boundary warnings → checklist prompts

    • When reviewers often ask “why is this crossing boundary X→Y?”, add a PR template question for any diff that touches both boundary dirs.
    • Harness: if both app/boundaries/x and app/boundaries/y changed, block auto-approve and require the boundary note field to be filled.
  • Apprenticeship hooks

    • For lane:ux_glue_lowrisk PRs authored by juniors, require a short “reasoning” comment (1–2 bullets: what changed, why safe) before auto-approve is allowed.
    • Harness enforces presence/length only; content stays human/taste-driven.
  1. Guardrails against taste erosion
  • Taste-tiered directories

    • Mark some dirs as tier:high_taste (core domain, key flows). Harness never auto-approves there, even for small diffs.
    • Keep auto-approve confined to tier:utility and tier:ux_shell dirs.
  • Taste exemplars

    • Link 2–3 example PRs per lane in the PR template. Harness posts them as hints, not checks.
    • Agents are instructed to mimic these patterns in prompts; reviewers use them as quick taste baselines.
  1. Guardrails against apprenticeship decay
  • Human-owned core moves

    • Any diff that adds a new domain class, changes a boundary façade, or alters verification semantics gets a lane:core_judgment label and cannot be auto-approved.
  • Learning flags

    • For junior-authored UX/glue PRs, encourage a learning tag; harness routes at least one in N to deeper human review even if it qualifies for rubber-stamp.

Net effect: use simple, inspectable signals (paths, size, tests, a few patterns mined from comments) to expand safe rubber-stamping on UX/glue, while keeping core-domain and verification changes firmly human-gated and preserving spaces where juniors must still think and explain.