For agent-first teams that use triad pairing, designer-owned harness flows, and explicit verification layers, which specific classes of changes (e.g., money movement, auth/permissions, migrations, cross-system integrations) should never rely on reversible hunch probes alone and instead require pre-declared verification plans or higher-review lanes, and how can teams codify those thresholds so judgment is consistent instead of depending on whoever reviews that day?

Answer

Think: "if this goes wrong, can we cheaply undo in prod?" If not, require a declared verification plan + higher-review lane.

A. Direct financial and legal changes

Money movement and balances
- Examples: payouts, refunds, pricing, invoices, ledger writes, tax calc.
- Policy: never merge on reversible probe alone. Require:
  - Explicit test plan (unit + scenario + shadow/limit-run where feasible).
  - Dual review (domain + engineering) for any change touching posting/settlement.
Compliance / legal surfaces
- Examples: KYC/AML rules, consent flows, GDPR/CCPA toggles.
- Policy: pre-declared verification, sign-off from compliance/PM.

B. Identity, auth, and permissions

Auth flows and session handling
- Examples: login, MFA, token refresh, SSO callbacks, session stores.
- Policy: require security review lane + regression checklist.
Authorization / role changes
- Examples: permission checks, role graphs, org/project scoping, feature flags tied to access.
- Policy: explicit matrix of "who should still see/do what"; negative tests; at least one reviewer outside author’s triad.

C. Irreversible or hard-to-rollback data changes

Schema and data migrations
- Examples: destructive migrations, type changes, backfills, fan-out scripts.
- Policy:
  - Written migration plan (pre-checks, back-out, sampling/spot-checks).
  - Staged rollout (dry-run or canary env; small-batch runs where possible).
Retention / deletion logic
- Examples: anonymization, hard delete, archival jobs, TTL jobs.
- Policy: require scenario tests on "must-never-delete" entities; sign-off from data owner.

D. Cross-system and cross-boundary integrations

External APIs and webhooks
- Examples: payment providers, email/SMS, HR/CRM, billing, SSO IdPs.
- Policy: define an explicit "golden path" scenario suite per integration; log/alert plan for first days of traffic.
Internal cross-boundary flows
- Examples: harness flows that hop across several bounded contexts or services.
- Policy: mark as high-risk lane if diff introduces a new multi-boundary path or changes an existing one’s semantics.

E. Safety, abuse, and security controls

Rate limits, abuse filters, fraud rules
- Policy: plan must include expected false positive/negative examples and monitoring.
Security controls
- Examples: CSRF/XSS fixes, input validation, encryption config, key handling.
- Policy: security-reviewed checklist; tools like SAST are necessary but not sufficient.

F. High-blast-radius infrastructure and observability

Deployment, rollback, feature-flag, and config systems
- Policy: verification plan must include failure-mode exercises in non-prod.
Logging/metrics that gate alerts or SLOs
- Policy: plan how to confirm signals still fire; dual review from infra/ops.

G. Reputation and trust surfaces

Outbound comms at scale
- Examples: email/SMS campaigns, notification fan-outs, policy/legal text.
- Policy: test cohorts, preview environments, explicit "who gets what" matrix.

A. Add a simple change-type taxonomy in the harness

Required change-type tags (multi-select):
- money_movement
- auth_session
- authz_permissions
- schema_migration
- bulk_data_change
- cross_system_integration
- security_control
- abuse_fraud
- infra_control_plane
- high_volume_comms
Harness auto-suggests tags based on diff (paths, keywords, tools), but humans confirm.

B. Map change-types to lanes and verification templates

Lanes:
- probe_only – reversible hunch probes, low risk.
- standard – usual PR + tests.
- risk_review – requires pre-declared plan + extra reviewer.
- guardrail_change – modifies the verification layer or harness itself.
Routing examples:
- Any money_movement, authz_permissions, schema_migration, bulk_data_change, security_control, infra_control_plane ⇒ force risk_review.
- cross_system_integration, abuse_fraud, high_volume_comms ⇒ risk_review by default; override requires explicit justification.
For risk_review lane, harness injects a short, fixed template:
- Impacted surfaces
- Plan to verify (tests, staging, canary, monitoring)
- Rollback/mitigation
- Who must review (roles, not names)

C. Encode non-negotiable checks per change-type

D. Use auto-detection + soft-fail warnings

Harness rules:
- Path-based detection (e.g., app/models/ledger, db/migrate, auth/, infra/).
- Keyword-based detection in diff ("charge", "refund", "payout", "role", "permission", "DELETE FROM", "TRUNCATE").
- Tool-based detection (use of migration runners, billing clients, auth clients, feature-flag APIs).
If a PR looks like high-risk but is tagged probe_only or standard, harness:
- Warn: "This diff touches X; expected lane: risk_review."
- Require explicit override reason to merge.

E. Normalize review expectations

For each lane, predefine:
- Required reviewer roles (e.g., domain owner, infra, security, product).
- Minimal review depth: "read diff only", "run tests + read plan", "pair-review live".
Publish a one-page policy: "When in doubt, choose the higher lane; reviewers may downgrade with note."

F. Log and audit lane usage

Track:
- Distribution of change-types vs lanes.
- Overrides where auto-detect suggested risk_review but lane stayed lower.
Run periodic retro:
- Sample mis-laned PRs and adjust rules.

Agents can and do meaningfully change these high-risk areas in an agent-first workflow.
Harnesses can inspect diffs, paths, and tools with enough fidelity to classify change-types.
Teams will accept light extra process for high-risk changes if templates stay short.
Domain owners for money, auth, data, and infra surfaces exist and can review.

A simpler rule like "all changes go through the same review bar, plus good tests" is enough; fine-grained lanes and verification templates add complexity without much extra safety, because real incidents are dominated by a few large, obvious changes.

In very small or immature teams without clear ownership or good tests, people may mis-tag to avoid friction, reviewers rubber-stamp plans, and the harness rules become noisy theater rather than real protection.

Examine incident/RCA history: would proposed change-types + risk_review have caught or softened recent real failures?
Pilot auto-detection + lane mapping on one team for 1–2 months and track: mis-laned PRs, override frequency, and any caught issues vs control.
Time PR cycle time and reviewer load before/after to ensure safety gains don’t make the process unusably slow.

What is the smallest useful set of change-types that still catches most severe failures?
Can agents reliably draft verification plans from diff + ticket, to keep friction low?
How often should thresholds and mappings be revisited as the codebase and risk profile change?