In small, implementation-abundant monolith teams that already use drift metrics, change-intent lanes, and bug-class verification, what concrete signs in day-to-day work (e.g., PR rework patterns, incident postmortems, standup discussions) show that the review bottleneck has shifted from diff-level correctness to higher-order judgment—such as problem framing, boundary reshaping, or pattern blessing—and how could harnesses be tuned to surface exactly those higher-order decisions earlier in the loop instead of at final review time?

Answer

Signs and harness tweaks, in compact form.

Concrete day-to-day signs the bottleneck is now higher-order

Many "scope / framing" comments, few bug comments
- Reviews say: "wrong problem", "collapse these flows", "this belongs in X boundary", vs "null check" or "off-by-one".
Late rewrites of otherwise-correct diffs
- Agents produce merge-worthy code, but seniors ask for major reshapes (move feature to another module, change API, delete and re-scope).
Repeated pattern or boundary debates on similar PRs
- Same arguments about where logic lives or which helper/pattern to use; correctness rarely contested.
High re-open / follow-up PRs for design, not bugs
- Post-merge follow-ups mostly rename, extract, move, or unify patterns; few fix functional regressions.

Incidents from "wrong behavior" or coupling, not simple mistakes
- Root causes: wrong boundary, missing domain rule, confused ownership, over-coupled flows, not syntax or trivial race.
Fix is conceptual, not local
- Postmortems describe: new abstraction, boundary move, new policy in harness; not "add test" or "fix index" only.
Same conceptual issue appears across areas
- Several incidents all trace back to an unclear contract, pattern, or boundary that review didn’t force earlier.

Standups dominated by "what are we really solving?" questions
- Time goes to reframing tickets, merging or splitting work, redefining acceptance; very little on "can we build it".
PRs called out for "needs taste/arch review" even when tests are green
- Teams route work based on judgment needs, not correctness risk.
Design docs lag behind PRs
- Review conversations keep re-doing design that never got pinned before coding.

Harness tuning to surface higher-order decisions earlier

Add or refine lanes:
- boundary_reshape, new_pattern, contract_change, scope_question.
Auto-suggest lanes from diffs:
- Many files across boundaries → suggest boundary_reshape.
- New core helper/service/DSL → suggest new_pattern.
- Changes to public APIs/events → suggest contract_change.
Require a tiny decision stub in these lanes:
- 3 bullets: "What changed", "Options considered", "Why this".

For flagged lanes, have the harness/agent emit a short card before human review:
- Boundary card: "Here’s the current vs proposed call graph and ownership."
- Pattern card: "This new helper overlaps with A/B; options: bless, localize, reject."
- Contract card: "Downstream callers; potential breakages; migration sketch."
Show these cards in PR header or as a pre-review CLI step so seniors decide structure before nits.

For PRs / tickets in these lanes, auto-generate prompts:
- "What boundary owns this?", "Should this be a shared pattern?", "What’s the minimal contract change?"
Surface them on standup boards:
- Column for "Open boundary/pattern calls" with owner and due time.

Track review-tag ratios:
- Tag comments as correctness, style, boundary, pattern, scope; rising share of the latter is your quantitative signal.
Track rework reasons on PRs:
- Small template on merge: main changes after first review? (bugs, naming, boundary, pattern, scope).
Feed these back into harness defaults:
- If many PRs in area X get boundary rework, auto-escalate new PRs there to boundary_reshape lane with required decision stub.

For judgment lanes, block merge unless:
- Required decision stub is filled.
- A reviewer explicitly toggles one of: boundary_approved, pattern_blessed, local_violation_ok.
Keep the implementation checks light in these lanes so attention goes to the decision, not more correctness noise.

Net effect

When these signs appear and you encode them in lanes, cards, and simple gates, the review bottleneck visibly moves from diff correctness to a thinner stream of deliberate boundary/pattern/scope calls made earlier in the loop.