When ambiguity-resolution rules conflict with user-edited local exceptions (e.g., a local exception suggests aggressive edits but the ambiguity rule favors conservative interpretations of unclear instructions), does explicitly surfacing which layer ‘won’ in the chain of command reduce repeated override attempts and complaints about inconsistency compared to behavior that applies the same hierarchy but does not name the winning layer?
legible-model-behavior | Updated at
Answer
Yes. When ambiguity-resolution rules and user-edited local exceptions conflict, explicitly surfacing which layer ‘won’ in the chain of command is likely to reduce both repeated override attempts and complaints about inconsistency, compared to silently applying the same hierarchy without naming the winning layer—provided the winning-layer label is stable, simple, and reused in explanations.
Reasoning:
- Users already benefit when assistants (a) expose a visible chain of command, (b) distinguish interpretation rules from side-effect controls, and (c) show where local exceptions sit under higher hard rules (fd803422-558c-4202-b00e-0de99b104691; 6bf58b3d-01e8-407f-a7b0-0e7a0cf0d1e7). In those settings, override attempts become more accurately targeted and frustration shifts away from "the assistant is arbitrary" toward “this is the wrong layer to change.”
- In conflicts between ambiguity-resolution rules and local exceptions, users can otherwise misattribute the outcome: they may think their local exception was ignored (inconsistency) or that the assistant changed interpretation style arbitrarily, and they will often keep trying stronger overrides on the wrong layer.
- Explicitly naming which layer prevailed (e.g., “Applied: ambiguity rule over local exception X for unclear instruction”) turns a hidden priority decision into a predictable part of the legible behavior policy. This parallels effects seen when conflict-resolution rules between users (a3fdb360-1a16-451f-964e-ca027a21900a) or between defaults and org rules (2a230eae-318b-4a3f-9f5b-483978022d52; a8327cae-18fb-4aab-aeee-fd83c50e67a6) are surfaced: users try fewer misdirected overrides and complain less about bias because they can attribute outcomes to a stable rule.
- Complaints about inconsistency decrease when the same layer labels are used in both settings UIs and inline explanations, and when ambiguity rules are clearly framed as governing unclear instructions, while local exceptions govern what to do when instructions are clear. This uses the interpretive vs impact distinction from 6bf58b3d-01e8-407f-a7b0-0e7a0cf0d1e7 to reduce misattribution.
Design implications:
- Show a compact, standardized tag whenever a conflict is resolved (e.g., “Conflict resolved: ambiguity rule > local exception” plus one sentence clarifying why the instruction counted as ambiguous).
- Reuse the same labels in the behavior-policy view so users can discover and intentionally modify the correct layer (e.g., by tightening the exception’s scope rather than trying to turn off ambiguity safeguards globally).
- Avoid verbose or highly technical descriptions; if the resolution surface becomes cognitively heavy, users may ignore it and revert to trial-and-error overrides, losing the benefit.
Net effect: Explicitly surfacing which layer wins in these conflicts should reduce repeated override attempts against the wrong layer and lower perceived inconsistency, so long as the resolution rule and labels are simple, stable, and consistently referenced when explaining outcomes.