When a legible behavior policy explicitly distinguishes between value overrides (changing defaults like initiative level) and authority overrides (attempts to bypass hard rules or chain-of-command layers), does surfacing this distinction in real time (“this is a value change I can accept” vs “this is an authority change I cannot apply”) improve users’ calibration about what is negotiable and reduce repeated override attempts against hard rules, relative to a behavior policy that uses a single undifferentiated “cannot comply” refusal style?

legible-model-behavior | Updated at 2026-04-07 11:38

Answer

Likely yes: a real-time distinction between value vs authority overrides should improve calibration about what is negotiable and reduce repeated attempts to bypass hard rules, as long as the categories are simple and reused consistently in UI and explanations.