In systems that distinguish hard rules from editable defaults, how does letting users simulate a requested action (showing what would happen and which rule layer would block or allow it) affect trust, willingness to accept refusals, and later override behavior compared to only learning about constraints after issuing real commands?

legible-model-behavior | Updated at 2026-04-06 18:43

Answer

Allowing users to simulate requested actions—showing the predicted effects and which rule layer would block or allow them—tends to increase trust, willingness to accept refusals, and the quality (though not always the quantity) of later override behavior compared to discovering constraints only after issuing real commands, provided that simulations are clearly marked as non-executing and are tightly integrated with the visible rule hierarchy.

Effects relative to post‑hoc constraint discovery:

Trust
- Simulations that preview both outcomes and the responsible rule layer make the legible behavior policy feel predictive rather than merely descriptive, echoing benefits seen with visible chains of command, local policies, and action budgets. Users see that constraints are stable rules, not ad-hoc reactions.
- Trust is highest when the simulation UI explicitly labels each gate with its rule type (e.g., hard org rule vs user default) and uses the same labels in later real refusals; this consistency parallels trust gains from org-labeled defaults and explicit override justifications.
Willingness to accept refusals
- When users see in advance that a hard rule will block a simulated action—and that relaxing relevant defaults would still not bypass that layer—they are more willing to accept the eventual real refusal as enforcement of a known constraint rather than arbitrary assistant behavior.
- Acceptance is further improved if, at refusal time, the assistant can reference the prior simulation (“As previewed, this step is blocked by the org export rule”), similar to how prior exception and local-policy framing helps users attribute refusals to the chain of command instead of to the assistant.
Later override behavior
- Simulations tend to reduce misdirected or futile overrides (e.g., repeatedly toggling a default that cannot affect a hard rule), because users can see which layer is actually decisive for a given action before committing. This channels override attempts toward tunable defaults and time-bounded exceptions where they are more likely to succeed.
- However, by making the boundaries clearer, simulations can increase targeted override proposals at layers that appear negotiable (e.g., asking for temporary side-effect exceptions where the simulation shows a default-based block). When paired with soft-proposal handling and brief justifications for rejected overrides, this generally improves overall satisfaction with override handling even if the raw number of override attempts rises.

Failure modes

If simulations imply that certain actions are allowed but real executions are later blocked by unmodeled or opaque rules, they create a strong “fake control” effect and can erode trust more than having no simulation at all.
Overly detailed or cognitively heavy simulations risk overwhelming users, blurring the distinction between hard rules and defaults and undermining the intended clarity benefits.

Overall, preview-style simulations that (a) accurately reflect the chain of command, (b) clearly distinguish hard rules from editable defaults, and (c) reuse the same labels and rationales during real refusals tend to make constraints feel more legitimate and predictable, increasing trust, improving acceptance of refusals, and steering override behavior toward meaningful, policy-consistent adjustments rather than frustrated trial-and-error.