For autonomous assistants with side-effect controls, is it more effective for trust and override satisfaction to expose one unified, cross-application behavior policy (covering files, messaging, finance, etc.) or to expose separate, domain-specific policies with their own hard rules vs defaults, assuming identical underlying constraints in both designs?

legible-model-behavior | Updated at 2026-04-06 18:38

Answer

A hybrid that presents a unified top-level policy with clearly segmented, domain-specific sections is generally most effective for trust and override satisfaction, given identical underlying constraints.

Comparing the two pure designs:

Single unified, cross-application policy
- Trust:
  - Benefits: Users see a coherent chain of command and a single contract that appears to apply everywhere, which can strengthen baseline trust and make refusals feel less arbitrary (“these same global rules protect me across files, messages, and money”). This mirrors the value of a stable, legible behavior contract in c181–c183 and the compact-but-coherent policy views in c3fc.
  - Risks: If the unified policy stays high-level, users often cannot tell which specific rule is biting in a given domain (e.g., whether a refusal in finance is about general side-effect limits or financial regulations). That opacity weakens targeted override attempts and can make constraints feel like generic bureaucracy rather than domain-appropriate protection.
- Override satisfaction:
  - Benefits: Fewer distinct surfaces to learn; some users prefer “one place to change defaults” and may feel more in control when they believe a single setting affects multiple domains.
  - Costs: Overrides become blunt: a user trying to relax notification behavior for messaging may inadvertently affect file or finance behavior, or be told “that’s a hard rule” without understanding that other, domain-specific levers could exist. This leads to the “fake control” and mis-aimed override problems seen when policy structure isn’t legible (cf. c0d0, c12aa, c9f07, c3fc).
Separate, domain-specific policies (files, messaging, finance, etc.)
- Trust:
  - Benefits: Users can see that different domains have different hard rules vs defaults and side-effect controls, which better matches their expectations of real-world risk differences (e.g., finance vs file renames). This domain legibility supports the same mechanisms that improve acceptance of hard-rule refusals in layered/task-scoped policies (c12aa, c d37b).
  - Costs: Without a clearly visible global frame, users may experience the assistant as fragmented (“every app has its own mysterious rules”) and may struggle to see that refusals across domains are governed by a common chain of command. That fragmentation can erode global trust even if local domain trust is good.
- Override satisfaction:
  - Benefits: Overrides can be tightly scoped (“for messaging, default to speed over reversibility”) making them more testable and satisfying, similar to project-scoped policies (c d37b) and time-bounded exceptions (c12aa). Users are more likely to perceive overrides as effective because changes are visible in the named domain only.
  - Costs: Users must discover the right domain policy UI; failed overrides often stem from aiming changes at the wrong place, especially for cross-domain actions (e.g., file attachments in messaging).

Given these tradeoffs, the best pattern is usually:

Unified top-level policy that explains the common chain of command, global hard rules (e.g., cross-domain side-effect limits, organization-wide compliance constraints), and general override handling.
Domain-specific sections inside that unified frame, each with:
- Clear labels for domain-hard rules vs domain-defaults.
- Local side-effect controls and budgets where relevant (files, messaging, finance), following the same explanation patterns and labels used globally (per c9f07 and c3fc).

This structure tends to:

Preserve the contract-like, cross-context predictability that stabilizes trust (c181–c183, c25c0).
Support more precise, satisfying overrides by giving users domain-scoped levers that behave like local policies or time-bounded exceptions (c12aa, c d37b, c0d0), but still visibly sit under the same global chain of command.

So, if forced to choose a single style, separate, domain-specific policies nested under an explicitly unified, cross-application header will usually produce higher trust and override satisfaction than a single flat, undifferentiated unified policy or a set of completely disconnected per-domain policies.