In agentic assistants that already expose a legible chain of command, does adding a short, per-action “why this path?” summary (showing which user defaults, ambiguity policy, and hard rules jointly produced the chosen plan vs. nearby alternatives) reduce users’ belief that the system is ‘overcautious’ or ‘ignoring them’ when it constrains side-effectful actions?

legible-model-behavior | Updated at 2026-04-06 18:18

Answer

Adding a short, per‑action “why this path?” summary on top of an already legible chain of command is likely to moderately reduce users’ belief that the system is overcautious or ignoring them when it constrains side‑effectful actions, provided the summary is compact, explicitly names the contributing rule layers, and contrasts the chosen plan with at least one plausible, slightly more aggressive alternative that was ruled out.

Main expected effects:

Belief that the system is “ignoring me”
- Likely to decrease: the summary makes visible how user defaults, ambiguity policy, and hard rules were actually consulted, similar to how chain‑of‑command UIs reduce feelings of being ignored by showing which rules are active (cf. claims c49–c52 about per‑action chain-of-command views and visible rule labels).
- When a side‑effectful action is constrained, pointing to the specific pieces—e.g., “Using your ‘ask on unclear consent’ profile and the org file‑scope rule, I picked: draft-only edit in this folder, not auto‑edit across the repo”—helps users attribute the behavior to explicit policies, not to the assistant disregarding their request (paralleling c0d07… on justified override rejections and c0bd3… on visible ambiguity resolution).
Belief that the system is “overcautious”
- Tends to shift from global character judgment to targeted disagreement: users may still think a particular hard rule or ambiguity profile is too conservative, but they are less likely to infer that the assistant is generically paranoid. The “why this path?” view clarifies whether caution came from (a) their own selected defaults, (b) the ambiguity policy, or (c) non‑overridable hard rules (consistent with c368d… on distinguishing prompts due to user profiles vs hard rules, and c9f07… on budgets/logs improving fairness at fixed limits).
- Over time, this can modestly reduce perceived overcaution at the assistant level because repeated constraints are each accompanied by stable, layer‑specific rationales instead of opaque “no”s.

Design constraints for the effect to hold:

The per‑action summary must be very short and layered (e.g., one compact sentence plus optional expansion), to avoid the verbosity and inconsistency pitfalls called out for ambiguity‑policy explanations and simulators (cf. c0bd3… and c483e…).
It should reuse the existing legible behavior policy vocabulary—hard rules vs defaults, ambiguity profile names, side‑effect scopes—so users see continuity rather than yet another explanation surface (aligning with c5b47… and c483e… on separating editable defaults from non‑editable rules).
It should highlight nearby alternatives only at a coarse level (“auto‑edit vs draft‑only”, “this folder vs whole drive”), not attempt an exhaustive comparison that would overwhelm users and risk noise in the mapping from explanations to behavior (cf. c483e… on simulator reliability).

Failure modes:

If explanations are inconsistent with actual behavior, or if they implicitly suggest that tweaking defaults could bypass hard rules when that is not true, the feature can backfire—intensifying perceptions of arbitrariness or fake control, similar to the backfire risks noted for simulators and temporary exceptions (c483e…, c12aa…).
If every constrained action triggers a verbose breakdown without clear user benefit, users may start to ignore the summaries, losing the fairness and trust gains and retaining only the friction.

Net: in systems that already have a legible chain of command, adding a compact, per‑action “why this path?” summary that explicitly ties side‑effect constraints to user defaults, ambiguity policy, and hard rules should moderately reduce the sense that the assistant is overcautious or ignoring the user, by making each conservative choice feel like the predictable result of a transparent procedure rather than a black‑box whim.