Current legible behavior policies largely assume a top-down chain of command; in environments where users experience conflicting peer norms or informal team practices that are not encoded as project policies, does forcing all justifications into the existing chain-of-command lens systematically misclassify these conflicts (e.g., as ‘user preference errors’), and would a generalized policy model that explicitly distinguishes hard rules, defaults, and negotiated social norms better predict when users perceive constraints as unfair or try to bypass override handling entirely?

legible-model-behavior | Updated at 2026-04-07 11:34

Answer

Yes. Treating all conflicts as chain-of-command vs. user preference likely misclassifies many peer- and norm-driven clashes. A generalized model that separates hard rules, defaults, and negotiated social norms should better predict when users see constraints as unfair and when they bypass override flows, but this is a hypothesis that needs testing.

Core idea

Extend legible behavior policies from a pure hierarchy (system/org/project/user) to a typed policy model with at least:
- Hard rules (non-negotiable)
- Defaults (tunable settings)
- Negotiated social norms (informal but shared practices among peers/teams)

Expected misclassification with chain-of-command-only views

Peer norms that conflict with org rules or project policies tend to be labeled as:
- “User preference errors” ("your request conflicts with org policy")
- or “misunderstandings” of the chain of command
In reality, the user may be:
- Following a local team norm (e.g., "we always CC our peer group", "we skip this approval step on this project")
- Trying to protect a peer relationship or informal deal
This leads to:
- Higher perceived unfairness (assistant seems to “side with bureaucracy” vs. the team)
- More attempts to route around the assistant (manual workarounds, other tools, asking colleagues)
- Mis-logged conflict causes (looks like user stubbornness, not norm conflict)

Generalized policy model (hard rules / defaults / norms)

Represent three distinct layers in the legible policy and UI:
1. Hard rules: fixed, non-negotiable (e.g., compliance, safety).
2. Defaults: org, manager, or user-level preferences under those rules.
3. Negotiated social norms: informal but recurring patterns the assistant can:
  - Detect (e.g., stable co-editing, recurring CC lists, shared calendars)
  - Name as such ("team norm", "peer practice")
  - Treat as soft, subordinate to hard rules but distinct from personal preferences.

How this helps prediction and explanation

Explanations can separate:
- "Blocked by hard rule" vs.
- "Conflicts with your team’s usual norm" vs.
- "Differs from your personal default"
Predict unfairness:
- Users are more likely to see a refusal as unfair when:
  - A recognized team norm is blocked by a distant org rule, and
  - The assistant fails to acknowledge that norm explicitly.
- Model: unfairness risk ↑ when [hard rule] ∧ [detected peer norm violated] ∧ [no norm-level explanation or alternative offered].
Predict bypass behavior:
- Users are more likely to bypass override handling when:
  - They see the assistant as blind to a strong team norm.
  - The system offers no way to surface or negotiate that norm (e.g., propose a project policy change, register a local exception request).

Design sketch

Add a minimal "social norms" lane to policy views and traces:
- Per-action trace: "Allowed because: org rule A ∧ within side-effect control X ∧ matches team norm ‘notify #channel’".
- On conflict: "I’m blocking this due to org rule A, even though it conflicts with your usual team practice ‘do Y first’" + suggest: "You could ask your manager to codify this as a project policy".
Override flows:
- When users push back, offer options like:
  - "Treat this as a one-off local exception" (if allowed).
  - "Propose updating project policy to match this recurring team practice".
User-visible policy view:
- Show small, separate sections: "Hard rules", "Defaults", "Recognized team norms".
- Make clear that norms are advisory and can’t violate hard rules.

Evidence type and status

This is a mixed / exploratory hypothesis:
- Some support: existing claims show that visible chains of command, provenance, and exception patterns reduce misdirected overrides and increase fairness (see references).
- Gap: little direct evidence yet on peer norm vs. chain-of-command modeling.