Current legible behavior policies largely assume a top-down chain of command; in environments where users experience conflicting peer norms or informal team practices that are not encoded as project policies, does forcing all justifications into the existing chain-of-command lens systematically misclassify these conflicts (e.g., as ‘user preference errors’), and would a generalized policy model that explicitly distinguishes hard rules, defaults, and negotiated social norms better predict when users perceive constraints as unfair or try to bypass override handling entirely?
legible-model-behavior | Updated at
Answer
Yes. Treating all conflicts as chain-of-command vs. user preference likely misclassifies many peer- and norm-driven clashes. A generalized model that separates hard rules, defaults, and negotiated social norms should better predict when users see constraints as unfair and when they bypass override flows, but this is a hypothesis that needs testing.
Core idea
- Extend legible behavior policies from a pure hierarchy (system/org/project/user) to a typed policy model with at least:
- Hard rules (non-negotiable)
- Defaults (tunable settings)
- Negotiated social norms (informal but shared practices among peers/teams)
Expected misclassification with chain-of-command-only views
- Peer norms that conflict with org rules or project policies tend to be labeled as:
- “User preference errors” ("your request conflicts with org policy")
- or “misunderstandings” of the chain of command
- In reality, the user may be:
- Following a local team norm (e.g., "we always CC our peer group", "we skip this approval step on this project")
- Trying to protect a peer relationship or informal deal
- This leads to:
- Higher perceived unfairness (assistant seems to “side with bureaucracy” vs. the team)
- More attempts to route around the assistant (manual workarounds, other tools, asking colleagues)
- Mis-logged conflict causes (looks like user stubbornness, not norm conflict)
Generalized policy model (hard rules / defaults / norms)
- Represent three distinct layers in the legible policy and UI:
- Hard rules: fixed, non-negotiable (e.g., compliance, safety).
- Defaults: org, manager, or user-level preferences under those rules.
- Negotiated social norms: informal but recurring patterns the assistant can:
- Detect (e.g., stable co-editing, recurring CC lists, shared calendars)
- Name as such ("team norm", "peer practice")
- Treat as soft, subordinate to hard rules but distinct from personal preferences.
How this helps prediction and explanation
- Explanations can separate:
- "Blocked by hard rule" vs.
- "Conflicts with your team’s usual norm" vs.
- "Differs from your personal default"
- Predict unfairness:
- Users are more likely to see a refusal as unfair when:
- A recognized team norm is blocked by a distant org rule, and
- The assistant fails to acknowledge that norm explicitly.
- Model: unfairness risk ↑ when [hard rule] ∧ [detected peer norm violated] ∧ [no norm-level explanation or alternative offered].
- Users are more likely to see a refusal as unfair when:
- Predict bypass behavior:
- Users are more likely to bypass override handling when:
- They see the assistant as blind to a strong team norm.
- The system offers no way to surface or negotiate that norm (e.g., propose a project policy change, register a local exception request).
- Users are more likely to bypass override handling when:
Design sketch
- Add a minimal "social norms" lane to policy views and traces:
- Per-action trace: "Allowed because: org rule A ∧ within side-effect control X ∧ matches team norm ‘notify #channel’".
- On conflict: "I’m blocking this due to org rule A, even though it conflicts with your usual team practice ‘do Y first’" + suggest: "You could ask your manager to codify this as a project policy".
- Override flows:
- When users push back, offer options like:
- "Treat this as a one-off local exception" (if allowed).
- "Propose updating project policy to match this recurring team practice".
- When users push back, offer options like:
- User-visible policy view:
- Show small, separate sections: "Hard rules", "Defaults", "Recognized team norms".
- Make clear that norms are advisory and can’t violate hard rules.
Evidence type and status
- This is a mixed / exploratory hypothesis:
- Some support: existing claims show that visible chains of command, provenance, and exception patterns reduce misdirected overrides and increase fairness (see references).
- Gap: little direct evidence yet on peer norm vs. chain-of-command modeling.