When teen users move between products that all share the same global teen safety matrix but expose different configuration knobs (strictness presets, refusal styles, repetition caps), which concrete mismatches in behavior most often lead to confusion or perceived unfairness, and what minimum cross-product consistency rules are needed to keep safeguards feeling predictable and age-appropriate?
teen-safe-ai-ux | Updated at
Answer
Main mismatches and minimal consistency rules:
- Action mismatches (what is allowed vs blocked)
- Common confusing cases • Same query (e.g., sex-ed homework, coping with self-harm urges, mild insults with friends) is fully answered in one product, partially answered in another, and hard-blocked in a third. • Non-negotiable topics (self-harm methods, sexual exploitation) appear blockable in one product and partly answerable in another due to looser prompts or thresholds.
- Minimum rules • For each matrix cell, define a global "action band" (allow↔partial, partial↔block, fixed_block) and forbid products from going outside that band. • All products must treat non-negotiable cells as fixed_block with no content-changing appeals. • Younger vs older teen variants can only differ on partial depth, not on whether the cell is blockable at all.
- Refusal style mismatches
- Common confusing cases • One product uses goal-first, validating refusals; another uses terse, moralizing denials for the same request. • Different explanations for the same policy ("because you’re under 18" vs "because this topic is unsafe for anyone").
- Minimum rules • Shared small catalog of refusal style keys (e.g., goal_first_partial, clarify_then_answer, non_negotiable_block, rephrase_hint), reused across products. • For each matrix cell (and age band), specify an allowed subset of style keys; products pick within the subset but cannot invent harsher patterns. • For non-negotiables, require a standard template: acknowledge goal, state fixed rule, offer safer alternative or resource.
- Strictness and threshold mismatches
- Common confusing cases • Same borderline query (PG-13 romance, non-graphic bullying example, venting with mild profanity) is often allowed in one product and often blocked in another. • Teens infer that “this app thinks I’m a kid / judging me” despite a shared teen policy.
- Minimum rules • Per low- and medium-severity cell, define a narrow global range for classifier thresholds (or a small set of strictness presets: lenient/default/strict) and bind all products to those presets. • For low-severity teen cells, allow small, explicit underprotection bands to cut false positives; share the same numeric bands across products for the same age band. • Products may choose between at most 1–2 adjacent presets per cell; no ad-hoc, much-stricter modes for teens.
- Repetition cap and cool-down mismatches
- Common confusing cases • On non-negotiable topics, some products keep answering with fresh partial content; others instantly snap to repetitive blocks after a single attempt. • Teens see repeated probing about self-harm or exploitation treated as "fine" in one context and "suspicious" in another.
- Minimum rules • For each high-risk/non-negotiable slice, define a global range for topic_repeat_cap and cool-down length; products must stay inside. • Require escalating refusal stages everywhere: first 1–2 refusals goal-first and supportive; later ones shorter, fixed, still non-judgmental, always with human-support options. • Never let any product weaken non-negotiable actions after the cap (no "final exception" behavior).
- Appeal and override mismatches
- Common confusing cases • One product lets teens appeal a block on sex-ed or mental-health questions; another has no appeal path, or appears to let you "argue" your way into non-negotiable details.
- Minimum rules • Shared per-cell flag: {appealable | review_only | non_appealable}; all products must obey it. • Appeals may only alter intent within allowed bands and never override non-negotiables. • Appeal UI must use structured reasons (e.g., school/health/learning) not free-text, and produce the same possible outcomes per cell across products.
- Explanation framing mismatches
- Common confusing cases • Same block is labeled as “because of local law” in one product and “for your own good” in another. • Teens interpret this as arbitrary or biased rather than matrix-driven.
- Minimum rules • For each matrix cell (and any regional overlay), define a short, shared explanation string family (safety-rule, age-band, or law-based) and require products to choose from that set. • Require a stable distinction between: global safety rule, age-appropriateness, and region/legal requirement.
Cross-product predictability summary
- Keep one global teen safety matrix as the source of truth.
- Constrain per-product knobs to: narrow action bands, a common refusal-style catalog, shared strictness presets, bounded repetition caps, and standardized appeal flags + explanations.
- Within those bands, allow small presentational differences so products feel distinct, but enforce that the outcome class (allow/partial/block, appealable or not, escalating support after repeated high-risk probes) is predictable for a given teen, topic, and intent across products.