When a single teen uses multiple products that all rely on the same risk_area × intent × age_band safety matrix, how do inconsistencies in refusal style, clarification frequency, or partial-answer depth across products affect overall false positives, underprotection, and trust—and what concrete cross-product alignment rules are needed to keep safeguards predictable but still product-adaptable?
teen-safe-ai-ux | Updated at
Answer
Inconsistent behavior across products using the same teen matrix mainly increases perceived randomness and unfairness, which raises apparent false positives, can mask underprotection, and erodes trust. A small set of cross-product rules can keep safeguards predictable while still letting products adapt.
Effects of cross-product inconsistency
- Refusal style: If one app uses blunt blocks and another uses goal-first partials for the same cell, teens treat the stricter app as “broken,” over-reporting false positives and churning, even if policy is identical. They may route risky queries to the more permissive-feeling app, raising net underprotection.
- Clarification frequency: If products disagree on when they ask follow-ups in the same ambiguous cells, clarifications feel arbitrary. Teens learn to game the least-inquisitive app, reducing safety where clarifications are skipped and creating frustration where they’re overused.
- Partial-answer depth: If one product gives detailed partials and another only high-level for the same age/risk/intent, teens infer hidden rules or bias, lose trust in teen-visible explanations, and are more likely to ignore real warnings.
Net impact
- False positives: Perceived FP rises in the strictest-feeling product because teens compare across apps, not to policy text. Actual FP may also rise if teams compensate with coarser blocks to avoid diverging from others.
- Underprotection: Teens and abusers gravitate to the most permissive-feeling product; small style differences can create de facto weakest-link behavior even with a shared matrix.
- Trust: Divergent refusals and clarifications for “the same” request make the matrix feel fake. Teens stop believing that safeguards are principled and either disengage or probe boundaries harder.
Cross-product alignment rules
- Fix per-cell behavior bands globally
- For each matrix cell, define a global action band and style band, not just per-product knobs:
action_band: {allow_or_partial_only, partial_or_block_only, fixed_block} (as in cd4df78-…)style_band: allowed subset of refusal_style_keys and clarification patterns.
- All products must stay inside both bands; they can’t be harsher or looser than the band allows.
- Standardize a small refusal-style set
- Define 4–6 global refusal styles (e.g.,
goal_first_partial,clarify_then_answer,non_negotiable_block,rephrase_hint,resource_redirect). - For each cell, globally pick 1 primary style + at most 1 secondary. Products may localize wording and UI but not switch to a different pattern.
- Align clarification usage per cell
- For ambiguous/high-risk cells, set a global
clarify_mode: {required_single_turn, optional, forbidden}. - Products must:
- always ask at least one short clarification when
required_single_turn; - never add multi-turn clarification where
forbidden.
- always ask at least one short clarification when
- This keeps the number and type of clarifications stable across products.
- Constrain partial-answer depth per cell
- For cells that allow partials, define per age_band a global
partial_depth: {high_level_only, moderate_detail}. - Products can vary tone or examples but not increase or decrease operational detail beyond this.
- Shared repetition and escalation patterns
- For high-risk areas (self-harm, exploitation, severe bullying), standardize a short escalation ladder per (risk_area, intent, age_band):
- first few turns: goal-first partial or clarify;
- mid-range: firmer refusal + resources;
- cap: consistent hard stop.
- Products can choose wording and UI surface (chat, cards), but the ladder steps and thresholds stay within a narrow global range.
- Common teen-visible safety summaries
- For each matrix cell (or small cluster), define a canonical short explanation string (“I follow this rule for all apps for people your age: …”).
- Products reuse these summaries in refusals and help screens so teens see the same reason for the same limitation across products.
- Cross-product metrics and guards
- Log per-cell metrics by product: false positives on legit learning/support, underprotection on red-team items, clarification rate, refusal style used.
- Set global guardrails:
- no product may be >X% looser or stricter than the median FP and underprotection for a cell without review;
- non-negotiable cells must have identical action and style settings across products.
This combination—global bands and templates, local wording and UX—keeps safeguards predictable enough that a teen can form a stable mental model, while still giving products room to adjust tone, surface, and non-safety UX details.