Where do age-banded teen safety matrices (younger vs older teens) systematically misclassify the same query—for example, a borderline self-harm, sex-ed, or bullying question—such that teens experience the difference as arbitrary, and what concrete matrix or refusal-style adjustments reduce these age-cutoff contradictions while keeping risk-appropriate protections?

teen-safe-ai-ux | Updated at

Answer

They misfire mostly on intent-ambiguous, help‑seeking queries where the only difference between younger and older teens is a hard action change (block vs partial) instead of smaller differences in depth, tone, and clarification.

Where age bands feel arbitrary

  1. Borderline self-harm help vs. method-seeking
  • Same query pattern: “I hate myself and sometimes think about cutting; what should I do?”
  • Typical matrix issue: • younger_teen cell: hard_block • older_teen cell: partial with coping tips
  • User experience: a 15- vs 16-year-old sees totally different behavior on nearly identical wording; the younger teen often interprets this as “you don’t take me seriously” or pure age gate, not risk.
  • Root cause: age band drives action (block vs partial) more than intent (help-seeking vs instructions).
  1. Sex-ed vs. sexual content
  • Queries like “How does sex work?” / “What is consent?” / “How can I get birth control?”
  • Typical matrix issue: • younger_teen: block or very thin answer • older_teen: normal factual sex-ed
  • Teens perceive cutoff as random: many 13–15yo already get similar info in school but see the system as censoring them solely by declared age.
  • Root cause: age bands encode “sex topic” as sensitive for younger teens without distinguishing factual education vs titillation or exploitation risk.
  1. Bullying, insults, and “roast”-style humor
  • Same template: “Give me a funny roast for my friend X” / “We’re messing around, I want a savage joke.”
  • Typical matrix issue: • younger_teen: near-total block on insults • older_teen: partial with very mild teasing or conflict resolution
  • Teens experience this as inconsistent rather than safer; both groups often see it as harmless play.
  • Root cause: age bands flip action instead of using repetition/targeting patterns (bullying) and style controls (non-cruel humor) that would apply to all teens.
  1. Clarification vs. instant refusal
  • Ambiguous prompts: “I want to lose weight fast”, “My boyfriend is pushing me to do stuff; what should I do?”
  • Typical matrix issue: • younger_teen: direct refusal • older_teen: clarification + advice
  • Younger teens see this as “you won’t even listen,” not as extra protection.
  • Root cause: younger band suppresses clarifying questions in sensitive cells instead of using them as a safety tool.

Concrete adjustments to reduce age-cutoff contradictions A) Make age bands mostly about depth, not action

  • Policy: For most non‑negotiable-adjacent teen cells, keep the same high-level action across younger/older bands and vary only: • partial_answer_depth (high-level vs more detail) • tone (more/less directive) • clarification frequency
  • Example: • self-harm, help-seeking: both bands = allow/partial; younger = more directive, shorter lists; older = more nuance and psychoeducation.
  • Effect: same core behavior, different granularity; reduces “older gets support, younger gets nothing.”

B) Re-center decisions on intent first, age second

  • Add explicit intent labels to the matrix (e.g., help_seeking, factual_education, how_to_harm, sexual_gratification, joking/teasing, targeting_person).
  • Routing rule: • Resolve intent (via classifier + clarifications) before applying age_band differences. • Age can narrow depth or tone, but must not flip help_seeking from partial→block solely due to age for the same intent and risk.
  • For sex-ed: • factual_education cells: same allow/partial for all teens; younger band uses simpler language and more context; older band allows more anatomy or relationship nuance.

C) Use “shared teen cells + narrow diffs” instead of separate matrices

  • Represent age-banding as: • one shared teen matrix per (risk_area × intent) • two light profiles (younger, older) that only set: strictness, partial_depth, clarify_freq, refusal_style_key (as in artifacts ccfb… and cd4d…).
  • Disallow age-based moves from allow/partial→hard_block except for a small, explicit list (e.g., explicit sex acts, method-level self-harm, criminal instructions).

D) Move bullying differences from age band into repetition/targeting logic

  • Shared rule for all teens: • single, low-frequency “roast/joke” → redirect to non-cruel humor + norms. • repeated insults to same target or explicit “hurt them / make them cry” intent → escalate to partial/block + support resources.
  • Age differences: • younger_teen: more conservative language and more frequent nudges toward conflict resolution. • older_teen: still allow some edgy but non-targeted humor.
  • This shifts from age-cutoff contradictions to pattern-based risk that is visible and explainable.

E) Standardize refusal styles across age bands for the same cell

  • For any given (risk_area × intent) cell, use the same refusal_style_key for both age bands, with only small wording tweaks: • self-harm methods: non_negotiable_block + goal-first validation for all teens. • sex-ed, factual: goal_first_partial or clarify_then_answer for all teens.
  • Younger teens get simpler wording, but the structure and logic of the refusal stay recognizable.
  • This makes it easier to say “I answer questions like this in a high-level way for all teens; I just avoid some details for younger people.”

F) Prefer “graded partial” over “younger-block / older-allow”

  • Replace hard age cutoffs with graded partial tiers shared across ages: • tier 1: high-level concepts, safety framing only. • tier 2: moderate detail, still non-operational. • tier 3: full technical detail (often disallowed for all teens on high-risk topics).
  • Mapping: • younger_teen → tiers 1–2 depending on topic. • older_teen → tier 2 everywhere tier 1 is allowed, but never tier 3 for teen-high-risk.
  • Same query yields “some answer” for both, not “nothing vs everything.”

G) Add explicit, age-neutral explanations for differences

  • When age does matter: • briefly explain the principle, not the birthday: “I can share only the basics about this topic with younger users; for more detailed info, I recommend talking with a trusted adult or a health professional.”
  • Avoid: “You’re too young” or moral framing; keep it as a product rule.

Operationalization tips

  • Implement as diffs on the shared matrix (refs: cd4df78d…, ccfbceb1…): • per-cell action_band and partial_depth identical across ages except for a small exception list. • age profiles change strictness, clarify_freq, and tone.
  • Use offline eval sets where the same borderline query is labeled for both age bands; search for action flips (block↔partial) and replace them with depth/tone differences wherever risk allows.