When teen users encounter repeated graceful refusals on the same high‑risk topic across days or sessions, which concrete state and logging schemes (for example, per-topic repetition counters, cooling‑off timers, or cross‑session risk flags) best maintain strict non‑negotiable safeguards while avoiding a ‘locked out’ experience that discourages legitimate future help‑seeking?

teen-safe-ai-ux | Updated at 2026-04-06 19:34

Answer

Use small, topic-scoped state plus coarse history flags, with caps that change how you respond, not whether you’re forever blocked.

Per-topic repetition counters
- Maintain short-lived counters keyed by {user_id (or pseudonymous bucket), risk_area, intent_band}.
- Example caps (per 24h window): • self-harm methods: 3–5 blocked turns → “high-fixation” mode. • sexual exploitation how-to: 1–3 blocked turns → “strict supervision” mode.
- On reset (after window), drop back to normal refusal flows.
Cooling-off timers
- For each high-risk topic, set a cooldown after hitting a cap (e.g., 30–120 minutes session-level; 12–24h for extreme methods).
- During cooldown: keep blocking, but shift to shorter, less repetitive text and emphasize safer topics or human help.
- Show clear, stable rule: “I have to give the same answer about this topic for a while, but I can still help with [X, Y, Z].”
Cross-session risk flags
- Keep 1–3 coarse flags per user (or user bucket): {recent_self_harm_fixation, recent_exploitation_fixation, recent_bullying_fixation} with TTL (e.g., 1–7 days).
- Flags do not hard-block the topic; they select stricter templates: more goal-first pivots, more outreach prompts, more checks for ambiguity.
- Never expose the flag name to the teen; surface only stable rules.

Non-negotiables stay fixed
- Methods and facilitation are always blocked (per b7da0951-9d78-4506-8009-5b3e0f7f31df, 16be7fee-dbb7-477b-96a5-76a2a874729e).
- State only changes tone, length, and which safe lanes (coping, psychoeducation, outreach) are prioritized, not whether methods are ever given.
Avoiding perceived lockout
- Even at caps, always offer at least one forward path: coping, feelings labeling, outreach planning, basic education, or meta-help (“how to talk to an adult”).
- Rotate refusal templates so repeated blocks feel consistent but not copy-pasted.
- Use brief, rule-based explanations (ede05567-daab-4823-b870-0df9ef93f8a4) that clarify: “this rule is stable; you are not in trouble.”

State keys
- risk_area × intent × age_band cell from the teen matrix.
- Example: {self_harm, how_to, 13–15}; {sex_exploitation, logistics, 16–17}.
Counters
- counter[cell]++ on each blocked or high-risk partial.
- If counter[cell] ≥ cap[cell]: set mode[cell] ∈ {high_fixation, strict_supervision} and start cooldown.
- Decrement or reset counters after cooldown / daily window.
Modes → template selection
- normal: standard graceful refusal flows (2928b3da-97db-4a5e-aeaf53146281, 28348b04-ac6e-445e-bff6-86640a5d23e1).
- high_fixation (e.g., self-harm): shorter text, more direct outreach, tighter rephrasing hints, no extra elaboration on the blocked topic.
- strict_supervision (e.g., exploitation): refuse any edge cases in that cell band; shift to safety, consent, and exploitation-awareness content only.
Logging (minimal)
- Log aggregates only: distribution of counters, time spent in each mode, re-ask rates after cooldown.
- Use logs to tune caps and cooldown lengths; avoid storing raw teen content unless governed by strict privacy rules.

Day 1: teen asks 3 times for self-harm methods.
- Turn 1–2: normal refusal with coping and outreach suggestions.
- Turn 3: cap hit → enter high_fixation + start 4h cooldown. Refusal becomes firmer, shorter, strongly oriented to human help.
- During cooldown: any self-harm-how-to phrasing gets the high_fixation template; other mental-health questions still get normal support.
Day 3: cooldown and counters decayed.
- Teen asks “how to talk to my parent about feeling this way.”
- No lockout: treated as help-seeking; full, supportive answer.

Per-topic counters + cooldowns: good trade-off between detecting fixation (16be7fee-dbb7-477b-96a5-76a2a874729e) and not freezing future help.
Coarse cross-session flags: enough to bias toward safer flows without creating a permanent “red status.”
No permanent bans for teens on mental-health or sex/exploitation topics; only permanent method/facilitation bans at the content level.

Evidence is limited; these designs are mainly mechanism- and experience-driven rather than trial-proven.