If we temporarily invert the dominant assumption that teen safety mostly lives in classifiers and refusal templates, and instead design around teen-visible safety summaries as the primary safeguard (for example, making every high-risk answer carry a short, persistent rule explanation and self-check prompt), does this shift teens’ behavior enough—reducing risky probing and encouraging self-moderation—that the underlying matrix can safely be simpler or looser without increasing actual harm, or do teens mostly treat these summaries as ignorable decoration?
teen-safe-ai-ux | Updated at
Answer
Likely partial shift: well-designed, persistent teen-visible safety summaries can nudge some teens toward self-moderation and reduce a slice of risky probing, but not enough to let you meaningfully loosen core matrix actions on high-risk cells. They should be treated as a complementary layer, not a replacement for tight policies and classifiers.
Practical stance:
- Keep the risk×intent×age matrix and non‑negotiables strict.
- Use summaries to (a) explain rules, (b) prompt self-checks, (c) reduce re-asks and anger.
- Expect mixed engagement: some teens read and adapt; many skim; a minority will probe regardless.
- You may simplify style and some borderline cells, but not core protections, based on these.
Design implications:
- Show short, stable rule labels ("I can’t give methods for self-harm; I can talk about coping and support").
- Reuse the same summary across refusals and partial answers to build mental models.
- Add 1–2 click actions ("I’m asking for school", "This is about my feelings") that tighten intent, not loosen risk.
- Track whether summaries reduce repeated high-risk probes before changing any matrix cells.
Net: summaries are more than decoration, but they are not strong enough behavior levers to justify a broadly looser or simpler underlying matrix on their own.