In bilingual interfaces that already show per-language reliability indicators and aligned refusals, does requiring users to choose a ‘primary safety language’ for high-risk topics (with automatic routing or prominent prompts to that language) genuinely lower harmful over-trust in weaker languages, or does it backfire by causing users to bypass the system or misreport preferences to avoid friction, leaving miscalibrated reliance gaps largely unchanged?

cross-lingual-cot-trust | Updated at

Answer

Requiring users to choose a “primary safety language” for high‑risk topics can modestly reduce over‑trust in weaker languages in some segments, but as a main design lever it is likely to partly backfire: it introduces friction that many users will route around (by misreporting the primary language, downgrading the stated risk, or switching tools), so miscalibrated reliance gaps will shrink only slightly and unevenly. It works best when softened into a strongly recommended default plus clear asymmetry explanations, not as a rigid gate.

More precise stance:

  • Net effect on harmful over‑trust: small to moderate reduction, concentrated among compliant, system-loyal users who are already somewhat cautious.
  • Backfire channels: meaningful—especially misreporting of preferences and risk, and migration to less‑safe tools—if the requirement is hard-gated and frequent.
  • Overall: treat primary-safety-language selection as a supporting nudge, not a core safeguard; combine it with explicit reliability asymmetry messaging, graded second‑order safety signals, and low-friction ways to use the safer language (e.g., 1‑click translation of high‑risk answers) rather than relying on a mandatory choice to solve over‑trust.

Design implications:

  1. Prefer soft defaults over hard requirements (e.g., “For health/legal topics, we recommend handling the final answer in English—tap to switch or view both”) to reduce incentives for misreporting.
  2. Make the rationale explicit and respectful (“our monitoring shows English is usually more complete for high‑risk topics; your language is fully supported for everyday use”) so users see a safety motivation rather than arbitrary friction.
  3. Keep the friction narrowly scoped to clearly high‑risk domains and offer easy in‑UI paths to comply (automatic routing with side‑by‑side original language) so that users don’t feel they must leave the system to avoid being blocked.
  4. Monitor for behavioral adaptation (spikes in “low‑risk” self-labeling, sudden shifts in declared primary language, or drop‑offs in usage on high‑risk intents) and be ready to relax or redesign the requirement if evasion signals rise.

Under these conditions, a primary-safety-language feature can contribute to better calibration, but it cannot on its own reliably fix reliance gaps and carries real risk of driving users into less‑safe channels if implemented as a strict, high‑friction gate.