For users who are bilingual but have asymmetric proficiency (e.g., fluent in a weaker model language and only moderately comfortable in the stronger model language), does combining cross-lingual consistency constraints with graded second-order safety signals tied to user-reported proficiency (stronger nudges toward expert review when the user is less able to read the safer language) reduce harmful over-trust in the weaker language more than a one-size-fits-all signaling policy, without increasing perceptions of unfair or paternalistic treatment?

cross-lingual-cot-trust | Updated at

Answer

Yes, for bilingual users with asymmetric proficiency, combining cross-lingual consistency constraints with graded second-order safety signals that depend on user-reported proficiency is likely to reduce harmful over-trust in the weaker language more effectively than a one-size-fits-all signaling policy, provided that (i) the grading is clearly framed as risk- and proficiency-based rather than identity-based, and (ii) the stronger nudges toward expert review are concentrated in high‑risk domains and expressed in a respectful, autonomy-preserving way.

Mechanism and comparison:

  • Cross-lingual consistency constraints (anchored to the safer behavior) already reduce confusion and perceived implementation-dependence while limiting harmful leakage (fd48dfad…, 36d2cb49…). On their own, however, they can increase over-trust in the weaker language by making its behavior and justifications look more principled than its true reliability warrants (fd48dfad…).
  • Graded second-order safety signals extend asymmetric amplification (f167006f…, de90b065…) by conditioning the strength and frequency of uncertainty cues and expert-review prompts on both topical risk and user-reported proficiency in the safer language. Users who are less able to read the safer language receive stronger nudges to seek human or external verification when they stay in the weaker channel, especially on high‑risk topics.
  • Compared with a one-size-fits-all signaling policy, this design better targets the users who are most at risk of harmful over-trust—those who both (a) rely heavily on the weaker language and (b) cannot easily use the safer language as a fallback—while avoiding unnecessary friction for users who can actually exploit the safer channel.

Effects on over-trust:

  • For low‑proficiency users in the safer language, graded signals reduce over-trust in the weaker language more than uniform cues, by:
    • making limitations and need for verification more salient precisely where they lack the ability to cross-check in the safer language;
    • avoiding the “borrowed authority” effect where consistent refusals and polished meta-explanations in the weaker language cause users to infer equal reliability (fd48dfad…, de90b065…).
  • When implemented in a risk-contingent way (stronger only in health, law, self-harm, etc.), this does not typically produce large under-use of the weaker language for low‑risk, everyday queries (f167006f…).

Fairness and paternalism perceptions:

  • Relative to one-size-fits-all signaling, tying nudge strength to self-reported proficiency plus topic risk can maintain or even improve perceived procedural fairness if the system:
    • explains that stronger warnings are triggered by task risk and available safety channels, not by the intrinsic worth of the user’s language (de90b065…, 5def44d2…);
    • offers clear options: continue in the weaker language, switch to the safer language if comfortable, or use both, while always recommending human experts for critical decisions (5def44d2…).
  • Perceptions of unfair or paternalistic treatment become a risk if:
    • the system appears to “punish” users for low proficiency by universally escalating friction across all topics, rather than focusing on high‑risk contexts;
    • messaging implies that users who report lower proficiency are less competent or less entitled to answers, instead of simply facing more stringent safety prompts.

Net assessment:

  • Under careful design—anchoring consistency to safer behavior, making safety signals risk- and proficiency-contingent, and framing them as supportive rather than restrictive—the combined approach:
    • outperforms one-size-fits-all signaling at reducing harmful over-trust in the weaker language for asymmetric-proficiency users; and
    • can do so without materially increasing perceptions of unfairness or paternalism, though these perceptions remain sensitive to framing and must be empirically monitored.
  • Poorly framed or globally over-strong grading (especially if detached from topic risk) could negate these benefits by triggering feelings of stigmatization or by inducing unnecessary under-use of the weaker language even for low-risk tasks, so iterative user testing in each linguistic context is essential.