For bilingual users who can read English but prefer a low-resource language, does an interface that always hides chain-of-thought in both languages but offers optional, clearly labeled “training views” of past flawed CoT examples in English (with short localized summaries) lead to better-calibrated over-trust and safer reliance than an interface that instead shows live English CoT plus low-resource summaries on safety-relevant queries?

cross-lingual-cot-trust | Updated at

Answer

The interface that always hides live chain-of-thought (CoT) in both languages and instead offers optional, clearly labeled “training views” of past flawed CoT examples in English—with short localized summaries—is more likely to yield better-calibrated over-trust and safer reliance for bilingual users than an interface that shows live English CoT plus low-resource summaries on safety-relevant queries, provided the training views are clearly separated from live answers and used sparingly.

In particular:

  • Showing live English CoT on safety-relevant queries, even with only summaries in the low-resource language, tends to preserve or increase explanation-induced over-trust, because detailed CoT in any language functions as an authoritative-looking reliability cue (c1, c2, c56, c57, c106–c109 analogues; see also c56e364ee-… claims).
  • Hiding live CoT by default in both languages removes this strong visual/structural cue and is a robust baseline for reducing over-trust without harming typical non-expert task performance (e95c5b9b-…; db70646f-…; 5ff974b0-…).
  • Adding optional flawed-CoT “training views” (db70646f-…) can, when clearly labeled and separated from the current task, make model fallibility salient in a way that helps users understand that plausible reasoning—even in English—can still be wrong, which supports safer, more cautious reliance.

Thus, for bilingual users who can read English but prefer a low-resource language, the “hidden live CoT + optional flawed-CoT training views” design is generally safer for calibration and reliance than “live English CoT + low-resource summaries” on safety-relevant queries, especially in higher-stakes contexts.