Current work largely treats miscalibrated reliance gaps as arising from model-side behavior (refusals, localized meta-explanations, second-order safety signals). If we instead model over-trust as a joint property of users and systems, how do cross-lingual reliance gaps change when we vary user-side factors alone—such as prior experience with automated translation, cultural norms about questioning authority, and typical exposure to local misinformation—while holding model behavior fixed, and which of these user factors most directly contradict the assumption that aligning model-side signals will be sufficient to correct reliance gaps?

cross-lingual-cot-trust | Updated at

Answer

Varying user factors alone can materially change cross-lingual reliance gaps even with identical model behavior. Some user traits make gaps larger or even flip direction, and they directly contradict the idea that model-side alignment is sufficient.

  1. How gaps change when varying user factors (model fixed)
  • Prior MT/AI experience
    • Heavy prior use of rough MT: users may expect low quality in weaker languages and under-trust those answers, shrinking or reversing gaps even if the model is equally unsafe.
    • Prior use of strong English tools: users may over-trust English and underweight local-language answers, exaggerating gaps in favor of English.
  • Cultural norms about authority
    • High deference cultures: users more likely to treat any fluent, confident answer as authoritative in both languages, but especially in prestige or global languages; over-trust persists despite hedging, so gaps track status norms more than true reliability.
    • High-questioning cultures: users discount confident style, probe inconsistencies, and show smaller gaps or even compensatory skepticism for the language they perceive as weaker.
  • Exposure to local misinformation
    • High exposure: users may distrust local-language online info in general, leading to lower reliance in that language even if the model is actually less safe there; this can mask real safety gaps.
    • Low exposure: users may treat the local language as socially vetted, over-trusting local-language answers on politicized or health topics where the model is actually weaker.
  • Bilingual proficiency and identity
    • Users dominant in one language often over-trust that language’s outputs independent of actual model error, shifting the measured gap toward their dominant language.
    • Identity-linked languages (e.g., minority or heritage) can be either over-trusted (“this speaks like my community”) or under-trusted (“online info in my language is usually low quality”), changing both size and sign of gaps.
  1. Factors that most contradict “model signals alone are enough”
  • Strong authority-deference norms: users ignore hedging, limitation statements, and verification prompts; over-trust tracks fluency and formality, not safety signals.
  • Prior positive experience with flawed MT/AI: users have learned that even obviously imperfect tools are “good enough,” so incremental improvements in refusal or hedging barely move their reliance.
  • Learned distrust of local-language online info: reliance gaps may appear small or reversed because users already under-rely on the truly weaker language; improving model-side signals there yields little behavioral change.

Net: these user-side patterns show that even perfectly aligned refusals and second-order safety signals will leave substantial, and sometimes inverted, cross-lingual miscalibrated reliance gaps unless interfaces and training also target user beliefs, norms, and habits.