In hypothesis-generation stages of physics projects that already use the AI grad student pattern for brainstorming mechanisms and toy models, which concrete combinations of AI roles and epistemic safeguards—for example, requiring AI to output contrastive “null” and “boring” hypotheses alongside each exciting one, or enforcing that every AI-suggested mechanism is accompanied by at least one falsifiable scaling-law prediction and a cheap discriminating test—most reliably increase the fraction of AI-proposed ideas that survive first-pass empirical or numerical checks without inflating researchers’ subjective confidence in any single story?

anthropic-ai-grad-student | Updated at

Answer

Most useful combinations are small, repeatable patterns that (i) force structure onto ideas and (ii) keep confidence low until ideas survive cheap tests.

  1. Trio hypotheses + pre-tagged confidence bands
  • AI role: for each mechanism, emit:
    • H_exciting: interesting mechanism.
    • H_boring: conservative / null-like variant.
    • H_pathology: “it’s just an artefact / miscalibration / discretization” story.
  • Safeguards:
    • Require same format card for all three: key assumptions, 1–2 testable predictions, obvious ways to fail.
    • Force AI to assign coarse priors (e.g. {low, med, high plausibility}) and to keep H_exciting ≤ H_boring by default.
  • Effect: increases survival rate by making “boring” and “pathology” options explicit and tested alongside the exciting one; caps confidence inflation by never letting the flashy story be the default.
  1. Scaling-law prediction + cheap discriminator per mechanism
  • AI role: for each candidate mechanism, produce:
    • One simple scaling-law prediction (e.g. observable ∝ L^α, T^β, Re^γ) valid in a clearly stated regime.
    • One low-cost discriminating check (small sim, back-of-envelope, existing dataset slice).
  • Safeguards:
    • Idea is not logged as a “candidate mechanism” until these fields are filled.
    • Teams track a simple base-rate: fraction of mechanisms whose scaling + cheap test survive.
  • Effect: reduces vague stories; increases fraction of ideas that clear first checks; base-rate tracking discourages overconfidence in any single pass.
  1. AI idea generator + AI constraint-enforcer split
  • AI-1 role (generator): propose mechanisms, toy models, and associated scaling predictions.
  • AI-2 role (enforcer): given only high-level goals and domain constraints, try to refute or trivialize AI-1’s ideas by:
    • Matching them to known boring baselines or known failure modes.
    • Checking unit consistency, limits, and obvious literature contradictions.
  • Safeguards:
    • Human only sees ideas that pass minimal consistency and are still non-boring.
    • Logs must show both passes and refutations.
  • Effect: increases quality of surviving ideas; visible refutation history reminds humans that many plausible ideas failed, tempering confidence.
  1. Hypothesis cards tied to mandatory “disconfirmation budget”
  • AI role: maintain short hypothesis cards with:
    • Mechanism summary.
    • 1–3 discriminating observables / simulation signatures.
    • 2–3 cheap ways the mechanism could be wrong.
  • Safeguards:
    • For every new mechanism adopted for follow-up, allocate a small fixed budget of time/compute to targeted disconfirmation tests, planned with AI help.
    • No mechanism is promoted to “working story” until at least one disconfirming test has actually run.
  • Effect: raises fraction of ideas that survive by forcing early culling; keeps subjective confidence bounded by tying status upgrades to explicit failed attempts to kill the idea.
  1. Literature-anchored idea screens
  • AI role: for each mechanism, auto-generate a mini-contrast set:
    • Closest standard mechanism in the literature.
    • 2–3 papers that most strongly support / contradict it.
  • Safeguards:
    • Idea is treated as “speculative” until at least one cited tension or support claim is manually checked (AI must surface raw equations/plots, not summaries).
  • Effect: filters out ideas already disfavored in data; visible conflicts prevent overconfidence based only on coherent-sounding stories.

Overall: these patterns are most promising when ideas are cheap but simulations/experiments are moderately costly, and when teams track simple base rates on which AI-shaped ideas survive early checks.