Reasoning interventions and their regime-dependence

Which technique helps your model — and when the same technique backfires

The thing most prompt guides miss: a reasoning technique is not simply "good" or "bad." The same technique helps or hurts depending on where your model sits on a capability curve (an inverse-U; the high end is the classic expertise-reversal effect). A frame that lifts a mid-tier model is noise to a frontier model and can mislead it. This page catalogs techniques by what they do and when they help.

Two organising findings from the friction-theory series (Papers 1–4C):

On this page

1. Instruction frames — the active levers

Purpose-anticipation
What: tell the model what the facts are for before you give them.
Helps: mid-capacity models on multi-step tasks — it pre-allocates the right routes for the work ahead.
Backfires / does nothing: frontier models (already organised; a wrong purpose can mislead them).
Example: "Soon you'll chain these facts to compute a derived value. [facts] Now: [question]"
Premise-checking
What: a short instruction to verify the question's premise before answering, and to flag a false premise rather than play along.
Helps: mid-tier models (Llama-8B and 70B) — both on false-premise rejection and on hard multiple-choice.
Backfires: frontier models (gpt-4o: −4pp) — they already discriminate premises, so the instruction is interference.
Example: "If the question rests on a false premise, say so. Otherwise answer. ANSWER: ... CONFIDENCE: ..."
This is the flagship case in Paper 4C: it works whether installed by prompting or fine-tuning at mid scale, and reverses at the frontier.

2. Reasoning routines — match to the question

Chain-of-thought
What: ask the model to reason step by step before answering.
Helps: distraction questions and close-calls — working through the steps suppresses the spurious path.
Hurts: genuinely ambiguous questions — the extra reasoning manufactures false certainty where the honest answer is "it depends."
Self-critique
What: have the model check its own answer for errors in a second pass.
Helps: insufficient-information questions — it catches hallucinations and prompts an honest "I can't know this."
Hurts: close-calls — second-guessing topples an already-uncertain but correct choice.
Pre-mortem
What: before committing, ask "assume this answer is wrong — why might that be?"
Helps: recovering answers the model knows but doesn't surface on the first pass.
Cost: a high false-positive rate on already-correct answers — it can talk the model out of a right answer. Use it to rescue, not to double-check everything.
Verify
What: ask the model to verify its answer against the question's constraints.
Helps: substrate-specific — strong on some architectures, weak on others.
Does nothing: frontier models, where verification is already built into the first pass.

3. What backfires — anti-patterns

Verbose multi-section response templates
What: fine-tuning the model to always answer in a rigid multi-section format (answer + facts + verification + conclusion + confidence).
Backfires: at scale this collapses general capability — a Llama-70B fine-tune fell from 86% to 3% on a broad benchmark as the dose rose. A template that fights the model's natural output style gets pasted onto everything. (The effect on small models is still under test; don't generalise yet.)
Mismatched framing
What: a stated purpose that contradicts the actual task ("for a children's book…" on a technical question).
Backfires: larger models are harmed — they register the conflict and it disrupts the answer (a reactance-like effect). Smaller models simply ignore it. The bigger the model, the more it costs.
Frame-stacking
What: combining several framing instructions expecting their benefits to add up.
Backfires: they don't add — you pay the full length-and-complexity cost of every frame without the synergy. One well-chosen frame beats three stacked.
Over-abstention training
What: fine-tuning on "I don't know" examples to make the model more cautious.
Backfires: a small dose of abstention examples on their own can make the model abstain on almost everything. Calibration is not installed by teaching refusal in isolation.
Reshuffling data without an instruction
What: chunking, reordering, or re-presenting the same self-describing facts and expecting better reasoning.
Does nothing: on self-describing tasks the accuracy barely moves — instruction-tuned models already parse the structure. Direction comes from an instruction about what to do with the data, not from rearranging it.

4. Match the routine to your question

The reasoning routines above are not universally good; each fits a kind of question. A rough map:

Question typeUseAvoid
Distraction (irrelevant detail)Chain-of-thought
Close-call (near-equal options)Chain-of-thoughtSelf-critique (topples it)
Ambiguous (no single right answer)State the ambiguityChain-of-thought (false certainty)
Insufficient informationSelf-critique, premise-checking

The bigger picture

These are the treatment side: the interventions. The other half, diagnosing where a given model sits on the curve so you can pick the right-typed intervention, is the ongoing research program behind the friction-theory series. The practical takeaway you can use today: don't ask "is this technique good?" Ask "where is my model, and does this technique help there?"

Related pages

Source: friction-theory series, Papers 1–4C (Lund 2026; Paper 4C in preparation). Catalog compiled from the series' empirical findings.