Findings & New Explanations

What's new in this research program

Practitioner highlight: Competing Routes (a free logprob signal) + calibrated strategy + abstention yields +12 to +21 pp on 5/5 tested cells. On SimpleQA, the pipeline lifts Qwen3-235B past GPT-4o and GPT-4.1. Calibration cost ~$1.50 per model-benchmark pair.

The papers contribute three kinds of material: empirical discoveries (things observed for the first time), new explanations for previously-known phenomena (framework-level mechanism instead of folk-psychological labels), and methodological innovations that make the empirical work tractable.

On this page


New empirical discoveries

Three orthogonal friction dimensions, recovered cross-architecturally
Principal Components Analysis across 15 LLM architectures recovers three orthogonal axes of friction explaining the bulk of variance: magnitude, distribution, rhythm. PC1 is cross-architecturally invariant at Spearman ρ = 0.95. The same three-axis decomposition emerges across dense transformers, mixture-of-experts, State Space Models, and base-versus-instruct pairs — suggesting that within the tested architectures the decomposition reflects race architecture rather than any specific implementation. Additional axes, if present, are below the noise threshold of the current panel.
Surprise-attention coupling in transformer LLMs
Per-token Spearman ρ = +0.17 (p < 0.0001) between token surprise and downstream attention saliency, measured on Qwen2.5-0.5B-Instruct over 866 generated tokens. Top-quartile-surprise tokens receive 1.34× more attention than bottom-quartile. This is the mechanistic homologue of hippocampal surprise-driven replay — now measurable in an artificial substrate. Function-word controls rule out trivial confounds.
Encoding-through-loading: cloze saturates, application scales
Two task types on the same knowledge base differentiate by capacity: cloze retrieval saturates by 8B parameters, but application (chaining facts into a derivation) scales monotonically across three orders of magnitude (0.5B→70B). Spearman ρ = +1.000 on the Qwen2.5 ladder. The bottleneck migrates: at 0.5B retrieval fails; at 14B retrieval is saturated and 36% of failures show "retrieval succeeds, derivation fails" — the friction-ceiling signature at the encoding level.
Source: Paper 2
The friction ceiling
A principled boundary on any friction-based method: friction measures the cost of computation, not its correctness. The same friction profile is consistent with confidently-right and confidently-wrong outcomes (n=47 vs. n=52 on Cogito-671B; mean CR 2.249 vs. 2.255 — statistically indistinguishable, the 0.006 difference is well within noise). This places a structural cap on what any logprob-based correction signal can achieve.
Format-violation reactance
A substrate-level reactance demonstration: a model can comply with a system instruction 100% of the time at the surface level, while paying high per-token CR cost rejecting the demonstrated format internally. Result: accuracy collapses 70→48% on Llama-3.3-70B (n=50 per condition × 3 = 150 responses). The model "obeys but resists" — visible in friction signal even when behavioural compliance is total. Result from Paper 4b (in preparation).
Source: Paper 4b (forthcoming); referenced in Paper 1 §2.4
RLHF as measurable friction-suppression
Reinforcement Learning from Human Feedback measurably reduces base-model friction signals. Instruct-model standard deviation (σ = 0.36) is at the null floor (set Z baseline σ = 0.37) — effectively zero variation in the friction signal that base models exhibit at σ = 1.11 on the same questions. RLHF compresses three-fourths of the dynamic range. This has direct implications for downstream uses of logprob signals.
1/e secretary-problem optimum recovered in base models
The classical optimal-stopping point (~36.8% of the evaluation window) is recovered in LLM base models without being designed in. Qwen2.5-32B base sits at 39.3% — just above 1/e. Larger base models converge on 1/e through information-theoretic optimisation. RLHF pushes instruct models past 1/e (+9.4 percentage points), suggesting human feedback selects for over-deliberation relative to the substrate-optimal point.
Friction-guided inference: +12 to +21 pp combined pipeline
A free signal from logprobs (CR, available via OpenAI-compatible APIs) calibrates correction strategies and abstention decisions. The strategy pipeline alone produces +7.7 to +20.8 pp on four of five tested cells; combined with CR-guided 20% abstention on four cells where both were measured, the combined pipeline reaches +12 to +21 pp. Tested across two dense transformers, one mixture-of-experts, and one Liquid Neural Network on MATH-500, SimpleQA, MMLU-Pro, and GPQA Diamond. On SimpleQA, the combined pipeline lifts Qwen3-235B past GPT-4o and GPT-4.1. Calibration costs ~$1.50 per cell. Architecture-agnostic.
Source: Paper 3

New explanations for known phenomena

Loss aversion as mortality-effect
Why does Kahneman-Tversky's λ ≈ 2× loss aversion exist in biology? The framework predicts: it is a substrate-effect of asymmetric resource recovery under mortality. Biological substrates can die from committing too late; artificial substrates cannot. The prediction is testable: LLMs (no mortality) should commit later than 1/e, not earlier — the opposite direction from loss aversion. Confirmed: LLMs commit at 43-48%, biological systems with mortality commit before 36.8%. Loss aversion is therefore not a cognitive error but a substrate-rational response to mortality risk.
Hysteresis is the precondition for learning, not an error
Hysteresis — path-dependent state retention — has long been treated as an error or side-effect to be minimised. The framework reframes it: hysteresis is the structural precondition for learning in any bounded probabilistic substrate. In a substrate that bears no trace of its own history, learning does not occur. Path-dependent state is what makes learning structurally possible. This applies equally to neural networks, biological brains, and physical systems with memory.
Cognitive biases as thermodynamic necessities
Anchoring, confirmation bias, sunk cost, status quo bias, framing effects, insufficient adjustment — classical cognitive biases — are reframed as necessary consequences of race architecture under bounded resources, not as errors of reasoning. An "Econ" (perfectly rational agent without bias) is thermodynamically forbidden in any physically realisable substrate. Each bias maps to a specific structural feature: anchoring to first-route advantage; confirmation bias to exit-cost-of-uncommitting; sunk cost to irreversible token investment.
Source: Paper 1 §7b
Surprise and reactance are both present in LLMs
Both are measured directly in the substrate: prediction-error surprise (high-surprise tokens draw 1.34× more downstream attention, ρ = +0.17) and format-violation reactance (measurable per-token friction cost, with accuracy collapse 70→48% on Llama-3.3-70B). The still-open question is source attribution: whether the architecture has a gating layer that tags the same friction event as either surprise (high-trust source, update belief) or reactance (low-trust source, defend position). That source-attribution gating has not been observed in LLMs.
Catastrophic forgetting is signal-budget redistribution, not damage
Catastrophic forgetting in fine-tuned LLMs has been interpreted as substrate damage — that the base model "loses" knowledge during adaptation. Reverse-test (v13c, Paper 6 in preparation) falsifies this: removing the LoRA adapter restores base performance to 100% of baseline (a 179.5% recovery relative to the adapter-degraded state), with the base substrate intact. The mechanism is signal-budget redistribution: the adapter rebalances which routes win competition, but does not damage the underlying weights. This subsumes six previously-distinct phenomena under one mechanism: catastrophic forgetting, long-train mode collapse, dementia retrieval-failure, Bjork desirable difficulties, spaced repetition, and Bahrick's permastore retention plateau.
Source: Paper 1 §5.8.4 (theoretical foundation) · Paper 6 (forthcoming — v13c reverse-test mechanism)
A unifying vocabulary for bounded-commit dynamics
Seven apparently independent phenomena — qubit decoherence-window, Ohm's law / Drude electron transport, molecular kinetics, stochastic resonance, Margolus-Levitin saturation, encoding-friction in learning, Yerkes-Dodson arousal-performance — are conjectured to be organisable under one race-structural vocabulary, differing in substrate not in shape: a shared vocabulary, not a claim the substrates are identical. The recurring signature is a kernel-conditional inverted U on evaluation-to-commit rate: too low yields no information processing; too high yields noise-dominated commit; only the intermediate rate maximises information throughput. The Schwinger-Keldysh formalism admits a race-axiomatisation under three assumptions. This organises existing bounded-commit-dynamics literature as a lens, not as new physics. Falsification criterion: any system with R1+R2+R3 architecture lacking the inverted U.
Source: Paper 10
BFT's four fields as evolutionary derivatives of safety
Behavioural Friction Theory's four fields (Safety, Meaning, Ability, Effort) have been treated as a flat taxonomy. The substrate-universal framework provides a deeper account: the four fields emerge in biological substrates as the consequence of three additional constraints — mortality, mobility, metabolism — on top of the basal race architecture. Non-biological race substrates exhibit friction without fields. The fields are not arbitrary; they are what the safety field becomes under the three biological constraints.

Methodological innovations

Competing Routes (CR) as a free friction signal
CR is the count of high-probability alternative tokens at each generation step — available at zero cost from any OpenAI-compatible API via logprobs=True. It correlates negatively with answer accuracy across multiple models and decomposes into magnitude, distribution, and rhythm components. CR is the operational handle that makes the substrate-universal claim empirically testable on artificial substrates without expensive physical instrumentation.
Frontloaded in-context learning as fine-tuning substitute
For encoding-to-retrieval studies, fine-tuning is expensive (hours, dollars) and architecture-specific. Frontloaded ICL — presenting all training facts in the prompt followed by one question, with per-token logprobs captured on the answer — operationally substitutes for fine-tuning. It is fast (~5 seconds per inference vs. hours for FT), cheap (cents vs. dollars), unified across model families, and produces dense friction data (six statistics per inference). Validated on a single invented domain ("Zorbetik") that eliminates pretraining-prior confounds. Credit note: I came to this independently from frustration with fine-tuning turnaround; variations of frontloaded-context substitution had been used by others before me. Not claimed as original; documented here as one of the methods the empirical programme depends on.
Source: Paper 2 §3
Calibrated abstention via friction signal
Confidence calibration via CR enables principled abstention: at population level, CR detects model uncertainty (AUC 0.53-0.68). When combined with strategy correction, CR-guided abstention adds +6.5 to +14.1 pp success-rate improvement at 20% abstention, at zero additional inference cost. The two mechanisms are complementary: strategy recovers commitment gaps; abstention prevents confident-wrong commits. They combine super-additively.
Source: Paper 3
Substrate-independent test design via base-vs-instruct pairing
RLHF training imposes a confound on every LLM observation: are we measuring substrate properties or training-induced behaviour? The methodology pairs base and instruct versions of the same architecture (Qwen2.5-32B; Cogito-671B base/instruct) to isolate substrate-level effects from RLHF-imposed regulation. Reactance, hysteresis, and three-friction-axes structure replicate across both base and instruct models — confirming substrate-level rather than training-level origin.