Matched Friction Under Hysteresis: A Programmatic Proposal for a Measurement Schema of Learning Optima
Paper 6 (core) · Pødenphant Lund, T. (2026i) · Preprint · Live on Zenodo
A programmatic proposal, not a completed theoretical contribution. Nine traditions in learning theory and belief-revision research have independently characterised inverted-U curves under different vocabularies. This paper sketches a candidate substrate-level vocabulary — matched friction under hysteresis — under which these traditions might be organised, and specifies the measurement programme such a vocabulary would require to become a falsifiable theory. The contribution is a research direction with stated commitments and known gaps, not a mature unification.
| DOI (concept) | 10.5281/zenodo.20059863 |
| Status | Preprint live as v1, 2026-05-26 |
| Cite-letter | 2026i |
| Author | Tomas Pødenphant Lund [ORCID] |
TL;DR
Learning research has independently discovered the same curve for more than a century. Yerkes–Dodson arousal, Vygotsky's zone of proximal development, Kalyuga's expertise-reversal effect, Bjork's desirable difficulties and spacing effect, Bengio's curriculum learning, the testing effect, Shannon–Berger rate-distortion, and Brehm–Festinger reactance/cognitive dissonance all describe inverted-U relationships between an input-intensity variable and a performance variable. Each tradition has its own specialists, its own textbooks, its own empirical paradigms; none is currently classified as a variant of any other.
This paper proposes a candidate substrate-level vocabulary for organising the nine. The schema:
L ∝ 𝟙[|d − m| > c] · coherence(d, m) · r
where d and m are data and model friction structures, c is substrate coercivity (the minimum-mismatch threshold for state-change), and r is per-event remanence (the fraction of a hysteresis crossing that persists). Three regimes follow from the formal structure: match (|d−m| < c) yields reception-maximum but zero learning; transitional (c < |d−m| < χ) yields the productive learning band; overload (|d−m| > χ) collapses coherence and degrades learning, where χ is the flow-breakdown threshold.
Two limitations are central to how this expression should be read. First, the coherence function is not yet specified independently of outcomes; the expression therefore functions as an organising vocabulary for substrate measurements, not as a fully-specified predictive equation. Second, the three-parameter orthogonality (coercivity × remanence × coherence as independently extractable) has not yet been demonstrated on any single substrate — this is the central live falsifier the schema commits to.
Three-grade stratification of the nine traditions. The paper distinguishes:
- Grade (a) — re-expression: the tradition's empirical curve is expressed in the schema's vocabulary, demonstrating parsimony without novel empirical content. Yerkes–Dodson, Vygotsky ZPD, Kalyuga, curriculum learning, testing effect (5 traditions).
- Grade (b) — candidate derivation with surplus content: the schema generates a quantitative prediction the native literature does not state. Bjork's spacing effect falls out of the recovery-time function τ(I) as a derived temporal shadow rather than an independent regularity (the strongest grade-(b) claim); desirable difficulties carries a derived prediction about the descending limb from the flow-breakdown threshold χ (2 traditions).
- Grade (c) — speculative extension: schema applied where the empirical inverted-U is not yet established on the relevant substrate. Rate-distortion (c → 0 limit is structurally consistent with but does not term-by-term recover R(D)); reactance/dissonance (biological inverted-U over encoding depth is framework prediction, not result) (2 traditions).
Empirical status. Empirical work to date is preliminary computational probing rather than empirical validation. The companion empirical paper's parabel-ensemble (Paper 4, Pødenphant Lund 2026g, in preparation) reports direction-consistent inverted-U patterns on four input-intensity axes of the Qwen2.5 family, with chunking-density additionally replicated on DeepSeek-V3 and Llama-3.1-70B. A fifth tested axis (violation magnitude) produced a null. The within-event / between-event consolidation-layer distinction was developed in response to that null and is treated as a post-hoc framework refinement consistent with the null, not as an independent prediction the null confirmed.
What this paper is: a programmatic statement of a candidate schema, a stratified accounting of its empirical readiness, an explicit list of the falsifiers it commits to, and a candidate operationalisation of its central term (coherence) on one substrate kind (silicon).
What this paper is not: a confirmed unifying theory, a validated three-parameter reduction, a demonstration that the schema's central predictions hold, or a replacement for the existing native-vocabulary treatments of any of the nine traditions. Readers should expect a research direction worth pursuing, not a result.
The three-way trade-off
The schema rests on three independently-attested principles from physical theory, recombined to express learning dynamics:
- Reception (matched-filter; Woodward 1953): signal-to-noise ratio is maximised when receiver response matches signal structure. In the framework: maximum information transfers when data friction structure d(t) matches model capacity structure m(t). But perfect match produces no residual, no state-change, no hysteresis crossing. Perfect reception is the wrong regime for learning.
- Learning (hysteresis; Stoner–Wohlfarth 1948): state-change in bounded substrates is hysteretic. Updates happen only when input exceeds a coercivity threshold c, below which the substrate returns to its prior state once input is withdrawn. The framework's claim: |d − m| > c is the necessary condition for any encoding to occur.
- Flow: when |d − m| far exceeds coercivity, no coherent resolution of the mismatch occurs. Gradient updates become diffuse, competing routes fail to commit, computational pathways break down. In the race architecture of bounded-rational computation, this is the regime where parallel evaluation does not produce accumulation.
The three principles compose multiplicatively. The optimum for total learning lies in the transitional band between sub-coercivity and flow-breakdown — once c, r, χ, and the coherence function are specified for a given substrate. The unspecified-coherence weakness is acknowledged explicitly as a live theoretical limitation, not minimised.
The nine traditions, mapped under the schema
- Yerkes–Dodson (arousal × performance): re-expression. The substrate-arousal variable maps to |d − m|; the inverted-U follows from the three-regime structure of §2.4.
- Vygotsky's ZPD (task difficulty × learnability): re-expression. The ZPD is the substrate-relative transitional band c < |d−m| < χ.
- Kalyuga's expertise reversal (scaffolding × learning rate): re-expression. Capable substrates have m already aligned with the task; additional scaffolding pushes |d−m| below c and into the no-learning match regime.
- Bjork's desirable difficulties (retrieval effort × retention strength): candidate derivation. The descending limb is predicted from the flow-breakdown threshold χ.
- Bjork's spacing effect (re-exposure interval × retention): candidate derivation — the schema's strongest grade-(b) claim. Spacing falls out of the recovery-time function τ(I) as a derived temporal shadow of within-event coercivity-crossing dynamics. Light material recovers in minutes; deep material requires overnight; overloaded material takes days. One capacity threshold; three behavioural consequences (within-session saturation, post-session recovery, optimal inter-session spacing) that classical learning theory has treated as separate laws.
- Bengio's curriculum learning (example-ordering × training efficiency): re-expression. Curriculum schedules m to track d through the transitional band as m updates.
- Roediger's testing effect (retrieval practice × long-term retention): re-expression. Retrieval is a within-event reactivation that drives a coercivity-crossing the passive re-reading does not.
- Shannon–Berger rate-distortion (bit-rate × encoding fidelity): speculative extension. The c → 0 limit is structurally consistent with but does not term-by-term recover the R(D) function.
- Brehm–Festinger reactance / cognitive dissonance (encoding depth × resistance to belief-revision): speculative extension. The biological inverted-U over encoding depth is a framework prediction, not a documented empirical regularity on biological substrates; the present paper demonstrates a silicon anchor only.
Predictions the unification generates
The schema generates three classes of empirical predictions unavailable to any single tradition:
- Substrate-coercivity threshold — a measurable LR-kneepoint cliff sharpness σ at the productive/no-learning boundary, per substrate architecture.
- Substrate-remanence per event — the post-event logprob margin between top-1 and top-2 candidate tokens at the answer position, as a per-event substrate constant.
- Substrate-rhythm bound — a Nyquist-derived bound on rhythm-encodable content per substrate, derived from the substrate's intrinsic relaxation time.
The most ambitious claim is the structural reduction in §4: combining the learning-cliff kinetics with the recovery-time function τ(I) yields a candidate derivation of Bjork's spacing effect as a derived temporal shadow of the same capacity constraint that produces the cliff. The candidate reduction is not yet experimentally confirmed; the schema commits to this as a falsifiable prediction.
Frozen-proxy protocol
The schema's commitment, stated once for the whole paper: for each substrate, the proxies for d, m, c, r, coherence, and χ must be fixed before the outcome data are inspected, and the schema's claim is held against those frozen proxies regardless of post-hoc reinterpretation.
On the silicon substrate this paper engages with, the proxies are frozen as: |d−m| → per-token CR over the comparison-token window; c → the LR-kneepoint cliff sharpness σ at the productive/no-learning boundary; r → post-event logprob margin between top-1 and top-2 candidate tokens at the answer position; coherence → the structured-vs-random difficulty contrast at matched substrate state; χ → the LR-kneepoint cliff at the productive/overload boundary.
A failure to recover the predicted topology under these proxies cannot be rescued by substituting a different proxy after the fact. This is the protocol that defeats the tautology risk identified by hostile readers (a schema with endogenous variables and per-substrate proxies can become unfalsifiable if proxies are chosen after outcomes are observed).
Position relative to allied frameworks
- Paper 10 (race architecture) derives the inverted-U topology directly from the R1+R2+R3 race-axioms across substrate scales without invoking the matched-friction formula. Paper 10 supplies the architecture-forced topology; this paper supplies the substrate-parameter formula and the nine-phenomenon mapping.
- Wallace's biocognition rate-distortion programme (Wallace, Leonova & Gochhait 2022; Wallace 2025, 2023) derives Yerkes–Dodson directly from rate-distortion theory on biological substrates. The two programmes are layered, not redundant: Wallace supplies the rate-distortion derivation in biology-cognition; this paper supplies the substrate-parameter expression with three independently-measurable substrate-parameters and the nine-phenomenon parameter-mapping with within-event vs consolidation-dependent layer-stratification.
- Paper 1 (FT) operationalises friction as competing routes (CR); 6 borrows the CR operationalisation as the silicon-substrate proxy for |d−m|.
- Paper 2 (Capacity scaling) empirically tests the cloze-vs-application asymmetry; 6's substrate-graded predictions extend the capacity-substrate story to the matched-friction regime.
- Paper 4B (Substrates encode experience) tests the same encoding-through-loading mechanism on the in-context substrate; 6 supplies the broader vocabulary within which 4B's findings are read as the inference-time analogue of the schema's within-event regime.
The Paper 6 family (companions in preparation)
The original Paper 6 was restructured in May 2026 (approved 2026-05-16) into a core paper plus three companions. The core is what you are reading. The three companions develop specific implications that the core treats only briefly:
- Paper 6B (in preparation): The Effort-Value Family: Race-Architecture as the Common Substrate of Value-Attribution. Treats effort-value attribution (IKEA effect, endowment effect, sunk-cost fallacy, generation effect, effort justification, effort heuristic) as a family on the race-architecture substrate. Cite-letter 2026r.
- Paper 6C (in preparation): Commit-Position as a Measurable Substrate Signature: Optimal Stopping, Loss Aversion, and the Substrate-Discipline of Race Topology. Treats commit-position dynamics including the secretary-problem 1/e anchor, salvaged from the original Paper 6. Cite-letter 2026s.
- Paper 6D (in preparation): Mirror-Friction: The Value of Received Information as Receiver-Side Friction-Resolution. The receiver-side complement to the effort-value family. Cite-letter 2026t.
Read the paper
The full paper is on Zenodo (concept DOI 10.5281/zenodo.20059863):