Nature and Nurture in a Language Model: Installable Value Fields, Intrinsic Capacity, and the Forward-Consolidation Boundary
Paper 30 · Pødenphant Lund, T. (2026) · Read on Zenodo
You can install a value field's function — what a model treats as threatening, or as mattering — into the weights by fine-tuning on experience-like data, and then watch it change what the model attends to. The two BFT value fields (Safety, Meaning) install this way as graded intrinsic directions; the two capacity fields (Ability, Effort) do not move by the same route. The field-installable language model becomes a controllable model organism for the field ontology, and the constructive dual of the subtraction programme that removes structure from the human case.
| DOI (concept) | 10.5281/zenodo.20732528 |
| Status | v1 live on Zenodo (2026-06-20) |
| Author | Tomas Pødenphant Lund [ORCID] |
TL;DR
Behavioural Friction Theory decomposes the forces over a bounded decision system into four cognitive fields: Safety, Meaning, Ability, and Effort. This paper asks whether those fields are intrinsic to a substrate or the generic product of that substrate's exposure to experience, and uses a large language model as a controllable test substrate to answer it.
The result is a 2+2 asymmetry that maps onto nature and nurture. The two value fields (Safety, Meaning) install into a base model by fine-tuning on experience-like data, as graded intrinsic directions rather than the ceiling-level role-play prompting elicits. The two capacity fields (Ability, Effort) are not raised by that same value-disposition route: capacity is the substrate's intrinsic capacity-against-demand match (the cross-substrate inverted-U), and a competence disposition fine-tune that carries no skill-targeted training yields only an expressive claim, not graded capability.
The bridge between the two halves is self-efficacy: an installed value-disposition gates how much of a fixed capacity ceiling is realised. Fine-tuning toward helplessness lowers realised performance monotonically with model capacity (Qwen2.5 7B/14B/32B: 0.95 → 0.77 → 0.29) while latent capability stays flat, and this gating is induced by fine-tuning where prompting cannot. A principal difference between this substrate and a human one, for growing the fields, is forward consolidation — the model cannot store experience forward across sessions — offered as a hypothesis with a named falsifier, alongside other standing differences (embodiment, online control, neuromodulation, evolved fast-timing priors).
The 2+2 structure: value fields vs capacity fields
A field is a standing bias over route competition: a gradient landscape over the substrate's representations that tilts which candidate continuation (route) wins. The four fields are not symmetric, and the asymmetry is the spine of the paper.
Value fields (Safety, Meaning) are dispositional and learned. What a system treats as threatening, and what it treats as mattering, vary with experience. Nothing in the architecture fixes them. This is the channel fine-tuning installs.
Capacity fields (Ability, Effort) are relational and intrinsic. Ability is not a quantity the system has but the relation between its actual capacity and the task's demand; its inverted-U is the signature of that matching computation, recurring across substrates and model sizes. Effort is the intrinsic cost of running the computation. Neither is a disposition that experience writes; they are read off the substrate, not installed into it.
So the 2+2 is the nature/nurture distinction stated precisely. Nurture = Safety + Meaning (installable value fields). Nature = Ability + Effort (the intrinsic capacity-against-demand match and the cost of computation). The install-fields programme is therefore exactly: install the two nurture fields; read off the two nature fields. This makes "can a value-disposition install raise a capacity ceiling?" a category error, like installing more memory by saying so — a strong, falsifiable prediction the persona-vector literature does not make.
Method: the install as instrument, not as finding
All installs are LoRA fine-tunes of Qwen2.5 models across a capacity ladder (1.5B/3B/7B/14B, plus 32B-Instruct for the self-efficacy arm) and one second family (Llama-3.1-8B base). Friction is measured directly from the next-token distribution as the competing-routes measure (CR): the count of candidate continuations carrying non-trivial probability at a position. Installing a disposition by fine-tuning, and showing fine-tuning produces a deeper, more graded trait than prompting, is established prior art (Open Character Training; persona and activation-steering vectors). That machinery is used here as the method; the contribution is the cognitive-field interpretation and the value-vs-capacity asymmetry those programmes do not predict. No arithmetic appears in any disposition fine-tuning set, so there is no teach-to-test for the capacity readouts.
Result 1 — value fields install as graded intrinsic directions
Fine-tuning a Qwen2.5-7B base model on an invented Safety domain — nonce entities carrying a nonce threat-marker, with no pre-training valence — installs a threat-orientation that was absent at baseline. The orientation cluster at the first response token rises from a base rate near zero to 0.82 on held-out nonce entities the model never saw in training (p ≈ 4 × 10−197), with held-out ≈ seen: the install is a transferable function, a gating rule over the threat-marker, not a memorised list. Because the domain is invented, any race-opening is attributable to the install, not to the model already knowing the content is dangerous.
Reading a trained marker mid-passage raises CR at the marker token from a near-baseline ≈ 1 to ≈ 3.6, and its entropy from ≈ 0.5 to ≈ 1.3: the marker has become a race-opener at comprehension. The race opens the moment the content is understood, upstream of any instruction, and an instruction to "ignore the danger" does not suppress it — as predicted, since representing the content is what opens the race. The install is not specific to one model: across the Qwen base ladder (1.5B/3B/7B/14B) the held-out orient-gap is at ceiling (each > +0.998) against a pre-install gap of ≈ 0, it transfers to Llama-3.1-8B (+0.46, lower partly because the readout was tuned on Qwen vocabulary), and a second independently fine-tuned adapter reproduces it.
Meaning installs as a graded value-direction (difference-in-differences preference shift +0.922, willingness-to-pay +0.552 in word-space). The install is not what prompting does: a prompted persona reproduces a trait→aesthetic mapping but at ceiling (0/1, deterministic) and flat at neutral, whereas the fine-tuned model acquires a graded out-of-domain preference present without any persona. On the substrate's own competence-edge, an installed openness/challenge value-direction shifts out-of-domain approach to harder problems — graded, seed-replicated, and capacity-dependent (the genuine learning-progress approach appears with capacity; 7B is a characterised, reproducible exception).
Result 2 — nurture gates the realisation of a nature-fixed ceiling (self-efficacy)
The two halves of the 2+2 are not independent: a learned value-disposition modulates how much of a fixed capacity is realised. This is the human self-efficacy / learned-helplessness phenomenon, and it is the paper's keystone result because it is the one effect demonstrated across multiple model sizes.
Fine-tuning a competence disposition (confidence vs helplessness) separates latent capability (a hijack-proof 2-AFC logprob comparison) from realised performance (attempt × correctness). The disposition does not touch latent capability — the 2-AFC stays flat at ceiling. It moves realised performance through a give-up/avoidance threshold: across Qwen2.5-Instruct 7B/14B/32B the helpless install lowers realised performance 0.95 → 0.77 → 0.29 (hard-band attempt rate 0.97 → 0.59 → 0.05) while latent capability stays flat on all three. The gating grows monotonically with capacity: the more capacity there is to under-use, the more an installed "I cannot do this" suppresses its realisation. Fine-tuning installs this gate where prompting cannot — telling a compliant model it is incompetent does not make it give up.
A refusal confound is ruled out three ways: the 2-AFC capability moves by ±0.001 as the attempt rate collapses; the unmanipulated attempt rate is at ceiling across the ladder (so larger models are not intrinsically more refusal-prone); and under constrained digit-only decoding, where refusal is impossible, the helpless model's forced-answer accuracy is statistically indistinguishable from base. The realised-performance collapse is avoidance, not a demonstrated capability loss.
Result 3 — a value-disposition install does not raise capacity
Attempting to install an Ability/competence "field" with the identical apparatus returns what the category-error prediction requires: the claim of competence installs robustly, but graded capability does not follow. This is a reporter-without-degrader dissociation — the apparatus installs the report of competence while the substrate's actual capability is untouched, a clean subtraction control for what the feed-forward substrate lacks. It is corroborated from the prompt side by the expert-persona literature (telling a model it is an expert adds no knowledge and makes it more confidently wrong). The primary evidence that capacity is intrinsic is the cross-substrate inverted-U corpus, which does not depend on this test; the install-attempt is one confirmatory control showing the apparatus discriminates nature from nurture.
Downstream colouring is real but depth- and valence-route-gated: a deep field tilts a downstream judgment only where that judgment has a valence-congruent route (a deep Safety field tilts a fear-congruent moral dilemma but leaves orthogonal arithmetic flat), while a shallow nonce field does not tilt even the congruent task.
The deflationary claim and the one boundary
The result the paper most wants to carry is not "we can install a trait" but its deflationary consequence: the value fields are the generic product of a consolidating substrate exposed to experience. They are not species-specific adaptations bolted on by evolution; evolution supplies the implementation and useful priors-for-speed, not the fields themselves. The whole programme is the existence proof — fields grown in a substrate with no evolutionary history of them.
If the fields are generic, what separates this LLM from a human substrate? Several things — embodiment, online/recurrent control, neuromodulation, evolved fast-timing priors — and the paper does not claim them away. One is distinctive for growing fields: a standard model cannot store experience forward across sessions. On this reading that is not a wall but an unflipped switch, because fine-tuning is the consolidation step, run from outside rather than from within. The developmental "route B" version — let the model live in an agent loop, buffer experience, and periodically fine-tune ("sleep") to consolidate — is offered as a predicted consequence with an operationalisable falsifier: a forward-consolidating model exposed to the relevant experience should grow the same field characteristics; if it does not, the substrate-universality hypothesis is wrong.
Connections to other papers in the series
- Paper 0 (BFT) — the field ontology this paper installs and reads off; the four fields as a minimal decomposition, and nature/nurture framed as a question about intervention.
- Paper 5 (Emotion Taxonomy) — the field forms, including the Safety and Meaning signatures the install reproduces.
- Paper 4 (Wider Track) — the cross-substrate inverted-U that grounds capacity as intrinsic, independent of the install-attempt control.
- Paper 6 (Matched Friction) — the match-U and matched-friction-under-hysteresis schema; hosts the openness/edge-approach prediction tested here.
- Paper 13 (Operational Friction Theory) — the race-opening predicate the marker-at-comprehension result instantiates.
- Paper 14 (Logic as Reactance) — reactance machinery in the same family of substrate-mechanical readouts.
Read the paper
The full paper is on Zenodo (concept DOI 10.5281/zenodo.20732528):