Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs

arXiv cs.CL / 3/11/2026

Ideas & Deep AnalysisModels & Research

Read original →

共有:

Key Points

Enabling reasoning in large language models (LLMs) significantly improves their ability to recall parametric knowledge for simple, single-hop factual questions, which typically do not require complex reasoning.
The study identifies two key mechanisms behind this effect: a computational buffer effect, where reasoning tokens facilitate latent computations, and factual priming, where generating related facts helps bridge semantic connections to the correct answer.
However, the generative self-retrieval process carries risks, as hallucinating intermediate facts during reasoning tends to increase hallucinations in the final answer.
By understanding these mechanisms, the researchers develop methods to improve model accuracy by emphasizing reasoning paths that avoid hallucinated facts.
This work provides deeper insight into how reasoning unlocks otherwise unreachable parametric knowledge within LLMs, with practical implications for enhancing model reliability and correctness.

Computer Science > Computation and Language

arXiv:2603.09906 (cs)

[Submitted on 10 Mar 2026]

Title:Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs

Authors:Zorik Gekhman, Roee Aharoni, Eran Ofek, Mor Geva, Roi Reichart, Jonathan Herzig

View a PDF of the paper titled Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs, by Zorik Gekhman and Roee Aharoni and Eran Ofek and Mor Geva and Roi Reichart and Jonathan Herzig

View PDF HTML (experimental)

Abstract:While reasoning in LLMs plays a natural role in math, code generation, and multi-hop factual questions, its effect on simple, single-hop factual questions remains unclear. Such questions do not require step-by-step logical decomposition, making the utility of reasoning highly counterintuitive. Nevertheless, we find that enabling reasoning substantially expands the capability boundary of the model's parametric knowledge recall, unlocking correct answers that are otherwise effectively unreachable. Why does reasoning aid parametric knowledge recall when there are no complex reasoning steps to be done? To answer this, we design a series of hypothesis-driven controlled experiments, and identify two key driving mechanisms: (1) a computational buffer effect, where the model uses the generated reasoning tokens to perform latent computation independent of their semantic content; and (2) factual priming, where generating topically related facts acts as a semantic bridge that facilitates correct answer retrieval. Importantly, this latter generative self-retrieval mechanism carries inherent risks: we demonstrate that hallucinating intermediate facts during reasoning increases the likelihood of hallucinations in the final answer. Finally, we show that our insights can be harnessed to directly improve model accuracy by prioritizing reasoning trajectories that contain hallucination-free factual statements.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2603.09906 [cs.CL]
	(or arXiv:2603.09906v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2603.09906 Focus to learn more arXiv-issued DOI via DataCite

Submission history

From: Zorik Gekhman [view email]
[v1] Tue, 10 Mar 2026 16:59:20 UTC (1,734 KB)

Full-text links:

Access Paper:

View PDF
HTML (experimental)
TeX Source

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2026-03

Change to browse by:

References & Citations

export BibTeX citation Loading...

BibTeX formatted citation

Data provided by:

Bookmark

Bibliographic Tools

Bibliographic and Citation Tools

Bibliographic Explorer Toggle

Bibliographic Explorer (What is the Explorer?)

Connected Papers Toggle

Connected Papers (What is Connected Papers?)

Litmaps Toggle

Litmaps (What is Litmaps?)

scite.ai Toggle

scite Smart Citations (What are Smart Citations?)

Code, Data, Media

Code, Data and Media Associated with this Article

alphaXiv Toggle

alphaXiv (What is alphaXiv?)

Links to Code Toggle

CatalyzeX Code Finder for Papers (What is CatalyzeX?)

DagsHub Toggle

DagsHub (What is DagsHub?)

GotitPub Toggle

Gotit.pub (What is GotitPub?)

Huggingface Toggle

Hugging Face (What is Huggingface?)

Links to Code Toggle

Papers with Code (What is Papers with Code?)

ScienceCast Toggle

ScienceCast (What is ScienceCast?)

Demos

Replicate Toggle

Replicate (What is Replicate?)

Spaces Toggle

Hugging Face Spaces (What is Spaces?)

Spaces Toggle

TXYZ.AI (What is TXYZ.AI?)

Recommenders and Search Tools

Link to Influence Flower

Influence Flower (What are Influence Flowers?)

Core recommender toggle

CORE Recommender (What is CORE?)

Author
Venue
Institution
Topic

About arXivLabs

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

『モンドーモンドー』｜夏目龍頭流闇文学｜AI画像生成｜自由詩｜散文詩｜ホラー｜ダークファンタジー｜深淵図書館

note

報告：LLMにおける「自己言及的再帰」と「ステートフル・エミュレーション」の観測

note

フリーランスの泥臭い経験を資産に変える。AIの文章に「あなたの魂」を注入する技術。【コピペOK】

note

諸葛亮孔明老師(ChatGPTのﾛｰﾙﾌﾟﾚｲ)との対話その肆拾伍『銀河文明･ダークマターエンジン』

note

人の言葉を喋る「ロボット盲導犬」は、視覚障害者の方々の自立支援の一助となるか

note

Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs

Key Points

Computer Science > Computation and Language

Title:Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs

Submission history