Scientific Knowledge-driven Decoding Constraints Improving the Reliability of LLMs

arXiv cs.CL / 4/9/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces SciDC, a knowledge-driven LLM generation approach that injects subject-specific knowledge into generation via strong constraints to reduce hallucinations.
  • SciDC uses stronger LLMs to automatically transform flexible domain knowledge into standardized, multi-layer rules that can then constrain downstream domain task generation.
  • Experiments on scientific domains—including industrial formulation design, clinical tumor diagnosis, and retrosynthesis planning—show consistent improvements, with an average 12% accuracy gain over vanilla generation.
  • The authors position the framework as extensible and discuss how LLMs could help automatically inductively summarize highly condensed knowledge to accelerate parts of scientific research.

Abstract

Large language models (LLMs) have shown strong knowledge reserves and task-solving capabilities, but still face the challenge of severe hallucination, hindering their practical application. Though scientific theories and rules can efficiently direct the behaviors of human manipulators, LLMs still do not utilize these highly-condensed knowledge sufficiently through training or prompting. To address this issue, we propose \textbf{SciDC}, an LLM generation method that integrate subject-specific knowledge with strong constraints. By adopting strong LLMs to automatically convert flexible knowledge into multi-layered, standardized rules, we build an extensible framework to effectively constrain the model generation on domain tasks. Experiments on scientific tasks including industrial formulation design, clinical tumor diagnosis and retrosynthesis planning, consistently demonstrate the effectiveness of our method, achieving a 12\% accuracy improvement on average compared with vanilla generation. We further discuss the potential of LLMs in automatically inductively summarizing highly-condensed knowledge, looking ahead to practical solutions for accelerating the overall scientific research process. All the code of this paper can be obtained (https://github.com/Maotian-Ma/SciDC).