Context-Fidelity Boosting: Enhancing Faithful Generation through Watermark-Inspired Decoding

arXiv cs.CL / 4/27/2026

📰 NewsTools & Practical UsageModels & Research

Key Points

  • The paper introduces Context-Fidelity Boosting (CFB), a decoding-time framework aimed at reducing “faithfulness hallucinations” where LLM outputs contradict or ignore the input context.
  • CFB uses watermark-inspired logit shaping by adding token-level logit adjustments proportional to how well each candidate token is supported by the input context.
  • It proposes three variants—static boosting, context-aware boosting, and token-aware boosting—ranging from fixed biases to adaptive, relevance-informed adjustments.
  • CFB is lightweight and does not require retraining or model architecture changes, and experiments on summarization and QA show consistent improvements with minimal generation overhead.
  • An open-source implementation is provided, suggesting the method can be readily adopted across many existing open-source LLMs.

Abstract

Large language models (LLMs) often produce content that contradicts or overlooks information provided in the input context, a phenomenon known as faithfulness hallucination. In this paper, we propose Context-Fidelity Boosting (CFB), a lightweight and general decoding-time framework that reduces such hallucinations by increasing the generation probability of source-supported tokens. Motivated by logit-shaping principles from watermarking techniques, CFB applies additive token-level logit adjustments based on a token's degree of support from the input context. Specifically, we develop three boosting strategies: static boosting, which applies a fixed bias to source-supported tokens; context-aware boosting, which scales this bias using the divergence between next-token distributions with and without context; and token-aware boosting, which further redistributes the adaptive bias according to local relevance estimated from source-position attention and source-scoped semantic similarity. CFB requires no retraining or architectural changes, making it compatible with a wide range of LLMs. Experiments on summarization and question answering tasks across multiple open-source LLMs show that CFB consistently improves faithfulness metrics with minimal generation overhead. Our implementation is fully open-sourced.