Thought Graph Traversal for Test-time Scaling in Chest X-ray VLLMs

arXiv cs.CV / 5/4/2026

💬 OpinionIdeas & Deep AnalysisTools & Practical UsageModels & Research

Key Points

  • The paper proposes a test-time scaling method for vision-language large models to improve chest X-ray report generation without any additional training.
  • It introduces a lightweight Thought Graph Traversal (TGT) framework that steers reasoning through organ-specific findings in a medically coherent sequence using structured medical priors embedded in prompts.
  • The method further improves reasoning depth via a “reasoning budget forcing” strategy that dynamically extends the generation process at inference time.
  • Experiments show the approach outperforms baseline prompting on standard benchmarks while enabling analysis of dataset biases through traceable reasoning paths, and the authors open-source the code and prompts for reproducibility.

Abstract

Test-time scaling offers a promising way to improve the reasoning performance of vision-language large models (VLLMs) without additional training. In this paper, we explore a simple but effective approach for applying test-time scaling to chest X-ray report generation. Specifically, we introduce a lightweight Thought Graph Traversal (TGT) framework that guides the model to reason through organ-specific findings in a medically coherent order. This framework integrates structured medical priors into the prompt, enabling deeper and more logical analysis with no changes to the underlying model. To further enhance reasoning depth, we apply a reasoning budget forcing strategy that adjusts the model's inference depth at test time by dynamically extending its generation process. This simple yet powerful combination allows a frozen radiology VLLM to self-correct and generate more accurate, consistent chest X-ray reports. Our method outperforms baseline prompting approaches on standard benchmarks, and also reveals dataset biases through traceable reasoning paths. Code and prompts are open-sourced for reproducibility at https://github.com/glerium/Thought-Graph-Traversal

Thought Graph Traversal for Test-time Scaling in Chest X-ray VLLMs | AI Navigate