ORACLE: Optimizing Reasoning Abilities of Large Language Models via Constraint-Led Synthetic Data Elicitation

arXiv cs.AI / 3/24/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The ORACLE framework targets a key limitation of synthetic reasoning data training for LLMs: many methods validate only end-to-end correctness while missing errors in intermediate reasoning steps.
  • ORACLE generates step-wise reasoning contexts with an LLM and then uses a symbolic reasoning engine to verify the validity of each intermediate step, aiming for fine-grained “step-level” supervision.
  • The approach is designed to work better than code-execution or conventional symbolic validators in natural-language reasoning settings that may be ambiguous or lack complete context.
  • Experiments across six logical, factual, and commonsense benchmarks show ORACLE outperforming strong baselines on multiple LLM models, indicating the method can reliably improve multi-step reasoning quality.
  • The paper positions ORACLE as a structured synthetic data generation pipeline that combines generative prompting with symbolic checks to produce higher-quality training data for reasoning tasks.

Abstract

Training large language models (LLMs) with synthetic reasoning data has become a popular approach to enhancing their reasoning capabilities, while a key factor influencing the effectiveness of this paradigm is the quality of the generated multi-step reasoning data. To generate high-quality reasoning data, many recent methods generate synthetic reasoning paths and filter them based on final answer correctness, often overlooking flaws in intermediate reasoning steps. To enhance the verification of intermediate reasoning steps, prior work primarily resorts to code execution or symbolic reasoning engines. However, code-based validation is restricted to code or mathematical tasks, and reasoning engines require a well-structured and complete context. As a result, existing methods fail to function effectively in natural language reasoning tasks that involve ambiguous or incomplete contexts. In these tasks, synthetic data still lack reliable checks for verifying each reasoning step. To address this challenge, we introduce ORACLE, a structured data generation framework inspired by syllogistic reasoning. ORACLE integrates the generative strengths of LLMs with symbolic supervision: the LLM produces step-wise reasoning contexts, while a symbolic reasoning engine verifies the validity of each intermediate step. By employing a unified prompting template to elicit modular reasoning chains, ORACLE enables fine-grained, step-level validation, facilitating the construction of high-quality multi-step reasoning data. Across six logical, factual, and commonsense reasoning benchmarks, our ORACLE consistently outperforms strong baselines on multiple models.

ORACLE: Optimizing Reasoning Abilities of Large Language Models via Constraint-Led Synthetic Data Elicitation | AI Navigate