ORACLE: Optimizing Reasoning Abilities of Large Language Models via Constraint-Led Synthetic Data Elicitation
arXiv cs.AI / 2026/3/24
📰 ニュースIdeas & Deep AnalysisModels & Research
要点
- The ORACLE framework targets a key limitation of synthetic reasoning data training for LLMs: many methods validate only end-to-end correctness while missing errors in intermediate reasoning steps.
- ORACLE generates step-wise reasoning contexts with an LLM and then uses a symbolic reasoning engine to verify the validity of each intermediate step, aiming for fine-grained “step-level” supervision.
- The approach is designed to work better than code-execution or conventional symbolic validators in natural-language reasoning settings that may be ambiguous or lack complete context.
- Experiments across six logical, factual, and commonsense benchmarks show ORACLE outperforming strong baselines on multiple LLM models, indicating the method can reliably improve multi-step reasoning quality.
- The paper positions ORACLE as a structured synthetic data generation pipeline that combines generative prompting with symbolic checks to produce higher-quality training data for reasoning tasks.

