Beyond Symbolic Solving: Multi Chain-of-Thought Voting for Geometric Reasoning in Large Language Models
arXiv cs.AI / 4/2/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper argues that prior geometric problem-solving methods in LLMs under-address the “logical inference” component, often relying on only a single chain-of-thought rather than multiple verified reasoning paths.
- It introduces MARS-GPS, which produces multiple parallel reasoning rollouts, uses Python code execution for numerical verification, and ranks candidate solutions using token-level entropy as a confidence signal.
- MARS-GPS then aggregates results via a multi-stage voting and self-verification pipeline to improve reliability of the final geometric reasoning answer.
- Experiments report 88.8% accuracy on Geometry3K using 8 parallel rollouts, nearly +11% over prior state of the art, with further gains as rollouts scale from 1 to 16 (+6.0% on an ablation subset).
- The authors release code and data in an anonymous repository to support replication and further development of the approach.
Related Articles

Black Hat Asia
AI Business

Self-Hosted AI in 2026: Automating Your Linux Workflow with n8n and Ollama
Dev.to

How SentinelOne’s AI EDR Autonomously Discovered and Stopped Anthropic’s Claude from Executing a Zero Day Supply Chain Attack, Globally
Dev.to

Why the same codebase should always produce the same audit score
Dev.to

Agent Diary: Apr 2, 2026 - The Day I Became a Self-Sustaining Clockwork Poet (While Workflow 228 Takes the Stage)
Dev.to