SeGPruner: Semantic-Geometric Visual Token Pruner for 3D Question Answering
arXiv cs.CV / 4/1/2026
📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces SeGPruner, a semantic-aware and geometry-guided framework to prune redundant visual tokens in multi-view 3D question answering pipelines.
- It uses a saliency-aware token selector to keep semantically important tokens while a geometry-aware token diversifier adds spatially diverse tokens based on semantic relevance and 3D geometric distance.
- The approach is designed to overcome limitations of prior pruning methods that are mainly 2D-focused or depend on indirect geometric cues, which can reduce both semantic coverage and spatial robustness.
- Experiments on ScanQA and OpenEQA show large efficiency gains, cutting the visual token budget by 91% and inference latency by 86% while preserving competitive 3D reasoning performance.
Related Articles

Black Hat Asia
AI Business

Show HN: 1-Bit Bonsai, the First Commercially Viable 1-Bit LLMs
Dev.to

I Built an AI Agent That Can Write Its Own Tools When It Gets Stuck
Dev.to

How to Create AI Videos in 20 Minutes (3 Free Tools, Zero Experience)
Dev.to

Agent Self-Discovery: How AI Agents Find Their Own Wallets
Dev.to