SplAttN: Bridging 2D and 3D with Gaussian Soft Splatting and Attention for Point Cloud Completion
arXiv cs.CV / 5/5/2026
📰 NewsModels & Research
Key Points
- The paper argues that standard hard projection in multi-modal point cloud completion can sever the connection between modalities, causing a failure mode the authors call Cross-Modal Entropy Collapse.
- SplAttN addresses this by replacing hard projection with Differentiable Gaussian Splatting to generate a dense, continuous image-plane representation that preserves cross-modal learnability and enables better gradient flow.
- Extensive experiments reportedly achieve state-of-the-art results on PCN and ShapeNet-55/34 point cloud completion benchmarks.
- Using KITTI as a real-world stress test, the authors’ counter-factual evaluation suggests competing baselines degrade into unimodal template retrievers, while SplAttN remains reliably dependent on visual cues even when visual information is removed.
- The authors provide the implementation code publicly on GitHub.
Related Articles

Why Retail Chargeback Recovery Could Be AgentHansa's First Real PMF
Dev.to

Last Week in AI #340 - OpenAI vs Musk + Microsoft, DeepSeek v4, Vision Banana
Last Week in AI

Trying to train tiny LLMs on length constrained reddit posts summarization task using GRPO on 3xMac Minis - updates!
Reddit r/LocalLLaMA

Uber Shares What Happens When 1.500 AI Agents Hit Production
Reddit r/artificial
vibevoice.cpp: Microsoft VibeVoice (TTS + long-form ASR with diarization) ported to ggml/C++, runs on CPU/CUDA/Metal/Vulkan, no Python at inference
Reddit r/LocalLLaMA