ARGS: Auto-Regressive Gaussian Splatting via Parallel Progressive Next-Scale Prediction

arXiv cs.CV / 4/2/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

提案手法ARGSは、2Dの次スケール予測の自動回帰的発想を3D生成（Gaussian splattingの多視点・多スケール表現）へ拡張し、LOD（解像度/詳細度）ごとの次スケールを並列に予測して生成するフレームワークを提示しています。
Gaussian simplification（簡略化）とその逆変換によって、粗い表現から次スケール生成へ誘導する戦略を導入し、階層木（hierarchical trees）を用いることで生成に必要なステップ数をO(log n)に抑えるとしています。
ツリー構造を自動回帰的に予測するtree-based transformerを提案し、葉ノードが内部祖先に注意（attention）することで、構造の整合性を高める設計になっています。
実験により、生成するマルチスケールGaussian表現の詳細度・視覚品質を制御でき、計算時間も現実的な範囲に収められることを示しています。

Abstract

Auto-regressive frameworks for next-scale prediction of 2D images have demonstrated strong potential for producing diverse and sophisticated content by progressively refining a coarse input. However, extending this paradigm to 3D object generation remains largely unexplored. In this paper, we introduce auto-regressive Gaussian splatting (ARGS), a framework for making next-scale predictions in parallel for generation according to levels of detail. We propose a Gaussian simplification strategy and reverse the simplification to guide next-scale generation. Benefiting from the use of hierarchical trees, the generation process requires only \(\mathcal{O}(\log n)\) steps, where \(n\) is the number of points. Furthermore, we propose a tree-based transformer to predict the tree structure auto-regressively, allowing leaf nodes to attend to their internal ancestors to enhance structural consistency. Extensive experiments demonstrate that our approach effectively generates multi-scale Gaussian representations with controllable levels of detail, visual fidelity, and a manageable time consumption budget.

Benchmarking Batch Deep Reinforcement Learning Algorithms

Dev.to

Qwen3.6-Plus: Alibaba's Quiet Giant in the AI Race Delivers a Million-Token Enterprise Powerhouse

Dev.to

How To Leverage AI for Back-Office Headcount Optimization

Dev.to

Is 1-bit and TurboQuant the future of OSS? A simulation for Qwen3.5 models.

Reddit r/LocalLLaMA

SOTA Language Models Under 14B?

Reddit r/LocalLLaMA

ARGS: Auto-Regressive Gaussian Splatting via Parallel Progressive Next-Scale Prediction

Key Points

Abstract

Related Articles

Benchmarking Batch Deep Reinforcement Learning Algorithms

Qwen3.6-Plus: Alibaba's Quiet Giant in the AI Race Delivers a Million-Token Enterprise Powerhouse

How To Leverage AI for Back-Office Headcount Optimization

Is 1-bit and TurboQuant the future of OSS? A simulation for Qwen3.5 models.

SOTA Language Models Under 14B?

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer