TFusionOcc: T-Primitive Based Object-Centric Multi-Sensor Fusion Framework for 3D Occupancy Prediction

arXiv cs.RO / 4/22/2026

📰 NewsDeveloper Stack & InfrastructureModels & Research

共有:

Key Points

The paper introduces TFusionOcc, an object-centric multi-sensor fusion framework for 3D semantic occupancy prediction aimed at improving autonomous-vehicle scene understanding.
It replaces limitations of voxel-based and Gaussian-primitives approaches by proposing T-primitive geometry primitives based on a Students t-distribution, including plain, T-Superquadric, and deformable T-Superquadric with inverse warping.
A unified probabilistic formulation is developed using a Students t-distribution and a T-mixture model (TMM) to jointly model both occupancy and semantic labels.
The method uses a tightly coupled multi-stage fusion architecture to integrate camera and LiDAR cues more effectively.
Experiments on nuScenes achieve state-of-the-art results, and evaluations on nuScenes-C indicate strong robustness to most corruption types, with code to be released on GitHub.

Abstract

The prediction of 3D semantic occupancy enables autonomous vehicles (AVs) to perceive the fine-grained geometric and semantic scene structure for safe navigation and decision-making. Existing methods mainly rely on either voxel-based representations, which incur redundant computation over empty regions, or on object-centric Gaussian primitives, which are limited in modeling complex, non-convex, and asymmetric structures. In this paper, we present TFusionOcc, a T-primitive-based object-centric multi-sensor fusion framework for 3D semantic occupancy prediction. Specifically, we introduce a family of Students t-distribution-based T-primitives, including the plain T-primitive, T-Superquadric, and deformable T-Superquadric with inverse warping, where the deformable T-Superquadric serves as the key geometry-enhancing primitive. We further develop a unified probabilistic formulation based on the Students t-distribution and the T-mixture model (TMM) to jointly model occupancy and semantics, and design a tightly coupled multi-stage fusion architecture to effectively integrate camera and LiDAR cues. Extensive experiments on nuScenes show state-of-the-art performance, while additional evaluations on nuScenes-C demonstrate strong robustness under most corruption scenarios. The code will be available at: https://github.com/DanielMing123/TFusionOcc

I’m working on an AGI and human council system that could make the world better and keep checks and balances in place to prevent catastrophes. It could change the world. Really. Im trying to get ahead of the game before an AGI is developed by someone who only has their best interest in mind.

Reddit r/artificial

Deepseek V4 Flash and Non-Flash Out on HuggingFace

Reddit r/LocalLLaMA

DeepSeek V4 Flash & Pro Now out on API

Reddit r/LocalLLaMA

I’m building a post-SaaS app catalog on Base, and here’s what that actually means

Dev.to

From "Hello World" to "Hello Agents": The Developer Keynote That Rewired Software Engineering

Dev.to

TFusionOcc: T-Primitive Based Object-Centric Multi-Sensor Fusion Framework for 3D Occupancy Prediction

Key Points

Abstract

Related Articles

I’m working on an AGI and human council system that could make the world better and keep checks and balances in place to prevent catastrophes. It could change the world. Really. Im trying to get ahead of the game before an AGI is developed by someone who only has their best interest in mind.

Deepseek V4 Flash and Non-Flash Out on HuggingFace

DeepSeek V4 Flash & Pro Now out on API

I’m building a post-SaaS app catalog on Base, and here’s what that actually means

From "Hello World" to "Hello Agents": The Developer Keynote That Rewired Software Engineering

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer