Director: Instance-aware Gaussian Splatting for Dynamic Scene Modeling and Understanding
arXiv cs.CV / 4/3/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces “Director,” a unified spatio-temporal Gaussian representation designed for dynamic scenes, aiming to combine high-fidelity 4D rendering with instance-level semantics for more robust understanding and tracking.
- It improves semantic consistency by supervising Gaussian-level learnable features using temporally aligned instance masks and sentence embeddings from multimodal large language models, with two MLP decoders to support identity consistency over time.
- To reduce temporal drift and improve stability, the method integrates 2D optical flow with 4D Gaussians and fine-tunes their motion, using the resulting alignment to provide more reliable initialization.
- Training further incorporates geometry-aware SDF constraints and regularization terms that enforce surface continuity, targeting better temporal coherence in dynamic foreground modeling.
- Experiments report that Director produces temporally coherent 4D reconstructions while enabling instance segmentation and open-vocabulary (language-conditioned) querying of the scene.
Related Articles

Black Hat Asia
AI Business

90000 Tech Workers Got Fired This Year and Everyone Is Blaming AI but Thats Not the Whole Story
Dev.to

Microsoft’s $10 Billion Japan Bet Shows the Next AI Battleground Is National Infrastructure
Dev.to

TII Releases Falcon Perception: A 0.6B-Parameter Early-Fusion Transformer for Open-Vocabulary Grounding and Segmentation from Natural Language Prompts
MarkTechPost

Portable eye scanner powered by AI expands access to low-cost community screening
Reddit r/artificial