StreetForward: Perceiving Dynamic Street with Feedforward Causal Attention
arXiv cs.CV / 3/23/2026
📰 NewsModels & Research
Key Points
- StreetForward introduces a pose-free, tracker-free feedforward framework for dynamic street reconstruction in autonomous driving, enabling rapid scene reconstruction without per-scene optimization.
- It augments the Visual Geometry Grounded Transformer with a temporal mask attention module to extract motion information from image sequences and produce motion-aware latent representations.
- Static content and dynamic instances are represented using 3D Gaussian Splatting and jointly optimized through cross-frame rendering with spatio-temporal consistency, allowing per-pixel velocity estimation and high-fidelity novel view synthesis at new poses and times.
- Trained on the Waymo Open Dataset, StreetForward demonstrates superior performance on novel view synthesis and depth estimation compared with existing methods and shows zero-shot generalization on CARLA and other datasets.
Related Articles
[D] Matryoshka Representation Learning
Reddit r/MachineLearning
Two new Qwen3.5 “Neo” fine‑tunes focused on fast, efficient reasoning
Reddit r/LocalLLaMA

HKIC, Gobi Partners and HKU team up for fund backing university research start-ups
SCMP Tech
Yann LeCun’s New LeWorldModel (LeWM) Research Targets JEPA Collapse in Pixel-Based Predictive World Modeling
MarkTechPost
Streaming experts
Simon Willison's Blog