Rectified Schr\"odinger Bridge Matching for Few-Step Visual Navigation

arXiv cs.RO / 4/8/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper addresses a key bottleneck in embodied visual navigation: diffusion/Schrödinger-Bridge-based generative policies need many integration steps, making them hard to use for real-time robotic control.
It proposes Rectified Schrödinger Bridge Matching (RSBM), linking standard Schrödinger Bridges and deterministic optimal transport through a single entropic regularization parameter ε.
The authors prove that the conditional velocity-field functional form remains invariant across ε, so one network can handle multiple regularization strengths.
They show that decreasing ε linearly reduces velocity variance, improving stability for coarse-step ODE integration.
Experiments report that RSBM reaches 92% success rate and 94% cosine similarity in only 3 steps (vs. ≥10 for standard bridges), without distillation or multi-stage training.

Abstract

Visual navigation is a core challenge in Embodied AI, requiring autonomous agents to translate high-dimensional sensory observations into continuous, long-horizon action trajectories. While generative policies based on diffusion models and Schr\"odinger Bridges (SB) effectively capture multimodal action distributions, they require dozens of integration steps due to high-variance stochastic transport, posing a critical barrier for real-time robotic control. We propose Rectified Schr\"odinger Bridge Matching (RSBM), a framework that exploits a shared velocity-field structure between standard Schr\"odinger Bridges (

\varepsilon=1

, maximum-entropy transport) and deterministic Optimal Transport (

\varepsilon\to 0

, as in Conditional Flow Matching), controlled by a single entropic regularization parameter

\varepsilon

. We prove two key results: (1) the conditional velocity field's functional form is invariant across the entire

\varepsilon

-spectrum (Velocity Structure Invariance), enabling a single network to serve all regularization strengths; and (2) reducing

\varepsilon

linearly decreases the conditional velocity variance, enabling more stable coarse-step ODE integration. Anchored to a learned conditional prior that shortens transport distance, RSBM operates at an intermediate

\varepsilon

that balances multimodal coverage and path straightness. Empirically, while standard bridges require

\geq 10

steps to converge, RSBM achieves over 94% cosine similarity and 92% success rate in merely 3 integration steps -- without distillation or multi-stage training -- substantially narrowing the gap between high-fidelity generative policies and the low-latency demands of Embodied AI.

Black Hat Asia

AI Business

The enforcement gap: why finding issues was never the problem

Dev.to

How I Built AI-Powered Auto-Redaction Into a Desktop Screenshot Tool

Dev.to

Agentic AI vs Traditional Automation: Why They Require Different Approaches in Modern Enterprises

Dev.to

Agentic AI vs Traditional Automation: Why Modern Enterprises Must Treat Them Differently

Dev.to

Rectified Schr\"odinger Bridge Matching for Few-Step Visual Navigation

Key Points

Abstract

Related Articles

Black Hat Asia

The enforcement gap: why finding issues was never the problem

How I Built AI-Powered Auto-Redaction Into a Desktop Screenshot Tool

Agentic AI vs Traditional Automation: Why They Require Different Approaches in Modern Enterprises

Agentic AI vs Traditional Automation: Why Modern Enterprises Must Treat Them Differently

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer