AeroBridge-TTA: Test-Time Adaptive Language-Conditioned Control for UAVs

arXiv cs.RO / 4/22/2026

📰 NewsDeveloper Stack & InfrastructureIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses a key failure mode in language-guided UAVs: execution mismatch when real dynamics (e.g., mass, drag, actuator delay, wind) differ from training.
  • It proposes AeroBridge-TTA, a language-conditioned UAV control pipeline that uses a language encoder to generate subgoals, an adaptive latent-conditioned policy, and a test-time adaptation (TTA) module that updates a latent variable online.
  • Across five language-conditioned UAV tasks and 13 mismatch conditions (with consistent domain randomization), AeroBridge-TTA matches a strong PPO-MLP baseline in-distribution while outperforming it in all out-of-distribution conditions.
  • The method achieves an average OOD improvement of +22.0 points (62.7% vs. 40.7%), and the overall +8.5 point gain is attributed entirely to the OOD regime.
  • An ablation keeping the model weights fixed but changing the latent update step size shows that the latent update mechanism itself accounts for a 4.6× lift in OOD performance.

Abstract

Language-guided unmanned aerial vehicles (UAVs) often fail not from bad reasoning or perception, but from execution mismatch: the gap between a planned trajectory and the controller's ability to track it when the real dynamics differ from training (mass changes, drag shifts, actuator delay, wind). We propose AeroBridge-TTA, a language-conditioned control pipeline that targets this gap with test-time adaptation. It has three parts: a language encoder that maps the command into a subgoal, an adaptive policy conditioned on the subgoal and a learned latent, and a test-time adaptation (TTA) module that updates the latent online from observed transitions. On five language-conditioned UAV tasks under 13 mismatch conditions with the same domain randomization, AeroBridge-TTA ties a strong PPO-MLP baseline in-distribution and wins all 5 out-of-distribution (OOD) conditions, +22.0 pts on average (62.7% vs. 40.7%); the +8.5 pt overall gain comes entirely from the OOD regime. A same-weights ablation that only changes the step size \alpha shows the latent update itself is responsible for a 4.6\times OOD lift.