VISION-SLS: Safe Perception-Based Control from Learned Visual Representations via System Level Synthesis

arXiv cs.LG / 4/29/2026

💬 OpinionDeveloper Stack & InfrastructureModels & Research

共有:

Key Points

VISION-SLS is a control method that uses high-resolution RGB images to compute nonlinear output-feedback control with robust constraint-satisfaction guarantees under calibrated uncertainty.
The approach combines a learned low-dimensional observation map built from pretrained visual features (with state-dependent error bounds) and a causal affine time-varying output-feedback policy optimized via System Level Synthesis (SLS).
The authors introduce a scalable solver for a resulting nonconvex optimization problem by using sequential convex programming and efficient Riccati recursions.
Experiments on simulated 4D car, 10D quadrotor, and a 59D humanoid with partial observability show safe, information-gathering behavior and constraint satisfaction using empirically calibrated error bounds.
Hardware validation demonstrates safe ground-vehicle control from onboard images, with improved safety rate and solve time versus baselines, and the code is published on GitHub.

Abstract

We propose VISION-SLS, a method for nonlinear output-feedback control from high-resolution RGB images which provides robust constraint satisfaction guarantees under calibrated uncertainty bounds despite partial observability, sensor noise, and nonlinear dynamics. To enable scalability while retaining guarantees, we propose: (i) a learned low-dimensional observation map from pretrained visual features with state-dependent error bounds, and (ii) a causal affine time-varying output-feedback policy optimized via System Level Synthesis (SLS). We develop a scalable, novel solver for the resulting nonconvex program that leverages sequential convex programming coupled with efficient Riccati recursions. On two simulated visuomotor tasks (a 4D car and a 10D quadrotor) with >= 512 x 512 pixels and a 59D humanoid task with partial observability, our method enables safe, information-gathering behavior that reduces uncertainty while guaranteeing constraint satisfaction with empirically-calibrated error bounds. We also validate our method on hardware, safely controlling a ground vehicle from onboard images, outperforming baselines in safety rate and solve times. Together, these results show that learned visual abstractions coupled with an efficient solver make SLS-based safe visuomotor output-feedback practical at scale. The code implementation of our method is available at https://github.com/trustworthyrobotics/VISION-SLS.

LLMs will be a commodity

Reddit r/artificial

What it feels like to have to have Qwen 3.6 or Gemma 4 running locally

Reddit r/LocalLLaMA

From Fault Codes to Smart Fixes: How Google Cloud NEXT ’26 Inspired My AI Mechanic Assistant

Dev.to

Dex lands $5.3M to grow its AI-driven talent matching platform

Tech.eu

7 OpenClaw Money-Making Cases in One Week — and the Hidden Cost Problem Behind Them

Dev.to

VISION-SLS: Safe Perception-Based Control from Learned Visual Representations via System Level Synthesis

Key Points

Abstract

Related Articles

LLMs will be a commodity

What it feels like to have to have Qwen 3.6 or Gemma 4 running locally

From Fault Codes to Smart Fixes: How Google Cloud NEXT ’26 Inspired My AI Mechanic Assistant

Dex lands $5.3M to grow its AI-driven talent matching platform

7 OpenClaw Money-Making Cases in One Week — and the Hidden Cost Problem Behind Them

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer