Liquid AI releases LFM2.5-VL-450M - structured visual understanding at 240ms

Reddit r/LocalLLaMA / 4/9/2026

📰 NewsSignals & Early TrendsTools & Practical UsageModels & Research

共有:

Key Points

Liquid AI has released LFM2.5-VL-450M, a vision-language model targeted for edge deployment, with 512×512 image processing in about 240ms and suitability for ~4 FPS video streams.
The model improves on LFM2-VL-450M by adding bounding box prediction, multilingual visual understanding across 9 languages (reported MMMB score gains), and function-calling support.
It aims to simplify production vision pipelines by performing object localization, contextual reasoning, and structured output generation in a single on-device pass rather than multi-stage detector/classifier plus heuristics.
LFM2.5-VL-450M runs on devices including NVIDIA Jetson Orin, Samsung S25 Ultra, and AMD 395+ Max, and is released as open-weight via Hugging Face and Liquid AI’s distribution channels.
The release provides immediate availability through an HF checkpoint and accompanying blog post, enabling developers to evaluate and integrate structured visual understanding on edge hardware.

Liquid AI releases LFM2.5-VL-450M - structured visual understanding at 240ms

Today, we release LFM2.5-VL-450M our most capable vision-language model for edge deployment. It processes a 512×512 image in 240ms and it is fast enough to reason about every frame in a 4 FPS video stream. It builds on LFM2-VL-450M with three new capabilities: