Real-time video captioning in the browser with LFM2-VL on WebGPU

Reddit r/LocalLLaMA / 3/14/2026

📰 NewsDeveloper Stack & InfrastructureTools & Practical UsageModels & Research

共有:

Key Points

The post showcases a LFM2-VL model running fully offline in-browser using WebGPU and Transformers.js for real-time video captioning.
The author notes a 120ms frame capture delay was needed to keep captions readable and mentions planning UX improvements to reduce caption jumping.
An online demo with source code is available on HuggingFace Spaces, enabling easy experimentation.
This demonstrates a browser-centric AI inference workflow with on-device processing, privacy advantages, and web-based deployment.

Real-time video captioning in the browser with LFM2-VL on WebGPU

The model runs 100% locally in the browser with Transformers.js. Fun fact: I had to slow down frame capturing by 120ms because the model was too fast! Once I figure out a better UX so users can follow the generated captions more easily (less jumping), we can remove that delay. Suggestions welcome!

Online demo (+ source code): https://huggingface.co/spaces/LiquidAI/LFM2-VL-WebGPU

submitted by /u/xenovatech
[link] [comments]

ベテランの若手育成負担を減らせ、PLC制御の「ラダー図」をAIで生成

日経XTECH

Run Claude Opus 4.6 via OpenAI-compatible API using your existing Pro/Max subscription

Dev.to

Jupyter AI Extension - Multi-LLM Support

Dev.to

How to Build an AI Team: The Solopreneur Playbook

Dev.to

Getting Started with AI Agents

Dev.to

Real-time video captioning in the browser with LFM2-VL on WebGPU

Key Points

Related Articles

ベテランの若手育成負担を減らせ、PLC制御の「ラダー図」をAIで生成

Run Claude Opus 4.6 via OpenAI-compatible API using your existing Pro/Max subscription

Jupyter AI Extension - Multi-LLM Support

How to Build an AI Team: The Solopreneur Playbook

Getting Started with AI Agents

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer