| The model runs 100% locally in the browser with Transformers.js. Fun fact: I had to slow down frame capturing by 120ms because the model was too fast! Once I figure out a better UX so users can follow the generated captions more easily (less jumping), we can remove that delay. Suggestions welcome! Online demo (+ source code): https://huggingface.co/spaces/LiquidAI/LFM2-VL-WebGPU [link] [comments] |
Real-time video captioning in the browser with LFM2-VL on WebGPU
Reddit r/LocalLLaMA / 3/14/2026
📰 NewsDeveloper Stack & InfrastructureTools & Practical UsageModels & Research
Key Points
- The post showcases a LFM2-VL model running fully offline in-browser using WebGPU and Transformers.js for real-time video captioning.
- The author notes a 120ms frame capture delay was needed to keep captions readable and mentions planning UX improvements to reduce caption jumping.
- An online demo with source code is available on HuggingFace Spaces, enabling easy experimentation.
- This demonstrates a browser-centric AI inference workflow with on-device processing, privacy advantages, and web-based deployment.
Related Articles

5 Dangerous Lies Behind Viral AI Coding Demos That Break in Production
Dev.to
Two bots, one confused server: what Nimbus revealed about AI agent identity
Dev.to
How to Create a Month of Content in One Day Using AI (Step-by-Step System)
Dev.to

OpenTelemetry just standardized LLM tracing. Here's what it actually looks like in code.
Dev.to
🌱 How AI is Transforming Planting — and Why It Matters
Dev.to