
Alibaba has released Qwen3.5-Omni, an omnimodal AI model that processes text, images, audio, and video. It claims to beat Gemini 3.1 Pro on audio tasks and picked up an unexpected trick along the way: writing code from spoken instructions and video input.
The article Qwen3.5-Omni learned to write code from spoken instructions and video without anyone training it to appeared first on The Decoder.




