DeepSeek adds AI vision in major move: ‘the whale can now see’

SCMP Tech / 4/29/2026

📰 NewsSignals & Early TrendsTools & Practical UsageIndustry & Market Moves

共有:

Key Points

DeepSeek has introduced an “image recognition mode” in its chat interface, expanding the product beyond text-only or chat-style interactions.
The new vision capability is positioned as a third chat mode alongside existing “expert” and “flash” modes.
The update signals an ongoing shift toward multimodal assistants, where users can input images and get recognition/understanding responses.
This change may affect how developers and teams integrate DeepSeek into workflows that require visual interpretation (e.g., analysis, search, or moderation).

Chinese artificial intelligence start-up DeepSeek has added multimodal capabilities to its flagship chatbot for the first time – meaning that it can process images and video in addition to text – bringing it in line with rivals that already offer the function. The limited release to select users comes just days after the Hangzhou-based company released its new flagship model V4, which was followed by extensive price cuts. According to DeepSeek multimodal team leader Chen Xiaokang, who made the...

Continue reading this article on the original site.

Read original →