DeepSeek adds AI vision in major move: ‘the whale can now see’

SCMP Tech / 4/29/2026

📰 NewsSignals & Early TrendsIndustry & Market MovesModels & Research

Key Points

  • DeepSeek has added multimodal AI capabilities to its flagship chatbot by introducing a new “image recognition mode” that can process images and video in addition to text.
  • The feature is being rolled out as a limited release to selected users, expanding the chat experience beyond existing modes like “expert” and “flash.”
  • The update comes days after DeepSeek launched its new flagship model V4 and followed it with broader price cuts, signaling a rapid iteration cycle.
  • DeepSeek’s multimodal lead Chen Xiaokang is quoted positioning the upgrade as a major step forward for the system’s ability to “see.”
Chinese artificial intelligence start-up DeepSeek has added multimodal capabilities to its flagship chatbot for the first time – meaning that it can process images and video in addition to text – bringing it in line with rivals that already offer the function. The limited release to select users comes just days after the Hangzhou-based company released its new flagship model V4, which was followed by extensive price cuts. According to DeepSeek multimodal team leader Chen Xiaokang, who made the...

Continue reading this article on the original site.

Read original →