
Multimodal AI models like GPT-5, Gemini 3 Pro, and Claude Opus 4.5 generate detailed image descriptions and medical diagnoses even when no image is provided. A Stanford study shows that common benchmarks obscure the problem.
The article AI models confidently describe images they never saw, and benchmarks fail to catch it appeared first on The Decoder.


