| So i have been seeing more of those pelican on a bike svg tests and while they work i feel like (and maybe you guys do too) they are getting kinda benchmaxxed so we should switch things up soon and this is my idea
Gemini 3.1 Pro gave me this and DeepSeek Expert Mode this DeepSeek Expert (official website) GLM 5.1 (hosted on unofficial cloud) MiniMax 2.7 (hosted on unoffical cloud) Kimi K2.5 (dont have access to 2.6 / budget was limited so i used it via offical website) Claude Sonnet 4.6 (official website and yes probably quantized version) Claude Sonnet 4.6 (Normal Thinking/Official Website) Qwen 3.6 Plus (official website) [link] [comments] |
Guys we have to change the pelican test
Reddit r/LocalLLaMA / 4/15/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisTools & Practical Usage
Key Points
- The post argues that the common “pelican on a bike” SVG test is becoming overused (“benchmaxxed”) and suggests switching to a different test.
- The author proposes using a more varied prompt—generating an SVG of “a horse sitting in an F1 race car”—as a new benchmark-style check.
- Multiple AI models are reported to have generated outputs for the proposed SVG prompt, including Gemini 3.1 Pro, DeepSeek (expert mode), GLM 5.1, MiniMax 2.7, Kimi K2.5, Claude Sonnet 4.6, and Qwen 3.6 Plus.
- The overall thrust is community-driven experimentation to stress-test image/SVG generation capabilities and reduce reliance on a single fixed prompt.
- The post implicitly highlights cross-model variability and the usefulness of prompt diversity when evaluating generative performance.
Related Articles

Black Hat USA
AI Business

Black Hat Asia
AI Business
Are gamers being used as free labeling labor? The rise of "Simulators" that look like AI training grounds [D]
Reddit r/MachineLearning

I built a trading intelligence MCP server in 2 days — here's how
Dev.to

Voice-Controlled AI Agent Using Whisper and Local LLM
Dev.to