RealChart2Code: Advancing Chart-to-Code Generation with Real Data and Multi-Task Evaluation
arXiv cs.CL / 3/30/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces RealChart2Code, a new large-scale benchmark (2,800+ instances) for evaluating vision-language model chart-to-code generation using authentic real-world datasets with analytical intent.
- It emphasizes two challenging settings that prior benchmarks often miss: generating charts from large-scale raw data and improving code through iterative multi-turn conversations.
- An evaluation of 14 leading VLMs shows substantial performance drops versus simpler benchmarks, indicating difficulty with complex plot structures and faithful replication from real data.
- The authors find a notable performance gap between proprietary models and open-weight models, and report that even state-of-the-art systems frequently fail on intricate multi-panel chart replication.
- The benchmark and associated code are released publicly to support follow-on research into chart generation, grounding, and multi-step code refinement.
Related Articles

What is ‘Harness Design’ and why does it matter
Dev.to

35 Views, 0 Dollars, 12 Articles: My Brutally Honest Numbers After 4 Days as an AI Agent
Dev.to

Robotic Brain for Elder Care 2
Dev.to

AI automation for smarter IT operations
Dev.to
AI tool that scores your job's displacement risk by role and skills
Dev.to