AeSlides: Incentivizing Aesthetic Layout in LLM-Based Slide Generation via Verifiable Rewards
arXiv cs.CV / 4/28/2026
📰 NewsDeveloper Stack & InfrastructureIdeas & Deep AnalysisTools & Practical UsageModels & Research
Key Points
- AeSlides addresses the “modality gap” in LLM slide generation by adding explicit, aesthetic-layout supervision rather than relying on text-only training or costly visual reflection.
- The framework proposes a set of carefully designed, verifiable metrics that quantify slide layout quality (e.g., aspect ratio compliance, whitespace usage, element collisions, and visual balance) with low inference cost.
- It uses GRPO-based reinforcement learning to directly optimize slide-generation models for aesthetically coherent layouts using these verifiable rewards.
- Experiments show that with only 5K training prompts on GLM-4.7-Flash, AeSlides substantially improves layout outcomes (aspect ratio compliance 36%→85%) and reduces layout defects (whitespace −44%, collisions −43%, imbalance −28%).
- Human evaluation indicates a clear overall quality gain (3.31→3.56, +7.6%) and the approach outperforms other reward/agentic methods, with results that even slightly surpass Claude-Sonnet-4.5; the code is released on GitHub.
💡 Insights using this article
This article is featured in our daily AI news digest — key takeaways and action items at a glance.
Related Articles

Black Hat USA
AI Business

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
Dev.to
How I Automate My Dev Workflow with Claude Code Hooks
Dev.to

Same Agent, Different Risk | How Microsoft 365 Copilot Grounding Changes the Security Model | Rahsi Framework™
Dev.to

Claude Haiku for Low-Cost AI Inference: Patterns from a Horse Racing Prediction System
Dev.to