PSPA-Bench: A Personalized Benchmark for Smartphone GUI Agent
arXiv cs.AI / 4/1/2026
📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces PSPA-Bench, a new benchmark designed to evaluate how well smartphone GUI agents personalize their assistance to individual user workflows and preferences rather than providing generic solutions.
- PSPA-Bench includes 12,855+ personalized instructions covering 10 daily-use scenarios and 22 mobile apps, and it uses a structure-aware process evaluation method for fine-grained measurement.
- Experiments benchmark 11 state-of-the-art GUI agents and find that existing approaches perform poorly in personalized settings, with even the best agent showing limited success.
- The analysis suggests three improvement directions: reasoning-focused models tend to outperform general LLMs, perception is a critical (though still relatively simple) capability, and reflection plus long-term memory can enhance adaptation.
Related Articles

Black Hat Asia
AI Business

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
Dev.to

Day 6: I Stopped Writing Articles and Started Hunting Bounties
Dev.to

Early Detection of Breast Cancer using SVM Classifier Technique
Dev.to

I Started Writing for Others. It Changed How I Learn.
Dev.to