ICPRL: Acquiring Physical Intuition from Interactive Control
arXiv cs.LG / 3/17/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- ICPRL introduces In-Context Physical Reinforcement Learning (ICPRL), a framework that lets vision-language models acquire physical intuition by conditioning on past interactive experiences without requiring weight updates.
- The method trains a vision-grounded policy via multi-turn Group Relative Policy Optimization (GRPO) over diverse multi-episode histories and uses a separately trained world model to predict action outcomes.
- During inference, the policy proposes candidate actions and the world model predicts outcomes to guide a root-node PUCT search, selecting the most promising action.
- On the DeepPHY benchmark, ICPRL achieves significant improvements in both the policy-only and world-model-augmented setups, and demonstrates transfer to unseen physical environments.
Related Articles

Astral to Join OpenAI
Dev.to

PearlOS. We gave swarm intelligence a local desktop environment and code control to self-evolve. Has been pretty incredible to see so far. Open source and free if you want your own.
Reddit r/LocalLLaMA

Why Data is Important for LLM
Dev.to

The Inference Market Is Consolidating. Agent Payments Are Still Nobody's Problem.
Dev.to

YouTube's Deepfake Shield for Politicians Changes Evidence Forever
Dev.to