DextER: Language-driven Dexterous Grasp Generation with Embodied Reasoning
arXiv cs.RO / 4/28/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- DextER is a language-driven model for generating dexterous multi-finger grasps that explicitly reasons about physical contact and hand–object interactions rather than directly mapping visual inputs to grasp parameters.
- The method uses an intermediate, embodiment-aware representation by predicting contact relationships (which finger links contact which parts of the object surface) and then autoregressively generating contact tokens followed by grasp configuration tokens.
- Experiments on DexGYS show strong performance, reaching a 67.14% success rate and outperforming prior state of the art by 3.83 percentage points.
- DextER also improves intention alignment significantly (reported as a 96.4% improvement) and supports steerable generation via partial contact specification for more controllable grasp synthesis.
Related Articles

Write a 1,200-word blog post: "What is Generative Engine Optimization (GEO) and why SEO teams need it now"
Dev.to

Indian Developers: How to Build AI Side Income with $0 Capital in 2026
Dev.to

Most People Use AI Like Google. That's Why It Sucks.
Dev.to

Behind the Scenes of a Self-Evolving AI: The Architecture of Tian AI
Dev.to

Tian AI vs ChatGPT: Why Local AI Is the Future of Privacy
Dev.to