How to Train your Tactile Model: Tactile Perception with Multi-fingered Robot Hands

arXiv cs.RO / 4/2/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses a scalability problem in tactile sensing for multi-fingered robot hands, where contact-property inference currently depends on CNNs trained on large, sensor-specific datasets.
  • It proposes TacViT, a Vision-Transformer-based tactile perception model that uses global self-attention to learn features that generalize across tactile sensors despite differences in lens characteristics, illumination, and wear.
  • The model is evaluated on tactile sensors for a five-fingered robot hand and is reported to outperform CNN-based approaches in out-of-distribution sensor generalization.
  • By reducing the need for data collection and retraining when new tactile sensors are deployed, TacViT aims to accelerate practical, real-world robotic manipulation workflows.

Abstract

Rapid deployment of new tactile sensors is essential for scalable robotic manipulation, especially in multi-fingered hands equipped with vision-based tactile sensors. However, current methods for inferring contact properties rely heavily on convolutional neural networks (CNNs), which, while effective on known sensors, require large, sensor-specific datasets. Furthermore, they require retraining for each new sensor due to differences in lens properties, illumination, and sensor wear. Here we introduce TacViT, a novel tactile perception model based on Vision Transformers, designed to generalize on new sensor data. TacViT leverages global self-attention mechanisms to extract robust features from tactile images, enabling accurate contact property inference even on previously unseen sensors. This capability significantly reduces the need for data collection and retraining, accelerating the deployment of new sensors. We evaluate TacViT on sensors for a five-fingered robot hand and demonstrate its superior generalization performance compared to CNNs. Our results highlight TacViTs potential to make tactile sensing more scalable and practical for real-world robotic applications.