You're Pushing My Buttons: Instrumented Learning of Gentle Button Presses

arXiv cs.RO / 4/8/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper tackles the challenge of learning contact-rich robotic manipulation from cameras and proprioception alone, noting that contact events are only partially observed in such modalities.
  • It proposes a training-time instrumentation approach by “sensorising” the environment: specifically, a microphone fingertip records audio and an instrumented button-state signal provides privileged supervision to train an audio encoder for contact event detection.
  • The learned audio representation is integrated with imitation learning using three fusion strategies, while keeping deployment fully independent of the instrumentation so the policy uses only vision and audio at inference.
  • Across methods, button-press success rates are comparable, but the instrumentation-guided audio representations consistently lower contact force, indicating improved interaction quality rather than only task completion.

Abstract

Learning contact-rich manipulation is difficult from cameras and proprioception alone because contact events are only partially observed. We test whether training-time instrumentation, i.e., object sensorisation, can improve policy performance without creating deployment-time dependencies. Specifically, we study button pressing as a testbed and use a microphone fingertip to capture contact-relevant audio. We use an instrumented button-state signal as privileged supervision to fine-tune an audio encoder into a contact event detector. We combine the resulting representation with imitation learning using three strategies, such that the policy only uses vision and audio during inference. Button press success rates are similar across methods, but instrumentation-guided audio representations consistently reduce contact force. These results support instrumentation as a practical training-time auxiliary objective for learning contact-rich manipulation policies.