FEEL (Force-Enhanced Egocentric Learning): A Dataset for Physical Action Understanding

arXiv cs.CV / 3/18/2026

📰 NewsTools & Practical UsageModels & Research

共有:

Key Points

FEEL (Force-Enhanced Egocentric Learning) is the first large-scale dataset pairing force measurements from custom piezoresistive gloves with egocentric video to enable force-informed physical action understanding.
It contains approximately 3 million force-synchronized frames from natural unscripted kitchen manipulation, with 45% of frames involving hand-object contact.
FEEL supports two task families: (1) contact understanding via temporal contact segmentation and pixel-level segmentation of contacted objects, and (2) action representation learning with force prediction as a self-supervised pretraining objective for video backbones.
The work reports state-of-the-art results on temporal contact segmentation, competitive pixel-level segmentation, and transfer gains on action understanding tasks across EPIC-Kitchens, SomethingSomething-V2, EgoExo4D and Meccano without manual labels.
By treating force as a primitive for physical interaction, FEEL enables scalable data collection and improved generalization for action understanding models.

Abstract

We introduce FEEL (Force-Enhanced Egocentric Learning), the first large-scale dataset pairing force measurements gathered from custom piezoresistive gloves with egocentric video. Our gloves enable scalable data collection, and FEEL contains approximately 3 million force-synchronized frames of natural unscripted manipulation in kitchen environments, with 45% of frames involving hand-object contact. Because force is the underlying cause that drives physical interaction, it is a critical primitive for physical action understanding. We demonstrate the utility of force for physical action understanding through application of FEEL to two families of tasks: (1) contact understanding, where we jointly perform temporal contact segmentation and pixel-level contacted object segmentation; and, (2) action representation learning, where force prediction serves as a self-supervised pretraining objective for video backbones. We achieve state-of-the-art temporal contact segmentation results and competitive pixel-level segmentation results without any need for manual contacted object segmentation annotations. Furthermore we demonstrate that action representation learning with FEEL improves transfer performance on action understanding tasks without any manual labels over EPIC-Kitchens, SomethingSomething-V2, EgoExo4D and Meccano.

Astral to Join OpenAI

Dev.to

I Built a MITM Proxy to See What Claude Code Actually Sends to Anthropic

Dev.to

Your AI coding agent is installing vulnerable packages. I built the fix.

Dev.to

ChatGPT Prompt Engineering for Freelancers: Unlocking Efficient Client Communication

Dev.to

PearlOS. We gave swarm intelligence a local desktop environment and code control to self-evolve. Has been pretty incredible to see so far. Open source and free if you want your own.

Reddit r/LocalLLaMA

FEEL (Force-Enhanced Egocentric Learning): A Dataset for Physical Action Understanding

Key Points

Abstract

Related Articles

Astral to Join OpenAI

I Built a MITM Proxy to See What Claude Code Actually Sends to Anthropic

Your AI coding agent is installing vulnerable packages. I built the fix.

ChatGPT Prompt Engineering for Freelancers: Unlocking Efficient Client Communication

PearlOS. We gave swarm intelligence a local desktop environment and code control to self-evolve. Has been pretty incredible to see so far. Open source and free if you want your own.

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer