LEXIS: LatEnt ProXimal Interaction Signatures for 3D HOI from an Image

arXiv cs.CV / 4/23/2026

📰 NewsSignals & Early TrendsModels & Research

共有:

Key Points

The paper proposes a new method (LEXIS) for reconstructing 3D human-object interaction (HOI) from a single RGB image by modeling the continuous physical coupling between bodies and objects.
It introduces “InterFields,” a dense, continuous proximity representation over body and object surfaces, and learns a structured discrete interaction-signature manifold via a VQ-VAE.
Building on these signatures, it develops LEXIS-Flow, a diffusion-based framework that estimates human/object meshes and their InterFields together.
The resulting InterFields enable physically plausible, proximity-aware reconstructions through guided refinement without needing costly post-hoc optimization.
Experiments on Open3DHOI and BEHAVE report significantly better performance than existing state-of-the-art baselines across reconstruction, contact, and proximity quality, with code/models planned to be public.

Abstract

Reconstructing 3D Human-Object Interaction from an RGB image is essential for perceptive systems. Yet, this remains challenging as it requires capturing the subtle physical coupling between the body and objects. While current methods rely on sparse, binary contact cues, these fail to model the continuous proximity and dense spatial relationships that characterize natural interactions. We address this limitation via InterFields, a representation that encodes dense, continuous proximity across the entire body and object surfaces. However, inferring these fields from single images is inherently ill-posed. To tackle this, our intuition is that interaction patterns are characteristically structured by the action and object geometry. We capture this structure in LEXIS, a novel discrete manifold of interaction signatures learned via a VQ-VAE. We then develop LEXIS-Flow, a diffusion framework that leverages LEXIS signatures to estimate human and object meshes alongside their InterFields. Notably, these InterFields help in a guided refinement that ensures physically-plausible, proximity-aware reconstructions without requiring post-hoc optimization. Evaluation on Open3DHOI and BEHAVE shows that LEXIS-Flow significantly outperforms existing SotA baselines in reconstruction, contact, and proximity quality. Our approach not only improves generalization but also yields reconstructions perceived as more realistic, moving us closer to holistic 3D scene understanding. Code & models will be public at https://anticdimi.github.io/lexis.

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

AI agents have no identity — we built the open registry that gives them one

Dev.to

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

Claude Desktop Now Supports Third-Party APIs — Here's How to Set It Up

Dev.to

SentinelOne's AI-powered EDR autonomously claims blocking a Claude Zero Day Supply Chain Attack

Dev.to

LEXIS: LatEnt ProXimal Interaction Signatures for 3D HOI from an Image

Key Points

Abstract

Related Articles

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

AI agents have no identity — we built the open registry that gives them one

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Claude Desktop Now Supports Third-Party APIs — Here's How to Set It Up

SentinelOne's AI-powered EDR autonomously claims blocking a Claude Zero Day Supply Chain Attack

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer