Addressing Ambiguity in Imitation Learning through Product of Experts based Negative Feedback

arXiv cs.RO / 3/30/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper studies how imitation learning for robots can work when demonstrations are ambiguous and come from multiple or suboptimal experts rather than a single highly competent one.
It proposes a Product of Experts–based negative-feedback system that uses the robot’s own failures to resolve ambiguity, contrasting it with standard positive-only imitation learning.
In experiments, the approach shows large gains in success rate for ambiguous tasks, reporting about a 90% improvement versus a baseline without negative feedback, and about a 50% improvement on a real robot.
The method is evaluated in both simulation and real-robot settings and is claimed to be more effective while also improving memory and time efficiency compared with a comparable negative-feedback alternative.
The work targets practical home and assistive robotics scenarios where user demonstrations may be noisy or incomplete, aiming to learn from corrective signals rather than assuming perfect demonstrations.

Abstract

Programming robots to perform complex tasks is often difficult and time consuming, requiring expert knowledge and skills in robot software and sometimes hardware. Imitation learning is a method for training robots to perform tasks by leveraging human expertise through demonstrations. Typically, the assumption is that those demonstrations are performed by a single, highly competent expert. However, in many real-world applications that use user demonstrations for tasks or incorporate both user data and pretrained data, such as home robotics including assistive robots, this is unlikely to be the case. This paper presents research towards a system which can leverage suboptimal demonstrations to solve ambiguous tasks; and particularly learn from its own failures. This is a negative-feedback system which achieves significant improvement over purely positive imitation learning for ambiguous tasks, achieving a 90% improvement in success rate against a system that does not utilise negative feedback, compared to a 50% improvement in success rate when utilised on a real robot, as well as demonstrating higher efficacy, memory efficiency and time efficiency than a comparable negative feedback scheme. The novel scheme presented in this paper is validated through simulated and real-robot experiments.

Black Hat Asia

AI Business

Mr. Chatterbox is a (weak) Victorian-era ethically trained model you can run on your own computer

Simon Willison's Blog

Beyond the Chatbot: Engineering Multi-Agent Ecosystems in 2026

Dev.to

I missed the "fun" part in software development

Dev.to

The Billion Dollar Tax on AI Agents

Dev.to

Addressing Ambiguity in Imitation Learning through Product of Experts based Negative Feedback

Key Points

Abstract

Related Articles

Black Hat Asia

Mr. Chatterbox is a (weak) Victorian-era ethically trained model you can run on your own computer

Beyond the Chatbot: Engineering Multi-Agent Ecosystems in 2026

I missed the "fun" part in software development

The Billion Dollar Tax on AI Agents

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer