Discovering What You Can Control: Interventional Boundary Discovery for Reinforcement Learning

arXiv cs.LG / 3/20/2026

📰 NewsIdeas & Deep AnalysisTools & Practical UsageModels & Research

共有:

Key Points

The paper defines the problem of identifying the agent's Causal Sphere of Influence to distinguish action-caused features from confounded distractors in reinforcement learning.
It introduces Interventional Boundary Discovery (IBD), which uses Pearl's do-operator on the agent's actions and two-sample tests to produce a binary mask over observation dimensions without requiring learned models, usable as a preprocessing step for any RL algorithm.
In experiments on 12 continuous control tasks with up to 100 distractors, observational feature selection is shown to misselect distractors and discard true causal features, while IBD closely tracks oracle performance across distractor levels and transfers to SAC and TD3.
A key finding is that full-state RL performance degrades when distractors outnumber relevant features by about 3:1, underscoring the value of causal feature discovery in RL pipelines.

Abstract

Selecting relevant state dimensions in the presence of confounded distractors is a causal identification problem: observational statistics alone cannot reliably distinguish dimensions that correlate with actions from those that actions cause. We formalize this as discovering the agent's Causal Sphere of Influence and propose Interventional Boundary Discovery IBD, which applies Pearl's do-operator to the agent's own actions and uses two-sample testing to produce an interpretable binary mask over observation dimensions. IBD requires no learned models and composes with any downstream RL algorithm as a preprocessing step. Across 12 continuous control settings with up to 100 distractor dimensions, we find that: (1) observational feature selection can actively select confounded distractors while discarding true causal dimensions; (2) full-state RL degrades sharply once distractors outnumber relevant features by roughly 3:1 in our benchmarks; and (3)IBD closely tracks oracle performance across all distractor levels tested, with gains transferring across SAC and TD3.

The massive shift toward edge computing and local processing

Dev.to

Self-Refining Agents in Spec-Driven Development

Dev.to

How to Optimize Your LinkedIn Profile with AI in 2026 (Get Found by Recruiters)

Dev.to

Agentforce Builder: How to Build AI Agents in Salesforce

Dev.to

How AI Consulting Services Support Staff Development in Dubai

Dev.to

Discovering What You Can Control: Interventional Boundary Discovery for Reinforcement Learning

Key Points

Abstract

Related Articles

The massive shift toward edge computing and local processing

Self-Refining Agents in Spec-Driven Development

How to Optimize Your LinkedIn Profile with AI in 2026 (Get Found by Recruiters)

Agentforce Builder: How to Build AI Agents in Salesforce

How AI Consulting Services Support Staff Development in Dubai

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer