MOCA: A Transformer-based Modular Causal Inference Framework with One-way Cross-attention and Cutting Feedback
arXiv stat.ML / 4/28/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- MOCA (Modular One-way Causal Attention) is a transformer-based framework for estimating causal effects from observational data by more robustly handling confounding under complex, nonlinear, and high-dimensional treatment/outcome mechanisms.
- The method uses a modular design that separates treatment and outcome modeling and applies one-way cross-attention to adjust for confounders while preserving causal directionality.
- A “cutting-feedback” strategy implemented via gradient detachment prevents the outcome loss from updating the treatment module, avoiding undesirable information leakage into treatment-side representations.
- Experiments on multiple simulated settings and two real-world benchmarks (Infant Health and Development Program and Dehejia–Wahba datasets) show competitive or improved performance versus established estimators and neural causal inference baselines like IPW/AIPW, X-learner, TARNet, and DragonNet.
- The authors argue that modular attention with one-way information flow is a promising, more interpretable direction for combining causal inference with modern deep learning.
Related Articles
Write a 1,200-word blog post: "What is Generative Engine Optimization (GEO) and why SEO teams need it now"
Dev.to
Indian Developers: How to Build AI Side Income with $0 Capital in 2026
Dev.to
Most People Use AI Like Google. That's Why It Sucks.
Dev.to
Behind the Scenes of a Self-Evolving AI: The Architecture of Tian AI
Dev.to
Tian AI vs ChatGPT: Why Local AI Is the Future of Privacy
Dev.to