Causal Matrix Completion under Multiple Treatments via Mixed Synthetic Nearest Neighbors

arXiv cs.LG / 3/13/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

Mixed Synthetic Nearest Neighbors (MSNN) extends Synthetic Nearest Neighbors (SNN) to handle multiple treatment levels by integrating information across treatments.
MSNN is an entry-wise causal identification estimator that enlarges the effective sample size available for estimation while retaining SNN-like guarantees.
The method preserves finite-sample error bounds and asymptotic normality, ensuring reliable inference under MNAR with sparse treatment data.
Empirical results on synthetic and real-world datasets demonstrate MSNN's robustness, especially when some treatment levels have limited data.
The work broadens the applicability of causal matrix completion to more complex, data-scarce treatment settings.

Abstract

Synthetic Nearest Neighbors (SNN) provides a principled solution to causal matrix completion under missing-not-at-random (MNAR) by exploiting local low-rank structure through fully observed anchor submatrices. However, its effectiveness critically relies on sufficient data availability within each treatment level, a condition that often fails in settings with multiple or complex treatments. In this work, we propose Mixed Synthetic Nearest Neighbors (MSNN), a new entry-wise causal identification estimator that integrates information across treatment levels. We show that MSNN retains the finite-sample error bounds and asymptotic normality guarantees of SNN, while enlarging the effective sample size available for estimation. Empirical results on synthetic and real-world datasets illustrate the efficacy of the proposed approach, especially under data-scarce treatment levels.