Identification of Bivariate Causal Directionality Based on Anticipated Asymmetric Geometries

arXiv cs.LG / 3/30/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper proposes two methods for inferring causal direction in bivariate numerical data using conditional distributions: Anticipated Asymmetric Geometries (AAG) and a Monotonicity Index based on gradient behavior.
AAG compares observed conditional distributions against “anticipated” distributions (modeled as normal using dual response statistics) using multiple metrics such as correlation, cosine similarity, Jaccard index, KL divergence, KS distance, and mutual information.
The Monotonicity Index method quantifies directional cues by counting sign changes in monotonicity indexes derived from gradients of conditional distributions along two axes.
Experiments on 95 real-world example pairs show tuned AAG achieves up to 77.9% accuracy, outperforming ANMs (about 63% ± 10%), while both methods produce deterministic outputs for fixed hyperparameters.
Because accuracy depends on hyperparameters, the study applies full factorial design-of-experiments for tuning and further trains a decision tree to analyze how decisive the causal identification is for misclassified cases.

Abstract

Identification of causal directionality in bivariate numerical data is a fundamental research problem with important practical implications. This paper presents two alternative methods to identify direction of causation by considering conditional distributions: (1) Anticipated Asymmetric Geometries (AAG) and (2) Monotonicity Index. The AAG method compares the actual conditional distributions to anticipated ones along two variables. Different comparison metrics, such as correlation, cosine similarity, Jaccard index, K-L divergence, K-S distance, and mutual information have been evaluated. Anticipated distributions have been projected as normal based on dual response statistics: mean and standard deviation. The Monotonicity Index approach compares the calculated monotonicity indexes of the gradients of conditional distributions along two axes and exhibits counts of gradient sign changes. Both methods assume stochastic properties of the bivariate data and exploit anticipated unimodality of conditional distributions of the effect. It turns out that the tuned AAG method outperforms the Monotonicity Index and reaches a top accuracy of 77.9% compared to ANMs accuracy of 63 +/- 10% when classifying 95 pairs of real-world examples (Mooij et al, 2014). The described methods include a number of hyperparameters that impact accuracy of the identification. For a given set of hyperparameters, both the AAG or Monotonicity Index method provide a unique deterministic outcome of the solution. To address sensitivity to hyperparameters, tuning of hyperparameters has been done by utilizing a full factorial Design of Experiment. A decision tree has been fitted to distinguish misclassified cases using the input data's symmetrical bivariate statistics to address the question of: How decisive is the identification method of causal directionality?