Same Graph, Different Likelihoods: Calibration of Autoregressive Graph Generators via Permutation-Equivalent Encodings

arXiv cs.LG / 4/8/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

Autoregressive graph generators’ likelihoods are only trustworthy if they remain consistent across different linearizations of the same graph, motivating a calibration-focused evaluation.
The paper introduces Linearization Uncertainty (LU), measuring violations in negative log-likelihood consistency via the coefficient of variation across permutation-equivalent encodings.
Experiments training transformer models under four linearization strategies show that models biased toward a particular ordering can achieve lower NLL on their native linearization while exhibiting dramatically worse calibration under random permutations.
On QM9, generated-graph NLL weakly correlates with molecular stability, whereas LU correlates much more strongly, suggesting LU as a more reliable quality signal for molecular generation.
The work provides code at the linked repository to enable permutation-based evaluation and LU-based model assessment.

Abstract

Autoregressive graph generators define likelihoods via a sequential construction process, but these likelihoods are only meaningful if they are consistent across all linearizations of the same graph. Segmented Eulerian Neighborhood Trails (SENT), a recent linearization method, converts graphs into sequences that can be perfectly decoded and efficiently processed by language models, but admit multiple equivalent linearizations of the same graph. We quantify violations in assigned negative log-likelihood (NLL) using the coefficient of variation across equivalent linearizations, which we call Linearization Uncertainty (LU). Training transformers under four linearization strategies on two datasets, we show that biased orderings achieve lower NLL on their native order but exhibit expected calibration error (ECE) two orders of magnitude higher under random permutation, indicating that these models have learned their training linearization rather than the underlying graph. On the molecular graph benchmark QM9, NLL for generated graphs is negatively correlated with molecular stability (AUC

=0.43

), while LU achieves AUC

=0.85

, suggesting that permutation-based evaluation provides a more reliable quality check for generated molecules. Code is available at https://github.com/lauritsf/linearization-uncertainty

Black Hat Asia

AI Business

[N] Just found out that Milla Jovovich is a dev, invested in AI, and just open sourced a project

Reddit r/MachineLearning

ALTK‑Evolve: On‑the‑Job Learning for AI Agents

Hugging Face Blog

Context Windows Are Getting Absurd — And That's a Good Thing

Dev.to

Every AI Agent Registry in 2026, Compared

Dev.to

Same Graph, Different Likelihoods: Calibration of Autoregressive Graph Generators via Permutation-Equivalent Encodings

Key Points

Abstract

Related Articles

Black Hat Asia

[N] Just found out that Milla Jovovich is a dev, invested in AI, and just open sourced a project

ALTK‑Evolve: On‑the‑Job Learning for AI Agents

Context Windows Are Getting Absurd — And That's a Good Thing

Every AI Agent Registry in 2026, Compared

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer