Quantifying Membership Disclosure Risk for Tabular Synthetic Data Using Kernel Density Estimators
arXiv cs.LG / 3/12/2026
📰 NewsTools & Practical UsageModels & Research
Key Points
- This paper proposes a KDE-based method to quantify membership disclosure risk in tabular synthetic data.
- It models the distribution of nearest-neighbor distances between synthetic data and training records to enable probabilistic membership inference and ROC-based evaluation.
- The paper introduces two attack models: a True Distribution Attack with privileged training data access and a Realistic Attack using only auxiliary data.
- Empirical evaluation across four real-world datasets and six generators shows the KDE approach achieves higher F1 scores and sharper risk characterization than prior baselines, without relying on expensive shadow models.
- A practical framework and metrics for post-generation risk assessment are provided, with datasets and code released for practitioners.
Related Articles

ベテランの若手育成負担を減らせ、PLC制御の「ラダー図」をAIで生成
日経XTECH

Hey dev.to community – sharing my journey with Prompt Builder, Insta Posts, and practical SEO
Dev.to

Why Regex is Not Enough: Building a Deterministic "Sudo" Layer for AI Agents
Dev.to

Perplexity Hub
Dev.to

How to Build Passive Income with AI in 2026: A Developer's Practical Guide
Dev.to