Nested Atoms Model with Application to Clustering Big Population-Scale Single-Cell Data
arXiv stat.ML / 4/14/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- The paper targets clustering of nested/hierarchical data where both group-level covariates and observation-level covariates must be modeled jointly.
- Using the OneK1K scRNA-seq dataset (982 individuals, 1.27M cells) as motivation, the authors aim to cluster both cells and individuals while incorporating individual-specific genotype information.
- They introduce the Nested Atoms Model (NAM), a Bayesian nonparametric framework designed to perform two-layer clustering that accounts for heterogeneity at the individual (group) and cell (observation) levels.
- To make NAM practical for high-dimensional genomics data, the authors develop a fast variational Bayesian inference algorithm for scaling inference.
- Experiments and simulations indicate NAM outperforms approaches that ignore group-level variables, and application to OneK1K yields individual clusters with homogeneous cell-type profiles that align with known immune cell types.
Related Articles

Don't forget, there is more than forgetting: new metrics for Continual Learning
Dev.to

Microsoft MAI-Image-2-Efficient Review 2026: The AI Image Model Built for Production Scale
Dev.to
Bit of a strange question?
Reddit r/artificial

One URL for Your AI Agent: HTML, JSON, Markdown, and an A2A Card
Dev.to

One URL for Your AI Agent: HTML, JSON, Markdown, and an A2A Card
Dev.to