Mining Electronic Health Records to Investigate Effectiveness of Ensemble Deep Clustering
arXiv cs.LG / 4/9/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The study evaluates how well traditional (e.g., K-means), hybrid, and deep learning clustering methods work on EHR-derived patient representations, using real heart failure data from the All of Us Research Program.
- It finds that traditional clustering performs more robustly than deep clustering methods designed for image-like tasks, highlighting a domain mismatch between image clustering and tabular EHR embeddings.
- To improve deep clustering, the authors propose an ensemble-based deep clustering method that aggregates cluster assignments across multiple embedding dimensions instead of relying on a single embedding space.
- In a new ensemble framework that combines traditional and deep clustering, the proposed ensemble embedding delivers the best overall performance across 14 clustering approaches and multiple patient cohorts.
- The paper emphasizes biologically sex-specific clustering as important for EHR analysis and argues for combining traditional and deep clustering rather than using a single method in isolation.
Related Articles

Black Hat Asia
AI Business

Amazon CEO takes aim at Nvidia, Intel, Starlink, more in annual shareholder letter
TechCrunch

Why Anthropic’s new model has cybersecurity experts rattled
Reddit r/artificial
Does the AI 2027 paper still hold any legitimacy?
Reddit r/artificial

Why Most Productivity Systems Fail (And What to Do Instead)
Dev.to