Self-Conditioned Denoising for Atomistic Representation Learning
arXiv cs.LG / 3/19/2026
📰 NewsModels & Research
Key Points
- The paper introduces Self-Conditioned Denoising (SCD), a backbone-agnostic pretraining objective that uses self-embeddings to enable conditional denoising across atomistic data.
- SCD applies across diverse domains, including small molecules, proteins, periodic materials, and non-equilibrium geometries, addressing SSL limitations that were confined to ground-state geometries or a single domain.
- With controlled backbone architecture and pretraining data, SCD significantly outperforms previous SSL methods and matches or exceeds supervised force-energy pretraining on downstream benchmarks.
- A small, fast Graph Neural Network pretrained with SCD can achieve competitive or superior performance to larger models trained on substantially larger labeled or unlabeled datasets.
- Code for SCD is available at https://github.com/TyJPerez/SelfConditionedDenoisingAtoms
Related Articles

Math needs thinking time, everyday knowledge needs memory, and a new Transformer architecture aims to deliver both
THE DECODER
Kreuzberg v4.5.0: We loved Docling's model so much that we gave it a faster engine
Reddit r/LocalLLaMA
Today, what hardware to get for running large-ish local models like qwen 120b ?
Reddit r/LocalLLaMA
Running mistral locally for meeting notes and it's honestly good enough for my use case
Reddit r/LocalLLaMA
[D] Single-artist longitudinal fine art dataset spanning 5 decades now on Hugging Face — potential applications in style evolution, figure representation, and ethical training data
Reddit r/MachineLearning