BiMol-Diff: A Unified Diffusion Framework for Molecular Generation and Captioning
arXiv cs.CL / 4/28/2026
📰 NewsModels & Research
Key Points
- BiMol-Diff is a unified diffusion framework that addresses both text-conditioned molecular generation and molecule captioning by bridging molecular structures with natural language.
- The method introduces a token-aware, position-dependent noise schedule that corrupts tokens unevenly based on how difficult they are to recover, aiming to preserve structurally informative substructures.
- BiMol-Diff shows improved molecule reconstruction on the ChEBI-20 and M3-20M benchmarks, including a 15.4% relative gain in Exact Match.
- For captioning, the approach achieves top performance among baselines, reaching the best BLEU and BERTScore.
- The paper concludes that token-aware noising can significantly improve fidelity for molecular structure–language modeling tasks.
Related Articles

An improvement of the convergence proof of the ADAM-Optimizer
Dev.to
We built an AI that runs an entire business autonomously. Not a demo. Not a prototype. Actually running. YC-backed, here's what we learned.
Reddit r/artificial
langchain-tests==1.1.7
LangChain Releases
Why isn’t LLM reasoning done in vector space instead of natural language?
Reddit r/LocalLLaMA
llama.cpp's Preliminary SM120 Native NVFP4 MMQ Is Merged
Reddit r/LocalLLaMA