Local Mechanisms of Compositional Generalization in Conditional Diffusion

Apple Machine Learning Journal / 4/28/2026

💬 OpinionModels & Research

共有:

Key Points

The paper examines how conditional diffusion models can perform compositional generalization, but notes that the underlying mechanisms are still poorly understood.
It studies “length generalization,” where a model generates images containing more objects than it saw during training, as a concrete test of compositionality.
Using a controlled CLEVR setup, the authors find length generalization succeeds in some scenarios but fails in others, implying that models do not always learn the full compositional structure.
The work then investigates the model-side “local mechanisms” that may explain when and why compositional generalization emerges in conditional diffusion.
Overall, the findings suggest compositional generalization in diffusion is partial and contingent on factors in training or structure, rather than guaranteed behavior.

Conditional diffusion models appear capable of compositional generalization, i.e., generating convincing samples for out-of-distribution combinations of conditioners, but the mechanisms underlying this ability remain unclear. To make this concrete, we study length generalization, the ability to generate images with more objects than seen during training. In a controlled CLEVR setting (Johnson et al.,2017), we find that length generalization is achievable in some cases but not others, suggesting that models only sometimes learn the underlying compositional structure. We then investigate…

Continue reading this article on the original site.

Read original →

An improvement of the convergence proof of the ADAM-Optimizer

Dev.to

We built an AI that runs an entire business autonomously. Not a demo. Not a prototype. Actually running. YC-backed, here's what we learned.

Reddit r/artificial

langchain-tests==1.1.7

LangChain Releases

Why isn’t LLM reasoning done in vector space instead of natural language?

Reddit r/LocalLLaMA

llama.cpp's Preliminary SM120 Native NVFP4 MMQ Is Merged

Reddit r/LocalLLaMA

Local Mechanisms of Compositional Generalization in Conditional Diffusion

Key Points

Related Articles

An improvement of the convergence proof of the ADAM-Optimizer

We built an AI that runs an entire business autonomously. Not a demo. Not a prototype. Actually running. YC-backed, here's what we learned.

langchain-tests==1.1.7

Why isn’t LLM reasoning done in vector space instead of natural language?

llama.cpp's Preliminary SM120 Native NVFP4 MMQ Is Merged

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer