Interpreting the Synchronization Gap: The Hidden Mechanism Inside Diffusion Transformers
arXiv cs.LG / 2026/3/24
💬 オピニオンIdeas & Deep AnalysisModels & Research
要点
- The paper explains the “synchronization gap” in diffusion models by linking it to coupled Ornstein–Uhlenbeck-style interaction timescales and investigating how this appears inside Diffusion Transformers (DiTs) in practice.
- It introduces an explicit architectural mechanism for replica coupling by embedding two generative trajectories into a shared token sequence and using a symmetric cross-attention gating parameter g.
- A linearized analysis shows how the interaction between replicas decomposes mechanistically inside attention layers, providing a theoretical bridge from continuous-time theory to discrete transformer architectures.
- Experiments on a pretrained DiT-XL/2 track commitment behavior and per-layer internal mode energies, finding that the synchronization gap is intrinsic to DiTs, collapses under strong coupling, and is localized to the final transformer layers.
- The results also show a frequency-driven commitment order: global low-frequency structure commits earlier than local high-frequency details, suggesting a depth-local “speciation” process near the output layers.

