Multimodal Diffusion to Mutually Enhance Polarized Light and Low Resolution EBSD Data

arXiv cs.LG / 4/27/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The study proposes using a multimodal diffusion model to integrate polarized light (PL) data with low-resolution electron back-scattered diffraction (EBSD) measurements to accelerate EBSD-related microscopy workflows.
  • By training an unconditional multimodal diffusion model to learn the complex dynamics between EBSD and PL, the approach targets inverse problems such as denoising, super-resolution, and grain-boundary prediction.
  • The model is trained only once on synthetic data, yet it generalizes well to real PL/EBSD inputs that may be low-resolution, noisy, corrupted, and misregistered.
  • Inference-time scaling improves performance across multiple objectives, showing robustness for practical microscopy conditions.
  • The authors report that performance is close to full-resolution results even when using only 25% of the EBSD resolution together with corrupted PL data, highlighting strong data-efficiency potential.

Abstract

In spite of the utility of 3-D electron back-scattered diffraction (EBSD) microscopy, the data collection process can be time-consuming with serial-sectioning. Hence, it is natural to look at other modalities, such as polarized light (PL) data, to accelerate EBSD data collection, supplemented with shared information. Complementarily, features in chaotic PL data could even be enriched with a handful of EBSD measurements. To inherently learn the complex dynamics between EBSD and PL to solve these inverse problems, we use an unconditional multimodal diffusion model, motivated by progress in diffusion models for inverse problems. Although trained solely on synthetic data once, our model has strong generalizable capabilities on real data which can be low-resolution, noisy, corrupted, and misregistered. With inference-time scaling, we show gains in performance on a variety of objectives including grain boundary prediction, super-resolution, and denoising. With our model, we demonstrate that there is little difference from full resolution performance with only 25% (1/4 the resolution) of EBSD data and corrupted PL data.