Contrastive Image-Metadata Pre-Training for Materials Transmission Electron Microscopy

arXiv cs.LG / 4/29/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The study addresses the problem that most transmission electron microscopy (TEM) images remain unpublished and are rarely reused, despite being accompanied by useful instrument metadata.
  • It introduces a dataset of 7,330 HAADF-STEM images (from a single instrument) paired with metadata, aimed at learning a shared embedding space linking image content/style with acquisition parameters.
  • Using these embeddings, the authors train a generative style-transfer network that can transform experimental images into the styles expected under different instrument settings.
  • The work evaluates the network’s effectiveness and investigates whether the approach can support physical denoising of TEM data.

Abstract

The vast majority of transmission electron microscopy (TEM) data never gets published and ends up on a backup drive until deleted to free up space. These left-over datasets are rich in detail and variation, often paired with automatically saved metadata of instrument state and acquisition parameters. In this work, we introduce a dataset of 7,330 high-angle annular dark-field scanning-TEM (HAADF-STEM) images from a single instrument to learn a joint embedding space between image metadata and HAADF image. These embeddings link image style with acquisition parameters, which allows us to train a generative style transfer network that can convert experimental images into the style they would have had if they were recorded with different instrument parameters. We evaluate the performance of the network and explore the usefulness of the technique for physical denoising.