AusSmoke meets MultiNatSmoke: a fully-labelled diverse smoke segmentation dataset

arXiv cs.CV / 4/28/2026

📰 NewsSignals & Early TrendsTools & Practical UsageModels & Research

Key Points

  • The paper introduces AusSmoke, a new Australia-collected, fully-labelled dataset aimed at addressing data scarcity for wildfire smoke segmentation in that region.
  • It also proposes MultiNatSmoke, a much larger, geographically diverse benchmark that combines publicly available international datasets with the newly collected Australian images.
  • Existing wildfire smoke segmentation datasets are described as limited in scale, geographically narrow, and sometimes dependent on synthetic imagery, which reduces training effectiveness and generalization.
  • The authors benchmark smoke segmentation models and report improved performance and better generalization across different geographic contexts.
  • The datasets and project are released publicly via GitHub to support model training and evaluation.

Abstract

Wildfires are an escalating global concern due to the devastating impacts on the environment, economy, and human health, with notable incidents such as the 2019-2020 Australian bushfires and the 2025 California wildfires underscoring the severity of these events. AI-enabled camera-based smoke detection has emerged as a promising approach for the rapid detection of wildfires. However, existing wildfire smoke segmentation datasets that are used for training detection and segmentation models are limited in scale, geographically constrained, and often rely on synthetic imagery, which hinders effective training and generalization. To overcome these limitations, we present AusSmoke, a new smoke segmentation dataset collected from Australia to address the data scarcity in this region. Furthermore, we introduce a MultiNational geographically diverse and substantially larger fully-labelled benchmark, called MultiNatSmoke, that consolidates publicly available international datasets with the newly collected Australian imagery, expanding the scale by an order of magnitude over previous collections. Finally, we benchmark smoke segmentation models, demonstrating improved performance and enhanced generalization across diverse geographical contexts. The project is available at \href{https://github.com/henryzhao0615/MultiNatSmoke}{Github}.