OASIC: Occlusion-Agnostic and Severity-Informed Classification

arXiv cs.CV / 4/7/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • 物体の重い遮蔽(オクルージョン)は可視情報の喪失と、遮蔽物による注意をそらすパターンという2つの根本要因によって困難になると整理しています。
  • 提案手法OASICは、テスト時に遮蔽パターンをマスクして“気を散らす情報”を除去し、遮蔽の種類に依らない形で異常(対象物に対する視覚的な逸脱)として扱って抑制します。
  • 学習時には対象物のランダム部位を“遮蔽の度合い(severity)”の複数レベルでマスクすることで、遮蔽下での情報不足に対処します。
  • テスト時に遮蔽度合いを推定し、その度合いに最適化されたモデルを選択することで、単一の遮蔽条件特化モデルよりも良い性能を目指します。
  • 実験では、グレイマスクと適応的なモデル選択の組み合わせにより、遮蔽画像での標準学習に対してAUC_occが+18.5、無遮蔽画像でのファインチューニングに対して+23.7改善したと報告しています。

Abstract

Severe occlusions of objects pose a major challenge for computer vision. We show that two root causes are (1) the loss of visible information and (2) the distracting patterns caused by the occluders. Our approach addresses both causes at the same time. First, the distracting patterns are removed at test-time, via masking of the occluding patterns. This masking is independent of the type of occlusion, by handling the occlusion through the lens of visual anomalies w.r.t. the object of interest. Second, to deal with less visual details, we follow standard practice by masking random parts of the object during training, for various degrees of occlusions. We discover that (a) it is possible to estimate the degree of the occlusion (i.e. severity) at test-time, and (b) that a model optimized for a specific degree of occlusion also performs best on a similar degree during test-time. Combining these two insights brings us to a severity-informed classification model called OASIC: Occlusion Agnostic Severity Informed Classification. We estimate the severity of occlusion for a test image, mask the occluder, and select the model that is optimized for the degree of occlusion. This strategy performs better than any single model optimized for any smaller or broader range of occlusion severities. Experiments show that combining gray masking with adaptive model selection improves \text{AUC}_\text{occ} by +18.5 over standard training on occluded images and +23.7 over finetuning on unoccluded images.