FryNet: Dual-Stream Adversarial Fusion for Non-Destructive Frying Oil Oxidation Assessment

arXiv cs.CV / 4/24/2026

📰 NewsDeveloper Stack & InfrastructureModels & Research

共有:

Key Points

The paper addresses a key limitation of current frying-oil oxidation monitoring, noting that wet-chemistry assays are destructive, lack spatial information, and cannot support real-time use.
It identifies a “camera-fingerprint shortcut” in thermal-image inspection models, where networks overfit to sensor noise/thermal bias rather than learning oxidation chemistry, causing poor performance when evaluated on different video sets.
The proposed FryNet uses a dual-stream RGB–thermal architecture to segment the oil region, classify serviceability, and regress four chemical oxidation indices (PV, p-AV, Totox, and temperature) in one forward pass.
FryNet’s design combines a ThermalMiT-B2 backbone with attention, an RGB-MAE encoder trained with masked autoencoding and chemical alignment, and a dual-encoder DANN adversarial regularization (via Gradient Reversal Layers) plus FiLM fusion to connect thermal structure with RGB chemical context.
On 7,226 paired frames from 28 frying videos, FryNet reports strong results—98.97% mIoU for segmentation, 100% classification accuracy, and 2.32 mean regression MAE—outperforming seven baselines and demonstrating resilience to the video-disjoint evaluation issue.

Abstract

Monitoring frying oil degradation is critical for food safety, yet current practice relies on destructive wet-chemistry assays that provide no spatial information and are unsuitable for real-time use. We identify a fundamental obstacle in thermal-image-based inspection, the camera-fingerprint shortcut, whereby models memorize sensor-specific noise and thermal bias instead of learning oxidation chemistry, collapsing under video-disjoint evaluation. We propose FryNet, a dual-stream RGB-thermal framework that jointly performs oil-region segmentation, serviceability classification, and regression of four chemical oxidation indices (PV, p-AV, Totox, temperature) in a single forward pass. A ThermalMiT-B2 backbone with channel and spatial attention extracts thermal features, while an RGB-MAE Encoder learns chemically grounded representations via masked autoencoding and chemical alignment. Dual-Encoder DANN adversarially regularizes both streams against video identity via Gradient Reversal Layers, and FiLM fusion bridges thermal structure with RGB chemical context. On 7,226 paired frames across 28 frying videos, FryNet achieves 98.97% mIoU, 100% classification accuracy, and 2.32 mean regression MAE, outperforming all seven baselines.

GPT-5.5 is here. So is DeepSeek V4. And honestly, I am tired of version numbers.

Dev.to

I Built an AI Image Workflow with GPT Image 2.0 (+ Fixing Its Biggest Flaw)

Dev.to

Max-and-Omnis/Nemotron-3-Super-64B-A12B-Math-REAP-GGUF

Reddit r/LocalLLaMA

Building a Visual Infrastructure Layer: How We’re Solving the "Visual Trust Gap" for E-com

Dev.to

Qwen3.6 35B-A3B is quite useful on 780m iGPU (llama.cpp,vulkan)

Reddit r/LocalLLaMA

FryNet: Dual-Stream Adversarial Fusion for Non-Destructive Frying Oil Oxidation Assessment

Key Points

Abstract

Related Articles

GPT-5.5 is here. So is DeepSeek V4. And honestly, I am tired of version numbers.

I Built an AI Image Workflow with GPT Image 2.0 (+ Fixing Its Biggest Flaw)

Max-and-Omnis/Nemotron-3-Super-64B-A12B-Math-REAP-GGUF

Building a Visual Infrastructure Layer: How We’re Solving the "Visual Trust Gap" for E-com

Qwen3.6 35B-A3B is quite useful on 780m iGPU (llama.cpp,vulkan)

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer