QYOLO: Lightweight Object Detection via Quantum Inspired Shared Channel Mixing

arXiv cs.AI / 4/30/2026

📰 NewsDeveloper Stack & InfrastructureSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper proposes QYOLO, a lightweight object detection approach that compresses a YOLO-style backbone by replacing the two deepest C2f modules with a quantum-inspired channel mixing block (QMixBlock).
QMixBlock uses sinusoidal global channel recalibration with shared learnable parameters across both backbone stages (P4/16 and P5/32), reducing parameters without needing stage-specific parameter sets.
The neck and detection head remain fully classical, so the method targets computational savings primarily in the backbone where channel width scaling drives most of the overhead.
Experiments on VisDrone2019 show that QYOLOv8n reduces parameters by 20.2% (3.01M→2.40M) and GFLOPs by 12.3% with only a 0.4 pp drop in mAP@50, and QYOLOv8s reduces parameters by 21.8% with just 0.1 pp degradation.
Adding knowledge distillation can recover full accuracy parity at no additional cost to compression, while a larger backbone+neck variant achieves even higher compression (38–41%) but with more accuracy loss, leading to a backbone-only final choice.

Abstract

The rapid advancement of object detection architectures has positioned single stage detectors as the dominant solution for real-time visual perception. A primary source of computational overhead in these models lies in the deep backbone stages, where C2f bottleneck modules at high stride levels accumulate a disproportionate share of parameters due to quadratic scaling with channel width. This work introduces QYOLO, a quantum-inspired channel mixing framework that achieves genuine architectural compression by replacing the two deepest backbone C2f modules at P4/16 (512 channels) and P5/32 (1024 channels) with a compact QMixBlock. The proposed block performs global channel recalibration through a sinusoidal mixing mechanism with shared learnable parameters across both backbone stages, enforcing consistent channel importance without requiring independent per-stage parameter sets. The neck and detection head remain fully classical and unchanged. Evaluation on the VisDrone2019 benchmark demonstrates that QYOLOv8n achieves a 20.2% reduction in parameter count (3.01M to 2.40M) and 12.3% GFLOPs reduction with only 0.4 pp mAP@50 degradation. QYOLOv8s achieves 21.8% reduction with 0.1 pp degradation. When combined with knowledge distillation, full accuracy parity is recovered at no cost to compression. An expanded backbone plus neck variant achieved 38 to 41% reduction at the cost of greater accuracy degradation, motivating the backbone-only final design.

Claude Opus 4.7: What Actually Changed and Whether You Should Migrate

Dev.to

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

Sector HQ Daily AI Intelligence - April 30, 2026

Dev.to

The Inference Inflection: Why AI's Center of Gravity Has Shifted from Training to Inference

Dev.to

AI transparency index on pvgomes.com

Dev.to

QYOLO: Lightweight Object Detection via Quantum Inspired Shared Channel Mixing

Key Points

Abstract

Related Articles

Claude Opus 4.7: What Actually Changed and Whether You Should Migrate

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Sector HQ Daily AI Intelligence - April 30, 2026

The Inference Inflection: Why AI's Center of Gravity Has Shifted from Training to Inference

AI transparency index on pvgomes.com

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer