Adaptive Slicing-Assisted Hyper Inference for Enhanced Small Object Detection in High-Resolution Imagery
arXiv cs.CV / 4/22/2026
📰 NewsDeveloper Stack & InfrastructureModels & Research
Key Points
- The paper introduces Adaptive Slicing-Assisted Hyper Inference (ASAHI) to improve small-object detection in high-resolution aerial/satellite imagery, where dense scenes and tiny targets make existing detectors struggle.
- Unlike fixed patch slicing, ASAHI adaptively chooses the number of overlapping slices (6 or 12) based on image resolution using a learned threshold, aiming to cut redundant computation.
- It includes slicing-assisted fine-tuning (SAF) by training on both full-resolution images and sliced patches to preserve detection quality while benefiting from larger effective receptive fields.
- For crowded scenes, ASAHI uses Cluster-DIoU-NMS (CDN) to merge detections efficiently and suppress duplicates using center-distance-aware DIoU logic.
- Experiments on VisDrone2019 and xView report state-of-the-art results (56.8% on VisDrone2019-DET-val, 22.7% on xView-test) and a 20–25% inference-time reduction versus the SAHI baseline.
Related Articles

GPT-5.5 is here. So is DeepSeek V4. And honestly, I am tired of version numbers.
Dev.to

I Built an AI Image Workflow with GPT Image 2.0 (+ Fixing Its Biggest Flaw)
Dev.to
Max-and-Omnis/Nemotron-3-Super-64B-A12B-Math-REAP-GGUF
Reddit r/LocalLLaMA

Building a Visual Infrastructure Layer: How We’re Solving the "Visual Trust Gap" for E-com
Dev.to
Qwen3.6 35B-A3B is quite useful on 780m iGPU (llama.cpp,vulkan)
Reddit r/LocalLLaMA