PanoSAM2: Lightweight Distortion- and Memory-aware Adaptions of SAM2 for 360 Video Object Segmentation
arXiv cs.CV / 4/10/2026
📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces PanoSAM2, a lightweight framework that adapts SAM2 to the 360 video object segmentation (360VOS) setting while keeping SAM2’s promptable VOS usability.
- It addresses 360-specific challenges—projection distortion and left-right semantic inconsistency—using a Pano-Aware Decoder with seam-consistent receptive fields plus iterative distortion refinement across the 0/360 boundary.
- It incorporates a Distortion-Guided Mask Loss that upweights regions and boundaries with larger distortion magnitudes to improve mask reliability under stretching artifacts.
- To mitigate sparse object information in SAM2’s memory for 360 videos, it adds a Long-Short Memory Module that maintains a compact long-term object pointer to better re-instantiate and align short-term memories, improving temporal coherence.
- Experiments report substantial performance gains over SAM2, including +5.6 on 360VOTS and +6.7 on PanoVOS, indicating the proposed distortion- and memory-aware adaptations are effective.
Related Articles

Black Hat Asia
AI Business

GLM 5.1 tops the code arena rankings for open models
Reddit r/LocalLLaMA

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
Dev.to

My Bestie Built a Free MCP Server for Job Search — Here's How It Works
Dev.to
can we talk about how AI has gotten really good at lying to you?
Reddit r/artificial