FractalMamba++: Scaling Vision Mamba Across Resolutions via Hilbert Fractal Geometry
arXiv cs.CV / 5/6/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- The paper addresses a key limitation of Vision Mamba: performance can degrade when 2D patch grids are serialized into a 1D recurrence, especially at inference resolutions larger than the training grid.
- It introduces FractalMamba++, which uses Hilbert curve–based fractal serialization to better preserve spatial locality across resolutions, improving neighborhood consistency compared with raster/linear scans.
- The model adds a Fractal Hierarchy Skip Connection (FHSC) that injects long-range state using deterministic routes derived from Hilbert recursion, reducing long-sequence information fading without runtime search or custom CUDA kernels.
- It further incorporates Fractal-Aware 2D Rotary Position Encoding (FA-RoPE) to tie positional interactions to true 2D proximity and fractal hierarchy level rather than the serialized 1D distance.
- Experiments across ImageNet classification, COCO detection/segmentation, ADE20K segmentation, and LEVIR-CD+ change detection show FractalMamba++ delivers improved results over existing Mamba-based vision backbones, particularly for high-resolution inputs.
Related Articles

Top 10 Free AI Tools for Students in 2026: The Ultimate Study Guide
Dev.to

AI as Your Contingency Co-Pilot: Automating Wedding Day 'What-Ifs'
Dev.to

Google AI Releases Multi-Token Prediction (MTP) Drafters for Gemma 4: Delivering Up to 3x Faster Inference Without Quality Loss
MarkTechPost
When Claude Hallucinates in Court: The Latham & Watkins Incident and What It Means for Attorney Liability
MarkTechPost
Solidity LM surpasses Opus
Reddit r/LocalLLaMA