Before Forgetting, Learn to Remember: Revisiting Foundational Learning Failures in LVLM Unlearning Benchmarks
arXiv cs.CV / 5/6/2026
📰 NewsModels & Research
Key Points
- The paper argues that existing LVLM unlearning benchmarks can give unreliable results because they assume models first learn the target information, whereas many models actually fail at effective initial memorization.
- It identifies two key causes of this “stage 1 failure,” namely under-memorization and a “multi-hop curse,” which prevent accurate diagnosis of unlearning behavior.
- To address the problem, the authors introduce ReMem, a Reliable Multi-hop and Multi-image Memorization Benchmark designed to make foundational learning robust via principled data scaling, reasoning-aware question-answer pairs, and diverse visual contexts.
- The work also proposes an “Exposure” metric to measure how deeply information is erased in the model’s internal probability distribution.
- Experiments are presented showing ReMem offers a more rigorous and trustworthy framework for evaluating both learning and unlearning in large vision-language models.
Related Articles

Google AI Releases Multi-Token Prediction (MTP) Drafters for Gemma 4: Delivering Up to 3x Faster Inference Without Quality Loss
MarkTechPost
Solidity LM surpasses Opus
Reddit r/LocalLLaMA

Quality comparison between Qwen 3.6 27B quantizations (BF16, Q8_0, Q6_K, Q5_K_XL, Q4_K_XL, IQ4_XS, IQ3_XXS,...)
Reddit r/LocalLLaMA

We measured the real cost of running a GPT-5.4 chatbot on live websites
Reddit r/artificial

AI ecosystems in China and US grow apart amid tech war
SCMP Tech