Evolving Contextual Safety in Multi-Modal Large Language Models via Inference-Time Self-Reflective Memory
arXiv cs.CL / 3/18/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- The authors introduce MM-SafetyBench++ as a benchmark for evaluating contextual safety in multi-modal LLMs by creating safe counterexamples for unsafe image-text pairs while preserving underlying context.
- They propose EchoSafe, a training-free framework that uses a self-reflective memory bank to accumulate and retrieve safety insights from prior interactions, guiding context-aware safety decisions during inference.
- Extensive experiments show EchoSafe improves contextual safety across multiple multi-modal safety benchmarks and establishes a strong baseline for safety evolution in MLLMs.
- The benchmark data and code are publicly available at the provided URL.




