CLASP: Class-Adaptive Layer Fusion and Dual-Stage Pruning for Multimodal Large Language Models
arXiv cs.CV / 4/15/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces CLASP, a plug-and-play token reduction framework to cut the heavy compute cost of multimodal LLMs caused by redundant visual tokens.
- CLASP performs class-adaptive, multi-layer visual feature fusion to build category-specific representations that are conditioned on prompts/instructions.
- It uses dual-stage pruning by splitting the token budget into attention-salient pivot tokens (relevance) and redundancy-aware completion tokens (coverage).
- Experiments on multiple benchmarks show CLASP improves performance over existing pruning approaches across varying pruning ratios and MLLM architectures.
- The authors indicate the code will be released publicly at the provided GitHub link, enabling adoption and evaluation by others.
Related Articles

Anthropic prepares Opus 4.7 and AI design tool, VCs offer up to 800 billion dollars
THE DECODER

ChatGPT Custom Instructions: The Ultimate Setup Guide
Dev.to

Best ChatGPT Alternatives 2026: 8 AI Tools Compared
Dev.to

Nghịch Lý Constraint: Hạn Chế AI Agent Nhiều Hơn, Code Tốt Hơn
Dev.to

Best AI for Coding: Copilot vs Claude vs Cursor
Dev.to