PCA-Seg: Revisiting Cost Aggregation for Open-Vocabulary Semantic and Part Segmentation
arXiv cs.CV / 3/19/2026
📰 NewsModels & Research
Key Points
- PCA-Seg introduces parallel cost aggregation to alleviate knowledge interference between class-level semantics and spatial context in open-vocabulary semantic and part segmentation.
- It features an expert-driven perceptual learning (EPL) module with a multi-expert parser to fuse semantic and contextual features and a coefficient mapper that learns pixel-specific weights for adaptive feature integration.
- A feature orthogonalization decoupling (FOD) strategy reduces redundancy between semantic and contextual streams, enabling learning from orthogonalized, complementary knowledge.
- Extensive experiments on eight benchmarks show that each parallel block adds only about 0.35M parameters while delivering state-of-the-art OSPS performance.
- The approach offers a lightweight, scalable path to improved vision-language alignment in open-vocabulary segmentation tasks.
Related Articles

Math needs thinking time, everyday knowledge needs memory, and a new Transformer architecture aims to deliver both
THE DECODER
Kreuzberg v4.5.0: We loved Docling's model so much that we gave it a faster engine
Reddit r/LocalLLaMA
Today, what hardware to get for running large-ish local models like qwen 120b ?
Reddit r/LocalLLaMA
Running mistral locally for meeting notes and it's honestly good enough for my use case
Reddit r/LocalLLaMA
[D] Single-artist longitudinal fine art dataset spanning 5 decades now on Hugging Face — potential applications in style evolution, figure representation, and ethical training data
Reddit r/MachineLearning