IGLOSS: Image Generation for Lidar Open-vocabulary Semantic Segmentation

arXiv cs.CV / 4/3/2026

📰 NewsSignals & Early TrendsModels & Research

共有:

Key Points

IGLOSS introduces a new zero-shot open-vocabulary semantic segmentation method tailored to 3D automotive lidar point clouds.
Instead of relying on VLMs like CLIP that suffer from an image-text modality gap, the approach generates prototype images from text to bridge the modalities.
The system uses a 3D network distilled from a 2D vision foundation model, then assigns labels by matching 3D point features with 2D features extracted from the generated prototypes.
The paper reports state-of-the-art performance for OVSS on nuScenes and SemanticKITTI datasets.
The authors provide code, pre-trained models, and generated images publicly via a GitHub repository.

Abstract

This paper presents a new method for the zero-shot open-vocabulary semantic segmentation (OVSS) of 3D automotive lidar data. To circumvent the recognized image-text modality gap that is intrinsic to approaches based on Vision Language Models (VLMs) such as CLIP, our method relies instead on image generation from text, to create prototype images. Given a 3D network distilled from a 2D Vision Foundation Model (VFM), we then label a point cloud by matching 3D point features with 2D image features of these prototypes. Our method is state-of-the-art for OVSS on nuScenes and SemanticKITTI. Code, pre-trained models, and generated images are available at https://github.com/valeoai/IGLOSS.

Black Hat Asia

AI Business

90000 Tech Workers Got Fired This Year and Everyone Is Blaming AI but Thats Not the Whole Story

Dev.to

Microsoft’s $10 Billion Japan Bet Shows the Next AI Battleground Is National Infrastructure

Dev.to

TII Releases Falcon Perception: A 0.6B-Parameter Early-Fusion Transformer for Open-Vocabulary Grounding and Segmentation from Natural Language Prompts

MarkTechPost

Portable eye scanner powered by AI expands access to low-cost community screening

Reddit r/artificial

IGLOSS: Image Generation for Lidar Open-vocabulary Semantic Segmentation

Key Points

Abstract

Related Articles

Black Hat Asia

90000 Tech Workers Got Fired This Year and Everyone Is Blaming AI but Thats Not the Whole Story

Microsoft’s $10 Billion Japan Bet Shows the Next AI Battleground Is National Infrastructure

TII Releases Falcon Perception: A 0.6B-Parameter Early-Fusion Transformer for Open-Vocabulary Grounding and Segmentation from Natural Language Prompts

Portable eye scanner powered by AI expands access to low-cost community screening

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer