Semantic Segmentation of Textured Non-manifold 3D Meshes using Transformers

arXiv cs.CV / 4/3/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper proposes a texture-aware transformer for semantic segmentation on textured non-manifold 3D meshes, addressing the difficulty of irregular mesh structure while leveraging texture information from raw face-associated pixels.
It introduces a hierarchical multi-scale feature aggregation scheme that combines a texture branch (pixel aggregation into a learnable token) with geometric descriptors processed through Two-Stage Transformer Blocks to balance local and global context.
Experiments on the Semantic Urban Meshes (SUM) benchmark show strong results (81.9% mF1, 94.3% OA), with additional evaluation on a newly curated cultural-heritage roof-tile dataset (49.7% mF1, 72.8% OA).
The method significantly outperforms existing approaches, indicating that jointly modeling texture plus geometry in transformer architectures can improve per-face semantic/damage-type predictions for complex meshes.

Abstract

Textured 3D meshes jointly represent geometry, topology, and appearance, yet their irregular structure poses significant challenges for deep-learning-based semantic segmentation. While a few recent methods operate directly on meshes without imposing geometric constraints, they typically overlook the rich textural information also provided by such meshes. We introduce a texture-aware transformer that learns directly from raw pixels associated with each mesh face, coupled with a new hierarchical learning scheme for multi-scale feature aggregation. A texture branch summarizes all face-level pixels into a learnable token, which is fused with geometrical descriptors and processed by a stack of Two-Stage Transformer Blocks (TSTB), which allow for both a local and a global information flow. We evaluate our model on the Semantic Urban Meshes (SUM) benchmark and a newly curated cultural-heritage dataset comprising textured roof tiles with triangle-level annotations for damage types. Our method achieves 81.9\% mF1 and 94.3\% OA on SUM and 49.7\% mF1 and 72.8\% OA on the new dataset, substantially outperforming existing approaches.

Why I built an AI assistant that doesn't know who you are

Dev.to

DenseNet Paper Walkthrough: All Connected

Towards Data Science

Meta Adaptive Ranking Model: What Instagram Advertisers Gain in 2026 | MKDM

Dev.to

The Facebook insider building content moderation for the AI era

TechCrunch

Qwen3.5 vs Gemma 4: Benchmarks vs real world use?

Reddit r/LocalLLaMA

Semantic Segmentation of Textured Non-manifold 3D Meshes using Transformers

Key Points

Abstract

Related Articles

Why I built an AI assistant that doesn't know who you are

DenseNet Paper Walkthrough: All Connected

Meta Adaptive Ranking Model: What Instagram Advertisers Gain in 2026 | MKDM

The Facebook insider building content moderation for the AI era

Qwen3.5 vs Gemma 4: Benchmarks vs real world use?

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer