ShapeY: A Principled Framework for Measuring Shape Recognition Capacity via Nearest-Neighbor Matching

arXiv cs.CV / 4/29/2026

📰 NewsModels & Research

共有:

Key Points

The paper introduces ShapeY, a benchmarking framework to measure how well object recognition systems use shape information rather than non-shape cues like texture or background.
ShapeY includes 68,200 grayscale images of 200 3D objects across multiple viewpoints, with optional appearance (non-shape) changes, and evaluates embeddings using a nearest-neighbor matching task.
The benchmark probes whether views cluster according to 3D shape similarity despite viewpoint variation and other appearance changes, producing multiple quantitative and qualitative readouts (e.g., error-rate graphs and matching-score histograms).
Experiments on 321 pre-trained networks show that even state-of-the-art models face significant difficulties in robust shape-based generalization and sometimes make rare but severely incorrect matches between clearly different shapes.
Overall, ShapeY provides a principled way to drive artificial vision toward more human-like shape recognition by emphasizing disentangled and viewpoint/appearance-invariant representations.

Abstract

Object recognition (OR) in humans relies heavily on shape cues and the ability to recognize objects across varying 3D viewpoints. Unlike humans, deep networks often rely on non-shape cues such as texture and background, leading to vulnerabilities in generalization and robustness. To address this gap, we introduce ShapeY, a novel and principled benchmarking framework designed to evaluate shape-based recognition capability in OR systems. ShapeY comprises 68,200 grayscale images of 200 3D objects rendered from multiple viewpoints and optionally subjected to non-shape ``appearance'' changes. Using a nearest-neighbor matching task, ShapeY specifically probes the fine-grained structure of an OR system's embedding space by evaluating whether object views are clustered by 3D shape similarity across varying 3D viewpoints and other non-shape changes. ShapeY provides a suite of quantitative and qualitative performance readouts, including error rate graphs, viewpoint tuning curves, histograms of positive and negative matching scores, and grids showing ordered best matches, which together offer a comprehensive evaluation of an OR system's shape understanding capability. Testing of 321 pre-trained networks with diverse architectures reveals significant challenges in achieving robust shape-based recognition: even state-of-the-art models struggle to generalize consistently across 3D viewpoint and appearance changes, and are prone to infrequent but egregious matches of objects of obviously completely different shape. ShapeY establishes a principled framework for advancing artificial vision systems toward human-like shape recognition capabilities, emphasizing the importance of disentangled and invariant object encodings.

LLMs will be a commodity

Reddit r/artificial

What it feels like to have to have Qwen 3.6 or Gemma 4 running locally

Reddit r/LocalLLaMA

Dex lands $5.3M to grow its AI-driven talent matching platform

Tech.eu

AI Voice Agents in Production: What Actually Works in 2026

Dev.to

How we built a browser-based AI Pathology platform

Dev.to

ShapeY: A Principled Framework for Measuring Shape Recognition Capacity via Nearest-Neighbor Matching

Key Points

Abstract

Related Articles

LLMs will be a commodity

What it feels like to have to have Qwen 3.6 or Gemma 4 running locally

Dex lands $5.3M to grow its AI-driven talent matching platform

AI Voice Agents in Production: What Actually Works in 2026

How we built a browser-based AI Pathology platform

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer