Counting Without Numbers \& Finding Without Words

arXiv cs.CL / 3/26/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper argues that current shelter reunification systems fail because they rely largely on visual appearance, even though animals often recognize each other acoustically through identity sounds.
It proposes the first multimodal reunification system that combines visual matching with acoustic biometrics to better detect pairs across stress-related appearance changes.
The described model is species-adaptive, handling a wide acoustic range from low-frequency elephant rumbles (around 10Hz) to higher-frequency puppy whines (up to 4kHz).
The approach is framed as being grounded in decades of cognitive science about approximate quantity perception and identity communication via sound, using probabilistic matching for robustness.
The authors position the work as an example of biology-grounded AI that could improve outcomes for vulnerable populations that cannot communicate with human language.

Abstract

Every year, 10 million pets enter shelters, separated from their families. Despite desperate searches by both guardians and lost animals, 70% never reunite, not because matches do not exist, but because current systems look only at appearance, while animals recognize each other through sound. We ask, why does computer vision treat vocalizing species as silent visual objects? Drawing on five decades of cognitive science showing that animals perceive quantity approximately and communicate identity acoustically, we present the first multimodal reunification system integrating visual and acoustic biometrics. Our species-adaptive architecture processes vocalizations from 10Hz elephant rumbles to 4kHz puppy whines, paired with probabilistic visual matching that tolerates stress-induced appearance changes. This work demonstrates that AI grounded in biological communication principles can serve vulnerable populations that lack human language.

AgentDesk vs Hiring Another Consultant: A Cost Comparison

Dev.to

"Why Your AI Agent Needs a System 1"

Dev.to

When should we expect TurboQuant?

Reddit r/LocalLLaMA

AI as Your Customs Co-Pilot: Automating HS Code Chaos in Southeast Asia

Dev.to

The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions

Dev.to

Counting Without Numbers \& Finding Without Words

Key Points

Abstract

Related Articles

AgentDesk vs Hiring Another Consultant: A Cost Comparison

"Why Your AI Agent Needs a System 1"

When should we expect TurboQuant?

AI as Your Customs Co-Pilot: Automating HS Code Chaos in Southeast Asia

The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer