findsylls: A Language-Agnostic Toolkit for Syllable-Level Speech Tokenization and Embedding

arXiv cs.AI / 3/30/2026

💬 OpinionIdeas & Deep AnalysisTools & Practical UsageModels & Research

共有:

Key Points

findsylls is introduced as a modular, language-agnostic toolkit that standardizes syllable segmentation by unifying classical syllable detectors with end-to-end syllabifiers under a common interface.
The framework supports syllable embedding extraction and multi-granular evaluation, enabling controlled comparisons of token rates, representations, and algorithms.
It implements and standardizes existing methods such as Sylber and VG-HuBERT while allowing components to be recombined for reproducible experimentation.
The paper demonstrates the toolkit on English and Spanish corpora and extends it to an under-documented Central Mande language (Kono) using newly hand-annotated data.
By providing a single pipeline for both high-resource and under-resourced languages, findsylls aims to reduce fragmentation in syllabification research and improve cross-study comparability.

Abstract

Syllable-level units offer compact and linguistically meaningful representations for spoken language modeling and unsupervised word discovery, but research on syllabification remains fragmented across disparate implementations, datasets, and evaluation protocols. We introduce findsylls, a modular, language-agnostic toolkit that unifies classical syllable detectors and end-to-end syllabifiers under a common interface for syllable segmentation, embedding extraction, and multi-granular evaluation. The toolkit implements and standardizes widely used methods (e.g., Sylber, VG-HuBERT) and allows their components to be recombined, enabling controlled comparisons of representations, algorithms, and token rates. We demonstrate findsylls on English and Spanish corpora and on new hand-annotated data from Kono, an underdocumented Central Mande language, illustrating how a single framework can support reproducible syllable-level experiments across both high-resource and under-resourced settings.

Black Hat Asia

AI Business

Claude Code tokens: what they are and how they're counted

Dev.to

How I Review AI-Generated Pull Requests (A Step-by-Step Checklist)

Dev.to

Freedom and Constraints of Autonomous Agents — Self-Modification, Trust Boundaries, and Emergent Gameplay

Dev.to

Von Hammerstein’s Ghost: What a Prussian General’s Officer Typology Can Teach Us About AI Misalignment

Reddit r/artificial

findsylls: A Language-Agnostic Toolkit for Syllable-Level Speech Tokenization and Embedding

Key Points

Abstract

Related Articles

Black Hat Asia

Claude Code tokens: what they are and how they're counted

How I Review AI-Generated Pull Requests (A Step-by-Step Checklist)

Freedom and Constraints of Autonomous Agents — Self-Modification, Trust Boundaries, and Emergent Gameplay

Von Hammerstein’s Ghost: What a Prussian General’s Officer Typology Can Teach Us About AI Misalignment

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer