Training Data Size Sensitivity in Unsupervised Rhyme Recognition

arXiv cs.CL / 4/10/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper studies how sensitive unsupervised rhyme recognition performance is to the amount of training data, using RhymeTagger as a language-independent tool based on repeating rhyme patterns in poetry corpora.
It evaluates RhymeTagger across seven languages and analyzes how both training size and cross-language differences affect classification accuracy.
To establish a realistic benchmark, the authors measure inter-annotator agreement on a manually annotated poem subset and identify causes of expert disagreement, including phonetic similarity and the positional distance between rhyming words.
The study compares RhymeTagger against three large language models in a one-shot setup, finding that LLMs without strong phonetic representation struggle, while RhymeTagger can outperform human agreement once training data is sufficient.

Abstract

Rhyme is deceptively intuitive: what is or is not a rhyme is constructed historically, scholars struggle with rhyme classification, and people disagree on whether two words are rhymed or not. This complicates automated rhymed recognition and evaluation, especially in multilingual context. This article investigates how much training data is needed for reliable unsupervised rhyme recognition using RhymeTagger, a language-independent tool that identifies rhymes based on repeating patterns in poetry corpora. We evaluate its performance across seven languages (Czech, German, English, French, Italian, Russian, and Slovene), examining how training size and language differences affect accuracy. To set a realistic performance benchmark, we assess inter-annotator agreement on a manually annotated subset of poems and analyze factors contributing to disagreement in expert annotations: phonetic similarity between rhyming words and their distance from each other in a poem. We also compare RhymeTagger to three large language models using a one-shot learning strategy. Our findings show that, once provided with sufficient training data, RhymeTagger consistently outperforms human agreement, while LLMs lacking phonetic representation significantly struggle with the task.

GLM 5.1 tops the code arena rankings for open models

Reddit r/LocalLLaMA

can we talk about how AI has gotten really good at lying to you?

Reddit r/artificial

AI just found thousands of zero-days. Your firewall is still pattern-matching from 2014

Dev.to

Emergency Room and the Vanishing Moat

Dev.to

I Built a 100% Browser-Based OCR That Never Uploads Your Documents — Here's How

Dev.to

Training Data Size Sensitivity in Unsupervised Rhyme Recognition

Key Points

Abstract

Related Articles

GLM 5.1 tops the code arena rankings for open models

can we talk about how AI has gotten really good at lying to you?

AI just found thousands of zero-days. Your firewall is still pattern-matching from 2014

Emergency Room and the Vanishing Moat

I Built a 100% Browser-Based OCR That Never Uploads Your Documents — Here's How

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer