In the LLM era, Word Sense Induction remains unsolved

arXiv cs.CL / 3/13/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

It argues that word sense induction remains unsolved in the absence of sense-annotated data and presents a SemCor-derived evaluation to respect polysemy and frequency distributions.
The authors benchmark pre-trained embeddings and clustering across parts of speech and propose an LLM-based WSI method for English, alongside data augmentation and semi-supervised setups using Wiktionary.
A key finding is that no unsupervised method outperforms the simple “one cluster per lemma” heuristic, with results varying across parts of speech and LLMs showing limited effectiveness on this task.
Despite these challenges, data augmentation (including Wiktionary-based semi-supervision) improves performance, and their method surpasses the previous state-of-the-art by about 3.3% on their test set, underscoring the need for better lexical semantics in LLMs.

Abstract

In the absence of sense-annotated data, word sense induction (WSI) is a compelling alternative to word sense disambiguation, particularly in low-resource or domain-specific settings. In this paper, we emphasize methodological problems in current WSI evaluation. We propose an evaluation on a SemCor-derived dataset, respecting the original corpus polysemy and frequency distributions. We assess pre-trained embeddings and clustering algorithms across parts of speech, and propose and evaluate an LLM-based WSI method for English. We evaluate data augmentation sources (LLM-generated, corpus and lexicon), and semi-supervised scenarios using Wiktionary for data augmentation, must-link constraints, number of clusters per lemma. We find that no unsupervised method (whether ours or previous) surpasses the strong "one cluster per lemma" heuristic (1cpl). We also show that (i) results and best systems may vary across POS, (ii) LLMs have troubles performing this task, (iii) data augmentation is beneficial and (iv) capitalizing on Wiktionary does help. It surpasses previous SOTA system on our test set by 3.3\%. WSI is not solved, and calls for a better articulation of lexicons and LLMs' lexical semantics capabilities.

The Honest Guide to AI Writing Tools in 2026 (What Actually Works)

Dev.to

Next-Generation LLM Inference Technology: From Flash-MoE to Gemini Flash-Lite, and Local GPU Utilization

Dev.to

The Wave of Open-Source AI and Investment in Security: Trends from Qwen, MS, and Google

Dev.to

How I built a 4-product AI income stack in 4 months (the honest version)

Dev.to

I stopped writing AI prompts from scratch. Here is the system I built instead.

Dev.to

In the LLM era, Word Sense Induction remains unsolved

Key Points

Abstract

Related Articles

The Honest Guide to AI Writing Tools in 2026 (What Actually Works)

Next-Generation LLM Inference Technology: From Flash-MoE to Gemini Flash-Lite, and Local GPU Utilization

The Wave of Open-Source AI and Investment in Security: Trends from Qwen, MS, and Google

How I built a 4-product AI income stack in 4 months (the honest version)

I stopped writing AI prompts from scratch. Here is the system I built instead.

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer