Learning the Cue or Learning the Word? Analyzing Generalization in Metaphor Detection for Verbs

arXiv cs.CL / 4/16/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper investigates whether state-of-the-art metaphor detection models generalize via transferable context patterns or rely on lexical memorization of verbs.
  • Using RoBERTa as a common backbone and the VU Amsterdam Metaphor Corpus, the authors run a lexical hold-out experiment that removes target verb lemmas from fine-tuning and compares performance on exposed vs held-out verbs.
  • Results show the model scores highest on exposed lemmas but still performs robustly on held-out lemmas, indicating meaningful generalization beyond seen words.
  • Additional analysis finds that sentence context features can largely reproduce full-model performance on held-out lemmas, while static verb-level embeddings do not.
  • The findings support a “learning the cue” view as the primary driver of generalization, with “learning the word” acting as an additive benefit when lexical exposure is present.

Abstract

Metaphor detection models achieve strong benchmark performance, yet it remains unclear whether this reflects transferable generalization or lexical memorization. To address this, we analyze generalization in metaphor detection through RoBERTa, the shared backbone of many state-of-the-art systems, focusing on English verbs using the VU Amsterdam Metaphor Corpus. We introduce a controlled lexical hold-out setup where all instances of selected target lemmas are strictly excluded from fine-tuning, and compare predictions on these Held-out lemmas against Exposed lemmas (verbs seen during fine-tuning). While the model performs best on Exposed lemmas, it maintains robust performance on Held-out lemmas. Further analysis reveals that sentence context alone is sufficient to match full-model performance on Held-out lemmas, whereas static verb-level embeddings are not. Together, these results suggest that generalization is primarily driven by "learning the cue" (transferable contextual patterns), while "learning the word" (verb-specific memorization) provides an additive boost when lexical exposure is available.