Hybrid topic modelling for computational close reading: Mapping narrative themes in Pushkin's Evgenij Onegin
arXiv cs.CL / 3/23/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- The paper proposes a hybrid topic modelling framework that combines Latent Dirichlet Allocation (LDA) with sparse Partial Least Squares Discriminant Analysis (sPLS-DA) to map themes and their dynamics in narrative poetry.
- It applies the method to Pushkin's Evgenij Onegin using a lemmatized Italian translation, yielding five stable topics from 35 document segments.
- To address small-corpus instability, the approach uses a multi-seed consensus protocol and employs sPLS-DA as a supervised probe to identify lexical markers refining each theme.
- It introduces narrative hubs—groups of contiguous stanzas—to extend bag-of-words to the narrative level, producing interpretable thematic maps aligned with the poem's emotional and structural arc and offering a reusable computational close reading template for other dense texts.
Related Articles
The Complete Guide to Model Context Protocol (MCP): Building AI-Native Applications in 2026
Dev.to
AI Shields Your Money: Banks’ New Fraud Fighters
Dev.to
Building AI Phone Systems for Veterinary Clinics — What Actually Works
Dev.to
How to Use Instagram Reels to Boost Sales [2026 Strategy]
Dev.to
[R] Adversarial Machine Learning
Reddit r/MachineLearning