Detecting Basic Values in A Noisy Russian Social Media Text Data: A Multi-Stage Classification Framework
arXiv cs.CL / 3/20/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- The study presents a multi-stage pipeline for detecting Schwartz's basic human values in noisy Russian social media text, combining spam filtering, targeted post selection, LLM-based annotation, and transformer-based multi-label classification.
- It treats expert annotations as interpretative benchmarks with uncertainty and aggregates multiple LLM judgments into soft labels to reflect varying levels of agreement.
- The best model, XLM-RoBERTa large, achieves an F1 macro of 0.83 and an F1 of 0.71 on held-out test data, demonstrating effective value detection and handling of annotation subjectivity.
- The work adds to understanding cultural variation in value expression on social platforms and publicly releases all models for further research.
Related Articles

Composer 2: What is new and Compares with Claude Opus 4.6 & GPT-5.4
Dev.to
How UCP Breaks Your E-Commerce Tracking Stack: A Platform-by-Platform Analysis
Dev.to
AI Text Analyzer vs Asking Friends: Which Gives Better Perspective?
Dev.to
[D] Cathie wood claims ai productivity wave is starting, data shows 43% of ceos save 8+ hours weekly
Reddit r/MachineLearning

Microsoft hires top AI researchers from Allen Institute for AI for Suleyman's Superintelligence team
THE DECODER