Semi-Supervised Learning with Balanced Deep Representation Distributions
arXiv cs.LG / 3/24/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper addresses semi-supervised text classification (SSTC) where self-training depends heavily on the accuracy of pseudo-labels for unlabeled data.
- It identifies a “margin bias” issue stemming from mismatched representation (feature) distributions between labels in SSTC.
- To reduce this bias, it introduces an angular margin loss and applies Gaussian linear transformations to balance the variance of label angles within each class.
- The proposed method, S2TC-BDD, constrains label angle variances using estimates computed over both labeled and pseudo-labeled texts during self-training iterations.
- Experiments on multi-class and multi-label settings show S2TC-BDD improves performance over state-of-the-art SSTC methods, with the largest gains when labeled data is scarce.
Related Articles

Composer 2: What is new and Compares with Claude Opus 4.6 & GPT-5.4
Dev.to
How UCP Breaks Your E-Commerce Tracking Stack: A Platform-by-Platform Analysis
Dev.to
AI Text Analyzer vs Asking Friends: Which Gives Better Perspective?
Dev.to
[D] Cathie wood claims ai productivity wave is starting, data shows 43% of ceos save 8+ hours weekly
Reddit r/MachineLearning

Microsoft hires top AI researchers from Allen Institute for AI for Suleyman's Superintelligence team
THE DECODER