Few-Shot Contrastive Adaptation for Audio Abuse Detection in Low-Resource Indic Languages
arXiv cs.CL / 4/13/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- The paper studies abusive speech detection for multilingual social media voice interactions, focusing on low-resource Indic languages where typical ASR→text pipelines can fail due to transcription errors and loss of prosody.
- It evaluates Contrastive Language-Audio Pre-training (CLAP) representations for detecting abuse directly from audio, using the ADIMA dataset.
- Experiments include few-shot supervised contrastive adaptation with cross-lingual learning and a leave-one-language-out setup, alongside zero-shot prompting for comparison.
- Results show CLAP provides strong cross-lingual audio representations across ten Indic languages, and lightweight projection-only adaptation can match performance of fully supervised models trained on all data in some cases.
- The gains from few-shot adaptation vary by language and are not simply increasing with more labeled examples, indicating incomplete and language-specific transfer.
Related Articles

When Agents Go Wrong: AI Accountability and the Payment Audit Trail
Dev.to

Google Gemma 4 Review 2026: The Open Model That Runs Locally and Beats Closed APIs
Dev.to

OpenClaw Deep Dive Guide: Self-Host Your Own AI Agent on Any VPS (2026)
Dev.to

# Anti-Vibe-Coding: 17 Skills That Replace Ad-Hoc AI Prompting
Dev.to

Automating Vendor Compliance: The AI Verification Workflow
Dev.to