I collected Reddit posts between Jan 29 - Mar 1, 2026 using 40 keyword-based search terms ("AI safety", "AI alignment", "EU AI Act", "AI replace jobs", "red teaming LLM", etc.) across all subreddits. After filtering, I ended up with 6,374 posts and ran them through a full NLP pipeline.
What I built:
Sentence embeddings (paraphrase-multilingual-MiniLM-L12-v2) -> 10D UMAP -> HDBSCAN clustering
Manual cluster review using structured cluster cards
Sentiment analysis per post (RoBERTa classifier)
Discourse framing layer - human-first labeling with blind LLM comparison and human adjudication
The result: 23 interpretable clusters grouped into 11 thematic families.
Three things I found interesting:
1. The discourse is fragmented, not unified.
No single cluster dominates - the largest is ~10% of posts. "AI safety discourse" on Reddit looks more like a field of related but distinct conversations: labour anxiety, regulation, lab trust, authenticity & synthetic content, technical safety, enterprise adoption, philosophical debates about personhood. They don't talk to each other that much.
2. The most negative clusters are about lived disruption, not abstract risk.
Job replacement, synthetic content spam, broken trust in specific AI labs, AI misuse in schools, creative displacement - these are the most negatively-toned clusters. Enterprise adoption and national AI progress clusters are neutral-to-positive. X-risk and alignment clusters are... mostly neutral, which surprised me.
3. Framing matters as much as topic.
Two clusters can both be "about AI and work" while one is macro labour anxiety and another is micro hiring friction - different problems, different policy implications. Topic labels alone don't capture this.
Visualizations, full report (PDF), sample data, and code: https://github.com/kelukes/reddit-ai-safety-discourse-2026
Feedback on the pipeline and all is very welcome - this was a capstone project and I'm still learning.
[link] [comments]
