CNSocialDepress: A Chinese Social Media Dataset for Depression Risk Detection and Structured Analysis

arXiv cs.CL / 3/27/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces CNSocialDepress, a Chinese-language social media benchmark dataset aimed at depression risk detection and analysis.
  • The dataset includes 44,178 posts from 233 users, with psychological experts annotating 10,306 depression-related segments.
  • Unlike binary-only resources, CNSocialDepress provides both binary risk labels and structured, multidimensional psychological attributes for more interpretable, fine-grained signal analysis.
  • Experiments show the dataset supports multiple NLP tasks, including structured psychological profiling and fine-tuning large language models for depression detection.
  • The authors position CNSocialDepress as a practical resource for mental-health applications tailored to Chinese-speaking populations, addressing a gap in publicly available resources.

Abstract

Depression is a pressing global public health issue, yet publicly available Chinese-language resources for depression risk detection remain scarce and largely focus on binary classification. To address this limitation, we release CNSocialDepress, a benchmark dataset for depression risk detection on Chinese social media. The dataset contains 44,178 posts from 233 users; psychological experts annotated 10,306 depression-related segments. CNSocialDepress provides binary risk labels along with structured, multidimensional psychological attributes, enabling interpretable and fine-grained analyses of depressive signals. Experimental results demonstrate the dataset's utility across a range of NLP tasks, including structured psychological profiling and fine-tuning large language models for depression detection. Comprehensive evaluations highlight the dataset's effectiveness and practical value for depression risk identification and psychological analysis, thereby providing insights for mental health applications tailored to Chinese-speaking populations.

CNSocialDepress: A Chinese Social Media Dataset for Depression Risk Detection and Structured Analysis | AI Navigate