Avoiding Over-smoothing in Social Media Rumor Detection with Pre-trained Propagation Tree Transformer

arXiv cs.CL / 3/25/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper argues that common rumor-detection approaches using graph neural networks (GNNs) degrade due to over-smoothing, which is closely linked to the structural properties of rumor propagation trees (with many 1-level nodes).
  • It also finds that GNN-based models have difficulty capturing long-range dependencies along reply propagation trees.
  • To address both issues, the authors propose P2T3, a pure Transformer-based method that extracts conversation chains from propagation trees and uses token-wise embeddings plus tailored inductive bias to encode connection structure.
  • P2T3 is pre-trained on large-scale unlabeled data and then evaluated against prior state-of-the-art methods, showing improved performance across multiple benchmarks and in few-shot settings.
  • The work suggests the approach can serve as a foundation for future large-model or unified multi-modal social-media rumor research.

Abstract

Deep learning techniques for rumor detection typically utilize Graph Neural Networks (GNNs) to analyze post relations. These methods, however, falter due to over-smoothing issues when processing rumor propagation structures, leading to declining performance. Our investigation into this issue reveals that over-smoothing is intrinsically tied to the structural characteristics of rumor propagation trees, in which the majority of nodes are 1-level nodes. Furthermore, GNNs struggle to capture long-range dependencies within these trees. To circumvent these challenges, we propose a Pre-Trained Propagation Tree Transformer (P2T3) method based on pure Transformer architecture. It extracts all conversation chains from a tree structure following the propagation direction of replies, utilizes token-wise embedding to infuse connection information and introduces necessary inductive bias, and pre-trains on large-scale unlabeled datasets. Experiments indicate that P2T3 surpasses previous state-of-the-art methods in multiple benchmark datasets and performs well under few-shot conditions. P2T3 not only avoids the over-smoothing issue inherent in GNNs but also potentially offers a large model or unified multi-modal scheme for future social media research.