Text Summarization With Graph Attention Networks

arXiv cs.CL / 4/7/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper explores using graph information—specifically Rhetorical Structure Theory (RST) and coreference graphs—to improve text summarization performance over baseline models.
  • A Graph Attention Network approach for incorporating the graph information did not improve results, leading the authors to pivot to a simpler Multi-layer Perceptron (MLP) architecture that did improve performance on CNN/DM.
  • The authors also annotated the XSum dataset with RST graph information, creating a new benchmark intended to support future research on graph-based summarization.
  • The XSum graph-annotated dataset introduced notable challenges that highlight both strengths and limitations of their models and graph-based methods in general.

Abstract

This study aimed to leverage graph information, particularly Rhetorical Structure Theory (RST) and Co-reference (Coref) graphs, to enhance the performance of our baseline summarization models. Specifically, we experimented with a Graph Attention Network architecture to incorporate graph information. However, this architecture did not enhance the performance. Subsequently, we used a simple Multi-layer Perceptron architecture, which improved the results in our proposed model on our primary dataset, CNN/DM. Additionally, we annotated XSum dataset with RST graph information, establishing a benchmark for future graph-based summarization models. This secondary dataset posed multiple challenges, revealing both the merits and limitations of our models.