SCURank: Ranking Multiple Candidate Summaries with Summary Content Units for Enhanced Summarization

arXiv cs.CL / 4/22/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper introduces SCURank, a summarization framework that ranks multiple candidate summaries using Summary Content Units (SCUs) rather than unstable LLM-based comparisons or surface-level overlap metrics like ROUGE.
SCURank evaluates summaries by the richness and semantic importance of their information content, aiming to produce more reliable and higher-quality rankings.
The authors test SCURank in the context of distilling summaries from multiple diverse LLMs (including SLMs such as BART) and show improved results over traditional metrics and existing LLM-based ranking approaches.
Results indicate that using SCURank to incorporate diverse LLM-generated summaries can improve abstractiveness and overall performance of the distilled models.
The project includes publicly available code on GitHub, enabling others to reproduce and build on the framework.

Abstract

Small language models (SLMs), such as BART, can achieve summarization performance comparable to large language models (LLMs) via distillation. However, existing LLM-based ranking strategies for summary candidates suffer from instability, while classical metrics (e.g., ROUGE) are insufficient to rank high-quality summaries. To address these issues, we introduce \textbf{SCURank}, a framework that enhances summarization by leveraging \textbf{Summary Content Units (SCUs)}. Instead of relying on unstable comparisons or surface-level overlap, SCURank evaluates summaries based on the richness and semantic importance of information content. We investigate the effectiveness of SCURank in distilling summaries from multiple diverse LLMs. Experimental results demonstrate that SCURank outperforms traditional metrics and LLM-based ranking methods across evaluation measures and datasets. Furthermore, our findings show that incorporating diverse LLM summaries enhances model abstractiveness and overall distilled model performance, validating the benefits of information-centric ranking in multi-LLM distillation. The code for SCURank is available at https://github.com/IKMLab/SCURank.