Learning to Control Summaries with Score Ranking

arXiv cs.CL / 4/21/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper targets a gap in multi-criteria summarization by enabling control of generated summaries for specific quality dimensions rather than only optimizing them jointly.
  • It introduces a loss function that matches model outputs to fine-grained, model-based evaluation scores (such as FineSurE), explicitly accounting for trade-offs like conciseness vs. completeness.
  • Experiments on three pretrained models (LLaMA, Qwen, and Mistral) show overall summary quality comparable to state-of-the-art methods.
  • The key differentiator is that the approach provides strong, dimension-specific controllability, allowing users to selectively prioritize one criterion over others.

Abstract

Recent advances in summarization research focus on improving summary quality across multiple criteria, such as completeness, conciseness, and faithfulness, by jointly optimizing these dimensions. However, these efforts largely overlook the challenge of controlling summary generation with respect to individual criteria, especially in the presence of their inherent trade-offs. For example, enhancing conciseness can compromise completeness, and vice versa. In this work, we address this gap by proposing a loss function that aligns model outputs with fine-grained, model-based evaluation scores (e.g., from FineSurE), enabling both improvement in summary quality and dimension-specific control. Our approach improves the overall quality of summaries while maintaining the ability to selectively prioritize one criterion over others. Experiments on three pretrained models (LLaMA, Qwen, and Mistral) demonstrate that our method achieves performance comparable to state-of-the-art summarizers, while uniquely offering strong controllability over individual quality dimensions.