Adaptive Planning for Multi-Attribute Controllable Summarization with Monte Carlo Tree Search

arXiv cs.CL / 4/13/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper introduces PACO (Adaptive Planning for Multi-Attribute Controllable Summarization), a training-free method for generating summaries that satisfy multiple, potentially correlated, user-specified attributes.
PACO reframes controllable summarization as a planning problem where the system decides the sequence of attribute control steps, using a customized Monte Carlo Tree Search (MCTS) over summary states.
Each MCTS node represents a candidate summary and actions correspond to single-attribute adjustments, allowing the framework to iteratively refine only the attributes that still need improvement.
Experiments across domains and model families show PACO improves multi-attribute controllability compared with LLM-based self-planning approaches and fine-tuned baselines.
The method is notably efficient, with results indicating that PACO using Llama-3.2-1B can approach the controllability of much larger Llama-3.3-70B fine-tuned baselines, and larger models further boost performance.

Abstract

Controllable summarization moves beyond generic outputs toward human-aligned summaries guided by specified attributes. In practice, the interdependence among attributes makes it challenging for language models to satisfy correlated constraints consistently. Moreover, previous approaches often require per-attribute fine-tuning, limiting flexibility across diverse summary attributes. In this paper, we propose adaptive planning for multi-attribute controllable summarization (PACO), a training-free framework that reframes the task as planning the order of sequential attribute control with a customized Monte Carlo Tree Search (MCTS). In PACO, nodes represent summaries, and actions correspond to single-attribute adjustments, enabling progressive refinement of only the attributes requiring further control. This strategy adaptively discovers optimal control orders, ultimately producing summaries that effectively meet all constraints. Extensive experiments across diverse domains and models demonstrate that PACO achieves robust multi-attribute controllability, surpassing both LLM-based self-planning models and fine-tuned baselines. Remarkably, PACO with Llama-3.2-1B rivals the controllability of the much larger Llama-3.3-70B baselines. With larger models, PACO achieves superior control performance, outperforming all competitors.