Scaffold-Conditioned Preference Triplets for Controllable Molecular Optimization with Large Language Models

arXiv cs.LG / 4/15/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes Scaffold-Conditioned Preference Triplets (SCPT), a pipeline that builds scaffold-preserving preference triplets ⟨scaffold, better, worse⟩ using scaffold alignment plus chemistry-driven filters for validity and synthesizability.
  • It aligns a pretrained molecular LLM as a conditional editor with these preference signals, targeting controllable molecular optimization that improves properties while retaining scaffold similarity.
  • Experiments on single- and multi-objective benchmarks show higher optimization success and property gains compared with baselines, with better maintenance of scaffold similarity.
  • The method demonstrates strong generalization from models trained on one or two properties to three-property tasks, suggesting extrapolative capability under limited higher-order supervision.
  • SCPT also introduces controllable “data-construction knobs” that create a more predictable similarity–gain trade-off frontier for adapting to different optimization regimes.

Abstract

Molecular property optimization is central to drug discovery, yet many deep learning methods rely on black-box scoring and offer limited control over scaffold preservation, often producing unstable or biologically implausible edits. While large language models (LLMs) are promising molecular generators, optimization remains constrained by the lack of chemistry-grounded preference supervision and principled data curation. We introduce \textbf{Scaffold-Conditioned Preference Triplets (SCPT)}, a pipeline that constructs similarity-constrained triplets \langle\text{scaffold}, \text{better}, \text{worse}\rangle via scaffold alignment and chemistry-driven filters for validity, synthesizability, and meaningful property gains. Using these preferences, we align a pretrained molecular LLM as a conditional editor, enabling property-improving edits that retain the scaffold. Across single- and multi-objective benchmarks, SCPT improves optimization success and property gains while maintaining higher scaffold similarity than competitive baselines. Compared with representative non-LLM molecular optimization methods, SCPT-trained LLMs are better suited to scaffold-constrained and multi-objective optimization. In addition, models trained on single-property and two-property supervision generalize effectively to three-property tasks, indicating promising extrapolative generalization under limited higher-order supervision. SCPT also provides controllable data-construction knobs that yield a predictable similarity-gain frontier, enabling systematic adaptation to diverse optimization regimes.