SMART: A Spectral Transfer Approach to Multi-Task Learning

arXiv cs.LG / 4/23/2026

📰 NewsDeveloper Stack & InfrastructureModels & Research

Key Points

  • The paper introduces SMART, a spectral transfer method for multi-task learning in linear regression that addresses performance drops when target sample sizes are small.
  • Unlike prior transfer approaches that assume bounded differences between source and target models, SMART instead assumes spectral similarity, with target singular subspaces contained in corresponding source subspaces and sparsely aligned to source singular bases.
  • SMART estimates the target coefficient matrix using structured regularization that leverages spectral information from a fitted source model, without requiring access to raw source data.
  • The authors develop a practical ADMM-based algorithm to handle a nonconvex optimization problem and provide non-asymptotic error bounds plus minimax lower bounds in a noiseless-source setting.
  • Experiments (including robustness to negative transfer and analysis on multi-modal single-cell data) show improved accuracy and predictive performance, and the implementation is released on GitHub.

Abstract

Multi-task learning is effective for related applications, but its performance can deteriorate when the target sample size is small. Transfer learning can borrow strength from related studies; yet, many existing methods rely on restrictive bounded-difference assumptions between the source and target models. We propose SMART, a spectral transfer method for multi-task linear regression that instead assumes spectral similarity: the target left and right singular subspaces lie within the corresponding source subspaces and are sparsely aligned with the source singular bases. Such an assumption is natural when studies share latent structures and enables transfer beyond the bounded-difference settings. SMART estimates the target coefficient matrix through structured regularization that incorporates spectral information from a source study. Importantly, it requires only a fitted source model rather than the raw source data, making it useful when data sharing is limited. Although the optimization problem is nonconvex, we develop a practical ADMM-based algorithm. We establish general, non-asymptotic error bounds and a minimax lower bound in the noiseless-source regime. Under additional regularity conditions, these results yield near-minimax Frobenius error rates up to logarithmic factors. Simulations confirm improved estimation accuracy and robustness to negative transfer, and analysis of multi-modal single-cell data demonstrates better predictive performance. The Python implementation of SMART, along with the code to reproduce all experiments in this paper, is publicly available at https://github.com/boxinz17/smart.