Near-optimal and Efficient First-Order Algorithm for Multi-Task Learning with Shared Linear Representation

arXiv cs.LG / 5/4/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses a gap in multi-task learning by proposing likelihood-based, efficiently solvable algorithms even when using a shared linear representation.
  • It introduces a first-order optimization method that jointly learns a shared representation and task-specific parameters, targeting the non-convexity caused by matrix factorization.
  • The algorithm is proven to converge in O~(1) iterations (up to log factors), indicating strong practical efficiency.
  • The method achieves a near-optimal estimation error of O~(dk/(TN)) and improves over existing likelihood-based approaches by a factor of k.
  • Overall, the work provides theoretical evidence that likelihood-based first-order methods can efficiently solve the multi-task learning problem under the studied setting.

Abstract

Multi-task learning (MTL) has emerged as a pivotal paradigm in machine learning by leveraging shared structures across multiple related tasks. Despite its empirical success, the development of likelihood-based efficiently solvable algorithms--even for shared linear representations--remains largely underdeveloped, primarily due to the non-convex structure intrinsic to matrix factorization. This paper introduces a first-order algorithm that jointly learns a shared representation and task-specific parameters, with guaranteed efficiency. Notably, it converges in \widetilde{\mathcal{O}}(1) iterations and attains a \emph{near-optimal} estimation error of \widetilde{\mathcal{O}}(dk/(TN)), \emph{improving} over existing likelihood-based methods by a factor of k, where d, k, T, N denote input dimension, representation dimension, task count, and samples per task, respectively. Our results justify that likelihood-based first-order methods can efficiently solve the MTL problem.

Near-optimal and Efficient First-Order Algorithm for Multi-Task Learning with Shared Linear Representation | AI Navigate