Budget Constraints as Riemannian Manifolds

arXiv cs.LG / 5/4/2026

📰 NewsIdeas & Deep AnalysisTools & Practical UsageModels & Research

共有:

Key Points

The paper addresses a recurring ML optimization problem—assigning K options across N groups under a strict total budget—highlighting that the true loss couples all groups and makes direct combinatorial optimization difficult.
It proposes a geometric reformulation: under softmax relaxation, the budget constraint becomes a smooth Riemannian manifold in logit space with closed-form normal vectors and monotonic expected-cost behavior along the cost vector.
Building on this manifold structure, the authors introduce Riemannian Constrained Optimization (RCO), which modifies Adam with tangent projection, binary-search retraction, and momentum transport to enforce budgets exactly without constraint-specific hyperparameters.
Combined with Gumbel straight-through for discrete choices and budget-constrained dynamic programming for discrete feasibility, RCO achieves optimal or near-optimal results on synthetic knapsack benchmarks and outperforms or matches evolutionary search on LLM compression tasks with substantially lower wall-clock cost.

Abstract

Assigning one of K options to each of N groups under a total cost budget is a recurring problem in machine learning, appearing in mixed-precision quantization, non-uniform pruning, and expert selection. The objective (model loss) depends jointly on all assignments and does not decompose across groups, which prevents combinatorial solvers from optimizing the true objective directly and limits them to proxy objectives. Evolutionary search evaluates the actual loss but lacks gradient information, while penalty-based methods provide gradients but enforce the budget only approximately and require sensitive hyperparameter tuning. We observe that under softmax relaxation, the budget constraint defines a smooth Riemannian manifold in logit space with particularly simple geometry: the normal vector is available in closed form, shifting logits along the cost vector changes expected cost monotonically, allowing binary-search retraction, and vector transport reduces to a single inner product. Building on this structure, we propose Riemannian Constrained Optimization (RCO), which augments a standard Adam update with tangent projection, binary-search retraction, and momentum transport. Combined with Gumbel straight-through estimation and budget-constrained dynamic programming for discrete feasibility, RCO enables first-order optimization of the true objective under exact budget enforcement, without introducing constraint hyperparameters. On synthetic knapsack problems with known optima, the manifold-based constraint handling recovers optimal solutions, whereas penalty methods plateau at 83% of optimal. On LLM compression tasks, including mixed-precision quantization and MoE expert pruning, RCO matches or exceeds evolutionary search methods while requiring 3x to 16x lower wall-clock cost on the evaluated configurations.