Cost-Aware Learning

arXiv cs.LG / 5/1/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper studies Cost-Aware Learning for finite-sum optimization, aiming to reach a target error while minimizing total cost when different components have different sampling costs.
It introduces the Cost-Aware Stochastic Gradient Descent (SGD) algorithm for convex objectives, along with derived cost complexity bounds to achieve error ε.
The authors prove a lower bound for the problem and propose a subset selection method to further reduce training cost.
They apply the framework to reinforcement learning with language models, where policy-gradient cost depends on sequence length, presenting Cost-Aware GRPO to cut policy-optimization cost without hurting performance.
Experiments on 1.5B and 8B LLMs show up to ~30% fewer tokens used for policy optimization while achieving matching or better accuracy versus baselines.

Abstract

We consider the problem of Cost-Aware Learning, where sampling different component functions of a finite-sum objective incurs different costs. The objective is to reach a target error while minimizing the total cost. First, we propose the Cost-Aware Stochastic Gradient Descent algorithm for convex functions, and derive its cost complexity to attain an error of

\epsilon

. Furthermore, we establish a lower bound for this setting and provide a subset selection algorithm to further reduce the cost of training. We apply our theoretical insights to reinforcement learning with language models, where the computational cost of policy gradients varies with sequence length. To this end, we introduce Cost-Aware GRPO, an algorithm designed to reduce the cost of policy optimization while preserving performance. Empirical results on 1.5B and 8B LLMs demonstrate that our approach reduces the tokens used in policy optimization by up to about 30% while matching or exceeding baseline accuracy.

Every handle invocation on BizNode gets a WFID — a universal transaction reference for accountability. Full audit trail,...

Dev.to

I deployed AI agents across AWS, GCP, and Azure without a VPN. Here is how it works.

Dev.to

Panduan Lengkap TestSprite MCP Server — Dokumentasi Getting Started dalam Bahasa Indonesia

Dev.to

AI made learning fun again

Dev.to

MCP, Skills, AI Agents, and New Models: The New Stack for Software Development

Dev.to

Cost-Aware Learning

Key Points

Abstract

Related Articles

Every handle invocation on BizNode gets a WFID — a universal transaction reference for accountability. Full audit trail,...

I deployed AI agents across AWS, GCP, and Azure without a VPN. Here is how it works.

Panduan Lengkap TestSprite MCP Server — Dokumentasi Getting Started dalam Bahasa Indonesia

AI made learning fun again

MCP, Skills, AI Agents, and New Models: The New Stack for Software Development

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer