Prime Once, then Reprogram Locally: An Efficient Alternative to Black-Box Service Model Adaptation
arXiv cs.LG / 4/3/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- The paper argues that adapting closed-box service APIs via Zeroth-Order Optimization (ZOO) is often inefficient, requiring many costly API calls and suffering from slow or unstable optimization, with extra difficulties appearing in modern APIs like GPT-4o.
- It introduces AReS (Alternative efficient Reprogramming for Service models), which replaces continuous black-box optimization with a single-pass “priming” step that trains only a lightweight adapter on a local encoder.
- After priming, AReS switches to a glass-box (white-box) reprogramming stage on the local proxy model, so adaptation and inference run locally and incur essentially no further API usage.
- Experiments report strong improvements on GPT-4o, including a +27.8% gain over the zero-shot baseline where ZOO-based methods largely fail, plus gains across ten datasets (e.g., +2.5% for VLMs and +15.6% for standard VMs).
- The method reduces API calls by over 99.99% while achieving state-of-the-art or better performance relative to prior approaches, positioning it as a practical alternative for API-based model adaptation.
Related Articles

Why I built an AI assistant that doesn't know who you are
Dev.to

DenseNet Paper Walkthrough: All Connected
Towards Data Science

Meta Adaptive Ranking Model: What Instagram Advertisers Gain in 2026 | MKDM
Dev.to

The Facebook insider building content moderation for the AI era
TechCrunch
Qwen3.5 vs Gemma 4: Benchmarks vs real world use?
Reddit r/LocalLLaMA