選択的予測のためのドメイン間不確実性定量化:転移情報を用いたベッティングによる包括的境界アブレーション

arXiv cs.AI / 2026/3/11

Ideas & Deep AnalysisModels & Research

要点

  • 本論文は、リスク制御付き選択的予測のための有限サンプル境界ファミリー9種を比較する包括的なアブレーション研究を提示しており、濃縮不等式、複数検定補正、ベッティングベースの信頼区間系列を統合している。
  • 主な理論的革新はTransfer-Informed Betting(TIB)で、これはソースドメインのリスクプロファイルを用いてWSR(Wealth Sequential Ratio)資産過程をウォームスタートし、データが少ないターゲットドメインにおいて特に厳密な境界を可能にし、形式的な保証を伴う。
  • 著者らは、TIBがすべてのソース-ターゲット間の分岐下で有効なスーパーマルチンゲールであり、ドメインが一致するとき標準WSRを支配し、データ非依存のウォームスタートではこれ以上の収束性能を達成できないことを証明している。
  • MASSIVE、NyayaBench、CLINC-150、Banking77の4つのベンチマークでの実験評価により、他の境界が失敗する低データ校正設定においてもカバレッジ向上や実現可能性を含む大幅な性能改善が示された。
  • また、これらの選択的予測手法を分割コンフォーマル予測と比較し、選択的予測が単一予測に対するリスク保証を提供する利点を強調し、現実のエージェント型キャッシングシステムにこれらの手法を適用し、自律応答提供のための進行的信頼モデルを形式化している。

Computer Science > Machine Learning

arXiv:2603.08907 (cs)
[Submitted on 9 Mar 2026]

Title:Cross-Domain Uncertainty Quantification for Selective Prediction: A Comprehensive Bound Ablation with Transfer-Informed Betting

View a PDF of the paper titled Cross-Domain Uncertainty Quantification for Selective Prediction: A Comprehensive Bound Ablation with Transfer-Informed Betting, by Abhinaba Basu
View PDF HTML (experimental)
Abstract:We present a comprehensive ablation of nine finite-sample bound families for selective prediction with risk control, combining concentration inequalities (Hoeffding, Empirical Bernstein, Clopper-Pearson, Wasserstein DRO, CVaR) with multiple-testing corrections (union bound, Learn Then Test fixed-sequence) and betting-based confidence sequences (WSR). Our main theoretical contribution is Transfer-Informed Betting (TIB), which warm-starts the WSR wealth process using a source domain's risk profile, achieving tighter bounds in data-scarce settings with a formal dominance guarantee. We prove that the TIB wealth process remains a valid supermartingale under all source-target divergences, that TIB dominates standard WSR when domains match, and that no data-independent warm-start can achieve better convergence. The combination of betting-based confidence sequences, LTT monotone testing, and cross-domain transfer is, to our knowledge, a three-way novelty not present in the literature. We evaluate all nine bound families on four benchmarks-MASSIVE (n=1,102), NyayaBench (n=280), CLINC-150 (n=22.5K), and Banking77 (n=13K)-across 18 (alpha, delta) configurations. On MASSIVE at alpha=0.10, LTT eliminates the ln(K) union-bound penalty, achieving 94.0% guaranteed coverage versus 73.8% for Hoeffding-a 27% relative improvement. On NyayaBench, where the small calibration set makes Hoeffding-family bounds infeasible below alpha=0.20, Transfer-Informed Betting achieves 18.5% coverage at alpha=0.10, a 5.4x improvement over LTT + Hoeffding. We additionally compare with split-conformal prediction, showing that conformal methods produce prediction sets (avg. 1.67 classes) whereas selective prediction provides single-prediction risk guarantees. We apply these methods to agentic caching systems, formalizing a progressive trust model where the guarantee determines when cached responses can be served autonomously.
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
MSC classes: 62F25, 68T05
ACM classes: I.2.6; G.3
Cite as: arXiv:2603.08907 [cs.LG]
  (or arXiv:2603.08907v1 [cs.LG] for this version)
  https://doi.org/10.48550/arXiv.2603.08907
Focus to learn more
arXiv-issued DOI via DataCite

Submission history

From: Abhinaba Basu [view email]
[v1] Mon, 9 Mar 2026 20:25:18 UTC (93 KB)
Full-text links:

Access Paper:

    View a PDF of the paper titled Cross-Domain Uncertainty Quantification for Selective Prediction: A Comprehensive Bound Ablation with Transfer-Informed Betting, by Abhinaba Basu
  • View PDF
  • HTML (experimental)
  • TeX Source
Current browse context:
cs.LG
< prev   |   next >
Change to browse by:

References & Citations

export BibTeX citation Loading...

BibTeX formatted citation

×
Data provided by:

Bookmark

BibSonomy logo Reddit logo
Bibliographic Tools

Bibliographic and Citation Tools

Bibliographic Explorer Toggle
Bibliographic Explorer (What is the Explorer?)
Connected Papers Toggle
Connected Papers (What is Connected Papers?)
Litmaps Toggle
Litmaps (What is Litmaps?)
scite.ai Toggle
scite Smart Citations (What are Smart Citations?)
Code, Data, Media

Code, Data and Media Associated with this Article

alphaXiv Toggle
alphaXiv (What is alphaXiv?)
Links to Code Toggle
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub Toggle
DagsHub (What is DagsHub?)
GotitPub Toggle
Gotit.pub (What is GotitPub?)
Huggingface Toggle
Hugging Face (What is Huggingface?)
Links to Code Toggle
Papers with Code (What is Papers with Code?)
ScienceCast Toggle
ScienceCast (What is ScienceCast?)
Demos

Demos

Replicate Toggle
Replicate (What is Replicate?)
Spaces Toggle
Hugging Face Spaces (What is Spaces?)
Spaces Toggle
TXYZ.AI (What is TXYZ.AI?)
Related Papers

Recommenders and Search Tools

Link to Influence Flower
Influence Flower (What are Influence Flowers?)
Core recommender toggle
CORE Recommender (What is CORE?)
IArxiv recommender toggle
IArxiv Recommender (What is IArxiv?)
About arXivLabs

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.