AI Navigate

信頼ゲート定理:ランキング型意思決定システムはいつ棄権すべきか?

arXiv cs.AI / 2026/3/11

Ideas & Deep AnalysisModels & Research

要点

  • 推薦システム、広告オークション、臨床トリアージなどのランキング型意思決定システムは、ランキング結果に介入するか棄権するかを判断して意思決定の質を向上させる必要がある。
  • 信頼に基づく棄権が意思決定を単調に改善するための2つの重要な条件として、ランク整合性(rank-alignment)と反転ゾーンの不在が特定されており、構造的不確実性と文脈的不確実性を区別している。
  • 協調フィルタリング、Eコマースの意図検出、臨床トリアージの3つの領域での実証検証により、構造的不確実性は一貫した棄権の利得をもたらす一方で、文脈的不確実性は棄権の効果を困難にすることが示された。
  • 文脈認識型の信頼度指標は文脈的不確実性による問題を部分的に緩和するが、単調な改善を完全には回復できず、実際の展開の複雑性を浮き彫りにしている。
  • 本論文は実践的ガイドラインとして、信頼ゲートを展開する前に保持データでこれらの条件を検証し、信頼シグナルを支配的な不確実性タイプに合わせることを推奨している。

Computer Science > Artificial Intelligence

arXiv:2603.09947 (cs)
[Submitted on 10 Mar 2026]

Title:The Confidence Gate Theorem: When Should Ranked Decision Systems Abstain?

Authors:Ronald Doku
View a PDF of the paper titled The Confidence Gate Theorem: When Should Ranked Decision Systems Abstain?, by Ronald Doku
View PDF HTML (experimental)
Abstract:Ranked decision systems -- recommenders, ad auctions, clinical triage queues -- must decide when to intervene in ranked outputs and when to abstain. We study when confidence-based abstention monotonically improves decision quality, and when it fails. The formal conditions are simple: rank-alignment and no inversion zones. The substantive contribution is identifying why these conditions hold or fail: the distinction between structural uncertainty (missing data, e.g., cold-start) and contextual uncertainty (missing context, e.g., temporal drift). Empirically, we validate this distinction across three domains: collaborative filtering (MovieLens, 3 distribution shifts), e-commerce intent detection (RetailRocket, Criteo, Yoochoose), and clinical pathway triage (MIMIC-IV). Structural uncertainty produces near-monotonic abstention gains in all domains; structurally grounded confidence signals (observation counts) fail under contextual drift, producing as many monotonicity violations as random abstention on our MovieLens temporal split. Context-aware alternatives -- ensemble disagreement and recency features -- substantially narrow the gap (reducing violations from 3 to 1--2) but do not fully restore monotonicity, suggesting that contextual uncertainty poses qualitatively different challenges. Exception labels defined from residuals degrade substantially under distribution shift (AUC drops from 0.71 to 0.61--0.62 across three splits), providing a clean negative result against the common practice of exception-based intervention. The results provide a practical deployment diagnostic: check C1 and C2 on held-out data before deploying a confidence gate, and match the confidence signal to the dominant uncertainty type.
Subjects: Artificial Intelligence (cs.AI)
Cite as: arXiv:2603.09947 [cs.AI]
  (or arXiv:2603.09947v1 [cs.AI] for this version)
  https://doi.org/10.48550/arXiv.2603.09947
Focus to learn more
arXiv-issued DOI via DataCite

Submission history

From: Ronald Doku [view email]
[v1] Tue, 10 Mar 2026 17:44:10 UTC (34 KB)
Full-text links:

Access Paper:

Current browse context:
cs.AI
< prev   |   next >
Change to browse by:
cs

References & Citations

export BibTeX citation Loading...

BibTeX formatted citation

×
Data provided by:

Bookmark

BibSonomy logo Reddit logo
Bibliographic Tools

Bibliographic and Citation Tools

Bibliographic Explorer Toggle
Bibliographic Explorer (What is the Explorer?)
Connected Papers Toggle
Connected Papers (What is Connected Papers?)
Litmaps Toggle
Litmaps (What is Litmaps?)
scite.ai Toggle
scite Smart Citations (What are Smart Citations?)
Code, Data, Media

Code, Data and Media Associated with this Article

alphaXiv Toggle
alphaXiv (What is alphaXiv?)
Links to Code Toggle
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub Toggle
DagsHub (What is DagsHub?)
GotitPub Toggle
Gotit.pub (What is GotitPub?)
Huggingface Toggle
Hugging Face (What is Huggingface?)
Links to Code Toggle
Papers with Code (What is Papers with Code?)
ScienceCast Toggle
ScienceCast (What is ScienceCast?)
Demos

Demos

Replicate Toggle
Replicate (What is Replicate?)
Spaces Toggle
Hugging Face Spaces (What is Spaces?)
Spaces Toggle
TXYZ.AI (What is TXYZ.AI?)
Related Papers

Recommenders and Search Tools

Link to Influence Flower
Influence Flower (What are Influence Flowers?)
Core recommender toggle
CORE Recommender (What is CORE?)
About arXivLabs

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.