信頼ゲート定理：ランキング型意思決定システムはいつ棄権すべきか？

arXiv cs.AI / 2026/3/11

Ideas & Deep AnalysisModels & Research

原文を読む →

共有:

要点

推薦システム、広告オークション、臨床トリアージなどのランキング型意思決定システムは、ランキング結果に介入するか棄権するかを判断して意思決定の質を向上させる必要がある。
信頼に基づく棄権が意思決定を単調に改善するための２つの重要な条件として、ランク整合性（rank-alignment）と反転ゾーンの不在が特定されており、構造的不確実性と文脈的不確実性を区別している。
協調フィルタリング、Eコマースの意図検出、臨床トリアージの３つの領域での実証検証により、構造的不確実性は一貫した棄権の利得をもたらす一方で、文脈的不確実性は棄権の効果を困難にすることが示された。
文脈認識型の信頼度指標は文脈的不確実性による問題を部分的に緩和するが、単調な改善を完全には回復できず、実際の展開の複雑性を浮き彫りにしている。
本論文は実践的ガイドラインとして、信頼ゲートを展開する前に保持データでこれらの条件を検証し、信頼シグナルを支配的な不確実性タイプに合わせることを推奨している。

Computer Science > Artificial Intelligence

arXiv:2603.09947 (cs)

[Submitted on 10 Mar 2026]

Title:The Confidence Gate Theorem: When Should Ranked Decision Systems Abstain?

Authors:Ronald Doku

View a PDF of the paper titled The Confidence Gate Theorem: When Should Ranked Decision Systems Abstain?, by Ronald Doku

View PDF HTML (experimental)

Abstract:Ranked decision systems -- recommenders, ad auctions, clinical triage queues -- must decide when to intervene in ranked outputs and when to abstain. We study when confidence-based abstention monotonically improves decision quality, and when it fails. The formal conditions are simple: rank-alignment and no inversion zones. The substantive contribution is identifying why these conditions hold or fail: the distinction between structural uncertainty (missing data, e.g., cold-start) and contextual uncertainty (missing context, e.g., temporal drift). Empirically, we validate this distinction across three domains: collaborative filtering (MovieLens, 3 distribution shifts), e-commerce intent detection (RetailRocket, Criteo, Yoochoose), and clinical pathway triage (MIMIC-IV). Structural uncertainty produces near-monotonic abstention gains in all domains; structurally grounded confidence signals (observation counts) fail under contextual drift, producing as many monotonicity violations as random abstention on our MovieLens temporal split. Context-aware alternatives -- ensemble disagreement and recency features -- substantially narrow the gap (reducing violations from 3 to 1--2) but do not fully restore monotonicity, suggesting that contextual uncertainty poses qualitatively different challenges. Exception labels defined from residuals degrade substantially under distribution shift (AUC drops from 0.71 to 0.61--0.62 across three splits), providing a clean negative result against the common practice of exception-based intervention. The results provide a practical deployment diagnostic: check C1 and C2 on held-out data before deploying a confidence gate, and match the confidence signal to the dominant uncertainty type.

Subjects:	Artificial Intelligence (cs.AI)
Cite as:	arXiv:2603.09947 [cs.AI]
	(or arXiv:2603.09947v1 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2603.09947 Focus to learn more arXiv-issued DOI via DataCite

Submission history

From: Ronald Doku [view email]
[v1] Tue, 10 Mar 2026 17:44:10 UTC (34 KB)

Full-text links:

Access Paper:

View a PDF of the paper titled The Confidence Gate Theorem: When Should Ranked Decision Systems Abstain?, by Ronald Doku

View PDF
HTML (experimental)
TeX Source

view license

Current browse context:

cs.AI

< prev | next >

new | recent | 2026-03

Change to browse by:

References & Citations

export BibTeX citation Loading...

BibTeX formatted citation

Data provided by:

Bookmark

Bibliographic Tools

Bibliographic and Citation Tools

Bibliographic Explorer Toggle

Bibliographic Explorer (What is the Explorer?)

Connected Papers Toggle

Connected Papers (What is Connected Papers?)

Litmaps Toggle

Litmaps (What is Litmaps?)

scite.ai Toggle

scite Smart Citations (What are Smart Citations?)

Code, Data, Media

Code, Data and Media Associated with this Article

alphaXiv Toggle

alphaXiv (What is alphaXiv?)

Links to Code Toggle

CatalyzeX Code Finder for Papers (What is CatalyzeX?)

DagsHub Toggle

DagsHub (What is DagsHub?)

GotitPub Toggle

Gotit.pub (What is GotitPub?)

Huggingface Toggle

Hugging Face (What is Huggingface?)

Links to Code Toggle

Papers with Code (What is Papers with Code?)

ScienceCast Toggle

ScienceCast (What is ScienceCast?)

Demos

Replicate Toggle

Replicate (What is Replicate?)

Spaces Toggle

Hugging Face Spaces (What is Spaces?)

Spaces Toggle

TXYZ.AI (What is TXYZ.AI?)

Recommenders and Search Tools

Link to Influence Flower

Influence Flower (What are Influence Flowers?)

Core recommender toggle

CORE Recommender (What is CORE?)

Author
Venue
Institution
Topic

About arXivLabs

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

半導体FABにLLMを持ち込んだら何が起きるか — ArXiv論文5本を現場目線でぶった斬る

Qiita

エッジコンピューティングとローカル処理への大規模な移行

Dev.to

仕様駆動開発における自己改良エージェント

Dev.to

Week 3: LLMでの構築を始める前に『退屈な』MLを学ぶ理由

Dev.to

三エージェント・プロトコルは移植可能だ。規律は移植不可能だ。

Dev.to

信頼ゲート定理：ランキング型意思決定システムはいつ棄権すべきか？

要点

Computer Science > Artificial Intelligence

Title:The Confidence Gate Theorem: When Should Ranked Decision Systems Abstain?

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

関連記事

半導体FABにLLMを持ち込んだら何が起きるか — ArXiv論文5本を現場目線でぶった斬る

エッジコンピューティングとローカル処理への大規模な移行

仕様駆動開発における自己改良エージェント

Week 3: LLMでの構築を始める前に『退屈な』MLを学ぶ理由

三エージェント・プロトコルは移植可能だ。規律は移植不可能だ。

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer