AI Navigate

Quantifying Uncertainty in AI Visibility: A Statistical Framework for Generative Search Measurement

arXiv cs.AI / 3/11/2026

Ideas & Deep Analysis

Key Points

  • AI-powered answer engines produce non-deterministic results, causing variability in cited sources even for identical queries submitted at different times.
  • Current measurement methods for domain visibility in generative search rely on single-run point estimates that treat citation shares as fixed values, which this paper argues is misleading.
  • The study analyzes citation variability on three generative search platforms and finds citation distributions follow a power-law and exhibit significant variability, challenging the stability of domain rankings.
  • The paper demonstrates that many observed differences in citation visibility fall within the noise floor and advocates reporting citation visibility metrics with uncertainty estimates.
  • Practical guidance is provided for required sample sizes to generate confidence intervals, improving the reliability and interpretability of domain visibility measurements in generative search.

Statistics > Applications

arXiv:2603.08924 (stat)
[Submitted on 9 Mar 2026]

Title:Quantifying Uncertainty in AI Visibility: A Statistical Framework for Generative Search Measurement

View a PDF of the paper titled Quantifying Uncertainty in AI Visibility: A Statistical Framework for Generative Search Measurement, by Ronald Sielinski
View PDF
Abstract:AI-powered answer engines are inherently non-deterministic: identical queries submitted at different times can produce different responses and cite different sources. Despite this stochastic behavior, current approaches to measuring domain visibility in generative search typically rely on single-run point estimates of citation share and prevalence, implicitly treating them as fixed values. This paper argues that citation visibility metrics should be treated as sample estimators of an underlying response distribution rather than fixed values. We conduct an empirical study of citation variability across three generative search platforms--Perplexity Search, OpenAI SearchGPT, and Google Gemini--using repeated sampling across three consumer product topics. Two sampling regimes are employed: daily collections over nine days and high-frequency sampling at ten-minute intervals. We show that citation distributions follow a power-law form and exhibit substantial variability across repeated samples. Bootstrap confidence intervals reveal that many apparent differences between domains fall within the noise floor of the measurement process. Distribution-wide rank stability analysis further demonstrates that citation rankings are unstable across samples, not only among top-ranked domains but throughout the frequently cited domain set. These findings demonstrate that single-run visibility metrics provide a misleadingly precise picture of domain performance in generative search. We argue that citation visibility must be reported with uncertainty estimates and provide practical guidance for sample sizes required to achieve interpretable confidence intervals.
Comments:
Subjects: Applications (stat.AP); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR)
Cite as: arXiv:2603.08924 [stat.AP]
  (or arXiv:2603.08924v1 [stat.AP] for this version)
  https://doi.org/10.48550/arXiv.2603.08924
Focus to learn more
arXiv-issued DOI via DataCite

Submission history

From: Ronald Sielinski [view email]
[v1] Mon, 9 Mar 2026 20:47:22 UTC (3,233 KB)
Full-text links:

Access Paper:

    View a PDF of the paper titled Quantifying Uncertainty in AI Visibility: A Statistical Framework for Generative Search Measurement, by Ronald Sielinski
  • View PDF
Current browse context:
stat.AP
< prev   |   next >
Change to browse by:

References & Citations

export BibTeX citation Loading...

BibTeX formatted citation

×
Data provided by:

Bookmark

BibSonomy logo Reddit logo
Bibliographic Tools

Bibliographic and Citation Tools

Bibliographic Explorer Toggle
Bibliographic Explorer (What is the Explorer?)
Connected Papers Toggle
Connected Papers (What is Connected Papers?)
Litmaps Toggle
Litmaps (What is Litmaps?)
scite.ai Toggle
scite Smart Citations (What are Smart Citations?)
Code, Data, Media

Code, Data and Media Associated with this Article

alphaXiv Toggle
alphaXiv (What is alphaXiv?)
Links to Code Toggle
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub Toggle
DagsHub (What is DagsHub?)
GotitPub Toggle
Gotit.pub (What is GotitPub?)
Huggingface Toggle
Hugging Face (What is Huggingface?)
Links to Code Toggle
Papers with Code (What is Papers with Code?)
ScienceCast Toggle
ScienceCast (What is ScienceCast?)
Demos

Demos

Replicate Toggle
Replicate (What is Replicate?)
Spaces Toggle
Hugging Face Spaces (What is Spaces?)
Spaces Toggle
TXYZ.AI (What is TXYZ.AI?)
Related Papers

Recommenders and Search Tools

Link to Influence Flower
Influence Flower (What are Influence Flowers?)
Core recommender toggle
CORE Recommender (What is CORE?)
About arXivLabs

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.