Generative Augmented Inference

arXiv cs.LG / 4/17/2026

📰 NewsModels & Research

Key Points

  • The paper introduces Generative Augmented Inference (GAI), a framework for using LLM-generated outputs as features to estimate models of outcomes that are originally based on expensive human labels.
  • Unlike standard proxy approaches that treat AI predictions as direct substitutes for true labels, GAI is designed to remain reliable even when the relationship between AI outputs and human labels is weak, complex, or misspecified.
  • Using an orthogonal moment construction, GAI enables consistent estimation and valid inference with flexible, nonparametric relationships between LLM signals and human labels.
  • The authors prove asymptotic normality and a “safe default” property: GAI cannot worsen performance versus human-data-only baselines and improves efficiency when auxiliary signals are predictive.
  • Experiments in areas like conjoint analysis, retail pricing, and health insurance choice show large reductions in human labeling needs (e.g., 50% error reduction and 75%+ label reduction in conjoint analysis, 90%+ in health insurance) while maintaining or improving decision accuracy and confidence interval coverage.

Abstract

Data-driven operations management often relies on parameters estimated from costly human-generated labels. Recent advances in large language models (LLMs) and other AI systems offer inexpensive auxiliary data, but introduce a new challenge: AI outputs are not direct observations of the target outcomes, but could involve high-dimensional representations with complex and unknown relationships to human labels. Conventional methods leverage AI predictions as direct proxies for true labels, which can be inefficient or unreliable when this relationship is weak or misspecified. We propose Generative Augmented Inference (GAI), a general framework that incorporates AI-generated outputs as informative features for estimating models of human-labeled outcomes. GAI uses an orthogonal moment construction that enables consistent estimation and valid inference with flexible, nonparametric relationship between LLM-generated outputs and human labels. We establish asymptotic normality and show a "safe default" property: relative to human-data-only estimators, GAI weakly improves estimation efficiency under arbitrary auxiliary signals and yields strict gains whenever the auxiliary information is predictive. Empirically, GAI outperforms benchmarks across diverse settings. In conjoint analysis with weak auxiliary signals, GAI reduces estimation error by about 50% and lowers human labeling requirements by over 75%. In retail pricing, where all methods access the same auxiliary inputs, GAI consistently outperforms alternative estimators, highlighting the value of its construction rather than differences in information. In health insurance choice, it cuts labeling requirements by over 90% while maintaining decision accuracy. Across applications, GAI improves confidence interval coverage without inflating width. Overall, GAI provides a principled and scalable approach to integrating AI-generated information.