R&F-Inventory: A Large-Scale Dataset for Monotonic Inventory Estimation in Reach and Frequency Advertising

arXiv cs.LG / 4/21/2026

📰 NewsSignals & Early TrendsModels & Research

Key Points

  • The paper introduces and releases a large-scale Reach & Frequency (R&F) contract inventory estimation dataset focused on controllable UV/PV delivery under targeting, scheduling, and frequency constraints.
  • Unlike many existing datasets that use independent samples, the dataset provides multiple budget points within the same R&F context, enabling full “budget-performance curves” for UV and PV.
  • It explicitly incorporates time-window-based frequency control (e.g., frequency caps within a given number of days) and is designed to naturally satisfy monotonicity and diminishing marginal returns in budget and scheduling dimensions.
  • The authors derive an exposure ceiling as a theoretical consistency check to assess data quality and the feasibility of model predictions.
  • They define two standardized benchmark tasks (single-point prediction and budget-curve reconstruction) and provide reproducible baseline methods and evaluation protocols, with accompanying experiment code on GitHub.

Abstract

Reach and Frequency (R&F) contract advertising is an important form of widely used brand advertising. Unlike performance advertising, R&F contracts emphasize controllable delivery of UV and PV under given targeting, scheduling, and frequency control constraints. In practical systems, advertisers typically need to view the UV, PV change curves at different budget levels in real time when creating an R&F contract. However, most existing publicly available advertising datasets are based on independent samples, lacking a characterization of the core structure of the "budget-performance curve" (including UV and PV) in R&F contracts.This paper proposes and releases a large-scale R&F contract inventory estimation dataset. This dataset uses the R&F contract context consisting of "targeting-scheduling-frequency control" as the basic context, providing observations of UV and PV corresponding to multiple budget points within the same context, thus forming a complete budget-performance curve. The dataset explicitly includes a time-window-based frequency control mechanism (e.g.,"no more than 3 times within 5 days") and naturally satisfies the monotonicity and diminishing marginal returns characteristics in the budget and scheduling dimensions. We further derive the theoretical maximum exposure ceiling and use it as a consistency check to evaluate data quality and the feasibility of model predictions. Using this data set, this paper defines two standardized benchmark tasks: single-point performance prediction and reconstruction of budget-performance curves, and provides a set of reproducible baseline methods and evaluation protocols. This dataset can support systematic research on problems such as structural constraint learning, monotonic regression, curve consistency modeling, and R&F contract planning.The code for our experiments can be found at https://github.com/pengyunshan/RF-Inventory.