Text-to-Distribution Prediction with Quantile Tokens and Neighbor Context
arXiv cs.CL / 4/23/2026
📰 NewsModels & Research
Key Points
- The paper proposes Quantile Token Regression for text-to-distribution (text regression) tasks that require predicting an entire conditional distribution rather than a single value.
- It introduces dedicated quantile tokens inserted into the input sequence so self-attention creates direct input-to-quantile pathways for each predicted quantile.
- The method improves local grounding by retrieving semantically similar neighbor instances and using their empirical distributions as contextual evidence for more accurate estimates.
- It includes theoretical analysis that clarifies which loss functions correspond to which distributional objectives in quantile regression.
- Experiments on Inside Airbnb and StackSample using LLMs from 1.7B to 14B parameters show consistent improvements over baselines, including lower MAPE and substantially narrower, sharper prediction intervals.
Related Articles

Training ChatGPT on Private Data: A Technical Reference
Dev.to
AI as a Fascist Artifact
Dev.to
Sony Ace: el robot que ganó 3 de 5 a élites de ping-pong en Nature
Dev.to

OpenAI releases open-source model that strips personal data from text
THE DECODER

Researchers warn US politics is repeating its ChatGPT mistake with world models
THE DECODER