Experimental evidence of progressive ChatGPT models self-convergence
arXiv cs.AI / 3/16/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper investigates how recursive training with synthetic data can cause "model self-convergence," lowering the diversity of outputs across newer ChatGPT releases.
- It uses a text-similarity metric to quantify output diversity and compares multiple ChatGPT versions over time, finding a measurable decline even when temperatures are set to 1.
- The authors attribute the diversity loss to increasing amounts of synthetic data in training sets, potentially due to LLM-generated content permeating the internet.
- They coin the term "model self-convergence" to describe the rising similarity of outputs across model versions as this longitudinal effect unfolds.


