| The basic idea is pretty simple. You give it a few seed prompts. It generates instruction-response pairs, an LLM scores each one, the good ones go into your training set and the bad ones become the seeds for the next round. Each cycle the model is essentially practicing on what it failed at before. You can run the judge completely locally with Ollama if you do not want to send data to any API. The fine-tuning at the end uses Unsloth on a free Colab GPU so the whole thing is doable without spending money. It is more of a practical tool than a research project but the idea of using failure cases as curriculum is something I find genuinely interesting. Would love to hear if anyone has done something similar. Github project link is in comments below 👇 [link] [comments] |
Made a tool that builds its own training data and improves each cycle by learning from what it got wrong
Reddit r/artificial / 5/5/2026
💬 OpinionDeveloper Stack & InfrastructureTools & Practical UsageModels & Research
Key Points
- The post describes a workflow where a tool takes a few seed prompts, generates instruction–response pairs, uses an LLM to judge the outputs, and uses both the good examples for training while feeding the bad ones back as seeds for the next iteration.
- It emphasizes an iterative “practice on what it failed at” approach, effectively creating a self-improving curriculum by focusing on failure cases.
- The author notes that the judging step can run fully locally using Ollama to avoid sending data to external APIs.
- The fine-tuning stage is implemented using Unsloth on a free Colab GPU, making the whole process accessible without paid resources.
- The project is framed as a practical tool rather than a research paper, with an invitation for others to share similar work.
Related Articles

Black Hat USA
AI Business

Singapore's Fraud Frontier: Why AI Scam Detection Demands Regulatory Precision
Dev.to

First experience with Building Apps with Google AI Studio: Incredibly simple and intuitive.
Dev.to

Meta will use AI to analyze height and bone structure to identify if users are underage
TechCrunch

How AI is Changing the Way We Code in 2026: The Shift from Syntax to Strategy
Dev.to