Made a tool that builds its own training data and improves each cycle by learning from what it got wrong

Reddit r/artificial / 5/5/2026

💬 OpinionDeveloper Stack & InfrastructureTools & Practical UsageModels & Research

共有:

Key Points

The post describes a workflow where a tool takes a few seed prompts, generates instruction–response pairs, uses an LLM to judge the outputs, and uses both the good examples for training while feeding the bad ones back as seeds for the next iteration.
It emphasizes an iterative “practice on what it failed at” approach, effectively creating a self-improving curriculum by focusing on failure cases.
The author notes that the judging step can run fully locally using Ollama to avoid sending data to external APIs.
The fine-tuning stage is implemented using Unsloth on a free Colab GPU, making the whole process accessible without paid resources.
The project is framed as a practical tool rather than a research paper, with an invitation for others to share similar work.

Made a tool that builds its own training data and improves each cycle by learning from what it got wrong

The basic idea is pretty simple. You give it a few seed prompts. It generates instruction-response pairs, an LLM scores each one, the good ones go into your training set and the bad ones become the seeds for the next round. Each cycle the model is essentially practicing on what it failed at before.

You can run the judge completely locally with Ollama if you do not want to send data to any API.

The fine-tuning at the end uses Unsloth on a free Colab GPU so the whole thing is doable without spending money.

It is more of a practical tool than a research project but the idea of using failure cases as curriculum is something I find genuinely interesting.

Would love to hear if anyone has done something similar.

Github project link is in comments below 👇

submitted by /u/gvij
[link] [comments]

Black Hat USA

AI Business

Singapore's Fraud Frontier: Why AI Scam Detection Demands Regulatory Precision

Dev.to

First experience with Building Apps with Google AI Studio: Incredibly simple and intuitive.

Dev.to

Meta will use AI to analyze height and bone structure to identify if users are underage

TechCrunch

How AI is Changing the Way We Code in 2026: The Shift from Syntax to Strategy

Dev.to

Made a tool that builds its own training data and improves each cycle by learning from what it got wrong

Key Points

Related Articles

Black Hat USA

Singapore's Fraud Frontier: Why AI Scam Detection Demands Regulatory Precision

First experience with Building Apps with Google AI Studio: Incredibly simple and intuitive.

Meta will use AI to analyze height and bone structure to identify if users are underage

How AI is Changing the Way We Code in 2026: The Shift from Syntax to Strategy

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer