Show HN: I built a tiny LLM to demystify how language models work

Hacker News / 4/6/2026

💬 OpinionTools & Practical UsageModels & Research

Key Points

  • The author describes building a small (~9M parameter) transformer-based LLM from scratch using about 130 lines of PyTorch, trained on ~60K synthetic conversation examples.
  • They report the model can be trained in roughly 5 minutes on free Google Colab hardware (T4), making the experiment practical for learning.
  • The project is positioned as a way to demystify how language models work by letting users fork the code and modify the model’s “personality.”
  • Community discussion on Show HN focuses on implementation details and how to adapt the approach for other characters or learning goals.

Built a ~9M param LLM from scratch to understand how they actually work. Vanilla transformer, 60K synthetic conversations, ~130 lines of PyTorch. Trains in 5 min on a free Colab T4. The fish thinks the meaning of life is food.

Fork it and swap the personality for your own character.


Comments URL: https://news.ycombinator.com/item?id=47655408

Points: 227

# Comments: 17

Show HN: I built a tiny LLM to demystify how language models work | AI Navigate