Karpathy's MicroGPT running at 50,000 tps on an FPGA

Reddit r/LocalLLaMA / 5/3/2026

📰 NewsDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical UsageModels & Research

共有:

Key Points

Karpathy’s MicroGPT is reportedly running at around 50,000 tokens per second on an FPGA using a very small model with just 4,192 parameters.
The write-up emphasizes that much of the throughput comes from keeping model weights on the chip (onboard ROM) instead of fetching them from external memory.
The post notes a practical limitation: with current FPGAs and 16-bit weights, onboard ROM caps the model size at roughly 20–30 million parameters.
It suggests that future increases in onboard ROM capacity—or FPGAs specialized for small language models (SLMs)—could enable larger models to achieve similar high inference speeds.
Project details and the related repository are provided for readers to inspect and reproduce the approach.

Sure, it's only 4,192 parameters, but it's a start. Project write-up here: https://v2.talos.wtf/ and github repository here: https://github.com/Luthiraa/TALOS-V2

Some of the speed comes from having the weights onboard, rather than in external memory. Onboard ROM means with 16 bit weights current FPGAs max out at 20-30 million parameters, but maybe this and Taalas (https://taalas.com/ - similar names are unlikely a coincidence) will lead to more onboard ROM appearing in FPGAs or FPGAs dedicated to SLMs.

submitted by /u/jawondo
[link] [comments]

Black Hat USA

AI Business

The foundational UK sovereign-AI patents are filed. The collaboration door is open.

Dev.to

Building a Shopify app with Claude Code — spec-driven development and pricing design

Dev.to

The AI Habit That Pays Dividends (And Takes Zero Extra Time)

Dev.to

From Chaos to Clarity: AI-Powered Client Portals for Designers

Dev.to

Karpathy's MicroGPT running at 50,000 tps on an FPGA

Key Points

Related Articles

Black Hat USA

The foundational UK sovereign-AI patents are filed. The collaboration door is open.

Building a Shopify app with Claude Code — spec-driven development and pricing design

The AI Habit That Pays Dividends (And Takes Zero Extra Time)

From Chaos to Clarity: AI-Powered Client Portals for Designers

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer