I wanted to see how older GPUs hold up for AI tasks today. Seven months ago I posted about the AMD 9070 XT I had for gaming, which I also wanted to use for AI. Recently, I added an old Titan X Pascal card to my server just to see what it could do it was just collecting dust anyway.
Even if it only ran a small LLM agent that reviews code while I sleep, I thought it would be a fun experiment.
After some tweaking with OpenCode and llama dot cpp, I’m seeing around 500 tokens/sec for prompt processing and 25 tokens/sec for generation. That’s similar to what the 9070 XT achieved, though at half the generation speed. Meanwhile, the server by itself was only hitting 100 tokens/sec and 6 tokens/sec for generation.
Lesson learned: old hardware can still perform surprisingly well.
Note: I added a simple panel to show hardware metrics from llama dot cpp. I don’t care much about tracking metrics it’s mostly just for the visuals.
[link] [comments]




