Wild Experience - Titan X Pascal

Reddit r/LocalLLaMA / 3/16/2026

💬 OpinionTools & Practical UsageModels & Research

共有:

Key Points

The author tested a Titan X Pascal old GPU on AI tasks using OpenCode and llama.cpp, comparing performance to an AMD 9070 XT and a server baseline.
The measurements show about 500 tokens/sec for prompt processing and 25 tokens/sec for generation on Titan X Pascal, roughly on par with the 9070 XT but slower in generation.
The server alone achieved 100 tokens/sec prompt and 6 tokens/sec generation, highlighting the GPU's role in accelerating AI tasks versus a baseline server.
The takeaway: old hardware can still perform surprisingly well for AI workloads, and the author added a simple hardware metrics panel to llama.cpp visuals.

I wanted to see how older GPUs hold up for AI tasks today. Seven months ago I posted about the AMD 9070 XT I had for gaming, which I also wanted to use for AI. Recently, I added an old Titan X Pascal card to my server just to see what it could do it was just collecting dust anyway.

Even if it only ran a small LLM agent that reviews code while I sleep, I thought it would be a fun experiment.

After some tweaking with OpenCode and llama dot cpp, I’m seeing around 500 tokens/sec for prompt processing and 25 tokens/sec for generation. That’s similar to what the 9070 XT achieved, though at half the generation speed. Meanwhile, the server by itself was only hitting 100 tokens/sec and 6 tokens/sec for generation.

Lesson learned: old hardware can still perform surprisingly well.

Note: I added a simple panel to show hardware metrics from llama dot cpp. I don’t care much about tracking metrics it’s mostly just for the visuals.

https://preview.redd.it/o3xs9461tcpg1.png?width=2468&format=png&auto=webp&s=c7a43fd1e96c4e1e40e58407a55bc64c28db6c92

submitted by /u/Lazy-Routine-Handler
[link] [comments]