PSA: If you haven’t updated Llama.cpp for a couple of days and find MTP to not be performing well, update llamacpp.

Reddit r/LocalLLaMA / 5/18/2026

💬 OpinionDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical Usage

共有:

Key Points

The post advises users to update llama.cpp if they have not done so for a few days and notice poor performance from MTP.
One user reported that after updating, token generation performance improved by roughly 1.5–1.8x based on their benchmarking.
The update reportedly also addressed most issues related to “pp,” leading to noticeably better output for the commenter.
Overall, it emphasizes that recent llama.cpp changes can significantly affect runtime performance for local LLM setups.

I thought it had horrible performance and was a nothingburger and had spent like an hour benchmarking it. Updated it yesterday and received a like 1.5-1.8x token boost. They even mostly fixed the pp issue. Now my pp is really big ;)

submitted by /u/Borkato
[link] [comments]