As MTP prepares to land in llama.cpp, Models that support MTP

Reddit r/LocalLLaMA / 5/5/2026

💬 OpinionDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical Usage

共有:

Key Points

A Reddit post says that MTP is expected to be integrated into llama.cpp soon, which will affect how locally run LLMs are set up.
Until official MTP weights are available, users are advised to download model weights from Hugging Face and convert them to GGUF format for use with llama.cpp.
The post lists several candidate models that purportedly support MTP, including DeepSeek v3 (and v3.2/4), Qwen3.5, GLM4.5+, MiniMax2.5+, Step3.5Flash, and Mimo v2+.
The author plans to test either Qwen3.5-122B or GLM4.5-air first, implying early experimentation priorities for MTP-enabled local deployments.
Practical guidance focuses on readiness for the MTP-to-llama.cpp transition, emphasizing conversion steps and selecting among compatible models.

DeepSeekv3 OG

DeepSeekv3.2/4

Qwen3.5

GLM4.5+

MiniMax2.5+

Step3.5Flash

Mimo v2+

Until we get mtp weights, you need to download HF weights and convert to gguf. I think I'm going to try either qwen3.5-122b or glm4.5-air first.