As MTP prepares to land in llama.cpp, Models that support MTP

Reddit r/LocalLLaMA / 5/5/2026

💬 OpinionDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical Usage

Key Points

  • A Reddit post says that MTP is expected to be integrated into llama.cpp soon, which will affect how locally run LLMs are set up.
  • Until official MTP weights are available, users are advised to download model weights from Hugging Face and convert them to GGUF format for use with llama.cpp.
  • The post lists several candidate models that purportedly support MTP, including DeepSeek v3 (and v3.2/4), Qwen3.5, GLM4.5+, MiniMax2.5+, Step3.5Flash, and Mimo v2+.
  • The author plans to test either Qwen3.5-122B or GLM4.5-air first, implying early experimentation priorities for MTP-enabled local deployments.
  • Practical guidance focuses on readiness for the MTP-to-llama.cpp transition, emphasizing conversion steps and selecting among compatible models.

DeepSeekv3 OG

DeepSeekv3.2/4

Qwen3.5

GLM4.5+

MiniMax2.5+

Step3.5Flash

Mimo v2+

Until we get mtp weights, you need to download HF weights and convert to gguf. I think I'm going to try either qwen3.5-122b or glm4.5-air first.

submitted by /u/segmond
[link] [comments]