server, webui: support continue generation on reasoning models by ServeurpersoCom · Pull Request #22727 · ggml-org/llama.cpp

Reddit r/LocalLLaMA / 5/13/2026

📰 NewsDeveloper Stack & InfrastructureTools & Practical Usage

Key Points

  • The ggml-org/llama.cpp pull request #22727 adds support for continuing text generation on reasoning models.
  • The change specifically targets server and WebUI workflows, enabling users to resume generation rather than restarting.
  • This update improves the usability of local LLM applications that rely on reasoning-capable models.
  • The post indicates the feature is now available (“now you can CONTINUE”) through the referenced PR.
  • Developers integrating llama.cpp-based backends and interfaces can leverage this capability to enhance user experience.
server, webui: support continue generation on reasoning models by ServeurpersoCom · Pull Request #22727 · ggml-org/llama.cpp

now you can CONTINUE

submitted by /u/jacek2023
[link] [comments]