AI Navigate

Implementing reasoning-budget in Qwen3.5

Reddit r/LocalLLaMA / 3/20/2026

💬 OpinionTools & Practical UsageModels & Research

Read original →

共有:

Key Points

The post asks how to implement reasoning-budget for Qwen3.5 using vLLM or SGLang in Python.
The author reports the model consistently uses about 1500 tokens for reasoning regardless of attempts to adjust it.
The question was submitted by user /u/DingyAtoll on Reddit, with a link to the LocalLLaMA discussion thread.
The thread focuses on understanding and controlling the reasoning budget, which impacts latency, cost, and output behavior.

Can anyone please tell me how I am supposed to implement reasoning-budget for Qwen3.5 on either vLLM or SGLang on Python? No matter what I try it just thinks for 1500 tokens for no reason and it's driving me insane.

submitted by /u/DingyAtoll
[link] [comments]

Related Articles

ベテランの若手育成負担を減らせ、PLC制御の「ラダー図」をAIで生成

ベテランの若手育成負担を減らせ、PLC制御の「ラダー図」をAIで生成

日経XTECH

Your AI generated code is "almost right", and that is actually WORSE than it being "wrong".

Your AI generated code is "almost right", and that is actually WORSE than it being "wrong".

Dev.to

Lessons from Academic Plagiarism Tools for SaaS Product Development

Lessons from Academic Plagiarism Tools for SaaS Product Development

Dev.to

Windsurf’s New Pricing Explained: Simpler AI Coding or Hidden Trade-Offs?

Windsurf’s New Pricing Explained: Simpler AI Coding or Hidden Trade-Offs?

Dev.to

Building Production RAG Systems with PostgreSQL: Complete Implementation Guide

Building Production RAG Systems with PostgreSQL: Complete Implementation Guide

Dev.to

関連おすすめサービス

※当サイトはアフィリエイト広告を利用しています

Notta搭載AI議事録イヤホン ZENCHORD1

AI時代の仕事術。Notta搭載で会議の議事録を自動生成するスマートイヤホン。

AI搭載ボイスレコーダー Plaud

世界100万人が愛用。AIで文字起こし・要約を自動化するボイスレコーダー。

画像高画質化AIツール Aiarty Image Enhancer

AIで画像を高画質化。写真・イラストを簡単にアップスケール。