Web OS result from Qwen3.6 35B is by far the best I tested in my laptop

Reddit r/LocalLLaMA / 4/17/2026

💬 OpinionSignals & Early TrendsTools & Practical UsageModels & Research

Key Points

  • A Reddit user reports that the Qwen3.6 35B model produced a notably strong “web OS” performance on their laptop, achieving ~98% usability versus ~70% with a previous Qwen3 coder model they tested.
  • The test involved generating web-OS-like code (about 2,100 lines) using a 38k context window with an OpenCode workflow and quantized weights (Q4_K_XL).
  • The user compared results against other state-of-the-art models they had tried, claiming Qwen3.6 35B was the best web OS outcome on their setup.
  • The run used local inference via llama-server with specific runtime parameters (e.g., temperature/top-p settings, parallelism, and quantization-related flags) on hardware consisting of 24GB DDR5 and an RTX 4050.
  • The post functions mainly as an early, hands-on signal about practical coding-agent/web-OS capability improvements for locally hosted LLMs rather than a formally released benchmark.

This is my first test with this model and Qwen impressed me. I will rate it 98% usable web os compared to my previous best 70% usable result from qwen3 next coder at q2.

Yes I know they train the models on these common prompts yet this is the best results I have seen even compared to a SOTA models.

~2100 lines of code used 38k context using opencode

Hardware: 24GB ddr5 + RTX4050

Quant: q4_k_xl

tg - 25 tk/s

llama-server \

--model /run/media/loq/New\ Volume/Models/unsloth/Qwen3.6-35B-A3B-GGUF/Qwen3.6-35B-A3B-UD-Q4_K_XL.gguf \

--port 1234 \

--host "0.0.0.0" \

--jinja \

-cmoe \

-t 8 -fa 1 -ctk q8_0 -ctv q8_0 \

--parallel 1 --fit-target 200 \

--temp 0.6 --top-p 0.95 --min-p 0.0 --top-k 20 --presence-penalty 0 --repeat-penalty 1.0

submitted by /u/Idontknow3728
[link] [comments]