| Hi guys, Back again. I have tested the Qwen 3.6 UD 2 K_XL Unsloth model on the same paper to web app task. The model is performing very well. It handled all tool calls properly and also managed large context using llama.cpp on a 16GB VRAM on laptop. I have attached all details You can test this model using the same skills I created earlier with the Qwen 35B model [link] [comments] |
Qwen 3.6 35 UD 2 K_XL is pulling beyond its weight and quantization (No one is GPU Poor now)
Reddit r/LocalLLaMA / 4/17/2026
💬 OpinionSignals & Early TrendsTools & Practical Usage
Key Points
- A Reddit user reports testing the Qwen 3.6 UD 2 K_XL (Qwen 35B) Unsloth model on a paper-to-web-app task and says it performs very well.
- They claim the model handled 58 tool calls with a 98.3% success rate and correctly managed large context using llama.cpp on a laptop with 16GB VRAM.
- The user states the model processed about 2.7 million tokens while building the app from the provided paper.
- They share a suggested workflow/commands to run the model via llama-server (e.g., with a 90,000 context length setting) and provide a link to a related “research-webapp-skill.”



