24GB VRAM users, have you tried Qwen3.5-9B-UD-Q8_K_XL?

Reddit r/LocalLLaMA / 3/21/2026

💬 OpinionDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical UsageModels & Research

共有:

Key Points

The author reports that the 9B UD-Q8_K-XL variant delivers better quality and faster performance than the 27B Q4_K_XL and Q5_K_XL for non-coding tasks.
They were able to pair Qwen3-TTS with this model using a custom Scarlett Johansson voice, with notably fast responses after the initial prompt load.
In their tests, using the same context size for 27B and 9B, the 9B 8-bit quant appears to outperform the 27B's 4- or 5-bit quantization for general-purpose use.
They would consider adding a second GPU to run the 27B at 8-bit and asked others if they've seen similar results.

I am somewhat convinced by my own testing, that for non-coding, the 9B at UD-Q8_K-XL variant is better than the 27B Q4_K_XL & Q5_K_XL. To me, it felt like going to the highest quant really showed itself with good quality results and faster. Not only that, I am able to pair Qwen3-TTS with it and use a custom voice (I am using Scarlett Johansson's voice). Once the first prompt is loaded and voice is called, it is really fast. I was testing with the same context size for 27 and 9B.

This is mostly about how the quality of the higher end 9B 8-bit quant felt better for general purpose stuff, compared to the 4 or 5 bit quants of 27B. It makes me want to get another GPU to add to my 3090 so that i can run the 27B at 8 bit.

Has anyone seen anything similar.

submitted by /u/Prestigious-Use5483
[link] [comments]

xAIのGrokが児童性的虐待コンテンツを生成したとして集団訴訟 3人のティーンが安全対策の不備を主張のサムネイル画像

Ledge.ai

GDELT、100以上の言語で世界のニュースを収集する情報基盤 1979年以降のデータを基にAIで報道動向を可視化のサムネイル画像

Ledge.ai

再現性とは何か | おじの解説 | 📗 AIを組織で回す技術 013

note

裏カツ奏 #AIイラスト #画像生成AI #アート #イラスト #生成AI #美女イラスト #創作 #クリエイター #イラストレーター

note

AIに聞く前に「自分の心」に聞け。40代がターゲットの「本当の痛み」を見抜く方法。

note

24GB VRAM users, have you tried Qwen3.5-9B-UD-Q8_K_XL?

Key Points

Related Articles

xAIのGrokが児童性的虐待コンテンツを生成したとして集団訴訟 3人のティーンが安全対策の不備を主張のサムネイル画像

GDELT、100以上の言語で世界のニュースを収集する情報基盤 1979年以降のデータを基にAIで報道動向を可視化のサムネイル画像

再現性とは何か | おじの解説 | 📗 AIを組織で回す技術 013

裏カツ奏 #AIイラスト #画像生成AI #アート #イラスト #生成AI #美女イラスト #創作 #クリエイター #イラストレーター

AIに聞く前に「自分の心」に聞け。40代がターゲットの「本当の痛み」を見抜く方法。

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer

Key Points

Related Articles

xAIのGrokが児童性的虐待コンテンツを生成したとして集団訴訟 3人のティーンが安全対策の不備を主張 のサムネイル画像

GDELT、100以上の言語で世界のニュースを収集する情報基盤 1979年以降のデータを基にAIで報道動向を可視化のサムネイル画像

再現性とは何か | おじの解説 | 📗 AIを組織で回す技術 013

裏カツ 奏 #AIイラスト #画像生成AI #アート #イラスト #生成AI #美女イラスト #創作 #クリエイター #イラストレーター

AIに聞く前に「自分の心」に聞け。40代がターゲットの「本当の痛み」を見抜く方法。

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer

xAIのGrokが児童性的虐待コンテンツを生成したとして集団訴訟 3人のティーンが安全対策の不備を主張のサムネイル画像

裏カツ奏 #AIイラスト #画像生成AI #アート #イラスト #生成AI #美女イラスト #創作 #クリエイター #イラストレーター