24GB VRAM users, have you tried Qwen3.5-9B-UD-Q8_K_XL?
Sentiment Mix
Geography
Expert Signals
Prestigious-Use5483
author • 1 mention
r/LocalLLaMA
source • 1 mention
AI-Generated Claims
Generated from linked receipts; click sources for full context.
24GB VRAM users, have you tried Qwen3.5-9B-UD-Q8_K_XL?.
Supported by 1 story
I am somewhat convinced by my own testing, that for non-coding, the 9B at UD-Q8\_K-XL variant is better than the 27B Q4\_K\_XL & Q5\_K\_XL.
Supported by 1 story
To me, it felt like going to the highest quant really showed itself with good quality results and faster.
Supported by 1 story
Not only that, I am able to pair Qwen3-TTS with it and use a custom voice (I am using Scarlett Johansson's voice).
Supported by 1 story
Once the first prompt is loaded and voice is called, it is really fast.
Supported by 1 story
Related Events
Qwen3.5 27B and 35B with 2x AMD 7900 XTX vLLM bench serve results
Hardware • 3/21/2026
HELP - What settings do you use? Qwen3.5-35B-A3B
Uncategorized • 3/21/2026
RTX 5060 Ti 16GB vs Context Window Size
Uncategorized • 3/21/2026
Feedback on my 256gb VRAM local setup and cluster plans. Lawyer keeping it local.
Uncategorized • 3/21/2026
Qwen 3.5 397B is the best local coder I have used until now
Uncategorized • 3/21/2026