Quick thoughts on Qwen3.5-35B-A3B-UD-IQ4_XS from Unsloth
Sentiment Mix
Geography
Expert Signals
EuphoricPenguin22
author • 1 mention
r/LocalLLaMA
source • 1 mention
AI-Generated Claims
Generated from linked receipts; click sources for full context.
Quick thoughts on Qwen3.5-35B-A3B-UD-IQ4_XS from Unsloth.
Supported by 1 story
Just some quick thoughts on [Qwen3.5-35B-A3B-UD-IQ4_XS](https://huggingface.co/unsloth/Qwen3.5-35B-A3B-GGUF/blob/main/Qwen3.5-35B-A3B-UD-IQ4_XS.gguf) after I finally got it working in the new version of [Ooba](https://github.com/oobabooga/text-generation-webui).
Supported by 1 story
In short: on a 3090, this thing runs at around 100 t/s with almost no preprocessing time, and it can fit like a 250k context length on the card with no cache quantization.
Supported by 1 story
Actual performance is quite good.
Supported by 1 story
I always make a quick demo and chuck it on Codepen, and I've been trying and failing to make a basic 3D snake game in ThreeJS with a local model until now.
Supported by 1 story
Related Events
Follow-up: Qwen3 30B a3b at 7-8 t/s on a Raspberry Pi 5 8GB (source included)
Uncategorized • 3/20/2026
Mistral Small 4 vs Qwen3.5-9B on document understanding benchmarks, but it does better than GPT-4.1
LLMs • 3/20/2026
composer 2 is just Kimi K2.5 with RL?????
Uncategorized • 3/20/2026
Ask HN: Is Claude down Again?
LLMs • 3/20/2026
Ooh, new drama just dropped 👀
Uncategorized • 3/20/2026