Quick thoughts on Qwen3.5-35B-A3B-UD-IQ4_XS from Unsloth

1 sources1 storiesFirst seen 3/20/2026Score20Mixed Progress

Single Source

Bigness

Coverage

Recency

Engagement

Velocity

Confidence

Clipability

Polarization

Claims

Contradictions

Breakthrough

Sentiment Mix

Positive0%

Neutral100%

Negative0%

Geography

North America

Expert Signals

EuphoricPenguin22

author • 1 mention

r/LocalLLaMA

source • 1 mention

AI-Generated Claims

Generated from linked receipts; click sources for full context.

Quick thoughts on Qwen3.5-35B-A3B-UD-IQ4_XS from Unsloth.

Supported by 1 story

Just some quick thoughts on [Qwen3.5-35B-A3B-UD-IQ4_XS](https://huggingface.co/unsloth/Qwen3.5-35B-A3B-GGUF/blob/main/Qwen3.5-35B-A3B-UD-IQ4_XS.gguf) after I finally got it working in the new version of [Ooba](https://github.com/oobabooga/text-generation-webui).

Supported by 1 story

In short: on a 3090, this thing runs at around 100 t/s with almost no preprocessing time, and it can fit like a 250k context length on the card with no cache quantization.

Supported by 1 story

Actual performance is quite good.

Supported by 1 story

I always make a quick demo and chuck it on Codepen, and I've been trying and failing to make a basic 3D snake game in ThreeJS with a local model until now.

Supported by 1 story