Follow-up: Qwen3 30B a3b at 7-8 t/s on a Raspberry Pi 5 8GB (source included)

1 sources1 storiesFirst seen 3/20/2026Score31Mixed Progress

Single Source

Bigness

Coverage

Recency

Engagement

Velocity

Confidence

Clipability

Polarization

Claims

Contradictions

Breakthrough

Sentiment Mix

Positive0%

Neutral100%

Negative0%

Geography

North America

Expert Signals

jslominski

author • 1 mention

r/LocalLLaMA

source • 1 mention

AI-Generated Claims

Generated from linked receipts; click sources for full context.

Follow-up: Qwen3 30B a3b at 7-8 t/s on a Raspberry Pi 5 8GB (source included).

Supported by 1 story

**Disclaimer: everything here runs locally on Pi5, no API calls/no egpu etc, source/image available below.** This is the follow-up to my post about a week ago.

Supported by 1 story

The demo is running [byteshape/Qwen3-30B-A3B-Instruct-2507-GGUF](https://huggingface.co/byteshape/Qwen3-30B-A3B-Instruct-2507-GGUF), specifically the [Q3\_K\_S 2.66bpw quant](https://huggingface.co/byteshape/Qwen3-30B-A3B-Instruct-2507-GGUF/blob/main/Qwen3-30B-A3B-Instruct-2507-Q3_K_S-2.66bpw.gguf).

Supported by 1 story

On a **Pi 5 8GB with SSD**, I'm getting 7-8 t/s at **16,384 context length**.

Supported by 1 story

On a 4 bit quant of the same model family you can expect 4-5t/s.

Supported by 1 story