Follow-up: Qwen3 30B a3b at 7-8 t/s on a Raspberry Pi 5 8GB (source included)
Sentiment Mix
Geography
Expert Signals
jslominski
author • 1 mention
r/LocalLLaMA
source • 1 mention
AI-Generated Claims
Generated from linked receipts; click sources for full context.
Follow-up: Qwen3 30B a3b at 7-8 t/s on a Raspberry Pi 5 8GB (source included).
Supported by 1 story
**Disclaimer: everything here runs locally on Pi5, no API calls/no egpu etc, source/image available below.** This is the follow-up to my post about a week ago.
Supported by 1 story
The demo is running [byteshape/Qwen3-30B-A3B-Instruct-2507-GGUF](https://huggingface.co/byteshape/Qwen3-30B-A3B-Instruct-2507-GGUF), specifically the [Q3\_K\_S 2.66bpw quant](https://huggingface.co/byteshape/Qwen3-30B-A3B-Instruct-2507-GGUF/blob/main/Qwen3-30B-A3B-Instruct-2507-Q3_K_S-2.66bpw.gguf).
Supported by 1 story
On a **Pi 5 8GB with SSD**, I'm getting 7-8 t/s at **16,384 context length**.
Supported by 1 story
On a 4 bit quant of the same model family you can expect 4-5t/s.
Supported by 1 story
Related Events
RTX 5060 Ti 16GB Local LLM Findings: 30B Still Wins, 35B UD Is Surprisingly Fast
LLMs • 3/20/2026
What LLMs are you keeping your eye on?
LLMs • 3/20/2026
Mistral Small 4 vs Qwen3.5-9B on document understanding benchmarks, but it does better than GPT-4.1
LLMs • 3/20/2026
Nvidia has an OpenClaw strategy. Do you?
Robotics • 3/20/2026
Show HN: FPGA soft-core of the Saab Viggen's 1963 airborne computer
Uncategorized • 3/20/2026