SIGNAL GRIDv0.1

RTX 5060 Ti 16GB Local LLM Findings: 30B Still Wins, 35B UD Is Surprisingly Fast

1 sources1 storiesFirst seen 3/20/2026Score26Mixed Progress
Single Source
CoverageRecencyEngagementVelocityBignessConfidenceClipability
Bigness
26
Coverage
13
Recency
86
Engagement
9
Velocity
0
Confidence
49
Clipability
55
Polarization
0
Claims
4
Contradictions
0
Breakthrough
50

Sentiment Mix

Positive100%
Neutral0%
Negative0%

Geography

North America

Expert Signals

Imaginary-Anywhere23

author1 mention

r/LocalLLaMA

source1 mention

AI-Generated Claims

Generated from linked receipts; click sources for full context.

RTX 5060 Ti 16GB Local LLM Findings: 30B Still Wins, 35B UD Is Surprisingly Fast.

Supported by 1 story

Bought 5060ti 16gb and tried various model.

Supported by 1 story

This is the short version for me deciding what to run on this card with `llama.cpp`, not a giant benchmark dump.

Supported by 1 story

Machine: * RTX 5060 Ti 16 GB * DDR4 now at 32 GB * llama-server `b8373` (`46dba9fce`) Relevant launch settings: * fast path: `fa=on`, `ngl=auto`, `threads=8` * KV: `-ctk q8_0 -ctv q8_0` * 30B coder path: `jinja`, `reasoning-budget 0`, `reasoning-format none` * 35B UD path: `c=262144`, `n-cpu-moe=8` * 35B `Q4_K_M` stable tune: `-ngl 26 -c 131072 --fit on --fit-ctx 131072 --fit-target 512M` Short version: * Best default coding model: `Unsloth Qwen3-Coder-30B UD-Q3_K_XL` * Best higher-context coding option: the same `Unsloth 30B` model at `96k` * Best fast 35B coding option: `Unsloth Qwen3.5-35B UD-Q2_K_XL` * `Unsloth Qwen3.5-35B Q4_K_M` is interesting, but still not the right default on this card What surprised me most is that the practical winners here were not just "smaller is...

Supported by 1 story

Related Events

Timeline (1 stories)

Receipts (1)

Bias Snapshot

Center
Left 0%Center 100%Right 0%
Sociali.redd.it3/20/2026