Model family · 6 sizes
Qwen2.5-Coder: which size runs locally?
Qwen2.5-Coder comes in 6 sizes, from 0.494B to 32B. Bigger is generally more capable but needs more memory. Here is each size with its Q4_K_M weight, the memory it needs, and the hardware that runs it.
- Sizes
- 6
- Smallest
- 0.494B
- Largest
- 32B
- Runs from
- 8GB
The Qwen2.5-Coder lineup
- Qwen2.5 Coder 0.5B0.494B · ~0.37 GB Q4_K_M · needs ~1 GB
- Qwen2.5 Coder 1.5B1.54B · ~0.92 GB Q4_K_M · needs ~2 GB
- Qwen2.5 Coder 3B3.09B · ~1.8 GB Q4_K_M · needs ~3 GB
- Qwen2.5 Coder 7B7B · ~4.36 GB Q4_K_M · needs ~6 GB
- Qwen2.5 Coder 14B14B · ~8.37 GB Q4_K_M · needs ~11 GB
- Qwen2.5 Coder 32B32B · ~18.49 GB Q4_K_M · needs ~21 GB
"Needs" is the sourced minimum memory for Q4_K_M with a small context. Larger context needs more.
Which Qwen2.5-Coder fits your memory
Largest that fits: Qwen2.5 Coder 3B (3.09B), best case on Apple M1 (8GB).
Largest that fits: Qwen2.5 Coder 14B (14B), best case on Nvidia GeForce RTX 4080 (16GB).
Largest that fits: Qwen2.5 Coder 32B (32B), best case on Nvidia GeForce RTX 4090 (24GB). Comfortable up to Qwen2.5 Coder 14B (14B).
Largest that fits: Qwen2.5 Coder 32B (32B), best case on Nvidia GeForce RTX 5090 (32GB).
Best case means the most capable device at that size (usually a discrete GPU). A Mac at the same size sits roughly one rung lower; see the per-size breakdown on each memory budget page.
FAQ
Which Qwen2.5-Coder size should I run locally?
Pick the largest size your memory allows. On 8GB (best case) up to Qwen2.5 Coder 3B; On 16GB (best case) up to Qwen2.5 Coder 14B; On 24GB (best case) up to Qwen2.5 Coder 32B; On 32GB (best case) up to Qwen2.5 Coder 32B. Smaller sizes run faster and leave headroom for context.
What is the smallest Qwen2.5-Coder model?
Qwen2.5 Coder 0.5B at 0.494B parameters, about 0.37 GB on disk at Q4_K_M and roughly 1 GB of memory to run. It is the one to use on phones and 8 GB machines.
What is the largest Qwen2.5-Coder model and what does it need?
Qwen2.5 Coder 32B at 32B, about 18.49 GB at Q4_K_M and roughly 21 GB of memory. It fits a high-memory desktop GPU or Mac.
Sources
Memory figures are estimates at Q4_K_M. See methodology.