Model family · 4 sizes
Mistral: which size runs locally?
Mistral comes in 4 sizes, from 7B to 46.7B. Bigger is generally more capable but needs more memory. Here is each size with its Q4_K_M weight, the memory it needs, and the hardware that runs it.
- Sizes
- 4
- Smallest
- 7B
- Largest
- 46.7B
- Runs from
- 16GB
The Mistral lineup
"Needs" is the sourced minimum memory for Q4_K_M with a small context. Larger context needs more.
Which Mistral fits your memory
No Mistral size fits 8GB; even Mistral 7B needs more.
Largest that fits: Mistral Nemo 12B (12.2B), best case on Nvidia GeForce RTX 4080 (16GB).
Largest that fits: Mistral Small 3 24B (24B), best case on Nvidia GeForce RTX 4090 (24GB).
Largest that fits: Mixtral 8x7B (46.7B), best case on Nvidia GeForce RTX 5090 (32GB). Comfortable up to Mistral Small 3 24B (24B).
Best case means the most capable device at that size (usually a discrete GPU). A Mac at the same size sits roughly one rung lower; see the per-size breakdown on each memory budget page.
FAQ
Which Mistral size should I run locally?
Pick the largest size your memory allows. On 16GB (best case) up to Mistral Nemo 12B; On 24GB (best case) up to Mistral Small 3 24B; On 32GB (best case) up to Mixtral 8x7B. Smaller sizes run faster and leave headroom for context.
What is the smallest Mistral model?
Mistral 7B at 7B parameters, about 4.37 GB on disk at Q4_K_M and roughly 8 GB of memory to run. It is the one to use on phones and 8 GB machines.
What is the largest Mistral model and what does it need?
Mixtral 8x7B at 46.7B (mixture of experts), about 26.49 GB at Q4_K_M and roughly 30 GB of memory. It fits a high-memory desktop GPU or Mac.
Sources
Memory figures are estimates at Q4_K_M. See methodology.