Skip to content
localmodel.run

Model family · 4 sizes

Mistral: which size runs locally?

Mistral comes in 4 sizes, from 7B to 46.7B. Bigger is generally more capable but needs more memory. Here is each size with its Q4_K_M weight, the memory it needs, and the hardware that runs it.

Sizes
4
Smallest
7B
Largest
46.7B
Runs from
16GB

The Mistral lineup

"Needs" is the sourced minimum memory for Q4_K_M with a small context. Larger context needs more.

Which Mistral fits your memory

8GB

No Mistral size fits 8GB; even Mistral 7B needs more.

No
16GB

Largest that fits: Mistral Nemo 12B (12.2B), best case on Nvidia GeForce RTX 4080 (16GB).

Yes
24GB

Largest that fits: Mistral Small 3 24B (24B), best case on Nvidia GeForce RTX 4090 (24GB).

Yes
32GB

Largest that fits: Mixtral 8x7B (46.7B), best case on Nvidia GeForce RTX 5090 (32GB). Comfortable up to Mistral Small 3 24B (24B).

Tight

Best case means the most capable device at that size (usually a discrete GPU). A Mac at the same size sits roughly one rung lower; see the per-size breakdown on each memory budget page.

FAQ

Which Mistral size should I run locally?

Pick the largest size your memory allows. On 16GB (best case) up to Mistral Nemo 12B; On 24GB (best case) up to Mistral Small 3 24B; On 32GB (best case) up to Mixtral 8x7B. Smaller sizes run faster and leave headroom for context.

What is the smallest Mistral model?

Mistral 7B at 7B parameters, about 4.37 GB on disk at Q4_K_M and roughly 8 GB of memory to run. It is the one to use on phones and 8 GB machines.

What is the largest Mistral model and what does it need?

Mixtral 8x7B at 46.7B (mixture of experts), about 26.49 GB at Q4_K_M and roughly 30 GB of memory. It fits a high-memory desktop GPU or Mac.

Sources

Memory figures are estimates at Q4_K_M. See methodology.