Skip to content
localmodel.run

Memory budget · 24 GB

Best local LLMs for 24GB

24GB is not a single ceiling. A 24GB Mac and a 24GB GPU each leave a different amount free for model weights, so the largest model you can run changes with the memory type, not just the number.

Usable range
16–23 GB
Models that fit
58
Memory types
2
Top pick
32B

What 24GB actually gives you

Usable figures are sourced per device (tap a card for the full profile). Verdicts below use Q4_K_M, the community-default quant.

Top pick for 24GB Q4_K_M

Runs comfortably on the most capable 24GB setup (Nvidia GeForce RTX 4090 (24GB), ~23 GB usable) at ~20.4 GB. Check it against your exact device on its model page.

Models ranked for 24GB

Biggest that fits first Mac · GPU

Each chip links to the full breakdown for that model on a real 24GB device. "Tight" means it fits but with little headroom, close other apps.

The ceiling, per memory type

Apple M4 Pro (24GB) (~16 GB usable)

Runs up to gpt-oss 20B (21B) comfortably at Q4_K_M. Larger models either sit tight or spill past the ~16 GB it can give a model.

Nvidia GeForce RTX 4090 (24GB) (~23 GB usable)

Runs up to Granite 4.0 H Small (32B) comfortably at Q4_K_M. Larger models either sit tight or spill past the ~23 GB it can give a model.

Too large for any 24GB device

FAQ

How much of 24GB can a model actually use?

It depends on the memory type. Apple unified memory: about 16 GB (Apple M4 Pro (24GB)); GPU VRAM: about 23 GB (Nvidia GeForce RTX 4090 (24GB)). The rest is reserved for the OS, display and runtime overhead.

What is the best local LLM for 24GB?

Granite 4.0 H Small (32B) is the strongest model that runs comfortably at Q4_K_M on the most capable 24GB setup (Nvidia GeForce RTX 4090 (24GB), ~23 GB usable). On a tighter 24GB device the ceiling is lower, shown per row above.

Why does a 24GB GPU fit a bigger model than a 24GB Mac?

A discrete GPU gives almost all of its VRAM to the model (leave ~1 GB for the driver). Apple Silicon shares one unified pool with macOS, so roughly 66% is available to the GPU for weights. Same 24GB sticker, different usable budget, so the model ceiling differs.

Sources

Memory figures are estimates at Q4_K_M with a small context. See methodology.