Efficiency · Elo per GB
The most efficient local LLMs
Capability per gigabyte, not raw score. This ranks the models that give the most quality for the least memory, the metric that matters when memory is what limits you.
Only 16 of 67 tracked models carry a public LMArena Elo, so only those can be scored. Many strong small and distilled models have no arena score and are not shown. Efficiency uses the sourced minimum memory at Q4_K_M.
The efficient frontier
Each step is the smallest model that beats every cheaper one on Elo. If you have this much memory to spare, this is the most capable scored model that fits it.
- ~2 GB Llama 3.2 1B 1B 1110
- ~3 GB SmolLM2 1.7B 1.7B 1114
- ~4 GB Gemma 3 4B 4B 1303
- ~10 GB Gemma 3 12B 12B 1342
- ~16 GB Mistral Small 3 24B 24B 1357
- ~20 GB Gemma 3 27B 27B 1366
- ~21 GB Qwen3 30B-A3B 30.5B 1383
Ranked by Elo per GB
| # | Model | Min GB | Elo | Elo/GB |
|---|---|---|---|---|
| 1 | 2 | 1110 | 555 | |
| 2 | 3 | 1114 | 371 | |
| 3 | 4 | 1303 | 326 | |
| 4 | 4 | 1166 | 292 | |
| 5 | 8 | 1266 | 158 | |
| 6 | 8 | 1211 | 151 | |
| 7 | 8 | 1149 | 144 | |
| 8 | 10 | 1342 | 134 | |
| 9 | 12 | 1256 | 105 | |
| 10 | 16 | 1357 | 85 | |
| 11 | 20 | 1366 | 68 | |
| 12 | 21 | 1383 | 66 | |
| 13 | 20 | 1289 | 64 | |
| 14 | 22 | 1347 | 61 | |
| 15 | 48 | 1318 | 27 | |
| 16 | 50 | 1303 | 26 |
Tiny models top this by asking for almost no memory. For raw quality regardless of size, see the Elo leaderboard; to fit a specific machine, start from your memory budget.
FAQ
What is the most efficient local LLM?
By capability per gigabyte, Llama 3.2 1B (1B) leads: an LMArena Elo of 1110 on roughly 2 GB of memory at Q4_K_M. Tiny models win this metric because they ask for so little memory; the frontier below shows the smallest model that beats each capability level.
Does this cover every model?
No. Only 16 of 67 tracked models carry a public LMArena Elo, so only those can be scored for efficiency. Many strong small and distilled models have no arena score and are not shown here; this is a ranking of what can be measured, not the whole catalog.
Why measure Elo per gigabyte?
Because on local hardware, memory is the binding constraint. Elo per GB answers a practical question: for the memory you can spare, what is the most capable model that fits? It rewards models that punch above their size.
Sources
Elo is a snapshot from LMArena around 2026-06-15 and drifts over time. Memory is the sourced Q4_K_M minimum; see methodology.