Efficiency · Elo per GB

The most efficient local LLMs

Capability per gigabyte, not raw score. This ranks the models that give the most quality for the least memory, the metric that matters when memory is what limits you.

Coverage

Only 16 of 135 tracked models carry a public LMArena Elo, so only those can be scored. Many strong small and distilled models have no arena score and are not shown. Efficiency uses the sourced minimum memory at Q4_K_M.

The efficient frontier

Each step is the smallest model that beats every cheaper one on Elo. If you have this much memory to spare, this is the most capable scored model that fits it.

Memory floor Elo

~2 GB Llama 3.2 1B 1B 1110
~3 GB SmolLM2 1.7B 1.7B 1114
~4 GB Gemma 3 4B 4B 1303
~10 GB Gemma 3 12B 12B 1342
~16 GB Mistral Small 3 24B 24B 1357
~20 GB Gemma 3 27B 27B 1366
~21 GB Qwen3 30B-A3B 30.5B 1383

Ranked by Elo per GB

#	Model	Min GB	Elo	Elo/GB
1	Llama 3.2 1B	2	1110	555
2	SmolLM2 1.7B	3	1114	371
3	Gemma 3 4B	4	1303	326
4	Llama 3.2 3B	4	1166	292
5	Gemma 2 9B	8	1266	158
6	Llama 3.1 8B	8	1211	151
7	Mistral 7B	8	1149	144
8	Gemma 3 12B	10	1342	134
9	Phi-4 14B	12	1256	105
10	Mistral Small 3 24B	16	1357	85
11	Gemma 3 27B	20	1366	68
12	Qwen3 30B-A3B	21	1383	66
13	Gemma 2 27B	20	1289	64
14	Qwen3 32B	22	1347	61
15	Llama 3.3 70B	48	1318	27
16	Qwen2.5 72B	50	1303	26

Tiny models top this by asking for almost no memory. For raw quality regardless of size, see the Elo leaderboard; to fit a specific machine, start from your memory budget.

FAQ

What is the most efficient local LLM?

By capability per gigabyte, Llama 3.2 1B (1B) leads: an LMArena Elo of 1110 on roughly 2 GB of memory at Q4_K_M. Tiny models win this metric because they ask for so little memory; the frontier below shows the smallest model that beats each capability level.

Does this cover every model?

No. Only 16 of 135 tracked models carry a public LMArena Elo, so only those can be scored for efficiency. Many strong small and distilled models have no arena score and are not shown here; this is a ranking of what can be measured, not the whole catalog.

Why measure Elo per gigabyte?

Because on local hardware, memory is the binding constraint. Elo per GB answers a practical question: for the memory you can spare, what is the most capable model that fits? It rewards models that punch above their size.

Sources

Elo is a snapshot from LMArena around 2026-08-03 and drifts over time. Memory is the sourced Q4_K_M minimum; see methodology.