Skip to content
localmodel.run

Catalog · Video generation

Best local video generation models

Local text-to-video and image-to-video models, ranked by peak VRAM. Video is the most memory-hungry modality, and most needs a discrete GPU.

Models
11
Lightest
~6 GB
Heaviest
~20 GB

Models

Lightest first peak VRAM
  • Wan 2.1 T2V 1.3B
    832×480 (480p) · 81 frames
    ~6 GB
    Q4 GGUF
    Runs on: MacNVIDIAAMD · offload floor ~5 GB

    Tools: ComfyUI, Diffusers

  • CogVideoX-2B
    720×480 · 49 frames
    ~8 GB
    fp16 + offload
    Runs on: NVIDIAAMD · offload floor ~4 GB

    Tools: Diffusers, ComfyUI

  • Stable Video Diffusion (img2vid-XT)
    1024×576 · 25 frames
    ~8 GB
    fp16 + offload
    Runs on: MacNVIDIAAMD

    Tools: ComfyUI, Diffusers

  • Wan 2.2 TI2V 5B
    1280×704 (720p) · 121 frames
    ~8 GB
    Q4 GGUF
    Runs on: MacNVIDIAAMD · offload floor ~5 GB

    Tools: ComfyUI, Diffusers

  • LTX-Video 2B
    1216×704 · 121 frames
    ~10 GB
    fp8 + offload
    Runs on: MacNVIDIAAMD · offload floor ~6 GB

    Tools: ComfyUI, Diffusers

  • Wan 2.1 T2V 14B
    1280×720 (720p) · 81 frames
    ~12 GB
    Q4 GGUF
    Runs on: MacNVIDIAAMD · offload floor ~8 GB

    Tools: ComfyUI, Diffusers

  • CogVideoX-5B
    720×480 · 49 frames
    ~16 GB
    INT8 / fp8
    Runs on: NVIDIAAMD · offload floor ~5 GB

    Tools: Diffusers, ComfyUI

  • HunyuanVideo
    544×960 · 129 frames
    ~16 GB
    Q4 GGUF
    Runs on: NVIDIAAMD · offload floor ~8 GB

    Tools: ComfyUI

  • Wan 2.2 T2V A14B
    1280×720 (720p) · 81 frames
    ~16 GB
    Q4 GGUF
    Runs on: NVIDIAAMD · offload floor ~8 GB

    Tools: ComfyUI, Diffusers

  • LTX-Video 13B
    1216×704 · 161 frames
    ~20 GB
    fp8
    Runs on: MacNVIDIAAMD · offload floor ~12 GB

    Tools: ComfyUI, Diffusers

  • Mochi 1
    480×848 · 85 frames
    ~20 GB
    fp8 + offload
    Runs on: NVIDIA · offload floor ~18 GB

    Tools: ComfyUI, Diffusers

Peak VRAM is the memory a run consumes, the same basis the site uses everywhere; see the methodology. To check a model against your exact device, open its compatibility page.

FAQ

What is the most memory-efficient local video generation model?

Wan 2.1 T2V 1.3B uses the least: about 6 GB at Q4 GGUF. With CPU offload it can drop to ~5 GB, more slowly.

How much GPU memory do I need for local video generation?

It ranges from about 6 GB to 20 GB of peak VRAM across the models here. The figure is the memory a run consumes, not the size of card you must buy, so match it to your usable VRAM with a gigabyte or two of margin.

Is the memory figure the download size or the run size?

The run size: peak VRAM actually consumed during generation, which is the number that decides if it fits. Diffusion models can also offload parts to system RAM to run on less, slower. Every figure here is sourced.

Sources