GitHub - hao-ai-lab/FastVideo: FastVideo is a unified framework for accelerated video generation. (original) (raw)

FastVideo is a unified framework for accelerated video generation.

It features a clean, consistent API that works across popular video models, making it easier for developers to author new models and incorporate system- or kernel-level optimizations. With FastVideo's optimizations, you can achieve more than 3x inference improvement compared to other systems.

Key Features

FastVideo has the following features:

State-of-the-art performance optimizations for inference
Cutting edge models
- Wan2.1 T2V, I2V
- HunyuanVideo
- FastHunyuan: consistency distilled video diffusion models for 8x inference speedup.
- StepVideo T2V
Distillation support
- Recipes for video DiT, based on PCM.
- Support distilling/finetuning/inferencing state-of-the-art open video DiTs: 1. Mochi 2. Hunyuan.
Scalable training with FSDP, sequence parallelism, and selective activation checkpointing, with near linear scaling to 64 GPUs.
Memory efficient finetuning with LoRA, precomputed latent, and precomputed text embeddings.

Getting Started

We recommend using an environment manager such as Conda to create a clean environment:

Create and activate a new conda environment

conda create -n fastvideo python=3.12 conda activate fastvideo

Install FastVideo

pip install fastvideo

Please see our docs for more detailed installation instructions.

Inference

Generating Your First Video

Here's a minimal example to generate a video using the default settings. Create a file called example.py with the following code:

from fastvideo import VideoGenerator

def main(): # Create a video generator with a pre-trained model generator = VideoGenerator.from_pretrained( "Wan-AI/Wan2.1-T2V-1.3B-Diffusers", num_gpus=1, # Adjust based on your hardware )

# Define a prompt for your video
prompt = "A curious raccoon peers through a vibrant field of yellow sunflowers, its eyes wide with interest."

# Generate the video
video = generator.generate_video(
    prompt,
    return_frames=True,  # Also return frames from this call (defaults to False)
    output_path="my_videos/",  # Controls where videos are saved
    save_video=True
)

if name == 'main': main()

Run the script with:

For a more detailed guide, please see our inference quick start.

Other docs:

Distillation and Finetuning

📑 Development Plan

More models support
- Add StepVideo to V1
Optimization features
- Teacache in V1
- SageAttention in V1
Code updates
- V1 Configuration API
- Support Training in V1

🤝 Contributing

We welcome all contributions. Please check out our guide here

Acknowledgement

We learned and reused code from the following projects:

We thank MBZUAI and Anyscale for their support throughout this project.

Citation

If you use FastVideo for your research, please cite our paper:

@misc{zhang2025fastvideogenerationsliding, title={Fast Video Generation with Sliding Tile Attention}, author={Peiyuan Zhang and Yongqi Chen and Runlong Su and Hangliang Ding and Ion Stoica and Zhenghong Liu and Hao Zhang}, year={2025}, eprint={2502.04507}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2502.04507}, } @misc{ding2025efficientvditefficientvideodiffusion, title={Efficient-vDiT: Efficient Video Diffusion Transformers With Attention Tile}, author={Hangliang Ding and Dacheng Li and Runlong Su and Peiyuan Zhang and Zhijie Deng and Ion Stoica and Hao Zhang}, year={2025}, eprint={2502.06155}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2502.06155}, }