GitHub - jianchang512/pyvideotrans: Translate the video from one language to another and embed dubbing & subtitles. (original) (raw)

Sponsors: Recall.ai - Meeting Transcription API

If you’re looking for a transcription API for meetings, consider checking out Recall.ai , an API that works with Zoom, Google Meet, Microsoft Teams, and more

A Powerful Open Source Video Translation / Audio Transcription / AI Dubbing / Subtitle Translation Tool

中文 | Documentation | Online Q&A

License Python Platform

pyVideoTrans is dedicated to seamlessly converting videos from one language to another, offering a complete workflow that includes speech recognition, subtitle translation, multi-role dubbing, and audio-video synchronization. It supports both local offline deployment and a wide variety of mainstream online APIs.

image


✨ Core Features


🚀 Quick Start (Windows Users)

We provide a pre-packaged .exe version for Windows 10/11 users, requiring no Python environment configuration.

  1. Download: Click to download the latest pre-packaged version
  2. Unzip: Extract the compressed file to a path (e.g., D:\pyVideoTrans).
  3. Run: Double-click sp.exe inside the folder to launch.

Note:


🛠️ Source Deployment (macOS / Linux / Windows Developers)

We recommend using uv for package management for faster speed and better environment isolation.

1. Prerequisites

2. Install uv (If not installed)

macOS/Linux

curl -LsSf https://astral.sh/uv/install.sh | sh

Windows (PowerShell)

powershell -c "irm https://astral.sh/uv/install.ps1 | iex"

3. Clone and Install

1. Clone the repository (Ensure path has no spaces/Chinese characters)

git clone https://github.com/jianchang512/pyvideotrans.git cd pyvideotrans

2. Install dependencies (uv automatically syncs environment)

uv sync

If you need local channels for qwen-tts and qwen-asr, please execute uv sync --extra qwen-tts --extra qwen-asr

4. Launch Software

Launch GUI:

Use CLI:

View documentation for detailed parameters

Video Translation Example

uv run cli.py --task vtv --name "./video.mp4" --source_language_code zh --target_language_code en

Audio to Subtitle Example

uv run cli.py --task stt --name "./audio.wav" --model_name large-v3

5. (Optional) GPU Acceleration Configuration

If you have an NVIDIA graphics card, execute the following commands to install the CUDA-supported PyTorch version:

Uninstall CPU version

uv remove torch torchaudio

Install CUDA version (Example for CUDA 12.x)

uv add torch==2.7 torchaudio==2.7 --index-url https://download.pytorch.org/whl/cu128 uv add nvidia-cublas-cu12 nvidia-cudnn-cu12


🧩 Supported Channels & Models (Partial)

Category Channel/Model Description
ASR (Speech Recognition) Faster-Whisper (Local) Recommended, fast speed, high accuracy
WhisperX / Parakeet Supports timestamp alignment & speaker diarization
Alibaba Qwen3-ASR / ByteDance Volcano Online API, excellent for Chinese
Translation (LLM/MT) DeepSeek / ChatGPT Supports context understanding, more natural translation
MiniMax AI MiniMax M2.7 LLM, latest flagship model, OpenAI-compatible
Google / Microsoft Traditional machine translation, fast speed
Ollama / M2M100 Fully local offline translation
TTS (Speech Synthesis) Edge-TTS Microsoft free interface, natural effect
F5-TTS / CosyVoice Supports Voice Cloning, requires local deployment
GPT-SoVITS / ChatTTS High-quality open-source TTS
302.AI / OpenAI / Azure High-quality commercial API

📚 Documentation & Support

⚠️ Disclaimer

This software is an open-source, free, non-commercial project. Users are solely responsible for any legal consequences arising from the use of this software (including but not limited to calling third-party APIs or processing copyrighted video content). Please comply with local laws and regulations and the terms of use of relevant service providers.

🙏 Acknowledgements

This project mainly relies on the following open-source projects (partial):


Created by jianchang512