GitHub - meta-pytorch/torchx: TorchX is a universal job launcher for PyTorch applications. TorchX is designed to have fast iteration time for training/research and support for E2E production ML pipelines when you're ready. (original) (raw)
TorchX is a universal job launcher for PyTorch applications. TorchX is designed to have fast iteration time for training/research and support for E2E production ML pipelines when you're ready.
TorchX currently supports:
- Kubernetes (EKS, GKE, AKS, etc)
- Slurm
- AWS Batch
- Docker
- Local
Need a scheduler not listed? Let us know!
Quickstart
See the quickstart guide.
Documentation
Requirements
torchx:
Certain schedulers may require scheduler specific requirements. See installation for info.
Installation
Stable
install torchx sdk and CLI -- minimum dependencies
pip install torchx
install with all dependencies
pip install "torchx[dev]"
install torchx Kubernetes / Volcano support
pip install "torchx[kubernetes]"
install torchx GCP Batch support
pip install "torchx[gcp_batch]"
Nightly
install torchx sdk and CLI
pip install torchx-nightly[dev]
Source (Development)
clone the repo
git clone https://github.com/meta-pytorch/torchx.git cd torchx
install with uv (recommended for development)
uv sync --extra dev
or with pip
pip install -e ".[dev]"
Docker
TorchX provides a docker container for using as as part of a TorchX role.
See: https://github.com/meta-pytorch/torchx/pkgs/container/torchx
Contributing
We welcome PRs! See the CONTRIBUTING file.
License
TorchX is BSD licensed, as found in the LICENSE file.