GitHub - skypilot-org/mock-train-workflow: Mock training workflow defined as SkyPilot YAMLs (original) (raw)
Example training workflow on SkyPilot
This repo contains an example training workflow consisting of three simple SkyPilot tasks:
data_preprocessing
: Ingests data and writes it to a bucket.train
: Trains a model on the data.eval
: Evaluates the model. Can be optionally run on the same cluster astrain
.
These tasks don't actually run any code, but they demonstrate how to structure a training workflow on SkyPilot.
Usage
When developing the workflow, you can run the tasks independently using sky launch
:
$ sky launch -c data data_preprocessing.yaml ...
The train and eval step can be run in a similar way, and the eval step can be run on the same cluster as the train step by setting the --cluster
flag:
$ sky launch -c train train.yaml ...
Run eval on the same cluster as train
$ sky exec train eval.yaml