Integrations — Stable Baselines3 2.7.0a0 documentation (original) (raw)

Weights & Biases

Weights & Biases provides a callback for experiment tracking that allows to visualize and share results.

The full documentation is available here: https://docs.wandb.ai/guides/integrations/other/stable-baselines-3

import gymnasium as gym import wandb from wandb.integration.sb3 import WandbCallback

from stable_baselines3 import PPO

config = { "policy_type": "MlpPolicy", "total_timesteps": 25000, "env_id": "CartPole-v1", } run = wandb.init( project="sb3", config=config, sync_tensorboard=True, # auto-upload sb3's tensorboard metrics # monitor_gym=True, # auto-upload the videos of agents playing the game # save_code=True, # optional )

model = PPO(config["policy_type"], config["env_id"], verbose=1, tensorboard_log=f"runs/{run.id}") model.learn( total_timesteps=config["total_timesteps"], callback=WandbCallback( model_save_path=f"models/{run.id}", verbose=2, ), ) run.finish()

Hugging Face 🤗

The Hugging Face Hub 🤗 is a central place where anyone can share and explore models. It allows you to host your saved models 💾.

You can see the list of stable-baselines3 saved models here: https://huggingface.co/models?library=stable-baselines3Most of them are available via the RL Zoo.

Official pre-trained models are saved in the SB3 organization on the hub: https://huggingface.co/sb3

We wrote a tutorial on how to use 🤗 Hub and Stable-Baselines3here.

Installation

pip install huggingface_sb3

Note

If you use the RL Zoo, pushing/loading models from the hub are already integrated:

Download model and save it into the logs/ folder

Only use TRUST_REMOTE_CODE=True with HF models that can be trusted (here the SB3 organization)

TRUST_REMOTE_CODE=True python -m rl_zoo3.load_from_hub --algo a2c --env LunarLander-v3 -orga sb3 -f logs/

Test the agent

python -m rl_zoo3.enjoy --algo a2c --env LunarLander-v3 -f logs/

Push model, config and hyperparameters to the hub

python -m rl_zoo3.push_to_hub --algo a2c --env LunarLander-v3 -f logs/ -orga sb3 -m "Initial commit"

Download a model from the Hub

You need to copy the repo-id that contains your saved model. For instance sb3/demo-hf-CartPole-v1:

import os

import gymnasium as gym

from huggingface_sb3 import load_from_hub from stable_baselines3 import PPO from stable_baselines3.common.evaluation import evaluate_policy

Allow the use of `pickle.load()` when downloading model from the hub

Please make sure that the organization from which you download can be trusted

os.environ["TRUST_REMOTE_CODE"] = "True"

Retrieve the model from the hub

repo_id = id of the model repository from the Hugging Face Hub (repo_id = {organization}/{repo_name})

filename = name of the model zip file from the repository

checkpoint = load_from_hub( repo_id="sb3/demo-hf-CartPole-v1", filename="ppo-CartPole-v1.zip", ) model = PPO.load(checkpoint)

Evaluate the agent and watch it

eval_env = gym.make("CartPole-v1") mean_reward, std_reward = evaluate_policy( model, eval_env, render=True, n_eval_episodes=5, deterministic=True, warn=False ) print(f"mean_reward={mean_reward:.2f} +/- {std_reward}")

You need to define two parameters:

repo-id: the name of the Hugging Face repo you want to download.
filename: the file you want to download.

Upload a model to the Hub

You can easily upload your models using two different functions:

package_to_hub(): save the model, evaluate it, generate a model card and record a replay video of your agent before pushing the complete repo to the Hub.
push_to_hub(): simply push a file to the Hub.

First, you need to be logged in to Hugging Face to upload a model:

If you’re using Colab/Jupyter Notebooks:

from huggingface_hub import notebook_login notebook_login()

Otherwise:

Then, in this example, we train a PPO agent to play CartPole-v1 and push it to a new repo sb3/demo-hf-CartPole-v1

With `package_to_hub()`

from stable_baselines3 import PPO from stable_baselines3.common.env_util import make_vec_env

from huggingface_sb3 import package_to_hub

Create the environment

env_id = "CartPole-v1" env = make_vec_env(env_id, n_envs=1)

Create the evaluation environment

eval_env = make_vec_env(env_id, n_envs=1)

Instantiate the agent

model = PPO("MlpPolicy", env, verbose=1)

Train the agent

model.learn(total_timesteps=int(5000))

This method saves, evaluates, generates a model card and records a replay video of your agent before pushing the repo to the hub

package_to_hub(model=model, model_name="ppo-CartPole-v1", model_architecture="PPO", env_id=env_id, eval_env=eval_env, repo_id="sb3/demo-hf-CartPole-v1", commit_message="Test commit")

You need to define seven parameters:

model: your trained model.
model_architecture: name of the architecture of your model (DQN, PPO, A2C, SAC…).
env_id: name of the environment.
eval_env: environment used to evaluate the agent.
repo-id: the name of the Hugging Face repo you want to create or update. It’s /.
commit-message.
filename: the file you want to push to the Hub.

With `push_to_hub()`

from stable_baselines3 import PPO from stable_baselines3.common.env_util import make_vec_env

from huggingface_sb3 import push_to_hub

Create the environment

env_id = "CartPole-v1" env = make_vec_env(env_id, n_envs=1)

Instantiate the agent

model = PPO("MlpPolicy", env, verbose=1)

Train the agent

model.learn(total_timesteps=int(5000))

Save the model

model.save("ppo-CartPole-v1")

Push this saved model .zip file to the hf repo

If this repo does not exist it will be created

repo_id = id of the model repository from the Hugging Face Hub (repo_id = {organization}/{repo_name})

filename: the name of the file == "name" inside model.save("ppo-CartPole-v1")

push_to_hub( repo_id="sb3/demo-hf-CartPole-v1", filename="ppo-CartPole-v1.zip", commit_message="Added CartPole-v1 model trained with PPO", )

You need to define three parameters:

repo-id: the name of the Hugging Face repo you want to create or update. It’s /.
filename: the file you want to push to the Hub.
commit-message.

MLFLow

If you want to use MLFLow to track your SB3 experiments, you can adapt the following code which defines a custom logger output:

import sys from typing import Any, Dict, Tuple, Union

import mlflow import numpy as np

from stable_baselines3 import SAC from stable_baselines3.common.logger import HumanOutputFormat, KVWriter, Logger

class MLflowOutputFormat(KVWriter): """ Dumps key/value pairs into MLflow's numeric format. """

def write(
    self,
    key_values: Dict[str, Any],
    key_excluded: Dict[str, Union[str, Tuple[str, ...]]],
    step: int = 0,
) -> None:

    for (key, value), (_, excluded) in zip(
        sorted(key_values.items()), sorted(key_excluded.items())
    ):

        if excluded is not None and "mlflow" in excluded:
            continue

        if isinstance(value, np.ScalarType):
            if not isinstance(value, str):
                mlflow.log_metric(key, value, step)

loggers = Logger( folder=None, output_formats=[HumanOutputFormat(sys.stdout), MLflowOutputFormat()], )

with mlflow.start_run(): model = SAC("MlpPolicy", "Pendulum-v1", verbose=2) # Set custom logger model.set_logger(loggers) model.learn(total_timesteps=10000, log_interval=1)

Integrations — Stable Baselines3 2.7.0a0 documentation (original) (raw)

Weights & Biases

Hugging Face 🤗

Installation

Download model and save it into the logs/ folder

Only use TRUST_REMOTE_CODE=True with HF models that can be trusted (here the SB3 organization)

Test the agent

Push model, config and hyperparameters to the hub

Download a model from the Hub

Allow the use of pickle.load() when downloading model from the hub

Please make sure that the organization from which you download can be trusted

Retrieve the model from the hub

repo_id = id of the model repository from the Hugging Face Hub (repo_id = {organization}/{repo_name})

filename = name of the model zip file from the repository

Evaluate the agent and watch it

Upload a model to the Hub

With package_to_hub()

Create the environment

Create the evaluation environment

Instantiate the agent

Train the agent

This method saves, evaluates, generates a model card and records a replay video of your agent before pushing the repo to the hub

With push_to_hub()

Create the environment

Instantiate the agent

Train the agent

Save the model

Push this saved model .zip file to the hf repo

If this repo does not exist it will be created

repo_id = id of the model repository from the Hugging Face Hub (repo_id = {organization}/{repo_name})

filename: the name of the file == "name" inside model.save("ppo-CartPole-v1")

MLFLow

Allow the use of `pickle.load()` when downloading model from the hub

With `package_to_hub()`

With `push_to_hub()`