TRL - Transformer Reinforcement Learning (original) (raw)

Hugging Face's logo

Join the Hugging Face community

and get access to the augmented documentation experience

Collaborate on models, datasets and Spaces

Faster examples with accelerated inference

Switch between documentation themes

TRL - Transformer Reinforcement Learning

TRL is a full stack library where we provide a set of tools to train transformer language models with methods like Supervised Fine-Tuning (SFT), Group Relative Policy Optimization (GRPO), Direct Preference Optimization (DPO), Reward Modeling, and more. The library is integrated with 🤗 transformers.

You can also explore TRL-related models, datasets, and demos in the TRL Hugging Face organization.

Learn

Learn post-training with TRL and other libraries in 🤗 smol course.

The documentation is organized into the following sections:

Getting Started: installation and quickstart guide.
Conceptual Guides: dataset formats, training FAQ, and understanding logs.
How-to Guides: reducing memory usage, speeding up training, distributing training, etc.
Integrations: DeepSpeed, Liger Kernel, PEFT, etc.
Examples: example overview, community tutorials, etc.
API: trainers, utils, etc.

Blog posts

< > Update on GitHub

TRL - Transformer Reinforcement Learning (original) (raw)

TRL - Transformer Reinforcement Learning

Learn

Contents

Blog posts