TRL - Transformer Reinforcement Learning (original) (raw)
Join the Hugging Face community
and get access to the augmented documentation experience
Collaborate on models, datasets and Spaces
Faster examples with accelerated inference
Switch between documentation themes
TRL - Transformer Reinforcement Learning
TRL is a full stack library where we provide a set of tools to train transformer language models with methods like Supervised Fine-Tuning (SFT), Group Relative Policy Optimization (GRPO), Direct Preference Optimization (DPO), Reward Modeling, and more. The library is integrated with 🤗 transformers.
You can also explore TRL-related models, datasets, and demos in the TRL Hugging Face organization.
Learn
Learn post-training with TRL and other libraries in 🤗 smol course.
Contents
The documentation is organized into the following sections:
- Getting Started: installation and quickstart guide.
- Conceptual Guides: dataset formats, training FAQ, and understanding logs.
- How-to Guides: reducing memory usage, speeding up training, distributing training, etc.
- Integrations: DeepSpeed, Liger Kernel, PEFT, etc.
- Examples: example overview, community tutorials, etc.
- API: trainers, utils, etc.