Benjamin-eecs - Overview (original) (raw)

Provide feedback

Appearance settings

SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning
Python 196 22
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
5k 543
DeepSeek-VL: Towards Real-World Vision-Language Understanding
Python 4.1k 593
TorchOpt is an efficient library for differentiable optimization built upon PyTorch.
Python 631 44
C++-based high-performance parallel environment execution engine (vectorized env) for general RL environments.
C++ 1.5k 141
Natural Language Reinforcement Learning
Python 101 7