Joey Hong (original) (raw)

Current Research

I'm currently interested developing novel reinforcement learning algorithms that scale better to difficult problems, particularly involving large language models and interaction with humans.

	Natural Language Actor-Critic: Scalable Off-Policy Learning in Language Space Joey Hong,Kang Liu,Zhan Ling,Jiecao Chen,Sergey Levine under submission, 2025 arXiv code website
	Planning without Search: Refining Frontier LLMs with Offline Goal-Conditioned RL Joey Hong,Anca Dragan,Sergey Levine NeurIPS, 2025 arXiv website
	Q-SFT: Q-Learning for Language Models via Supervised Fine-Tuning Joey Hong,Anca Dragan,Sergey Levine ICLR, 2025 arXiv
	Zero-Shot Goal-Directed Dialogue via RL on Imagined Conversations Joey Hong,Sergey Levine,Anca Dragan NeurIPS Foundation Models for Decision Making Workshop, 2024 arXiv,slides
	Offline RL with Observation Histories: Analyzing and Improving Sample Complexity Joey Hong,Anca Dragan,Sergey Levine ICLR, 2024 arXiv
	Learning to Influence Human Behavior with Offline Reinforcement Learning Joey Hong,Sergey Levine,Anca Dragan NeurIPS, 2023 arXiv, website
	Confidence-Conditioned Value Functions for Offline Reinforcement Learning Joey Hong,Aviral Kumar,Sergey Levine ICLR, 2022 (oral) arXiv
	On the Sensitivity of Reward Inference to Misspecified Human Models Joey Hong,Kush Bhatia,Anca Dragan ICLR, 2022 (oral) arXiv
	When Should We Prefer Offline Reinforcement Learning Over Behavioral Cloning? Aviral Kumar,Joey Hong,Anikait Singh,Sergey Levine ICLR, 2021 arXiv,blog