Main (original) (raw)

My research group aims to build and improve language models. Methodologically, we study data-driven methods that combine deep-learning based models with probabilistic controls. We are interested in applications in improved scaling, efficiency, model reasoning, and long-context generation.

I am also interested in open-source deep learning LLMs, and develop projects to make systems safer, more clear, and easier to use. I work part-time at Hugging Face and like to release various software projects to support NLP and DL research. I host a YouTube channel with technical talks about topics I am interested in.

Current Research Areas

Recognition

My group's work has been recognized with an NSF CAREER Award and a Sloan Fellowship. We have won paper awards at conferences for NLP, Hardware, and Visualization, as well as awards for best demonstrations for open-source software.

Selected Papers

A selection of papers that represent my research interests and style.

Contextual Document EmbeddingsJ. X. Morris, Alexander Rush.ICLR 2025
Simple and Effective Masked Diffusion Language ModelsSubham Sekhar Sahoo, Marianne Arriola, Yair Schiff, Aaron Gokaslan, Edgar Marroquin, Justin T Chiu, Alexander Rush, Volodymyr Kuleshov.NeurIPS 2024
Zephyr: Direct Distillation of LM AlignmentLewis Tunstall, Edward Beeching, Nathan Lambert, Nazneen Rajani, Kashif Rasul, Younes Belkada, Shengyi Huang, Leandro von Werra, Clémentine Fourrier, Nathan Habib, Nathan Sarrazin, Omar Sanseviero, Alexander M. Rush, Thomas Wolf.COLM 2024
Pretraining Without AttentionJunxiong Wang, Jing Nathan Yan, Albert Gu, Alexander M. Rush.EMNLP 2023 Findings
Multitask prompted training enables zero-shot task generalizationVictor Sanh, et al..ICLR 2022
How many data points is a prompt worth?Teven Le Scao, Alexander M. Rush.NAACL Short 2021
Transformers: State-of-the-art Natural Language ProcessingThomas Wolf et al.EMNLP Demos 2020
Compound Probabilistic Context-Free Grammars for Grammar InductionYoon Kim, Chris Dyer, Alexander M. Rush.ACL 2019
Learning Neural Templates for Text GenerationSam Wiseman, Stuart M. Shieber, Alexander Rush.EMNLP 2018
LSTMVis: A Tool for Visual Analysis of Hidden State Dynamics in Recurrent Neural NetworksHendrik Strobelt, Sebastian Gehrmann, Hanspeter Pfister, and Alexander M. Rush.InfoVis 2017
OpenNMT: Open-Source Toolkit for Neural Machine TranslationGuillaume Klein, Yoon Kim, Yuntian Deng, Jean Senellart, Alexander M. Rush.ACL Demo 2017
Sequence-Level Knowledge DistillationYoon Kim and Alexander M. Rush.EMNLP 2016
Character-Aware Neural Language ModelsYoon Kim, Yacine Jernite, David Sontag, and Alexander M. Rush.AAAI 2016
A Neural Attention Model for Abstractive Sentence SummarizationAlexander M. Rush, Sumit Chopra, and Jason Weston.EMNLP 2015.

Contact