Allegro Lab (original) (raw)

This page is under development.

The AI, Language, Learning, Generalization, and Robustness (ALLeGRo) Lab studies natural language processing and machine learning with a focus building reliable NLP systems for a wide range of scenarios. We aim for a deeper understanding of how NLP systems work, when they fail, and how they can be improved.

Here are the research questions we have been working on recently:

How can we scientifically understand large language models? Our scientific understanding of LLMs lags far behind our ability to engineer them. To bridge this gap, our recent work has studied in-context learning from both a data-centric and mechanistic perspective; we have also investigated the predictability of different LLM capabilities.
How should we benchmark modern NLP systems? I have long advocated for benchmarking robustness and uncertainty of NLP systems. Our recent work has benchmarked generalization to long-tail examples and calibration of LLMs. We have also shown that benchmarking under distribution shift can reveal advantages of neurosymbolic approaches.
How can smaller open-source models compete with closed-source LLMs? Continued scientific progress relies on access to strong open-source models. Our recent work has improved smaller models by training them to generate reasoning chains.
How can advances in NLP inform other disciplines? Developments in NLP promise to have broad impacts across disparate areas of study. We have collaborated with legal experts to operationalize underspecified requirements in the EU’s Digital Services Act in a manner that is both legally justified and technically feasible. I am also interested in collaborating with experts in other disciplines who want to use NLP for their own research; for example, I have built assisted curation tools for biomedical researchers.

news

Sep 03, 2024	Welcome to the new Allegro Lab website.

selected publications

AIES
Operationalizing content moderation "accuracy" in the Digital Services Act
2024
ACL Findings
Proving membership in LLM pretraining data via data watermarks
Johnny Tian-Zheng Wei*, Ryan Yixiang Wang*, and Robin Jia
2024
NAACL
Do Localization Methods Actually Localize Memorized Data in LLMs?
Ting-Yun Chang, Jesse Thomason, and Robin Jia
EMNLP
Chain-of-Questions Training with Latent Answers for Robust Multistep Question Answering
Wang Zhu, Jesse Thomason, and Robin Jia
EMNLP Findings
Estimating Large Language Model Capabilities without Labeled Test Data
Harvey Yiyun Fu, Qinyuan Ye, Albert Xu, Xiang Ren, and Robin Jia
EACL Findings
Benchmarking Long-tail Generalization with Likelihood Splits
Ameya Godbole Jia, and Robin
EMNLP Findings
Generalization Differences between End-to-End and Neuro-Symbolic Vision-Language Reasoning Systems
Wang Zhu, Jesse Thomason, and Robin Jia
ACL
Selective Question Answering under Domain Shift
Amita Kamath, Robin Jia, and Percy Liang
NAACL
Document-Level N-ary Relation Extraction with Multiscale Representation Learning
Robin Jia, Cliff Wong, and Hoifung Poon
EMNLP
Adversarial Examples for Evaluating Reading Comprehension Systems
Robin Jia, and Percy Liang
EMNLP
When Parts are Greater Than Sums: Individual LLM Components Can Outperform Full Models
Ting-Yun Chang, Jesse Thomason, and Robin Jia
2024
NeurIPS
Pre-trained Large Language Models Use Fourier Features to Compute Addition
Tianyi Zhou, Deqing Fu, Vatsal Sharan, and Robin Jia
2024
arxiv
Language Models can Infer Action Semantics for Classical Planners from Environment Feedback
Wang Zhu, Ishika Singh, Robin Jia, and Jesse Thomason
2024
NeurIPS
Transformers Learn Higher-Order Optimization Methods for In-Context Learning: A Study with Linear Models
Deqing Fu, Tian-Qi Chen, Robin Jia, and Vatsal Sharan
2024