Gabor Angeli (original) (raw)

About Me

This is my (now-old) academic homepage from my time as a Ph.D. student at Stanford (2010 -- 2016), where I was advised by Chris Manning in the natural language processing group. Prior to that, I graduated from UC Berkeley in 2010 with a B.S. in Electrical Engineering / Computer Science. I'm now the co-founder and CTO of Eloquent Labs.

My interests are in natural language understanding. Recently I've been working on open-domain natural language inference -- particularly common sense reasoning -- and some work in relation extraction. In the past, I did some work on interpreting temporal expressions using semantic parsing.

Otherwise, In my free time I enjoy the outdoors (hiking / camping / backpacking), board games, and movies.

My Publications

2019
Mimic and Rephrase: Reflective listening in open-ended dialogue
Conference on Natural language Learning (CoNLL). 2019

Reflective listening—-demonstrating that you have heard your conversational partner—-is key to effective communication. Expert human communicators often mimic and rephrase their conversational partner, e.g., when responding to sentimental stories or to questions they don’t know the answer to. We introduce a new task and an associated dataset wherein dialogue agents similarly mimic and rephrase a user’s request to communicate sympathy (I’m sorry to hear that) or lack of knowledge (I do not know that). We study what makes a rephrasal response good against a set of qualitative metrics. We then evaluate three models for generating responses: a syntax-aware rulebased system, a seq2seq LSTM neural models with attention (S2SA), and the same neural model augmented with a copy mechanism (S2SA+C). In a human evaluation, we find that S2SA+C and the rule-based system are comparable and approach human-generated response quality. In addition, experiences with a live deployment of S2SA+C in a customer support setting suggest that this generation task is a practical contribution to real world conversational agents.
@inproceedings{dieter2019mimic,
author = {Dieter, Justin and Wang, Tian and Angeli, Gabor and Chang, Angel X. and Chaganty, Arun},
booktitle = {Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL 2019)},
title = {Mimic and Rephrase: Reflective listening in open-ended dialogue},
url = {https://www.aclweb.org/anthology/K19-1037.pdf},
year = {2019}
}
2017
Position-aware Attention and Supervised Data Improve Slot Filling
Empirical Methods in Natural Language Processing (EMNLP). 2017
Organized relational knowledge in the form of “knowledge graphs” is important for many applications. However, the ability to populate knowledge bases with facts automatically extracted from documents has improved frustratingly slowly. This paper simultaneously addresses two issues that have held back prior work. We first propose an effective new model, which combines an LSTM sequence model with a form of entity position-aware attention that is better suited to relation extraction. Then we build TACRED, a large (119,474 examples) supervised relation extraction dataset, obtained via crowdsourcing and targeted towards TAC KBP relations. The combination of better supervised data and a more appropriate high-capacity model enables much better relation extraction performance. When the model trained on this new dataset replaces the previous relation extraction component of the best TAC KBP 2015 slot filling system, its F1 score increases markedly from 22.2% to 26.7%.
@inproceedings{zhang2017tacred,
author = {Zhang, Yuhao and Zhong, Victor and Chen, Danqi and Angeli, Gabor and Manning, Christopher D.},
booktitle = {Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP 2017)},
title = {Position-aware Attention and Supervised Data Improve Slot Filling},
url = {https://nlp.stanford.edu/pubs/zhang2017tacred.pdf},
year = {2017}
}
- GitHub Project
2016
Learning Open Domain Knowledge From Text
Stanford University. 2016
The increasing availability of large text corpora holds the promise of acquiring an unprecedented amount of knowledge from this text. However, current techniques are either special- ized to particular domains or do not scale to large corpora. This dissertation develops a new technique for learning open-domain knowledge from unstructured web-scale text corpora. A first application aims to capture common sense facts: given a candidate statement about the world and a large corpus of known facts, is the statement likely to be true? We appeal to a probabilistic relaxation of natural logic – a logic which uses the syntax of natural language as its logical formalism – to define a search problem from the query statement to its appropriate support in the knowledge base over valid (or approximately valid) logical inference steps. We show a 4x improvement in recall over lemmatized lookup for querying common sense facts, while maintaining above 90% precision. This approach is extended to handle longer, more complex premises by segmenting these utterances into a set of atomic statements entailed through natural logic. We evaluate this system in isolation by using it as the main component in an Open Information Extrac- tion system, and show that it achieves a 3% absolute improvement in F1 compared to prior work on a competitive knowledge base population task. A remaining challenge is elegantly handling cases where we could not find a supporting premise for our query. To address this, we create an analogue of an evaluation function in gameplaying search: a shallow lexical classifier is folded into the search program to serve as a heuristic function to assess how likely we would have been to find a premise. Results on answering 4thgrade science questions show that this method improves over both the classifier in isolation and a strong IR baseline, and outperforms prior work on the task.
@phdthesis{angeli-thesis,
author = {Gabor Angeli},
title = {Learning Open Domain Knowledge From Text},
school = {Stanford University},
year = 2016,
month = 6
}
- Source LaTeX
Combining Natural Logic and Shallow Reasoning for Question Answering.
Association for Computational Linguistics (ACL). 2016
Broad domain question answering is often difficult in the absence of structured knowledge bases, and can benefit from shallow lexical methods (broad coverage) and logical reasoning (high precision). We propose an approach for incorporating both of these signals in a unified framework based on natural logic. We extend the breadth of inferences afforded by natural logic to include relational entailment (e.g., buy → own) and meronymy (e.g., a person born in a city is born the city’s country). Furthermore, we train an evaluation function – akin to gameplaying – to evaluate the expected truth of candidate premises on the fly. We evaluate our approach on answering multiple choice science questions, , achieving strong results on the dataset.
@inproceedings{2016angeli-naturalli,
author = {Gabor Angeli and Neha Nayak and Christopher D. Manning},
booktitle = {Association for Computational Linguistics (ACL)},
title = {Combining Natural Logic and Shallow Reasoning for Question Answering},
year = {2016}
}
- NaturalLI on GitHub
- Natural Logic at Stanford
Evaluating Word Embeddings Using a Representative Suite of Practical Tasks
First Workshop on Evaluating Vector Space Representations for NLP (RepEval). 2016
Word embeddings are now widely used in natural language understanding tasks requiring sophisticated semantic information. However, the quality of new embedding methods is usually evaluated based on relatively simple word similarity benchmarks. We propose evaluating word embeddings by using them as features in simple models for a suite of popular downstream tasks. This gives a realistic view of the utility of the embeddings in realworld settings. The selection of diverse set of tasks, including both semantic and syntactic tasks, facilitates qualitative analysis of the strengths and weaknesses of the embeddings. The use of simple models allows us to format this evaluation as a standardized script that can be made available publicly, and which can be run in a few hours.
@inproceedings{2016nayak-veceval,
author = {Neha Nayak and Gabor Angeli and Christopher D. Manning},
booktitle = {RepEval Workshop},
title = {Evaluating Word Embeddings Using a Representative Suite of Practical Tasks},
year = {2016}
}
- VecEval Homepage
2015
A large annotated corpus for learning natural language inference
Empirical Methods in Natural Language Processing (EMNLP). 2015
Understanding entailment and contradiction is fundamental to understanding natural language, and inference about entailment and contradiction is a valuable test- ing ground for the development of semantic representations. However, machine learning research in this area has been dramatically limited by the lack of large-scale resources. To address this, we introduce the Stanford Natural Language Inference corpus, a new, freely available collection of labeled sentence pairs, written by humans doing a novel grounded task based on image captioning. At 570K pairs, it is two orders of magnitude larger than all other resources of its type. This increase in scale allows lexicalized classifiers to outperform some sophisticated existing entailment models, and it allows a neural network-based model to perform competitively on natural language inference benchmarks for the first time.
@inproceedings{2015bowman-snli,
title = {A large annotated corpus for learning natural language inference},
author = {Samuel R. Bowman and Gabor Angeli and Christopher Potts and Christopher D. Manning},
booktitle = {EMNLP},
year = {2015}
}
- Project homepage: http://nlp.stanford.edu/projects/snli/
- Natural Logic at Stanford
Leveraging Linguistic Structure For Open Domain Information Extraction
Association for Computational Linguistics (ACL). 2015
Relation triples produced by open domain information extraction (open IE) systems are useful for question answering, inference, and other IE tasks. Traditionally these are extracted using a large set of patterns; however, this approach is brittle on out-of-domain text and long-range dependencies, and gives no insight into the substructure of the arguments. We replace this large pattern set with a few patterns for canonically structured sentences, and shift the focus to a classifier which learns to extract self-contained clauses from longer sentences. We then run natural logic inference over these short clauses to determine the maximally specific arguments for each candidate triple. We show that our approach outperforms a state-of-the-art open IE system on the end-to-end TAC-KBP 2013 Slot Filling task.
@inproceedings{2015angeli-openie,
title = {Leveraging Linguistic Structure For Open Domain Information Extraction},
author = {Gabor Angeli and Melvin Johnson Premkumar and Christopher D. Manning},
booktitle = {ACL},
year = {2015}
}
- Project homepage: http://nlp.stanford.edu/software/openie.shtml
- Natural Logic at Stanford
Robust Subgraph Generation Improves Abstract Meaning Representation Parsing
Association for Computational Linguistics (ACL). 2015
The Abstract Meaning Representation (AMR) is a representation for open-domain rich semantics, with potential use in fields like event extraction and machine translation. Node generation, typically done using a simple dictionary lookup, is currently an important limiting factor in AMR parsing. We propose a small set of actions that derive AMR subgraphs by transformations on spans of text, which allows for more robust learning of this stage. Our set of construction actions generalize better than the previous approach, and can be learned with a simple classifier. We improve on the previous state-of-the-art result for AMR parsing, boosting end-to-end performance by 3 F1 on both the LDC2013E117 and LDC2014T12 datasets.
@inproceedings{2015werling-amr,
title = {Robust Subgraph Generation Improves Abstract Meaning Representation Parsing},
author = {Keenon Werling and Gabor Angeli and Christopher D. Manning},
booktitle = {ACL},
year = {2015}
}
- AMR home page
Bootstrapped Self Training for Knowledge Base Population
Text Analysis Conference Proceedings. 2015

A central challenge in relation extraction is the lack of supervised training data. Pattern-based relation extractors suffer from low recall, whereas distant supervision yields noisy data which hurts precision. We propose bootstrapped selftraining to capture the benefits of both systems: the precision of patterns and the generalizability of trained models. We show that training on the output of patterns drastically improves performance over the patterns. We propose self-training for further improvement: recall can be improved by incorporating the predictions from previous iterations; precision by filtering the assumed negatives based previous predictions. We show that even our patternbased model achieves good performance on the task, and the self-trained models rank among the top systems.
@inproceedings{2015angeli-kbp,
title = {Bootstrapped Self Training for Knowledge Base Population},
author = {Gabor Angeli and Victor Zhong and Danqi Chen and Arun Chaganty and Jason Bolton and Melvin Johnson Premkumar and Panupong Pasupat and Sonal Gupta and Christopher D. Manning},
booktitle = {TAC-KBP},
year = {2016}
}
2014
NaturalLI: Natural Logic Inference for Common Sense Reasoning
Empirical Methods in Natural Language Processing (EMNLP). 2014
Common-sense reasoning is important for AI applications, both in NLP and many vision and robotics tasks. We propose NaturalLI: a Natural Logic inference system for inferring common sense facts – for instance, that cats have tails or tomatoes are round – from a very large database of known facts. In addition to being able to provide strictly valid derivations, the system is also able to produce derivations which are only likely valid, accompanied by an associated confidence. We both show that our system is able to capture strict Natural Logic inferences on the FraCaS test suite, and demonstrate its ability to predict common sense facts with 49% recall and 91% precision.
@inproceedings{2014angeli-naturalli,
title = {NaturalLI: Natural Logic Inference for Common Sense Reasoning},
author = {Gabor Angeli and Christopher D. Manning},
booktitle = {EMNLP},
year = {2014}
}
- NaturalLI on GitHub
- Natural Logic at Stanford
Combining Distant and Partial Supervision for Relation Extraction
Empirical Methods in Natural Language Processing (EMNLP). 2014
Broad-coverage relation extraction either requires expensive supervised training data, or suffers from noise introduced by distantly supervised methods. We present an approach for providing partial supervision to a distantly supervised relation extractor using a small number of carefully selected examples. We compare against established active learning criteria and propose a novel criterion to sample examples which are both uncertain and representative. In this way, we combine the benefits of fine-grained supervision for difficult examples with the coverage of a large distantly supervised corpus. Our approach gives a substantial increase of 3.9% end-to-end F1 on the 2013 KBP Slot Filling evaluation, yielding a net F1 of 37.7%.
@inproceedings{2014angeli-active,
title = {Combining Distant and Partial Supervision for Relation Extraction},
author = {Gabor Angeli and Julie Tibshirani and Jean Y. Wu and Christopher D. Manning},
booktitle = {EMNLP},
year = {2014}
}
- The MIML-RE homepage
- The best trained model
- The annotated examples
Stanford's Distantly Supervised Slot Filling Systems for KBP 2014
Text Analysis Conference Proceedings. 2015

We describe Stanford’s entry in the TAC-KBP 2014 Slot Filling challenge. We submitted two broad approaches to Slot Filling, both strongly based on the ideas of distant supervision: one built on the DeepDive framework (Niu et al., 2012), and another based on the multi-instance multilabel relation extractor of Surdeanu et al. (2012). In addition, we evaluate the impact of learned and hard-coded patterns on performance for slot filling, and the impact of the partial annotations described in Angeli et al. (2014).
@inproceedings{2014angeli-kbp,
title = {Stanford's Distantly Supervised Slot Filling Systems for KBP 2014},
author = {Gabor Angeli and Sonal Gupta and Melvin Johnson Premkumar and Christopher D. Manning and Christopher R{'e} and Julie Tibshirani and Jean Y. Wu and Sen Wu and Ce Zhang},
booktitle = {TAC-KBP},
year = {2015}
}
A Dictionary of Nonsubsective Adjectives
Stanford CS Technical Report. 2014

Computational approaches to inference and information extraction often assume that adjective-noun compounds maintain all the relevant properties of the unmodified noun. A significant portion of nonsubsective adjectives violate this assumption. We present preliminary work towards a classifier for these adjectives. We also compile a comprehensive list of 60 nonsubsective adjectives including those used for training and those found by the classifiers.
@techreport{2014nayak-adjectives,
title = {A Dictionary of Nonsubsective Adjectives},
author = {Neha Nayak and Mark Kowarsky and Gabor Angeli and Christopher D. Manning},
number = {CSTR 2014-04},
institution = {Department of Computer Science, Stanford University},
month = {October},
year = {2014}
}
2013
Philosophers are Mortal: Inferring the Truth of Unseen Facts
Computational Natural Language Learning (CoNLL). 2013
Large databases of facts are prevalent in many applications. Such databases are accurate, but as they broaden their scope they become increasingly incomplete. In contrast to extending such a database, we present a system to query whether it contains an arbitrary fact. This work can be thought of as re-casting open domain information extraction: rather than growing a database of known facts, we smooth this data into a database in which any possible fact has membership with some confidence. We evaluate our system predicting held out facts, achieving 74.2% accuracy and outperforming multiple baselines. We also evaluate the system as a common-sense filter for the ReVerb Open IE system, and as a method for answer validation in a Question Answering task.
@inproceedings{2013angeli-truth,
title = {Philosophers are Mortal: Inferring the Truth of Unseen Facts},
author = {Gabor Angeli and Christopher Manning},
booktitle = {CoNLL},
year = {2013}
}
- Sim, the code for the various similarity metrics, with backoffs
Language-Independent Discriminative Parsing of Temporal Expressions
Association for Computational Linguistics (ACL). 2013

Temporal resolution systems are traditionally tuned to a particular language, requiring significant human effort to translate them to new languages. We present a language independent semantic parser for learning the interpretation of temporal phrases given only a corpus of utterances and the times they reference. We make use of a latent parse that encodes a language-flexible representation of time, and extract rich features over both the parse and associated temporal semantics. The parameters of the model are learned using a weakly supervised bootstrapping approach, without the need for manually tuned parameters or any other language expertise. We achieve state-of-the-art accuracy on all languages in the TempEval-2 temporal normalization task, reporting a 4% improvement in both English and Spanish accuracy, and to our knowledge the first results for four other languages.
@inproceedings{2013angeli-temporal,
title = {Language-Independent Discriminative Parsing of Temporal Expressions},
author = {Gabor Angeli and Jakob Uszkoreit},
booktitle = {ACL},
year = {2013}
}
Stanford's 2013 KBP System
Text Analysis Conference Proceedings. 2014

We describe Stanford’s entry in the TAC-KBP 2013 Slotfilling challenge. Our system makes use of a distantly supervised approach, implementing the multi-instance multi-label system of Surdeanu et al. (2012). In addition, Stanford’s system significantly improved the information retrieval component of the system, as well as the consistency and inference procedure applied after candidate relations have been extracted. Stanford’s 2013 KBP entry achieved an F1 of 31.36 on the 2013 evaluation data, performing above the median entry (15.32 F1).
@inproceedings{2013angeli-kbp,
title = {Stanford's 2013 {KBP} System},
author = {Gabor Angeli and Arun Chaganty and Angel Chang and Kevin Reschke and Julie Tibshirani and Jean Y. Wu and Osbert Bastani and Keith Siilats and Christopher D. Manning},
booktitle = {TAC-KBP},
year = {2014}
}
2012
Parsing Time: Learning to Interpret Time Expressions
North American Chapter of the Association for Computational Linguistics (NAACL). 2012
We present a probabilistic approach for learning to interpret temporal phrases given only a corpus of utterances and the times they reference. While most approaches to the task have used regular expressions and similar linear pattern interpretation rules, the possibility of phrasal embedding and modification in time expressions motivates our use of a compositional grammar of time expressions. This grammar is used to construct a latent parse which evaluates to the time the phrase would represent, as a logical parse might evaluate to a concrete entity. In this way, we can employ a loosely supervised EM-style bootstrapping approach to learn these latent parses while capturing both syntactic uncertainty and pragmatic ambiguity in a probabilistic framework. We achieve an accuracy of 72% on an adapted TempEval-2 task -- comparable to state of the art systems.
@inproceedings{2012angeli-temporal,
title = {Parsing Time: Learning to Interpret Time Expressions},
author = {Gabor Angeli and Christopher D. Manning and Daniel Jurafsky},
booktitle = {NAACL-HLT},
year = {2012}
}
- Parsing Time, the main codebase
- Some of the more reusable code (CKY, EM) liveshere
- The code makes heavy use ofStanford CoreNLP
- The code is written in Scala 2.9.1
2010
A Simple Domain-Independent Probabilistic Approach to Generation
Empirical Methods in Natural Language Processing (EMNLP), 2010
We present a simple, robust generation system which performs content selection and surface realization in a unified, domain-independent framework. In our approach, we break up the end-to-end generation process into a sequence of local decisions, arranged hierarchically and each trained discriminatively. We deployed our system in three different domains---Robocup sportscasting, technical weather forecasts, and common weather forecasts, obtaining results comparable to state-of-the-art domain-specific systems both in terms of BLEU scores and human evaluation.
@inproceedings{2010angeli-generation,
title = {A Simple Domain-Independent Probabilistic Approach to Generation},
author = {Gabor Angeli and Percy Liang and Dan Klein},
booktitle = {Empirical Methods in Natural Language Processing (EMNLP)},
year = {2010}
}
- Code (use at your own risk!) and associateddata. The data should be unzipped under the code directory.
- Sumtime Prodigy-METEO (Belz 2008) [homepage] [direct link]
- Robocup (Chen and Mooney 2008) [homepage] [direct link]
- WeatherGov (Liang et al. 2009) [homepage] [direct link]

I've experiments to run; there is research to be done.