Lori Moon | University of Illinois at Urbana-Champaign (original) (raw)
Papers by Lori Moon
arXiv (Cornell University), Oct 20, 2020
In Natural Language (NL) applications, there is often a mismatch between what the NL interface is... more In Natural Language (NL) applications, there is often a mismatch between what the NL interface is capable of interpreting and what a lay user knows how to express. This work describes a novel natural language interface that reduces this mismatch by refining natural language input through successive, automatically generated semi-structured templates. In this paper we describe how our approach, called SKATE, uses a neural semantic parser to parse NL input and suggest semistructured templates, which are recursively filled to produce fully structured interpretations. We also show how SKATE integrates with a neural rule-generation model to interactively suggest and acquire commonsense knowledge. We provide a preliminary coverage analysis of SKATE for the task of story understanding, and then describe a current business use-case of the tool in a specific domain: COVID-19 policy design.
arXiv (Cornell University), Sep 16, 2020
Current affiliation Verneek, Inc.
International Conference on Computational Linguistics, Aug 1, 2018
This paper describes the augmentation of an existing corpus of child-directed speech. The resulti... more This paper describes the augmentation of an existing corpus of child-directed speech. The resulting corpus is a gold-standard labeled corpus for supervised learning of semantic role labels in adult-child dialogues. Semantic role labeling (SRL) models assign semantic roles to sentence constituents, thus indicating who has done what to whom (and in what way). The current corpus is derived from the Adam files in the Brown corpus (Brown, 1973) of the CHILDES corpora, and augments the partial annotation described in Connor et al. (2010). It provides labels for both semantic arguments of verbs and semantic arguments of prepositions. The semantic role labels and senses of verbs follow Propbank guidelines (Kingsbury and Palmer, 2002; Gildea and Palmer, 2002; Palmer et al., 2005) and those for prepositions follow Srikumar and Roth (2011). The corpus was annotated by two annotators. Inter-annotator agreement is given separately for prepositions and verbs, and for adult speech and child speech. Overall, across child and adult samples, including verbs and prepositions, the κ score for sense is 72.6, for the number of semantic-role-bearing arguments, the κ score is 77.4, for identical semantic role labels on a given argument, the κ score is 91.1, for the span of semantic role labels, and the κ for agreement is 93.9. The sense and number of arguments was often open to multiple interpretations in child speech, due to the rapidly changing discourse and omission of constituents in production. Annotators used a discourse context window of ten sentences before and ten sentences after the target utterance to determine the annotation labels. The derived corpus is available for use in CHAT (MacWhinney, 2000) and XML format.
Modal auxiliary verbs, such as could, might, must, would, and others, have di↵erent readings depe... more Modal auxiliary verbs, such as could, might, must, would, and others, have di↵erent readings depending on the context in which they occur (Kratzer 1981[45]). The sentence 'Jess could fry the fish' can mean that, in a time previous to the utterance of the sentence, Jess had the ability to fry the fish, or it can mean that, at the time of the utterance, Jess frying the fish is a possible event. Modal auxiliary verbs often create intensional environments, leading the events described by the second verb to be understood to be non-actual events. When the readings are described as being determined by a context, it is often a broad notion of nonlinguistic and extra-sentential linguistic context that is the focus of the interpretation. For example, descriptive pragmatic constraints are used in Lewis 1973[54] and Kratzer 1981[45] to characterize types of accessibility relations and types of orderings of worlds. A large part of the meaning of modal auxiliary verbs, however, centers around how the events described by the second verb are situated relative to the time at which the sentence containing the modal auxiliary is used. Information about the temporal situation of an event is conveyed through the linguistic context in which a modal auxiliary verb occurs, including, but not limited to, lexical properties of the linguistic expressions describing the event in the scope of the modal auxiliary, lexical properties of the modal auxiliary itself, and temporal and aspectual marking on linguistic expressions in the verbal projections. In order to provide a framework for representing the interactions of tense, aspect, and modality, a fragment of English is given in a Multi-Modal Combinatorial Categorial Grammar (Baldridge & Kruij↵[6], Steedman 2012[93]). Modal auxiliaries are given verb-like lexical entries in the grammar using lexical entries that combine features from Villavicenio 2002[101] and standard attribute value matrices of Head Driven Phrase
Intelligent Agents, Nov 3, 2011
Modal verbs occur in contexts which convey information about non-actual states of affairs as well... more Modal verbs occur in contexts which convey information about non-actual states of affairs as well as in contexts which convey information about the actual world of the discourse. Modeling the semantic interpretation of non-actual states of affairs is notoriously complicated, sometimes requiring modal logic, belief revision, non-monotonic reasoning, and multi-agent autoepistemic models. This work presents linguistic features which disambiguate those instances of the past tense modal verbs `could’ and `would’ which occur in contexts where the proposition in the scope of the modal is not true in the actual world of the discourse from those instances which presuppose or entail that an event in their scope occurred in the actual world of the discourse. It also illustrates the complexity of the role of modal verbs in semantic interpretation and, consequently, the limitations of state of the art inference systems with respect to modal verbs. The features suggested for improving modal verb interpretation are based on the analysis of corpus data and insights from the linguistic literature.
Proceedings of the AAAI Conference on Artificial Intelligence
In Natural Language (NL) applications, there is often a mismatch between what the NL interface is... more In Natural Language (NL) applications, there is often a mismatch between what the NL interface is capable of interpreting and what a lay user knows how to express. This work describes a novel natural language interface that reduces this mismatch by refining natural language input through successive, automatically generated semi-structured templates. In this paper we describe how our approach, called SKATE, uses a neural semantic parser to parse NL input and suggest semi-structured templates, which are recursively filled to produce fully structured interpretations. We also show how SKATE integrates with a neural rule-generation model to interactively suggest and acquire commonsense knowledge. We provide a preliminary coverage analysis of SKATE for the task of story understanding, and then describe a current business use-case of the technology in a restricted domain: COVID-19 policy design.
Linguistic Issues in Language Technology, Aug 1, 2016
Modal auxiliaries have different readings, depending on the context in which they occur (Kratzer,... more Modal auxiliaries have different readings, depending on the context in which they occur (Kratzer, 1981). Several projects have attempted to classify uses of modal auxiliaries in corpora according to their reading using supervised machine learning techniques (e.g., Rubinstein et al., 2013, Ruppenhofer & Rehbein, 2012). In each study, traditional taxonomic labels, such as 'epistemic' and 'deontic' are used by human annotators to label instances of modal auxiliaries in a corpus. In order to achieve higher agreement among annotators, results in these previous studies are reported after collapsing some of the initial categories. The results show that human annotators have fairly good agreement on some of the categories, such as whether or not a use is epistemic, but poor agreement on others. They also show that annotators agree more on modals such as might than on modals such as could. In this study, we used traditional taxonomic categories on sentences containing modal auxiliary verbs that were randomly extracted from the English Gigaword 4 th edition corpus (Parker et al., 2009). The lowest inner-annotator agreement using traditional taxonomic labels occurred with uses of could, with raw agreements of 42% − 48% (κ = 0.196 − 0.259), compared to might, for instance, with raw agreement of 98%. In response to the low numbers, rather than collapsing traditional categories, we tried a new method of classifying uses of could with respect to where the reading situates the eventuality being described relative to the speech time. For example, the sentence 'Jess could swim.' 1 2 / LiLT volume 14, issue 6 August 2016 is about a swimming eventuality in the past leading up to the time of speech, if it is read as being an ability. The sentence is about a swimming eventuality in the future, if it is read as being a statement about a possibility. The classification labels we propose are crucial in separating uses of could that have actuality inferences (Bhatt, 1999, Hacquard, 2006) from uses that do not. For the temporal location of the event described by a use of could, using four category labels, we achieved 73% − 90% raw agreement (κ = 0.614 − 0.744). Sequence of tense contexts (Abusch, 1997) present a major factor in the difficulty of determining the temporal properties present in uses of could. Among three annotators, we achieved raw agreement scores of 89% − 96%(κ = 0.779 − 0.919%) on identification of sequence of tense contexts. We discuss the role of our findings with respect to textual entailment.
This paper describes the augmentation of an existing corpus of child-directed speech. The resulti... more This paper describes the augmentation of an existing corpus of child-directed speech. The resulting corpus is a gold-standard labeled corpus for supervised learning of semantic role labels in adult-child dialogues. Semantic role labeling (SRL) models assign semantic roles to sentence constituents, thus indicating who has done what to whom (and in what way). The current corpus is derived from the Adam files in the Brown corpus (Brown 1973) of the CHILDES corpora, and augments the partial annotation described in Connor et al. (2010). It provides labels for both semantic arguments of verbs and semantic arguments of prepositions. The semantic role labels and senses of verbs follow Propbank guidelines Kingsbury and Palmer, 2002; Gildea and Palmer 2002; Palmer et al., 2005) and those for prepositions follow Srikumar and Roth (2011). The corpus was annotated by two annotators. Inter-annotator agreement is given separately for prepositions and verbs, and for adult speech and child speech. O...
Modal auxiliary verbs, such as could, might, must, would, and others, have di↵erent readings depe... more Modal auxiliary verbs, such as could, might, must, would, and others, have di↵erent readings depending on the context in which they occur (Kratzer 1981[45]). The sentence 'Jess could fry the fish' can mean that, in a time previous to the utterance of the sentence, Jess had the ability to fry the fish, or it can mean that, at the time of the utterance, Jess frying the fish is a possible event. Modal auxiliary verbs often create intensional environments, leading the events described by the second verb to be understood to be non-actual events. When the readings are described as being determined by a context, it is often a broad notion of nonlinguistic and extra-sentential linguistic context that is the focus of the interpretation. For example, descriptive pragmatic constraints are used in Lewis 1973[54] and Kratzer 1981[45] to characterize types of accessibility relations and types of orderings of worlds. A large part of the meaning of modal auxiliary verbs, however, centers around how the events described by the second verb are situated relative to the time at which the sentence containing the modal auxiliary is used. Information about the temporal situation of an event is conveyed through the linguistic context in which a modal auxiliary verb occurs, including, but not limited to, lexical properties of the linguistic expressions describing the event in the scope of the modal auxiliary, lexical properties of the modal auxiliary itself, and temporal and aspectual marking on linguistic expressions in the verbal projections. In order to provide a framework for representing the interactions of tense, aspect, and modality, a fragment of English is given in a Multi-Modal Combinatorial Categorial Grammar (Baldridge & Kruij↵[6], Steedman 2012[93]). Modal auxiliaries are given verb-like lexical entries in the grammar using lexical entries that combine features from Villavicenio 2002[101] and standard attribute value matrices of Head Driven Phrase
Modal verbs occur in contexts which convey information about non-actual states of affairs as well... more Modal verbs occur in contexts which convey information about non-actual states of affairs as well as in contexts which convey information about the actual world of the discourse. Modeling the semantic interpretation of non-actual states of affairs is notoriously complicated, sometimes requiring modal logic, belief revision, non-monotonic reasoning, and multi-agent autoepistemic models. This work presents linguistic features which disambiguate those instances of the past tense modal verbs `could’ and `would’ which occur in contexts where the proposition in the scope of the modal is not true in the actual world of the discourse from those instances which presuppose or entail that an event in their scope occurred in the actual world of the discourse. It also illustrates the complexity of the role of modal verbs in semantic interpretation and, consequently, the limitations of state of the art inference systems with respect to modal verbs. The features suggested for improving modal verb ...
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020
Current affiliation Verneek, Inc.
arXiv (Cornell University), Oct 20, 2020
In Natural Language (NL) applications, there is often a mismatch between what the NL interface is... more In Natural Language (NL) applications, there is often a mismatch between what the NL interface is capable of interpreting and what a lay user knows how to express. This work describes a novel natural language interface that reduces this mismatch by refining natural language input through successive, automatically generated semi-structured templates. In this paper we describe how our approach, called SKATE, uses a neural semantic parser to parse NL input and suggest semistructured templates, which are recursively filled to produce fully structured interpretations. We also show how SKATE integrates with a neural rule-generation model to interactively suggest and acquire commonsense knowledge. We provide a preliminary coverage analysis of SKATE for the task of story understanding, and then describe a current business use-case of the tool in a specific domain: COVID-19 policy design.
arXiv (Cornell University), Sep 16, 2020
Current affiliation Verneek, Inc.
International Conference on Computational Linguistics, Aug 1, 2018
This paper describes the augmentation of an existing corpus of child-directed speech. The resulti... more This paper describes the augmentation of an existing corpus of child-directed speech. The resulting corpus is a gold-standard labeled corpus for supervised learning of semantic role labels in adult-child dialogues. Semantic role labeling (SRL) models assign semantic roles to sentence constituents, thus indicating who has done what to whom (and in what way). The current corpus is derived from the Adam files in the Brown corpus (Brown, 1973) of the CHILDES corpora, and augments the partial annotation described in Connor et al. (2010). It provides labels for both semantic arguments of verbs and semantic arguments of prepositions. The semantic role labels and senses of verbs follow Propbank guidelines (Kingsbury and Palmer, 2002; Gildea and Palmer, 2002; Palmer et al., 2005) and those for prepositions follow Srikumar and Roth (2011). The corpus was annotated by two annotators. Inter-annotator agreement is given separately for prepositions and verbs, and for adult speech and child speech. Overall, across child and adult samples, including verbs and prepositions, the κ score for sense is 72.6, for the number of semantic-role-bearing arguments, the κ score is 77.4, for identical semantic role labels on a given argument, the κ score is 91.1, for the span of semantic role labels, and the κ for agreement is 93.9. The sense and number of arguments was often open to multiple interpretations in child speech, due to the rapidly changing discourse and omission of constituents in production. Annotators used a discourse context window of ten sentences before and ten sentences after the target utterance to determine the annotation labels. The derived corpus is available for use in CHAT (MacWhinney, 2000) and XML format.
Modal auxiliary verbs, such as could, might, must, would, and others, have di↵erent readings depe... more Modal auxiliary verbs, such as could, might, must, would, and others, have di↵erent readings depending on the context in which they occur (Kratzer 1981[45]). The sentence 'Jess could fry the fish' can mean that, in a time previous to the utterance of the sentence, Jess had the ability to fry the fish, or it can mean that, at the time of the utterance, Jess frying the fish is a possible event. Modal auxiliary verbs often create intensional environments, leading the events described by the second verb to be understood to be non-actual events. When the readings are described as being determined by a context, it is often a broad notion of nonlinguistic and extra-sentential linguistic context that is the focus of the interpretation. For example, descriptive pragmatic constraints are used in Lewis 1973[54] and Kratzer 1981[45] to characterize types of accessibility relations and types of orderings of worlds. A large part of the meaning of modal auxiliary verbs, however, centers around how the events described by the second verb are situated relative to the time at which the sentence containing the modal auxiliary is used. Information about the temporal situation of an event is conveyed through the linguistic context in which a modal auxiliary verb occurs, including, but not limited to, lexical properties of the linguistic expressions describing the event in the scope of the modal auxiliary, lexical properties of the modal auxiliary itself, and temporal and aspectual marking on linguistic expressions in the verbal projections. In order to provide a framework for representing the interactions of tense, aspect, and modality, a fragment of English is given in a Multi-Modal Combinatorial Categorial Grammar (Baldridge & Kruij↵[6], Steedman 2012[93]). Modal auxiliaries are given verb-like lexical entries in the grammar using lexical entries that combine features from Villavicenio 2002[101] and standard attribute value matrices of Head Driven Phrase
Intelligent Agents, Nov 3, 2011
Modal verbs occur in contexts which convey information about non-actual states of affairs as well... more Modal verbs occur in contexts which convey information about non-actual states of affairs as well as in contexts which convey information about the actual world of the discourse. Modeling the semantic interpretation of non-actual states of affairs is notoriously complicated, sometimes requiring modal logic, belief revision, non-monotonic reasoning, and multi-agent autoepistemic models. This work presents linguistic features which disambiguate those instances of the past tense modal verbs `could’ and `would’ which occur in contexts where the proposition in the scope of the modal is not true in the actual world of the discourse from those instances which presuppose or entail that an event in their scope occurred in the actual world of the discourse. It also illustrates the complexity of the role of modal verbs in semantic interpretation and, consequently, the limitations of state of the art inference systems with respect to modal verbs. The features suggested for improving modal verb interpretation are based on the analysis of corpus data and insights from the linguistic literature.
Proceedings of the AAAI Conference on Artificial Intelligence
In Natural Language (NL) applications, there is often a mismatch between what the NL interface is... more In Natural Language (NL) applications, there is often a mismatch between what the NL interface is capable of interpreting and what a lay user knows how to express. This work describes a novel natural language interface that reduces this mismatch by refining natural language input through successive, automatically generated semi-structured templates. In this paper we describe how our approach, called SKATE, uses a neural semantic parser to parse NL input and suggest semi-structured templates, which are recursively filled to produce fully structured interpretations. We also show how SKATE integrates with a neural rule-generation model to interactively suggest and acquire commonsense knowledge. We provide a preliminary coverage analysis of SKATE for the task of story understanding, and then describe a current business use-case of the technology in a restricted domain: COVID-19 policy design.
Linguistic Issues in Language Technology, Aug 1, 2016
Modal auxiliaries have different readings, depending on the context in which they occur (Kratzer,... more Modal auxiliaries have different readings, depending on the context in which they occur (Kratzer, 1981). Several projects have attempted to classify uses of modal auxiliaries in corpora according to their reading using supervised machine learning techniques (e.g., Rubinstein et al., 2013, Ruppenhofer & Rehbein, 2012). In each study, traditional taxonomic labels, such as 'epistemic' and 'deontic' are used by human annotators to label instances of modal auxiliaries in a corpus. In order to achieve higher agreement among annotators, results in these previous studies are reported after collapsing some of the initial categories. The results show that human annotators have fairly good agreement on some of the categories, such as whether or not a use is epistemic, but poor agreement on others. They also show that annotators agree more on modals such as might than on modals such as could. In this study, we used traditional taxonomic categories on sentences containing modal auxiliary verbs that were randomly extracted from the English Gigaword 4 th edition corpus (Parker et al., 2009). The lowest inner-annotator agreement using traditional taxonomic labels occurred with uses of could, with raw agreements of 42% − 48% (κ = 0.196 − 0.259), compared to might, for instance, with raw agreement of 98%. In response to the low numbers, rather than collapsing traditional categories, we tried a new method of classifying uses of could with respect to where the reading situates the eventuality being described relative to the speech time. For example, the sentence 'Jess could swim.' 1 2 / LiLT volume 14, issue 6 August 2016 is about a swimming eventuality in the past leading up to the time of speech, if it is read as being an ability. The sentence is about a swimming eventuality in the future, if it is read as being a statement about a possibility. The classification labels we propose are crucial in separating uses of could that have actuality inferences (Bhatt, 1999, Hacquard, 2006) from uses that do not. For the temporal location of the event described by a use of could, using four category labels, we achieved 73% − 90% raw agreement (κ = 0.614 − 0.744). Sequence of tense contexts (Abusch, 1997) present a major factor in the difficulty of determining the temporal properties present in uses of could. Among three annotators, we achieved raw agreement scores of 89% − 96%(κ = 0.779 − 0.919%) on identification of sequence of tense contexts. We discuss the role of our findings with respect to textual entailment.
This paper describes the augmentation of an existing corpus of child-directed speech. The resulti... more This paper describes the augmentation of an existing corpus of child-directed speech. The resulting corpus is a gold-standard labeled corpus for supervised learning of semantic role labels in adult-child dialogues. Semantic role labeling (SRL) models assign semantic roles to sentence constituents, thus indicating who has done what to whom (and in what way). The current corpus is derived from the Adam files in the Brown corpus (Brown 1973) of the CHILDES corpora, and augments the partial annotation described in Connor et al. (2010). It provides labels for both semantic arguments of verbs and semantic arguments of prepositions. The semantic role labels and senses of verbs follow Propbank guidelines Kingsbury and Palmer, 2002; Gildea and Palmer 2002; Palmer et al., 2005) and those for prepositions follow Srikumar and Roth (2011). The corpus was annotated by two annotators. Inter-annotator agreement is given separately for prepositions and verbs, and for adult speech and child speech. O...
Modal auxiliary verbs, such as could, might, must, would, and others, have di↵erent readings depe... more Modal auxiliary verbs, such as could, might, must, would, and others, have di↵erent readings depending on the context in which they occur (Kratzer 1981[45]). The sentence 'Jess could fry the fish' can mean that, in a time previous to the utterance of the sentence, Jess had the ability to fry the fish, or it can mean that, at the time of the utterance, Jess frying the fish is a possible event. Modal auxiliary verbs often create intensional environments, leading the events described by the second verb to be understood to be non-actual events. When the readings are described as being determined by a context, it is often a broad notion of nonlinguistic and extra-sentential linguistic context that is the focus of the interpretation. For example, descriptive pragmatic constraints are used in Lewis 1973[54] and Kratzer 1981[45] to characterize types of accessibility relations and types of orderings of worlds. A large part of the meaning of modal auxiliary verbs, however, centers around how the events described by the second verb are situated relative to the time at which the sentence containing the modal auxiliary is used. Information about the temporal situation of an event is conveyed through the linguistic context in which a modal auxiliary verb occurs, including, but not limited to, lexical properties of the linguistic expressions describing the event in the scope of the modal auxiliary, lexical properties of the modal auxiliary itself, and temporal and aspectual marking on linguistic expressions in the verbal projections. In order to provide a framework for representing the interactions of tense, aspect, and modality, a fragment of English is given in a Multi-Modal Combinatorial Categorial Grammar (Baldridge & Kruij↵[6], Steedman 2012[93]). Modal auxiliaries are given verb-like lexical entries in the grammar using lexical entries that combine features from Villavicenio 2002[101] and standard attribute value matrices of Head Driven Phrase
Modal verbs occur in contexts which convey information about non-actual states of affairs as well... more Modal verbs occur in contexts which convey information about non-actual states of affairs as well as in contexts which convey information about the actual world of the discourse. Modeling the semantic interpretation of non-actual states of affairs is notoriously complicated, sometimes requiring modal logic, belief revision, non-monotonic reasoning, and multi-agent autoepistemic models. This work presents linguistic features which disambiguate those instances of the past tense modal verbs `could’ and `would’ which occur in contexts where the proposition in the scope of the modal is not true in the actual world of the discourse from those instances which presuppose or entail that an event in their scope occurred in the actual world of the discourse. It also illustrates the complexity of the role of modal verbs in semantic interpretation and, consequently, the limitations of state of the art inference systems with respect to modal verbs. The features suggested for improving modal verb ...
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020
Current affiliation Verneek, Inc.