Will Radford - Academia.edu (original) (raw)
Papers by Will Radford
Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, 2010
Financial surveillance technology alerts analysts to suspicious trading events. Our aim is to ide... more Financial surveillance technology alerts analysts to suspicious trading events. Our aim is to identify explainable false positives (e.g., caused by price-sensitive information in company news) and explainable true positives (e.g., caused by ramping in forums) by aligning these alerts with publicly available information. Our system aligns 99% of alerts, which will speed the analysts' task by helping them to eliminate false positives and gather evidence for true positives more rapidly.
Artificial Intelligence, 2013
We automatically create enormous, free and multilingual silver-standard training annotations for ... more We automatically create enormous, free and multilingual silver-standard training annotations for named entity recognition (ner) by exploiting the text and structure of Wikipedia. Most ner systems rely on statistical models of annotated data to identify and classify names of people, locations and organisations in text. This dependence on expensive annotation is the knowledge bottleneck our work overcomes. We first classify each Wikipedia article into named entity (ne) types, training and evaluating on 7200 manually-labelled Wikipedia articles across nine languages. Our crosslingual approach achieves up to 95% accuracy. We transform the links between articles into ne annotations by projecting the target article's classifications onto the anchor text. This approach yields reasonable annotations, but does not immediately compete with existing gold-standard data. By inferring additional links and heuristically tweaking the Wikipedia corpora, we better align our automatic annotations to gold standards. We annotate millions of words in nine languages, evaluating English, German, Spanish, Dutch and Russian Wikipedia-trained models against conll shared task data and other gold-standard corpora. Our approach outperforms other approaches to automatic ne annotation (Richman and Schone, 2008 [61], Mika et al., 2008 [46]) competes with goldstandard training when tested on an evaluation corpus from a different source; and performs 10% better than newswire-trained models on manually-annotated Wikipedia text.
Proceedings of the Australasian Language Technology Workshop, 2007
This paper reports on the application of the Text Attribution Tool (TAT) to profiling the authors... more This paper reports on the application of the Text Attribution Tool (TAT) to profiling the authors of Arabic emails. The TAT system has been developed for the purpose of language-independent author profiling and has now been trained on two email corpora, English and Arabic. We describe the overall TAT system and the Machine Learning experiments resulting in classifiers for the different author traits. Predictions for demographic and psychometric author traits show improvements over the baseline for some of the ...
Proceedings of the First Workshop on Gender Bias in Natural Language Processing, 2019
The 1st ACL workshop on Gender Bias in Natural Language Processing included a shared task on gend... more The 1st ACL workshop on Gender Bias in Natural Language Processing included a shared task on gendered ambiguous pronoun (GAP) resolution. This task was based on the coreference challenge defined in Webster et al. (2018), designed to benchmark the ability of systems to resolve pronouns in real-world contexts in a gender-fair way. 263 teams competed via a Kaggle competition, with the winning system achieving logloss of 0.13667 and near gender parity. We review the approaches of eleven systems with accepted description papers, noting their effective use of BERT (Devlin et al., 2019), both via fine-tuning and for feature extraction, as well as ensembling.
Appositions are adjacent NPs used to add information to a discourse. We propose systems exploitin... more Appositions are adjacent NPs used to add information to a discourse. We propose systems exploiting syntactic and semantic constraints to extract appositions from OntoNotes. Our joint log-linear model outperforms the state-of-the-art Favre and Hakkani-Tür (2009) model by ∼10% on Broadcast News, and achieves 54.3% Fscore on multiple genres.
We explore unsupervised and supervised whole-document approaches to English NEL with naïve and co... more We explore unsupervised and supervised whole-document approaches to English NEL with naïve and context clustering. Our best system uses unsupervised entity linking and naïve clustering and scores 66.5% B 3 + F1 score. Our KB clustering score is competitive with the top systems at 65.6%.
We use a supervised whole-document approach to English Entity Linking with simple clustering appr... more We use a supervised whole-document approach to English Entity Linking with simple clustering approaches. The system extends our TAC 2012 system (Radford et al., 2012), introducing new features for modelling local entity description and type-specific matching as well type-specific supervised models and supervised NIL classification. Our rule-based clustering takes advantage of local description and topics to split NIL clusters. The best system uses supervised entity linking and local description type clustering and scores 72.7% B+ F1 score. Our KB clustering score is competitive with the top system at 71.4%.
Proceedings of the Fifth Workshop on Computational Linguistics and Clinical Psychology: From Keyboard to Clinic
The CLPsych 2018 Shared Task B explores how childhood essays can predict psychological distress t... more The CLPsych 2018 Shared Task B explores how childhood essays can predict psychological distress throughout the author's life. Our main aim was to build tools to help our psychologists understand the data, propose features and interpret predictions. We submitted two linear regression models: MODELA uses simple demographic and wordcount features, while MODELB uses linguistic, entity, typographic, expert-gazetteer, and readability features. Our models perform best at younger prediction ages, with our best unofficial score at 23 of 0.426 disattenuated Pearson correlation. This task is challenging and although predictive performance is limited, we propose that tight integration of expertise across computational linguistics and clinical psychology is a productive direction.
Tracking information flow (IFLOW) is crucial to understanding the evolution of news stories. We p... more Tracking information flow (IFLOW) is crucial to understanding the evolution of news stories. We present analysis and experiments for IFLOW between company announcements and newswire. Error analysis shows that many FPs are annotation errors and many FNs are due to coarse-grained document-level modelling. Experiments show that document meta-data features (e.g., category, length, timing) improve f-scores relative to upper bound by 23%.
This report details our submission to the fourth Dialog State Tracking Challenge (DSTC4), the fir... more This report details our submission to the fourth Dialog State Tracking Challenge (DSTC4), the first time Xerox has participated. Accordingly, we have taken a segment-specific approach that attempts to identify ontology values as precisely as possible using a statistical model. Our model is inspired by work in Named Entity Linking that extracts mentions, then searches and reranks candidates. This is mainly motivated by the small amount of data available relative to the high complexity of the task. However, we believe this setting is realistic in the industrial environment where few data are generally available for a given dialog context to automate. This relatively simple approach performs reasonably at 38.5% F1 using schedule 2 evaluation, and is the most precise at 59.4% on the DSTC4 test set.
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers
We investigate the generation of onesentence Wikipedia biographies from facts derived from Wikida... more We investigate the generation of onesentence Wikipedia biographies from facts derived from Wikidata slot-value pairs. We train a recurrent neural network sequence-to-sequence model with attention to select facts and generate textual summaries. Our model incorporates a novel secondary objective that helps ensure it generates sentences that contain the input facts. The model achieves a BLEU score of 41, improving significantly upon the vanilla sequence-to-sequence model and scoring roughly twice that of a simple template baseline. Human preference evaluation suggests the model is nearly as good as the Wikipedia reference. Manual analysis explores content selection, suggesting the model can trade the ability to infer knowledge against the risk of hallucinating incorrect information.
Proceedings of the Third Workshop on Computational Lingusitics and Clinical Psychology, 2016
Proceedings of the 5th Workshop on Automated Knowledge Base Construction, 2016
Recognition and disambiguation of named entities in text is a knowledge-intensive task. Systems a... more Recognition and disambiguation of named entities in text is a knowledge-intensive task. Systems are typically bound by the resources and coverage of a single target knowledge base (KB). In place of a fixed knowledge base, we attempt to infer a set of endpoints which reliably disambiguate entity mentions on the web. We propose a method for discovering web KBs and our preliminary results suggest that web KBs allow linking to entities that can be found on the web, but may not merit a major KB entry.
Abstract This paper describes the CMCRC systems entered in the TAC 2011 entity linking challenge.... more Abstract This paper describes the CMCRC systems entered in the TAC 2011 entity linking challenge. We used our best-performing system from TAC 2010 to link queries, then clustered NIL links. We focused on naıve baselines that group by attributes of the top entity ...
Natural language is fraught with problems of ambiguity, including name reference. A name in text ... more Natural language is fraught with problems of ambiguity, including name reference. A name in text can refer to multiple entities just as an entity can be known by different names. This thesis examines how a mention in text can be linked to an external knowledge base (), in our case, Wikipedia. The named entity linking () task requires systems to identify the entry, or Wikipedia article, that a mention refers to; or, if the does not contain the correct entry, return. Entity linking systems can be complex and we present a framework for analysing their different components. First, mentions must be extracted from the text. The is searched to build a list of candidate entries for a mention. Finally, a disambiguation component will identify the correct entry or propose a link. This provides a lens through which to understand and compare systems, and a way to characterise how performance in one component affects another. We use this framework to comprehensively analyse three seminal systems: Bunescu and Paşca (2006), Cucerzan (2007) and Varma et al. (2009). These are evaluated on a common dataset and we Name:
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2014
The AIDA-YAGO dataset is a popular target for whole-document entity recognition and disambiguatio... more The AIDA-YAGO dataset is a popular target for whole-document entity recognition and disambiguation, despite lacking a shared evaluation tool. We review evaluation regimens in the literature while comparing the output of three approaches, and identify research opportunities. This utilises our open, accessible evaluation tool. We exemplify a new paradigm of distributed, shared evaluation, in which evaluation software and standardised, versioned system outputs are provided online.
Artificial Intelligence, 2013
ABSTRACT Named Entity Linking (NEL) grounds entity mentions to their corresponding node in a Know... more ABSTRACT Named Entity Linking (NEL) grounds entity mentions to their corresponding node in a Knowledge Base (KB). Recently, a number of systems have been proposed for linking entity mentions in text to Wikipedia pages. Such systems typically search for candidate entities and then disambiguate them, returning either the best candidate or NIL. However, comparison has focused on disambiguation accuracy, making it difficult to determine how search impacts performance. Furthermore, important approaches from the literature have not been systematically compared on standard data sets. We reimplement three seminal NEL systems and present a detailed evaluation of search strategies. Our experiments find that coreference and acronym handling lead to substantial improvement, and search strategies account for much of the variation between systems. This is an interesting finding, because these aspects of the problem have often been neglected in the literature, which has focused largely on complex candidate ranking algorithms.
Information is fundamental to Finance, and understanding how it flows from official sources to ne... more Information is fundamental to Finance, and understanding how it flows from official sources to news agencies is a central problem. Readers need to digest information rapidly from high volume news feeds, which often contain duplicate and irrelevant stories, to gain a competitive advantage. We propose a text categorisation task over pairs of official announcements and news stories to identify whether the story repeats announcement information and/or adds value. Using features based on the intersection of the texts and relative timing, our system identifies information flow at 89.5% F-score and three types of journalistic contribution at 73.4% to 85.7% Fscore. Evaluation against majority annotator decision performs 13% better than a bag-of-words baseline.
Complete Patent Searching Database and Patent Data Analytics Services.
Proceedings of the NAACL HLT 2010 Workshop on Computational Linguistics and Writing: Writing Processes and Authoring Aids}, 2010
@Book{CLW:2010, editor = {Michael Piotrowski and Cerstin Mahlow and Robert Dale}, title = {Procee... more @Book{CLW:2010, editor = {Michael Piotrowski and Cerstin Mahlow and Robert Dale}, title = {Proceedings of the NAACL HLT 2010 Workshop on Computational Linguistics and Writing: Writing Processes and Authoring Aids}, month = {June}, year = {2010}, address = {Los Angeles, CA, USA}, publisher = {Association for Computational Linguistics}, url = {http://www.aclweb.org/anthology/ W10-04} } @InProceedings{rosener:2010:CLW, author = {R\"{o}sener, Christoph}, title = {Computational Linguistics in the Translator's Workflow---Combining Authoring ...
Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, 2010
Financial surveillance technology alerts analysts to suspicious trading events. Our aim is to ide... more Financial surveillance technology alerts analysts to suspicious trading events. Our aim is to identify explainable false positives (e.g., caused by price-sensitive information in company news) and explainable true positives (e.g., caused by ramping in forums) by aligning these alerts with publicly available information. Our system aligns 99% of alerts, which will speed the analysts' task by helping them to eliminate false positives and gather evidence for true positives more rapidly.
Artificial Intelligence, 2013
We automatically create enormous, free and multilingual silver-standard training annotations for ... more We automatically create enormous, free and multilingual silver-standard training annotations for named entity recognition (ner) by exploiting the text and structure of Wikipedia. Most ner systems rely on statistical models of annotated data to identify and classify names of people, locations and organisations in text. This dependence on expensive annotation is the knowledge bottleneck our work overcomes. We first classify each Wikipedia article into named entity (ne) types, training and evaluating on 7200 manually-labelled Wikipedia articles across nine languages. Our crosslingual approach achieves up to 95% accuracy. We transform the links between articles into ne annotations by projecting the target article's classifications onto the anchor text. This approach yields reasonable annotations, but does not immediately compete with existing gold-standard data. By inferring additional links and heuristically tweaking the Wikipedia corpora, we better align our automatic annotations to gold standards. We annotate millions of words in nine languages, evaluating English, German, Spanish, Dutch and Russian Wikipedia-trained models against conll shared task data and other gold-standard corpora. Our approach outperforms other approaches to automatic ne annotation (Richman and Schone, 2008 [61], Mika et al., 2008 [46]) competes with goldstandard training when tested on an evaluation corpus from a different source; and performs 10% better than newswire-trained models on manually-annotated Wikipedia text.
Proceedings of the Australasian Language Technology Workshop, 2007
This paper reports on the application of the Text Attribution Tool (TAT) to profiling the authors... more This paper reports on the application of the Text Attribution Tool (TAT) to profiling the authors of Arabic emails. The TAT system has been developed for the purpose of language-independent author profiling and has now been trained on two email corpora, English and Arabic. We describe the overall TAT system and the Machine Learning experiments resulting in classifiers for the different author traits. Predictions for demographic and psychometric author traits show improvements over the baseline for some of the ...
Proceedings of the First Workshop on Gender Bias in Natural Language Processing, 2019
The 1st ACL workshop on Gender Bias in Natural Language Processing included a shared task on gend... more The 1st ACL workshop on Gender Bias in Natural Language Processing included a shared task on gendered ambiguous pronoun (GAP) resolution. This task was based on the coreference challenge defined in Webster et al. (2018), designed to benchmark the ability of systems to resolve pronouns in real-world contexts in a gender-fair way. 263 teams competed via a Kaggle competition, with the winning system achieving logloss of 0.13667 and near gender parity. We review the approaches of eleven systems with accepted description papers, noting their effective use of BERT (Devlin et al., 2019), both via fine-tuning and for feature extraction, as well as ensembling.
Appositions are adjacent NPs used to add information to a discourse. We propose systems exploitin... more Appositions are adjacent NPs used to add information to a discourse. We propose systems exploiting syntactic and semantic constraints to extract appositions from OntoNotes. Our joint log-linear model outperforms the state-of-the-art Favre and Hakkani-Tür (2009) model by ∼10% on Broadcast News, and achieves 54.3% Fscore on multiple genres.
We explore unsupervised and supervised whole-document approaches to English NEL with naïve and co... more We explore unsupervised and supervised whole-document approaches to English NEL with naïve and context clustering. Our best system uses unsupervised entity linking and naïve clustering and scores 66.5% B 3 + F1 score. Our KB clustering score is competitive with the top systems at 65.6%.
We use a supervised whole-document approach to English Entity Linking with simple clustering appr... more We use a supervised whole-document approach to English Entity Linking with simple clustering approaches. The system extends our TAC 2012 system (Radford et al., 2012), introducing new features for modelling local entity description and type-specific matching as well type-specific supervised models and supervised NIL classification. Our rule-based clustering takes advantage of local description and topics to split NIL clusters. The best system uses supervised entity linking and local description type clustering and scores 72.7% B+ F1 score. Our KB clustering score is competitive with the top system at 71.4%.
Proceedings of the Fifth Workshop on Computational Linguistics and Clinical Psychology: From Keyboard to Clinic
The CLPsych 2018 Shared Task B explores how childhood essays can predict psychological distress t... more The CLPsych 2018 Shared Task B explores how childhood essays can predict psychological distress throughout the author's life. Our main aim was to build tools to help our psychologists understand the data, propose features and interpret predictions. We submitted two linear regression models: MODELA uses simple demographic and wordcount features, while MODELB uses linguistic, entity, typographic, expert-gazetteer, and readability features. Our models perform best at younger prediction ages, with our best unofficial score at 23 of 0.426 disattenuated Pearson correlation. This task is challenging and although predictive performance is limited, we propose that tight integration of expertise across computational linguistics and clinical psychology is a productive direction.
Tracking information flow (IFLOW) is crucial to understanding the evolution of news stories. We p... more Tracking information flow (IFLOW) is crucial to understanding the evolution of news stories. We present analysis and experiments for IFLOW between company announcements and newswire. Error analysis shows that many FPs are annotation errors and many FNs are due to coarse-grained document-level modelling. Experiments show that document meta-data features (e.g., category, length, timing) improve f-scores relative to upper bound by 23%.
This report details our submission to the fourth Dialog State Tracking Challenge (DSTC4), the fir... more This report details our submission to the fourth Dialog State Tracking Challenge (DSTC4), the first time Xerox has participated. Accordingly, we have taken a segment-specific approach that attempts to identify ontology values as precisely as possible using a statistical model. Our model is inspired by work in Named Entity Linking that extracts mentions, then searches and reranks candidates. This is mainly motivated by the small amount of data available relative to the high complexity of the task. However, we believe this setting is realistic in the industrial environment where few data are generally available for a given dialog context to automate. This relatively simple approach performs reasonably at 38.5% F1 using schedule 2 evaluation, and is the most precise at 59.4% on the DSTC4 test set.
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers
We investigate the generation of onesentence Wikipedia biographies from facts derived from Wikida... more We investigate the generation of onesentence Wikipedia biographies from facts derived from Wikidata slot-value pairs. We train a recurrent neural network sequence-to-sequence model with attention to select facts and generate textual summaries. Our model incorporates a novel secondary objective that helps ensure it generates sentences that contain the input facts. The model achieves a BLEU score of 41, improving significantly upon the vanilla sequence-to-sequence model and scoring roughly twice that of a simple template baseline. Human preference evaluation suggests the model is nearly as good as the Wikipedia reference. Manual analysis explores content selection, suggesting the model can trade the ability to infer knowledge against the risk of hallucinating incorrect information.
Proceedings of the Third Workshop on Computational Lingusitics and Clinical Psychology, 2016
Proceedings of the 5th Workshop on Automated Knowledge Base Construction, 2016
Recognition and disambiguation of named entities in text is a knowledge-intensive task. Systems a... more Recognition and disambiguation of named entities in text is a knowledge-intensive task. Systems are typically bound by the resources and coverage of a single target knowledge base (KB). In place of a fixed knowledge base, we attempt to infer a set of endpoints which reliably disambiguate entity mentions on the web. We propose a method for discovering web KBs and our preliminary results suggest that web KBs allow linking to entities that can be found on the web, but may not merit a major KB entry.
Abstract This paper describes the CMCRC systems entered in the TAC 2011 entity linking challenge.... more Abstract This paper describes the CMCRC systems entered in the TAC 2011 entity linking challenge. We used our best-performing system from TAC 2010 to link queries, then clustered NIL links. We focused on naıve baselines that group by attributes of the top entity ...
Natural language is fraught with problems of ambiguity, including name reference. A name in text ... more Natural language is fraught with problems of ambiguity, including name reference. A name in text can refer to multiple entities just as an entity can be known by different names. This thesis examines how a mention in text can be linked to an external knowledge base (), in our case, Wikipedia. The named entity linking () task requires systems to identify the entry, or Wikipedia article, that a mention refers to; or, if the does not contain the correct entry, return. Entity linking systems can be complex and we present a framework for analysing their different components. First, mentions must be extracted from the text. The is searched to build a list of candidate entries for a mention. Finally, a disambiguation component will identify the correct entry or propose a link. This provides a lens through which to understand and compare systems, and a way to characterise how performance in one component affects another. We use this framework to comprehensively analyse three seminal systems: Bunescu and Paşca (2006), Cucerzan (2007) and Varma et al. (2009). These are evaluated on a common dataset and we Name:
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2014
The AIDA-YAGO dataset is a popular target for whole-document entity recognition and disambiguatio... more The AIDA-YAGO dataset is a popular target for whole-document entity recognition and disambiguation, despite lacking a shared evaluation tool. We review evaluation regimens in the literature while comparing the output of three approaches, and identify research opportunities. This utilises our open, accessible evaluation tool. We exemplify a new paradigm of distributed, shared evaluation, in which evaluation software and standardised, versioned system outputs are provided online.
Artificial Intelligence, 2013
ABSTRACT Named Entity Linking (NEL) grounds entity mentions to their corresponding node in a Know... more ABSTRACT Named Entity Linking (NEL) grounds entity mentions to their corresponding node in a Knowledge Base (KB). Recently, a number of systems have been proposed for linking entity mentions in text to Wikipedia pages. Such systems typically search for candidate entities and then disambiguate them, returning either the best candidate or NIL. However, comparison has focused on disambiguation accuracy, making it difficult to determine how search impacts performance. Furthermore, important approaches from the literature have not been systematically compared on standard data sets. We reimplement three seminal NEL systems and present a detailed evaluation of search strategies. Our experiments find that coreference and acronym handling lead to substantial improvement, and search strategies account for much of the variation between systems. This is an interesting finding, because these aspects of the problem have often been neglected in the literature, which has focused largely on complex candidate ranking algorithms.
Information is fundamental to Finance, and understanding how it flows from official sources to ne... more Information is fundamental to Finance, and understanding how it flows from official sources to news agencies is a central problem. Readers need to digest information rapidly from high volume news feeds, which often contain duplicate and irrelevant stories, to gain a competitive advantage. We propose a text categorisation task over pairs of official announcements and news stories to identify whether the story repeats announcement information and/or adds value. Using features based on the intersection of the texts and relative timing, our system identifies information flow at 89.5% F-score and three types of journalistic contribution at 73.4% to 85.7% Fscore. Evaluation against majority annotator decision performs 13% better than a bag-of-words baseline.
Complete Patent Searching Database and Patent Data Analytics Services.
Proceedings of the NAACL HLT 2010 Workshop on Computational Linguistics and Writing: Writing Processes and Authoring Aids}, 2010
@Book{CLW:2010, editor = {Michael Piotrowski and Cerstin Mahlow and Robert Dale}, title = {Procee... more @Book{CLW:2010, editor = {Michael Piotrowski and Cerstin Mahlow and Robert Dale}, title = {Proceedings of the NAACL HLT 2010 Workshop on Computational Linguistics and Writing: Writing Processes and Authoring Aids}, month = {June}, year = {2010}, address = {Los Angeles, CA, USA}, publisher = {Association for Computational Linguistics}, url = {http://www.aclweb.org/anthology/ W10-04} } @InProceedings{rosener:2010:CLW, author = {R\"{o}sener, Christoph}, title = {Computational Linguistics in the Translator's Workflow---Combining Authoring ...