Francis Bond - Academia.edu (original) (raw)
Papers by Francis Bond
Achieving such a high quality of proceedings would have been impossible without the huge work tha... more Achieving such a high quality of proceedings would have been impossible without the huge work that was undertaken by the International Program Committee members. We take this opportunity to thank them for their great support and cooperation.
CSLI Publications eBooks, 2016
Natural Language Engineering, May 3, 2013
This paper shows the necessity of distinguishing different referential uses of NPs in Machine Tra... more This paper shows the necessity of distinguishing different referential uses of NPs in Machine Translation. We propose a three-way distinction between the generic, referential and as- criptive uses of noun phrases (NPs), and argue that this is the minimum necessary to generate articles and number correctly when translating from Japanese to English. A detailed algorithm is proposed for determining the referentiality of Japanese NPs, based on a defeasible hierarchy of pragmatic rules that are applied top-down, from the clause to the NP. We also sketch the process of generating determiners and number using rules based on the different NP referentialities for a Japanese-English MT system. Using the proposed heuristics has raised the percentage of NPs generated with correct use of articles and number in the Japanese-English MT system ALT-J/E from 65% to 85%.
Natural language communication between machines and humans are still constrained. The article add... more Natural language communication between machines and humans are still constrained. The article addresses a gap in natural language understanding about actions, specifically that of understanding commands. We propose a new method for commonsense inference (grounding) of high-level natural language commands into specific action commands for further execution by a robotic system. The method allows to build a knowledge base that consists of a large set of commonsense inferences. The preliminary results have been presented.
Language Resources and Evaluation, May 1, 2002
IWSLT, 2008
Large amounts of training data are essential for training statistical machine translations system... more Large amounts of training data are essential for training statistical machine translations systems. In this paper we show how training data can be expanded by paraphrasing one side. The new data is made by parsing then generating using a precise HPSG based grammar, which gives sentences with the same meaning, but minor variations in lexical choice and word order. In experiments with Japanese and English, we showed consistent gains on the Tanaka Corpus with less consistent improvement on the IWSLT 2005 evaluation data.
This is the proceedings of the second ACL workshop on multiword expressions (MWEs). MWEs are incr... more This is the proceedings of the second ACL workshop on multiword expressions (MWEs). MWEs are increasingly being singled out as a problem for NLP, particularly for the many applications which require some degree of semantic interpretation and require tasks such as parsing and word sense disambiguation. In the call for papers we solicited papers that especially laid emphasis on integrating analysis, acquisition and treatment of various kinds of multiword expressions in natural language NLP. For example, research that combines a linguistic analysis with a method of automatically acquiring the classes described, work that combines the computational treatment of a class of MWEs with a solid linguistic analysis and research that extracts MWEs and either classifies them or uses them in some task. We received 23 submissions (3 from Asia, 11 from Europe and 9 from the Americas), and accepted 11 of them for presentation, with two reserves. Each submission was reviewed by three members of the program committee, who not only judged each submission but also gave detailed comments to the authors. The overall quality of submissions was high, making the final selection very difficult. The papers in these proceedings are those which were finally selected for presentation. Many of the papers deal with MWEs in general, rather than aiming at specific subtypes, with examples from a wide range of languages (Basque, English, Japanese, Portuguese, Russian and Turkish). There were also a variety of formalisms considered (dependency grammar, finite state machines, lexical conceptual structure, HPSG, . . . ) as well as more descriptive papers. The main applications targeted were machine translation and information retrieval.
This paper presents work in progress on the development of derivational links for the Japanese Wo... more This paper presents work in progress on the development of derivational links for the Japanese WordNet, with a focus on the retrieval, validation and elaboration of nouns and verbs linked by the agentive noun derivation. 2,340 such links are generated, of which we validated 833 such pairs. We briefly discuss some challenges in determining valid link pairs as well as their morphosemantic natures. We also consider the possibilities and challenges of automating the discovery of morphosemantic links, by linking our results with current theoretical issues in agentive nominais. In addition, we are currently corroborating these Japanese agentive derivations with English counterparts from the Princeton WordNet and intend to perform a more rigorous cross-lingual comparison.
English, Apr 8, 2022
It is common to discover an epigraph in the opening pages of a novel that highlights one or more ... more It is common to discover an epigraph in the opening pages of a novel that highlights one or more of the major themes and denotes the influence of another author on the composition of the text. Yet, the inclusion of an epigraph also bestows prestige on the citing text – helping the author to select his or her place in the wider literary tradition – and situates the text in a particular genre or historical period. In order to trace the development of what Gérard Genette dubbed the ‘epigraph effect’, we collected 16,963 epigraphs and recorded their provenance (author, work, date, and country of origin). This collection enables us to trace intertextual connections between authors throughout literary history and national traditions.
Asian-Pacific Journal of Second and Foreign Language Education, Nov 18, 2016
In this paper, Chinese curricula in the contexts of China and Singapore on primary level are comp... more In this paper, Chinese curricula in the contexts of China and Singapore on primary level are compared and contrasted by both quantitative (Word Segmenter and Text Analyzer) and qualitative methods (in-depth thematic analysis). The research shows challenges for educational administrators, teachers and other professional staff in Chinese education on the aspects of applying information and communication technology, implementing individualized and autonomous teaching and learning, balancing different teaching objectives, understanding students' cognitive and psychological varieties, readjusting teachers' roles as organizer, facilitator and instructor, and aligning formative assessment and summative assessment. Meanwhile, the research also demonstrates and rationalizes the prospects of Mobile Assisted Language Learning (MALL) in Chinese teaching and the glocalization of Chinese curricula in the future. Further, the research will help people know more about diverse social linguistic contexts in China and Singapore, the different essence of first language (L1) and second language (L2) teaching, and the complexity and arduousness of language teaching in practice.
version of a work that has been peer reviewed and accepted for publication by Workshop on Languag... more version of a work that has been peer reviewed and accepted for publication by Workshop on Language Resources in Asia, NLPRS-2001. It incorporates referee's comments but changes resulting from the publishing process, such as copyediting, structural formatting, may not be reflected in this document.
... dellorletta-EtAl:2006:Frontiers, author = {Dell'Orletta, Felice and Lenci, A... more ... dellorletta-EtAl:2006:Frontiers, author = {Dell'Orletta, Felice and Lenci, Alessandro and Montemagni, Simonetta and Pirrelli, Vito}, title = {Probing ... Linguistics}, pages = {21--28}, url = {http://www.aclweb.org/anthology/W/W06/W06-0604} } @InProceedings{maxwell-hughes:2006 ...
arXiv (Cornell University), Jan 23, 1996
Wrocław University of Science and Technology, Department of Computational Intelligence, Jul 28, 2020
Achieving such a high quality of proceedings would have been impossible without the huge work tha... more Achieving such a high quality of proceedings would have been impossible without the huge work that was undertaken by the International Program Committee members. We take this opportunity to thank them for their great support and cooperation.
CSLI Publications eBooks, 2016
Natural Language Engineering, May 3, 2013
This paper shows the necessity of distinguishing different referential uses of NPs in Machine Tra... more This paper shows the necessity of distinguishing different referential uses of NPs in Machine Translation. We propose a three-way distinction between the generic, referential and as- criptive uses of noun phrases (NPs), and argue that this is the minimum necessary to generate articles and number correctly when translating from Japanese to English. A detailed algorithm is proposed for determining the referentiality of Japanese NPs, based on a defeasible hierarchy of pragmatic rules that are applied top-down, from the clause to the NP. We also sketch the process of generating determiners and number using rules based on the different NP referentialities for a Japanese-English MT system. Using the proposed heuristics has raised the percentage of NPs generated with correct use of articles and number in the Japanese-English MT system ALT-J/E from 65% to 85%.
Natural language communication between machines and humans are still constrained. The article add... more Natural language communication between machines and humans are still constrained. The article addresses a gap in natural language understanding about actions, specifically that of understanding commands. We propose a new method for commonsense inference (grounding) of high-level natural language commands into specific action commands for further execution by a robotic system. The method allows to build a knowledge base that consists of a large set of commonsense inferences. The preliminary results have been presented.
Language Resources and Evaluation, May 1, 2002
IWSLT, 2008
Large amounts of training data are essential for training statistical machine translations system... more Large amounts of training data are essential for training statistical machine translations systems. In this paper we show how training data can be expanded by paraphrasing one side. The new data is made by parsing then generating using a precise HPSG based grammar, which gives sentences with the same meaning, but minor variations in lexical choice and word order. In experiments with Japanese and English, we showed consistent gains on the Tanaka Corpus with less consistent improvement on the IWSLT 2005 evaluation data.
This is the proceedings of the second ACL workshop on multiword expressions (MWEs). MWEs are incr... more This is the proceedings of the second ACL workshop on multiword expressions (MWEs). MWEs are increasingly being singled out as a problem for NLP, particularly for the many applications which require some degree of semantic interpretation and require tasks such as parsing and word sense disambiguation. In the call for papers we solicited papers that especially laid emphasis on integrating analysis, acquisition and treatment of various kinds of multiword expressions in natural language NLP. For example, research that combines a linguistic analysis with a method of automatically acquiring the classes described, work that combines the computational treatment of a class of MWEs with a solid linguistic analysis and research that extracts MWEs and either classifies them or uses them in some task. We received 23 submissions (3 from Asia, 11 from Europe and 9 from the Americas), and accepted 11 of them for presentation, with two reserves. Each submission was reviewed by three members of the program committee, who not only judged each submission but also gave detailed comments to the authors. The overall quality of submissions was high, making the final selection very difficult. The papers in these proceedings are those which were finally selected for presentation. Many of the papers deal with MWEs in general, rather than aiming at specific subtypes, with examples from a wide range of languages (Basque, English, Japanese, Portuguese, Russian and Turkish). There were also a variety of formalisms considered (dependency grammar, finite state machines, lexical conceptual structure, HPSG, . . . ) as well as more descriptive papers. The main applications targeted were machine translation and information retrieval.
This paper presents work in progress on the development of derivational links for the Japanese Wo... more This paper presents work in progress on the development of derivational links for the Japanese WordNet, with a focus on the retrieval, validation and elaboration of nouns and verbs linked by the agentive noun derivation. 2,340 such links are generated, of which we validated 833 such pairs. We briefly discuss some challenges in determining valid link pairs as well as their morphosemantic natures. We also consider the possibilities and challenges of automating the discovery of morphosemantic links, by linking our results with current theoretical issues in agentive nominais. In addition, we are currently corroborating these Japanese agentive derivations with English counterparts from the Princeton WordNet and intend to perform a more rigorous cross-lingual comparison.
English, Apr 8, 2022
It is common to discover an epigraph in the opening pages of a novel that highlights one or more ... more It is common to discover an epigraph in the opening pages of a novel that highlights one or more of the major themes and denotes the influence of another author on the composition of the text. Yet, the inclusion of an epigraph also bestows prestige on the citing text – helping the author to select his or her place in the wider literary tradition – and situates the text in a particular genre or historical period. In order to trace the development of what Gérard Genette dubbed the ‘epigraph effect’, we collected 16,963 epigraphs and recorded their provenance (author, work, date, and country of origin). This collection enables us to trace intertextual connections between authors throughout literary history and national traditions.
Asian-Pacific Journal of Second and Foreign Language Education, Nov 18, 2016
In this paper, Chinese curricula in the contexts of China and Singapore on primary level are comp... more In this paper, Chinese curricula in the contexts of China and Singapore on primary level are compared and contrasted by both quantitative (Word Segmenter and Text Analyzer) and qualitative methods (in-depth thematic analysis). The research shows challenges for educational administrators, teachers and other professional staff in Chinese education on the aspects of applying information and communication technology, implementing individualized and autonomous teaching and learning, balancing different teaching objectives, understanding students' cognitive and psychological varieties, readjusting teachers' roles as organizer, facilitator and instructor, and aligning formative assessment and summative assessment. Meanwhile, the research also demonstrates and rationalizes the prospects of Mobile Assisted Language Learning (MALL) in Chinese teaching and the glocalization of Chinese curricula in the future. Further, the research will help people know more about diverse social linguistic contexts in China and Singapore, the different essence of first language (L1) and second language (L2) teaching, and the complexity and arduousness of language teaching in practice.
version of a work that has been peer reviewed and accepted for publication by Workshop on Languag... more version of a work that has been peer reviewed and accepted for publication by Workshop on Language Resources in Asia, NLPRS-2001. It incorporates referee's comments but changes resulting from the publishing process, such as copyediting, structural formatting, may not be reflected in this document.
... dellorletta-EtAl:2006:Frontiers, author = {Dell'Orletta, Felice and Lenci, A... more ... dellorletta-EtAl:2006:Frontiers, author = {Dell'Orletta, Felice and Lenci, Alessandro and Montemagni, Simonetta and Pirrelli, Vito}, title = {Probing ... Linguistics}, pages = {21--28}, url = {http://www.aclweb.org/anthology/W/W06/W06-0604} } @InProceedings{maxwell-hughes:2006 ...
arXiv (Cornell University), Jan 23, 1996
Wrocław University of Science and Technology, Department of Computational Intelligence, Jul 28, 2020