WORKING PAPER 73 ANOTHER APPROACH TO ENGLISH by (original) (raw)
1974
A new approach to building descriptions of English is 6utlined and programs implementing the ideas for sentencesized fragments are demonstrated.
Mind your grammar: a new approach to modelling text
1987
Beginning to create the New Oxford English Dictionary database has resulted in the realization that databases for reference texts are unlike those for conventional enterprises. While the traditional approaches to database design and development are sound, the particular techniques used for commercial databases have been repeatedly found to be inappropriate for text-dominated databases, such as the New OED. In the same way that the relational model was developed based on experiences gained from earlier database approaches, the grammar-based model presented here builds on the traditional foundations of computer science, and particularly database theory and practice. This new model uses grammars as schemas and ''parsed strings'' as instances. Operators on the parsed strings are defined, resulting in a ''p-string algebra'' that can be used for data manipulation and view definition. The model is representation-independent and the operators are non-navigational, so that efficient implementations may be developed for unknown future hardware and operating systems. Several approaches to storage structures and efficient processing algorithms for representative hardware configurations have been investigated. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment. N.B. These pages are not exact images of the Proceedings. Furthermore, connecting lines in the diagrams representing trees are not depicted in this online version.
Using a parser as a heuristic tool for the description of New Englishes
2009
We propose a novel use of an automatic parser as a tool for descriptive linguistics. This use has two advantages. First, quantitative data on very large amounts of texts are available instantly, a process which would take years of work with manual annotation. Second, it allows variational linguists to use a partly corpus-driven approach, where results emerge from the data. The disadvantage of the parser-based approach is that the level of precision and recall is lower. We give evaluations of precision and recall of the parser we use. We then show the application of the parser-based approach to a selection of New Englishes. using several subparts of the International Corpus of English (ICE). We employ two methods to discover potential features of New Englishes: (a) by exploring quantitative differences in the use of established syntactic patterns (b) by evaluating the potential correlation between parsing breakdowns and regional syntactic innovations.
Sentence Fragments Regular Structures
1988
This paper describes an analysis of telegraphic fragments as regular structures (not errors) handled by rn~n~nal extensions to a system designed for processing the standard language. The modular approach which has been implemented in the Unlsys natural language processing system PUNDIT is based on a division of labor in which syntax regulates the occurrence and distribution of elided elements, and semantics and pragumtics use the system's standard mechankms to interpret them.
A System for Automatic English Text Expansion
IEEE Access
We present an automatic text expansion system to generate English sentences, which performs automatic Natural Language Generation (NLG) by combining linguistic rules with statistical approaches. Here, "automatic" means that the system can generate coherent and correct sentences from a minimum set of words. From its inception, the design is modular and adaptable to other languages. This adaptability is one of its greatest advantages. For English, we have created the highly precise aLexiE lexicon with wide coverage, which represents a contribution on its own. We have evaluated the resulting NLG library in an Augmentative and Alternative Communication (AAC) proof of concept, both directly (by regenerating corpus sentences) and manually (from annotations) using a popular corpus in the NLG field. We performed a second analysis by comparing the quality of text expansion in English to Spanish, using an ad-hoc Spanish-English parallel corpus. The system might also be applied to other domains such as report and news generation.
Description-directed Natural Language Generation
1985
We report here on a significant new set of capabilities that we have incorporated into our language generation system MUMBLE. Their impact will be to greatly simplify the work of any text planner that uses MUMBLE as ita linguistics component since MUMBLE can now take on many of the planner's text organization and decision-making problems with markedly less hand-tailoring of algorithms in either component.
Handbook of Natural Language Processing, 2000
We report here on a significant new set of capabilities that we have incorporated into our language generation system MUMBLE. Their impact will be to greatly simplify the work of any text planner that uses MUMBLE as ita linguistics component since MUMBLE can now take on many of the planner's text organization and decision-making problems with markedly less hand-tailoring of algorithms in either component.