Automatic summarization for text simplification: evaluating text understanding by poor readers (original) (raw)

Automatic summarization for text simplification

Companion Proceedings of the XIV Brazilian Symposium on Multimedia and the Web - WebMedia '08, 2008

In this paper we present experiments on summarization and text simplification for poor readers, more specifically, functional illiteracy readers. We test several summarizers and use summaries as the basis of simplification strategies. We show that each simplification approach has different effects on readers of varied levels of literacy, but that all of them do improve text understanding at some level.

Readability assessment for text simplification

We describe a readability assessment ap-proach to support the process of text simplifi-cation for poor literacy readers. Given an in-put text, the goal is to predict its readability level, which corresponds to the literacy level that is expected from the target reader: rudi-mentary, basic or advanced. We complement features traditionally used for readability as-sessment with a number of new features, and experiment with alternative ways to model this problem using machine learning methods, namely classification, regression and ranking. The best resulting model is embedded in an authoring tool for Text Simplification.

Text simplification for reading assistance

Proceedings of the second international workshop on Paraphrasing -, 2003

This paper describes our ongoing research project on text simplification for congenitally deaf people. Text simplification we are aiming at is the task of offering a deaf reader a syntactic and lexical paraphrase of a given text for assisting her/him to understand what it means. In this paper, we discuss the issues we should address to realize text simplification and report on the present results in three different aspects of this task: readability assessment, paraphrase representation and post-transfer error detection.

Text simplification and comprehensible input: A case for an intuitive approach

Language Teaching Research, 2011

Texts are routinely simplified to make them more comprehensible for second language learners. However, the effects of simplification upon the linguistic features of texts remain largely unexplored. Here we examine the effects of one type of text simplification: intuitive text simplification. We use the computational tool, Coh-Metrix, to examine linguistic differences between proficiency levels of a corpus of 300 news texts that had been simplified to three levels of simplification (beginner, intermediate, advanced). The main analysis reveals significant differences between levels for a wide range of linguistic features, particularly between beginner and advanced levels. The results show that lower level texts are generally less lexically and syntactically sophisticated than higher-level texts. The analysis also reveals that lower level texts contain more cohesive features than higher-level texts. The analysis also provides strong evidence that these linguistic features can be used t...

Flesch-Kincaid is Not a Text Simplification Evaluation Metric

2021

Sentence-level text simplification is currently evaluated using both automated metrics and human evaluation. For automatic evaluation, a combination of metrics is usually employed to evaluate different aspects of the simplification. Flesch-Kincaid Grade Level (FKGL) is one metric that has been regularly used to measure the readability of system output. In this paper, we argue that FKGL should not be used to evaluate text simplification systems. We provide experimental analyses on recent system output showing that the FKGL score can easily be manipulated to improve the score dramatically with only minor impact on other automated metrics (BLEU and SARI). Instead of using FKGL, we suggest that the component statistics, along with others, be used for posthoc analysis to understand system behavior.

Approaches to Text Simplification: Can Computer Technologies Outdo a Human Mind?

GEMA Online Journal of Language Studies, 2021

Narrowly specialized information is addressed to a limited circle of professionals though it provokes interest among people without specialized education. This gives rise to a need for the popularization of scientific information. This process is carried out through simplified texts as a kind of secondary texts that are directly aimed at the addressee. Age, language proficiency and background knowledge are the main features which are usually taken into consideration by the author of the secondary text who makes changes in the text composition, as well as in its pragmatics, semantics and syntax. This article analyses traditional approaches to text simplification, computer simplification and summarization. The authors compare humanauthored simplification of literary texts with the newest trends in computer simplification to promote further development of machine simplification tools. It has been found that the samples of simplified scientific texts seem to be more natural than the samples of simplified literary texts since technical background knowledge can be processed with machine tools. The authors have come to the conclusion that literary and technical texts should imply different approaches for adaptation and simplification. In addition, personal readers' experience plays a great part in finding the implications in literary texts. In this respect it might be reasonable to create separate engines for simplifying and adapting texts from diverse spheres of knowledge.

Automatic Text Simplification of News Articles in the Context of Public Broadcasting

arXiv (Cornell University), 2022

This report summarizes the work carried out by the authors during the Twelfth Montreal Industrial Problem Solving Workshop, held at Université de Montréal in August 2022. The team tackled a problem submitted by CBC/Radio-Canada on the theme of Automatic Text Simplification (ATS). In order to make its written content more widely accessible, and to support its second-language teaching activities, CBC/RC has recently been exploring the potential of automatic methods to simplify texts. They have developed a modular lexical simplification system (LSS), which identifies complex words in French and English texts, and replaces them with simpler, more common equivalents. Recently however, the ATS research community has proposed a number of approaches that rely on deep learning methods to perform more elaborate transformations, not limited to just lexical substitutions, but covering syntactic restructuring and conceptual simplifications as well. The main goal of CBC/RC's participation in the workshop was to examine these new methods and to compare their performance to that of their own LSS. This report is structured as follows: In Section 2, we detail the context of the proposed problem and the requirements of the sponsor. We then give an overview of current ATS methods in Section 3. Section 4 provides information about the relevant datasets available, both for training and testing ATS methods. As is often the case in natural language processing applications, there is much less data available to support ATS in French than in English; therefore, we also discuss in that section the possibility of automatically translating English resources into French, as a means of supplementing the French data. The outcome of text simplification, whether automatic or not, is notoriously difficult to evaluate objectively; in Section 5, we discuss the various evaluation methods we have considered, both manual and automatic. Finally, we present the ATS methods we have tested and the outcome of their evaluation in Section 6, then Section 7 concludes this document and presents research directions.

One Step Closer to Automatic Evaluation of Text Simplification Systems

Proceedings of the 3rd Workshop on Predicting and Improving Text Readability for Target Reader Populations (PITR), 2014

This study explores the possibility of replacing the costly and time-consuming human evaluation of the grammaticality and meaning preservation of the output of text simplification (TS) systems with some automatic measures. The focus is on six widely used machine translation (MT) evaluation metrics and their correlation with human judgements of grammaticality and meaning preservation in text snippets. As the results show a significant correlation between them, we go further and try to classify simplified sentences into: (1) those which are acceptable; (2) those which need minimal post-editing; and (3) those which should be discarded. The preliminary results, reported in this paper, are promising.

A corpus analysis of simple account texts and the proposal of simplification strategies: first steps towards text simplification systems

2008

In this paper we investigate the main linguistic phenomena that can make texts complex and how they could be simplified. We focus on a corpus analysis of simple account texts available on the web for Brazilian Portuguese (BP). This study illustrates the need for text simplification to facilitate accessibility to information by poor readers and by people with cognitive disabilities. It also highlights features of simplification for BP, which may differ from other languages. Moreover, we propose simplification strategies and a Simplification Annotation Editor. This study consists of the first step towards building BP text simplification systems. One of the scenarios in which these systems could be used is that of reading electronic texts produced, e.g., by the Brazilian government or by news agencies.