Computational text analysis of intermediate and high intermediate reading passages for ESL learners / Anealka Aziz Hussin (original) (raw)

An Examination of Current Text Difficulty Indices With Early Reading Texts

TextProject Reading Research Report #10-01, 2010

This study considers the degree to which two quantitative indices—Lexiles and Coh-Metrix—discriminate across levels of difficulty and types of beginning reading texts. The database consisted of 444 texts, representing seven text types that are part of reading/language arts instruction. These text types were distributed across seven levels of text difficulty. Analyses showed that Lexiles predicted a clear progression in difficulty across the seven levels but that these differences were due almost entirely to Mean Sentence Length (MSL), not Mean Lexical Frequency (MLF). Findings were similar for the syntax and word abstractness variables of Coh-Metrix. Of three additional Coh-Metrix variables—non-narrativity, referential cohesion, and situation model cohesion—only referential cohesion showed a progression of easier to harder across the seven text levels. Of the seven text types, trade books had the highest Lexiles, while historical textbooks had the lowest. The results of the Coh-Metrix analyses showed that all text types fell within the easy range but trade books were predicted to be the hardest and the historical textbooks the easiest. These quantitative systems validate the general order of text levels and indicate some differences in the predicted ease or difficulty of text types. The use- fulness of this information—general in nature—in matching beginning readers with appropriate texts is less certain. The report concludes with an identification of next steps for supporting optimal matches of texts and beginning readers’ knowledge.

TEXT DIFFICULTY: A COMPARISON OF READABILITY FORMULAE AND EXPERTS’ JUDGMENT

Teachers of English, librarians, researchers have been interested in finding the right text for the right reader for many years. In teaching Second Language (L2), text writers often try to fulfil the demand by simplifying the texts for the readers. The emerged term " readability " can be defined as " the ease of reading words and sentences " (Hargis, et al. 1998). The aim of this research was to compare the ways to find the right text for the right reader: traditional readability formulae (Flesch Reading Ease, Flesch-Kincaid Grade Level), Coh-Metrix Second Language (L2) Reading Index, which is a readability formula based on psycholinguistic and cognitive models of reading', and teachers' estimation of grade levels by using leveled texts in a web site. In order to do this, a selection of texts from a corpus of intuitively simplified texts was used (N30). Coh-Metrix Readability levels, Flesch Reading Ease, and Flesch-Kincaid Grade Levels of the texts were calculated via Coh-Metrix Web Tool. Three teachers of English were asked to decide the levels of the texts. When the relationship between Coh-metrix Readability Level, traditional formulae and the texts levels in the website was analysed via SPSS, it was found that there was weak negative correlation between Flesch-Kincaid Grade Level and the texts levels in the website (-,39). Additionally, there was weak negative correlation between the texts levels in the website and Flesch Reading Ease scores (-,41). However, there was moderate negative correlation between Coh-metrix Readability levels and the texts levels in the website (-,63), where Teacher1 and Coh-metrix Readability levels had very strong positive correlation (,95). It was identified that readability formulae can help L2 teachers when they select texts for their students for teaching and assessment purposes.

An Examination of Current Text Difficulty Indices with Early Reading Texts. Reading Research Report #10-01

Online Submission, 2010

This study considers the degree to which two quantitative indices-Lexiles and Coh-Metrix-discriminate across levels of difficulty and types of beginning reading texts. The database consisted of 444 texts, representing seven text types that are part of reading/language arts instruction. These text types were distributed across seven levels of text difficulty. Analyses showed that Lexiles predicted a clear progression in difficulty across the seven levels but that these differences were due almost entirely to Mean Sentence Length (MSL), not Mean Lexical Frequency (MLF). Findings were similar for the syntax and word abstractness variables of Coh-Metrix. Of three additional Coh-Metrix variables-non-narrativity, referential cohesion, and situation model cohesion-only referential cohesion showed a progression of easier to harder across the seven text levels. Of the seven text types, trade books had the highest Lexiles, while historical textbooks had the lowest. The results of the Coh-Metrix analyses showed that all text types fell within the easy range but trade books were predicted to be the hardest and the historical textbooks the easiest. These quantitative systems validate the general order of text levels and indicate some differences in the predicted ease or difficulty of text types. The usefulness of this information-general in nature-in matching beginning readers with appropriate texts is less certain. The report concludes with an identification of next steps for supporting optimal matches of texts and beginning readers' knowledge.

Empirical verification of the Polish formula of text difficulty

The aim of the study was to verify the accuracy of the formula for assessing the difficulty of texts written in Polish (Pisarek 1969). The study involved 1,309 persons aged between 15 and 84. 15 texts were used, each approximately 300 words long, representing different subjects and varied difficulty level. Text comprehension was checked with multiple choice tests, the cloze procedure and open-ended questions. Significant correlations between the difficulty of a text and its comprehension were found (rmc(15) =-0.529, p = 0.043; ropen =-0.519, p = 0.047; rcloze =-0.656, p = 0.008). The results confirmed relative accuracy and usefulness of Pisarek's readability formula. The discussion includes proposed ways of improving the current form of the formula.

Readability of Texts: Human Evaluation Versus Computer Index

mcser.org

This paper reports a study which aimed at exploring if there is any difference between the evaluation of EFL expert readers and computer-based evaluation of English text difficulty. 43 participants including university EFL instructors and graduate students read 10 different English passages and completed a Likert-type scale on their perception of the different components of text difficulty. On the other hand, the same 10 English texts were fed into Word Program and Flesch Readability index of the texts were calculated. Then comparisons were made to see if readers' evaluation of texts were the same or different from the calculated ones. Results of the study revealed significant differences between participants' evaluation of text difficulty and the Flesch Readability index of the texts. Findings also indicated that there was no significant difference between EFL instructors and graduate students' evaluation of the text difficulty. The findings of the study imply that while readability formulas are valuable measures for evaluating level of text difficulty, they should be used cautiously. Further research seems necessary to check the validity of the readability formulas and the findings of the present study.

FORMALISING TEXT DIFFICULTY WITHIN THE EFL CONTEXT (Talk) Readability indices for the assessment of textbooks: a feasibility study in the context of EFL (Paper)

Over Readability indexes have been widely used in order to measure textual difficulty. They can be really useful for the automatic classification of texts, especially within the language teaching discipline. Among other applications, they allow for the previous determination of the difficulty level of texts without even the need of reading them through. The aim of this investigation is twofold: first, examining the degree of accuracy of the six most commonly used readability indexes: Flesch Reading Ease, Flesch-Kincaid Grade Level, Gunning Fog, Automated Readability Index, SMOG, and Coleman-Liau; and second, by means of the data obtained, trying to come up with a new optimised measure.

A Corpus-Based Readability Formula for Estimate of Arabic Texts Reading Difficulty

The present study is aimed at designing a formula for estimating the difficulty of reading Arabic texts. Flesch, Gunning Fox and Dale-Chall are some of the formulae that have been used for measuring English texts difficulty. Some of them have been automated making it easy for users to check the readability level of a particular text. A few scholars have attempted to come up with a readability formula for Arabic, but none has been automated. This study is thus conducted to find the formula that would make it possible for users to measure the difficulty level of Arabic texts online. This will greatly help in materials selection for reading comprehension and testing. This paper will present the prototype of a readability formula which is based on a corpus for estimating the difficulty of Arabic written documents.

Reconstructing Readability: Recent Developments and Recommendations in the Analysis of Text Difficulty

Educational Psychology Review, 2012

Largely due to technological advances, methods for analyzing readability have increased significantly in recent years. While past researchers designed hundreds of formulas to estimate the difficulty of texts for readers, controversy has surrounded their use for decades, with criticism stemming largely from their application in creating new texts as well as their utilization of surface-level indicators as proxies for complex cognitive processes that take place when reading a text. This review focuses on examining developments in the field of readability during the past two decades with the goal of informing both current and future research and providing recommendations for present use. The fields of education, linguistics, cognitive science, psychology, discourse processing, and computer science have all made recent strides in developing new methods for predicting the difficulty of texts for various populations. However, there is a need for further development of these methods if they are to become widely available. Keywords Readability. Text difficulty. Reading. Text analysis A century of reading research paralleled a century of research into what makes one text more or less difficult to read and comprehend than another. Some estimate that by the 1980s, over 200 readability formulas had already been developed (DuBay 2004), and since the 1980s, the area has exploded in fields like discourse processing and computer science. The question that remains in the minds of educators and researchers alike is "what is the best way of determining the difficulty of a particular text?" Several reviews and summaries are available of the older, more traditional methods of assessing readability (e.g., Bormuth 1966; DuBay 2004; Klare 1974), and most researchers in the field discuss these classic methods by way of introduction to their own research. Controversy, however, has surrounded these older formulas, and new methods are constantly being developed and tested. The purposes of this review were to (a) examine recent developments in the field of

An Appraisal of the Efficacy of Measures of Readability in Identfying Level of Text Difficulty among Theintermediate Esl Learners

The study is an exposition of "making a match" between a text reader and a text especially pedagogical situation. The study posits that if learners who are given reading materials that are too easy are not challenged and their learning growth can be stunted. However, learners who are given reading materials that are too difficult can fail to make progress and consequently become frustrated. It is therefore imperative for the teacher to make a match so that the learner smoothly and reliably attain educational ladder through the developmental levels. The study discovers that researches conducted on readability anchor on two general factors influencing readability. For the text factors itself, word difficulty, word familiarization, sentence difficulty among others are explored, while the reader"s factors include physical capabilities, reading abilities, motivation, prior knowledge, etc. The study therefore recommends that teachers should be aware of the learners" individual reading needs, select appropriate reading materials, and motivate learners in reading.