FrecELE: A lexical database for Spanish as a Second Language (original) (raw)
Related papers
Lextale-Esp: A Test to Rapidly and Efficiently Assess the Spanish Vocabulary Size
The methods to measure vocabulary size vary across disciplines. This heterogeneity hinders direct comparisons between studies and slows down the understanding of research findings. A quick, free and efficient test of English language proficiency, LexTALE, was recently developed to remedy this problem. LexTALE has been validated and shown to be an effective tool for distinguishing between different levels of proficiency in English.
Accurate Measurement of Lexical Sophistication with Reference to ESL Learner Data
Proceedings of the 11th International Conference on Educational Data Mining (Boyer, K.E. and Yudelson, M. Eds.), 2018
One commonly used measure of lexical sophistication is the Advanced Guiraud (AG; [9]), whose formula requires frequency band counts (e.g., COCA; [13]). However, the accuracy of this measure is affected by the particular 2000-word frequency list selected as the basis for its calculations [27]. For example, possible issues arise when frequency lists that are based solely on native speaker corpora are used as a target for second language (L2) learners (e.g., [8]) because the exposure frequencies for L2 learners may vary from that of native speakers. Such L2 variation from comparable native speakers may be due to first language (L1) culture, home country teaching materials, or the text types which L2 learners commonly encounter. This paper addresses the aforementioned problem through an English as a Second Language (ESL) frequency list validation. Our validation is established on two sources: (1) the New General Service List (NGSL; [4]) which is based on the Cambridge English Corpus (CEC) and (2) written data from the 4.2 million-word Pitt English Language Institute Corpus (PELIC). Using open-source data science tools and natural language processing technologies, the paper demonstrates that more distinct measurable lexical sophistication differences across levels are discernible when learner-oriented frequency lists (as compared to general corpora frequency lists) are used as part of a lexical measure such as AG. The results from this research will be useful in teaching contexts where lexical proficiency is measured or assessed, and for materials and test developers who rely on such lists as being representative of known vocabulary at different levels of proficiency. This research applies data-driven exploration of learner corpora to vocabulary acquisition and pedagogy, thus closing a loop between educational data mining and classroom applications.
Behavior Research Methods, 2014
In this paper we introduce ESCOLEX, the first European Portuguese children's lexical database with grade-level-adjusted word frequency statistics. Computed from a 3.2 million word corpus, ESCOLEX provides 48,381 wordforms extracted from 171 Elementary and Middle School textbooks for 6 to 11 yearold children attending the first six grades in the Portuguese educational system. Similarly to other children's grade-level databases (e.g.frequency indices for each grade: overall word frequency (F), index of dispersion across the selected textbooks (D), estimated frequency per million words (U), and standard frequency index (SFI). It also provides the new measure of contextual diversity (CD). Additionally, the number of letters in the word, partof-speech, number of syllables, syllable structure, and adult frequencies taken from P-PAL (a European Portuguese corpus-based lexical database - in press) are also provided. ESCOLEX is a useful tool for both researchers interested in language processing and development, and professionals in need of verbal material adjusted to children's developmental stages. ESCOLEX can be downloaded at
The lexical knowledge of students of Spanish is related with the global knowledge of language of students in Spanish and in English (Chávez, 2017a forthcoming; Fairclough, 2009; Rodrigo, 2009) as well as with reading comprehension (Velásquez, 2015). Other researchers have also proposed that Spanish instructors should use more time in class teaching vocabulary based on the student's level of knowledge (Fairclough & Belpoliti, 2015; Waldvogel, 2016). In the present study, we analyze the lexical knowledge of students of Spanish as second language classes (L2) at the college level. It is based in the lexical threshold theory. The instruments are a lexical multiple choice exam, a cloze test activity and a survey for the instructor. Its methodology follows Chávez (2017a forthcoming), Fairclough and Ramirez (2009) and Rodrigo (2009). This study follows the recommendations of the doctoral dissertation in Chávez (2017a forthcoming) and it is done as a pilot study. The results demonstrate a positive Pearson correlation between the lexical exam and the cloze test activity of r=.762. They also show a varied lexical knowledge of the participants as found in previous studies (Velásquez, 2015; Fairclough, 2013; Fairclough & Ramírez, 2009) and a lower level of lexical knowledge when compared to students of heritage languages classes 1 (HL) (Chávez, 2017a forthcoming).
3,000 words in Spanish L2 basic language courses: A reachable goal
Second Language Research & Practice, 2023
While studies on lexical development in English L2 abound, less is known about how learners develop their lexicons in other L2s and how their developmental paths relate to lexical frequency counts. To fill this gap, this longitudinal study tracks the receptive lexical knowledge of students who progress through three semesters of Spanish L2 in a US university. Using an online receptive vocabulary test taken at the end of each semester, this study explores what percentage of the 3,000 most frequent Spanish words (overall and by frequency band) these learners recognized. Factors influencing outcomes such as whether the students had Spanish courses before the university, or whether they spoke Spanish outside of class were also examined. Results are consistent with English L2 research. Moreover, as L2 learners' proficiency increased, less additional vocabulary was learned. Previous experiences and use of Spanish outside of class positively influenced scores. On average, learners could recognize around 65% of the most frequent 3,000 words by the end of the third semester. These findings have practical implications for designing the vocabulary component of language courses during and after the first three semesters.
Journal of Foreign Language Teaching and Applied Linguistics, 2019
In the present paper,we carry out a study of lexical availability in a sample of modern languages students, whom mother tongue is Spanish and their main language of instruction is English. The process of compilation and analysis of data has been done twice in two different groups so as to be able to establish a real contrast between the two languages: English and Spanish. A quantitative analysis of the data is done and after there is a discussion of the findings. Our objective is to find out which is the available lexicon in groups of students at tertiary level and to see if there exist differences in the vocabulary those students use in their mother tongue in comparison with the lexicon they employ in their main language of instruction.
An Analysis of Spanish L2 Lexical Richness
2014
Reliable estimates of Spanish L2 learners’ vocabulary size and richness at different proficiency levels can bring to light lexical deficiencies and may help influence decisions regarding vocabulary instruction. The focus of this paper is to explore the concepts of lexical variation and sophistication—together known as lexical richness—and to analyze the relationship between oral proficiency and lexical richness among Spanish L2 learners. This pilot study offers some insights into the process of lexical development between different levels of Spanish L2 oral proficiency.