Pascual Cantos Gómez | Universidad de Murcia (original) (raw)
Papers by Pascual Cantos Gómez
Actas del XXIV Congreso Internacional de AESLA [Recurso electrónico]: aprendizaje de lenguas, uso del lenguaje y modelación cognitiva : perspectivas aplicadas entre disciplinas, 2007, ISBN 978-84-611-6897-2, 2007
The motivation of the present paper is base don the intuition that the sole use of data on lexica... more The motivation of the present paper is base don the intuition that the sole use of data on lexical relative to text samples of variations languages, authors, linguistic domains, etc. might be a potential indicator for automated text discrimination. In order to look for a reliable and valid lexical density index, we shall review and clarify the mathematical relationship between types (word forms) and tokens (words) by discussing and constructing adequeate regression models that might help to differentiate text types from each other. Additionally we shall use multivariate statistical models (cluster analysis and discriminant analysis) to complement the mathematical lexical density regression model (TYT-formula).La motivación del presente artículo nace de la intuición de que la sola utilización de la densidad léxica de muestras textuales pertenecientes a diferentes idiomas, autores, dominios lingüísticos, etc. Puede ser potencialmente válida para discriminar textos de forma automática....
La mentira en el lenguaje se ha estudiado desde la perspectiva de varias disciplinas, siendo la m... more La mentira en el lenguaje se ha estudiado desde la perspectiva de varias disciplinas, siendo la más reciente la minería de opiniones. En este contexto, el presente estudio persigue explorar los rasgos sintomáticos de la mentira en lengua escrita en español, lo cual no ha sido aún investigado. Para ello, hemos desarrollado un marco de trabajo basado en un clasificador de máquinas de soporte vectorial (SVM) aplicado a un corpus ad hoc de opiniones. Hemos usado las categorías psicolingüísticas definidas en LIWC (Pennebaker, Francis y Booth, 2001) a través de sus cuatro dimensiones fundamentales para entrenar el algoritmo. Los resultados del experimento muestran que es posible separar los textos en lengua española de acuerdo con su condición de verdad, siendo las dos primeras dimensiones, procesos lingüísticos y psicológicos, las más relevantes para la consecución de tal objetivo.Deception in language has been studied from the perspective of several disciplines, being the most recent on...
Lingüística de corpus en español, 2022
Journal of Research Design and Statistics in Linguistics and Communication Science, 2017
Journal of Research Design and Statistics in Linguistics and Communication Science, 2018
Vigo International Journal of Applied Linguistics, 2019
Readability indices have been widely used in order to measure textual difficulty. They can be use... more Readability indices have been widely used in order to measure textual difficulty. They can be useful for the automatic classification of texts, especially in language teaching. Among other applications, they allow for the previous determination of the difficulty level of texts without the need of reading them through. The aim of this research is twofold: first, to examine the degree of accuracy of the six most commonly used readability indices, and second, to present a new optimized measure. The main problem is that these readability indices may offer disparity, and this is precisely what has motivated our attempt to unite their potential. A discriminant analysis of all the variables under examination has enabled the creation of a much more precise model, improving the previous best results by 15%. Furthermore, errors and disparities in the difficulty level of the analyzed texts have been detected.
International Journal of English Studies, 2010
The concepts of explicit and implicit (knowledge) are at the core of SLA studies. We take explici... more The concepts of explicit and implicit (knowledge) are at the core of SLA studies. We take explicit as conscious and declarative (knowledge); implicit as unconscious, automatic and procedural (knowledge) (DeKeyser, 2003; R. Ellis, 2005a, 2005b, 2009; Hulstjin, 2005; Robinson, 1996; Schmidt, 1990, 1994). The importance of those concepts and components, we believe, must also be acknowledged in language teaching, and consequently in language teaching materials. However, explicitness and implicitness are rather complex constructs; such complexity allows for multiple nuances and perspectives in their analysis, and this fact poses a real challenge for their identification in the learning and teaching process and materials. We focus here on ELT materials and aim at the building of a reliable construct which may help in the identification of their potential for promoting implicit and explicit components. We first determined the features to identify the construct for implicitness and explicit...
Iberica Revista De La Asociacion Europea De Lenguas Para Fines Especificos, 2014
The aim of this paper is to sketch a potential methodology for automatic text classification whic... more The aim of this paper is to sketch a potential methodology for automatic text classification which allows text topic discrimination as a prior step to new case assignment to previously established text topics. Such case assignment will be performed by means of Discriminant Function Analysis based on a series of easily-computable linguistic parameters, in order to reduce computational costs.
Equinox eBooks Publishing, 2013
Lfe Revista De Lenguas Para Fines Especificos, 2011
This research focuses on the Corpus of English Texts on Astronomy (CETA), the first sub-corpus of... more This research focuses on the Corpus of English Texts on Astronomy (CETA), the first sub-corpus of CC, presenting a diachronic approach to astronomy specific lexicon found in texts from 1710 to 1920. The goal of this research is trying to determine the evolution of the lexical astronomy-domain specificity in the CETA. That is, how many astronomy-like lexical features occur in the English astronomic texts gathered in the CETA. This might shed some light on: (i) the introduction rate of new astronomic specific vocabulary along time, (ii) lexical richness in English astronomic texts, (iii) the rate of new astronomic specific vocabulary along time, (iv) the potential lexical specific features of English astronomic texts, and (v) lexico-semantic text difficulty of English astronomic texts.
Homenaje a Francisco Gutierrez Diez 2013 Isbn 978 84 15463 55 9 Pags 59 71, 2013
Equinox eBooks Publishing, 2013
Topics include: 5.1 Typology and usefulness; 5.2 Keywords; 5.3 Text annotation; 5.4 Comparing wor... more Topics include: 5.1 Typology and usefulness; 5.2 Keywords; 5.3 Text annotation; 5.4 Comparing wordlists; 5.5 Frequency distributions
Actas Del Xxiv Congreso Internacional De Aesla Recurso Electronico Aprendizaje De Lenguas Uso Del Lenguaje Y Modelacion Cognitiva Perspectivas Aplicadas Entre Disciplinas 2007 Isbn 978 84 611 6897 2, 2007
Estudios Ingleses de la Universidad Complutense, 2011
Las similitudes o diferencias que percibimos al comparar dos o más lenguas son siempre relativas.... more Las similitudes o diferencias que percibimos al comparar dos o más lenguas son siempre relativas. El análisis de estos contrastes lingüísticos se puede formalizar tanto desde diversos ángulos y/o niveles lingüísticos (fonético-fonológico, morfológico, sintáctico, semántico, pragmático, diacrónico, etc.), como mediante diferentes metodologías (inductiva, deductiva, descriptiva, etc.). La finalidad de este estudio comparativo entre el español y el inglés es ofrecer, por primera vez, datos cuantitativos objetivos y fiables sobre la estructura y vertebración del inventario léxico en ambas lenguas (riqueza léxica, crecimiento léxico: formas, tramos de frecuencias, formas léxicas y funcionales, formas más frecuentes, longitud de formas, etc.), y poder aproximarnos a las diferentes propiedades matemáticas que subyacen en las lenguas española e inglesa. Todos los datos cuantitativos se han obtenido partiendo de dos corpus lingüísticos equivalentes (en estructura, composición y tamaño: 20 millones de palabras): Cumbre (para el español) y Lacell (para el inglés).
um.es
Abstract: The present paper is based on the extended piece of research which is aimed at carrying... more Abstract: The present paper is based on the extended piece of research which is aimed at carrying out structural and lexical analysis of two contrasting plays Shakespeare´s Hamlet and Sumarokov´s Gamlet-in a specific linguistic domain. In this contribution, we will attempt to ...
International Journal of English Studies (IJES), 2009
Actas del XXIV Congreso Internacional de AESLA [Recurso electrónico]: aprendizaje de lenguas, uso del lenguaje y modelación cognitiva : perspectivas aplicadas entre disciplinas, 2007, ISBN 978-84-611-6897-2, 2007
The motivation of the present paper is base don the intuition that the sole use of data on lexica... more The motivation of the present paper is base don the intuition that the sole use of data on lexical relative to text samples of variations languages, authors, linguistic domains, etc. might be a potential indicator for automated text discrimination. In order to look for a reliable and valid lexical density index, we shall review and clarify the mathematical relationship between types (word forms) and tokens (words) by discussing and constructing adequeate regression models that might help to differentiate text types from each other. Additionally we shall use multivariate statistical models (cluster analysis and discriminant analysis) to complement the mathematical lexical density regression model (TYT-formula).La motivación del presente artículo nace de la intuición de que la sola utilización de la densidad léxica de muestras textuales pertenecientes a diferentes idiomas, autores, dominios lingüísticos, etc. Puede ser potencialmente válida para discriminar textos de forma automática....
La mentira en el lenguaje se ha estudiado desde la perspectiva de varias disciplinas, siendo la m... more La mentira en el lenguaje se ha estudiado desde la perspectiva de varias disciplinas, siendo la más reciente la minería de opiniones. En este contexto, el presente estudio persigue explorar los rasgos sintomáticos de la mentira en lengua escrita en español, lo cual no ha sido aún investigado. Para ello, hemos desarrollado un marco de trabajo basado en un clasificador de máquinas de soporte vectorial (SVM) aplicado a un corpus ad hoc de opiniones. Hemos usado las categorías psicolingüísticas definidas en LIWC (Pennebaker, Francis y Booth, 2001) a través de sus cuatro dimensiones fundamentales para entrenar el algoritmo. Los resultados del experimento muestran que es posible separar los textos en lengua española de acuerdo con su condición de verdad, siendo las dos primeras dimensiones, procesos lingüísticos y psicológicos, las más relevantes para la consecución de tal objetivo.Deception in language has been studied from the perspective of several disciplines, being the most recent on...
Lingüística de corpus en español, 2022
Journal of Research Design and Statistics in Linguistics and Communication Science, 2017
Journal of Research Design and Statistics in Linguistics and Communication Science, 2018
Vigo International Journal of Applied Linguistics, 2019
Readability indices have been widely used in order to measure textual difficulty. They can be use... more Readability indices have been widely used in order to measure textual difficulty. They can be useful for the automatic classification of texts, especially in language teaching. Among other applications, they allow for the previous determination of the difficulty level of texts without the need of reading them through. The aim of this research is twofold: first, to examine the degree of accuracy of the six most commonly used readability indices, and second, to present a new optimized measure. The main problem is that these readability indices may offer disparity, and this is precisely what has motivated our attempt to unite their potential. A discriminant analysis of all the variables under examination has enabled the creation of a much more precise model, improving the previous best results by 15%. Furthermore, errors and disparities in the difficulty level of the analyzed texts have been detected.
International Journal of English Studies, 2010
The concepts of explicit and implicit (knowledge) are at the core of SLA studies. We take explici... more The concepts of explicit and implicit (knowledge) are at the core of SLA studies. We take explicit as conscious and declarative (knowledge); implicit as unconscious, automatic and procedural (knowledge) (DeKeyser, 2003; R. Ellis, 2005a, 2005b, 2009; Hulstjin, 2005; Robinson, 1996; Schmidt, 1990, 1994). The importance of those concepts and components, we believe, must also be acknowledged in language teaching, and consequently in language teaching materials. However, explicitness and implicitness are rather complex constructs; such complexity allows for multiple nuances and perspectives in their analysis, and this fact poses a real challenge for their identification in the learning and teaching process and materials. We focus here on ELT materials and aim at the building of a reliable construct which may help in the identification of their potential for promoting implicit and explicit components. We first determined the features to identify the construct for implicitness and explicit...
Iberica Revista De La Asociacion Europea De Lenguas Para Fines Especificos, 2014
The aim of this paper is to sketch a potential methodology for automatic text classification whic... more The aim of this paper is to sketch a potential methodology for automatic text classification which allows text topic discrimination as a prior step to new case assignment to previously established text topics. Such case assignment will be performed by means of Discriminant Function Analysis based on a series of easily-computable linguistic parameters, in order to reduce computational costs.
Equinox eBooks Publishing, 2013
Lfe Revista De Lenguas Para Fines Especificos, 2011
This research focuses on the Corpus of English Texts on Astronomy (CETA), the first sub-corpus of... more This research focuses on the Corpus of English Texts on Astronomy (CETA), the first sub-corpus of CC, presenting a diachronic approach to astronomy specific lexicon found in texts from 1710 to 1920. The goal of this research is trying to determine the evolution of the lexical astronomy-domain specificity in the CETA. That is, how many astronomy-like lexical features occur in the English astronomic texts gathered in the CETA. This might shed some light on: (i) the introduction rate of new astronomic specific vocabulary along time, (ii) lexical richness in English astronomic texts, (iii) the rate of new astronomic specific vocabulary along time, (iv) the potential lexical specific features of English astronomic texts, and (v) lexico-semantic text difficulty of English astronomic texts.
Homenaje a Francisco Gutierrez Diez 2013 Isbn 978 84 15463 55 9 Pags 59 71, 2013
Equinox eBooks Publishing, 2013
Topics include: 5.1 Typology and usefulness; 5.2 Keywords; 5.3 Text annotation; 5.4 Comparing wor... more Topics include: 5.1 Typology and usefulness; 5.2 Keywords; 5.3 Text annotation; 5.4 Comparing wordlists; 5.5 Frequency distributions
Actas Del Xxiv Congreso Internacional De Aesla Recurso Electronico Aprendizaje De Lenguas Uso Del Lenguaje Y Modelacion Cognitiva Perspectivas Aplicadas Entre Disciplinas 2007 Isbn 978 84 611 6897 2, 2007
Estudios Ingleses de la Universidad Complutense, 2011
Las similitudes o diferencias que percibimos al comparar dos o más lenguas son siempre relativas.... more Las similitudes o diferencias que percibimos al comparar dos o más lenguas son siempre relativas. El análisis de estos contrastes lingüísticos se puede formalizar tanto desde diversos ángulos y/o niveles lingüísticos (fonético-fonológico, morfológico, sintáctico, semántico, pragmático, diacrónico, etc.), como mediante diferentes metodologías (inductiva, deductiva, descriptiva, etc.). La finalidad de este estudio comparativo entre el español y el inglés es ofrecer, por primera vez, datos cuantitativos objetivos y fiables sobre la estructura y vertebración del inventario léxico en ambas lenguas (riqueza léxica, crecimiento léxico: formas, tramos de frecuencias, formas léxicas y funcionales, formas más frecuentes, longitud de formas, etc.), y poder aproximarnos a las diferentes propiedades matemáticas que subyacen en las lenguas española e inglesa. Todos los datos cuantitativos se han obtenido partiendo de dos corpus lingüísticos equivalentes (en estructura, composición y tamaño: 20 millones de palabras): Cumbre (para el español) y Lacell (para el inglés).
um.es
Abstract: The present paper is based on the extended piece of research which is aimed at carrying... more Abstract: The present paper is based on the extended piece of research which is aimed at carrying out structural and lexical analysis of two contrasting plays Shakespeare´s Hamlet and Sumarokov´s Gamlet-in a specific linguistic domain. In this contribution, we will attempt to ...
International Journal of English Studies (IJES), 2009