Stefania Spina | Università per Stranieri di Perugia (original) (raw)

Books by Stefania Spina

Research paper thumbnail of Fiumi di parole. Discorso e grammatica delle conversazioni scritte in Twitter

Fiumi di parole è il primo studio sistematico italiano sulle caratteristiche linguistiche e disco... more Fiumi di parole è il primo studio sistematico italiano sulle caratteristiche linguistiche e discorsive delle interazioni in Twitter, basato su un ampio campione di dati reali estratti da Twitter, analizzati anche attraverso metodologie statistico-quantitative.

Research paper thumbnail of Openpolitica. Il discorso dei politici italiani nell'era di Twitter

Research paper thumbnail of Fare i conti con le parole: introduzione alla linguistica dei corpora

Research paper thumbnail of Parole in rete: guida ai siti Internet sul linguaggio

Papers by Stefania Spina

Research paper thumbnail of Predicting readability of texts for Italian L2 students: a preliminary study

Text selection and comparability for L2 students to read and comprehend are central concerns both... more Text selection and comparability for L2 students to read and comprehend are central concerns both for teaching and assessment purposes.Compared to subjective selection. quantitative approaches provide more objective information, analysing texts at language and discourse level (Khalifa & Weir, 2009). Readability formulae such as the Flesch Reading Ease, the Flesch-Kincaid Grade Level and, for Italian, the GulpEase index (Lucisano and Piemontese, 1988), do not fully addressed the issue of text complexity. A new readability formula called Coh-Metrix was proposed (Crossley, Geenfield, & McNamara 2008), which takes into account a wider set of language and discourse features. A similar approach was proposed to assess readability of Italian texts through a tool called READ-IT (Dell’Orletta, Montemagni, & Venturi 2011). While READ-IT was tested on newspaper texts randomly selected, this contribution focuses on the development of a similar computational tool applied on texts specifically selected in the context of assessing Italian as L2. Two text corpora have been collected from the CELI (Certificates of Italian Language) item bank at B2 and C2 level. Statistical differences in the occurrence of a set of linguistic and discursive features have been analysed according to four different categories: length features, lexical features, morpho-syntactic features, and discursive features.

Research paper thumbnail of Le tre fasi del discorso politico italiano in Twitter: una storia senza lieto fine?

Italienisch, 2022

The relationship between Twitter and Italian politicians started around 2010. Since then, it is p... more The relationship between Twitter and Italian politicians started around 2010. Since then, it is possible to trace an evolution, and to identify three distinct phases that have characterised this relationship. This study describes some of the linguistic shifts that characterise the political discourse on Twitter, in its progressive move from the naive attitude to ʻbeing social’ of the beginnings, to the self-promotional monologue, up to the verbal excesses and forms of language aggression of the last three years. Through the analysis of corpora of tweets written by Luigi di Maio, Giorgia Meloni and Matteo Renzi, the focus is mainly on some of the lexical and discursive features that characterise this verbal aggression. The use of a rhetoric of simplification, through which politicians constantly tend to banalise and reduce reality to its extreme forms, strongly contributes to the diffusion of a highly polarised form of political discourse, as well as to our daily experience with social media.

Research paper thumbnail of Breve Storia Della Classe DI Concorso A23 - Lingua Italiana Per Discenti DI Lingua Straniera

Nel corso degli ultimi trent'anni la crescente presenza di studentesse e studenti neoarrivati... more Nel corso degli ultimi trent'anni la crescente presenza di studentesse e studenti neoarrivati, immigrati e figli di genitori immigrati ha fortemente influenzato la società italiana e ha impegnato il sistema scolastico al fine di favorire la loro integrazione linguistica e sociale. Nonostante nel corso degli ultimi decenni il numero delle studentesse e degli studenti alloglotti sia aumentato sensibilmente, solo a partire dall'anno scolastico 2017/2018 è stata introdotta la figura del docente di Lingua italiana per discenti di lingua straniera. L'assenza di docenti esperti e competenti nell'ambito della didattica dell'italiano L2, infatti, ha a lungo caratterizzato gli interventi di integrazione linguistica rivolti agli studenti alloglotti. Per questo motivo l'istituzionalizzazione della classe di concorso A23-Lingua italiana per discenti di lingua straniera è un momento fondamentale per il riconoscimento istituzionale dell'insegnamento dell'italiano L2...

Research paper thumbnail of Second language learning for vulnerable adult migrants: The case of the Italian public school

Glottodidactica. An International Journal of Applied Linguistics, 2021

Over the last years, thousands of asylum seekers and refugees have arrived in Italy, including a ... more Over the last years, thousands of asylum seekers and refugees have arrived in Italy, including a large number of teenagers without any adult caregivers and women. A significant part of them is placed in the Italian language courses for foreigners organised by the Provincial Centre for Adult Education, commonly called CPIA (Centro Provinciale per l’Istruzione degli Adulti). This paper addresses the exploration of the language testing and assessment of the courses organized by this institution. With the aim of evaluating these aspects, the paper concentrates on the CPIA, its teachers, and its students. Focusing on CPIA’s language courses, we investigate the language testing and assessment carried out at the beginning and at the end of the language courses. Thanks to these observations, the paper tries to identify some critical aspects and to understand their causes.

Research paper thumbnail of Smart Cities and Languages: The Language Network

This paper intends to analyze the potential of smart cities from a linguistic perspective, with p... more This paper intends to analyze the potential of smart cities from a linguistic perspective, with particular attention towards aspects such as second language acquisition (SLA), social inclusion and innovation, but also positive influences on sectors such as tourism and commerce. After an introduction of the theoretical foundations, the possible developing scenarios will be taken into consideration and analyzed more in detail.

Research paper thumbnail of The Longitudinal Corpus of Chinese Learners of Italian (LoCCLI)

Collection The LOCCLI is a longitudinal corpus: data was collected in two different points in tim... more Collection The LOCCLI is a longitudinal corpus: data was collected in two different points in time, from the same 175 learners. Each of the learners contributed two essays to the corpus: one essay was written at the beginning of a six-month course of Italian, and the other at the end of the course. In order to collect corpus data, they were asked to write on three similar essay topics: 1) My first impression of Italy and Italians, 2) My hobbies: what do I usually do in my free time, 3) My last holidays.). They were instructed not to write on the same topic more than once.

Research paper thumbnail of Measuring Text Complexity for Italian as a Second Language Learning Purposes

Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications

The selection of texts for second language learning purposes typically relies on teachers' and te... more The selection of texts for second language learning purposes typically relies on teachers' and test developers' individual judgment of the observable qualitative properties of a text. Little or no consideration is generally given to the quantitative dimension within an evidence-based framework of reproducibility. This study aims to fill the gap by evaluating the effectiveness of an automatic tool trained to assess text complexity in the context of Italian as a second language learning. A dataset of texts labeled by expert test developers was used to evaluate the performance of three classifier models (decision tree, random forest, and support vector machine), which were trained using linguistic features measured quantitatively and extracted from the texts. The experimental analysis provided satisfactory results, also in relation to which kind of linguistic trait contributed the most to the final outcome.

Research paper thumbnail of Multi-Word Expressions in Second Language Writing: A Large-Scale Longitudinal Learner Corpus Study

Language learning, 2019

In the present study, we sought to advance the field of learner corpus research by tracking the d... more In the present study, we sought to advance the field of learner corpus research by tracking the development of phrasal vocabulary in essays produced at two different points in time. To this aim, we employed a large pool of second language (L2) learners (N = 175) from three proficiency levels—beginner, elementary, and intermediate—and focused on an underrepresented L2 (Italian). Employing mixed-effects models, a flexible and powerful tool for corpus data analysis, we analyzed learner combinations in terms of five different measures: phrase frequency, mutual information, lexical gravity, delta Pforward, and delta Pbackward . Our findings suggest a complex picture, in which higher proficiency and greater exposure to the L2 do not result in more idiomatic and targetlike output, and may, in fact, result in greater reliance on low frequency combinations whose constituent words are non-associated or mutually attracted.

Research paper thumbnail of Role of Emoticons as Structural Markers in Twitter Interactions

Discourse Processes, 2018

Emoticons play a key role in digital written interactions. Since the 1980s research has highlight... more Emoticons play a key role in digital written interactions. Since the 1980s research has highlighted their growing relevance, as they allow to convey increasingly rich emotional, social, and pragmatic information. This article contributes to this area of research by providing an analysis of emoticons as structural markers in Twitter interactions. Based on a large corpus of Italian tweets, mixed-effect models were used to investigate to what extent and how emoticons are used in this role and what variables most influence their use and their relationship with punctuation marks. Results indicate that emoticons often have the function of clause and sentence boundary mark- ing, either replacing or integrating punctuation marks. Variables affecting their use included user’s age, emoticon type, and position within the tweets. In this role their use reveals the pursuit of coherent strategies in discourse organization.

Research paper thumbnail of The role of learner corpus research in the study of L2 phraseology: main contributions and future directions

Journal of Applied Psycholinguistics, 2020

This study describes some of the main contributions of Learner Corpus Research to the study of L2... more This study describes some of the main contributions of Learner Corpus Research to the study of L2 phraseology. Although recently established, this strand of research has already provided significant results in the analysis of phraseology in learner language. The corpus-based approach has strongly contributed to the establishment of the key role of the phraseological dimension in the acqui- sition of a second or foreign language. After a brief description of some of the main results achieved in the field of L2 phraseology research, the study describes the most common statistical association measures, each representing a specific phraseological dimension (repetition, diversification, type-to- ken distribution, strength of association, exclusivity, and directionality). Through their systematic use, Learner Corpus Research is playing an important role in the identification of phraseological sequences, providing robust empirical methods to extract them from learner corpora, analyse their distribution, and measure their properties.

Research paper thumbnail of Investigation of Native Speaker and Second Language Learner Intuition of Collocation Frequency

Language learning, 2015

Research into frequency intuition has focused primarily on native (L1) and, to a lesser degree, n... more Research into frequency intuition has focused primarily on native (L1) and, to a lesser degree, nonnative (L2) speaker intuitions about single word frequency. What remains a largely unexplored area is L1 and L2 intuitions about collocation (i.e., phrasal) frequency. To bridge this gap, the present study aimed to answer the following question: How do L2 learners and native speakers compare against each other and corpora in their subjective judgments of collocation frequency? Native speakers and learners of Italian were asked to judge 80 noun-adjective pairings as one of the following: high frequency, medium frequency, low frequency, very low frequency. Both L1 and L2 intuitions of high frequency collocations correlated strongly with corpus frequency. Neither of the two groups of participants exhibited accurate intuitions of medium and low frequency collocations. With regard to very low frequency pairings, L1 but not L2 intuitions were found to correlate with corpora for the majority of the items. Further, mixed-effects modeling revealed that L2 learners were comparable to native speakers in their judgments of the four frequency bands, although some differences did emerge. Taken together, the study provides new insights into the nature of L1 and L2 intuitions about phrasal frequency.

Research paper thumbnail of Automatic Classification of Text Complexity

Applied Sciences

This work introduces an automatic classification system for measuring the complexity level of a g... more This work introduces an automatic classification system for measuring the complexity level of a given Italian text under a linguistic point-of-view. The task of measuring the complexity of a text is cast to a supervised classification problem by exploiting a dataset of texts purposely produced by linguistic experts for second language teaching and assessment purposes. The commonly adopted Common European Framework of Reference for Languages (CEFR) levels were used as target classification classes, texts were elaborated by considering a large set of numeric linguistic features, and an experimental comparison among ten widely used machine learning models was conducted. The results show that the proposed approach is able to obtain a good prediction accuracy, while a further analysis was conducted in order to identify the categories of features that influenced the predictions.

Research paper thumbnail of Corpora for Linguists vs. Corpora for Learners

8 | 2 | 2019

This paper aims to shed light on how research findings stemming from Learner Corpus Research (LCR... more This paper aims to shed light on how research findings stemming from Learner Corpus Research (LCR) can inform the development of Data-driven learning (DDL) pedagogical activities. By doing this, it seeks to show how the gap between corpora built to be used by linguists and those tailored for learners can be filled. It starts by defining what a corpus is and how second language learning studies can benefit from the research findings based on corpora, but also from the direct use of corpora in the classroom. Then, it provides an overview of the available native and learner corpora of Italian, and how corpora in general can be adapted for DDL purposes. Finally, it describes an example of how an LCR finding can be used to develop DDL activities. It concludes with some desiderata for the future.

Research paper thumbnail of Io non è che non me ne frega niente: tendenze recenti della negazione tramite frase scissa

Le tendenze dell’italiano contemporaneo rivisitate. Atti del LII Congresso Internazionale di Studi della Società di Linguistica Italiana (Berna, 6-8 settembre 2018), 2019

Questo contributo ha l'obiettivo di indagare le tendenze recenti della ne-gazione italiana tramit... more Questo contributo ha l'obiettivo di indagare le tendenze recenti della ne-gazione italiana tramite frase scissa, per quanto riguarda la sua struttura, le sue funzioni e la sua diffusione, in particolare rispetto alla negazione di frase non marcata non. Abbinando l'osservazione e il confronto di dati estratti da corpora con metodologie statistiche multifattoriali, lo studio mostra che la costruzione è dotata di una notevole versatilità pragmatica, sintattica e nella distribuzione dell'informazione, che consentono di mettere in rilievo speci-fici elementi della frase negativa e la rendono una risorsa preziosa, specie nel parlato informale. Parole chiave: negazione, frase scissa, dislocazioni, focalizzazione.

Research paper thumbnail of News as a Conversation. A Comparative Analysis of the Language of Online and Printed News in Italy

Recherches en Communication

Previous studies on newspaper language have already pointed out how the language of written news ... more Previous studies on newspaper language have already pointed out how the language of written news has undergone significant changes in the last two or three decades. The aim of this study is to move a step forward and to compare the use of selected linguistic features in online and in printed Italian news . The main hypothesis is that Internet and communication technologies have introduced important transformations in the way news is written, organized and delivered, and that the consequences of these transformations are observable at all levels of linguistic analysis.

Research paper thumbnail of Come cambia la lingua dei politici nell'era di Twitter

Research paper thumbnail of Fiumi di parole. Discorso e grammatica delle conversazioni scritte in Twitter

Fiumi di parole è il primo studio sistematico italiano sulle caratteristiche linguistiche e disco... more Fiumi di parole è il primo studio sistematico italiano sulle caratteristiche linguistiche e discorsive delle interazioni in Twitter, basato su un ampio campione di dati reali estratti da Twitter, analizzati anche attraverso metodologie statistico-quantitative.

Research paper thumbnail of Openpolitica. Il discorso dei politici italiani nell'era di Twitter

Research paper thumbnail of Fare i conti con le parole: introduzione alla linguistica dei corpora

Research paper thumbnail of Parole in rete: guida ai siti Internet sul linguaggio

Research paper thumbnail of Predicting readability of texts for Italian L2 students: a preliminary study

Text selection and comparability for L2 students to read and comprehend are central concerns both... more Text selection and comparability for L2 students to read and comprehend are central concerns both for teaching and assessment purposes.Compared to subjective selection. quantitative approaches provide more objective information, analysing texts at language and discourse level (Khalifa & Weir, 2009). Readability formulae such as the Flesch Reading Ease, the Flesch-Kincaid Grade Level and, for Italian, the GulpEase index (Lucisano and Piemontese, 1988), do not fully addressed the issue of text complexity. A new readability formula called Coh-Metrix was proposed (Crossley, Geenfield, & McNamara 2008), which takes into account a wider set of language and discourse features. A similar approach was proposed to assess readability of Italian texts through a tool called READ-IT (Dell’Orletta, Montemagni, & Venturi 2011). While READ-IT was tested on newspaper texts randomly selected, this contribution focuses on the development of a similar computational tool applied on texts specifically selected in the context of assessing Italian as L2. Two text corpora have been collected from the CELI (Certificates of Italian Language) item bank at B2 and C2 level. Statistical differences in the occurrence of a set of linguistic and discursive features have been analysed according to four different categories: length features, lexical features, morpho-syntactic features, and discursive features.

Research paper thumbnail of Le tre fasi del discorso politico italiano in Twitter: una storia senza lieto fine?

Italienisch, 2022

The relationship between Twitter and Italian politicians started around 2010. Since then, it is p... more The relationship between Twitter and Italian politicians started around 2010. Since then, it is possible to trace an evolution, and to identify three distinct phases that have characterised this relationship. This study describes some of the linguistic shifts that characterise the political discourse on Twitter, in its progressive move from the naive attitude to ʻbeing social’ of the beginnings, to the self-promotional monologue, up to the verbal excesses and forms of language aggression of the last three years. Through the analysis of corpora of tweets written by Luigi di Maio, Giorgia Meloni and Matteo Renzi, the focus is mainly on some of the lexical and discursive features that characterise this verbal aggression. The use of a rhetoric of simplification, through which politicians constantly tend to banalise and reduce reality to its extreme forms, strongly contributes to the diffusion of a highly polarised form of political discourse, as well as to our daily experience with social media.

Research paper thumbnail of Breve Storia Della Classe DI Concorso A23 - Lingua Italiana Per Discenti DI Lingua Straniera

Nel corso degli ultimi trent'anni la crescente presenza di studentesse e studenti neoarrivati... more Nel corso degli ultimi trent'anni la crescente presenza di studentesse e studenti neoarrivati, immigrati e figli di genitori immigrati ha fortemente influenzato la società italiana e ha impegnato il sistema scolastico al fine di favorire la loro integrazione linguistica e sociale. Nonostante nel corso degli ultimi decenni il numero delle studentesse e degli studenti alloglotti sia aumentato sensibilmente, solo a partire dall'anno scolastico 2017/2018 è stata introdotta la figura del docente di Lingua italiana per discenti di lingua straniera. L'assenza di docenti esperti e competenti nell'ambito della didattica dell'italiano L2, infatti, ha a lungo caratterizzato gli interventi di integrazione linguistica rivolti agli studenti alloglotti. Per questo motivo l'istituzionalizzazione della classe di concorso A23-Lingua italiana per discenti di lingua straniera è un momento fondamentale per il riconoscimento istituzionale dell'insegnamento dell'italiano L2...

Research paper thumbnail of Second language learning for vulnerable adult migrants: The case of the Italian public school

Glottodidactica. An International Journal of Applied Linguistics, 2021

Over the last years, thousands of asylum seekers and refugees have arrived in Italy, including a ... more Over the last years, thousands of asylum seekers and refugees have arrived in Italy, including a large number of teenagers without any adult caregivers and women. A significant part of them is placed in the Italian language courses for foreigners organised by the Provincial Centre for Adult Education, commonly called CPIA (Centro Provinciale per l’Istruzione degli Adulti). This paper addresses the exploration of the language testing and assessment of the courses organized by this institution. With the aim of evaluating these aspects, the paper concentrates on the CPIA, its teachers, and its students. Focusing on CPIA’s language courses, we investigate the language testing and assessment carried out at the beginning and at the end of the language courses. Thanks to these observations, the paper tries to identify some critical aspects and to understand their causes.

Research paper thumbnail of Smart Cities and Languages: The Language Network

This paper intends to analyze the potential of smart cities from a linguistic perspective, with p... more This paper intends to analyze the potential of smart cities from a linguistic perspective, with particular attention towards aspects such as second language acquisition (SLA), social inclusion and innovation, but also positive influences on sectors such as tourism and commerce. After an introduction of the theoretical foundations, the possible developing scenarios will be taken into consideration and analyzed more in detail.

Research paper thumbnail of The Longitudinal Corpus of Chinese Learners of Italian (LoCCLI)

Collection The LOCCLI is a longitudinal corpus: data was collected in two different points in tim... more Collection The LOCCLI is a longitudinal corpus: data was collected in two different points in time, from the same 175 learners. Each of the learners contributed two essays to the corpus: one essay was written at the beginning of a six-month course of Italian, and the other at the end of the course. In order to collect corpus data, they were asked to write on three similar essay topics: 1) My first impression of Italy and Italians, 2) My hobbies: what do I usually do in my free time, 3) My last holidays.). They were instructed not to write on the same topic more than once.

Research paper thumbnail of Measuring Text Complexity for Italian as a Second Language Learning Purposes

Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications

The selection of texts for second language learning purposes typically relies on teachers' and te... more The selection of texts for second language learning purposes typically relies on teachers' and test developers' individual judgment of the observable qualitative properties of a text. Little or no consideration is generally given to the quantitative dimension within an evidence-based framework of reproducibility. This study aims to fill the gap by evaluating the effectiveness of an automatic tool trained to assess text complexity in the context of Italian as a second language learning. A dataset of texts labeled by expert test developers was used to evaluate the performance of three classifier models (decision tree, random forest, and support vector machine), which were trained using linguistic features measured quantitatively and extracted from the texts. The experimental analysis provided satisfactory results, also in relation to which kind of linguistic trait contributed the most to the final outcome.

Research paper thumbnail of Multi-Word Expressions in Second Language Writing: A Large-Scale Longitudinal Learner Corpus Study

Language learning, 2019

In the present study, we sought to advance the field of learner corpus research by tracking the d... more In the present study, we sought to advance the field of learner corpus research by tracking the development of phrasal vocabulary in essays produced at two different points in time. To this aim, we employed a large pool of second language (L2) learners (N = 175) from three proficiency levels—beginner, elementary, and intermediate—and focused on an underrepresented L2 (Italian). Employing mixed-effects models, a flexible and powerful tool for corpus data analysis, we analyzed learner combinations in terms of five different measures: phrase frequency, mutual information, lexical gravity, delta Pforward, and delta Pbackward . Our findings suggest a complex picture, in which higher proficiency and greater exposure to the L2 do not result in more idiomatic and targetlike output, and may, in fact, result in greater reliance on low frequency combinations whose constituent words are non-associated or mutually attracted.

Research paper thumbnail of Role of Emoticons as Structural Markers in Twitter Interactions

Discourse Processes, 2018

Emoticons play a key role in digital written interactions. Since the 1980s research has highlight... more Emoticons play a key role in digital written interactions. Since the 1980s research has highlighted their growing relevance, as they allow to convey increasingly rich emotional, social, and pragmatic information. This article contributes to this area of research by providing an analysis of emoticons as structural markers in Twitter interactions. Based on a large corpus of Italian tweets, mixed-effect models were used to investigate to what extent and how emoticons are used in this role and what variables most influence their use and their relationship with punctuation marks. Results indicate that emoticons often have the function of clause and sentence boundary mark- ing, either replacing or integrating punctuation marks. Variables affecting their use included user’s age, emoticon type, and position within the tweets. In this role their use reveals the pursuit of coherent strategies in discourse organization.

Research paper thumbnail of The role of learner corpus research in the study of L2 phraseology: main contributions and future directions

Journal of Applied Psycholinguistics, 2020

This study describes some of the main contributions of Learner Corpus Research to the study of L2... more This study describes some of the main contributions of Learner Corpus Research to the study of L2 phraseology. Although recently established, this strand of research has already provided significant results in the analysis of phraseology in learner language. The corpus-based approach has strongly contributed to the establishment of the key role of the phraseological dimension in the acqui- sition of a second or foreign language. After a brief description of some of the main results achieved in the field of L2 phraseology research, the study describes the most common statistical association measures, each representing a specific phraseological dimension (repetition, diversification, type-to- ken distribution, strength of association, exclusivity, and directionality). Through their systematic use, Learner Corpus Research is playing an important role in the identification of phraseological sequences, providing robust empirical methods to extract them from learner corpora, analyse their distribution, and measure their properties.

Research paper thumbnail of Investigation of Native Speaker and Second Language Learner Intuition of Collocation Frequency

Language learning, 2015

Research into frequency intuition has focused primarily on native (L1) and, to a lesser degree, n... more Research into frequency intuition has focused primarily on native (L1) and, to a lesser degree, nonnative (L2) speaker intuitions about single word frequency. What remains a largely unexplored area is L1 and L2 intuitions about collocation (i.e., phrasal) frequency. To bridge this gap, the present study aimed to answer the following question: How do L2 learners and native speakers compare against each other and corpora in their subjective judgments of collocation frequency? Native speakers and learners of Italian were asked to judge 80 noun-adjective pairings as one of the following: high frequency, medium frequency, low frequency, very low frequency. Both L1 and L2 intuitions of high frequency collocations correlated strongly with corpus frequency. Neither of the two groups of participants exhibited accurate intuitions of medium and low frequency collocations. With regard to very low frequency pairings, L1 but not L2 intuitions were found to correlate with corpora for the majority of the items. Further, mixed-effects modeling revealed that L2 learners were comparable to native speakers in their judgments of the four frequency bands, although some differences did emerge. Taken together, the study provides new insights into the nature of L1 and L2 intuitions about phrasal frequency.

Research paper thumbnail of Automatic Classification of Text Complexity

Applied Sciences

This work introduces an automatic classification system for measuring the complexity level of a g... more This work introduces an automatic classification system for measuring the complexity level of a given Italian text under a linguistic point-of-view. The task of measuring the complexity of a text is cast to a supervised classification problem by exploiting a dataset of texts purposely produced by linguistic experts for second language teaching and assessment purposes. The commonly adopted Common European Framework of Reference for Languages (CEFR) levels were used as target classification classes, texts were elaborated by considering a large set of numeric linguistic features, and an experimental comparison among ten widely used machine learning models was conducted. The results show that the proposed approach is able to obtain a good prediction accuracy, while a further analysis was conducted in order to identify the categories of features that influenced the predictions.

Research paper thumbnail of Corpora for Linguists vs. Corpora for Learners

8 | 2 | 2019

This paper aims to shed light on how research findings stemming from Learner Corpus Research (LCR... more This paper aims to shed light on how research findings stemming from Learner Corpus Research (LCR) can inform the development of Data-driven learning (DDL) pedagogical activities. By doing this, it seeks to show how the gap between corpora built to be used by linguists and those tailored for learners can be filled. It starts by defining what a corpus is and how second language learning studies can benefit from the research findings based on corpora, but also from the direct use of corpora in the classroom. Then, it provides an overview of the available native and learner corpora of Italian, and how corpora in general can be adapted for DDL purposes. Finally, it describes an example of how an LCR finding can be used to develop DDL activities. It concludes with some desiderata for the future.

Research paper thumbnail of Io non è che non me ne frega niente: tendenze recenti della negazione tramite frase scissa

Le tendenze dell’italiano contemporaneo rivisitate. Atti del LII Congresso Internazionale di Studi della Società di Linguistica Italiana (Berna, 6-8 settembre 2018), 2019

Questo contributo ha l'obiettivo di indagare le tendenze recenti della ne-gazione italiana tramit... more Questo contributo ha l'obiettivo di indagare le tendenze recenti della ne-gazione italiana tramite frase scissa, per quanto riguarda la sua struttura, le sue funzioni e la sua diffusione, in particolare rispetto alla negazione di frase non marcata non. Abbinando l'osservazione e il confronto di dati estratti da corpora con metodologie statistiche multifattoriali, lo studio mostra che la costruzione è dotata di una notevole versatilità pragmatica, sintattica e nella distribuzione dell'informazione, che consentono di mettere in rilievo speci-fici elementi della frase negativa e la rendono una risorsa preziosa, specie nel parlato informale. Parole chiave: negazione, frase scissa, dislocazioni, focalizzazione.

Research paper thumbnail of News as a Conversation. A Comparative Analysis of the Language of Online and Printed News in Italy

Recherches en Communication

Previous studies on newspaper language have already pointed out how the language of written news ... more Previous studies on newspaper language have already pointed out how the language of written news has undergone significant changes in the last two or three decades. The aim of this study is to move a step forward and to compare the use of selected linguistic features in online and in printed Italian news . The main hypothesis is that Internet and communication technologies have introduced important transformations in the way news is written, organized and delivered, and that the consequences of these transformations are observable at all levels of linguistic analysis.

Research paper thumbnail of Come cambia la lingua dei politici nell'era di Twitter

Research paper thumbnail of La nascita della terminologia linguistica in Italia: il lessico tecnico di Giovanni Flechia in alcuni inediti

Research paper thumbnail of Learner Corpus Research and phraseology in Italian as a second language: the case of the DICI-A, a learner dictionary of Italian collocations

Collocations Cross-Linguistically. Corpora, Dictionaries and Language Teaching (Mémoires de la Société Néophilologique de Helsinki C), ed. Begoña Sanromán Vilas, pp. 219-244., 2016

The goal of this paper is twofold. Firstly, it aims to describe the state of the art in the resea... more The goal of this paper is twofold. Firstly, it aims to describe the state of the art in the research on Italian collocations in the field of second language acquisition. The last twenty years have seen a significant increase in corpus-based collocational studies: learner corpus research has greatly contributed to the analysis of the acquisition of collocations by learners of Italian, with a central role ascribed to frequency. Secondly, the paper describes an example of a lexicographical resource issued from phraseological and learner corpus research: the DICI-A, a learner dictionary of Italian collocations. Specific issues in its composition and organization are discussed, particularly with regard to the creation of an Italian collocation profile, based on the association of each collocation to the most appropriate learner proficiency level.

Research paper thumbnail of Methodological issues in a television news corpus: Discourse and annotation

Corpus Linguistics and Variation in English, 2000

Research paper thumbnail of Corpora di italiano L2: difficoltà di annotazione e trascrizione "allargata

Research paper thumbnail of Quarant'anni di ricerca in Linguistica applicata (1974-2014) - Scritti di Anna Ciliberti

Quarant'anni di ricerca in Linguistica applicata (1974-2014) - Scritti di Anna Ciliberti, 2019

Questo volume raccoglie una selezione ragionata della vastissima produzione scientifica di Anna C... more Questo volume raccoglie una selezione ragionata della vastissima produzione scientifica di Anna Ciliberti e vuole essere un omaggio ed un ringraziamento sincero di tutta l’Università per Stranieri di Perugia a questo suo contributo importante, competente ed autorevole, che Anna ha fornito riuscendo sempre a coniugare visione, entusiasmo ed equlibrio.
Nel suo quindicennio a Perugia, Anna Ciliberti ha rappresentato un punto di riferimento, umano e professionale, per tutto l’insieme delle attività accademiche che in quegli anni prendevano avvio, e per tutte le persone che in queste attività venivano via via coinvolte.

ISBN: 978-88-99811-09-9

Research paper thumbnail of Dall'epica guerresca al tecnicismo geometrico: la cronaca giornalistica del calcio del nuovo millennio

Dinamiche sociolinguistiche e didattica delle lingue nei contesti sportivi Sociolinguistic Dynamics and Language Teaching in Sports, 2020

A partire dalla considerazione del calcio come fenomeno socio-culturale e sociolinguistico, quest... more A partire dalla considerazione del calcio come fenomeno socio-culturale e sociolinguistico, questo studio si propone di indagare la cronaca giornalistica scritta del calcio, e di delineare alcune delle sue tendenze recenti, legate in particolare a tre fenomeni di tipo lessicale, testuale e discorsivo: l’uso di un particolare tipo di tecni- cismi, il carattere espressivo ed emotivo del discorso e la sua caratteristica di brevità e immediatezza. L’arco temporale preso in considerazione è questo primo scorcio del nuovo millennio, con particolare riferimento al triennio 2015-2017.

Research paper thumbnail of Error annotation in learner corpora: tools and applications in English and Italian

Slides of the workshop on error annotation in learner corpora (Cambridge, TALC 2018, 18-21 July).... more Slides of the workshop on error annotation in learner corpora (Cambridge, TALC 2018, 18-21 July).
OLGA VINOGRADOVA, NIKITA LOGIN, IVAN TORUBAROV (RESEARCH UNIVERSITY HIGHER SCHOOL OF ECONOMICS, MOSCOW)
LUCIANA FORTI, STEFANIA SPINA (UNIVERSITY FOR FOREIGNERS PERUGIA, ITALY)

Research paper thumbnail of Tesi di laurea, articoli scientifici e manuali universitari:  un confronto tra il discorso accademico di studenti e docenti

Presentazione accettata a SILFI 2018 (Genova, 28-30 magio 2018):

Research paper thumbnail of Perugia Corpus (ora disponibile in rete)

Il Perugia corpus (PEC) è un corpus di riferimento dell'italiano contemporaneo scritto e parlato,... more Il Perugia corpus (PEC) è un corpus di riferimento dell'italiano contemporaneo scritto e parlato, di 26 milioni di parole. È diviso in 10 sezioni (accademico, amministrazione, film, letteratura, parlato, saggistica, scuola, stampa, televisione, web).
Dal giugno 2015 è interrogabile in rete.