Giorgia Tolfo - Academia.edu (original) (raw)
Uploads
Phd Thesis by Giorgia Tolfo
La fotografia viene utilizzata intermedialmente per la narrazione di contromemorie e memorie trau... more La fotografia viene utilizzata intermedialmente per la narrazione di contromemorie e memorie traumatiche ricorrendo a numerose modalità e strategie di inserzione e impiego diverse. Se l’intermedialità da un lato non è riconducibile ad una serie di pratiche convenzionali, ma dipende dal contesto narrativo, dall’altro essa detiene un’organicità che la allinea funzionalmente ai processi e alle indagini sulla rappresentabilità del trauma. Inoltre, per la versatilità della sua natura poliedrica, la pratica narrativa intermediale (nelle sue configurazioni più diverse) assume una valenza epistemologica e metodologica nei confronti degli studi sull’esternazione e rielaborazione del trauma. Questo studio si prefigge di mettere a confronto testi teorici e testi narrativi per metterne in rilievo il reciproco apporto.
Papers by Giorgia Tolfo
Digitised Newspapers – A New Eldorado for Historians?
This chapter discusses the open access digitisation programme undertaken by Living with Machines,... more This chapter discusses the open access digitisation programme undertaken by Living with Machines, exploring the range of constraints that inform digitisation strategies and selection priorities. Because the landscape of digitised newspaper collections is so complex, and research and digitisation processes operate on different timelines, we have focused on opportunities to make digitisation choices both transparent and pragmatic. Working towards solutions that reflect collaborations between library staff and scholars, we introduce: a) Press Picker, our custom visualisation tool designed to support decision making about digitisation; and b) the Environmental Scan, a process of automatic metadata generation from the Newspaper Press Directories, a contemporaneous record of British newspapers.
Living with Machines is the largest digital humanities project ever funded in the UK. The project... more Living with Machines is the largest digital humanities project ever funded in the UK. The project brought together a team of twenty-three researchers to leverage more than twenty-years' worth of digitisation projects in order to deepen our understanding of the impact of mechanisation on nineteenth-century Britain. In contrast to many previous digital humanities projects which have sought to create resources, the project was concerned to work with what was already there, which whilst straightforward in theory is complex in practice. This Element describes the efforts to do so. It outlines the challenges of establishing and managing a truly multidisciplinary digital humanities project in the complex landscape of cultural data in the UK and share what other projects seeking to undertake digital history projects can learn from the experience. This title is also available as Open Access on Cambridge Core.
CERN European Organization for Nuclear Research - Zenodo, May 24, 2022
Research in computational linguistics has made successful attempts at modelling word meaning at s... more Research in computational linguistics has made successful attempts at modelling word meaning at scale, but much remains to be done to put these computational models to the test of historical scholarship (see e.g. Beelen et al. 2021). More importantly, a lot of computational research looks at texts in a historical vacuum, 'synchronically', as linguists would say. Living with Machines is an interdisciplinary research project that rethinks the impact of technology on the lives of ordinary people during the Industrial Revolution (Ahnert et al. 2021). During this project, we decided to address a fundamental question: what did people mean by 'machine' and how has this meaning changed over time? This paper outlines how a simple research question like 'what was a machine?' can provide an opportunity to engage the public with our work while also generating data for analysis and new avenues of research in a radically collaborative way. Turning to a diachronic perspective, we wanted to capture how changes in the usage of this word in nineteenth century texts can help us understand the role of machines in nineteenth century imaginations. An earlier crowdsourcing task on the project defined machines as 'devices or equipment not powered by people or animals'. As a result of that task, we discovered that this definition did not reflect how 'machine' was used in contemporary newspaper articles. Accordingly, we designed the 'What's that machine?' citizen science tasks to find out what a 'machine' was in the 19th century as part of our linguistic and historical research. As engaging the public with our research is a key goal of the project, crowdsourcing, rather than internal annotation, was a natural fit. It also allowed us to tackle classification challenges at scale. We set up two related 'What's that machine?' tasks on the Zooniverse platform: Describe it! and Classify it! (Ridge, 2020). The former asked the public to transcribe excerpts from newspaper articles
Michael Quick's book _Railway Passenger Stations in Great Britain: a Chronology_ offers a uniquel... more Michael Quick's book _Railway Passenger Stations in Great Britain: a Chronology_ offers a uniquely rich and detailed account of Britain's changing railway infrastructure. Its listing of over 12,000 stations allows us to reconstruct the coming of rail at both micro- and macro-scales. However, being published originally as a book (and subsequently online as a PDF created from an underlying MS Word document), this resource was not well suited for systematic linking to other data. We now present a new, automatically generated dataset that provides the rich detail of this exceptional resource in a structured format. Each station described in the _Chronology_ is given certain attributes, such as operating companies and opening and closing dates, and is georeferenced and linked---whenever possible---to its corresponding entry on Wikidata. We name this structured, linked, and georeferenced dataset 'StopsGB' (Structured Timeline of Passenger Stations in Great Britain), and we make it openly available. We believe this dataset (and the method used to create it) will be of widespread interest across the historical, digital library and semantic web communities, and that it will be a key resource for ongoing research into the impact of the railway in Great Britain.
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, 2021
The transformative impact of the railway on nineteenth-century British society has been widely re... more The transformative impact of the railway on nineteenth-century British society has been widely recognized, but understanding that process at scale remains challenging because the Victorian rail network was both vast and in a state of constant flux. Michael Quick’s reference work Railway Passenger Stations in Great Britain: a Chronology offers a uniquely rich and detailed account of Britain’s changing railway infrastructure. Its listing of over 12,000 stations allows us to reconstruct the coming of rail at both microand macro-scales; however, being published originally as a book, this resource was not well suited for systematic linking to other geographical data. This paper shows how such a minimally-structured historical directory can be transformed into an openly available structured and linked dataset, named StopsGB (Structured Timeline of Passenger Stations in Great Britain), which will be of widespread interest across the historical, digital library and semantic web communities....
ArXiv, 2020
This paper proposes a new approach to animacy detection, the task of determining whether an entit... more This paper proposes a new approach to animacy detection, the task of determining whether an entity is represented as animate in a text. In particular, this work is focused on atypical animacy and examines the scenario in which typically inanimate objects, specifically machines, are given animate attributes. To address it, we have created the first dataset for atypical animacy detection, based on nineteenth-century sentences in English, with machines represented as either animate or inanimate. Our method builds on recent innovations in language modeling, specifically BERT contextualized word embeddings, to better capture fine-grained contextual properties of words. We present a fully unsupervised pipeline, which can be easily adapted to different contexts, and report its performance on an established animacy dataset and our newly introduced resource. We show that our method provides a substantially more accurate characterization of atypical animacy, especially when applied to highly ...
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, 2021
As languages evolve historically, making computational approaches sensitive to time can improve p... more As languages evolve historically, making computational approaches sensitive to time can improve performance on specific tasks. In this work, we assess whether applying historical language models and time-aware methods help with determining the correct sense of polysemous words. We outline the task of time-sensitive Targeted Sense Disambiguation (TSD), which aims to detect instances of a sense or set of related senses in historical and time-stamped texts, and address two main goals: 1) we scrutinize the effect of applying historical language models on the performance of several TSD methods and 2) we assess different disambiguation methods that take into account the year in which a text was produced. We train historical BERT models on a corpus of nineteenth-century English books and draw on the Oxford English Dictionary (and its Historical Thesaurus) to create historically evolving sense representations. Our results show that using historical language models consistently improves perf...
Talks by Giorgia Tolfo
DH Benelux 2022 - ReMIX: Creation and alteration in DH, 2022
Research in computational linguistics has made successful attempts at modelling word meaning at s... more Research in computational linguistics has made successful attempts at modelling word meaning at scale, but much remains to be done to put these computational models to the test of historical scholarship (see e.g. Beelen et al. 2021). More importantly, a lot of computational research looks at texts in a historical vacuum, 'synchronically', as linguists would say. Living with Machines is an interdisciplinary research project that rethinks the impact of technology on the lives of ordinary people during the Industrial Revolution (Ahnert et al. 2021). During this project, we decided to address a fundamental question: what did people mean by 'machine' and how has this meaning changed over time? This paper outlines how a simple research question like 'what was a machine?' can provide an opportunity to engage the public with our work while also generating data for analysis and new avenues of research in a radically collaborative way. Turning to a diachronic perspective, we wanted to capture how changes in the usage of this word in nineteenth century texts can help us understand the role of machines in nineteenth century imaginations. An earlier crowdsourcing task on the project defined machines as 'devices or equipment not powered by people or animals'. As a result of that task, we discovered that this definition did not reflect how 'machine' was used in contemporary newspaper articles. Accordingly, we designed the 'What's that machine?' citizen science tasks to find out what a 'machine' was in the 19th century as part of our linguistic and historical research. As engaging the public with our research is a key goal of the project, crowdsourcing, rather than internal annotation, was a natural fit. It also allowed us to tackle classification challenges at scale. We set up two related 'What's that machine?' tasks on the Zooniverse platform: Describe it! and Classify it! (Ridge, 2020). The former asked the public to transcribe excerpts from newspaper articles
Books by Giorgia Tolfo
Cambridge University Press, 2023
Living with Machines is the largest digital humanities project ever funded in the UK. The project... more Living with Machines is the largest digital humanities project ever funded in the UK. The project brought together a team of twenty-three researchers to leverage more than twenty-years' worth of digitisation projects in order to deepen our understanding of the impact of mechanisation on nineteenth-century Britain. In contrast to many previous digital humanities projects which have sought to create resources, the project was concerned to work with what was already there, which whilst straightforward in theory is complex in practice. This Element describes the efforts to do so. It outlines the challenges of establishing and managing a truly multidisciplinary digital humanities project in the complex landscape of cultural data in the UK and share what other projects seeking to undertake digital history projects can learn from the experience. This title is also available as Open Access on Cambridge Core.
La fotografia viene utilizzata intermedialmente per la narrazione di contromemorie e memorie trau... more La fotografia viene utilizzata intermedialmente per la narrazione di contromemorie e memorie traumatiche ricorrendo a numerose modalità e strategie di inserzione e impiego diverse. Se l’intermedialità da un lato non è riconducibile ad una serie di pratiche convenzionali, ma dipende dal contesto narrativo, dall’altro essa detiene un’organicità che la allinea funzionalmente ai processi e alle indagini sulla rappresentabilità del trauma. Inoltre, per la versatilità della sua natura poliedrica, la pratica narrativa intermediale (nelle sue configurazioni più diverse) assume una valenza epistemologica e metodologica nei confronti degli studi sull’esternazione e rielaborazione del trauma. Questo studio si prefigge di mettere a confronto testi teorici e testi narrativi per metterne in rilievo il reciproco apporto.
Digitised Newspapers – A New Eldorado for Historians?
This chapter discusses the open access digitisation programme undertaken by Living with Machines,... more This chapter discusses the open access digitisation programme undertaken by Living with Machines, exploring the range of constraints that inform digitisation strategies and selection priorities. Because the landscape of digitised newspaper collections is so complex, and research and digitisation processes operate on different timelines, we have focused on opportunities to make digitisation choices both transparent and pragmatic. Working towards solutions that reflect collaborations between library staff and scholars, we introduce: a) Press Picker, our custom visualisation tool designed to support decision making about digitisation; and b) the Environmental Scan, a process of automatic metadata generation from the Newspaper Press Directories, a contemporaneous record of British newspapers.
Living with Machines is the largest digital humanities project ever funded in the UK. The project... more Living with Machines is the largest digital humanities project ever funded in the UK. The project brought together a team of twenty-three researchers to leverage more than twenty-years' worth of digitisation projects in order to deepen our understanding of the impact of mechanisation on nineteenth-century Britain. In contrast to many previous digital humanities projects which have sought to create resources, the project was concerned to work with what was already there, which whilst straightforward in theory is complex in practice. This Element describes the efforts to do so. It outlines the challenges of establishing and managing a truly multidisciplinary digital humanities project in the complex landscape of cultural data in the UK and share what other projects seeking to undertake digital history projects can learn from the experience. This title is also available as Open Access on Cambridge Core.
CERN European Organization for Nuclear Research - Zenodo, May 24, 2022
Research in computational linguistics has made successful attempts at modelling word meaning at s... more Research in computational linguistics has made successful attempts at modelling word meaning at scale, but much remains to be done to put these computational models to the test of historical scholarship (see e.g. Beelen et al. 2021). More importantly, a lot of computational research looks at texts in a historical vacuum, 'synchronically', as linguists would say. Living with Machines is an interdisciplinary research project that rethinks the impact of technology on the lives of ordinary people during the Industrial Revolution (Ahnert et al. 2021). During this project, we decided to address a fundamental question: what did people mean by 'machine' and how has this meaning changed over time? This paper outlines how a simple research question like 'what was a machine?' can provide an opportunity to engage the public with our work while also generating data for analysis and new avenues of research in a radically collaborative way. Turning to a diachronic perspective, we wanted to capture how changes in the usage of this word in nineteenth century texts can help us understand the role of machines in nineteenth century imaginations. An earlier crowdsourcing task on the project defined machines as 'devices or equipment not powered by people or animals'. As a result of that task, we discovered that this definition did not reflect how 'machine' was used in contemporary newspaper articles. Accordingly, we designed the 'What's that machine?' citizen science tasks to find out what a 'machine' was in the 19th century as part of our linguistic and historical research. As engaging the public with our research is a key goal of the project, crowdsourcing, rather than internal annotation, was a natural fit. It also allowed us to tackle classification challenges at scale. We set up two related 'What's that machine?' tasks on the Zooniverse platform: Describe it! and Classify it! (Ridge, 2020). The former asked the public to transcribe excerpts from newspaper articles
Michael Quick's book _Railway Passenger Stations in Great Britain: a Chronology_ offers a uniquel... more Michael Quick's book _Railway Passenger Stations in Great Britain: a Chronology_ offers a uniquely rich and detailed account of Britain's changing railway infrastructure. Its listing of over 12,000 stations allows us to reconstruct the coming of rail at both micro- and macro-scales. However, being published originally as a book (and subsequently online as a PDF created from an underlying MS Word document), this resource was not well suited for systematic linking to other data. We now present a new, automatically generated dataset that provides the rich detail of this exceptional resource in a structured format. Each station described in the _Chronology_ is given certain attributes, such as operating companies and opening and closing dates, and is georeferenced and linked---whenever possible---to its corresponding entry on Wikidata. We name this structured, linked, and georeferenced dataset 'StopsGB' (Structured Timeline of Passenger Stations in Great Britain), and we make it openly available. We believe this dataset (and the method used to create it) will be of widespread interest across the historical, digital library and semantic web communities, and that it will be a key resource for ongoing research into the impact of the railway in Great Britain.
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, 2021
The transformative impact of the railway on nineteenth-century British society has been widely re... more The transformative impact of the railway on nineteenth-century British society has been widely recognized, but understanding that process at scale remains challenging because the Victorian rail network was both vast and in a state of constant flux. Michael Quick’s reference work Railway Passenger Stations in Great Britain: a Chronology offers a uniquely rich and detailed account of Britain’s changing railway infrastructure. Its listing of over 12,000 stations allows us to reconstruct the coming of rail at both microand macro-scales; however, being published originally as a book, this resource was not well suited for systematic linking to other geographical data. This paper shows how such a minimally-structured historical directory can be transformed into an openly available structured and linked dataset, named StopsGB (Structured Timeline of Passenger Stations in Great Britain), which will be of widespread interest across the historical, digital library and semantic web communities....
ArXiv, 2020
This paper proposes a new approach to animacy detection, the task of determining whether an entit... more This paper proposes a new approach to animacy detection, the task of determining whether an entity is represented as animate in a text. In particular, this work is focused on atypical animacy and examines the scenario in which typically inanimate objects, specifically machines, are given animate attributes. To address it, we have created the first dataset for atypical animacy detection, based on nineteenth-century sentences in English, with machines represented as either animate or inanimate. Our method builds on recent innovations in language modeling, specifically BERT contextualized word embeddings, to better capture fine-grained contextual properties of words. We present a fully unsupervised pipeline, which can be easily adapted to different contexts, and report its performance on an established animacy dataset and our newly introduced resource. We show that our method provides a substantially more accurate characterization of atypical animacy, especially when applied to highly ...
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, 2021
As languages evolve historically, making computational approaches sensitive to time can improve p... more As languages evolve historically, making computational approaches sensitive to time can improve performance on specific tasks. In this work, we assess whether applying historical language models and time-aware methods help with determining the correct sense of polysemous words. We outline the task of time-sensitive Targeted Sense Disambiguation (TSD), which aims to detect instances of a sense or set of related senses in historical and time-stamped texts, and address two main goals: 1) we scrutinize the effect of applying historical language models on the performance of several TSD methods and 2) we assess different disambiguation methods that take into account the year in which a text was produced. We train historical BERT models on a corpus of nineteenth-century English books and draw on the Oxford English Dictionary (and its Historical Thesaurus) to create historically evolving sense representations. Our results show that using historical language models consistently improves perf...
DH Benelux 2022 - ReMIX: Creation and alteration in DH, 2022
Research in computational linguistics has made successful attempts at modelling word meaning at s... more Research in computational linguistics has made successful attempts at modelling word meaning at scale, but much remains to be done to put these computational models to the test of historical scholarship (see e.g. Beelen et al. 2021). More importantly, a lot of computational research looks at texts in a historical vacuum, 'synchronically', as linguists would say. Living with Machines is an interdisciplinary research project that rethinks the impact of technology on the lives of ordinary people during the Industrial Revolution (Ahnert et al. 2021). During this project, we decided to address a fundamental question: what did people mean by 'machine' and how has this meaning changed over time? This paper outlines how a simple research question like 'what was a machine?' can provide an opportunity to engage the public with our work while also generating data for analysis and new avenues of research in a radically collaborative way. Turning to a diachronic perspective, we wanted to capture how changes in the usage of this word in nineteenth century texts can help us understand the role of machines in nineteenth century imaginations. An earlier crowdsourcing task on the project defined machines as 'devices or equipment not powered by people or animals'. As a result of that task, we discovered that this definition did not reflect how 'machine' was used in contemporary newspaper articles. Accordingly, we designed the 'What's that machine?' citizen science tasks to find out what a 'machine' was in the 19th century as part of our linguistic and historical research. As engaging the public with our research is a key goal of the project, crowdsourcing, rather than internal annotation, was a natural fit. It also allowed us to tackle classification challenges at scale. We set up two related 'What's that machine?' tasks on the Zooniverse platform: Describe it! and Classify it! (Ridge, 2020). The former asked the public to transcribe excerpts from newspaper articles
Cambridge University Press, 2023
Living with Machines is the largest digital humanities project ever funded in the UK. The project... more Living with Machines is the largest digital humanities project ever funded in the UK. The project brought together a team of twenty-three researchers to leverage more than twenty-years' worth of digitisation projects in order to deepen our understanding of the impact of mechanisation on nineteenth-century Britain. In contrast to many previous digital humanities projects which have sought to create resources, the project was concerned to work with what was already there, which whilst straightforward in theory is complex in practice. This Element describes the efforts to do so. It outlines the challenges of establishing and managing a truly multidisciplinary digital humanities project in the complex landscape of cultural data in the UK and share what other projects seeking to undertake digital history projects can learn from the experience. This title is also available as Open Access on Cambridge Core.