Wrangling with Non-Standard Data (original) (raw)

Tiny Data: Building a Community of Practice around Humanities Datasets

International Journal of Digital Curation

Quantitative data, the foundation of scientific research, have been in the foreground of discussions about data creation, curation, and publication pipelines. However, data for humanistic and social scientific inquiries take many forms, including physical and ephemeral primary resources (books, objects, performances, interactions); qualitative, free-form observations; as well as quantitative, structured data and metadata. At the Vanderbilt University Jean and Alexander Heard Library, we started the Tiny Data Working Group (TDWG) in 2016 to tackle some of the humanistic research data creation and curation issues in a constructive, collaborative, and interdisciplinary format. The present paper considers what it means to be FAIR with humanities data, as well as how to build a community of data-literate humanists, based on our experiences with the TDWG.

Digging into Data Management in Public-Funded, International Research in Digital Humanities

Journal of the Association for Information Science and Technology, 2019

Path-breaking in theory and practice alike, digital humanities (DH) not only secures a larger public audience for humanities and social sciences research, but also permits researchers to ask novel questions and to revisit familiar ones. Public-funded, international, and collaborative research in digital humanities furthers institutional research missions and enriches networked knowledge. The Digging into Data 3 challenge (DID3) (2014-2016), an international and interdisciplinary grant initiative embracing big data, included fourteen teams sponsored by ten funders from four nations. A qualitative case study that relies on purposive sampling and grounded analysis, this article centers on the information practices of DID3 participants. Semi-structured interviews were conducted with 53 participants on eleven of the fourteen DID3 projects. The study explores how Data Management Plan (DMP) requirements affect work practices in public-funded digital humanities, how scholars grapple with key data management challenges, and how they plan to reuse and share their data. It concludes with three recommendations and three directions for future research.

An experiment on accuracy, efficiency, productivity and researchers’ satisfaction in digital humanities data analysis: dataset appendix

2016

Data analysis represents the most important group of tasks carried out in research contexts. Due to the current lack of empirical studies about data analysis performance in digital humanities research contexts, we conducted an empirical experiment comparing data analysis performance employed traditional software versus data analysis performance employed software-assistance tools which incorporate cognitive processes in their design. The experiment is designed in terms of accuracy, efficiency, productivity and user satisfaction during the data analysis made by researchers in digital humanities. It allowed us to find some clear benefits of the cognitive inclusion in the software designed for research contexts, with statistically significant differences in terms of accuracy, productivity and researcher’s satisfaction in support of this explicit inclusion, although some efficiency weaknesses are detected. This dataset presents the raw data obtained during the first round of our experime...

Accuracy, efficiency, productivity and researchers’ satisfaction in digital humanities data analysis: Experiment design

2016

This documental appendix presents form-based materials employed in order to carry out emppricial studies aboput data analysis performance in digital humanities research contexts. The document includes questionnaires examples for obtaining personal and professional profile of the subjects, as well as data analysis tasks defined in two different case studies in digital humanities that allowed us to measure accuracy, efficiency, productivity and researchers' satisfaction during data analysis performance experiments.

The Datafied Society: Challenges and Strategies in Big Data Research for Social Sciences and Humanities

The advent of big data marks a profound shift in our epistemological framework, introducing a new knowledge paradigm where the social landscape is shaped by data processing, perceived as both comprehensive and natural. This transformative shift challenges traditional notions of human agency in societal understanding, positioning empirical quantification at the forefront of inquiry. Beyond philosophical implications, pragmatic challenges abound in big data research-from issues of commensuration and the influence of action grammars to the dominance of correlational over causal relationships, the prevalence of everyday data over historical archives, and the pervasive impact of algorithms on data ecosystems. This manuscript undertakes a comprehensive exploration of these challenges, proposing strategies for navigating them within emerging disciplines such as Digital Humanities, Social Computing, and Cultural Analysis. Methodologically anchored in constructivist principles and critical discourse analysis (CDA), the study investigates how socio-cultural contexts shape data and knowledge production. Drawing on extensive literature and meta-analyses, it synthesizes diverse perspectives to underscore the necessity for methodological innovation and reflexivity in addressing the complexities of big data research, ensuring the integrity and depth of social inquiry amidst evolving data-driven methodologies.

Issues in Humanities Data Curation

2011

As both the materials and the analytical practices of humanities research become increasingly digital, the challenge of sustaining meaningful access to the outputs of humanities research becomes more urgent. Continuing technological change and new institutional pressures require a sustainable commitment to curate humanities data throughout its entire lifecycle from creation to re-use and long-term preservation. Success will depend on coordinated, collaborative efforts and arrangements amongst scholars, librarians, administrators, funders, and their organizations. This document is intended to provide background and provoke discussions about the skills, professional roles, training, and institutional support needed for curation of humanities research materials. Each section contains questions for further exploration and debate that we hope will provide participants with opportunities to share their own experiences and knowledge. The final white paper resulting from the Humanities Data Curation Summit sponsored by the Data Curation Education Program at the University of Illinois, Urbana-Champaign, and CenterNet is intended to advance the development of a curation agenda for the digital humanities as a vital piece of conceptual infrastructure [1] [#svensson2011] for the field. Based on a shared understanding of current problems, the final white paper will propose concrete, specific recommendations for scholars, librarians and archivists, professional societies, institutions, and funders.