Critical Data Studies Research Papers (original) (raw)

What would data science look like if its key critics were engaged to help improve it, and how might critiques of data science improve with an approach that considers the day-to-day practices of data science? This article argues for... more

What would data science look like if its key critics were engaged to help improve it, and how might critiques of data science improve with an approach that considers the day-to-day practices of data science? This article argues for scholars to bridge the conversations that seek to critique data science and those that seek to advance data science practice to identify and create the social and organizational arrangements necessary for a more ethical data science. We summarize four critiques that are commonly made in critical data studies: data are inherently interpretive, data are inextricable from context, data are mediated through the sociomaterial arrangements that produce them, and data serve as a medium for the negotiation and communication of values. We present qualitative research with academic data scientists, "data for good" projects, and specialized cross-disciplinary engineering teams to show evidence of these critiques in the day-to-day experience of data scienti...

Communication technologies increasingly mediate data exchanges rather than human communication. We propose the term data valences to describe the differences in expectations that people have for data across different social settings.... more

Communication technologies increasingly mediate data exchanges rather than human communication. We propose the term data valences to describe the differences in expectations that people have for data across different social settings. Building on two years of interviews, observations, and participation in the communities of technology designers, clinicians, advocates, and users for emerging mobile data in formal health care and consumer wellness, we observed the tensions among these groups in their varying expectations for data. This article identifies six data valences (self-evidence, actionability, connection, transparency, “truthiness,” and discovery) and demonstrates how they are mediated and how they are distinct across different social domains. Data valences give researchers a tool for examining the discourses around, practices with, and challenges for data as they are mediated across social settings.

This article examines the mutual domestication of users and recommendation algorithms on Netflix. Based on 25 interviews with users and an inductive analysis of their practices and profiles on the platform, we discuss five dynamics... more

This article examines the mutual domestication of users and recommendation algorithms on Netflix. Based on 25 interviews with users and an inductive analysis of their practices and profiles on the platform, we discuss five dynamics through which this mutual domesti- cation occurs: personalization, or the ways in which individualized relationships between users and the platform are built; how algorithmic recommendations are integrated into a matrix of cultural codes; the rituals through which they are incorporated into spatial and temporal processes in daily life; the resistance to various aspects of Netflix as a form to enact agency; and the conversion or transformation of the private consumption of the platform into a public issue. The conclusion elaborates on the theoretical and analytical implications of this approach, to rethink the relationship between algorithms and culture.

Chapter 7 in the edited volume Situating Open Data.

This chapter explores the past, assesses the present and delineates the future of a media practice approach to citizen media. The first section provides an extensive overview of the different currents in research on media practices,... more

This chapter explores the past, assesses the present and delineates the future of a media practice approach to citizen media. The first section provides an extensive overview of the different currents in research on media practices, identifying the antecedents of the media practice approach in several theoretical traditions and highlighting possible points of convergence between them. Hence, we ground the roots of the practice approach in Latin American communication and media studies, we scrutinize Couldry's conceptualization in connection to theories of practices within the social sciences, and we examine audience research, media anthropology, social movement studies, citizen and alternative media, and Communication for Development and Social Change. The second section takes stock of the current 'state of the art' of practice-focused research on citizen and activist media and develops a critical assessment of how the concept of media practices has been used in recent literature, identifying key strengths and shortcomings. In this section, we also discuss the integration of media practices with other concepts, such as mediation, mediatization, media ecologies, media archeology, media imaginaries, and the public sphere. The third section delineates future directions for research on citizen media and practice, reflecting on some of the challenges facing this growing interdisciplinary field. Here, we illustrate how the media practice approach provides a powerful framework for researching the pressing challenges posed by mediatization and datafication. Further, we highlight the need for deeper theoretical engagement, underline the necessity of dialogue between different traditions, and point out some unresolved issues and limitations. The chapter concludes with an outline of the contributions to this edited collection.

El feminicidio (o femicidio) es una categoría feminista que designa-y denuncia-a las muertes violentas de género de mujeres, adolescentes y niñas, en especial sus asesinatos. En América Latina, la categoría ha proporcionado un marco para... more

El feminicidio (o femicidio) es una categoría feminista que designa-y denuncia-a las muertes violentas de género de mujeres, adolescentes y niñas, en especial sus asesinatos. En América Latina, la categoría ha proporcionado un marco para que las activistas feministas y, más recientemente, los estados recopilen datos sobre el feminicidio. Este trabajo busca comprender las implicancias de las formas en que se estructuran, clasifican y curan los datos de feminicidio, como estos arreglos dirigen las acciones posibles. A través de tres ideas clave-el feminicidio como marco, las trazas digitales de asesinato y los marcos de datos-, el estudio propone un abordaje teórico y metodológico para el análisis de la organización y la presentación de datos sobre asesinatos de mujeres relacionados con el género. Utilizando el método de reconstrucción ontológica, el estudio examina dos datasets de asesinatos de mujeres por razones de género en Uruguay, producidos por una activista y por el estado. Este trabajo muestra cómo las descripciones que proponen los datasets habilitan ciertas acciones (y otras no) y concluye con recomendaciones para revisar el diseño de los datasets de feminicidio y para futuras líneas de investigación.

Money is an ‘instrument of collective memory’ before it is a means of exchange, a unit of account or a store of value. Money's status as a memory technology is particularly significant in light of the role that information and... more

Money is an ‘instrument of collective memory’ before it is a means of exchange, a unit of account or a store of value. Money's status as a memory technology is particularly significant in light of the role that information and communication technologies now play in economic transactions. Many of the new channels and infrastructures for payments, such as magnetic cards, mobile phones, the wired Internet, social media platforms, and RFID technologies, record detailed transactional data alongside a range of other identifying data. We now have extremely detailed records of the many ways that money circulates, is transferred and is spent. This paper concerns this previously latent transactional data and how it is currently recorded, monetised, and used to inform action. What has been recorded in and about money at different moments in time and how are these categories breaking down? Who has access to and ownership over this collectively produced record and how is it driving new data practices and business models based on the monetisation and application of monetary records? And how might re-engaging with money's mnemonic status help to foreground a politics and ethics of transactional data?

Critical Data Studies (CDS) explore the unique cultural, ethical, and critical challenges posed by Big Data. Rather than treat Big Data as only scientifically empirical and therefore largely neutral phenomena, CDS advocates the view that... more

Critical Data Studies (CDS) explore the unique cultural, ethical, and critical challenges posed by Big Data. Rather than treat Big Data as only scientifically empirical and therefore largely neutral phenomena, CDS advocates the view that Big Data should be seen as always-already constituted within wider data assemblages. Assemblages is a concept that helps capture the multitude of ways that already-composed data structures inflect and interact with society, its organization and functioning, and the resulting impact on individuals' daily lives. CDS questions the many assumptions about Big Data that permeate contemporary literature on information and society by locating instances where Big Data may be naively taken to denote objective and transparent informational entities. In this introduction to the Big Data & Society CDS special theme, we briefly describe CDS work, its orientations, and principles.

We commonly think of society as made of and by humans, but with the proliferation of machine learning and AI technologies, this is clearly no longer the case. Billions of automated systems tacitly contribute to the social construction of... more

We commonly think of society as made of and by humans, but with the proliferation of machine learning and AI technologies, this is clearly no longer the case. Billions of automated systems tacitly contribute to the social construction of reality by drawing algorithmic distinctions between the visible and the invisible, the relevant and the irrelevant, the likely and the unlikely – on and beyond platforms.
Drawing on the work of Pierre Bourdieu, this book develops an original sociology of algorithms as social agents, actively participating in social life. Through a wide range of examples, Massimo Airoldi shows how society shapes algorithmic code, and how this culture in the code guides the practical behaviour of the code in the culture, shaping society in turn. The ‘machine habitus’ is the generative mechanism at work throughout myriads of feedback loops linking humans with artificial social agents, in the context of digital infrastructures and pre-digital social structures.
Machine Habitus will be of great interest to students and scholars in sociology, media and cultural studies, science and technology studies and information technology, and to anyone interested in the growing role of algorithms and AI in our social and cultural life.

This article sketches key concerns surrounding the digital reproduction of enslaved and colonized subjects held in cultural heritage collections. It centralizes one photograph of a crying Afro-Caribbean child from St. Croix, housed in the... more

This article sketches key concerns surrounding the digital reproduction of enslaved and colonized subjects held in cultural heritage collections. It centralizes one photograph of a crying Afro-Caribbean child from St. Croix, housed in the Royal Danish Library, to demonstrate the unresolved ethical matters present in retrospective attempts to visualize colonialism. Working with affect and haunting as research material, the inquiry questions how museums and other cultural heritage institutions are caretaking historical violations, identifying themselves as hosting agents, and navigating issues of trust and accountability as they make their colonial collections available online. Speculating about what an ethics of care in representation could look like, the article draws on reparatory artistic engagements with such imagery and proposes how metadata could be rethought as a cataloging space with the potential to alter historical imbalances of power.

Popular accounts of datafied ways of knowing implied in the ascendance of Big Data posit that the increasingly massive volume of information collected immanently to digital technologies affords new means of understanding complex social... more

Popular accounts of datafied ways of knowing implied in the ascendance of Big Data posit that the increasingly massive volume of information collected immanently to digital technologies affords new means of understanding complex social processes. As a rejoinder to existing modes of talking about Big Data and what it means for social research, this chapter suggests an epistemological intervention from a critical, anti- oppressive stance that seeks to reinstate people within datafied social life. Rather than taking as its premise that Big Data can offer insights into social processes, this approach starts from the perspective of the people caught up in programs of social sorting, carried out by computational algorithms, particularly as they occupy marginalized positions within regimes of power-knowledge (to use Foucault’s term). As a specifically situated case study, we examine the ways data are mobilized in European border control and how this phenomenon can be studied, framed through the eurocentric legacies of population measurement in colonial disciplinary surveillance. The connection between power and knowledge here is meant to implore researchers to consider how their deployments of Big Data, even from critical perspectives, may serve to replicate structures of discrimination by denying less “data-ready” ways of knowing. To that end, the conclusion of the chapter suggests some alternative methodological avenues for reinstating people – specifying who the “we” permits – in light of Big Data supremacy.

The collection, processing, storage and circulation of data are fundamental element of contemporary societies. While the positivistic literature on 'data revolution' finds it essential for improving development delivery, critical data... more

The collection, processing, storage and circulation of data are fundamental element of contemporary societies. While the positivistic literature on 'data revolution' finds it essential for improving development delivery, critical data studies stress the threats of datafication. In this article, we demonstrate that datafication has been happening continuously through history, driven by political and economic pressures. We use historical examples to show how resource and personal data were extracted, accumulated and commodified by colonial empires, national governments and trade organizations, and argue that similar extractive processes are a present-day threat in the Global South. We argue that the decoupling of earlier and current datafication processes obscures the underlying, complex power dynamics of datafication. Our historical perspective shows how, once aggregated, data may become imperishable and can be appropriated for problematic purposes in the long run by both public and private entities. Using historical case studies, we challenge the current regulatory approaches that view data as a commodity and frame it instead as a mobile, non-perishable, yet ideally inalienable right of people.

Currently discussed under the term “echo chambers,” online networks reproduce gender and racial biases, as well as reinforcing social and economic inequalities embedded within society. As online communities have become more homogeneous,... more

Currently discussed under the term “echo chambers,”
online networks reproduce gender and racial biases,
as well as reinforcing social and economic inequalities
embedded within society. As online communities have
become more homogeneous, they have also shifted from
the nineties cyberutopian vision of fluid, anonymous
online beings to an imperative of an authentic, consistent
social media profile. Social media networks became
a place for real data, real names, real places, and real
neighbors. Neighborhood platforms like Nextdoor or
Amazon’s Neighbors by Ring see the relevance of their
product in focusing on connecting real people living
next to each other rather than people in a global online
network. But what happens when neighborhood districts
and data neighborhoods get linked in a social media neighborhood network? Networks of hybrid offline and online information based on small-scale areas and groups of people connected to machine-readable user profiles owned by private tech companies create various political
and ethical problems, like the potential for surveillance,
privacy and security concerns, and issues of discrimination.
Two important implications of anonymity arise in the
context of network science and the neighborhoods it
produces. One has to do with a representation of identity
based on groups of actors with the same characteristics,
and the other with techniques and regulations of (non)
identification. Both concepts influence and shape each
other; their history and their interdependence with social
and technical conditions have to be understood. By
tracing the history of neighborhood developments like the
neighborhood unit that evolved from the industrial city,
the block regime in Nazi Germany, the countercultural
idea of alternative ways of communal living, the vision
of a virtual global village, and social media neighborhood
networks linked to nearest-neighbor analytics, this paper
traces how historically developed relations of anonymity
and neighborhoods affect subjectivity, fairness, and
relations of equality and difference today.

This paper situates data practices in Japan in a diffractive genealogy of surveillance capitalism. It puts data conceptualized in three ways into focus: real data, data in information banks, and data of the super app LINE. While... more

This paper situates data practices in Japan in a diffractive genealogy of surveillance capitalism. It puts data conceptualized in three ways into focus: real data, data in information banks, and data of the super app LINE. While technology embodying these concepts of data is mainly used in Asia, this technology is entangled with discourses and legislation in Europe and practices of U.S. American surveillance capitalism in important aspects. This article empirically traces these entanglements and demonstrates how discourses around data sovereignty, geopolitical shifts, historical background, global political and economic trends, and international policies intermingle in contemporary accounts of data and digital sovereignty in Japanese context. Decolonial theory is consulted in order to account for Japan's recent past as a non-Western territorial empire and the privileged position that Japanese experts on data have in the drafting of international data policies.

Digital personal data is increasingly framed as the basis of contemporary economies, representing an important new asset class. Control over these data assets seems to explain the emergence and dominance of so-called "Big Tech" firms,... more

Digital personal data is increasingly framed as the basis of contemporary economies, representing an important new asset class. Control over these data assets seems to explain the emergence and dominance of so-called "Big Tech" firms, consisting of Apple, Microsoft, Amazon, Google/Alphabet, and Facebook. These US-based firms are some of the largest in the world by market capitalization, a position that they retain despite growing policy and public condemnation-or "techlash"-of their market power based on their monopolistic control of personal data. We analyse the transformation of personal data into an asset in order to explore how personal data is accounted for, governed, and valued by Big Tech firms and other political-economic actors (e.g., investors). However, our findings show that Big Tech firms turn "users" and "user engagement" into assets through the performative measurement, governance, and valuation of user metrics (e.g., user numbers, user engagement), rather than extending ownership and control rights over personal data per se. We conceptualize this strategy as a form of "techcraft" to center attention on the means and mechanisms that Big Tech firms deploy to make users and user data measurable and legible as future revenue streams.

This short article explores the changing meanings of "media" in media studies in relation to emerging technologies and critical paradigms in the field. After brief discussion of various media studies approaches, including media ecologies... more

This short article explores the changing meanings of "media" in media studies in relation to emerging technologies and critical paradigms in the field. After brief discussion of various media studies approaches, including media ecologies and algorighmic cultures, the article discusses the institutional rise of data sciences and compares this field to media studies. The author argues that while the foundational assumptions, methodological frameworks, and mindsets of data sciences and media studies differ, curious co-existences can morph into lively collaborations rather than turf battles, and concludes by pointing to research that explemplies a productive synthesis. Generally, I am not one for broad brushstrokes. In my own research I tend to become mired in media materialities, rather than commit to meta-level extrapolations. However, when given the opportunity recently to comment on some changes in media studies over the past thirty years, I decided to give it a try. To guide me, I borrowed a technique from Charlotte Brunsdon's (1998) chapter, "What is the 'Television' of Television Studies?" Reflecting on the development of a relatively new field of study, Brunsdon recognized that television scholars conceptualized and approached their

This is a preprint version of an essay I have written about self-tracking for a book on information keywords. It includes an overview of theoretical perspectives that can be used to understand self-tracking and personal data, with a... more

This is a preprint version of an essay I have written about self-tracking for a book on information keywords. It includes an overview of theoretical perspectives that can be used to understand self-tracking and personal data, with a particular focus on vital materialism as espoused in the scholarship of Donna Haraway and Jane Bennet. Reference is also made to some findings from empirical studies colleagues and I have conducted on self-tracking rationales and practices.

A spate of recent scandals concerning personal digital data illustrates the extent to which innovation and finance are thoroughly entangled with one another. The innovation-finance nexus is an example of an emerging dynamic in... more

A spate of recent scandals concerning personal digital data illustrates the extent to which innovation and finance are thoroughly entangled with one another. The innovation-finance nexus is an example of an emerging dynamic in technoscientific capitalism in which innovation is increasingly driven by the pursuit of “economic rents”. Unlike innovation that delivers new products, services, and markets, innovation as rentiership is defined by the extraction and capture of value through different modes of ownership and control over resources and assets. This shift towards rentiership is evident in the transformation of personal digital data into a private asset. In light of this assetization, it is necessary to unpack how innovation itself might be a problem, rather than a solution to a range of global challenges. Our aim in this paper is to conceptualize this relationship between innovation, finance, and data rentiership, and examine the policy implications of this pursuit of economic rents as a deliberate research and innovation strategy in datadriven technology sectors.

Humans have become increasingly datafied with the use of digital technologies that generate information about them. The onto-epistemological dimensions of personal digital data assemblages and their relationship to bodies and selves have... more

Humans have become increasingly datafied with the use of digital technologies that generate information about them. The onto-epistemological dimensions of personal digital data assemblages and their relationship to bodies and selves have yet to be thoroughly theorised. In this essay, I adopt various sociomaterialist perspectives, particularly those espoused in feminist materialism, vital materialism and the anthropology of material culture, to examine the ways in which these assemblages operate as part of knowing, perceiving and sensing human bodies. My aim is to reflect on the enactments of the human-nonhuman assemblages that are personal digital data. In so doing, I position these assemblages as things which are made and used by humans, involving processes of creativity, articulation and improvisation, and draw attention to the vitality of data things (their thing-power). I argue that these theoretical perspectives work to highlight the material dimensions of human-data assemblages as they made, grow, enacted, articulated and incorporated; emphasise the intertwined nature of known, knower and knowing; involve both reflexive (what shared tacit norms, assumptions and discourses underpin practices) and diffractive (what are the emergent entanglements of practices and agents, what is different or resistant, what are new or alternative possibilities); and identify to how all these assume importance and significance in people’s lives.

Big data is increasingly seen as a way of providing a more 'scientific' approach to the understanding and management of cities. But most geographic analyses of geotagged social media data have failed to mobilize a sufficiently complex... more

Big data is increasingly seen as a way of providing a more 'scientific' approach to the understanding and management of cities. But most geographic analyses of geotagged social media data have failed to mobilize a sufficiently complex understanding of socio-spatial relations. By combining the conceptual approach of relational socio-spatial theory with the methods of critical GIScience, this paper explores the spatial imaginaries and processes of segregation and mobility at play in the notion of the '9th Street Divide' in Louisville, Kentucky. Through a more context-sensitive analysis of this data, this paper argues against this popular spatial imaginary and the notion that the Louisville's West End is somehow separate and apart from the rest of the city. By analyzing the everyday activity spaces of different groups of Louisvillians through geotagged Twitter data, we instead argue for an understanding of these neighborhoods as fluid, porous and actively produced, rather than as rigid, static or fixed. Ultimately, this paper is meant to provide a conceptual and methodological framework for the analysis of social media data that is more attentive to the multiplicity of socio-spatial relations embodied in such data.

Recently, media and communication researchers have shown an increasing interest in critical data studies and ways to utilize data for social progress. In this commentary, I highlight several useful contributions in the International Panel... more

Recently, media and communication researchers have shown an increasing interest in critical data studies and ways to utilize data for social progress. In this commentary, I highlight several useful contributions in the International Panel on Social Progress' (IPSP) report toward identifying key data justice issues, before suggesting extra focus on algorithmic discrimination and implicit bias. Following my assessment of the IPSP's report, I emphasize the importance of two emerging media and communication areas – data ontology and semantic technology – that impact internet users daily yet receive limited attention from critical data researchers. I illustrate two examples to show how data ontologies and semantic technologies impact social processes by engaging in the hierarchization of social relations and entities, a practice that will become more common as the internet changes states towards a " smarter " version of itself.

As an imagination, the 'smart city' is rapidly becoming an integral part of our urban futures. Situated in the contemporary moment of 'data revolution' and India's techno-urban context, this paper is an attempt to reflect upon the... more

As an imagination, the 'smart city' is rapidly becoming an integral part of our urban futures. Situated in the contemporary moment of 'data revolution' and India's techno-urban context, this paper is an attempt to reflect upon the socio-technical imaginaries of data-driven urbanism and the incumbent reconfigurations of how we know, experience and govern a city. The author provides ethnographic vignettes of five little traditions of data-driven urbanism in Delhi pertaining to: the new 'image of the city', the changing nature of expertise, civic data activism, data-driven consumer applications and political communication and analytics. Foregrounding the generative potentials of each of these socio-technical sites, the paper argues for a meta-analytics of data.

In what follows, some contemporary narratives about 'the information society' are interrogated from critical race theoretical and decolonial perspectives with a view to constructing a 'counter-narrative' purporting to demonstrate the... more

In what follows, some contemporary narratives about 'the information society' are interrogated from critical race theoretical and decolonial perspectives with a view to constructing a 'counter-narrative' purporting to demonstrate the embeddedness of coloniality—that is, the persistent operation of colonial logics—in such discourses.

This article introduces the tenets of a theory of datafication of and in the Souths. It calls for a de-Westernization of critical data studies, in view of promoting a reparation to the cognitive injustice that fails to recognize... more

This article introduces the tenets of a theory of datafication of and in the Souths. It calls for a de-Westernization of critical data studies, in view of promoting a reparation to the cognitive injustice that fails to recognize non-mainstream ways of knowing the world through data. It situates the "Big Data from the South" research agenda as an epistemological, ontological, and ethical program and outlines five conceptual operations to shape this agenda. First, it suggests moving past the "universalism" associated with our interpretations of datafication. Second, it advocates understanding the South as a composite and plural entity, beyond the geographical connotation (i.e., "global South"). Third, it postulates a critical engagement with the decolonial approach. Fourth, it argues for the need to bring agency to the core of our analyses. Finally, it suggests embracing the imaginaries of datafication emerging from the Souths, foregrounding empowering ways of thinking data from the margins.

Social life is rife with networks of any kind. Nowadays, sociological concerns for networks, relations, associations, processes, mobilities, and flows are intensive and em-blematic. This reflection takes " networks " and their multiple... more

Social life is rife with networks of any kind. Nowadays, sociological concerns for networks, relations, associations, processes, mobilities, and flows are intensive and em-blematic. This reflection takes " networks " and their multiple products as starting points for a new sociological imagination. It hence outlines a set of current theoretical and methodologi-cal issues for approaching the wide and diverse field of the sociology of networks in a critical manner, beginning from the analytical distinction between the critical and the normative-functionalist sociology of networks. The paper concludes with reference to contemporary digital society and the critical use of network data (Big Data), that is, ever-larger quantities of information generated by human communicative interactions in social networking platforms and other web activities.

This theoretical article explores the bottom-up data practices enacted by individuals and groups in the context of organized collective action. Conversing with critical media theory, the sociology of social movements, and platform... more

This theoretical article explores the bottom-up data practices enacted by individuals and groups in the context of organized collective action. Conversing with critical media theory, the sociology of social movements, and platform studies, it asks how activists largely reliant on social media for their activities can leverage datafication and mobilize social media data in their tactics and narratives. Using the notion of digital traces as a heuristic tool to understand the dynamics between platforms and their users, the article reflects on the concurrent materiality and discursiveness of digital traces and analyzes the evolution of political agency vis-à-vis the datafied self. It contributes to our understanding of " digital traces in context " by foregrounding human agency and the meaning-making activities of individuals and groups. Focusing on the possibilities opened up by digital traces, it considers how activists make sense of the ways in which social media structure their interactions. It shows how digital traces trigger a quest for visibility that is unprecedented in the social movement realm, and how they can function as particular " agency machines. "