GARY KING (original) (raw)
Recent Papers
. Working Paper. “Experimental Evidence on the (Limited) Influence of Reputable Media Outlets”.Abstract
High quality news outlets are widely regarded as essential to responsive, uncorrupt democratic governments. However, experimental validation of the mechanisms of this claim, whereby outlets influence citizen knowledge and views, has proven elusive because reputable outlets try to publish the truth (and so valid control groups are hard to find), do not randomize news content, and have business models that generate massive endogeneity for researchers. We worked with a major media outlet to overcome these problems and meet journalistic and scientific standards. The results of four experiments covering crime, the economy, the environment, and gender equity indicate that editorial decisions have large effects on readers' factual knowledge, as implied by claims about the importance of the press, but they are only modestly larger than the effect of sponsored content on the same sites, which anyone can buy without editorial oversight. Moreover, at least in the short term, editorial decisions are no different from sponsored content purchases for other outcomes: Effects on political attitudes and policy preferences are statistically indistinguishable from each other, approximately zero, and the same across policy areas. Our results suggest that the traditional news media provides a clear but tenuous foundation for democratic citizen education.
María Ballesteros, Cynthia Dwork, Gary King, Conlan Olson, and Manish Raghavan. 3/25/2025. “Evaluating the Impacts of Swapping on the US Decennial Census.” Symposium on Computer Science and Law (CSLAW ’25). München, Germany.Abstract
To meet its dual burdens of providing useful statistics and ensuring privacy of individual respondents, the US Census Bureau has for decades introduced some form of “noise” into published statistics. Initially, they used a method known as “swapping” (1990–2010). In 2020, they switched to an algorithm called TopDown that ensures a form of Differential Privacy. While the TopDown algorithm has been made public, no implementation of swapping has been released and many details of the deployed swapping methodology deployed have been kept secret. Further, the Bureau has not published (even a synthetic) “original” dataset and its swapped version. It is therefore difficult to evaluate the effects of swapping, and to compare these effects to those of other privacy technologies. To address these difficulties we describe and implement a parameterized swapping algorithm based on Census publications, court documents, and informal interviews with Census employees. With this implementation, we characterize the impacts of swapping on a range of statistical quantities of interest. We provide intuition for the types of shifts induced by swapping and compare against those introduced by TopDown. We find that even when swapping and TopDown introduce errors of similar magnitude, the direction in which statistics are biased need not be the same across the two techniques. More broadly, our implementation provides researchers with the tools to analyze and potentially correct for the impacts of disclosure avoidance systems on the quantities they study.
Zachary J. Ward, Rifat Atun, Gary King, Brenda Sequeira Dmello, and Sue J. Goldie. 11/11/2024. “Global maternal health country typologies: A framework to guide policy.” PLOS Global Public Health, 4, 11, Pp. 1-10.Abstract
Maternal mortality remains a large challenge in global health. Learning from the experience of similar countries can help to accelerate progress. In this analysis we develop a typology of country groupings for maternal health and provide guidance on how policy implications vary by country typology. We used estimates from the Global Maternal Health (GMatH) microsimulation model, which was empirically calibrated to a range of fertility, process, and mortality indicators and provides estimates for 200 countries and territories. We used the 2022 estimates of the maternal mortality ratio (MMR) and lifetime risk of maternal death (LTR) and used a k-means clustering algorithm to define groups of countries based on these indicators. We estimated the means of other maternal indicators for each group, as well as the mean impact of different policy interventions. We identified 7 groups (A-G) of country typologies with different salient features. High burden countries (A-B) generally have MMRs above 500 and LTRs above 2%, and account for nearly 25% of global maternal deaths. Countries in these groups are estimated to benefit most from improving access to family planning and increasing facility births. Middle burden countries (C-E) generally have MMRs between 100–500 and LTRs between 0.5%-3%. Countries in these groups account for 55% of global maternal deaths and would benefit most from increasing facility births and improving quality of care. Low burden countries (F-G) generally have MMRs below 100 and LTRs below 0.5%, account for 20% of global maternal deaths, and would benefit most from improving access to family planning and community-based interventions and linkages to care. Indicators vary widely across groups, but also within groups, highlighting the importance of considering multiple indicators when assessing progress in maternal health. Policy impacts also differ by country typology, providing policymakers with information to help prioritize interventions.
Musashi Hinck, Uma Ilavarasan, Gary King, Kentaro Nakamura, and Brandon M. Stewart. 7/18/2024. “Automated Cognitive Debriefing.” In Society for Political Methodology. Riverside, CA.Abstract
Cognitive debriefing: necessary for researchers & respondents to agree on question meaning but prohibitively expensive, so rarely used.
- Administer survey, then go back & discuss what respondent thinks each question means.
- Universally-recommended best practice.
Our goal: easily & drastically improve question wording through an automated cognitive debriefing tool (ACD tool).
Zachary J. Ward, Rifat Atun, Gary King, Brenda Sequeira Dmello, and Sue J. Goldie. 6/2024. “Global maternal mortality projections by urban/rural locationand education level: a simulation-based analysis.” eClinicalMedicine, 72, Pp. 1-12. Publisher's VersionAbstract
Background
Maternal mortality remains a challenge in global health, with well-known disparities across countries. However, less is known about disparities in maternal health by subgroups within countries. The aim of this study is to estimate maternal health indicators for subgroups of women within each country.
Methods
In this simulation-based analysis, we used the empirically calibrated Global Maternal Health (GMatH) microsimulation model to estimate a range of maternal health indicators by subgroup (urban/rural location and level of education) for 200 countries/territories from 1990 to 2050. Education levels were defined as low (less than primary), middle (less than secondary), and high (completed secondary or higher). The model simulates the reproductive lifecycle of each woman, accounting for individual-level factors such as family planning preferences, biological factors (e.g., anemia), and history of maternal complications, and how these factors vary by subgroup. We also estimated the impact of scaling up women's education on projected maternal health outcomes compared to clinical and health system-focused interventions.
Findings
We find large subgroup differences in maternal health outcomes, with an estimated global maternal mortality ratio (MMR) in 2022 of 292 (95% UI 250–341) for rural women and 100 (95% UI 84–116) for urban women, and 536 (95% UI 450–594), 143 (95% UI 117–174), and 85 (95% UI 67–108) for low, middle, and high education levels, respectively. Ensuring all women complete secondary school is associated with a large impact on the projected global MMR in 2030 (97 [95% UI 76–120]) compared to current trends (167 [95% UI 142–188]), with especially large improvements in countries such as Afghanistan, Chad, Madagascar, Niger, and Yemen.
Interpretation
Substantial subgroup disparities present a challenge for global maternal health and health equity. Outcomes are especially poor for rural women with low education, highlighting the need to ensure that policy interventions adequately address barriers to care in rural areas, and the importance of investing in social determinants of health, such as women's education, in addition to health system interventions to improve maternal health for all women.
Natalie Ayers, Gary King, Zagreb Mukerjee, and Dominic Skinnion. Forthcoming. “Statistical Intuition Without Coding (or Teachers).” PS: Political Science and Politics.Abstract
Two features of quantitative political methodology make teaching and learning especially difficult: (1) Each new concept in probability, statistics, and inference builds on all previous (and sometimes all other relevant) concepts; and (2) motivating substantively oriented students, by teaching these abstract theories simultaneously with the practical details of a statistical programming language (such as R), makes learning each subject harder. We address both problems through a new type of automated teaching tool that helps students see the big theoretical picture and all its separate parts at the same time without having to simultaneously learn to program. This tool, which we make available via one click in a web browser, can be used in a traditional methods class, but is also designed to work without instructor supervision.
Danny Ebanks, Jonathan N. Katz, and Gary King. Working Paper. “How American Politics Ensures Electoral Accountability in Congress”.Abstract
An essential component of democracy is the ability to hold legislators accountable via the threat of electoral defeat, a concept that has rarely been quantified directly. Well known massive changes over time in indirect measures — such as incumbency advantage, electoral margins, partisan bias, partisan advantage, split-ticket voting, and others — all seem to imply wide swings in electoral accountability. In contrast, we show that the (precisely calibrated) probability of defeating incumbent US House members has been surprisingly constant and remarkably high for two-thirds of a century. We resolve this paradox with a generative statistical model of the full vote distribution to avoid biases induced by the common practice of studying only central tendencies, and validate it with extensive out-of-sample tests. We show that different states of the partisan battlefield lead in interestingly different ways to the same high probability of incumbent defeat. Many challenges to American democracy remain, but this core feature remains durable.
Presentations
Statistically Valid Inferences from Privacy Protected Data (Pew Research Center), at Pew Research Center, Washington DC, Thursday, April 24, 2025:
Venerable procedures for privacy protection and data sharing within academia, companies, and governments, and between sectors, are now known to be inadequate (e.g., respondents in de-identified surveys can usually be re-identified). At the same time, unprecedented quantities of data that could help social scientists understand and ameliorate the challenges of human society are presently locked away inside companies, governments, and other organizations, in part because of worries about privacy violations. Other organizations just go ahead and distribute deidentified data, thinking...
Read more about Statistically Valid Inferences from Privacy Protected Data (Pew Research Center)
Statistically Valid Inferences from Privacy Protected Data (Stat188, Harvard University), at Harvard University, Wednesday, November 13, 2024:
Venerable procedures for privacy protection and data sharing within academia, companies, and governments, and between sectors, have been proven to be completely inadequate (e.g., respondents in de-identified surveys can usually be re-identified). At the same time, unprecedented quantities of data that could help social scientists understand and ameliorate the challenges of human society are presently locked away inside companies, governments, and other organizations, in part because of worries about privacy violations. We address these problems with a general-...
Books
Gary King, Robert O. Keohane, and Sidney Verba. 2021. Designing Social Inquiry: Scientific Inference in Qualitative Research, New Edition. 2nd ed. Princeton: Princeton University Press. Publisher's VersionAbstract
"The classic work on qualitative methods in political science"
Designing Social Inquiry presents a unified approach to qualitative and quantitative research in political science, showing how the same logic of inference underlies both. This stimulating book discusses issues related to framing research questions, measuring the accuracy of data and the uncertainty of empirical inferences, discovering causal effects, and getting the most out of qualitative research. It addresses topics such as interpretation and inference, comparative case studies, constructing causal theories, dependent and explanatory variables, the limits of random selection, selection bias, and errors in measurement. The book only uses mathematical notation to clarify concepts, and assumes no prior knowledge of mathematics or statistics.
Featuring a new preface by Robert O. Keohane and Gary King, this edition makes an influential work available to new generations of qualitative researchers in the social sciences.
Replication data at the Harvard Dataverse: https://doi.org/10.7910/DVN/YHZG5M.
Gary King, Kay Schlozman, and Norman Nie. 2009. The Future of Political Science: 100 Perspectives. New York: Routledge Press.Read more
Federico Girosi and Gary King. 2008. Demographic Forecasting. Princeton: Princeton University Press.Abstract
We introduce a new framework for forecasting age-sex-country-cause-specific mortality rates that incorporates considerably more information, and thus has the potential to forecast much better, than any existing approach. Mortality forecasts are used in a wide variety of academic fields, and for global and national health policy making, medical and pharmaceutical research, and social security and retirement planning.
As it turns out, the tools we developed in pursuit of this goal also have broader statistical implications, in addition to their use for forecasting mortality or other variables with similar statistical properties. First, our methods make it possible to include different explanatory variables in a time series regression for each cross-section, while still borrowing strength from one regression to improve the estimation of all. Second, we show that many existing Bayesian (hierarchical and spatial) models with explanatory variables use prior densities that incorrectly formalize prior knowledge. Many demographers and public health researchers have fortuitously avoided this problem so prevalent in other fields by using prior knowledge only as an ex post check on empirical results, but this approach excludes considerable information from their models. We show how to incorporate this demographic knowledge into a model in a statistically appropriate way. Finally, we develop a set of tools useful for developing models with Bayesian priors in the presence of partial prior ignorance. This approach also provides many of the attractive features claimed by the empirical Bayes approach, but fully within the standard Bayesian theory of inference.
Replication data at the Harvard Dataverse: https://doi.org/10.7910/DVN/ZVN8XQ.
Gary King, Ori Rosen, Martin Tanner, Gary King, Ori Rosen, and Martin A Tanner. 2004. Ecological Inference: New Methodological Strategies. New York: Cambridge University Press.Abstract
Ecological Inference: New Methodological Strategies brings together a diverse group of scholars to survey the latest strategies for solving ecological inference problems in various fields. The last half decade has witnessed an explosion of research in ecological inference – the attempt to infer individual behavior from aggregate data. The uncertainties and the information lost in aggregation make ecological inference one of the most difficult areas of statistical inference, but such inferences are required in many academic fields, as well as by legislatures and the courts in redistricting, by businesses in marketing research, and by governments in policy analysis.