Counterfactuals and Hypothesis Testing in Political Science | World Politics | Cambridge Core (original) (raw)

Abstract

Scholars in comparative politics and international relations routinely evaluate causal hypotheses by referring to counterfactual cases where a hypothesized causal factor is supposed to have been absent. The methodological status and the viability of this very common procedure are unclear and are worth examining. How does the strategy of counterfactual argument relate, if at all, to methods of hypothesis testing based on the comparison of actual cases, such as regression analysis or Mill's Method of Difference? Are counterfactual thought experiments a viable means of assessing hypotheses about national and international outcomes, or are they methodologically invalid in principle? The paper addresses the first question in some detail and begins discussion of the second. Examples from work on the causes of World War I, the nonoccurrence of World War III, social revolutions, the breakdown of democratic regimes in Latin America, and the origins of fascism and corporatism in Europe illustrate the use, problems and potential of counterfactual argument in small-N-oriented political science research.

References

1 Moore, , Social Origins of Dictatorship and Democracy (Boston: Beacon Press, 1966), 414;Google ScholarWaltz, , Theory of International Politics (New York: Random House, 1979), 180.Google Scholar

2 Data not generated by random assignment to control and treatment groups is referred to as quasi-, or nonexperimental.

3 The sense of “significant respects” is discussed below.

4 These summary statements of the two strategies are not complete. Qualifications and elaborations for each are discussed in the rest of the paper, with more attention paid to the counterfactual case strategy. The potential difficulties with the method of comparing actual cases, which is formally known as regression analysis though informally practiced in such works as Skocpol, Theda, States and Social Revolutions (Cambridge: Cambridge University Press, 1979)CrossRef Google Scholar, are extensively discussed in the econometrics and statistics literatures.

5 “Structural position” here would entail the number of great powers and the basic geopolitical circumstances of the Soviet Union. Waltz (fn. 1); and idem, “Another Gap?” in Osgood, Robert et al., Containment, Soviet Behavior, and Grand Strategy, Policy Papers in International Affairs No. 16 (Berkeley: Institute of International Studies, University of California, 1981).Google Scholar On structural versus domestic political or ideological explanations of Soviet foreign policy, see also Posen, Barry R., “Competing Images of the Soviet Union,” World Politics 39 (July 1987), 579–97.CrossRef Google Scholar

6 Degrees of freedom are the number of cases minus the number of explanatory variables minus one.

7 For example, Moore, Barrington, The Social Bases of Obedience and Revolt (London: Macmillan, 1978);Google ScholarEvera, Stephen Van, “The Cult of the Offensive and the Origins of the First World War,” International Security 9 (Summer 1984), 58–107.CrossRef Google Scholar Of course, historians can be quite careful about their counterfactual arguments. For examples, see Bundy, McGeorge, Danger and Survival (New York: Random House, 1988);Google Scholar and Kennan, George, Russia and the West under Lenin and Stalin (Boston: Little Brown, 1960), 29–32.Google Scholar

8 On deterrence, see Huth, Paul, Extended Deterrence and the Prevention of War (New Haven: Yale University Press, 1988);CrossRef Google Scholar on fascism versus liberalism in Germany, see Moore (fn. 1); on fascism versus liberalism, corporatism, or traditional dictatorship, see Luebbert, Gregory M., “Social Foundations of Political Order in Interwar Europe,” World Politics 39 (July 1987), 449–78.CrossRef Google Scholar

9 For example, according to A. J. P. Taylor, “a historian should never deal in speculations about what did not happen”; Taylor, , The Struggle for Mastery in Europe, 1848–1918 (London: Oxford University Press, 1954), 513.Google Scholar Or, in M. M. Postan's words, “The might-have-beens of history are not a profitable subject of discussion”; quoted in Gould, J. D., “Hypothetical History,” Economic History Review, 2d ser., 22 (August 1969), 195–207.CrossRef Google Scholar See also Fischer, David Hack-ett, Historians' Fallacies (New York: Harper Colophon Books, 1970), 15–21;Google Scholar and examples given in McClelland, Peter, Causal Explanation and Model-Building in History, Economics, and the New Economic History (Ithaca, N.Y.: Cornell University Press, 1975).Google Scholar

10 Weber, , “Objective Possibility and Adequate Causation in Historical Explanation,” in The Methodology of the Social Sciences (New York: Free Press, 1949);Google ScholarElster, , Logic and Society: Contradictions and Possible Worlds (New York: Wiley, 1978);Google Scholar citations below (fn. 56); and Polsby, Nelson, ed., What If?: Essays in Social Science Fiction (Lexington, Mass.: Lewis Publishing, 1982).Google Scholar

11 The notion of comparability plays a major role in the methodological and applied writings of specialists in comparative politics. My impression is that nonetheless the notion remains a deeply vague one. It seems to include, at various times, the idea that the other causes should be uncorrelated with the independent variables (E(X'e) = o); that everything else should have as little influence as possible (E(e'e) should be close to zero); that measures will not be as valid or reliable across countries and cultures; and other meanings. (Throughout, E(.) is the expectations operator; X is an n × k matrix of n observations on k independent variables; e is an n × i vector ot error terms.)

Posing the main risk for analysis across sets of actual cases in terms of the regression validity of the ceteris paribus assumption also bears qualification. For regression estimates to be unbiased, we do not need the other things to be literally equal, though it is true that the more equal they are, the greater the precision of our estimated effects. For unbiased estimates of causal effects we need only require that the other things not be systematically related to the prospective causes and the dependent variable that we are evaluating. This point appears not to have been fully clear to Mill (working before statistics was well developed), who sometimes writes in his System of Logic (London: John W. Parker, 1851) as though everything else has to be literally identical in order for the Method of Difference to work. The same confusion seems to carry over today in the work of some specialists in comparative politics who take Mill as a principal methodological guide (e.g., Skocpol, Theda and Somers, Margaret, “The Uses of Comparative History in Macrosocial Inquiry,” Comparative Studies in Society and History 22 [April 1980], 174–97).CrossRef Google Scholar That said, I should also note that those who conduct large-N research often do refer to this assumption as the “ceteris paribus assumption” simply for convenience, and I will follow this usage here.

12 This is true as well of actual experiments in which cases are assigned at random to treatment and control groups. See Neuberg, Leland, Conceptual Anomalies in Economics and Statistics: Lessons from the Social Experiment (Cambridge: Cambridge University Press, 1988).Google Scholar Among other things, Neuberg shows that a counterfactual assumption is needed to justify estimates of sampling variance in actual experiments. I suggest below, however, that counterfactuals play a key role in quasi-experimental hypothesis testing that they do not play in actual experiments.

13 The later example comes from Huth (fn. 8), 97, who often uses counterfactual argument about particular cases to make more plausible the results of his regression analysis.

14 I am relying here on what David Lewis calls “metalinguistic” theories of counterfactuals. These hold that “a counterfactual is true, or assertable, if and only if its antecedent, together with suitable further premises, implies its consequent”; Lewis, , Counterfactuals (Cambridge: Cambridge University Press, 1973), 65.Google Scholar The “further premises” may include both facts and causal laws, or “lawlike generalizations.” For example, the counterfactual “If that match had been struck, it would have lit” is true given the existence of certain laws concerning sulfur, oxygen, friction, and heat, plus certain factual conditions, including a dry match, presence of oxygen, etc. A counterfactual is thus a “condensed or incomplete argument” (Mackie, J. L., “Counterfactuals and Causal Laws,” in Butler, R. J., ed., Analytical Philosophy [Blackwell: Oxford, 1962], 68).Google Scholar There are other accounts of what makes a counterfactual true (or assertable), based on notions of distance between “possible worlds”; see Lewis.

15 Van Evera (fn. 7); Snyder, Jack, “Civil-Military Relations and the Cult of the Offensive, 1914 and 1984,” International Security 9 (Summer 1984), 108–46.CrossRef Google Scholar

16 Of course, each step of this process—from identifying a sample to interpreting relative importance—is fraught with methodological peril. Both strategies, it should be emphasized, are risky.

17 I want to suggest that counterfactual reasoning must underlie efforts to infer or assess the relative weights of causes in case studies where the analyst's degrees of freedom in the actual world are negative. In practice, those who use case studies often resort as well to casual comparisons with other actual cases (e.g., “Whereas in many other African countries …, in Kenya …”) and testing multiple implications of a theory; see Campbell, Donald, “‘Degrees of Freedom’ and the Case Study,” Comparative Political Studies 8 (July 1975), 178–93.CrossRef Google Scholar

19 There is, however, more than one meaningful sense to the idea of causal importance in a regression model. See J. Merrill Shanks, “The Importance of Importance” (Working paper, Survey Research Center, University of California, Berkeley, 1982); Achen, Christopher, Interpreting and Using Regression (Beverly Hills, Calif.: Sage Publications, 1982).CrossRef Google Scholar

20 Some philosophers of history working on the problem of how historians can and should attribute causal weightings have proposed similar criteria. See Martin, Raymond, “Causes, Conditions, and Causal Importance,” History and Theory 21 (1982), 53–74CrossRef Google Scholar, and citations therein.

21 Cf. Levy, Jack, “Domestic Politics and War,” Journal of Interdisciplinary History 18 (Spring 1988), 653–73.CrossRef Google Scholar

22 Another tack on this puzzle is taken by Campbell (fn. 17).

23 E.g., Cohen, Youssef, “Democracy from Above: The Political Origins of Military Dictatorship in Brazil,” World Politics 40 (October 1987), 30–54CrossRef Google Scholar; Im, Hyug Baeg, “The Rise of Bureaucratic Authoritarianism in South Korea,” World Politics 39 (January 1987), 231–57CrossRef Google Scholar; Gowa, Joanne, “Hegemons, IOs, and Markets: The Case of the Substitution Account,” International Organization 38 (Autumn 1984), 661–83.CrossRef Google Scholar

24 For examples, see articles in Deyo, Frederic, ed., The Political Economy of the New Asian Industrialism (Ithaca, N.Y.: Cornell University Press, 1987).Google Scholar

25 As the preceding discussion should suggest, an N=i case study in which causal inferences are drawn is, strictly speaking, impossible, since other counterfactual cases must be invoked to support causal claims. I use N here to refer to the number of cases in the actual world. On the idea of actual versus possible worlds, see Loux, Michael, ed., The Possible and Actual: Readings in the Metaphysics of Modality (Ithaca, N.Y.: Cornell University Press, 1985).Google Scholar

26 These include, but are not limited to, nationalism, imperialism, capitalism, social Darwinism, a fatalistic intellectual mood, the balance of power system, population growth, differential industrialization, a power transition, long cycles, tight alliances, multipolarity, misperceptions, psychological pathologies, leader personalities, essentially aggressive German intent, military doctrine (i.e., the cult of the offensive), military organization, diplomatic errors, the Russian mobilization, the archduke's assassination, and the outcomes of recent crises.

27 Jervis, , “War and Misperception,” Journal of Interdisciplinary History 18 (Spring 1988), 684.CrossRef Google Scholar

29 I should note that rationality principles are not the only ones that might be used to limn counterfactual scenarios. One might argue, for example, that had some independent variable been different, a key actor would have ignored it due to cognitive dissonance or wishful thinking.

Even so, the frequent use of rationality principles to sketch counterfactual scenarios should not be surprising. The counterfactual strategy is often used by analysts explaining an outcome as the result of human choices. This entails saying why other possible choices were not seen as desirable by the actors. In game-theoretic terms, analysts using the counterfactual strategy are often describing why some particular set of choices was an equilibrium (or, at least, rationalizable) strategy in the “game” faced by the actors. On Nash equilibrium versus rationalizability as game-theoretic solution concepts, see Bernheim, B. Douglas, “Rationalizable Strategic Behavior,” Econometrica 52 (1984), 1007–28.CrossRef Google Scholar

30 Van Evera (fn. 7), 105 (emphasis added).

31 Sagan, Scott, “1914 Revisited,” International Security 2 (Fall 1986), 151–75CrossRef Google Scholar, at 159 (emphasis added).

32 See also Snyder's response to Sagan's critique and Sagan's reply, International Security 9 (Winter 1986–87), 187–98. Their discussion is carried out largely in the realm of the counterfactual (e.g., what was the probability that the Schlieffen plan would work).

33 The Hollandization thesis is developed by Mueller, John in Retreat from Doomsday (New York: Basic Books, 1989)Google Scholar, where he argues that gradual changes in the government and societies of advanced industrial states have made them more peaceable in their external affairs. For a review of arguments on the causes of the long peace, see Gaddis, John Lewis, The Long Peace (Oxford: Oxford University Press, 1987)Google Scholar, chap. 8.

34 Depending on how one counts the “poles,” neither does bipolarity; see Waltz (fn. 1).

35 To assess the question of relative importance, we would also need to ask about what would have happened if nuclear weapons existed but Hollandization did not. Mueller does not explore this second counterfactual scenario explicitly. To hold that Hollandization has been the more important cause, he would need to argue that postwar states lacking the key Hollandization attributes might not have been deterred from fighting a major war, despite nuclear weapons.

36 Mueller, , “The Essential Irrelevance of Nuclear Weapons: Stability in the Postwar World,” International Security 13 (Fall 1988), 55–79CrossRef Google Scholar, at 56 (emphasis added).

38 The fortunate absence of actual cases of nuclear conflict has led a number of historians and political scientists to reflect on the role of counterfactuals in nuclear history. See Gaddis, John Lewis, “Nuclear Weapons and International Systemic Stability,” American Academy of Arts and Sciences Occasional Paper No. 2 (Cambridge: AAAS, 1990).Google Scholar This paper was prepared for an AAAS workshop entitled “Nuclear History and the Use of Counterfactuals.” In a different vein, Lebow, Richard Ned and Stein, Janice Gross (“Beyond Deterrence,” Journal of Social Issues 43 [Winter 1987], 3–71)Google Scholar have briefly discussed the role of counterfactuals in defining a sample of cases of successful deterrence.

39 Linz, Juan and Stepan, Alfred, eds., The Breakdown of Democratic Regimes (Baltimore, Md.: Johns Hopkins University Press, 1978).Google Scholar

40 Stepan, “Political Leadership and Regime Breakdown: Brazil,” ibid. For other examples, see Smoke, Richard, War: Controlling Escalation (Cambridge: Harvard University Press, 1977)CrossRef Google Scholar, and citations in fn. 23.

41 Stepan (fn. 40), 134, and see also 120.

42 The distinction is similar to that between underlying causes and specific or proximate causes—a framework often used by historians.

43 Stepan (fn. 40), 129 and 130.

44 In a current project, Stepan uses explicit counterfactual analysis to assess the impact of presidential as opposed to parliamentary systems on democratic regime breakdown in South America and Southern Europe.

45 Lijphart, Arend, “Comparative Politics and the Comparative Method,” American Political Science Review 65 (September 1971), 682–93.CrossRef Google Scholar

46 Stepan, , The State and Society: Peru in Comparative Perspective (Princeton: Princeton University Press, 1978).Google Scholar

47 Moore (fn. 1), 430. See also George, Alexander L. et al., The Limits of Coercive Diplomacy (Boston: Little Brown, 1971), 227.Google Scholar

48 Skocpol (fn. 4). Only four of these “negative cases” are treated explicitly and at length, though Skocpol is well aware that others mentioned are used in the same fashion.

50 Luebbert (fn. 8), 457–58 (emphasis added).

52 The rationality principle is: Parties desirous of electoral success will seek partners that can carry many votes with them. The implicit counterfactual argument is: If there had been many landless laborers in Norway, the socialists might have sought to form a coalition with them, and fascism might have resulted.

53 Luebbert (fn. 8), 466.

54 Weber (fn. 10), 164 (emphasis in original).

55 Fogel, Robert, Railroads and American Economic Growth (Baltimore, Md.: Johns Hopkins University Press, 1964);Google Scholar McClelland (fn. 9); Gould (fn. 9); Redlich, Fritz, “‘New’ and Traditional Approaches to Economic History and Their Interdependence,” Journal of Economic History 25 (1965), 480–95CrossRef Google Scholar; and Climo, T. A. and Howells, P. G. A, “Possible Worlds in Historical Explanation,” History and Theory 15 (1976), 1–20.CrossRef Google Scholar Fischer (fn. 9) lists further references.

56 Elster (fn. 10). See also Elster, , Explaining Technical Change (Cambridge: Cambridge University Press, 1983)Google Scholar, chap. 1; idem, “Reply to Comments,” Inquiry 23 (June 1980), 213–32; Lukes, Steven, “Elster on Counterfactuals,” Inquiry 23 (June 1980), 145–55CrossRef Google Scholar; Barry, Brian, “Supertax,” Political Studies 28 (1980), 139–43.Google Scholar Political scientists have broached issues raised by counterfactuals in a variety of places. See, for example, George, Alexander and McKeown, Timothy, “Case Studies and Theories of Organizational Decision Making,” in Coulam, Robert and Smith, Richard, eds.: Advances in Information Processing in Organizations (Greenwich, Conn.: JAI Press, 1985), 2:33–34Google Scholar; Ragin, Charles, The Comparative Method (Berkeley: University of California Press, 1987), 39Google Scholar; Moon, Donald, “The Logic of Political Inquiry: A Synthesis of Opposed Perspectives,” in Polsby, Nelson and Greenstein, Fred, eds., Handbook of Political Science (Reading, Mass.: Addison-Wesley, 1975), 1:131–228.Google Scholar

57 Goodman, Nelson, “The Problem of Counterfactual Conditionals,” Journal of Philosophy 44 (1947), 113–38CrossRef Google Scholar, reprinted in his Fact Fiction and Forecast (Cambridge: Harvard University Press, 1983); Lewis (fn. 14); Sosa, Ernest, ed., Causation and Conditionals (Oxford: Oxford University Press, 1975).Google Scholar Part of the philosophical interest in counterfactuals arises from their bearing on key issues in the philosophy of science. See Suppes, Frederick, “The Search for Philosophic Understanding of Scientific Theories,” in Suppes, , ed., The Structure of Scientific Theories (Urbana: University of Illinois Press, 1977), 3–232, at 36–45Google Scholar, and references cited there; Nagel, Ernest, The Structure of Science (New York: Harcourt, Brace, and World, 1961).Google Scholar Nagel (chap. 15) also saw that counterfactuals play a key role in historical explanation.

59 Martin (fn. 20). See also Martin, , “Beyond Positivism: A Research Program for Philosophy of History,” Philosophy of Science 48 (1981), 112–21CrossRef Google Scholar; and idem, “Singular Causal Explanation,” Theory and Decision 2 (1972), 221–37.

60 Carr, Edward Hallet, What Is History? (New York: Knopf, 1962);Google Scholar Gaddis (fn. 38).

61 On these, see Beauchamp, Tom and Rosenberg, Alexander, Hume and the Problem of Causation (Oxford: Oxford University Press, 1981).Google Scholar

62 For example, Luebbert (fn. 8) might have distinguished more carefully between the conditions prevailing in particular countries that allowed the causes of regime type—coalition membership—to operate as they did. On related philosophical distinctions between causes and conditions, see Mackie, J. L., “Causes and Conditions,” in Sosa (fn. 57), 15–38Google Scholar; and Martin (fn. 59, 1981, 1972).

63 Carr (fn. 60), citing Churchill.

64 A third suggestion for resolving this problem would be to add a condition of temporal or causal proximity to P2; that is, A is a cause of B if P2 is true and A precedes B by a relatively short time period, or if the causal chain is not too long. But this raises the problem of how long?

67 Elster (fn. 56, 1983), 38.

69 For related criticisms of Elster's notion of counterfactual legitimacy, see Barry (fn. 56); and Lukes (fn. 56).

70 Goodman (fn. 57, 1983), 15–17. See also references in fn. 14. Goodman points out that it is quite problematic to use a counterfactual to define general truth conditions for counterfactuals. See Mackie (fn. 14) for a possible way around this problem (which at any rate may be of greater interest to philosophers than to political scientists).

I should note that Elster (fn. 10) is well aware of the issue of cotenability, which he refers to as “compossibility” (p. 177) and also “compatibility” (p. 183). Indeed, his “branching worlds” theory for assessing the truth of counterfactuals can be seen as a suggestion for assessing cotenability.

71 Note the similarity of the cotenability condition to P1, the key assumption justifying a causal interpretation of regression coefficients derived from quasi-experimental data. The likeness underscores the point that quasi experiments and the counterfactual strategy share reliance on counterfactual suppositions.

72 This suggestion is influenced by examples provided by McGeorge Bundy (fn. 7), and by Gaddis's discussion of them (fn. 38).

73 The point that a variable is distinct from any particular realization of it should be obvious but is sometimes missed. The point that the variance explained might be defined across actual or counterfactual cases is rarely seen.

74 For example, “if that match had been struck, it probably would have lit” will not be controversial in most circumstances. Neither are the counterfactual arguments implied by Luebbert (fn. 8) on why the fascist coalition did not develop in Norway and Denmark.