Confirmation (original) (raw)

1. Confirmation by instances

In a seminal essay on induction, Jean Nicod (1924) offered the following important remark:

Consider the formula or the law: \(F\) entails \(G\). How can a particular proposition, or more briefly, a fact affect its probability? If this fact consists of the presence of \(G\) in a case of \(F\), it is favourable to the law […]; on the contrary, if it consists of the absence of \(G\) in a case of \(F\), it is unfavourable to this law. (219, notation slightly adapted)

Nicod’s work was an influential source for Carl Gustav Hempel’s (1943, 1945) early studies in the logic of confirmation. In Hempel’s view, the key valid message of Nicod’s statement is that the observation report that an object \(a\) displays properties \(F\) and \(G\) (e.g., that \(a\) is a swan and is white) confirms the universal hypothesis that all \(F\)-objects are \(G\)-objects (namely, that all swans are white). Apparently, it is by means of this kind of confirmation by instances that one can obtain supporting evidence for statements such as “sodium salts burn yellow”, “wolves live in a pack”, or “planets move in elliptical orbits” (also see Russell 1912, Ch. 6). We will now see the essential features of Hempel’s analysis of confirmation.

1.1 Hempel’s theory

Hempel’s theory addresses the non-deductive relation of confirmation between evidence and hypothesis, but relies thoroughly on standard logic for its full technical formulation. As a consequence, it also goes beyond Nicod’s idea in terms of clarity and rigor.

Let \(\bL\) be the set of the closed sentences of a first-order logical language \(L\) (finite, for simplicity) and consider \(h, e \in \bL\). Also let \(e\), the evidence statement, be consistent and contain individual constants only (no quantifier), and let \(I(e)\) be the set of all constants occurring (non-vacuously) in \(e\). So, for example, if \(e = Qa \wedge Ra\), then \(I(e) = \{a\}\), and if \(e = Qa \wedge Qb\), then \(I(e) = \{a,b\}\). (The non-vacuity clause is meant to ensure that if sentence \(e\) happens to be, say, \(Qa \wedge Qb \wedge (Rc \vee \neg Rc)\), then \(I(e)\) still is \(\{a, b\}\), for \(e\) does not really state anything non-trivial about the individual denoted by \(c\). See Sprenger 2011a, 241–242.) Hempel’s theory relies on the technical construct of the_development_ of hypothesis \(h\) for evidence \(e\), or the \(e\)-development of \(h\), indicated by \(dev_{e}(h)\). Intuitively, \(dev_{e}(h)\) is all that (and only what) \(h\) says once restricted to the individuals mentioned (non-vacuously) in \(e\), i.e., exactly those denoted by the elements of \(I(e)\).

The notion of the \(e\)-development of hypothesis \(h\) can be given an entirely general and precise definition, but we’ll not need this level of detail here. Suffice it to say that the \(e\)-development of a universally quantified material conditional \(\forall x(Fx \rightarrow Gx)\) is just as expected, that is: \(Fa \rightarrow Ga\) in case \(I(e) = \{a\}\); \((Fa \rightarrow Ga) \wedge (Fb \rightarrow Gb)\) in case \(I(e) = \{a,b\}\), and so on. Following Hempel, we will take universally quantified material conditionals as canonical logical representations of relevant hypotheses. So, for instance, we will count a statement of the form \(\forall x(Fx \rightarrow Gx)\) as an adequate rendition of, say, “all pieces of copper conduct electricity”.

In Hempel’s theory, evidence statement \(e\) is said to confirm hypothesis \(h\) just in case it entails, not \(h\) in its full extension, but suitable instantiations of \(h\). The technical notion of the \(e\)-development of \(h\) is devised to identify precisely those relevant instantiations, that is, the consequences of \(h\) as restricted to the individuals involved in \(e\). More precisely, Hempelian confirmation can be defined as follows:

Hempelian confirmation
For any \(h,e \in \bL\) such that \(e\) is consistent and contains individual constants only (no quantifier):

  1. evidence \(e\) directly Hempel-confirms hypothesis \(h\) if and only if \(e \vDash dev_{e}(h)\); \(e\) _Hempel-confirms_\(h\) if and only if, for some \(s \in \bL\), \(e \vDash dev_{e}(s)\) and \(s \vDash h\);
  2. evidence \(e\) directly Hempel-disconfirms hypothesis \(h\) if and only if \(e \vDash dev_{e}(\neg h)\); \(e\)Hempel-disconfirms \(h\) if and only if, for some \(s \in \bL, e \vDash dev_{e}(s)\) and \(s \vDash \neg h\);
  3. evidence \(e\) is Hempel-neutral for hypothesis \(h\) otherwise.

In each of clauses (i) and (ii), Hempelian confirmation (disconfirmation, respectively) is a generalization of _direct_Hempelian confirmation (disconfirmation). To retrieve the latter as a special case of the former, one only has to posit \(s = h\) \((\neg h\), respectively, for disconfirmation).

By direct Hempelian confirmation, evidence statement \(e\) that, say, object \(a\) is a white swan, \(swan(a) \wedge white(a)\), confirms hypothesis \(h\) that all swans are white, \(\forall x(swan(x) \rightarrow white(x))\), because the former entails the \(e\)-development of the latter, that is, \(swan(a) \rightarrow white(a)\). This is a desired result, according to Hempel’s reading of Nicod. By (indirect) Hempelian confirmation, moreover, \(swan(a) \wedge white(a)\) also confirms that a particular further object \(b\) will be white, if it’s a swan, i.e., \(swan(b) \rightarrow white(b)\) (to see this, just set \(s = \forall x(swan(x) \rightarrow white(x))\)).

The second possibility considered by Nicod (“the_absence_ of \(G\) in a case of \(F\,\)”) can be accounted for by Hempelian disconfirmation. For the evidence statement \(e\) that \(a\) is a non-white swan—\(swan(a) \wedge \neg white(a)\)—entails (in fact, is identical to) the \(e\)-development of the hypothesis that there exist non-white swans—\(\exists x(swan(x) \wedge \neg white(x))\)—which in turn is just the negation of \(\forall x(swan(x) \rightarrow white(x))\). So the latter is disconfirmed by the evidence in this case. And finally, \(e = swan(a) \wedge \neg white(a)\) also Hempel-disconfirms that a particular further object \(b\) will be white, if it’s a swan, i.e., \(swan(b) \rightarrow white(b)\), because the negation of the latter, \(swan(b) \wedge \neg white(b)\), is entailed by \(s = \forall x(swan(x) \wedge \neg white(x))\) and \(e \vDash dev_{e}(s)\).

So, to sum up, we have four illustrations of how Hempel’s theory articulates Nicod’s basic ideas, to wit:

1.2 Two paradoxes and other difficulties

The ravens paradox (Hempel 1937, 1945). Consider the following statements:

(\(h\))

\(\forall x(raven(x) \rightarrow black(x))\), i.e., all ravens are black;

(\(e\))

\(raven(a) \wedge black(a)\), i.e., \(a\) is a black raven;

(\(e^*\))

\(\neg black(a^*) \wedge \neg raven(a^*)\), i.e., \(a^*\) is a non-black non-raven (say, a green apple).

Is hypothesis \(h\) confirmed by \(e\) and \(e^*\) alike? That is, is the claim that all ravens are black equally confirmed by the observation of a black raven and by the observation of a non-black non-raven (e.g., a green apple)? One would want to say no, but Hempel’s theory is unable to draw this distinction. Let’s see why.

As we know, \(e\) (directly) Hempel-confirms \(h\), according to Hempel’s reconstruction of Nicod. By the same token, \(e^*\) (directly) Hempel-confirms the hypothesis that all non-black objects are non-ravens, i.e., \(h^* = \forall x(\neg black(x) \rightarrow \neg raven(x))\). But \(h^* \vDash h\) (\(h\) and \(h^*\) are just logically equivalent). So, \(e^*\) (the observation report of a non-black non-raven), like \(e\) (black raven), does (indirectly) Hempel-confirm \(h\) (all ravens are black). Indeed, as \(\neg raven(a)\) entails \(raven(a) \rightarrow black(a)\), it can be shown that \(h\) is (directly) Hempel-confirmed by the observation of_any_ object that is not a raven (an apple, a cat, a shoe), apparently disclosing puzzling “prospects for indoor ornithology” (Goodman 1955, 71).

\(Blite\) (Goodman 1955). Consider the peculiar predicate “\(blite\)”, defined as follows: an object is blite just in case (i) it is black if examined at some moment \(t\) up to some future time \(T\) (say, the next expected appearance of Halley’s comet, in 2061) and (ii) it is white if examined afterwards. So we posit \(blite(x) \equiv (ex_{t\le T}(x) \rightarrow black(x)) \wedge (\neg ex_{t\le T}(x) \rightarrow white(x))\). Now consider the following statements:

(\(h\))

\(\forall x(raven(x) \rightarrow black(x))\), i.e., all ravens are black;

(\(h^*\))

\(\forall x(raven(x) \rightarrow blite(x))\), i.e., all ravens are blite;

(\(e\))

\(e = raven(a) \wedge ex_{t\le T}(a) \wedge black(a)\), i.e., \(a\) is a raven observed no later than \(T\) and it is black.

Does \(e\) confirm hypotheses \(h\) and \(h^*\) alike? That is, does the observation of a black raven before \(T\) confirms equally the claim that all ravens are black as the claim that all ravens are blite? Here again, one would want to say no, but Hempel’s theory is unable to draw the distinction. For one can check that the \(e\)-developments of \(h\) and \(h^*\) are both entailed by \(e\). Thus, \(e\) (the report of a raven examined no later than \(T\) and found to be black) does Hempel-confirm \(h^*\) (all ravens are blite) just as it confirms \(h\) (all ravens are black). Moreover, \(e\) also Hempel-confirms the statement that a raven will be white if examined after \(T\), because this is a logical consequence of \(h^*\) (which is directly Hempel-confirmed by \(e\)). And finally, suppose that \(blurple(x) \equiv (ex_{t\le T}(x) \rightarrow black(x)) \wedge (\neg ex_{t\le T}(x) \rightarrow purple(x)).\) We then have that the very same evidence statement \(e\) Hempel-confirms the hypothesis that all ravens are blurple, and thus also its implication that a raven will be \(purple\) if examined after \(T\)!

A seemingly obvious idea, here, is that there must be something inherently wrong with predicates such as \(blite\) or \(blurple\) (and perhaps non-raven and non-black, too) and thus a principled way to rule them out as “unnatural”. Then one could restrict confirmation theory accordingly, i.e., to “natural kinds” only (see, e.g., Quine 1970). Yet this point turns out be very difficult to pursue coherently and it has not borne much fruit in this discussion (Rinard 2014 is a recent exception). After all, for all we know, it is a perfectly “natural” feature of a token of the “natural kind” water that it is found in one physical state for temperatures below 0 degrees Celsius and in an entirely different state for temperatures above that threshold. So why should the time threshold \(T\) in \(blite\) or \(blurple\) be a reason to dismiss those predicates? (The water example comes from Howson 2000, 31–32. See Schwartz 2011, 399 ff., for a more general assessment of this issue.)

The above, widely known “paradoxes” then suggest that Hempel’s analysis of confirmation is too liberal: it sanctions the existence of confirmation relations that are intuitively very unsound (see Earman and Salmon 1992, 54, and Sprenger 2011a, 243, for more on this). Yet the Hempelian notion of confirmation turns out to be very restrictive, too, on other accounts. For suppose that hypothesis \(h\) and evidence \(e\) do not share any piece of non-logical vocabulary. \(h\) might be, say, Newton’s law of universal gravitation (connecting force, distances and masses), while \(e\) could be the description of certain spots on a telescopic image. Throughout modern physics, significant relations of confirmation and disconfirmation were taken to obtain between statements like these. Indeed, telescopic sightings have been crucial evidence for Newton’s law as applied to celestial bodies. However, as their non-logical vocabularies are disjoint, \(e\) and \(h\) must simply be logically independent, and so must be \(e\) and \(dev_{e}(h)\) (with very minor caveats, this follows from Craig’s so-called interpolation theorem, see Craig 1957). In such circumstances, there can be nothing but Hempel-neutrality between evidence and hypothesis. So Hempel’s original theory seems to lack the resources to capture a key feature of inductive inference in science as well as in several other domains, i.e., the kind of “vertical” relationships of confirmation (and disconfirmation) between the description of observed phenomena and hypotheses concerning underlying structures, causes, and processes.

To overcome the latter difficulty, Clark Glymour (1980a) embedded a refined version of Hempelian confirmation by instances in his analysis of scientific reasoning. In Glymour’s revision, hypothesis \(h\) is confirmed by some evidence \(e\) even if appropriate auxiliary hypotheses and assumptions must be involved for \(e\) to entail the relevant instances of \(h\). This important theoretical move turns confirmation into a three_-place relation concerning the evidence, the target hypothesis, and (a conjunction of) auxiliaries. Originally, Glymour presented his sophisticated neo-Hempelian approach in stark contrast with the competing traditional view of so-called_hypothetico-deductivism (HD). Despite his explicit intentions, however, several commentators have pointed out that, partly because of the due recognition of the role of auxiliary assumptions, Glymour’s proposal and HD end up being plagued by similar difficulties (see, e.g., Horwich 1983, Woodward 1983, and Worrall 1982). In the next section, we will discuss the HD framework for confirmation and also compare it with Hempelian confirmation. It will thus be convenient to have a suitable extended definition of the latter, following the remarks above. Here is one that serves our purposes:

Hempelian confirmation (extended)
For any \(h, e,k \in \bL\) such that \(e\) contains individual constants only (no quantifier), \(k = dev_{e}(\alpha)\) for some \(\alpha \in \bL\) containing quantifiers only (no individual constant) and such that \(\alpha \not\vDash h\), and \(e\wedge k\) is consistent:

  1. \(e\) directly Hempel-confirms \(h\) relative to \(k\) if and only if \(e\wedge k \vDash dev_{e}(h)\); \(e\)Hempel-confirms \(h\) relative to \(k\) if and only if, for some \(s \in \bL, e\wedge k \vDash dev_{e}(s)\) and \(s\wedge k \vDash h\);
  2. \(e\) directly Hempel-disconfirms \(h\) relative to \(k\) if and only if \(e\wedge k \vDash dev_{e}(\neg h)\); \(e\) Hempel-disconfirms \(h\) relative to \(k\) if and only if, for some \(s\in \bL, e\wedge k \vDash dev_{e}(s)\)a and \(s\wedge k \vDash \neg h\);
  3. \(e\) is Hempel-neutral for \(h\) _relative to_\(k\) otherwise.

One can see that in the above definition the auxiliary assumptions in \(k\) are the \(e\)-development of further closed constant-free hypotheses (in fact, equations as applied to specific measured values, in typical examples from Glymour 1980a), where such hypotheses are meant to be conjoined in a single statement (\(\alpha\)) for convenience. This implies that the only terms occurring (non-vacuously) in \(k\) are individual constants already occurring (non-vacuously) in \(e\). For an empty \(\alpha\) (that is, tautologous: \(\alpha = \top\)), \(k\) must be empty too, and the original (restricted) definition of Hempelian confirmation applies. As for the proviso that \(\alpha \not\vDash h\), it rules out undesired cases of circularity—akin to so-called “macho” bootstrap confirmation, as discussed in Earman and Glymour 1988 (for more on Glymour’s theory and its developments, see Douven and Meijs 2006, and references therein).

2. Hypothetico-deductivism

The central idea of hypothetico-deductive (HD) confirmation can be roughly described as “deduction-in-reverse”: evidence is said to confirm a hypothesis in case the latter, while not entailed by the former, is able to entail it, with the help of suitable auxiliary hypotheses and assumptions. The basic version (sometimes labelled “naïve”) of the HD notion of confirmation can be spelled out thus:

HD-confirmation
For any \(h, e, k \in \bL\) such that \(h\wedge k\) is consistent:

  1. \(e\) HD-confirms \(h\) relative to \(k\) if and only if \(h\wedge k \vDash e\) and \(k \not\vDash e\);
  2. \(e\) HD-disconfirms \(h\) relative to \(k\) if and only if \(h\wedge k \vDash \neg e\), and \(k \not\vDash \neg e\);
  3. \(e\) is HD-neutral for hypothesis \(h\) relative to \(k\) otherwise.

Note that clause (ii) above represents HD-disconfirmation as plain logical inconsistency of the target hypothesis with the data (given the auxiliaries) (see Hempel 1945, 98).

2.1 HD vs. Hempelian confirmation

HD-confirmation and Hempelian confirmation convey different intuitions (see Huber 2008a for an original analysis). They are, in fact, distinct and strictly incompatible notions. This will be effectively illustrated by the consideration of the following conditions.

Entailment condition (EC)
For any \(h,e,k \in \bL\), if \(e\wedge k\) is consistent, \(e\wedge k \vDash h\) and \(k \not\vDash h\), then \(e\) confirms \(h\) relative to \(k\).

Confirmation complementarity (CC)
For any \(h, e, k \in \bL\), \(e\) confirms \(h\) relative to \(k\) if and only if \(e\) disconfirms \(\neg h\) relative to \(k\).

Special consequence condition (SCC)
For any \(h, e, k \in \bL\), if \(e\) confirms \(h\) relative to \(k\) and \(h\wedge k \vDash h^*\), then \(e\) confirms \(h^*\) relative to \(k\).

On the implicit proviso that \(k\) is empty (that is, tautologous: \(k = \top\)), Hempel (1943, 1945) himself had put forward (EC) and (SCC) as compelling adequacy conditions for any theory of confirmation, and devised his own proposal accordingly. As for (CC), he took it as a plain definitional truth (1943, 127). Moreover, Hempelian confirmation (extended) satisfies all conditions above (of course, for arguments \(h\), \(e\) and \(k\) for which it is defined). HD-confirmation, on the contrary, violates all of them. Let us briefly discuss each one in turn.

It is rather common for a theory of ampliative (non-deductive) reasoning to retain classical logical entailment as a special case (a feature sometimes called “super-classicality”; see Strasser and Antonelli 2019). That’s essentially what (EC) implies for confirmation. Now given appropriate \(e\), \(h\) and \(k\), if \(e\wedge k\) entails \(h\), we readily get that \(e\) Hempel-confirms \(h\) relative to \(k\) in two simple steps. First, given that \(e\) and \(k\) are both quantifier-free, \(dev_{e}(e\wedge k) = e\wedge k\) according to Hempel’s full definition of \(dev\) (see Hempel 1943, 131). Then it trivially follows that \(e\wedge k \vDash dev_{e}(e\wedge k)\), so \(e\wedge k\) is (directly) Hempel-confirmed and its logical consequence \(h\) is likewise confirmed (indirectly). Logical entailment is thus retained as an instance of Hempelian confirmation in a fairly straightforward way. HD-confirmation, on the contrary, does not fulfil (EC). Here is one odd example (see Sprenger 2011a, 234). With \(k = \top\), just let \(e\) be the observation report that object \(a\) is a black swan, \(swan(a) \wedge black(a)\), and \(h\) be the hypothesis that black swans exist, \(\exists x(swan(x) \wedge black(x))\). Evidence \(e\) verifies \(h\) conclusively, and yet it does not HD-confirm it, simply because \(h \not\vDash e\). So the observation of a black swan turns out to be HD-neutral for the hypothesis that black swans exist! The same example shows how HD-confirmation violates (CC), too. In fact, while HD-neutral for \(h\), \(e\) HD-disconfirms its negation \(\neg h\) that no swan is black, \(\forall x(swan(x) \rightarrow \neg black(x))\), because the latter is obviously inconsistent with (refuted by) \(e\).

The violation of (EC) and (CC) by HD-confirmation is arguably a reason for concern, for those conditions seem highly plausible. The special consequence condition (SCC), on the other hand, deserves separate and careful consideration. As we will see later on, (SCC) is a strong constraint, and far from sacrosanct. For now, let us point out one major philosophical motivation in its favor. (SCC) has often been invoked as a means to ensure the fulfilment of the following condition (see, e.g., Hesse 1975, 88; Horwich 1983, 57):

Predictive inference condition (PIC)
For any \(e, k \in \bL\), if \(e\) confirms \(\forall x(Fx \rightarrow Gx)\) relative to \(k\), then \(e\) confirms \(F(a) \rightarrow G(a)\) relative to \(k\).

In fact, (PIC) readily follows from (SCC) and so it is satisfied by Hempel’s theory. It says that, if evidence \(e\) confirms “all \(F\)s are \(G\)s”, then it also confirms that a further object will be \(G\), if it is \(F\). Notably, this does not hold for HD-confirmation. Here is why. Given \(k = Fa\) (i.e., the assumption that \(a\) comes from the \(F\) population), we have that \(e = Ga\) HD-confirms \(h = \forall x(Fx \rightarrow Gx)\), because the latter entails the former (given \(k\)). (That’s the HD reconstruction of Nicod’s insight, see below.) We also have, of course, that \(h\) entails \(h^* = Fb \rightarrow Gb\). And yet, contrary to (PIC), since \(h^*\) does not entail \(e\) (given \(k\)), it is not HD-confirmed by it either. The troubling conclusion is that the observation that a swan is white (or that a million of them are, for that matters) does not HD-confirm the prediction that a further swan will be found to be white.

2.2 Back to black (ravens)

One attractive feature of HD-confirmation is that it largely eludes the ravens paradox. As the hypothesis \(h\) that all ravens are black does not entail that some generally sampled object \(a\) will be a black raven, the HD view of confirmation is not committed to the eminently Hempelian implication that \(e = raven(a) \wedge black(a)\) confirms \(h\). Likewise, \(\neg black(a) \wedge \neg raven(a)\) does not HD-confirm that all non-black objects are non-raven. The derivation of the paradox, as presented above, is thus blocked.

Indeed, HD-confirmation yields a substantially different reading of Nicod’s insight as compared to Hempel’s theory (Okasha 2011 has an important discussion of this distinction). Here is how it goes. If object \(a\) is assumed to have been taken among ravens_—so that, crucially, the auxiliary assumption \(k = raven(a)\) is made—and \(a\) is checked for color and found to be black, then, yes, the latter evidence, \(black(a)\), HD-confirms that all ravens are black \((h)\) relative to \(k\). By the same token, \(\neg black(a)\) HD-disconfirms \(h\) relative to the same assumption \(k = raven(a)\). And, again, this is as it should be, in line with Nicod’s mention of “the absence of \(G\) [here, non-black as evidence] in a case of \(F\) [here, raven as an auxiliary assumption]”. It is also true that an object that is found_not to be a raven HD-confirms \(h\), but _only_relative to \(k = \neg black(a)\), that is, if \(a\) is assumed to have been taken among non-black objects to begin with; and this seems acceptable too (after all, while sampling from non-black objects, one might have found the counterinstance of a raven, but didn’t). Unlike Hempel’s theory, moreover, HD-confirmation does not yield the debatable implication that, by itself (that is, given \(k = \top\)), the observation of a non-raven \(a\), \(\neg raven(a)\), must confirm \(h\).

Interestingly, the introduction of auxiliary hypotheses and assumptions shows that the issues surrounding Nicod’s remarks can become surprisingly subtle. Consider the following statements (Maher’s 2006 example):

(\(\alpha_1\))

\(\forall x(white(x) \rightarrow \neg black(x))\)

(\(\alpha_2\))

\(\exists x(swan(x)) \rightarrow \exists y(swan(y) \wedge black(y))\)

\(\alpha_1\) simply specifies that no object is both white and black, while \(\alpha_2\) says that, if there are swans at all, then there also is some black swan. Also posit, again, \(e = swan(a) \wedge white(a)\). Under \(\alpha_1\) and \(\alpha_2\), the observation of a white swan clearly _dis_confirms (indeed, refutes) the hypothesis \(h\) that all swans are white. Hempel’s theory (extended) faces difficulties here, because for \(k = dev_{e}(\alpha_1 \wedge \alpha_2)\) it turns out that \(e\wedge k\) is inconsistent. But HD-confirmation gets this case right, thus capturing appropriate boundary conditions for Nicod’s generally sensible claims. For, with \(k = \alpha_1 \wedge \alpha_2\), one has that \(h\wedge k\) is consistent and entails \(\neg e\) (for it entails that no swan exists), so that \(e\) HD-disconfirms (refutes) \(h\) relative to \(k\) (see Good 1967 for another famous counterexample to Nicod’s condition).

HD-confirmation, however, is also known to suffer from distinctive “paradoxical” implications. One of the most frustrating is surely the following (see Osherson, Smith, and Shafir 1986, 206, for further specific problems).

The irrelevant conjunction paradox. Suppose that \(e\) confirms \(h\) relative to (possibly empty) \(k\). Let statement \(q\) be logically consistent with \(e\wedge h\wedge k\), but otherwise entirely irrelevant for all of those conjuncts (perhaps belonging to a completely separate domain of inquiry). Does \(e\) confirm \(h\wedge q\) (relative to \(k\)) as it does with \(h\)? One would want to say no, and this implication can be suitably reconstructed in Hempel’s theory. HD-confirmation, on the contrary, can not draw this distinction: it is easy to show that, on the conditions specified, if the HD clause for confirmation is satisfied for \(e\) and \(h\) (given \(k\)), so it is for \(e\) and \(h\wedge q\) (given \(k\)). (This is simply because, if \(h\wedge k \vDash e\), then \(h\wedge q\wedge k \vDash e\), too, by the monotonicity of classical logical entailment.)

Kuipers (2000, 25) suggested that one can live with the irrelevant conjunction problem because, on the conditions specified, \(e\) would still not HD-confirm \(q\) alone (given \(k\)), so that HD-confirmation can be “localized”: \(h\) is the only bit of the conjunction \(h\wedge q\) that gets any confirmation on its own, as it were. Other authors have been reluctant to bite the bullet and have engaged in technical refinements of the “naïve” HD view. In these proposals, the spread of HD-confirmation upon frivolous conjunctions can be blocked at the expense of some additional logical machinery (see Gemes 1993, 1998; Schurz 1991, 1994).

Finally, it should be noted that HD-confirmation offers no substantial relief from the blite paradox. On the one hand, \(e = raven(a) \wedge ex_{t\le T}(a) \wedge black(a)\) does not, as such, HD-confirm either \(h = \forall x(raven(x) \rightarrow black(x))\) or \(h^* = \forall x(raven(x) \rightarrow blite(x))\), that is, for empty \(k\). On the other hand, if object \(a\) is assumed to have been sampled from ravens before \(T\) (that is, given \(k = raven(a) \wedge ex_{t\le T}(a))\), then \(black(a)\) is entailed by both “all ravens are black” and “all ravens are blite” and therefore HD-confirms each of them. So HD-confirmation, too, sanctions the existence of confirmation relations that seem intuitively unsound (indeed, indefinitely many of them: as we know, other variations of \(h^*\) can be conceived at will, like the “blurple” hypothesis). One could insist that HD does handle the blite paradox after all, because \(black(a)\) (given \(k\) as above) does not HD-confirms that a raven will be white if examined after \(T\) (Kuipers 2000, 29 ff.). Unfortunately (as pointed out by Schurz 2005, 148) \(black(a)\) does not HD-confirm that a raven will be black if examined after \(T\) either (again, given \(k\) as above). That’s because, as already pointed out, HD-confirmation fails the predictive inference condition (PIC) in general. So, all in all, HD-confirmation can not tell black from blite any more than Hempel-confirmation can.

2.3 Underdetermination and the Duhemian challenge

The issues above look contrived and artificial to some people’s taste—even among philosophers. Many have suggested a closer look at real-world inferential practices in the sciences as a more appropriate benchmark for assessment. For one thing, the very idea of hypothetico-deductivism has often been said to stem from the origins of Western science. As reported by Simplicius of Cilicia (sixth century A.D.) in his commentary on Aristotle’s De Caelo, Plato had challenged his pupils to identify combinations of “ordered” motions by which one could account for (namely, deduce) the planets’ wandering trajectories across the heavens as observed by the Earth. As a matter of historical fact, mathematical astronomy has engaged in just this task for centuries: scholars have been trying to define geometrical models from which the apparent motion of celestial bodies would derive.

It is fair to say that, at its roots, the kind of challenges that the HD framework faces with scientific reasoning is not so different from the main puzzles that arise from philosophical considerations of a more formal kind. Still, the two areas turn out to be complementary in important ways. The following statement will serve as a useful starting point to extend the scope of our discussion.

Underdetermination Theorem (UT) for “naïve” HD-confirmation
For any contingent \(h, e \in \bL\), if \(h\) and \(e\) are logically consistent, there exists some \(k \in \bL\) such that \(e\) HD-confirms \(h\) relative to \(k\).

(UT) is an elementary logical fact that has been long recognized (see, e.g., Glymour 1980a, 36). In purely formal terms, just positing \(k = h \rightarrow e\) will do for a proof. To appreciate how (UT) can spark any philosophical interest, one has to combine it with some insightful remarks first put forward by Pierre Duhem (1906) and then famously revived by Quine (1951) in a more radical style. (Indeed, (UT) essentially amounts to the “entailment version” of “Quinean underdetermination” in Laudan 1990, 274.)

Duhem (he himself a supporter of the HD view) pointed out that in mature sciences such as physics most hypotheses or theories of real interest can not be contradicted by any statement describing observable states of affairs. Taken in isolation, they simply do not logically imply, nor rule out, any observable fact, essentially because (unlike “all ravens are black”) they involve the mention of unobservable entities and processes. So, in effect, Duhem emphasized that, typically, scientific hypotheses or theories_are_ logically consistent with any piece of checkable evidence. Unless, of course, the logical connection is underpinned by auxiliary hypotheses and assumptions suitably bridging the gap between the observational and non-observational vocabulary, as it were. But then, once auxiliaries are in play, logic alone guarantees that_some_ \(k\) exists such that \(h\wedge k\) is consistent, \(h\wedge k \vDash e\), and \(k \not\vDash e\), so that confirmation holds in naïve HD terms (that’s just the UT result above). Apparently, when Duhem’s point applies, the uncritical supporter of whatever hypothesis \(h\) can legitimately claim (naïve HD) confirmation from any \(e\) by simply shaping \(k\) conveniently. In this sense, hypothesis assessment would be radically “underdetermined” by any amount of evidence practically available.

Influential authors such as Thomas Kuhn (1962/1970) (but see Laudan 1990, 268, for a more extensive survey) relied on Duhemian insights to suggest that confirmation by empirical evidence is too weak a force to drive the evaluation of theories in science, often inviting conclusions of a relativistic flavor (see Worrall 1996 for an illuminating reconstruction along these lines). Let us briefly consider a classical case, which Duhem himself thoroughly analyzed: the wave vs. particle theories of light in modern optics. Across the decades, wave theorists were able to deduce an impressive list of important empirical facts from their main hypothesis along with appropriate auxiliaries, diffraction phenomena being only one major example. But many particle theorists’ reaction was to retain their hypothesis nonetheless and to reshape other parts of the “theoretical maze” (i.e., \(k\); the term is Popper’s, 1963, p. 330) to recover those observed facts as consequences of their own proposal. And as we’ve seen,if the bare logic of naïve HD was to be taken strictly, surely they could have claimed their overall hypothesis to be confirmed too, just as much as their opponents.

Importantly, they didn’t. In fact, it was quite clear that particle theorists, unlike their wave-theory opponents, were striving to remedy weaknesses rather than scoring successes (see Worrall 1990). But why, then? Because, as Duhem himself clearly realized, the logic of naïve HD “is not the only rule for our judgments” (1906, 217). The lesson of (UT) and the Duhemian insight is not quite, it seems, that naïve HD is the last word and scientific inference is unconstrained by stringent rational principles, but rather that the HD view has to be strengthened in order to capture the real nature of evidential support in rational scientific inference. At least, that’s the position of a good deal of philosophers of science working within the HD framework broadly construed. It has even been maintained that “no serious twentieth-century methodologist” has ever subscribed to the naïve HD view above “without crucial qualifications” (Laudan 1990, 278; also see Laudan and Leplin 1991, 466).

So the HD approach to confirmation has yielded a number of more articulated variants to meet the challenge of underdetermination. Following (loosely) Norton (2005), we will now survey an instructive sample of them.

2.4 The extended HD menu

Naïve HD can be enriched by a resolute form of_predictivism_. According to this approach, the naïve HD clause for confirmation is too weak because \(e\) must have been predicted in advance from \(h\wedge k\). Karl Popper’s (1934/1959) account of the “corroboration” of hypotheses famously embedded this view, but squarely predictivist stances can be traced back to early modern thinkers like Christiaan Huygens (1629–1695) and Gottfried Wilhelm Leibniz (1646–1716), and in Duhem’s work itself. The predictivist sets a high bar for confirmation. Her favorite examples typically include stunning episodes in which the existence of previously unknown objects, phenomena, or whole classes of them is anticipated: the phases of Venus for Copernican astronomy or the discovery of Neptune for Newtonian physics, all the way up to the Higgs boson for so-called standard model of subatomic particles.

The predictivist solution to the underdetermination problem is fairly radical: many of the relevant factual consequences of \(h\wedge k\) will be already known when this theory is articulated, and so unfit for confirmation. Critics have objected that predictivism is in fact far too restrictive. There seem to be many cases in which already known phenomena clearly do provide support to a new hypothesis or theory. Zahar (1973) first raised this problem of “old evidence”, then made famous by Glymour (1980a, 85 ff.) as a difficulty for Bayesianism (seeSection 3 below). Examples of this kind abound in the history of science as elsewhere, but the textbook illustration has become the precession of Mercury’s perihelion, a lasting anomaly for Newtonian physics: Einstein’s general relativity calculations got this long-known fact right, thereby gaining a remarkable piece of initial support for the new theory. In addition to this problem with old evidence, HD predictivism also seems to lack a principled rationale. After all, the temporal order of the discovery of \(e\) and of the articulation of \(h\) and \(k\) may well be an entirely accidental historical contingency. Why should it bear on the confirmation relationship among them? (See Giere 1983 and Musgrave 1974 for classical discussions of these issues. Douglas and Magnus 2013 and Barnes 2018 offer more recent views and rich lists of further references.)

As a possible response to the difficulties above, naïve HD can be enriched by the use-novelty criterion (UN) instead. The UN reaction to the underdetermination problem is more conservative than the temporal predictivist strategy. According to this view, to improve on the weak naïve HD clause for confirmation one only has to rule out one particular class of cases, i.e., those in which the description of a known fact, \(e\), served as a constraint in the construction of \(h\wedge k\). The UN view thus comes equipped with a rationale. If \(h\wedge k\) was shaped on the basis of \(e\), UN advocates point out, then it was bound to get that state of affairs right; the theory never ran any risk of failure, thus did not achieve any particularly significant success either. Precisely in these cases, and just for this reason, the evidence \(e\) must not be double-counted: by using it for the construction of the theory, its confirmational power becomes “dried out”, so to speak.

The UN completion of naïve HD originated from Lakatos and some of his collaborators (see Lakatos and Zahar 1975 and Worrall 1978; also see Giere 1979, 161–162, and Gillies 1989 for similar views), although important hints in the same direction can be found at least in the work of William Whewell (1840/1847). Consider the touchstone example of Mercury again. According to Zahar (1973), Einstein did not need to rely on the Mercury data to define theory and auxiliaries as to match observationally correct values for the perihelion precession (also see Norton 2011a; and Earman and Janssen 1993 for a very detailed, and more nuanced, account). Being already known, the fact was not of course predicted in a strictly temporal sense, and yet, on Zahar’s reading, it could have been: it was “use-novel” and thus fresh for use to confirm the theory. For a more mundane illustration, so-called _cross-validation_techniques represent a routine application of the UN idea in statistical settings (as pointed out by Schurz 2014, 92; also see Forster 2007, 592 ff.). According to some commentators, however, the UN criterion needs further elaboration (see Hitchcock and Sober 2004 and Lipton 2005), while others have criticized it as essentially wrong-headed (see Howson 1990 and Mayo 1991, 2014; also see Votsis 2014).

Yet another way to enrich naïve HD is to combine it with_eliminativism_. According to this view, the naïve HD clause for confirmation is too weak because there must have been a low (enough) objective chance of getting the outcome \(e\) (favorable to \(h\)) if \(h\) was false, so that few possibilities exist that \(e\) may have occurred for some reason other than the truth of \(h\). Briefly put, the occurrence of \(e\) must be such that most alternatives to \(h\) can be safely ruled out. The founding figure of eliminativism is Francis Bacon (1561–1626). John Stuart Mill (1843/1872) is a major representative in later times, and Deborah Mayo’s “error-statistical” approach to hypothesis testing arguably develops this tradition (Mayo 1996 and Mayo and Spanos 2010; see Bird 2010, Kitcher 1993, 219 ff., and Meehl 1990 for other contemporary variations).

Eliminativism is most credible when experimentation is at issue (see, e.g., Guala 2012). Indeed, the appeal to Bacon’s idea of_crucial experiment_ (instantia crucis) and related notions (e.g., “severe testing”) is a fairly reliable mark of eliminativist inclinations. Experimentation is, to a large extent, precisely an array of techniques to keep undesired interfering factors at a minimum by active manipulation and deliberate control (think of the blinding procedure in medical trials, with \(h\) the hypothesized effectiveness of a novel treatment and \(e\) a relative improvement in clinical endpoints for a target subsample of patients thus treated). When this kind of control obtains, popular statistical tools are supposed to allow for the calculation of the probability of \(e\) in case \(h\) is false meant as a “relative frequency in a (real or hypothetical) series of test applications” (Mayo 1991, 529), and to secure a sufficiently low value to validate the positive outcome of the test. It is much less clear how firm a grip this approach can retain when inference takes place at higher levels of generality and theoretical commitment, where the hypothesis space is typically much too poorly ordered to fit routine error-statistical analyses. Indeed, Laudan (1997, 315; also see Musgrave 2010) spotted in this approach the risk of a “balkanization” of scientific reasoning, namely, a restricted focus on scattered pieces of experimental inference (but see Mayo 2010 for a defense).

Naïve HD can also be enriched by the notion of_simplicity_. According to this view, the naïve HD clause for confirmation is too weak because \(h\wedge k\) must be a simple (enough), unified way to account for evidence \(e\). A classic reference for the simplicity view is Newton’s first law of philosophizing in the Principia (“admit no more causes of natural things than such as are both true and sufficient to explain their appearances”), echoing very closely Ockham’s razor. This basic idea has never lost its appeal—even up to recent times (see, e.g., Quine and Ullian 1970, 69 ff.; Sober 1975; Zellner, Keuzenkamp, and McAleer 2002; Scorzato 2013).

Despite Thomas Kuhn’s (1957, 181) suggestions to the contrary, the success of Copernican astronomy over Ptolemy’s system has remained an influential case study fostering the simplicity view (Martens 2009). Moreover, in ordinary scientific problems such as_curve fitting_, formal criteria of model selection are applied where the paucity of parameters can be interpreted naturally as a key dimension of simplicity (Forster and Sober 1994). Traditionally, two main problems have proven pressing, and frustrating, for the simplicity approach. First, how to provide a sufficiently coherent and illuminating explication of this multifaceted and elusive notion (see Riesch 2010); and second, how to justify the role of simplicity as a properly epistemic (rather than merely pragmatic) virtue (see Kelly 2007, 2008).

Finally, naïve HD can be enriched by the appeal to_explanation_. Here, the naïve HD clause for confirmation is meant to be too weak because \(h\wedge k\) must be able (not only to entail, but) to explain \(e\). By this move, the HD approach embeds the slogan of the so-called _inference to the best explanation_view: “observations support the hypothesis precisely because it would explain them” (Lipton 2000, 185; also see Lipton 2004). Historically, the main source for this connection between explanation and support is found in the work of Charles Sanders Peirce (1839–1914). Janssen (2003) offers a particularly neat contemporary exhibit, explicitly aimed at “curing cases of the Duhem-Quine disease” (484; also see Thagard 1978, and Douven 2017 for a relevant survey). Quite unlike eliminativist approaches, explanationist analyses tend to focus on large-scale theories and relatively high-level kinds of evidence. Dealing with Einstein’s general relativity, for instance, Janssen (2003) greatly emphasizes its explanation of the equivalence of inertial and gravitational mass (essentially a brute fact in Newtonian physics) over the resolution of the puzzle of Mercury’s perihelion. Explanationist accounts are also distinctively well-equipped to address inference patterns from non-experimental sciences (Cleland 2011).

The problems faced by these approaches are similar to those affecting the simplicity view. Agreement is still lacking on the nature of scientific explanation (see Woodward 2019) and it is not clear how far an explanationist variant of HD can go without a sound analysis of that notion. Moreover, some critics have wondered why the relationship of confirmation should be affected by an explanatory connection with the evidence per se (see Salmon 2001).

The above discussion does not display an exhaustive list (nor are the listed options mutually exclusive, for that matter: see, e.g., Baker 2003; also see Worrall 2010 for some overlapping implications in an applied setting of real practical value). And our sketched presentation hardly allows for any conclusive assessment. It does suggest, however, that reports of the death of hypothetico-deductivism (see Earman 1992, 64, and Glymour 1980b) might have been exaggerated. For all its difficulties, HD has proven fairly resilient at least as a basic framework to elucidate some key aspects of how hypotheses can be confirmed by the evidence (see Betz 2013, Gemes 2005, and Sprenger 2011b for consonant points of view).

3. Bayesian confirmation theories

Bayes’s theorem is a very central element of the probability calculus (see Joyce 2019). For historical reasons,Bayesian has become a standard label to allude to a range of approaches and positions sharing the common idea that probability (in its modern, mathematical sense) plays a crucial role in rational belief, inference, and behavior. According to Bayesian epistemologists and philosophers of science, (i) rational agents have credences differing in strength, which moreover (ii) satisfy the probability axioms, and can thus be represented in probabilistic form. (In non-Bayesian models (ii) is rejected, but (i) may well be retained: see Huber and Schmidt-Petri 2009, Levi 2008, and Spohn 2012.) Well-known arguments exist in favor of this position (see, e.g., Easwaran 2011a; Pettigrew 2016; Skyrms 1987; Vineberg 2016), although there is no lack of difficulties and criticism (see, e.g., Easwaran 2011b; Hájek 2008; Kelly and Glymour 2004; Norton 2011b).

Beyond the core ideas above, however, the theoretical landscape of Bayesianism is quite as hopelessly diverse as it is fertile. Surveys and state of art presentations are already numerous, and ostensibly growing (see, e.g., Good 1971; Joyce 2011; Oaksford and Chater 2007; Sprenger and Hartmann 2020; Weisberg 2015). For the present purposes, attention can be restricted to a classification that is still fairly coarse-grained, and based on just two dimensions or criteria.

First, there is a distinction between permissivism and_impermissivism_ (see Meacham 2014 and Kopec and Titelbaum 2016 for this terminology). For permissive Bayesians (often otherwise labelled “subjectivists”), accordance with the probability axioms is the only clear-cut constraint on the credences of a rational agent. In impermissive forms of Bayesianism (often otherwise called “objective”), further constraints are put forward that significantly restrict the range of rational credences, possibly up to one single “right” probability function in any given setting. Second, there are different attitudes towards so-called principle of total evidence (TE) for the credences on which a reasoner relies. TE Bayesians maintain that the relevant credences should be represented by a probability function \(P\) which conveys the totality of what is known to the agent. For non-TE approaches, depending on the circumstances, \(P\) may (or should) be set up so that portions of the evidence available are in fact bracketed. (Unsurprisingly, further subtleties arise as soon as one delves a bit further into the precise meaning and scope of TE; see Fitelson 2008 and Williamson 2002, Chs. 9–10, for important discussions.)

Of course, many intermediate positions exist between extreme forms of permissivism and impermissivism so outlined, and more or less the same applies for the TE issue. The above distinctions are surely rough enough, but useful nonetheless. Impermissive TE Bayesianism has served as a received view in early Bayesian philosophy of science (e.g., Carnap 1950/1962). But impermissivism is easily found in combination with non-TE positions, too (see, e.g., Maher 1996). TE permissivism seems a good approximation of De Finetti’s (2008) stance, while non-TE permissivism is arguably close to a standard view nowadays (see, e.g., Howson and Urbach 2006). No more than this will be needed to begin our exploration of Bayesian confirmation theories.

3.1 Probabilistic confirmation as firmness

Let us posit a set \(\bP\) of probability functions representing possible states of belief about a domain that is described in a finite language \(L\) with \(\bL\) the set of its closed sentences. From now on, unless otherwise specified, whenever considering some \(h, e, k \in \bL\) and \(P \in \bP\), we will invariably rely on the following provisos:

  1. both \(e\wedge k\) and \(h\wedge k\) are consistent;
  2. \(P(e\wedge k), P(h\wedge k) \gt 0;\)
  3. \(P(k) \gt P(h\wedge k)\) (unless \(k \vDash h\));
  4. \(P(e\wedge k) \gt P(e\wedge h\wedge k)\) (unless \(e\wedge k \vDash h\)); and
  5. \(P(e\wedge h\wedge k) \gt 0\), as long as \(e\wedge h\wedge k\) is consistent.

(These assumptions are convenient and critical for technical reasons, but not entirely innocent. Festa 1999 and Kuipers 2000, 44 ff., discuss some limiting cases that are left aside here owing to these constraints.)

A probabilistic theory of confirmation can be spelled out through the definition of a function \(C_{P}(h, e\mid k): \{\bL^3 \times \bP\} \rightarrow \Re\) representing the degree of confirmation that hypothesis \(h\) receives from evidence \(e\) relative to \(k\) and probability function \(P\). \(C_{P}(h,e\mid k)\) will then have relevant probabilities as its building blocks, according to the following basic postulate of probabilistic confirmation:

(P0) Formality
There exists a function \(g\) such that, for any \(h, e, k \in \bL\) and any \(P \in \bP\), \(C_{P}(h,e\mid k) = g[P(h\wedge e\mid k),P(h\mid k),P(e\mid k)]\).

Note that the probability distribution over the algebra generated by \(h\) and \(e\), conditional on \(k\), is entirely determined by \(P(h\wedge e\mid k)\), \(P(h\mid k)\) and \(P(e\mid k)\). Hence, (P0) simply states that \(C_{P}(h, e\mid k)\) depends on that distribution, and nothing else. (The label for this assumption is taken from Tentori, Crupi, and Osherson 2007, 2010.)

Hempelian and HD confirmation, as discussed above, are_qualitative_ theories of confirmation. They only tell us_whether_ evidence \(e\) confirms (disconfirms) hypothesis \(h\) given \(k\). However, assessments of the amount of support that some evidence brings to a hypothesis are commonly involved in scientific reasoning, as well as in other domains, if only in the form of comparative judgments such as “hypothesis \(h\) is more strongly confirmed by \(e_{1}\) than by \(e_{2}\)” or “\(e\) confirms \(h_{1}\) to a greater extent than \(h_{2}\)”. Consider, for instance, the following principle, a veritable cornerstone of probabilistic confirmation in all of its variations (see Crupi, Chater, and Tentori 2013 for a list of references):

(P1) Final probability
For any \(h,e_{1},e_{2},k \in \bL\) and any \(P \in \bP\), \(C_{P}(h,e_{1}\mid k) \gtreqless C_{P}(h, e_{2}\mid k)\) if and only if \(P(h\mid e_{1} \wedge k) \gtreqless P(h\mid e_{2} \wedge k).\)

(P1) is itself a comparative, or ordinal, principle, stating that, for any fixed hypothesis \(h\), the final (or posterior) probability and confirmation always move in the same direction in the light of data, \(e\) (given \(k\)). Interestingly, (P0) and (P1) are already sufficient to single out one traditional class of measures of probabilistic confirmation, if conjoined with the following (see Crupi and Tentori 2016, 656, Schippers 2017, and also Törnebohm 1966, 81):

(P2) Local equivalence
For any \(h_{1},h_{2},e,k \in \bL\) and any \(P\in \bP\), if \(h_{1}\) and \(h_{2}\) are logically equivalent given \(e\) and \(k\), then \(C_{P}(h_{1},e\mid k) = C_{P}(h_{2}, e\mid k).\)

The following can then be shown:

Theorem 1
(P0), (P1) and (P2) hold if and only if there exists a strictly increasing function \(f\) such that, for any \(h, e, k \in \bL\) and any \(P \in \bP\), \(C_{P}(h, e\mid k) = f[P(h\mid e\wedge k)]\).

Theorem 1 provides a simple axiomatic characterization of the class of confirmation functions that are strictly increasing with the final probability of the hypothesis given the evidence (and \(k\)) (proven in Schippers 2017). All the functions in this class are ordinally equivalent, meaning that they imply the same rank order of \(C_{P}(h, e\mid k)\) and \(C_{P^*}(h^*,e^*\mid k^*)\) for any \(h, h^*,e, e^*,k, k^* \in \bL\) and any \(P, P^* \in \bP.\)

By (P0), (P1) and (P2), we thus have \(C_{P}(h, e\mid k) = f[P(h\mid e \wedge k)]\), implying that the more likely \(h\) is given the evidence the more it is confirmed. This approach explicates confirmation precisely as the overall credibility of a hypothesis (firmness is Carnap’s 1950/1962 telling term, xvi). In this view, “Bayesian confirmation theory is little more than the examination of [the] properties” of the posterior probability function (Howson 2000, 179).

As we will see, the ordinal level of analysis is a solid and convenient middleground between a purely qualitative and a thoroughly quantitative (metric) notion of confirmation. To begin with, ordinal notions are in general sufficient to move “upwards” to the qualitative level as follows:

Qualitative confirmation from ordinal relations (QC)
For any \(h, e, k \in \bL\) and any \(P \in \bP\):

Given Theorem 1, (P0), (P1) and (P2) can be combined with the definitions in (QC) to derive the following qualitative notion of probabilistic confirmation as firmness:

Confirmation as firmness (\(F\)-confirmation, qualitative)
For any \(h, e, k \in \bL\) and any \(P \in \bP\):

The point of qualitative \(F\)-confirmation is thus straightforward: \(h\) is said to be (dis)confirmed by \(e\) (given \(k\)) if it is more likely than not to be true (false). (Sometimes a threshold higher than a probability \(\bfrac{1}{2}\) is identified, but this complication would add little for our present purposes.)

The ordinal notion of confirmation is of high theoretical significance because ordinal divergences, unlike purely quantitative differences, imply opposite comparative judgments for some evidence-hypothesis pairs. A refinement from the ordinal to a properly quantitative level is also be of interest, however, and much useful for tractability and applications. For example, one can have 0 as a convenient neutrality threshold for confirmation as firmness, provided that the following functional representation is adopted (see Peirce 1878 for an early occurrence):

\begin{align} F(h,e\mid k) & = \log\left[\frac{P(h\mid e \wedge k)}{P(\neg h\mid e \wedge k)}\right] \\ & = \log Odds(h\mid e \wedge k) \end{align}

(The base of the logarithm can be chosen at convenience, as long as it is strictly greater than 1.)

A quantitative requirement that is often put forward is the following stringent form of additivity:

Strict additivity (SA)
For any \(h, e_{1},e_{2},k \in \bL\) and any \(P \in \bP\),
\(\ \ \ C_{P}(h, e_{1} \wedge e_{2}\mid k) = C_{P}(h, e_{1}\mid k) + C_{P}(h, e_{2}\mid e_{1} \wedge k).\)

Although extraneous to \(F\)-confirmation, Strict Additivity will prove of use later on for the discussion of further variants of Bayesian confirmation theory.

3.2 Strengths and infirmities of firmness

Confirmation as firmness shares a number of structural properties with Hempelian confirmation. It satisfies the Special Consequence Condition, thus the Predictive Inference Condition too. It satisfies the Entailment Condition and, in virtue of (P1), extends it smoothly to the following ordinal counterpart:

Entailment condition (ordinal extension) (EC-Ord)
For any \(h, e_{1},e_{2},k\in \bL\) and any \(P \in \bP\) such that \(k \not\vDash h\):

  1. if, \(e_{1}\wedge k \vDash h\) and \(e_{2}\wedge k \not\vDash h\), then \(h\) is more confirmed by \(e_{1}\) than by \(e_{2}\) relative to \(k\), that is, \(C_{P}(h, e_{1}\mid k) \gt C_{P}(h, e_{2}\mid k);\)
  2. if, \(e_{1}\wedge k\vDash h\) and \(e_{2}\wedge k\vDash h,\) then \(h\) is equally confirmed by \(e_{1}\) and by \(e_{2}\) relative to \(k\), that is, \(C_{P}(h, e_{1}\mid k) = C_{P}(h, e_{2}\mid k).\)

According to (EC-Ord) not only is classical entailment retained as a case of confirmation, it also represents a limiting case: it is the strongest possible form of confirmation that a fixed hypothesis \(h\) can receive.

\(F\)-confirmation also satisfies Confirmation Complementarity and, moreover, extends it to its appealing ordinal counterpart (see Crupi, Festa, and Buttasi 2010, 85–86), that is:

Confirmation complementarity (ordinal extension)(CC-Ord)
\(C_{P}(\neg h, e\mid k)\) is a strictly decreasing function of \(C_{P}(h, e\mid k)\), that is, for any \(h, h^*,e, e^*,k \in \bL\) and any \(P\in \bP,\) \(C_{P}(h, e\mid k)\gtreqless C_{P}(h^*,e^*\mid k)\) if and only if \(C_{P}(\neg h, e\mid k) \lesseqgtr C_{P}(\neg h^*,e^*\mid k).\)

(CC-Ord) neatly reflects Keynes’ (1921, 80) remark that “an argument is always as near to proving or disproving a proposition, as it is to disproving or proving its contradictory”. Indeed, quantitatively, the measure \(F(h, e\mid k)\) instantiates Confirmation Complementarity in a simple and elegant way, that is, it satisfies \(C_{P}(h, e\mid k) = -C_{P}(\neg h, e\mid k).\)

\(F\)-confirmation also implies another attractive quantitative result, alleviating the ailments of the irrelevant conjunction paradox. In the statement below, indicating this result, the_irrelevance_ of \(q\) for hypothesis \(h\) and evidence \(e\) (relative to \(k\)) is meant to amount to the probabilistic independence of \(q\) from \(h, e\) and their conjunction (given \(k\)), that is, to \(P(h \wedge q\mid k) = P(h\mid k)P(q\mid k),\) \(P(e \wedge q\mid k) = P(e\mid k)P(q\mid k)\), and \(P(h \wedge e \wedge q\mid k) = P(h \wedge e\mid k)P(q\mid k)\), respectively.

Confirmation upon irrelevant conjunction (ordinal solution) (CIC)
For any \(h, e, q, k \in \bL\) and any \(P \in \bP,\) if \(e\) confirms \(h\) relative to \(k\) and \(q\) is irrelevant for \(h\) and \(e\) relative to \(k\), then
\(\ \ \ C_{P}(h, e\mid k) \gt C_{P}(h \wedge q, e\mid k).\)

So, even in case it is qualitatively preserved across the tacking of \(q\) onto \(h\), the positive confirmation afforded by \(e\) is at least bound to quantitatively decrease thereby.

Partly because of appealing formal features such as those mentioned so far, there is a long list of distinguished scholars advocating the firmness view of confirmation, from Keynes (1921) and Hosiasson-Lindenbaum (1940) onwards, most often coupled with some form of impermissive Bayesianism (see Hawthorne 2011 and Williamson 2011 for contemporary variations). In fact, \(F\)-confirmation fits most neatly a classical form of TE impermissivism _à la_Carnap, where one assumes that \(k = \top,\) that \(P\) is an “objective” initial probability based on essentially logical considerations, and that all the non-logical information available is collected in \(e\). The spirit of the Carnapian project never lost its appeal entirely (see, e.g., Festa 2003, Franklin 2001, Maher 2010, Paris 2011). However, the idea of a “logical” interpretation of \(P\) got stuck into difficulties that are often seen as insurmountable (e.g., Earman and Salmon 1992, 85–89; Gillies 2000, Ch. 3; Hájek 2019; Howson and Urbach 2006, 59–72; van Fraassen 1989, Ch. 12; Zabell 2011). And arguably, lacking some robust and effective impermissivist policy, the account of confirmation as firmness ends up loosing much of its philosophical momentum. The issues surrounding the ravens and blite paradoxes provide a useful illustration.

Consider again \(h = \forall x(raven(x) \rightarrow black(x))\), and the main analyses of “the observation that \(a\) is a black raven” encountered so far, that is:

  1. \(k = \top\) and \(e = raven(a) \wedge black(a)\), and
  2. \(k = raven(a)\) and \(e = black(a).\)

In both cases, whether \(e\) \(F\)-confirms \(h\) or not (relative to \(k\)) critically depends on \(P\): if the prior \(P(h\mid k)\) is low enough, \(e\) won’t do no matter what under either (i) or (ii); and if it is high enough, \(h\) will be \(F\)-confirmed either way. As a consequence, the \(F\)-confirmation view, by itself, does not offer any definite hint as to when, how, and why Nicod’s remarks apply or not.

For the purposes of our discussion, the following condition reveals another debatable aspect of the firmness explication of confirmation.

Consistency condition (Cons)
For any \(h, h^*,e, k \in \bL\) and any \(P \in \bP\), if \(k \vDash \neg(h\wedge h^*)\) then \(e\) confirms \(h\) given \(k\) if and only if \(e\) disconfirms \(h^*\) given \(k\).

(Cons) says that evidence \(e\) can never confirm incompatible hypotheses. But consider, by way of illustration, a clinical case of an infectious disease of unknown origin, and suppose that \(e\) is the failure of antibiotic treatment. Arguably, there is nothing wrong in saying that, by discrediting bacteria as possible causes, the evidence confirms (viz. provides some support for) any of a number of alternative viral diagnoses. This judgment clashes with (Cons), though, which then seems an overly strong constraint.

Notably, (Cons) was defended by Hempel (1945) and, in fact, one can show that it follows from the conjunction of (qualitative) Confirmation Complementary and the Special Consequence Condition, and so from both Hempelian and \(F\)-confirmation. This is but one sign of how stringent the Special Consequence Condition is. Mainly because of the latter, both the Hempelian and the firmness views of confirmation must depart from the plausible HD idea that hypotheses are generally confirmed by their verified consequences (see Hempel 1945, 103–104). We will come back to this while discussing our next topic: a very different Bayesian explication of confirmation, based on the notion of probabilistic relevance.

3.3 Probabilistic relevance confirmation

We’ve seen that the firmness notion of probabilistic confirmation can be singled out through one ordinal constraint, (P2), in addition to the fundamental principles (P0)–(P1). The counterpart condition for the so-called relevance notion of probabilistic confirmation is the following:

(P3) Tautological evidence
For any \(h_{1},h_{2},k\in \bL\) and any \(P\in \bP\), \(C_{P}(h_{1},\top \mid k) = C_{P}(h_{2},\top \mid k).\)

(P3) implies that any hypothesis is equally “confirmed” by empty evidence. We will say that \(C_{P}(h, e\mid k)\) represents the probabilistic relevance notion of confirmation, or relevance-confirmation, if and only if it satisfies (P0), (P1) and (P3). These conditions are sufficient to derive the following, purely qualitative principle, according to the definitional method in (QC) above (see Crupi and Tentori 2014, 82, and Crupi 2015).

Probabilistic relevance confirmation (qualitative)
For any \(h, e, k \in \bL\) and any \(P\in \bP:\)

The point of relevance confirmation is that the credibility of a hypothesis can be changed in either a positive (confirmation in a strict sense) or negative way (disconfirmation) by the evidence concerned (given \(k\)). Confirmation (in the strict sense) thus reflects an increase from initial to final probability, whereas disconfirmation reflects a decrease (see Achinstein 2005 for some diverging views on this very idea).

The qualitative notions of confirmation as firmness and as relevance are demonstrably distinct. Unlike firmness, relevance confirmation can not be formalized by the final probability alone, or any increasing function thereof. To illustrate, the probability of an otherwise very rare disease \((h)\) can be quite low even after a relevant positive test result \((e)\); yet \(h\) is relevance-confirmed by \(e\) to the extent that its probability rises thereby. By the same token, the probability of the absence of the disease \((\neg h)\) can be quite high despite the positive test result \((e)\), yet \(\neg h\) is relevance-disconfirmed by \(e\) to the extent that its probability decreases thereby. Perhaps surprisingly, the distinction between firmness and relevance confirmation—“extremely fundamental” and yet “sometimes unnoticed”, as Salmon (1969, 48–49) put it—had to be stressed time and again to achieve theoretical clarity in philosophy (e.g., Popper 1954; Peijnenburg 2012) as well as in other domains concerned, such as artificial intelligence and the psychology of reasoning (see Horvitz and Heckerman 1986; Crupi, Fitelson, and Tentori 2008; Shogenji 2012).

The qualitative notion of relevance confirmation already has some interesting consequences. It implies, for instance, the following remarkable fact:

Complementary Evidence (CompE)
For any \(h, e, k\in \bL\) and any \(P\in \bP,\) \(e\) confirms \(h\) relative to \(k\) if and only if \(\neg e\) disconfirms \(h\) relative to \(k.\)

The importance of (CompE) can be illustrated as follows. Consider the case of a father suspected of abusing his son. Suppose that the child does claim that s/he has been abused (label this evidence \(e\)). A forensic psychiatrist, when consulted, declares that this confirms guilt \((h)\). Alternatively, suppose that the child is asked and does_not_ report having been abused \((\neg e).\) As pointed out by Dawes (2001), it may well happen that a forensic psychiatrist will nonetheless interpret this as evidence confirming guilt (suggesting that violence has prompted the child’s denial). One might want to argue that, other things being equal, this kind of “heads I win, tails you lose” judgment would be inconsistent, and thus in principle untenable. Whoever concurs with this line of argument (as Dawes 2001 himself did) is likely to be relying on the relevance notion of confirmation. In fact, no other notion of confirmation considered so far provides a general foundation for this judgment. \(F\)-confirmation, in particular, would not do, for it does allow that both \(e\) and \(\neg e\) confirm \(h\) (relative to \(k\)). This is because, mathematically, it is perfectly possible for both \(P(h\mid e \wedge k)\) and \(P(h\mid \neg e \wedge k)\) to be arbitrarily high above \(\bfrac{1}{2}.\) Condition (CompE), on the contrary, ensures that only one between the complementary statements \(e\) and \(\neg e\) can confirm hypothesis \(h\) (relative to \(k\)). (To be precise, HD-confirmation also satisfies condition CompE, yet it would fail the above example all the same, although for a different reason, that is, because the connection between \(h\) and \(e\) is plausibly one of probabilistic dependence but not of logical entailment.)

Remarks such as the foregoing have induced some contemporary Bayesian theorists to dismiss the notion of confirmation as firmness altogether, concluding with I.J. Good (1968, 134) that “if you had \(P(h\mid e \wedge k)\) close to unity, but less than \(P(h\mid k)\), you ought not to say that \(h\) was confirmed by \(e\)” (also see Salmon 1975, 13). Let us follow this suggestion and proceed to consider the ordinal (and quantitative) notions of relevance confirmation.

3.4 Differences, ratios, and partial entailment

Just as with firmness, the ordinal analysis of relevance confirmation can be characterized axiomatically. With the relevance notion, however, a larger set of options arises. Consider the following principles.

(P4) Disjunction of alternative hypotheses
For any \(e, h_{1},h_{2},k\in \bL\) and any \(P\in \bP,\) if \(k\vDash \neg (h_{1} \wedge h_{2})\), then \(C_{P}(h_{1},e\mid k) \gtreqless C_{P}(h_{1} \vee h_{2},e\mid k)\) if and only if \(P(h_{2}\mid e \wedge k)\gtreqless P(h_{2}\mid k).\)

(P5) Law of likelihood
For any \(e, h_{1}, h_{2}, k\in \bL\) and any \(P\in \bP,\) \(C_{P}(h_{1}, e\mid k)\gtreqless C_{P}(h_{2}, e\mid k)\) if and only if \(P(e\mid h_{1} \wedge k)\gtreqless P(e\mid h_{2} \wedge k).\)

(P6) Modularity (for conditionally independent data)
For any \(e_{1},e_{2},h, k\in \bL\) and any \(P\in \bP,\) if \(P(e_{1}\mid \pm h \wedge e_{2} \wedge k)=P(e_{1}\mid \pm h \wedge k),\) then \(C_{P}(h, e_{1}\mid e_{2} \wedge k) = C_{P}(h, e_{1}\mid k).\)

All the above conditions occur more or less widely in the literature (see Crupi, Chater, and Tentori 2013 and Crupi and Tentori 2016 for references and discussion). Interestingly, they’re all pairwise incompatible on the background of the Formality and the Final Probability principles (P0 and P1 above). Indeed, they sort out the relevance notion of confirmation into three distinct, classical families of measures, as follows (Crupi, Chater, and Tentori 2013; Crupi and Tentori 2016; Heckerman 1988; Sprenger and Hartmann 2020, Ch. 1):

Theorem 2
Given (P0) and (P1):

  1. (P4) holds if and only if \(C_{P}(h, e\mid k)\) is a_probability difference measure_, that is, if there exists a strictly increasing function \(f\) such that, for any \(h, e, k\in \bL\) and any \(P\in \bP,\) \(C_{P}(h, e\mid k) = f[P(h\mid e \wedge k) - P(h\mid k)];\)
  2. (P5) holds if and only if \(C_{P}(h, e\mid k)\) is a_probability ratio measure_, that is, if there exists a strictly increasing function \(f\) such that, for any \(h, e, k\in \bL\) and any \(P\in \bP,\) \(C_{P}(h, e\mid k) =f[\frac{P(h\mid e \wedge k)}{P(h\mid k)}];\)
  3. (P6) holds if and only if \(C_{P}(h, e\mid k)\) is a_likelihood ratio measure_, that is, if there exists a strictly increasing function \(f\) such that, for any \(h, e, k\in \bL\) and any \(P\in \bP,\) \(C_{P}(h, e\mid k) =f[\frac{P(e\mid h \wedge k)}{P(e\mid \neg h \wedge k)}].\)

If a strictly additive behavior (SA above) is imposed, one functional form is singled out for the quantitative representation of confirmation corresponding to each of the clauses above:

  1. \(D_{P}(h, e\mid k) = P(h\mid e \wedge k) - P(h\mid k);\)
  2. \(R_{P}(h, e\mid k) = \log[\frac{P(h\mid e \wedge k)}{P(h\mid k)}];\)
  3. \(L_{P}(h, e\mid k) = \log[\frac{P(e\mid h \wedge k)}{P(e\mid \neg h \wedge k)}].\)

(The bases of the logarithms are assumed to be strictly greater than 1.)

Before discussing briefly this set of alternative quantitative measures of relevance confirmation, we will address one further related issue. It is a long-standing idea, going back to Carnap at least, that confirmation theory should yield an inductive logic that is analogous to classical deductive logic in some suitable sense, thus providing a theory of partial entailment, and partial refutation. Now, the deductive-logical notions of entailment and refutation (contradiction) exhibit the following well-known properties:

Contraposition of entailment
Entailment is contrapositive, but not commutative. That is, it holds that \(e\) entails \(h\) \((e\vDash h)\) if and only if \(\neg h\) entails \(\neg e\) \((\neg h\vDash \neg e),\) while it does not hold that \(e\) entails \(h\) if and only if \(h\) entails \(e\) \((h\vDash e).\)

Commutativity of refutation
Refutation, on the contrary, is commutative, but not contrapositive. That is, it holds that \(e\) refutes \(h\) \((e\vDash \neg h)\) if and only if \(h\) refutes \(e\) \((h\vDash \neg e)\), while it does not hold that \(e\) refutes \(h\) if and only if \(\neg h\) refutes \(\neg e\) \((\neg h \vDash \neg\neg e).\)

The confirmation-theoretic counterparts are fairly straightforward:

(P7) Contraposition of confirmation
For any \(e, h, k\in \bL\) and any \(P\in \bP,\) if \(e\) relevance-confirms \(h\) relative to \(k,\) then \(C_{P}(h, e\mid k) = C_{P}(\neg e,\neg h\mid k).\)

(P8) Commutativity of disconfirmation
For any \(e, h, k \in \bL\) and any \(P \in \bP,\) if \(e\) relevance-disconfirms \(h\) relative to \(k\), then \(C_{P}(h, e\mid k) = C_{P}(e, h\mid k).\)

The following can then be proven (Crupi and Tentori 2013):

Theorem 3
Given (P0) and (P1), (P7) and (P8) hold if and only if \(C_{P}(h, e\mid k)\) is a relative distance measure, that is, if there exists a strictly increasing function \(f\) such that, for any \(h, e, k\in \bL\) and any \(P\in \bP,\) \(C_{P}(h, e\mid k) = f[Z(h, e\mid k)],\) where:

\( Z(h,e\mid k)= \begin{cases} \dfrac{P(h\mid e \wedge k) - P(h\mid k)}{1-P(h\mid k)} & \mbox{if } P(h\mid e \wedge k) \ge P(h\mid k) \\ \\ \dfrac{P(h\mid e \wedge k) - P(h\mid k)}{P(h\mid k)} & \mbox{if } P(h\mid e \wedge k) \lt P(h\mid k) \end{cases} \)

So, despite some pessimistic suggestions (see, e.g., Hawthorne 2018, and the discussion in Crupi and Tentori 2013), a neat confirmation-theoretic generalization of logical entailment (and refutation) is possible after all. Interestingly, relative distance measures can be additive, but only for uniform pairs of arguments – both confirmatory or both disconfirmatory (see Milne 2014, p. 259). (Note: Crupi, Tentori, and Gonzalez 2007; Crupi, Festa, and Buttasi 2010; and Crupi and Tentori 2013, 2014, provide further discussions of the properties of relative distance measures and their intuitive motivations. Also see Mura 2008 for a related analysis.)

The plurality of alternative probabilistic measures of relevance confirmation has prompted some scholars to be skeptical or dismissive of the prospects for a quantitative theory of confirmation (see, e.g., Howson 2000, 184–185, and Kyburg and Teng 2001, 98 ff.). However, as we will see shortly, quantitative analyses of relevance confirmation have proved important for handling a number of puzzles and issues that plagued competing approaches. Moreover, various arguments in the philosophy of science and beyond have been shown to depend critically (and sometimes unwittingly) on the choice of one confirmation measure (or some of them) rather than others (see Festa and Cevolani 2017, Fitelson 1999, Brössel 2013, Glass 2013, Roche and Shogenji 2014, Rusconi et al. 2014, and van Enk 2014).

Recently, arguments have been offered by Huber (2008b) in favor of \(D\), by Park (2014), Pruss (2014), and Vassend (2015) in favor of \(L\) (also see Morey, Romeijn, and Rouder 2016 for an important connection with statistics), and by Crupi and Tentori (2010) in favor of \(Z\). Hájek and Joyce (2008, 123), on the other hand, have seen different measures as possibly capturing “distinct, complementary notions of evidential support” (also see Schlosshauer and Wheeler 2011, Sprenger and Hartmann 2020, Ch.1, and Steel 2007 for tempered forms of pluralism). The case of measure \(R\) deserves some more specific comments, however. Following Fitelson (2007), one could see \(R\) as conveying key tenets of so-called “likelihoodist” position about evidential reasoning (see Royall 1997 for a classical statement, and Chandler 2013 and Sober 1990 for consonant arguments and inclinations). There seems to be some consensus, however, that compelling objections can be raised against the adequacy of \(R\) as a proper measure of relevance confirmation (see, in particular, Crupi, Festa, and Buttasi 2010, 85–86; Eells and Fitelson 2002; Gillies 1986, 112; and compare Milne 1996 with Milne 2010, Other Internet Resources). In what follows, too, it will be convenient to restrict our discussion to \(D, L\) and \(Z\) as candidate measures. All the results to be presented below are invariant for whatever choice among these three options, and across ordinal equivalence with each of them (but those results do_not_ always extend to measures ordinally equivalent to \(R\)).

3.5 New evidence, old evidence, and total evidence

Let us go back to a classical HD case, where the (consistent) conjunction \(h \wedge k\) (but not \(k\) alone) entails \(e.\) The following can be proven:

Surprising prediction theorem (SP)
For any \(e, h, k \in \bL\) and any \(P\in \bP\) such that \(h \wedge k\vDash e\) and \(k\not\vDash e:\)

  1. if \(P(e\mid k)\lt 1,\) then \(e\) relevance-confirms \(h\) relative to \(k\) and \(C_{P}(h, e\mid k)\) is a decreasing function of \(P(e\mid k);\)
  2. if \(P(e\mid k) = 1,\) then \(e\) is relevance-neutral for \(h\) relative to \(k.\)

Formally, it is fairly simple to show that (SP) characterizes relevance confirmation (see, e.g., Crupi, Festa, and Buttasi 2010, 80; Hájek and Joyce 2008, 123), but the philosophical import of this result is nonetheless remarkable. For illustrative purposes, it is useful to assume the endorsement of the principle of total evidence (TE) as a default position for the Bayesian. This means that \(P\) is assumed to represent actual degrees of belief of a rational agent, that is, given all the background information available. Then, by clause (i) of (SP), we have that the occurrence of \(e\), a consequence of \(h \wedge k\) (but not of \(k\) alone), confirms \(h\) relative to \(k\) provided that \(e\) was initially uncertain to some degree (even given \(k\)). In other words: \(e\) must have been predicted on the basis of \(h \wedge k\). Moreover, again by (i), the confirmatory impact will be stronger the more surprising (unlikely) the evidence was unless \(h\) was conjoined to \(k\). So, under TE, relevance confirmation turns out to embed a squarely predictivist version of hypothetico-deductivism! As we know, this neutralizes the charge of underdetermination, yet it comes at the usual cost, namely, the old evidence problem. In fact, if TE is in force, then clause (ii) of (SP) implies that no statement that is known to be true (thus assigned probability 1) can ever have confirmatory import.

Interestingly, the Bayesian predictivist has an escape (neatly anticipated, and criticized, by Glymour 1980a, 91–92). Consider Einstein and Mercury once again. As effectively pointed out by Norton (2011a, 7), Einstein was extremely careful to emphasize that the precession phenomenon had been derived “without having to posit any special [_auxiliary_] _hypotheses at all_”. Why? Well, presumably because if one had allowed herself to arbitrarily devise ad hoc auxiliaries (within \(k\), in our notation) then one could have been pretty much certain in advance to find a way to get Mercury’s data right (remember: that’s the lesson of the underdetermination theorem). But getting those data right with auxiliaries \(k\) that were not thus adjusted—that would have been a natural consequence _had_the theory of general relativity been true and it would have been surprising otherwise. Arguably, this line of argument exploits much of the use-novelty idea within a predictivist framework. The crucial points are (i) that the evidence implied is not a verified empirical statement \(e\) but the logical fact that \(h \wedge k\) entails \(e\), and (ii) that the existence of this connection of entailment was not to be obviously anticipated at all, precisely because \(h \wedge k\) and \(e\) are such that the latter did not serve as a constraint to specify the former. On these conditions, it seems that \(h\) can be confirmed by this kind of “second-order” (logical) evidence in line with (SP)while TE is concurrently preserved.

At least two main problems arise, however. The first one is more technical in nature. Modelling rational uncertainty concerning logical facts (such as \(h \wedge k \vDash e\)) by probabilistic means is no trivial task. Garber (1983) put forward an influential proposal, but doubts have been raised that it might not be well-behaved (e.g., van Fraassen 1988; a careful survey with further references can be found in Eva and Hartmann forthcoming). Second, and more substantially, this solution of the old evidence problem can be charged of being an elusive change of the subject: for it was Mercury’s data, not anything else, that had to be recovered as having confirmed (and still confirming, some would add) Einstein’s theory. That’s the kind of judgment that confirmation theory must capture, and which remains unattainable for the predictivist Bayesian. (Earman 1992, 131 voiced this complaint forcefully. Hints for a possible rejoinder appear in Eells’s 1990 thorough discussion; see also Skyrms 1983.)

Bayesians that are unconvinced by the predictivist position are naturally led to dismiss TE and allow for the assignment of initial probabilities lower than 1 even to statements that were known all along. Of course, this brings the underdetermination problem back, for now \(k\) can still be concocted ad hoc to have known evidence \(e\) following from \(h \wedge k\) _and moreover_\(P(e\mid k)\lt 1\) is not prevented by TE anymore, thus potentially licencing arbitrary confirmation relations. Two moves can be combined to handle this problem. First, unlike HD, the Bayesian framework has the formal resources to characterize the auxiliaries themselves as more or less likely and thus their adoption as relatively safe or suspicious (the standard Bayesian treatment of auxiliary hypotheses is developed along these lines in Dorling 1979 and Howson and Urbach 2006, 92–102, and it is critically discussed in Rowbottom 2010, Strevens 2001, and Worrall 1993; also see Christensen 1997 for an important analysis of related issues). Second, one has to provide indications as to how TE should be relaxed. Non-TE Bayesians of the impermissivist strand often suggest that objective likelihood values concerning the outcome \(e\)—\(P(e\mid h \wedge k)\)—can be specified for the competing hypotheses at issue quite apart from the fact that \(e\) may have already occurred. Such values would typically be diverse for different hypotheses (thus mathematically implying \(P(e\mid k)\lt 1\)) and serve as a basis to capture formally the confirmatory impact of \(e\) (see Hawthorne 2005 for an argument along these lines). Permissivists, on the other hand, can not coherently rely on these considerations to articulate a non-TE position. They must invoke counterfactual degrees of belief instead, suggesting that \(P\) should be reconstructed as representing the beliefs that the agent would have, had she not known that \(e\) was true (see Howson 1991 for a statement and discussion, and Sprenger 2015 for an original recent variant; also see Jeffrey 1995 and Wagner 2001 for relevant technical results, and Steele and Werndl 2013 for an intriguing case-study from climate science).

3.6 Paradoxes probabilified and other elucidations

The theory of Bayesian confirmation as relevance indicates when and why the HD idea works: if \(h \wedge k\) (but not \(k\)) entails \(e\), then \(h\) is relevance-confirmed by \(e\) (relative to \(k\))because the latter increases the probability of the former—provided that \(P(e\mid k) \lt 1\). Admittedly, the meaning of the latter proviso partly depends on how one handles the problem of old evidence. Yet it seems legitimate to say that Bayesian relevance confirmation (unlike the firmness view) retains a key point of ordinary scientific practice which is embedded in HD and yields further elements of clarification. Consider the following illustration.

\((e_{1})\)

tigers carry the ND1 gene

\((e_{2})\)

elephants carry the ND1 gene

\((e_{2}^*)\)

lions carry the ND1 gene

\((h)\)

all mammals carry the ND1 gene

Qualitative confirmation theories comply with the idea that \(h\) is confirmed both by \(e_{1} \wedge e_{2}\) and by \(e_{1} \wedge e_{2}^*.\) In the HD case, it is clear that \(h\) entails both conjunctions, given of course \(k\) stating that tigers, lions, and elephants are all mammals (an Hempelian account could also be given easily). Bayesian relevance confirmation unequivocally yields the same qualitative verdict. There is more, however. Presumably, one might also want to say that \(h\) is more strongly confirmed by \(e_{1} \wedge e_{2}\) than by \(e_{1} \wedge e_{2}^*,\) because the former offers a more varied and diverse body of positive evidence (interestingly, on experimental investigation, this pattern prevails in most people’s judgment, including children, see Lo et al. 2002). Indeed, the variety of evidence is a fairly central issue in the analysis of confirmation (see, e.g., Bovens and Hartmann 2002, Schlosshauer and Wheeler 2011, and Viale and Osherson 2000). In the illustrative case above, higher variety is readily captured by lower probability: it just seems a priori less likely that species as diverse as tigers and elephants share some unspecified genetic trait as compared to tigers and lions, that is, \(P(e_{1} \wedge e_{2}\mid k)\lt P(e_{1} \wedge e_{2}^*\mid k).\) By (SP) above, then, one immediately gets from the relevance confirmation view the sound implication that \(C_{P}(h, e_{1} \wedge e_{2}\mid k)\gt C_{P}(h, e_{1} \wedge e_{2}^*\mid k).\)

Principle (SP) is also of much use in the ravens problem. Posit \(h = \forall x(raven(x)\rightarrow black(x))\) once again. Just as HD, Bayesian relevance confirmation directly implies that \(e = black(a)\) confirms \(h\) given \(k = raven(a)\) and \(e^* =\neg raven(a)\) confirms \(h\) given \(k^* =\neg black(a)\) (provided, as we know, that \(P(e\mid k)\lt 1\) and \(P(e^*\mid k^*)\lt 1).\) That’s because \(h \wedge k\vDash e\) and \(h \wedge k^*\vDash e^*.\) But of course, to have \(h\) confirmed, sampling ravens and finding a black one is intuitively more significant than failing to find a raven while sampling the enormous set of the non-black objects. That is, it seems, because the latter is very likely to obtain anyway, whether or not \(h\) is true, so that \(P(e^*\mid k^*)\) is actually quite close to unity. Accordingly, (SP) implies that \(h\) is indeed more strongly confirmed by \(black(a)\) given \(raven(a)\) than it is by \(\neg raven(a)\) given \(\neg black(a)\)—that is, \(C_{P}(h, e\mid k)\gt C_{P}(h, e^*\mid k^*)\)—as long as the assumption \(P(e\mid k)\lt P(e^*\mid k^*)\) applies.

What then if the sampling in not constrained \((k = \top)\) and the evidence now amounts to the finding of a black raven, \(e = raven(a) \wedge black(a)\), versus a non-black non-raven, \(e^* =\neg black(a) \wedge \neg raven(a)\)? We’ve already seen that, for either Hempelian or HD-confirmation, \(e\) and \(e^*\) are on a par: both Hempel-confirm \(h\), none HD-confirms it. In the former case, the original Hempelian version of the ravens paradox immediately arises; in the latter, it is avoided, but at a cost: \(e\) is declared flatly irrelevant for \(h\)—a bit of a radical move. Can the Bayesian do any better? Quite so. Consider the following conditions:

  1. \(P[raven(a)\mid h] = P[raven(a)] \gt 0\)
  2. \(P[\neg raven(a) \wedge black(a)\mid h] = P[\neg raven(a) \wedge black(a)]\)

Roughly, (i) says that the size of the ravens population does not depend on their color (in fact, on \(h\)), and (ii) that the size of the population of black _non_-raven objects also does not depend on the color of ravens. Note that both (i) and (ii) seem fairly sound as far as our best understanding of our actual world is concerned. It is easy to show that, in relevance-confirmation terms, (i) and (ii) are sufficient to imply that \(e = raven(a) \wedge black(a)\), but not \(e^* = \neg raven(a) \wedge \neg black(a)\), confirms \(h\), that is \(C_{P}(h,e) \gt C_{P}(h,e^*) = 0\) (this observation is due to Mat Coakley). So the Bayesian relevance approach to confirmation can make a principled difference between \(e\) and \(e^*\) in both ordinal _and qualitative_terms. (A much broader analysis is provided by Fitelson and Hawthorne 2010, Hawthorne and Fitelson 2010 [Other Internet Resources]. Notably, their results include the full specification of the sufficient and necessary conditions for the main inequality \(C_{P}(h, e) \gt C_{P}(h, e^*)\).)

In general, Bayesian (relevance) confirmation theory implies that the evidential import of an instance of some generalization will often depend on the credence structure, and relies on its formal representation, \(P\), as a tool for more systematic analyses. Consider another instructive example. Assume that \(a\) denotes some company from some (otherwise unspecified) sector of the economy, and label the latter predicate \(S\). So, \(k = Sa\). You are informed that \(a\) increased revenues in 2019, represented as \(e = Ra\). Does this confirm \(h = \forall x(Sx \rightarrow Rx)\)? It does, at least to some degree, one would say. For an expansion of the whole sector (recall that you have no clue what this is) surely would account for the data. That’s a straightforward HD kind of reasoning (and a suitable Hempelian counterpart reconstruction would concur). But does \(e\) also confirm \(h^* = Sb \rightarrow Rb\) for some further company \(b\)? Well, another obvious account of the data \(e\) would be that company \(a\) has gained market shares at the expenses of some competitor, so that \(e\) might well seem to support \(\neg h^*,\) if anything (the revenues example is inspired by a remark in Blok, Medin, and Osherson 2007, 1362).

It can be shown that the Bayesian notion of relevance confirmation allows for this pattern of judgments, because (given \(k\)) evidence \(e\) above increases the probability of \(h\) but may well have the opposite effect on \(h^*\) (see Sober 1994 for important remarks along similar lines). Notably, \(h\) entails \(h^*\) by plain instantiation, and so contradicts \(\neg h^*\). As a consequence, the implication that \(C_{P}(h,e\mid k)\) is positive while \(C_{P}(h^*,e\mid k)\) is not clashes with each of the following, and proves them unduly restrictive: the Special Consequence Condition (SCC), the Predictive Inference Condition (PIC), and the Consistency Condition (Cons). Note that these principles were all evaded by HD-confirmation, but all implied by confirmation as firmness (see above).

At the same time, the most compelling features of \(F\)-confirmation, which the HD model was unable to capture, are retained by confirmation as relevance. In fact, all our measures of relevance confirmation (\(D, L\), and \(Z\)) entail the ordinal extension of the Entailment Condition (EC) as well as \(C_{P}(h, e\mid k) = -C_{P}(\neg h, e\mid k)\) and thereby Confirmation Complementarity in all of its forms (qualitative, ordinal, and quantitative). Moreover, the Bayesian confirmation theorist of either the firmness or the relevance strand can avail herself of the same quantitative strategy of “damage control” for the main specific paradox of HD-confirmation, i.e., the irrelevant conjunction problem. (See statement (CIC) above, and Crupi and Tentori 2010, Fitelson 2002. Also see Chandler 2007 for criticism, and Moretti 2006 for a related debate.)

We’re left with one last issue to conclude our discussion, to wit, the blite paradox. Recall that \(blite\) is so defined:

\[blite(x) \equiv (ex_{t\le T}(x)\rightarrow black(x)) \wedge (\neg ex_{t\le T}(x)\rightarrow white(x)).\]

As always heretofore, we posit \(h = \forall x(raven(x)\rightarrow black(x)),\) \(h^* = \forall x(raven(x)\rightarrow blite(x)).\) We then consider the set up where \(k = raven(a) \wedge ex_{t\le T}(a),\) \(e= black(a),\) and \(P(e\mid k)\lt 1.\) Various authors have noted that, with Bayesian relevance confirmation, one has that \(P(h\mid k)\gt P(h^*\mid k)\) is sufficient to imply that \(C_{P}(h, e\mid k)\gt C_{P}(h^*,e\mid k)\) (see Gaifman 1979, 127–128; Sober 1994, 229–230; and Fitelson 2008, 131). So, as long as the black hypothesis is perceived as initially more credible than its blite counterpart, the former will be more strongly confirmed than the latter. Of course, \(P(h\mid k)\gt P(h^*\mid k)\) is an entirely commonsensical assumption, yet these same authors have generally, and quite understandably, failed to see this result as philosophically illuminating. Lacking some interesting, non-question-begging story as to why that inequality should obtain, no solution of the paradox seems to emerge. More modestly, one could point out that a measure of relevance confirmation \(C_{P}(h, e\mid k)\) implies (i) and (ii) below.

  1. Necessarily (that is, for any \(P\in \bP\)), \(e\) confirms \(h\) relative to \(k\).
  2. Possibly (that is, for some \(P\in \bP\)), each one of the following obtains:
    • \(e\) confirms that a raven will be black if examined after \(T\), that is, \((raven(b)\wedge \neg ex_{t\le T}(b)) \rightarrow black(b),\) relative to \(k\); and
    • \(e\) does not confirm that a raven will be white if examined after \(T\), that is, \((raven(b)\wedge \neg ex_{t\le T}(b)) \rightarrow white(b),\) relative to \(k\).

Without a doubt, (i) and (ii) fall far short of a satisfactory solution of the blite paradox. Yet it seems at least a legitimate minimal requirement for a compelling solution (if any exists) that it implies both. It is then of interest to note that confirmation as firmness is inconsistent with (i), while Hempelian and HD-confirmation are inconsistent with (ii).