Epistemic Irrationality in the Bayesian Brain (original) (raw)
**Epistemic Irrationality in the Bayesian Brain**
Daniel Williams
**Abstract***.* A large body of research in cognitive psychology and neuroscience draws on Bayesian statistics to model information processing within the brain. Many theorists have noted that this research seems to be in tension with a large body of experimental results purportedly documenting systematic deviations from Bayesian updating in human belief formation. In response, proponents of the Bayesian brain hypothesis contend that Bayesian models can accommodate such results by making suitable assumptions about model parameters (for example, priors, likelihoods, and utility functions). To make progress in this debate, I argue that it is fruitful to focus not on specific experimental results but rather on what I call the ‘sources of epistemic irrationality’ in human cognition. I identify four such sources and I explore whether and, if so, how Bayesian models can be reconciled with them: (1) processing costs; (2) evolutionary suboptimality; (3) motivated cognition; and (4) error management.
*1. Introduction*
*2. The Bayesian Brain*
*3. The Problem of Epistemic Irrationality*
*3.1. Bayesian inference and rationality*
*3.2. Intuitive Bayesian inference*
*4. Sources of Epistemic Irrationality*
*4.1. Processing costs*
*4.2. Evolutionary suboptimality*
*4.3. Motivational influences*
*4.4. Error management*
*5. Conclusion*
‘While to its advocates the rationality of Bayesian inference is one of its main attractions, to sceptics the hypothesis of rationality inherent in the Bayesian framework seems at best empirically implausible and at worse naïve’ (Feldman [2015], p.1022).
**1 Introduction**
Recent years have seen an explosion of research in cognitive psychology and neuroscience that draws on Bayesian statistics to model inferential processes within the brain (Chater et al. [2010]; Lake et al. [2016]; Knill and Pouget [2004]; Tenenbaum et al. [2011]). The scope and influence of this Bayesian paradigm has generated both excitement and controversy, with some describing this Bayesian ‘revolution’ (Hahn [2014]) as a ‘paradigm shift’ (Friston et al. [2017], p.1) in our understanding of the mind and others criticizing it on both methodological and empirical grounds (Bowers and Davis [2012]; Colombo et al. [2018]; Jones and Love [2011]; Marcus and Davis [2013]). A fundamental source of controversy concerns the optimality assumptions apparently central to Bayesian modelling. Expounding the Bayesian research programme, for example, Chater et al. ([2006], p.289) argue that ‘it seems increasingly plausible that human cognition may be explicable in rational probabilistic terms and that, in core domains, human cognition approaches an optimal level of performance’. As several authors have pointed out, such claims seem to be in tension with a large body of experimental results purportedly documenting systematic bias and suboptimality in human judgement and decision-making (Mandelbaum [2019]; Marcus and Davis [2013]; Williams [2018]). Indeed, in much of the research on human irrationality, Bayes’ theorem is explicitly used as the benchmark against which to identify systematic bias (Eil and Rao [2011]; Gilovich et al. [2002]; Tversky and Kahneman [1974]).
This situation has provoked many responses. Some have concluded that Bayesian cognitive science is at best restricted to modelling perception and motor control, where assumptions of optimality are widely seen to be less controversial (Mandelbaum [2019]; Marcus and Davis [2013]). Others have argued that Bayesian models have the theoretical resources to accommodate many putative examples of bias (Gershman [2019]; Sanborn and Chater [2016]; Tappin and Gadsby [2019]). An enduring critique of this response, however, is that such attempts at reconciliation constitute implausible ‘just-so stories’: even if one can fit a Bayesian model to a judgement or decision by making certain assumptions about model parameters (for example, priors, likelihoods, and utility functions), many theorists worry that such models are typically post hoc and unmotivated, and thus more a reflection of the ingenuity of modellers than the workings of the mind (Bowers and Davis [2012]; Williams [2018]).
My aim in this paper is to make theoretical progress in this controversy by changing the focus of attention. Specifically, rather than attempting to directly reconcile the Bayesian brain hypothesis with experimental results putatively describing non-Bayesian belief updating, my aim is to advance this debate by connecting the Bayesian brain hypothesis to the most significant sources of epistemic irrationality in human cognition—those factors that have been put forward to explain why human cognition exhibits systematic bias in the first place.
Changing the focus of the debate in this way brings several theoretical benefits. First, it helps to clarify the controversy, providing a useful and more principled taxonomy of various sources of biased belief and their different relationships to Bayesian models. Second, it avoids many of the methodological worries surrounding just-so stories: by focusing on the sources of epistemic irrationality rather than putative examples of such irrationality, the ability of Bayesian models to accommodate specific data becomes less relevant. Finally, focusing on distinct sources of epistemic irrationality helps to more clearly identify some of the deepest challenges for the Bayesian brain hypothesis. My aim in this paper is not to argue for or against this hypothesis. Nevertheless, it will become clear in what follows that some sources of epistemic irrationality are much more straightforward to accommodate within Bayesian models than others. Reframing the debate in this way thus helps to clarify the potential theoretical, methodological, and empirical challenges that it confronts.
To make the task of this paper manageable, my focus in what follows is on human belief formation. There are comparable issues concerning questions of optimality in the perceptual and sensorimotor domain (Rahnev and Denison 2018), but here I will restrict my focus to the domain of cognition, broadly construed, and thus the processes by which agents update beliefs in response to new evidence.
I structure the paper as follows. In Section 2 I briefly introduce the Bayesian brain hypothesis. In Section 3 I introduce what I call the ‘problem of epistemic irrationality’ that it confronts, and I make two brief clarifications: first, even though Bayes’ theorem is defined over subjective probabilities, it is still appropriate to understand it as a norm of epistemic rationality; second, the fact that humans are often bad when it comes to the conscious application of Bayes’ theorem does not itself challenge the Bayesian brain hypothesis. In Section 4 I then identify four core sources of epistemic irrationality in human cognition: processing costs (S4.1), evolutionary suboptimality (S4.2), motivational influences on belief formation (S4.3), and error management (S4.4). In each case I show where a Bayesian reconciliation is straightforward and where it is difficult. I conclude in Section 5 by briefly summarizing the foregoing conclusions and identifying important lessons for the epistemic status of the Bayesian brain hypothesis.
**2 The Bayesian Brain**
Bayes’ theorem is a derivation of probability theory that states the following:
p(h/e) = p(e/h)p(h)/p(e)
This theorem can be used to calculate the probability of the truth of possible hypotheses upon receipt of new evidence. If e is a piece of evidence and h is a possible hypothesis for explaining this evidence, Bayes’ theorem states the following: the probability of the hypothesis given the evidence p(h/e) is proportional to how well the hypothesis predicts the evidence—that is, the likelihood p(e/h)—weighted by the probability of the hypothesis considered prior to receiving the evidence—that is, the prior p(h). For example, suppose that you witness somebody coughing (e) and you consider three possible explanations: that they have a cold h1, that they have lung disease h2, or that they have heartburn h3 (see Tenenbaum et al. [2011], p.1280). Only h1 seems reasonable. Why? Because heartburn is a non-starter—it does not produce coughing—and although both colds and lung disease do produce coughing, colds are far more frequent; that is, they have a much higher prior probability. Now suppose that the person coughs up blood. H2 now seems much more likely: even though colds are far more frequent than lung disease, coughing up blood is much more likely to come from lung disease than from a cold. Bayes’ theorem thus provides a procedure ‘for navigating a course between excessive tendencies toward “observational adequacy” (whereby new data is over-accommodated) and “doxastic conservatism” (whereby existing beliefs are over-weighted)’ (McKay and Dennett [2009], p.497).
The posterior probability distribution computed via Bayes’ theorem is neutral on which action an agent should perform. Bayesian decision theory states that rational agents should perform the action which maximizes expected utility (Peterson [2017]). In addition to a probability distribution defined over states of a domain, rational action therefore also requires a gain or loss function that defines the utility of performing possible actions if those states of affairs obtain.^{[1]} According to Bayesian decision theory, selection of action thus depends both on the posterior distribution computed via Bayes’ theorem and the subjective costs and benefits associated with different outcomes.
It is widely held that Bayes’ theorem describes the optimal procedure for inference under uncertainty. There have been many attempts to justify this claim. For example, some point out that Bayes’ theorem is a derivation of the axioms of probability theory, which can themselves be justified by classic ‘Dutch book’ arguments, which demonstrate ‘that, under fairly general conditions, any gambler whose subjective probabilities deviate from the laws of probability, however slightly, can be mercilessly exploited’ (Chater et al. [2011], p.556). However, classic Dutch book arguments at best support the purely synchronic claim that an agent’s credences at any given time should conform to the probability calculus. The diachronic prescription that agents should conditionalize by replacing p(h) with p(h/e) in conformity with Bayes’ theorem requires an additional justification. There are many attempts to justify this additional prescription, including diachronic Dutch book theorems and arguments that attempt to show that Bayesians will always out-predict non-Bayesians (Joyce [2004]; Icard [2018]). I will not delve into this complex literature here. Instead, I will simply assume the widespread view that Bayes’ theorem describes the optimal procedure for inference under uncertainty.^{[2]} To foreshadow somewhat, however, it is important to note that almost all extant justifications of this view assume that the relevant agent has unlimited time and computational resources (see Icard [2018]).
Bayes’ theorem is applicable whenever an agent or system is required to update hypotheses upon receipt of new evidence under conditions of uncertainty. This situation plausibly characterises much of information processing in the brain. Perception, for example, involves estimating the state of the distal world based on proximal sensory inputs that are both noisy and ambiguous, and belief fixation involves updating beliefs in response to such perceptual estimates and other noisy and ambiguous informational inputs. Recognition of this fact has led to an enormous and growing body of research attempting to illuminate the brain’s solutions to such inferential problems with Bayesian statistics. As Chater et al ([2010], p.811) put it,
There has been a recent explosion in research applying Bayesian models to cognitive phenomena. This development has resulted from the realization that across a wide variety of tasks the fundamental problem the cognitive system confronts is coping with uncertainty.
As I will understand it, the Bayesian brain hypothesis is defined by its commitment to the view that inferential processes in the brain conform (approximately, at least) to Bayesian norms (Knill and Pouget [2004]). Thus, the hypothesis itself is silent on how brains implement Bayesian inference (see Chater et al. [2010]; Friston [2012], p.1231). Indeed, much research within Bayesian cognitive science explicitly abstracts away from algorithmic and implementational details, although it does of course place significant constraints on research at those levels (Lake et al. [2016]; Tenenbaum et al. [2011]; see Section 4.1. below for further clarification on these points).^{[3]}
In addition to its ability to accommodate a variety of specific experimental results (see Knill and Pouget [2004]), theorists have proposed many general arguments for the Bayesian brain hypothesis. Feldman ([2015], p.1022), for example, argues that
Bayesian inference is, in a well-defined sense, the best way to solve whatever decision problem the brain is faced with. Natural selection pushes organisms to adopt the most effective solutions available, so evolution should tend to favor Bayes-optimal solutions whenever possible. For this reason, any phenomenon that can be understood as part of a Bayesian model automatically inherits an evolutionary rationale.
Similarly, Mathys et al. ([2011], p.1) write,
Since a Bayesian learner processes information optimally, it should have an evolutionary advantage over other types of agents, and one might therefore expect the human brain to have evolved such that it implements an ideal Bayesian learner.
Other theorists sidestep evolutionary considerations altogether and motivate the Bayesian framework directly from the putative fact that people exhibit extraordinary success in tasks involving inference and decision-making under uncertainty:
If we want to explain how it is that people […] are able to cope so successfully with their highly uncertain world, the norms of probability provide the beginnings of an answer—to the extent that the mind reasons probabilistically, the normative justifications that imply that this is the ‘right’ way to reason about uncertainty […] go some way to explaining how it is that the cognitive system deals with uncertainty with a reasonable degree of success (Chater et al. [2011], p.556).
As we will see, both of these arguments are vulnerable to persuasive criticisms. The most obvious such criticism in the present context, however, is the one with which I began this paper: a highly influential body of research in psychology and behavioural economics suggests that we frequently do not handle uncertainty with much success.
**3 The Problem of Epistemic Irrationality**
The Bayesian brain hypothesis confronts what I will call the ‘problem of epistemic irrationality’. The problem is this: an enormous body of experimental results purportedly identify systematic biases in human cognition. Indeed, in much of the research identifying examples of biased judgement and decision-making, Bayes’ theorem is explicitly used as the benchmark against which to identify such bias (Gilovich et al. [2002]; Tversky and Kahneman [1974]).
As I noted in Section 1, theorists have developed numerous responses to this problem. Some conclude that it is a deep challenge for proponents of the Bayesian brain hypothesis, whereas others argue that many if not all putative examples of biased belief can in fact be accommodated by Bayesian models. One of the problems with this debate, however, is that Bayesian models are highly flexible, which makes it difficult to adjudicate this controversy by focusing exclusively on experimental results. Of course, if one could extract an individual’s priors and likelihoods prior to their receipt of new evidence and then compare their posteriors against a Bayesian benchmark, testing the Bayesian brain hypothesis would be straightforward. However, this is at best applicable in cases in which individuals have straightforward introspective access to the contents of their beliefs and can verbally articulate them. Even granting that merely asking people the contents of their beliefs is a reliable way of gauging them—itself a contentious claim—the Bayesian brain hypothesis is supposed to apply to unconscious information processing, so this option is often not available. Thus, in many cases theorists have great latitude in their attribution of priors, likelihoods, and utility functions (Bowers and Davis [2012]).
It is not my intention in this paper to argue that the Bayesian brain hypothesis cannot be tested against experimental data. Indeed, I will draw on some explicit tests below. Nevertheless, given the proven ability of Bayesian models to accommodate such a wide variety of experimental results, and the enduring controversies concerning this theoretical malleability, my aim in this paper is to make progress in this debate by stepping back from putative examples of biased belief to explore the underlying sources of epistemic irrationality in human cognition. First, however, it is important to address two potential points of confusion.
**3.1 Bayesian inference and rationality**
Several theorists have sought to dispel the implications of optimality in discussion of Bayesian models altogether. In his discussion of the Bayesian brain hypothesis, for example, Clark ([2016], pp.40-41) writes that
what researchers find in general is not that we humans are—rather astoundingly—‘Bayes’ optimal’ in some absolute sense […] but rather, that we are often optimal, or near optimal, at taking into account the uncertainties that characterize the information that we actually command […]
Similarly, in their classic review of the Bayesian brain hypothesis, Knill and Pouget ([2004], p.713) argue that ‘humans are clearly not optimal in the sense that they achieve the level of performance afforded by the uncertainty in the physical stimulus’, and then go on to note that
the real test of the Bayesian coding hypothesis is in whether the neural computations that result in perceptual judgments or motor behavior take into account the uncertainty in the information available at each stage of processing.
One way of understanding such claims is as follows: Bayesian inference is defined over subjective probabilities. These subjective probabilities—that is, degrees of belief—need not correspond to actual frequencies or regularities in the environment. Therefore, there is no reason to think that Bayesian processing is optimal. Stated in this way, however, the argument is confused. Of course, Bayes’ theorem does not guarantee veridicality, precisely because it is defined over subjective probabilities. For example, if one’s likelihoods are completely unresponsive to regularities in the world, there is no reason to think that Bayesian updating will lead to truth. For this reason, it is widely accepted that Bayesian updating is not sufficient for epistemic rationality (Joyce [2004]). Nevertheless, it describes a procedure for belief updating, which—under plausible assumptions—is necessary for epistemic rationality.
Despite this, these comments do illustrate an important point: precisely because Bayesian processing is defined over subjective probabilities, inferences or decisions that can look to be irrational or suboptimal might nevertheless arise from Bayes’ optimal processing defined over strange or otherwise implausible priors or likelihoods (Gershman [2019]; Tappin and Gadsby [2019]). Indeed, this is one of the things that makes adjudicating controversies surrounding the optimality assumptions of the Bayesian brain hypothesis so difficult, and which motivates my aim in Section 4 of turning to the sources of epistemic irrationality rather than putative instances of it. Thus, when I discuss epistemic irrationality and its sources in what follows, I am talking about the processes underlying belief fixation, not the accuracy of beliefs themselves, although of course epistemically rational belief-fixing procedures are precisely those that lead to accurate beliefs, and any theory that combines epistemically rational belief forming procedures such as Bayesian updating with highly inaccurate beliefs owes an explanation of how the latter arise.
**3.2 Intuitive Bayesian inference**
A second important point of clarification is this: although my focus is on belief formation, my focus is not on the conscious application of Bayes’ theorem in deliberate reasoning. The Bayesian brain hypothesis is supposed to apply to neural information processing in general, including information processing within the brains of other animals. It is not supposed to apply to cases in which individuals consciously apply Bayes’ theorem, which is ‘a recent cultural invention that few people become fluent with, and only then after sophisticated training’ (Tenenbaum et al. [2011], p.1280). This is important because some results apparently documenting non-Bayesian belief updating are really results showing that people are bad at consciously applying Bayes’ theorem when presented with explicitly articulated mathematical puzzles (see Oaksford and Chater [2007], p.13-14). This does not itself undermine the Bayesian brain hypothesis, however. As Oaksford and Chater ([2007], p.14) put it, ‘[E]ven if the mind were a probabilistic… calculating engine, it is by no means necessary that it will be possible to engage that engine’s reasoning abilities with verbally stated probabilistic puzzles’.
More generally, much of contemporary psychology assumes a distinction between ‘System One’ and ‘System Two’ styles of cognition, often characterised as a distinction between ‘intuition’ and ‘reasoning’ (Kahneman [2003]). Although there are various ways of drawing this distinction, and some scepticism about drawing any hard and fast distinction at all, it is widely accepted that intuitive inference is fast, unconscious, and largely effortless, with subjects having no introspective access to the inferential process by which judgements are generated, whereas reasoning is slow, effortful, and guided by conscious attention to the process of inference itself, which the agent voluntarily controls in a way that can be sensitive to explicitly represented norms of reasoning (see Kahneman [2003]; Stanovich [2011]). Almost all extant work within Bayesian cognitive science on human cognition targets the former—intuitive or ‘System One’—style of inference, such that agents are not thought to have any introspective access to the Bayesian algorithms alleged to underpin cognitive processes. Thus, although it is a fascinating question how to understand the process of conscious and effortful deliberative reasoning within a Bayesian brain, is not a question that I address here.
**4 Sources of Epistemic Irrationality**
My aim in this paper is to step back from specific examples of putative bias in human cognition and focus on the underlying sources of epistemic irrationality. As I am using the expression, ‘sources of epistemic irrationality’ refers to factors that are supposed to explain why human cognition exhibits systematic bias. The motivation for offering such explanations, rather than merely cataloguing putative examples of systematic bias, is that it has seemed plausible to many psychologists and philosophers that it would be selectively advantageous to represent the world accurately (see McKay and Dennett [2009] for a review). As Quine ([1969], p.126) famously put it, ‘Creatures inveterately wrong in their inductions have a pathetic but praiseworthy tendency to die out before reproducing their kind’. If that is right, there must have been strong selective pressure to evolve cognitive mechanisms conducive to generating accurate representations. Because accurate representations cannot be generated by magic, one would thus expect evolution to have designed cognitive systems conducive to generating ‘grounded’ representations—that is, representations formed in a way that is appropriately grounded in the evidence at the organism’s disposal (McKay and Dennett [2009], p.494). Indeed, we have seen that the dual assumptions that evolution selected for optimal inference and that Bayes’ theorem is inferentially optimal provide an important motivation for the Bayesian brain hypothesis (Feldman [2015]; Friston [2012]; Geisler and Diehl [2002]; Mathys et al. [2011]).
From this perspective, any putative examples of systematic epistemic irrationality can seem mysterious. Explanations that identify the sources of epistemic irrationality are thus supposed to dispel this mystery. Rather than identifying examples of systematic bias, they strive to explain why human cognition exhibits systematic bias in the first place.
In this section I focus on four such explanations and I relate them to the Bayesian brain hypothesis. These explanations are not supposed to be exhaustive, although they do plausibly constitute the most influential explanations that have been advanced in the psychological and philosophical literature. They point to the processing costs of cognition (S4.1), evolutionary suboptimality (S4.2), motivational influences on belief formation (S4.3), and error management (S4.4).
**4.1 Processing costs**
By far the most influential explanation of epistemic irrationality in the psychological literature points to the processing costs of cognition (Kahneman [2003]; Simon [1956]). We are limited creatures with finite time, energy, and resources. We must therefore ‘satisfice’ rather than optimise (Simon [1956]). This perspective underlies the research programme on ‘bounded rationality’, which studies the ‘fast and frugal’ heuristics alleged to underlie much of cognition and decision-making—heuristics that are often practically superior to algorithms that employ formal norms of reasoning precisely because they are more cost-effective (Gigerenzer and Selten [2002]; Kahneman [2003]). To take a well-known example, human beings appear to rely on an ‘availability heuristic’ that involves judging the frequency of events based on how easily they come to mind (Kahneman [2003]). Although this heuristic leads to systematic mistakes when such cognitive availability is not a reliable proxy for real-world frequencies, many cognitive scientists argue that it is nevertheless more adaptive than a less mistake-prone system that would require an individual to exhaustively search a mental database of all relevant information whenever forming judgements of frequency or probability (see Mercier and Sperber [2017], p.208).
Superficially, at least, the Bayesian brain hypothesis can seem to be in tension with any recognition of the processing costs on cognition. Bayes’ theorem, after all, does not factor in practical constraints on time, energy, or resources. Indeed, given the presence of large hypothesis spaces and mutually dependent variables, exact Bayesian inference is often extremely slow and sometimes computationally intractable, such that in many real-world situations ‘a fully rational Bayesian mind cannot exist’ (Gigerenzer [2008], p82). For this reason, many proponents of the research programme of bounded rationality explicitly position themselves in opposition to theories of human cognition that appeal to formal norms of reasoning such as logic and probability theory (Gigerenzer and Selten [2002]; Tversky and Kahneman [1974]). This applies equally to those who regard evidence for the existence of heuristics as an illustration of human irrationality (Kahneman [2003]) and those, such as Gigerenzer ([2008]), who argue that the use of such heuristics *is* rational—or at least ‘ecologically rational’—given the kinds of practical constraints under which we operate and the environments that we inhabit.
It would be a mistake to conclude that there is an irresolvable tension between bounded rationality and the Bayesian brain hypothesis, however. Proponents of Bayesian cognitive science are explicit that their proposal is not that the brain performs *exact* Bayesian inference (Friston [2010]; Penny [2012]; Sanborn and Chater [2016]). As Tenenbaum et al. ([2011], p.1284) put it, ‘The complexity of exact inference in large-scale models implies that… [the brain] can at best approximate Bayesian computations’. There is a large body of research in both artificial intelligence and statistical computing focused on developing algorithms for such approximate Bayesian computations. The two most influential classes of approximations are variational and sampling algorithms (for a review, see Sanborn [2017]; Penny [2012]).
Variational approximations work by defining a simpler set of probability distributions. Because mutual dependence among variables quickly generates a combinatorial explosion, for example, it is common to assume independence between certain variables to reduce computational complexity. Further, probability distributions are often constrained to take a simpler parametric form, such as Gaussian distributions that can be encoded in terms of their means and variances. Given a chosen distance measure such as the Kullbach-Leibler divergence, variational algorithms can then be ‘used to find the family member that most closely approximates the complex distribution’ (Sanborn [2017], p.100). Such variational approximations underlie predictive coding, an influential theory of neural information processing in which higher levels of cortical hierarchies generate predictions of activity at lower levels and exploit errors in those predictions for the purpose of learning and inference (Friston [2005]; see Kiefer and Hohwy [2018] for a review).
Sampling approximations function differently. Unlike variational algorithms, they are stochastic, and they work by randomly drawing samples from a probability distribution such that the collection of samples can stand in for the much more complex distribution in calculations (Sanborn [2017]). There are formal proofs showing that such approximate calculations become asymptotically correct as one approaches an infinite number of samples, but with finite samples a Bayesian sampler generates systematic and predictable errors (see Sanborn and Chater [2016]). One of the most widely used sampling algorithms in statistics and in cognitive science is the Markov chain Monte Carlo, which makes stochastic transitions from an initial state (i.e. particular set of values for each of the relevant random variables) that function as samples from the target distribution (Tenenbaum et al. [2011], p.1284).
Variational and sampling approximations illustrate that very different algorithms can nevertheless qualify as approximately Bayesian. Indeed, I noted above that the Bayesian brain hypothesis itself is best understood as placing constraints on the algorithms used by the brain, not as an algorithm-level thesis itself. This raises a difficult question, however: what are the necessary and sufficient conditions that an algorithm must satisfy to qualify as approximately implementing Bayesian inference? Many proponents of Bayesian cognitive science seem to hold that approximating the output of Bayesian inference in a principled way is sufficient (Sanborn and Griffiths [2010]), whereas others worry that patently non-Bayesian systems such as look-up tables can satisfy this condition in limited domains, and thus advance additional conditions, such as the transferability of priors and likelihoods across tasks and contexts (Maloney and Mamassian [2009]). I will not pretend to settle this extremely complex issue here. Instead, I will sidestep it by restricting my focus to approximation algorithms that are actually used within Bayesian cognitive science and artificial intelligence, and where there is unanimous agreement that they qualify as approximately Bayesian. As noted, this means focusing on variational and sampling approximations (Sanborn [2017]).
Focusing on variational and sampling approximations solves two problems for Bayesian cognitive science. First, it explains how a finite and highly constrained brain might be able to approximate Bayesian inference. Second, and most interestingly from our perspective here, approximation algorithms suggest an attractive explanation of systematic deviations from exact Bayesian inference in human cognition. As Sanborn ([2017], p.98) puts it,
The advantage of this approach is that when these algorithms are used in situations for which they are well-adapted, they make probabilistic cognition achievable, but when they are applied to situations for which they are poorly adapted, they can explain biases in behavior that cannot be explained by probabilistic models alone.
Much of the existing work pursuing this suggestion has focused on sampling algorithms and their ability to account for examples of systematic bias in human cognition (Dasgupta et al. [2017]; Sanborn and Chater 2016). To take only one obvious example, a notorious characteristic of human judgement and decision-making is its stochasticity, with people frequently giving different answers to the same question, even within short time periods (Gilovich et al. [2002]). This variability is puzzling from the perspective of optimal models of human cognition, but it falls out straightforwardly from Bayesian sampling due to the variability of samples (Sanborn and Chater [2016]). The psychological literature illustrates several other examples, however, including attempts to illuminate the conjunction fallacy, base-rate neglect, and the unpacking effect as natural consequences of sampling approximations (see Sanborn and Chater [2016] and Sanborn [2017] for an overview). There has been less work directly connecting the nature of variational algorithms to specific biases (although see Hohwy [2017] for some suggestions in this area).
The appeal to approximate Bayesian inference demonstrates at least in principle that ‘the view of cognition as satisficing need not be viewed as opposed to the Bayesian approach. Rather, sampling provides a mechanism for satisficing in real-world environments’ (Sanborn and Chater [2016], p.869). Nevertheless, the project of appealing to approximate Bayesian inference to explain systematic biases in human cognition at best constitutes a live research programme, and it remains unclear how many examples of biased belief can be accounted for in terms of variational or sampling approximations. For this reason, it would be premature to evaluate the success or failure of this research programme as a way of accommodating bounded rationality at this point in time.
Before concluding, however, it is worth identifying a deeper theoretical issue with this research programme that has received insufficient attention by proponents of the Bayesian brain hypothesis. The issue is this: most formal arguments attempting to demonstrate the optimality of Bayesian inference assume exact Bayesian inference (see Joyce [2004]). Even if one accepts such arguments, it therefore need not follow that approximate Bayesian inference will be superior to a variety of possible heuristics once one factors in practical constraints on human cognition. Indeed, in general it seems likely that what is optimal in real-world settings will be highly dependent on a set of contingent facts—the exact nature of the practical limitations, the task, the structure of the environment, and so on—such that the optimal strategy under practical constraints might be highly context-variable (Gigerenzer [2008]). If that is right, however, then it partially undermines one of the deepest motivations for the Bayesian brain hypothesis: namely, that because Bayes’ theorem is optimal it provides a useful starting point for modelling cognition (see Chater et al. [2010]; Feldman [2015]; Friston [2012]; Geisler and Diehl [2002]). That is, even if one thinks that rationality assumptions do provide a useful framework for modelling cognition, once one factors in practical constraints and thus focuses on ‘ecological rationality’ (Gigerenzer [2008]) or ‘resource rationality’ (Leider and Griffiths [2019]), it is no longer clear why Bayesian algorithms—even approximate Bayesian algorithms—should take centre stage in the cognitive modeller’s toolkit. As Icard ([2018], p.85) puts it, ‘[W]hy should we expect an approximation to Bayesian inference to be more rational than any number of alternative models that do not in any straightforward sense approximate Bayesian calculations?’
The upshot of this worry is that even though Bayesian models can in principle accommodate the processing costs of cognition through various approximations to Bayesian inference, such processing costs nevertheless partially undermine any *general* motivation for the Bayesian brain hypothesis in terms of considerations of rationality and optimality. Of course, one way to avoid this problem is to demonstrate that approximate Bayesian inference *is* superior to various non-Bayesian heuristics in specific contexts, and there has been some work attempting to demonstrate just that (see, for example, Icard [2018]). This work must be undertaken in a piecemeal fashion, however, because the fact that approximate Bayesian inference is optimal under some practical constraints does not imply that it is optimal under all possible practical constraints.
*Contra* many in the psychological and philosophical literature, then, the processing costs of cognition entail there is no simple argument from the optimality of Bayesian inference to the plausibility of the Bayesian brain hypothesis. Crucially, however, this hypothesis confronts an additional problem: even if approximate Bayesian inference is optimal in certain circumstances, it is still not clear why we should expect evolution to have converged on optimal solutions anyway. This brings us to a second important source of epistemic irrationality.
**4.2 Evolutionary suboptimality**
In Section 3 I briefly described the claim that the Bayesian brain hypothesis has an automatic ‘evolutionary rationale’ in virtue of the alleged facts (i) that Bayesian inference is optimal and (ii) that we should expect evolution to have selected for optimal solutions to the problems that organisms confront. The first source of epistemic irrationality just outlined challenges the first of these alleged facts: once one factors in the processing costs of cognition, it is not clear that Bayesian inference—even approximate Bayesian inference—is optimal. A second important source of epistemic irrationality threatens the second of these putative facts: even if one can demonstrate that in certain contexts approximate Bayesian inference is optimal, evolution rarely gives rise to optimal systems even once factors in practical constraints on individual organisms (see Jacob [1977]; Marcus [2009]).
There are various reasons for evolutionary suboptimality (for an overview, see Dawkins [1986]; Marcus [2009]). First, natural selection is constrained to operate on the products of previous selection, such that solutions to problems must always repurpose the solutions to previous problems. Second, natural selection is dependent on the emergence of appropriate mutations: even when there is selective pressure towards some solution, appropriate genetic variants must arise to be selected for, and there is no guarantee that this will happen. Finally, evolution is a hill-climbing optimization process that can easily get stuck in what computer scientists call ‘local maxima’: even if there is a superior solution to a problem, there is no way for evolution to converge on that solution if this requires that the relevant organisms traverse a ‘fitness valley’ in which initial movements towards that solution result in fitness decreases that are selected against (Dennett [1995]).
Such evolutionary constraints imply that systematic deviations from epistemic rationality in human cognition can emerge as a general consequence of the broader fact that evolution does not invariably give rise to optimal systems of any kind. Just as the human spinal column is not the optimal solution to the problem of walking upright but rather a notoriously faulty ‘kluge’ haphazardly pieced together from the anatomy of our quadrupedal ancestors, many aspects of human cognition might be similar (Marcus [2009]). For example, Marcus ([2009]) argues that a variety of cognitive biases, including confirmation bias and framing effects, emerge from the fact that humans are forced to rely upon a ‘context-dependent’ memory system shared with other vertebrates because there was no way for a superior location-addressable memory to have evolved in our lineage.
Once more, such considerations present a challenge to the Bayesian brain hypothesis. Indeed, evolutionary suboptimality presents a fundamental challenge to the more general methodology of ‘rational analysis’ in cognitive science (Anderson [1990]), which draws on assumptions of optimality and rationality to constrain psychological theorising, and which has played an important role in motivating Bayesian models of cognition, either explicitly (Oaksford and Chater [2007]) or implicitly (see Icard [2018]). And—as we have seen—this applies equally to contemporary forms of rational analysis that are sensitive to the processing costs of cognition (Gigerenzer [2008]; Lieder and Griffiths [2019]). Evolution does not invariably give rise to optimal satisficers.
The obvious response to this challenge is to argue that Bayesian models are in principle independent of any assumptions about the evolutionary process. On this view, the truth of the Bayesian brain hypothesis is exclusively dependent on whether it can illuminate the workings of the mind/brain. Evolutionary considerations are irrelevant.
This response likely underestimates the significance of evolutionary suboptimality, however. Of course, the ultimate success of the Bayesian brain hypothesis must be evaluated in light of its ability to illuminate neural information processing. Nevertheless, we have already seen that Bayesian models are highly flexible. Indeed, there are formal arguments demonstrating that any judgement can be modelled in terms of Bayesian inference by making suitable assumptions about priors and likelihoods (Wald [1947]). It is this fact which invites the worry of Bayesian just-so stories. Thus, mere consistency with data is insufficient. One important independent rationale for Bayesian models in light of this fact is precisely their optimality, which explains why considerations of optimality are pervasive in the literature within Bayesian cognitive science (Friston [2010]; [2012]; Oaksford and Chater [2007]). Once one abandons the assumption that considerations of optimality or rationality should constrain cognitive theorizing, it becomes much less clear what the initial motivation for the Bayesian brain hypothesis is, or why Bayesian models enjoy the level of influence that they do within contemporary cognitive science.
Once again, these considerations do not directly threaten the truth of the Bayesian brain hypothesis. However, they do threaten the view widespread among proponents of the Bayesian brain hypothesis that assumptions about optimality should motivate the search for Bayesian models (see Chater et al. [2010]; Feldman [2015]; Friston [2012]; Oaksford and Chater [2007]). That is, contra Feldman’s quote above, the Bayesian brain hypothesis does not carry an automatic ‘evolutionary rationale’, even if approximate Bayesian inference is the optimal solution to the inferential problems that we solve.
**4.3 Motivational influences**
A third important source of epistemic irrationality in human cognition comes from motivational influences on belief formation: the way in which our motives—our desires, goals, emotions, and so on—causally influence how we seek out and process information. Such motivational influences are reflected in many expressions of commonsense psychology—'wishful thinking’, ‘burying your head in the sand’, ‘denial’, ‘drinking your own kool aid’, and so on—and there is a large body of influential research in psychology and the social sciences detailing their importance as an underlying source of epistemic irrationality (Kahan [2017]; Kunda [1990]; Sharot and Garrett [2016]).
Motivational influences on belief formation can seem to be in sharp tension with the Bayesian brain hypothesis (Williams [2018]). Consider the ‘good news/bad news effect’, for example. In an influential study, Eil and Rao ([2011]) first identified subjects’ prior opinions about their relative IQ and physical attractiveness and then exposed them to novel evidence that bore directly on such beliefs. Eil and Rao ([2011], p.116) describe the results of this experiment as follows:
[S]ubjects incorporated favourable news into their existing beliefs in a fundamentally different manner than unfavourable news. In response to favourable news, subjects tended to […] adhere quite closely to the Bayesian benchmark […] In contrast, subjects discounted or ignored signal strength in processing unfavourable news, which led to noisy posterior beliefs that were nearly uncorrelated with Bayesian inference.
How should a proponent of the Bayesian brain hypothesis respond to such phenomena? One response is of course simply to deny the existence of motivational influences on cognition altogether. Several proponents of the Bayesian brain hypothesis have pointed out that many putative cases of motivated cognition can in fact be accommodated within Bayesian models given suitable assumptions about priors, likelihoods, and so on (Gershman [2019]; Tappin and Gadsby [2019]). For example, Gershman ([2019], p.19) argues that the good news/bad news effect can be accommodated even if ‘people are being fully Bayesian’, if one assumes that subjects also subscribe to an ‘auxiliary hypothesis’ that negative feedback comes from ‘invalid evidence sources’. Nevertheless, this response is quite extreme: both commonsense and everyday observation attest to the biasing influence of an individual’s motives on thought, and there is an enormous empirical literature documenting examples of motivated cognition (see Kunda [1990] for a review). Of course, perhaps all such evidence is erroneous, or better explained without positing motivational influences, but the probability that this is true in all cases seems low.
Another response is to concede that an individual’s motivations can interfere with otherwise Bayesian inferential processes. On this view, although Bayesian models describe a fundamental form of information processing in the brain, motivational influences can disrupt such Bayesian processing, generating systematic deviations from Bayesian inference. For example, after criticizing the Bayesian brain hypothesis by drawing on results within the field of cognitive dissonance theory—results that can be understood in terms of motivated cognition (Kunda [1990])—Mandelbaum ([2019], p.154) allows for the possibility that for beliefs that one does not identify with (beliefs ‘that are very disconnected from the self’) ‘one can update in the way Bayesians predict’. On this view, then, the Bayesian brain hypothesis describes inferential processing in cases in which the individual’s motives do not play an interfering role.
A final response—perhaps consistent with this second one—attempts to incorporate motivational influences into Bayesian models. Within contemporary economics, for example, motivational influences are typically understood in terms of ‘belief-based utility’, which names the putative fact that our beliefs do not merely inform our efforts to satisfy our preferences but are sometimes targets of our preferences (see Bénabou and Tirole [2016]). In the vocabulary of rational choice theory, beliefs sometimes enter directly into an individual’s utility function. According to this work, it is such belief-based preferences that underpin the motivational influences on cognition in the first place, leading agents to conform information processing to the goal of satisfying such belief-based preferences (Bénabou and Tirole [2016]). Framing things in this way thus suggests an interesting possibility: perhaps one can use Bayesian decision theory to model motivated cognition as a consequence of expected utility maximization, which is itself underpinned by Bayesian updating (Mobius et al. [unpublished]). The only difference between ordinary decision-making is that the relevant decisions concern how to process information, the relevant preferences concern beliefs, and the relevant expectations concern the consequences of forming certain beliefs. Sharot and Garrett ([2016], p.28) seem to have this idea in mind when they claim that ‘learning asymmetries can be explained with a Bayesian model, if that model accounts for the fact that agents derive utility from beliefs *per se*’.
The only example I am aware of that develops this view in any depth comes from Mobius et al. ([unpublished], p.2), who introduce the concept of a ‘biased Bayesian’:
We suppose that the agent is a “biased Bayesian” updater who uses Bayes’ rule to process information but decides at an initial stage how to interpret the informativeness of signals and how to value information, taking into account the competing demands of belief utility and decision-making. When the weight placed on belief utility is zero, the model reproduces “perfect” (unbiased) Bayesian updating.
According to this view, then, agents (or subpersonal cognitive systems) make decisions about how to seek out and process information, and these decisions are sensitive to the utility associated with certain beliefs. When beliefs are evaluatively neutral, agents thus decide to update beliefs as ‘unbiased’ Bayesians. When beliefs are sources of utility, however, they process information in a way that is sensitive to such belief-based preferences. For example, the reliability of evidence that challenges motivated beliefs might be automatically downgraded, enabling agents to uphold cherished beliefs in the face of apparently disconfirming evidence. Indeed, there is some evidence that this is exactly what people do (Kahan [2017]). Consequently, Bayesian ‘agents who derive utility directly from their beliefs… will exhibit a range of distinctive and measurable biases in both the way they acquire and the way they process information’ (Mobius et al. [unpublished], p.3).
Setting aside some formidable technical issues concerning how to formally model this proposal, is the proposal itself plausible? Once more, it is probably best understood as an active research programme at this point and it is an open question whether all forms of motivated cognition can be captured in this way. Nevertheless, I think that there are three general worries that one might raise concerning the project.
First, by drawing on Bayesian decision theory to model the formation of beliefs that would not have been arrived at through Bayesian inference alone, one might worry that this project preserves a Bayesian account of belief formation in name but not in spirit. It is not clear how pressing this worry is, however. After all, proponents of the Bayesian brain hypothesis are already committed to the existence of utility functions, and this account neatly illuminates how they can accommodate an atypical form of belief formation without abandoning the defining commitments of the hypothesis.
Second, the mere fact that one can model motivational influences on belief formation in this way is of course no reason to think that one should, and one might worry that the only motivation for doing so is to save Bayesian models of cognition. Again, however, this worry is not very compelling. If one thinks that there is independent reason to endorse a general model of neural information processing in terms of Bayesian inference, it is not unreasonable to see whether this model can also accommodate motivational influences. Further, there is evidence that many examples of motivated cognition are practically rational. For example, several theorists point out that motivated cognition in the political domain is highly rational: because one’s individual vote and actions have a negligible impact on political decision-making, there is little practical incentive to hold true beliefs; on the other hand, there are often powerful practical incentives to hold ungrounded beliefs when they are personally rewarding or socially incentivised (see Kahan [2017]). An influential view in social science is that political agents respond to such incentives in the way in which they seek out and process information—that they exhibit what Caplan ([2001]) calls ‘rational irrationality’. Thus, it is not absurd to think that agents do in fact make rational (albeit unconscious) decisions about the expected value of motivated cognition in different contexts (see Caplan [2001]; Kahan [2017]), and Bayesian decision theory can illuminate this phenomenon.
A more serious objection is that this framework depicts a form of information processing that is too computationally demanding to be plausible. Specifically, it suggests that in addition to first-order processes of belief formation, agents also engage in a higher-order form of expected utility maximization in which they flexibly update first-order processes in response to expectations about the relative costs and benefits of different beliefs in different contexts. Even if this is possible, one might worry that it is not very psychologically plausible. Nevertheless, it is once again difficult to evaluate this worry, not least because proponents of the Bayesian brain hypothesis might argue that the computations underlying motivated reasoning are themselves implemented by approximate inferential processes. Ultimately whether this project can succeed is thus a complex empirical question, and it would be premature to attempt to answer this question at this stage. What is clear, however, is that the relationship between Bayesian brains and motivated cognition is much more complicated than many in the psychological and philosophical literature give it credit for.
**4.4 Error management**
A fourth influential explanation of epistemic irrationality points to a simple fact: different representational errors are associated with different costs. In some contexts, for example, false positives generate only moderate costs whereas false negatives generate extremely high costs. Consider predator detection: mistakenly indicating the presence of a predator and failing to indicate the presence of a predator are both mistakes, but the latter mistake is much costlier. Indeed, it will likely cost an organism its life. Given this asymmetry, many theorists have argued that evolution can sometimes select for biased systems over unbiased systems if the former are better at avoiding costlier errors (see Haselton and Buss [2000]; Johnson et al. [2013]; Stich [1990]). As Stich ([1990], p.62) puts it,
A very cautious, risk-aversive inferential strategy – one that leaps to the conclusion that danger is present on very slight evidence – will typically lead to false beliefs more often, and true ones less often, than a less hair-trigger one that waits for more evidence before rendering a judgment. Nonetheless, the unreliable, error-prone, risk-aversive strategy may well be favored by natural selection […] [F]rom the point of view of reproductive success, it is often better to be safe (and wrong) than sorry.
This insight forms the basis of Error Management Theory, which focuses on the way in which asymmetries in the costs of different representational errors have selected for biased cognitive mechanisms (Haselton and Buss [2000]). Error Management Theory has been an extremely productive research programme, offering purported explanations of everything from the fact that men typically overperceive sexual interest in women to the fact that human beings routinely overattribute agency to natural phenomena, a feature of human psychology widely held to be central to the emergence of religious beliefs (see Johnson et al. [2013]). As the quote from Stich illustrates, many have concluded that the evolutionary demands of effective error management have selected for systems of belief formation that are systematically biased away from truth (Stich [1990]). Thus, one might worry that the results of Error Management Theory are in tension with the Bayesian brain hypothesis. Overestimating the presence of predators may not be Bayes’ optimal, but it may nevertheless be adaptive (see Haselton and Buss [2000]).
This conclusion is mistaken, however. In fact, the basic insight of Error Management Theory is straightforward to integrate into Bayesian models of cognition (see McKay and Efferson [2010]). To see this, note that one interpretation of this insight is a simple consequence of classic decision theory: decision-making is not merely sensitive to the relative probability of different outcomes but to their expected utility—the relevant costs and benefits of potential actions as weighted against their probability of occurring (Peterson [2017]). As we saw in Section 2, Bayesian decision theory goes beyond the inference problem of computing posterior probability distributions. To act, an agent also requires a utility function identifying the subjective costs and benefits of performing certain actions if certain states of the world obtain. To take a simplified example, suppose that in the case of predator detection there are two potential worldly states, S1 (predator) and S2 (no predator), and two potential actions, A1 (run) and A2 (don’t run). Bayes’ theorem enables the organism to update the probability that it assigns to S1 and S2 based on new evidence. This probability assignment is itself neutral on which action the organism should take, however. Thus, even if the organism assigns an extremely low probability to S1 (predator), A1 (run) might still be rational if the expected cost of A2 (don’t run) is high enough, which it often will be given the high costs of A2 *if* S1 obtains.
The upshot of this is that the consequences of asymmetries in the costs of different errors can be managed at the level of action rather than biased beliefs. As McKay and Dennett ([2009], p.501) put it, ‘[T]endencies to “overestimate” the likelihood that food is contaminated… may reflect judicious decision criteria for action rather than misbeliefs’. Thus, Bayesian decision theory can in principle accommodate the lessons of Error Management Theory without revision. Importantly, however, this fact says nothing about whether organisms do in fact cope with asymmetries in the costs of different errors in this way. And this suggests another way of understanding the claims made by thinkers such as Stich: even if it is possible to manage asymmetries in the costs of different representational errors at the level of action, it is implausible that this is the solution that evolution favoured. And—again—one reason for thinking this is that the process of minimizing expected costs is computationally demanding: even if it might be optimal given unlimited time and resources, evolution likely selected for simpler strategies. Johnson et al. ([2013], pp.2-3) advance just this objection:
If one could accurately estimate the costs and probabilities of events, one could manage errors through ‘cautious-action policies’ (one form of which is Bayesian updating) […] To understand why we need a cognitive bias […] we need to pay attention to cognitive constraints (the limitations of brains as decision-making machines) and evolutionary constraints (natural selection as an imperfect designer).
This challenge therefore returns us to the processing costs of cognition and evolutionary constraints discussed above. As before, then, whether or not proponents of the Bayesian brain hypothesis can answer this challenge is an empirical question. I am not aware of any direct research attempting to link approximate Bayesian inference to the lessons of Error Management Theory, but connecting the two research programmes is a crucial task for those working within Bayesian cognitive science.
Before concluding this section, it is worth highlighting another area in which differences in the costs of representational errors have been proposed as an explanation of certain forms of biased cognition. This proposal focuses on the following fact: there are important contextual differences in the degree to which an agent is punished for holding false beliefs. In most everyday contexts, false beliefs are likely to undermine practical success, leading to misguided inferences and actions that are not responsive to the state of the world. In some contexts, however, there is little personal risk associated with being wrong, either because one is unlikely to ever act on the relevant belief or because the belief concerns phenomena that one has little ability to influence. As I noted above (Section 4.3), it has long been argued that political beliefs provide an important example of this: because individuals have a negligible impact on political decision-making, they have little practical incentive to hold true beliefs (Caplan [2001]). Philosophers, psychologists, and social scientists have long claimed that individuals are more likely to lapse into error in such contexts. Descartes ([[1673] 1999], p.6), for example, observed that he ‘could find much more truth in the reasonings that each person makes concerning matters that are important to him, and whose outcome ought to cost him dearly later on if he judged badly’.
What resources does a proponent of the Bayesian brain hypothesis have to accommodate this putative phenomenon? This is a complex question that cannot be fully addressed here. Nevertheless, the considerations adduced so far suggest two important strategies.
First, we have already seen that the processing costs of cognition imply a trade-off between accuracy and efficiency. When there are high personal costs associated with being wrong, individuals might expend more energy in seeking out novel information and sampling novel hypotheses. Such differential sampling is not inconsistent with the Bayesian brain hypothesis. Thus, the mere fact that Bayesian brains expend less effort in contexts where false beliefs carry few personal risks might explain the predominance of false and ungrounded beliefs in domains such as religion and politics (Downs [1957]).
Second, and relatedly, we saw above that Bayesian models can plausibly accommodate the influence of motivational influences under the assumption that beliefs themselves are positive sources of utility. Under conditions in which there are strong motivations for holding certain beliefs—because they are comforting or socially rewarded, for example—and little personal risk associated with false belief, a rational utility maximizing agent might therefore decide to conform the way in which it processes information to the goal of satisfying such belief-based preferences rather than arriving at the truth. Indeed, this assumption is central to work within economics on belief-based utility. As Bénabou and Tirole ([2016], p.150) put it, ‘Beliefs for which the individual cost of being wrong is small are more likely to be distorted by emotions, desires, and goals’ (see Williams [forthcoming]). In this framework, then, managing the different costs of different errors might itself be a consequence of rational decision-making, with Bayesian brains flexibly calibrating their information processing to both the costs and benefits of holding evidentially unsupported beliefs in different contexts.
Is this proposal plausible? Again, I do not want to take a stand on this issue here. Nevertheless, I hope that these schematic suggestions illustrate the ability—at least in principle—of Bayesian models to accommodate a variety of superficially irrational phenomena.
**5 Conclusion**
Recent decades have witnessed a Bayesian revolution in psychology. As with other major psychological research programmes such as connectionism, evolutionary psychology, and embodied cognition, Bayesian cognitive science has invited not just empirical scrutiny but also substantial methodological and philosophical controversy over its foundational assumptions. The concepts of rationality and optimality have in many respects been at the epicentre of such controversies. In this paper I have sought to clarify and advance debates in this area by drawing attention away from specific examples of human irrationality towards the underlying sources of epistemic irrationality. This shift of focus affords a deeper and more principled analysis of the relationship between rationality and the Bayesian brain, it largely sidesteps issues about the theoretical malleability of Bayesian models in the face of experimental data, and it brings into clearer focus some of the real challenges that Bayesian cognitive science confronts when it comes to the problem of epistemic irrationality.
There are two general lessons that can be extracted from the foregoing analysis. First, the appearance of a simple tension between Bayesian brains and epistemic irrationality is illusory. Approximate Bayesian inference offers a potential way of accommodating biases that emerge from the processing costs of cognition, and reorienting attention towards the utility assigned to beliefs and the different costs associated with different errors in different contexts reveals that many apparent forms of irrationality can be accommodated with the resources of Bayesian decision theory. Whether it is possible to reconcile Bayesian brains with all examples of biased belief in such ways is currently an unanswered empirical question. I hope that the present paper helps to clarify what this question involves, however, and provides some additional motivation to pursue it.
Second, and just as importantly, the widespread view that the optimality of Bayesian inference in and of itself provides a motivation for the Bayesian brain hypothesis is also illusory. As we have seen, the severe processing costs of cognition entail that cognitive systems must rely on approximate Bayesian inference rather than exact Bayesian inference, and there is little reason to think that the former will always be optimal. Worse, even when approximate Bayesian inference does provide the optimal procedure for satisficing in certain contexts, there is little reason to think that evolution would have converged on such optimal solutions anyway.
Of course, in some sense these conclusions only serve to make the epistemic status of the Bayesian brain hypothesis more complicated. But that is the point. The connections between human rationality, Bayesian inference, practical constraints, and evolution *are* complicated, and discussions in this area should reflect this fact.
**Acknowledgements**
I would like to thank Stephen Gadsby, Ben Tappin, Phil Corlett, Bence Nanay, Jakob Hohwy, and Marcella Montagnese for helpful discussion surrounding the topic of this article, as well as two anonymous reviewers for helpful comments and suggestions. This work was generously supported by the Fonds voor Wetenschappelijk Onderzoek (FWO) Odysseus grant [G.0020.12N] and the Fonds voor Wetenschappelijk Onderzoek (FWO) research grant [G0C7416N].
*Daniel Williams*
*Centre for Philosophical Psychology*
*University of Antwerp*
*Antwerp, Belgium*
**References**
Anderson, J. R. [1990]: *The Adaptive Character of Thought*. London: Hillsdale, NJ: Erlbaum Associates.
Bénabou, R. and Tirole, J. [2016]: ‘Mindful Economics: The Production, Consumption, and Value of Beliefs’, *Journal Of Economic Perspectives*, **30(3)**, pp.141-164.
Bowers, J. S. and Davis, C. J. [2012]: ‘Bayesian Just-so Stories in Psychology and Neuroscience’, *Psychological Bulletin*, **138(3)**, pp.389-414
Caplan, B. [2001]: ‘Rational Ignorance versus Rational Irrationality’, *Kyklos*, **54(1)**, pp.3-26
Chater, N., Oaksford, M., Hahn, U., and Heit, E. [2010]: ‘Bayesian Models of Cognition’, *Wiley Interdisciplinary Reviews: Cognitive Science*, **1(6)**, pp.811-823.
Chater, N., Oaksford, M., Hahn, U., and Heit, E. [2011]: ‘Inductive Logic and Empirical Psychology’, in *Handbook of the History of Logic* *(Vol. 10),* London: North-Holland.
Chater, N., Tenenbaum, J. and Yuille, A. [2006]: ‘Probabilistic Models of Cognition: Conceptual Foundations’, *Trends in cognitive sciences,* **10(7)**, pp.287-291.
Colombo, M. and Hartmann, S. [2015]: ‘Bayesian Cognitive Science, Unification, and Explanation’, *The British Journal for the Philosophy of Science*, **68(2)**, pp.451-484.
Colombo, M., Elkin, E. and Hartmann, S. [2018]: ‘Being Realist about Bayes, and the Predictive Processing Theory of Mind’, *The British Journal for the Philosophy of Science*, axy059, https://doi.org/10.1093/bjps/axy059.
Colombo, M. and Seriès, P. [2012]: ‘Bayes in the Brain—on Bayesian Modelling in Neuroscience’, *The British journal for the philosophy of science*, **63(3)**, pp.697-723.
Dasgupta, I., Schulz, E. and Gershman, S. J. [2017]: ‘Where Do Hypotheses Come From?’, *Cognitive Psychology*, **96**, pp.1–25.
Dawkins, R. [1986]: *The Blind Watchmaker*. London: W. W. Norton.
Dennett, D. C. [1995a]: *Darwin’s Dangerous Idea: Evolution and the Meanings of Life*. London: Simon & Schuster/Penguin.
Descartes, R. [[1673] 1999]: *Discourse on Method*. Indianapolis: Hacket
Downs, A. [1957]: *An Economic Theory of Democracy*. New York: Harper & Row.
Eil, D. and Rao, J. M. [2011]: ‘The Good News-Bad News Effect: Asymmetric Processing of Objective Information about Yourself’, *American Economic Journal: Microeconomics*, **3(2)**, pp.114-38.
Feldman, J. [2015]: ‘Bayesian Models of Perceptual Organization’, in Wagemans, J (*ed*.), *The Oxford Handbook of Perceptual Organization*, Oxford: Oxford University Press, pp. 1008-1027
Friston, K. [2005]: ‘A Theory of Cortical Responses’, *Philosophical Transactions of the Royal Society B: Biological sciences*, **360(1456)**, pp.815-836.
Friston, K. [2010]: ‘The Free-energy Principle: A Unified Brain Theory?’, *Nature Reviews Neuroscience*, **11(2)**, pp.127-138.
Friston, K. [2012]: ‘The History of the Future of the Bayesian brain’, *NeuroImage*, **62(2)**, pp.1230-1233.
Friston, K. J., Daunizeau, J., Kilner, J. and Kiebel, S. J. [2010]: ‘Action and Behavior: A Free-energy Formulation’, *Biological Cybernetics*, **102(3)**, pp.227-260.
Friston, K., FitzGerald, T., Rigoli, F., Schwartenbeck, P. and Pezzulo, G. [2017]: ‘Active Inference: A Process Theory’, *Neural Computation*, **29(1)**, pp.1-49.
Friston, K., Samothrakis, S. and Montague, R. [2012]: ‘Active Inference and Agency: Optimal Control Without Cost Functions. *Biological Cybernetics*, **106**, pp.523– 541.
Geisler, W. S. and Diehl, R. L. [2002]: ‘Bayesian Natural Selection and the Evolution of Perceptual Systems’, *Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences*, **357(1420)**, pp.419-448.
Gershman, S. J. [2019]: ‘How to Never be Wrong’, *Psychonomic Bulletin & Review*, **26(1)**, pp.13-28.
Gigerenzer, G. [2008]: *Rationality for Mortals: How People Cope with Uncertainty*, Oxford: Oxford University Press.
Gigerenzer, G. and Selten, R. [2002]: *Bounded rationality*, Cambridge, Mass.: MIT Press.
Gilovich, T., Griffin, D. and Kahneman, D. (*eds.*) [2002)]: *Heuristics and Biases: The Psychology of Intuitive Judgment*, Cambridge: Cambridge University Press.
Hahn, U. [2014]: ‘The Bayesian boom: Good thing or bad?’, *Frontiers in Psychology*, **5**, p.765.
Haselton, M. G. and Buss, D. M. [2000]: ‘Error Management Theory: A New Perspective on Biases in Cross-sex Mind Reading’, *Journal of Personality and Social Psychology*, **78(1)**, pp.81-91.
Hohwy, J. [2013)]: *The Predictive Mind*, Oxford: Oxford University Press
Hohwy, J. [2017]: ‘Priors in Perception: Top-down Modulation, Bayesian Perceptual Learning Rate, and Prediction Error Minimization’, *Consciousness And Cognition*, **47**, pp.75-85.
Icard, T. F. [2018]: ‘Bayes, Bounds, and Rational Analysis’, *Philosophy of Science*, **85(1)**, pp.79-101.
Jacob, F. [1977]: ‘Evolution and Tinkering’, *Science*, **196**, pp.1161–1166.
Johnson, D. D., Blumstein, D. T., Fowler, J. H. and Haselton, M. G. [2013]: ‘The Evolution of Error: Error Management, Cognitive Constraints, and Adaptive Decision-making Biases’, *Trends in Ecology & Evolution*, **28(8)**, pp.474-481.
Jones, M. and Love, B. C. [2011]: ‘Bayesian Fundamentalism or Enlightenment? On the Explanatory Status and Theoretical Contributions of Bayesian Models of Cognition’, *Behavioral and Brain Sciences*, **34(4)**, pp.169-188.
Joyce, J.M. [2004]: ‘Practical Aspects of Theoretical Reasoning’, in A. R. Mele and P. Rawling (*eds.*), *The Oxford Handbook of Rationality*, New York: Oxford University Press, pp.132-154
Kahan, D. [2017]: ‘The Expressive Rationality of Inaccurate Perceptions’, *Behavioral and Brain Sciences,* **40**, pp.26-8.
Kahneman, D. [2003]: ‘Maps of Bounded Rationality: Psychology for Behavioral Economics’, *American economic review*, **93(5)**, pp.1449-1475.
Kiefer, A. and Hohwy, J. [2018]: ‘Content and Misrepresentation in Hierarchical Generative Models’, *Synthese*, **195(6)**, pp.2387-2415.
Knill, D. C. and Pouget, A. [2004]: ‘The Bayesian Brain: The Role of Uncertainty in Neural Coding and Computation’, *Trends in Neurosciences*, **27(12)**, pp.712-719.
Kunda, Z. [1990]: ‘The Case for Motivated Reasoning’, *Psychological Bulletin*, **108(3)**, pp.480–498.
Lake, B., Ullman, T., Tenenbaum, J. and Gershman, S. [2016]: ‘Building Machines that Learn and Think Like People’, *Behavioral and Brain Sciences*, **40**, pp.1-72
Lieder, F. and Griffiths, T. L. [2019]: ‘Resource-rational Analysis: Understanding Human Cognition as the Optimal Use of Limited Computational Resources’, *Behavioral and* *Brain Sciences*, pp.1–85.
Maloney, L. T. and Mamassian, P. [2009]: ‘Bayesian Decision Theory as a Model of Human Visual Perception: Testing Bayesian Transfer’, *Visual Neuroscience*, **26(1)**, pp.147-155.
Mandelbaum, E. [2019]: ‘Troubles with Bayesianism: An Introduction to the Psychological Immune System’, *Mind & Language*, **34(2)**, pp.141-157.
Marcus, G. [2009] ‘How Does the Mind Work? Insights from Biology’, *Topics in Cognitive Science*, **1(1)**, pp.145-172.
Marcus, G. F., and Davis, E. [2013]: ‘How Robust are Probabilistic Models of Higher-level Cognition?’, *Psychological Science*, **24(12)**, pp.2351-2360.
Mathys, C., Daunizeau, J., Friston, K. J. and Stephan, K. E. [2011]: ‘A Bayesian Foundation for Individual Learning Under Uncertainty’, *Frontiers in Human Neuroscience*, **5**, p.39.
McKay, R. T. and Dennett, D. C. [2009]: ‘The Evolution of Misbelief’, *Behavioral and Brain Sciences*, **32(6)**, pp.493-510.
McKay, R. and Efferson, C. [2010]: ‘The Subtleties of Error Management’, *Evolution and Human Behavior*, **31(5)**, pp.309-319.
Mercier, H. and Sperber, D. [2017]: *The Enigma of Reason*, Cambridge, Massachusetts: Harvard University Press.
Mobius, M. M., Niederle, M., Niehaus, P. and Rosenblat, T. S. [unpublished]: ‘Managing Self-confidence: Theory and Experimental Evidence’, available at https://www.nber.org/papers/w17014.
Oaksford, M. and Chater, N. [2007]: *Bayesian Rationality: The Probabilistic Approach to Human Reasoning*, Oxford: Oxford University Press.
Penny, W. [2012]: ‘Bayesian Models of Brain and Behaviour’, *ISRN Biomathematics*, pp.1-19.
Peterson, M. [2017]: *An Introduction to Decision Theory*, Cambridge: Cambridge University Press.
Quine, W. [1969]: *Ontological Relativity and Other Essays*, New York: Columbia University Press.
Rahnev, D. and Denison, R. N. [2018]: ‘Suboptimality in perceptual decision making’, *Behavioral and Brain Sciences*, **41**, pp.1-107
Sanborn, A. N. [2017]: ‘Types of Approximation for Probabilistic Cognition: Sampling and Variational’, *Brain and Cognition*, **112**, pp.98-101.
Sanborn, A. N., Griffiths, T. L. and Navarro, D. J. [2010] ‘Rational Approximations to Rational Models: Alternative Algorithms for Category Learning’, *Psychological Review*, **117(4)**, pp.1144.
Sanborn, A. N. and Chater, N. [2016]: ‘Bayesian Brains Without Probabilities’, *Trends in Cognitive sciences*, **20(12)**, pp.883-893.
Sharot, T., and Garrett, N. [2016]: ‘Forming Beliefs: Why Valence Matters’, *Trends in Cognitive Sciences*, *20*(1), pp.25-33.
Simon, H. [1956]: ‘Rational Choice and the Structure of the Environment’, *Psychological Review*, **63(2)**, pp.129-138.
Stanovich, K. [2011]: *Rationality and the reflective mind*, Oxford: Oxford University Press.
Stich, S. [1990]: *The Fragmentation of Reason*, Cambridge, MA: The MIT Press.
Tappin, B. M. and Gadsby, S. [2019]: ‘Biased Belief in the Bayesian Brain: A Deeper Look at the Evidence’, *Consciousness and cognition*, **68**, pp.107-114.
Tenenbaum, J., Kemp, C., Griffiths, T. and Goodman, N. [2011]: ‘How to Grow a Mind: Statistics, Structure, and Abstraction’, *Science*, **331(6022)**, pp.1279-1285.
Tversky, A. and Kahneman, D. [1974]: ‘Judgment Under Uncertainty: Heuristics and Biases’, *Science*, **185(4157)**, pp.1124-1131.
Wald, A. [1947]: ‘An Essentially Complete Class of Admissible Decision Functions’, *The Annals of Mathematical Statistics*, pp. 549-555.
Williams, D. [2018]: ‘Hierarchical Bayesian Models of Delusion’, *Consciousness And Cognition*, **61**, pp.129-147.
Williams, D. [forthcoming]: ‘Socially Adaptive Belief’, *Mind and Language*.
- One complication that I ignore in this paper is that one influential strand of research within Bayesian cognitive science dispenses with utility functions altogether (see Clark [2016]; Hohwy [2013]). Given formal demonstrations that utility functions can always be replaced by the specification of appropriate priors (see Friston et al. [2012]), however, everything that I say in this paper drawing on Bayesian decision theory could in principle be rewritten in terms that do not mention utility functions. ↑
- Colombo et al. ([2018]) reject this assumption, thus providing an additional critique of Bayesian models that I do not address in this paper. ↑
- Some argue that such considerations undermine the realist pretensions of Bayesian models (Colombo and Seriès [2012]; Colombo and Hartmann [2015]). I ignore such worries in this paper. ↑