Approximate Bayesian Computation for astronomical model analysis: a case study in galaxy demographics and morphological transformation at high redshift (original) (raw)

Journal Article

,

1School of Mathematical Sciences (Statistical Science), Queensland University of Technology (QUT), GPO Box 2434, Brisbane 4001, QLD, Australia

Search for other works by this author on:

1School of Mathematical Sciences (Statistical Science), Queensland University of Technology (QUT), GPO Box 2434, Brisbane 4001, QLD, Australia

Search for other works by this author on:

Indeed the authors can find no astronomical reference to either the terms ‘ABC’ or ‘likelihood-free’ (inference) on the NASA ADS data base, and Google Scholar indicates no astronomical citations yet to any of the biological/mathematical ABC literature mentioned herein. A more pedagogical treatise on the potential for ABC in astronomy presented by Chad Schafer and Peter Freeman at the Statistical Challenges in Modern Astronomy V conference in 2011 (available at http://www.springer.com/statistics/book/978-1-4614-3519-8/), which details two interesting uses for ABC in extragalactic data analysis, represents to our knowledge the only prior application in this field.

For readers familiar with the work of Bower et al. (), we note that the ‘discrepancy parameter’ introduced for their emulation of the galform SAM could not be employed as such in ABC as it is not (designed as) a gauge of model–data similarity; indeed, it serves an entirely different purpose in their analysis, acting as an error term for cosmic variance and structural uncertainty in their code.

As a caveat to the above referencing we note that (i) though the analyses of Henriques et al. () and Lu et al. () are both conducted broadly in the style of the ‘approximate likelihood’ approach formalized by Wood (), there are also a number of significant implementational differences unique to each, and (ii) though the work of Kampakoglou, Trotta & Silk () has in previous papers been cited as an example of MCMC-based SAM constraint, in fact, their study concerns a purely analytic model for which there exists no intrinsic stochasticity (thus, only approximate observational errors enter their likelihood computation). Finally, we refer the interested reader to Hartig et al. () for a concise overview of the similarities and differences between the ‘approximate likelihood’ and ABC approaches to inference from statistical simulation, and to Nott, Fan & Sisson () for an advanced treatment of the link between a particular version of ABC and the Bayes Linear technique (cf. Goldstein & Wooff ) underlying the model emulator approach.

PEGASE: Projet d’Etude des GAlaxies par Synthese Evolutive (Fioc & Rocca-Volmerange ).

GOODS-N/-S: Great Observatories Origins Deep-North/-South.

COSMOS: COSMic evOlution Survey.

UDS: UKIDSS [United Kingdom Infrared Telescope (UKIRT) Infrared Deep Sky Survey] UltraDeep Survey.

MOIRCS: Multi-Object Infrared Camera and Spectrograph.

NEWFIRM: NOAO (National Optical Astronomical Observatory) Extremely Wide-Field Infrared iMager.

Though redshift may be the more familiar baseline for many observational astronomers we have instead adopted here a scale of time (since z = 6, in Gyr) for the horizontal axis of this plot. This is because time, rather than redshift, forms the natural evolutionary variable of the stochastic processes described in our morphological transformation model. The top right-hand panels of Figs 4, 9 and 10 are marked with both, however, as a convenient reference for the appropriate conversion under our assumed cosmology.

SINS: Spectroscopic Imaging survey in the Near-infrared with SINFONI (Spectrograph for INtegral Field Observations in the Near Infrared).

Analysis of the (much larger) COSMOS photometric redshift catalogue (Ilbert et al. ) confirms that underdensities of this magnitude indeed arise frequently amongst the massive/most luminous galaxy population at these redshifts. In particular, for the optically luminous (i.e. rest-frame Z < −23.6 mag) members of that catalogue at 1.5 < _z_ < 3, which appear at comparable number density to the _M_gal > 1011 M⊙ population from the EGS, a maximum LOS separation of 150 Mpc or more occurs (roughly) once for every three random placements of the CANDELS/EGS observational footprint within the COSMOS field.

http://galaxy-catalogue.dur.ac.uk:8080/Millennium/

Recall that for the purposes of the present analysis we treat the time of ‘birth’ in our model as the epoch at which a galaxy first reaches the top end of the high-redshift stellar mass function, whether via star formation or merging. To identify this point in the De Lucia & Blaizot () mocks one must follow back the linked progenitor tree accordingly via sql query.

When using a non-symmetric proposal distribution as in the present case the ratio of sampling densities joins, of course, the likelihood ratio and the prior density ratio in computing the full MCMC acceptance probability.

Application of ABC to the problem of Bayesian model choice (cf. Grelaud et al. ; Toni & Stumpf ) is far from straightforward as an unfortunate summary statistic selection can lead to disastrously incorrect Bayes factors, even asymptotically (Robert et al. ). Recently though, Marin et al. () have made substantial progress in this field by establishing necessary and sufficient conditions on the validity of candidate summary statistics for this purpose.

With the choice of summary statistic typically (re-)cast in terms of the choice of reference data set in these astronomical studies – namely, whether to constrain, for instance, against the luminosity function (Bower et al. ; Lu et al. ; Cirasuolo et al. ), the Tully-Fisher relation (van den Bosch ; Tonini et al. ), the mass-metallicity relation (Pipino et al. ), and/or the black hole–bulge mass relation (Henriques et al. ).

MRSSE: Mean square Root Sum of Standard Errors.

One may note an intriguing similarity between the use of fourth (or fifth) nearest neighbour-based estimators in both statistical studies of distributional entropy and in astronomical studies of large-scale environment (cf. Baldry et al. ) – though it is unlikely there exists an underlying significance to this beyond the desirable error properties of the n ∼ 4–5 choice (Singh et al. ).

In fact, as noted by Fearnhead & Prangle (), since in ABC analysis we are only interested in the difference, graphic, the vector, graphic, may well be omitted from this above definition.

Another (well-motivated) alternative choice of scaling here would be the sample covariance matrix of the posterior particle population from our earlier run of SMC ABC under S :_k_=3.

Following Conselice (), Bluck et al. (, ) advocate correction of this raw merger fraction, which they denote _f_m, to a ‘galaxy merger fraction’, denoted _f_gm, via the relation, graphic. The motivation for this correction is to identify the ‘number of galaxies merging as opposed to the number of mergers’. However, we believe this to be a false move in context of their analysis of the most massive galaxies in the GNS, in which they identify impending mergers as those _M_gal > 1011 M⊙ systems host to a close companion (<30 kpc) of similar brightness (i.e. within ±1.5 mag in the observed H band; corresponding to a lower bound on the mass ratio of ∼1:4). Owing to the steepness of the galaxy stellar mass function at z ≳ 1.5 the vast majority of such companions will almost certainly be sub-1011 M⊙ – indeed in our sample we find just one close pair in which both are truly high-mass galaxies, and MA12 find just two. Hence, by ‘correcting’ to _f_gm as above one is in fact estimating the merger rate experienced by both massive galaxies and the (ill-defined) population of less massive galaxies that will ultimately merge with them, which does not seem a particularly useful exercise.

As a first-order estimate of the downwards revision in these merger rates that would result from switching to a 1:3 mass ratio selection one can suppose (perhaps naïvely) that the masses of close pair galaxies correspond to independent random draws from the z ∼ 1.5 luminosity function; in which case, Δlog λm ∼ −0.15 to −0.35, bringing the BL09 and MA12 determinations broadly into agreement with our own.

In fact, we downweight the count of Type II galaxies in this calculation by (1 − _P_Sph + D remnant) to acknowledge the (minor) role of secular evolution in early bulge formation, as discussed further in Section .

We note for reference that Glade, Ballet & Bastien () provide a brief review of the fundamentals of Poisson processes in their recent paper on the Drake equation.

Author Notes

Published:

01 September 2012

Cite

E. Cameron, A. N. Pettitt, Approximate Bayesian Computation for astronomical model analysis: a case study in galaxy demographics and morphological transformation at high redshift, Monthly Notices of the Royal Astronomical Society, Volume 425, Issue 1, September 2012, Pages 44–65, https://doi.org/10.1111/j.1365-2966.2012.21371.x
Close

Navbar Search Filter Mobile Enter search term Search

Abstract

‘Approximate Bayesian Computation’ (ABC) represents a powerful methodology for the analysis of complex stochastic systems for which the likelihood of the observed data under an arbitrary set of input parameters may be entirely intractable – the latter condition rendering useless the standard machinery of tractable likelihood-based, Bayesian statistical inference [e.g. conventional Markov chain Monte Carlo (MCMC) simulation]. In this paper, we demonstrate the potential of ABC for astronomical model analysis by application to a case study in the morphological transformation of high-redshift galaxies. To this end, we develop, first, a stochastic model for the competing processes of merging and secular evolution in the early Universe, and secondly, through an ABC-based comparison against the observed demographics of massive (_M_gal > 1011 M⊙) galaxies (at 1.5 < z < 3) in the Cosmic Assembly Near-IR Deep Extragalatic Legacy Survey (CANDELS)/Extended Groth Strip (EGS) data set we derive posterior probability densities for the key parameters of this model. The ‘Sequential Monte Carlo’ implementation of ABC exhibited herein, featuring both a self-generating target sequence and self-refining MCMC kernel, is amongst the most efficient of contemporary approaches to this important statistical algorithm. We highlight as well through our chosen case study the value of careful summary statistic selection, and demonstrate two modern strategies for assessment and optimization in this regard. Ultimately, our ABC analysis of the high-redshift morphological mix returns tight constraints on the evolving merger rate in the early Universe and favours major merging (with disc survival or rapid reformation) over secular evolution as the mechanism most responsible for building up the first generation of bulges in early-type discs.

1 Introduction

With origins in population genetics and evolutionary biology (e.g. Tavaré et al. ; Pritchard et al. ; Beaumont, Zhang & Balding ; and see Csilléry et al. for a recent review), Approximate Bayesian Computation (ABC) offers a powerful technique for recovering posterior probability densities from complex stochastic models for which the likelihood may be entirely intractable, that is, the probability of the observed data under a given set of input parameters cannot be solved analytically or computed directly (within a practical time frame). Examples include the estimation of time to the most recent common ancestor under the coalescent model with recombination, given a full suite of modern gene sequencing (Marjoram & Tavaré ), or the derivation of transition probabilities in continuous time Markov models of macroparasite population evolution from simple demographics (Drovandi & Pettitt ). However, although there exist a variety of important astrophysical models with inherently intractable likelihoods (a number of which we will discuss herein), applications to date of ABC in this field remain surprisingly rare. The only indispensable ingredients required for ABC are as follows: (i) a stochastic model for the observed data, replicating the behaviour of all random processes driving the system at hand, as well as any relevant observational errors; and (ii) a discrepancy measure, based typically on a set of low-order summary statistics, to quantitatively gauge similarity between the output from this model and the empirical benchmark.

One potentially valuable role for ABC in an astronomical context may thus be in the constraint of semi-analytic models (SAMs) of galaxy formation (cf. Cole et al. ; Benson et al. ; Baugh ; Bower et al. ; De Lucia et al. ; Neistein & Weinmann ) – in which the output at run-time necessarily exhibits complex stochasticity owing to the effects of cosmic variance (induced computationally via sampling from within large-scale dark matter simulations, Springel et al. ; Knebe et al. ; or via Monte Carlo construction of halo merger trees, Lacey & Cole ; Parkinson, Cole & Helly ). For ABC analysis of such codes an appropriate discrepancy measure might then be the metric distance between simulated and observed luminosity functions under a sensible binning scheme. With conditions (i) and (ii) above thus satisfied ABC offers an easily implemented, theoretically well-established (Nunes & Balding ; Marin et al. ; Fearnhead & Prangle ) alternative to the computationally intensive ‘approximate likelihood’ approach (requiring very large-scale simulation/resimulation, e.g. Henriques et al. ; Wood ; Lu et al. ; and note Benson et al. regarding the required diversity of merger trees sampled for genuine convergence of SAMs), or the user-intensive application of model emulators (requiring a non-trivial degree of run-time supervision and operator expertise; cf. Bower et al. and references therein).

Another astronomical problem readily amenable to ABC is that of inferring the age and mass of an unresolved star cluster based on its broad-band spectral energy distribution (SED). Here it is the sheer diversity/complexity of evolutionary tracks open to a cluster of given mass under a stochastically sampled initial mass function (IMF) that unrenders infeasible (i.e. intractable) any explicit formulation of the observational likelihood function (cf. Asa’D & Hanson ; Bonatto, Lima & Bica ; Hernandez ; Koda et al. ) – though with brute-force resimulation at fixed input using a cluster formation code such as slug (Fumagalli, da Silva & Krumholz ; Da Silva, Fumagalli & Krumholz ) or massclean (Popescu & Hanson ) one can in principle generate a fair approximation to it by recording the frequency of output in each region of the observational hyperspace. Indeed with huge libraries of such simulations Popescu & Hanson () and Fouesneau & Lançon () are already employing this approximate likelihood approach for ‘first-order’ cluster mass and age estimation. An appreciation of the established ABC method may offer practitioners in this field valuable insight into the challenges they face, which are, in abstraction, already addressed routinely in the related statistical literature. For instance, the merits of alternative filter combinations may be readily assessed through the lens of summary statistic selection, and a realistic distribution of cluster metallicities and dust reddening vectors robustly accounted for via the Bayesian technique of marginalizing over nuisance parameters.

Another two intriguing examples of astronomical model analysis problems amenable to ABC appear in recent work by Hekker et al. () and Leigh et al. () in the disparate fields of asteroseismology and IMF profiling, respectively. In the former, it is the non-linear propagation of realization noise in the solar oscillation spectrum that renders intractable the observational likelihood function. Simulated data sets though may be readily generated for this system, and Hekker et al. () have identified a corresponding set of summary statistics optimal for inference of the key model parameters. Specification of an appropriate discrepancy distance thus remains the final (and relatively trivial) hurdle to ABC implementation here. In the Leigh et al. () study, it is the intrinsic complexity of two-body relaxation within many-body stellar systems that necessitates a simulation-based approach to likelihood approximation. The cluster metallicity and the global binary fraction act as nuisance parameters of their model, while binary star confusion and the (inherent) projection of a 3D system on to the 2D observational plane contribute complex sources of measurement ‘error’ best treated by forward simulation.

In this paper, we illustrate heuristically the power of ABC for astronomical model analysis through application to yet another branch of this rich subject, namely the morphological transformation of massive galaxies at high redshift. In particular, we demonstrate a contemporary Sequential Monte Carlo (SMC) formulation of the ABC algorithm (cf. Del Moral, Doucet & Jasra ; Sisson, Fan & Tanaka ; Drovandi & Pettitt ), as well as a regression-based procedure for constructing an optimal summary statistic–discrepancy measure pairing for the purpose of parameter estimation (Fearnhead & Prangle ). Importantly, the stochastic model we explore herein features both an ‘independent evolution’ case for which the likelihood is in fact tractable and a ‘co-evolution’ case for which it is not – the former allowing the strengths and limitations of our ABC-based solution to be established against conventional Markov chain Monte Carlo (MCMC) simulation and the latter a demonstration of the unique possibilities of ABC analysis.

Installation of the new Wide Field Camera 3 (WFC3) on the Hubble Space Telescope (HST) in 2009 – and the subsequent allocation of vast amounts of observing time to deep, near-infrared (NIR) surveys with this instrument, including the Early Release Science (ERS) program (Windhorst et al. ) and the Cosmic Assembly Near-IR Deep Extragalatic Legacy Survey (CANDELS; Koekemoer et al. ; Grogin et al. ) – has at last made accessible (at high resolution) the rest-frame optical morphologies of distant galaxies at the epoch of peak cosmic star formation and active galactic nucleus (AGN) activity (z ∼ 2; Warren, Hewett & Osmer ; Lilly et al. ; Madau et al. ; Oesch et al. ). Early studies exploiting these new data sets have documented the emergence of the first Hubble sequence analogues (Cameron et al. ; Conselice et al. ; Szomoru et al. ), demonstrated the compactness of the first massive spheroids (Szomoru et al. ; Newman et al. ; Szomoru, Franx & van Dokkum ), explored the unique characteristics of galaxies ultraluminous at IR (Kartaltepe et al. ) and X-ray wavelengths (Rosario et al. ; Schawinski et al. ; Kocevski et al. ), and probed structural transformation in extreme cluster environments (Lotz et al. ; Papovich et al. ). Thus far, however, there have been remarkably few studies to exploit the full potential of demographic analysis for constraining pathways of galaxy evolution – one early exemplar being Bell et al.’s () search for correlations between the global observables of key galaxy subpopulations in the CANDELS data set divided coarsely by rest-frame optical morphological type (via the usual proxy of global Sérsic index; cf. Driver et al. ; Cameron & Driver ; Kelvin et al. ). Hence, we have chosen here specifically for our exposition of the ABC technique a case study in the demographic analysis of WFC3 data in the hope of motivating further research in this direction.

The structure of this paper is as follows. In Section we review the publicly available source catalogues and images comprising our high-redshift demographic benchmark, then in Section we present the core of our case study in ABC for astronomical model analysis. First, we describe our model for galaxy evolution and our procedure for stochastic simulation from this model (Section ). Secondly, we explain the ABC algorithm and the SMC approach to its implementation (Section ). Thirdly, we examine in depth the important process of constructing an optimal summary statistic–discrepancy parameter pairing (Section ). Finally, we confirm the general similarity between our ABC and MCMC posteriors in the tractable ‘independent evolution’ case, and present our final ABC-only posteriors for the more realistic, but likelihood intractable, ‘co-evolution’ case (Section ). In Section , we conclude this paper with a discussion of the implications of the model constraints so derived for astrophysical theories of morphological transformation in the early Universe.

We have thus attempted to organize our exposition of ABC in such a manner as to allow astronomers interested in this important statistical algorithm but not working directly in the area of galaxy evolution to optionally skip over the technical details and justification of our model (Section ) without disadvantage (instead reading only Sections , and in depth). All magnitudes are quoted in the AB system and a standard graphic cosmological model is adopted throughout.

2 Data

Featuring a vast ensemble of multiwavelength imaging compiled from both ground-based and space-based observatories the Extended Groth Strip (EGS) region of the Northern Sky (centred on RA 14h 17m, Dec. +52°30′) numbers amongst the premier legacy survey fields of the modern era. The All-wavelength Extended Groth Strip International Survey (AEGIS) team (Davis et al. ) has been responsible for the bulk of this data collection through extensive observational campaigns with HST and Spitzer. Such a comprehensive set of photometric measurements greatly facilitates the estimation of redshifts and stellar masses via SED template fitting, and there exist a number of published studies characterizing the high-redshift galaxy population in the EGS to this effect.

In this study we employ the publicly available and up-to-date ultraviolet (UV) to far-IR (FIR) based catalogue of Barro et al. (, hereafter B11) to identify a complete sample of high-mass (M_gal > 1011 M⊙), early Universe (1.5 < _z_ < 3) systems. The B11 photometric redshifts, based on up to 19 band flux measurements in the survey core, feature an overall accuracy (measured against a spectroscopic subsample from AGEIS with median _z_ ∼ 1.3) of Δ_z_/(1 + _z_) = 0.034 at a sub-2 per cent catastrophic failure rate. At the highest redshifts (_z_ > 2.5) comparison against a spectroscopic sample of 91 Lyman break galaxies (LBGs) confirms only a slight degradation to Δ_z/(1 + z) = 0.069. The corresponding B11 stellar masses for these systems were derived using the PEGASE SED library (with Salpeter IMF and Calzetti extinction) – the choice of which (from amongst the wide range of alternative SED libraries) represents the dominant source of systematic uncertainty here (of the order of 0.1–0.3 dex; Barro et al. ). For the purposes of this paper, in which our principle aim is to demonstrate as straightforwardly as possible the technicalities of the ABC approach, we hereafter neglect further quantitative consideration of these uncertainties (except when required for fitting the build-up in number density over cosmic time, which contributes two nuisance parameters to our model, in Section ).

The CANDELS team (Koekemoer et al. ; Grogin et al. ) is currently engaged in the acquisition of high-resolution, NIR (and UV) imaging targeting distant galaxies in selected subregions of five key legacy fields (GOODS-N, GOODS-S, the EGS, COSMOS and the UDS) totalling ∼800 arcmin2 under an allocation of 902 orbits of HST/WFC3 exposure time. Drizzled to a pixel scale of 0.06 arcsec, the presently available epoch (egs01) of imaging within (an ∼90 arcmin2 subregion of) the EGS field features a point spread function (PSF) full width at half-maximum (FWHM) of ∼0.18 arcsec and a 5σ detection limit of 26.8 mag in the F_160_W (_H_-band) filter (with comparable coverage in the F_125_W filter). As such CANDELS already represents the highest quality data set published to date for the study of rest-frame optical morphologies at z ∼ 2–3 in the EGS. Accordingly for the present analysis we derive our high-redshift demographic benchmark from visual classification of all 126 members of the B11 catalogue at _M_gal > 1011 M⊙ and 1.5 < z < 3 imaged thus far.

To this end one of us (EC) inspected each source carefully in the CANDELS (HST WFC3/IR) _H_-band mosaic with ds9 and assigned it one of the following four types: (i) spheroid (compact elliptical; cf. Szomoru et al. ), (ii) spheroid-plus-disc (early-type disc with a prominent central bulge; cf. Cameron et al. ), (iii) pure disc [late-type, bulgeless (perhaps clumpy) disc; cf. Elmegreen et al. ] or (iv) ongoing merger (evident violent relaxation event in progress, as revealed by the presence of distinctive tidal features and/or multiple massive nuclei; cf. Elmegreen et al. ). In total we count eight Type I spheroids, nine Type II spheroid-plus-discs, 90 Type III pure discs and 19 Type IV ongoing mergers in our sample. Example _H_-band postage stamp images characterizing these archetypal high-redshift morphologies are presented in the left-hand panel of Fig. 1, as is an illustration of the demographic evolution across our sample in the right-hand panel. Once again in accordance with the expository aims of this paper regarding ABC we do not explore the (complex) possible impacts of classification subjectivity on our results – although we note that both ABC and the Bayesian framework in general offer a powerful statistical basis for marginalizing over such sources of uncertainty (cf. Gelman et al. ; Taylor et al. ; and see our treatment of various nuisance parameters in Sections and ), particularly where the experimental evaluation of the classification system has been appropriately designed and implemented (Hand ). Reassuringly though, the relative proportions of early- and late-type systems recovered from our classification process are at least broadly consistent with those reported by Buitrago et al. () in their analysis of the (lower resolution) GOODS NICMOS Survey (GNS; Conselice et al. ; also Mortlock et al. ).

Left: CANDELS (HST WFC3/IR) H-band postage stamp images characterizing the four archetypal morphologies present amongst massive (Mgal >1011 M⊙), high-redshift (1.5 < z <3) galaxies in the B11 data set. Right: an illustration of demographic evolution (i.e. the evolving morphological mix) amongst our B11 (CANDELS/EGS) sample. The symbol key is given in the left-hand panel, and the ‘evolved’ morphological types (pure spheroids and spheroid-plus-discs) are circled in red to highlight their late build-up.

Figure 1.

Left: CANDELS (HST WFC3/IR) _H_-band postage stamp images characterizing the four archetypal morphologies present amongst massive (_M_gal >1011 M⊙), high-redshift (1.5 < z <3) galaxies in the B11 data set. Right: an illustration of demographic evolution (i.e. the evolving morphological mix) amongst our B11 (CANDELS/EGS) sample. The symbol key is given in the left-hand panel, and the ‘evolved’ morphological types (pure spheroids and spheroid-plus-discs) are circled in red to highlight their late build-up.

3 STATISTICAL METHODOLOGY AND RESULTS

Here we begin by introducing our stochastic model for the morphological transformation of high-redshift galaxies, detailing both the tractable ‘independent evolution’ case and the intractable ‘co-evolution’ case, in Section . We then proceed to outline the SMC approach to ABC in Section , and to demonstrate linear regression-based construction of an optimal summary statistic–discrepancy parameter pairing for our model in Section . Finally, in Section we compare the performance of SMC ABC against ‘tractable likelihood’-based MCMC in the ‘independent evolution’ case and present our ABC-only solution for the more realistic ‘co-evolution’ case.

3.1 Morphological transformation as a stochastic process

With the current generation of SAMs yet to offer detailed or reliable predictions for the morphologies of simulated galaxies (Almeida, Baugh & Lacey ; González et al. ) we develop here instead a basic stochastic model for describing the competing processes of morphological transformation in the early Universe. In this endeavour we are motivated both by contemporary observational results and hydrodynamical simulations. The purpose of simulation in this study is thus not to work forwards through parametrized approximations for the physical laws of halo accretion, gas cooling and star formation (amongst others) in order to constrain their ‘fundamental’ scaling coefficients (as in SAMs), but rather to explore in a rigorous statistical sense the extent to which the rates of incidence of the key events thought to shape morphological evolution are jointly constrained by the observed demographics. Nevertheless, working backwards from these constraints (our posterior probability densities) one may hopefully achieve insight into the underlying physical mechanisms, as we discuss in Section .

As a starting point for our model we suppose that the arrival of galaxies at the top end of the high-redshift stellar mass function may be faithfully represented as a non-homogeneous Poisson birth process with an underlying rate, graphic, increasing as 10_K_ _t_γ from a zero baseline at z = 6. Thus, on average (i.e. over an infinite volume), graphic (modulo the impact of merging amongst _M_gal > 1011 M⊙ systems, which is intrinsically rare at these redshifts, i.e. negligible in this context; cf. Man et al. and our discussion in Section ). In the left-hand panel of Fig. 2 we illustrate the build-up in number density at _M_gal > 1011 M⊙ over the interval z ∼ 1.5–6 synthesized from observations in the GNS (Mortlock et al. , hereafter M11), the MOIRCS Deep Survey (Kajisawa et al. , hereafter K09), the NEWFIRM Medium-Band Survey (Marchesini et al. , hereafter M10; Brammer et al. , hereafter BR11), the UDS (Caputi et al. , hereafter C11) and the EGS (the present study, hereafter C12). Interestingly, where their redshift baselines overlap a number of these rival graphic determinations exhibit surprisingly large discrepancies with regard to their respective cosmic variance uncertainties (marked as 1σ error bars in Fig. 2 following the recipe of Moster et al. ). As highlighted by BR11 such discrepancies may well arise from the systematic errors inherent in SED-based stellar mass computation (owing to degeneracies between the various template libraries), and we suspect this to be the case here.

Left: the build-up in galaxy number density at stellar masses Mgal > 1011 M⊙ from z = 6 to 1.5 (in terms of cosmological time since z = 6) synthesized from recent observational determinations of Φ>1011 M⊙(z) from the literature. The K09, M10, C11 and M11 data points shown here are derived from integration of their published Schechter mass function fits, while the BR11 data points are sourced directly from that paper. The error bar on each indicates the 1σ contribution of cosmic variance for the respective survey and bin width (following the recipe of Moster et al. 2011). The dark, medium and light grey bands plotted underneath represent our (pointwise) median, 1σ and 3σ credible intervals, respectively, on the evolving mean number density. These are derived from the joint posterior densities of K and γ under our non-homogeneous Poisson birth process model, λb(t) = 10Ktγ ; the median curve shown here corresponds roughly to γ = 0.65. Right: illustration of the birth rate by type in our model (cf. Section 3.1). The rate at which sub-1011 M_ galaxies are promoted above this mass threshold by merging is taken as W times our Mgal > 1011 M⊙ merger rate, Wλlm(t ); leaving the rate of promotion (by star formation) of Type III discs, λd(t ), as the remainder with respect the total birth rate, λb(t ).

Figure 2.

Left: the build-up in galaxy number density at stellar masses M_gal > 1011 M⊙ from z = 6 to 1.5 (in terms of cosmological time since z = 6) synthesized from recent observational determinations of Φ>1011 M⊙(z) from the literature. The K09, M10, C11 and M11 data points shown here are derived from integration of their published Schechter mass function fits, while the BR11 data points are sourced directly from that paper. The error bar on each indicates the 1_σ contribution of cosmic variance for the respective survey and bin width (following the recipe of Moster et al. 2011). The dark, medium and light grey bands plotted underneath represent our (pointwise) median, 1_σ_ and 3_σ_ credible intervals, respectively, on the evolving mean number density. These are derived from the joint posterior densities of K and γ under our non-homogeneous Poisson birth process model, λ_b(t) = 10_K_t_γ ; the median curve shown here corresponds roughly to γ = 0.65. Right: illustration of the birth rate by type in our model (cf. Section 3.1). The rate at which sub-1011 M_ galaxies are promoted above this mass threshold by merging is taken as W times our _M_gal > 1011 M⊙ merger rate, _Wλl_m(t ); leaving the rate of promotion (by star formation) of Type III discs, λd(t ), as the remainder with respect the total birth rate, λb(t ).

To estimate our birth rate parameters, K and γ, we thus perform standard MCMC exploration of the relevant posterior probability density space under a likelihood model in which the data points from each of the above-listed studies are assumed subject to a common systematic bias in addition to cosmic variance. The prior magnitude of this systematic bias component (in dex) is treated as normally distributed with graphic for each survey. Our respective priors on K and γ are both Uniform, with the former non-informative and the latter standard (i.e. bound between zero and one). The joint posterior density for {K, γ} thus recovered is roughly bivariate Normal with graphic. For reference we plot the corresponding (pointwise) median, 1σ and 3σ credible intervals for graphic against the various empirical determinations shown in Fig. 2. Due to the relatively small cosmic volume probed by the CANDELS/EGS data set we do not attempt to further constrain K and γ during our ABC analysis; instead, we treat these two variables as nuisance parameters of our stochastic model and integrate them out at run-time (see Section ).

It is important to note at this point that the marginal posterior density on the systematic bias in our EGS data points favours a (median) of +0.11 dex, suggesting that the B11 stellar masses are systematically overestimated by a corresponding ∼0.10 dex (adopting the z ∼ 1.5 mass function slope of M11). Hence, it is perhaps more appropriate to describe our B11 CANDELS/EGS data set as an _M_gal ≳ 1011 M⊙ sample, stressing the inherent (systematic) uncertainty in SED-based stellar mass selection [arising primarily from the (uncertain) choice of stellar population synthesis model/code used to construct the underlying SED template library; cf. Muzzin et al. ].

We next suppose that each galaxy arrives at the top end of the high-redshift stellar mass function as either a (star-forming) late-type disc (Type III) or an ongoing major merger (Type IV) – a simplifying assumption which serves to reduce markedly the required dimensionality of our model, yet which is also consistent with the present state of knowledge on this topic. In a recent empirical census of rest-frame optical morphology amongst the sub-1011 M⊙ population at 1.5 < z < 3.5, Cameron et al. () were unable to identify a single unambiguous spheroid beyond z ≈ 2.2 in their sample from the ERS (and see also Conselice et al. for a similar result). Amongst the small fraction (∼20 per cent) of sub-1011 M⊙ spheroids discovered in their sample at later epochs only one was found to be actively star-forming – leaving dry merging as perhaps the only feasible (but also unlikely; cf. Lin et a. ; Chou, Bridge & Abraham ) mechanism for sub-1011 M⊙ spheroids to thus move above this threshold mass without transitioning through a standard Type IV phase. Meanwhile, contemporary hydrodynamical simulations have demonstrated the theoretical potential for high-redshift discs at 1010.5–1011 M⊙ to sustain immense rates of star formation fuelled by cold flow gas accretion (Brooks et al. ; Dekel, Sari & Ceverino ; Genel et al. ) while avoiding secular bulge assembly through the wind-driven disruption of clump instabilities (Hopkins et al. ; Genel et al. ) – ensuring their rapid transition to the high-mass regime intact as Type III systems.

The probability of birth as a Type IV ongoing merger is estimated in our model as W times the ratio of the instantaneous merger rate amongst our _M_gal > 1011 M⊙ population, graphic, to the corresponding instantaneous birth rate, graphic, as shown in the right-hand panel of Fig. 2. This factor, W, represents another nuisance parameter of our model corresponding to the ratio by number density of galaxies in such a mass range that a single major merger could promote them above 1011 M⊙ to those already beyond this threshold. (There is an implicit assumption here that the merger rate does not evolve significantly with mass over this small baseline.) According to the shape of the stellar mass function at these redshifts (e.g. BR11; M11) we estimate W ≈ 0.5 ± 0.2. Important to note is the fact that the merger rate so defined, graphic, is strictly that of galaxies already at the top end of the stellar mass function – i.e. we are effectively eliminating the contribution to the observed merger fraction from galaxies that were sub-1011 M⊙ prior to their most recent encounter. With graphic the total birth rate and graphic the birth rate of Type IV mergers, the corresponding birth rate of Type III discs, graphic, is simply the arithmetic difference, graphic (as indicated in the right-hand panel of Fig. 2).

As for the total birth rate described earlier, graphic, merging in our model is characterized as a non-homogeneous Poisson process, with a unique rate by volume of

formula

Here αmerge represents the baseline merger rate (in units of Mpc−3 Gyr−1) at our lowest redshift, z = 1.5, with αmergeβmerge the peak cosmological merger rate for massive galaxies at _t_br. This latter model parameter, _t_br, thus dictates a point of phase transition (or ‘break’) beyond which the merger rate by volume must ultimately decrease (with increasing redshift) back to zero at z = 6 (the time origin of our model) at least as fast as the total number density of galaxies itself, lest the specific merger rate (per galaxy), graphic, becomes asymptotically infinite. Here we have chosen for simplicity a fixed, marginally sufficient decay rate for graphic above this transition redshift of _t_2 > γ(≈0.65) + 1. One possible extension of our model, which may well be worthwhile in the future if/when a larger demographic data set for z ∼ 1.5 CANDELS sources becomes available, would be to treat this pre-_t_br decay rate as a free parameter of the fit.

Previous empirical studies have argued alternately that the massive galaxy merger rate is either very near constant (de Ravel et al. ; Lotz et al. ; Williams, Quadri & Franx ; Man et al. ) or markedly increasing with redshift (Conselice et al. ; Bluck et al. , ; Conselice, Yang & Bluck ; de Ravel et al. ; López-Sanjuan et al. ) from the local volume out to z ∼ 1.5; and even less consensus exists regarding its behaviour to z ∼ 3 and beyond (Williams et al. ; Bluck et al. ; Law et al. ; Man et al. ). The intrinsic clumpiness of galaxy-scale star formation at the observed optical (i.e. rest-frame UV) wavelengths of many of these studies (most non-WFC3) has proved a persistent source of uncertainty, introducing substantial ambiguity into the interpretation of those morphological signatures otherwise indicative of recent merging locally (cf. Conselice et al. ; López-Sanjuan et al. ). Uncertainties concerning the fraction of apparent close pairs to ultimately merge (Lotz et al. ) and the visibility time-scales of the resulting post-merger tidal features (Lotz et al. ) have only compounded these difficulties. (Important to note is that the full demographic analysis performed in the present study permits a simultaneous, non-degenerate constraint of the latter unknown, which is a significant advantage of this particular mode of analysis.)

Our only inflexible constraints on the tuneable parameters of graphic here are thus that αmerge is, of course, strictly positive, βmerge is greater than or equal to one and _t_6( = 0) ≤ _t_br ≤ _t_1.5. Hence we adopt only weak priors specified as (i) a T distribution in log10 space for αmerge (with μ = −4, Σ = 0.5, and 10 degrees of freedom, truncated to a key region of interest at a lower bound of −5.5 and an upper bound of −2.5); (ii) a beta distribution in log10 space for 2βmerge (with shape coefficients, 1 and 4, favouring a smaller peak-to-baseline ratio over a higher one); and (iii) a beta distribution for _t_br (as a fraction of _t_1.5) (with shape coefficients, 2 and 1, favouring a break closer to z ∼ 1.5 than z ∼ 6). The grey-shaded tiles and histograms in Figs 4, 9 and 10 offer graphical representations of these prior densities.

Schematic illustration of the five characteristic pathways of high-redshift morphological transformation permitted under the stochastic model described in this paper (shown over an arbitrary timeline of 1.5 Gyr from ‘birth’ to ‘observation’ with an assumed Type III birth class). Solid lines are used to mark the fundamental pathways necessary to reach a given evolutionary state, whereas dashed lines allow for a range of possible degenerate evolutionary histories prior to the most recent merger (the details of which are inconsequential to the final state achieved). The secular evolution pathway to Type II status and the null evolution pathway to Type III status are the only branches for which one could not substitute Type IV as the birth class here.

Figure 3.

Schematic illustration of the five characteristic pathways of high-redshift morphological transformation permitted under the stochastic model described in this paper (shown over an arbitrary timeline of 1.5 Gyr from ‘birth’ to ‘observation’ with an assumed Type III birth class). Solid lines are used to mark the fundamental pathways necessary to reach a given evolutionary state, whereas dashed lines allow for a range of possible degenerate evolutionary histories prior to the most recent merger (the details of which are inconsequential to the final state achieved). The secular evolution pathway to Type II status and the null evolution pathway to Type III status are the only branches for which one could not substitute Type IV as the birth class here.

Benchmark posterior probability densities for the key parameters of our stochastic model of morphological transformation at high redshift (in the ‘independent evolution’ case) recovered from ‘tractable likelihood’-based MCMC simulation. In each of the main diagonal panels we compare the marginal posterior density of a single parameter (in red) against its prior (in grey), while in each of the off-diagonal panels below we extend this comparison to the joint density formed by pairing that parameter against one of its peers. For the latter visualization we employ a lattice of variable-sized points to trace the MCMC posterior on a scale of 1, 2.5, 7.5 and 15 times some appropriate baseline probability density, while grey-shaded tiles map the corresponding prior on an identical scale. In the upper right-hand panel we plot the (pointwise) 1s and 3s credible intervals and median curve (in dark grey, light grey and red, respectively) for the Mgal > 1011 M⊙ merger rate, »m(t), deriving from our (joint, marginal) posterior densities on αmerge, βmerge and tbr.

Figure 4.

Benchmark posterior probability densities for the key parameters of our stochastic model of morphological transformation at high redshift (in the ‘independent evolution’ case) recovered from ‘tractable likelihood’-based MCMC simulation. In each of the main diagonal panels we compare the marginal posterior density of a single parameter (in red) against its prior (in grey), while in each of the off-diagonal panels below we extend this comparison to the joint density formed by pairing that parameter against one of its peers. For the latter visualization we employ a lattice of variable-sized points to trace the MCMC posterior on a scale of 1, 2.5, 7.5 and 15 times some appropriate baseline probability density, while grey-shaded tiles map the corresponding prior on an identical scale. In the upper right-hand panel we plot the (pointwise) 1s and 3s credible intervals and median curve (in dark grey, light grey and red, respectively) for the _M_gal > 1011 M⊙ merger rate, »m(t), deriving from our (joint, marginal) posterior densities on _α_merge, _β_merge and _t_br.

The spatial distribution of massive (Mgal > 1011 M⊙) galaxies in our B11 (CANDELS/EGS) sample at 1.5 < z < 3, projected in terms of comoving distance on to the LOS-RA and LOS-Dec. planes. The red circles overplotted highlight our 11 pairs and one threesome of galaxies neighbouring within 2.5Mpc. The colour/symbol code for galaxy types employed here is identical to that of Fig. 1. For reference we also mark the LOS axis on a scale of time since z = 6 (the origin point of our model). (See the text for a comment on the apparent void at ∼400 Mpc.)

Figure 5.

The spatial distribution of massive (_M_gal > 1011 M⊙) galaxies in our B11 (CANDELS/EGS) sample at 1.5 < z < 3, projected in terms of comoving distance on to the LOS-RA and LOS-Dec. planes. The red circles overplotted highlight our 11 pairs and one threesome of galaxies neighbouring within 2.5Mpc. The colour/symbol code for galaxy types employed here is identical to that of Fig. 1. For reference we also mark the LOS axis on a scale of time since z = 6 (the origin point of our model). (See the text for a comment on the apparent void at ∼400 Mpc.)

Evaluation of candidate summary statistics for model-data comparison across a range of binning schemes (k = 1, 3, 6 and 12) for both our ‘naïve‘ S and optimized S* (the latter explained later in this section) via the twin diagnostics of Left: distributional entropy and right: (M)RSSE (cf. Nunes & Balding 2010). Recall here that a lower posterior entropy typically indicates a higher posterior information content, and a lower (M)RSSE scores a more accurate recovery of the posterior mean. In each instance the marked data point reveals the median, and the error bars a corresponding 95 per cent confidence interval, recovered from six rounds of rejection ABC (i.e. selection of the 100 least discrepant particles out of an initial 5000 drawn from the prior density). Note that as the posterior means of our model parameters under the ‘independent evolution’ case studied here have already been well approximated via our earlier (‘tractable likelihood’-based) MCMC simulation we have employed these directly to estimate a pseudo-RSSE, rather than forming an MRSSE from repeated ABC runs against simulated data sets ‘close’ to the real one as in the canonical Nunes & Balding (2010) procedure.

Figure 6.

Evaluation of candidate summary statistics for model-data comparison across a range of binning schemes (k = 1, 3, 6 and 12) for both our ‘naïve‘ S and optimized S* (the latter explained later in this section) via the twin diagnostics of Left: distributional entropy and right: (M)RSSE (cf. Nunes & Balding 2010). Recall here that a lower posterior entropy typically indicates a higher posterior information content, and a lower (M)RSSE scores a more accurate recovery of the posterior mean. In each instance the marked data point reveals the median, and the error bars a corresponding 95 per cent confidence interval, recovered from six rounds of rejection ABC (i.e. selection of the 100 least discrepant particles out of an initial 5000 drawn from the prior density). Note that as the posterior means of our model parameters under the ‘independent evolution’ case studied here have already been well approximated via our earlier (‘tractable likelihood’-based) MCMC simulation we have employed these directly to estimate a pseudo-RSSE, rather than forming an MRSSE from repeated ABC runs against simulated data sets ‘close’ to the real one as in the canonical Nunes & Balding (2010) procedure.

Comparison of the marginal posterior density for αmerge in the ‘independent evolution’ case of our model recovered from SMC ABC under two of our ‘naïve’ summary statistics, S : k = 3 and S : k = 12). In each case the dotted line represents the posterior after one rejection-resampling iteration (m = 1) of the SMC ABC algorithm, the solid data points the same after four iterations (m = 4) and the dashed red line the benchmark posterior from ‘tractable likelihood’-based MCMC. At the bottom of each panel we indicate also the position of the posterior mean for this parameter under each of our SMC ABC runs as well as from our MCMC benchmark (the latter highlighted with a red diamond).

Figure 7.

Comparison of the marginal posterior density for _α_merge in the ‘independent evolution’ case of our model recovered from SMC ABC under two of our ‘naïve’ summary statistics, S : k = 3 and S : k = 12). In each case the dotted line represents the posterior after one rejection-resampling iteration (m = 1) of the SMC ABC algorithm, the solid data points the same after four iterations (m = 4) and the dashed red line the benchmark posterior from ‘tractable likelihood’-based MCMC. At the bottom of each panel we indicate also the position of the posterior mean for this parameter under each of our SMC ABC runs as well as from our MCMC benchmark (the latter highlighted with a red diamond).

Comparison of the marginal posterior density for αmerge in the ‘independent evolution’ case of our model recovered from SMC ABC under the alternative summary statistics, ‘naïve’ S : k = 3 and optimized S* : k = 3. In each case the dotted line represents the posterior after one rejection-resampling iteration (m = 1) of the SMC ABC algorithm, the solid data points the same after four iterations (m = 4), and the dashed red line the benchmark posterior from ‘tractable likelihood’-based MCMC. At the bottom of each panel we indicate also the position of the posterior mean for this parameter under each of our SMC ABC runs as well as from our MCMC benchmark (the latter highlighted with a red diamond).

Figure 8.

Comparison of the marginal posterior density for _α_merge in the ‘independent evolution’ case of our model recovered from SMC ABC under the alternative summary statistics, ‘naïve’ S : k = 3 and optimized S* : k = 3. In each case the dotted line represents the posterior after one rejection-resampling iteration (m = 1) of the SMC ABC algorithm, the solid data points the same after four iterations (m = 4), and the dashed red line the benchmark posterior from ‘tractable likelihood’-based MCMC. At the bottom of each panel we indicate also the position of the posterior mean for this parameter under each of our SMC ABC runs as well as from our MCMC benchmark (the latter highlighted with a red diamond).

SMC ABC posterior probability densities for the key parameters of our stochastic model of morphological transformation at high redshift (in the ‘independent evolution’ case). In each of the main diagonal panels we compare the marginal posterior density of a single parameter (in red) against its prior (in grey), while in each of the off-diagonal panels below we extend this comparison to the joint density formed by pairing that parameter against one of its peers. For the latter visualization we employ a lattice of variable-sized points to trace the SMC ABC posterior on a scale of 1, 2.5, 7.5 and 15 times some appropriate baseline probability density, while grey-shaded tiles map the corresponding prior on an identical scale. In the upper right-hand panel we plot the (pointwise) 1σ and 3σ credible intervals and median curve (in dark grey, light grey and red, respectively) for the Mgal > 1011 M⊙ merger rate, λm(t), deriving from our (joint, marginal) posterior densities on αmerge, βmerge and tbr. Both here and in the main diagonal panels the MCMC (‘tractable likelihood’-based) benchmark solution is illustrated for comparison via the corresponding dashed (and dotted) lines.

Figure 9.

SMC ABC posterior probability densities for the key parameters of our stochastic model of morphological transformation at high redshift (in the ‘independent evolution’ case). In each of the main diagonal panels we compare the marginal posterior density of a single parameter (in red) against its prior (in grey), while in each of the off-diagonal panels below we extend this comparison to the joint density formed by pairing that parameter against one of its peers. For the latter visualization we employ a lattice of variable-sized points to trace the SMC ABC posterior on a scale of 1, 2.5, 7.5 and 15 times some appropriate baseline probability density, while grey-shaded tiles map the corresponding prior on an identical scale. In the upper right-hand panel we plot the (pointwise) 1_σ_ and 3_σ_ credible intervals and median curve (in dark grey, light grey and red, respectively) for the _M_gal > 1011 M⊙ merger rate, λm(t), deriving from our (joint, marginal) posterior densities on _α_merge, _β_merge and _t_br. Both here and in the main diagonal panels the MCMC (‘tractable likelihood’-based) benchmark solution is illustrated for comparison via the corresponding dashed (and dotted) lines.

SMC ABC posterior probability densities for the key parameters of our stochastic model of morphological transformation at high redshift (in the ‘co-evolution’ case). In each of the main diagonal panels we compare the marginal posterior density of a single parameter (in blue) against its prior (in grey), while in each of the off-diagonal panels below we extend this comparison to the joint density formed by pairing that parameter against one of its peers. For the latter visualization we employ a lattice of variable-sized points to trace the SMC ABC posterior on a scale of 1, 2.5, 7.5 and 15 times some appropriate baseline probability density, while grey-shaded tiles map the corresponding prior on an identical scale. In the upper right-hand panel we plot the (pointwise) 1σ and 3σ credible intervals and median curve (in dark grey, light grey and blue, respectively) for the Mgal > 1011 M⊙ merger rate, λm(t), deriving from our (joint, marginal) posterior densities on αmerge, βmerge and tbr.

Figure 10.

SMC ABC posterior probability densities for the key parameters of our stochastic model of morphological transformation at high redshift (in the ‘co-evolution’ case). In each of the main diagonal panels we compare the marginal posterior density of a single parameter (in blue) against its prior (in grey), while in each of the off-diagonal panels below we extend this comparison to the joint density formed by pairing that parameter against one of its peers. For the latter visualization we employ a lattice of variable-sized points to trace the SMC ABC posterior on a scale of 1, 2.5, 7.5 and 15 times some appropriate baseline probability density, while grey-shaded tiles map the corresponding prior on an identical scale. In the upper right-hand panel we plot the (pointwise) 1_σ_ and 3_σ_ credible intervals and median curve (in dark grey, light grey and blue, respectively) for the _M_gal > 1011 M⊙ merger rate, λm(t), deriving from our (joint, marginal) posterior densities on _α_merge, _β_merge and _t_br.

Since, as mentioned earlier, dry merging appears to be remarkably uncommon in the early Universe – cf. the rapidly declining fraction of red–red pairs with increasing redshift (Lin et a. ; Chou et al. ; Kampczyk et al. ) and the overall paucity of passive galaxies, in general, above z ∼ 2 (BR11; Whitaker et al. ; Wuyts et al. ) – we assume that all mergers to occur under our model are gas-rich and therefore generate a distinctive post-merger morphology with irregular tidal features (Elmegreen et al. ; Lotz et al. ). We model the (observed _H_-band) visibility time-scale of these features according to a gamma distribution with scale, 1 + 100τIrr morph, and shape coefficient, 100, for τIrr morph in Gyr; thereby allowing an ∼0.1 Gyr interquartile spread to account for some intrinsic variation in the cold gas fraction (and thus the merger-to-stable-remnant transition time; Lotz et al. ) across the galaxy population sampled. Inspired by contemporary hydrodynamical simulations of gas-rich mergers (Lotz et al. ) we choose our prior density on τIrr morph to favour time-scales of the order of 0.2–0.7 Gyr (though permitting, at much lower prior density, the possibility of even >1 Gyr time-scales; Conselice ) by adopting a beta distribution with shape coefficients, 3 and 5, on this parameter divided by 1.5. Upon fading of these post-merger tidal features we suppose that the final remnant may assume either a Type I (pure spheroid) or Type II (spheroid-plus-disc) morphology with the probability of the latter outcome a tuneable parameter, _P_Sph + D remnant. Given the ongoing debate within the hydrodynamical modelling community regarding the relative frequency of mergers conducive to disc reformation at these epochs (e.g. Robertson et al. ; Bournaud et al. ), we adopt a beta distribution prior on _P_Sph + D remnant with shape coefficients, 1 and 3, favouring Type II production in less than one in every four mergers.

The second pathway of morphological transformation permitted under our model is that of secular evolution of Type III discs to Type II, the theoretical mechanism proposed to drive this process being the inwards migration of massive star-forming clumps as encountered in certain hydrodynamical simulations (Bournaud, Elmegreen & Elmegreen ; Elmegreen, Bournaud & Elmegreen ; Dekel et al. ). We again model this process stochastically via a gamma distribution with scale, 1 + 50τsec ev, and shape coefficient, 50, for τsec ev in Gyr; thereby inducing an intrinsic spread in this variable at run-time intended to mimic the impact of natural diversity in the structure and kinematics of high-redshift discs. Though inwards migration has been well publicized as the favoured hypothesis of the SINS team to explain the characteristic morphologies and SEDs of the clump population hosted amongst members of their pioneering z ∼ 2 survey (Förster-Schreiber et al. ; Genzel et al. ), as noted earlier the most recent hydrodynamical simulations incorporating the effects of wind-driven mass loss, at least in the 1010.5–1011 M⊙ regime, indicate that a large fraction of these clumps may be too short lived to migrate successfully into a central bulge (Hopkins et al. ; Genel et al. ). We therefore adopt such a prior on the time-scale for secular bulge formation as to allow a full range of scenarios from rapid growth on sub-Gyr time-scales (implying that many of our Type III discs should transition to Type II before z ∼ 1.5) to incredibly slow growth on up to 10-Gyr time-scales (implying none should). Mathematically we represent our prior density on this parameter via a Uniform distribution in log10 space bounded between −1 and 1.

Fig. 3 illustrates schematically the five distinct pathways of morphological transformation permitted under the above-specified model. We note for reference that Figs 4, 9 and 10 offer graphical representations of the prior densities on all our model parameters.

3.1.1. Simulation from our stochastic model

Having outlined above the principle details of our stochastic model for high-redshift morphological transformation we now describe our corresponding procedure for simulating from this model under two distinct paradigms – ‘independent evolution’and ‘co-evolution’ – the likelihood of the observed data under a given set of model parameters being tractable in the former and intractable in the latter. (Our derivation of the ‘independent evolution’ likelihood function is given in the Appendix to this paper.)

3.1.1.1 The -independent evolution’ case

As the name suggests in the ‘independent evolution’ case we suppose that neither the birth nor morphological transformation history of any galaxy are ever coupled to those of another. Simulation from our stochastic model under this assumption for a given set of input parameters is then simply a matter of applying the above probabilistic transition rules to generate one by one a mock morphology at the observed redshift for each object in our benchmark sample as follows.

First, the birth time of the galaxy at hand (i.e. the epoch at which its stellar mass finally exceeds 1011 M⊙) is drawn from the interval _t_6( = 0) ≤ _t_birth ≤ _t_obs according to the waiting time distribution dictated by its assumed (increasing rate, non-homogeneous) Poissonian form (as derived in the Appendix to this paper). The birth class is then assigned as either Type III or Type IV, with the probability of the latter given by graphic. To compute this ratio we must also sample a value for each of the nuisance parameters, K, γ and W, according to graphic and graphic, respectively (where graphic represents the truncated Normal distribution). The number of mergers, _n_merge, experienced between birth and observation is then drawn from the Poisson distribution specified by rate, graphic. If _n_merge ≠ 0 the corresponding epoch of last major merger is identified by sampling from the relevant waiting time distribution (also derived in the Appendix). The manifest duration of the resulting post-merger (Type IV) irregular state is then drawn directly from the gamma distribution with scale, 1 + 100τIrr morph, and shape coefficient, 100; and if encompassed within the remaining time until observation the galaxy is assigned either Type I or II morphology, with the probability of the latter set by _P_Sph + D remnant (otherwise it finishes the simulation as a Type IV). Finally, galaxies born as Type III discs and experiencing no major mergers may yet evolve to Type II via secular evolution, determined likewise by comparing an evolutionary period drawn from the gamma distribution with scale, 1 + 50τsec ev, and shape coefficient, 50, against the time available between birth and observation.

Simulation from our model is thus inherently stochastic – i.e. the internal assignment of birth times, most recent merger times, and so on (and thereby the output assignment of final morphologies) will vary from run to run at fixed input. In the SMC approach to ABC (Chopin ; Del Moral et al. ; Sisson et al. ; Drovandi & Pettitt ) the effects of this stochasticity are accounted for in an efficient, consistent manner through the iterative application of the key rejection and resampling/refreshment steps described in Section .

As noted earlier a characteristic feature of our model in the ‘independent evolution’ case is that the likelihood, graphic, of the observed data under a given set of input parameters is, in fact, tractable (whereas in the ‘co-evolution’ case it is not). For expository purposes this allows us to reconstruct via standard (‘tractable likelihood’-based) Bayesian computational methods (namely, MCMC) the ‘true’ posterior probability density of our model parameters (modulo the inherent variance of MCMC simulation) as a benchmark for comparison against our ABC results. The derivation of this likelihood function is rather involved so we present details separately in the Appendix (along with a description of the MCMC scheme employed). The resulting ‘true’ posteriors are, however, presented here in Fig. 4 for reference during our discussion of summary statistics in Section and the accuracy of our ABC posteriors in Section .

3.1.1.2 The -co-evolution’ case

Though it ensures the likelihood tractability required for our demonstration of the robustness of SMC ABC with respect to MCMC in Section our initial assumption that galaxy evolution proceeds independently across our entire sample may not be physically realistic. A number of recent studies probing high-redshift clusters and protoclusters out to z ∼ 1.5–3 (Doherty et al. ; Hatch et al. ; Tanaka et al. ; Papovich et al. ; Spitler et al. ) and beyond (Capak et al. ; Carilli et al. ) have presented evidence to suggest that the evolutionary histories of intermediate-mass galaxies within these early overdensities are in fact highly correlated such that they consistently achieve peak star formation earlier than their counterparts of similar mass in the field. Indeed such early biasing by environment of star formation and mass accretion constitutes a fundamental prediction of hierarchical formation theory under Λ cold dark matter (ΛCDM; Springel et al. ; Overzier et al. ) and should already be manifest in the spatial distribution of the first generation of (re-)ionizing sources (Kramer, Haiman & Peng ). As usual though, empirical results for galaxies at the top end of the stellar mass function remain limited due to the intrinsic rarity of these systems.

The beauty of ABC, of course, is that it permits the study of arbitrarily complex stochastic models irrespective of likelihood tractability, allowing one to relax such simplifying assumptions as that of ‘independent evolution’ in the present example. In this section we thus outline a ‘co-evolution’ case of our model in which a physically plausible coupling is introduced into the formation times of galaxies in close pairs and small groups, and in Section we explore the impact of this coupling on our ABC posteriors.

In Fig. 5 we illustrate the nature of spatial clustering amongst the massive (_M_gal > 1011 M⊙) galaxies of our B11 (CANDELS/EGS) sample at 1.5 < z < 3, representing their 3D distribution in observed right ascension (RA), declination (Dec.), and redshift via 2D projections in comoving distance along the line of sight (LOS) and the axes of RA and Dec., alternately. Neighbouring systems separated by no more than 2.5 Mpc – a conservative linking scale for protogroup-sized overdensities in the early Universe (cf. Capak et al. and references therein) – are marked accordingly and highlight the diversity of high-redshift ‘environments’ in the survey volume, with 11 simple pairings and one threesome identified at the adopted linking scale and a majority of relatively isolated systems. Perhaps the most striking feature on first inspection of this plot, however, is the apparent void at ∼400 Mpc distance from the lower bound of our sample at z ∼ 1.5. Examination of the B11 photometric redshifts for the full EGS, however, reveals no indication of this underdensity extending across the wider field, reassuring us that it is most likely an imprint of cosmic variance (cf. Trenti & Stiavelli ; Driver & Robotham ) within the small volume probed by the present data set – and not the result of a systematic bias in the adopted SED fitting algorithm, for instance.

To explore the impact of small-scale clustering on the ‘birth’ time distribution as defined in our stochastic model for morphological transformation we refer to the publicly available mocks from the De Lucia & Blaizot () SAM embedded in a small volume of the Millennium Simulation of comparable size to that probed by the B11 data set. In line with our suspicion from Section of an ∼0.1 dex bias in the B11 stellar masses our first discovery here is that the number density of simulated galaxies in these mocks at a cut-off of _M_gal > 1011 M⊙ is far below that of our CANDELS/EGS sample, but may be brought into reasonable agreement if we revise our selection down to (at least) _M_gal > 1010.9 M⊙. Following this adjustment we identify 12 pairs, two threesomes, three foursomes and even a five member configuration in this ∼7 × 105 Mpc3 ‘snapshot’ of the De Lucia & Blaizot () SAM at z ∼ 1.5. Interestingly, whilst we do find a strong correlation between the birth times of galaxies in close associations – with a median absolute difference of only ∼0.3 Gyr for neighbours within 2.5 Mpc compared against ∼0.65 Gyr for randomly assigned pairs – despite some theoretical expectation there exists negligible evidence in these mocks for a systematic bias in this specific aspect of galaxy formation. Hence, we do not attempt to induce one arbitrarily into our stochastic simulations; instead we focus here simply on reproducing the observed correlation using the following modified sampling scheme.

Rather than drawing independent birth times one by one for each galaxy in our sample as in the ‘independent evolution’ case described above, in the ‘co-evolution’ case of our model we instead generate a complete set of birth times at the start of the simulation and distribute these thereafter with an environmental dependence. We achieve this via the admittedly somewhat ad hoc scheme described below, which we have specifically tailored (through ‘trail and error’ experimentation) to render median absolute birth time differences for both neighbouring galaxies and random pairings of similar magnitude to those of the De Lucia & Blaizot () model. That is, we do not propose that our scheme follows in meaningful way the unknown sequence of random physical processes by which nearby neighbours come to experience similar evolutionary histories; we simply assert that it mimics faithfully the imprint of these processes on the coupling of galaxy birth times ‘observed’ in this particular reference SAM.

First we draw from the standard Uniform distribution a primary set of 126 points, one for each galaxy in the B11 CANDELS/EGS data set, plus a secondary set of 12 points, one for each close pair or threesome earlier identified (see Fig. 5) – the latter serving as ‘latent variables’ for the coupling of birth times in these associations. To each galaxy in a close pair or threesome we then allocate a single point from the primary set with selection probability proportional to the inverse square of distance between that point and the corresponding latent variable from the secondary set. The remaining points in the primary set are then randomly allocated with equal selection probability to the many isolated galaxies of our data set. Each point on the interval [0,1] thus assigned is then transformed to a birth time for its matching galaxy through multiplication by the relevant graphic (see, for reference, the Appendix to this paper) followed by inversion to recover _t_birth. Note that by allocating from an initial uniform sample in this manner we naturally preserve the mean build-up rate of graphic specified by our fit of K and γ in graphic against the available observational data from K09, M10, BR11, C11, M11 and C12 (see the left-hand panel of Fig. 2).

3.2 SMC ABC: Sequential Monte Carlo Approximate Bayesian Computation

As mentioned in Section , ABC (cf. Tavaré et al. ; Pritchard et al. ; Beaumont et al. ; Sisson et al. ; Wilkinson ; Csilléry et al. ; Drovandi & Pettitt , ) offers a rigorous statistical framework for estimating the posterior probability densities of key scientific parameters under complex models for which the likelihood of the observed data may be entirely intractable (thus prohibiting application of the standard MCMC approach, for example). To conduct ABC one requires only a stochastic model from which the observed data, y, are believed to be a random draw given some unknown set of intrinsic (true) input parameters, and a discrepancy measure for the comparison of simulated data, **y**s, against observed, graphic, typically based on a set of low-order summary statistics, S(·). Following Drovandi & Pettitt () the aim of ABC may be stated formally as the recovery of unbiased samples from the distribution described by the approximate joint posterior density,

formula

(1)

where graphic represents a vector of unknown model parameters, graphic the prior density on those parameters and εT some target tolerance for the specified discrepancy measure between simulated and observed data. The indicator function graphic assumes value unity for simulated–observed data set pairs with metric distance below this tolerance, and zero otherwise.

The archetypal scheme for random sampling from the distribution defined by equation is that of rejection ABC (cf. Pritchard et al. ) in which one draws a large sample of N trial parameter vectors, graphic (i = 1, … , N), from the prior, graphic, simulates a corresponding data set for each **y**s, and then rejects all graphic for which the discrepancy between simulated and observed data exceeds some target tolerance, i.e. graphic. That is, one adopts as an approximation to the posterior the complementary set of input parameter vectors (drawn from the prior) for which the corresponding simulated (or mock) data set appears ‘close’ to the observed. In principle, those regions of parameter space with the greatest probability of having generated the observed data set should be the most frequently represented amongst this approximate posterior sample – modulo two (possibly large) sources of error. Namely, (i) Monte Carlo error due to the limited number of simulated data sets it will be feasible to generate, and the even-more-limited number of these that will likely appear ‘close’ to the observed data set; and (ii) the inherent error of the likelihood approximation in ABC arising from the gap between ‘close’ (as judged by the summary statistic-based discrepancy distance) and ‘equal to’, which cannot (in general) be made arbitrarily small if at least some simulated data sets are to be deemed acceptable. Except in (unrealistically) fortuitous (or trivial) circumstances in which the prior is already very close to the posterior, when exploring high dimensional parameter spaces (_N_par ≳ 3) under rejection ABC these intertwined sources of error may well force one into an undesirable trade-off between an impractically low acceptance rate or an uncomfortably large tolerance. Thus, much recent work in the ABC field has been concerned with the development of more efficient alternatives to rejection ABC, involving sophisticated algorithms to focus the sampling of input parameters for the (computationally expensive) data simulation phase towards regions of increasingly higher acceptance probability, though in such a manner (or with the appropriate book-keeping) as to avoid biasing the output posterior approximation.

Perhaps the most promising of these is the SMC (cf. Liu ; Chopin ) approach to ABC (Del Moral et al. ; Sisson et al. ; Drovandi & Pettitt ) which proposes to simulate from equation stepwise by evolving a dynamic population of ‘particles’ (with each particle representing a single vector of input model parameters) through a sequence of intermediate distributions characterized by graphic for t = 1, … , T, indexing a series of non-increasing targets, εt. The two key stages of the SMC algorithm are thus (i) rejection of the most discrepant particles under each target; followed by (ii) resampling from amongst the least discrepant particles with some refreshment mechanism applied to maintain particle diversity. SMC may therefore also be referred to as ‘particle filtering’ or ‘population Monte Carlo’ [PMC; for examples of the recent adoption of PMC (SMC) in a cosmological context, see Wraith et al. ; Kilbinger et al. , and references therein].

In this study we employ for rejection the self-generating target strategy of Drovandi & Pettitt () in which the sequence of incremental targets, εt, is chosen on run-time such that a fixed fraction, α, of all particles is dropped at each iteration (herein α = 0.75). We then restore the particle population to its full operating size, N, by resampling with replacement from amongst the remaining (1 − α)N particles. Population refreshment is achieved by application of an MCMC kernel to these replicates. As in Drovandi & Pettitt () we favour the use of a self-refining, Metropolis–Hastings proposal distribution based on the current particle sample mean vector, graphic, and covariance matrix, graphic. For the specific model at hand we adopt a truncated, multivariate T distribution of degree 10 with the truncation bounds set by the support of our prior densities (as detailed in Section ). At each iteration of the SMC algorithm this MCMC kernel is run a fixed number of times, R, to (hopefully) produce genuine refreshment in a fraction, c, of the resampled particles (with some particles, of course, likely to be moved multiple times). The requisite R is here estimated according to the empirical efficiency, _p_acc, of the previous MCMC kernel as graphic. Note that the ‘likelihood ratio’ in the corresponding MCMC acceptance computation is simply graphic, i.e. all trial particles for which the simulation produces a mock data set with discrepancy distance falling below the current target are assured a non-zero probability of acceptance. Our final target, εT, is defined pragmatically (with respect to the limitations of our computational resources) as that εt for which a further SMC rejection–resampling step would incur more than _R_max ∼ 100 applications of this MCMC kernel.

3.2.1 Treatment of nuisance parameters

A particular feature of the model adopted in this study is the appearance of three nuisance parameters – essential inputs of little interest to our final science goal though known only to a limited accuracy from previous studies. Namely, graphic where graphic and graphic. Upon each simulation from our model for a single particle, graphic, we draw a random value for each of K, γ and W from their respective distributions for use in that one instance. Heuristically this amounts to approximating the integral, graphic, by a single Monte Carlo sample, yet produces through the power of the SMC ABC algorithm (i.e. via inference over a particle population en masse) a reasonable approximation (converging asymptotically) to sampling from equation .

3.3 Refinement of summary statistics

As explored in a number of recent papers careful selection of the summary statistic(s) used to evaluate the discrepancy between simulated and observed data in ABC is of paramount importance to achieving accuracy and efficiency with this algorithm, whether employed for the purpose of parameter estimation (e.g. Joyce & Marjoram ; Nunes & Balding ; Fearnhead & Prangle ) or, more challengingly, Bayesian model choice (Barnes et al. ; Marin et al. ; Robert et al. ). Indeed the same is true for all approaches to inference from complex models for which sufficient statistics are unavailable, including those of ‘approximate likelihood’ (Wood ) and model emulation (cf. Bower et al. and references therein), and should thus not be thought of as a unique concern for ABC-based analyses.

A summary statistic may be defined as any mathematic representation of the original data set that reduces its effective dimension. For example, a single column mean, a list of multiple column means, or even a list of multiple column means, variances and higher order moments; though it will rarely be profitable to carry such a high dimensional statistic as the latter into a full ABC analysis owing to the direct relationship between Monte Carlo error and summary dimension (cf. Beaumont et al. ; Fearnhead & Prangle ). For some stochastic models (including that of the present case study in morphological transformation; cf. Section ) the nature of the output data may well be such that only a few basic modes of summary are of any likely value, while for others (e.g. the Ricker map case study of Wood ; and most SAM-based studies of galaxy formation) there may in fact be many in a rich hierarchy of complexities. Given that ABC is specifically designed for use with complex stochastic models with intractable likelihoods it cannot be expected that the inferential value of any of these candidate summary statistics will be known a priori to the analyst who must ultimately choose between them – though some may well be strongly biased or uninformative. Hence in this section we complete our exposition of the ABC technique with a demonstration of two contemporary approaches to this particular selection problem: first, we apply Nunes & Balding's () two-stage procedure of distributional entropy and MRSSE minimization to identify the optimal choice from amongst a candidate set of four ‘naïve’ summary statistics for our morphological data set; and secondly, we employ the so-called ‘semi-automatic’ scheme of Fearnhead & Prangle () to build an alternative set of summary statistics optimized with respect to the recovery of posterior means, validating their performance in comparison against the former.

3.3.1 Minimum entropy/MRSSE-based selection

The most natural mode of summary for population demographic data in the context of extragalactic astronomy is, of course, by way of type counts or type fractions in similar-sized bins of redshift (see Oesch et al. and Buitrago et al. for recent examples). Important considerations when compiling such summary data for the purpose of ABC analysis are then the number and placement of bins to use and the weights one should assign to the type counts/fractions observed therein. As a ‘naïve’ first attempt at constraining the parameter space of our model in the ‘independent evolution’ case we thus trial four alternative summary statistics based on progressively finer subdivisions of the B11 CANDELS/EGS data set by redshift. Mathematically, we define this class of ‘naïve’ summary statistics, S, via the generic column vector

formula

with graphic the fraction of Type I galaxies in the first of k redshift bins subdividing equally the interval 1.5 < z < 3, graphic the fraction of Type II galaxies in the aforementioned bin, and so on. Adopting equal significance weights across all bins, we thus establish a complete discrepancy distance,

formula

(2)

Following the two-stage procedure of Nunes & Balding () we begin the evaluation of our four ‘naïve’ candidates, graphic, by computing the fourth nearest neighbour entropy of the posterior distribution resulting from simple rejection ABC under each – the goal here being to exploit the (approximate) inverse relationship between entropy and information in order to identify the most ‘informative’ summary statistic with regard to inference of the model parameters at hand. In the present round of rejection ABC experiments we accept only the 100 least discrepant particles of an initial sample of 5000 drawn from the prior, and we compute the associated entropy statistic for each according to the formula:

formula

[Singh et al. ; with _N_par(=6) representing the dimension of our model parameter space, n(=100) the number of accepted particles, Γ and ψ the ‘gamma’ and ‘digamma’ functions, respectively, and R i, 4 the fourth nearest neighbour distance]. There exists, of course, a certain subjectivity in the choice of scaling for each parameter in the computation of R i, 4; one option would be to first standardize all parameters on the interval [0, 1]; however, in this case we prefer instead to standardize against the diagonal matrix of our prior variances, graphic, such that graphic. By repeating the rejection ABC process six times for each k one may estimate both the median entropy and matching 95 per cent confidence interval (from the range) under that particular binning scheme. The results of this analysis are presented in the left-hand panel of Fig. 6.

Interestingly, although one might, at face value, expect a monotonic relationship of decreasing posterior distributional entropy with increasing k on the basis that finer binning should break any false degeneracies in the posteriors recovered from ABC runs with fewer bins (i.e. in some sense increase the information return), this is not necessarily true in practice owing to the simultaneous increase in Monte Carlo error, as the present example demonstrates. Although we do observe a slight decrease in distributional entropy upon moving from one to three redshift bins the opposite is true for our six and 12 bin trials under this particular mode of summary (see the left-hand panel of Fig. 6).

Following identification of the entropy-minimizing S (here k = 3) Nunes & Balding () recommend a second (rather computationally expensive) round of evaluation against the formal optimality criterion of Mean square Root Sum of Squared Errors (MRSSE) to ensure that the summary statistic favoured by the minimum entropy analysis is not likely biased with respect to recovery of the posterior mean. In the full Nunes & Balding () scheme the MRSSE score is to be estimated via a new series of rejection ABC analyses against simulated data sets constructed under parameter vectors revealed by the original ABC runs for the entropy-minimizing S as (likely to be) ‘close’ to those responsible for the observed data set. Since the (marginal) posterior mean of each model parameter under the ‘independent evolution’ case studied here has already been well approximated via our earlier (‘tractable likelihood’-based) MCMC simulation (see Section ), we may take a shortcut to the truth here (and vastly reduce our computational burden) by employing these directly to estimate alternative pseudo-RSSE scores for each of our previous rejection ABC runs. The results of this analysis are presented in the right-hand panel of Fig. 6. The relationship between pseudo-RSSE score and binning k observed here mirrors closely that exposed by our original entropy evaluation, validating S :_k_=3 as the optimal choice from amongst our candidate set of ‘naïve’ summary statistics.

The value of the above optimization procedure may easily be appreciated from inspection of Fig. 7 in which we compare the marginal posterior for one of our key model parameters, αmerge, recovered from full SMC ABC analysis under S with k = 3 against that for k = 12. As in all SMC ABC runs reported herein we use a population of 10 000 particles, iterated through an adaptive threshold defined by a rejection rate of α = 0.75, followed by MCMC kernel-based resampling with a goal refreshment rate of no less than c = 0.90. A total of four iterations were achieved under this scheme before the limits of our computational resources were reached [at R > _R_max(=100) required applications of the MCMC kernel for such a level of refreshment]. Although both (marginal) posteriors appear rather similar after only one iteration it is evident by the fourth (and final) iteration that the S :_k_=3 statistic has produced a particle population tracing far more faithfully the density of our benchmark (‘tractable likelihood’-based) MCMC simulation than its S :_k_=12 counterpart.

3.3.2 The -semi-automatic’ scheme

In an important contribution to the ABC literature Fearnhead & Prangle () have recently demonstrated that the optimal summary statistic for estimation of model parameters under quadratic loss (i.e. optimality with respect to the recovery of posterior means) is simply the conditional expectation function, graphic. As a direct consequence the authors were thereby able to propose and justify a regression-based algorithm for the direct construction of well-behaved summary statistics, allowing one (in principle) to bypass the above process of searching through a ‘naïve’ set of often unsatisfactory candidates. To implement their so-called ‘semi-automatic’ scheme one must first generate a ‘reasonably large’ sample of model parameter–simulated data set pairs spanning a ‘relevant volume’ of parameter space. In lower dimensional analyses one may simply draw this sample directly from the prior, though the posterior density from a trial ABC run with some ‘naïve’ summary statistic will generally serve as a superior starting point. Least-squares-based fitting to this reference data set of the relation, graphic, for each model parameter, graphic, yields the optimal summary statistic, graphic. Here e i denotes a symmetric error term of zero mean and f(y) some vector-valued function of the data, which for graphic will be typically of the form f(y)=y, graphic, or similar – the lack of a universally appropriate algorithm for defining this regression function being the chief cause for classification of the above scheme as ‘semi-’ rather ‘fully’ automatic.

Since the raw output, y, from our stochastic model for high-redshift morphological transformation is, in fact, multinomial (rather than real-valued) we adopt here (for the purposes of computational efficiency) a modified regression function of form, f(y)= S(y), with S(·) denoting as above the compilation of type fractions in fixed bins of redshift. Under this adaptation of the Fearnhead & Prangle () approach the magnitude of each component in each fitted graphic, to establish the full discrepancy measure under this new summary statistic,

formula

Fitting of graphic and graphic was achieved here by application of the glm and step routines in r to the 100 least discrepant particles from each of the six rejection ABC runs conducted earlier under our ‘naïve’ summary statistic for k = 3 (resulting in a full calibration sample of 600 model parameter–simulated data set pairs). Notably, the step routine in r makes use of the AIC (Akaike Information Criterion) statistic to restrict each fit to only those elements of S contributing significantly to the prediction of θ_i_.

Although k = 3 proved to be the optimal binning scheme for the set of ‘naïve’ summary statistics examined above one cannot simply assume this to hold for the new graphic, so we again examine the merits of each alternative, k = {1, 3, 6, 12}, following Nunes & Balding (). The results of this analysis are overlaid against our measurements for the ‘naïve’ S in Fig. 6. Reassuringly, with the exception of the most limited (k = 1) binning scheme these new summary statistics significantly outperform the old in the pseudo-RSSE criterion for which they are designed. Despite our caution k = 3 does again appear to constitute the best choice of binning, though k = 6 is not far behind in accuracy and may also offer slightly lower entropy. In Fig. 8 we compare the marginal posterior density recovered for the model parameter, αmerge, following a full run of SMC ABC under graphic against that obtained earlier under S : _k_=3. Interestingly, though our ‘naïve’ summary provides a visually ‘closer’ fit to the shape (especially the width, i.e. standard deviation) of the benchmark MCMC density for this parameter, the optimized statistic does outperform it with regard to the recovery of the posterior mean (which it manages within a small tolerance after only a single iteration of the SMC ABC algorithm). Hence, given both its ease of implementation and its demonstrated effectiveness in the present analysis we can confidently recommend the ‘semi-automatic’ scheme of Fearnhead & Prangle () for summary statistic refinement.

3.3.3 A note on one alternative

As mentioned in the introduction to this section the subject of summary statistic selection for ABC analysis remains an active area of research in the statistical literature, hence it is worth reviewing here briefly another popular alternative we have neglected to demonstrate above for the sake of brevity. Namely, the ‘approximate sufficiency’ algorithm of Joyce & Marjoram () for iteratively building a master summary statistic from the union of randomly trialled candidates, with each new addition accepted only if it offers an improvement in parameter inference exceeding some threshold. This algorithm may be of particular interest for SAM-based studies of galaxy formation given the wide variety of available observational benchmarks from which summary statistics may be composed (see footnote 17), though it has been criticized for a dependence on the (random) order in which the candidate statistics are tested at each application (i.e. the stated search procedure is far from exhaustive). Note also that although our above demonstration of the Nunes & Balding () procedure is presented in terms of selecting a unique summary statistic from four evidently degenerate choices (i.e. k = {1, 3, 6, 12}) an optimal union of summary statistics may also be identified via the minimum entropy/MRSSE criterion, though possibly at great computational expense if the original set of basis candidates is large and there are many permutations of interest.

3.4 SMC ABC posteriors for our stochastic model of morphological transformation

In Fig. 9 we present posterior probability densities for the key parameters of our stochastic model of morphological transformation at high redshift in the ‘independent evolution’ case, as recovered from SMC ABC using our optimized summary statistic (cf. Section ), graphic. The approximate solution shown here represents the state of a 10 000 particle population progressed through four rejection–resampling iterations with an α = 0.75 rejection rate and a c = 0.90 target refreshment rate (cf. Section ). Comparison against our ‘tractable likelihood’-based MCMC benchmark (for this tractable case of our model) presented earlier in Fig. 4 (and overplotted for illustrative purposes here in key panels) highlights the value of the ABC approach. That is, without reference to the explicit likelihood function of the system at hand this simple procedure of strategic simulation and discrepancy thresholding has nevertheless produced a most satisfactory approximation to the true posterior, capturing the key features of each marginal and bivariate joint density under investigation. As is expected though (cf. Csilléry et al. ) the ABC posterior does not reproduce exactly the true (‘tractable likelihood’-based) solution here, owing to the inherent gap between ‘close’ and ‘equal to’ in its likelihood approximation – both in the non-zero tolerance for the simulated–observed data set discrepancy required to achieve a workable acceptance rate and in the fundamental degeneracy of summary statistic matching over full data set matching (cf. Fearnhead & Prangle , for instance). Moreover, we note that the ABC credible intervals so derived (see, in particular, those for the _M_gal > 1011 M⊙ merger rate, graphic, shown in the top right-hand panel) are for the same reason noticeably broader than those of the benchmark solution. Whilst one would not usually even consider an ABC approach if the likelihood were tractable it is reassuring for those occasions of interest in which it is not to verify through the above comparison that this approximate likelihood scheme can at least perform similarly.

In Fig. 10 we present the SMC ABC posteriors recovered from the ‘co-evolution’ case of our model for which indeed the likelihood function is (by construction for this example) thoroughly intractable – and thus the standard toolbox of (‘tractable likelihood’-based) MCMC simulation entirely inaccessible. As described in Section this intractability is induced simply by coupling the birth times of galaxies in close associations in a manner consistent with that observed in the De Lucia & Blaizot () SAM, leaving all other details of the model unchanged. Hence it is perhaps unsurprising that the posteriors for this example differ only slightly from those presented above, with a modest decrease in confidence regarding the true value of the merger rate (i.e. the joint, marginal density of αmerge, βmerge and _t_br) and a modest increase in confidence regarding the merger visibility time-scale (from graphic to graphic). Nevertheless, this demonstrated ability of ABC to handle models with intractable likelihoods, and thus to permit the derivation of robust Bayesian constraints from arbitrarily ‘realistic’ (i.e. complex) simulations, offers a wealth of possibilities for astronomical studies far beyond the present example which cannot be overstated.

4 ASTROPHYSICAL RESULTS AND DISCUSSION

Having completed our exposition of the ABC algorithm in Section we take the opportunity here to explore a number of interesting astrophysical results arising from our chosen case study in morphological transformation at high redshift. In Section we compare our SMC ABC-based constraints on the evolving merger rate in the early Universe against recent estimates from the literature based on simple close pair and asymmetric galaxy counts, highlighting the superior informative power of the former over the latter. Then in Section we discuss our (posterior) preference for merging over secular evolution as the dominant pathway to early bulge formation in the context of contemporary hydrodynamical and ‘semi-empirical’ simulations.

4.1 The evolving merger rate at the highest redshifts

As mentioned in Section to this paper the recent installation of WFC3 on HST has at last made accessible at high resolution the rest-frame optical morphologies of massive galaxies at the epoch of peak cosmic star formation and AGN activity (z ∼ 2; Warren et al. ; Lilly et al. ; Madau et al. ; Oesch et al. ), opening a unique window into the structural assembly of this first generation of Hubble sequence analogues (Cameron et al. ; Conselice et al. ; Szomoru et al. ). A particular motivation for our present case study in morphological transformation was to highlight the potential of model-based demographic analysis as a means to exploit this wealth of new data. In this section we thus explicitly demonstrate the advantages of such an approach (as implemented here via SMC ABC) over the standard ‘one type at a time’ mode of study.

Given the (expected) importance of merging as a driver of both star formation and morphological transformation under the canonical hierarchical clustering paradigm (White & Rees ), it is perhaps no great surprise that a total of four separate teams have recently attempted constraint of the z ≳ 1.5 (major) merger rate based on close pair and/or asymmetric galaxy counts in deep NIR imaging. Namely, Bluck et al. (, ) in the GNS field, Man et al. () in the COSMOS field (in the subregion of HST/NICMOS coverage), Law et al. () in a dedicated HST/WFC3 study from Cycle 17 sampling multiple fields and Williams et al. () in the UDS field; the first two exploring the same _M_gal > 1011 M⊙ regime as in our paper and the latter two probing much further down the mass function.

Having identified their target population of impending mergers (in the case of close pair selection) or recent merger remnants (in the case of asymmetric galaxy selection) each of these teams has then proceeded to estimation of the early Universe merger rate in the following manner. First, they compute the merger fraction as the number of mergers detected divided by the total number of galaxies within the target redshift interval and mass range minus the expected proportion of false detections (arising, for instance, from chance alignments along the LOS). Secondly, they adopt a third-party estimate of the merger visibility time-scale based on _N_-body/hydrodynamical simulations of galaxy–galaxy collisions; e.g. Man et al. () adopt τclose pair ∼ 0.4 ± 0.2 Gyr from Lotz et al. (). And thirdly, they either estimate the comoving volume density of their target galaxy population directly from the total count in the survey volume (as in Man et al. ), or adopt again a (hopefully more robust, i.e. less prone to cosmic variance error) third-party estimate from the literature (as in Bluck et al. who take graphic from Drory et al. ). The merger rate by volume is then computed simply as graphic.

In the left-hand panel of Fig. 11 we compare the resulting estimates of graphic from the close pair studies of Bluck et al. (, hereafter BL09) and Man et al. (, hereafter MA12) against the credible intervals derived from our SMC ABC analysis of the B11 CANDELS/EGS morphological mix (in the ‘co-evolution’ case of our model). We present the aforementioned pair count estimates – which are all well outside our model-based 1σ credible interval – as upper bounds on the major merger rate, since we believe that their use of a 1:4 mass ratio threshold may admit a significant number of ‘weak’ accretion events unlikely to generate a substantial change in morphology; e.g. Hopkins et al. favour instead a 1:3 mass ratio for the major/minor distinction. Interestingly, the MA12 results are at least qualitatively consistent with our posterior inference for the location of the break epoch at _t_br ∼ 2.55 Gyr (since z = 6; i.e. z ∼ 1.75) – though we note in any case that the plotted data points do carry rather large uncertainties (of ∼88 per cent at the 1σ level; MA12), owing in particular to the contribution from cosmic variance error (and necessarily stellar mass estimation error; cf. BR11 and Section ) in the determination of graphic.

The evolving merger rate over the first ∼3Gyr of massive galaxy formation (z ∼ 6 to z ∼ 1.5). Here λm(t) represents the rate by volume of major (viz. morphology-changing) mergers experienced by those systems already more massive than 1011 M⊙ prior to this accretion event. As in Fig. 10 we plot the (pointwise) 1σ and 3σ credible intervals and median curve deriving from our SMC ABC analysis of the B11 (CANDELS/EGS) morphological mix in dark grey, light grey and blue, respectively. The close pair count-based estimates of BL09 and MA12 (with the former scaled down by a factor of 2 to remove their ‘correction’ from fm to fgm; cf. footnote 22) are overlaid in the left-hand panel; marked here as loose upper bounds since their threshold of a 1:4 mass-ratio for designation as a ‘major’ merger may well be overly generous (e.g. Hopkins et al. 2009a,b favour 1:3 alternatively). As a more faithful comparison in the right-hand panel we present asymmetric galaxy count-based estimates derived from BL11 (following application of our in-house calibrations for Φ>1011 M⊙(z) and W; see Section 3.1 and Fig. 2). Also overlaid here are raw Type IV count-based estimates from our (C12) visual classifications, along with an estimate based on the observed Types I and II count in our lowest redshift bin; the arrow on the latter noting the fact that the marked position of this data point corresponds to the limiting case of all these mergers having occurred with only just enough time to allow fading of the characteristic post-merger irregular features prior to observation.

Figure 11.

The evolving merger rate over the first ∼3Gyr of massive galaxy formation (z ∼ 6 to z ∼ 1.5). Here λ_m_(t) represents the rate by volume of major (viz. morphology-changing) mergers experienced by those systems already more massive than 1011 M⊙ prior to this accretion event. As in Fig. 10 we plot the (pointwise) 1_σ_ and 3_σ_ credible intervals and median curve deriving from our SMC ABC analysis of the B11 (CANDELS/EGS) morphological mix in dark grey, light grey and blue, respectively. The close pair count-based estimates of BL09 and MA12 (with the former scaled down by a factor of 2 to remove their ‘correction’ from _f_m to _f_gm; cf. footnote 22) are overlaid in the left-hand panel; marked here as loose upper bounds since their threshold of a 1:4 mass-ratio for designation as a ‘major’ merger may well be overly generous (e.g. Hopkins et al. 2009a,b favour 1:3 alternatively). As a more faithful comparison in the right-hand panel we present asymmetric galaxy count-based estimates derived from BL11 (following application of our in-house calibrations for Φ>1011 M⊙(z) and W; see Section 3.1 and Fig. 2). Also overlaid here are raw Type IV count-based estimates from our (C12) visual classifications, along with an estimate based on the observed Types I and II count in our lowest redshift bin; the arrow on the latter noting the fact that the marked position of this data point corresponds to the limiting case of all these mergers having occurred with only just enough time to allow fading of the characteristic post-merger irregular features prior to observation.

As a more faithful comparison – that is, a comparison against rival estimates based also on post-merger (not pre-merger) observational signatures – we present in the right-hand panel of Fig. 11 the merger rate inferred from application of the graphic formula to both the asymmetric galaxy counts of Bluck et al. () (denoted here as BL11) and the raw Type IV counts of our B11 CANDELS/EGS data set (denoted C12). For the former we adopt τ_A_ ≈ 0.6 ± 0.3 Gyr from Conselice et al. () (see also Lotz et al. ) and for the latter our prior of τIrr morph ≈ 0.55 ± 0.25 Gyr; and in both cases we employ our in-house estimate of graphic, which has been calibrated against a large compilation of recent data points from the literature in a manner accounting for the presence of systematic biases in the underlying SED-fit-based stellar masses of each contributing survey (see Section and Fig. 2). Given our definition of graphic as the rate of mergers experienced by those systems already in excess of 1011 M⊙ at the stated epoch we must allow for the presence of galaxies only promoted above this mass threshold by the aforesaid accretion event by scaling back our raw count-based estimates according to the factor, graphic, with W the nuisance parameter introduced in Section . The error bars accompanying each data point in this right-hand panel of Fig. 11 denote the 1σ credible intervals accounting for the relevant uncertainties in these estimates (including, of course, those on the estimation of the population proportion from binomial count data, often mishandled in astronomical studies; cf. Cameron ).

Immediately evident from inspection of this right-hand panel of Fig. 11 is the reasonable agreement (well inside the 1σ errors) between our Type IV count-based estimates and those based on the asymmetric galaxy counts from Bluck et al. , confirming a fair degree of equivalence between the use of visual classification and non-parametric, quantitative indicators for the selection of high-redshift mergers. Perhaps the most striking impression made by this comparison, however, is the marked offset between the median curve of our SMC ABC constraint on graphic from full demographic analysis of the B11 (CANDELS/EGS) sample and those estimates based only upon the Type IV count at t ≲ 2.5 Gyr (since z = 6; i.e. at z ≳ 2.25). In particular, our median curve here favours a higher merger rate, effectively splitting the difference against the higher asymmetric galaxy count-based data point at t ∼ 1.5 Gyr (since z = 6). The explanation for this offset lies, of course, in the fact that we fit our model not just against the merger fraction but rather the full morphological mix, meaning that the fitted merger rate must not only account for the number of ongoing mergers at a given epoch but also the observed population of evolved systems (i.e. ellipticals and bulge-dominated discs) which must have (or in the latter case, will very likely have; see Section ) undergone a merger in their recent past. As an indication of the contribution of Types I and II counts to our fit of graphic we have also marked in Fig. 11 a pseudo-λm data point estimated by treating these evolved galaxy types as ongoing mergers observed at _t_obs − τIrr morph. The pseudo-λm data point thus computed confirms that the past rate of merging was very likely to have been higher than that indicated by our raw Type IV counts. Our counts of Types I and II galaxies also contribute a valuable source of data for constraining τIrr morph, which through our ABC analysis we verify is indeed probably close to our prior expectation (i.e. our prior mean of τIrr morph ≈ 0.55 ± 0.25 falls well within the 1σ credible interval of our posterior, graphic). The associated increase in confidence regarding the true value of τIrr morph contributes significantly to the reduced width of our SMC ABC credible intervals on graphic (based on the full demographics) relative to those based only on our Type IV counts.

4.2 The dominant role of merging over secular evolution for early bulge formation

Another interesting feature of the ABC-based model constraints derived in this paper concerns the relative dominance of merging over secular evolution as the favoured mechanism responsible for building up the first generation of massive bulges in early-type discs. In particular, the posterior probability density of our _P_Sph + D remnant parameter favours production of a Type II system in ≈33 ± 17 per cent [1σ] of mergers at these high redshifts – where the gas-rich nature of the progenitors has previously been argued as conducive to disc survival and/or rapid reformation around a central spheroid on the basis of hydrodynamical simulations (e.g. Robertson et al. ; Hopkins et al. ; but see Bournaud et al. regarding the difficulties of reproducing such merger outcomes in models with a realistically cold, turbulent interstellar medium). Interestingly, Hopkins et al. () have also argued for Type I suppression in gas-rich mergers as a solution to the inconsistency between conventional SAMs and the observed bulge and disc demographics in the local Universe. The posterior density on the secular evolution time-scale for our model, graphic [1σ] Gyr, on the other hand, renders this rival pathway to bulge formation rather unlikely for z ≳ 1.5 discs. Indeed given the above posterior means one can confirm via repeated simulation from our model that on average only ∼7 per cent of Type II systems detected at these redshifts will have been formed via secular evolution. This result is consistent with the recent observations of Genel et al. () and Hopkins et al. () from hydrodynamic simulations in which wind-driven feedback appears to destroy all but the most massive clumps in less time than required for their inwards migration under dynamical friction.

5 Conclusions

In this paper we have demonstrated the potential of ‘ABC’ for astronomical model analysis through a detailed case study in the morphological transformation of high-redshift galaxies. In the process we have derived tight constraints on the evolving merger rate in the early Universe, and revealed the relative dominance of merging over secular evolution for bulge formation at these epochs, through an ABC-based examination of the full population demographics of an _M_gal > 1011 M⊙, 1.5 < z < 3 sample from the B11 CANDELS/EGS data set. More importantly though our exposition of the contemporary ‘SMC’ implementation of ABC as well as two modern approaches to summary statistic selection will hopefully guide and inspire further astronomical applications of this powerful statistical technique.

Acknowledgments

This work is based on observations made under the CANDELS Multi-Cycle Treasury Program with NASA/ESA HST, which is operated by the Association of Universities for Research in Astronomy, Inc., under NASA contract NAS5-26555.

This study makes use of data from AEGIS, a multiwavelength sky survey conducted with the Chandra, GALEX, Hubble, Keck, CFHT, MMT, Subaru, Palomar, Spitzer, VLA and other telescopes and supported in part by NSF, NASA and STFC.

EC is grateful for financial support from the Australian Research Council.

References

,

2012

,

MNRAS, 419

,

3

,

2116

,

2011

, preprint (arXiv:1106.6281)

,

2006

,

Rep. Prog. Phys.

,

69

,

3101

,

2002

,

Genetics

,

162

,

2025

,

2009

,

MNRAS

,

394

,

51

(BL09)

,

2011

,

ApJ

,

739

,

24

(BR11)

,

2011

, preprint (arXiv:1111.6993)

Cameron

E., Carollo C. M., Oesch P. A., Bouwens R. J., Illingworth G. D., Trenti M., Labbé I., Magee D.

,

2011

,

ApJ

,

743

,

146

,

2011

,

MNRAS

,

413

,

162

(C11)

,

2002

,

Biometrika

,

89

,

539

,

2010

,

Trends Ecol. Evol.

,

25

,

410

,

2011

, preprint (arXiv:1104.5470)

,

2006

,

J. R. Stat. Soc. Ser. B Stat. Methodol.

,

68

,

411

,

2010

,

Biometrics

,

67

,

225

,

2011

,

Stat. Commun. Infect. Dis.

,

3

,

2

,

2012

,

J. R. Stat. Soc. B

,

74

,

1

,

2003

,

Bayesian Data Analysis, 2nd edn

.

Chapman & Hall/CRC

,

,

2012

,

Int. J. Astrobiol.

,

11

,

103

,

2007

,

Bayes Linear Statistics: Theory and Methods

.

Wiley & Sons

,

,

2009

,

Bayesian Anal.

,

3

,

427

,

1997

,

Construction and Assessment of Classification Rules

.

Wiley

,

,

2011

,

Ecol. Lett.

,

14

,

816

,

2011

, preprint (arXiv:1111.6591)

,

2008

,

Stat. Appl. Genet. Mol. Biol.

,

7

,

26

,

2009

,

ApJ

,

702

,

1393

(K09)

,

2011

, preprint (arXiv:1112.4842)

,

2011

, preprint (arXiv:1110.4057)

,

2001

,

Monte Carlo Strategies in Scientific Computing

.

Blackwell Science Ltd

,

,

2011

, Galaxy Formation: An International Conference, Online at http://astro.dur.ac.uk/Gal2011, id.48

,

2012

,

ApJ

,

744

,

85

(MA12)

,

2010

,

ApJ

,

725

,

1277

(M10)

,

2011

, preprint (arXiv:1110.4700)

,

2011

,

MNRAS

,

413

,

2845

(M11)

,

2011

, preprint (arXiv:1112.4755)

,

2010

,

Stat. Appl. Genet. Mol. Biol.

,

9

,

34

,

2012

, preprint (arXiv:1201.0755)

,

1999

,

Mol. Biol. Evol.

,

16

,

1791

,

2011

, preprint (arXiv:1102.4432)

,

2011

, preprint (arXiv:1110.3816)

,

2003

,

Am. J. Math. Manage. Sci.

,

23

,

301

,

2007

,

Proc. Natl. Acad. Sci. USA

,

106

,

1760

,

1997

,

Genetics

,

145

,

505

,

2010

,

Bioinformatics

,

26

,

104

,

2008

,

Biometrika

,

20

,

10

,

2009

,

Phys. Rev. D

,

80

,

2

A Appendix

2018 LIKELIHOOD COMPUTATION FOR OUR STOCHASTIC MODEL OF MORPHOLOGICAL TRANSFORMATION IN THE -INDEPENDENT EVOLUTION’ CASE

In this appendix, we derive the likelihood function, graphic, of the observed data – i.e. the (HST WFC3/IR) _H_-band demographics for our sample of 126 galaxies at 1.5 < _z_ < 3 and _M_gal > 1011 M⊙ selected from the B11 CANDELS/EGS data set – given a particular set of input parameters, graphic, under the ‘independent evolution’ case of our stochastic model for high-redshift morphological transformation. To this end, we must first compute generic expressions for the probability densities of the time of birth and the time of last major merger, given an arbitrary redshift of observation. Integration of the conditional transition probabilities of the permitted evolutionary pathways to a given morphology as a function of the above returns the likelihood of observing that type at a particular redshift, and via the independence assumption allows a simple formulation of the complete likelihood function.

Probability density for the time of birth. The ‘birth’ of galaxies in our model (i.e. the promotion, via star formation or merging, of new systems to the top end of the 1.5 < z < 6 stellar mass function) is characterized as a non-homogeneous Poisson process of rate,

formula

in units of Mpc−3Gyr−1, with the origin of the time-variable set to z = 6. Both K and γ are treated here as nuisance parameters with graphic. The waiting time distribution for galaxy births under the above-specified Poisson process is, of course, exponential in Λ_b_ space, where

formula

Given that each galaxy experiences (by definition) only a single birth, the Λ_b_ epoch of this event represents a unique draw from the uniform distribution on [0, Λ_b_(_t_obs)] with _t_obs the cosmological time since z = 6 at the observed redshift, z i. Transformation of variables back to the time-domain specifies a probability density for the time of birth, _t_birth, on [0, _t_obs] of

formula

Probability density for the time of the last major merger. As for the case of the birth function examined above, merging is treated under our stochastic model as a non-homogeneous Poisson process, with a variable rate by volume (in Mpc−3 Gyr−1) set by the input parameters, αmerge, βmerge and _t_br, of

formula

Here _t_1.5 is used to denote the cosmological time between z = 6 and 1.5 (the lower bound of our sample). The number of mergers experienced by an individual galaxy prior to observation for a given birth time is thus Poisson distributed with the rate

formula

where Γ_m_(t) =

formula

That is, the probability of a galaxy experiencing k mergers between its epoch of birth and its epoch of observation is

formula

with the two cases of particular interest being graphic and graphic. Moreover, for the latter, the (non-zero) N m = k mergers will be distributed uniformly in Γ_m_ space on [Γ_m_(t_birth), Γ_m(t_obs)]. The corresponding probability density of the Γ_m epoch of the most recent merger is then simply that of the _k_th-order statistic,

formula

[remembering that graphic]. Summation over all possible (non-zero) merger counts, k = 1, … , ∞, weighted by their respective probabilities, P(N m = k), returns the overall probability density of the most recent merger Γ_m_ epoch as

formula

formula

formula

Once again transformation of variables delivers the form of this density back in the time-domain, namely

formula

Likelihood of a Type III late-type disc. As described in Section galaxies in our model may be born as either Type III discs or Type IV ongoing mergers, with the probability of the latter set by

formula

We note explicitly here the conditional dependence on our suite of nuisance parameters, graphic, where this final element, which derives from the shape of the z ∼ 1.5–3 stellar mass function, is taken as W ≈ 0.5 ± 0.2, that is, graphic. Since no transformation process permits a transition (back) to Type III from outside this state (i.e. Type III represents a transient class of the morphological-type Markov chain; see Fig. 3), the corresponding likelihood is the easiest to derive – it is merely the probability that a galaxy is born as a Type III disc and experiences neither merging nor secular evolution prior to observation.

For a given _t_birth (with 0 < _t_birth < _t_obs), the probability of classification C i = III under our stochastic model for morphological transformation is thus

formula

formula

formula

formula

formula

Here graphic represents the null secular evolution probability, which operates independently of the null merger probability and the probability of birth as a Type III system. According to the gamma-distributed form of the secular evolution time-scale under our model,

formula

and, as noted above,

formula

Integration of this expression by _t_birth over the corresponding density, f b(_t_birth), gives the general likelihood of Type III morphology for the galaxy at hand,

formula

Analytic solutions for this integral exist in a number of special cases, such as γ = 0 and 1 (the latter in terms of the error function), but not to our knowledge for an arbitrary, non-integer γ ≈ 0.65 as required to fit the observed build-up in number density above 1011 M⊙ (see Fig. 2). To evaluate this expression within our MCMC code, we thus employ the efficient numerical technique of Monte Carlo integration (Liu ) at run-time.

Likelihood of a Type I spheroid. In contrast to the simple case of the Type III disc presented above, there exist an infinite variety of evolutionary pathways potentially leading to the production of a Type I spheroid under our model. However, these pathways may all be considered degenerate with respect to the instance of one final merger (possibly that of the galaxy's birth) followed by fading of all post-merger irregular features and settling of the merger remnant into a spheroid morphology (see Fig. 3). The probability of transition through this penultimate step is, of course, dependent upon the time of the final merger, t m. Following the derivation above, we write down the likelihood of Type I formation for the _i_th galaxy with _t_obs corresponding to z i as an integral over _t_birth (and now t m also) as

formula

with

formula

formula

formula

formula

Here graphic represents the probability of the post-merger irregular features fading to reveal the final remnant morphology prior to observation, and graphic represents the probability that the final remnant emerges as a Type I pure spheroid rather than a Type II spheroid-plus-disc. The first half of this equation corresponds to the case of Type I classification at _t_obs for a galaxy that has experienced at least one merger since birth, while the second corresponds to the case of a sole, primal merger. Expansion of this expression in full returns another integral with no simple closed form, so once again Monte Carlo integration is ultimately required for its evaluation.

Likelihood of a Type IV ongoing merger. The likelihood of observing a Type IV ongoing merger is easily derived from the case of the Type I spheroid above with the trivial changes required being a complementation of the probability of the post-merger irregular features fading and removal of the binomial merger remnant type probability.

Likelihood of a Type II spheroid-plus-disc. As illustrated in Fig. 3 there are in fact two non-degenerate pathways to the formation of a Type II spheroid-plus-disc system under our model for high-redshift morphological transformation, corresponding to merging or secular evolution alternately. So, in principle, writing down the likelihood of this type, graphic, should be the most difficult of all. However, having derived likelihoods for each of the other morphological types, one may simply evaluate this case as

formula

formula

Such an approach of course requires accurate evaluation of the likelihoods for all three alternative morphologies. Testament to the robustness of our Monte Carlo integration approach is the fact that upon setting _P_Sph + D remnant = 0.5 and τsec ev = 100 Gyr, for which theoretically, graphic, the computational agreement of these likelihoods (for a single galaxy) evaluated in R typically extends to at least the fourth significant figure (with just _n_mc ∼ 1000).

Likelihood of the full data set. Recovering the likelihood of the full observational data set, graphic with graphic (i = 1, … , _N_gal), in the ‘independent evolution’ case is then simply a matter of taking the product of likelihoods for each individual galaxy (as even neighbouring galaxies evolve uncoupled in this scenario). Thus,

formula

Of course, when the assumption of independence is relaxed (in order to build a more physically realistic model) in the manner of the ‘co-evolution’ case considered in Section , this full data set likelihood function is no longer so readily tractable, and ABC methods thus become essential for reconstructing the posterior probability densities of the key input parameters.

Here we employ the symmetric multivariate normal distribution, graphic, for graphic with graphicchosen (by trial-and-error) to produce, on average, an ∼40 per cent acceptance rate for graphic. Running two separate threads of R on a 2.7-GHz dual core (4 GB RAM), 13-inch Macbook Pro laptop, we were able to verify satisfactory convergence of this chain after completing a target of 100 000 MCMC steps (with a 1000-step burn-in period on each thread) in a little less than 48 h.

Author notes

Indeed the authors can find no astronomical reference to either the terms ‘ABC’ or ‘likelihood-free’ (inference) on the NASA ADS data base, and Google Scholar indicates no astronomical citations yet to any of the biological/mathematical ABC literature mentioned herein. A more pedagogical treatise on the potential for ABC in astronomy presented by Chad Schafer and Peter Freeman at the Statistical Challenges in Modern Astronomy V conference in 2011 (available at http://www.springer.com/statistics/book/978-1-4614-3519-8/), which details two interesting uses for ABC in extragalactic data analysis, represents to our knowledge the only prior application in this field.

For readers familiar with the work of Bower et al. (), we note that the ‘discrepancy parameter’ introduced for their emulation of the galform SAM could not be employed as such in ABC as it is not (designed as) a gauge of model–data similarity; indeed, it serves an entirely different purpose in their analysis, acting as an error term for cosmic variance and structural uncertainty in their code.

As a caveat to the above referencing we note that (i) though the analyses of Henriques et al. () and Lu et al. () are both conducted broadly in the style of the ‘approximate likelihood’ approach formalized by Wood (), there are also a number of significant implementational differences unique to each, and (ii) though the work of Kampakoglou, Trotta & Silk () has in previous papers been cited as an example of MCMC-based SAM constraint, in fact, their study concerns a purely analytic model for which there exists no intrinsic stochasticity (thus, only approximate observational errors enter their likelihood computation). Finally, we refer the interested reader to Hartig et al. () for a concise overview of the similarities and differences between the ‘approximate likelihood’ and ABC approaches to inference from statistical simulation, and to Nott, Fan & Sisson () for an advanced treatment of the link between a particular version of ABC and the Bayes Linear technique (cf. Goldstein & Wooff ) underlying the model emulator approach.

PEGASE: Projet d’Etude des GAlaxies par Synthese Evolutive (Fioc & Rocca-Volmerange ).

GOODS-N/-S: Great Observatories Origins Deep-North/-South.

COSMOS: COSMic evOlution Survey.

UDS: UKIDSS [United Kingdom Infrared Telescope (UKIRT) Infrared Deep Sky Survey] UltraDeep Survey.

MOIRCS: Multi-Object Infrared Camera and Spectrograph.

NEWFIRM: NOAO (National Optical Astronomical Observatory) Extremely Wide-Field Infrared iMager.

Though redshift may be the more familiar baseline for many observational astronomers we have instead adopted here a scale of time (since z = 6, in Gyr) for the horizontal axis of this plot. This is because time, rather than redshift, forms the natural evolutionary variable of the stochastic processes described in our morphological transformation model. The top right-hand panels of Figs 4, 9 and 10 are marked with both, however, as a convenient reference for the appropriate conversion under our assumed cosmology.

SINS: Spectroscopic Imaging survey in the Near-infrared with SINFONI (Spectrograph for INtegral Field Observations in the Near Infrared).

Analysis of the (much larger) COSMOS photometric redshift catalogue (Ilbert et al. ) confirms that underdensities of this magnitude indeed arise frequently amongst the massive/most luminous galaxy population at these redshifts. In particular, for the optically luminous (i.e. rest-frame Z < −23.6 mag) members of that catalogue at 1.5 < _z_ < 3, which appear at comparable number density to the _M_gal > 1011 M⊙ population from the EGS, a maximum LOS separation of 150 Mpc or more occurs (roughly) once for every three random placements of the CANDELS/EGS observational footprint within the COSMOS field.

http://galaxy-catalogue.dur.ac.uk:8080/Millennium/

Recall that for the purposes of the present analysis we treat the time of ‘birth’ in our model as the epoch at which a galaxy first reaches the top end of the high-redshift stellar mass function, whether via star formation or merging. To identify this point in the De Lucia & Blaizot () mocks one must follow back the linked progenitor tree accordingly via sql query.

When using a non-symmetric proposal distribution as in the present case the ratio of sampling densities joins, of course, the likelihood ratio and the prior density ratio in computing the full MCMC acceptance probability.

Application of ABC to the problem of Bayesian model choice (cf. Grelaud et al. ; Toni & Stumpf ) is far from straightforward as an unfortunate summary statistic selection can lead to disastrously incorrect Bayes factors, even asymptotically (Robert et al. ). Recently though, Marin et al. () have made substantial progress in this field by establishing necessary and sufficient conditions on the validity of candidate summary statistics for this purpose.

With the choice of summary statistic typically (re-)cast in terms of the choice of reference data set in these astronomical studies – namely, whether to constrain, for instance, against the luminosity function (Bower et al. ; Lu et al. ; Cirasuolo et al. ), the Tully-Fisher relation (van den Bosch ; Tonini et al. ), the mass-metallicity relation (Pipino et al. ), and/or the black hole–bulge mass relation (Henriques et al. ).

MRSSE: Mean square Root Sum of Standard Errors.

One may note an intriguing similarity between the use of fourth (or fifth) nearest neighbour-based estimators in both statistical studies of distributional entropy and in astronomical studies of large-scale environment (cf. Baldry et al. ) – though it is unlikely there exists an underlying significance to this beyond the desirable error properties of the n ∼ 4–5 choice (Singh et al. ).

In fact, as noted by Fearnhead & Prangle (), since in ABC analysis we are only interested in the difference, graphic, the vector, graphic, may well be omitted from this above definition.

Another (well-motivated) alternative choice of scaling here would be the sample covariance matrix of the posterior particle population from our earlier run of SMC ABC under S :_k_=3.

Following Conselice (), Bluck et al. (, ) advocate correction of this raw merger fraction, which they denote _f_m, to a ‘galaxy merger fraction’, denoted _f_gm, via the relation, graphic. The motivation for this correction is to identify the ‘number of galaxies merging as opposed to the number of mergers’. However, we believe this to be a false move in context of their analysis of the most massive galaxies in the GNS, in which they identify impending mergers as those _M_gal > 1011 M⊙ systems host to a close companion (<30 kpc) of similar brightness (i.e. within ±1.5 mag in the observed H band; corresponding to a lower bound on the mass ratio of ∼1:4). Owing to the steepness of the galaxy stellar mass function at z ≳ 1.5 the vast majority of such companions will almost certainly be sub-1011 M⊙ – indeed in our sample we find just one close pair in which both are truly high-mass galaxies, and MA12 find just two. Hence, by ‘correcting’ to _f_gm as above one is in fact estimating the merger rate experienced by both massive galaxies and the (ill-defined) population of less massive galaxies that will ultimately merge with them, which does not seem a particularly useful exercise.

As a first-order estimate of the downwards revision in these merger rates that would result from switching to a 1:3 mass ratio selection one can suppose (perhaps naïvely) that the masses of close pair galaxies correspond to independent random draws from the z ∼ 1.5 luminosity function; in which case, Δlog λm ∼ −0.15 to −0.35, bringing the BL09 and MA12 determinations broadly into agreement with our own.

In fact, we downweight the count of Type II galaxies in this calculation by (1 − _P_Sph + D remnant) to acknowledge the (minor) role of secular evolution in early bulge formation, as discussed further in Section .

We note for reference that Glade, Ballet & Bastien () provide a brief review of the fundamentals of Poisson processes in their recent paper on the Drake equation.

© 2012 The Authors Monthly Notices of the Royal Astronomical Society © 2012 RAS

Citations

Views

Altmetric

Metrics

Total Views 1,481

1,077 Pageviews

404 PDF Downloads

Since 11/1/2016

Month: Total Views:
November 2016 1
January 2017 4
February 2017 3
March 2017 3
April 2017 4
June 2017 23
July 2017 3
August 2017 2
September 2017 6
October 2017 5
November 2017 7
December 2017 10
January 2018 16
February 2018 27
March 2018 12
April 2018 26
May 2018 27
June 2018 26
July 2018 22
August 2018 12
September 2018 14
October 2018 61
November 2018 9
December 2018 11
January 2019 11
February 2019 21
March 2019 11
April 2019 28
May 2019 17
June 2019 10
July 2019 17
August 2019 17
September 2019 14
October 2019 15
November 2019 19
December 2019 11
January 2020 27
February 2020 20
March 2020 6
April 2020 16
May 2020 17
June 2020 9
July 2020 21
August 2020 20
September 2020 8
October 2020 9
November 2020 17
December 2020 9
January 2021 8
February 2021 19
March 2021 27
April 2021 14
May 2021 13
June 2021 18
July 2021 19
August 2021 12
September 2021 11
October 2021 18
November 2021 13
December 2021 14
January 2022 23
February 2022 18
March 2022 26
April 2022 34
May 2022 21
June 2022 39
July 2022 28
August 2022 17
September 2022 28
October 2022 9
November 2022 15
December 2022 25
January 2023 18
February 2023 15
March 2023 13
April 2023 16
May 2023 12
June 2023 11
July 2023 7
August 2023 7
September 2023 15
October 2023 16
November 2023 16
December 2023 32
January 2024 13
February 2024 12
March 2024 16
April 2024 11
May 2024 16
June 2024 19
July 2024 16
August 2024 8
September 2024 15
October 2024 4

Citations

75 Web of Science

×

Email alerts

Citing articles via

More from Oxford Academic