Gaia Early Data Release 3 - Catalogue validation (original) (raw)

A&A 649, A5 (2021)

Catalogue validation

1, X. Luri1, F. Arenou2, C. Babusiaux3,2, A. Helmi4, T. Muraveva5, C. Reylé6, F. Spoto7, A. Vallenari8, T. Antoja1, E. Balbinot4, C. Barache9, N. Bauchet2, A. Bragaglia5, D. Busonero10, T. Cantat-Gaudin1, J. M. Carrasco1, S. Diakité6, M. Fabrizio11,12, F. Figueras1, A. Garcia-Gutierrez1, A. Garofalo5, C. Jordi1, P. Kervella13, S. Khanna4, N. Leclerc2, E. Licata10, S. Lambert9, P. M. Marrese11,12, A. Masip1, P. Ramos1, N. Robichon2, A. C. Robin6, M. Romero-Gómez1, S. Rubele8 and M. Weiler1

1Departament de FQA, Institut de Ciències del Cosmos, Universitat de Barcelona (IEEC-UB), Martí i Franquès 1, 08028 Barcelona, Spain
e-mail: claus@fqa.ub.edu
2GEPI, Observatoire de Paris, Université PSL, CNRS, 5 place Jules Janssen, 92190 Meudon, France
3Univ. Grenoble Alpes, CNRS, IPAG, 38000 Grenoble, France
4Kapteyn Astronomical Institute, University of Groningen, Landleven 12, 9747 AD Groningen, The Netherlands
5 INAF – Osservatorio di Astrofisica e Scienza dello Spazio di Bologna, Via Piero Gobetti 93/3, 40129 Bologna, Italy
6Institut UTINAM, CNRS, OSU THETA Franche-Comté Bourgogne, Univ. Bourgogne Franche-Comté, 25000 Besançon, France
7 Harvard-Smithsonian Center for Astrophysics, 60 Garden St., MS 15, Cambridge, MA 02138, USA
8INAF, Osservatorio Astronomico di Padova, Vicolo Osservatorio, Padova 35131, Italy
9SYRTE, Observatoire de Paris, Université PSL, CNRS, Sorbonne Université, LNE, 61 avenue de l’Observatoire, 75014 Paris, France
10 INAF – Osservatorio Astrofisico di Torino, Via Osservatorio 20, 10025 Pino Torinese, Torino, Italy
11 INAF – Osservatorio Astronomico di Roma, Via di Frascati 33, 00078 Monte Porzio Catone (Roma), Italy
12 ASI Science Data Center, Via del Politecnico, Roma
13LESIA, Observatoire de Paris, Université PSL, CNRS, Sorbonne Université, Université de Paris, 5 place Jules Janssen, 92195 Meudon, France

Received: 3 November 2020
Accepted: 27 November 2020

Abstract

Context. The third Gaia data release is published in two stages. The early part, Gaia EDR3, gives very precise astrometric and photometric properties for nearly two billion sources together with seven million radial velocities from Gaia DR2. The full release, Gaia DR3, will add radial velocities, spectra, light curves, and astrophysical parameters for a large subset of the sources, as well as orbits for solar system objects.

Aims. Before the publication of the catalogue, many different data items have undergone dedicated validation processes. The goal of this paper is to describe the validation results in terms of completeness, accuracy, and precision for the Gaia EDR3 data and to provide recommendations for the use of the catalogue data.

Methods. The validation processes include a systematic analysis of the catalogue contents to detect anomalies, either individual errors or statistical properties, using statistical analysis and comparisons to the previous release as well as to external data and to models.

Results. Gaia EDR3 represents a major step forward, compared to Gaia DR2, in terms of precision, accuracy, and completeness for both astrometry and photometry. We provide recommendations for dealing with issues related to the parallax zero point, negative parallaxes, photometry for faint sources, and the quality indicators.

Key words: catalogs / astrometry / techniques: photometric

© ESO 2021

1 Introduction

The third data release from the European Space Agency mission Gaia (Gaia Collaboration 2016, 2021) covers observations made between July 2014 and May 2017. It takes place in two stages, where the first (early) stage, Gaia EDR3, provides the updated astrometry and photometry. For convenience it also includes (nearly all) radial velocities from the second data release, Gaia DR2 (Gaia Collaboration 2018). The second stage, the full Gaia DR3, will include the same sources as Gaia EDR3 and add new radial velocities, spectra, light curves, astrophysical parameters, and orbits for solar system objects, as well as a detailed analysis of quasars and extended objects, for example. This paper describes the validation of Gaia EDR3 with the aim of facilitating the optimal use of the catalogue, comprehending its contents, and especially exposing the known issues. The approach followed is a transverse analysis of the properties of the various contents from an internal as well as external point of view. We also use the previous release, Gaia DR2, as a reference for comparisons in order to quantify the changes and improvements from one release to the next.

The general properties of the catalogue are described in Sect. 2. This includes the completeness in terms of limiting magnitude and angular resolution and also in terms of high proper motion stars. Likewise, we discuss how sources, and their identifiers, have changed since Gaia DR2.

The new astrometric solution (Lindegren et al. 2021b; Klioner et al., in prep.) determines two parameters (position), “2p”, five parameters (position, parallax, proper motion), “5p”, or six parameters, “6p”, for a source. In the latter case, the sixth parameter is the source colour (effective wavelength), listed in the Gaia archive as the pseudocolour. The use of three levels of solutions introduces tricky issues in the validation. Another important topic is the presence of spurious solutions and the means available to identify them. The validation of the astrometry is discussed below in Sect. 3.

Cycle 3 photometry is described by Riello et al. (2021) both in terms of the various calibration steps and in terms of data quality. Important changes have been made to the background modelling, leading to improvements for _G_BP and _G_RP photometry, which, however, suffers from other issues in the faint end. Changes to the overall response modelling have had undesired effects in a few cases, leading to the elimination of photometry for 5.4 million sources over a wide range of brightness. The validation of the photometry is discussed in Sect. 4.

Finally, Sect. 5 presents a statistical approach to comparing Gaia EDR3 and Gaia DR2. Additionally, we describe the overall results and recommendations in Sect. 6.

Updated radial velocities will only appear in Gaia DR3 and, as mentioned, Gaia EDR3 therefore contains values copied from Gaia DR2. The process for the identification of sources and the validation of the velocities is described in detail in Seabroke et al. (2021) and is therefore not discussed here. As a result of this process, 14 800 radial velocities (0.2%) were discarded.

In addition to the papers mentioned above, the Gaia archive provides online documentation1 with additional details about the data processing and the description of the data items in the published catalogue and its various accompanying tables. Gaia jargon is difficult to avoid and we therefore include a short dictionary of Gaia and Gaia EDR3 related terms in Appendix B.

2 General tests and completeness

Our general tests cover a wide range of issues from simple, yet indispensable, checks that the catalogue has been correctly populated to more sophisticated statistical tests on completeness.

2.1 Gaia DR2 sources in Gaia EDR3

An important question is how to find Gaia DR2 sources in Gaia EDR3 and determine whether they are still present and if they maintain their source identifiers. In general this is the case, but there are also many exceptions. In Gaia EDR3, we still have 96.2% of the Gaia DR2 sources at the same position to within 10 mas when taking Gaia EDR3 proper motions into account. If we, in addition, require the same identifier, we are down to 93.6%, so close to three percent of the sources now have a different identifier. This typically happens when two processes are in conflict. On-board, the transit of a single star may trigger more than one detection or a close pair may not be resolved. Later, on ground, the cross match algorithm has to decide if it is dealing with one or two sources. Since the previous data release, more information has become available and algorithms have been adapted to better handle the difficult cases. Figure 1, which is based on a representative subset, shows that sources in the range of 10 < G < 11.5 mag are strongly affected by this identifier change. This is a magnitude range where the on-board detection often detects two sources, rather than just one, especially in the upper rows of the focal plane, where images tend to be wider, cf. Rowell et al. (2021, Fig. 12). Measures are being taken in the on-ground processing to essentially eliminate this issue in future data releases. Also for the brightest sources, the swarm of spurious detections they trigger on-board Gaia creates problems for the cross match process, and a large fraction therefore have new identifiers. These points are discussed in detail in Torra et al. (2021).

More remarkable than a change of identifier is perhaps that many Gaia DR2 sources have no close counterpart in Gaia EDR3. If we use a closeness limit of 10 mas, as many as 3.8% sources have vanished. This limit may be a little too strict, for example, for faint sources, and if we relax it to 50 mas it is only 0.61% of the sources. Going all the way to 2′′, 0.18% are still missing. As shown in Fig. 1, where a 50 mas limit is used, 1–2% of the brightest sources have changed. In the faint end, the fraction of missing sources is very small at 18 mag, but it increases slowly until 20.7 mag after which it rises sharply, reaching 20% at 21.1 mag. There can be many reasons for these changes, but binaries, crowding, and spurious sources are among the likely explanations.

On the other hand, counterparts may be offset for good reasons. For double stars we may, for example, have only the photocentre in Gaia DR2, but a resolved pair in Gaia EDR3. It can also be that the proper motion is unknown or erroneous, and this can be important even when propagating positions by the mere 0.5 yr, which is the difference in epoch between the two catalogues. Counterparts may also be completely missing, for example, if the detections upon which the Gaia DR2 source was based are now considered spurious.

A table, gaiaedr3.dr2_neighbourhood, is provided in the Gaia archive. We recommend using this table for looking up Gaia DR2 sources in Gaia EDR3.

	Fig. 1Fraction of Gaia DR2 sources maintaining the same source identifier in Gaia EDR3 (red curve), and the fraction – irrespective of the identifier – having a counterpart in Gaia EDR3 at the same position within 50 mas (blue curve).

2.2 Large-scale completeness of Gaia EDR3

On-board Gaia, sources are selected for observation based on two criteria: they must be roughly pointlike and they must be brighter than G = 20.7 mag. The instrument has, however, a limited capacity for the number of simultaneous observations, cf. de Bruijne et al. (2015), and when scanning close to the Galactic plane some observations – in particular the fainter – are never sent to ground because of limited mass-memory and telemetry capacity. As a simple measure of the actual magnitude limit, Fig. 2 shows the 99th percentile of the G magnitude across the sky using the healpix spatial index (Górski et al. 2005). The area with the brightest limit is Baade’s window, unsurprisingly, followed by low Galactic latitudes in general. Here, the finite on-board resources clearly dominate. The limit is fainter on higher latitudes, especially along the caustics of the scanning law, where more transits are available.

Another way to estimate the completeness is to look at how the actual number of transits obtained for each source depends on the magnitude as illustrated in Fig. 3. It shows the first, third, and fifth percentiles of the number of transits and the number of visibility periods2 used in the astrometric solution. For a magnitude range where the catalogue is nearly complete, we expect these percentiles for the number of transits to lie well above the required minimum of five transits for a source. For the catalogue in general (top panel), this holds for sources brighter than G ~ 19 mag; however, when we reach G ~ 20 mag, the incompleteness is noticeable. For a field in Baade’s window (lower panel), sources are deprived of transits at a much earlier point and the incompleteness is severe at G ~ 19 mag.

The number of visibility periods used for astrometry is also of interest because a minimum of nine periods is required in order to publish the parallax and proper motion, cf. Lindegren et al. (2021b). Here, an insufficient number of periods is noticeable in the catalogue beyond G ~ 18.5 mag and even more severe after G ~ 19.5 mag; whereas for Baade’s window, insufficiency sets in about 1.5 mag earlier. Thinking ahead to Gaia DR4, the mission segment covered will be twice as long as for Gaia EDR3, the number of transits and visibility periods will be double, and a significant improvement in completeness can be expected.

Comparisons with models have been performed to check the data for the star density as a function of the position on the sky and of G magnitude. The reference model is GOG20. It is described in detail in the online documentation3 and is also released with the Gaia EDR3 set of catalogues4. In order to have good statistics in each pixel of the sky maps, the comparisons are done using healpix of order 4, corresponding to 3072 pixels per sky map (Fig. 4). The comparison with the GOG20 simulation shows that the overall picture of the sky densities are very well comparable in data and model, although incompleteness may remain towards the inner Galaxy for faint stars. The Gaia EDR3 completeness has also improved with respect to Gaia DR2 for stars fainter than 18 mag (Fig. 5, upper panel). Still the predicted numbers are higher than the observed, but this is mainly due to the counts in the Galacticplane, where the extinction is underestimated in GOG20 (Fig. 4, right panel, and Fig. 5, lower panel).

Figure 6 shows the improvement in the completeness of crowded regions between Gaia DR2 and Gaia EDR3. Here, we use the OGLE data from Udalski et al. (2008) which provides only an upper limit to the Gaia completeness due to the poorer OGLE spatial resolution.

For bright sources, detection efficiency starts to drop at G ~ 3 mag due to saturation that is too strong (Gaia Collaboration 2016). As a consequence, 20% of the stars brighter than magnitude 3 do not have an entry in Gaia EDR3. A few bright stars which were present in Gaia DR2 with rather dubious solutions, such as Polaris, are also missing in Gaia EDR3.

	Fig. 2Map in Galactic coordinates of the 99th percentile of the G magnitude at healpix level 5, i.e. in 3.36 □° pixels.

	Fig. 3Third percentile of the number of transits and number of visibility periods per source used in the astrometric solution. The shaded areas show the range of the first and fifth percentiles. The limits of five transits for inclusion in the catalogue and nine visibility periods for a full astrometric solution are also indicated. Top: catalogue in general. Bottom: field in Baade’s window.

2.3 Small-scale completeness of Gaia EDR3

The completeness at the smallest angular separations can be tested using a histogram of source-pair distances in a small dense field near the Galactic plane. Such a field will be completely dominated by distant field stars and there will be very few resolvedbinaries. Figure 7 shows (top panel) a histogram of source separations for such a field, where the black line indicates the expected relation for a random source distribution. For separations above 1.5–2′′, the actual distribution closely follows this line, indicating that we have a high completeness. However, below 1.′′ 5, and especially below 0.′′7, the completeness falls rapidly. This is expected, taking the current processing strategy into account, and it is caused by conflicts between neighbour observations both wanting to use the same pixels. Between 0.′′ 18 and 0.′′ 4, only a few pairs were resolved because of the particulars of their magnitude difference and their orientation with respect to the dominating scan directions. The bottom panel of the figure shows the same distribution but normalised with respectto the expected relation. This shows an apparently higher completeness for the lowest separations and the question that is begged is whether spuriously resolved single sources are at play. We notice that below 0.′′ 4, as many as 74% of the pairs are composed of 2p solutions, making it difficult to judge if they are genuine. For separations between 0.′′ 4 and 0.′′ 5, the pairs of 2p solutions constitute only 40%.

Figure 8 shows the improvement in the spatial resolution of Gaia EDR3 using the Washington Double StarCatalog (WDS; Mason et al. 2001). It confirms again that incompleteness is severe below 0.′′ 7, but it has improved substantially when compared to Gaia DR2.

Fig. 4Star counts in Gaia EDR3 (left), GOG20 (middle), and relative difference (right) in the magnitude range of 17 < G < 18. A relative difference, (EDR3-GOG20)/EDR3, of − 1 (resp. +1) corresponds to an excess (resp. a deficit) of 100% in the GOG20 model with regard to Gaia EDR3 data. The colour scale is logarithmic in the left and middle panels, and linear in the right panel. The healpix level is 4.

Fig. 5Star counts averaged among the healpix bins over the whole sky as a function of magnitude, in Gaia EDR3 (orange crosses) and Gaia DR2 (blue triangles), compared to GOG20 (black circles). Bottom: difference in counts between GOG20 and Gaia EDR3 over the whole sky (circles), excluding the Galactic plane (triangles), and excluding the Galactic plane and the Magellanic Clouds (crosses). The deficit in GOG20 at faint magnitudes is mainly due to the Magellanic Clouds as they are not included in the model.

	Fig. 6Improvement of the Gaia completeness at G = 20 mag versus some OGLE fields of different stellar densities from Gaia DR2 (red) to Gaia EDR3 (blue).

2.4 Completeness in crowded regions: globular clusters

As already mentioned, the Gaia instrument has a limited capacity for observing very dense areas, and sources in these fields will get fewer observations and the limiting magnitude will be brighter. We derive the completeness in a few globular clusters which have variouslevels of crowding. The procedure and the sample are the same as described in Arenou et al. (2018). Gaia EDR3data are compared with the catalogue of HST photometry by Sarajedini et al. (2007). We recall that the completeness of HST data is derived using crowding experiments and is higher than 90% in the whole Gaia range.

Table A.1 presents the results for the inner and outer regions of the cluster sample and shows the combined completeness of the astrometry and the photometry. Since, by construction, the Gaia photometry is only published for sources with an astrometric solution, the photometric completeness cannot be higher than the astrometric completeness. It can, however, be lower, in particular in high density regions for the _G_BP and _G_RP photometry. Since this photometry is derived from dispersed images, crowding affects these measurements much more than astrometric measurements and G band photometry. We found that in globular clusters, a percentage of about 20–30% of stars having five or six parameter solutions do not have _G_BP and _G_RP magnitudes. One of the worst cases is NGC 6809 where the percentage of stars without _G_BP and _G_RP is of 37%. Because of the lower level of crowding, open clusters are more favourable cases. In general, the percentage of stars without _G_BP and _G_RP is of the order of 1–3%.

In globulars, the completeness is still at the 60% level for G ~ 19 mag when the density is of the order of 105 stars/sq deg. The completeness is higher than 20% at G ~ 17 mag when the density is lower than a few 107 stars/sq deg. At the faint end, in favourable cases, the completeness is still at the 80% level at G ~ 20 mag (see Fig. 9). Inner and outer regions of the globulars have very different levels of completeness. For instance, in NGC 5053, the completeness in the inner and outer regions are very similar and quite high (60% at G ~ 20 mag); in NGC 2298, the inner and outer regions have a very different level of completeness. In the inner region, the completeness is about 10% at G ~ 20 mag, and 60% in the outer region. In the very crowded NGC 5286, the completeness in the inner region is 20% at G ~ 17 mag. However, the completeness level is still variable for similar densities and magnitudes, depending on the number of observations among others (see Fig. 9).

	Fig. 7Top: histogram of source pair distances in a circular field of radius 0.5° centred at (l, b) = (330°, −4°) with a line showing the expected relation for a random distribution of the sources. Bottom: normalised histogram using the expected relation.

	Fig. 8Improvement of the completeness (in percent) of visual double stars from the WDS catalogue as a function of the WDS separation between components from Gaia DR2 (red) to Gaia EDR3 (blue).

	Fig. 9Global completeness as a function of G for the whole sample of globulars (grey line). Pink, blue, and orange lines indicate the completeness in different density ranges D. The shaded areas indicate the uncertainties

2.5 High proper motion stars

Gaia DR2 contained 951 sources with a proper motion above 1000 mas yr−1 and 3726 with motions above 600 mas yr−1. The corresponding numbers for Gaia EDR3 are 633 and 2729, respectively, which is about 30% fewer. At first glance, this may look disturbing, but the fall is largely explained by spurious solutions in Gaia DR2 having been weeded out. This point is illustrated in Fig. 10. The top panel shows the 3 280 360 Gaia DR2 proper motions above 50 mas yr−1 against the parallax. There is a large population of negative parallaxes and also parallaxes approaching two arcseconds in absolute value. Conditions that trigger a spurious solution with a negative parallax, such as an unresolved duplicity, may equally well produce a spurious solution with a positive parallax as demonstrated by the overall symmetrical appearance. We return to this point in Sect. 3.2. The lower panel shows the corresponding plot for Gaia EDR3 with 3 273 397 sources, that is to say 6963 fewer. This plot has a better appearance with much fewer negative parallaxes. In particular, there are no more negative parallaxes for sources with proper motions above 300 mas yr−1. Sources with a proper motion higher than 1000 mas yr−1 must – in the great majority – be relatively nearby and will only very rarely have a parallax below 10 mas lest their tangential velocity become unrealistically large. We can therefore safely assume that solutions giving negative parallaxes are mostly spurious for these large proper motion cases. In Gaia DR2, 175 of the 951 sources with large proper motions had a negative parallax and if we assume that a similar number had a spurious positive parallax, we are down toabout 600 good solutions. This is a number that compares favourably with the 633 such sources in Gaia EDR3. While it is possible to estimate the number of spurious solutions in Gaia DR2, it is not easy to identify which ones they are. We note that this improvement holds for high proper motion sources because they benefit more from the longer mission duration. We examine sources of lower proper motion in Sect. 3.2.

Comparing high proper motion stars with SIMBAD, we find that 8% of the SIMBAD stars with a proper motion higher than 600 mas yr−1 are missing in Gaia EDR3. Those are mostly stars with only a 2p solution in Gaia EDR3 and stars outside _Gaia_’s magnitude range, and a few show duplicated entries in Gaia EDR3.

	Fig. 10Proper motion versus parallax for large proper motions. Top: in Gaia DR2. Bottom:in Gaia EDR3. The grey line shows the locus of tangential velocity 500 km s−1.

2.6 Sources without parallax and proper motion

As many as 344 million sources have only a position published from the astrometric solution and neither parallax nor proper motion. The requirements for maintaining a full astrometric solution are detailed in Lindegren et al. (2021b). A source must, as mentioned, have at least nine visibility periods, the formal uncertainties must be sufficiently small, and the source must be brighter than G = 21 mag according to the photometry available at the time of the astrometric processing, that is in Gaia DR2. The majority of the 2p sources have simply too few observations and will obtain a full solution in later releases.

If we look specifically at sources brighter than 19 mag, where the completeness is high, we have 8.8 million 2p solutions out of 575.9 million sources, that is 1.5%. This is a clear improvement over Gaia DR2, which contained 13.8 million 2p solutions among 568.1 million sources, that is 2.4%. For these brighter sources, the problem is only the lack of observations for less than half of them. The rest have, for a large part, a problem with a close neighbour. This follows from the various indicators in the catalogue, such as ipd_frac_odd_win, which indicates observation window conflicts for wider pairs, ipd_frac_multi_peak, indicating resolved, closer pairs and ipd_harmonic_gof_amplitude, indicating asymmetric images for the closest pairs. The processing approach in Gaia EDR3 did not intend to resolve these pairs.

3 Astrometric quality of Gaia EDR3

The astrometric solution for Gaia EDR3, cf. Lindegren et al. (2021b), is based on 33 months of observations as compared to the 21 months for Gaia DR2. Therefore, there are reasons to not only expect a much improved precision, but also a much better ability to disentangle proper motions and parallaxes even for sources where (some) transits are biased by a close neighbour.

As already mentioned, there are three flavours of astrometric solutions in Gaia EDR3: either 2p, with only a position, 5p, with also parallax and proper motion, and finally 6p, where also a pseudo-colour is derived. The latter category encompasses the faintest sources and sources without a known colour in Gaia DR2 or with a clearly biased colour. Therefore, the more accurate astrometric solutions are those with five parameters (5p). This is partly the case because of the correlations introduced by deriving a colour from the astrometric measurements, in part because the photometric colour is missing precisely for the sources observed in the more difficult conditions, for example, in crowded areas or with a brighter source in the vicinity. Here, we concentrate on the 5p and 6p solutions, which are the ones of more interest.

We test the astrometry with an emphasis on the parallaxes. Here, large negative values reveal the presence of spurious solutions; distant objects such as quasars (QSOs) are suited for testing the parallax zero point; and star clusters and binaries, with sourceslocated at nearly the same distance, are ideal for finding magnitude and colour dependent parallax errors. For the parallax precision, we use the negative wing of the parallax distribution. For proper motions, star clusters and binaries are again ideal when looking for magnitude dependent errors.

Lindegren et al. (2021a) calculated a parallax zero point correction depending on the magnitude, colour, and ecliptic latitude, using different tracers (QSOs, red clump stars, stars from the Large Magellanic Cloud (LMC), binaries). The paper provides recipes for correcting the zero point error, at least in a statistical sense. We therefore also look into the effect of applying these corrections.

3.1 Imprints of the scanning law

The scanning law for Gaia (Gaia Collaboration 2016, Sect. 5.2) specifies the direction of the spin axis of the spacecraft – and thereby also the great circle being observed – as a function of time. The axis precesses around the Sun at an angle of 45° with a period of 63 days, thereby creating a characteristic pattern in the number of transits across the sky with high values at ±45° ecliptic latitude and in loops in between.

Figure 11 shows the average separation between sources in Gaia EDR3 and their counterparts in Gaia DR2, taking Gaia EDR3 proper motions into account. Only separations below 10 mas and sources with a known proper motion have been used (1458.7 million sources). The map shows that the discrepancy is normally at the sub-mas level, but there are also specific areas of the sky where they approach 2 mas. On the one degree scale, there is a clear scan-law pattern. The plot is dominated by very faint sources and does not reflect the performance in the bright end. Although it cannot be deduced from the map itself, we believe that it largely shows positional errors of Gaia DR2.

Figure 12 shows the remaining chequered-pattern systematics in the parallaxes due to the Gaia scanning law in the direction of the LMC and of the Galactic centre, where the parallaxes are small and homogeneous enough so that variations in the median values merely reflect the parallax errors. The systematics for 6p are larger than for 5p, but they have otherwise decreased together with the chequered-pattern since DR2 (cf. Arenou et al. 2018, Fig. 13).

	Fig. 11Mapin Galactic coordinates of the mean positional difference between Gaia EDR3 and Gaia DR2, having propagated the Gaia EDR3 positions to the epoch of Gaia DR2.

	Fig. 12Maps in Galactic coordinates of median parallaxes in the direction of the LMC (top) and Galactic centre (bottom) for 5p (left) and 6p (right) solutions. To increase the contrast, the represented parallax range is [−0.05, 0.05] mas, although the median parallax in most of the field is above 0.05 mas.

3.2 Spurious astrometric solutions

An astrometric solution for a source is derived from a cluster of transits covering nearly three years and associated with the same source identifier. It is a fundamental assumption that the image parameters for each of these transits all refer to the same astrophysical source. Ideally, this is an isolated point source, but it could also be the photocentre of an unresolved pair. For close source pairs, this assumption breaks down. Depending mainly on the scan angle, different transits may then give image parameters for one or the other source, and sometimes for the photocentre. In these cases, there is a risk that the astrometric solution will produce meaningless proper motions and parallaxes. Here, we call these spurious astrometric solutions. Several different quality indicators for the solution help to identify such cases, cf. Lindegren et al. (2021b).

A common way of selecting reliable astrometric solutions – in particular parallaxes – is to use only parallaxes larger than some apparently safe value or only parallaxes much larger than their estimated uncertainties. At first glance, it does appearsafe to use only parallaxes with relatively small errors, for example with parallax_over_error > 5. There are 192.21 million sources in Gaia EDR3 with such good parallaxes. We use the limit of five as an illustrative example and not as a recommendation. The point is simply that the number of negative parallaxes fulfilling the parallax_over_error < −5 condition is expected to be extremely small for a Gaussian error distribution.

Formal uncertainties can, however, be misleading. They are based on the assumption that the source is undisturbed and can be properly described using a five-parameter model. This is normally true, but far from always. One way to find spurious solutions is to count the fraction of very negative parallaxes, for example for the present example smaller than minus five times the formal uncertainty. There are 3.04 million sources with parallax_over_error < −5. These solutions are clearly spurious.

We can reasonably assume that a disturbance giving rise to a negative (spurious) parallax, for example image parameters affected by duplicity or crowding, could just as well have produced a spurious solution with a positive parallax and with roughly the same probability. We therefore get a conservative estimate of the number of spurious, positive parallaxes by counting the negative ones. Needless to say, disturbances can also be so small that they merely produce slightly wrong positive parallaxes, but these cases are harder to find.

We can therefore say that among the 192.21 million significant, positive parallaxes, of the order of 3.04 million are spurious, that is to say 1.6% of this “good” sample. As illustrated in Fig. 13 (upper panel), the spurious fraction, determined in this way, strongly depends on magnitude and is much higher for 6p solutions than for 5p ones. We recall here that 6p solutions are used for sources where some circumstances prevented good _G_BP and _G_RP photometry from being determined in the processing for Gaia DR2. It is reasonable to assume that it is these very circumstances that have also led to the spurious astrometry rather than the inclusion of a sixth parameter. The lower panel of Fig. 13 shows that areas such as the LMC and the Galactic centre have a particularly high fraction of spurious solutions. This is very likely caused by crowding. When evaluating parallaxes for a particular sample of sources, where only positive parallaxes are selected, we therefore recommend to also select a similar sample, but with negative parallaxes in order to evaluate the likely fraction of spurious results.

Thanks to the better angular resolution in Gaia EDR3, the number of spurious solutions has decreased substantially since Gaia DR2. This can be illustrated with a proper motion diagram near the Galactic centre (Fig. 14). In this region of the ecliptic, with a small number of visibility periods, there are mostly two perpendicular scanning directions which are now barely visible, but which clearly appeared with spurious proper motions in the corresponding Gaia DR2 figure (Arenou et al. 2018, Fig. 11b). With many half-resolved doubles in this very dense region, distorted image parameters can explain a large number of spurious solutions, that is to say solutions, which have large proper motion errors in Gaia DR2.

Compared to Gaia DR2, the dispersion of the proper motions in Fig. 14 is a factor > 3 smaller, so that one could wonder whether spurious solutions are still present. Here the ipd_gof_harmonic_amplitude5 parameter can be of help: Values above 0.1 for sources with ruwe6 larger than 1.4 characterise resolved doubles, which have not been correctly handled yet. Using this parameter as an explanatory variable on Fig. 14 (upper panel), we conclude that the corona of relatively large proper motions can be spurious, since the ipd_gof_harmonic_phase7 in Fig. 14 (lower panel) suggests that these sources were partly resolved along the two principal scanning directions.

	Fig. 13Estimated fraction of spurious solutions for sources where the formal uncertainty is at least five times smaller than the parallax. Top: fraction by G for the whole catalogue (dashed line) as well as separately for 5p (lower, red line) and 6p (upper, blue line) solutions. Bottom: skymap of the fraction for the whole catalogue in Galactic coordinates.

	Fig. 14Proper motion diagram of sources near the Galactic centre within a 0.5° radius. Top: colour-coded by ipd_gof_harmonic_amplitude. Bottom: coded by ipd_gof_harmonic_phase. Reddish points in the top panel reveal potentially spurious solutions.

3.3 Large-scale systematics

The quasarsare distant enough so that the DR3 measured parallax directly gives the astrometric error, thus QSOs can be used to estimate thelarge-scale variation of the parallax systematics. The QSO sample used is mostly a subset from outside of the Galactic plane of the sources listed in the table agn_cross_id published as part of the Gaia Archive for EDR3. The sample was filtered from potential 5_σ_ outliers in parallax or proper motion and from potential non-single objects using: ruwe < 1.4 and ipd_frac_multi_peak ≤ 2 and ipd_gof_harmonic_amplitude < 0.1.

Median parallaxes were computed in overlapping regions of radius 5° having at least 20 QSOs and are shown in Fig. 15. Compared to the similar plot done for Gaia DR2 (Arenou et al. 2018, Fig. 15), the improvement in the top panel of Fig. 15 is very clear for the 5p solutions. A slight north-south asymmetry appears with, for example, parallaxes below β < −30° being about 8 _μ_as more negative than those above _β_ > 30°. Applying the zero point correction from Lindegren et al. (2021a) removes this asymmetry (see bottom panel of Fig. 15); some east-west asymmetry along the ecliptic of a few _μ_as may, however, remain. It is more difficult to conclude about the 6p solutions: They represent only 20% of the QSO sample and have larger uncertainties, so the amplitude of the variations may be more related to random errors than to systematics.

	Fig. 15Maps in ecliptic coordinates of the variations of QSO parallaxes (mas) in 5° radius fields. Top: 5p solution. Bottom: 5p with zero point correction.

3.4 Comparison to external data

We compared the Gaia EDR3 parallaxes with external catalogues described in detail in the Gaia EDR3 online documentation8. Those are the same as used in Arenou et al. (2018), except that we updated the APOGEE catalogue to the DR16 version (Ahumada et al. 2020), the ICRF catalogue to its third realisation (Charlot et al. 2020), and added dSph members. We show the summary of the results in Table 1 without and with the parallax zero point correction of Lindegren et al. (2021a) applied. The correction significantly improves the parallax differences, the exceptions being the LMC and SMC (Small Magellanic Cloud) stars sub-selected by their Gaia DR2 radial velocities, which are bright, and the two largest dSph of our sample, Sculptor and Fornax. The parallax difference with HIPPARCOS is within the expected HIPPARCOS parallax zero point uncertainty (up to 0.1 mas, Arenou et al. 1995), but a correlation of the parallax difference with the magnitude is seen for HIPPARCOS stars brighter than G = 6 mag. The jump in the parallax zero point at G ~ 13 mag (Lindegren et al. 2021a) is seen in the APOGEE comparison and removed by the application of the parallax zero point correction. The variation in the QSO parallax with the magnitude for 5p solutions was also removed by the parallax zero point correction. A correlation of the parallax zero point with the pseudo-colour is seen in the dSph, in particular in Fornax, which was reduced but not fully removed by the parallax zero point correction of Lindegren et al. (2021a).

Concerning the proper motions, we looked in particular at the difference between the Gaia proper motion and the proper motion derived from the positions of Gaia and HIPPARCOS. By construction (Sect. 4.5 of Lindegren et al. 2021b), the global rotation between those proper motions, seen in Gaia DR2 (Lindegren et al. 2018b; Brandt 2018), is not present anymore. However, a variation of this rotation with magnitude and colour is still present but smaller than for Gaia DR2 (the maximum variation reaching 0.1 mas yr−1 for bright or red sources). We note that between Gaia and HIPPARCOS proper motions, a global rotation is still present with w = (− 0.120, 0.173, 0.090) ± 0.005 mas yr−1. This is a deviation well within the estimated accuracy of the HIPPARCOS spin.

	Fig. 16Parallaxes averaged among healpix bins over the whole sky as a function of magnitude for Gaia EDR3 (orange crosses), GOG20 (black circles), and Gaia DR2 (blue triangles).

3.5 Comparison to a Milky Way model

We compared the astrometric data to that of the GOG20 simulation in order to investigate potential systematic errors. This was done by computing the median of the parallaxes and the median of the proper motions in each healpix bin of the sky map for all of the data and the model. The comparison for the median parallaxes are shown in Fig. 16 as a function of magnitude for Gaia EDR3, Gaia DR2, and the GOG20 simulation.

The median parallaxes are generally in very good agreement between Gaia EDR3 and GOG20, specially at magnitudes larger than ten. However, there is a systematic difference which, in absolute value, depends on magnitude, and it is quite high on the bright side and more than 1 mas. This systematic difference between thedata and model simulation is a bit reduced in Gaia EDR3 compared to Gaia DR2. At G > 10 mag, the difference goes below 0.1 mas. Regarding the proper motions, the model and data present approximately similar patterns in all magnitude ranges. However, there are systematics, as was already noted in the validation of Gaia DR2. Overall, Gaia EDR3data are as expected from our knowledge of the Galactic kinematics up to very faint magnitudes and it is probably the model which suffers from systematics, or it does not account for asymmetries. Indeed, we note that the change of kinematics prescriptions from GOG18 to GOG20 generally allows for a better agreement with the data.

3.6 Astrometric zero point and precision of the parallaxes from cluster analysis

The zero point of the parallaxes was verified using three external reference catalogues of open clusters. We made use of Dias et al. (2014; hereafter DAML), Kharchenko et al. (2013) (hereafter MWSC), and finally Cantat-Gaudin et al. (2020) based on Gaia DR2 parallaxes. We selected the most suitable clusters for this aim: a selection of about 200 clusters with well known parameters (hereafter best sample) for a total of about 70 000 stars; and a wider sample of 2043 clusters including 250 000 stars (hereafter whole sample). The best sample is the same sample that was already used to validate Gaia DR2 in Arenou et al. (2018). The whole sample is based on Cantat-Gaudin et al. (2020) cluster identification and parameters. Cluster members for this analysis were obtained using Gaia EDR3 proper motions selected within one σ from the mean value. Clusters closer than 1000 pc show an intrinsic internal dispersion in the parallaxes and are not suitable for estimating the zero point. When we used clusters located farther away, we derived an average zero point difference (ϖ_Gaia − ϖ_reference) = − 0.059 mas for MWSC and − 0.091 mas for DAML, but with a large σ ~ 0.2 mas. Looking at the trends of the zero point with colour and magnitude, we find a complex pattern. In Fig. 17, we plotted the differential parallax to the cluster median Δ_ϖ for the whole sample of clusters located farther than 1000 pc, which were normalised and not normalised to the nominal parallax uncertainties. We note that Δ_ϖ gives an indication about zero point changes. When plotted versus G, we detected discontinuities in the zero point at G ~ 11, 12, and 13 mag. Strong variations are evident for stars bluer and redder than _G_BP − G_RP ~ 0.9 mag. Figure 17 (right panel) presents the variation of Δ_ϖ in the colour-magnitude diagram, showing a number of discontinuities and a complex pattern. At faint magnitudes, red stars have a higher dispersion; however, the effect can be due to the less reliable membership, while at red colours, the large variations can reflect poor statistics. When divided by the nominal uncertainty, these patterns are still present with a reduced amplitude, implying that nominal uncertainties on the parallax do not account for the zero point variation, that is to say nominal uncertainties are underestimated.

Clustersare very good targets to test the quality of the parallax correction from Lindegren et al. (2021a) since all the stars are expected to have the same parallax. In addition, clusters can be found close to the Galactic plane, where no calibration tracers are located. This allows for an independent verification. We applied this correction to the cluster data, using the Matlab code9 provided in the Lindegren et al. (2021a) paper.

The results are shown in Fig.18 where we plotted the Δ_ϖ_ and the scaled Δ_ϖ_, that is to say the analogue was scaled to the nominal uncertainties on parallaxes as a function of the G magnitude, and finally the residuals to the median in the colour magnitude diagram. This correction reduces the dispersion inside the clusters at bright magnitudes and bluer colours, while at faint magnitudes (G > 18 mag) or a redder colour, the dispersion is still high. The median values scaled to the nominal uncertainties are always < 1, which indicates that nominal uncertainties account for the residual variations. Clearly this positive result should be taken with caution. It refers to a specific range of colours and positions in the sky.

Finally, we compare the parallaxes of single stars in Gaia DR2 and Gaia EDR3 for the whole cluster sample. The median difference is (ϖ Gaia DR3−ϖ Gaia DR2) = 0.017 mas (with 16th percentile = −0.047 mas; 84th percentile = 0.082 mas) with a dependence on the magnitude.

Table 1

Summary of the comparison between the Gaia parallaxes and the external catalogues.

Fig. 17Left: Δ_ϖ_ (top) and scaled to the nominal uncertainties Δ_ϖ_ (bottom) versus G for the whole sample of clusters. The solid lines show the LOWESS (locally weighted scatterplot smoothing) of the stars bluer (redder) than _G_BP − _G_RP = 0.9 (blue and red lines), while the black line is for the whole sample. Right: CMD of the whole sample where the colour shows the differential parallax to the median.

Fig. 18Left: Δ_ϖ_ (top) and the scaled Δ_ϖ_ (bottom) versus magnitude for the whole sample after the correction to the parallax zero point is applied. The solid lines show the LOWESS of the stars bluer (redder) than _G_BP − _G_RP = 0.9 mag (blue and red lines). The black line is the for the whole sample. Right: CMD of the whole sample where the colour shows the differential parallax to the median, and the analogous scaled to the nominal parallax uncertainties after parallax zero point correction (see text for details).

3.7 Uncertainty of the astrometric parameters

We evaluate the actual precision of the astrometric parameters partly from parallaxes and proper motions of QSOs and of stars in the LMC and partly from deconvolution of the negative parallax tail. As discussed by Lindegren et al. (2018a), it is useful to describe the true external parallax uncertainty, _σ_ext, as the quadratic sum of the formal catalogue uncertainty (parallax_error) times a multiplicative factor (k) plus a systematic error (_σ_s),σext2=k2σi2+σs2. $\begin{equation*} \sigma_{\textrm{ext}}^2 = k^2\sigma_i^2+\sigma_{\textrm{s}}^2.\end{equation*}$ (1)

In addition to this, the catalogue uncertainties incorporate part of the excess noise of the solution when present. Consequently, large uncertainties typically correspond to both fainter sources and/or non-single stars.

3.7.1 Uncertainty of parallax and proper motion from distant objects

Similarly to what was found for Gaia DR2 by Arenou et al. (2018), QSOs show that the uncertainties are slightly underestimated and that this under-estimation increases with magnitude. The under-estimation is lower than for Gaia DR2 for the 5p solution, but larger for the 6p solution. This is seen for a parallax using the unit-weight uncertainty (uwu) in Fig. 19 and for proper motion using a _χ_2 test in Fig. 20. The trend with magnitude is also seen with LMC stars, although the under-estimation of uncertainties is higher, as presented in Fig. 21, which is most probably due to the crowding. The increase in the under-estimation is also seen in the uwu presented in Table 1. The uwu reported is the k term of Eq. (1), assuming a negligible systematic error term (_σ_s) except for the HIPPARCOS comparison which is the only catalogue for which both terms could be clearly separated. The uwu and the residual R χ were computed after applying the parallax zero point correction of Lindegren et al. (2021a) and removing stars with ruwe > 1.4.

3.7.2 Parallax uncertainties by deconvolution

The “true” dispersion of the parallaxes was estimated using a deconvolution method, which was applied on the negative tail of the parallaxes (see Arenou et al. 2017, Sect. 6.2.1 for details), and the uwu ratio of the external over the internal uncertainty was computed. Figure 21 shows the uwu as a function of the catalogue uncertainties for several illustrative subsets and it can give insights into the underestimation factor, the systematics, and the contamination by non-single sources.

On the right side of the figure, the asymptotic uwu is mostly flat and it gives the multiplicative factor: it is about 1.05 for 5p solutions (improved from DR2), slightly more for very faint stars, 1.22 for 6p solutions, and larger for sources with non-zero excess noise or those in the LMC. While the uwu is in general slowly increasing with uncertainty due to the contamination by non-single sources, it increases sharply for sources brighter than 17 mag, most probably due to half-resolved doubles (as indicated by the average ipd_frac_multi_peak or ipd_gof_harmonic_amplitude), which suggests that the uncertainties of these non-single bright stars are underestimated quite a bit. Then, the left part of the curves shows the influence of the systematics. As confirmed by other tests, the systematics have decreased compared to DR2, except for 6p solutions, and they are largest for sources with non-zero excess noise, which is due to either calibration errors or to perturbation of non-single stars.

	Fig. 19Unit-weight uncertainty (uwu) that needs to be applied to the Gaia parallax uncertainties to be consistent with the residual distribution (after zero point correction and removing stars with ruwe < 1.4) versus LQRF QSOs, LMC, and dSph stars for the 5p (top) and 6p (bottom) solutions, as a function of magnitude.

	Fig. 20_χ_2 test of the LQRF QSOs proper motions as a function of G magnitude for the 5p (black circles) and 6p (purple squares) solution. The residual R χ should follow a _χ_2 with 2 degrees of freedom. The correlation observed here is due to the underestimation of the uncertainties as a function of magnitude.

	Fig. 21Uwu of the parallaxes estimated by deconvolution versus parallax_error (mas) for several subsets: DR2 and EDR3 for 5p, 6p, G < 17 mag, _G_ > 20 mag, LMC, and non-zero excess noise.

3.8 Magnitude dependence from binary stars

One way to check the magnitude variations of the parallax zero point is to use resolved binary stars. When the period of the binary system is long enough, the proper motion of the two components is similar, or at least the differences are smoothed out when a large sample is used. Potential common proper motion pairs have been selected over the whole sky; this has been restricted to primaries up to G < 15 mag and secondaries up to G < 18 mag only: selecting fainter secondaries would increase the fraction of optical doubles in dense fields too much, thus biasing the parallax differences. We computed the differences between the two components of their parallax and proper motion, then the norm of this difference accounting for its covariance was determined, and a pair was considered as a true binary if this _χ_2 had a p-value above 0.01 and the linear separation was below 104 au.

The differences of the parallax and proper motion between primary and secondary components are represented in the top panel of Fig. 22 versus the G magnitude of the primary.For 12 < G < 13 mag, a 20 _μ_as increase for the parallax zero point differences appears clearly, which was not present with DR2 parallaxes. A significant improvement is seen after the zero point correction from Lindegren et al. (2021a). Although within the uncertainty (below 0.007 mas, approximately constant), a small residual effect near G ≈ 11.5 mag or G ≈ 13 mag may perhaps still be present. It should not come as a surprise that these variations occur in the magnitude interval where there is a change in the gating scheme or in the window size (Lindegren et al. 2021a, Sect. 2.1). For primaries brighter than 7 mag (middle panel of Fig. 22), the zero point may be more negative, as can also be seen using known WDS binaries.

We tested the consistency between the parallax of two components of wide physical binaries, which were selected from the WDS (separation larger than 0.5′′ and flag “V”). We again found how the magnitude dependence of the parallax zero point present in Gaia EDR3 is nicely removed by applying the parallax zero point correction of Lindegren et al. (2021a), thus confirming the results shown in Fig. 22 (middle panel). Similarly, following Kervella et al. (2019), we used their catalogue of Cepheids and RR Lyrae resolved common proper motion pairs to check the compatibility of their parallaxes. With the parallax zero point correction of Lindegren et al. (2021a), and opposite to the Gaia DR2 results for Cepheids, no systematic offset nor strong outlier (outside 5_σ_) were found.

While no variation appears for μ δ in the bottom panel of Fig. 22, there is an increase of about 0.01 mas yr−1 at G = 13 mag for μ α*. Although this may not look statistically significant (uncertainty ≈ 0.007 mas yr−1), this effect isprobably real, as an empirical 0.1 mas rotation correction was applied to the proper motion system for G < 13 mag (Lindegren et al. 2021b, Sect. 4.5).

Fig. 22Differential astrometry for common proper motion pairs shown in the sense primary minus secondary component. Top: differences between parallaxes (mas, running medians over 3000 sources) before (blue) and after (magenta) zero point correction, as a function of the primary magnitude, 10 < G < 15 mag. Middle: primary minus secondary parallaxes before and after zero point correction for stars brighter than G = 10 mag (running medians over 100 sources). Bottom: differential proper motion (mas yr−1) in right ascension (blue) and declination (orange) versus magnitude, running medians over 3000 sources.

3.9 Pseudo-colour dependence

The 6p solutions derive both the astrometric parameters and the pseudo-colour. Although the correlation between the parallax and pseudo-colour is, in general, not large (top panel of Fig. 23), this nevertheless implies that (random or systematic) errors in the pseudo-colour translate into systematics on the parallax, which should not, in general, be interpreted as a colour effect. On average, the correlation is, for example, positive near the Galactic centre and negative in the LMC, giving a very large peak-to-peak amplitude of 0.2 mas (Fig. 23, middle panel); although it is twice as small as for DR2 (Arenou et al. 2018, Fig. 18). After the Lindegren et al. (2021a) correction of the zero point, the variations were considerably reduced (Fig. 23, bottom panel).

	Fig. 23Top: correlation between parallax and pseudo-colour. Middle: running median with uncertainty for LMC parallaxes where the negative correlation translates into systematics for the bluest and reddest sources. Bottom: LMC parallaxes after zero point correction.

3.10 Proper motions from star cluster analysis

For each star, we derived the differences of the proper motions μ α*, μ δ to the cluster median. This provides information about the zero point variations and about the uncertainties. Figures 24 and 25 present the results, which were scaled and not scaled to the nominal errors as function of G, _G_BP − _G_RP, and in the colour-magnitude diagram. As already found for the parallaxes, a complex dependence of the difference with magnitude and colour is clearly visible. This reflects a variation in the zero point. In particular, a shift is present at G ~ 13 mag in μ α*. When this difference is normalised to the nominal uncertainties, we find that these patterns are still present for brighter stars. This means that nominal uncertainties are underestimated in this magnitude and colour range. This effect is more evident in μ α*. Residuals are also present for very red stars (_G_BP − _G_RP ~1.5−2 mag), where the results are less significant, since the statistics is relatively poor. This region can be contaminated by field stars.

We calculated the proper motion differences (median, μ α*EDR3−μ α*DR2) for the whole cluster sample in Gaia DR2 and Gaia EDR3. The zero point median difference is − 0.006 mas yr−1 with a third quartile of 0.06 mas yr−1 in right ascension and in declination.

Fig. 24Left: difference between proper motions in declination and median value (top) and scaled (bottom) to the nominal uncertainties as a function of G for the whole cluster sample. The solid lines indicate the LOWESS of the sample; blue corresponds to stars bluer than _G_BP − _G_RP = 0.9 mag and red is for objects redder than this limit. Right: colour-magnitude diagram of the whole cluster sample where the colour indicates the difference between the star μ δ and the cluster median. The rightmost panel is analogous to the left-hand panel, where the difference is scaled to the nominal uncertainties.

3.11 Proper motion precision in crowded areas

We compared the proper motions in the centre of M4 with external HST data (Nascimbeni et al. 2014), where high-quality relative proper motions are available. The precision of HST proper motions is of the order of 0.33 mas yr−1. The observed field is affected by crowding. We used the flux excess parameter C=(phot_bp_mean_flux+phot_rp_mean_flux)/ phot_g_mean_flux10, as a tracer of crowding contamination. In Fig. 26, we present the Gaia proper motions distribution, the differences between Gaia and HST proper motions, and the scaled dispersion for the stars having Gaia nominal proper motion uncertainties < 0.2 mas yr−1 and a moderate flux excess, C < 2. The scaled dispersion is very close to one for both _μ_ _α_* and _μ_ _δ_, indicating that the nominal uncertainties were estimated correctly. Stars in this sample are not affected by crowding. Stars with a high level of contamination (C > 5) are, on average, at 1.6–1.8 error bars from the expected value, that is to say Gaia nominal uncertainties on proper motions are underestimated for faint and/or contaminated stars. In any case, this represents a substantial improvement over Gaia DR2, where the nominal uncertainties were underestimated by a factor from two to three.

3.12 Goodness of fit for very bright stars

Although it is known that calibration errors or stellar duplicity may frequently make the quality of the astrometric solutions decrease, it may appear surprising to complain about a solution quality that is too good. Figure 27(right panel) shows that the ruwe of sources brighter than 17 mag is near the expected value of one on average. However, it is smaller than one near the Galactic centre, and twice as small for stars brighter than G = 11 mag (Fig. 27, left panel). The interpretation is the following: in this region, because most of the sources suffer from crowding, most sources should have some excess noise and not just a small fraction of them, as in other regions. Consequently, the attitude excess noise may have absorbed this source excess noise, leading to source solutions that appear much better than they truly are in reality. Consequently, the caveat is that the ruwe of bright sources in large crowded areas, and thus their astrometric_gof_al11 too, may be much smaller than they should be.

4 Photometric quality of Gaia EDR3

The photometry in Gaia EDR3 consists of three broad bands: a G magnitude for (almost) all sources (1.8 billion) and a _G_BP and _G_RP for the largemajority (1.5 billion). The photometry and its main validation is described in Riello et al. (2021), and here we present some additional validation. These tests include internal comparisons, comparisons with external catalogues, and some simulations.

Among the issues described by Riello et al. (2021), there are two issues for the user to be aware of. The first concerns sources where a reliable colour was not known at the beginning of the Gaia EDR3 processing and which therefore have not benefited from an optimal processing. These are all the 6p sources and many 2p sources, but none of the 5p sources. Riello et al. (2021) propose a correction to the catalogue G magnitude of the 6p sources that is typically of the order of a hundredth of a magnitude. For 2p sources, it is unfortunately not possibleto know with certainty if a correction is needed. The second issue concerns the faintest _G_BP sources, where a source with practically no signal, for example the _G_BP flux for a faint, red source, will still have a significant flux assigned. This issue, which also applies to a small population of _G_RP fluxes, is discussed below in Sect. 4.2.

	Fig. 26Left: Gaia proper motion distribution in M4. Only stars with a proper motion error < 0.2 mas yr−1 and flux excess parameter C < 2 are plotted. Centre: differences between proper motions in Gaia and in HST data. Right: scaled dispersion for the stars having Gaia proper motions errors < 0.2 mas yr−1 and flux excess parameter C < 2.

	Fig. 27Re-normalised unit-weight error for the astrometric solution, ruwe, for sources brighter than G = 11 (left) and brighter than G = 17 (right).

4.1 Spurious photometric solutions

There were many unrealistically faint sources, down to G ~ 25.6 mag, in a pre-release version of the catalogue. Such faint magnitudes can only arise from processing problems. After investigations in the photometric pipeline, PhotPipe, the root cause was traced to the way poor input spectral shape coefficients (SSCs) disturb the application of the photometric calibration (see Riello et al. 2021, Sect. 4.4, Eqs. (1)–(3)). From the BP spectrum, four coefficients were derived: bp_ssc0,..., bp_ssc3, and four similar from RP. Quotients, such as bp_ssc0/(bp_ssc1+bp_ssc2), are used in the calibration model without taking into account that the denominator in certain cases may become extremely small and the quotient therefore extremely large. This is illustrated in Fig. 28, where we can see that in the majority of cases (in fact 82% for this sample of very faint sources) the principal blue and red quotients have reasonable values (lower left), and that these values are well separated from cases of very large values. This has allowed us to establish well defined thresholds, and the photometry of the sources with quotients larger than those thresholds is not published in the main catalogue because it is considered unreliable. The 5.4 million sources sources affected were reprocessed by PhotPipe using default SSCs, and the G flux derived this way is published in a separate file accessible in the Gaia EDR3 known issues12 page.

Coloursin Gaia DR2 were shown to be too blue for faint red sources and this also happens in Gaia EDR3. The phot_bp_rp_excess_factor parameter gives a measure of the coherence among G, _G_BP, and _G_RP and can be used to identify the problematic cases. For those cases, the use of the colour G −_G_RP instead of _G_BP − _G_RP may be more useful. Riello et al. (2021) identify the root cause of the problem to be a flux threshold applied to the individual transits when computing the mean photometry. We explore this question below.

	Fig. 28Quotients of blue and red spectral shape coefficients for the set of sources fainter than G ~ 21.7 mag. The photometry for sources with quotients larger than the thresholds (blue lines) were filtered for the publication (see text).

	Fig. 29Simulated mean fluxes as a function of the true mean flux in the presence of the 1 e− s−1 threshold (dark filled symbols) and without (light open symbols). The mean and the 1_σ_ confidence intervals are shown as lines and shaded regions. The dotted line indicates the lower bound of the mean in the presence of the threshold.

4.2 Transit-level flux threshold

The _G_BP and G_RP mean fluxes were derived from the epoch fluxes at the several transits of a source across the focal plane during the mission. As discussed by Riello et al. (2021; Sect. 8.1), in the computation of the mean fluxes, a threshold of 1 e− s−1 was introduced for the individual transits. Transits with fluxes below that threshold were excluded from the computation of the mean fluxes. This introduces a bias in the mean fluxes for the faintest sources, as the distribution of noise-affected epoch fluxes becomes truncated when the flux of a source becomes so low that the distribution of epoch fluxesreaches the threshold. To illustrate the effect, we simulated mean fluxes by applying and not applying the threshold, under the simplifying assumption of the same normally distributed background noise of 50 e− s−1 and 30 transits. As background noise, we subsume here all noise contributions other than the source’s photon noise, that is to say contributions from detector and electronics effects, the sky background, and stray light. In reality, this background noise can vary significantly between different transits. Simulations of measured mean fluxes for sources by applying and not applying the threshold asa function of true source flux is shown in Fig. 29, together with the means, and the 1_σ confidence intervals. As the source fluxes become low, the measured mean fluxes systematically deviate from the true fluxes, and the mean of the distributions of mean fluxes meet a lower bound. For a threshold much smaller than the background noise level, this lower bound on the mean is approximately 0.8 times the standard deviation of the background noise.

Sources are detected based on an estimate of their G band magnitude, with a detection limit well above the 1 e− s−1 threshold. Also since astronomical sources do not become arbitrarily blue, the flux of detected sources measured in the _G_RP passband cannot become arbitrarily smaller than the flux in the G band. But as astronomical sources can become arbitrarily red, the flux in the _G_BP passband can become smaller than the G band flux without bound. Sufficiently red sources can therefore have _G_BP fluxes that even fall below the detection limit while having significant fluxes in the G and _G_RP passbands. As a consequence, the bias on the mean flux for faint sources resulting from the threshold mostly affects the _G_BP fluxes, and it only has a much smaller effect for the _G_RP fluxes.

As a consequence of the _G_BP fluxes being far more biased towards larger values than the G and _G_RP fluxes for faint sources, a “turn” in the colour-magnitude diagram involving _G_BP magnitudes results for faint sources, as shown by Riello et al. (2021). To illustrate the effect further, we simulated BaSeL spectra (Lejeune et al. 1997) with different effective temperatures (for solar metallicity and a surface gravity, log g, of four) in the G versus _G_BP − G and G − _G_RP colour-magnitude diagrams. The distribution of the simulated mean magnitudes, together with the mean of the distributions as a function of colour, is shown in Fig. 30. For the G versus _G_BP − G colour-magnitude diagram, the strong turn in the distributions is visible, which starts at brighter G magnitudes as the source becomes cooler, and thus redder. If the fluxes approach the noise level, the distributions become independent of the spectral energy distributions of the sources, and the means of the distributions for all sources converge at the same location in the colour-magnitude diagram. For thresholds much smaller than the noise level, the position of this convergence point only depends on the background noise and, when converting fluxes into magnitudes, on the zero points of the passbands.

For the G versus G − _G_RP colour-magnitude diagram, the effect of convergence on the same mean position in the diagram for faint sources is also present. However, the turning of the distributions occurs at fainter magnitudes, and the effect for blue sources is much smaller than for the red sources in the G versus _G_BP − G case.

To minimise the effects introduced by the threshold on the interpretation of photometric data, it is thus advisable to avoid the use of _G_BP magnitudes fainter than about 20.5, corresponding to _G_BP fluxes below approximately 86 e− s−1, if possible. For _G_RP magnitudes,significant bias effects occur at values fainter than about 20.0, corresponding to _G_RP fluxes below approximately 79 e− s−1. Since the bias effects are strongest in _G_BP, the G − _G_RP colour is more reliable than coloursinvolving _G_BP.

	Fig. 30Distribution of 5 million simulated mean magnitudes for BaSeL spectra with effective temperatures from 2000 to 35 000 K, in the presence of a 1 e− s−1 threshold, in the _G_BP − G versus G (upper panel) and the G − _G_RP versus G (lower panel) colour-magnitude diagrams. The solid lines indicate the mean.

	Fig. 31Colour–colour relation of the APOGEE DR16 red clump sample. Stars with 5p astrometric solutions are shown with black dots, and the ones with 6p solutions are shown with red dots. Top: full sample. Bottom: after removing stars with _G_BP > 20.5 mag and correcting the G photometry of the 6p solutions using the recipe of Riello et al. (2021).

4.3 Photometry of the 6p solutions

We tested the correction to the G magnitude proposed by Riello et al. (2021) for stars with 6p solutions, that is with a default colour in the image parameter determination (IPD), on the red clump sample of APOGEE DR16 (Bovy et al. 2014). The variation in the colour-colour relation for those red clump stars is due to the variation in the extinction and it is curved due to the non-linearity of the extinction coefficient with extinction. In the top panel of Fig. 31, there are two effects that can be seen. The first one is that of the transit-level flux threshold, discussed in the previous section, seen as a plume of stars becoming bluer than the global relation. The second effect is the difference between thecolour-colour curves of the 5p and 6p solutions (black and red, respectively). If we were to add the G magnitude correction as proposed by Riello et al. (2021, Sect. 8.3) and remove those sources with _G_BP > 20.5 mag as proposed in the previous section, we would recover the expected colour-colour relation.

	Fig. 32Residuals from a global G − _G_RP = f(_G_BP − _G_RP) relation for asample of luminous, low extinction stars (_A_0 < 0.05 mag, M G < 4 mag). Top: as a function of G, colour-coded by the number of stars normalised by the total number of stars per magnitude bin. Bottom: as a function of _G_BP − _G_RP for a sample of bluer stars, colour-coded by the magnitude.

4.4 Photometric accuracy

The systematic errors of the photometry can be studied in both internal and external comparisons as well as from galaxy models of the MilkyWay. In addition, we look at the changes with respect to Gaia DR2.

4.4.1 Internal comparisons

We studied the internal trends as a function of magnitude. Figure 32 (top panel) shows the residuals from the median G −_G_RP = f(_G_BP − _G_RP) relation as a function of G on a sample of stars in the upper part of the H-R diagram with low extinction, that is _A_0 < 0.05 mag according to the 3D extinction map of Lallement et al. (2019) and M G< 4 mag13, taking into account the parallax error at 2 sigma. We further selected only stars with relative photometric errors better than 2% in G and 5% in _G_BP and _G_RP. These strict criteria lead to a well behaved colour-colour relation but to less than 800 000 stars, which are mostly close by and therefore relatively bright. There is a small trend with magnitude which is much smaller than in Gaia DR2 (Arenou et al. 2018). Discontinuities at G ~ 10.8 and 13 magof only about 0.5 and 1 mmag are also much smaller than in Gaia DR2. Figure 32 (bottom panel) shows that for blue stars, the trend with magnitude is still stronger (see also Sect. 8.4 of Riello et al. 2021), but the amplitude is about three times smaller than in Gaia DR2 (Arenou et al. 2018, Fig. 35).

Figure 33 shows the median _G_BP − _G_RP colours in a dense field in the case of Gaia DR2 and Gaia EDR3. One can appreciate that the artefacts from the scan patterns have decreased quite a bit.

	Fig. 33Median colours _G_BP − _G_RP in a dense field (Galactic coordinates) showing artefacts from the scan pattern in Gaia DR2 (top), which have almost disappeared in Gaia EDR3 (bottom).

4.4.2 Comparisons with external catalogues

We compared Gaia EDR3 photometry to the HIPPARCOS, _Tycho_-2, Landolt standards (Landolt 1983, 1992, 2007, 2009, 2013; Clem & Landolt 2013, 2016), the SDSS tertiary standard stars of Betoule et al. (2013), and Pan-STARRS1 (PS1, Chambers et al. 2016) photometry. For HIPPARCOS, _Tycho_-2, we selected low extinction stars (_A_0 < 0.05 mag) using the 3D extinction map of Lallement et al. (2019) and taking into account the parallax uncertainty. For SDSS and PS1, we selected stars with a high galactic latitude. For the Landolt catalogue, we selected both on |_b_| > 30° and _A_0 < 0.05 mag for low latitude stars. We selected only stars with flux_over_error > 20, corresponding to photometric errors of <0.05 mag for the external catalogues, to ensure we were working with roughly Gaussian errors in magnitude space. An empirical robust spline regression was derived which models the global colour-colour relation. The residuals from those models are plotted as a function of magnitude in Fig. 34. We also added in Fig. 34 the comparison with the magnitude computed on the CALSPEC (Bohlin et al. 2014, 2020 April Update) spectra combined with the Gaia EDR3 instrument response. The global zero point offset observed is smaller than 1%, which is in agreement with the CALSPEC expected accuracy (Bohlin et al. 2014) and our observation of the variations between different CALSPEC releases.

Figure 34 shows that the strong saturation effect present in Gaia DR2 as well as the variation of the G zero point with magnitudehave been removed in Gaia EDR3. Variations are now smaller than 0.04 mag. The trend as a function of magnitude for bright stars are consistent between the HIPPARCOS, _Tycho_-2, and CALSPEC results. On the faint end, a global increase in the residuals is observed in _G_RP consistently between Landolt, Panstarss, and SDSS, while for _G_BP those three surveys do not give a consistent amplitude of the variation. We recall that applying our procedure to colour-colour relations within the external catalogues’ photometric bands leads to global variations of the order of 2 mmag/mag for PS1 and larger for SDSS (Arenou et al. 2018).

Fig. 34Fromtop to bottom, G, _G_BP, and _G_RP photometry (hereafter referred to as X) versus external photometry: CALSPEC (black dots) corresponding to X − X(calspec); HIPPARCOS (orange) residuals of the global X − Hp = f(V − I) spline; _Tycho_-2 (cyan) residuals of the global X − V T = f(B T − V T) spline; Landolt (red) residuals of the global X − V = f(V − I) spline; and PS1 (blue) and SDSS (green) residuals of the global X − r = f(g − i) spline for SDSS and PS1. The zero point of those different residuals is arbitrary with the exception of the CALSPEC results.

	Fig. 35Mapin Galactic coordinates of the mean difference of the G magnitudes between Gaia EDR3 and Gaia DR2.

	Fig. 36Comparison between mean G magnitudes provided in the Gaia DR2 and Gaia EDR3 catalogues for 140 635 RR Lyrae variables. Black and red points show sources witha difference in magnitudes of less and more than 0.5 mag, respectively.

4.4.3 Global comparison of G to Gaia DR2

Figure 35 shows the full sky mean difference of the G magnitudes between Gaia EDR3 and Gaia DR2. Although the comparison between both releases is not straightforward because their photometricsystems are slightly different, as discussed in Riello et al. (2021), the map does illustrate specific areas where the differences are large, though it is generally at the level of a couple of hundredths of a magnitude. As in the case of astrometry, we expect that the map mainly shows the errors of Gaia DR2 and hence the improvements reached in Gaia EDR3.

4.4.4 Comparisons of _G_BP− _G_RP colours to a Milky Way model

The median of the _G_BP − _G_RP colour is computedin each healpix bin of the sky map, for Gaia EDR3 data and GOG20 simulations. At intermediate and faint magnitudes, the differences can be large in the Galactic plane and are clearly linked to the extinction. At higher latitudes, the model is in agreement with the data at the level of 0.1 mag. However at bright magnitudes (G < 9 mag at intermediate latitudes and upwards, G < 12 mag in the plane), the data deviate from the values predicted by the model, from − 0.2 mag to more than − 1 mag at G = 5. This discrepancy is slightly reduced compared to Gaia DR2. This could be a possible problem in the colour determination for those bright stars. However, since we are comparing the median value in healpix bins, an underestimate of blue bright stars in the model couldrather explain this remaining difference, in particular in the brightest bins affected by large Poisson noise.

4.5 Photometric precision

A comparison between mean G magnitudes provided in the Gaia DR2 and Gaia EDR3 catalogues for a sample of 140 635 sources classified as RR Lyrae variables in Gaia DR2, for which both estimates of the mean G magnitude are available, shows that there is a difference in magnitudes of more than 0.5 mag for 2044 sources, with the Gaia EDR3 magnitudes being fainter. The Gaia EDR3 G magnitudes plotted versus the Gaia DR2 G magnitudes for 140 635 RR Lyrae stars are shown in Fig. 36, where sources which have a difference in magnitude more and less than 0.5 mag are marked with red and black points, respectively. The figure shows that there is good agreement for a majority of stars, while for 2044 sources shown with red points there is a systematic shift in magnitudes, with the Gaia EDR3 magnitudes being fainter. Among these 2044 sources, 908 are listed by Clementini et al. (2019) in their Table C.1 as known galaxies misclassifiedas RR Lyrae variables in Gaia DR2. Furthermore, a more detailed analysis of the 2044 stars performed with a dedicated pipeline shows that the vast majority of these sources are extended objects. The fainter magnitudes for these 2044 sources are, therefore, most likely due to an improved background determination in the Gaia EDR3 processing.

Fig. 37Top:excess flux for 36 million sources towards the direction l ∈ (10°, 20°) and the b ∈ (−5°, 5°) direction. The black line is the identity and the other ones are the median for different ranges of G. Bottom: excess flux for 14 million sources towards the anticentre direction. The colour code accounts for the ratio between the log10 phot_bp_rp_excess_factor and the relation 0.05 + 0.039(_G_BP − _G_RP), which is the approximate locus of well behaved stars.

4.6 Photometric quality indicators

Gaia EDR3 includes the flux excess factor, phot_bp_rp_excess_factor14 as an indicator of the coherence among G, _G_BP, and _G_RP fluxes. It is sensitive to contamination by close-by sources in dense fields, binarity, background subtraction problems, as well as extended objects. The improvement of the full pipe-line calibrations in Gaia EDR3 yields a decrease in the excess factor with respect to Gaia DR2 as it can be seen in Fig. 37. The fainter the magnitude is, the larger the improvement, which is more noticeable for point-like sources with G > 19 mag. Sources with phot_bp_rp_excess_factor > 5 were published in Gaia DR2 without _G_BP and _G_RP fluxes. In Gaia EDR3, this filter has not been applied and it is up to the users to decide on the use of the photometry in case of a large excess of flux. Small amplitude artefacts due to the scanning pattern still remain (bottom panel in Fig. 37). Criteria for filtering the incoherent fluxes are discussed in Sect. 9.4 of Riello et al. (2021).

4.7 Photometry in crowded areas

Photometry in highly crowded areas is of lower quality than in non crowded regions. This is the case of all the globular clusters, where the photometry in the inner regions is shifted in colour and magnitude as an effect of crowding, with a large dispersion. This effect was visible in Gaia DR2 and it is still present in Gaia EDR3, in spite of the improvements in the number of observationsand in the photometry. See for instance Fig. 38, showing the quality of the photometry in the inner and outer regions of NGC 5986.

We compare G, _G_BP, and _G_RP photometry with the high quality HST ACS/F_606_W magnitudes in M4 using the data set from Bedin et al. (2013). The precision of the HST photometry is at the level of a few milli-mag. We subtracted the median difference between Gaia and HST photometry, and we calculated the variation of the residuals for stars having 1.4 < _G_BP − _G_RP < 1.5. We selecteda fix range to avoid colour effects. The residuals around the median value in G are of the order of 0.025–0.03 mag. These values are in agreement with the comparison with external catalogues presented in Sect. 4.4.2. However, a clear trend with the magnitude and with the flux excess is visible (see Fig. 39). We recall here that the flux excess can be considered as a proxy of the level of contamination due to neighbouring stars. The detected trend is an effect of the crowding, showing that faint stars located in regions with high contamination present larger residuals. No trend is present in _G_BP and _G_RP, albeit with a large dispersion due to the high level of contamination.

5 Global validation of Gaia EDR3

Here, we perform a global comparison of the statistical properties of Gaia EDR3 to Gaia DR2. This complements the detailed analyses presented in the previous sections on the astrometry and the photometry. For the statistical comparison, we use the Kullback–Liebler Divergence (KLD) to establish the degree of correlations and clustering between observables or, more generally, entries in the catalogue. The KLD is defined asKLD=−∫dn x p(x) log[p(x)/q(x)], $\begin{equation*}\textrm{KLD} = - \int {\textrm{d}}^{n} x \, p({\bm x}) \, \log[p({\bm x})/q({\bm x})] ,\end{equation*}$ (2)

where x is the _n_-dimensional vector of the observables considered, p(x) is the distribution of these observables in the dataset, and q(x) = Π i p i(x i), that is to say the product of marginalised 1D distributions of each of the observables. Large values of the KLD indicate highly structured data, while low values of the KLD (below ~ 0.5) correspond to little information content.

We compared the performance of Gaia EDR3 to Gaia DR2 in two ways: (1) we considered all 2D subspaces (n = 2), that is all possible combinations of pairs of observables in the catalogues (e.g. (G, ϖ), (σ ϖ, μ δ), etc); and (2) we computed the KLD for 3D subspaces (n = 3) for small regions on the sky, with two of the three observables being α and δ. In both cases, we excluded outliers by considering 99% of the data for each observable (e.g. for the _G_-magnitude, the range used for Gaia EDR3 is 11.69–21.62).

Figure 40 shows the distribution of KLD values for the 2D comparison between Gaia EDR3 and Gaia DR2. While a few 2D subspaces lie very close to the 1:1 line, indicating a similar behaviour in Gaia EDR3 and Gaia DR2, there is a large set of subspaces for which the KLD has decreased in Gaia EDR3. Furthermore, these sets follow a parallel track to the 1:1 line, with an offset of about 0.17. These are mostly subspaces involving astrometric uncertainties. This decrease can be fully attributed to the smaller uncertainties since the KLD is the relative entropy log [p(x)∕q(x)] weighted by the distribution of p(x) values, which have become smaller in the case of the uncertainties. The subspaces with the largest decrease in the KLD for Gaia EDR3 are those including photometric uncertainties, as may be expected. On the other hand, a few subspaces depict an increase in the KLD, and these are mostly subspaces combining the number of observations or visibility periods. In this case, the range of the parameter has increased significantly from Gaia DR2 to Gaia EDR3.

We also computed the KLD in 3D in four circular patches on the sky, each with a radius of 5 deg, and centred on (l,b) =(−90°, −45°), (−90°, 45°), (90°, −45°), and (90°, 45°), respectively. As for Gaia DR2, patches that are symmetric with respect to the Galactic plane exhibit a similar behaviour. However, there is also a strong dependence on their location with respect to the scanning law. When compared with Gaia DR2, we find Gaia EDR3 to be systematically less clustered in subspaces containing astrometric or photometric information, for example (see e.g. Fig. 33). We interpret this as an improvement on the systematics introduced by the scanning law. Nonetheless, we still find some amount of residual clustering in the astrometric parameters which appear to be sensitive to the orientation of the visits, and not just the number of observations (as seen, for example, in Fig. 12).

The global analyses performed here thus confirm that the quality of Gaia EDR3 has improved significantly compared to that of Gaia DR2. The largest improvement is found for the photometry, since some “systematics” associated to the imprint of the scanning law pattern are still present in the astrometric parameters and their uncertainties.

	Fig. 38Colour magnitude diagram in the inner (blue) and outer (black) regions of NGC 5986 in three different areas: inner 50% and outer 50% (left); inner 25% and outer 25% (centre); and inner 10% and outer 10% (right).

	Fig. 39Residuals of the difference between G and F606W HSTmagnitudes as a function of G in M4 in the colour range (_G_BP − _G_RP) = 1.4–1.5 mag (see text for detail).

	Fig. 40Comparison of 2D KLD between Gaia EDR3 and Gaia DR2. The 1:1 line is shown in red, while subspaces for which KLD was deviated by at least 10% with respect to DR2 are shown as “*”, respectively. The blue dashed line is a guide to thelow KLD sequence at an offset of about 0.17 from the 1:1 line.

Table 2

Principal recommendations for using Gaia EDR3.

6 Conclusions and recommendations

Gaia EDR3 provides updated astrometry and photometry for 1 811 709 771 sources. Of these, 19.0% have a 2p astrometric solution, that is to say only a position, 32.3% have a 5p solution, that is also including the parallax and proper motion, and 48.7% have a 6p solution where also a colour parameter, the pseudo-colour, is determined. For the photometry, the catalogue gives G magnitudes for 99.7% of the sources, _G_BP for 85.1%, _G_RP for 85.8%, and a _G_BP − _G_RP colour for 85.0%.

In this paper, we have presented a series of tests aimed at illustrating the quality of the catalogue, with an emphasis on known issues as it is natural for a validation paper. The idea has been that these examples can serve as a guide for actual use cases. In many tests, we have used rather strict selection criteria in order to better answer certain questions, but selection criteria should always be chosen to fit the case at hand. For convenience, we summarise the principal recommendations from this validation exercise in Table 2. Additional advice and recommendations can be found in the astrometric and photometric processing papers (Lindegren et al. 2021b; Riello et al. 2021).

The catalogue is the third in a series, and it is natural to check where it stands compared to the previous release, Gaia DR2. In particular, we note the following.

It contains 7% more sources.
It presents parallaxes and proper motions for 10% more sources.
It has a high completeness until G ~ 19 mag, cf. Sect. 2.2.
The completeness has improved in dense areas, cf. Sect. 2.4.
The angular resolution has improved but is still dropping fast below 0.′′7, cf. Sect. 2.3.
Source identifiers from Gaia DR2 have, in general, been maintained for 97% of the sources, cf. Sect. 2.1, but it is not advisable to rely on the identifier alone. If the counterpart of a Gaia DR2 source is sought, we recommend using the table gaiaedr3.dr2_neighbourhood instead, which is available in the Gaia archive.
A little more than half a percent of Gaia DR2 sources are not found in Gaia EDR3 within 50 mas, cf. Sect. 2.1. They are mostly faint and some are likely to be spurious sources.
The catalogue contains several fields that help to identify sources with a close neighbour. In addition to the parameters detailingthe internal consistencies of the astrometric and photometric solutions already in Gaia DR2, the catalogue now also provides statistics on the images themselves, such as ipd_frac_multi_peak, ipd_harmonic_gof_amplitude, etc., cf. Table B.1.
For a guide to the full list of parameters, we recommend the datamodel description in the online documentation15.

For the astrometry, the precision has significantly benefited from the additional year of observations. In addition, we notice the following.

There are fewer sources than in Gaia DR2 with incomplete astrometry (2p), for example 1.5% as compared to 2.4% for G < 19 mag, which is mainly due to the increase in number of observations, cf. Sect. 2.6.
A significant number of 2p solutions, more than half for G < 19 mag, are caused by the insufficiency of a 5p or 6p source model, cf. Sect. 2.6.
In specific sky areas, there can be mean differences in position with respect to Gaia DR2 at the 1 mas level.
The systematic errors in parallax – as shown by QSOs – are significantly smaller than in Gaia DR2, cf. Sect. 3.3. They can be further diminished by applying the corrections detailed in Lindegren et al. (2021a), and we recommend following the guidelines from that paper.
High proper motion sources now have a much improved reliability with no negative parallaxes for sources with motions above 300 mas yr−1, cf. Sect. 2.5.
We have 1.6% spurious solutions among sources with ϖ_∕_σ ϖ > 5, cf. Sect. 3.2, but this fraction is much less for brighter sources and for 5p solutions. We recommend testing any specific sample, designed to contain only positive parallaxes, by also selecting the corresponding sample with negative parallaxes.
Parallax uncertainties are underestimated, but less than for Gaia DR2, cf. Sect. 3.7.
Parallaxes have up to a +0.02 mas level offset in zero point for sources brighter than G = 13 mag as shown in binaries and in clusters. It is mostly removed when applying the Lindegren et al. (2021a) parallax correction.
Parallaxes for 6p solutions show a clear correlation with the pseudo-colour in the LMC, which is largely removed with the Lindegren et al. (2021a) correction.
Proper motions in right ascension have a small offset of 0.01 mas yr−1for sources brighter than G = 13 mag, cf. Figs. 22 and 25. This is seen in proper motion pairs as well as in clusters.
The quality indicators ruwe and astrometric_gof_al are strongly underestimated for bright sources in crowded areas, cf. Sect. 3.12.

For the photometry, nearly all issues found with Gaia DR2 (Arenou et al. 2018) are either solved or have improved significantly. Still, the blue and red _G_BP and _G_RP photometry has a series of issues of its own. This photometry is based on prism spectra and has therefore – by design – a limited angular resolution. Each observation contains a substantial flux from the sky background, limiting the performance for faint sources. To help judge the reliability of the photometry, the excess factor, phot_bp_rp_excess_factor, gives a simple measure of the consistency between the three fluxes. For _G_BP and _G_RP, which collect the flux in a relatively wide window, the number of transits with other sources within (phot_bp_n_blended_transits) or close to (phot_bp_n_ contaminated_transits) the window, can be useful but it is important to only count, of course, well known sources.

For the photometry we notice the following.

The trend in G as a function of G, which was pronounced in Gaia DR2, has been significantly reduced in Gaia EDR3.
The indications of a discontinuity in the G magnitude around G = 10.87 and 13 mag are much weaker in Gaia EDR3 than in Gaia DR2, cf. Sect. 4.4.1.
Sources, where a reliable colour was not known at the beginning of the Gaia EDR3 processing were processed using a default colour. They constitute a significant fraction of the catalogue and Riello et al. (2021) recommend a correction to the G photometry for such sources, in particular those with astrometric 6p solutions, see Sect. 4.3 for example, where it is demonstrated to work. The correction is typically a hundredth of a magnitude and can of course only be applied if a reliable colour is now known, which is the case for 86% of the 6p solutions.
The _G_BP and_G_RP photometry is much less affected by systematic errors in the background subtraction than they were in Gaia DR2.
The completeness of _G_BP and_G_RP is reduced in dense fields.
Photometry in _G_BP that is fainter than about 20.5 mag and in G_RP that is fainter than about 20 mag is heavily biased towards brighter values as illustrated in the simulations in Figs. 29 and 30. The cause is discussed in Riello et al. (2021)and well understood, cf. Sect. 4.2. We therefore recommend the use of the colour_G − _G_RPinstead of _G_BP − _G_RPfor samples including faint, red sources (_G_BP fainter than about 20.5 mag).

Also a global, statistical analysis, cf. Sect. 5, confirms that the systematics of Gaia EDR3 have improved significantly compared to that of Gaia DR2. The more notable improvement, seen in this way, is found for the photometry, since some “systematics” associated to the imprint of the scanning law pattern are still present in the astrometric parameters and their uncertainties.

Acknowledgements

This work has made use of data from the European Space Agency (ESA) mission Gaia (https://www.cosmos.esa.int/gaia), processed by the Gaia Data Processing and Analysis Consortium (DPAC, https://www.cosmos.esa.int/web/gaia/dpac/consortium). Funding for the DPAC has been provided by national institutions, in particular the institutions participating in the Gaia Multilateral Agreement. This work was supported by the Spanish Ministry of Science, Innovation and University (MICIU/FEDER, UE) through grants RTI2018-095076-B-C21, ESP2016-80079-C2-1-R, and the Institute of Cosmos Sciences University of Barcelona (ICCUB, Unidad de Excelencia ‘María de Maeztu’) through grants MDM-2014-0369 and CEX2019-000918-M. T.M., D.B., A.G. and E.L. acknowledge financial support from the Agenzia Spaziale Italiana (ASI) provided through contracts 2014-025-R.0, 2014-025-R.1.2015 and 2018-24-HH.0 to the Italian Istituto Nazionale di Astrofisica. P.K. acknowledges support from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme under grant agreement No 695099 (project CepBin), and the French Agence Nationale de la Recherche (ANR), under grant ANR-15-CE31-0012-01 (UnlockCepheids). This research has made an extensive use of Aladin and the SIMBAD, VizieR databases operated at the Centre de Données Astronomiques (Strasbourg) in France and of the software TOPCAT (Taylor 2005). We finally wish to thank Alessandro Spagna for his comments to an earlier version of this paper.

Appendix A Completeness in globular clusters

Appendix B _Gaia_-specific terms

Below, in Table B.1, we give short definitions of Gaia -related terms appearing in this paper. Several are fields in the Gaia catalogue and detailed explanations are available in the Gaia EDR3 datamodel description.

Table B.1

Short definitions for Gaia -related terms.

References

Ahumada, R., Allende Prieto, C., Almeida, A., et al. 2020, ApJS, 249, 3 [CrossRef] [Google Scholar]
Arenou, F., Lindegren, L., Froeschle, M., et al. 1995, A&A, 304, 52 [NASA ADS] [Google Scholar]
Arenou, F., Luri, X., Babusiaux, C., et al. 2017, A&A, 599, A50 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Arenou, F., Luri, X., Babusiaux, C., et al. 2018, A&A, 616, A17 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Bedin, L. R., Anderson, J., Heggie, D. C., et al. 2013, Astron. Nachr., 334, 1062 [NASA ADS] [CrossRef] [Google Scholar]
Betoule, M., Marriner, J., Regnault, N., et al. 2013, A&A, 552, A124 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Bohlin, R. C., Gordon, K. D., & Tremblay, P. E. 2014, PASP, 126, 711 [NASA ADS] [Google Scholar]
Bovy, J., Nidever, D. L., Rix, H.-W., et al. 2014, ApJ, 790, 127[Google Scholar]
Brandt, T. D. 2018, ApJS, 239, 31 [NASA ADS] [CrossRef] [Google Scholar]
Cantat-Gaudin, T., Anders, F., Castro-Ginard, A., et al. 2020, A&A, 640, A1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Chambers, K. C., Magnier, E. A., Metcalfe, N., et al. 2016, ArXiv e-prints [arXiv:1612.05560][Google Scholar]
Charlot, P., Jacobs, C. S., Gordon, D., et al. 2020, A&A, in press[Google Scholar]
Clem, J. L., & Landolt, A. U. 2013, AJ, 146, 88[Google Scholar]
Clem, J. L., & Landolt, A. U. 2016, AJ, 152, 91 [NASA ADS] [CrossRef] [Google Scholar]
Clementini, G., Ripepi, V., Molinaro, R., et al. 2019, A&A, 622, A60 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
de Bruijne, J. H. J., Allen, M., Azaz, S., et al. 2015, A&A, 576, A74 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Dias, W. S., Monteiro, H., Caetano, T. C., et al. 2014, A&A, 564, A79 (DAML Catalogue) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Gaia Collaboration (Prusti, T., et al.) 2016, A&A, 595, A1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Gaia Collaboration (Brown, A. G. A., et al.) 2018, A&A, 616, A1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Gaia Collaboration (Brown, A., et al.) 2021, A&A, 649, A1 (Gaia EDR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Górski, K. M., Hivon, E., Banday, A. J., et al. 2005, ApJ, 622, 759 [NASA ADS] [CrossRef] [Google Scholar]
Kervella, P., Gallenne, A., Evans, N. R., et al. 2019, A&A, 623, A117 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Kharchenko, N. V., Piskunov, A. E., Schilbach, E., Röser, S., & Scholz, R.-D. 2013, A&A, 558, A53 (MWSC Catalogue) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Lallement, R., Babusiaux, C., Vergely, J. L., et al. 2019, A&A, 625, A135 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Landolt, A. U. 1983, AJ, 88, 439 [NASA ADS] [CrossRef] [Google Scholar]
Landolt, A. U. 1992, AJ, 104, 340 [NASA ADS] [CrossRef] [Google Scholar]
Landolt, A. U. 2007, AJ, 133, 2502 [NASA ADS] [CrossRef] [Google Scholar]
Landolt, A. U. 2009, AJ, 137, 4186[Google Scholar]
Landolt, A. U. 2013, AJ, 146, 131 [NASA ADS] [CrossRef] [Google Scholar]
Lejeune, T., Cuisinier, F., & Buser, R. 1997, A&AS, 125, 229 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Lindegren, L., Hernández, J., Bombrun, A., et al. 2018a, Gaia DR2 astrometry, IAU 30 GA – Division A: Fundamental Astronomy, Vienna, 2018, August 2[Google Scholar]
Lindegren, L., Hernández, J., Bombrun, A., et al. 2018b, A&A, 616, A2 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Lindegren, L., Bastian, U., Biermann, M., et al. 2021a, A&A, 649, A4 (Gaia EDR3 SI) [EDP Sciences] [Google Scholar]
Lindegren, L., Klioner, S., Hernández, J., et al. 2021b, A&A, 649, A2 (Gaia EDR3 SI) [EDP Sciences] [Google Scholar]
Mason, B. D., Wycoff, G. L., Hartkopf, W. I., Douglass, G. G., & Worley, C. E. 2001, AJ, 122, 3466[Google Scholar]
Nascimbeni, V., Bedin, L. R., Heggie, D. C., et al. 2014, MNRAS, 442, 2381 [NASA ADS] [CrossRef] [Google Scholar]
Riello, M., De Angeli, F., Evans, D. W., et al. 2021, A&A, 649, A3 (Gaia EDR3 SI)[Google Scholar]
Rowell, N., Davidson, M., Lindegren, L., et al. 2021, A&A, 649, A11 (Gaia EDR3 SI) [EDP Sciences] [Google Scholar]
Sarajedini, A., Bedin, L. R., Chaboyer, B., et al. 2007, AJ, 133, 1658 [NASA ADS] [CrossRef] [Google Scholar]
Seabroke, G., Fabricius, C., Teyssier, D., et al. 2021, A&A, submitted (Gaia EDR3 SI)[Google Scholar]
Taylor, M. B. 2005, ASP Conf. Ser., 347, 29[Google Scholar]
Torra, F., Castañeda, J., Fabricius, C., et al. 2021, A&A, 649, A10 (Gaia EDR3 SI)[Google Scholar]
Udalski, A., Szymanski, M. K., Soszynski, I., & Poleski, R. 2008, Acta Astron., 58, 69 [NASA ADS] [Google Scholar]

A visibility period, included in the Gaia archive as visibility_periods_used is the time range when a source is observed without a time gap of more than four days. From one period to the next, the scan direction has changed and a couple of months may have passed.

GOG20 is published in the Gaia archive in the table gaiaedr3.gaia_source_simulation

ipd_gof_harmonic_amplitude indicates the level of asymmetry in the image, cf. Table B.1.

ruwe is the renormalised unit weight error (for astrometry) given in the Gaia archive.

ipd_gof_harmonic_phase indicates the orientation of an asymmetric image.

In the Gaia EDR3 archive, this is phot_bp_rp_excess_factor.

astrometric_gof_al is the goodness of fit of the astrometric solution.

parallax + 2 ×parallax_error < exp((4 −phot_g_mean_mag + 10) × 0.46).

Riello et al. (2021) define a corrected factor C*that takes the dependence of the phot_bp_rp_excess_factor with _G_BP − _G_RP colour into account.

All Tables

Table 1

Summary of the comparison between the Gaia parallaxes and the external catalogues.

Table 2

Principal recommendations for using Gaia EDR3.

Table B.1

Short definitions for Gaia -related terms.

All Figures

	Fig. 1Fraction of Gaia DR2 sources maintaining the same source identifier in Gaia EDR3 (red curve), and the fraction – irrespective of the identifier – having a counterpart in Gaia EDR3 at the same position within 50 mas (blue curve).
In the text

	Fig. 2Map in Galactic coordinates of the 99th percentile of the G magnitude at healpix level 5, i.e. in 3.36 □° pixels.
In the text

	Fig. 3Third percentile of the number of transits and number of visibility periods per source used in the astrometric solution. The shaded areas show the range of the first and fifth percentiles. The limits of five transits for inclusion in the catalogue and nine visibility periods for a full astrometric solution are also indicated. Top: catalogue in general. Bottom: field in Baade’s window.
In the text

In the text

	Fig. 6Improvement of the Gaia completeness at G = 20 mag versus some OGLE fields of different stellar densities from Gaia DR2 (red) to Gaia EDR3 (blue).
In the text

	Fig. 7Top: histogram of source pair distances in a circular field of radius 0.5° centred at (l, b) = (330°, −4°) with a line showing the expected relation for a random distribution of the sources. Bottom: normalised histogram using the expected relation.
In the text

	Fig. 8Improvement of the completeness (in percent) of visual double stars from the WDS catalogue as a function of the WDS separation between components from Gaia DR2 (red) to Gaia EDR3 (blue).
In the text

	Fig. 9Global completeness as a function of G for the whole sample of globulars (grey line). Pink, blue, and orange lines indicate the completeness in different density ranges D. The shaded areas indicate the uncertainties
In the text

	Fig. 10Proper motion versus parallax for large proper motions. Top: in Gaia DR2. Bottom:in Gaia EDR3. The grey line shows the locus of tangential velocity 500 km s−1.
In the text

	Fig. 11Mapin Galactic coordinates of the mean positional difference between Gaia EDR3 and Gaia DR2, having propagated the Gaia EDR3 positions to the epoch of Gaia DR2.
In the text

	Fig. 12Maps in Galactic coordinates of median parallaxes in the direction of the LMC (top) and Galactic centre (bottom) for 5p (left) and 6p (right) solutions. To increase the contrast, the represented parallax range is [−0.05, 0.05] mas, although the median parallax in most of the field is above 0.05 mas.
In the text

	Fig. 13Estimated fraction of spurious solutions for sources where the formal uncertainty is at least five times smaller than the parallax. Top: fraction by G for the whole catalogue (dashed line) as well as separately for 5p (lower, red line) and 6p (upper, blue line) solutions. Bottom: skymap of the fraction for the whole catalogue in Galactic coordinates.
In the text

	Fig. 14Proper motion diagram of sources near the Galactic centre within a 0.5° radius. Top: colour-coded by ipd_gof_harmonic_amplitude. Bottom: coded by ipd_gof_harmonic_phase. Reddish points in the top panel reveal potentially spurious solutions.
In the text

	Fig. 15Maps in ecliptic coordinates of the variations of QSO parallaxes (mas) in 5° radius fields. Top: 5p solution. Bottom: 5p with zero point correction.
In the text

	Fig. 16Parallaxes averaged among healpix bins over the whole sky as a function of magnitude for Gaia EDR3 (orange crosses), GOG20 (black circles), and Gaia DR2 (blue triangles).
In the text

In the text

	Fig. 19Unit-weight uncertainty (uwu) that needs to be applied to the Gaia parallax uncertainties to be consistent with the residual distribution (after zero point correction and removing stars with ruwe < 1.4) versus LQRF QSOs, LMC, and dSph stars for the 5p (top) and 6p (bottom) solutions, as a function of magnitude.
In the text

	Fig. 20_χ_2 test of the LQRF QSOs proper motions as a function of G magnitude for the 5p (black circles) and 6p (purple squares) solution. The residual R χ should follow a _χ_2 with 2 degrees of freedom. The correlation observed here is due to the underestimation of the uncertainties as a function of magnitude.
In the text

	Fig. 21Uwu of the parallaxes estimated by deconvolution versus parallax_error (mas) for several subsets: DR2 and EDR3 for 5p, 6p, G < 17 mag, _G_ > 20 mag, LMC, and non-zero excess noise.
In the text

In the text

	Fig. 23Top: correlation between parallax and pseudo-colour. Middle: running median with uncertainty for LMC parallaxes where the negative correlation translates into systematics for the bluest and reddest sources. Bottom: LMC parallaxes after zero point correction.
In the text

In the text

	Fig. 26Left: Gaia proper motion distribution in M4. Only stars with a proper motion error < 0.2 mas yr−1 and flux excess parameter C < 2 are plotted. Centre: differences between proper motions in Gaia and in HST data. Right: scaled dispersion for the stars having Gaia proper motions errors < 0.2 mas yr−1 and flux excess parameter C < 2.
In the text

	Fig. 27Re-normalised unit-weight error for the astrometric solution, ruwe, for sources brighter than G = 11 (left) and brighter than G = 17 (right).
In the text

	Fig. 28Quotients of blue and red spectral shape coefficients for the set of sources fainter than G ~ 21.7 mag. The photometry for sources with quotients larger than the thresholds (blue lines) were filtered for the publication (see text).
In the text

	Fig. 29Simulated mean fluxes as a function of the true mean flux in the presence of the 1 e− s−1 threshold (dark filled symbols) and without (light open symbols). The mean and the 1_σ_ confidence intervals are shown as lines and shaded regions. The dotted line indicates the lower bound of the mean in the presence of the threshold.
In the text

	Fig. 30Distribution of 5 million simulated mean magnitudes for BaSeL spectra with effective temperatures from 2000 to 35 000 K, in the presence of a 1 e− s−1 threshold, in the _G_BP − G versus G (upper panel) and the G − _G_RP versus G (lower panel) colour-magnitude diagrams. The solid lines indicate the mean.
In the text

	Fig. 31Colour–colour relation of the APOGEE DR16 red clump sample. Stars with 5p astrometric solutions are shown with black dots, and the ones with 6p solutions are shown with red dots. Top: full sample. Bottom: after removing stars with _G_BP > 20.5 mag and correcting the G photometry of the 6p solutions using the recipe of Riello et al. (2021).
In the text

	Fig. 32Residuals from a global G − _G_RP = f(_G_BP − _G_RP) relation for asample of luminous, low extinction stars (_A_0 < 0.05 mag, M G < 4 mag). Top: as a function of G, colour-coded by the number of stars normalised by the total number of stars per magnitude bin. Bottom: as a function of _G_BP − _G_RP for a sample of bluer stars, colour-coded by the magnitude.
In the text

	Fig. 33Median colours _G_BP − _G_RP in a dense field (Galactic coordinates) showing artefacts from the scan pattern in Gaia DR2 (top), which have almost disappeared in Gaia EDR3 (bottom).
In the text

In the text

	Fig. 35Mapin Galactic coordinates of the mean difference of the G magnitudes between Gaia EDR3 and Gaia DR2.
In the text

	Fig. 36Comparison between mean G magnitudes provided in the Gaia DR2 and Gaia EDR3 catalogues for 140 635 RR Lyrae variables. Black and red points show sources witha difference in magnitudes of less and more than 0.5 mag, respectively.
In the text

In the text

	Fig. 38Colour magnitude diagram in the inner (blue) and outer (black) regions of NGC 5986 in three different areas: inner 50% and outer 50% (left); inner 25% and outer 25% (centre); and inner 10% and outer 10% (right).
In the text

	Fig. 39Residuals of the difference between G and F606W HSTmagnitudes as a function of G in M4 in the colour range (_G_BP − _G_RP) = 1.4–1.5 mag (see text for detail).
In the text

	Fig. 40Comparison of 2D KLD between Gaia EDR3 and Gaia DR2. The 1:1 line is shown in red, while subspaces for which KLD was deviated by at least 10% with respect to DR2 are shown as “*”, respectively. The blue dashed line is a guide to thelow KLD sequence at an offset of about 0.17 from the 1:1 line.
In the text