Gaia Data Release 3 - The Gaia Andromeda Photometric Survey (original) (raw)

A&A 674, A4 (2023)

The Gaia Andromeda Photometric Survey

1, L. Eyer2, G. Busso1, M. Riello1, F. De Angeli1, P. W. Burgess1, M. Audard2,4, G. Clementini3, A. Garofalo3, B. Holl2,4, G. Jevardat de Fombelle2, A. C. Lanzafame5,6, I. Lecoeur-Taibi4, N. Mowlavi2,4, K. Nienartowicz7,4, L. Palaversa8 and L. Rimoldini4

1Institute of Astronomy, University of Cambridge, Madingley Road, Cambridge, CB3 0HA, UK
e-mail: dwe@ast.cam.ac.uk
2Department of Astronomy, University of Geneva, Chemin Pegasi 51, 1290 Versoix, Switzerland
3 INAF – Osservatorio di Astrofisica e Scienza dello Spazio di Bologna, Via Piero Gobetti 93/3, 40129 Bologna, Italy
4Department of Astronomy, University of Geneva, Chemin d’Ecogia 16, 1290 Versoix, Switzerland
5Dipartimento di Fisica e Astronomia “Ettore Majorana”, Università di Catania, Via S. Sofia 64, 95123 Catania, Italy
6 INAF – Osservatorio Astrofisico di Catania, Via S. Sofia 78, 95123 Catania, Italy
7 Sednai Sàrl, Geneva, Switzerland
8 Ruđer Bošković Institute, Bijenička cesta 54, 10000 Zagreb, Croatia

Received: 7 June 2022
Accepted: 24 June 2022

Abstract

Context. As part of Gaia Data Release 3 (Gaia DR3), epoch photometry has been released for 1.2 million sources centred on M 31. This is a taster for Gaia Data Release 4 where all the epoch photometry will be released.

Aims. In this paper, the content of the Gaia Andromeda Photometric Survey (GAPS) is described, including statistics to assess the quality of the data. Known issues with the photometry are also outlined.

Methods. Methods are given to improve interpretation of the photometry, in particular, a method for error renormalisation. Also, use of correlations between the three photometric passbands allows clearer identification of variables, and is not affected by false detections caused by systematic effects.

Results. GAPS presents a unique opportunity to look at Gaia epoch photometry that has not been preselected based on variability. This allows investigations to be carried out that can be applied to the rest of the sky using the mean source results. Additionally, scientific studies of variability can be carried out on M 31 and the Milky Way in general.

Key words: instrumentation: photometers / techniques: photometric / Galaxy: general / stars: variables: general / Local Group

© The Authors 2023

1. Introduction

Gaia is one of the most ambitious, diverse, and demanding projects in operation as part of the ESA Astrophysics Science programme. From early on, the data processing and analysis of the Gaia data have been recognised as a challenge of the highest order. More than 450 people have gathered to take part in this enormous task. Ten years after the launch, investigations of alternative algorithms and software development are still ongoing to achieve significant improvements of the data products. In order to mitigate the risk and satisfy the scientific community, the approach has been to release Gaia data in an iterative manner: upon each release, more data have been processed, and there are a larger number of sources with more diverse data products. The data behaviour is better understood and more and more effects are taken into account and therefore the calibrations are also improved. The feedback from the scientific community is also important in this process. The Gaia Andromeda Photometric Survey (GAPS) is such an early release for the epoch photometry.

In the data release plans, the intention for Gaia DR3 was to only release data for the entire catalogue averaged over many observations. The equivalent epoch data, sometimes referred to as time series, would only be published in the fourth data release, which is not expected before the end of 2025. However, the iterative approach can also be taken for the time domain measurements of Gaia. In the first data release, _G_-band epoch photometry was released for 3194 variable stars. In the second data release, this number increased to half a million variable stars. Now, with the third data release, about 10 million variables will be published with their time series. With this approach, it was thought that releasing epoch photometry for all sources –variable and constant– from a limited region of the sky would help the community understand the strengths and limitations of the Gaia epoch photometry.

The paper outline is as follows: Section 2 describes the choice of the field for this survey; Sect. 3 describes the data; in Sect. 4 the overall statistics is described; Sect. 5 goes through some of the issues that remain in the data and Sect. 6 gives a few simple examples of what can be done with the data. As with many large missions, many acronyms are used in Gaia publications. Table D.1 lists the ones used in this paper.

2. Choice of field

Several fields were studied to determine whether or not they would be suitable for a data release. Among them were the Andromeda galaxy (M 31), the Ursa Minor dwarf Spheroidal (UMi dSph), the open cluster NGC 2516 (a well-studied intermediate-age cluster), the Kepler field, and the two PLATO Long-duration Observation Phase fields (Nascimbeni et al. 2022).

The Gaia scanning law is peculiar and these fields are sampled very differently. For example, the locations of Ursa Minor dwarf Spheroidal and NGC 2516 are close to the ecliptic poles and have benefited from the very specific Ecliptic Pole Scanning Law from the first month of the operations before the spacecraft started its Nominal Scanning Law (EPSL; NSL), (Gaia Collaboration 2016). Fields observed with the EPSL were observed very frequently during this period and are therefore excluded, as they would not be representative of the typical Gaia sampling. The PLATO and Kepler fields are in a region with a low number of measurements. The Andromeda galaxy instead is in a region where the number of scans varies significantly within the range covered by Gaia due to its scanning law. The 10th and 90th percentiles of this distribution are 10 and 57 observations. Furthermore, M 31 encompasses regions of different densities, including crowded areas, and so the community will be able to evaluate some spurious variability effects due to crowding. Finally, with a radius of 5.5° centred on M 31 (RA 10.68333°, Dec 41.26917°), this field, in addition to stars from the Andromeda galaxy, combines a large number of Milky Way stars that result in a Hertzsprung–Russell (HR) diagram where all sequences are well populated.

3. Data description

The data set contains epoch photometry in G, _G_BP, and _G_RP for 1 257 319 sources. The G photometry is the field of view (FoV) average, and so there is only one G value per transit. These data are contained within a DataLink Massive data base associated with the Gaia DR3 archive. This can be accessed from the archive query results by clicking on the DataLink symbol (two chain links) and selecting the appropriate data from the pop-up window.

There are three data structures that can be selected. The RAW data structure option will result in one file with one row per source with arrays for each field. Element i of each array contains data for the _i_th transit. This data structure is described in the online DR3 documentation1 in Sect. 20.7.1. The other two data structures, INDIVIDUAL and COMBINED, have one row per transit and passband type. INDIVIDUAL has one file per source, while COMBINED has all the data within one file. Table 1 gives details about the fields of the epoch photometry contained within the INDIVIDUAL and COMBINED data structures. We note that rejected_by_photometry and rejected_by_variability are independent of each other and are the result of different processes.

Table 1.

General description of the epoch photometry fields of the Gaia DR3 archive (for the INDIVIDUAL and COMBINED data structures).

To identify which sources are part of GAPS, the main gaia_source archive table can be queried by checking in_andromeda_survey = ‘t’. Other source data can also be extracted at this time. We note that there is a limit associated with the archive DataLink service and only 5000 sources can be extracted in a single query. An example is given in Appendix A of how to automate this and extract more than this limit.

As reported in Riello et al. (2021), when generating mean photometry in the processing leading to Gaia EDR3, calibrated epoch fluxes with values lower than 1 e− s−1 were rejected. A similar threshold was set for epoch photometry entering the archive at 0 e− s−1, that is, only positive values were considered valid. Ideally, the negative fluxes should have been retained because they are equally valid. For sources with very low flux, the error distribution is to all practical purposes Gaussian; however, when the fluxes are transformed to magnitudes, they lose this property. Additionally, negative fluxes, if they had been retained, would have undefined magnitudes. An alternative method, not used here, is to use an inverse hyperbolic sine function instead of a logarithmic transform, as proposed by Lupton et al. (1999). Such transformation allows negative values of flux. Care must therefore be taken when using the magnitudes, because there will be transits with very large magnitudes, especially for _G_BP and _G_RP, corresponding to flux values close to zero. These are well beyond the nominal detection capabilities of Gaia.

The sky distribution of the sources in this survey is shown in Fig. 1.

thumbnail Fig. 1.Overall area covered by this survey in equatorial coordinates. As with many sky plots from Gaia, this is not an image but a diagram showing sources identified by Gaia. The density scale is logarithmic. Also visible are M 32 and M 110.

Three of the fields in Table 1 can be further decoded to generate potentially useful information.

The first quantity is the Gaia transit identifier containing information on the FoV, CCD Row, on-board mission time (OBMT), and the across scan AC position of the window. Details on how to decode this can be found in Sect. 20.7.1 (EPOCH_PHOTOMETRY) of the Gaia DR3 documentation. The additional processing flags can also be decoded using the description given in Sect. 19.6.1 of the documentation. One of the bits of this flag indicates if G band flux scatter is larger than expected by the photometry processing. If this is set for a significant number of the G epochs of a source, this could indicate very short timescale variability. However, for some magnitude ranges, this could be due to uncalibrated systematic effects, such as magnitude terms; see Sect. 5.2. Most of the bits in this flag mainly describe which CCD data were rejected or unavailable when forming the G FoV average.

Finally, the source identifier is described in Sect. 20.1.1 (GAIA_SOURCE) of the Gaia DR3 documentation. Contained within this number is a level 12 HEALPix index number which gives the approximate position of the source to the nearest arcmin.

4. General statistics

Figure 2 shows the number of observations at each magnitude for G, _G_BP, and _G_RP, and the peaks of these are approximately at 20.2, 20.2, and 18.9 mag, respectively. The broadness of the peaks, extending well into the faint end, are an effect of the large uncertainties for faint sources, especially for _G_BP and _G_RP. Additionally, the asymmetric and non-linear nature of the transformation from flux to magnitude increases the number of extremely faint epochs. No signal-to-noise-ratio (S/N) filter has been applied to this survey so as to avoid biasing any investigations that the users might want to carry out.

thumbnail Fig. 2.Number of observations as a function of G, _G_BP, and _G_RP magnitude.

The uncertainties are shown in Figs. 35. The G fluxes are formed from a weighted mean of up to nine Astrometric Field (AF) CCDs forming the transit. The uncertainties reflect this. The formulae for calculating the weighted mean and uncertainty can be found in Carrasco et al. (2017). This means that some features, such as gating, which are seen in the equivalent _G_BP and _G_RP plots, are smoothed out. The ridge observed at the faint end, which is about a factor of five above the median line, is formed by transits that only contain one CCD measurement (AF1). This can be identified from the flags described in Appendix B, indicating which AF CCDs contributed to the mean G flux value.

thumbnail Fig. 3.Distribution of the uncertainty on G in magnitudes as a function of magnitude. The line shows the median of the distribution. The density colour scale is logarithmic.

Some of the features seen above the median line in the G magnitude range 13−16 are caused by variation of the photometry within the transit. This could be due to variability or uncalibrated systematic effects affecting one or more of the CCD measurements within the transit.

At the bright end, G < 12, the bumps seen are the result of both the different effective exposure times caused by gating, and the saturation features. Although gating should mitigate most of the effects of saturation, the on-board choice of gate is affected by on-board photometric errors and is therefore not always optimal, thus causing some saturation to occur. This is different from CCD to CCD and therefore additional scatter is observed.

The _G_BP and _G_RP uncertainties have an easier structure to explain. Brighter than about magnitude 11, the features are all caused by gating. The effect of this is for the transits to have different effective exposure times and therefore uncertainties. We note that the selection of the gate to be used is done by an on-board magnitude estimate which is quite noisy (brighter than about G = 12, this is about 0.5 mag). This explains why there is quite a large overlap in magnitude between the different gated observations.

Brighter than about magnitude 16, the gradients in these plots are all very similar and indicate that the uncertainty is source-limited that is, limited by the S/N of the source. Fainter than this, the gradient changes, indicating the change into a sky-limited regime.

Figure 6 shows the median of the distribution for all three passbands on the same plot to ease comparison.

thumbnail Fig. 6.Median of the G, _G_BP, and _G_RP uncertainties in magnitudes as a function of magnitude.

The majority of sources have between 30 and 45 observations, but a reasonable number, 15%, have more than 50 observations. These are located in the stripes seen in the sky distribution and are due to the scanning law of Gaia. The central region of M 31 has very few observations in comparison. This is because of crowding, which causes the observations to fail for a number of reasons. For the G observations, the line spread function (LSF, Rowell et al. 2021) fits fail due to the presence of multiple sources in the window. _G_BP and _G_RP observations have much larger windows, which often overlap in crowded regions and are therefore truncated on board. These have not been processed for Gaia DR3. The occurrence of overlap and therefore truncation depends not only on stellar density but also the scanning direction with respect to the location of the sources in the sky. For this reason, most sources will still get some useful observations but the average number of epochs per source is significantly reduced in such areas.

Figure 9 shows the colour distribution in GAPS field. The average colour away from M 31 is 1.4 in _G_BP − _G_RP, whereas in the spiral arms of M 31 the average colour is 1.0. This reflects the brighter population of M 31.

thumbnail Fig. 7.Total number of _G_RP observations for each source within the survey.

The total number of observations per source is shown in Fig. 7 and their sky distribution in Fig. 8. The _G_RP values are shown in these plots as representative of the number of transits in each passband.

thumbnail Fig. 8.Sky distribution of the sources in equatorial coordinates weighted by the number of _G_RP observations.
thumbnail Fig. 9.Sky distribution of the sources in equatorial coordinates weighted by the colour of the source.

Figure 10 shows the time distribution of the epochs as Barycentric Julian Day in Barycentric Coordinate Time (TCB) – 2455197.5 in days. This approximately corresponds to the range 27 August 2014 to 24 May 2017. As can be seen, the distribution is highly irregular because of the Gaia scanning law.

thumbnail Fig. 10.Distribution in time (TCB days since 2010) of the epochs within the survey. The range of 1700 to 2700 TCB days since 2010 corresponds approximately to 1200 to 5200 OBMT rev which is often used in other Gaia papers. See Gaia Collaboration (2016) for an explanation of Gaia timescales.

5. Known issues with the published photometry

5.1. Error renormalisation

It is often the case that the estimation of errors is as difficult to get right as the main data. A correct estimation of the errors on the single transits is very important, because modelling the data often relies on errors for weighting the data. Usually errors are underestimated in comparison to the observed scatter due to uncalibrated systematic errors. Distinguishing between some systematic errors and random ones may not be important in many modelling cases where the model does not use the parameter driving the systematic error. For example, if there were systematic differences between CCD Rows or FoV, these would not be relevant for scientific modelling such as light curve fitting apart from an apparent increase in the size of the errors.

In many cases, a simple investigation into the unit-weight residuals can give an indication as to the quality of the error estimates. This usually involves a modelling assumption, for example the source is constant. A particularly useful technique is the use of _P_-values which effectively transforms the residuals into a flat distribution that can be interpreted more easily; see Eq. (2) of Evans et al. (2017) for the conversion from _χ_2 to _P_-value. Indeed, in classical hypothesis testing (Kendall & Stuart 1979), the _P_-value is used to accept or reject the null hypothesis. It is defined (for a unilateral test) as the probability of the random variable X to be larger than the obtained value x o.

P - value = ∫ X 0 ∞ X d X . <span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable rowspacing="0.25em" columnalign="right" columnspacing=""><mtr><mtd><mstyle scriptlevel="0" displaystyle="true"><mrow><mrow><mi>P</mi><mtext>-</mtext><mrow><mi mathvariant="normal">v</mi><mi mathvariant="normal">a</mi><mi mathvariant="normal">l</mi><mi mathvariant="normal">u</mi><mi mathvariant="normal">e</mi></mrow></mrow><mo>=</mo><msubsup><mo>∫</mo><msub><mi>X</mi><mn>0</mn></msub><mi mathvariant="normal">∞</mi></msubsup><mi>X</mi><mi mathvariant="normal">d</mi><mi>X</mi><mi mathvariant="normal">.</mi></mrow></mstyle></mtd></mtr></mtable><annotation encoding="application/x-tex">\begin{aligned} {P\text{-}\mathrm{value}} = \int _{X_0}^{\infty } X \mathrm{d}X.\nonumber \end{aligned}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:2.7263em;vertical-align:-1.1132em;"></span><span class="mord"><span class="mtable"><span class="col-align-r"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.6132em;"><span style="top:-3.6132em;"><span class="pstrut" style="height:3.4143em;"></span><span class="mord"><span class="mord"><span class="mord mathnormal" style="margin-right:0.13889em;">P</span><span class="mord text"><span class="mord">-</span></span><span class="mord"><span class="mord mathrm">value</span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mop"><span class="mop op-symbol large-op" style="margin-right:0.44445em;position:relative;top:-0.0011em;">∫</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.4143em;"><span style="top:-1.7881em;margin-left:-0.4445em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.07847em;">X</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3173em;"><span style="top:-2.357em;margin-left:-0.0785em;margin-right:0.0714em;"><span class="pstrut" style="height:2.5em;"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mtight">0</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.143em;"><span></span></span></span></span></span></span></span></span></span><span style="top:-3.8129em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">∞</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:1.012em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord mathnormal" style="margin-right:0.07847em;">X</span><span class="mord mathrm">d</span><span class="mord mathnormal" style="margin-right:0.07847em;">X</span><span class="mord">.</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:1.1132em;"><span></span></span></span></span></span></span></span></span></span></span></span>

The random variable X is assumed to follow the statistics of the null hypothesis. In the case where all sources follow the null hypothesis, the distribution of _P_-value is flat. As with a _χ_2 test, the _P_-value test is very sensitive because involves a quadratic residual. If a small fraction of the sources are variable and the uncertainties are well estimated, then the distribution of the _P_-values is flat for a large fraction of the _P_-values, and a peak is present near zero for the variable sources. However, it is not straightforward to disentangle the effect of variability from that of an unrepresentative error estimation. The left panel in Fig. 11 illustrates this issue (variability and error problems), where the original _P_-value distribution is shown for sources in this survey in the magnitude range 16 < G < 17.

thumbnail Fig. 11.G band _P_-value distributions for sources in the range 16 < G < 17 for three cases in which the following magnitude errors were added in quadrature to the errors provided in the archive: 0.0 (left panel), 0.0019 (central panel), 0.01975 (right panel). The middle value corresponds to the selected additional error for this magnitude range.

The most commonly used approach for error renormalisation is to scale the errors in some way. This is what was carried out for the astrometry in Gaia DR2 (Lindegren et al. 2018) where this was the correct mitigation. This approach was attempted for the photometry, but no consistent results could be achieved. It is likely that the main problem affecting the unit-weight residuals of the photometry is caused by uncalibrated systematic effects. As there are likely to be more than one of these, the combined effect is to effectively introduce an additional random Gaussian error, in agreement with the central limit theorem (Kendall & Stuart 1977). This suggests that adding an error in quadrature to the formal error would be a better solution than scaling the errors. As the size of the systematic errors for Gaia is likely to be a function of magnitude, the additive correction should be a function of magnitude. Corrections to the photometric errors have been computed independently for the three different passbands. An example of these corrections is shown in the last two panels of Fig. 11 where two different corrections have been added in quadrature to the formal errors before calculating the _P_-values. In the right-most panel, the additional error is too large, causing an excess of high _P_-values. The addition of 0.0019 mag in the middle panel gives a reasonable solution where most of the distribution is flat and the peak at low _P_-values would be caused by variability only.

The features seen in Fig. 11 lead to a simple algorithm to determine a reasonable additional error that could be added in quadrature to the data. For each magnitude range (±0.5 mag) and starting with a very high additional error, the _P_-value distribution is generated and the ratio of sources in the _P_-value ranges [0.8, 0.9] and [0.9, 1.0] is calculated. Initially, there will be many more sources in the last bin because the additional error is too large. This is gradually decreased for each magnitude range until the number of sources with _P_-value between 0.9 and 1.0 is smaller than the number of sources in the range [0.8, 0.9]. When this condition is met, the corresponding correction is the one to be adopted. The step size of the decrease is 0.05 mmag for G and 0.2 mmag for _G_BP and _G_RP.

The results of this analysis are shown in Fig. 12 in comparison to the median quoted errors. Table 2 shows the results as a function of magnitude. These can be interpolated for generating the appropriate error to add in quadrature for each transit. In the cases where the algorithm has failed, for example where there are too few data points, no value is given. Interpolation over these points can be carried out. Extrapolation is not advised and the end values should be used in these cases.

thumbnail Fig. 12.First three plots show the results for the error renormalisation analysis for G, _G_BP, and _G_RP respectively. The errors are all in magnitudes. In these plots, the median quoted error is shown along with the additional error from this analysis. We note that the BP and RP plots have a larger ordinate axis maximum. The final plot shows the additional error for all three passbands at the same time to aid comparison. Missing points indicate where the algorithm has failed.

Table 2.

Additional single transit errors for G, _G_BP, and _G_RP as a function of magnitude.

From Fig. 12 and Table 2 it can be seen that the G additional errors are much smaller than those of _G_BP and _G_RP. This is due to the G values being an average of up to nine values. For the additional errors, the G ones are smaller than the median error by a factor of about two, whereas the _G_BP and _G_RP ones are around the same size as the median quoted errors.

5.2. Magnitude-based systematic errors

Within the internal photometric calibrations, no terms depending on magnitude are used (Riello et al. 2021). This is because the reference photometry used for these calibrations is derived from the photometry itself in an iterative loop. Introducing a magnitude-dependent term into the calibration would cause convergence problems to arise that are due to the overall system being degenerate.

Using the data in this survey, it is possible to see the scale of the magnitude-dependent systematic effects in each of the three passbands by looking at the differential magnitude systematic errors. In future processing cycles, these effects could be calibrated out once the mean reference photometry has been determined.

A number of effects can cause systematic deviations as a function of magnitude which can be very different between the G passband data and that of _G_BP and _G_RP. For G, the main effect comes from the fit of the LSF or PSF to the sampled data. If the calibration of the LSF/PSF is not perfect, then magnitude effects can arise that are due to the weighted nature of the fit.

The other significant effect comes from the calibration of the background. This affects G in a similar manner to _G_BP and _G_RP. Problems with this calibration lead to a systematic effect at the faint end, similar to a hockey stick. For the G photometry, an occasional systematic can be seen at around G = 11 which is caused by saturation that is not mitigated by the gating strategy of Gaia.

We note that these systematic errors will be different in each processing cycle because their cause is entangled with the different calibrations that have been carried out. With each processing cycle, the calibrations improve and the sizes of these magnitude terms are reduced.

Figure 13 shows the epoch G residuals to the mean magnitude for the Following FoV and Row 1 as a function of magnitude for different time selections. These correspond to the peaks seen in Fig. 10. Some of the narrower peaks have been grouped together to make the plot clearer. Only the medians of the distributions are shown.

thumbnail Fig. 13.Comparison between epoch and mean magnitudes for various time selections as a function of magnitude. The lines correspond to the medians of the distribution. The legend on the side indicates the time corresponding to the selection (TCB days since 2010). The data for this plot are from the Following FoV and Row 1.

The main two features that can be seen are the effects of saturation for G < 13 and probable background subtraction issues at the faint end. These are clearly a function of time. No consistent pattern with time is evident.

Also seen in this plot is the outlier behaviour of the data around TCB = 1730. The G epochs for this period can deviate from the mean by a few tenths of a magnitude. This period immediately followed a decontamination event (Gaia Collaboration 2016) and the image quality had not stabilised following the heating up of the focal plane. Thus, the LSFs and PSFs generated for the time were not suitable for the data. We note that the G data from this period are flagged as rejected_by_variability. The corresponding _G_BP and _G_RP epochs are not similarly affected because their photometry is not determined by a profile fit, but effectively by aperture photometry, and is not as affected by image instability.

Figure 14 shows the equivalent residuals for the _G_RP data. Here the only effect seen is at the faint end and is probably caused by difficulties with the background calibrations. The _G_BP residuals show similar behaviour.

Comparisons between FoVs or CCD rows for the same time selection also show similar-sized systematic trends. This indicates that these complex magnitude-based systematic errors depend on row, FoV, and time. We note that the size of these systematic effects is a fraction of the size of the scatter, and is equivalent to the epoch uncertainty, which is usually less than 25%.

5.3. Crowding and background effects

As described in Riello et al. (2021), the corrected _G_BP and _G_RP flux excess factor was introduced as a consistency metric. Below, we reiterate the definition of this quantity:

C ∗ = C − f ( G BP − G RP ) , <span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable rowspacing="0.25em" columnalign="right" columnspacing=""><mtr><mtd><mstyle scriptlevel="0" displaystyle="true"><mrow><msup><mi>C</mi><mo lspace="0em" rspace="0em">∗</mo></msup><mo>=</mo><mi>C</mi><mo>−</mo><mi>f</mi><mo stretchy="false">(</mo><msub><mi>G</mi><mrow><mi mathvariant="normal">B</mi><mi mathvariant="normal">P</mi></mrow></msub><mo lspace="0em" rspace="0em">−</mo><msub><mi>G</mi><mrow><mi mathvariant="normal">R</mi><mi mathvariant="normal">P</mi></mrow></msub><mo stretchy="false">)</mo><mo separator="true">,</mo></mrow></mstyle></mtd></mtr></mtable><annotation encoding="application/x-tex">\begin{aligned} C^{*} = C-f(G_{\rm BP}{-}G_{\rm RP}), \end{aligned}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.5em;vertical-align:-0.5em;"></span><span class="mord"><span class="mtable"><span class="col-align-r"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1em;"><span style="top:-3.16em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord"><span class="mord mathnormal" style="margin-right:0.07153em;">C</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.7387em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">∗</span></span></span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mord mathnormal" style="margin-right:0.07153em;">C</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mord mathnormal" style="margin-right:0.10764em;">f</span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">G</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3283em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight"><span class="mord mathrm mtight">BP</span></span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord"><span class="mord">−</span></span><span class="mord"><span class="mord mathnormal">G</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3283em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight"><span class="mord mathrm mtight">RP</span></span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mclose">)</span><span class="mpunct">,</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.5em;"><span></span></span></span></span></span></span></span></span></span></span></span>(1)

where C = (_I_BP + _I_RP)/I G is the ratio between the sum of the BP and RP fluxes and the G flux and f(_G_BP − _G_RP) is a function of the colour of the source. Good and consistent photometry should have C* values of around zero. In Gaia DR3, the C* was calculated for all sources from their mean photometry. In the case of the GAPS dataset, it is also possible to calculate this from the epoch photometry to have an indication of its consistency. Figure 15 shows the sky distribution, zoomed in on the Andromeda galaxy, of the epoch C*: in the centre of the galaxy and in the spiral arms, C* is clearly higher than the background, which is mainly due to crowding effects. At every epoch, the scan angle changes, and depending on this, a source can be affected by neighbouring stars in different ways, as the amount of contaminating flux varies. Figure 16 shows some examples of this effect: panel A is an example of a source that has almost all transits flagged as blended2 but as the amount of contaminated flux varies with the scan angle, the crowding does not affect the photometry for the scans when the C* is close to zero; panel B is a similar case, but only a few transits are flagged as blended. For comparison, panel C shows an example of a source that is always isolated and its C* is always close to zero. Panel D, on the other hand, is a case of a source that was estimated as never crowded (nBlend = 0) but shows clear variation with scan angle. In this last case, the blending source was probably not in the source catalogue used by the crowding evaluation algorithm which was based on the Gaia DR2 source catalogue. We note that in the crowded cases, the difference in scan angles between the peaks is about 180°. This is because the positions of the sources with respect to the window (except for a mirror effect) will remain the same if all that is changed is the reversal of the scan direction.

thumbnail Fig. 15.Sky distribution of the corrected _G_BP and _G_RP flux excess factor C*, obtained from the epoch photometry.
thumbnail Fig. 16.Examples of C* variations with the scan angles (in degrees). In every panel, the mean G magnitude is indicated, as well as the total number of transits (nObs) and the number of blended transits (nBlend). See the text for a detailed explanation of the single cases.

Crowding effects are not the only reason for the C* variations. Different causes could be instrumental effects, calibration issues, or cosmic rays. The epoch C* can then be used as a quality indicator in a statistical way. However, we warn users that filtering out transits on the basis of a bad C* could hinder the study of special cases such as binaries or variables (see also Sect. 3.1 of Distefano et al. 2023).

5.4. Spurious periodicity

Spurious periods found in the variability analysis can have different origins. Firstly, any noisy data, just by a random process, can mimic some signals that, in reality, are not there. In other words, we can generalise this idea: there are always false positive detections in any statistical selection.

Secondly, calibration residuals or the data acquisition strategy can leave their signatures in the calibrated data, and this signal can be mistaken for true variability of the source. For example, we can find spurious signals and incorrect periods (46 and 96 days) in the Gaia astrometric and photometric data emerging from the scan-angle direction of extended or crowded sources (Holl et al. 2023). The periods are consequences of _Gaia_’s scanning law.

Thirdly, the celestial source has a genuine signal, but the data analysis confuses it with a spurious period. Typical examples in sparsely sampled time series give rise to aliasing. The convolution theorem states that the Fourier transform of a product f(t)*h(t) is the convolution of the individual Fourier transforms F(ν)⊛H(ν). Therefore, the discrete Fourier transform of the observations results from the convolution of the Fourier transform of the signal with the spectral window. This will then reflect the regularity pattern of the observing times. If the signal has a simple low frequency, such as a trend or a long-term periodic phenomena, then the discrete Fourier transform will mainly reproduce the spectral window. It is therefore possible that one of the peaks from this spectral window may have the highest amplitude within the searched frequency interval. In Gaia, spurious frequencies at 4, 8, and 12 cycles per day will be possible (see Appendix of Eyer et al. 2017 for the spectral window structure).

5.5. Reminders about the features noted in Riello et al. (2021)

The following is a list of the known issues with the Gaia EDR3 mean photometry that are discussed in Sect. 8 of Riello et al. (2021) with a comment on how they affect the epoch photometry in this survey.

Overestimated mean _G_BP flux for faint red sources. This effect was caused by the filtering out of fluxes smaller than 1 e− s−1 when forming the mean photometry. For the epoch photometry, no such filter was applied; however, in the processing of the data for the archive, negative fluxes were excluded and do not appear in the survey. We note that low fluxes will be problematic if transformed to magnitudes. Plotting a colour–magnitude diagram of all the epochs will demonstrate this.

Sources with poor Spectrum Shape Coefficients (SSCs). Of the 5 401 215 sources that were identified as having poor colour information in Gaia EDR3, 1250 are within the area covered by GAPS. These sources do not have any mean G photometry in the main section of the archive. For a more thorough explanation, please go to the Known Issues web page for Gaia EDR3. These sources do have epoch photometry in this survey, but it is very unreliable because they have been processed with the unreliable SSC values.

Systematic errors due to the use of default colour in image parameter determinations (IPDs). The systematic error described in Sect. 8.3 of Riello et al. (2021) is not present in the mean G photometry of the sources in the main archive, or in the epoch G values within this survey, because the correction described in this latter paper has been applied.

_G_-band magnitude term for blue and bright sources. Eleven sources within this survey with G < 13 and _G_BP − _G_RP < −0.1 are affected by the magnitude term in G caused by this effect. This is probably caused by issues linked to the PSF/LSF calibration (Rowell et al. 2021).

6. Simple examples of data usage

6.1. Correlations between the passbands

An interesting way to detect variables with this data set is to look at the correlations between the three passbands in the residuals with respect to the mean for each source by use of principal component analysis (PCA, Jolliffe 2002). See Süveges et al. (2012) for a similar investigation using SDSS data. For the Gaia data, the analysis is limited to only three passbands, but this is sufficient for identifying variability, avoiding interference from systematic effects. This is based on the assumption that no correlation will exist between the passbands due to instrumental effects. This shows how to exploit one of the most valuable features of this data set: simultaneous observations in many passbands.

The approach taken in this section is to generate, for each epoch of a source, three residuals with respect to the mean for that passband. These residuals are scaled using the estimated error for that residual, that is, quadrature addition of the mean source error, epoch error, and the additional error described in Sect. 5.1. An Eigen decomposition is carried out on these unit-weight residuals resulting in the principal components. The length of the first principal component (PC1) gives a strong indication of the variation within the photometric signals and the direction indicates the extent of the correlation between the data of the passbands. Another indication of correlation is the relative size of PC1 with respect to the two other principal components. The metric that was found to be most useful was R = _ν_1/(_ν_2 + _ν_3), where ν n are the lengths of the Eigen vectors.

A problem with this method is that the errors for the G passband are significantly smaller than those for _G_BP and _G_RP (see Fig. 6). As the amplitude of the variability in the three passbands is generally of the same order, scaling by the errors will mean that PC1 will generally be in the direction of the axis corresponding to the G passband. This makes it difficult to distinguish between variability and a systematic error solely in G. This is shown in the left panel of Fig. 17. Here the main concentration is towards G, indicating that the most significant variation is in G. The spur heading towards the diagonal (1, 1, 1) are the sources showing correlated variability.

thumbnail Fig. 17.Directions of the first Eigen vector (principal component) for all the sources in GAPS with more than ten transits and R > 2.0. The left plot shows the results obtained when simply scaling the input residuals by the errors, while in the right plot, the residuals are also scaled by the measured width of the distribution. The orientation of the axes is the same in both plots.

To improve on this, the residuals are further scaled by the measured width of the distributions. This emphasises when there is a strong correlation between the passbands and gives them equal weight. This is shown in the right-hand panel of Fig. 17. The concentration of points in the G direction is probably caused by uncalibrated systematic errors in the G band. The two concentrations in the diagonal direction (1, 1, 1) are likely variables where there is strong correlation between the passbands.

Using selection limits of R > 2 and a distance from the diagonal of less than 20°, a selection of variables can be found using this method. We also note that sources with sufficient transits should be selected. A limit of ten FoV transits is suggested. Figure 18 shows an example of a periodic variable identified from this dataset that is not classified within the catalogue of variables released in Gaia DR3 (Eyer et al. 2023; Rimoldini et al. 2023). This star, Gaia DR3 376526416902123392, was identified as a variable by Heinze et al. (2018) and was classified as SINE.

thumbnail Fig. 18.Example of a variable identified using correlations between the three passbands. The photometry shown in the left plot is simply sorted according to time. The vertical lines indicate a time gap of more than 0.5 day between two successive observations. The horizontal line is the median value. This is more useful than displaying purely as a function of time. Due to the time sampling resulting from the scanning law of Gaia, the data points would be strongly grouped (see Fig. 10) and the correlations difficult to see. From this plot, it is clear that the photometry is correlated. The plot on the right shows a folded light curve resulting from the period found.

To identify the period used in Fig. 18, we used the generalised Lomb–Scargle method (Zechmeister & Kürster 2009; Lomb 1976; Scargle 1982) and found a period of 0.214 days. We note that a general period search on the GAPS data is carried out, spurious concentrations at 0.0355 and 0.083 days will be found. These are caused by the scanning law or satellite rotation and are not due to variability. More detail on this can be found in Holl et al. (2023).

6.2. Hertsprung–Russell diagrams

Combining a Hertzsprung–Russell (HR) diagram with the R correlation metric from Sect. 6.1 provides an interesting and useful method for visualising and identifying various variable stars. This is shown in Fig. 19. We note that the stars plotted in this diagram are unlikely to be part of M 31 given the 10% parallax error selection used.

thumbnail Fig. 19.Colour–absolute magnitude with an auxiliary axis in R from Sect. 6.1. Only stars with a parallax error of better than 10% are used here. Rather than plotting each data point individually, the mean inside an area is shown. This is to avoid the overplotting of data points in the densest regions. The numbered regions are discussed in the text.

Areas with high variability identified in Fig. 19 are identified as follows:

1. The seven very variable blue objects below the main sequence are cataclysmic variables. Four of these have been specifically identified as such by variability processing and are in Gaia DR3.

2. This region shows the classical instability strip on the main sequence formed by δ Scuti stars (_p_-mode pulsating stars from the κ mechanism) and γ Doradus stars (_g_-mode pulsating stars) at the lower luminosity.

3. The interesting clump with strong variability to the edge of the giant branch is made up of RS Canum Venaticorum-type variables. The majority of these have similarly been identified and are in Gaia DR3.

4. The very red giants are long-period variable stars (Miras, semi-regular, OGLE small amplitude red giants).

5. The faint ridge in the middle of the main sequence is surprising; it should be noted that a similar feature was present also for the fraction of variable stars in Fig. 8 of Gaia Collaboration (2019).

6. The stronger ridge at the top of the main sequence is composed of different types of variability, many linked with binaries.

More variability might be expected to be seen at the red end of the main sequence than is evident in this figure. Figure 8 of Gaia Collaboration (2019) indicates that, in general, this part of the HR diagram contains many variables. One reason that the R values are not higher here is that this metric depends on a correlation between all three Gaia passbands and that the variability is strong with respect to the uncertainties of the epoch photometry. The stars typically inhabiting this part of the HR diagram will be faint and very red. A consequence of this is that the _G_BP fluxes will be very faint and the variability will not be significant compared to the uncertainties.

Similarly, in light of Fig. 10 of Gaia Collaboration (2019), one might expect ZZ Ceti variables to be visible in this figure. In this case, there are not enough white dwarfs in the survey for these variables to be noticeable.

6.3. Assessment of ‘variability proxies’

While no general variability metric has been provided in the Gaia releases so far, users have been inventive in using the available per-source statistics to provide information on variability. These have commonly been called ‘variability proxies’. Two such examples can be found in Belokurov et al. (2017) and Mowlavi et al. (2021). Both are approximately the same in that they are the fractional error on the mean photometry multiplied by the square root of the number of observations. This is simply a consequence of the error estimate containing a scatter component; see Carrasco et al. (2017) Eq. (3). For constant stars, this gives the best estimate for the error on the mean even in the case where epoch errors have been under- or overestimated. For variable stars with a large variation, this gives an estimate of the variability. By multiplying by the square root of the number of observations, the error on the mean is effectively converted into a scatter measurement. The advantage that GAPS has is that with the epoch data, more reliable variability metrics can be compared to a variability proxy which can be used for the DR3 catalogue where there is no epoch data.

One idea for how to improve on the currently used variability proxies is to account for the intrinsic error of the photometry. This can be done by fitting the variability proxy as a function of magnitude and applying this correction to each value. This can be seen in Fig. 20. In this case, it is a simple quadratic fit in log space with a minimum limit; i.e.

log 10 correction = max ( − 2.4134 , 0.0127 G 2 − 0.2082 G − 2.0423 ) . <span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable rowspacing="0.25em" columnalign="right" columnspacing=""><mtr><mtd><mstyle scriptlevel="0" displaystyle="true"><mrow><msub><mrow><mi>log</mi><mo>⁡</mo></mrow><mn>10</mn></msub><mrow><mi mathvariant="normal">c</mi><mi mathvariant="normal">o</mi><mi mathvariant="normal">r</mi><mi mathvariant="normal">r</mi><mi mathvariant="normal">e</mi><mi mathvariant="normal">c</mi><mi mathvariant="normal">t</mi><mi mathvariant="normal">i</mi><mi mathvariant="normal">o</mi><mi mathvariant="normal">n</mi></mrow><mo>=</mo><mrow><mi mathvariant="normal">m</mi><mi mathvariant="normal">a</mi><mi mathvariant="normal">x</mi></mrow><mo stretchy="false">(</mo><mo>−</mo><mn>2.4134</mn><mo separator="true">,</mo><mn>0.0127</mn><msup><mi>G</mi><mn>2</mn></msup><mo>−</mo><mn>0.2082</mn><mi>G</mi><mo>−</mo><mn>2.0423</mn><mo stretchy="false">)</mo><mi mathvariant="normal">.</mi></mrow></mstyle></mtd></mtr></mtable><annotation encoding="application/x-tex">\begin{aligned} \log _{10} \mathrm{correction} = \mathrm{max} (-2.4134, 0.0127 G^2 - 0.2082 G - 2.0423). \end{aligned}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.5241em;vertical-align:-0.5121em;"></span><span class="mord"><span class="mtable"><span class="col-align-r"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.0121em;"><span style="top:-3.1479em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mop"><span class="mop">lo<span style="margin-right:0.01389em;">g</span></span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.207em;"><span style="top:-2.4559em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">10</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2441em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord"><span class="mord mathrm">correction</span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mord"><span class="mord mathrm">max</span></span><span class="mopen">(</span><span class="mord">−</span><span class="mord">2.4134</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord">0.0127</span><span class="mord"><span class="mord mathnormal">G</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8641em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">2</span></span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mord">0.2082</span><span class="mord mathnormal">G</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mord">2.0423</span><span class="mclose">)</span><span class="mord">.</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.5121em;"><span></span></span></span></span></span></span></span></span></span></span></span>(2)

The initial idea was to subtract the correction from the variability proxy in quadrature. However, this will lead to taking a square root of a negative number in about half of the cases, and so leaving the metric without taking the square root is a better solution. This effectively makes the metric a corrected variance. In a similar vein, plotting the corrected variance in log space will remove about half the data points and will result in strange density distributions which naturally result from imperfections in the fit to derive the correction.

Figure 21 shows the corrected variance, described above, plotted against the R variability metric from Sect. 6.1. Selection limits of R > 2, a distance from the diagonal of less than 20°, and G < 14 have been used in this plot. Not using the magnitude limit leaves many, probably spurious points in the bottom right of the plot. This indicates that even after correction, the corrected variance (or any variability proxy) has difficulty in identifying variables unless they have large values. This is purely due to the difficulty in measuring scatter with sufficient accuracy as the sources get fainter.

thumbnail Fig. 21.Corrected variance versus R variability metric for sources with a distance from the diagonal of less than 20° and brighter than G = 14.

7. Conclusions

By delivering all the photometric data in this survey, here we provide the community with an early opportunity to probe the quality of the Gaia epoch photometry and see what level of variability detection can be achieved. It is possible that artefacts could be found in the data that were not identified by the processing team. In this way, the community can participate in the ongoing iterative process that is improving the general quality of the Gaia data. We note that some issues are known to the processing team but due to the time constraints of the complex DPAC processing schedule it is not always possible to address them all in time for the data release.

Also presented are alternative approaches to handling the three passbands using PCA, showing its usefulness. Again, this survey will allow the community to develop alternative approaches to this multivariate dataset.

The variance level as a function of magnitude was calibrated so that intrinsic variability metrics can be derived in order to help with the selection of variables from the mean photometry. Similarly, this survey can be used to estimate and establish the selection function of variability detection and help derive the expected number of true variables.

It is hoped that the community will use this epoch photometry to prepare for the future DR4 and DR5 data releases, where the photometric time series of all sources will be released.


2

The information about the number of blended transits comes from the main source catalogue (phot_bp_n_blended_transits and phot_rp_n_blended_transits).

Acknowledgments

Thanks to Michael Davidson, Claus Fabricius and Jordi Portell for interesting discussions on the effect of the major planets on the GAPS photometry due to higher background. (There is no effect!). Almost all the figures in this paper were generated using TOPCAT written by Mark Taylor. This work presents results from the European Space Agency (ESA) space mission Gaia. Gaia data are being processed by the Gaia Data Processing and Analysis Consortium (DPAC). Funding for the DPAC is provided by national institutions, in particular the institutions participating in the Gaia MultiLateral Agreement (MLA). The Gaia mission website is https://www.cosmos.esa.int/gaia. The Gaia archive website is https://archives.esac.esa.int/gaia. Funding agency acknowledgements are given in Appendix C.

References

  1. Belokurov, V., Erkal, D., Deason, A. J., et al. 2017, MNRAS, 466, 4711[Google Scholar]
  2. Carrasco, J. M., Evans, D. W., Montegriffo, P., et al. 2017, A&A, 601, C1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  3. De Angeli, F., Weiler, M., Montegriffo, P., et al. 2023, A&A, 674, A2 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  4. Distefano, E., Lanzafame, A. C., Brugaletta, E., et al. 2023, A&A, 674, A20 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  5. Evans, D. W., Riello, M., De Angeli, F., et al. 2017, A&A, 600, A51 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  6. Eyer, L., Mowlavi, N., Evans, D. W., et al. 2017, ArXiv e-prints [arXiv:1702.03295][Google Scholar]
  7. Eyer, L., Audard, M., Holl, B., et al. 2023, A&A, 674, A13 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  8. Gaia Collaboration (Prusti, T., et al.) 2016, A&A, 595, A1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  9. Gaia Collaboration (Eyer, L., et al.) 2019, A&A, 623, A110 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  10. Heinze, A. N., Tonry, J. L., Denneau, L., et al. 2018, AJ, 156, 241[Google Scholar]
  11. Holl, B., Sozzetti, A., Sahlmann, J., et al. 2023, A&A, 674, A10 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  12. Jolliffe, I. 2002, Principal Component Analysis, 2nd edn. (New York: Springer-Verlag)[Google Scholar]
  13. Kendall, M., & Stuart, A. 1977, The Advanced Theory of Statistics. Vol. 1: Distribution Theory (London: Griffin)[Google Scholar]
  14. Kendall, M., & Stuart, A. 1979, The Advanced Theory of Statistics. Vol. 2: Inference and Relationship (London: Griffin)[Google Scholar]
  15. Lindegren, L., Hernández, J., Bombrun, A., et al. 2018, A&A, 616, A2 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  16. Lomb, N. R. 1976, Ap&SS, 39, 447[Google Scholar]
  17. Lupton, R. H., Gunn, J. E., & Szalay, A. S. 1999, AJ, 118, 1406[Google Scholar]
  18. Mowlavi, N., Rimoldini, L., Evans, D. W., et al. 2021, A&A, 648, A44 [EDP Sciences] [Google Scholar]
  19. Nascimbeni, V., Piotto, G., Börner, A., et al. 2022, A&A, 658, A31 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  20. Riello, M., De Angeli, F., Evans, D. W., et al. 2021, A&A, 649, A3 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  21. Rimoldini, L., Holl, B., Gavras, P., et al. 2023, A&A, 674, A14 (Gaia DR3 SI) [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  22. Rowell, N., Davidson, M., Lindegren, L., et al. 2021, A&A, 649, A11 [EDP Sciences] [Google Scholar]
  23. Scargle, J. D. 1982, ApJ, 263, 835[Google Scholar]
  24. Süveges, M., Sesar, B., Váradi, M., et al. 2012, MNRAS, 424, 2528 [CrossRef] [Google Scholar]
  25. Zechmeister, M., & Kürster, M. 2009, A&A, 496, 577 [CrossRef] [EDP Sciences] [Google Scholar]

Appendix A: Downloading epoch photometry data from the Gaia DR3 archive

The epoch photometry data can be obtained using the Datalink feature of the archive. Other types of data such as BP/RP spectra can be obtained in a similar manner (see De Angeli et al. 2023, for instructions). A dedicated tutorial is available at https://www.cosmos.esa.int/web/gaia-users/archive/datalink-products#datalink_jntb_get_all_prods.

In this section we provide an example of how to download epoch photometry data using the Python programming language. It assumes that you already have a tailored list of source IDs to be extracted. Sources that have GAPS light curve data can be identified from the main Gaia DR3 catalogue using the query in_andromeda_survey = ’t’.

There is currently a limit of 5000 Datalink objects within a single query. In the following example, the input list is split into chunks of that size (or less) to overcome this restriction. A new FITS file for each chunk is created in the current folder. Pre-existing files are overwritten.

Appendix B: Bitwise coding for other_flags field

This field contains information on the data used to compute the fluxes and their quality. It generally provides debugging information that may be safely ignored for most applications. The field is a collection of binary flags, whose values can be recovered by applying bit shifting or masking operations. Each band has different binary flags in different positions, as shown below. The bit numbering is as follows: least significant bit = 1 and most significant bit = 64.

G band:

BP band:

RP band:

Appendix C: Funding Agency Acknowledgements

The Gaia mission and data processing have financially been supported by, in alphabetical order by country:

Appendix D: Acronyms used in the paper

Table D.1.

_Gaia_-related and other acronyms used in this paper. The first occurrence of the acronym is noted.

All Tables

Table 1.

General description of the epoch photometry fields of the Gaia DR3 archive (for the INDIVIDUAL and COMBINED data structures).

Table 2.

Additional single transit errors for G, _G_BP, and _G_RP as a function of magnitude.

Table D.1.

_Gaia_-related and other acronyms used in this paper. The first occurrence of the acronym is noted.

All Figures

thumbnail Fig. 1.Overall area covered by this survey in equatorial coordinates. As with many sky plots from Gaia, this is not an image but a diagram showing sources identified by Gaia. The density scale is logarithmic. Also visible are M 32 and M 110.
In the text
thumbnail Fig. 3.Distribution of the uncertainty on G in magnitudes as a function of magnitude. The line shows the median of the distribution. The density colour scale is logarithmic.
In the text
thumbnail Fig. 6.Median of the G, _G_BP, and _G_RP uncertainties in magnitudes as a function of magnitude.
In the text
thumbnail Fig. 8.Sky distribution of the sources in equatorial coordinates weighted by the number of _G_RP observations.
In the text
thumbnail Fig. 9.Sky distribution of the sources in equatorial coordinates weighted by the colour of the source.
In the text
thumbnail Fig. 10.Distribution in time (TCB days since 2010) of the epochs within the survey. The range of 1700 to 2700 TCB days since 2010 corresponds approximately to 1200 to 5200 OBMT rev which is often used in other Gaia papers. See Gaia Collaboration (2016) for an explanation of Gaia timescales.
In the text
thumbnail Fig. 11.G band _P_-value distributions for sources in the range 16 < G < 17 for three cases in which the following magnitude errors were added in quadrature to the errors provided in the archive: 0.0 (left panel), 0.0019 (central panel), 0.01975 (right panel). The middle value corresponds to the selected additional error for this magnitude range.
In the text
thumbnail Fig. 12.First three plots show the results for the error renormalisation analysis for G, _G_BP, and _G_RP respectively. The errors are all in magnitudes. In these plots, the median quoted error is shown along with the additional error from this analysis. We note that the BP and RP plots have a larger ordinate axis maximum. The final plot shows the additional error for all three passbands at the same time to aid comparison. Missing points indicate where the algorithm has failed.
In the text
thumbnail Fig. 13.Comparison between epoch and mean magnitudes for various time selections as a function of magnitude. The lines correspond to the medians of the distribution. The legend on the side indicates the time corresponding to the selection (TCB days since 2010). The data for this plot are from the Following FoV and Row 1.
In the text
thumbnail Fig. 15.Sky distribution of the corrected _G_BP and _G_RP flux excess factor C*, obtained from the epoch photometry.
In the text
thumbnail Fig. 16.Examples of C* variations with the scan angles (in degrees). In every panel, the mean G magnitude is indicated, as well as the total number of transits (nObs) and the number of blended transits (nBlend). See the text for a detailed explanation of the single cases.
In the text
thumbnail Fig. 17.Directions of the first Eigen vector (principal component) for all the sources in GAPS with more than ten transits and R > 2.0. The left plot shows the results obtained when simply scaling the input residuals by the errors, while in the right plot, the residuals are also scaled by the measured width of the distribution. The orientation of the axes is the same in both plots.
In the text
thumbnail Fig. 18.Example of a variable identified using correlations between the three passbands. The photometry shown in the left plot is simply sorted according to time. The vertical lines indicate a time gap of more than 0.5 day between two successive observations. The horizontal line is the median value. This is more useful than displaying purely as a function of time. Due to the time sampling resulting from the scanning law of Gaia, the data points would be strongly grouped (see Fig. 10) and the correlations difficult to see. From this plot, it is clear that the photometry is correlated. The plot on the right shows a folded light curve resulting from the period found.
In the text
thumbnail Fig. 19.Colour–absolute magnitude with an auxiliary axis in R from Sect. 6.1. Only stars with a parallax error of better than 10% are used here. Rather than plotting each data point individually, the mean inside an area is shown. This is to avoid the overplotting of data points in the densest regions. The numbered regions are discussed in the text.
In the text
thumbnail Fig. 21.Corrected variance versus R variability metric for sources with a distance from the diagonal of less than 20° and brighter than G = 14.
In the text