Automated quality assurance routines for fMRI data applied to a multicenter study - PubMed (original) (raw)

Automated quality assurance routines for fMRI data applied to a multicenter study

Tony Stöcker et al. Hum Brain Mapp. 2005 Jun.

Abstract

Standard procedures to achieve quality assessment (QA) of functional magnetic resonance imaging (fMRI) data are of great importance. A standardized and fully automated procedure for QA is presented that allows for classification of data quality and the detection of artifacts by inspecting temporal variations. The application of the procedure on phantom measurements was used to check scanner and stimulation hardware performance. In vivo imaging data were checked efficiently for artifacts within the standard fMRI post-processing procedure by realignment. Standardized and routinely carried out QA is essential for extensive data amounts as collected in fMRI, especially in multicenter studies. Furthermore, for the comparison of two different groups, it is important to ensure that data quality is approximately equal to avoid possible misinterpretations. This is shown by example, and criteria to quantify differences of data quality between two groups are defined.

PubMed Disclaimer

Figures

Figure 1

Figure 1

Fast routine for automated eye removal. From the threshold mask R T (A), its hull (B) is found by edge detection. The voxels on the hull (C), excluding the eyes, are found by neighbor search. The final mask R M (D) is determined by intersecting the original mask R T with the spatial integration R H of the hull. (See Appendix for details.)

Figure 2

Figure 2

Presentation of the distribution‐type of fMRI data by means of the q–q plot. Data are consistent with the assumption of Gaussian‐distributed data; r qq and σ are the correlation coefficient and the slope of the data, respectively. They define the similarity to the Gaussian distribution and its standard deviation.

Figure 3

Figure 3

Example of the percent signal change (PSC) of an fMRI experimental run corrupted by spikes. SPM2 translational realignment parameters (A) in comparison to the PSC (B). Spikes, which are visible in (B), do not necessarily occur in (A) and vice versa. Only the PSC gives valuable information about the corrupted scans (C) PSC per slice. The marked slice is shown in (D), whereas (E) shows the same slice at the foregoing time‐point and (F) shows the difference.

Figure 4

Figure 4

Scanner drift in an EPI time course of a phantom measurement is illustrated by the realignment parameters (A). The q–q analyses of the raw data (B) and the realigned data (C) differ significantly in their tails. Gradient heating in the readout direction causes small shifts in the spatial encoding. Without correction, the MR signal distribution has the appearance of being non‐Gaussian; this is a consequence of ignoring the shift.

Figure 5

Figure 5

QA of in vivo data. The hemodynamic model of the baseline condition defines the data that are taken into account. Dots mark the scans used for the q–q analysis for each subject so that BOLD‐induced signal variations do not influence the QA.

Figure 6

Figure 6

Percent signal change (PSC) of all phantom measurements in the multicenter study. The symbols denote the seven contributing centers. A high PSC denotes a high noise level. The PSC, however, cannot distinguish between random noise and coherent noise (artifacts). For this, the distribution type has to be considered (Fig. 7).

Figure 7

Figure 7

Distribution‐type estimate for all phantom measurements in the multicenter study. The difference from a Gaussian distribution is shown in (A), (B), and (C) by the Kolmogorov–Smirnov distance D KS, the q–q correlation coefficient r qq, and the Anderson–Darling distance D AD, respectively. Generally, a zero KS or AD distance, as well as a q–q correlation coefficient of one, denote that the noise in the underlying data is purely random (normally distributed) and no MR artifacts are present. Table (D) shows how these measures correlate with each other. Although (A) and (B) are sensitive to variations in the mean of the distribution, (C) is more sensitive to changes in the tails.

Figure 8

Figure 8

QA of fMRI data in patient studies. The q–q correlation coefficients r qq of the patient data are plotted vs. r qq of the control data, yielding another q–q plot (A). Its correlation coefficient r describes the consistency of the distribution type of r qq for both groups. The parameters of a linear fit (μ and σ2) describe their deviation in mean and variance; thus, ideally, r = 1, μ = 0, and σ = 1 holds for the comparison of fMRI data quality of two groups. The same procedure is also applied to the PSC of all subjects (B). Because the PSC reflects the amount of random noise, the trend shows that subjects with lower PSC show higher activations in fMRI data analyses. For the data presented here, the influence of data quality on group comparisons is negligible because the amount of random noise (PSC) and statistical properties of the noise (r qq) follow very similar distributions across both groups.

Figure 9

Figure 9

SPM2 one‐sample _t_‐test results (random effects) for the working memory contrast in the multicenter study. Groups of size n = 16 were analyzed; C, controls subjects; P, patients; +/−, lowest/highest PSC (highest/lowest data quality), respectively. All results are thresholded at P < 0.001 (uncorrected). C+ and C− results are similar; however, C+ strongest activation corresponds to a _t_‐value of 14, whereas C− does not contain _t_‐scores above 9.5. Furthermore, the cluster size is larger in the C+ group. The P− group has extremely low data quality, which is reflected by the low activation in the statistical maps. It is the only case that does not show any activation when thresholding at P < 0.05 with correction for multiple comparisons.

Figure 10

Figure 10

SPM2 two‐sample _t_‐tests (random effects) comparing the results of the groups depicted in Figure 9. Groups with low data quality (C−,P−) do not show more activations than do groups with high data quality (C+,P+). The contrasts (C+–P−) and (P+–C−) show significant activations at the chosen threshold, P < 0.001 (uncorrected); this activation can also partly be seen when thresholding at P < 0.05 with correction for multiple comparisons. The results clearly show that the activation patterns in group comparisons are influenced strongly by the data quality of each group. For instance, the contrasts (C+–P+) and (P+–C+) (not shown here) resemble the first and the third activation patterns shown above, but with lower activations. The strength of activations in group comparisons thus can only be interpreted if data quality is equal across both groups.

References

    1. Bourel P, Gibon D, Coste E, Daanen V, Rousseau J (1999): Automatic quality assessment protocol for MRI equipment. Med Phys 26: 2693–2700. -PubMed
    1. Casey BJ, Cohen JD, O'Craven K, Davidson RJ, Irwin W, Nelson CA, Noll DC, Hu X, Lowe MJ, Rosen BR, Truwitt, CL , Turski PA (1998): Reproducibility of fMRI results across four institutions using a spatial working memory task. Neuroimage 8: 249–261. -PubMed
    1. Chuang KH, Chen JH (2001): IMPACT: Image‐based physiological artifacts estimation and correction technique for functional MRI. Magn Reson Med 46: 344–353. -PubMed
    1. Della‐Maggiore V, Chau W, Peres‐Neto PR, McIntosh AR (2002): An empirical comparison of SPM preprocessing parameters to the analysis of fMRI data. Neuroimage 17: 19–28. -PubMed
    1. Friston KJ (2000): Experimental design and statistical issues In: Mazziotta JC, Toga AW, Frackowiak RSJ, editors. Brain Mapping: the disorders. San Diego: Academic Press; p 33–58.

Publication types

MeSH terms

LinkOut - more resources