Clinical Utility of a Plasma-Based miRNA Signature Classifier Within Computed Tomography Lung Cancer Screening: A Correlative MILD Trial Study (original) (raw)

Abstract

Purpose

Recent screening trial results indicate that low-dose computed tomography (LDCT) reduces lung cancer mortality in high-risk patients. However, high false-positive rates, costs, and potential harms highlight the need for complementary biomarkers. The diagnostic performance of a noninvasive plasma microRNA signature classifier (MSC) was retrospectively evaluated in samples prospectively collected from smokers within the randomized Multicenter Italian Lung Detection (MILD) trial.

Patients and Methods

Plasma samples from 939 participants, including 69 patients with lung cancer and 870 disease-free individuals (n = 652, LDCT arm; n = 287, observation arm) were analyzed by using a quantitative reverse transcriptase polymerase chain reaction–based assay for MSC. Diagnostic performance of MSC was evaluated in a blinded validation study that used prespecified risk groups.

Results

The diagnostic performance of MSC for lung cancer detection was 87% for sensitivity and 81% for specificity across both arms, and 88% and 80%, respectively, in the LDCT arm. For all patients, MSC had a negative predictive value of 99% and 99.86% for detection and death as a result of disease, respectively. LDCT had sensitivity of 79% and specificity of 81% with a false-positive rate of 19.4%. Diagnostic performance of MSC was confirmed by time dependency analysis. Combination of both MSC and LDCT resulted in a five-fold reduction of LDCT false-positive rate to 3.7%. MSC risk groups were significantly associated with survival (χ12 = 49.53; P < .001).

Conclusion

This large validation study indicates that MSC has predictive, diagnostic, and prognostic value and could reduce the false-positive rate of LDCT, thus improving the efficacy of lung cancer screening.

INTRODUCTION

Lung cancer is the primary cause of cancer death worldwide.1 Currently, the majority of lung cancers are detected at an advanced stage in which treatments have limited efficacy and survival rates are low. Detection of lung cancer at an early stage has the possibility of significantly reducing mortality with a greater chance of cure.

European randomized lung cancer screening trials with an observational control arm but limited size, to date, have shown no mortality reductions.24 But the larger National Cancer Institute (NCI)–sponsored National Lung Screening Trial (NSLT) reported a 20% reduction in mortality with low-dose computed tomography (LDCT) screening of high-risk individuals with a history of ≥ 30 pack-years and ≤ 15 years since quitting smoking compared with annual chest radiography.5 However, high false-positive rates, the cost of screening the large number of individuals at high risk (estimated at 3.5 million in the United States), and the potential harms associated with LDCT screening highlight the need for complementary biomarkers for standardized diagnostic use.57

MicroRNAs (miRNAs) are small noncoding RNAs that modulate gene activity and are aberrantly expressed in most types of cancer.8 As a result of their small size and stability, miRNAs can also be measured in biologic fluids such as plasma and serum and can serve as circulating biomarkers.9,10 Previously, we reported the development and validation of plasma-based miRNA signatures from patients in two independent LDCT screening studies demonstrating that the quantitative measurement by real-time reverse transcriptase polymerase chain reaction (RT-PCR) of 24 circulating miRNAs has diagnostic and prognostic performance.11

Here, we present results from a validation study to determine the diagnostic performance of a prespecified miRNA signature classifier (MSC) algorithm in 939 participants retrospectively evaluated by using samples prospectively collected within the randomized Multicenter Italian Lung Detection (MILD) clinical trial of LDCT versus observation.4 We demonstrated that MSC has significant diagnostic and prognostic performance, and we propose that MSC could complement LDCT screening by reducing false-positive rates.

PATIENTS AND METHODS

Study Population

The MILD trial, a randomized prospective clinical trial, was launched in 2005 at the Istituto Nazionale dei Tumori of Milan. The trial enrolled 4,099 current or former smokers, at least 50 years old and without history of cancer within the prior 5 years: 2,376 (58%) were randomly assigned to the LDCT arms (1,190, annual; 1,186, biennial LDCT) and 1,723 (42%) to the observational arm.4 At the time of enrollment (baseline) and of each annual or biennial recall of all volunteers from the trial, whole blood was collected as described,11 according to the internal review and the ethics boards of the Istituto Nazionale dei Tumori of Milan.

For this study, 1,000 consecutive plasma samples collected from June 2009 to July 2010 among lung cancer–free individuals enrolled onto the trial were used to determine the specificity of the MSC. Plasma samples were first assayed for hemolysis (see Appendix, online only) to remove samples that were potentially contaminated by RBC miRNAs.12,13

Of the 1,000 samples, 130 were not evaluable because of hemolysis. Of the remaining 870 disease-free individuals, 594 (68%) belonged to the LDCT arm and 276 (32%) to the observational arm. To obtain a cohort for determining the sensitivity performance of MSC, plasma samples from almost all patients with lung cancer diagnosed by September 2012 were obtained (n = 85). We favored measuring the negative predictive value (NPV) in a large, unselected series of patients instead of matching cases and controls, which would have greatly reduced the number of controls and the power of the study. For 69 of these 85 patients, at least one evaluable sample was collected. For all patients, we considered the sample closest to the LDCT examination that resulted in a diagnosis of cancer. Specifically, a sample at diagnosis was available for 50 patients and a predisease sample was available for 19 patients (Fig 1). The predisease samples were collected from 8 to 35 months before lung cancer detection with a median lag time of 18 months.

Fig 1.

Fig 1.

CONSORT diagram (The Multicentric Italian Lung Detection [MILD] study, 2005 to 2012). LDCT, low-dose computed tomography.

miRNA Profiling

Total RNA was extracted from 200-μL plasma samples with the mir_Vana PARIS Kit (Life Technologies, Carlsbad, CA) and eluted in 50 μL of buffer. miRNA expression was determined in 3 μL of eluted RNA by using the Multiplex Pools Protocol on custom-made microfluidics card (Life Technologies) containing the 24 miRNAs spotted on duplicates (see the Appendix). For each sample, cycle thresholds (CTs) of individual miRNAs were determined by using ViiA 7 software (Life Technologies) with a threshold of 0.15 and an automatic baseline. For input into the MSC, the average of the duplicate readings of predefined miRNA ratios was calculated as 2−_ct(mirx)/2−ct(miry) as previously described.11

MSC Algorithm

The MSC algorithm is a prespecified three-level MSC of low, intermediate, or high risk of disease; participants were categorized as belonging to one of these three risk groups on the basis of predefined cut points of positivity for four different expression ratio signatures of 24 miRNAs defined as risk of disease, risk of aggressive disease, presence of disease, and presence of aggressive disease.11 The miRNA ratios of the four signatures with the respective cut points were determined within two training sets of participant samples, independent of the 939-participant validation cohort, for optimization of sensitivity followed by specificity as described (see the Appendix). For the validation cohort, samples were assayed, and MSC risk scores were calculated without knowledge of clinical outcome.

Statistical Analysis

MSC risk scores blinded to clinical outcome for individual participants were submitted to an independent research center (Istituto Mario Negri of Milan), and data analysis was completed according to a prespecified statistical analysis plan. Sensitivity (SE), specificity (SP), positive predictive value (PPV), and NPV of the MSC were computed to evaluate the MSC's discriminatory performance in classifying patients diagnosed with lung cancer versus disease-free patients overall and for both the LDCT and observational arms of the study. For diagnostic performance, individuals categorized within the MSC low-risk group were compared with those within either the MSC intermediate-risk group or the MSC high-risk group. We also computed SE and SP for the combined use of binary MSC and LDCT, considering single-positive and double-positive tests.

To account for the time dependency of MSC as a predictor of disease development, the SE, SP, PPV, and NPV were calculated for the various time intervals from blood sample collection to lung cancer diagnosis (6, 12, 18, and 24 months) by using the methodology described by Heagerty et al.14,15

To determine the prognostic performance of MSC, all three risk groups were examined, and a survival curve was obtained as the Kaplan-Meier estimator from the date of blood sample collection according to MSC among all 939 participants.16 We estimated the heterogeneity of MSC in survival by using Cox proportional hazards models, considering models further adjusted for age and sex, and the χ12 test between high/intermediate and low MSC was computed.

RESULTS

Patient Characteristics

Characteristics of 939 participantss with evaluable plasma samples enrolled onto the MILD trial from 2005 to 2012 (LDCT, n = 652; observational, n = 287), including 69 participantss with lung cancer and 870 individuals without lung cancer, are indicated according to age, sex, tobacco smoking status, smoking duration, and number of cigarettes smoked per day (Table 1). Patients with lung cancer were older than individuals without lung cancer (P < .001), and the proportion of males was higher (81.2% v 63.3%; P = .0029). Smoking status was not significantly different between participants with or without cancer, but patients who developed cancer had smoked for a longer time (P < .001).

Table 1.

Distribution of Participants According to Age, Sex, and Tobacco Smoking Status (Multicentric Italian Lung Detection [MILD] study, 2005-2012)

Variable Lung Cancer (n = 69) No Lung Cancer (n = 870) P*
No. % No. %
Age (years) <.001
<55 13 18.8 377 43.3
55-59 15 21.7 242 27.8
60-64 23 33.3 165 19.0
≥ 65 18 26.1 86 9.9
Mean 60.9 56.4
SD 6.3 5.8
Range 50-77 50-75
Sex .0029
Female 13 18.8 319 36.7
Male 56 81.2 551 63.3
Smoking status .9
Ex-smoker 14 20.3 180 20.7
Current smoker 55 79.7 690 79.3
Duration of smoking (years) <.001
<40 19 27.5 500 57.5
40-49 41 59.4 326 37.5
≥ 50 9 13.1 44 5.1
No. of cigarettes per day .1201
<20 14 20.3 235 27.0
20-29 31 44.9 440 50.6
30-39 11 15.9 96 11.0
≥ 40 13 18.8 99 11.4

For patients with cancer, median time from random assignment to diagnosis was 29 months (range, 1 to 82 months), and median time from plasma sampling to diagnosis was 2 months (range, 0 to 35 months). In participants without lung cancer, median time from random assignment to plasma sampling was 44 months (range, 0 to 58 months), and median time from plasma sampling to last follow-up was 27 months (range, 3 to 41 months).

Diagnostic and Prognostic Performance of MSC

Evaluable plasma samples obtained before or at diagnosis from 939 participants across LDCT and observational arms were analyzed by using a real-time RT-PCR–based assay with a prespecified MSC algorithm of low, intermediate, and high risk of cancer groups. MSC risk groups were examined for all 939 participants according to lung cancer occurrence, death as a result of lung cancer, and tumor stage (Table 2). MSC intermediate and high correctly classified 60 of 69 patients with lung cancer with 87% SE, 81% SP, 27% PPV, and 99% NPV. Of the 19 patients with lung cancer who died during follow-up, 18 were positive at the MSC test, with 95% SE, 81% SP, 10% PPV, and 100% NPV. No deaths as a result of causes other than lung cancer were observed during the follow-up. Comparative diagnostic performance of MSC for lung cancer detection within the two arms was similar with 88% SE, 80% SP, 31% PPV, and 99% NPV for the LDCT arm and 82% SE, 83% SP, 16% PPV, and 99% NPV for the observational arm.

Table 2.

Distribution of Participants With and Without Lung Cancer According to Lung Cancer Prevalence, Stage, Death, and MSC, With Corresponding SE, SP, PPV, and NPV (Multicentric Italian Lung Detection [MILD] study, 2005-2012)

Status Total No. of Patients MSC (risk of lung cancer)
High Intermediate Low
No. % No. % No. %
All participants 939 63 6.7 159 16.9 717 76.4
No lung cancer 870 32 3.7 130 14.9 708 81.4
Lung cancer* 69 31 44.9 29 42.0 9 13.0
Lung cancer stage
I 37 14 37.8 19 51.4 4 10.8
II to III 12 5 41.7 4 33.3 3 25.0
IV 19 11 57.9 6 31.6 2 10.5
Lung cancer deaths 19 12 63.2 6 31.6 1§ 5.3

Across all three MSC risk groups, a significant trend in the proportion of death as a result of disease was observed with an increasing proportion of lung cancer deaths associated with low, intermediate, and high risk, respectively (P = .0336). MSC risk groups were not significantly associated (P = .40) with various tumor stages (I, II to III, or IV; Table 2).

No significant differences were observed between MSC risk groups and histologic subtypes (χ12 = 1.60; P = .4485) or between adenocarcinoma and squamous cell carcinoma (χ12 = 0.55; P = .759).

Time dependency analysis of diagnostic performance of MSC showed similar values of SE, SP, PPV, and NPV at 6-, 12-, 18-, and 24-month intervals between blood sampling and lung cancer diagnosis (Table 3).

Table 3.

Time Dependency Analysis of MSC (Multicentric Italian Lung Detection [MILD] study, 2005-2012)

Months From Blood Sampling to Lung Cancer Detection SE (%) SP (%) PPV (%) NPV (%)
6 83 80 18 99
12 86 81 22 99
18 86 81 23 99
24 87 81 25 99

Complementary Diagnostic Performance of LDCT and MSC

Restricting the analysis to the 652 participants in the LDCT arm, LDCT identified 46 of 58 patients with lung cancer missing three patients within the 251 individuals with no pulmonary nodule detected and nine patients because of an interval cancer for an SE of 79% (Table 4). The three cancers with no pulmonary nodule consisted of one nonsolid lesion, one mediastinal adenopathy, and one pleural effusion. Prespecified binary risk groups of MSC (considering high and intermediate v low) identified 40 of 46 LDCT-detected cancers, eight of nine interval cancers, and all three patients with no pulmonary nodule.

Table 4.

Distribution of 939 Participants According to MSC and LDCT, by LDCT (including screen-detected and non–screen-detected lung cancers) and Observational Arms (Multicentric Italian Lung Detection [MILD] study, 2005-2012)

Variable Total No Lung Cancer Lung Cancer
Overall MSC Overall MSC
High Intermediate Low High Intermediate Low
LDCT arm 652 594 22 94 478 58 27 24 7
LDCT detected 643 594 22 94 478 49 22 21 6
No nodule 251 248 7 42 199 3 2 1 0
Nodule diameter, mm
≤ 5 231 231 12 33 186 0 0 0 0
>5 to ≤ 10 102 94 2 16 76 8 4 2 2
>10 59 21 1 3 17 38 16 18 4
Interval cancer 9 9 5 3 1
Observational arm 287 276 10 36 230 11 4 5 2
Total 939 870 32 130 708 69 31 29 9

LDCT had an SP of 81% for the clinically actionable subgroup of noncalcified nodules more than 5 mm and an associated false-positive rate of 19.4% (115 of 594; Table 4). When double-positive (LDCT and MSC) participants were considered, the false-positive rate decreased to 3.7% (22 of 594), with a decrease in SE (40 [69%] of 58). Considering a participant with at least one positive test (LDCT or MSC) as positive, the combined use of LDCT and MSC identified 57 of 58 patients, with an SE of 98% and an SP of 65%. Conversely, MSC detected nine (82%) of 11 lung cancers that occurred in the observational arm (Table 4).

Association of MSC Risk Groups With Survival

Analysis was completed to determine the prognostic performance of the three predefined MSC risk groups to predict overall survival from plasma samples collected for all participants with 3-year follow-up (N = 939). Two-year survival was 100% for participants with low MSC, 98% for intermediate MSC, and 87% for high MSC, although 3-year survival was 100% for low MSC, 97% for intermediate MSC, and 77% for high MSC (Fig 2). The difference in survival between high/intermediate and low MSC was statistically significant (χ12 = 49.53; P < .001). The heterogeneity was still significant after adjustment for age and sex (χ12 = 12.57; P < .001)

Fig 2.

Fig 2.

Three-year survival from date of blood sample collection according to microRNA (miRNA) signature classifier among all patients (Multicentric Italian Lung Detection [MILD] study, 2005 to 2012). (*) No death occurred due to other causes in lung cancer–free participants.

DISCUSSION

MSC had satisfactory diagnostic performance for early detection of lung cancer within this large validation study of plasma samples prospectively collected from 939 participants enrolled onto the randomized MILD screening trial.

The diagnostic characteristic of high SE coupled with an NPV of 99% indicates that MSC is a clinically useful screening test. Moreover, the diagnostic performance of MSC as a predictor of lung cancer development was confirmed by the time dependency analysis.

MSC identified patients with a high likelihood of death as a result of disease. As a binary diagnostic, MSC had an SE of 95% and an NPV of 100% for death as a result of disease for 939 participants across both arms. Furthermore, the MSC risk groups were associated with significantly different survival at 3 years for the entire cohort of 939 participants (100% for low MSC, 97% for intermediate MSC, and 77% for high MSC). These findings indicate that plasma miRNAs identify both malignancy and aggressiveness of the tumor.

To date, three small European randomized trials, including MILD, have reported nonsignificant mortality reductions.24 The NSLT, a randomized clinical screening trial enrolling 53,454 participants with three rounds of annual LDCT screening versus chest radiographs, demonstrated a 20% reduction of lung cancer mortality.5 After three rounds of screening, 24.2% of participants were classified as positive, with 96.4% of these being false positive with the need to screen 320 participants to prevent one lung cancer death. In a systematic review of all randomized clinical trials that examined the benefits and harms of LDCT screening, the average nodule detection rate was 20%, with 90% of nodules being benign.17

In this study, MSC classified 74% (485) as low risk among the 652 individuals in the LDCT arm: 478 were true negative and seven were false negative. Thus, if MSC was used alone as screening test, seven lung cancers would have been lost. However, none of these individuals died at 3 years after the miRNA test examination. Conversely, MSC was able to identify eight of nine interval cancers undetected by LDCT. Thus, integration of MSC and LDCT (at least one positive test) would raise screening sensitivity from 87% for MSC and 84% for LDCT alone, to 98% (57 of 58), with a false-positive rate of 35%. Conversely, considering double-positive participants (MSC and LDCT positive), the integration of MSC with LDCT would reduce the false-positive rate of LDCT screening more than five-fold. Specifically, the frequency of double false-positive MSC and LDCT was only 3.7% (22 of 594) compared with 19.7% (115 of 594) for LDCT alone, reducing the SE to 69%. Consequently, MSC could complement LDCT screening by reducing false-positive results and enable greater standardization of diagnostic algorithms, thereby decreasing health care costs. Moreover, further repetitions of MSC rather than LDCT could be proposed for individuals with a low MSC, given the absence of mortality at 3 years for participants with low MSCs.

Health care costs of LDCT screening and associated follow-up procedures are significant. A recent budget impact model considering LDCT as it is widely adopted in the United States indicated that LDCT screening would avoid up to 8,100 premature lung cancer deaths at a 75% screening rate with an additional screening cost of $240,000 to avoid one lung cancer death.7 Complementing LDCT screening with a noninvasive biomarker test by reducing downstream costs might increase the number of individuals enrolled in LDCT screening.18

Blood-based biomarkers for early detection of lung cancer have been previously reported.19 A commercially available serum test that examines increased autoantibodies of six tumor-related antigens has been validated in patients with non–small-cell lung cancer with 31% SE and 84% SP.20 Notably, a serum-based assay of a 34-miRNA signature was discovered and validated in sample sets from the Continuous Observation of Smoking Subjects (COSMOS) LDCT screening trial with 80% accuracy for lung cancer detection in high-risk asymptomatic participants.21 This study further validates the use of blood-based miRNA biomarkers for detection of lung cancer.

Stable circulating miRNAs can be quantitatively measured in serum and plasma. Circulating miRNAs are packaged in exosomes or associated with ribonucleoprotein complexes, specifically argonaute 2, the effector component of the miRNA silencing complex that regulates messenger RNA repression in cells.2224 It has been hypothesized that circulating miRNA silencing complexes are not only biomarkers that indicate tumor load but also, through paracrine action, may regulate recipient cells and be associated with cancer pathogenesis.19,25

In this study, the prognostic and diagnostic performance of MSC, independent of tumor stage, suggests that the miRNAs measured within MSC are not simply an output of tumor load, but rather signals of pathogenesis related to tumor aggressiveness.

Fundamental to the development of MSC was a nonbiased computational approach of screening 4,950 ratios of 100 different plasma miRNAs for selection of the optimal set of miRNA ratios for lung cancer detection and association with poor prognosis.11 These miRNA ratios may reflect regulation between competing mechanisms of miRNA regulation of messenger RNAs within different cellular components of the tumor and the surrounding microenvironment. It is unlikely that these miRNA ratios are reflective only of different blood cell types, given that there is no significant correlation between miRNAs highly expressed by these cell types and respective complete blood count values. An intriguing scenario could be that stromal cells activated by the inflamed lung microenvironment release specific miRNAs into the circulation that could be functionally engaged in the regulation of target genes associated with neoplastic transformation.

We report in this large validation study from the MILD trial the use of a robust qRT-PCR assay of plasma-derived miRNA signatures (MSCs) that has diagnostic performance for malignant disease presence, risk of future malignancy, and the ability to distinguish lung cancers from the large majority of benign LDCT-detected pulmonary nodules. In a synergistic approach, MSCs could improve the effectiveness of LDCT for lung cancer screening by avoiding further rounds of LDCT in a large proportion of individuals and unnecessary invasive diagnostic follow-up.

A limitation of this study is that MSC validation and subsequent conclusions are based on the analysis of a single randomized screening trial, and generalizability of the reported findings should be assessed by correlative studies of other prospective randomized LDCT screening trials. However, this blinded study of a prespecified plasma-based assay represents the largest study testing a biomarker within an LDCT screening trial and indicates the possibility of superior diagnostic performance of LDCT if combined with MSC.

Supplementary Material

Publisher's Note

Acknowledgment

We acknowledge Monica Tortoreto, Angela Pettinicchio, Elisabetta Corna, Elena Vaghi, Maida De Bortoli, Federica Facchinetti, and Patrizia Gasparini for collection of blood samples in the Multicentric Italian Lung Detection (MILD) trial. We thank Antonella Zambon and Andrea Arfé for biostatistical support on time-dependence analysis. We are grateful to Elena Bertocchi, Carolina Ninni, and Annamaria Calanca for handling volunteers in the trial and for administrative and editorial assistance.

Appendix

Custom-Made Microfluidic Cards

For plasma samples, microRNA (miRNA) profiling used custom-made microfluidics card (Life Technologies) containing the 24 miRNAs that composed the ratios that were found differentially expressed in the training set: hsa-miR-101-2253, hsa-miR-106a-2169, hsa-miR-126-2228, hsa-miR-133a-2246, hsa-miR-140-3p-2234, hsa-miR-140-5p-1187, hsa-miR-142-3p-464, hsa-miR-145-2278, hsa-miR-148a-470, hsa-miR-15b-390, hsa-miR-16-391, hsa-miR-17-2308, hsa-miR-197-497, hsa-miR-19b-396, hsa-miR-21-397, hsa-miR-221-524, hsa-miR-28-3p-2446, hsa-miR-30b-602, hsa-miR-30c-419, hsa-miR-320-2277, hsa-miR-451-1141, hsa-miR-486-5p-1278, hsa-miR-660-1515, and hsa-miR-92a-431. Eight plasma samples were simultaneously analyzed in each card in duplicate.

Evaluation of Hemolysis Affecting Plasma Samples

Plasma samples with the presence of hemolysis were removed from subsequent analyses because of the known release of contaminating miRNAs by hemolysis of blood cells such as RBCs or platelets.12,13 Two quality control (QC) measurements were used for this evaluation. First, in a preanalytic step before RNA extraction, a spectrophotometric analysis was completed by measuring the absorbance at different wavelengths (414, 541, and 576 nm) to identify the presence and amount of free hemoglobin in the sample, as described by Kirschner et al.12 A second QC step was implemented to obtain even greater sensitivity to hemolysis by analyzing expression levels of hemolysis-related miRNAs contained within the miRNA signature classifier (MSC; mir-451, mir-486-5p, mir-16, and mir-92a) for all samples. Plasma samples with expression levels of hemolysis-related miRNAs that exceeded two standard deviations from the overall mean of all samples were excluded from subsequent analyses. Samples that were excluded by this second QC step also included all samples with detectable spectrophotometrically measured hemolysis. No difference in the frequency or in the amount of hemolysis as measured either spectrophotometrically or by hemolysis-related miRNA analysis was observed in the cancer versus control samples.

In addition to the potential of contaminating miRNAs released by hemolysis, it has recently been described that miRNA differentially expressed in plasma could simply reflect different blood cell counts.13 To analyze this hypothesis, miRNA ratios composed by neutrophil-expressed and RBC-expressed miRNAs (according to Pritchard et al13) were compared with ratios between the levels of neutrophil and RBC obtained by CBC from 23 patients with lung cancer in this study. As reported in Appendix Table A1 (online only), none of the miRNA ratios present in our signatures were found to have a significant correlation with respective CBC ratios.

MSC Algorithm

The MSC algorithm is a prespecified three-level MSC of low, intermediate, or high risk of disease with participant categorization to one of these three risk groups on the basis of predefined cut points of positivity for four different expression ratio signatures of 24 miRNAs as defined: risk of disease (RD), risk of aggressive disease (RAD), presence of disease (PD), and presence of aggressive disease (PAD). For the development of this algorithm, different gene expression ratios of 24 different miRNAs were generated starting from 4,950 ratios of 100 different miRNAs stably circulating in plasma in a training set of samples from patients with lung cancer prospectively collected before or at diagnosis from the Istituto Nazionale dei Tumori/Istituto Europeo di Oncologia lung cancer screening trial as previously described.11 The development of the prespecified MSC algorithm used in this study was refined from the algorithm previously described.

First, we removed samples with detectable hemolysis from the original training set belonging to an observational pilot low-dose computed tomography (LDCT) trial (Pastorino et al: Lancet 362:593-597, 2003) and described in Boeri et al.11 Subsequent to our publication (Boeri et al11), investigators reported that hemolysis affected accurate expression measurements of miRNAs in plasma and serum samples. Thus, to generate an optimal training set for MSC, five samples within the original training set were excluded because of detectable hemoglobin by spectrophotometric analysis and increase in hemolysis-related miRNA levels (as previously described). So, miRNA ratios were used in the optimal training set to develop the miRNA signatures of RD, PD, RAD, and PAD (Appendix Table A2, online only).

Second, we have generated predefined ratio cutoff values to obtain ≥ 80% specificity by using plasma samples from a training set of 84 disease-free individuals from the MILD trial that were not included in the 939-participant validation cohort. The use of plasma samples from single participants, instead of pools (used in Boeri et al11), allowed us to consider all 24 miRNAs originally identified in the training set and to generate accurate cutoffs. Therefore, we could include mir-101, mir-145, and mir-133a in this study, which were excluded in the former validation set because of a high variability among the individuals within the control pools used in that study, as disclosed on page 3715 of Boeri et al.11

To build a three-level risk categorization for disease (MSC low, intermediate, and high), the training set was also used to establish the minimum number of ratios exceeding the respective cutoff value needed to be considered positive: 10/27 for RD, 9/27 for PD, 14/28 for RAD, and 14/28 for PAD. The three-level MSC was then defined as follows: low risk (L) if RDneg ∩ PDneg ∩ RADneg ∩ PADneg; intermediate risk (I) if RDpos ∪ PDpos ∩ RADneg ∩ PADneg; or high risk (H) if RADpos ∪ PADpos. These prespecified risk groups were then used to test diagnostic and prognostic performance within an independent set of 939 participants from the MILD screening trial.

Table A1.

Plasma miRNA Ratios Correlation to Blood Cell Counts in 23 Patients With Lung Cancer

Neutrophil miRNA/RBC miRNA Ratio* Pearson Correlation With Neutrophil and RBC Counts
197/16 .34
197/486-5 .20
197/451 .15
197/92a −.07
142-3p/16 .16
142-3p/486-5p .01
142-3p/451 .07
142-3p/92a −.10
140-5p/16 .27
140-5p/486-5p .15
140-5p/451 .13
140-5p/92a −.11
17/16 .19
17/486-5p .11
17/451 .10
17/92a −.19
21/16 .43
21/486-5p .14
21/451 .23
21/92a .06
Table A2.

Refined miRNA Ratios' Signatures and Corresponding Cutoff Values (log2)

RD RAD PD PAD
Ratio Cutoff Ratio Cutoff Ratio Cutoff Ratio Cutoff
197/660 > 4.30 197/451 > −1.75 106a/142-3p > 2.02 197/486-5p > −1.39
17/660 > 9.26 28-3p/451 > −2.41 106a/140-5p > 5.50 197/451 > −1.75
28-3p/660 > 3.36 320/451 > 1.00 106a/660 > 9.32 197/660 > 4.30
133a/660 > 0.59 126/451 > 5.85 106a/92a > 3.97 17/486-5p > 3.75
106a/660 > 9.32 197/92a > −1.28 142-3p/17 < −1.95 17/451 > 3.33
197/451 > −1.75 28-3p/92a > −1.81 140-5p/17 < −5.58 17/660 > 9.26
17/451 > 3.33 320/92a > 1.65 17/660 > 9.26 106a/486-5p > 3.75
28-3p/451 > −2.41 126/92a > 3.39 17/92a > 3.81 106a/451 > 3.33
133a/451 > −5.49 142-3p/197 < 3.25 142-3p/197 < 3.25 106a/660 > 9.32
19b/660 > 8.38 142-3p/28-3p < 3.75 140-5p/197 < −0.34 126/486-5p > 2.55
197/19b > −4.12 126/142-3p > 0.81 197/660 > 4.30 126/451 > 2.85
142-3p/15b < 2.88 19b/451 > 2.85 142-3p/28-3p < 3.75 126/660 > 8.55
15b/660 > 5.00 197/660 > 4.30 140-5p/28-3p < 0.26 16/197 < 5.00
320/660 > 6.77 197/30c > −1.41 28-3p/660 > 3.36 140-5p/197 < −0.34
126/660 > 8.55 197/21 > −0.40 126/142-3p > 0.81 197/92a > −1.28
140-3p/660 > −0.21 17/451 > 3.33 126/140-5p > 4.76 197/30b > −3.24
16/197 < 5.00 106a/451 > 3.33 126/660 > 8.55 197/30c > −1.41
197/92a > −1.28 197/30b > −3.24 142-3p/145 < 3.62 19b/660 > 8.38
17/92a > 3.81 106a/142-3p > 2.02 320/660 > 6.77 28-3p/486-5p > −2.40
133a/92a > −4.18 142-3p/17 < −1.95 142-3p/15b < 2.88 28-3p/451 > −2.41
101/140-3p < −0.20 21/28-3p < 0.70 19b/660 > 8.38 16/17 < −0.50
15b/30c > −0.67 126/21 > 3.98 142-3p/148a < 6.21 106a/16 > 0.55
106a/92a > 3.97 197/19b > −4.12 197/92a > −1.28 19b/486-5p > 2.85
15b/30b > −2.47 28-3p/660 > 3.36 142-3p/30b < −0.40 19b/451 > 2.85
15b/21 > 0.36 21/221 < −1.35 142-3p/21 < 2.62 320/486-5p > 1.40
106a/451 > 3.33 145/197 < −1.10 142-3p/221 < 1.69 320/451 > 1.00
15b/451 > −0.90 28-3p/30c > −1.99 133a/142-3p > −7.20 320/660 > 6.77
28-3p/30b > −3.60 16/320 < 1.71

Footnotes

See accompanying editorial on page 725

Supported by Investigator Grants No. 10096, 1227, 11991, 10068, and 12162 (Special Program “Innovative Tools for Cancer Risk Assessment and early Diagnosis,” 5x1000) from the Italian Association for Cancer Research, Grant No. RF-2010 from the Italian Ministry of Health, Grant EDRN UO1 CA166905 from the National Cancer Institute, and by Gensignia.

G.S. and M.B. contributed equally to this study. C.L.V. and U.P. had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.

Authors' disclosures of potential conflicts of interest and author contributions are found at the end of this article.

AUTHORS' DISCLOSURES OF POTENTIAL CONFLICTS OF INTEREST

Although all authors completed the disclosure declaration, the following author(s) and/or an author's immediate family member(s) indicated a financial or other interest that is relevant to the subject matter under consideration in this article. Certain relationships marked with a “U” are those for which no compensation was received; those relationships marked with a “C” were compensated. For a detailed description of the disclosure categories, or for more information about ASCO's conflict of interest policy, please refer to the Author Disclosure Declaration and the Disclosures of Potential Conflicts of Interest section in Information for Contributors.

Employment or Leadership Position: None Consultant or Advisory Role: None Stock Ownership: None Honoraria: None Research Funding: None Expert Testimony: None Patents, Licenses, and Royalties: Gabriella Sozzi, Mattia Boeri, and Ugo Pastorino are coinventors for two patent applications regarding the miRNA signature disclosed in this article. Other Remuneration: None

AUTHOR CONTRIBUTIONS

Conception and design: Gabriella Sozzi, Mattia Boeri, Alfonso Marchiano, Ugo Pastorino

Collection and assembly of data: Paola Suatoni, Davide Conte, Nicola Sverzellati

Data analysis and interpretation: Gabriella Sozzi, Mattia Boeri, Marta Rossi, Francesca Bravi, Davide Conte, Eva Negri, Carlo La Vecchia, Ugo Pastorino

Manuscript writing: All authors

Final approval of manuscript: All authors

REFERENCES

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Publisher's Note