p16INK4a immunocytochemistry versus HPV testing for triage of women with minor cytological abnormalities: A systematic review and meta-analysis (original) (raw)

. Author manuscript; available in PMC: 2014 Oct 15.

Published in final edited form as: Cancer Cytopathol. 2012 Jun 14;120(5):294–307. doi: 10.1002/cncy.21205

Abstract

Background

The best method to identify women with minor cervical lesions that require diagnostic work-up remains unclear. We performed a meta-analysis to assess the accuracy of p16INK4a immunocytochemistry compared to hrHPV DNA testing with hybrid capture II (HC2) to detect cervical intraepithelial neoplasia (CIN2+ and CIN3+) in women with a cervical cytology showing atypical squamous cells of undetermined significance (ASC-US) or low-grade cervical lesions (LSIL).

Methods

A literature search was performed in three electronic databases to identify studies eligible for this meta-analysis.

Results

Seventeen studies were included in the meta-analysis. The pooled sensitivity of p16INK4a to detect CIN2+ was 83.2% (95%CI: 76.8–88.2%) and 83.8% (95%CI: 73.5–90.6%) in ASC-US and LSIL cervical cytology respectively; pooled specificities were 71.0% (95%CI: 65.0–76.4%) and 65.7% (95%CI: 54.2–75.6%). Eight studies provided both HC2 and p16INK4a triage data. p16INK4a and HC2 have a similar sensitivity and p16INK4a has significantly higher specificity in the triage of women with ASC-US (relative sensitivity: 0.95 (95%CI: 0.89–1.01); relative specificity: 1.82 (95%CI: 1.57–2.12)). In the triage of LSIL, p16INK4a has a significantly lower sensitivity but higher specificity compared to HC2 (relative sensitivity: 0.87 (95%CI: 0.81–0.94); relative specificity: 2.74 (1.99–3.76)).

Conclusion

The published literature indicates an improved accuracy of p16INK4a compared to HC2 testing in the triage of ASC-US. In LSIL triage p16INK4a is more specific but less sensitive.

Keywords: cervical cancer, cervical intraepithelial neoplasia, ASCUS, LSIL, triage, p16INK4a, cyto-immunochemistry, HPV testing, diagnostic accuracy, systematic review, meta-analysis

Introduction

Cervical cancer is the third most common cancer in women worldwide. It is estimated that approximately 530,000 women developed cervical cancer and that 275,000 died from the disease in 20081. A well-organized screening for and management of precancerous lesions could reduce the incidence of cervical cancer2. Women with high-grade cervical abnormalities should be referred immediately to colposcopy or even treatment. However, the optimal management of women with atypical squamous cells of undetermined significance (ASC-US) or low-grade squamous intraepithelial lesions (LSIL) remains elusive and continues to be the object of intensive research.

Testing for carcinogenic HPV DNA has been proposed as a triage method to identify women at increased risk of cervical cancer precursors and cervical cancer. Numerous clinical studies, most prominently ALTS3 and a meta-analysis4 indicated that the hybrid capture II assay has improved accuracy (higher sensitivity, similar specificity) than repeat Pap testing to detect CIN2+ in women with ASC-US cytology. However, for LSIL, the possible advantages of HPV triage still remain unclear5. LSIL is the morphological correlate of a productive HPV infection6. Therefore, HPV-DNA testing nearly always yields positive results and cannot provide additional risk stratification to distinguish between women with or without underlying or developing high grade lesions7.

There is a lot of research on the development of objective biomarkers that can distinguish transforming from productive HPV infections and predict disease severity. The cellular tumor suppressor protein p16INK4a has been identified as a biomarker for transforming HPV infections. It is a cyclin-dependent kinase inhibitor that decelerates the cell cycle by inactivating the cyclin-dependent kinases (CDK4/6) involved in the phosporylation of the retinoblastoma protein (pRb)8. In the presence of the HR-HPV oncogene E7, p16INK4a transcription is induced by the histone demethylase KDM6B9 and not by a pRb feedback mechanism as previously assumed10;11. As a result, p16INK4a protein accumulates in the cell and this could be considered as a surrogate of a transforming infection.

Recently, an established immunocytochemical dual-staining protocol which simultaneously detects p16INK4a and Ki-67 expression has been established. The simultaneous detection of p16INK4a over expression with the proliferation marker Ki-67 within the same cervical epithelial cell indicates deregulation of the cell cycle, and does not require morphology-based interpretations12.

A previous meta-analysis demonstrated the correlation between the frequency of p16INK4a over expression and the severity of preneoplastic cervical lesions in cellular and tissue specimen13. No hypotheses regarding clinical applications of p16INK4a immunostaining were addressed in this systematic review13. Establishing a correlation between p16INK4a expression and severity of cancer precursors is a first step in the generation of evidence for potential clinical applications in screening for cervical cancer or in management of screen-positive women14. We therefore conducted a meta-analysis to explore the performance of p16INK4a immunocytochemistry in the triage of women with minor cytological cervical lesions.

Material and methods

PICOS question

Prior to literature search, a clinical question and corresponding PICOS were defined (Population – Index test – Comparator test – Outcomes – Studies):

Can p16INK4a be used to identify women with minor cytological abnormalities who need referral to colposcopy? Is it better than repeat cytology, HPV testing (HC2, other HPV assays) or other biomarkers? In other words: is p16INK4a immunocytochemistry a good triage test to manage women with ASC-US or LSIL?

Search strategy

Three electronic databases were searched – PubMed-MedLine, Embase and CENTRAL. The following search string was used in PubMed-MedLine: (cervix OR cervical OR vaginal) AND (cancer OR carcinoma OR dysplas* OR neoplasm* OR CIN OR SIL OR “pap smear” or cytology) AND (p16* OR p16INK4a OR protein p16 OR p16 protein). No language or publication date restrictions were applied.

The references of the retrieved articles were hand-searched in order to identify other eligible studies. Eligibility of inclusion or exclusion criteria was verified independently by two investigators (JR and MR). When no consensus could be reached, a third investigator was involved (MA). Extraction of the data was done by JR and checked by MA.

Inclusion and exclusion of studies

We included all studies that assessed p16INK4a immunostaining or p16INK4a/Ki-67 dual staining with or without hybrid capture 2 (HC2) testing as comparator test on liquid based cytology (LBC) or conventional cytology (CC) specimens showing ASC-US or LSIL cytology and where the diagnosis was verified with a reference standard. Studies were excluded if the population contained less than 20 women with ASC-US or LSIL cytology. If the data were not separated according to ASC-US or LSIL cytology, separate data were requested from the authors. When the authors did not respond, the studies were excluded. When duplicate publications of the same studies were found, the most comprehensive was included.

Participants

Two groups of participants were considered: women with equivocal cervical lesions or ASC-US (triage group I) and women with low-grade cytological lesions or LSIL (triage group II).

For the first group we considered women with atypical squamous cells of undetermined significance (ASCUS) as defined in the 1988 version of The Bethesda System (TBS)15. For studies using the TBS-2001 criteria, only the data for ASC-US cases were extracted. Studies reporting data exclusively on atypical squamous cells-favor reactive (ASC-R) or atypical squamous cells-cannot exclude high-grade squamous intraepithelial lesion (ASC-H) or atypical glandular cells (AGC) were excluded. For this meta-analysis only one term “ASC-US” was used for both versions of the Bethesda System.

For the second group we considered women with low-grade squamous intraepithelial lesions (LSIL). Studies that used the terminology of the British Society of Clinical Cytology (BSCC)16 were translated into TBS-1988. The BSCC terms borderline and mild dyskaryosis were considered as similar to ASCUS and LSIL respectively17.

Types of outcome measures

Outcome measures were defined prior to the literature search. The primary outcome was the absolute sensitivity and specificity of p16INK4a immunocytochemistry to detect underlying disease (CIN2+ or CIN3+/AIS) in the triage of women with equivocal or low-grade cytological abnormalities. The secondary outcome was the relative sensitivity and specificity of p16INK4a immunostaining versus hrHPV testing in studies with comparator testing.

Reference standard

We considered the following categories of reference standards:

  1. Colposcopy and LLETZ or conization on all women
  2. Colposcopy, punch biopsies of colposcopically suspicious areas and random biopsies of colposcopic normal zones on all women
  3. Colposcopy and more than 1 biopsy on all women (type of biopsy unknown)
  4. Colposcopy and one or more biopsies of colposcopic suspected zone. Women are considered free of CIN2+ if colposcopy is negative
  5. Colposcopy and/or biopsy on all women (no further information)
  6. Retrospective collection of biopsy/histology data

Data extraction and statistical analyses

Study characteristics and covariates that could influence study outcomes were tabled: primary p16INK4a antibody used, reference standard and positivity criterion for p16INK4a. The QUADAS-checklist for evaluation of the quality of diagnostic test studies was used as a tool to evaluate the quality of the studies18. The most important quality items that were reviewed in the QUADAS-checklist are the acceptability of the reference standard, the delay between tests, blinding of results, incorporation bias and verification bias18.

A pooling of the absolute accuracy of p16INK4a immunocytochemistry and hrHPV testing was done making use of the Stata-10 procedure, metandi (Stata Corp., College Station, Texas, US). This is a two-level mixed logistic regression model, with independent binomial distributions for the true positives and true negatives conditional on the sensitivity and specificity in each study, and a bivariate normal model for the logit transforms of sensitivity and specificity between studies19;20.

The relative sensitivity and specificity of p16INK4a compared to hrHPV testing was computed using metadas, a SAS (SAS Institute Inc., Cary, NC, USA) macro for meta-analysis of diagnostic accuracy studies which allows the inclusion of “test” as a covariate making comparison of two or more tests possible21;22.

Multivariate analyses for p16INK4a immunocytochemistry were done using metadas. Different covariates were included for test-positivity criterion used for p16INK4a, primary antibody, preparation method index cytology and the reference standard used.

Results

Included studies

The electronic search yielded 810 articles (last search was performed on August 24, 2011). The majority of articles were found in PubMed-Medline (619). An additional 191 articles were retrieved from Embase. The CENTRAL database yielded no further results. The PRISMA flow-chart (Figure 1) shows the harvest of selected references and the reasons for exclusion. Finally, 17 reports were retained that contained data fulfilling the inclusion criteria allowing addressing the PICOS question.

Figure 1.

Figure 1

PRISMA-diagram: flowchart of study selection.

Two studies provided data on the accuracy of p16INK4a immunochemistry on women with LSIL cytology23;24, 5 on women with ASC-US cytology2529 and another 10 studies on the triage of both ASC-US and LSIL cytology12;3038. Study characteristics and technical information of included papers are shown in Table 1 and Table 2 respectively.

Table 1.

Study characteristics included studies

Study Country Study size Triage group Triage tests Outcomes Gold Standard*
Nieh, 2005 Taiwan 66 ASCUS p16INK4a cytology CIN2+ 3
HC2
Holladay, 2006 USA 100 ASC-US p16INK4a cytology CIN2+ 6
100 LSIL HC2
Meyer, 2007 USA 28 LSIL p16INK4a cytology CIN2+ 5
15 ASC-US HC2
Monsonego, 2007 France 98 ASC-US p16INK4a cytology CIN2+ 3
105 LSIL HC2 CIN3+
Wentzensen, 2007 France 137 ASCUS p16INK4a cytology CIN2+ 3
88 LSIL
Schledermann, 2008 Denmark 43 ASC p16INK4a cytology CIN2+ 6
Sweden 36 LSIL
Szarewski, 2008 UK 104 ASCUS p16INK4a cytology CIN2+ 3
617 LSIL HC2 CIN3+
Denton, 2010 Switzerland 385 ASC-US p16INK4a cytology CIN2+ 6
Italy 425 LSIL HC2 CIN3+
Passamonti, 2010 Italy 91 ASC-US p16INK4a cytology CIN2+ 4
60 LSIL CIN3+
Samarawardana, 2010 USA 164 ASC-US p16INK4a cytology CIN2+ 4
42 LSIL
Sung, 2010 Korea 66 ASC-US p16INK4a cytology CIN2+ 3
Tsoumpou, 2010 Greece 216 LSIL p16INK4a cytology CIN2+ 4
Alameda, 2011 Spain 109 ASCUS p16INK4a cytology CIN2+ 4
HC2
Edgerton, 2011 USA 63 ASC-US Dual stain (p16INK4a/Ki-67) CIN2+ 6
Guo, 2011 USA 65 ASC-US p16INK4a cytology CIN2+ 5
CIN3+
Nasioutziki, 2011 Greece 53 ASCUS p16INK4a cytology CIN2+ 5
277 LSIL HC2
Schmidt, 2011 Switzerland 361 ASCUS Dual stain (p16INK4a/Ki-67) CIN2+ 3
Italy 415 LSIL HC2

Table 2.

Technical details included studies.

Study p16INK4a antibody Positivity criterion p16INK4a Preparation method cytology Collection device cytology
Nieh, 2005 CloneE6H4 Nuclear/cytoplasmic staining ≥1 cytological abnormal cervical cell Conventional cytology Wooden spatula/cytobrush
Holladay, 2006 CloneE6H4 Cytoplasmic/nuclear staining ≥1 cytological abnormal cervical cell LBC (PreservCyt, ThinPrep) ND
Meyer, 2007 CloneE6H4 Nuclear/cytoplasmic staining ≥1 cytological abnormal cervical cell LBC (PreservCyt, ThinPrep) ND
Monsonego, 2007 CloneE6H4 Nuclear/cytoplasmic staining ≥1 cytological abnormal cervical cell LBC (PreservCyt, ThinPrep) ND
Wentzensen, 2007 CloneE6H4 Nuclear score* >2 LBC (CYTO-screen system fixative fluid) Flexible brush
Schledermann, 2008 CloneE6H4 Nuclear staining ≥1 cytological abnormal cervical cell LBC (ThinPrep, PreservCyt) Plastic spatulaEndocervical cytobrush
Szarewski, 2008 CloneE6H4 Nuclear score* >2 LBC (ThinPrep, PreservCyt) Cervex broom
Denton, 2010 CloneE6H4 Cytotechnologist 1 + pathologist: Presence ≥1 p16INK4a stained cervical cellCytotechnologist 2: Nuclear score* ≥2 LBC ND
Passamonti, 2010 Clone JC8 Nuclear/cytoplasmic staining ≥1 cytological abnormal cervical cell 151 Conventional cytology95 LBC (ThinPrep, PreservCyt) ND
Samarawardana, 2010 16P04 Nuclear/cytoplasmic strong staining in ≥30 metaplastic, koilocytotic, or cytological equivocal cells LBC (ThinPrep, PreservCyt) Broom-like device
Sung, 2010 CloneE6H4 Nuclear/cytoplasmic staining ≥1 cytological abnormal cervical cell LBC Cytobrush
Tsoumpou, 2010 CloneE6H4 Nuclear/cytoplasmic staining ≥1 cytological abnormal cervical cell LBC (ThinPrep, PreservCyt) ND
Alameda, 2011 CloneE6H4 Nuclear score* >2 LBC (ND) ND
Edgerton, 2011 CINTecPLUS Simultaneous dual staining of ≥1 cervical cell LBC (ND, SurePath) ND
Guo, 2011 Clone6H12 Nuclear staining of ≥1 cytological abnormal cervical cell with/without cytoplasmic staining LBC (SurePath) ND
Nasioutziki, 2011 CloneE6H4 Nuclear score* >2 LBC (PreservCyt; ThinPrep) Ayre’s spatula &cytobrush
Schmidt, 2011 CINtecPlus KitCloneE6H4Clone 274-11 AC3 Simultaneous dual staining of ≥1 cervical cell LBC (ThinPrep, PreservCyt) ND

In two studies12;26 p16INK4a/Ki-67 dual staining using CINtec Plus kit was performed, the other 15 studies2325;2738 applied single p16INK4a imumnocytochemistry. Twelve studies2325;2833;3638 used clone E6H4 as a primary antibody for p16INK4a, other primary antibodies used were Clone 6H1227, Clone JC834 and 16P0435. Positivity criteria of p16INK4a immunostaining differed between the studies. Five studies25;30;33;37;38 made use of the nuclear scoring proposed by Wentzensen and Bergeron39. This scoring system takes into account nuclear staining and nuclear abnormalities (increased size, granular/hyperchromatic chromatin, irregular shape or variable morphology from cell to cell). When a cervical cell shows nuclear p16INK4a staining and one of the nuclear abnormalities mentioned above, a score of 2 is given. If the stained nucleus shows an increased size and 1 or more nuclear abnormality, a score of 3 and 4 is given respectively. A nuclear score of >2 or ≥2 is used as a cut-off for p16INK4a positivity. For the studies that applied p16INK4a/Ki-67 dual staining, simultaneous red nuclear- and brown cytoplasmic staining in at least one cervical cell was set as the positivity criterion12;26. The presence of staining in 1 or more or 30 or more cytological abnormal cervical cell was interpreted as a positive p16INK4a reaction in the remaining 10 studies. However, there was a difference in the localization of the immunostaining. Two studies27;36 only considered nuclear staining as a positive p16INK4a staining reaction while 8 studies23;24;28;29;31;32;34;35 considered both nuclear and/or cytoplasmic staining as a positive reaction.

Triage of atypical cells of undetermined significance

Fifteen studies contained accuracy data for p16INK4a immunostaining in the triage of women with ASC-US cytology12;2538. A total of 1740 women were enrolled. Eight studies performed a direct comparison with HC2 triage data12;25;28;3033;37. The study of Denton et al.30 provided p16INK4a immunocytochemistry data interpreted independently by 2 pathologists and 1 cytotechnologist. To avoid that this study should contribute too much influence each interpretation was weighted with a factor 0.33.

Absolute accuracy p16INK4a-triage

The pooled estimated absolute sensitivity and specificity values and their 95% confidence interval (CI) are shown in Table 3. The pooled sensitivity was 83.2% (95%CI: 76.8–88.2%) and 85.4% (95%CI: 71.7–93.1%) for an outcome of CIN2+ and CIN3+ respectively. To predict the absence of CIN2+ or CIN3+, the pooled absolute specificity was 71.0% (95%CI: 65.0–76.4%) and 61.1% (95%CI: 57.2–64.9%) respectively. The hierarchical summary receiver-operator curve (HSROC)-curve for p16INK4a triage for an outcome of CIN2+ is shown in Figure 2.

Table 3.

Pooled absolute accuracy estimates of p16INK4a and HC2 in the triage of women with ASCUS or LSIL cytology.

Test Triage group Outcome N° of studies Parameter Accuracy (%)
p16INK4a ASCUS/-US CIN2+ 17* Sensitivity 83.2 (76.8–88.2)
17* Specificity 71.0 (65.0–76.4)
CIN3+ 8 Sensitivity 85.4 (71.7–93.1)
8 Specificity 61.1 (57.2–64.9)
LSIL CIN2+ 14 Sensitivity 83.8 (73.5–90.6)
14 Specificity 65.7 (54.2–75.6)
CIN3+ 7 Sensitivity 87.7 (78.6–93.2)
7 Specificity 48.9 (36.2–61.7)
HC2 ASCUS/-US CIN2+ 8 Sensitivity 91.6 (85.9–95.1)
8 Specificity 40.5 (33.5–47.9)
CIN3+ 3 Sensitivity 92.2 (85.1–99.4) §
3 Specificity 41.0 (33.1–48.8) §
LSIL CIN2+ 7 Sensitivity 99.5 (82.6–100.0)
7 Specificity 28.9 (16.4–45.6)
CIN3+ 3 Sensitivity 98.6 (95.9–101.3) §
3 Specificity 22.5 (15.3–29.6) §
Figure 2.

Figure 2

Meta-analysis of the sensitivity and specificity of p16INK4a immunostaining in the triage of women with ASC-US (left) or LSIL (right) to detect CIN2+ (top) and CIN3+ (bottom). Black square: summary point, small circles: individual studies; green line: SROC curve; interrupted brown line: 95% confidence ellipse.

Relative accuracy of p16INK4a- versus HC2-triage

The relative accuracy measures and their CI’s are shown in Table 4. The relative sensitivity of p16INK4a versus HC2 for CIN2+ and CIN3+ was 0.95 (95%CI: 0.89–1.01) and 0.98(95% CI: 0.86–1.12) respectively. The relative specificity was 1.82 (95% CI: 1.57–2.12) and 1.64 (95% CI: 1.44–1.87) for predicting the absence of CIN2+ or CIN3+ respectively. The corresponding HSROC curve is shown in Figure 4. In the upper graph the two summary points are almost on the same height (equal sensitivity) but the summary point of p16INK4a is located more to the left (higher specificity) than that of HC2. This means that HC2 and p16INK4a have an equal sensitivity in the triage of ASC-US to detect CIN2+, however, the specificity of p16INK4a is higher than the specificity of HC2.

Table 4.

Pooled relative accuracy of p16INK4a vs HC2 in the triage of women with ASCUS or LSIL cytology.

Triage group Outcome Parameter Ratio (p16INK4a vs HC2) p-value
ASCUS/-US CIN2+ Sensitivity 0.95 (0.89–1.01) 0.1287
Specificity 1.82 (1.57–2.12) <0.0001
CIN3+ Sensitivity* 0.98 (0.86–1.12) 0.780
Specificity* 1.64 (1.44–1.87) 0.000
LSIL CIN2+ Sensitivity 0.87 (0.81–0.94) 0.0002
Specificity 2.74 (1.99–3.76) <0.0001
CIN3+ Sensitivity 0.88 (0.81–0.95) 0.0013
Specificity 2.81 (2.38–3.33) <0.0001
Figure 4.

Figure 4

Forest plot sensitivity (left) and specificity (right) ratios of p16INK4a triage versus HC2 in women with ASC-US (top) or LSIL (bottom) to detect CIN2+.

Triage of low grade squamous intraepithelial lesions

Two thousand nineteen women were enrolled in 12 studies12;23;24;3038 reporting p16INK4a triage accuracy data for LSIL. Seven studies allowed comparison of p16INK4a with HC2 triage12;23;3033;37.

Absolute accuracy p16 INK4a -triage

The pooled absolute sensitivity was similar to that in the triage of ASC-US, with 83.8% (95%CI: 73.5–90.6%) and 87.7% (95%CI: 78.6–93.2%) to predict respectively CIN2+ or CIN3+ lesions. The absolute specificity to predict the absence of CIN2+ or CIN3+ lesions was a bit lower than in ASC-US triage, pooled estimates were respectively 65.7% (95% CI: 54.2–75.6%) and 48.9% (95%CI: 36.2–61.7%). (Table 3, Figure 2)

Relative accuracy p16 INK4a - versus HC2-triage

In contrast with ASC-US triage, p16INK4a showed a lower sensitivity than HC2 to predict CIN2+ or CIN3+ lesions. The relative sensitivity for CIN2+ and CIN3+ lesions was 0.87 (95%CI: 0.81–0.94) and 0.88 (95%CI: 0.81–0.95) respectively. In concordance to ASC-US triage, p16INK4a showed a statistically significantly higher specificity than HC2 with pooled values of 2.74 (95%CI: 1.99–3.76) and 2.81 (95% CI: 2.38–3.33) for CIN2+ and CIN3+ outcome respectively. The corresponding HSROC curve is shown in Figure 4, lower graph. The summary point of p16INK4a is located lower (lower sensitivity) and more to the left (higher specificity) than that of HC2 testing, which means that there is a difference in sensitivity and specificity between p16INK4a and HC2 to triage LSIL. p16INK4a-triage has a higher specificity but a lower sensitivity than HC2 to detect CIN2+ lesions in women with LSIL cytology. (Table 4, Figure 4 and Figure 6)

Influence of study characteristics

The multivariate analysis showed a higher sensitivity and specificity for studies that used the nuclear scoring system to interpret p16INK4a results and studies that applied dual staining for p16INK4a and Ki-67 compared to studies that only looked at simple p16INK4a expression in cytologically abnormal cells (Table 6). However, these differences were not statistically significant (p>0.05).

The studies that applied both p16INK4a and HC2 triage tests showed no significant differences in sensitivity and equal specificity compared to studies that only assessed p16INK4a immunocytochemistry. The type of p16INK4a antibody used also did not significantly influence the accuracy measures.

Discussion

Our meta-analysis showed better accuracy of p16INK4a triage of ASC-US than HC2 (similar sensitivity but better specificity) considering both outcomes CIN2+ and CIN3+. In LSIL triage, p16INK4a-staining was more specific than HC2 but less sensitive.

Triage of ASC-US

It has been shown in large randomized trials and meta-analyses that HC2 performs better than repeat cytology to triage women with ASC-US4;5;40;41. Nevertheless, the triage specificity of HC2 still is not optimal (often in the 40–60% range), resulting in colposcopy referral of many women without disease. With a pooled specificity of 71% (1.82 times higher than HC2), p16INK4a immunostaining appears to be a test that meets the demand for a more specific triage test without loosing sensitivity. The specificity of HC2 in ASC-US triage including 8 studies (40.5%, 95%CI: 33.5–47.9%) seems lower in our meta-analysis compared to previous meta-analysis including 20 studies (62.5%, 95% CI: 57.8–67.3%)5, but not so different from the specificity reported in the ALTS study (48%)3, which could be explainable by differences in age composition of study populations. Age could not be controlled for throughout previous meta-analyses since age-stratified data were not sufficiently reported in the included studies. However, within each of the 8 studies included in our meta-analysis, age could not cause bias since the two compared tests were done on the same women.

Triage of LSIL

HC2 does not perform well in many studies because of its very low specificity7;40;42. However, these findings are not universal and depend on quality of cytological interpretation and the HPV test used. In our meta-analysis, the pooled specificity values for HC2 were very similar to previous meta-analyses (in the range 22% to 28% for CIN3+ and CIN2+ outcomes5. There is clearly a need for more specific assays universally usable in triage of LSIL, which are as sensitive as and more specific than HC2. Our meta-analysis shows that p16INK4a is indeed more specific, but in contrast to triage of ASC-US, it is less sensitive.

Influence of study characteristics

The use of p16INK4a immunocytochemistry in clinical applications remains controversial due to the variation in procedures used. The most important difference between the different studies is the interpretation of p16INK4a expression31. Since a purely color-based approach to identify abnormal cells in cervical smears using p16INK4a is hampered by the fact that few normal endocervical, squamous metaplastic, or atrophic cells also may display some p16INK4a expression, Wentzensen et al.38 defined morphologic criteria that would enable scoring of p16INK4a –positive squamous cells. A major concern of using morphology-based biomarkers is achieving adequate reproducibility. While the nuclear scoring showed high reproducibility in the initial reports3839, it was not consistently applied in subsequent studies, and reproducibility was not evaluated on a larger scale. The recent p16INK4a/Ki-67 dual staining could eliminate the need for a standardized methodology because it allows identifying cells with deregulated cell cycle in cervical cytology specimens independent of morphology-based parameters. We presumed that the studies applying the nuclear scoring system or p16INK4a/Ki-67 dual staining would have a greater accuracy (higher sensitivity and specificity) to identify women with CIN2+ compared to studies that only looked at simple p16INK4a expression in cytological abnormal cells without scoring. Multivariate analyses showed higher sensitivity and specificity of the ASC-US studies applying nuclear scoring or dual staining compared to those applying simple p16INK4a immunostaining but, in general, these differences were not statistically significant. Only the specificity of p16INK4a immunostaining with nuclear scoring in women with ASC-US was significantly higher compared to the other studies (p=0.04). p16INK4a/Ki-67 dual staining was used in only 2 ASC-US-triage studies12;26. One study12 reported excellent sensitivity (92%) and specificity (81%) for CIN2+ using dual staining, where the sensitivity was similar to that of HC2 (ratio 1.01, 95% CI: 0.92–1.16) but with increased specificity (ratio 2.22, 95% CI: 1.89–2.62). Another study26 using dual staining showed substantially lower sensitivity (64%) and specificity (53%) for the same outcome without comparison with HC2. This could be due to the fact that this study did not follow the manufacturer’s instructions for CINtec PLUS dual staining. In LSIL triage only one study used dual staining with similar findings as for ASC-US-triage: high sensitivity (94%) and rather good specificity (68%) for CIN2+ which was similar to sensitivity of HC2 (ratio 0.98, 95% CI: 0.93–1.03) but with higher specificity than HC2 (ratio 3.57, 95% CI: 2.76–4.60)12.

The gold standard used can influence accuracy estimates of the triage test. In this meta-analysis we considered colposcopy and histology as the gold standard and distinguished 6 types of verification. However, none of these methods of verification influenced significantly accuracy estimates of triage. In addition, staining of biopsies also can impact on the outcome assessment. Two studies12;30 used p16 immunohistochemistry in addition to the normal haematoxylin & eosin (HE) staining for the histological interpretation of biopsies. Previous studies have shown that this improved gold standard increases the sensitivity of the histological interpretation43;44. Multivariate analysis showed no significant difference in absolute sensitivity of triage using p16INK4a immunocytochemistry between studies that used HE staining compared to p16INK4a staining of biopsies (p=0.17 and p=0.22 for ASC-US and LSIL respectively). Furthermore, outcome adjudication using p16 will bias results in favor of p16 cytology because of autocorrelation.

Future research on triage of ASC-US and LSIL

The meta-analysis presented in this paper is part of an international effort including a series of ongoing meta-analyses addressing the accuracy of triage of minor cytological abnormalities using other methods, such as other hrHP_V_ DNA tests than HC2, assays detecting viral RNA, picking up a restricted number HPV types (in particular HPV types 16 and 18), as well as other protein markers such as ProExC BD Diagnostics—TriPath, Burlington, NC, USA). All these meta-analyses will address questions of follow-up of screen-positive women participating in cytology-based screening. Investigators and authors should be recommended to follow STARD guidelines for good diagnostic research involving application of one or more markers followed by verification with colposcopy and colposcopy-targeted biopsies with or without additional random punch biopsies for all patients with ASC-US and LSIL14;45. This gold standard verification should preferentially be blinded to the results of the markers and take place in a short delay (<10 weeks) to avoid development of disease after the triage tests. Future research should also target longitudinal outcomes, in particular the risk of developing CIN3 in women triage+ and triage- results over 3 to 5 years (longitudinal PPV and 1-NPV).

Conclusion

Based on the currently published data, we can conclude that p16INK4a immunocytochemistry could be recommended for use in the triage of women with ASC-US due to the higher specificity without loss of sensitivity compared to HC2 testing. In LSIL triage, p16INK4a is less sensitive but more specific than HC2. It can therefore be used as a first step triage justifying further diagnostic work-up of p16INK4a-positive women. However, women with LSIL testing p16INK4a negative cannot be referred back to normal screening. Those women should be re-invited for a repeat testing. Dual staining in LSIL triage could be as sensitive as HC2 but this was observed in only one observational study, which is insufficient to justify clinical recommendations. More studies using the dual stain are currently ongoing and may have an influence on the current conclusions.

Figure 3.

Figure 3

HSROC plot of the relative sensitivity and specificity of p16INK4a immunostaining versus HC2 in the triage of women with ASC-US (top) or LSIL (bottom) to detect CIN2+ lesions.

Table 5.

Multivariate meta-analysis of the absolute sensitivity and specificity of p16INK4a triage of ASC-US and LSIL for an outcome of CIN2+ according to covariates

Triage group Covariate Covariate level N° studies Sensitivity (%) p-value Specificity (%) p-value
ASC-US Test cutoff criterion p16INK4a expression in >1 cell 10 81.6 (70.2–89.3) REF 66.8 (62.3–70.9) REF
NS>2 5 85.9 (75.5–92.3) 0.504* 83.1 (60.8–94.0) 0.036*
Dual staining >1 cell 2 84.6 (69.1–93.1) 0.696* 70.2 (56.7–80.9) 0.595*
Nb of triage tets evaluated Both triage tests§ 10 87.0 (81.1–91.3) 0.119* 73.3 (62.9–81.7) 0.452*
Only p16INK4a testing** 7 77.3 (65.1–86.1) REF 68.7 (60.9–75.6) REF
LSIL Test cutoff criterion p16INK4a expression in >1 cell 9 79.7 (65.8–88.9) REF 58.9 (46.1–70.7) REF
NS>2 4 82.1 (65.9–91.6) 0.778* 77.1 (56.4–89.7) 0.086*
Dual staining >1 cell 1 94.4 (55.0–99.6) 0.106* 68.0 (45.5–84.4) 0.445*
Nb of triage tets evaluated Both triage tests 9 85.2 (74.4–91.1) 0.644* 64.1 (49.3–76.6) 0.681*
Only p16INK4a testing 5 80.2 (54.9–93.1) REF 68.7 (49.9–82.8) REF

Acknowledgments

Financial support was received from: (1) the European Commission through the PREHDICT Network, coordinated by the Free University of Amsterdam (the Netherlands), funded by the 7th Framework program of DG Research (Brussels, Belgium), and through the ECCG (European Cooperation on development and implementation of Cancer screening and prevention Guidelines, via IARC, Lyon, France), funded by Directorate of SANCO (Luxembourg, Grand-Duchy of Luxembourg); (2) The Belgian Foundation Against Cancer, Brussels, Belgium; (3) the Gynaecological Cancer Cochrane Review Collaboration (Bath, United Kingdom).

The authors acknowledge M. Nasioutziki, M. Guo 2010 for the provision of additional data.

References