Combination of Transient Elastography and an Enhanced Liver Fibrosis Test to Assess the Degree of Liver Fibrosis in Patients with Chronic Hepatitis B (original) (raw)

Abstract

Background/Aims

Liver stiffness (LS) was assessed using transient elastography, and the enhanced liver fibrosis (ELF) test was performed to accurately assess fibrotic burden. We validated the LS-ELF algorithm and investigated whether the sequential LS-ELF algorithm performs better than concurrent combination of these analyses in chronic hepatitis B (CHB) patients.

Methods

Between 2009 and 2013, 222 CHB patients who underwent liver biopsy (LB), as well as LS measurement and the ELF test, were enrolled.

Results

Advanced fibrosis (≥F3) and cirrhosis (F4) were identified in 141 (63.6%) and 118 (53.2%) patients, respectively. Areas under receiver operating characteristic curve for LS predictions of ≥F3 (0.887 vs 0.703) and F4 (0.853 vs 0.706) were significantly higher than the ELF test (all p<0.001). Based on the LS-ELF algorithm, 60.4% to 71.6% and 55.7% to 66.3% of patients could have avoided LB to exclude ≥F3 and F4, respectively, whereas 68.0% to 78.7% and 63.5% to 66.1% of patients could have avoided LB to confirm ≥F3 and F4, respectively. When confirmation and exclusion strategies were applied simultaneously, 69.4% to 72.5% and 60.8% to 65.3% of patients could have avoided LB and been diagnosed as ≥F3 and F4, respectively. The proportion of patients who correctly avoided LB for the prediction of ≥F3 (69.4% to 72.5% vs 42.3% to 59.0%) and F4 (60.8% to 65.3% vs 23.9% to 49.5%) based on the sequential LS-ELF algorithm was significantly higher than the concurrent combination (all p<0.05).

Conclusions

The sequential LS-ELF algorithm conferred a greater probability of avoiding LB in CHB patients to diagnose advanced fibrosis and cirrhosis, and this test performed significantly better than the concurrent combination.

Keywords: Hepatitis B, chronic, Enhanced liver fibrosis, Liver cirrhosis, Transient elastography, Liver stiffness

INTRODUCTION

Chronic hepatitis B (CHB) includes a diverse spectrum of diseases, ranging from asymptomatic infection to severely fulminant hepatitis and hepatic decompensation.1,2 The extent of liver fibrosis is one of the main prognostic factors in CHB, which is correlated with the risk of developing cirrhosis and liver-related complications.3 Fortunately, recently developed potent antiviral agents for CHB can prevent disease progression, reduce the risk of emerging hepatocellular carcinoma (HCC), and even reverse hepatitis B virus (HBV)-induced liver fibrosis. However, because the remaining fibrotic burden cannot be completely resolved in most CHB patients, they are still at higher risks of developing liver-related complcations, such as HCC.3 Thus, the degree of liver fibrosis must be evaluated in any assessment of the risk of HBV-related HCC development and determination of long-term prognosis, even in this era of potent and active antiviral therapy.4,5

Liver biopsy (LB) is considered a reference method for evaluating the extent of liver fibrosis;6 however, it has some issues including sampling error and inter-/intra-observer interpretational variability.7,8 In addition, LB is associated with a low risk of life-threatening procedure-related complications such as bleeding, perforation, and even death.9 Thus, various biochemical surrogates for LB, such as the FibroTest (FT), enhanced liver fibrosis (ELF) test, and Wisteria floribunda agglutinin-positive human Mac-2 binding protein,10,11 as well as physical tools such as transient elastography (TE) and acoustic radiation force impulse elastography, have been proposed to noninvasively assess the fibrotic burden in patients with chronic liver diseases.12,13

Of these noninvasive surrogates, the ELF test incorporates three serologic markers: hyaluronic acid (HA), N-terminal propeptide of collagen type III (PIIINP), and tissue inhibitor of metalloproteinase-1 (TIMP-1).14 The ELF test has demonstrated high reproducibility, automaticity, and performance in predicting the extent of liver fibrosis in patients with chronic liver diseases of varying etiologies.1518 In addition, several longitudinal studies have reported a significant association between the ELF test and liver-related events, HCC, and liver-related deaths in patients with chronic liver diseases. Similarly, liver stiffness (LS) assessed using TE exhibits high accuracy and reproducibility in diagnosing the degree of liver fibrosis.19 Several longitudinal studies have demonstrated the ability of TE to predict long-term prognosis and assess the risk of HCC and liver-related event development.2022

Based on the accuracy of LS and ELF, Wong et al.23 recently proposed a sequential LS-ELF algorithm to diagnose advanced fibrosis and cirrhosis in CHB patients. An exclusion and confirmation strategy based on this algorithm could improve the diagnostic accuracy of the LS value or ELF test alone, allowing approximately 60% of patients to avoid LB. However, this algorithm has not been validated or compared to the concurrent use of LS and ELF. In addition, the performance of ELF in the study by Wong et al.23 was relatively lower than that previously reported (area under receiver operating characteristic curve [AUC], 0.69 vs 0.86 to 0.92).1518

Thus, in this single-center, retrospective cohort study, we investigated the diagnostic performance of LS assessed using TE and the ELF test for advanced fibrosis and cirrhosis, validated the LS-ELF algorithm, and determined whether sequential LS-ELF algorithm is superior to their concurrent combination in terms of preventing unnecessary LB in CHB patients.

MATERIALS AND METHODS

1. Patients

From 2009 to 2013, 265 CHB patients who underwent TE and LB at Severance Hospital, Yonsei University College of Medicine (Seoul, Korea) were considered eligible for inclusion in this study. The study population included 99 CHB patients who were recruited in our previous study between 2009 and 2010.14 CHB was defined as the persistent presence of serum HBV surface antigen for more than 6 months and HBV DNA positivity by the polymerase chain reaction assay. Before starting antiviral therapy, LB was performed to assess the severity of liver fibrosis. With informed consent, serum samples taken at the time of LB were stored in the Yonsei Liver Blood Bank system (approval number: 4-2009-0725).

The exclusion criteria were as follows: (1) measurement failure of LS value (valid shot=0); (2) unreliable LS value; (3) measurement failure of the ELF test; (4) a previous history of antiviral therapy and hepatic decompensation; (5) HCC at the time of LB or a past history of it;24 (6) liver specimen less than 15 mm in length; (7) alanine aminotransferase (ALT) level >5×the upper limit of normal (ULN); (8) co-infection with human immunodeficiency virus or hepatitis C virus; (9) alcohol ingestion in excess of 40 g/day for more than 5 years;25 and (10) heart failure or pregnancy (Supplementary Fig. 1).

This study was performed in accordance with the ethical guidelines of the 1975 Declaration of Helsinki and was approved by the Institutional Review Board of Severance Hospital. Given its retrospective nature, written informed consent to access clinical data was not required.

2. Histologic evaluation

LB specimens were fixed in formalin and embedded in paraffin. Then standard hematoxylin and eosin and trichrome (Masson) stains were performed using 4-μm sections. All of the liver tissue samples were evaluated by an experienced pathologist blinded to the patient clinical data, including the results of the TE and ELF tests. Liver histology was scored semi-quantitatively according to the Batts and Ludwig system.26 Fibrosis was staged (0 to 4) as follows: F0, no fibrosis; F1, portal fibrosis without septa; F2, portal fibrosis and few septa; F3, numerous septa without cirrhosis; and F4, cirrhosis.

3. ELF measurement

ELF was tested using serum samples at stored the time of LB. HA, PIIINP, and TIMP-1 levels were estimated using an ADVIA Centaur XP automated immunoanalyzer (Siemens Healthcare Diagnostics, Tarrytown, NY, USA). ELF was calculated by the following algorithm provided by the manufacturer: ELF=2.278+0.851 ln(HA)+0.751 ln(PIIINP)+0.394 ln(TIMP-1).

4. LS measurement using TE

At the time of enrollment, TE was performed by a well-trained technician (>50,000 examinations). Details of the technique and examination procedure were previously reported.19,2729 LS values were expressed as kilopascals (kPa). The interquartile range (IQR) was defined as an index of intrinsic variability of LS values, corresponding to the interval of LS measurement results containing 50% of the valid measurements between the 25th and 75th percentiles. In this study, only LS values with 10 validated measurements and a 60% success rate (at minimum, both) were considered reliable. The median of successfully measured values was presumed representative of LS in a given patient only at IQR-to-median-value ratios <0.3.

5. Sequential LS-ELF algorithm

Wong et al.23 proposed a sequential LS-ELF algorithm, in which confirmation and exclusion by the LS value were performed first. When the value was indeterminate, confirmation and exclusion by the ELF test were performed in sequence, and when both values were nondiagnostic, LB was considered. A confirmatory strategy was defined as accurate if the posttest probability was >90%, and the exclusion strategy was defined as accurate if the posttest probability was <10%. The LS and ELF value cutoffs proposed by Wong et al.23 were adopted as external cutoff values in this study. The external LS value cutoffs were as follows: ≥F3, 6.0 kPa and F4, 7.5 kPa for at least 90% sensitivity; ≥F3, 10.2 kPa and F4, 12.0 kPa for at least 90% specificity; and ≥F3, 9.0 kPa and F4, 10.0 kPa for the maximum sum of sensitivity and specificity. Those for the ELF test were as follows: ≥F3, 8.4 and F4, 8.8 for at least 90% sensitivity; ≥F3, 10.8 and F4, 11.1 for at least 90% specificity; and ≥F3, 9.8 and F4, 9.5 for the maximum sum of sensitivity and specificity.

6. Concurrent LS-ELF algorithm

The accuracy of the concurrent use of LS and ELF test values was compared to the sequential LS-ELF algorithm by Wong et al.’s method.23 To establish the strategy for the concurrent use of LS and ELF test values in our study cohort and comparing the accuracy of this concurrent LS-ELF algorithm with that of the sequential LS-ELF algorithm, we adopted two strategies that have been used in previous studies. First, Boursier et al.30 proposed the concurrent combination of FibroMeter and LS values. In this algorithm, a forward binary logistic regression was performed using both values. Using the regression score, the 90% negative predictive value (NPV) and 90% positive predictive value (PPV) were set. LB was considered required when the regression score was in the indeterminate zone between the two thresholds. Second, Castéra et al.10 proposed a different concurrent combination of FT and LS values, in which LB was considered when the FT and LS values did not match.

7. ULN of ALT and stratification according to ALT level

Because patients at the same fibrosis stage but with higher ALT levels tend to exhibit higher LS values, the diagnostic performances of LS, ELF, and the LS-ELF algorithm were calculated separately according to ALT level (ALT ≤ULN and ALT 1–5× ULN).31 In this study, ULNs of 67 IU/L for males and 55 IU/L for females, as used by Wong et al.,23 were adopted. In addition, we used another ULN of 40 IU/L (ULNKorea) for both sexes which has been used in Korea.14,20,27,28 Based on ULNKorea, 218 patients showed an ALT level ≤5×ULN and were used for subgroup analyses.

8. Statistical analysis

The SPSS version 18.0 (SPSS Inc., Chicago, IL, USA) was utilized for all of the statistical analysis. Data are expressed as medians (IQR) or number (%) as appropriate. Pearson correlation coefficient was used to test the correlations among LS and ELF values with the other variables. The performance of LS and ELF in terms of predicting the degree of liver fibrosis was evaluated by AUCs with 95% confidence intervals (CIs). The DeLong method was used to compare the AUC values of LS and ELF. The sensitivity, specificity, PPV, NPV, positive and negative likelihood ratios of LS, ELF, and the LS-ELF algorithm were calculated to evaluate the accuracy of the determinants. Internal cutoffs were calculated from our study population and external cutoffs were from Wong et al.23 Cutoffse was defined as the cutoff value which had at least 90% sensitivity and cutoffsp was defined as the cutoff value which had at least 90% specificity. Cutoffse+sp was determined to maximize the sum of sensitivity and specificity from receiver operating characteristic curve analyses, and the corresponding diagnostic indices were calculated. The proportion of patients with correctly avoided biopsy was compared among the algorithms by the McNemar test. The cumulative incidence rate of HCC was calculated using the Kaplan-Meier method and compared using log-rank test. A value of p<0.05 was considered statistically significant.

RESULTS

1. Patient characteristics

A total of 265 consecutive CHB patients who underwent LB, LS measurement, and the ELF test were considered eligible for this study. Fifteen patients were excluded due to measurement failure of LS value, unreliable LS value, or measurement failure of ELF test. Of the 250 patients with valid LS values and ELF test results, 28 were excluded according to the exclusion criteria. Finally, a total of 222 patients were selected for statistical analysis.

The baseline characteristics of the study population (144 males and 78 females) are shown in Table 1. The median age and body mass index was 48 years and 23.7 kg/m2, respectively. Histological fibrosis staging was as follows: F1 in 39, F2 in 42, F3 in 23, and F4 in 118 patients, respectively. The median LS and ELF values were 10.2 kPa and 9.7, respectively.

Table 1.

Baseline Characteristics (n=222)

Variable Value
Demographic
Age, yr 48 (37–55)
Male sex 144 (64.9)
Body mass index, kg/m2 23.7 (21.7–25.6)
Diabetes mellitus 14 (6.3)
Hypertension 27 (12.2)
Laboratory
Alanine aminotransferase, IU/L 42 (30–64)
Serum albumin, g/dL 4.2 (3.9–4.5)
Total bilirubin, mg/dL 0.7 (0.6–0.9)
γ-Glutamyl transpeptidase, IU/L 37 (25–69)
Platelet count, 109/L 169 (122–201)
HBeAg positivity 115 (51.8)
HBV DNA, copies/mL 616,000 (19,950–14,775,000)
Histological
Length of biopsy samples, cm 1.7 (1.6–2.0)
Fibrosis stage
F1/F2/F3/F4 39 (17.6)/42 (18.9)/23 (10.4)/118 (53.1)
Noninvasive fibrosis assessment
LS value, kPa 10.2 (6.9–15.9)
ELF test 9.7 (8.8–10.4)

2. Correlation among LS and ELF values with the other variables

Age, total bilirubin, γ-glutamyl transpeptidase, and fibrosis stage significantly increased with increasing LS and ELF values, whereas serum albumin and platelet count significantly decreased (all p<0.05). LS and ELF values were significantly correlated (r=0.500, p<0.001), whereas ALT values did not show significant correlations with LS or ELF values (p=0.067 and p=0.494, respectively) (Supplementary Fig. 1).

3. Diagnostic performance of LS and ELF values

The AUCs of LS value to predict ≥F2, ≥F3, and F4 fibrosis stage were 0.857 (95% CI, 0.804 to 0.900), 0.887 (95% CI, 0.837 to 0.925), and 0.853 (95% CI, 0.799 to 0.897), respectively; and those of ELF were 0.802 (95% CI, 0.743 to 0.852), 0.703 (95% CI, 0.638 to 0.762), and 0.706 (95% CI, 0.642 to 0.765), respectively (Fig. 1). The performance of the LS and ELF values to predict ≥F2 was statistically similar (p=0.286), whereas the LS value was significantly superior to the ELF value in predicting ≥F3 and F4 (both p<0.001).

Fig. 1.

Fig. 1

Receiver operating characteristic curves of liver stiffness (LS) and enhanced liver fibrosis (ELF) values to predict fibrosis stages. LS and ELF values were similarly predictive of ≥F2 stage (A) (area under receiver operating characteristic curve [AUC], 0.857 vs 0.802; p=0.286), whereas the LS value was superior to the ELF value for predicting the ≥F3 (B) (AUC, 0.887 vs 0.703; p<0.001) and F4 stages (C) (AUC, 0.853 vs 0.706; p<0.001).

Of the study population (n=222), 163 patients (73.4%) had ALT levels ≤ULN and 59 patients (26.4%) had ALT level of 1–5× ULN. If ULNKorea was used (n=218), 98 patients (45.0%) had ALT levels ≤ULN and 120 patients (55.0%) had ALT levels of 1–5× ULN. When the patients were stratified according to their ALT level (ALT ≤ULN vs ALT 1–5× ULN) and ULN criteria, LS value was significantly superior to the ELF value for predicting ≥F3 (AUC, 0.859 to 0.910 vs 0.697 to 0.714) and F4 (AUC, 0.838 to 0.886 vs 0.697 to 0.720) (all p<0.05) (Supplementary Table 1).

4. Cutoff values of LS and ELF for predicting ≥F3 and F4

Diagnostic indices of LS and ELF value for predicting ≥F3 and F4 were calculated according to external cutoff values from Hong Kong23 and internal cutoff values from our cohort were calculated (Table 2). In addition, the LS and ELF value cutoffs according to ALT level (ALT ≤ULN vs ALT 1–5× ULN) were determined (Table 2).

Table 2.

Diagnostic Indices of LS and ELF Values to Assess ≥F3 and F4 Stages according to External and Internal Cutoff Values

Variable ≥F3 (n=141) F4 (n=118)
Cutoff Sn, % Sp, % PPV, % NPV, % LR+ LR− Cutoff Sn, % Sp, % PPV, % NPV, % LR+ LR−
LS value*
Based on external cutoff
All
Sn >90% 6.0 97.2 44.4 75.3 90.0 1.74 0.06 7.5 91.5 51.0 67.9 84.1 1.86 0.16
Sp >90% 10.2 70.2 87.7 90.8 62.8 5.68 0.33 12.0 59.3 91.3 88.6 66.4 6.85 0.44
Sn+Spmax 9.0 80.1 81.5 88.3 70.2 4.32 0.24 10.0 75.4 74.0 76.7 72.6 2.90 0.33
ALT ≤ULN
Sn >90% 6.0 97.4 41.3 80.9 86.4 1.66 0.06 6.0 98.0 31.3 68.8 90.9 1.42 0.06
Sp >90% 9.0 80.3 82.6 92.2 62.3 4.61 0.23 12.0 57.6 89.1 89.1 57.6 5.26 0.47
Sn+Spmax 9.0 80.3 82.6 92.2 62.3 4.61 0.23 10.0 74.7 71.9 80.4 64.8 2.65 0.35
ALT 1–5× ULN
Sn >90% 7.5 95.8 65.7 65.7 95.8 2.79 0.06 7.5 94.7 57.5 51.4 95.8 2.22 0.09
Sp >90% 12.0 58.3 97.1 93.3 77.3 20.40 0.42 12.0 68.4 95.0 86.7 86.4 13.68 0.33
Sn+Spmax 11.0 75.0 94.3 90.0 84.6 13.12 0.26 11.0 78.9 87.5 75.0 89.7 6.31 0.24
Based on internal cutoff
All
Sn >90% 7.6 90.8 64.2 81.5 80.0 2.53 0.14 7.6 91.5 51.0 67.9 84.1 1.86 0.16
Sp >90% 10.5 66.0 90.1 92.1 60.3 6.67 0.37 11.8 60.2 91.3 88.8 66.9 6.95 0.43
Sn+Spmax 9.0 80.1 81.5 88.3 70.2 4.32 0.24 11.0 66.1 85.6 83.9 69.0 4.58 0.39
ALT ≤ULN
Sn >90% 7.5 90.6 60.9 85.5 71.8 2.31 0.15 7.6 90.9 50.0 73.8 78.0 1.81 0.18
Sp >90% 10.4 66.7 91.3 95.1 51.9 7.66 0.36 12.3 55.6 90.6 90.2 56.9 5.92 0.49
Sn+Spmax 9.0 80.3 82.6 92.2 62.3 4.61 0.23 9.0 82.8 68.8 80.4 72.1 2.65 0.24
ALT 1–5× ULN
Sn >90% 8.0 91.7 68.6 66.7 92.3 2.91 0.12 8.0 94.7 57.5 51.4 95.8 2.22 0.09
Sp >90% 10.6 79.2 91.4 86.4 86.5 9.23 0.22 11.5 78.9 90.0 78.9 90.0 7.89 0.23
Sn+Spmax 10.6 79.2 91.4 86.4 86.5 9.23 0.22 11.5 78.9 90.0 78.9 90.0 7.89 0.23
ELF test
Based on external cutoff
All
Sn >90% 8.4 95.0 34.6 71.7 80.0 1.42 0.14 8.8 86.4 38.5 61.4 71.4 1.40 0.35
Sp >90% 10.8 24.8 92.6 85.4 41.4 3.35 0.81 11.1 21.2 91.3 73.5 50.5 2.44 0.86
Sn+Spmax 9.8 53.2 71.6 76.5 46.8 1.87 0.65 9.5 66.1 58.7 64.5 60.4 1.59 0.57
ALT ≤ULN
Sn >90% 8.4 94.9 30.4 77.6 70.0 1.36 0.18 8.8 85.9 39.1 68.5 64.1 1.40 0.36
Sp >90% 10.8 23.9 91.3 87.5 32.1 2.75 0.83 11.1 22.2 92.2 81.5 43.4 2.84 0.84
Sn+Spmax 9.8 51.3 76.1 84.5 38.0 2.14 0.64 9.5 65.7 62.5 73.0 54.1 1.50 0.54
ALT 1–5× ULN
Sn >90% 9.2 75.0 48.6 50.0 73.9 1.45 0.51 9.5 68.4 52.5 40.6 77.8 1.44 0.60
Sp >90% 10.8 29.2 94.3 77.8 66.0 5.10 0.75 11.1 15.8 90.0 42.9 69.2 1.57 0.93
Sn+Spmax 9.8 62.5 65.7 55.6 71.9 1.82 0.57 9.5 68.4 52.5 40.6 77.8 1.44 0.60
Based on internal cutoff
All
Sn >90% 8.6 90.8 35.8 71.1 69.0 1.41 0.25 8.7 90.7 37.5 62.2 78.0 1.45 0.24
Sp >90% 10.8 24.8 92.6 85.4 41.4 3.35 0.81 10.7 28.8 91.3 79.1 53.1 3.32 0.77
Sn+Spmax 8.4 95.0 34.6 71.7 80.0 1.42 0.14 8.4 98.3 31.7 62.0 94.3 1.43 0.05
ALT ≤ULN
Sn >90% 8.6 90.6 32.6 77.4 57.7 1.34 0.28 8.7 90.9 37.5 69.2 72.7 1.38 0.35
Sp >90% 10.7 25.6 91.3 88.2 32.6 2.94 0.81 10.6 31.3 90.6 83.8 46.0 3.34 0.75
Sn+Spmax 9.0 77.8 52.2 80.5 48.0 1.62 0.42 8.5 97.0 31.3 68.6 87.0 1.41 0.09
ALT 1–5× ULN
Sn >90% 8.6 91.7 40.0 51.2 87.5 1.52 0.20 8.6 94.7 37.5 41.9 93.8 1.51 0.14
Sp >90% 10.8 29.2 94.3 77.8 66.0 5.10 0.75 11.1 15.8 90.0 42.9 69.2 1.57 0.93
Sn+Spmax 8.4 95.8 40.0 52.3 93.3 1.59 0.10 8.4 100.0 37.5 43.2 100.0 1.60 0.00

For prediction of ≥F3, the internal cutoffse, cutoffsp, and cutoffse+sp values were 7.6, 10.5, and 9.0 kPa, respectively for LS; and those for ELF were 8.6, 10.8, and 8.4, respectively. An elevated ALT level (1–5× ULN) was associated with a slightly increased LS cutoff value compared to a normal ALT value (≤ULN) (cutoffse 7.5 kPa→8.0 kPa, cutoffsp 10.4 kPa→10.6 kPa, and cutoffse+sp 9.0 kPa→10.6 kPa), but was not associated with a change in ELF value (cutoffse 8.6→8.6, cutoffsp 10.7→10.8, and cutoffse+sp 9.0→8.4). For prediction of F4, an elevated ALT level had varying effects on LS and ELF cutoff values in predicting F4. Similar results were obtained when ULNKorea was used (Supplementary Table 2).

5. Diagnostic performance of LS, ELF, and the LS-ELF algorithm

The diagnostic performance of LS, ELF, and the LS-ELF algorithm to exclude and confirm ≥F3 and F4 is shown in Table 3. For reference, published diagnostic indices of Wong’s cohort are also described in Table 3. Using a LS-ELF exclusion strategy-based algorithm for our cohort, 60.4% (n=49) and 55.7% of patients (n=58) could avoid LB to exclude ≥F3 and F4 using external cutoffs, respectively; whereas 71.6% (n=58) and 66.3% (n=69) of patients could avoid LB to exclude ≥F3 and F4 using internal cutoffs, respectively. In addition, the proportions of patients who could avoid LB to confirm ≥F3 and F4 using external and internal cutoffs are also summarized in Table 3. When ULNKorea was used, the overall results were similar (Supplementary Table 3).

Table 3.

Diagnostic Performance of LS, ELF, and the LS-ELF Algorithms to Exclude and Confirm ≥F3 and F4

Variable ELF test LS value*
Internal External Internal External
Exclusion strategy
≥F3 (n=81)
Cutoff value 8.6 8.4 7.5 (ALT ≤ULN)8.0 (ALT 1–5× ULN) 6.0 (ALT ≤ULN)7.5 (ALT 1–5× ULN)
Sensitivity, % 91.0 95.0 90.8 97.2
Specificity, % 33.3 34.6 64.2 51.9
PPV, % 68.6 71.7 81.5 77.8
NPV, % 69.8 80.0 80.0 91.3
LR+ 1.4 1.4 2.5 2.0
LR− 0.3 0.1 0.1 0.1
No. of biopsy correctly avoided (%) 29 (35.8) 28 (34.0) 52 (64.2) 42 (51.8)
No. of incorrect diagnosis (%) 13 (16.0) 7 (8.6) 13 (16.0) 4 (4.9)
F4 (n=104)
Cutoff value 8.7 8.8 7.6 (ALT ≤ULN)8.0 (ALT 1–5× ULN) 6.0 (ALT ≤ULN)7.5 (ALT 1–5× ULN)
Sensitivity, % 90.7 86.4 90.7 97.5
Specificity, % 37.5 38.5 53.8 41.3
PPV, % 62.2 61.4 69.0 65.3
NPV, % 78.0 71.4 83.6 93.5
LR+ 1.5 1.4 2.0 1.7
LR− 0.2 0.4 0.2 0.1
No. of biopsy correctly avoided (%) 39 (37.5) 40 (38.4) 56 (53.8) 43 (41.3)
No. of incorrect diagnosis (%) 11 (10.5) 16 (15.3) 11 (9.3) 3 (2.8)
Confirmation strategy
≥F3 (n=141)
Cutoff value 10.8 10.8 10.5 (ALT ≤ULN)11.0 (ALT 1–5× ULN) 9.0 (ALT ≤ULN)12.0 (ALT 1–5× ULN)
Sensitivity, % 24.8 24.8 63.2 66.0
Specificity, % 92.6 92.6 90.0 87.8
PPV, % 85.4 85.4 91.0 89.6
NPV, % 41.4 41.4 60.4 61.7
LR+ 3.4 3.4 6.3 5.4
LR− 0.8 0.8 0.4 0.4
No. of biopsy correctly avoided (%) 35 (24.8) 35 (24.8) 91 (64.5) 95 (67.3)
No. of incorrect diagnosis (%) 8 (5.6) 8 (5.6) 9 (6.3) 11 (7.8)
F4 (n=118)
Cutoff value 10.7 11.1 12.0 (ALT ≤ULN)13.0 (ALT 1–5× ULN) 12.0 (ALT ≤ULN)12.0 (ALT 1–5× ULN)
Sensitivity, % 28.8 21.2 59.2 60.0
Specificity, % 91.3 91.3 91.2 89.5
PPV, % 79.1 73.5 87.7 85.7
NPV, % 53.1 50.5 68.0 68.0
LR+ 3.3 2.8 6.7 5.7
LR− 0.8 0.8 0.4 0.4
No. of biopsy correctly avoided (%) 34 (28.8) 25 (21.1) 71 (60.1) 72 (61.0)
No. of incorrect diagnosis (%) 9 (7.6) 9 (7.6) 10 (8.4) 12 (10.1)

6. Diagnostic performance of the sequential and concurrent LS-ELF algorithm to predict ≥F3 and F4

The diagnostic performances of the sequential LS-ELF algorithm and concurrent LS-ELF algorithms assessed using strategies by Castéra et al.10 and Boursier et al.30 to predict ≥F3 and F4 are shown in Table 4. When the sequential LS-ELF algorithm was applied using a confirmation and exclusion strategy, 69.4% to 72.5% and 60.8% to 65.3% of patients could avoid LB to diagnose ≥F3 and F4, respectively, according to internal and external cutoff values. The proportion of patients who could avoid LB to diagnose ≥F3 and F4, when the concurrent LS-ELF algorithm based on the strategies of Castéra and Boursier was applied, are also summarized in Table 4.

Table 4.

Diagnostic Performance of the Sequential and Concurrent LS-ELF Algorithms to Predict ≥F3 and F4

Variable Sequential combination Concurrent combination
Internal cutoff External cutoff Using Castéra’s strategy10 Using Boursier’s strategy30
Internal cutoff External cutoff
≥F3
Sensitivity, % 97.0 97.4 66.7 100.0 97.9
Specificity, % 67.8 40.2 90.1 55.6 88.9
PPV, % 81.5 63.6 92.2 79.7 93.9
NPV, % 93.8 93.5 60.8 100.0 96.0
LR+ 3.0 1.6 6.8 2.25 8.8
LR− 0.0 0.1 0.4 0.00 0.0
No. of biopsy correctly avoided (%) 154 (69.4) 161 (72.5) 94 (42.3) 131 (59.0) 128 (57.7)
No. of incorrect diagnosis (%) 36 (16.2) 24 (10.8) 8 (3.6) 36 (16.2) 12 (5.4)
F4
Sensitivity, % 86.4 97.2 44.9 100.0 96.6
Specificity, % 84.6 51.9 88.5 63.5 90.4
PPV, % 86.4 77.8 81.5 75.6 91.9
NPV, % 84.6 91.3 58.6 100.0 95.9
LR+ 5.6 2.0 3.9 2.7 10.0
LR− 0.2 0.1 0.6 0.0 0.0
No. of biopsy correctly avoided (%) 145 (65.3) 135 (60.8) 53 (23.9) 96 (43.2) 110 (49.5)
No. of incorrect diagnosis (%) 36 (16.2) 32 (14.4) 12 (5.4) 38 (17.1) 14 (6.3)

The proportion of patients with correctly avoided LB in predicting ≥F3 according to the sequential LS-ELF algorithm (69.4% by internal cutoffs and 72.5% by external cutoffs) was significantly higher than the proportion of those with correctly avoided LB in predicting ≥F3 according to the concurrent LS-ELF algorithm by Castéra et al.10 (42.3% using internal cutoffs and 59.0% using external cutoffs) and Boursier et al.30 (57.7%) (all p<0.05 by McNemar test, with the exception of the borderline statistical significance between Boursier’s strategy and the current study using external cutoff values [p=0.057]). Similarly, the proportion of patients with correctly avoided LB in predicting F4 was significantly higher with the sequential LS-ELF algorithm than the concurrent algorithms (all p<0.05 by McNemar test except for the borderline statistical significance between Castéra’s strategy and the current study using internal cutoff values [p=0.059]). When ULNKorea was used, similar findings were observed (data not shown).

7. Schematic diagram of the sequential LS-ELF algorithm

A schematic diagram of the sequential LS-ELF algorithm is shown in Fig. 2. Using the internal cutoff values, 83 patients were excluded from ≥F3 (58 with correct exclusion and 25 with incorrect exclusion). In addition, 107 patients were confirmed to have ≥F3 (96 with correct confirmation and 11 with incorrect confirmation) (Fig. 2A). Regarding an F4 diagnosis, 87 patients were excluded (59 with correct exclusion and 28 with incorrect exclusion). Furthermore, among the 94 patients confirmed with F4, 86 had a correct diagnosis and eight had an incorrect diagnosis (Fig. 2B). Diagrams of diagnoses using external cutoff values to predict ≥F3 and F4 are shown in Fig. 2C and D. In addition, all of the diagnostic diagrams using ULNKorea showed similar patterns in terms of predicting ≥F3 and F4 (Supplementary Fig. 2).

Fig. 2.

Fig. 2

Prediction of ≥F3 and F4 stages using the combined LS-ELF algorithm with internal (A, B) and external cutoff values (C, D).

LS, liver stiffness; ELF, enhanced liver fibrosis; CHB, chronic hepatitis B.

8. Prognostic value of LS-ELF algorithm

When the study population was divided into two groups according to the combined LS-ELF algorithm using the internal cutoff values for prediction of ≥F3, a significantly higher incidence of HCC was observed in patients with advanced fibrosis confirmed (n=107) than in patients with advanced fibrosis excluded (n=83) (p<0.001 by log-rank test) (Supplementary Fig. 3). When the internal cutoff value for prediction of F4 and external cutoff values for prediction of ≥F3 and F4 were used, similar findings were also observed (data not shown).

DISCUSSION

Although TE can accurately assess the degree of liver fibrosis, its diagnostic performance for detecting early stage liver fibrosis is unsatisfactory.19 Similarly, ELF can reliably predict the degree of liver fibrosis.1518 Recently, Wong et al.23 proposed a sequential diagnostic algorithm using LS and ELF values, which was effective for avoiding LB in CHB patients. In our study, we found that the LS-ELF algorithm prevented unnecessary LB in 69.4% to 72.5% of patients with advanced liver fibrosis, similar to the results of a Hong Kong study (61% to 66%). Moreover, the sequential LS-ELF algorithm was significantly superior to the concurrent use of two surrogates in terms of avoiding LB in CHB patients.

This study had several clinical strengths. First, we validated the diagnostic performance of a recently proposed LS-ELF algorithm. The diagnostic accuracy of LS value for predicting liver fibrosis was similar in our cohort and the Hong Kong study23 (AUC, 0.857 to 0.887 vs 0.82 to 0.83), whereas the accuracy of the ELF test in our cohort was higher than that in the Hong Kong study23 (AUC, 0.703 to 0.802 vs 0.59 to 0.69). Nevertheless, our data demonstrated that the combined use of LS and ELF values has improved diagnostic accuracy, which is consistent with Wong et al.23 Because ELF shows better performance in predicting early stage fibrosis (≥F2) (AUC: 0.802 in our study and 0.82 to 0.90 in previous studies,1618) and LS shows better performance in predicting advanced-stage fibrosis (≥F3) (AUC: 0.887 in our study and 0.83 to 0.90 in previous studies,23,32) their combined use enhances the overall accuracy of assessment of fibrotic burden. Similarly, the combined use of two surrogates, such as the combination of FT and LS by Castéra et al.10 in CHC patients and the combination of FibroMeter and LS values by Boursier et al.,30 improves diagnostic accuracy compared to use of either marker alone, which supports our findings. In addition, we found the prognostic value of LS-ELF algorithm by showing the significantly higher incidence rate of HCC in patients with advanced fibrosis or cirrhosis confirmed than in patients with advanced fibrosis or cirrhosis excluded. Although validation studies in other ethnic groups are required, the LS-ELF algorithm can be applied to CHB patients in Asian counties, which have a high prevalence of HBV.

Second, our study revealed that the sequential use of LS and ELF values is superior to their concurrent use. According to the sequential LS-ELF algorithm, the proportion of patients with correctly avoided LB in predicting ≥F3 (69.4% to 72.5% vs 42.3% to 59.0%) and F4 (60.8% to 65.3% vs 23.9% to 49.5%) was significantly higher than that of concurrent use of LS and ELF. Although further prospective studies confirming our results and dealing with cost-efficiency of this diagnostic algorithm are required, sequential strategy may potentially lead to cost savings. That is, the concurrent use of LS value and the ELF test to screen all of the CHB patients incurs expenses compared to the sequential measurement of two surrogates.

Third, we validated the LS-ELF algorithm in various clinical settings to investigate the potential influence of clinical confounders. To reveal whether the main results can be influenced according to the specific characteristics of our cohort, we tested the LS-ELF algorithm according to both internal and external cutoff values from Hong Kong23 and the different ULN of the ALT level between Hong Kong (67 IU/L for males and 55 IU/L for females) and Korea (40 IU/L for both sexes). The overall results were not influenced by several potential confounders, which supports the applicability of the LS-ELF algorithm. However, due to the different distribution of fibrosis stage and ALT level between our work and the Hong Kong study, further research is required.

In our study, the internal cutoff values of LS according to fibrosis stage were 9.0 to 11.0 kPa and those of ELF were 8.4 to 9.8, similar to previous reports.23,25,28,29 Although the LS cutoff increased with increasing ALT level, LS values were not correlated with ALT levels (p=0.067); therefore, ALT only had minimal effects on the internal cutoff values of LS and ELF in our study, in contrast to previous reports.31,3335 This might be due to attenuation of the influence of a high ALT level by exclusion of CHB patients with ALT levels >5×ULN. The LS results of CHB patients with acute exacerbation are not reliable for diagnosing cirrhosis,34 and the cutoff LS values of patients with ALT levels 1–5× ULN were higher than those of patients with normal ALT levels.31,33 However, a recent multicenter study reported that the overestimating influence of an ALT 1–10× ULN was modest, as the accuracy of LS values was similar irrespective of ALT adjustment.35 Due to this controversy, we stratified the patients according to ALT level (ALT ≤ULN vs ALT 1–5× ULN) and ULN criteria (ULN vs ULNKorea), but ALT level had no significant effects on the results. In this study, ELF values did not correlate with ALT levels, and thus different ELF cutoff values according to ALT level could not be used.

We are aware of several issues with this study. First, because of its retrospective design, the study was subject to potential selection bias. Histological information for all of the patients who started antiviral therapy during the study period could not be obtained, despite confirming the diagnostic accuracy of the LS-ELF algorithm. Thus, further prospective validation of the LS-ELF algorithm is required before it can be applied in clinical practice. In addition, although we excluded patients with the length of liver samples <1.5 cm not only for securing more reliable pathological interpretation, but also for obtaining a large sample size as much as we could, the relatively insufficient quality of liver sample might have limited the accuracy of histological assessment of liver fibrosis. Second, because validation of the LS-ELF algorithm in patients with ALT<5×ULN was restricted to prevent the confounding influence of high ALT levels on LS values, it cannot be recommended for patients with AT values >5×ULN. However, most of the unusually high LS during acute liver injury showed a progressive and rapid decrease in parallel with a decrease in ALT levels.34 Thus, applicability of the LS-ELF algorithm after stabilizing ALT level by antiviral or conservative therapy should be investigated. Third, probably due to low NPV of exclusion strategy, our study showed higher rates of incorrect diagnosis of patients who avoided LB compared to Wong et al.23 (NPV, 69.9% to 83.1% in this work vs 90% in Wong’s study). Fourth, this study had a potential spectrum bias, with a high proportion of patients with cirrhosis (53.2%). In contrast, in Wong et al.,23 15% of patients had F3 fibrosis and 25% had cirrhosis. However, despite the different fibrosis stage distribution between Korea and Hong Kong, the LS-ELF algorithm consistently performed better than either LS or ELF alone. Lastly, although the sequential LS-ELF diagnostic algorithm exhibited enhanced overall diagnostic accuracy, it is associated with higher costs than LS alone. Thus, a cost-efficiency analysis of this algorithm is warranted.

In conclusion, the sequential LS-ELF algorithm had a higher probability of preventing LB in CHB patients to diagnose advanced fibrosis and cirrhosis and performed significantly better than that of the concurrent algorithms. However, further prospective validation studies are required before this algorithm can be used clinically.

Supplementary Material

ACKNOWLEDGEMENTS

This study was supported in part by the Basic Science Research Program through the National Research Foundation of Korea funded by the Ministry of Science, ICT & Future Planning (2016R1A1A1A05005138).

Footnotes

CONFLICTS OF INTEREST

No potential conflict of interest relevant to this article was reported.

REFERENCES

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials