Deep learning supported discovery of biomarkers for clinical prognosis of liver cancer (original) (raw)

Data availability

The TCGA diagnostic whole-slide data and corresponding clinical information are available from NIH genomic data commons (https://portal.gdc.cancer.gov/projects/TCGA-LIHC). The PAIP histology data and corresponding annotations are available from the Pathology AI Platform 2019 challenge (https://paip2019.grand-challenge.org/Dataset/). Restrictions apply to the availability of the QHCG data, including WSIs and generated PaSegNet dataset, which were used with institutional permission through institutional review board approval for the current study, and are thus not publicly available. Please email all requests for academic use of raw and processed data to the corresponding author. All requests will be evaluated on the basis of institutional and departmental policies to determine whether the data requested are subject to intellectual property or patient privacy obligations. Data can only be shared for non-commercial academic purposes and will require a formal material transfer agreement. Source data are provided with this paper.

Code availability

All code was implemented in Python using PyTorch as the primary DL package. All code and scripts to reproduce the experiments of this paper are available at https://github.com/Biooptics2021/PathFinder. The code is also available at https://zenodo.org/record/7628549 (ref. [55](/articles/s42256-023-00635-3#ref-CR55 "Liang, J & Kong, L. PathFinder. Zenodo https://doi.org/10.5281/zenodo.7628549

             (2023).")).

References

Kather, J. N. & Calderaro, J. Development of AI-based pathology biomarkers in gastrointestinal and liver cancer. Nat. Rev. Gastroenterol. Hepatol. https://doi.org/10.1038/s41575-020-0343-3 (2020).
Ludwig, J. A. & Weinstein, J. N. Biomarkers in cancer staging, prognosis and treatment selection. Nat. Rev. Cancer 5, 845–856 (2005).
Article Google Scholar
Bosman, F. T. & True, L. D. Prognostic biomarkers: an introduction. Virchows Arch. 464, 253–256 (2014).
Article Google Scholar
Mandalà, M. & Massi, D. Tissue prognostic biomarkers in primary cutaneous melanoma. Virchows Arch. 464, 265–281 (2014).
Article Google Scholar
Hamilton, P. W. et al. Digital pathology and image analysis in tissue biomarker research. Methods 70, 59–73 (2014).
Article Google Scholar
Bera, K., Schalper, K. A., Rimm, D. L., Velcheti, V. & Madabhushi, A. Artificial intelligence in digital pathology—new tools for diagnosis and precision oncology. Nat. Rev. Clin. Oncol. 16, 703–715 (2019).
Article Google Scholar
Kleppe, A. et al. Designing deep learning studies in cancer diagnostics. Nat. Rev. Cancer 21, 199–211 (2021).
Article Google Scholar
Courtiol, P. et al. Deep learning-based classification of mesothelioma improves prediction of patient outcome. Nat. Med. 25, 1519–1525 (2019).
Article Google Scholar
Shi, J.-Y. et al. Exploring prognostic indicators in the pathological images of hepatocellular carcinoma based on deep learning. Gut https://doi.org/10.1136/gutjnl-2020-320930 (2020).
Saillard, C. et al. Predicting survival after hepatocellular carcinoma resection using deep learning on histological slides. Hepatology 72, 2000–2013 (2020).
Article Google Scholar
Chen, R. J. et al. Pathomic fusion: an integrated framework for fusing histopathology and genomic features for cancer diagnosis and prognosis. IEEE Trans. Med. Imaging 0062, 1–1 (2020).
Google Scholar
Chen, R. J. et al. Pan-cancer integrative histology-genomic analysis via multimodal deep learning. Cancer Cell 40, 865–878 (2022).
Article Google Scholar
Watson, D. S. et al. Clinical applications of machine learning algorithms: beyond the black box. Br. Med. J. 364, 10–13 (2019).
Google Scholar
Boehm, K. M., Khosravi, P., Vanguri, R., Gao, J. & Shah, S. P. Harnessing multimodal data integration to advance precision oncology. Nat. Rev. Cancer 22, 114–126 (2022).
Article Google Scholar
Vasey, B. et al. DECIDE-AI: new reporting guidelines to bridge the development-to-implementation gap in clinical artificial intelligence. Nat. Med. 27, 186–187 (2021).
Article Google Scholar
Kundu, S. AI in medicine must be explainable. Nat. Med. 27, 1328 (2021).
Article Google Scholar
Collins, G. S. & Moons, K. G. M. Reporting of artificial intelligence prediction models. Lancet 393, 1577–1579 (2019).
Article Google Scholar
Elmarakeby, H. A. et al. Biologically informed deep neural network for prostate cancer discovery. Nature 598, 348–352 (2021).
Article Google Scholar
Barredo Arrieta, A. et al. Explainable Artificial Intelligence (XAI): concepts, taxonomies, opportunities and challenges toward responsible AI. Inf. Fusion 58, 82–115 (2020).
Article Google Scholar
Gunning, D. et al. XAI—Explainable artificial intelligence. Sci. Robot. 4, eaay7120 (2019).
Rudin, C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 1, 206–215 (2019).
Lu, M. Y. et al. Data-efficient and weakly supervised computational pathology on whole-slide images. Nat. Biomed. Eng. 5, 555–570 (2021).
Article Google Scholar
Bulten, W. et al. Automated deep-learning system for Gleason grading of prostate cancer using biopsies: a diagnostic study. Lancet Oncol. 21, 233–241 (2020).
Article Google Scholar
Skrede, O. J. et al. Deep learning for prediction of colorectal cancer outcome: a discovery and validation study. Lancet 395, 350–360 (2020).
Article Google Scholar
S. Xie, R. Girshick, P. Dollár, Z. Tu, & K. He. Aggregated Residual Transformations for Deep Neural Networks. in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 5987–5995 (2017). doi: 10.1109/CVPR.2017.634
Simonyan, K., Vedaldi, A. & Zisserman, A. Deep inside convolutional networks: visualising image classification models and saliency maps. 2nd Int. Conf. Learn. Represent. ICLR 2014 - Work. Track Proc. 1–8 (2014).
McShane, L. M. et al. REporting recommendations for tumour MARKer prognostic studies (REMARK). Br. J. Cancer 93, 387–391 (2005).
Bray, F. et al. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 68, 394–424 (2018).
Article Google Scholar
Shaban, M. et al. A novel digital score for abundance of tumour infiltrating lymphocytes predicts disease free survival in oral squamous cell carcinoma. Sci. Rep. 9, 1–13 (2019).
Article Google Scholar
Tellez, D., Litjens, G., Van Der Laak, J. & Ciompi, F. Neural image compression for gigapixel histopathology image analysis. IEEE Trans. Pattern Anal. Mach. Intell. 43, 567–578 (2021).
Article Google Scholar
Davies, A. et al. Advancing mathematics by guiding human intuition with AI. Nature 600, 70–74 (2021).
Article MATH Google Scholar
Bijelic, L. & Rubio, E. R. Tumor necrosis in hepatocellular carcinoma—unfairly overlooked? Ann. Surg. Oncol. 28, 600–601 (2021).
Article Google Scholar
Wei, T. et al. Tumor necrosis impacts prognosis of patients undergoing curative-intent hepatocellular carcinoma. Ann. Surg. Oncol. 28, 797–805 (2021).
Article Google Scholar
Ling, Y. H. et al. Tumor necrosis as a poor prognostic predictor on postoperative survival of patients with solitary small hepatocellular carcinoma. BMC Cancer 20, 1–9 (2020).
Article Google Scholar
Vakkila, J. & Lotze, M. T. Inflammation and necrosis promote tumour growth. Nat. Rev. Immunol. 4, 641–648 (2004).
Article Google Scholar
Minervini, A. et al. Prognostic role of histological necrosis for nonmetastatic clear cell renal cell carcinoma: correlation with pathological features and molecular markers. J. Urol. 180, 1284–1289 (2008).
Article Google Scholar
Trentin, L. et al. Tumour-infiltrating lymphocytes bear the 75 kDa tumour necrosis factor receptor. Br. J. Cancer 71, 240–245 (1995).
Article Google Scholar
Mercan, E. et al. Assessment of machine learning of breast pathology structures for automated differentiation of breast cancer and high-risk proliferative lesions. JAMA Netw. Open 2, 1–11 (2019).
Article Google Scholar
Javed, S., Mahmood, A., Werghi, N., Benes, K. & Rajpoot, N. Multiplex cellular communities in multi-gigapixel colorectal cancer histology images for tissue phenotyping. IEEE Trans. Image Process. 29, 1–1 (2020).
Wu, R. et al. Comprehensive analysis of spatial architecture in primary liver cancer. Sci. Adv. 7, eabg3750 (2021).
Liu, Y. et al. High-spatial-resolution multi-omics sequencing via deterministic barcoding in tissue. Cell 183, 1665–1681 (2020).
Xie, W. et al. Prostate cancer risk stratification via non-destructive 3D pathology with deep learning-assisted gland analysis. Cancer Res. https://doi.org/10.1158/0008-5472.can-21-2843 (2021).
Dehaene, O., Camara, A., Moindrot, O., de Lavergne, A. & Courtiol, P. Self-supervision closes the gap between weak and strong supervision in histology. Preprint at arXiv https://doi.org/10.48550/arXiv.2012.03583 (2020).
Whang, S. E., Roh, Y., Song, H. & Lee, J.-G. Data collection and quality challenges in deep learning: a data-centric AI perspective. VLDB J. (2023). doi: 10.1007/s00778-022-00775-9
Yamashita, R. et al. Deep learning model for the prediction of microsatellite instability in colorectal cancer: a diagnostic study. Lancet Oncol. 22, 132–141 (2021).
Article Google Scholar
Kather, J. N. et al. Predicting survival from colorectal cancer histology slides using deep learning: a retrospective multicenter study. PLoS Med. 16, 1–22 (2019).
Article Google Scholar
Otsu, N. A threshold selection method from gray level histograms. IEEE Trans. Syst. Man Cybern. 9, 62–66 (1979).
J. Deng et al. ImageNet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition 248–255 (2009). doi: 10.1109/CVPR.2009.5206848
Vahadane, A. et al. Structure-preserving color normalization and sparse stain separation for histological images. IEEE Trans. Med. Imaging 35, 1962–1971 (2016).
Article Google Scholar
Verweij, P. J. M. & Van Houwelingen, H. C. Penalized likelihood in Cox regression. Stat. Med. 13, 2427–2436 (1994).
Article Google Scholar
Sundararajan, M., Taly, A. & Yan, Q. Axiomatic attribution for deep networks. 34th Int. Conf. Mach. Learn. ICML 2017 7, 5109–5118 (2017).
Google Scholar
Xu, K. et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. 32th Int. Conf. Mach. Learn. ICML 2015 37, 2048–2057 (2015).
Horn, H. S. Measurement of ‘overlap’ in comparative ecological studies. Am. Nat. 100, 419–424 (1966).
Article Google Scholar
Kokhlikyan, N. et al. Captum: a unified and generic model interpretability library for PyTorch An Overview of the Algorithms. Preprint at arXiv https://doi.org/10.48550/arXiv.2009.07896 (2020).
Liang, J & Kong, L. PathFinder. Zenodo https://doi.org/10.5281/zenodo.7628549 (2023).

Download references

Acknowledgements

We thank Y. Gao, S. Yang and X. Chen for helpful comments on the manuscript. The study by L.K. and J.L. was partially supported by the STI2030-Major Projects (no. 2022ZD0212000), National Natural Science Foundation of China (NSFC) (nos. 61831014, and 32021002), Tsinghua-Foshan Innovation Special Fund (TFISF) (no. 2021THFS0207) and the Guoqiang Institute, Tsinghua University (no. 2021GQG1024). Y.X. was supported by the Beijing Tsinghua Changgung Hospital Fund (no. 12021C1009).

Author information

Author notes

Meilong Wu
Present address: Division of Hepatobiliary and Pancreas Surgery, Department of General Surgery, Shenzhen People’s Hospital, The Second Clinical Medical College, Jinan University, Shenzhen, China
Meilong Wu
Present address: Division of Hepatobiliary and Pancreas Surgery, Department of General Surgery, The First Affiliated Hospital, Southern University of Science and Technology, Shenzhen, China

Authors and Affiliations

State Key Laboratory of Precision Measurement Technology and Instruments, Department of Precision Instrument, Tsinghua University, Beijing, China
Junhao Liang, Weisheng Zhang & Lingjie Kong
Department of Pathology, Beijing Tsinghua Changgung Hospital, School of Clinical Medicine, Tsinghua University, Beijing, China
Jianghui Yang, Hongfang Yin & Ying Xiao
School of Clinical Medicine, Tsinghua University, Beijing, China
Meilong Wu
Department of Automation, Tsinghua University, Beijing, China
Qionghai Dai
IDG/McGovern Institute for Brain Research, Tsinghua University, Beijing, China
Lingjie Kong

Authors

Junhao Liang
You can also search for this author inPubMed Google Scholar
Weisheng Zhang
You can also search for this author inPubMed Google Scholar
Jianghui Yang
You can also search for this author inPubMed Google Scholar
Meilong Wu
You can also search for this author inPubMed Google Scholar
Qionghai Dai
You can also search for this author inPubMed Google Scholar
Hongfang Yin
You can also search for this author inPubMed Google Scholar
Ying Xiao
You can also search for this author inPubMed Google Scholar
Lingjie Kong
You can also search for this author inPubMed Google Scholar

Contributions

L.K. and J.L. conceived the idea. L.K. supervised the project. J.L. and Y.X. performed the experiments. Y.X., Y.J. and W.M. curated the QHCG dataset. J.L., Y.X. and W.Z. analysed the results. Q.D. and H.Y. provided helpful discussions on the project design. J.L. and L.K. prepared the manuscript with inputs from all co-authors.

Corresponding authors

Correspondence toQionghai Dai, Hongfang Yin, Ying Xiao or Lingjie Kong.

Ethics declarations

Competing interests

The authors declare that they have no competing financial interests.

Peer review

Peer review information

Nature Machine Intelligence thanks Jiguang Wang and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 The macro mode and micro mode in our model.

Macro mode mainly focuses on the global information at low WSIs resolution. In this case, the spatial distribution information of different tissue types is included, while the high-resolution cell morphology information is discarded. On the contrary, micro mode mainly focuses on the region-level information at high spatial resolution. In this case, the high-resolution cell morphology information is included, while the tissue spatial distribution and contextual information are ignored. Scale bar: 100 μm.

Extended Data Fig. 2 The gap between pathological annotation and deep learning, and the pipeline of meta annotation.

a, The distributions of data points and decision boundaries in latent feature space of different situations. In an ideal situation, DL can learn an ideal decision boundary based on enough and class-balanced data. However, the actual data distribution is often not clear. The dataset we observed usually has noisy labels near the decision boundary, which makes the decision boundary learned by the model jitter in the ideal boundary area, or is class-unbalanced, which makes the decision boundary deviate from the ideal boundary. Meta annotated dataset, collecting a small number of representative data points in each class, is still possible to make the decision boundary close to the ideal boundary. b, Conventional pathological annotation method. It usually takes a long time to complete pixel-level annotation of complex tissues. Scale bar, 500 μm. c, WSI example. Scale bar, 2000 μm. d, The borders or interiors of tumor regions annotated by conventional methods still contain other types of tissue. Scale bar, 200 μm. e, An example of annotating regions with great difficulties. Multiple classes of tissue are mixed together. Scale bar, 100 μm. f, The tumor morphologies at different spatial locations of the WSI are similar. Scale bar, 100 μm. g, The pipeline of our proposed meta annotation. Scale bar: 100 μm (above), 200 μm (below).

Extended Data Fig. 3 Summary of study design and data usage.

a, Information of datasets. b, Training and validation of PaSegNet. c, Acquiring macro mode and micro mode by WSI decoupling and sparsification. d, 10-fold cross-validations of prognosis networks on TCGA dataset. e, Generalization ability test. The prognosis networks were first trained on TCGA dataset and then tested on QHCG dataset. f, Discovery, characterization, and verification of new biomarkers. g, Exploration of macro mode robustness and multiple WSIs selection rule.

Extended Data Fig. 4 Neural network architectures and detailed processes of various modes.

a, The process of obtaining probability heatmaps, segmentation maps, and tumor patches based on PaSegNet. b, c, d, Neural network architectures and detailed processes of MacroNet (b), MicroNet (c) and M2MNet (d), respectively. e, The detailed architecture of each neural network module in the model.

Extended Data Fig. 5 Segmentation results.

a, Segmentation results of QHCG WSIs. b, Segmentation results of TCGA WSIs. c, Segmentation results of PAIP WSIs. d, Segmentation results of small key lesion regions. Left, necrosis regions and corresponding probability heatmap. Scale bar, 250 μm. Right, tumor regions and corresponding probability heatmap. Scale bar, 1 mm. TUM, tumor; Nor, normal; FIB, fibrosis; INF, inflammation; NEC, necrosis; REA, bile duct reaction; STE, steatosis.

Extended Data Fig. 6 Survival and recurrence analyses on TCGA and QHCG dataset, and the correlation maps of clinical parameters.

a, b, Kaplan-Meier analyses of patient stratification of low and high death risk patients via M2MNet on TCGA dataset (a) and QHCG dataset (b). c-f, Kaplan-Meier analyses of patient stratification of low and high recurrence risk patients via M2MNet (c), MacroNet (d), TND (e), and NEC (f) on QHCG dataset. g-i, Multivariable analyses of factors associated with recurrence and MacroNet (g), TND (h), and NEC (i) on QHCG dataset (n = 83 patients); the data are presented as hazard ratio estimates (squares) and the error bars show the 95%-confidence interval of the hazard ratio estimate, according to multivariable Cox proportional hazards model. The results of univariate, multivariate analyses, and the abbreviations of each variable are detailed in Supplementary Table 3. j, k, Correlation maps of clinical parameters on TCGA dataset (j) and QHCG dataset (k). P values according to two-sided log-rank test (a-f) and multivariable Cox proportional hazards model (g-i). n, sample size; HR, hazard ratio; Stage, AJCC staging; TIL, tumor infiltrating lymphocytes digital score; BDT, bile duct thrombosis; AFP, alpha-fetoprotein; MVI, microvascular invasion.

Source data

Extended Data Fig. 7 Quantification analysis of macro mode, and the indicator distributions among all WSIs.

a, Quantification of tissue fraction on TCGA dataset (n = 330 patients). b, Quantification of TIL on TCGA dataset (n = 330 patients). c, Quantification of tissue fraction on QHCG dataset (n = 83 patients). d, Quantification of TIL on QHCG dataset (n = 83 patients). e, Distribution of NEC score from different WSIs of a same patient. f, Distribution of TND score from different WSIs of a same patient. a-d, The median risk score value is taken as the cutoff value of high risk group and low risk group; the significance level shown is determined using a two-sided Mann-Whitney-Wilcoxon test; boxplot whiskers extend to the smallest and largest value within 1.5 times the interquartile ranges of hinges, and box centre and hinges indicate median and first and third quartiles, respectively. TIL, tumor infiltrating lymphocytes digital score; TUM, tumor; Nor, normal; FIB, fibrosis; INF, inflammation; NEC, necrosis; REA, bile duct reaction; STE, steatosis.

Source data

Extended Data Fig. 8 The localization results of corresponding pathological features of TND and NEC.

a, TND heatmaps and pathological features of its localization. b, NEC heatmaps and pathological features of its localization. The zoom-in views of pathological slides are from the heatmaps labelled in black and red boxes. Scale bar: 500 μm.

Extended Data Fig. 9 Robustness of TND under different segmentation accuracies.

a, TND scores calculated for each patient based on segmentation results generated by 11 CNNs. The TND scores corresponding to ResNeXt50 (the CNN used in this study) are marked with an opaque blue asterisk. Patients are ranked based on TND scores corresponding to ResNeXt50. b, Classification performance, segmentation results, TND heatmaps, and prognostic performance of different CNNs. Histograms include recall, precision, and F1-score for each CNN’s ‘tumor’ category tested on QHCG test set, as well as TND prognostic performance (C-Index) based on segmentation maps generated by each CNN. c, Prognostic performance distributions of different CNNs (n = 11 networks). Boxplot whiskers extend to the smallest and largest value within 1.5 times the interquartile ranges of hinges, and box centre and hinges indicate median and first and third quartiles, respectively.

Source data

Extended Data Fig. 10 Robustness of NEC under different segmentation accuracies.

a, NEC scores calculated for each patient based on segmentation results generated by 11 CNNs. The NEC scores corresponding to ResNeXt50 (the CNN used in this study) are marked with an opaque blue asterisk. Patients are ranked based on NEC scores corresponding to ResNeXt50. b, Classification performance, segmentation results, NEC heatmaps, and prognostic performance of different CNNs. Histograms include recall, precision, and F1-score for each CNN’s ‘necrosis’ category tested on QHCG test set, as well as NEC prognostic performance (C-Index) based on segmentation maps generated by each CNN. c, Prognostic performance distributions of different CNNs (n = 11 networks). Boxplot whiskers extend to the smallest and largest value within 1.5 times the interquartile ranges of hinges, and box centre and hinges indicate median and first and third quartiles, respectively.

Source data

Supplementary information

Source data

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Liang, J., Zhang, W., Yang, J. et al. Deep learning supported discovery of biomarkers for clinical prognosis of liver cancer.Nat Mach Intell 5, 408–420 (2023). https://doi.org/10.1038/s42256-023-00635-3

Download citation

Received: 06 April 2022
Accepted: 27 February 2023
Published: 03 April 2023
Issue Date: April 2023
DOI: https://doi.org/10.1038/s42256-023-00635-3