A Pathologist-Annotated Dataset for Validating Artificial Intelligence: A Project Description and Pilot Study (original) (raw)

Authors:Sarah N Dudgeon (1), Si Wen (1), Matthew G Hanna (2), Rajarsi Gupta (3), Mohamed Amgad (4), Manasi Sheth (5), Hetal Marble (6), Richard Huang (6), Markus D Herrmann (7), Clifford H. Szu (8), Darick Tong (8), Bruce Werness (8), Evan Szu (8), Denis Larsimont (9), Anant Madabhushi (10), Evangelos Hytopoulos (11), Weijie Chen (1), Rajendra Singh (12), Steven N. Hart (13), Joel Saltz (3), Roberto Salgado (14), Brandon D Gallas (1) ((1) United States Food and Drug Administration, Center for Devices and Radiological Health, Office of Science and Engineering Laboratories, Division of Imaging Diagnostics & Software Reliability, White Oak, MD, (2) Memorial Sloan Kettering Cancer Center, New York, NY, (3) Stony Brook Medicine Dept of Biomedical Informatics, Stony Brook, NY, (4) Department of Pathology, Northwestern University, Rubloff Building, Chicago Illinois, (5) United States Food and Drug Administration, Center for Devices and Radiological Health, Office of Product Quality and Evaluation, Office of Clinical Evidence and Analysis, Division of Biostatistics, White Oak, MD, (6) Massachusetts General Hospital/Harvard Medical School, Boston, MA, (7) Computational Pathology, Department of Pathology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, (8) Arrive Origin, San Francisco, CA, (9) Department of Pathology, Institut Jules Bordet, Brussels, Belgium, (10) Case Western Reserve University, Cleveland, OH, (11) iRhythm Technologies Inc., San Francisco, CA, (12) Northwell health and Zucker School of Medicine, New York, NY, (13) Department of Health Sciences Research, Mayo Clinic, Rochester MN, (14) Division of Research, Peter Mac Callum Cancer Centre, Melbourne, Australia, Department of Pathology, GZA-ZNA Hospitals, Antwerp, Belgium)

View PDF

Abstract:Purpose: In this work, we present a collaboration to create a validation dataset of pathologist annotations for algorithms that process whole slide images (WSIs). We focus on data collection and evaluation of algorithm performance in the context of estimating the density of stromal tumor infiltrating lymphocytes (sTILs) in breast cancer. Methods: We digitized 64 glass slides of hematoxylin- and eosin-stained ductal carcinoma core biopsies prepared at a single clinical site. We created training materials and workflows to crowdsource pathologist image annotations on two modes: an optical microscope and two digital platforms. The workflows collect the ROI type, a decision on whether the ROI is appropriate for estimating the density of sTILs, and if appropriate, the sTIL density value for that ROI. Results: The pilot study yielded an abundant number of cases with nominal sTIL infiltration. Furthermore, we found that the sTIL densities are correlated within a case, and there is notable pathologist variability. Consequently, we outline plans to improve our ROI and case sampling methods. We also outline statistical methods to account for ROI correlations within a case and pathologist variability when validating an algorithm. Conclusion: We have built workflows for efficient data collection and tested them in a pilot study. As we prepare for pivotal studies, we will consider what it will take for the dataset to be fit for a regulatory purpose: study size, patient population, and pathologist training and qualifications. To this end, we will elicit feedback from the FDA via the Medical Device Development Tool program and from the broader digital pathology and AI community. Ultimately, we intend to share the dataset, statistical methods, and lessons learned.

Submission history

From: Brandon Gallas [view email]
[v1] Wed, 14 Oct 2020 12:16:07 UTC (471 KB)