Classification of potential nuclei in prostate histology images using shape manifold learning (original) (raw)

Gleason grading of prostate histology utilizing manifold regularization via statistical shape model of manifolds

SPIE Proceedings, 2012

Gleason patterns of prostate cancer histopathology, characterized primarily by morphological and architectural attributes of histological structures (glands and nuclei), have been found to be highly correlated with disease aggressiveness and patient outcome. Gleason patterns 4 and 5 are highly correlated with more aggressive disease and poorer patient outcome, while Gleason patterns 1-3 tend to reflect more favorable patient outcome. Because Gleason grading is done manually by a pathologist visually examining glass (or digital) slides, subtle morphologic and architectural differences of histological attributes may result in grading errors and hence cause high inter-observer variability. Recently some researchers have proposed computerized decision support systems to automatically grade Gleason patterns by using features pertaining to nuclear architecture, gland morphology, as well as tissue texture. Automated characterization of gland morphology has been shown to distinguish between intermediate Gleason patterns 3 and 4 with high accuracy. Manifold learning (ML) schemes attempt to generate a low dimensional manifold representation of a higher dimensional feature space while simultaneously preserving nonlinear relationships between object instances. Classification can then be performed in the low dimensional space with high accuracy. However ML is sensitive to the samples contained in the dataset; changes in the dataset may alter the manifold structure. In this paper we present a manifold regularization technique to constrain the low dimensional manifold to a specific range of possible manifold shapes, the range being determined via a statistical shape model of manifolds (SSMM). In this work we demonstrate applications of the SSMM in (1) identifying samples on the manifold which contain noise, defined as those samples which deviate from the SSMM, and (2) accurate out-of-sample extrapolation (OSE) of newly acquired samples onto a manifold constrained by the SSMM. We demonstrate these applications of the SSMM in the context of distinguishing between Gleason patterns 3 and 4 using glandular morphologic features in a prostate histopathology dataset of 58 patient studies. Identifying and eliminating noisy samples from the manifold via the SSMM results in a statistically significant improvement in classification accuracy (CA), 93.0 ± 1.0% with removal of noisy samples compared to a CA of 90.9 ± 1.1% without removal of samples. The use of the SSMM for OSE of new independent test instances also shows statistically significant improvement in CA, 87.1 ± 0.8% with the SSMM compared to 85.6 ± 0.1% without the SSMM. Similar improvements were observed for the synthetic Swiss Roll and Helix datasets.

Explicit shape descriptors: Novel morphologic features for histopathology classification

Medical Image Analysis, 2013

Object morphology, defined as shape and size characteristics, observed on medical imagery is often an important marker for disease presence and/or aggressiveness. In the context of prostate cancer histopathology, gland morphology is an integral component of the Gleason grading system which enables discrimination between low and high grade disease. However, clinicians are often unable to distinguish between subtle differences in object morphology, as evidenced by high inter-observer variability in Gleason grading. Boundary-based morphologic descriptors, such as the variance in the distance from points on the boundary of an object to its center, may not have the requisite discriminability to separate objects with subtle shape differences. In this paper, we present a set of novel explicit shape descriptors (ESDs) which are capable of distinguishing subtle shape differences between prostate glands of intermediate Gleason grades (grades 3 and 4) on prostate cancer histopathology. Calculation of ESDs involves: (1) representing object morphology using an explicit shape model (e.g. medial axis); (2) aligning the shape models via a non-rigid registration scheme with a diffeomorphic constraint and quantifying shape model dissimilarity; and (3) applying a non-linear dimensionality reduction scheme (e.g. Graph Embedding) to learn a low dimensional projection encoding the shape differences between objects. ESDs are hence the principal eigenvectors in the reduced embedding space. In this work we demonstrate that ESDs in conjunction with a Support Vector Machine classifier are able to correctly distinguish between 888 prostate glands corresponding to different Gleason grades (benign, grade 3, or grade 4) of prostate cancer from 58 needle biopsy specimens with a maximum accuracy of 0.89 and corresponding area under the receiver operating characteristic curve of 0.78.

Automated gland and nuclei segmentation for grading of prostate and breast cancer histopathology

2008

Automated detection and segmentation of nuclear and glandular structures is critical for classification and grading of prostate and breast cancer histopathology. In this paper, we present a methodology for automated detection and segmentation of structures of interest in digitized histopathology images. The scheme integrates image information from across three different scales: (1) lowlevel information based on pixel values, (2) high-level information based on relationships between pixels for object detection, and domain-specific information based on relationships between histological structures. Low-level information is utilized by a Bayesian classifier to generate a likelihood that each pixel belongs to an object of interest. High-level information is extracted in two ways: (i) by a level-set algorithm, where a contour is evolved in the likelihood scenes generated by the Bayesian classifier to identify object boundaries, and (ii) by a template matching algorithm, where shape models are used to identify glands and nuclei from the low-level likelihood scenes. Structural constraints are imposed via domainspecific knowledge in order to verify whether the detected objects do indeed belong to structures of interest. In this paper we demonstrate the utility of our glandular and nuclear segmentation algorithm in accurate extraction of various morphological and nuclear features for automated grading of (a) prostate cancer, (b) breast cancer, and (c) distinguishing between cancerous and benign breast histology specimens. The efficacy of our segmentation algorithm is evaluated by comparing breast and prostate cancer grading and benign vs. cancer discrimination accuracies with corresponding accuracies obtained via manual detection and segmentation of glands and nuclei.

Statistical shape model for manifold regularization: Gleason grading of prostate histology

Computer Vision and Image Understanding, 2013

Gleason patterns of prostate cancer histopathology, characterized primarily by morphological and architectural attributes of histological structures (glands and nuclei), have been found to be highly correlated with disease aggressiveness and patient outcome. Gleason patterns 4 and 5 are highly correlated with more aggressive disease and poorer patient outcome, while Gleason patterns 1-3 tend to reflect more favorable patient outcome. Because Gleason grading is done manually by a pathologist visually examining glass (or digital) slides subtle morphologic and architectural differences of histological attributes, in addition to other factors, may result in grading errors and hence cause high inter-observer variability. Recently some researchers have proposed computerized decision support systems to automatically grade Gleason patterns by using features pertaining to nuclear architecture, gland morphology, as well as tissue texture. Automated characterization of gland morphology has been shown to distinguish between intermediate Gleason patterns 3 and 4 with high accuracy. Manifold learning (ML) schemes attempt to generate a low dimensional manifold representation of a higher dimensional feature space while simultaneously preserving nonlinear relationships between object instances. Classification can then be performed in the low dimensional space with high accuracy. However ML is sensitive to the samples contained in the dataset; changes in the dataset may alter the manifold structure. In this paper we present a manifold regularization technique to constrain the low dimensional manifold to a specific range of possible manifold shapes, the range being determined via a statistical shape model of manifolds (SSMM). In this work we demonstrate applications of the SSMM in (1) identifying samples on the manifold which contain noise, defined as those samples which deviate from the SSMM, and (2) accurate out-of-sample extrapolation (OSE) of newly acquired samples onto a manifold constrained by the SSMM. We demonstrate these applications of the SSMM in the context of distinguish between Gleason patterns 3 and 4 using glandular morphologic features in a prostate histopathology dataset of 58 patient studies. Identifying and eliminating noisy samples from the manifold via the SSMM results in a statistically significant improvement in area under the receiver operator characteristic curve (AUC), 0.832 ± 0.048 with removal of noisy samples compared to a AUC of 0.779 ± 0.075 without removal of samples. The use of the SSMM for OSE of newly acquired glands also shows statistically significant improvement in AUC, 0.834 ± 0.051 with the SSMM compared to 0.779 ± 0.054 without the SSMM. Similar results were observed for the synthetic Swiss Roll and Helix datasets.

Cancer diagnosis by nuclear morphometry using spatial information

Methods for extracting quantitative information regarding nuclear morphology from histopathology images have been long used to aid pathologists in determining the degree of differentiation in numerous malignancies. Most methods currently in use, however, employ the naïve Bayes approach to classify a set of nuclear measurements extracted from one patient. Hence, the statistical dependency between the samples (nuclear measurements) is often not directly taken into account. Here we describe a method that makes use of statistical dependency between samples in thyroid tissue to improve patient classification accuracies with respect to standard naïve Bayes approaches. We report results in two sample diagnostic challenges.

Classification of Tumor Histology via Morphometric Context

Proceedings / CVPR, IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2013

Image-based classification of tissue histology, in terms of different components (e.g., normal signature, categories of aberrant signatures), provides a series of indices for tumor composition. Subsequently, aggregation of these indices in each whole slide image (WSI) from a large cohort can provide predictive models of clinical outcome. However, the performance of the existing techniques is hindered as a result of large technical and biological variations that are always present in a large cohort. In this paper, we propose two algorithms for classification of tissue histology based on robust representations of morphometric context, which are built upon nuclear level morphometric features at various locations and scales within the spatial pyramid matching (SPM) framework. These methods have been evaluated on two distinct datasets of different tumor types collected from The Cancer Genome Atlas (TCGA), and the experimental results indicate that our methods are (i) extensible to differ...

Out-of-Sample Extrapolation utilizing Semi-Supervised Manifold Learning (OSE-SSL): Content Based Image Retrieval for Histopathology Images

Scientific reports, 2016

Content-based image retrieval (CBIR) retrieves database images most similar to the query image by (1) extracting quantitative image descriptors and (2) calculating similarity between database and query image descriptors. Recently, manifold learning (ML) has been used to perform CBIR in a low dimensional representation of the high dimensional image descriptor space to avoid the curse of dimensionality. ML schemes are computationally expensive, requiring an eigenvalue decomposition (EVD) for every new query image to learn its low dimensional representation. We present out-of-sample extrapolation utilizing semi-supervised ML (OSE-SSL) to learn the low dimensional representation without recomputing the EVD for each query image. OSE-SSL incorporates semantic information, partial class label, into a ML scheme such that the low dimensional representation co-localizes semantically similar images. In the context of prostate histopathology, gland morphology is an integral component of the Gle...

Novel morphometric based classification via diffeomorphic based shape representation using manifold learning

Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention, 2010

Morphology of anatomical structures can provide important diagnostic information regarding disease. Implicit features of morphology, such as contour smoothness or perimeter-to-area ratio, have been used in the context of computerized decision support classifiers to aid disease diagnosis. These features are usually specific to the domain and application (e.g., margin irregularity is a predictor of malignant breast lesions on DCE-MRI). In this paper we present a framework for extracting Diffeomorphic Based Similarity (DBS) features to capture subtle morphometric differences between shapes that may not be captured by implicit features. Object morphology is represented using the medial axis model and objects are compared by determining correspondences between medial axis models using a cluster-based diffeomorphic registration scheme. To visualize and classify morphometric differences, a manifold learning scheme (Graph Embedding) is employed to identify nonlinear dependencies between med...

Out-of-sample extrapolation using semi-supervised manifold learning (OSE-SSL): Content-based image retrieval for prostate histology grading

2011 IEEE International Symposium on Biomedical Imaging: From Nano to Macro, 2011

In this paper, we present an out-of-sample extrapolation (OSE) scheme in the context of semi-supervised manifold learning (OSE-SSL). Manifold learning (ML) takes samples with high dimensionality and learns a set of low dimensional embeddings. Embeddings generated by ML preserve nonlinear relationships between samples allowing dataset visualization, classification, or evaluation of object similarity. Semi-supervised ML (SSL), a recent ML extension, exploits known class labels to learn embeddings, which may result in greater separation between samples of different classes compared to unsupervised ML schemes. Most ML schemes utilize the eigenvalue decomposition (EVD) to learn embeddings. For instance, Graph Embedding (GE) learns embeddings by EVD on a similarity matrix that models high dimensional feature vector similarity between samples. In datasets where new samples are acquired, such as a content-based image retrieval (CBIR) system, recalculating EVD is infeasible. OSE schemes obtain new embeddings without recalculating EVD. The Nyström method (NM) is an OSE algorithm where new embeddings are estimated as a weighted sum of known embeddings. Known embeddings must describe the embedding space for NM to accurately estimate new embeddings. In this paper, NM and semi-supervised GE (SSGE) are combined to learn embeddings which cluster samples by class and rapidly calculate embeddings for new samples without recalculating EVD. OSE-SSL is compared to (i) NM paired with GE (NM-GE), and (ii) SSGE obtained for the full database, where SSGE results represent ground truth embeddings. OSE-SSL, NM-GE, and SSGE are evaluated in their ability to: (1) cluster samples by label, measured by Silhouette Index (SI); (2) CBIR accuracy, measured by area under the precision-recall curve (AUPRC). In a synthetic Swiss roll dataset of 2000 samples, OSE-SSL requires training on 50% of the dataset to achieve SI and AUPRC similar to SSGE while NM-GE requires 70% of dataset to achieve SI and AUPRC similar to GE. For a prostate histology dataset of 888 glands, a CBIR system was evaluated on its ability to retrieve images according to Gleason Grade. OSE-SSL had AUPRC of 0.6 while NM-GE had AUPRC of 0.3.