Raghu Machiraju - Academia.edu (original) (raw)
Papers by Raghu Machiraju
Biocomputing 2020, 2019
Integration of transcriptomic and proteomic data should reveal multi-layered regulatory processes... more Integration of transcriptomic and proteomic data should reveal multi-layered regulatory processes governing cancer cell behaviors. Traditional correlation-based analyses have demonstrated limited ability to identify the post-transcriptional regulatory (PTR) processes that drive the non-linear relationship between transcript and protein abundances. In this work, we ideate an integrative approach to explore the variety of post-transcriptional mechanisms that dictate relationships between genes and corresponding proteins. The proposed workflow utilizes the intuitive technique of scatterplot diagnostics or scagnostics, to characterize and examine the diverse scatterplots built from transcript and protein abundances in a proteogenomic experiment. The workflow includes representing gene-protein relationships as scatterplots, clustering on geometric scagnostic features of these scatterplots, and finally identifying and grouping the potential gene-protein relationships according to their disposition to various PTR mechanisms. Our study verifies the efficacy of the implemented approach to excavate possible regulatory mechanisms by utilizing comprehensive tests on a synthetic dataset. We also propose a variety of 2D pattern-specific downstream analyses methodologies such as mixture modeling, and mapping miRNA post-transcriptional effects to
Studies in health technology and informatics, 2017
Systematic Reviews (SRs) of biomedical literature summarize evidence from high-quality studies to... more Systematic Reviews (SRs) of biomedical literature summarize evidence from high-quality studies to inform clinical decisions, but are time and labor intensive due to the large number of article collections. Article similarities established from textual features have been shown to assist in the identification of relevant articles, thus facilitating the article screening process efficiently. In this study, we visualized article similarities to extend its utilization in practical settings for SR researchers, aiming to promote human comprehension of article distributions and hidden patterns. To prompt an effective visualization in an interpretable, intuitive, and scalable way, we implemented a graph-based network visualization with three network sparsification approaches and a distance-based map projection via dimensionality reduction. We evaluated and compared three network sparsification approaches and the visualization types (article network vs. article map). We demonstrated the effec...
BMC Bioinformatics, 2021
Background Assigning chromatin states genome-wide (e.g. promoters, enhancers, etc.) is commonly p... more Background Assigning chromatin states genome-wide (e.g. promoters, enhancers, etc.) is commonly performed to improve functional interpretation of these states. However, computational methods to assign chromatin state suffer from the following drawbacks: they typically require data from multiple assays, which may not be practically feasible to obtain, and they depend on peak calling algorithms, which require careful parameterization and often exclude the majority of the genome. To address these drawbacks, we propose a novel learning technique built upon the Self-Organizing Map (SOM), Self-Organizing Map with Variable Neighborhoods (SOM-VN), to learn a set of representative shapes from a single, genome-wide, chromatin accessibility dataset to associate with a chromatin state assignment in which a particular RE is prevalent. These shapes can then be used to assign chromatin state using our workflow. Results We validate the performance of the SOM-VN workflow on 14 different samples of v...
Metabolites, 2020
As researchers are increasingly able to collect data on a large scale from multiple clinical and ... more As researchers are increasingly able to collect data on a large scale from multiple clinical and omics modalities, multi-omics integration is becoming a critical component of metabolomics research. This introduces a need for increased understanding by the metabolomics researcher of computational and statistical analysis methods relevant to multi-omics studies. In this review, we discuss common types of analyses performed in multi-omics studies and the computational and statistical methods that can be used for each type of analysis. We pinpoint the caveats and considerations for analysis methods, including required parameters, sample size and data distribution requirements, sources of a priori knowledge, and techniques for the evaluation of model accuracy. Finally, for the types of analyses discussed, we provide examples of the applications of corresponding methods to clinical and basic research. We intend that our review may be used as a guide for metabolomics researchers to choose ...
BMC Bioinformatics, 2019
Background Proteomic measurements, which closely reflect phenotypes, provide insights into gene e... more Background Proteomic measurements, which closely reflect phenotypes, provide insights into gene expression regulations and mechanisms underlying altered phenotypes. Further, integration of data on proteome and transcriptome levels can validate gene signatures associated with a phenotype. However, proteomic data is not as abundant as genomic data, and it is thus beneficial to use genomic features to predict protein abundances when matching proteomic samples or measurements within samples are lacking. Results We evaluate and compare four data-driven models for prediction of proteomic data from mRNA measured in breast and ovarian cancers using the 2017 DREAM Proteogenomics Challenge data. Our results show that Bayesian network, random forests, LASSO, and fuzzy logic approaches can predict protein abundance levels with median ground truth-predicted correlation values between 0.2 and 0.5. However, the most accurately predicted proteins differ considerably between approaches. Conclusions ...
Journal of Thoracic Oncology, 2018
Introduction: Despite apparently complete surgical resection, approximately half of resected earl... more Introduction: Despite apparently complete surgical resection, approximately half of resected early-stage lung cancer patients relapse and die of their disease. Adjuvant chemotherapy reduces this risk by only 5% to 8%. Thus, there is a need for better identifying who benefits from adjuvant therapy, the drivers of relapse, and novel targets in this setting. Methods: RNA sequencing and liquid chromatography/ liquid chromatography-mass spectrometry proteomics data were generated from 51 surgically resected non-small cell lung tumors with known recurrence status. Results: We present a rationale and framework for the incorporation of high-content RNA and protein measurements into integrative biomarkers and show the potential of this approach for predicting risk of recurrence in a group of lung adenocarcinomas. In addition, we characterize the relationship between mRNA and protein measurements in lung adenocarcinoma and show that it is outcome specific. Conclusions: Our results suggest that mRNA and protein data possess independent biological and clinical importance, which can be leveraged to create higherpowered expression biomarkers.
Biomedical Informatics Insights, 2018
Convolutional neural networks (CNNs) have gained steady popularity as a tool to perform automatic... more Convolutional neural networks (CNNs) have gained steady popularity as a tool to perform automatic classification of whole slide histology images. While CNNs have proven to be powerful classifiers in this context, they fail to explain this classification, as the network engineered features used for modeling and classification are ONLY interpretable by the CNNs themselves. This work aims at enhancing a traditional neural network model to perform histology image modeling, patient classification, and interpretation of the distinctive features identified by the network within the histology whole slide images (WSIs). We synthesize a workflow which (a) intelligently samples the training data by automatically selecting only image areas that display visible disease-relevant tissue state and (b) isolates regions most pertinent to the trained CNN prediction and translates them to observable and qualitative features such as color, intensity, cell and tissue morphology and texture. We use the Ca...
Bioinformatics (Oxford, England), Jan 17, 2018
Technologies that generate high-throughput 'omics data are flourishing, creating enormous, pu... more Technologies that generate high-throughput 'omics data are flourishing, creating enormous, publicly available repositories of multi-omics data. As many data repositories continue to grow, there is an urgent need for computational methods that can leverage these data to create comprehensive clusters of patients with a given disease. Our proposed approach creates a patient-to-patient similarity graph for each data type as an intermediate representation of each omics data type and merges the graphs through subspace analysis on a Grassmann manifold. We hypothesize that this approach generates more informative clusters by preserving the complementary information from each level of 'omics data. We applied our approach to a TCGA breast cancer data set and show that by integrating gene expression, microRNA, and DNA methylation data, our proposed method can produce clinically useful subtypes of breast cancer. We then investigate the molecular characteristics underlying these subtypes...
Cancer’s cellular behavior is driven by alterations in the processes that cells use to sense and ... more Cancer’s cellular behavior is driven by alterations in the processes that cells use to sense and respond to diverse stimuli. Underlying these processes are a series of chemical processes (enzyme-substrate, protein-protein, etc.). Here we introduce a set of mathematical techniques for describing and characterizing these processes.
Lecture Notes in Computer Science, 2010
The use of multivariate pattern recognition for the analysis of neural representations encoded in... more The use of multivariate pattern recognition for the analysis of neural representations encoded in fMRI data has become a significant research topic, with wide applications in neuroscience and psychology. A popular approach is to learn a mapping from the data to the observed behavior. However, identifying the instantaneous cognitive state without reference to external conditions is a relatively unexplored problem and could provide important insights into mental processes. In this paper, we present preliminary but promising results from the application of an unsupervised learning technique to identify distinct brain states. The temporal ordering of the states were seen to be synchronized with the experimental conditions, while the spatial distribution of activity in a state conformed with the expected functional recruitment.
Lecture Notes in Computer Science
In this work, we propose a novel method for deformable tensor-to-tensor registration of Diffusion... more In this work, we propose a novel method for deformable tensor-to-tensor registration of Diffusion Tensor Images. Our registration method models the distances in between the tensors with Geode-sic-Loxodromes and employs a version of Multi-Dimensional Scaling (MDS) algorithm to unfold the manifold described with this metric. Defining the same shape properties as tensors, the vector images obtained through MDS are fed into a multi-step vector-image registration scheme and the resulting deformation fields are used to reorient the tensor fields. Results on brain DTI indicate that the proposed method is very suitable for deformable fiber-to-fiber correspondence and DTI-atlas construction.
2010 IEEE International Symposium on Biomedical Imaging: From Nano to Macro, 2010
In fMRI analysis, general linear modelling (GLM) is commonly used because of its explanatory powe... more In fMRI analysis, general linear modelling (GLM) is commonly used because of its explanatory power, statistical simplicity and computational efficiency. Such models primarily measure the parametric effects of experimental conditions on amplitude of activation, and neglect other important effects on nature of the hemodynamic response, including its temporal characteristics such as relative latency (delay). In this paper, we present a GLM approach to estimate experimental effects on, not only activation amplitude, but also latency. We validate the statistical properties of our method through simulations, and show that on in vivo fMRI data, latency can characterize aspects of neural recruitment during different cognitive tasks, that amplitude alone cannot.
2010 7th IEEE International Symposium on Biomedical Imaging: From Nano to Macro, ISBI 2010 - Proceedings, 2010
We present a study of the spatial variation of nuclear morphology of stromal and cancer-associate... more We present a study of the spatial variation of nuclear morphology of stromal and cancer-associated fibroblasts in the mouse mammary gland. The work is part of a framework being developed for the analysis of the tumor microenvironment in breast cancer. Recent research has uncovered the role of stromal cells in promoting tumor growth and progression. In specific, studies have indicated that stromal fibroblastsformerly considered to be passive entities in the extra-cellular matrix-play an active role in the progression of tumor in mammary tissue. We have focused on the analysis of the nuclear morphology of fibroblasts, which several studies have shown to be a critical phenotype in cancer. An essential component of our approach is that the nuclear morphology is studied within the 3D spatial context of the tissue, thus enabling us to pose questions about how the locus of a cell relates to its morphology, and possibly to its function. In order to make quantitative comparisons between nuclear populations, we build statistical shape models of cell populations and infer differences between the populations through these models. We present our observation on both normal and tumor tissues from the mouse mammary gland.
Lecture Notes in Computer Science, 2011
Methods to quantify cellular-level phenotypic differences between genetic groups are a key tool i... more Methods to quantify cellular-level phenotypic differences between genetic groups are a key tool in genomics research. In disease processes such as cancer, phenotypic changes at the cellular level frequently manifest in the modification of cell population profiles. These changes are hard to detect due the ambiguity in identifying distinct cell phenotypes within a population. We present a methodology which enables the detection of such changes by generating a phenotypic signature of cell populations in a data-derived feature-space. Further, this signature is used to estimate a model for the redistribution of phenotypes that was induced by the genetic change. Results are presented on an experiment involving deletion of a tumor-suppressor gene dominant in breast cancer, where the methodology is used to detect changes in nuclear morphology between control and knockout groups.
Lecture Notes in Computer Science, 2011
In systems-based approaches for studying processes such as cancer and development, identifying an... more In systems-based approaches for studying processes such as cancer and development, identifying and characterizing individual cells within a tissue is the first step towards understanding the largescale effects that emerge from the interactions between cells. To this end, nuclear morphology is an important phenotype to characterize the physiological and differentiated state of a cell. This study focuses on using nuclear morphology to identify cellular phenotypes in thick tissue sections imaged using 3D fluorescence microscopy. The limited label information, heterogeneous feature set describing a nucleus, and existence of sub-populations within cell-types makes this a difficult learning problem. To address these issues, a technique is presented to learn a distance metric from labeled data which is locally adaptive to account for heterogeneity in the data. Additionally, a label propagation technique is used to improve the quality of the learned metric by expanding the training set using unlabeled data. Results are presented on images of tumor stroma in breast cancer, where the framework is used to identify fibroblasts, macrophages and endothelial cellsthree major stromal cells involved in carcinogenesis.
2012 9th IEEE International Symposium on Biomedical Imaging (ISBI), 2012
3D cell nuclei segmentation from fluorescence microscopy images is a key application in many biol... more 3D cell nuclei segmentation from fluorescence microscopy images is a key application in many biological studies. We propose a new, fully automated and non parametric method that takes advantage of the resolution anisotropy in fluorescence microscopy. The cell nuclei are first detected in 2D at each image plane and then tracked over depth through a graph based decision to recover their 3D profiles. As the tracking fails to separate very close cell nuclei along depth, we also propose a corrective step based on an intensity projection criterion. Experimental results on real data demonstrate the efficacy of the proposed method.
Ultrasound in Medicine & Biology, 2011
The use of manual segmentation of lymph nodes, within an ultrasound image, is challenging due to ... more The use of manual segmentation of lymph nodes, within an ultrasound image, is challenging due to operator dependency and speckle. A group of 23 healthy female volunteers consented to a short imaging session to capture a maximum of three axillary lymph nodes. A feasibility study was completed using both automatic and manual segmentation techniques to analyze a sample of 45, three-dimensional (3-D) nodal volume sets. Level-set segmentation based on geodesic active contours and shape-space learning based on a level-set segmentation approach was used to capture global node shapes. Most of the image feature based segmentation methods failed; however, a more precise automatic segmentation algorithm was obtained using a superimposed shape model. Shape model based segmentation significantly improved the segmentation compared with standard level sets. The best segmentation results were achieved when an experienced sonographer assisted with setting seed surfaces. The initialization of seed surfaces improved the capture of the global shape and lymphatic vessels.
NeuroImage, 2011
Understanding the highly complex, spatially distributed and temporally organized phenomena entail... more Understanding the highly complex, spatially distributed and temporally organized phenomena entailed by mental processes using functional MRI is an important research problem in cognitive and clinical neuroscience. Conventional analysis methods focus on the spatial dimension of the data discarding the information about brain function contained in the temporal dimension. This paper presents a fully spatio-temporal multivariate analysis method using a state-space model (SSM) for brain function that yields not only spatial maps of activity but also its temporal structure along with spatially varying estimates of the hemodynamic response. Efficient algorithms for estimating the parameters along with quantitative validations are given. A novel low-dimensional feature-space for representing the data, based on a formal definition of functional similarity, is derived. Quantitative validation of the model and the estimation algorithms is provided with a simulation study. Using a real fMRI study for mental arithmetic, the ability of this neurophysiologically inspired model to represent the spatio-temporal information corresponding to mental processes is demonstrated. Moreover, by comparing the models across multiple subjects, natural patterns in mental processes organized according to different mental abilities are revealed.
Biocomputing 2020, 2019
Integration of transcriptomic and proteomic data should reveal multi-layered regulatory processes... more Integration of transcriptomic and proteomic data should reveal multi-layered regulatory processes governing cancer cell behaviors. Traditional correlation-based analyses have demonstrated limited ability to identify the post-transcriptional regulatory (PTR) processes that drive the non-linear relationship between transcript and protein abundances. In this work, we ideate an integrative approach to explore the variety of post-transcriptional mechanisms that dictate relationships between genes and corresponding proteins. The proposed workflow utilizes the intuitive technique of scatterplot diagnostics or scagnostics, to characterize and examine the diverse scatterplots built from transcript and protein abundances in a proteogenomic experiment. The workflow includes representing gene-protein relationships as scatterplots, clustering on geometric scagnostic features of these scatterplots, and finally identifying and grouping the potential gene-protein relationships according to their disposition to various PTR mechanisms. Our study verifies the efficacy of the implemented approach to excavate possible regulatory mechanisms by utilizing comprehensive tests on a synthetic dataset. We also propose a variety of 2D pattern-specific downstream analyses methodologies such as mixture modeling, and mapping miRNA post-transcriptional effects to
Studies in health technology and informatics, 2017
Systematic Reviews (SRs) of biomedical literature summarize evidence from high-quality studies to... more Systematic Reviews (SRs) of biomedical literature summarize evidence from high-quality studies to inform clinical decisions, but are time and labor intensive due to the large number of article collections. Article similarities established from textual features have been shown to assist in the identification of relevant articles, thus facilitating the article screening process efficiently. In this study, we visualized article similarities to extend its utilization in practical settings for SR researchers, aiming to promote human comprehension of article distributions and hidden patterns. To prompt an effective visualization in an interpretable, intuitive, and scalable way, we implemented a graph-based network visualization with three network sparsification approaches and a distance-based map projection via dimensionality reduction. We evaluated and compared three network sparsification approaches and the visualization types (article network vs. article map). We demonstrated the effec...
BMC Bioinformatics, 2021
Background Assigning chromatin states genome-wide (e.g. promoters, enhancers, etc.) is commonly p... more Background Assigning chromatin states genome-wide (e.g. promoters, enhancers, etc.) is commonly performed to improve functional interpretation of these states. However, computational methods to assign chromatin state suffer from the following drawbacks: they typically require data from multiple assays, which may not be practically feasible to obtain, and they depend on peak calling algorithms, which require careful parameterization and often exclude the majority of the genome. To address these drawbacks, we propose a novel learning technique built upon the Self-Organizing Map (SOM), Self-Organizing Map with Variable Neighborhoods (SOM-VN), to learn a set of representative shapes from a single, genome-wide, chromatin accessibility dataset to associate with a chromatin state assignment in which a particular RE is prevalent. These shapes can then be used to assign chromatin state using our workflow. Results We validate the performance of the SOM-VN workflow on 14 different samples of v...
Metabolites, 2020
As researchers are increasingly able to collect data on a large scale from multiple clinical and ... more As researchers are increasingly able to collect data on a large scale from multiple clinical and omics modalities, multi-omics integration is becoming a critical component of metabolomics research. This introduces a need for increased understanding by the metabolomics researcher of computational and statistical analysis methods relevant to multi-omics studies. In this review, we discuss common types of analyses performed in multi-omics studies and the computational and statistical methods that can be used for each type of analysis. We pinpoint the caveats and considerations for analysis methods, including required parameters, sample size and data distribution requirements, sources of a priori knowledge, and techniques for the evaluation of model accuracy. Finally, for the types of analyses discussed, we provide examples of the applications of corresponding methods to clinical and basic research. We intend that our review may be used as a guide for metabolomics researchers to choose ...
BMC Bioinformatics, 2019
Background Proteomic measurements, which closely reflect phenotypes, provide insights into gene e... more Background Proteomic measurements, which closely reflect phenotypes, provide insights into gene expression regulations and mechanisms underlying altered phenotypes. Further, integration of data on proteome and transcriptome levels can validate gene signatures associated with a phenotype. However, proteomic data is not as abundant as genomic data, and it is thus beneficial to use genomic features to predict protein abundances when matching proteomic samples or measurements within samples are lacking. Results We evaluate and compare four data-driven models for prediction of proteomic data from mRNA measured in breast and ovarian cancers using the 2017 DREAM Proteogenomics Challenge data. Our results show that Bayesian network, random forests, LASSO, and fuzzy logic approaches can predict protein abundance levels with median ground truth-predicted correlation values between 0.2 and 0.5. However, the most accurately predicted proteins differ considerably between approaches. Conclusions ...
Journal of Thoracic Oncology, 2018
Introduction: Despite apparently complete surgical resection, approximately half of resected earl... more Introduction: Despite apparently complete surgical resection, approximately half of resected early-stage lung cancer patients relapse and die of their disease. Adjuvant chemotherapy reduces this risk by only 5% to 8%. Thus, there is a need for better identifying who benefits from adjuvant therapy, the drivers of relapse, and novel targets in this setting. Methods: RNA sequencing and liquid chromatography/ liquid chromatography-mass spectrometry proteomics data were generated from 51 surgically resected non-small cell lung tumors with known recurrence status. Results: We present a rationale and framework for the incorporation of high-content RNA and protein measurements into integrative biomarkers and show the potential of this approach for predicting risk of recurrence in a group of lung adenocarcinomas. In addition, we characterize the relationship between mRNA and protein measurements in lung adenocarcinoma and show that it is outcome specific. Conclusions: Our results suggest that mRNA and protein data possess independent biological and clinical importance, which can be leveraged to create higherpowered expression biomarkers.
Biomedical Informatics Insights, 2018
Convolutional neural networks (CNNs) have gained steady popularity as a tool to perform automatic... more Convolutional neural networks (CNNs) have gained steady popularity as a tool to perform automatic classification of whole slide histology images. While CNNs have proven to be powerful classifiers in this context, they fail to explain this classification, as the network engineered features used for modeling and classification are ONLY interpretable by the CNNs themselves. This work aims at enhancing a traditional neural network model to perform histology image modeling, patient classification, and interpretation of the distinctive features identified by the network within the histology whole slide images (WSIs). We synthesize a workflow which (a) intelligently samples the training data by automatically selecting only image areas that display visible disease-relevant tissue state and (b) isolates regions most pertinent to the trained CNN prediction and translates them to observable and qualitative features such as color, intensity, cell and tissue morphology and texture. We use the Ca...
Bioinformatics (Oxford, England), Jan 17, 2018
Technologies that generate high-throughput 'omics data are flourishing, creating enormous, pu... more Technologies that generate high-throughput 'omics data are flourishing, creating enormous, publicly available repositories of multi-omics data. As many data repositories continue to grow, there is an urgent need for computational methods that can leverage these data to create comprehensive clusters of patients with a given disease. Our proposed approach creates a patient-to-patient similarity graph for each data type as an intermediate representation of each omics data type and merges the graphs through subspace analysis on a Grassmann manifold. We hypothesize that this approach generates more informative clusters by preserving the complementary information from each level of 'omics data. We applied our approach to a TCGA breast cancer data set and show that by integrating gene expression, microRNA, and DNA methylation data, our proposed method can produce clinically useful subtypes of breast cancer. We then investigate the molecular characteristics underlying these subtypes...
Cancer’s cellular behavior is driven by alterations in the processes that cells use to sense and ... more Cancer’s cellular behavior is driven by alterations in the processes that cells use to sense and respond to diverse stimuli. Underlying these processes are a series of chemical processes (enzyme-substrate, protein-protein, etc.). Here we introduce a set of mathematical techniques for describing and characterizing these processes.
Lecture Notes in Computer Science, 2010
The use of multivariate pattern recognition for the analysis of neural representations encoded in... more The use of multivariate pattern recognition for the analysis of neural representations encoded in fMRI data has become a significant research topic, with wide applications in neuroscience and psychology. A popular approach is to learn a mapping from the data to the observed behavior. However, identifying the instantaneous cognitive state without reference to external conditions is a relatively unexplored problem and could provide important insights into mental processes. In this paper, we present preliminary but promising results from the application of an unsupervised learning technique to identify distinct brain states. The temporal ordering of the states were seen to be synchronized with the experimental conditions, while the spatial distribution of activity in a state conformed with the expected functional recruitment.
Lecture Notes in Computer Science
In this work, we propose a novel method for deformable tensor-to-tensor registration of Diffusion... more In this work, we propose a novel method for deformable tensor-to-tensor registration of Diffusion Tensor Images. Our registration method models the distances in between the tensors with Geode-sic-Loxodromes and employs a version of Multi-Dimensional Scaling (MDS) algorithm to unfold the manifold described with this metric. Defining the same shape properties as tensors, the vector images obtained through MDS are fed into a multi-step vector-image registration scheme and the resulting deformation fields are used to reorient the tensor fields. Results on brain DTI indicate that the proposed method is very suitable for deformable fiber-to-fiber correspondence and DTI-atlas construction.
2010 IEEE International Symposium on Biomedical Imaging: From Nano to Macro, 2010
In fMRI analysis, general linear modelling (GLM) is commonly used because of its explanatory powe... more In fMRI analysis, general linear modelling (GLM) is commonly used because of its explanatory power, statistical simplicity and computational efficiency. Such models primarily measure the parametric effects of experimental conditions on amplitude of activation, and neglect other important effects on nature of the hemodynamic response, including its temporal characteristics such as relative latency (delay). In this paper, we present a GLM approach to estimate experimental effects on, not only activation amplitude, but also latency. We validate the statistical properties of our method through simulations, and show that on in vivo fMRI data, latency can characterize aspects of neural recruitment during different cognitive tasks, that amplitude alone cannot.
2010 7th IEEE International Symposium on Biomedical Imaging: From Nano to Macro, ISBI 2010 - Proceedings, 2010
We present a study of the spatial variation of nuclear morphology of stromal and cancer-associate... more We present a study of the spatial variation of nuclear morphology of stromal and cancer-associated fibroblasts in the mouse mammary gland. The work is part of a framework being developed for the analysis of the tumor microenvironment in breast cancer. Recent research has uncovered the role of stromal cells in promoting tumor growth and progression. In specific, studies have indicated that stromal fibroblastsformerly considered to be passive entities in the extra-cellular matrix-play an active role in the progression of tumor in mammary tissue. We have focused on the analysis of the nuclear morphology of fibroblasts, which several studies have shown to be a critical phenotype in cancer. An essential component of our approach is that the nuclear morphology is studied within the 3D spatial context of the tissue, thus enabling us to pose questions about how the locus of a cell relates to its morphology, and possibly to its function. In order to make quantitative comparisons between nuclear populations, we build statistical shape models of cell populations and infer differences between the populations through these models. We present our observation on both normal and tumor tissues from the mouse mammary gland.
Lecture Notes in Computer Science, 2011
Methods to quantify cellular-level phenotypic differences between genetic groups are a key tool i... more Methods to quantify cellular-level phenotypic differences between genetic groups are a key tool in genomics research. In disease processes such as cancer, phenotypic changes at the cellular level frequently manifest in the modification of cell population profiles. These changes are hard to detect due the ambiguity in identifying distinct cell phenotypes within a population. We present a methodology which enables the detection of such changes by generating a phenotypic signature of cell populations in a data-derived feature-space. Further, this signature is used to estimate a model for the redistribution of phenotypes that was induced by the genetic change. Results are presented on an experiment involving deletion of a tumor-suppressor gene dominant in breast cancer, where the methodology is used to detect changes in nuclear morphology between control and knockout groups.
Lecture Notes in Computer Science, 2011
In systems-based approaches for studying processes such as cancer and development, identifying an... more In systems-based approaches for studying processes such as cancer and development, identifying and characterizing individual cells within a tissue is the first step towards understanding the largescale effects that emerge from the interactions between cells. To this end, nuclear morphology is an important phenotype to characterize the physiological and differentiated state of a cell. This study focuses on using nuclear morphology to identify cellular phenotypes in thick tissue sections imaged using 3D fluorescence microscopy. The limited label information, heterogeneous feature set describing a nucleus, and existence of sub-populations within cell-types makes this a difficult learning problem. To address these issues, a technique is presented to learn a distance metric from labeled data which is locally adaptive to account for heterogeneity in the data. Additionally, a label propagation technique is used to improve the quality of the learned metric by expanding the training set using unlabeled data. Results are presented on images of tumor stroma in breast cancer, where the framework is used to identify fibroblasts, macrophages and endothelial cellsthree major stromal cells involved in carcinogenesis.
2012 9th IEEE International Symposium on Biomedical Imaging (ISBI), 2012
3D cell nuclei segmentation from fluorescence microscopy images is a key application in many biol... more 3D cell nuclei segmentation from fluorescence microscopy images is a key application in many biological studies. We propose a new, fully automated and non parametric method that takes advantage of the resolution anisotropy in fluorescence microscopy. The cell nuclei are first detected in 2D at each image plane and then tracked over depth through a graph based decision to recover their 3D profiles. As the tracking fails to separate very close cell nuclei along depth, we also propose a corrective step based on an intensity projection criterion. Experimental results on real data demonstrate the efficacy of the proposed method.
Ultrasound in Medicine & Biology, 2011
The use of manual segmentation of lymph nodes, within an ultrasound image, is challenging due to ... more The use of manual segmentation of lymph nodes, within an ultrasound image, is challenging due to operator dependency and speckle. A group of 23 healthy female volunteers consented to a short imaging session to capture a maximum of three axillary lymph nodes. A feasibility study was completed using both automatic and manual segmentation techniques to analyze a sample of 45, three-dimensional (3-D) nodal volume sets. Level-set segmentation based on geodesic active contours and shape-space learning based on a level-set segmentation approach was used to capture global node shapes. Most of the image feature based segmentation methods failed; however, a more precise automatic segmentation algorithm was obtained using a superimposed shape model. Shape model based segmentation significantly improved the segmentation compared with standard level sets. The best segmentation results were achieved when an experienced sonographer assisted with setting seed surfaces. The initialization of seed surfaces improved the capture of the global shape and lymphatic vessels.
NeuroImage, 2011
Understanding the highly complex, spatially distributed and temporally organized phenomena entail... more Understanding the highly complex, spatially distributed and temporally organized phenomena entailed by mental processes using functional MRI is an important research problem in cognitive and clinical neuroscience. Conventional analysis methods focus on the spatial dimension of the data discarding the information about brain function contained in the temporal dimension. This paper presents a fully spatio-temporal multivariate analysis method using a state-space model (SSM) for brain function that yields not only spatial maps of activity but also its temporal structure along with spatially varying estimates of the hemodynamic response. Efficient algorithms for estimating the parameters along with quantitative validations are given. A novel low-dimensional feature-space for representing the data, based on a formal definition of functional similarity, is derived. Quantitative validation of the model and the estimation algorithms is provided with a simulation study. Using a real fMRI study for mental arithmetic, the ability of this neurophysiologically inspired model to represent the spatio-temporal information corresponding to mental processes is demonstrated. Moreover, by comparing the models across multiple subjects, natural patterns in mental processes organized according to different mental abilities are revealed.