Angus Ng - Academia.edu (original) (raw)
Papers by Angus Ng
Wiley StatsRef: Statistics Reference Online, 2022
R topics documented: bootstrap.......................................... 2 conplot.................. more R topics documented: bootstrap.......................................... 2 conplot........................................... 4 ddmix............................................ 6 ddmsn............................................ 7 ddmst............................................ 8 ddmvn............................................ 9
Description Fit multivariate mixture models via the EM Algorithm. Multivariate distributions in-c... more Description Fit multivariate mixture models via the EM Algorithm. Multivariate distributions in-clude Normal distribution, t-distribution, Skew Normal distribution and and Skew t-distribution. The emmix is an updated version of EMMIX with new features such as cluster-ing the degenerated data and fitting skew mixture models.
Citation: Lee, Andy and Zhao, Yun and Yau, Kelvin and Ng, Shu. 2008. Survival mixture modelling o... more Citation: Lee, Andy and Zhao, Yun and Yau, Kelvin and Ng, Shu. 2008. Survival mixture modelling of recurrent infections, in Mizuta, M. and Nakano, J. (ed), Joint Meeting of 4th World Conference of the IASC and 6th Conference of the Asian Regional Section of the IASC on Computational Statistics and Data Analysis, Dec 5 2008, pp. 1008-1014, Yokohama, Japan: Japanese Society of Computational Statistics.
British Journal of Cancer, 2004
Kallikrein 6 (hK6, also known as protease M/zyme/neurosin) is a member of the human kallikrein ge... more Kallikrein 6 (hK6, also known as protease M/zyme/neurosin) is a member of the human kallikrein gene family. We have previously cloned the cDNA for this gene by differential display and shown the overexpression of the mRNA in breast and ovarian primary tumour tissues and cell lines. To thoroughly characterise the expression of this kallikrein in ovarian cancer, we have developed a novel monoclonal antibody specific to hK6 and employed it in immunohistochemistry with a wide range of ovarian tumour samples. The expression was found elevated in 67 of 80 cases of ovarian tumour samples and there was a significant difference in the expression levels between normal and benign ovarian tissues and the borderline and invasive tumours (Po0.001). There was no difference of expression level between different subtypes of tumours. More significantly, high level of kallikrein 6 expression was found in many early-stage and low-grade tumours, and elevated hK6 proteins were found in benign epithelia coexisting with borderline and invasive tissues, suggesting that overexpression of hK6 is an early phenomenon in the development of ovarian cancer. Quantitative real-time reverse transcription-polymerase chain reactions also showed elevated kallikrein 6 mRNA expression in ovarian tumours. Genomic Southern analysis of 19 ovarian tumour samples suggested that gene amplification is one mechanism for the overexpression of hK6 in ovarian cancer.
Journal of Statistical Software, 1997
Purpose: To fit a mixture of two Gompertz distributions ** c ** to censored survival data ** c **... more Purpose: To fit a mixture of two Gompertz distributions ** c ** to censored survival data ** c **Input files: 1 data file "surv.dat"; ** c ** 2 initial estimates file "para.dat" ** c **Output file: final estimates file "fort.25"
Methods in molecular biology (Clifton, N.J.), 2013
There are two distinct but related clustering problems with microarray data. One problem concerns... more There are two distinct but related clustering problems with microarray data. One problem concerns the clustering of the tissue samples (gene signatures) on the basis of the genes; the other concerns the clustering of the genes on the basis of the tissues (gene profiles). The clusters of tissues so obtained in the first problem can play a useful role in the discovery and understanding of new subclasses of diseases. The clusters of genes obtained in the second problem can be used to search for genetic pathways or groups of genes that might be regulated together. Also, in the first problem, we may wish first to summarize the information in the very large number of genes by clustering them into groups (of hyperspherical shape), which can be represented by some metagenes, such as the group sample means. We can then carry out the clustering of the tissues in terms of these metagenes. We focus here on mixtures of normals to provide a model-based clustering of tissue samples (gene signature...
Australian family physician, 2011
Forms of address between patients and general practitioners is an underexplored area which may in... more Forms of address between patients and general practitioners is an underexplored area which may influence productive dialogue within a consultation. This article aims to describe how Australian patients prefer to be addressed by their GP, how patients prefer to address their GP, and the factors influencing these preferences. Twenty consecutive patients of 13 randomly selected GPs (n=260) were surveyed on preferences for use of names in consultations and the factors influencing these preferences. Ninety percent of patients prefer to be addressed by their first name. Thirty-five percent of patients prefer to call the GP by first name, 27% by title and last name, 21% by title only, and 10% by title and first name. A range of influencing factors was identified. These findings allow GPs to feel confident in addressing their patients informally. They indicate the diversity of patient preferences for addressing their GP and the factors influencing these choices.
Australian family physician, 2009
The extent to which a fear of needles influences health decisions remains largely unexplored. Thi... more The extent to which a fear of needles influences health decisions remains largely unexplored. This study investigated the prevalence of fear of needles in a southeast Queensland community, described associated symptoms, and highlighted health care avoidance tendencies of affected individuals. One hundred and seventy-seven participants attending an outer urban general practice responded to a questionnaire on fear of needles, symptoms associated with needles and its influence on their use of medical care. Twenty-two percent of participants reported a fear of needles. Affected participants were more likely than participants with no fear to report vasovagal symptoms, have had a previous traumatic needle experience (46.2 vs. 16.4%, p<0.001) and avoid medical treatment involving needles (20.5 vs. 2.3%, p<0.001). Fear of needles is common and is associated with health care avoidance. Health professionals could better identify and manage patients who have a fear of needles by recognis...
Complex Systems, 2005
Time-course experiments with microarrays are often used to study dynamic biological systems and g... more Time-course experiments with microarrays are often used to study dynamic biological systems and genetic regulatory networks (GRNs) that model how genes influence each other in cell-level development of organisms. The inference for GRNs provides important insights into the fundamental biological processes such as growth and is useful in disease diagnosis and genomic drug design. Due to the experimental design, multilevel data hierarchies are often present in time-course gene expression data. Most existing methods, however, ignore the dependency of the expression measurements over time and the correlation among gene expression profiles. Such independence assumptions violate regulatory interactions and can result in overlooking certain important subject effects and lead to spurious inference for regulatory networks or mechanisms. In this paper, a multilevel mixed-effects model is adopted to incorporate data hierarchies in the analysis of time-course data, where temporal and subject effects are both assumed to be random. The method starts with the clustering of genes by fitting the mixture model within the multilevel random-effects model framework using the expectation-maximization (EM) algorithm. The network of regulatory interactions is then determined by searching for regulatory control elements (activators and inhibitors) shared by the clusters of co-expressed genes, based on a time-lagged correlation coefficients measurement. The method is applied to two real time-course datasets from the budding yeast (Saccharomyces cerevisiae) genome. It is shown that the proposed method provides clusters of cell-cycle regulated genes that are supported by existing gene function annotations, and hence enables inference on regulatory interactions for the genetic network.
COMPSTAT 2008
ABSTRACT In this paper, we consider the use of mixtures of linear mixed models to cluster data wh... more ABSTRACT In this paper, we consider the use of mixtures of linear mixed models to cluster data which may be correlated and replicated and which may have covariates. For each cluster, a regression model is adopted to incorporate the covariates, and the correlation and replication structure in the data are specified by the inclusion of random effects terms. The procedure is illustrated in its application to the clustering of gene-expression profiles. Yes Yes
Statistical Advances in the Biomedical Sciences
Page 1. CHAPTER 21 Clustering of Microarray Data via Mixture Models Geoffrey J. McLachlan and Ang... more Page 1. CHAPTER 21 Clustering of Microarray Data via Mixture Models Geoffrey J. McLachlan and Angus Ng Department of Mathematics, University of Queensland, Brisbane, Australia ... For example, the gene-shaving methodology of Hastie et al. ...
Methods in Molecular Biology, 2008
Recently, we reported the development and use of a "reverse capture" antibody microarray for the ... more Recently, we reported the development and use of a "reverse capture" antibody microarray for the purpose of investigating antigen-autoantibody profiling. This platform was developed to allow researchers to characterize and compare the autoantibody profiles of normal and diseased patients. Our "reverse capture" protocol is based on the dualantibody sandwich immunoassay of enzyme-linked immunosorbent assay (ELISA), and we have previously reported its use to detect autoimmunity to epitopes found on native antigens derived from tumor cell lines. In this protocol, we used ovarian cancer as a model system to adapt the "reverse capture" procedure for use with native antigens derived from frozen tissue samples. The use of this platform in studies of autoimmunity is valuable because it allows for the detection of autoantibody reactivity with epitopes found on the post-translational modifications (PTMs) of native antigens, a feature not present with other protein array platforms. In the first step in the "reverse capture" process, tissue-derived native antigens are immobilized onto the 500 monoclonal antibodies that are spotted in duplicate on the array surface. Using the captured antigens as "baits," we then incubate the array with labeled IgG from test and control samples, and perform a two-slide dye-swap to account for any dye effects. Here, we present a detailed description of the "reverse capture" autoantibody microarray for use with tissue-derived native antigens.
Health Care Management Science, 2001
With obstetrical delivery being the most frequent cause for hospital admissions, it is important ... more With obstetrical delivery being the most frequent cause for hospital admissions, it is important to determine health- and patient-related characteristics affecting maternity length of stay (LOS). Although the average inpatient LOS has decreased steadily over the years, the issue of the appropriate LOS after delivery is complex and hotly debated, especially since the introduction of the mandatory minimum-stay legislation in the USA. The purpose of this paper is to identity factors associated with maternity LOS and to model variations in LOS. A Gamma mixture risk-adjusted model is proposed in order to analyze heterogeneity of maternity LOS within obstetrical Diagnosis Related Groups (DRGs). The determination of pertinent factors would benefit hospital administrators and clinicians to manage LOS and expenditures efficiently.
Statistics in Medicine, 2004
A two-component survival mixture model is proposed to analyse a set of ischaemic stroke-speciÿc m... more A two-component survival mixture model is proposed to analyse a set of ischaemic stroke-speciÿc mortality data. The survival experience of stroke patients after index stroke may be described by a subpopulation of patients in the acute condition and another subpopulation of patients in the chronic phase. To adjust for the inherent correlation of observations due to random hospital e ects, a mixture model of two survival functions with random e ects is formulated. Assuming a Weibull hazard in both components, an EM algorithm is developed for the estimation of ÿxed e ect parameters and variance components. A simulation study is conducted to assess the performance of the two-component survival mixture model estimators. Simulation results conÿrm the applicability of the proposed model in a small sample setting. Copyright ? 2004 John Wiley & Sons, Ltd.
Statistics in Medicine, 2001
A mixture model incorporating long-term survivors has been adopted in the field of biostatistics ... more A mixture model incorporating long-term survivors has been adopted in the field of biostatistics where some individuals may never experience the failure event under study. The surviving fractions may be considered as cured. In most applications, the survival times are assumed to be independent. However, when the survival data are obtained from a multi-centre clinical trial, it is conceived that the environmental conditions and facilities shared within clinic affects the proportion cured as well as the failure risk for the uncured individuals. It necessitates a long-term survivor mixture model with random effects. In this paper, the long-term survivor mixture model is extended for the analysis of multivariate failure time data using the generalized linear mixed model (GLMM) approach. The proposed model is applied to analyse a numerical data set from a multi-centre clinical trial of carcinoma as an illustration. Some simulation experiments are performed to assess the applicability of the model based on the average biases of the estimates formed.
Pak. J. Statist, 2010
Studies on genetic profiling demonstrate its potential utility for classifying tumours, leading t... more Studies on genetic profiling demonstrate its potential utility for classifying tumours, leading to a "genetic-staging" system for predicting disease outcomes. A precise prediction of individual disease outcome is important to identify patients who have a high risk of disease recurrence, and to tailor treatments to the individual patient. Genes, however, are not the sole determinants of disease outcomes. Non-genetic factors have roles in many stages of tumourigenesis, and the simultaneous use of genetic-staging and clinical risk factors may therefore improve the prediction of disease ouctome. In this paper, we aim to quantify the prognostic value of genetic-staging from gene expressions by using mixture model-based clustering methods. We also investigate via the use of logistric regression whether a more accurate prediction of disease outcome can be obtained by using geneticstaging in conjunction with clinical risk factors. The proposed method is illustrated using a real example of breast cancer data. It shows that genetic-staging provides significant additional prognostic information when it is obtained by applying sophisticated modelbsed clustering method for the identification of marker-genes that are relevant to predict disease outcomes.
Population Health Metrics, 2011
Background: Multimorbidity is becoming more prevalent. Previously-used methods of assessing multi... more Background: Multimorbidity is becoming more prevalent. Previously-used methods of assessing multimorbidity relied on counting the number of health conditions, often in relation to an index condition (comorbidity), or grouping conditions based on body or organ systems. Recent refinements in statistical approaches have resulted in improved methods to capture patterns of multimorbidity, allowing for the identification of nonrandomly occurring clusters of multimorbid health conditions. This paper aims to identify nonrandom clusters of multimorbidity. Methods: The Australian Work Outcomes Research Cost-benefit (WORC) study cross-sectional screening dataset (approximately 78,000 working Australians) was used to explore patterns of multimorbidity. Exploratory factor analysis was used to identify nonrandomly occurring clusters of multimorbid health conditions. Results: Six clinically-meaningful groups of multimorbid health conditions were identified. These were: factor 1: arthritis, osteoporosis, other chronic pain, bladder problems, and irritable bowel; factor 2: asthma, chronic obstructive pulmonary disease, and allergies; factor 3: back/neck pain, migraine, other chronic pain, and arthritis; factor 4: high blood pressure, high cholesterol, obesity, diabetes, and fatigue; factor 5: cardiovascular disease, diabetes, fatigue, high blood pressure, high cholesterol, and arthritis; and factor 6: irritable bowel, ulcer, heartburn, and other chronic pain. These clusters do not fall neatly into organ or body systems, and some conditions appear in more than one cluster. Conclusions: Considerably more research is needed with large population-based datasets and a comprehensive set of reliable health diagnoses to better understand the complex nature and composition of multimorbid health conditions.
Wiley StatsRef: Statistics Reference Online, 2022
R topics documented: bootstrap.......................................... 2 conplot.................. more R topics documented: bootstrap.......................................... 2 conplot........................................... 4 ddmix............................................ 6 ddmsn............................................ 7 ddmst............................................ 8 ddmvn............................................ 9
Description Fit multivariate mixture models via the EM Algorithm. Multivariate distributions in-c... more Description Fit multivariate mixture models via the EM Algorithm. Multivariate distributions in-clude Normal distribution, t-distribution, Skew Normal distribution and and Skew t-distribution. The emmix is an updated version of EMMIX with new features such as cluster-ing the degenerated data and fitting skew mixture models.
Citation: Lee, Andy and Zhao, Yun and Yau, Kelvin and Ng, Shu. 2008. Survival mixture modelling o... more Citation: Lee, Andy and Zhao, Yun and Yau, Kelvin and Ng, Shu. 2008. Survival mixture modelling of recurrent infections, in Mizuta, M. and Nakano, J. (ed), Joint Meeting of 4th World Conference of the IASC and 6th Conference of the Asian Regional Section of the IASC on Computational Statistics and Data Analysis, Dec 5 2008, pp. 1008-1014, Yokohama, Japan: Japanese Society of Computational Statistics.
British Journal of Cancer, 2004
Kallikrein 6 (hK6, also known as protease M/zyme/neurosin) is a member of the human kallikrein ge... more Kallikrein 6 (hK6, also known as protease M/zyme/neurosin) is a member of the human kallikrein gene family. We have previously cloned the cDNA for this gene by differential display and shown the overexpression of the mRNA in breast and ovarian primary tumour tissues and cell lines. To thoroughly characterise the expression of this kallikrein in ovarian cancer, we have developed a novel monoclonal antibody specific to hK6 and employed it in immunohistochemistry with a wide range of ovarian tumour samples. The expression was found elevated in 67 of 80 cases of ovarian tumour samples and there was a significant difference in the expression levels between normal and benign ovarian tissues and the borderline and invasive tumours (Po0.001). There was no difference of expression level between different subtypes of tumours. More significantly, high level of kallikrein 6 expression was found in many early-stage and low-grade tumours, and elevated hK6 proteins were found in benign epithelia coexisting with borderline and invasive tissues, suggesting that overexpression of hK6 is an early phenomenon in the development of ovarian cancer. Quantitative real-time reverse transcription-polymerase chain reactions also showed elevated kallikrein 6 mRNA expression in ovarian tumours. Genomic Southern analysis of 19 ovarian tumour samples suggested that gene amplification is one mechanism for the overexpression of hK6 in ovarian cancer.
Journal of Statistical Software, 1997
Purpose: To fit a mixture of two Gompertz distributions ** c ** to censored survival data ** c **... more Purpose: To fit a mixture of two Gompertz distributions ** c ** to censored survival data ** c **Input files: 1 data file "surv.dat"; ** c ** 2 initial estimates file "para.dat" ** c **Output file: final estimates file "fort.25"
Methods in molecular biology (Clifton, N.J.), 2013
There are two distinct but related clustering problems with microarray data. One problem concerns... more There are two distinct but related clustering problems with microarray data. One problem concerns the clustering of the tissue samples (gene signatures) on the basis of the genes; the other concerns the clustering of the genes on the basis of the tissues (gene profiles). The clusters of tissues so obtained in the first problem can play a useful role in the discovery and understanding of new subclasses of diseases. The clusters of genes obtained in the second problem can be used to search for genetic pathways or groups of genes that might be regulated together. Also, in the first problem, we may wish first to summarize the information in the very large number of genes by clustering them into groups (of hyperspherical shape), which can be represented by some metagenes, such as the group sample means. We can then carry out the clustering of the tissues in terms of these metagenes. We focus here on mixtures of normals to provide a model-based clustering of tissue samples (gene signature...
Australian family physician, 2011
Forms of address between patients and general practitioners is an underexplored area which may in... more Forms of address between patients and general practitioners is an underexplored area which may influence productive dialogue within a consultation. This article aims to describe how Australian patients prefer to be addressed by their GP, how patients prefer to address their GP, and the factors influencing these preferences. Twenty consecutive patients of 13 randomly selected GPs (n=260) were surveyed on preferences for use of names in consultations and the factors influencing these preferences. Ninety percent of patients prefer to be addressed by their first name. Thirty-five percent of patients prefer to call the GP by first name, 27% by title and last name, 21% by title only, and 10% by title and first name. A range of influencing factors was identified. These findings allow GPs to feel confident in addressing their patients informally. They indicate the diversity of patient preferences for addressing their GP and the factors influencing these choices.
Australian family physician, 2009
The extent to which a fear of needles influences health decisions remains largely unexplored. Thi... more The extent to which a fear of needles influences health decisions remains largely unexplored. This study investigated the prevalence of fear of needles in a southeast Queensland community, described associated symptoms, and highlighted health care avoidance tendencies of affected individuals. One hundred and seventy-seven participants attending an outer urban general practice responded to a questionnaire on fear of needles, symptoms associated with needles and its influence on their use of medical care. Twenty-two percent of participants reported a fear of needles. Affected participants were more likely than participants with no fear to report vasovagal symptoms, have had a previous traumatic needle experience (46.2 vs. 16.4%, p<0.001) and avoid medical treatment involving needles (20.5 vs. 2.3%, p<0.001). Fear of needles is common and is associated with health care avoidance. Health professionals could better identify and manage patients who have a fear of needles by recognis...
Complex Systems, 2005
Time-course experiments with microarrays are often used to study dynamic biological systems and g... more Time-course experiments with microarrays are often used to study dynamic biological systems and genetic regulatory networks (GRNs) that model how genes influence each other in cell-level development of organisms. The inference for GRNs provides important insights into the fundamental biological processes such as growth and is useful in disease diagnosis and genomic drug design. Due to the experimental design, multilevel data hierarchies are often present in time-course gene expression data. Most existing methods, however, ignore the dependency of the expression measurements over time and the correlation among gene expression profiles. Such independence assumptions violate regulatory interactions and can result in overlooking certain important subject effects and lead to spurious inference for regulatory networks or mechanisms. In this paper, a multilevel mixed-effects model is adopted to incorporate data hierarchies in the analysis of time-course data, where temporal and subject effects are both assumed to be random. The method starts with the clustering of genes by fitting the mixture model within the multilevel random-effects model framework using the expectation-maximization (EM) algorithm. The network of regulatory interactions is then determined by searching for regulatory control elements (activators and inhibitors) shared by the clusters of co-expressed genes, based on a time-lagged correlation coefficients measurement. The method is applied to two real time-course datasets from the budding yeast (Saccharomyces cerevisiae) genome. It is shown that the proposed method provides clusters of cell-cycle regulated genes that are supported by existing gene function annotations, and hence enables inference on regulatory interactions for the genetic network.
COMPSTAT 2008
ABSTRACT In this paper, we consider the use of mixtures of linear mixed models to cluster data wh... more ABSTRACT In this paper, we consider the use of mixtures of linear mixed models to cluster data which may be correlated and replicated and which may have covariates. For each cluster, a regression model is adopted to incorporate the covariates, and the correlation and replication structure in the data are specified by the inclusion of random effects terms. The procedure is illustrated in its application to the clustering of gene-expression profiles. Yes Yes
Statistical Advances in the Biomedical Sciences
Page 1. CHAPTER 21 Clustering of Microarray Data via Mixture Models Geoffrey J. McLachlan and Ang... more Page 1. CHAPTER 21 Clustering of Microarray Data via Mixture Models Geoffrey J. McLachlan and Angus Ng Department of Mathematics, University of Queensland, Brisbane, Australia ... For example, the gene-shaving methodology of Hastie et al. ...
Methods in Molecular Biology, 2008
Recently, we reported the development and use of a "reverse capture" antibody microarray for the ... more Recently, we reported the development and use of a "reverse capture" antibody microarray for the purpose of investigating antigen-autoantibody profiling. This platform was developed to allow researchers to characterize and compare the autoantibody profiles of normal and diseased patients. Our "reverse capture" protocol is based on the dualantibody sandwich immunoassay of enzyme-linked immunosorbent assay (ELISA), and we have previously reported its use to detect autoimmunity to epitopes found on native antigens derived from tumor cell lines. In this protocol, we used ovarian cancer as a model system to adapt the "reverse capture" procedure for use with native antigens derived from frozen tissue samples. The use of this platform in studies of autoimmunity is valuable because it allows for the detection of autoantibody reactivity with epitopes found on the post-translational modifications (PTMs) of native antigens, a feature not present with other protein array platforms. In the first step in the "reverse capture" process, tissue-derived native antigens are immobilized onto the 500 monoclonal antibodies that are spotted in duplicate on the array surface. Using the captured antigens as "baits," we then incubate the array with labeled IgG from test and control samples, and perform a two-slide dye-swap to account for any dye effects. Here, we present a detailed description of the "reverse capture" autoantibody microarray for use with tissue-derived native antigens.
Health Care Management Science, 2001
With obstetrical delivery being the most frequent cause for hospital admissions, it is important ... more With obstetrical delivery being the most frequent cause for hospital admissions, it is important to determine health- and patient-related characteristics affecting maternity length of stay (LOS). Although the average inpatient LOS has decreased steadily over the years, the issue of the appropriate LOS after delivery is complex and hotly debated, especially since the introduction of the mandatory minimum-stay legislation in the USA. The purpose of this paper is to identity factors associated with maternity LOS and to model variations in LOS. A Gamma mixture risk-adjusted model is proposed in order to analyze heterogeneity of maternity LOS within obstetrical Diagnosis Related Groups (DRGs). The determination of pertinent factors would benefit hospital administrators and clinicians to manage LOS and expenditures efficiently.
Statistics in Medicine, 2004
A two-component survival mixture model is proposed to analyse a set of ischaemic stroke-speciÿc m... more A two-component survival mixture model is proposed to analyse a set of ischaemic stroke-speciÿc mortality data. The survival experience of stroke patients after index stroke may be described by a subpopulation of patients in the acute condition and another subpopulation of patients in the chronic phase. To adjust for the inherent correlation of observations due to random hospital e ects, a mixture model of two survival functions with random e ects is formulated. Assuming a Weibull hazard in both components, an EM algorithm is developed for the estimation of ÿxed e ect parameters and variance components. A simulation study is conducted to assess the performance of the two-component survival mixture model estimators. Simulation results conÿrm the applicability of the proposed model in a small sample setting. Copyright ? 2004 John Wiley & Sons, Ltd.
Statistics in Medicine, 2001
A mixture model incorporating long-term survivors has been adopted in the field of biostatistics ... more A mixture model incorporating long-term survivors has been adopted in the field of biostatistics where some individuals may never experience the failure event under study. The surviving fractions may be considered as cured. In most applications, the survival times are assumed to be independent. However, when the survival data are obtained from a multi-centre clinical trial, it is conceived that the environmental conditions and facilities shared within clinic affects the proportion cured as well as the failure risk for the uncured individuals. It necessitates a long-term survivor mixture model with random effects. In this paper, the long-term survivor mixture model is extended for the analysis of multivariate failure time data using the generalized linear mixed model (GLMM) approach. The proposed model is applied to analyse a numerical data set from a multi-centre clinical trial of carcinoma as an illustration. Some simulation experiments are performed to assess the applicability of the model based on the average biases of the estimates formed.
Pak. J. Statist, 2010
Studies on genetic profiling demonstrate its potential utility for classifying tumours, leading t... more Studies on genetic profiling demonstrate its potential utility for classifying tumours, leading to a "genetic-staging" system for predicting disease outcomes. A precise prediction of individual disease outcome is important to identify patients who have a high risk of disease recurrence, and to tailor treatments to the individual patient. Genes, however, are not the sole determinants of disease outcomes. Non-genetic factors have roles in many stages of tumourigenesis, and the simultaneous use of genetic-staging and clinical risk factors may therefore improve the prediction of disease ouctome. In this paper, we aim to quantify the prognostic value of genetic-staging from gene expressions by using mixture model-based clustering methods. We also investigate via the use of logistric regression whether a more accurate prediction of disease outcome can be obtained by using geneticstaging in conjunction with clinical risk factors. The proposed method is illustrated using a real example of breast cancer data. It shows that genetic-staging provides significant additional prognostic information when it is obtained by applying sophisticated modelbsed clustering method for the identification of marker-genes that are relevant to predict disease outcomes.
Population Health Metrics, 2011
Background: Multimorbidity is becoming more prevalent. Previously-used methods of assessing multi... more Background: Multimorbidity is becoming more prevalent. Previously-used methods of assessing multimorbidity relied on counting the number of health conditions, often in relation to an index condition (comorbidity), or grouping conditions based on body or organ systems. Recent refinements in statistical approaches have resulted in improved methods to capture patterns of multimorbidity, allowing for the identification of nonrandomly occurring clusters of multimorbid health conditions. This paper aims to identify nonrandom clusters of multimorbidity. Methods: The Australian Work Outcomes Research Cost-benefit (WORC) study cross-sectional screening dataset (approximately 78,000 working Australians) was used to explore patterns of multimorbidity. Exploratory factor analysis was used to identify nonrandomly occurring clusters of multimorbid health conditions. Results: Six clinically-meaningful groups of multimorbid health conditions were identified. These were: factor 1: arthritis, osteoporosis, other chronic pain, bladder problems, and irritable bowel; factor 2: asthma, chronic obstructive pulmonary disease, and allergies; factor 3: back/neck pain, migraine, other chronic pain, and arthritis; factor 4: high blood pressure, high cholesterol, obesity, diabetes, and fatigue; factor 5: cardiovascular disease, diabetes, fatigue, high blood pressure, high cholesterol, and arthritis; and factor 6: irritable bowel, ulcer, heartburn, and other chronic pain. These clusters do not fall neatly into organ or body systems, and some conditions appear in more than one cluster. Conclusions: Considerably more research is needed with large population-based datasets and a comprehensive set of reliable health diagnoses to better understand the complex nature and composition of multimorbid health conditions.