Min Lu | University of Miami (original) (raw)

Papers by Min Lu

Research paper thumbnail of Editorial: Machine learning applications in educational studies

Frontiers in Education, 2023

Research paper thumbnail of Computing within-study covariances, data visualization, and missing data solutions for multivariate meta-analysis with metavcov

Frontiers in Psychology, 2023

Multivariate meta-analysis (MMA) is a powerful statistical technique that can provide more reliab... more Multivariate meta-analysis (MMA) is a powerful statistical technique that can provide more reliable and informative results than traditional univariate meta-analysis, which allows for comparisons across outcomes with increased statistical power. However, implementing appropriate statistical methods for MMA can be challenging due to the requirement of various specific tasks in data preparation. The metavcov package aims for model preparation, data visualization, and missing data solutions to provide tools for different methods that cannot be found in accessible software. It provides sufficient constructs for estimating coefficients from other well-established packages. For model preparation, users can compute both effect sizes of various types and their variance-covariance matrices, including correlation coefficients, standardized mean difference, mean difference, log odds ratio, log risk ratio, and risk difference. The package provides a tool to plot the confidence intervals for the primary studies and the overall estimates. When specific effect sizes are missing, single imputation is available in the model preparation stage; a multiple imputation method is also available for pooling the results in a statistically principled manner from models of users' choice. The package is demonstrated in two real data applications and a simulation study to assess methods for handling missing data.

Research paper thumbnail of Acculturative Stress, Resilience, and a Syndemic Factor Among Latinx Immigrants

Nursing Research, 2023

Background: The process of immigration and subsequent adaptation can expose Latinx immigrants to ... more Background: The process of immigration and subsequent adaptation can expose Latinx immigrants to chronic and compounding challenges (i.e., acculturative stress), but little is known about how resilience factors and these stressors interact to influence syndemic conditions, intertwined epidemics that disproportionally affect historically marginalized communities. Objectives: The purpose of this study was to describe the influence of acculturative stress and resilience on the syndemic factor underlying substance abuse, intimate partner violence, HIV risk, and mental conditions. Methods: Baseline cross-sectional data from a community-engaged, longitudinal study of 391 adult (ages 18-44 years) Latinx immigrants in North Carolina were obtained using standardized measures available in English and Spanish. Structural equation modeling tested the syndemic model, and random forest variable importance identified the most influential types of acculturative stressors and resilience factors, including their interactions, on the syndemic factor. Results: Results indicated that a single syndemic factor explained variations in heavy drinking, drug use, intimate partner violence, depression, and anxiety and fit the data well. Age, being a woman, acculturative stress, acculturation to the United States, and emotional support were significantly related to the syndemic factor. The relationship between acculturative stress and the syndemic factor was buffered by ethnic pride, coping, enculturation, social support, and individual resilience. The most influential acculturative stressors were marital, family, and occupation/economic stress. Discussion: Findings from this study underscore the importance of considering the co-occurrence of behavioral and mental health conditions among Latinx immigrants. Health promotion programs for Latinx immigrants should address acculturative stress and bolster ethnic pride, social support, and coping as sources of resilience.

Research paper thumbnail of Telehealth utilization in U.S. Medicare beneficiaries aged 65 years and older during the COVID-19 pandemic

BMC Public Health, 2023

Background The COVID-19 pandemic has become a serious public health concern for older adults and ... more Background The COVID-19 pandemic has become a serious public health concern for older adults and amplified the value of deploying telehealth solutions. The purpose of this study was to investigate telehealth offered by providers among U.S. Medicare beneficiaries aged 65 years and older during the COVID-19 pandemic. Methods This cross-sectional study analyzed Medicare beneficiaries aged 65 years and older using data from the Medicare Current Beneficiary Survey, Winter 2021 COVID-19 Supplement (n = 9, 185). We identified variables that were associated with telehealth offered by primary care physicians and beneficiaries' access to the Internet through a multivariate classification analysis utilizing Random Forest machine learning techniques.

Research paper thumbnail of Exploring Factors That Affected Student Well-Being during the COVID-19 Pandemic: A Comparison of Data-Mining Approaches

Int. J. Environ. Res. Public Health, 2022

COVID-19-related school closures caused unprecedented and prolonged disruption to daily life, edu... more COVID-19-related school closures caused unprecedented and prolonged disruption to daily life, education, and social and physical activities. This disruption in the life course affected the well-being of students from different age groups. This study proposed analyzing student well-being and determining the most influential factors that affected student well-being during the COVID-19 pandemic. With this aim, we adopted a cross-sectional study designed to analyze the student data from the Responses to Educational Disruption Survey (REDS) collected between December 2020 and July 2021 from a large sample of grade 8 or equivalent students from eight countries (n = 20,720), including Burkina Faso, Denmark, Ethiopia, Kenya, the Russian Federation, Slovenia, the United Arab Emirates, and Uzbekistan. We first estimated a well-being IRT score for each student in the REDS student database. Then, we used 10 data-mining approaches to determine the most influential factors that affected the well-being of students during the COVID-19 outbreak. Overall, 178 factors were analyzed. The results indicated that the most influential factors on student well-being were multifarious. The most influential variables on student well-being were students’ worries about contracting COVID-19 at school, their learning progress during the COVID-19 disruption, their motivation to learn when school reopened, and their excitement to reunite with friends after the COVID-19 disruption.

Research paper thumbnail of Access to care through telehealth among U.S. Medicare beneficiaries in the wake of the COVID-pandemic

Frontiers in Public Health, 2022

Background: The coronavirus disease (COVID-) public health emergency has amplified the potential ... more Background: The coronavirus disease (COVID-) public health emergency has amplified the potential value of deploying telehealth solutions. Less is known about how trends in access to care through telehealth changed over time.

Research paper thumbnail of randomForestSRC: Forest Weights, In-Bag (IB) and Out-of-Bag (OOB) Ensembles Vignette

Cite this vignette as H. Ishwaran, M. Lu, and U. B. Kogalur. 2021. “randomForestSRC: forest weigh... more Cite this vignette as
H. Ishwaran, M. Lu, and U. B. Kogalur. 2021. “randomForestSRC: forest weights, in-bag (IB) and out-of-bag (OOB) ensembles vignette.” http://randomforestsrc.org/articles/forestWgt.html.

@misc{HemantGettingStarted,
author = "Hemant Ishwaran and Min Lu and Udaya B. Kogalur",
title = {{randomForestSRC}: forest weights, in-bag (IB) and out-of-Bag (OOB) ensembles vignette},
year = {2021},
url = {http://randomforestsrc.org/articles/forestWgt.html}
}
Contents
Introduction
Formal Description of In-Bag and OOB Ensemble
Illustration

Research paper thumbnail of randomForestSRC: AUC Splitting for Multiclass Problems Vignette

@misc{HemantAUCsplit, author = "Hemant Ishwaran and Min Lu and Udaya B. Kogalur", title = {{r... more @misc{HemantAUCsplit,
author = "Hemant Ishwaran and Min Lu and Udaya B. Kogalur",
title = {{randomForestSRC}: AUC splitting for multiclass problems vignette},
year = {2022},
url = {http://randomforestsrc.org/articles/aucsplit.html}
}

Research paper thumbnail of Standard errors and confidence intervals for variable importance in random forest regression, classification, and survival

Statistics in medicine, Jan 4, 2018

Random forests are a popular nonparametric tree ensemble procedure with broad applications to dat... more Random forests are a popular nonparametric tree ensemble procedure with broad applications to data analysis. While its widespread popularity stems from its prediction performance, an equally important feature is that it provides a fully nonparametric measure of variable importance (VIMP). A current limitation of VIMP, however, is that no systematic method exists for estimating its variance. As a solution, we propose a subsampling approach that can be used to estimate the variance of VIMP and for constructing confidence intervals. The method is general enough that it can be applied to many useful settings, including regression, classification, and survival problems. Using extensive simulations, we demonstrate the effectiveness of the subsampling estimator and in particular find that the delete-d jackknife variance estimator, a close cousin, is especially effective under low subsampling rates due to its bias correction properties. These 2 estimators are highly competitive when compare...

Research paper thumbnail of randomForestSRC: Partial Plots Vignette

Research paper thumbnail of Estimating Individual Treatment Effect in Observational Data Using Random Forest Methods

Journal of Computational and Graphical Statistics

Research paper thumbnail of randomForestSRC: Variable Importance (VIMP) with Subsampling Inference Vignette

randomForestSRC Vignettes, 2021

H. Ishwaran, M. Lu, and U. B. Kogalur. 2021. “randomForestSRC: variable importance (VIMP) with su... more H. Ishwaran, M. Lu, and U. B. Kogalur. 2021. “randomForestSRC: variable importance (VIMP) with subsampling inference vignette.” http://randomforestsrc.org/articles/vimp.html.

@misc{HemantVIMP,
author = "Hemant Ishwaran and Min Lu and Udaya B. Kogalur",
title = {{randomForestSRC}: variable importance {(VIMP)} with subsampling inference vignette},
year = {2021},
url = {http://randomforestsrc.org/articles/vimp.html}
}

Research paper thumbnail of randomForestSRC: Getting Started with randomForestSRC Vignette

randomForestSRC Vignettes, 2021

H. Ishwaran, M. Lu, and U. B. Kogalur. 2021. “randomForestSRC: getting started with randomForestS... more H. Ishwaran, M. Lu, and U. B. Kogalur. 2021. “randomForestSRC: getting started with randomForestSRC vignette.” http://randomforestsrc.org/articles/getstarted.html.

@misc{HemantGettingStarted,
author = "Hemant Ishwaran and Min Lu and Udaya B. Kogalur",
title = {{randomForestSRC}: getting started with {randomForestSRC} vignette},
year = {2021},
url = {http://randomforestsrc.org/articles/getstarted.html}
}

Research paper thumbnail of randomForestSRC: sidClustering Vignette

randomForestSRC Vignettes, 2021

H. Ishwaran, A. Mantero, M. Lu, and U. B. Kogalur. 2021. “randomForestSRC: sidClustering vignette... more H. Ishwaran, A. Mantero, M. Lu, and U. B. Kogalur. 2021. “randomForestSRC: sidClustering vignette.” http://randomforestsrc.org/articles/sidClustering.html.

@misc{HemantsidClustering,
author = "Hemant Ishwaran and Alejandro Mantero and Min Lu and Udaya B. Kogalur",
title = {{randomForestSRC}: {sidClustering} vignette},
year = {2021},
url = {http://randomforestsrc.org/articles/sidClustering.html}
}

Research paper thumbnail of randomForestSRC: Multivariate Splitting Rule Vignette

randomForestSRC Vignettes, 2021

H. Ishwaran, F. Tang, M. Lu, and U. B. Kogalur. 2021. “randomForestSRC: multivariate splitting ru... more H. Ishwaran, F. Tang, M. Lu, and U. B. Kogalur. 2021. “randomForestSRC: multivariate splitting rule vignette.” http://randomforestsrc.org/articles/mvsplit.html.

@misc{HemantMultiv,
author = "Hemant Ishwaran and Fei Tang and Min Lu and Udaya B. Kogalur",
title = {{randomForestSRC}: multivariate splitting rule vignette},
year = {2021},
url = {http://randomforestsrc.org/articles/mvsplit.html}
}

Research paper thumbnail of randomForestSRC: Random Forests Quantile Classifier (RFQ) Vignette

randomForestSRC Vignettes, 2021

Random forest classification for imbalanced data H. Ishwaran, R. O’Brien, M. Lu, and U. B. Kogal... more Random forest classification for imbalanced data
H. Ishwaran, R. O’Brien, M. Lu, and U. B. Kogalur. 2021. “randomForestSRC: random forests quantile classifier (RFQ) vignette.” http://randomforestsrc.org/articles/imbalance.html.

@misc{HemantRFQv,
author = "Hemant Ishwaran and Robert O'Brien and Min Lu and Udaya B. Kogalur",
title = {{randomForestSRC}: random forests quantile classifier {(RFQ)} vignette},
year = {2021},
url = {http://randomforestsrc.org/articles/imbalance.html}
}

Research paper thumbnail of randomForestSRC: Competing Risks Vignette

randomForestSRC Vignettes, 2021

H. Ishwaran, T. A. Gerds, B. M. Lau, M. Lu, and U. B. Kogalur. 2021. “randomForestSRC: competing ... more H. Ishwaran, T. A. Gerds, B. M. Lau, M. Lu, and U. B. Kogalur. 2021. “randomForestSRC: competing risks vignette.” http://randomforestsrc.org/articles/competing.html.

@misc{HemantCompeting,
author = "Hemant Ishwaran and Thomas A. Gerds and Bryan M. Lau and Min Lu and Udaya B. Kogalur",
title = {{randomForestSRC}: competing risks vignette},
year = {2021},
url = {http://randomforestsrc.org/articles/competing.html}
}

Research paper thumbnail of randomForestSRC: Random Survival Forests Vignette

randomForestSRC Vignettes, 2021

H. Ishwaran, M. S. Lauer, E. H. Blackstone, M. Lu, and U. B. Kogalur. 2021. “randomForestSRC: ran... more H. Ishwaran, M. S. Lauer, E. H. Blackstone, M. Lu, and U. B. Kogalur. 2021. “randomForestSRC: random survival forests vignette.” http://randomforestsrc.org/articles/survival.html.

@misc{HemantRandomS,
author = "Hemant Ishwaran and Michael S. Lauer and Eugene H. Blackstone and Min Lu
and Udaya B. Kogalur",
title = {{randomForestSRC}: random survival forests vignette},
year = {2021},
url = {http://randomforestsrc.org/articles/survival.html}
}

Research paper thumbnail of Interactions between Staphylococcal Enterotoxins A and D and Superantigen-Like Proteins 1 and 5 for Predicting Methicillin and Multidrug Resistance Profiles among Staphylococcus aureus Ocular Isolates

Plos One, 2021

Background Methicillin-resistant Staphylococcus aureus (MRSA) and multidrug-resistant (MDR) S. au... more Background Methicillin-resistant Staphylococcus aureus (MRSA) and multidrug-resistant (MDR) S. aureus strains are well recognized as posing substantial problems in treating ocular infections. S. aureus has a vast array of virulence factors, including superantigens and enterotoxins. Their interactions and ability to signal antibiotics resistance have not been explored. Objectives To predict the relationship between superantigens and methicillin and multidrug resistance among S. aureus ocular isolates. Methods We used a DNA microarray to characterize the enterotoxin and superantigen gene profiles of 98 S. aureus isolates collected from common ocular sources. The outcomes contained phenotypic and genotypic expressions of MRSA. We also included the MDR status as an outcome, categorized as resistance to three or more drugs, including oxacillin, penicillin, erythromycin, clindamycin, moxifloxacin, tetracycline, trimethoprim-sulfamethoxazole and gentamicin. We identified gene profiles that predicted each outcome through a classification analysis utilizing Random Forest machine learning techniques. Findings Our machine learning models predicted the outcomes accurately utilizing 67 enterotoxin and superantigen genes. Strong correlates predicting the genotypic expression of MRSA were enterotoxins A, D, J and R and superantigen-like proteins 1, 3, 7 and 10. Among these virulence factors, enterotoxin D and superantigen-like proteins 1, 5 and 10 were also July 12, 2021 1/20 significantly informative for predicting both MDR and MRSA in terms of phenotypic expression. Strong interactions were identified including enterotoxins A (entA) interacting with superantigen-like protein 1 (set6-var1 11), and enterotoxin D (entD) interacting with superantigen-like protein 5 (ssl05/set3 probe 1): MRSA and MDR S. aureus are associated with the presence of both entA and set6-var1 11, or both entD and ssl05/set3 probe 1, while the absence of these genes in pairs indicates non-multidrug-resistant and methicillin-susceptible S. aureus.
Conclusions MRSA and MDR S. aureus show a different spectrum of ocular pathology than their non-resistant counterparts. When assessing the role of enterotoxins in predicting antibiotics resistance, it is critical to consider both main effects and interactions.

Research paper thumbnail of Cure and death play a role in understanding dynamics for COVID-19: data-driven competing risk compartmental models, with and without vaccination

Plos One, 2021

Several factors have played a strong role in influencing the dynamics of COVID-19 in the U.S. One... more Several factors have played a strong role in influencing the dynamics of COVID-19 in the U.S. One being the economy, where a tug of war has existed between lockdown measures to control disease versus loosening of restrictions to address economic hardship. A more recent effect has been availability of vaccines and the mass vaccination efforts of 2021. In order to address the challenges in analyzing this complex process, we developed a competing risk compartmental model framework with and without vaccination compartment. This framework separates instantaneous risk of removal for an infectious case into competing risks of cure and death, and when vaccinations are present, the vaccinated individual can also achieve immunity before infection. Computations are performed using a simple discrete time algorithm that utilizes a data driven contact rate. Using population level pre-vaccination data, we are able to identify and characterize three wave patterns in the U.S. Estimated mortality rates for second and third waves are 1.7%, which is a notable decrease from 8.5% of a first wave observed at onset of disease. This analysis reveals the importance cure time has on infectious duration and disease transmission. Using vaccination data from 2021, we find a fourth wave, however the effect of this wave is suppressed due to vaccine effectiveness. Parameters playing a crucial role in this modeling were a lower cure time and a signficantly lower mortality rate for the vaccinated.

Research paper thumbnail of Editorial: Machine learning applications in educational studies

Frontiers in Education, 2023

Research paper thumbnail of Computing within-study covariances, data visualization, and missing data solutions for multivariate meta-analysis with metavcov

Frontiers in Psychology, 2023

Multivariate meta-analysis (MMA) is a powerful statistical technique that can provide more reliab... more Multivariate meta-analysis (MMA) is a powerful statistical technique that can provide more reliable and informative results than traditional univariate meta-analysis, which allows for comparisons across outcomes with increased statistical power. However, implementing appropriate statistical methods for MMA can be challenging due to the requirement of various specific tasks in data preparation. The metavcov package aims for model preparation, data visualization, and missing data solutions to provide tools for different methods that cannot be found in accessible software. It provides sufficient constructs for estimating coefficients from other well-established packages. For model preparation, users can compute both effect sizes of various types and their variance-covariance matrices, including correlation coefficients, standardized mean difference, mean difference, log odds ratio, log risk ratio, and risk difference. The package provides a tool to plot the confidence intervals for the primary studies and the overall estimates. When specific effect sizes are missing, single imputation is available in the model preparation stage; a multiple imputation method is also available for pooling the results in a statistically principled manner from models of users' choice. The package is demonstrated in two real data applications and a simulation study to assess methods for handling missing data.

Research paper thumbnail of Acculturative Stress, Resilience, and a Syndemic Factor Among Latinx Immigrants

Nursing Research, 2023

Background: The process of immigration and subsequent adaptation can expose Latinx immigrants to ... more Background: The process of immigration and subsequent adaptation can expose Latinx immigrants to chronic and compounding challenges (i.e., acculturative stress), but little is known about how resilience factors and these stressors interact to influence syndemic conditions, intertwined epidemics that disproportionally affect historically marginalized communities. Objectives: The purpose of this study was to describe the influence of acculturative stress and resilience on the syndemic factor underlying substance abuse, intimate partner violence, HIV risk, and mental conditions. Methods: Baseline cross-sectional data from a community-engaged, longitudinal study of 391 adult (ages 18-44 years) Latinx immigrants in North Carolina were obtained using standardized measures available in English and Spanish. Structural equation modeling tested the syndemic model, and random forest variable importance identified the most influential types of acculturative stressors and resilience factors, including their interactions, on the syndemic factor. Results: Results indicated that a single syndemic factor explained variations in heavy drinking, drug use, intimate partner violence, depression, and anxiety and fit the data well. Age, being a woman, acculturative stress, acculturation to the United States, and emotional support were significantly related to the syndemic factor. The relationship between acculturative stress and the syndemic factor was buffered by ethnic pride, coping, enculturation, social support, and individual resilience. The most influential acculturative stressors were marital, family, and occupation/economic stress. Discussion: Findings from this study underscore the importance of considering the co-occurrence of behavioral and mental health conditions among Latinx immigrants. Health promotion programs for Latinx immigrants should address acculturative stress and bolster ethnic pride, social support, and coping as sources of resilience.

Research paper thumbnail of Telehealth utilization in U.S. Medicare beneficiaries aged 65 years and older during the COVID-19 pandemic

BMC Public Health, 2023

Background The COVID-19 pandemic has become a serious public health concern for older adults and ... more Background The COVID-19 pandemic has become a serious public health concern for older adults and amplified the value of deploying telehealth solutions. The purpose of this study was to investigate telehealth offered by providers among U.S. Medicare beneficiaries aged 65 years and older during the COVID-19 pandemic. Methods This cross-sectional study analyzed Medicare beneficiaries aged 65 years and older using data from the Medicare Current Beneficiary Survey, Winter 2021 COVID-19 Supplement (n = 9, 185). We identified variables that were associated with telehealth offered by primary care physicians and beneficiaries' access to the Internet through a multivariate classification analysis utilizing Random Forest machine learning techniques.

Research paper thumbnail of Exploring Factors That Affected Student Well-Being during the COVID-19 Pandemic: A Comparison of Data-Mining Approaches

Int. J. Environ. Res. Public Health, 2022

COVID-19-related school closures caused unprecedented and prolonged disruption to daily life, edu... more COVID-19-related school closures caused unprecedented and prolonged disruption to daily life, education, and social and physical activities. This disruption in the life course affected the well-being of students from different age groups. This study proposed analyzing student well-being and determining the most influential factors that affected student well-being during the COVID-19 pandemic. With this aim, we adopted a cross-sectional study designed to analyze the student data from the Responses to Educational Disruption Survey (REDS) collected between December 2020 and July 2021 from a large sample of grade 8 or equivalent students from eight countries (n = 20,720), including Burkina Faso, Denmark, Ethiopia, Kenya, the Russian Federation, Slovenia, the United Arab Emirates, and Uzbekistan. We first estimated a well-being IRT score for each student in the REDS student database. Then, we used 10 data-mining approaches to determine the most influential factors that affected the well-being of students during the COVID-19 outbreak. Overall, 178 factors were analyzed. The results indicated that the most influential factors on student well-being were multifarious. The most influential variables on student well-being were students’ worries about contracting COVID-19 at school, their learning progress during the COVID-19 disruption, their motivation to learn when school reopened, and their excitement to reunite with friends after the COVID-19 disruption.

Research paper thumbnail of Access to care through telehealth among U.S. Medicare beneficiaries in the wake of the COVID-pandemic

Frontiers in Public Health, 2022

Background: The coronavirus disease (COVID-) public health emergency has amplified the potential ... more Background: The coronavirus disease (COVID-) public health emergency has amplified the potential value of deploying telehealth solutions. Less is known about how trends in access to care through telehealth changed over time.

Research paper thumbnail of randomForestSRC: Forest Weights, In-Bag (IB) and Out-of-Bag (OOB) Ensembles Vignette

Cite this vignette as H. Ishwaran, M. Lu, and U. B. Kogalur. 2021. “randomForestSRC: forest weigh... more Cite this vignette as
H. Ishwaran, M. Lu, and U. B. Kogalur. 2021. “randomForestSRC: forest weights, in-bag (IB) and out-of-bag (OOB) ensembles vignette.” http://randomforestsrc.org/articles/forestWgt.html.

@misc{HemantGettingStarted,
author = "Hemant Ishwaran and Min Lu and Udaya B. Kogalur",
title = {{randomForestSRC}: forest weights, in-bag (IB) and out-of-Bag (OOB) ensembles vignette},
year = {2021},
url = {http://randomforestsrc.org/articles/forestWgt.html}
}
Contents
Introduction
Formal Description of In-Bag and OOB Ensemble
Illustration

Research paper thumbnail of randomForestSRC: AUC Splitting for Multiclass Problems Vignette

@misc{HemantAUCsplit, author = "Hemant Ishwaran and Min Lu and Udaya B. Kogalur", title = {{r... more @misc{HemantAUCsplit,
author = "Hemant Ishwaran and Min Lu and Udaya B. Kogalur",
title = {{randomForestSRC}: AUC splitting for multiclass problems vignette},
year = {2022},
url = {http://randomforestsrc.org/articles/aucsplit.html}
}

Research paper thumbnail of Standard errors and confidence intervals for variable importance in random forest regression, classification, and survival

Statistics in medicine, Jan 4, 2018

Random forests are a popular nonparametric tree ensemble procedure with broad applications to dat... more Random forests are a popular nonparametric tree ensemble procedure with broad applications to data analysis. While its widespread popularity stems from its prediction performance, an equally important feature is that it provides a fully nonparametric measure of variable importance (VIMP). A current limitation of VIMP, however, is that no systematic method exists for estimating its variance. As a solution, we propose a subsampling approach that can be used to estimate the variance of VIMP and for constructing confidence intervals. The method is general enough that it can be applied to many useful settings, including regression, classification, and survival problems. Using extensive simulations, we demonstrate the effectiveness of the subsampling estimator and in particular find that the delete-d jackknife variance estimator, a close cousin, is especially effective under low subsampling rates due to its bias correction properties. These 2 estimators are highly competitive when compare...

Research paper thumbnail of randomForestSRC: Partial Plots Vignette

Research paper thumbnail of Estimating Individual Treatment Effect in Observational Data Using Random Forest Methods

Journal of Computational and Graphical Statistics

Research paper thumbnail of randomForestSRC: Variable Importance (VIMP) with Subsampling Inference Vignette

randomForestSRC Vignettes, 2021

H. Ishwaran, M. Lu, and U. B. Kogalur. 2021. “randomForestSRC: variable importance (VIMP) with su... more H. Ishwaran, M. Lu, and U. B. Kogalur. 2021. “randomForestSRC: variable importance (VIMP) with subsampling inference vignette.” http://randomforestsrc.org/articles/vimp.html.

@misc{HemantVIMP,
author = "Hemant Ishwaran and Min Lu and Udaya B. Kogalur",
title = {{randomForestSRC}: variable importance {(VIMP)} with subsampling inference vignette},
year = {2021},
url = {http://randomforestsrc.org/articles/vimp.html}
}

Research paper thumbnail of randomForestSRC: Getting Started with randomForestSRC Vignette

randomForestSRC Vignettes, 2021

H. Ishwaran, M. Lu, and U. B. Kogalur. 2021. “randomForestSRC: getting started with randomForestS... more H. Ishwaran, M. Lu, and U. B. Kogalur. 2021. “randomForestSRC: getting started with randomForestSRC vignette.” http://randomforestsrc.org/articles/getstarted.html.

@misc{HemantGettingStarted,
author = "Hemant Ishwaran and Min Lu and Udaya B. Kogalur",
title = {{randomForestSRC}: getting started with {randomForestSRC} vignette},
year = {2021},
url = {http://randomforestsrc.org/articles/getstarted.html}
}

Research paper thumbnail of randomForestSRC: sidClustering Vignette

randomForestSRC Vignettes, 2021

H. Ishwaran, A. Mantero, M. Lu, and U. B. Kogalur. 2021. “randomForestSRC: sidClustering vignette... more H. Ishwaran, A. Mantero, M. Lu, and U. B. Kogalur. 2021. “randomForestSRC: sidClustering vignette.” http://randomforestsrc.org/articles/sidClustering.html.

@misc{HemantsidClustering,
author = "Hemant Ishwaran and Alejandro Mantero and Min Lu and Udaya B. Kogalur",
title = {{randomForestSRC}: {sidClustering} vignette},
year = {2021},
url = {http://randomforestsrc.org/articles/sidClustering.html}
}

Research paper thumbnail of randomForestSRC: Multivariate Splitting Rule Vignette

randomForestSRC Vignettes, 2021

H. Ishwaran, F. Tang, M. Lu, and U. B. Kogalur. 2021. “randomForestSRC: multivariate splitting ru... more H. Ishwaran, F. Tang, M. Lu, and U. B. Kogalur. 2021. “randomForestSRC: multivariate splitting rule vignette.” http://randomforestsrc.org/articles/mvsplit.html.

@misc{HemantMultiv,
author = "Hemant Ishwaran and Fei Tang and Min Lu and Udaya B. Kogalur",
title = {{randomForestSRC}: multivariate splitting rule vignette},
year = {2021},
url = {http://randomforestsrc.org/articles/mvsplit.html}
}

Research paper thumbnail of randomForestSRC: Random Forests Quantile Classifier (RFQ) Vignette

randomForestSRC Vignettes, 2021

Random forest classification for imbalanced data H. Ishwaran, R. O’Brien, M. Lu, and U. B. Kogal... more Random forest classification for imbalanced data
H. Ishwaran, R. O’Brien, M. Lu, and U. B. Kogalur. 2021. “randomForestSRC: random forests quantile classifier (RFQ) vignette.” http://randomforestsrc.org/articles/imbalance.html.

@misc{HemantRFQv,
author = "Hemant Ishwaran and Robert O'Brien and Min Lu and Udaya B. Kogalur",
title = {{randomForestSRC}: random forests quantile classifier {(RFQ)} vignette},
year = {2021},
url = {http://randomforestsrc.org/articles/imbalance.html}
}

Research paper thumbnail of randomForestSRC: Competing Risks Vignette

randomForestSRC Vignettes, 2021

H. Ishwaran, T. A. Gerds, B. M. Lau, M. Lu, and U. B. Kogalur. 2021. “randomForestSRC: competing ... more H. Ishwaran, T. A. Gerds, B. M. Lau, M. Lu, and U. B. Kogalur. 2021. “randomForestSRC: competing risks vignette.” http://randomforestsrc.org/articles/competing.html.

@misc{HemantCompeting,
author = "Hemant Ishwaran and Thomas A. Gerds and Bryan M. Lau and Min Lu and Udaya B. Kogalur",
title = {{randomForestSRC}: competing risks vignette},
year = {2021},
url = {http://randomforestsrc.org/articles/competing.html}
}

Research paper thumbnail of randomForestSRC: Random Survival Forests Vignette

randomForestSRC Vignettes, 2021

H. Ishwaran, M. S. Lauer, E. H. Blackstone, M. Lu, and U. B. Kogalur. 2021. “randomForestSRC: ran... more H. Ishwaran, M. S. Lauer, E. H. Blackstone, M. Lu, and U. B. Kogalur. 2021. “randomForestSRC: random survival forests vignette.” http://randomforestsrc.org/articles/survival.html.

@misc{HemantRandomS,
author = "Hemant Ishwaran and Michael S. Lauer and Eugene H. Blackstone and Min Lu
and Udaya B. Kogalur",
title = {{randomForestSRC}: random survival forests vignette},
year = {2021},
url = {http://randomforestsrc.org/articles/survival.html}
}

Research paper thumbnail of Interactions between Staphylococcal Enterotoxins A and D and Superantigen-Like Proteins 1 and 5 for Predicting Methicillin and Multidrug Resistance Profiles among Staphylococcus aureus Ocular Isolates

Plos One, 2021

Background Methicillin-resistant Staphylococcus aureus (MRSA) and multidrug-resistant (MDR) S. au... more Background Methicillin-resistant Staphylococcus aureus (MRSA) and multidrug-resistant (MDR) S. aureus strains are well recognized as posing substantial problems in treating ocular infections. S. aureus has a vast array of virulence factors, including superantigens and enterotoxins. Their interactions and ability to signal antibiotics resistance have not been explored. Objectives To predict the relationship between superantigens and methicillin and multidrug resistance among S. aureus ocular isolates. Methods We used a DNA microarray to characterize the enterotoxin and superantigen gene profiles of 98 S. aureus isolates collected from common ocular sources. The outcomes contained phenotypic and genotypic expressions of MRSA. We also included the MDR status as an outcome, categorized as resistance to three or more drugs, including oxacillin, penicillin, erythromycin, clindamycin, moxifloxacin, tetracycline, trimethoprim-sulfamethoxazole and gentamicin. We identified gene profiles that predicted each outcome through a classification analysis utilizing Random Forest machine learning techniques. Findings Our machine learning models predicted the outcomes accurately utilizing 67 enterotoxin and superantigen genes. Strong correlates predicting the genotypic expression of MRSA were enterotoxins A, D, J and R and superantigen-like proteins 1, 3, 7 and 10. Among these virulence factors, enterotoxin D and superantigen-like proteins 1, 5 and 10 were also July 12, 2021 1/20 significantly informative for predicting both MDR and MRSA in terms of phenotypic expression. Strong interactions were identified including enterotoxins A (entA) interacting with superantigen-like protein 1 (set6-var1 11), and enterotoxin D (entD) interacting with superantigen-like protein 5 (ssl05/set3 probe 1): MRSA and MDR S. aureus are associated with the presence of both entA and set6-var1 11, or both entD and ssl05/set3 probe 1, while the absence of these genes in pairs indicates non-multidrug-resistant and methicillin-susceptible S. aureus.
Conclusions MRSA and MDR S. aureus show a different spectrum of ocular pathology than their non-resistant counterparts. When assessing the role of enterotoxins in predicting antibiotics resistance, it is critical to consider both main effects and interactions.

Research paper thumbnail of Cure and death play a role in understanding dynamics for COVID-19: data-driven competing risk compartmental models, with and without vaccination

Plos One, 2021

Several factors have played a strong role in influencing the dynamics of COVID-19 in the U.S. One... more Several factors have played a strong role in influencing the dynamics of COVID-19 in the U.S. One being the economy, where a tug of war has existed between lockdown measures to control disease versus loosening of restrictions to address economic hardship. A more recent effect has been availability of vaccines and the mass vaccination efforts of 2021. In order to address the challenges in analyzing this complex process, we developed a competing risk compartmental model framework with and without vaccination compartment. This framework separates instantaneous risk of removal for an infectious case into competing risks of cure and death, and when vaccinations are present, the vaccinated individual can also achieve immunity before infection. Computations are performed using a simple discrete time algorithm that utilizes a data driven contact rate. Using population level pre-vaccination data, we are able to identify and characterize three wave patterns in the U.S. Estimated mortality rates for second and third waves are 1.7%, which is a notable decrease from 8.5% of a first wave observed at onset of disease. This analysis reveals the importance cure time has on infectious duration and disease transmission. Using vaccination data from 2021, we find a fourth wave, however the effect of this wave is suppressed due to vaccine effectiveness. Parameters playing a crucial role in this modeling were a lower cure time and a signficantly lower mortality rate for the vaccinated.

Research paper thumbnail of All about the Major of Biostatistics

All about statistics: 1 Major 2 Career 3 Visa 4 Job Hunting 5 Live in the US

Research paper thumbnail of Chinese version An Online prediction tool for COVID19

Research paper thumbnail of An Online prediction tool for COVID19

Research paper thumbnail of Min Lu dissertation defense slides

Research paper thumbnail of Personalized Treatment with Non- Overlapping Treatment Groups

Estimation of multiple treatment effects in observational survival data is complicated due to con... more Estimation of multiple treatment effects in observational survival data is complicated due to confounding, heterogeneity, and selection bias. A key challenge is assessing overlap and, possibly, estimation of effects strictly within overlapping populations that are eligible for the corresponding treatments. Unfortunately, treatments do not always have clearly defined evidence based eligibility criteria. Therefore, we propose new random forest methods to address individual therapy overlap. These methods possess the unique feature of being able to incorporate external expert knowledge either in a fully supervised way (i.e., we have a strong belief that knowledge is correct) using multilabel analyses, or in a minimally supervised fashion (i.e., knowledge is not considered gold-standard) using multiclass analyses. We directly estimate individual treatment effect (ITE) and average treatment effect (ATE) through comparison of survival under counterfactual treatment assignments using an extension to random survival forests we call virtual twin random survival forests interaction. Treatment effect is viewed as a dynamic causal procedure to making treatment decisions. Motivation for our methodology arose from the problem of current treatment management for ischemic cardiomyopathy. Using a large observational survival data set, four well established therapies are compared: coronary artery bypass grafting (CABG), CABG combined with surgical ventricular reconstruction (SVR), CABG combined with mitral valve anuloplasty (MVA), and listing for heart transplantation (LCTx).

Research paper thumbnail of JCGS Estimating Individual Treatment Effect in Observational Data Using Random Forest Methods

• Invited talk at Prevention Science Methodology Group’s Spring 2017 virtual grand rounds, April ... more • Invited talk at Prevention Science Methodology Group’s Spring 2017 virtual grand rounds, April 2017

• Travel Award, Biostatistics Workshop, Statistical Inference for Biomedical Big Data, University of Florida, Gainesville, April 2017

Research paper thumbnail of Counterfactual Random Forest Individual Causal Inference

Making use of random forests (RF) within the counter-factual framework we estimate individual tre... more Making use of random forests (RF) within the counter-factual framework we estimate individual treatment effects by directly modeling the response. We find accurate estimation of individual treatment effects is possible even in complex heterogeneous settings but that the type of RF approach plays an important role in accuracy. Methods designed to be adaptive to confounding, when used in parallel with out-of-sample estimation, do best. One method found to be especially promising is counterfactual synthetic forests. We illustrate this new methodology by applying it to a large comparative effectiveness trial, Project Aware, in order to explore the role drug use plays in sexual risk. The analysis reveals important connections between risky behavior, drug usage, and sexual risk.

Research paper thumbnail of Application of metaanalysis in sport and exercise science. In N. Ntoumanis, & N. D. Myers (Eds.), An introduction to intermediate and advanced statistical analyses for sport and exercise scientists (pp. 233-253)

Ahn, S., Lu, M., Lefevor, G. T., Fedewa, A. L., & Celimli, S. (2015). Application of metaanalysis... more Ahn, S., Lu, M., Lefevor, G. T., Fedewa, A. L., & Celimli, S. (2015). Application of metaanalysis in sport and exercise science. In N. Ntoumanis, & N. D. Myers (Eds.), An introduction to intermediate and advanced statistical analyses for sport and exercise scientists (pp. 233-253). John Wiley & Sons. You can purchase it from the Wiley website: http://www.wiley-vch.de/publish/dt/books/forthcomingTitles/ST00/1-118-96205-2/?sID=fqcemseuphnnjcaoqmvn5c1v16

Research paper thumbnail of An online COVID-19 pandemic prediction tool: Dynamic Modeling COVID-19 for Your Own Region

https://minlu.shinyapps.io/killCOVID19/

Research paper thumbnail of Personalized Treatment Management Software

Research paper thumbnail of R programming for Categorical data analysis__Class 1

Research paper thumbnail of R programming for Categorical data analysis__Class 2

Research paper thumbnail of R programming for Categorical data analysis__Class 3

Research paper thumbnail of R programming for Categorical data analysis__Class 4: Contingency Tables

Research paper thumbnail of R programming for Categorical data analysis__Class 5

Research paper thumbnail of R programming for Categorical data analysis__Class 6

Research paper thumbnail of R programming for Categorical data analysis__Class 7

Research paper thumbnail of R programming for Categorical data analysis__Class 8

Research paper thumbnail of R programming for Categorical data analysis__Class 9

Research paper thumbnail of R programming for Categorical data analysis__Class 11

Research paper thumbnail of R programming for Categorical data analysis__Class 12

Research paper thumbnail of Dynamic Competing Risk Modeling COVID-19 in a Pandemic Scenario

arXiv.org

The emergence of coronavirus disease 2019 (COVID-19) in the United States has forced federal and ... more The emergence of coronavirus disease 2019 (COVID-19) in the United States has forced federal and local governments to implement containment measures. Moreover, the severity of the situation has sparked engagement by both the research and clinical community with the goal of developing effective treatments for the disease. This article proposes a time dynamic prediction model with competing risks for the infected individual and develops a simple tool for policy makers to compare different strategies in terms of when to implement the strictest containment measures and how different treatments can increase or suppress infected cases. Two types of containment strategies are compared: (1) a constant containment strategy that could satisfy the needs of citizens for a long period; and (2) an adaptive containment strategy whose strict level changes across time. We consider how an effective treatment of the disease can affect the dynamics in a pandemic scenario. For illustration we consider a region with population 2.8 million and 200 initial infectious cases assuming a 4% mortality rate compared with a 2% mortality rate if a new drug is available. Our results show compared with a constant containment strategy, adaptive containment strategies shorten the outbreak length and reduce maximum daily number of cases. This, along with an effective treatment plan for the disease can minimize death rate.