Gaspard Harerimana - Academia.edu (original) (raw)
Uploads
Papers by Gaspard Harerimana
The prognosis of a patient's re-admission and the forecast of future diagnoses is a critical task... more The prognosis of a patient's re-admission and the forecast of future diagnoses is a critical task in the process of inferring clinical outcomes. The discharge summaries recorded in the Electronic Health Records (EHR) are stinking rich but they are also heterogeneous, sparse, noisy, and biased, and hinder the learning algorithms that aim to extract actionable insights from them. The existing approaches use the current admission's International Classification of Diseases (ICD) codes as input but they do not fully describe the progression of the patient. Other systems apply the attention mechanisms directly to these notes without the guidance of a domain knowledge resulting in distorted predictions. In this work, we propose a hybrid LSTM-CNN self-guided attention model that aims to predict the ICD diagnosis that is likely to cause the next readmission within 90 days since the current discharge using the discharge narratives. Because the notes contain unnecessary tokens, the model leverages the recent advances in deep learning to predict the patient's future diagnosis by reducing the number of tokens from the notes to be considered for prediction. We use a 1D CNN (Convolutional Neural Network) to capture all features from the note and concurrently an LSTM (Long Short-Term Memory) is used to extract the features of clinically meaningful Concept Unique Identifiers (CUI) that are fetched from the note itself to build a knowledge base. The textual knowledge base guides the learning module about which n-grams from the note to focus on for prediction. We consider 3 prediction scenarios; diagnosis category prediction, the probability of the occurrence of one of the top 20 disease conditions and ICD9 codes prediction. For the diagnosis category prediction, the model achieves a Macro-Average ROC of 0.82, an AUROC of 0.87 for most the Top 20 most appearing diseases prediction, and a Micro-RECALL of 0.84 for ICD9 codes prediction. The predictive accuracy of the model is assessed through the prediction of heart failure onset and for all these prediction scenarios the results show that the hybrid approach outperforms the existing baselines.
IEEE Access
Deep learning has progressively been the spotlight of innovations that aim to leverage the clinic... more Deep learning has progressively been the spotlight of innovations that aim to leverage the clinical time-series data that are longitudinally recorded in the Electronic Health Records(EHR) to forecast the patient's survival and vital signs deterioration. However, their recording velocity, as well as their noisiness, hinder the proper adoption of the recently proposed benchmarks. The Recurrent Neural Networks (RNN) especially the Long-short Term Memory (LSTMs) have achieved better results in recent studies but they are hard to train and interpret and fail to properly capture the long-term dependencies. Moreover, the RNNs suffer greatly with clinical time series due to their sequential processing which cripples the prospect of parallel processing. Recently the Transformer approach was proposed for Natural Language Processing (NLP) tasks and achieved state-of-the-art results. Hence to tackle the drawbacks that are suffered by the RNNs we propose a clinical time series Multi-head Transformer (MHT), which is a transformer-based model that forecasts the patient's future time series variables using the vitals signs. To prove the generalization of the model we use the same model for other critical tasks that describe the Intensive Care Unit (ICU) patient's progression and the associated risks like the remaining Length Of Stay(LoS), the In-hospital Mortality as well as the 24 hours mortality. Our model achieves an Area Under The Curve-Receiver Operating Characteristics(AUC-ROC) of 0.98 and an Area Under the Curve, Precision-Recall (AUC-PR) of 0.424 for vital time series prediction, and an AUC-ROC of 0.875 in the mortality prediction. The model performs well for the frequently recorded variables like the Heart Rate (HR) and performs barely like the LSTM counterparts for the intermittently captured records such as the White Blood Count (WBC). INDEX TERMS Multi head transformer, clinical time series, natural language processing, self-attention, encoder-decoder attention, interpolation.
Journal of Biomedical Informatics
Leveraging the Electronic Health Records (EHR) longitudinal data to produce actionable clinical i... more Leveraging the Electronic Health Records (EHR) longitudinal data to produce actionable clinical insights has always been a critical issue for recent studies. Non-forecasted extended hospitalizations account for a disproportionate amount of resource use, the mediocre quality of inpatient care, and avoidable fatalities. The capability to predict the Length of Stay (LoS) and mortality in the early stages of the admission provides opportunities to improve care and prevent many preventable losses. Forecasting the in-hospital mortality is important in providing clinicians with enough insights to make decisions and hospitals to allocate resources, hence predicting the LoS and mortality within the first day of admission is a difficult but a paramount endeavor. The biggest challenge is that few data are available by this time, thus the prediction has to bring in the previous admissions history and free text diagnosis that are recorded immediately on admission. We propose a model that uses the multi-modal EHR structured medical codes and key demographic information to classify the LoS in 3 classes; Short Los (LoS⩽10 days), Medium LoS (1030 days) as well as mortality as a binary classification of a patient's death during current admission. The prediction has to use data available only within 24 hours of admission. The key predictors include previous ICD9 diagnosis codes, ICD9 procedures, key demographic data, and free text diagnosis of the current admission recorded right on admission. We propose a Hierarchical Attention Network (HAN-LoS and HAN-Mor) model and train it to a dataset of over 45321 admissions recorded in the de-identified MIMIC-III dataset. For improved prediction, our attention mechanisms can focus on the most influential past admissions and most influential codes in these admissions. For fair performance evaluation, we implemented and compared the HAN model with previous approaches. With dataset balancing techniques HAN-LoS achieved an AUROC of over 0.82 and a Micro-F1 score of 0.24 and HAN-Mor achieved AUC-ROC of 0.87 hence outperforming the existing baselines that use structured medical codes as well as clinical time series for LoS and Mortality forecasting. By predicting mortality and LoS using the same model, we show that with little tuning the proposed model can be used for other clinical predictive tasks like phenotyping, decompensation,re-admission prediction, and survival analysis.
IEEE Access
Because of the vast availability of data, there has been an additional focus on the health indust... more Because of the vast availability of data, there has been an additional focus on the health industry and an increasing number of studies that aim to leverage the data to improve healthcare have been conducted. The health data are growing increasingly large, more complex, and its sources have increased tremendously to include computerized physician order entry, electronic medical records, clinical notes, medical images, cyber-physical systems, medical Internet of Things, genomic data, and clinical decision support systems. New types of data from sources like social network services and genomic data are used to build personalized healthcare systems, hence health data are obtained in various forms, from varied sources, contexts, technologies, and their nature can impede a proper analysis. Any analytical research must overcome these obstacles to mine data and produce meaningful insights to save lives. In this paper, we investigate the key challenges, data sources, techniques, technologies, as well as future directions in the field of big data analytics in healthcare. We provide a do-it-yourself review that delivers a holistic, simplified, and easily understandable view of various technologies that are used to develop an integrated health analytic application. INDEX TERMS Big data, cyber-physical systems, health analytics, machine learning, social networks analysis.
IEEE Access
Recent technological advancements have led to a deluge of medical data from various domains. Howe... more Recent technological advancements have led to a deluge of medical data from various domains. However, the recorded data from divergent sources comes poorly annotated, noisy and unstructured. Hence, the data is not fully leveraged to establish actionable insights that can be used in clinical applications. These data recorded in hospital's Electronic Health Records (EHR) consist of patient information, clinical notes, charted events, medications, procedures, laboratory test results, diagnosis codes etc. Traditional machine learning, and statistical methods have failed to offer insights that can be used by physicians to treat patients as they need to obtain an expert opinion assisted features before building a predictive model. With the rise of deep learning methods, there is a need to understand how deep learning can save lives. The purpose of this study was to offer an intuitive explanation for possible use cases of deep learning with Electronic Health Records. We reflect on techniques that can be applied by health informatics professionals by giving technical intuitions and blue prints on how each clinical task can be approached by a deep learning algorithm.
Applied Sciences
There is a need to extract meaningful information from big data, classify it into different categ... more There is a need to extract meaningful information from big data, classify it into different categories, and predict end-user behavior or emotions. Large amounts of data are generated from various sources such as social media and websites. Text classification is a representative research topic in the field of natural-language processing that categorizes unstructured text data into meaningful categorical classes. The long short-term memory (LSTM) model and the convolutional neural network for sentence classification produce accurate results and have been recently used in various natural-language processing (NLP) tasks. Convolutional neural network (CNN) models use convolutional layers and maximum pooling or max-overtime pooling layers to extract higher-level features, while LSTM models can capture long-term dependencies between word sequences hence are better used for text classification. However, even with the hybrid approach that leverages the powers of these two deep-learning model...
IEEE Access, 2019
Q-learning is arguably one of the most applied representative reinforcement learning approaches a... more Q-learning is arguably one of the most applied representative reinforcement learning approaches and one of the off-policy strategies. Since the emergence of Q-learning, many studies have described its uses in reinforcement learning and artificial intelligence problems. However, there is an information gap as to how these powerful algorithms can be leveraged and incorporated into general artificial intelligence workflow. Early Q-learning algorithms were unsatisfactory in several aspects and covered a narrow range of applications. It has also been observed that sometimes, this rather powerful algorithm learns unrealistically and overestimates the action values hence abating the overall performance. Recently with the general advances of machine learning, more variants of Q-learning like Deep Q-learning which combines basic Q learning with deep neural networks have been discovered and applied extensively. In this paper, we thoroughly explain how Q-learning evolved by unraveling the mathematical complexities behind it as well its flow from reinforcement learning family of algorithms. Improved variants are fully described, and we categorize Q-learning algorithms into single-agent and multi-agent approaches. Finally, we thoroughly investigate up-to-date research trends and key applications that leverage Q-learning algorithms.
The prognosis of a patient's re-admission and the forecast of future diagnoses is a critical task... more The prognosis of a patient's re-admission and the forecast of future diagnoses is a critical task in the process of inferring clinical outcomes. The discharge summaries recorded in the Electronic Health Records (EHR) are stinking rich but they are also heterogeneous, sparse, noisy, and biased, and hinder the learning algorithms that aim to extract actionable insights from them. The existing approaches use the current admission's International Classification of Diseases (ICD) codes as input but they do not fully describe the progression of the patient. Other systems apply the attention mechanisms directly to these notes without the guidance of a domain knowledge resulting in distorted predictions. In this work, we propose a hybrid LSTM-CNN self-guided attention model that aims to predict the ICD diagnosis that is likely to cause the next readmission within 90 days since the current discharge using the discharge narratives. Because the notes contain unnecessary tokens, the model leverages the recent advances in deep learning to predict the patient's future diagnosis by reducing the number of tokens from the notes to be considered for prediction. We use a 1D CNN (Convolutional Neural Network) to capture all features from the note and concurrently an LSTM (Long Short-Term Memory) is used to extract the features of clinically meaningful Concept Unique Identifiers (CUI) that are fetched from the note itself to build a knowledge base. The textual knowledge base guides the learning module about which n-grams from the note to focus on for prediction. We consider 3 prediction scenarios; diagnosis category prediction, the probability of the occurrence of one of the top 20 disease conditions and ICD9 codes prediction. For the diagnosis category prediction, the model achieves a Macro-Average ROC of 0.82, an AUROC of 0.87 for most the Top 20 most appearing diseases prediction, and a Micro-RECALL of 0.84 for ICD9 codes prediction. The predictive accuracy of the model is assessed through the prediction of heart failure onset and for all these prediction scenarios the results show that the hybrid approach outperforms the existing baselines.
IEEE Access
Deep learning has progressively been the spotlight of innovations that aim to leverage the clinic... more Deep learning has progressively been the spotlight of innovations that aim to leverage the clinical time-series data that are longitudinally recorded in the Electronic Health Records(EHR) to forecast the patient's survival and vital signs deterioration. However, their recording velocity, as well as their noisiness, hinder the proper adoption of the recently proposed benchmarks. The Recurrent Neural Networks (RNN) especially the Long-short Term Memory (LSTMs) have achieved better results in recent studies but they are hard to train and interpret and fail to properly capture the long-term dependencies. Moreover, the RNNs suffer greatly with clinical time series due to their sequential processing which cripples the prospect of parallel processing. Recently the Transformer approach was proposed for Natural Language Processing (NLP) tasks and achieved state-of-the-art results. Hence to tackle the drawbacks that are suffered by the RNNs we propose a clinical time series Multi-head Transformer (MHT), which is a transformer-based model that forecasts the patient's future time series variables using the vitals signs. To prove the generalization of the model we use the same model for other critical tasks that describe the Intensive Care Unit (ICU) patient's progression and the associated risks like the remaining Length Of Stay(LoS), the In-hospital Mortality as well as the 24 hours mortality. Our model achieves an Area Under The Curve-Receiver Operating Characteristics(AUC-ROC) of 0.98 and an Area Under the Curve, Precision-Recall (AUC-PR) of 0.424 for vital time series prediction, and an AUC-ROC of 0.875 in the mortality prediction. The model performs well for the frequently recorded variables like the Heart Rate (HR) and performs barely like the LSTM counterparts for the intermittently captured records such as the White Blood Count (WBC). INDEX TERMS Multi head transformer, clinical time series, natural language processing, self-attention, encoder-decoder attention, interpolation.
Journal of Biomedical Informatics
Leveraging the Electronic Health Records (EHR) longitudinal data to produce actionable clinical i... more Leveraging the Electronic Health Records (EHR) longitudinal data to produce actionable clinical insights has always been a critical issue for recent studies. Non-forecasted extended hospitalizations account for a disproportionate amount of resource use, the mediocre quality of inpatient care, and avoidable fatalities. The capability to predict the Length of Stay (LoS) and mortality in the early stages of the admission provides opportunities to improve care and prevent many preventable losses. Forecasting the in-hospital mortality is important in providing clinicians with enough insights to make decisions and hospitals to allocate resources, hence predicting the LoS and mortality within the first day of admission is a difficult but a paramount endeavor. The biggest challenge is that few data are available by this time, thus the prediction has to bring in the previous admissions history and free text diagnosis that are recorded immediately on admission. We propose a model that uses the multi-modal EHR structured medical codes and key demographic information to classify the LoS in 3 classes; Short Los (LoS⩽10 days), Medium LoS (1030 days) as well as mortality as a binary classification of a patient's death during current admission. The prediction has to use data available only within 24 hours of admission. The key predictors include previous ICD9 diagnosis codes, ICD9 procedures, key demographic data, and free text diagnosis of the current admission recorded right on admission. We propose a Hierarchical Attention Network (HAN-LoS and HAN-Mor) model and train it to a dataset of over 45321 admissions recorded in the de-identified MIMIC-III dataset. For improved prediction, our attention mechanisms can focus on the most influential past admissions and most influential codes in these admissions. For fair performance evaluation, we implemented and compared the HAN model with previous approaches. With dataset balancing techniques HAN-LoS achieved an AUROC of over 0.82 and a Micro-F1 score of 0.24 and HAN-Mor achieved AUC-ROC of 0.87 hence outperforming the existing baselines that use structured medical codes as well as clinical time series for LoS and Mortality forecasting. By predicting mortality and LoS using the same model, we show that with little tuning the proposed model can be used for other clinical predictive tasks like phenotyping, decompensation,re-admission prediction, and survival analysis.
IEEE Access
Because of the vast availability of data, there has been an additional focus on the health indust... more Because of the vast availability of data, there has been an additional focus on the health industry and an increasing number of studies that aim to leverage the data to improve healthcare have been conducted. The health data are growing increasingly large, more complex, and its sources have increased tremendously to include computerized physician order entry, electronic medical records, clinical notes, medical images, cyber-physical systems, medical Internet of Things, genomic data, and clinical decision support systems. New types of data from sources like social network services and genomic data are used to build personalized healthcare systems, hence health data are obtained in various forms, from varied sources, contexts, technologies, and their nature can impede a proper analysis. Any analytical research must overcome these obstacles to mine data and produce meaningful insights to save lives. In this paper, we investigate the key challenges, data sources, techniques, technologies, as well as future directions in the field of big data analytics in healthcare. We provide a do-it-yourself review that delivers a holistic, simplified, and easily understandable view of various technologies that are used to develop an integrated health analytic application. INDEX TERMS Big data, cyber-physical systems, health analytics, machine learning, social networks analysis.
IEEE Access
Recent technological advancements have led to a deluge of medical data from various domains. Howe... more Recent technological advancements have led to a deluge of medical data from various domains. However, the recorded data from divergent sources comes poorly annotated, noisy and unstructured. Hence, the data is not fully leveraged to establish actionable insights that can be used in clinical applications. These data recorded in hospital's Electronic Health Records (EHR) consist of patient information, clinical notes, charted events, medications, procedures, laboratory test results, diagnosis codes etc. Traditional machine learning, and statistical methods have failed to offer insights that can be used by physicians to treat patients as they need to obtain an expert opinion assisted features before building a predictive model. With the rise of deep learning methods, there is a need to understand how deep learning can save lives. The purpose of this study was to offer an intuitive explanation for possible use cases of deep learning with Electronic Health Records. We reflect on techniques that can be applied by health informatics professionals by giving technical intuitions and blue prints on how each clinical task can be approached by a deep learning algorithm.
Applied Sciences
There is a need to extract meaningful information from big data, classify it into different categ... more There is a need to extract meaningful information from big data, classify it into different categories, and predict end-user behavior or emotions. Large amounts of data are generated from various sources such as social media and websites. Text classification is a representative research topic in the field of natural-language processing that categorizes unstructured text data into meaningful categorical classes. The long short-term memory (LSTM) model and the convolutional neural network for sentence classification produce accurate results and have been recently used in various natural-language processing (NLP) tasks. Convolutional neural network (CNN) models use convolutional layers and maximum pooling or max-overtime pooling layers to extract higher-level features, while LSTM models can capture long-term dependencies between word sequences hence are better used for text classification. However, even with the hybrid approach that leverages the powers of these two deep-learning model...
IEEE Access, 2019
Q-learning is arguably one of the most applied representative reinforcement learning approaches a... more Q-learning is arguably one of the most applied representative reinforcement learning approaches and one of the off-policy strategies. Since the emergence of Q-learning, many studies have described its uses in reinforcement learning and artificial intelligence problems. However, there is an information gap as to how these powerful algorithms can be leveraged and incorporated into general artificial intelligence workflow. Early Q-learning algorithms were unsatisfactory in several aspects and covered a narrow range of applications. It has also been observed that sometimes, this rather powerful algorithm learns unrealistically and overestimates the action values hence abating the overall performance. Recently with the general advances of machine learning, more variants of Q-learning like Deep Q-learning which combines basic Q learning with deep neural networks have been discovered and applied extensively. In this paper, we thoroughly explain how Q-learning evolved by unraveling the mathematical complexities behind it as well its flow from reinforcement learning family of algorithms. Improved variants are fully described, and we categorize Q-learning algorithms into single-agent and multi-agent approaches. Finally, we thoroughly investigate up-to-date research trends and key applications that leverage Q-learning algorithms.