Early Anomaly Detection in Time Series: A Hierarchical Approach for Predicting Critical Health Episodes (original) (raw)
Related papers
Conditional anomaly detection methods for patient management alert systems
THE 25TH …, 2008
Anomaly detection methods can be very useful in identifying unusual or interesting patterns in data. A recently proposed conditional anomaly detection framework extends anomaly detection to the problem of identifying anomalous patterns on a subset of attributes in the data. The anomaly always depends (is conditioned) on the value of remaining attributes. The work presented in this paper focuses on instance-based methods for detecting conditional anomalies. The methods rely on the distance metric to identify examples in the dataset that are most critical for detecting the anomaly. We investigate various metrics and metric learning methods to optimize the performance of the instance-based anomaly detection methods. We show the benefits of the instancebased methods on two real-world detection problems: detection of unusual admission decisions for patients with the communityacquired pneumonia and detection of unusual orders of an HPF4 test that is used to confirm Heparin induced thrombocytopenia -a lifethreatening condition caused by the Heparin therapy.
Efficient Novelty Detection Methods for Early Warning of Potential Fatal Diseases
Cornell University - arXiv, 2022
Fatal diseases, as Critical Health Episodes (CHEs), represent real dangers for patients hospitalized in Intensive Care Units. These episodes can lead to irreversible organ damage and death. Nevertheless, diagnosing them in time would greatly reduce their inconvenience. This study therefore focused on building a highly effective early warning system for CHEs such as Acute Hypotensive Episodes and Tachycardia Episodes. To facilitate the precocity of the prediction, a gap of one hour was considered between the observation periods (Observation Windows) and the periods during which a critical event can occur (Target Windows). The MIMIC II dataset was used to evaluate the performance of the proposed system. This system first includes extracting additional features using three different modes. Then, the feature selection process allowing the selection of the most relevant features was performed using the Mutual Information Gain feature importance. Finally, the high-performance predictive model LightGBM was used to perform episode classification. This approach called MIG-LightGBM was evaluated using five different metrics: Event Recall (ER), Reduced Precision (RP), average Anticipation Time (aveAT), average False Alarms (aveFA), and Event F1-score (EF1-score). A method is therefore considered highly efficient for the early prediction of CHEs if it exhibits not only a large aveAT but also a large EF1-score and a low aveFA. Compared to systems using Extreme Gradient Boosting, Support Vector Classification or Naive Bayes as a predictive model, the proposed system was found to be highly dominant. It also confirmed its superiority over the Layered Learning approach.
2019
BACKGROUND More than 20% of patients admitted to the intensive care unit (ICU) develop an adverse event (AE) increasing the risk of further complications and mortality. Despite substantial research on AE prediction, no previous study has leveraged patients’ temporal data to extract features using their structural temporal patterns, i.e. trends. OBJECTIVE To improve AE prediction methods by using structural temporal pattern detection for patients admitted to the ICU by extracting features from their temporal pattern data to capture global and local temporal trends and to demonstrate these improvements in the detection of Acute Kidney Injury (AKI). METHODS Using the MIMIC dataset, we extracted both global and local trends using structural pattern detection methods to predict AKI. Classifiers were built using state-of-the-art models; the optimal classifier was selected for comparisons with previous approaches. The classifier with structural pattern detection features was compared with ...
2020
Many areas of research are characterised by the deluge of large-scale highly-dimensional time-series data. However, using the data available for prediction and decision making is hampered by the current lag in our ability to uncover and quantify true interactions that explain the outcomes.We are interested in areas such as intensive care medicine, which are characterised by i) continuous monitoring of multivariate variables and non-uniform sampling of data streams, ii) the outcomes are generally governed by interactions between a small set of rare events, iii) these interactions are not necessarily definable by specific values (or value ranges) of a given group of variables, but rather, by the deviations of these values from the normal state recorded over time, iv) the need to explain the predictions made by the model. Here, while numerous data mining models have been formulated for outcome prediction, they are unable to explain their predictions. We present a model for uncovering i...
Early detection of sepsis utilizing deep learning on electronic health record event sequences
Background: The timeliness of detection of a sepsis incidence in progress is a crucial factor in the outcome for the patient. Machine learning models built from data in electronic health records can be used as an effective tool for improving this timeliness, but so far the potential for clinical implementations has been largely limited to studies in intensive care units. This study will employ a richer data set that will expand the applicability of these models beyond intensive care units. Furthermore, we will circumvent several important limitations that have been found in the literature: 1) Models are evaluated shortly before sepsis onset without considering interventions already initiated. 2) Machine learning models are built on a restricted set of clinical parameters, which are not necessarily measured in all departments. 3) Model performance is limited by current knowledge of sepsis, as feature interactions and time dependencies are hard-coded into the model. Methods: In this study, we present a model to overcome these shortcomings using a deep learning approach on a diverse multicenter data set. We used retrospective data from multiple Danish hospitals over a seven-year period. Our sepsis detection system is constructed as a combination of a convolutional neural network and a long short-term memory network. We suggest a retrospective assessment of interventions by looking at intravenous antibiotics and blood cultures preceding the prediction time. Results: Results show performance ranging from AUROC 0.856 (3 hours before sepsis onset) to AUROC 0.756 (24 hours before sepsis onset). Evaluating the clinical utility of the model, we find that a large proportion of septic patients did not receive antibiotic treatment or blood culture at the time of the sepsis prediction, and the model could therefore facilitate such interventions at an earlier point in time. Conclusion: We present a deep learning system for early detection of sepsis that is able to learn characteristics of the key factors and interactions from the raw event sequence data itself, without relying on a labor-intensive feature extraction work. Our system outperforms baseline models, such as gradient boosting, which rely on specific data elements and therefore suffer from many missing values in our dataset.
Prediction of Sudden Health Crises Owing to Congestive Heart Failure with Deep Learning Models
2021
Received: 23 November 2020 Accepted: 6 February 2021 Artificial Intelligence (AI) has its roots in every area in the present scenario. Healthcare is one of the markets in which AI has greatly grown in recent years. The tremendous increase in health data generation and the substantial evolution of the robust data analysis tools have contributed to AI improvement in health care and research, leading to increased service efficiency. Health reporting is stored as Electronic Health Records (EHR), providing information on the patients sought temporarily. EHR data have different issues, such as heterogeneity, missing values, distortion, noise, time, etc. This study reflects the irregularity of appointment that refers to the irregular timing of the operations (patient visits). Congestive heart failure (CHF) is a grave clinical disorder caused by an insufficient blood supply in the bloodstream owing to a heart muscle dysfunction. Most people suffer from CHF which result in death or immediate...
Detecting hazardous intensive care patient episodes using real-time mortality models
2009
The modern intensive care unit (ICU) has become a complex, expensive, data-intensive environment. Caregivers maintain an overall assessment of their patients based on important observations and trends. If an advanced monitoring system could also reliably provide a systemic interpretation of a patient's observations it could help caregivers interpret these data more rapidly and perhaps more accurately. In this thesis I use retrospective analysis of mixed medical/surgical intensive care patients to develop predictive models. Logistic regression is applied to 7048 development patients with several hundred candidate variables. These candidate variables range from simple vitals to long term trends and baseline deviations. Final models are selected by backward elimination on top cross-validated variables and validated on 3018 additional patients. The real-time acuity score (RAS) that I develop demonstrates strong discrimination ability for patient mortality, with an ROC area (AUC) of 0.880. The final model includes a number of variables known to be associated with mortality, but also computationally intensive variables absent in other severity scores. In addition to RAS, I also develop secondary outcome models that perform well at predicting pressor weaning (AUC=0.825), intraaortic balloon pump removal (AUC=0.816), the onset of septic shock (AUC=0.843), and acute kidney injury (AUC=0.742). Real-time mortality prediction is a feasible way to provide continuous risk assessment for ICU patients. RAS offers similar discrimination ability when compared to models computed once per day, based on aggregate data over that day. Moreover, RAS mortality predictions are better at discrimination than a customized SAPS II score (Day 3 AUC=0.878 vs AUC=0.849, p < 0.05). The secondary outcome models also provide interesting insights into patient responses to care and patient risk profiles. While models trained for specifically recognizing secondary outcomes consistently outperform the RAS model at their specific tasks, RAS provides useful baseline risk estimates throughout these events and in some cases offers a notable level of predictive utility.
Medical Data Mining for Early Deterioration Warning in General Hospital Wards
2011 IEEE 11th International Conference on Data Mining Workshops, 2011
Data mining on medical data has great potential to improve the treatment quality of hospitals and increase the survival rate of patients. Every year, 4-17% of patients undergo cardiopulmonary or respiratory arrest while in hospitals. Early prediction techniques have become an apparent need in many clinical area. Clinical study has found early detection and intervention to be essential for preventing clinical deterioration in patients at general hospital units. In this paper, based on data mining technology, we propose an early warning system (EWS) designed to identify the signs of clinical deterioration and provide early warning for serious clinical events. Our EWS is designed to provide reliable early alarms for patients at the general hospital wards (GHWs). EWS automatically identifies patients at risk of clinical deterioration based on their existing electronic medical record. The main task of EWS is a challenging classification problem on highdimensional stream data with irregular, multi-scale data gaps, measurement errors, outliers, and class imbalance. In this paper, we propose a novel data mining framework for analyzing such medical data streams. The framework addresses the above challenges and represents a practical approach for early prediction and prevention based on data that would realistically be available at GHWs. We assess the feasibility of the proposed EWS approach through retrospective study that includes data from 28,927 visits at a major hospital. Finally, we apply our system in a real-time clinical trial and obtain promising results. This project is an example of multidisciplinary cyber-physical systems involving researchers in clinical science, data mining, and nursing staff in the hospital. Our early warning algorithm shows promising result: the transfer of patients to ICU was predicted with sensitivity of 0.4127 and specificity of 0.950 in the real time system.
Learning Temporal Rules to Forecast Instability in Intensive Care Patients
Intensive care medicine, 2013
Inductive machine learning, and in particular extraction of association rules from data, has been successfully used in multiple application domains, such as market basket analysis, disease prognosis, fraud detection, and protein sequencing. The appeal of rule extraction techniques stems from their ability to handle intricate problems yet produce models based on rules that can be comprehended by humans, and are therefore more transparent. Human comprehension is a factor that may improve adoption and use of data-driven decision support systems clinically via face validity. In this work, we explore whether we can reliably and informatively forecast cardiorespiratory instability (CRI) in step-down unit (SDU) patients utilizing data from continuous monitoring of physiologic vital sign (VS) measurements. We use a temporal association rule extraction technique in conjunction with a rule fusion protocol to learn how to forecast CRI in continuously monitored patients. We detail our approach and present and discuss encouraging empirical results obtained using continuous multivariate VS data from the bedside monitors of 297 SDU patients spanning 29 346 hours (3.35 patient-years) of observation. We present example rules that have been learned from data to illustrate potential benefits of comprehensibility of the extracted models, and we analyze the empirical utility of each VS as a potential leading indicator of an impending CRI event.
Anomaly Detection Paradigm for Multivariate Time Series Data Mining for Healthcare
Applied Sciences
Time series data are significant, and are derived from temporal data, which involve real numbers representing values collected regularly over time. Time series have a great impact on many types of data. However, time series have anomalies. We introduce an anomaly detection paradigm called novel matrix profile (NMP) to solve the all-pairs similarity search problem for time series data in the healthcare. The proposed paradigm inherits the features from two state-of-the-art algorithms: Scalable Time series Anytime Matrix Profile (STAMP) and Scalable Time-series Ordered-search Matrix Profile (STOMP). The proposed NMP caches the output in an easy-to-access fashion for single- and multidimensional data. The proposed NMP can be used on large multivariate data sets and generates approximate solutions of high quality in a reasonable time. It is implemented on a Python platform. To determine its effectiveness, it is compared with the state-of-the-art matrix profile algorithms, i.e., STAMP and...