Rule-based anomaly pattern detection for detecting disease outbreaks (original) (raw)

Framework on Outlier Sequential patterns for Outbreak Detection

2011

There are many outbreak detection that available with various techniques being introduced ranging from statistic to data mining including machine learning. With the direction of spatial-temporal data the research under public health surveillance especially outbreak detection or anomalies detection are promising research. In this paper we applied data mining techniques in detecting outbreak in public health surveillance. The phase involves learning, detecting and repository. An extracted sequential pattern method, outlier set was identified using outlier detection algorithm methods. 1.0 Introduction The main objective of a health surveillance system is to reduce the impact of an outbreak by enabling officials to detect it quickly and implement timely, appropriate interventions. Identifying an outbreak days to weeks earlier than traditional surveillance will result in a reduction in morbidity, mortality, and its economic consequences. This is likely obtainable by improvements in data ...

Conditional anomaly detection methods for patient– management alert systems

THE 25TH …, 2008

Anomaly detection methods can be very useful in identifying unusual or interesting patterns in data. A recently proposed conditional anomaly detection framework extends anomaly detection to the problem of identifying anomalous patterns on a subset of attributes in the data. The anomaly always depends (is conditioned) on the value of remaining attributes. The work presented in this paper focuses on instance-based methods for detecting conditional anomalies. The methods rely on the distance metric to identify examples in the dataset that are most critical for detecting the anomaly. We investigate various metrics and metric learning methods to optimize the performance of the instance-based anomaly detection methods. We show the benefits of the instancebased methods on two real-world detection problems: detection of unusual admission decisions for patients with the communityacquired pneumonia and detection of unusual orders of an HPF4 test that is used to confirm Heparin induced thrombocytopenia -a lifethreatening condition caused by the Heparin therapy.

Anomaly Detection in COVID-19 Time-Series Data

SN Computer Science

Anomaly detection and explanation in big volumes of real-world medical data, such as those pertaining to COVID-19, pose some challenges. First, we are dealing with time-series data. Typical time-series data describe behavior of a single object over time. In medical data, we are dealing with time-series data belonging to multiple entities. Thus, there may be multiple subsets of records such that records in each subset, which belong to a single entity are temporally dependent, but the records in different subsets are unrelated. Moreover, the records in a subset contain different types of attributes, some of which must be grouped in a particular manner to make the analysis meaningful. Anomaly detection techniques need to be customized for time-series data belonging to multiple entities. Second, anomaly detection techniques fail to explain the cause of outliers to the experts. This is critical for new diseases and pandemics where current knowledge is insufficient. We propose to address these issues by extending our existing work called IDEAL, which is an LSTM-autoencoder based approach for data quality testing of sequential records, and provides explanations of constraint violations in a manner that is understandable to end-users. The extension (1) uses a novel two-level reshaping technique that splits COVID-19 data sets into multiple temporally-dependent subsequences and (2) adds a data visualization plot to further explain the anomalies and evaluate the level of abnormality of subsequences detected by IDEAL. We performed two systematic evaluation studies for our anomalous subsequence detection. One study uses aggregate data, including the number of cases, deaths, recovered, and percentage of hospitalization rate, collected from a COVID tracking project, New York Times, and Johns Hopkins for the same time period. The other study uses COVID-19 patient medical records obtained from Anschutz Medical Center health data warehouse. The results are promising and indicate that our techniques can be used to detect anomalies in large volumes of real-world unlabeled data whose accuracy or validity is unknown.

Population-wide Anomaly Detection

Early detection of disease outbreaks, particularly an outbreak due to an act of bioterrorism, is a critically important problem due to the potential to reduce both morbidity and mortality. One of the most lethal bioterrorism scenarios is a large-scale release of inhalational anthrax. The Population-wide Anomaly Detection and Assessment (PANDA) algorithm [1] is specifically designed to monitor health-care data for the onset of an outbreak caused by an outdoor, airborne release of inhalational anthrax. At the heart of the PANDA algorithm is a causal Bayesian network which models the effects of the outbreak on a population. The most unique aspect of the PANDA algorithm is an approach we will refer to as population-wide anomaly detection in which each individual in the population is represented as a subnetwork of the overall causal Bayesian network. This paper will describe the benefits of the population-wide approach used by PANDA, which include a coherent way to incorporate background knowledge as well as different types of evidence, the ability to combine multiple data sources indicative of an outbreak, and the capability to identify the evidence that contributes the most to the belief that an anthrax outbreak is occurring.

Bayesian network anomaly pattern detection for disease outbreaks

MACHINE LEARNING-INTERNATIONAL WORKSHOP THEN CONFERENCE-

Early disease outbreak detection systems typically monitor health care data for irreg- ularities by comparing the distribution of re- cent data against a baseline distribution. De- termining the baseline is difficult due to the ...

IRJET-Big Data-Driven Abnormal Behavior Detection in Healthcare based on Association Rules

IRJET, 2021

In the big data driven abnormal deportment unearthing in healthcare rested on association rules scheme, Healthcare insurance frauds are causing millions of bones of public healthcare fund losses around the world in varicolored ways, which makes it really important to strengthen the stewardship of medical insurance in order to guarantee the steady operation of medical insurance resources. Healthcare fraud unearthing tactics can reduce the losses of healthcare insurance resources and refine medical quality. Subsisting fraud unearthing studies generally train on chancing normal deportment patterns and treat those violating normal deportment patterns as fraudsters. Notwithstanding, fraudsters can hourly disguise themselves with some normal deportment, resembling as some harmonious deportment when they seek medical treatments. To address these issues, we combined a Chart Reduce distributed reckoning model and association rule mining to propose a medical cluster deportment unearthing algorithm rested on frequent pattern mining. It can dredge certain harmonious deportments of cases in medical treatment exertion. By dissecting1.5 million medical claim records, we've validated the effectiveness of the tactics. Trials show that this tactics has better performance than several touchstone tactics.

Conditional Outlier Detection for Clinical Alerting

We develop and evaluate a data-driven approach for detecting unusual (anomalous) patient-management actions using past patient cases stored in an electronic health record (EHR) system. Our hypothesis is that patient-management actions that are unusual with respect to past patients may be due to a potential error and that it is worthwhile to raise an alert if such a condition is encountered. We evaluate this hypothesis using data obtained from the electronic health records of 4,486 post-cardiac surgical patients. We base the evaluation on the opinions of a panel of experts. The results support that anomaly-based alerting can have reasonably low false alert rates and that stronger anomalies are correlated with higher alert rates.

A wavelet-based anomaly detector for early detection of disease outbreaks

Workshop on Machine …, 2006

We describe a wavelet-based automated algorithm for detecting disease outbreaks in temporal syndromic data. We describe the method, which improves upon the algorithm and its implementation on a diverse set of real syndromic data from multiple data sources and multiple geographical locations. Our results show a robust performance which is comparable to a few recently suggested methods.

Algorithms for rapid outbreak detection: a research synthesis

Journal of Biomedical Informatics, 2005

The threat of bioterrorism has stimulated interest in enhancing public health surveillance to detect disease outbreaks more rapidly than is currently possible. To advance research on improving the timeliness of outbreak detection, the Defense Advanced Research Project Agency sponsored the Bio-event Advanced Leading Indicator Recognition Technology (BioALIRT) project beginning in 2001. The purpose of this paper is to provide a synthesis of research on outbreak detection algorithms conducted by academic and industrial partners in the BioALIRT project. We first suggest a practical classification for outbreak detection algorithms that considers the types of information encountered in surveillance analysis. We then present a synthesis of our research according to this classification. The research conducted for this project has examined how to use spatial and other covariate information from disparate sources to improve the timeliness of outbreak detection. Our results suggest that use of spatial and other covariate information can improve outbreak detection performance. We also identified, however, methodological challenges that limited our ability to determine the benefit of using outbreak detection algorithms that operate on large volumes of data. Future research must address challenges such as forecasting expected values in high-dimensional data and generating spatial and multivariate test data sets.