A Probabilistic Optimum-Path Forest Classifier for Non-Technical Losses Detection (original) (raw)
Related papers
Non-technical losses identification using Optimum-Path Forest and state estimation
2015 IEEE Eindhoven PowerTech, 2015
The consumers are based on typical Brazilian residential consumers. The results indicate that the proposed method improves the performance of previously developed techniques and is suitable for both typical and smart EPDS. The best results are obtained when the DSE and OPF are associated using normalized data as the input for this hybrid approach, with expected success rate of up to 72.43% for on-site inspections in the EPDS test system.
Fast Non-Technical Losses Identification Through Optimum-Path Forest
2009
Abstract Fraud detection in energy systems by illegal consumers is the most actively pursued study in non-technical losses by electric power companies. Commonly used supervised pattern recognition techniques, such as artificial neural networks and support vector machines have been applied for automatic commercial frauds identification, however they suffer from slow convergence and high computational burden.
Electricity consumer fraud is a problem faced by all power utilities. Finding efficient measurements for detecting fraudulent electricity consumption has been an active research area in recent years. In this paper,the approach towards nontechnical loss (NTL) detection in power utilities using an artificial intelligence based technique, Support Vector Machine (SVM), are presented. This approach provides a method of data mining, which involves feature extraction from past consumption data. This SVM based approach uses customer load profile information and additional attributes to expose abnormal behavior that is known to be highly correlated with NTL activities. Some key advantages of SVM in data clustering, among which is the easy way of using them to fit the data of a wide range of features are discussed here. Finally, some major weakness of using SVM in clustering for NTL identification are identified, which leads to motivate for the scope of Optimum-Path Forest, a new model of NTL identification.
IEEE Access, 2019
With the ever-growing demand of electric power, it is quite challenging to detect and prevent Non-Technical Loss (NTL) in power industries. NTL is committed by meter bypassing, hooking from the main lines, reversing and tampering the meters. Manual on-site checking and reporting of NTL remains an unattractive strategy due to the required manpower and associated cost. The use of machine learning classifiers has been an attractive option for NTL detection. It enhances data-oriented analysis and high hit ratio along with less cost and manpower requirements. However, there is still a need to explore the results across multiple types of classifiers on a real-world dataset. This paper considers a real dataset from a power supply company in Pakistan to identify NTL. We have evaluated 15 existing machine learning classifiers across 9 types which also include the recently developed CatBoost, LGBoost and XGBoost classifiers. Our work is validated using extensive simulations. Results elucidate that ensemble methods and Artificial Neural Network (ANN) outperform the other types of classifiers for NTL detection in our real dataset. Moreover, we have also derived a procedure to identify the top-14 features out of a total of 71 features, which are contributing 77% in predicting NTL. We conclude that including more features beyond this threshold does not improve performance and thus limiting to the selected feature set reduces the computation time required by the classifiers. Last but not least, the paper also analyzes the results of the classifiers with respect to their types, which has opened a new area of research in NTL detection.
Review of Non-Technical Losses Identification Techniques
International Journal on Recent and Innovation Trends in Computing and Communication, 2021
Illegally consumption of electric power, termed as non-technical losses for the distribution companies is one of the dominant factors all over the world for many years. Although there are some conventional methods to identify these irregularities, such as physical inspection of meters at the consumer premises etc, but it requires large number of manpower and time; then also it does not seem to be adequate. Now a days there are various methods and algorithms have been developed that are proposed in different research papers, to detect non-technical losses. In this paper these methods are reviewed, their important features are highlighted and also the limitations are identified. Finally, the qualitative comparison of various non-technical losses identification algorithms is presented based on their performance, costs, data handling, quality control and execution times. It can be concluded that the graph-based classifier, Optimum-Path Forest algorithm that have both supervised and unsu...
Improving electricity non technical losses detection including neighborhood information
2018 IEEE Power & Energy Society General Meeting (PESGM), 2018
Non technical losses (NTL) cause significant damage to power supply companies' economies. Detecting abnormal clients behavior is an important and difficult task. In this paper we analyze the impact of considering customers geo-localization information, in automatic NTL detection. A methodology to find optimal grid sizes to compute a set of local features with a random search procedure is proposed. The number and size of the grids, and other classification algorithm parameters are adjusted to maximize the area under receiver operating characteristic curve (AUC), showing performance improvements in a data set of 6 thousand of Uruguayan residential customers. Comparative analysis with different subsets of characteristics, that include the monthly consumption, contractual information and the new local features are presented. In addition, we probe that raw customers' geographical location used as an input feature, gives competitive results as well. In addition we evaluate a entire new database of 6 thousand Uruguayan customers, whom were inspected in-site by UTE experts between 2015 and 2017.
Comparison of Supervised Learning Techniques for Non-Technical Loss Detection in Power Utility
The study attempts to identify a potentially reliable supervised learning technique for predicting the outcomes of mortality in an altered state of consciousness (ASC) patients. ASC is a state distinguished from ordinary waking consciousness, which is a common phenomenon in the Emergency Department (ED). Thirty (30) distinctive attributes or features are commonly used to recognize ASC. The study accordingly applied these features to model the prediction of mortality in ASC patients. Supervised learning techniques are found to be suitable for such classification problems. Consequently, the study compared five supervised learning techniques that are commonly applied to evaluate the risk of mortality using health-related datasets, namely Decision Tree, Neural Network, Random Forest, Naïve Bayes, and Logistic Regression. The labeled dataset comprised patient records captured by the Universiti Sains Malaysia hospital's Emergency Medicine department from June to November 2008. The cleaned dataset was divided into two parts. The larger part was used for training and the smaller part, for evaluation. Since the ratio between training and testing samples varies between individual supervised learning techniques, we studied the performance of the modeled techniques by also varying the proportion of the training data to the dataset. We applied four percentage splits; 66%, 75%, 80%, and 90% to allow for 3-, 4-, 5-and 10-fold cross-validation experiments to evaluate the accuracy of the analyzed techniques. The variation helped to lessen the chance of over fitting, and averaged the effects of various conditions on accuracy. The experiments were conducted in the WEKA environment. The results indicated that Random Forest is the most reliable technique to model for predicting the mortality in ASC patients with acceptable accuracy, sensitivity, and specificity of 70.9%, 76.3%, and 65.5%, respectively. The results are further confirmed by SROC analysis. The findings of the study serve as a fundamental step towards a comprehensive study in the future.
Learning to Identify Non-Technical Losses with Optimum-Path Forest
Abstract—In this work we have proposed an innovative and accurate solution for non-technical losses identification using the Optimum-Path Forest (OPF) classifier and its learning algorithm. Results in two datasets demonstrated that OPF outperformed the state of the art pattern recognition techniques and OPF with learning achieved better results for automatic nontechnical losses identification than recently ones obtained in the literature. Keywords-Non-Technical Losses, Optimum-Path Forest.
Review of non-technical loss detection methods
Electricity theft has been a major issue for many years. Distribution System Operators (DSOs) have been trying to detect electricity theft, however the phenomenon insists, while simple meter inspection methods cannot adequately identify most cases of fraud. In this paper the most recent and characteristic research papers on Non-Technical Loss (NTL) detection are reviewed and their key features are summarized. NTL detection schemes are organized in three large categories: data oriented, network oriented and hybrids. Data oriented and network oriented methods are further divided to subcategories, according to the main concept behind NTL detection. Apart from categorizing the various methods the authors focus on algorithms, data types and size, features, evaluation metrics and NTL detection system response times. An overview of the algorithms used by NTL detection systems is presented focusing on why they are suitable for the specific application. The data types consumed by each NTL detection system are defined and features typically extracted from these data types are presented. In addition, the authors provide a comprehensive list of performance metrics used and comment on their importance. Finally, a qualitative comparison of NTL detectors is provided focusing on performance issues, costs, data variety/quality issues and system response times.