A novel algorithm for feature selection using Harmony Search and its application for non-technical losses detection (original) (raw)
Related papers
A Probabilistic Optimum-Path Forest Classifier for Non-Technical Losses Detection
IEEE Transactions on Smart Grid, 2018
Probabilistic-driven classification techniques extend the role of traditional approaches that output labels (usually integer numbers) only. Such techniques are more fruitful when dealing with problems where one is not interested in recognition/identification only, but also into monitoring the behavior of consumers and/or machines, for instance. Therefore, by means of probability estimates, one can take decisions to work better in a number of scenarios. In this paper, we propose a probabilisticbased Optimum-Path Forest (OPF) classifier to handle the problem of non-technical losses (NTL) detection in power distribution systems. The proposed approach is compared against naïve OPF, probabilistic Support Vector Machines, and Logistic Regression, showing promising results for both NTL identification and in the context of general-purpose applications.
Binary Harmony Search Based Feature Selection and Data Classification Technique
Nowadays data is rapidly growing at an exponential pace. To deal with such data explosion, we need effective data processing and analysis techniques. A most popular machine learning technique namely "feature selection" that plays the most vital role of machine learning technique where it selecting a subset of features from a dataset that still provides most of the useful information. In real-world applications misclassification costs of minority class could be extremely high. Therefore this is the most demanding query especially while the data are too high in dimensionality because of enhance in over-fitting and inferior representation interpretability. Recently Feature selection is one of the most popular way to deal with the trouble by figuring out the features that will be predict best to a minority class.
Non-technical losses identification using Optimum-Path Forest and state estimation
2015 IEEE Eindhoven PowerTech, 2015
The consumers are based on typical Brazilian residential consumers. The results indicate that the proposed method improves the performance of previously developed techniques and is suitable for both typical and smart EPDS. The best results are obtained when the DSE and OPF are associated using normalized data as the input for this hybrid approach, with expected success rate of up to 72.43% for on-site inspections in the EPDS test system.
Fast Non-Technical Losses Identification Through Optimum-Path Forest
2009
Abstract Fraud detection in energy systems by illegal consumers is the most actively pursued study in non-technical losses by electric power companies. Commonly used supervised pattern recognition techniques, such as artificial neural networks and support vector machines have been applied for automatic commercial frauds identification, however they suffer from slow convergence and high computational burden.
Optimum-Path Forest Pruning Parameter Estimation Through Harmony Search
sibgrapi.sid.inpe.br
Pattern recognition in large amount of data has been paramount in the last decade, since that is not straightforward to design interactive and real time classification systems. Very recently, the Optimum-Path Forest classifier was proposed to overcome such limitations, together with its training set pruning algorithm, which requires a parameter that has been empirically set up to date. In this paper, we propose a Harmony Searchbased algorithm that can find near optimal values for that. The experimental results have showed that our algorithm is able to find proper values for the OPF pruning algorithm parameter.
The search for optimal feature set in power quality event classification
Expert Systems with Applications, 2009
The significance of detection and classification of power quality (PQ) events that disturbs the voltage and/ or current waveforms in the electrical power distribution networks is well known. Consequently, in spite of a large number of research reports in this area, the problem of PQ event classification remains to be an important engineering problem. Several feature construction, pattern recognition, analysis, and classification methods were proposed for this purpose. In spite of the extensive number of such alternatives, a research on the comparison of ''how useful these features with respect to each other using specific classifiers" was omitted. In this work, a thorough analysis is carried out regarding the classification strengths of an ensemble of celebrated features. The feature items were selected from well-known tools such as spectral information, wavelet extrema across several decomposition levels, and local statistical variations of the waveform. The tests are repeated for classification of several types of real-life data acquired during line-to-ground arcing faults and voltage sags due to the induction motor starting under different load conditions. In order to avoid specificity in classifier strength determination, eight different approaches are applied, including the computationally costly ''exhaustive search" together with the leave-one-out technique. To further avoid specificity of the feature for a given classifier, two classifiers (Bayes and SVM) are tested. As a result of these analyses, the more useful set among a wider set of features for each classifier is obtained. It is observed that classification accuracy improves by eliminating relatively useless feature items for both classifiers. Furthermore, the feature selection results somewhat change according to the classifier used. This observation shows that when a new analysis tool or a feature is developed and claimed to perform ''better" than another, one should always indicate the matching classifier for the feature because that feature may prove comparably inefficient with other classifiers.
Review of Non-Technical Losses Identification Techniques
International Journal on Recent and Innovation Trends in Computing and Communication, 2021
Illegally consumption of electric power, termed as non-technical losses for the distribution companies is one of the dominant factors all over the world for many years. Although there are some conventional methods to identify these irregularities, such as physical inspection of meters at the consumer premises etc, but it requires large number of manpower and time; then also it does not seem to be adequate. Now a days there are various methods and algorithms have been developed that are proposed in different research papers, to detect non-technical losses. In this paper these methods are reviewed, their important features are highlighted and also the limitations are identified. Finally, the qualitative comparison of various non-technical losses identification algorithms is presented based on their performance, costs, data handling, quality control and execution times. It can be concluded that the graph-based classifier, Optimum-Path Forest algorithm that have both supervised and unsu...
Energies
Electricity fraud in billing are the primary concerns for Distribution System Operators (DSO). It is estimated that billions of dollars are wasted annually due to these illegal activities. DSOs around the world, especially in underdeveloped countries, still utilize conventional time consuming and inefficient methods for Non-Technical Loss (NTL) detection. This research work attempts to solve the mentioned problem by developing an efficient energy theft detection model in order to identify the fraudster customers in a power distribution system. The key motivation for the present study is to assist the DSOs in their fight against energy theft. The proposed computational model initially utilizes a set of distinct features extracted from the monthly consumers’ consumption data, obtained from Multan Electric Power Company (MEPCO) Pakistan, to segregate the honest and the fraudulent customers. The Pearson’s chi-square feature selection algorithm is adopted to select the most relevant feat...
Detection of Non-Technical Loss in Power Utilities Using Data Mining Techniques
This paper presents Non-Technical loss (NTL) in power utilities and it describes how to handle. Non-technical loss has been an influential factor on the benefits of electric power utilities. At the same time, with distribute generation extensively installed, the consumption patterns having many similarities between dishonest users and normal customers. Non-Technical Loss may be theft of electricity, illegal connection, fault metering and billing error. Improving the reliability of NTL detection algorithm becomes particularly important. Data mining techniques are used to detect the Non-Technical Loss using classification algorithm. The implementation of a intelligent computational tool to identify the non-technical losses and to select its most close feature, considering information from the database with consumers profiles. This work presents using the weka software to the proposed objective, comparing various classification techniques and optimization through intelligent algorithm.
IEEE Access, 2019
With the ever-growing demand of electric power, it is quite challenging to detect and prevent Non-Technical Loss (NTL) in power industries. NTL is committed by meter bypassing, hooking from the main lines, reversing and tampering the meters. Manual on-site checking and reporting of NTL remains an unattractive strategy due to the required manpower and associated cost. The use of machine learning classifiers has been an attractive option for NTL detection. It enhances data-oriented analysis and high hit ratio along with less cost and manpower requirements. However, there is still a need to explore the results across multiple types of classifiers on a real-world dataset. This paper considers a real dataset from a power supply company in Pakistan to identify NTL. We have evaluated 15 existing machine learning classifiers across 9 types which also include the recently developed CatBoost, LGBoost and XGBoost classifiers. Our work is validated using extensive simulations. Results elucidate that ensemble methods and Artificial Neural Network (ANN) outperform the other types of classifiers for NTL detection in our real dataset. Moreover, we have also derived a procedure to identify the top-14 features out of a total of 71 features, which are contributing 77% in predicting NTL. We conclude that including more features beyond this threshold does not improve performance and thus limiting to the selected feature set reduces the computation time required by the classifiers. Last but not least, the paper also analyzes the results of the classifiers with respect to their types, which has opened a new area of research in NTL detection.