Günther Eibl | Salzburg University of Applied Sciences (original) (raw)
Talks by Günther Eibl
Papers by Günther Eibl
e & i Elektrotechnik und Informationstechnik
To ensure continued reliable operation of the energy grid in the face of a rising number of local... more To ensure continued reliable operation of the energy grid in the face of a rising number of local energy communities (LECs), they need to be integrated in a way that ensures that they do not negatively impact but rather support the overall system, e.g., by providing flexibility. Project ECOSINT aims at intelligent, digital integration of LECs to achieve this goal. Along with several other tasks, this includes the development of a suitable software architecture, which is currently in progress and has yielded a conceptual model as a first result. Among the numerous requirements regarding LECs and their software architecture which have been collected in the initial phase of the project, interoperability has been identified as a crucial factor for success. This is addressed by incorporating AIT’s virtual lab (VLab) framework, which is presented and demonstrated using the example of an electric vehicle (EV) charging scenario as part of the ECOSINT solution.
Neuroepidemiology, 2002
The epidemiology of multiple sclerosis (MS) in Austria is almost unknown. We evaluated the preval... more The epidemiology of multiple sclerosis (MS) in Austria is almost unknown. We evaluated the prevalence of MS in Austria using data from questionnaires completed by neurologists, comprising information on a total of 1,006 MS patients who attended 30 out-patient specialized clinics nationwide. Additional data were collected from 2,414 MS patients, who received questionnaires from the Austrian MS Society or their doctor. A novel extrapolation model, based on frequencies of patients visits at MS clinics, was used to estimate the overall prevalence of MS. Considering either disability or the course of the disease, the prevalence of MS patients in Austria was estimated to be 98.5 per 100,000 people. The prevalence of MS in Austria was found to be similar to that of other countries in Central Europe. Epidemiological studies, such as this, provide a unique source of data from which key features of a disease and its impact on patients may be examined.
2008 19th International Conference on Pattern Recognition, 2008
Video footage of real crowded scenes still poses severe challenges for automated surveillance. Th... more Video footage of real crowded scenes still poses severe challenges for automated surveillance. This paper evaluates clustering methods for finding independent dominant motion fields for an observation period based on a recently published real-time optical flow algorithm. We focus on self-tuning spectral clustering and Isomap combined with k-means. Several combinations of feature vector normalizations and distance measures (Euclidean, Mahanalobis and a general additive distance) are evaluated for four image sequences including three publicly available crowd datasets. Evaluation is based on mean accuracy obtained by comparison with a manually defined ground truth clustering. For every dataset at least one approach correctly classified more than 95% of the flow vectors without extra tuning of parameters, providing a basis for an automatic analysis after a view-dependent setup.
Journal of Machine Learning Research, Dec 1, 2005
AdaBoost.M2 is a boosting algorithm designed for multiclass problems with weak base classifiers. ... more AdaBoost.M2 is a boosting algorithm designed for multiclass problems with weak base classifiers. The algorithm is designed to minimize a very loose bound on the training error. We propose two alternative boosting algorithms which also minimize bounds on performance measures. These performance measures are not as strongly connected to the expected error as the training error, but the derived bounds are tighter than the bound on the training error of AdaBoost.M2. In experiments the methods have roughly the same performance in minimizing the training and test error rates. The new algorithms have the advantage that the base classifier should minimize the confidence-rated error, whereas for AdaBoost.M2 the base classifier should minimize the pseudo-loss. This makes them more easily applicable to already existing base classifiers. The new algorithms also tend to converge faster than AdaBoost.M2.
Sustainable Cloud and Energy Services, 2017
The smart grid changes the way how energy and information are exchanged and offers opportunities ... more The smart grid changes the way how energy and information are exchanged and offers opportunities for incentive-based load balancing. For instance, customers may shift the time of energy consumption of household appliances in exchange for a cheaper energy tariff. This paves the path towards a full range of modular tariffs and dynamic pricing that incorporate the overall grid capacity as well as individual customer demands. This also allows customers to frequently switch within a variety of tariffs from different utility providers based on individual energy consumption and provision forecasts. For automated tariff decisions it is desirable to have a tool that assists in choosing the optimum tariff based on a prediction of individual energy need and production. However, the revelation of individual load patterns for smart grid applications poses severe privacy threats for customers as analyzed in depth in literature. Similarly, accurate and fine-grained regional load forecasts are sensitive business information of utility providers that are not supposed to be released publicly. This paper extends previous work in the domain of privacy-preserving load profile matching where load profiles from utility providers and load profile forecasts from customers are transformed in a distance-preserving embedding in order to find a matching tariff. The embeddings neither reveal individual contributions of customers, nor those of utility providers. Prior work requires a dedicated entity that needs to be trustworthy at least to some extent for determining the matches. In this paper we propose an adaption of this protocol, where we use blockchains and smart contracts for this matching process, instead. Blockchains are gaining widespread adaption in the smart grid domain as a powerful tool for public commitments and accountable calculations. While the use of a decentralized and trust-free blockchain for this protocol comes at the price of some privacy degradation (for which a mitigation is outlined), this drawback is outweighed for it enables verifiability, reliability and transparency.
In this thesis the influence of secondary electrons on the stability of the presheath is investig... more In this thesis the influence of secondary electrons on the stability of the presheath is investigated. Secondary electrons are emitted from the wall due to the impact of primary particles. They get accelerated by the sheath potential and enter the presheath essentially as a cold beam. If the corresponding electron-beam instability is stronger than the damping processes in the presheath, the presheath becomes unstable. The approximate theoretical treatment of this problem mainly includes the derivation of the dispersion relation. The numerical solution of the dispersion relation shows that we can expect an instability induced by the secondary electrons. The problem is also treated by means of Particle-in-cell (PIC) simulations, where the instability induced by the secondary electrons is observed as well. In addition to these already known results we propose a mechanism which saturates the instability. As the (initially cold) secondary electron beam enters the presheath, its temperatu...
The energy domain currently struggles with radical legal and technological changes, such as, smar... more The energy domain currently struggles with radical legal and technological changes, such as, smart meters. This results in new use cases which can be implemented based on business process technology. Understanding and automating business processes requires to model and test them. However, existing process testing approaches frequently struggle with the testing of process resources, such as ERP systems, and negative testing. Hence, this work presents a toolchain which tackles that limitations. The approach uses an open source process engine to generate event logs and applies process mining techniques in a novel way.
Applications based on blockchain technology have become popular. While these applications have cl... more Applications based on blockchain technology have become popular. While these applications have clear benefits, users are not yet familiar with their usage, which could hinder further applications of this technology. In this paper, an online survey with 110 potential users, as a representative of an average citizen, was conducted. The focus of this survey is to explore their preferences concerning the interaction with blockchain-based applications by mainly focusing on how to handle private keys. To best of our knowledge this is the first study where average citizens are asked about the preferred management of a private key, which is necessary when interacting with blockchain-based applications. One of the main results was that about 80% of the participants would like to have the benefit of data sovereignty despite the cost of being fully responsible to backup their
EURASIP Journal on Information Security, 2017
The availability of individual load profiles per household in the smart grid end-user domain comb... more The availability of individual load profiles per household in the smart grid end-user domain combined with non-intrusive load monitoring to infer personal data from these load curves has led to privacy concerns. Privacy-enhancing technologies have been proposed to address these concerns. In this paper, the extension of privacy-enhancing technologies by wavelet-based multi-resolution analysis (MRA) is proposed to enhance the options available on the user side. For three types of privacy methods (secure aggregation, masking and differential privacy), we show that MRA not only enhances privacy, but also adds additional flexibility and control for the end-user. The combination of MRA and PETs is evaluated in terms of privacy, computational demands, and real-world feasibility for each of the three method types.
ArXiv, 2021
Smart meter data aggregation protocols have been developed to address rising privacy threats agai... more Smart meter data aggregation protocols have been developed to address rising privacy threats against customers’ consumption data. However, these protocols do not work satisfactorily in the presence of failures of smart meters or network communication links. In this paper, we propose a lightweight and fault-tolerant aggregation algorithm that can serve as a solid foundation for further research. We revisit an existing error-resilient privacy-preserving aggregation protocol based on masking and improve it by: (i) performing changes in the cryptographic parts that lead to a reduction of computational costs, (ii) simplifying the behaviour of the protocol in the presence of faults, and showing a proof of proper termination under a well-defined failure model, (iii) decoupling the computation part from the data flow so that the algorithm can also be used with homomorphic encryption as a basis for privacy-preservation. To best of our knowledge, this is the first algorithm that is formulated...
IEEE Transactions on Smart Grid, 2017
There has been a large number of contributions on privacy-preserving smart metering with Differen... more There has been a large number of contributions on privacy-preserving smart metering with Differential Privacy, addressing questions from actual enforcement at the smart meter to billing at the energy provider. However, exploitation is mostly limited to application of cryptographic security means between smart meters and energy providers. We illustrate along the use case of privacy preserving load forecasting that Differential Privacy is indeed a valuable addition that unlocks novel information flows for optimization. We show that (i) there are large differences in utility along three selected forecasting methods, (ii) energy providers can enjoy good utility especially under the linear regression benchmark model, and (iii) households can participate in privacy preserving load forecasting with an individual re-identification risk < 60%, only 10% over random guessing.
IEEE Transactions on Smart Grid, 2016
The deployment of future energy systems promises a number of advantages for a more stable and rel... more The deployment of future energy systems promises a number of advantages for a more stable and reliable grid as well as for a sustainable usage of energy resources. The efficiency and effectiveness of such smart grids rely on customer consumption data that is collected, processed and analyzed. This data is used for billing, monitoring and prediction. However, this implies privacy threats. Approaches exist that aim to either encrypt data in certain ways, to reduce the resolution of data or to mask data in a way so that an individuals' contribution is untraceable. While the latter is an effective way for protecting customer privacy when aggregating over space or time, one of the drawbacks of these approaches is the limitation or full negligence of device failures. In this paper we therefore propose a masking approach for spatio-temporal aggregation of time series for protecting individual privacy while still providing sufficient error-resilience and reliability.
Through smart metering in the smart grid end-user domain, load profiles are measured per househol... more Through smart metering in the smart grid end-user domain, load profiles are measured per household. Personal data can be inferred from these load profiles by using nonintrusive appliance load monitoring methods, which has led to privacy concerns. Privacy is expected to increase with longer intervals between measurements of load curves. This paper studies the impact of data granularity on edge detection methods, which are the common first step in nonintrusive load monitoring algorithms. It is shown that when the time interval exceeds half the on-time of an appliance, the appliance use detection rate declines. Through a one-versus-rest classification modeling, the ability to detect an appliance’s use is evaluated through F-scores. Representing these F-scores visually through a heatmap yields an easily understandable way of presenting potential privacy implications in smart metering to the end-user or other decision makers.
Recently, first methods for holiday detection from unsupervised low-resolution smart metering dat... more Recently, first methods for holiday detection from unsupervised low-resolution smart metering data have been presented. However, due to the unsupervised nature of the problem, previous work only applied the algorithms on a few typical cases and lacks a systematic validation. This paper systematically validates the existing algorithm by visual inspection and shows that numerous cases exist, where implicit assumptions are not met and the methods fail. Moreover, it proposes a new, very simple rule-based method which is in principle able to overcome these problems. This method should be seen as a first step towards improvement, since it is not automated and needs a moderate amount of human intervention for each household.
Energy Informatics
Electrical networks of transmission system operators are mostly built up as isolated networks wit... more Electrical networks of transmission system operators are mostly built up as isolated networks without access to the Internet. With the increasing popularity of smart grids, securing the communication network has become more important to avoid cyber-attacks that could result in possible power outages. For misuse detection, signature-based approaches are already in use and special rules for a wide range of protocols have been developed. However, one big disadvantage of signature-based intrusion detection is that zero-day exploits cannot be detected. Machine-learning-based anomaly detection methods have the potential to achieve that. In this paper, various such methods for intrusion detection in substations, which use the asynchronous communication protocol International Electrotechnical Commission (IEC) 60870-5-104, are tested and compared. The evaluation of the proposed methods is performed by applying them to a data set which includes normal operation traffic and four different atta...
The ability to detect appliances in load data highly depends on the resolution of the data. While... more The ability to detect appliances in load data highly depends on the resolution of the data. While a lot of related work exists on detecting appliances in second or sub-second granularity load data, in this paper, we detect swimming pools through their filter pumps in load data with the 15-minute granularity prescribed by the European Union for smart meters. We model the filter pump based on exemplary measurements and describe a prototypical algorithm to extract the filter pump's consumption from the aggregated mains signal of a real-world household. We evaluate pool detection performance with different classifiers on a data set with 843 households, where the information on the existence of a swimming pool is available. We achieve 94.8% detection accuracy with a precision of 68.5% with an off-the-shelf classifier. Decreasing the temporal resolution in several steps to 8 hours negatively affects the recall while the precision stays at the same level. We find that these results rai...
Proceedings of the VLDB Endowment
Differential privacy allows bounding the influence that training data records have on a machine l... more Differential privacy allows bounding the influence that training data records have on a machine learning model. To use differential privacy in machine learning, data scientists must choose privacy parameters (ϵ, δ ). Choosing meaningful privacy parameters is key, since models trained with weak privacy parameters might result in excessive privacy leakage, while strong privacy parameters might overly degrade model utility. However, privacy parameter values are difficult to choose for two main reasons. First, the theoretical upper bound on privacy loss (ϵ, δ) might be loose, depending on the chosen sensitivity and data distribution of practical datasets. Second, legal requirements and societal norms for anonymization often refer to individual identifiability, to which (ϵ, δ ) are only indirectly related. We transform (ϵ, δ ) to a bound on the Bayesian posterior belief of the adversary assumed by differential privacy concerning the presence of any record in the training dataset. The bou...
Proceedings of the 4th International Conference on Information Systems Security and Privacy, 2018
The planned Smart Meter rollout at a large scale has raised privacy concern. In this work for the... more The planned Smart Meter rollout at a large scale has raised privacy concern. In this work for the first time holiday detection from smart metering data is presented. Although holiday detection may seem easier than occupancy detection, it is shown that occupancy detection methods must at least be adapted when used for holiday detection. A new, unsupervised method for holiday detection that applies classification algorithms on a suitable re-formulation of the problem is presented. Several algorithms were applied to a big, realistic smart metering dataset that-compared to existing datasets for occupancy detection-is unique in terms of number of households (869) and measurement duration (>1 year) and has a realistic low time resolution of 15 minutes. This allows for more realistic checks of seemingly plausible but unconfirmed assumptions. This work is merely a first starting point for further research in this area with more research questions raised than answered. While the results of the algorithms look plausible in a visual analysis, testing for data with ground truth is most importantly needed.
e & i Elektrotechnik und Informationstechnik
To ensure continued reliable operation of the energy grid in the face of a rising number of local... more To ensure continued reliable operation of the energy grid in the face of a rising number of local energy communities (LECs), they need to be integrated in a way that ensures that they do not negatively impact but rather support the overall system, e.g., by providing flexibility. Project ECOSINT aims at intelligent, digital integration of LECs to achieve this goal. Along with several other tasks, this includes the development of a suitable software architecture, which is currently in progress and has yielded a conceptual model as a first result. Among the numerous requirements regarding LECs and their software architecture which have been collected in the initial phase of the project, interoperability has been identified as a crucial factor for success. This is addressed by incorporating AIT’s virtual lab (VLab) framework, which is presented and demonstrated using the example of an electric vehicle (EV) charging scenario as part of the ECOSINT solution.
Neuroepidemiology, 2002
The epidemiology of multiple sclerosis (MS) in Austria is almost unknown. We evaluated the preval... more The epidemiology of multiple sclerosis (MS) in Austria is almost unknown. We evaluated the prevalence of MS in Austria using data from questionnaires completed by neurologists, comprising information on a total of 1,006 MS patients who attended 30 out-patient specialized clinics nationwide. Additional data were collected from 2,414 MS patients, who received questionnaires from the Austrian MS Society or their doctor. A novel extrapolation model, based on frequencies of patients visits at MS clinics, was used to estimate the overall prevalence of MS. Considering either disability or the course of the disease, the prevalence of MS patients in Austria was estimated to be 98.5 per 100,000 people. The prevalence of MS in Austria was found to be similar to that of other countries in Central Europe. Epidemiological studies, such as this, provide a unique source of data from which key features of a disease and its impact on patients may be examined.
2008 19th International Conference on Pattern Recognition, 2008
Video footage of real crowded scenes still poses severe challenges for automated surveillance. Th... more Video footage of real crowded scenes still poses severe challenges for automated surveillance. This paper evaluates clustering methods for finding independent dominant motion fields for an observation period based on a recently published real-time optical flow algorithm. We focus on self-tuning spectral clustering and Isomap combined with k-means. Several combinations of feature vector normalizations and distance measures (Euclidean, Mahanalobis and a general additive distance) are evaluated for four image sequences including three publicly available crowd datasets. Evaluation is based on mean accuracy obtained by comparison with a manually defined ground truth clustering. For every dataset at least one approach correctly classified more than 95% of the flow vectors without extra tuning of parameters, providing a basis for an automatic analysis after a view-dependent setup.
Journal of Machine Learning Research, Dec 1, 2005
AdaBoost.M2 is a boosting algorithm designed for multiclass problems with weak base classifiers. ... more AdaBoost.M2 is a boosting algorithm designed for multiclass problems with weak base classifiers. The algorithm is designed to minimize a very loose bound on the training error. We propose two alternative boosting algorithms which also minimize bounds on performance measures. These performance measures are not as strongly connected to the expected error as the training error, but the derived bounds are tighter than the bound on the training error of AdaBoost.M2. In experiments the methods have roughly the same performance in minimizing the training and test error rates. The new algorithms have the advantage that the base classifier should minimize the confidence-rated error, whereas for AdaBoost.M2 the base classifier should minimize the pseudo-loss. This makes them more easily applicable to already existing base classifiers. The new algorithms also tend to converge faster than AdaBoost.M2.
Sustainable Cloud and Energy Services, 2017
The smart grid changes the way how energy and information are exchanged and offers opportunities ... more The smart grid changes the way how energy and information are exchanged and offers opportunities for incentive-based load balancing. For instance, customers may shift the time of energy consumption of household appliances in exchange for a cheaper energy tariff. This paves the path towards a full range of modular tariffs and dynamic pricing that incorporate the overall grid capacity as well as individual customer demands. This also allows customers to frequently switch within a variety of tariffs from different utility providers based on individual energy consumption and provision forecasts. For automated tariff decisions it is desirable to have a tool that assists in choosing the optimum tariff based on a prediction of individual energy need and production. However, the revelation of individual load patterns for smart grid applications poses severe privacy threats for customers as analyzed in depth in literature. Similarly, accurate and fine-grained regional load forecasts are sensitive business information of utility providers that are not supposed to be released publicly. This paper extends previous work in the domain of privacy-preserving load profile matching where load profiles from utility providers and load profile forecasts from customers are transformed in a distance-preserving embedding in order to find a matching tariff. The embeddings neither reveal individual contributions of customers, nor those of utility providers. Prior work requires a dedicated entity that needs to be trustworthy at least to some extent for determining the matches. In this paper we propose an adaption of this protocol, where we use blockchains and smart contracts for this matching process, instead. Blockchains are gaining widespread adaption in the smart grid domain as a powerful tool for public commitments and accountable calculations. While the use of a decentralized and trust-free blockchain for this protocol comes at the price of some privacy degradation (for which a mitigation is outlined), this drawback is outweighed for it enables verifiability, reliability and transparency.
In this thesis the influence of secondary electrons on the stability of the presheath is investig... more In this thesis the influence of secondary electrons on the stability of the presheath is investigated. Secondary electrons are emitted from the wall due to the impact of primary particles. They get accelerated by the sheath potential and enter the presheath essentially as a cold beam. If the corresponding electron-beam instability is stronger than the damping processes in the presheath, the presheath becomes unstable. The approximate theoretical treatment of this problem mainly includes the derivation of the dispersion relation. The numerical solution of the dispersion relation shows that we can expect an instability induced by the secondary electrons. The problem is also treated by means of Particle-in-cell (PIC) simulations, where the instability induced by the secondary electrons is observed as well. In addition to these already known results we propose a mechanism which saturates the instability. As the (initially cold) secondary electron beam enters the presheath, its temperatu...
The energy domain currently struggles with radical legal and technological changes, such as, smar... more The energy domain currently struggles with radical legal and technological changes, such as, smart meters. This results in new use cases which can be implemented based on business process technology. Understanding and automating business processes requires to model and test them. However, existing process testing approaches frequently struggle with the testing of process resources, such as ERP systems, and negative testing. Hence, this work presents a toolchain which tackles that limitations. The approach uses an open source process engine to generate event logs and applies process mining techniques in a novel way.
Applications based on blockchain technology have become popular. While these applications have cl... more Applications based on blockchain technology have become popular. While these applications have clear benefits, users are not yet familiar with their usage, which could hinder further applications of this technology. In this paper, an online survey with 110 potential users, as a representative of an average citizen, was conducted. The focus of this survey is to explore their preferences concerning the interaction with blockchain-based applications by mainly focusing on how to handle private keys. To best of our knowledge this is the first study where average citizens are asked about the preferred management of a private key, which is necessary when interacting with blockchain-based applications. One of the main results was that about 80% of the participants would like to have the benefit of data sovereignty despite the cost of being fully responsible to backup their
EURASIP Journal on Information Security, 2017
The availability of individual load profiles per household in the smart grid end-user domain comb... more The availability of individual load profiles per household in the smart grid end-user domain combined with non-intrusive load monitoring to infer personal data from these load curves has led to privacy concerns. Privacy-enhancing technologies have been proposed to address these concerns. In this paper, the extension of privacy-enhancing technologies by wavelet-based multi-resolution analysis (MRA) is proposed to enhance the options available on the user side. For three types of privacy methods (secure aggregation, masking and differential privacy), we show that MRA not only enhances privacy, but also adds additional flexibility and control for the end-user. The combination of MRA and PETs is evaluated in terms of privacy, computational demands, and real-world feasibility for each of the three method types.
ArXiv, 2021
Smart meter data aggregation protocols have been developed to address rising privacy threats agai... more Smart meter data aggregation protocols have been developed to address rising privacy threats against customers’ consumption data. However, these protocols do not work satisfactorily in the presence of failures of smart meters or network communication links. In this paper, we propose a lightweight and fault-tolerant aggregation algorithm that can serve as a solid foundation for further research. We revisit an existing error-resilient privacy-preserving aggregation protocol based on masking and improve it by: (i) performing changes in the cryptographic parts that lead to a reduction of computational costs, (ii) simplifying the behaviour of the protocol in the presence of faults, and showing a proof of proper termination under a well-defined failure model, (iii) decoupling the computation part from the data flow so that the algorithm can also be used with homomorphic encryption as a basis for privacy-preservation. To best of our knowledge, this is the first algorithm that is formulated...
IEEE Transactions on Smart Grid, 2017
There has been a large number of contributions on privacy-preserving smart metering with Differen... more There has been a large number of contributions on privacy-preserving smart metering with Differential Privacy, addressing questions from actual enforcement at the smart meter to billing at the energy provider. However, exploitation is mostly limited to application of cryptographic security means between smart meters and energy providers. We illustrate along the use case of privacy preserving load forecasting that Differential Privacy is indeed a valuable addition that unlocks novel information flows for optimization. We show that (i) there are large differences in utility along three selected forecasting methods, (ii) energy providers can enjoy good utility especially under the linear regression benchmark model, and (iii) households can participate in privacy preserving load forecasting with an individual re-identification risk < 60%, only 10% over random guessing.
IEEE Transactions on Smart Grid, 2016
The deployment of future energy systems promises a number of advantages for a more stable and rel... more The deployment of future energy systems promises a number of advantages for a more stable and reliable grid as well as for a sustainable usage of energy resources. The efficiency and effectiveness of such smart grids rely on customer consumption data that is collected, processed and analyzed. This data is used for billing, monitoring and prediction. However, this implies privacy threats. Approaches exist that aim to either encrypt data in certain ways, to reduce the resolution of data or to mask data in a way so that an individuals' contribution is untraceable. While the latter is an effective way for protecting customer privacy when aggregating over space or time, one of the drawbacks of these approaches is the limitation or full negligence of device failures. In this paper we therefore propose a masking approach for spatio-temporal aggregation of time series for protecting individual privacy while still providing sufficient error-resilience and reliability.
Through smart metering in the smart grid end-user domain, load profiles are measured per househol... more Through smart metering in the smart grid end-user domain, load profiles are measured per household. Personal data can be inferred from these load profiles by using nonintrusive appliance load monitoring methods, which has led to privacy concerns. Privacy is expected to increase with longer intervals between measurements of load curves. This paper studies the impact of data granularity on edge detection methods, which are the common first step in nonintrusive load monitoring algorithms. It is shown that when the time interval exceeds half the on-time of an appliance, the appliance use detection rate declines. Through a one-versus-rest classification modeling, the ability to detect an appliance’s use is evaluated through F-scores. Representing these F-scores visually through a heatmap yields an easily understandable way of presenting potential privacy implications in smart metering to the end-user or other decision makers.
Recently, first methods for holiday detection from unsupervised low-resolution smart metering dat... more Recently, first methods for holiday detection from unsupervised low-resolution smart metering data have been presented. However, due to the unsupervised nature of the problem, previous work only applied the algorithms on a few typical cases and lacks a systematic validation. This paper systematically validates the existing algorithm by visual inspection and shows that numerous cases exist, where implicit assumptions are not met and the methods fail. Moreover, it proposes a new, very simple rule-based method which is in principle able to overcome these problems. This method should be seen as a first step towards improvement, since it is not automated and needs a moderate amount of human intervention for each household.
Energy Informatics
Electrical networks of transmission system operators are mostly built up as isolated networks wit... more Electrical networks of transmission system operators are mostly built up as isolated networks without access to the Internet. With the increasing popularity of smart grids, securing the communication network has become more important to avoid cyber-attacks that could result in possible power outages. For misuse detection, signature-based approaches are already in use and special rules for a wide range of protocols have been developed. However, one big disadvantage of signature-based intrusion detection is that zero-day exploits cannot be detected. Machine-learning-based anomaly detection methods have the potential to achieve that. In this paper, various such methods for intrusion detection in substations, which use the asynchronous communication protocol International Electrotechnical Commission (IEC) 60870-5-104, are tested and compared. The evaluation of the proposed methods is performed by applying them to a data set which includes normal operation traffic and four different atta...
The ability to detect appliances in load data highly depends on the resolution of the data. While... more The ability to detect appliances in load data highly depends on the resolution of the data. While a lot of related work exists on detecting appliances in second or sub-second granularity load data, in this paper, we detect swimming pools through their filter pumps in load data with the 15-minute granularity prescribed by the European Union for smart meters. We model the filter pump based on exemplary measurements and describe a prototypical algorithm to extract the filter pump's consumption from the aggregated mains signal of a real-world household. We evaluate pool detection performance with different classifiers on a data set with 843 households, where the information on the existence of a swimming pool is available. We achieve 94.8% detection accuracy with a precision of 68.5% with an off-the-shelf classifier. Decreasing the temporal resolution in several steps to 8 hours negatively affects the recall while the precision stays at the same level. We find that these results rai...
Proceedings of the VLDB Endowment
Differential privacy allows bounding the influence that training data records have on a machine l... more Differential privacy allows bounding the influence that training data records have on a machine learning model. To use differential privacy in machine learning, data scientists must choose privacy parameters (ϵ, δ ). Choosing meaningful privacy parameters is key, since models trained with weak privacy parameters might result in excessive privacy leakage, while strong privacy parameters might overly degrade model utility. However, privacy parameter values are difficult to choose for two main reasons. First, the theoretical upper bound on privacy loss (ϵ, δ) might be loose, depending on the chosen sensitivity and data distribution of practical datasets. Second, legal requirements and societal norms for anonymization often refer to individual identifiability, to which (ϵ, δ ) are only indirectly related. We transform (ϵ, δ ) to a bound on the Bayesian posterior belief of the adversary assumed by differential privacy concerning the presence of any record in the training dataset. The bou...
Proceedings of the 4th International Conference on Information Systems Security and Privacy, 2018
The planned Smart Meter rollout at a large scale has raised privacy concern. In this work for the... more The planned Smart Meter rollout at a large scale has raised privacy concern. In this work for the first time holiday detection from smart metering data is presented. Although holiday detection may seem easier than occupancy detection, it is shown that occupancy detection methods must at least be adapted when used for holiday detection. A new, unsupervised method for holiday detection that applies classification algorithms on a suitable re-formulation of the problem is presented. Several algorithms were applied to a big, realistic smart metering dataset that-compared to existing datasets for occupancy detection-is unique in terms of number of households (869) and measurement duration (>1 year) and has a realistic low time resolution of 15 minutes. This allows for more realistic checks of seemingly plausible but unconfirmed assumptions. This work is merely a first starting point for further research in this area with more research questions raised than answered. While the results of the algorithms look plausible in a visual analysis, testing for data with ground truth is most importantly needed.
Energy Informatics
The nationwide rollout of smart meters in private households raises privacy concerns: Is it possi... more The nationwide rollout of smart meters in private households raises privacy concerns: Is it possible to extract privacy-sensitive information from a household's power consumption? For a small sample of 869 Upper Austrian households, information about consumption-heavy amenities and household characteristics are available. This work studies the detection of households with swimming pools (the most common amenity in the dataset) using Convolutional Neural Networks (CNNs) applied on load heatmaps constructed from load profiles. Although only a small dataset is available, results show that by using CNNs, privacy can be broken automatically, i.e., without the time-consuming, manual feature generation. The method even slightly outperforms a previous approach that relies on a nearest neighbor classifier with engineered features.