Joseph Ndong | IFAN, Cheikh Anta Diop University of Dakar (original) (raw)
Papers by Joseph Ndong
Forecasting
Building a sophisticated forecasting framework for solar and photovoltaic power production in geo... more Building a sophisticated forecasting framework for solar and photovoltaic power production in geographic zones with severe meteorological conditions is very challenging. This difficulty is linked to the high variability of the global solar radiation on which the energy production depends. A suitable forecasting framework might take into account this high variability and could be able to adjust/re-adjust model parameters to reduce sensitivity to estimation errors. The framework should also be able to re-adapt the model parameters whenever the atmospheric conditions change drastically or suddenly—this changes according to microscopic variations. This work presents a new methodology to analyze carefully the meaningful features of global solar radiation variability and extract some relevant information about the probabilistic laws which governs its dynamic evolution. The work establishes a framework able to identify the macroscopic variations from the solar irradiance. The different cat...
This paper presents a new approach for anomaly detection based on possibility theory for normal b... more This paper presents a new approach for anomaly detection based on possibility theory for normal behavioral mod-eling. Combining subspace identification algorithms and Kalman filtering techniques could be a good basis to find a suitable model to build a decision variable where, a new decision process can be applied to identify anomalous events. A robust final decision scheme can be built, by means of possibility distributions to find the abnormal space where anomalies happen. Our system uses a calibrated state space dynamical linear model where the model's parameters are found by the principal component analysis framework. The multidimensional Kalman innovation process is used to build the unidimensional decision variable. Thereafter this variable is clustered and possibility distributions are used to separate the clusters into normal and abnormal spaces when anomalies happen. We had studied the false alarm rate vs. detection rate trade-off by means of the Receiver Operating Char...
Parametric anomaly detection is generally a three steps process where, in the first step a model ... more Parametric anomaly detection is generally a three steps process where, in the first step a model of normal behavior is calibrated and thereafter, the obtained model is used in order to reduce the entropy of the observation. The second step generates an innovation process that is used in the third step to make a decision on the existence or not of an anomaly in the observed data. Under favorable conditions the innovation process is expected to be a Gaussian white noise. However, in practice, this is hardly the case as frequently the observed signals are not gaussian themselves. Moreover long range dependencies, as well as heavy tail in the observation can lead to important deviation from the normality and the independence in the innovation processes. This, results in the frequent observation that the decisions made assuming that the innovation process is a white and Gaussian results in a large false positive rate. In this paper we deal with the above issue. Our approach consists of n...
This paper aims at evaluating impact of the temperature and humidity on the short-term of solar p... more This paper aims at evaluating impact of the temperature and humidity on the short-term of solar potential forecast. Therefore, discrete Kalman filter model based upon AR process and EM algorithm is applied in Dakar for a lead-time of 20 minutes. The model input parameters at time noticed by t, of the model at a time t, are air temperature, relative humidity and global solar radiation. Expectation at (t +T), is the global solar radiation. Input data are measured at the Polytechnic School of Dakar and cover one year. Results are verified using performance criteria such criteria such the nRMSE, nMAE, nMBE. Criteria values are of about 4.9% for the nRMSE, of 0.272% for the nMAE and about of-0.7% for the nMBE. Without considering the impact of temperature and relative humidity criteria calculation lead to following values : nRMSE of 4.8%, nMAE of 0.271% and nMBE of 0.4%. The analysis of the results showed that studied meteorological parameters have very soft influence (difference of nRMS...
Ce memoire porte sur le developpement de techniques de detection d'anomalies dans les reseaux... more Ce memoire porte sur le developpement de techniques de detection d'anomalies dans les reseaux de communication, basees sur des methodes probabilistes et possibilistes. Les methodes probabilistes reposent essentiellement sur des techniques statistiques robustes, notamment le filtrage de Kalman, les modeles de mixtures de gaussiennes, les modeles de Markov caches, et egalement sur l'analyse en composants principaux et ses variantes. Les methodes possibilistes utilisent la theorie des ensembles flous et des distributions de possibilite. Contrairement aux travaux existant dans la litterature, le processus d'innovation (variable de decision) a la sortie du filtre de Kalman est suppose etre une mixture de distributions normales, plutot qu'un simple bruit blanc. Ceci permet d'appliquer des methodes de clustering non supervisees afin de montrer que les anomalies pourront etre detectees dans un petit nombre de clusters, formant le sous espace du comportement anormal du sy...
Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, 2018
In this work, we extend a previous work where we proposed a suitable state model built from a Kar... more In this work, we extend a previous work where we proposed a suitable state model built from a Karhunen-Loeve Transformation to build a new decision process from which, we can extract useful knowledge and information about the identified underlying sub-communities from an initial network. The aim of the method is to build a framework for a multi-level knowledge retrieval. Besides the capacity of the methodology to reduce the high dimensionality of the data, the new detection scheme is able to extract, from the sub-communities, the dense subgroups with the definition and formulation of new quantities related to the notions of energy and co-energy. The energy of a node is defined as the rate of its participation to the set of activities while the notion of co-energy defines the rate of interaction/link between two nodes. These two important features are used to make each link weighted and bounded, so that we are able to perform a thorough refinement of the sub-community discovery. This study allows to perform a multi-level analysis by extracting information either per-link or per-intra-subcommunity. As an improvement of this work, we define the notion of pivot to relate the node(s) with the greatest influence in the network. We propose the use of a thorough tool based on the formulation of the transformation of a suitable probabilistic model into a possibilistic model to extract these pivot(s) which are the nodes that control the evolution of the community.
In this paper, we present an analysis for anomaly detection by comparing two well known approache... more In this paper, we present an analysis for anomaly detection by comparing two well known approaches, namely the Principal Component Analysis (PCA) based and the Kalman filtering based signal processing techniques. The PCA-based approach is coupled with a Karuhen-Loeve expansion (KL) to achieve higher improvement in the detection performance; on the other hand, based on a Kalman filter, we built a new method by combining statistical methods such as: gaussian mixture and a hidden markov modellers, which allows us to obtain performances better than those obtained with the PCA-KL expansion method. For this newer method, our approach consists of not assuming anymore that the Kalman innovation process is gaussian and white. In place, we are assuming that the real distribution of the process is a mixture of normal distributions and that, there is time dependency in the innovation that we will capture by using a Hidden Markov Model. We therefore derive a new decision process and we show that...
E3S Web of Conferences
The prediction of solar potential is an important step toward the evaluation of PV plant producti... more The prediction of solar potential is an important step toward the evaluation of PV plant production for the best energy planning. In this study, the discrete Kalman filter model was implemented for short-term solar resource forecasting one the Dakar site in Senegal. The model input parameters are constituted at a time t of the air temperature, the relative humidity and the global solar radiation. The expected output at time t+T is the global solar radiation. The model performance is evaluated with the square root of the normalized mean squared error (NRMSE), the absolute mean of the normalized error (NMAE), the average bias error (NMBE). The model Validation is carried out by means of the data measured within the Polytechnic Higher School of Dakar for one year. The simulation results following the 20 minute horizon show a good correlation between the prediction and the measurement with an NRMSE of 4.8%, an NMAE of 0.27% and an NMBE of 0.04%. This model could contribute to help photo...
Proceeding of the Electrical Engineering Computer Science and Informatics
Possibility theory can be used as a suitable framework to build a normal behavioral model for an ... more Possibility theory can be used as a suitable framework to build a normal behavioral model for an anomaly detector. Based on linear and/or nonlinear systems, sub-optimal filtering approaches based on the Extended Kalman Filter and the Unscented Kalman Filter are calibrated for entropy reduction and could be a good basis to find a suitable model to build a decision variable where, a decision process can be applied to identify anomalous events. Sophisticated fuzzy clustering algorithms can be used to find a set of clusters built on the decision variable, where anomalies might happen inside a few of them. To achieve an efficient detection step, a robust decision scheme is built, by means of possibility distributions, to separate the clusters into normal and abnormal spaces. We had studied the false alarm rate vs. detection rate trade-off by means of ROC (Receiver Operating Characteristic) curves to show the results. We validate the approach over different realistic network traffic.
Studies in Computational Intelligence, 2016
2015 11th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS), 2015
In this paper, we present an analysis for anomaly detection by comparing two well known approache... more In this paper, we present an analysis for anomaly detection by comparing two well known approaches, namely the Principal Component Analysis (PCA) based and the Kalman filtering based signal processing techniques. The PCA-based approach is coupled with a Karuhen-Loeve expansion (KL) to achieve higher improvement in the detection performance; on the other hand, based on a Kalman filter, we built a new method by combining statistical methods such as: gaussian mixture and a hidden markov modellers, which allows us to obtain performances better than those obtained with the PCA-KL expansion method. For this newer method, our approach consists of not assuming anymore that the Kalman innovation process is gaussian and white. In place, we are assuming that the real distribution of the process is a mixture of normal distributions and that, there is time dependency in the innovation that we will capture by using a Hidden Markov Model. We therefore derive a new decision process and we show that this approach results in an considerable decrease of false alarm rates. We validate the two comparative approaches over several different realistic traces.
Studies in Computational Intelligence, 2016
Lecture Notes in Computer Science, 2015
ABSTRACT This work deals with a fluctuating workload as in social ap- plications where users inte... more ABSTRACT This work deals with a fluctuating workload as in social ap- plications where users interact each other in a temporary fashion. The data on which a user group focuses form a bundle and can cause a peak if the frequency of interactions as well as the number of users is high. To manage such a situation, one solution is to partition data and/or to move them to a more powerful machine while ensuring consistency and effec- tiveness. However, two problems may be raised such as how to partition data in a efficient way and how to determine which part of data to move in such a way that data are located on one single site. To achieve this goal, we track the bundles formation and their evolution and measure their related load for two reasons : (1) to be able to partition data based on how they are required by user interactions; and (2) to assess whether a machine is still able of executing transactions linked to a bundle with a bounded latency. The main gain of our approach is to minimize the number of machines used while maintaining low latency at a low cost.
This paper presents a new approach for anomaly detection based on possibility theory for normal b... more This paper presents a new approach for anomaly detection based on possibility theory for normal behavioral modeling. Combining subspace identification algorithms and Kalman filtering techniques could be a good basis to find a suitable model to build a decision variable where, a new decision process can be applied to identify anomalous events. A robust final decision scheme can be built, by means of possibility distributions to find the abnormal space where anomalies happen. Our system uses a calibrated state space dynamical linear model where the model's parameters are found by the principal component analysis framework. The multidimensional Kalman innovation process is used to build the unidimensional decision variable. Thereafter this variable is clustered and possibility distributions are used to separate the clusters into normal and abnormal spaces when anomalies happen. We had studied the false alarm rate {\em vs.} detection rate trade-off by means of the Receiver Operating...
Lecture Notes in Social Networks, 2014
Conference Proceedings on 3rd Annual International Conference on Network Technology & Communications, 2012
Parametric anomaly detection is generally a three steps process where, in the first step a model ... more Parametric anomaly detection is generally a three steps process where, in the first step a model of normal behavior is calibrated and thereafter, the obtained model is used in order to reduce the entropy of the observation. The second step generates an innovation process that is used in the third step to make a decision on the existence or not of an anomaly in the observed data. Under favorable conditions the innovation process is expected to be a Gaussian white noise. However, in practice, this is hardly the case as frequently the observed signals are not gaussian themselves. Moreover long range dependencies, as well as heavy tail in the observation can lead to important deviation from the normality and the independence in the innovation processes. This, results in the frequent observation that the decisions made assuming that the innovation process is a white and Gaussian results in a large false positive rate. In this paper we deal with the above issue. Our approach consists of not assuming anymore that the innovation process is Gaussian and white. In place we are assuming that the real distribution of the process is a mixture of Gaussian and that there are some time dependency in the innovation that we will capture by using a Hidden Markov Model. We therefore derive a new decision process and we show that this approach results into an important decrease of false alarm rates. We validate this approach over realistic traces.
Forecasting
Building a sophisticated forecasting framework for solar and photovoltaic power production in geo... more Building a sophisticated forecasting framework for solar and photovoltaic power production in geographic zones with severe meteorological conditions is very challenging. This difficulty is linked to the high variability of the global solar radiation on which the energy production depends. A suitable forecasting framework might take into account this high variability and could be able to adjust/re-adjust model parameters to reduce sensitivity to estimation errors. The framework should also be able to re-adapt the model parameters whenever the atmospheric conditions change drastically or suddenly—this changes according to microscopic variations. This work presents a new methodology to analyze carefully the meaningful features of global solar radiation variability and extract some relevant information about the probabilistic laws which governs its dynamic evolution. The work establishes a framework able to identify the macroscopic variations from the solar irradiance. The different cat...
This paper presents a new approach for anomaly detection based on possibility theory for normal b... more This paper presents a new approach for anomaly detection based on possibility theory for normal behavioral mod-eling. Combining subspace identification algorithms and Kalman filtering techniques could be a good basis to find a suitable model to build a decision variable where, a new decision process can be applied to identify anomalous events. A robust final decision scheme can be built, by means of possibility distributions to find the abnormal space where anomalies happen. Our system uses a calibrated state space dynamical linear model where the model's parameters are found by the principal component analysis framework. The multidimensional Kalman innovation process is used to build the unidimensional decision variable. Thereafter this variable is clustered and possibility distributions are used to separate the clusters into normal and abnormal spaces when anomalies happen. We had studied the false alarm rate vs. detection rate trade-off by means of the Receiver Operating Char...
Parametric anomaly detection is generally a three steps process where, in the first step a model ... more Parametric anomaly detection is generally a three steps process where, in the first step a model of normal behavior is calibrated and thereafter, the obtained model is used in order to reduce the entropy of the observation. The second step generates an innovation process that is used in the third step to make a decision on the existence or not of an anomaly in the observed data. Under favorable conditions the innovation process is expected to be a Gaussian white noise. However, in practice, this is hardly the case as frequently the observed signals are not gaussian themselves. Moreover long range dependencies, as well as heavy tail in the observation can lead to important deviation from the normality and the independence in the innovation processes. This, results in the frequent observation that the decisions made assuming that the innovation process is a white and Gaussian results in a large false positive rate. In this paper we deal with the above issue. Our approach consists of n...
This paper aims at evaluating impact of the temperature and humidity on the short-term of solar p... more This paper aims at evaluating impact of the temperature and humidity on the short-term of solar potential forecast. Therefore, discrete Kalman filter model based upon AR process and EM algorithm is applied in Dakar for a lead-time of 20 minutes. The model input parameters at time noticed by t, of the model at a time t, are air temperature, relative humidity and global solar radiation. Expectation at (t +T), is the global solar radiation. Input data are measured at the Polytechnic School of Dakar and cover one year. Results are verified using performance criteria such criteria such the nRMSE, nMAE, nMBE. Criteria values are of about 4.9% for the nRMSE, of 0.272% for the nMAE and about of-0.7% for the nMBE. Without considering the impact of temperature and relative humidity criteria calculation lead to following values : nRMSE of 4.8%, nMAE of 0.271% and nMBE of 0.4%. The analysis of the results showed that studied meteorological parameters have very soft influence (difference of nRMS...
Ce memoire porte sur le developpement de techniques de detection d'anomalies dans les reseaux... more Ce memoire porte sur le developpement de techniques de detection d'anomalies dans les reseaux de communication, basees sur des methodes probabilistes et possibilistes. Les methodes probabilistes reposent essentiellement sur des techniques statistiques robustes, notamment le filtrage de Kalman, les modeles de mixtures de gaussiennes, les modeles de Markov caches, et egalement sur l'analyse en composants principaux et ses variantes. Les methodes possibilistes utilisent la theorie des ensembles flous et des distributions de possibilite. Contrairement aux travaux existant dans la litterature, le processus d'innovation (variable de decision) a la sortie du filtre de Kalman est suppose etre une mixture de distributions normales, plutot qu'un simple bruit blanc. Ceci permet d'appliquer des methodes de clustering non supervisees afin de montrer que les anomalies pourront etre detectees dans un petit nombre de clusters, formant le sous espace du comportement anormal du sy...
Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, 2018
In this work, we extend a previous work where we proposed a suitable state model built from a Kar... more In this work, we extend a previous work where we proposed a suitable state model built from a Karhunen-Loeve Transformation to build a new decision process from which, we can extract useful knowledge and information about the identified underlying sub-communities from an initial network. The aim of the method is to build a framework for a multi-level knowledge retrieval. Besides the capacity of the methodology to reduce the high dimensionality of the data, the new detection scheme is able to extract, from the sub-communities, the dense subgroups with the definition and formulation of new quantities related to the notions of energy and co-energy. The energy of a node is defined as the rate of its participation to the set of activities while the notion of co-energy defines the rate of interaction/link between two nodes. These two important features are used to make each link weighted and bounded, so that we are able to perform a thorough refinement of the sub-community discovery. This study allows to perform a multi-level analysis by extracting information either per-link or per-intra-subcommunity. As an improvement of this work, we define the notion of pivot to relate the node(s) with the greatest influence in the network. We propose the use of a thorough tool based on the formulation of the transformation of a suitable probabilistic model into a possibilistic model to extract these pivot(s) which are the nodes that control the evolution of the community.
In this paper, we present an analysis for anomaly detection by comparing two well known approache... more In this paper, we present an analysis for anomaly detection by comparing two well known approaches, namely the Principal Component Analysis (PCA) based and the Kalman filtering based signal processing techniques. The PCA-based approach is coupled with a Karuhen-Loeve expansion (KL) to achieve higher improvement in the detection performance; on the other hand, based on a Kalman filter, we built a new method by combining statistical methods such as: gaussian mixture and a hidden markov modellers, which allows us to obtain performances better than those obtained with the PCA-KL expansion method. For this newer method, our approach consists of not assuming anymore that the Kalman innovation process is gaussian and white. In place, we are assuming that the real distribution of the process is a mixture of normal distributions and that, there is time dependency in the innovation that we will capture by using a Hidden Markov Model. We therefore derive a new decision process and we show that...
E3S Web of Conferences
The prediction of solar potential is an important step toward the evaluation of PV plant producti... more The prediction of solar potential is an important step toward the evaluation of PV plant production for the best energy planning. In this study, the discrete Kalman filter model was implemented for short-term solar resource forecasting one the Dakar site in Senegal. The model input parameters are constituted at a time t of the air temperature, the relative humidity and the global solar radiation. The expected output at time t+T is the global solar radiation. The model performance is evaluated with the square root of the normalized mean squared error (NRMSE), the absolute mean of the normalized error (NMAE), the average bias error (NMBE). The model Validation is carried out by means of the data measured within the Polytechnic Higher School of Dakar for one year. The simulation results following the 20 minute horizon show a good correlation between the prediction and the measurement with an NRMSE of 4.8%, an NMAE of 0.27% and an NMBE of 0.04%. This model could contribute to help photo...
Proceeding of the Electrical Engineering Computer Science and Informatics
Possibility theory can be used as a suitable framework to build a normal behavioral model for an ... more Possibility theory can be used as a suitable framework to build a normal behavioral model for an anomaly detector. Based on linear and/or nonlinear systems, sub-optimal filtering approaches based on the Extended Kalman Filter and the Unscented Kalman Filter are calibrated for entropy reduction and could be a good basis to find a suitable model to build a decision variable where, a decision process can be applied to identify anomalous events. Sophisticated fuzzy clustering algorithms can be used to find a set of clusters built on the decision variable, where anomalies might happen inside a few of them. To achieve an efficient detection step, a robust decision scheme is built, by means of possibility distributions, to separate the clusters into normal and abnormal spaces. We had studied the false alarm rate vs. detection rate trade-off by means of ROC (Receiver Operating Characteristic) curves to show the results. We validate the approach over different realistic network traffic.
Studies in Computational Intelligence, 2016
2015 11th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS), 2015
In this paper, we present an analysis for anomaly detection by comparing two well known approache... more In this paper, we present an analysis for anomaly detection by comparing two well known approaches, namely the Principal Component Analysis (PCA) based and the Kalman filtering based signal processing techniques. The PCA-based approach is coupled with a Karuhen-Loeve expansion (KL) to achieve higher improvement in the detection performance; on the other hand, based on a Kalman filter, we built a new method by combining statistical methods such as: gaussian mixture and a hidden markov modellers, which allows us to obtain performances better than those obtained with the PCA-KL expansion method. For this newer method, our approach consists of not assuming anymore that the Kalman innovation process is gaussian and white. In place, we are assuming that the real distribution of the process is a mixture of normal distributions and that, there is time dependency in the innovation that we will capture by using a Hidden Markov Model. We therefore derive a new decision process and we show that this approach results in an considerable decrease of false alarm rates. We validate the two comparative approaches over several different realistic traces.
Studies in Computational Intelligence, 2016
Lecture Notes in Computer Science, 2015
ABSTRACT This work deals with a fluctuating workload as in social ap- plications where users inte... more ABSTRACT This work deals with a fluctuating workload as in social ap- plications where users interact each other in a temporary fashion. The data on which a user group focuses form a bundle and can cause a peak if the frequency of interactions as well as the number of users is high. To manage such a situation, one solution is to partition data and/or to move them to a more powerful machine while ensuring consistency and effec- tiveness. However, two problems may be raised such as how to partition data in a efficient way and how to determine which part of data to move in such a way that data are located on one single site. To achieve this goal, we track the bundles formation and their evolution and measure their related load for two reasons : (1) to be able to partition data based on how they are required by user interactions; and (2) to assess whether a machine is still able of executing transactions linked to a bundle with a bounded latency. The main gain of our approach is to minimize the number of machines used while maintaining low latency at a low cost.
This paper presents a new approach for anomaly detection based on possibility theory for normal b... more This paper presents a new approach for anomaly detection based on possibility theory for normal behavioral modeling. Combining subspace identification algorithms and Kalman filtering techniques could be a good basis to find a suitable model to build a decision variable where, a new decision process can be applied to identify anomalous events. A robust final decision scheme can be built, by means of possibility distributions to find the abnormal space where anomalies happen. Our system uses a calibrated state space dynamical linear model where the model's parameters are found by the principal component analysis framework. The multidimensional Kalman innovation process is used to build the unidimensional decision variable. Thereafter this variable is clustered and possibility distributions are used to separate the clusters into normal and abnormal spaces when anomalies happen. We had studied the false alarm rate {\em vs.} detection rate trade-off by means of the Receiver Operating...
Lecture Notes in Social Networks, 2014
Conference Proceedings on 3rd Annual International Conference on Network Technology & Communications, 2012
Parametric anomaly detection is generally a three steps process where, in the first step a model ... more Parametric anomaly detection is generally a three steps process where, in the first step a model of normal behavior is calibrated and thereafter, the obtained model is used in order to reduce the entropy of the observation. The second step generates an innovation process that is used in the third step to make a decision on the existence or not of an anomaly in the observed data. Under favorable conditions the innovation process is expected to be a Gaussian white noise. However, in practice, this is hardly the case as frequently the observed signals are not gaussian themselves. Moreover long range dependencies, as well as heavy tail in the observation can lead to important deviation from the normality and the independence in the innovation processes. This, results in the frequent observation that the decisions made assuming that the innovation process is a white and Gaussian results in a large false positive rate. In this paper we deal with the above issue. Our approach consists of not assuming anymore that the innovation process is Gaussian and white. In place we are assuming that the real distribution of the process is a mixture of Gaussian and that there are some time dependency in the innovation that we will capture by using a Hidden Markov Model. We therefore derive a new decision process and we show that this approach results into an important decrease of false alarm rates. We validate this approach over realistic traces.