Amine Farhat | ISG-Tunis - Academia.edu (original) (raw)

Papers by Amine Farhat

Research paper thumbnail of Mining MultiLevel Frequent Itemsets under Constraints

Research paper thumbnail of Clustering Heterogeneous Data Streams with Uncertainty over Sliding Window

Lecture Notes in Computer Science, 2013

Existing methods for clustering uncertain data streams over sliding windows do not treat the cate... more Existing methods for clustering uncertain data streams over sliding windows do not treat the categorical attributes. However, uncertain mixed data are ubiquitous. This paper investigates the problem of clustering heterogeneous data streams pervaded by uncertainty over sliding windows, so-called SWHU-Clustering. A Heterogeneous Uncertain Temporal Cluster Feature HUTCF is introduced to monitor the distribution statistics of mixed data points. Based on this structure, Exponential Histogram of Heterogeneous Uncertain Cluster Feature EHHUCF is presented as a collection of HUTCF. This structure may help to handle the in-cluster evolution, and detects the temporal change of the cluster distribution. Our approach has several advantages over existing method: 1 the higher execution efficiency benefits from its good design as it avoids the effects of old data on the final results. 2 We incorporated the k-NN into the clustering process in order to reduce the complexity of the algorithm. 3 Memory consumption can be managed efficiently by limiting the number of HUTCF in each EHHUCF. Simulations on real databases show the feasibility of SWHU-Clustering as well as its effectiveness by comparing it with UMicro algorithm.

Research paper thumbnail of New Algorithm for Frequent Itemsets Mining from Evidential Data Streams

Procedia Computer Science, 2016

Mining frequent itemsets is a very interesting issue in Data Streams handling, useful for several... more Mining frequent itemsets is a very interesting issue in Data Streams handling, useful for several real world applications. This task reveals many challenges such the one-pass principle as well as performance problems due to the huge volumes of Data Streams. Performance is defined in terms of CPU and main memory consumption in terms of uncertainty management issues. In this paper, we introduce the concept of Evidential Data Streams and we present a new innovative algorithm for mining frequent itemsets from evidential data streams, based on the evidence theory concepts.

Research paper thumbnail of BUILDING A DATA WAREHOUSE FOR NATIONAL SOCIAL SECURITY FUND OF THE REPUBLIC OF TUNISIA

International Journal of Database Management Systems, 2010

The amounts of data available to decision makers are increasingly important, given the network av... more The amounts of data available to decision makers are increasingly important, given the network availability, low cost storage and diversity of applications. To maximize the potential of these data within the National Social Security Fund (NSSF) in Tunisia, we have built a data warehouse as a multidimensional database, cleaned, homogenized, historicized and consolidated. We used Oracle Warehouse Builder to extract, transform and load the source data into the Data Warehouse, by applying the KDD process. We have implemented the Data Warehouse as an Oracle OLAP. The knowledge extraction has been performed using the Oracle Discoverer tool. This allowed users to take maximum advantage of knowledge as a regular report or as ad hoc queries. We started by implementing the main topic for this public institution, accounting for the movements of insured persons. The great success that has followed the completion of this work has encouraged the NSSF to complete the achievement of other topics of interest within the NSSF. We suggest in the near future to use Multidimensional Data Mining to extract hidden knowledge and that are not predictable by the OLAP.

Research paper thumbnail of Mining MultiLevel Frequent Itemsets under Constraints

Computing Research Repository, 2010

Mining association rules is a task of data mining, which extracts knowledge in the form of signif... more Mining association rules is a task of data mining, which extracts knowledge in the form of significant implication relation of useful items (objects) from a database. Mining multilevel association rules uses concept hierarchies, also called taxonomies and defined as relations of type 'is-a' between objects, to extract rules that items belong to different levels of abstraction. These rules are more

Research paper thumbnail of Mining MultiLevel Frequent Itemsets under Constraints

Research paper thumbnail of Clustering Heterogeneous Data Streams with Uncertainty over Sliding Window

Lecture Notes in Computer Science, 2013

Existing methods for clustering uncertain data streams over sliding windows do not treat the cate... more Existing methods for clustering uncertain data streams over sliding windows do not treat the categorical attributes. However, uncertain mixed data are ubiquitous. This paper investigates the problem of clustering heterogeneous data streams pervaded by uncertainty over sliding windows, so-called SWHU-Clustering. A Heterogeneous Uncertain Temporal Cluster Feature HUTCF is introduced to monitor the distribution statistics of mixed data points. Based on this structure, Exponential Histogram of Heterogeneous Uncertain Cluster Feature EHHUCF is presented as a collection of HUTCF. This structure may help to handle the in-cluster evolution, and detects the temporal change of the cluster distribution. Our approach has several advantages over existing method: 1 the higher execution efficiency benefits from its good design as it avoids the effects of old data on the final results. 2 We incorporated the k-NN into the clustering process in order to reduce the complexity of the algorithm. 3 Memory consumption can be managed efficiently by limiting the number of HUTCF in each EHHUCF. Simulations on real databases show the feasibility of SWHU-Clustering as well as its effectiveness by comparing it with UMicro algorithm.

Research paper thumbnail of New Algorithm for Frequent Itemsets Mining from Evidential Data Streams

Procedia Computer Science, 2016

Mining frequent itemsets is a very interesting issue in Data Streams handling, useful for several... more Mining frequent itemsets is a very interesting issue in Data Streams handling, useful for several real world applications. This task reveals many challenges such the one-pass principle as well as performance problems due to the huge volumes of Data Streams. Performance is defined in terms of CPU and main memory consumption in terms of uncertainty management issues. In this paper, we introduce the concept of Evidential Data Streams and we present a new innovative algorithm for mining frequent itemsets from evidential data streams, based on the evidence theory concepts.

Research paper thumbnail of BUILDING A DATA WAREHOUSE FOR NATIONAL SOCIAL SECURITY FUND OF THE REPUBLIC OF TUNISIA

International Journal of Database Management Systems, 2010

The amounts of data available to decision makers are increasingly important, given the network av... more The amounts of data available to decision makers are increasingly important, given the network availability, low cost storage and diversity of applications. To maximize the potential of these data within the National Social Security Fund (NSSF) in Tunisia, we have built a data warehouse as a multidimensional database, cleaned, homogenized, historicized and consolidated. We used Oracle Warehouse Builder to extract, transform and load the source data into the Data Warehouse, by applying the KDD process. We have implemented the Data Warehouse as an Oracle OLAP. The knowledge extraction has been performed using the Oracle Discoverer tool. This allowed users to take maximum advantage of knowledge as a regular report or as ad hoc queries. We started by implementing the main topic for this public institution, accounting for the movements of insured persons. The great success that has followed the completion of this work has encouraged the NSSF to complete the achievement of other topics of interest within the NSSF. We suggest in the near future to use Multidimensional Data Mining to extract hidden knowledge and that are not predictable by the OLAP.

Research paper thumbnail of Mining MultiLevel Frequent Itemsets under Constraints

Computing Research Repository, 2010

Mining association rules is a task of data mining, which extracts knowledge in the form of signif... more Mining association rules is a task of data mining, which extracts knowledge in the form of significant implication relation of useful items (objects) from a database. Mining multilevel association rules uses concept hierarchies, also called taxonomies and defined as relations of type 'is-a' between objects, to extract rules that items belong to different levels of abstraction. These rules are more