Fabrizio Angiulli | Università della Calabria (original) (raw)

Papers by Fabrizio Angiulli

Research paper thumbnail of Principal Directions-Based Pivot Placement

Lecture Notes in Computer Science, 2013

Determining a good sets of pivots is a challenging task for metric space indexing. Several techni... more Determining a good sets of pivots is a challenging task for metric space indexing. Several techniques to select pivots from the data to be indexed have been introduced in the literature. In this paper, we propose a pivot placement strategy which exploits the natural data orientation in order to select space points which achieve a good alignment with the whole data to be indexed. Comparison with existing methods substantiates the effectiveness of the approach.

Research paper thumbnail of Outlier Detection Using Disjunctive Logic Programming

European Conference on Artificial Intelligence, 2004

Page 1. Outlier Detection using Disjunctive Logic Programming Fabrizio Angiulli 1 and Rachel Ben-... more Page 1. Outlier Detection using Disjunctive Logic Programming Fabrizio Angiulli 1 and Rachel Ben-Eliyahu - Zohary 2 and Luigi Palopoli 3 Abstract. ... 1 ICAR-CNR, Via Pietro Bucci 41C, 87030 Rende (CS), Italy, email: angiulli@icar.cnr.it 2 Comm. ...

Research paper thumbnail of Outlier Detection by Logic Programming

Computing Research Repository, 2004

The development of efiective knowledge discovery techniques has become a very active research are... more The development of efiective knowledge discovery techniques has become a very active research area in recent years due to the important impact it has had in several relevant application domains. One interesting task therein is that of singling out anomalous individuals from a given population, e.g., to detect rare events in time-series analysis settings, or to identify objects whose behavior

Research paper thumbnail of Outlier Detection Using Default Logic Fabrizio Angiulli , Rachel Ben-Eliyahu - Zohary

ABSTRACT Default logic is used to describe regular behavior and normal properties. We suggest to ... more ABSTRACT Default logic is used to describe regular behavior and normal properties. We suggest to exploit the framework of default logic for detecting outliers - individuals who behave in an unexpected way or feature abnormal properties. The ability to locate outliers can help to maintain knowledgebase integrity and to single out irregular individuals. We first formally define the notion of an outlier and an outlier witness. We then show that finding outliers is quite complex. Indeed, we show that several versions of the outlier detection problem lie over the second level of the polynomial hierarchy. For example, the question of establishing if at least one outlier can be detected in a given propositional default theory is -complete. Although outlier detection involves heavy computation, the queries involved can frequently be executed off-line, thus somewhat alleviating the difficulty of the problem. In addition, we show that outlier detection can be done in polynomial time for both the class of acyclic normal unary defaults and the class of acyclic dual normal unary defaults. 1

Research paper thumbnail of Metaquerying: propriet� e tecniche di implementazione

Research paper thumbnail of Outlier detection for simple default theories

Artificial Intelligence, Oct 1, 2010

It was noted recently that the framework of default logics can be exploited for detecting outlier... more It was noted recently that the framework of default logics can be exploited for detecting outliers. Outliers are observations expressed by sets of literals that feature unexpected properties. These observations are not explicitly provided in input (as it happens with abduction) but, rather, they are hidden in the given knowledge base. Unfortunately, in the two related formalisms for specifying defaults -Reiter's default logic and extended disjunctive logic programs -the most general outlier detection problems turn out to lie at the third level of the polynomial hierarchy. In this note, we analyze the complexity of outlier detection for two very simple classes of default theories, namely NU and DNU, for which the entailment problem is solvable in polynomial time. We show that, for these classes, checking for the existence of an outlier is anyway intractable. This result contributes to further showing the inherent intractability of outlier detection in default reasoning.

Research paper thumbnail of Computational properties of metaquerying problems

Acm Transactions on Computational Logic, 2003

ABSTRACT Metaquerying is a datamining technology by which hidden dependencies among several datab... more ABSTRACT Metaquerying is a datamining technology by which hidden dependencies among several database relations can be discovered. This tool has already been successfully applied to several real-world applications. Recent papers provide only preliminary results about the complexity of metaquerying. In this paper we define several variants of metaquerying that encompass, as far as we know, all variants defined in the literature. We study both the combined complexity and the data complexity of these variants. We show that, under the combined complexity measure, metaquerying is generally intractable (unless P=NP), lying sometimes quite high in the complexity hierarchies (as high as NP PP), depending on the characteristics of the plausibility index. However, we are able to single out some tractable and interesting metaquerying cases (whose combined complexity is LOGCFL-complete). As for the data complexity of metaquerying, we prove that, in general, this is in TC0, but lies within AC0 in some simpler cases. Finally, we discuss implementation of metaqueries, by providing algorithms to answer them.

Research paper thumbnail of On the Complexity of Mining Association Rules

Sebd, 2001

... Fabrizio Angiulli1, Giovambattista Ianni2, and Luigi Palopoli3 1 ISI-CNR c/o Universitá della... more ... Fabrizio Angiulli1, Giovambattista Ianni2, and Luigi Palopoli3 1 ISI-CNR c/o Universitá della Calabria, DEIS, Via P. Bucci 41C, Rende, Italy angiulli@isi.cs.cnr.it 2 Universitá della Calabria, DEIS, Via P. Bucci 41C, Rende, Italy ianni@deis.unical.it 3 Universitá di Reggio Calabria ...

Research paper thumbnail of Outlier Detection Using Default Logic

Default logic is used to describe regular behavior and normal properties. We suggest to exploit t... more Default logic is used to describe regular behavior and normal properties. We suggest to exploit the framework of default logic for detecting outliers - individuals who behave in an unexpected way or feature abnormal properties. The ability to locate outliers can help to maintain knowledgebase integrity and to single out irregular individuals.

Research paper thumbnail of On the complexity of inducing categorical and quantitative association rules

Theoretical Computer Science, 2004

Inducing association rules is one of the central tasks in data mining applications. Quantitative ... more Inducing association rules is one of the central tasks in data mining applications. Quantitative association rules induced from databases describe rich and hidden relationships to be found within data that can prove useful for various application purposes (e.g., market basket analysis, customer proÿling, and others). Although association rules are quite widely used in practice, a thorough analysis of the related computational complexity is missing. This paper intends to provide a contribution in this setting. To this end, we ÿrst formally deÿne quantitative association rule mining problems, which include boolean association rules as a special case; we then analyze computational complexity of such problems. The general problem as well as some interesting special cases are considered.

Research paper thumbnail of An Unsupervised Outlier Detection Approach for Cleaning String Data Entries

Sistemi Evoluti per Basi di Dati, 2008

Research paper thumbnail of Top-k closest pairs join query: an approximate algorithm for large high dimensional data

Proceedings. International Database Engineering and Applications Symposium, 2004. IDEAS '04., 2004

ABSTRACT In This work we present a novel approximate algorithm to calculate the top k closest pai... more ABSTRACT In This work we present a novel approximate algorithm to calculate the top k closest pairs join query of two large and high dimensional data sets. The algorithm has worst case time complexity O(d2nk) and space complexity O(nd) and guarantees a solution within a O(d1+ 12 /) factor of the exact one, where t ∈ {1,2,..., ∞} denotes the Minkowski metrics Lt of interest and d the dimensionality. It makes use of the concept of space filling curve to establish an order between the points of the space and performs at most d + 1 sorts and scans of the two data sets. During a scan, each point from one data set is compared with its closest points, according to the space filling curve order, in the other data set and points whose contribution to the solution has already been analyzed are detected and eliminated. Experimental results on real and synthetic data sets show that our algorithm (i) behaves as an exact algorithm in low dimensional spaces; (ii) it is able to prune the entire (or a considerable fraction of the) data set even for high dimensions if certain separation conditions are satisfied; (iii) in any case it returns a solution within a small error to the exact one.

Research paper thumbnail of Optimal Subset Selection for Classification through SAT Encodings

IFIP – The International Federation for Information Processing, 2008

... Fabrizio Angiulli DEIS, Universit`a della Calabria, Via P. Bucci 41C, 87036 Rende (CS), Italy... more ... Fabrizio Angiulli DEIS, Universit`a della Calabria, Via P. Bucci 41C, 87036 Rende (CS), Italy, e-mail: f.angiulli@deis.unical.it Stefano Basta ICAR-CNR, Via P. Bucci 41C, 87036 Rende (CS), Italy, e-mail: basta@icar.cnr.it Please ...

Research paper thumbnail of Detection of Discriminating Rules

Research paper thumbnail of On the tractability of minimal model computation for some CNF theories

Artificial Intelligence, 2014

ABSTRACT Designing algorithms capable of efficiently constructing minimal models of CNFs is an im... more ABSTRACT Designing algorithms capable of efficiently constructing minimal models of CNFs is an important task in AI. This paper provides new results along this research line and presents new algorithms for performing minimal model finding and checking over positive propositional CNFs and model minimization over propositional CNFs. An algorithmic schema, called the Generalized Elimination Algorithm (GEA) is presented, that computes a minimal model of any positive CNF. The schema generalizes the Elimination Algorithm (EA) [BP97], which computes a minimal model of positive head-cycle-free (HCF) CNF theories. While the EA always runs in polynomial time in the size of the input HCF CNF, the complexity of the GEA depends on the complexity of the specific eliminating operator invoked therein, which may in general turn out to be exponential. Therefore, a specific eliminating operator is defined by which the GEA computes, in polynomial time, a minimal model for a class of CNF that strictly includes head-elementary-set-free (HEF) CNF theories [GLL06], which form, in their turn, a strict superset of HCF theories. Furthermore, in order to deal with the high complexity associated with recognizing HEF theories, an "incomplete" variant of the GEA (called IGEA) is proposed: the resulting schema, once instantiated with an appropriate elimination operator, always constructs a model of the input CNF, which is guaranteed to be minimal if the input theory is HEF. In the light of the above results, the main contribution of this work is the enlargement of the tractability frontier for the minimal model finding and checking and the model minimization problems.

Research paper thumbnail of A Greedy Search Approach to Co-clustering Sparse Binary Matrices

2006 18th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'06), 2006

Page 1. A Greedy Search Approach to Co-clustering Sparse Binary Matrices Fabrizio Angiulli, Eugen... more Page 1. A Greedy Search Approach to Co-clustering Sparse Binary Matrices Fabrizio Angiulli, Eugenio Cesario, Clara Pizzuti ICAR-CNR Via P. Bucci, 41C 87036 Rende (CS), Italy {angiulli,cesario,pizzuti}@icar.cnr.it Abstract ...

Research paper thumbnail of DESCRY: A Density Based Clustering Algorithm for Very Large Data Sets

Lecture Notes in Computer Science, 2004

ABSTRACT A novel algorithm, named DESCRY, for clustering very large multidimensional data sets wi... more ABSTRACT A novel algorithm, named DESCRY, for clustering very large multidimensional data sets with numerical attributes is presented. DESCRY discovers clusters having different shape, size, and density and when data contains noise by first finding and clustering a small set of points, called meta-points, that well depict the shape of clusters present in the data set. Final clusters are obtained by assigning each point to one of the partial clusters. The computational complexity of DESCRY is linear both in the data set size and in the data set dimensionality. Experiments show the very good qualitative results obtained comparable with those obtained by state of the art clustering algorithms.

Research paper thumbnail of The GPR system: An architecture for integrating active and deductive rules on complex database objects

Theory and Practice of Object Systems, 1997

ABSTRACT This paper illustrates a prototype system, called GPRS, supporting the Generalized Produ... more ABSTRACT This paper illustrates a prototype system, called GPRS, supporting the Generalized Production Rules (GPR) database language. The GPR language integrates, in a unified framework, active rules, which allow the specification of event driven computations on data, and deductive rules, which can be used to derive intensional relations in the style of logic programming. The prototype realizes the operational semantics of GPR using a unique ruleevaluation engine. The data model of reference is object based and the system is implemented on top of an object oriented DBMS. Hence, the GPRS prototype represents a concrete proposal of an advanced DBMS for complex objects that provides both active and deductive styles of rule programming. c fl 1997 John Wiley & Sons

Research paper thumbnail of Approximate k-Closest-Pairs with Space Filling Curves

Lecture Notes in Computer Science, 2002

ABSTRACT An approximate algorithm to efficiently solve the k-Closest- Pairs problem in high-dimen... more ABSTRACT An approximate algorithm to efficiently solve the k-Closest- Pairs problem in high-dimensional spaces is presented. The method is based on dimensionality reduction of the space ℝd through the Hilbert space filling curve and performs at most d+1 scans of the data set. After each scan, those points whose contribution to the solution has already been analyzed, are eliminated from the data set. The pruning is lossless, in fact the remaining points along with the approximate solution found can be used for the computation of the exact solution. Although we are able to guarantee an O(d 1+ 1/t ) approximation to the solution, where t = 1,…,∞ denotes the used L t metric, experimental results give the exact k-Closest-Pairs for all the data sets considered and show that the pruning of the search space is effective.

Research paper thumbnail of Fast Outlier Detection in High Dimensional Spaces

Lecture Notes in Computer Science, 2002

In this paper we propose a new definition of distance-based outlier that considers for each point... more In this paper we propose a new definition of distance-based outlier that considers for each point the sum of the distances from its k nearest neighbors, called weight. Outliers are those points having the largest values of weight. In order to compute these weights, we find the k nearest neighbors of each point in a fast and efficient way by

Research paper thumbnail of Principal Directions-Based Pivot Placement

Lecture Notes in Computer Science, 2013

Determining a good sets of pivots is a challenging task for metric space indexing. Several techni... more Determining a good sets of pivots is a challenging task for metric space indexing. Several techniques to select pivots from the data to be indexed have been introduced in the literature. In this paper, we propose a pivot placement strategy which exploits the natural data orientation in order to select space points which achieve a good alignment with the whole data to be indexed. Comparison with existing methods substantiates the effectiveness of the approach.

Research paper thumbnail of Outlier Detection Using Disjunctive Logic Programming

European Conference on Artificial Intelligence, 2004

Page 1. Outlier Detection using Disjunctive Logic Programming Fabrizio Angiulli 1 and Rachel Ben-... more Page 1. Outlier Detection using Disjunctive Logic Programming Fabrizio Angiulli 1 and Rachel Ben-Eliyahu - Zohary 2 and Luigi Palopoli 3 Abstract. ... 1 ICAR-CNR, Via Pietro Bucci 41C, 87030 Rende (CS), Italy, email: angiulli@icar.cnr.it 2 Comm. ...

Research paper thumbnail of Outlier Detection by Logic Programming

Computing Research Repository, 2004

The development of efiective knowledge discovery techniques has become a very active research are... more The development of efiective knowledge discovery techniques has become a very active research area in recent years due to the important impact it has had in several relevant application domains. One interesting task therein is that of singling out anomalous individuals from a given population, e.g., to detect rare events in time-series analysis settings, or to identify objects whose behavior

Research paper thumbnail of Outlier Detection Using Default Logic Fabrizio Angiulli , Rachel Ben-Eliyahu - Zohary

ABSTRACT Default logic is used to describe regular behavior and normal properties. We suggest to ... more ABSTRACT Default logic is used to describe regular behavior and normal properties. We suggest to exploit the framework of default logic for detecting outliers - individuals who behave in an unexpected way or feature abnormal properties. The ability to locate outliers can help to maintain knowledgebase integrity and to single out irregular individuals. We first formally define the notion of an outlier and an outlier witness. We then show that finding outliers is quite complex. Indeed, we show that several versions of the outlier detection problem lie over the second level of the polynomial hierarchy. For example, the question of establishing if at least one outlier can be detected in a given propositional default theory is -complete. Although outlier detection involves heavy computation, the queries involved can frequently be executed off-line, thus somewhat alleviating the difficulty of the problem. In addition, we show that outlier detection can be done in polynomial time for both the class of acyclic normal unary defaults and the class of acyclic dual normal unary defaults. 1

Research paper thumbnail of Metaquerying: propriet� e tecniche di implementazione

Research paper thumbnail of Outlier detection for simple default theories

Artificial Intelligence, Oct 1, 2010

It was noted recently that the framework of default logics can be exploited for detecting outlier... more It was noted recently that the framework of default logics can be exploited for detecting outliers. Outliers are observations expressed by sets of literals that feature unexpected properties. These observations are not explicitly provided in input (as it happens with abduction) but, rather, they are hidden in the given knowledge base. Unfortunately, in the two related formalisms for specifying defaults -Reiter's default logic and extended disjunctive logic programs -the most general outlier detection problems turn out to lie at the third level of the polynomial hierarchy. In this note, we analyze the complexity of outlier detection for two very simple classes of default theories, namely NU and DNU, for which the entailment problem is solvable in polynomial time. We show that, for these classes, checking for the existence of an outlier is anyway intractable. This result contributes to further showing the inherent intractability of outlier detection in default reasoning.

Research paper thumbnail of Computational properties of metaquerying problems

Acm Transactions on Computational Logic, 2003

ABSTRACT Metaquerying is a datamining technology by which hidden dependencies among several datab... more ABSTRACT Metaquerying is a datamining technology by which hidden dependencies among several database relations can be discovered. This tool has already been successfully applied to several real-world applications. Recent papers provide only preliminary results about the complexity of metaquerying. In this paper we define several variants of metaquerying that encompass, as far as we know, all variants defined in the literature. We study both the combined complexity and the data complexity of these variants. We show that, under the combined complexity measure, metaquerying is generally intractable (unless P=NP), lying sometimes quite high in the complexity hierarchies (as high as NP PP), depending on the characteristics of the plausibility index. However, we are able to single out some tractable and interesting metaquerying cases (whose combined complexity is LOGCFL-complete). As for the data complexity of metaquerying, we prove that, in general, this is in TC0, but lies within AC0 in some simpler cases. Finally, we discuss implementation of metaqueries, by providing algorithms to answer them.

Research paper thumbnail of On the Complexity of Mining Association Rules

Sebd, 2001

... Fabrizio Angiulli1, Giovambattista Ianni2, and Luigi Palopoli3 1 ISI-CNR c/o Universitá della... more ... Fabrizio Angiulli1, Giovambattista Ianni2, and Luigi Palopoli3 1 ISI-CNR c/o Universitá della Calabria, DEIS, Via P. Bucci 41C, Rende, Italy angiulli@isi.cs.cnr.it 2 Universitá della Calabria, DEIS, Via P. Bucci 41C, Rende, Italy ianni@deis.unical.it 3 Universitá di Reggio Calabria ...

Research paper thumbnail of Outlier Detection Using Default Logic

Default logic is used to describe regular behavior and normal properties. We suggest to exploit t... more Default logic is used to describe regular behavior and normal properties. We suggest to exploit the framework of default logic for detecting outliers - individuals who behave in an unexpected way or feature abnormal properties. The ability to locate outliers can help to maintain knowledgebase integrity and to single out irregular individuals.

Research paper thumbnail of On the complexity of inducing categorical and quantitative association rules

Theoretical Computer Science, 2004

Inducing association rules is one of the central tasks in data mining applications. Quantitative ... more Inducing association rules is one of the central tasks in data mining applications. Quantitative association rules induced from databases describe rich and hidden relationships to be found within data that can prove useful for various application purposes (e.g., market basket analysis, customer proÿling, and others). Although association rules are quite widely used in practice, a thorough analysis of the related computational complexity is missing. This paper intends to provide a contribution in this setting. To this end, we ÿrst formally deÿne quantitative association rule mining problems, which include boolean association rules as a special case; we then analyze computational complexity of such problems. The general problem as well as some interesting special cases are considered.

Research paper thumbnail of An Unsupervised Outlier Detection Approach for Cleaning String Data Entries

Sistemi Evoluti per Basi di Dati, 2008

Research paper thumbnail of Top-k closest pairs join query: an approximate algorithm for large high dimensional data

Proceedings. International Database Engineering and Applications Symposium, 2004. IDEAS '04., 2004

ABSTRACT In This work we present a novel approximate algorithm to calculate the top k closest pai... more ABSTRACT In This work we present a novel approximate algorithm to calculate the top k closest pairs join query of two large and high dimensional data sets. The algorithm has worst case time complexity O(d2nk) and space complexity O(nd) and guarantees a solution within a O(d1+ 12 /) factor of the exact one, where t ∈ {1,2,..., ∞} denotes the Minkowski metrics Lt of interest and d the dimensionality. It makes use of the concept of space filling curve to establish an order between the points of the space and performs at most d + 1 sorts and scans of the two data sets. During a scan, each point from one data set is compared with its closest points, according to the space filling curve order, in the other data set and points whose contribution to the solution has already been analyzed are detected and eliminated. Experimental results on real and synthetic data sets show that our algorithm (i) behaves as an exact algorithm in low dimensional spaces; (ii) it is able to prune the entire (or a considerable fraction of the) data set even for high dimensions if certain separation conditions are satisfied; (iii) in any case it returns a solution within a small error to the exact one.

Research paper thumbnail of Optimal Subset Selection for Classification through SAT Encodings

IFIP – The International Federation for Information Processing, 2008

... Fabrizio Angiulli DEIS, Universit`a della Calabria, Via P. Bucci 41C, 87036 Rende (CS), Italy... more ... Fabrizio Angiulli DEIS, Universit`a della Calabria, Via P. Bucci 41C, 87036 Rende (CS), Italy, e-mail: f.angiulli@deis.unical.it Stefano Basta ICAR-CNR, Via P. Bucci 41C, 87036 Rende (CS), Italy, e-mail: basta@icar.cnr.it Please ...

Research paper thumbnail of Detection of Discriminating Rules

Research paper thumbnail of On the tractability of minimal model computation for some CNF theories

Artificial Intelligence, 2014

ABSTRACT Designing algorithms capable of efficiently constructing minimal models of CNFs is an im... more ABSTRACT Designing algorithms capable of efficiently constructing minimal models of CNFs is an important task in AI. This paper provides new results along this research line and presents new algorithms for performing minimal model finding and checking over positive propositional CNFs and model minimization over propositional CNFs. An algorithmic schema, called the Generalized Elimination Algorithm (GEA) is presented, that computes a minimal model of any positive CNF. The schema generalizes the Elimination Algorithm (EA) [BP97], which computes a minimal model of positive head-cycle-free (HCF) CNF theories. While the EA always runs in polynomial time in the size of the input HCF CNF, the complexity of the GEA depends on the complexity of the specific eliminating operator invoked therein, which may in general turn out to be exponential. Therefore, a specific eliminating operator is defined by which the GEA computes, in polynomial time, a minimal model for a class of CNF that strictly includes head-elementary-set-free (HEF) CNF theories [GLL06], which form, in their turn, a strict superset of HCF theories. Furthermore, in order to deal with the high complexity associated with recognizing HEF theories, an "incomplete" variant of the GEA (called IGEA) is proposed: the resulting schema, once instantiated with an appropriate elimination operator, always constructs a model of the input CNF, which is guaranteed to be minimal if the input theory is HEF. In the light of the above results, the main contribution of this work is the enlargement of the tractability frontier for the minimal model finding and checking and the model minimization problems.

Research paper thumbnail of A Greedy Search Approach to Co-clustering Sparse Binary Matrices

2006 18th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'06), 2006

Page 1. A Greedy Search Approach to Co-clustering Sparse Binary Matrices Fabrizio Angiulli, Eugen... more Page 1. A Greedy Search Approach to Co-clustering Sparse Binary Matrices Fabrizio Angiulli, Eugenio Cesario, Clara Pizzuti ICAR-CNR Via P. Bucci, 41C 87036 Rende (CS), Italy {angiulli,cesario,pizzuti}@icar.cnr.it Abstract ...

Research paper thumbnail of DESCRY: A Density Based Clustering Algorithm for Very Large Data Sets

Lecture Notes in Computer Science, 2004

ABSTRACT A novel algorithm, named DESCRY, for clustering very large multidimensional data sets wi... more ABSTRACT A novel algorithm, named DESCRY, for clustering very large multidimensional data sets with numerical attributes is presented. DESCRY discovers clusters having different shape, size, and density and when data contains noise by first finding and clustering a small set of points, called meta-points, that well depict the shape of clusters present in the data set. Final clusters are obtained by assigning each point to one of the partial clusters. The computational complexity of DESCRY is linear both in the data set size and in the data set dimensionality. Experiments show the very good qualitative results obtained comparable with those obtained by state of the art clustering algorithms.

Research paper thumbnail of The GPR system: An architecture for integrating active and deductive rules on complex database objects

Theory and Practice of Object Systems, 1997

ABSTRACT This paper illustrates a prototype system, called GPRS, supporting the Generalized Produ... more ABSTRACT This paper illustrates a prototype system, called GPRS, supporting the Generalized Production Rules (GPR) database language. The GPR language integrates, in a unified framework, active rules, which allow the specification of event driven computations on data, and deductive rules, which can be used to derive intensional relations in the style of logic programming. The prototype realizes the operational semantics of GPR using a unique ruleevaluation engine. The data model of reference is object based and the system is implemented on top of an object oriented DBMS. Hence, the GPRS prototype represents a concrete proposal of an advanced DBMS for complex objects that provides both active and deductive styles of rule programming. c fl 1997 John Wiley & Sons

Research paper thumbnail of Approximate k-Closest-Pairs with Space Filling Curves

Lecture Notes in Computer Science, 2002

ABSTRACT An approximate algorithm to efficiently solve the k-Closest- Pairs problem in high-dimen... more ABSTRACT An approximate algorithm to efficiently solve the k-Closest- Pairs problem in high-dimensional spaces is presented. The method is based on dimensionality reduction of the space ℝd through the Hilbert space filling curve and performs at most d+1 scans of the data set. After each scan, those points whose contribution to the solution has already been analyzed, are eliminated from the data set. The pruning is lossless, in fact the remaining points along with the approximate solution found can be used for the computation of the exact solution. Although we are able to guarantee an O(d 1+ 1/t ) approximation to the solution, where t = 1,…,∞ denotes the used L t metric, experimental results give the exact k-Closest-Pairs for all the data sets considered and show that the pruning of the search space is effective.

Research paper thumbnail of Fast Outlier Detection in High Dimensional Spaces

Lecture Notes in Computer Science, 2002

In this paper we propose a new definition of distance-based outlier that considers for each point... more In this paper we propose a new definition of distance-based outlier that considers for each point the sum of the distances from its k nearest neighbors, called weight. Outliers are those points having the largest values of weight. In order to compute these weights, we find the k nearest neighbors of each point in a fast and efficient way by