Esposito Floriana | Università degli Studi di Bari (original) (raw)
Papers by Esposito Floriana
Lecture Notes in Computer Science, 2011
Horn clause Logic is a powerful representation language exploited in Logic Programming as a compu... more Horn clause Logic is a powerful representation language exploited in Logic Programming as a computer programming framework and in Inductive Logic Programming as a formalism for expressing examples and learned theories in domains where relations among objects must be expressed to fully capture the relevant information. While the predicates that make up the description language are defined by the knowledge engineer and handled only syntactically by the interpreters, they sometimes express information that can be properly exploited only with reference to a suitable background knowledge in order to capture unexpressed and underlying relationships among the concepts described. This is typical when the representation includes numerical information, such as single values or intervals, for which simple syntactic matching is not sufficient. This work proposes an extension of an existing framework for similarity assessment between First-Order Logic Horn clauses, that is able to handle numeric information in the descriptions. The viability of the solution is demonstrated on sample problems.
Evaluation can be highly effective in improving research quality and productivity. To achieve the... more Evaluation can be highly effective in improving research quality and productivity. To achieve the intended effects, research evaluation should follow established principles, benchmarked against appropriate criteria, and sensitive to disciplinary differences. This report confirms the findings of the 2008 report on Research Evaluation for Computer Science, while incorporating recent developments.
Cornell University - arXiv, Nov 15, 2013
Dealing with structured data needs the use of expressive representation formalisms that, however,... more Dealing with structured data needs the use of expressive representation formalisms that, however, puts the problem to deal with the computational complexity of the machine learning process. Furthermore, real world domains require tools able to manage their typical uncertainty. Many statistical relational learning approaches try to deal with these problems by combining the construction of relevant relational features with a probabilistic tool. When the combination is static (static propositionalization), the constructed features are considered as boolean features and used offline as input to a statistical learner; while, when the combination is dynamic (dynamic propositionalization), the feature construction and probabilistic tool are combined into a single process. In this paper we propose a selective propositionalization method that search the optimal set of relational features to be used by a probabilistic learner in order to minimize a loss function. The new propositionalization approach has been combined with the random subspace ensemble method. Experiments on real-world datasets shows the validity of the proposed method.
Fundamenta Informaticae, 2013
ABSTRACT Standardized processes are important for correctly carrying out activities in an organiz... more ABSTRACT Standardized processes are important for correctly carrying out activities in an organization. Often the procedures they describe are already in operation, and the need is to understand and formalize them in a model that can support their analysis, replication and enforcement. Manually building these models is complex, costly and error-prone. Hence, the interest in automatically learning them from examples of actual procedures. Desirable options are incrementality in learning and adapting the models, and the ability to express triggers and conditions on the tasks that make up the workflow. This paper proposes a framework based on First-Order Logic that solves many shortcomings of previous approaches to this problem in the literature, allowing to deal with complex domains in a powerful and flexible way. Indeed, First-Order Logic provides a single, comprehensive and expressive representation and manipulation environment for supporting all of the above requirements. A purposely devised experimental evaluation confirms the effectiveness and efficiency of the proposed solution.
Lecture Notes in Computer Science, 2000
This work concerns a research project aiming at studying whether a machine learning system could ... more This work concerns a research project aiming at studying whether a machine learning system could reproduce the changes in the concept of force observed in children. The theoretical framework proposed considers learning as a process of formation and revision of a logical theory. INTHELEX, an incremental learning system, was used to emulate the transitions occurring in the human learning process.
Lecture Notes in Computer Science
We present a method based on clustering techniques to detect concept drift or novelty in a knowle... more We present a method based on clustering techniques to detect concept drift or novelty in a knowledge base expressed in Description Logics. The method exploits an effective and language-independent semi-distance measure defined for the space of individuals, that is based on a finite number of dimensions corresponding to a committee of discriminating features (represented by concept descriptions). In the algorithm, the possible clusterings are represented as strings of central elements (medoids, w.r.t. the given metric) of variable length. The number of clusters is not required as a parameter; the method is able to find an optimal choice by means of the evolutionary operators and of a fitness function. An experimentation with some ontologies proves the feasibility of our method and its effectiveness in terms of clustering validity indices. Then, with a supervised learning phase, each cluster can be assigned with a refined or newly constructed intensional definition expressed in the adopted language.
Advances in Applied Artificial Intelligence, 2006
Scientific conference management involves many complex and multi-faceted activities, which would ... more Scientific conference management involves many complex and multi-faceted activities, which would make highly desirable for the organizing people to have a Web-based management system that makes some of them a little easier to carry out. One of such activities is the assignment of submitted papers to suitable reviewers, involving the authors, the reviewers and the conference chair. Authors that submit the papers usually must fill a form with paper title, abstract and a set of conference topics that fit their submission subject. Reviewers are required to register and declare their expertise on the conference topics (among other things). Finally, the conference chair has to carry out the review assignment taking into account the information provided by both the authors (about their paper) and the reviewers (about their competencies). Thus, all this subtasks needed for the assignment are currently carried out manually by the actors. While this can be just boring in the case of authors and reviewers, in case of conference chair the task is also very complex and time-consuming. In this paper we proposes the exploitation of intelligent techniques to automatically extract paper topics from their title and abstract, and the expertise of the reviewers from the titles of their publications available on the Internet. Successively, such a knowledge is exploited by an expert system able to automatically perform the assignments. The proposed methods were evaluated on a real conference dataset obtaining good results when compared to handmade ones, both in terms of quality and user-satisfaction of the assignments, and for reduction in execution time with respect to the case of humans performing the same process.
Lecture Notes in Computer Science, 2010
We describe a method for learning functions that can predict the ranking of resources in knowledg... more We describe a method for learning functions that can predict the ranking of resources in knowledge bases expressed in Description Logics. The method relies on a kernelized version of the PERCEPTRON RANKING algorithm which is suitable for batch but also online problems settings. The usage of specific kernel functions that encode the similarity between individuals in the context of knowledge bases allows the application of the method to ontologies in the standard representations for the Semantic Web. An extensive experimentation reported in this paper proves the effectiveness of the method at the task of ranking the answers to queries, expressed by class descriptions when applied to real ontologies describing simple and complex domains.
Inductive Logic Programming, 2000
Induction of recursive theories in the normal ILP setting is a complex task because of the non-mo... more Induction of recursive theories in the normal ILP setting is a complex task because of the non-monotonicity of the consistency property. In this paper we propose computational solutions to some relevant issues raised by the multiple predicate learning problem. A separate-and-parallel-conquer search strategy is adopted to interleave the learning of clauses supplying predicates with mutually recursive definitions. A novel generality order to be imposed to the search space of clauses is investigated in order to cope with recursion in a more suitable way. The consistency recovery is performed by reformulating the current theory and by applying a layering technique based on the collapsed dependency graph. The proposed approach has been implemented in the ILP system ATRE and tested in the specific context of the document understanding problem within the WISDOM project. Experimental results are discussed and future directions are drawn.
Lecture Notes in Computer Science, 2014
ABSTRACT In the Semantic Web context, procedures for deciding the class-membership of an individu... more ABSTRACT In the Semantic Web context, procedures for deciding the class-membership of an individual to a target concept in a knowledge base are generally based on automated reasoning. However, frequent cases of incompleteness/inconsistency due to distributed, heterogeneous nature and the Web-scale dimension of the knowledge bases. It has been shown that resorting to models induced from the data may offer comparably effective and efficient solutions for these cases, although skewness in the instance distribution may affect the quality of such models. This is known as class-imbalance problem. We propose a machine learning approach, based on the induction of Terminological Random Forests, that is an extension of the notion of Random Forest to cope with this problem in case of knowledge bases expressed through the standard Web ontology languages. Experimentally we show the feasibility of our approach and its effectiveness w.r.t. related methods, especially with imbalanced datasets.
Theories and Methods of Spatio-Temporal Reasoning in Geographic Space, 1992
Abstract. Any ES research, in particular planner ES research, shows that knowledge acquisition is... more Abstract. Any ES research, in particular planner ES research, shows that knowledge acquisition is a bottleneck when building up the ES prototypes. From this viewpoint, the possibility of automatically acquiring knowledge for ES, at least with reference to special themes and problems, may ...
Communications in Computer and Information Science, 2010
Information retrieval effectiveness has become a crucial issue with the enormous growth of availa... more Information retrieval effectiveness has become a crucial issue with the enormous growth of available digital documents and the spread of Digital Libraries. Search and retrieval are mostly carried out on the textual content of documents, and traditionally only at the lexical ...
Lecture Notes in Computer Science, 2005
Traditional Machine Learning approaches are based on single inference mechanisms. A step forward ... more Traditional Machine Learning approaches are based on single inference mechanisms. A step forward concerned the integration of multiple inference strategies within a first-order logic learning framework, taking advantage of the benefits that each approach can bring. Specifically, abduction is exploited to complete the incoming information in order to handle cases of missing knowledge, and abstraction is exploited to eliminate superfluous details that can affect the performance of a learning system. However, these methods require some background information to exploit the specific inference strategy, that must be provided by a domain expert. This work proposes algorithms to automatically discover such an information in order to make the learning task completely autonomous. The proposed methods have been tested on the system INTHELEX, and their effectiveness has been proven by experiments in a real-world domain.
Lecture Notes in Computer Science, 2009
Imitative learning can be considered an essential task of humans development. People use instruct... more Imitative learning can be considered an essential task of humans development. People use instructions and demonstrations provided by other human experts to acquire knowledge. In order to make an agent capable of learning through demonstrations, we propose a relational framework for learning by imitation. Demonstrations and domain specific knowledge are compactly represented by a logical language able to express complex relational processes. The agent interacts in a stochastic environment and incrementally receives demonstrations. It actively interacts with the human by deciding the next action to execute and requesting demonstration from the expert based on the current learned policy. The framework has been implemented and validated with experiments in simulated agent domains.
Artificial Intelligence Frontiers in Statistics, 1993
We present an automated ontology matching methodology, supported by various machine learning tech... more We present an automated ontology matching methodology, supported by various machine learning techniques, as implemented in the system MoTo. The methodology is twotiered. On the first stage it uses a meta-learner to elicit certain mappings from those predicted by single matchers induced by a specific base-learner. Then, uncertain mappings are recovered passing through a validation process, followed by the aggregation of the individual predictions through linguistic quantifiers. Experiments on benchmark ontologies demonstrate the effectiveness of the methodology.
Lecture Notes in Computer Science, 2005
ABSTRACT We tackle the problem of learning ontologies expressed in a rich representation like the... more ABSTRACT We tackle the problem of learning ontologies expressed in a rich representation like the ALC\mathcal{ALC} logic. This task can be cast as a supervised learning problem to be solved by means of operators for this representation which take into account the available metadata. The properties of such operators are discussed and their effectiveness is empirically tested in the experimentation reported in this paper.
Lecture Notes in Computer Science, 1999
IDL (Intelligent Digital Library) is a prototypical intelligent digital library service that is c... more IDL (Intelligent Digital Library) is a prototypical intelligent digital library service that is currently being developed at the University of Bari. Among the characterizing features of IDL there are a retrieval engine and several facilities available for the library users. In this paper, we present the learning component, named Learning Server, that has been exploited in IDL for document analysis,
Lecture Notes in Computer Science, 2011
Horn clause Logic is a powerful representation language exploited in Logic Programming as a compu... more Horn clause Logic is a powerful representation language exploited in Logic Programming as a computer programming framework and in Inductive Logic Programming as a formalism for expressing examples and learned theories in domains where relations among objects must be expressed to fully capture the relevant information. While the predicates that make up the description language are defined by the knowledge engineer and handled only syntactically by the interpreters, they sometimes express information that can be properly exploited only with reference to a suitable background knowledge in order to capture unexpressed and underlying relationships among the concepts described. This is typical when the representation includes numerical information, such as single values or intervals, for which simple syntactic matching is not sufficient. This work proposes an extension of an existing framework for similarity assessment between First-Order Logic Horn clauses, that is able to handle numeric information in the descriptions. The viability of the solution is demonstrated on sample problems.
Evaluation can be highly effective in improving research quality and productivity. To achieve the... more Evaluation can be highly effective in improving research quality and productivity. To achieve the intended effects, research evaluation should follow established principles, benchmarked against appropriate criteria, and sensitive to disciplinary differences. This report confirms the findings of the 2008 report on Research Evaluation for Computer Science, while incorporating recent developments.
Cornell University - arXiv, Nov 15, 2013
Dealing with structured data needs the use of expressive representation formalisms that, however,... more Dealing with structured data needs the use of expressive representation formalisms that, however, puts the problem to deal with the computational complexity of the machine learning process. Furthermore, real world domains require tools able to manage their typical uncertainty. Many statistical relational learning approaches try to deal with these problems by combining the construction of relevant relational features with a probabilistic tool. When the combination is static (static propositionalization), the constructed features are considered as boolean features and used offline as input to a statistical learner; while, when the combination is dynamic (dynamic propositionalization), the feature construction and probabilistic tool are combined into a single process. In this paper we propose a selective propositionalization method that search the optimal set of relational features to be used by a probabilistic learner in order to minimize a loss function. The new propositionalization approach has been combined with the random subspace ensemble method. Experiments on real-world datasets shows the validity of the proposed method.
Fundamenta Informaticae, 2013
ABSTRACT Standardized processes are important for correctly carrying out activities in an organiz... more ABSTRACT Standardized processes are important for correctly carrying out activities in an organization. Often the procedures they describe are already in operation, and the need is to understand and formalize them in a model that can support their analysis, replication and enforcement. Manually building these models is complex, costly and error-prone. Hence, the interest in automatically learning them from examples of actual procedures. Desirable options are incrementality in learning and adapting the models, and the ability to express triggers and conditions on the tasks that make up the workflow. This paper proposes a framework based on First-Order Logic that solves many shortcomings of previous approaches to this problem in the literature, allowing to deal with complex domains in a powerful and flexible way. Indeed, First-Order Logic provides a single, comprehensive and expressive representation and manipulation environment for supporting all of the above requirements. A purposely devised experimental evaluation confirms the effectiveness and efficiency of the proposed solution.
Lecture Notes in Computer Science, 2000
This work concerns a research project aiming at studying whether a machine learning system could ... more This work concerns a research project aiming at studying whether a machine learning system could reproduce the changes in the concept of force observed in children. The theoretical framework proposed considers learning as a process of formation and revision of a logical theory. INTHELEX, an incremental learning system, was used to emulate the transitions occurring in the human learning process.
Lecture Notes in Computer Science
We present a method based on clustering techniques to detect concept drift or novelty in a knowle... more We present a method based on clustering techniques to detect concept drift or novelty in a knowledge base expressed in Description Logics. The method exploits an effective and language-independent semi-distance measure defined for the space of individuals, that is based on a finite number of dimensions corresponding to a committee of discriminating features (represented by concept descriptions). In the algorithm, the possible clusterings are represented as strings of central elements (medoids, w.r.t. the given metric) of variable length. The number of clusters is not required as a parameter; the method is able to find an optimal choice by means of the evolutionary operators and of a fitness function. An experimentation with some ontologies proves the feasibility of our method and its effectiveness in terms of clustering validity indices. Then, with a supervised learning phase, each cluster can be assigned with a refined or newly constructed intensional definition expressed in the adopted language.
Advances in Applied Artificial Intelligence, 2006
Scientific conference management involves many complex and multi-faceted activities, which would ... more Scientific conference management involves many complex and multi-faceted activities, which would make highly desirable for the organizing people to have a Web-based management system that makes some of them a little easier to carry out. One of such activities is the assignment of submitted papers to suitable reviewers, involving the authors, the reviewers and the conference chair. Authors that submit the papers usually must fill a form with paper title, abstract and a set of conference topics that fit their submission subject. Reviewers are required to register and declare their expertise on the conference topics (among other things). Finally, the conference chair has to carry out the review assignment taking into account the information provided by both the authors (about their paper) and the reviewers (about their competencies). Thus, all this subtasks needed for the assignment are currently carried out manually by the actors. While this can be just boring in the case of authors and reviewers, in case of conference chair the task is also very complex and time-consuming. In this paper we proposes the exploitation of intelligent techniques to automatically extract paper topics from their title and abstract, and the expertise of the reviewers from the titles of their publications available on the Internet. Successively, such a knowledge is exploited by an expert system able to automatically perform the assignments. The proposed methods were evaluated on a real conference dataset obtaining good results when compared to handmade ones, both in terms of quality and user-satisfaction of the assignments, and for reduction in execution time with respect to the case of humans performing the same process.
Lecture Notes in Computer Science, 2010
We describe a method for learning functions that can predict the ranking of resources in knowledg... more We describe a method for learning functions that can predict the ranking of resources in knowledge bases expressed in Description Logics. The method relies on a kernelized version of the PERCEPTRON RANKING algorithm which is suitable for batch but also online problems settings. The usage of specific kernel functions that encode the similarity between individuals in the context of knowledge bases allows the application of the method to ontologies in the standard representations for the Semantic Web. An extensive experimentation reported in this paper proves the effectiveness of the method at the task of ranking the answers to queries, expressed by class descriptions when applied to real ontologies describing simple and complex domains.
Inductive Logic Programming, 2000
Induction of recursive theories in the normal ILP setting is a complex task because of the non-mo... more Induction of recursive theories in the normal ILP setting is a complex task because of the non-monotonicity of the consistency property. In this paper we propose computational solutions to some relevant issues raised by the multiple predicate learning problem. A separate-and-parallel-conquer search strategy is adopted to interleave the learning of clauses supplying predicates with mutually recursive definitions. A novel generality order to be imposed to the search space of clauses is investigated in order to cope with recursion in a more suitable way. The consistency recovery is performed by reformulating the current theory and by applying a layering technique based on the collapsed dependency graph. The proposed approach has been implemented in the ILP system ATRE and tested in the specific context of the document understanding problem within the WISDOM project. Experimental results are discussed and future directions are drawn.
Lecture Notes in Computer Science, 2014
ABSTRACT In the Semantic Web context, procedures for deciding the class-membership of an individu... more ABSTRACT In the Semantic Web context, procedures for deciding the class-membership of an individual to a target concept in a knowledge base are generally based on automated reasoning. However, frequent cases of incompleteness/inconsistency due to distributed, heterogeneous nature and the Web-scale dimension of the knowledge bases. It has been shown that resorting to models induced from the data may offer comparably effective and efficient solutions for these cases, although skewness in the instance distribution may affect the quality of such models. This is known as class-imbalance problem. We propose a machine learning approach, based on the induction of Terminological Random Forests, that is an extension of the notion of Random Forest to cope with this problem in case of knowledge bases expressed through the standard Web ontology languages. Experimentally we show the feasibility of our approach and its effectiveness w.r.t. related methods, especially with imbalanced datasets.
Theories and Methods of Spatio-Temporal Reasoning in Geographic Space, 1992
Abstract. Any ES research, in particular planner ES research, shows that knowledge acquisition is... more Abstract. Any ES research, in particular planner ES research, shows that knowledge acquisition is a bottleneck when building up the ES prototypes. From this viewpoint, the possibility of automatically acquiring knowledge for ES, at least with reference to special themes and problems, may ...
Communications in Computer and Information Science, 2010
Information retrieval effectiveness has become a crucial issue with the enormous growth of availa... more Information retrieval effectiveness has become a crucial issue with the enormous growth of available digital documents and the spread of Digital Libraries. Search and retrieval are mostly carried out on the textual content of documents, and traditionally only at the lexical ...
Lecture Notes in Computer Science, 2005
Traditional Machine Learning approaches are based on single inference mechanisms. A step forward ... more Traditional Machine Learning approaches are based on single inference mechanisms. A step forward concerned the integration of multiple inference strategies within a first-order logic learning framework, taking advantage of the benefits that each approach can bring. Specifically, abduction is exploited to complete the incoming information in order to handle cases of missing knowledge, and abstraction is exploited to eliminate superfluous details that can affect the performance of a learning system. However, these methods require some background information to exploit the specific inference strategy, that must be provided by a domain expert. This work proposes algorithms to automatically discover such an information in order to make the learning task completely autonomous. The proposed methods have been tested on the system INTHELEX, and their effectiveness has been proven by experiments in a real-world domain.
Lecture Notes in Computer Science, 2009
Imitative learning can be considered an essential task of humans development. People use instruct... more Imitative learning can be considered an essential task of humans development. People use instructions and demonstrations provided by other human experts to acquire knowledge. In order to make an agent capable of learning through demonstrations, we propose a relational framework for learning by imitation. Demonstrations and domain specific knowledge are compactly represented by a logical language able to express complex relational processes. The agent interacts in a stochastic environment and incrementally receives demonstrations. It actively interacts with the human by deciding the next action to execute and requesting demonstration from the expert based on the current learned policy. The framework has been implemented and validated with experiments in simulated agent domains.
Artificial Intelligence Frontiers in Statistics, 1993
We present an automated ontology matching methodology, supported by various machine learning tech... more We present an automated ontology matching methodology, supported by various machine learning techniques, as implemented in the system MoTo. The methodology is twotiered. On the first stage it uses a meta-learner to elicit certain mappings from those predicted by single matchers induced by a specific base-learner. Then, uncertain mappings are recovered passing through a validation process, followed by the aggregation of the individual predictions through linguistic quantifiers. Experiments on benchmark ontologies demonstrate the effectiveness of the methodology.
Lecture Notes in Computer Science, 2005
ABSTRACT We tackle the problem of learning ontologies expressed in a rich representation like the... more ABSTRACT We tackle the problem of learning ontologies expressed in a rich representation like the ALC\mathcal{ALC} logic. This task can be cast as a supervised learning problem to be solved by means of operators for this representation which take into account the available metadata. The properties of such operators are discussed and their effectiveness is empirically tested in the experimentation reported in this paper.
Lecture Notes in Computer Science, 1999
IDL (Intelligent Digital Library) is a prototypical intelligent digital library service that is c... more IDL (Intelligent Digital Library) is a prototypical intelligent digital library service that is currently being developed at the University of Bari. Among the characterizing features of IDL there are a retrieval engine and several facilities available for the library users. In this paper, we present the learning component, named Learning Server, that has been exploited in IDL for document analysis,