Claudio Sartori | Università di Bologna (original) (raw)
Uploads
Papers by Claudio Sartori
In the age of Big Data, scalable algorithm implementations as well as powerful computational reso... more In the age of Big Data, scalable algorithm implementations as well as powerful computational resources are required. For data mining and data analytics the support of big data platforms is becoming increasingly important, since they provide algorithm implementations with all the resources needed for their execution. However, choosing the best platform might depend on several constraints, including but not limited to computational resources, storage resources, target tasks, service costs. Sometimes it may be necessary to switch from one platform to another depending on the constraints. As a consequence, it is desirable to reuse as much algorithm code as possible, so as to simplify the setup in new target platforms. Unfortunately each big data platform has its own peculiarity, especially to deal with parallelism. This impacts on algorithm implementation, which generally needs to be modified before being executed. This work introduces functional parallel primitives to define the parallelizable parts of algorithms in a uniform way, independent of the target platform. Primitives are then transformed by a compiler into skeletons, which are finally deployed on vendor-dependent frameworks. The procedure proposed aids not only in terms of code reuse but also in terms of parallelization, because programmer's expertise is not demanded. Indeed, it is the compiler that entirely manages and optimizes algorithm parallelization. The experiments performed show that the transformation process does not negatively affect algorithm performance.
Users of Web search engines generally express information needs with short and ambiguous queries,... more Users of Web search engines generally express information needs with short and ambiguous queries, leading to irrelevant results. Personalized search methods improve users' experience by automatically reformulating queries before sending them to the search engine or rearranging received results, according to their specific interests. A user profile is often built from previous queries, clicked results or in general from the user's browsing history; different topics must be distinguished in order to obtain an accurate profile. It is quite common that a set of user files, locally stored in sub-directory, are organized by the user into a coherent taxonomy corresponding to own topics of interest, but only a few methods leverage on this potentially useful source of knowledge. We propose a novel method where a user profile is built from those files, specifically considering their consistent arrangement in directories. A bag of keywords is extracted for each directory from text documents within it. We can infer the topic of each query and expand it by adding the corresponding keywords, in order to obtain a more targeted formulation. Experiments are carried out using benchmark data through a repeatable systematic process, in order to evaluate objectively how much our method can improve relevance of query results when applied upon a third-party search engine.
... Organization IX Paolo Ciancarini, University of Bologna, Italy Costas Courcoubetis, Athens Un... more ... Organization IX Paolo Ciancarini, University of Bologna, Italy Costas Courcoubetis, Athens University of Economics and Business, Greece Yogesh Deshpande, University of Western Sydney, Australia Asuman ... 198 Khaled Nagi, Iman Elghandour, Birgitta Konig-Ries Author Index ...
A recent direction of database research has been focused on integrating logic programming and obj... more A recent direction of database research has been focused on integrating logic programming and object orientation.
Electronics, Nov 23, 2022
This article is an open access article distributed under the terms and conditions of the Creative... more This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY
Sensors
Long document summarization poses obstacles to current generative transformer-based models becaus... more Long document summarization poses obstacles to current generative transformer-based models because of the broad context to process and understand. Indeed, detecting long-range dependencies is still challenging for today’s state-of-the-art solutions, usually requiring model expansion at the cost of an unsustainable demand for computing and memory capacities. This paper introduces Emma, a novel efficient memory-enhanced transformer-based architecture. By segmenting a lengthy input into multiple text fragments, our model stores and compares the current chunk with previous ones, gaining the capability to read and comprehend the entire context over the whole document with a fixed amount of GPU memory. This method enables the model to deal with theoretically infinitely long documents, using less than 18 and 13 GB of memory for training and inference, respectively. We conducted extensive performance analyses and demonstrate that Emma achieved competitive results on two datasets of differen...
The purpose of semantic query optimization is to use semantic knowledge (e.g. integrity constrain... more The purpose of semantic query optimization is to use semantic knowledge (e.g. integrity constraints) for transforming a query into an equivalent one that may be answered more efficiently than the original version. This paper proposes a general method for semantic query optimization in the framework of OODBs (Object Oriented Database Systems). The method is applicable to the class of conjunctive queries and is based on two ingredients: a description logic able to express both class descriptions and integrity constraints rules (IC rules) as types; subsumption computation between types to evaluate the logical implications expressed by IC rules
Proceedings of the 11th International Conference on Enterprise Information, 2009
Traditional techniques for query formulation need the knowledge of the database contents, i.e. wh... more Traditional techniques for query formulation need the knowledge of the database contents, i.e. which data are stored in the data source and how they are represented. In this paper, we discuss the development of a keyword-based search engine for structured data sources. The idea is to couple the ease of use and flexibility of keyword-based search with metadata extracted from data schemata and extensional knowledge which constitute a semantic network of knowledge. Translating keywords into SQL statements, we will develop a search engine that is effective, semantic-based, and applicable also when instance are not continuously available, such as in integrated data sources or in data sources extracted from the deep web.
Elsevier Science B.V. (North- Holland
In P2P computing peers and services forego central coordination and dynamically organise themselv... more In P2P computing peers and services forego central coordination and dynamically organise themselves to support knowledge sharing and collaboration, in both cooperative and non-cooperative environments. The success of P2P systems strongly depends on a number of factors. Firstly, the ability to ensure equitable distribution of content and services. Economic and business models which rely on incentive mechanisms to supply contributions to the system are being developed, along with methods for controlling the "free riding" issue. Second, the ability to enforce provision of trusted services. Reputation based P2P trust management models are becoming a focus of the research community as a viable solution. The trust models must balance both constraints imposed by the environment (e.g. scalability) and the unique properties of trust as a social and psychological phenomenon. Recently, we are also witnessing a move of the P2P paradigm to embrace mobile computing in an attempt to achieve even higher ubiquitousness. The possibility of services related to physical location and the relation with agents in physical proximity could introduce new opportunities and also new technical challenges. Although researchers working on distributed computing, MultiAgent Systems, databases and networks have been using similar concepts for a long time, it is only fairly recently that papers motivated by the current P2P paradigm have started appearing in high quality conferences and workshops. Research in agent systems in particular appears to be most relevant because, since their inception, MultiAgent Systems have always been thought of as collections of peers. This workshop brings together researchers working on agent systems and P2P computing with the intention of strengthening this connection. http://p2p.ingce.unibo.it
Lecture Notes in Computer Science, 1997
... Suppose that a client want to retrieve informations about 'Joe Chung'; the query ex... more ... Suppose that a client want to retrieve informations about 'Joe Chung'; the query expressed in MSL is the following: (QI) JC :- JC :< cs_person {<name 'Joe Chung'>}> 9 The object pattern in the tail of the query Ol is matched against the structure of the objects hold in MED. ...
In questo paper si descrive la diversa rappresentazione della conoscenza estensionale nelle basi ... more In questo paper si descrive la diversa rappresentazione della conoscenza estensionale nelle basi di dati e nelle basi di conoscenza
Atti di: P. Mello, (a cura di). Bologna
In questo lavoro si vuole analizzare la possibilit\ue0 di effettuare l'Ottimizzazione Semanti... more In questo lavoro si vuole analizzare la possibilit\ue0 di effettuare l'Ottimizzazione Semantica delle Interrogazioni utilizzando la relazione di subsumption. Il lavoro include una formalizzazione dei modelli dei dati ad oggetti complessi, arricchita con la nozione di subsumption, che individua tutte le relazioni di specilizzazione tra classi di oggetti sulla base delle loro descrizioni
In the age of Big Data, scalable algorithm implementations as well as powerful computational reso... more In the age of Big Data, scalable algorithm implementations as well as powerful computational resources are required. For data mining and data analytics the support of big data platforms is becoming increasingly important, since they provide algorithm implementations with all the resources needed for their execution. However, choosing the best platform might depend on several constraints, including but not limited to computational resources, storage resources, target tasks, service costs. Sometimes it may be necessary to switch from one platform to another depending on the constraints. As a consequence, it is desirable to reuse as much algorithm code as possible, so as to simplify the setup in new target platforms. Unfortunately each big data platform has its own peculiarity, especially to deal with parallelism. This impacts on algorithm implementation, which generally needs to be modified before being executed. This work introduces functional parallel primitives to define the parallelizable parts of algorithms in a uniform way, independent of the target platform. Primitives are then transformed by a compiler into skeletons, which are finally deployed on vendor-dependent frameworks. The procedure proposed aids not only in terms of code reuse but also in terms of parallelization, because programmer's expertise is not demanded. Indeed, it is the compiler that entirely manages and optimizes algorithm parallelization. The experiments performed show that the transformation process does not negatively affect algorithm performance.
Users of Web search engines generally express information needs with short and ambiguous queries,... more Users of Web search engines generally express information needs with short and ambiguous queries, leading to irrelevant results. Personalized search methods improve users' experience by automatically reformulating queries before sending them to the search engine or rearranging received results, according to their specific interests. A user profile is often built from previous queries, clicked results or in general from the user's browsing history; different topics must be distinguished in order to obtain an accurate profile. It is quite common that a set of user files, locally stored in sub-directory, are organized by the user into a coherent taxonomy corresponding to own topics of interest, but only a few methods leverage on this potentially useful source of knowledge. We propose a novel method where a user profile is built from those files, specifically considering their consistent arrangement in directories. A bag of keywords is extracted for each directory from text documents within it. We can infer the topic of each query and expand it by adding the corresponding keywords, in order to obtain a more targeted formulation. Experiments are carried out using benchmark data through a repeatable systematic process, in order to evaluate objectively how much our method can improve relevance of query results when applied upon a third-party search engine.
... Organization IX Paolo Ciancarini, University of Bologna, Italy Costas Courcoubetis, Athens Un... more ... Organization IX Paolo Ciancarini, University of Bologna, Italy Costas Courcoubetis, Athens University of Economics and Business, Greece Yogesh Deshpande, University of Western Sydney, Australia Asuman ... 198 Khaled Nagi, Iman Elghandour, Birgitta Konig-Ries Author Index ...
A recent direction of database research has been focused on integrating logic programming and obj... more A recent direction of database research has been focused on integrating logic programming and object orientation.
Electronics, Nov 23, 2022
This article is an open access article distributed under the terms and conditions of the Creative... more This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY
Sensors
Long document summarization poses obstacles to current generative transformer-based models becaus... more Long document summarization poses obstacles to current generative transformer-based models because of the broad context to process and understand. Indeed, detecting long-range dependencies is still challenging for today’s state-of-the-art solutions, usually requiring model expansion at the cost of an unsustainable demand for computing and memory capacities. This paper introduces Emma, a novel efficient memory-enhanced transformer-based architecture. By segmenting a lengthy input into multiple text fragments, our model stores and compares the current chunk with previous ones, gaining the capability to read and comprehend the entire context over the whole document with a fixed amount of GPU memory. This method enables the model to deal with theoretically infinitely long documents, using less than 18 and 13 GB of memory for training and inference, respectively. We conducted extensive performance analyses and demonstrate that Emma achieved competitive results on two datasets of differen...
The purpose of semantic query optimization is to use semantic knowledge (e.g. integrity constrain... more The purpose of semantic query optimization is to use semantic knowledge (e.g. integrity constraints) for transforming a query into an equivalent one that may be answered more efficiently than the original version. This paper proposes a general method for semantic query optimization in the framework of OODBs (Object Oriented Database Systems). The method is applicable to the class of conjunctive queries and is based on two ingredients: a description logic able to express both class descriptions and integrity constraints rules (IC rules) as types; subsumption computation between types to evaluate the logical implications expressed by IC rules
Proceedings of the 11th International Conference on Enterprise Information, 2009
Traditional techniques for query formulation need the knowledge of the database contents, i.e. wh... more Traditional techniques for query formulation need the knowledge of the database contents, i.e. which data are stored in the data source and how they are represented. In this paper, we discuss the development of a keyword-based search engine for structured data sources. The idea is to couple the ease of use and flexibility of keyword-based search with metadata extracted from data schemata and extensional knowledge which constitute a semantic network of knowledge. Translating keywords into SQL statements, we will develop a search engine that is effective, semantic-based, and applicable also when instance are not continuously available, such as in integrated data sources or in data sources extracted from the deep web.
Elsevier Science B.V. (North- Holland
In P2P computing peers and services forego central coordination and dynamically organise themselv... more In P2P computing peers and services forego central coordination and dynamically organise themselves to support knowledge sharing and collaboration, in both cooperative and non-cooperative environments. The success of P2P systems strongly depends on a number of factors. Firstly, the ability to ensure equitable distribution of content and services. Economic and business models which rely on incentive mechanisms to supply contributions to the system are being developed, along with methods for controlling the "free riding" issue. Second, the ability to enforce provision of trusted services. Reputation based P2P trust management models are becoming a focus of the research community as a viable solution. The trust models must balance both constraints imposed by the environment (e.g. scalability) and the unique properties of trust as a social and psychological phenomenon. Recently, we are also witnessing a move of the P2P paradigm to embrace mobile computing in an attempt to achieve even higher ubiquitousness. The possibility of services related to physical location and the relation with agents in physical proximity could introduce new opportunities and also new technical challenges. Although researchers working on distributed computing, MultiAgent Systems, databases and networks have been using similar concepts for a long time, it is only fairly recently that papers motivated by the current P2P paradigm have started appearing in high quality conferences and workshops. Research in agent systems in particular appears to be most relevant because, since their inception, MultiAgent Systems have always been thought of as collections of peers. This workshop brings together researchers working on agent systems and P2P computing with the intention of strengthening this connection. http://p2p.ingce.unibo.it
Lecture Notes in Computer Science, 1997
... Suppose that a client want to retrieve informations about 'Joe Chung'; the query ex... more ... Suppose that a client want to retrieve informations about 'Joe Chung'; the query expressed in MSL is the following: (QI) JC :- JC :< cs_person {<name 'Joe Chung'>}> 9 The object pattern in the tail of the query Ol is matched against the structure of the objects hold in MED. ...
In questo paper si descrive la diversa rappresentazione della conoscenza estensionale nelle basi ... more In questo paper si descrive la diversa rappresentazione della conoscenza estensionale nelle basi di dati e nelle basi di conoscenza
Atti di: P. Mello, (a cura di). Bologna
In questo lavoro si vuole analizzare la possibilit\ue0 di effettuare l'Ottimizzazione Semanti... more In questo lavoro si vuole analizzare la possibilit\ue0 di effettuare l'Ottimizzazione Semantica delle Interrogazioni utilizzando la relazione di subsumption. Il lavoro include una formalizzazione dei modelli dei dati ad oggetti complessi, arricchita con la nozione di subsumption, che individua tutte le relazioni di specilizzazione tra classi di oggetti sulla base delle loro descrizioni