Alessandro Adamou - Academia.edu (original) (raw)
Papers by Alessandro Adamou
Journal on Computing and Cultural Heritage
Modelling the knowledge behind human experiences is a complex process: it should take into accoun... more Modelling the knowledge behind human experiences is a complex process: it should take into account, among others, the activities performed, human observations and the documentation of the evidence. To represent this knowledge in a declarative way means to support data interoperability in the context of cultural heritage artefacts, as linked datasets on experience documentation have started to appear. With this objective in mind, we describe a study based on an ontology design pattern for modelling experiences through observations, which are considered indirect evidence of a mental process (i.e., the experience). This pattern highlights the structural differences between types of experiential documentation, such as diaries and social media, providing a guideline for the comparability between different domains and for supporting the construction of heterogeneous datasets based on an epistemic compatibility. We have performed not only a formal evaluation over the pattern but also an as...
Lecture Notes in Computer Science, 2022
Understanding the structure of identifiers in a particular dataset is critical for users/applicat... more Understanding the structure of identifiers in a particular dataset is critical for users/applications that want to use such a dataset, and connect to it. This is especially true in Linked Data where, while benefiting from having the structure of URIs, identifiers are also designed according to specific conventions, which are rarely made explicit and documented. In this paper, we present an automatic method to extract such URI patterns which is based on adapting formal concept analysis techniques to the mining of string patterns. The result is a tool that can generate, in a few minutes, the documentation of the URI patterns employed in a SPARQL endpoint by the instances of each class in the corresponding datasets. We evaluate the approach through demonstrating its performance and efficiency on several endpoints of various origins.
One of the existing query recommendation strategies for unknown datasets is “by example”, i.e. ba... more One of the existing query recommendation strategies for unknown datasets is “by example”, i.e. based on a query that the user already knows how to formulate on another dataset within a similar domain. In this paper we measure what contribution a structural analysis of the query and the datasets can bring to a recommendation strategy, to go alongside approaches that provide a semantic analysis. Here we concentrate on the case of star-shaped SPARQL queries over RDF datasets. The illustrated strategy performs a least general generalization on the given query, computes the specializations of it that are satisfiable by the target dataset, and organizes them into a graph. It then visits the graph to recommend first the reformulated queries that reflect the original query as closely as possible. This approach does not rely upon a semantic mapping between the two datasets. An implementation as part of the SQUIRE query recommendation library is discussed. 2012 ACM Subject Classification Info...
More and more learning activities take place online in a selfdirected manner. Therefore, just as ... more More and more learning activities take place online in a selfdirected manner. Therefore, just as the idea of self-tracking activities for fitness purposes has gained momentum in the past few years, tools and methods for awareness and self-reflection on one’s own online learning behavior appear as an emerging need for both formal and informal learners. Addressing this need is one of the key objectives of the AFEL (Analytics for Everyday Learning) project. In this paper, we discuss the different aspects of what needs to be put in place in order to enable awareness and self-reflection in online learning. We start by describing a scenario that guides the work done. We then investigate the theoretical, technical and support aspects that are required to enable this scenario, as well as the current state of the research in each aspect within the AFEL project. We conclude with a discussion of the ongoing plans from the project to develop learner-facing tools that enable awareness and selfre...
Data integration problems are commonly viewed as interoperability issues, where the burden of rea... more Data integration problems are commonly viewed as interoperability issues, where the burden of reaching a common ground for exchanging data is distributed across the peers involved in the process. While apparently an effective approach towards standardization and interoperability, it poses a constraint to data providers who, for a variety of reasons, require backwards compatibility with proprietary or non-standard mechanisms. Publishing a holistic data API is one such use case, where a single peer performs most of the integration work in a many-to-one scenario. Incidentally, this is also the base setting of software compilers, whose operational model is comprised of phases that perform analysis, linkage and assembly of source code and generation of intermediate code. There are several analogies with a data integration process, more so with data that live in the Semantic Web, but what requirements would a data provider need to satisfy, for an integrator to be able to query and transfo...
The International Journal of the Humanities: Annual Review, 2015
The Listening Experience Database (http://www.open.ac.uk/Arts/LED) is the first project to collat... more The Listening Experience Database (http://www.open.ac.uk/Arts/LED) is the first project to collate and interrogate a mass of historical personal experiences of listening to music. Such accounts have previously received only isolated attention because they are challenging to locate and gather en masse. An extensive body of data about the responses of “ordinary listeners” (as opposed to professional critics) thus offers new ways of approaching music-related research. The underlying information system relies on linked data, including a knowledge base that is itself a linked dataset. The data management workflow supports both systematic contributions from the project team and crowdsourced input where knowledgeability and completeness of information can be expected to vary widely. The database demonstrates the potential of a mass of data as a robust evidential base for our understanding of how music functions in society. It also contributes a large body of structured data to the global m...
Proceedings of the 13th International Conference on Semantic Systems, 2017
Virtual data integration takes place at query execution time and relies on transformations of the... more Virtual data integration takes place at query execution time and relies on transformations of the original query to many target endpoints, where the data reside. In systems that integrate many data sources, this means maintaining many mappings, queries and query templates, as well as possibly issuing separate queries for linking entities in the datasets and retrieving their data. We propose a practical approach to keeping such complexity under control, which manipulates the translation from one client query to many target queries. The method performs just-in-time recompilation of the client query into elements that are combined with a query template into the target queries for multiple sources. It was validated in a setting with a custom star-shaped query language as client API and SPARQL endpoints as sources. The approach has shown to reduce the number of target queries to issue and of query templates to maintain, using a number of compiler functions that scales with the complexity of the data source, with an overhead that may be neglected where the method is most effective.
Proceedings of The International Workshop on Semantic Big Data, 2020
In scenarios where many different, independent and dynamic data sources need to be brought togeth... more In scenarios where many different, independent and dynamic data sources need to be brought together, mediated data integration at runtime is rapidly gaining interest. In a global-as-view approach, schema mappings express how to get data from each data source according to the global schema of the mediator. Key issues include the effort required to include and map new data sources, and the very need of data sources for the global schema to be expressed. It has been argued that the principles of Linked Data can be used to spread the cost of adding new sources in a pay-as-you-go model. We contribute by describing a data integration framework able to mitigate these issues, by relating data sources under a global schema which is implicit and only partly known at the time a new data source joins. Mappings over a data source only require partial knowledge of it and of the part of the global schema that it will affect. Pay-as-you go can then be employed to guarantee eventual schema compliance. This approach was adopted in a large-scale data integration system for Smart Cities, where it allowed short time-to-publish for new data and iterative schema refinements.
International Journal on Digital Libraries, 2018
Research has approached the practice of musical reception in a multitude of ways, such as the ana... more Research has approached the practice of musical reception in a multitude of ways, such as the analysis of professional critique, sales figures and psychological processes activated by the act of listening. Studies in the Humanities, on the other hand, have been hindered by the lack of structured evidence of actual experiences of listening as reported by the listeners themselves, a concern that was voiced since the early Web era. It was however assumed that such evidence existed, albeit in pure textual form, but could not be leveraged until it was digitised and aggregated. The Listening Experience Database (LED) responds to this research need by providing a centralised hub for evidence of listening in the literature. Not only does LED support search and reuse across nearly 10,000 records, but it also provides machine-readable structured data of the knowledge around the contexts of listening. To take advantage of the mass of formal knowledge that already exists on the Web concerning these contexts, the entire framework adopts Linked Data principles and technologies. This also allows LED to directly reuse open data from the British Library for the source documentation that is already published. Reused data are re-published as open data with enhancements obtained by expanding over the model of the original data, such as the partitioning of published books and collections into individual stand-alone documents. The database was populated through crowdsourcing and seamlessly incorporates data reuse from the very early data entry phases. As the sources of the evidence often contain vague, fragmentary of uncertain information, facilities were put in place to generate structured data out of such fuzziness. Alongside elaborating on these functionalities, this article provides insights into the most recent features of the latest instalment of the dataset and portal, such as the interlinking with the MusicBrainz database, the relaxation of geographical input constraints through text mining, and the plotting of key locations in an interactive geographical browser.
Lecture Notes in Computer Science, 2016
In this demo paper, a SPARQL Query Recommendation Tool (called SQUIRE) based on query reformulati... more In this demo paper, a SPARQL Query Recommendation Tool (called SQUIRE) based on query reformulation is presented. Based on three steps, Generalization, Specialization and Evaluation, SQUIRE implements the logic of reformulating a SPARQL query that is satisfiable w.r.t a source RDF dataset, into others that are satisfiable w.r.t a target RDF dataset. In contrast with existing approaches, SQUIRE aims at recommending queries whose reformulations: i) reflect as much as possible the same intended meaning, structure, type of results and result size as the original query and ii) do not require to have a mapping between the two datasets. Based on a set of criteria to measure the similarity between the initial query and the recommended ones, SQUIRE demonstrates the feasibility of the underlying query reformulation process, ranks appropriately the recommended queries, and offers a valuable support for query recommendations over an unknown and unmapped target RDF dataset, not only assisting the user in learning the data model and content of an RDF dataset, but also supporting its use without requiring the user to have intrinsic knowledge of the data.
2016 IEEE International Smart Cities Conference (ISC2), 2016
For guidance on citations see FAQs.
Semantic Web, 2016
The article reports on the evolution of data.open.ac.uk, the Linked Open Data platform of the Ope... more The article reports on the evolution of data.open.ac.uk, the Linked Open Data platform of the Open University, from a research experiment to a data hub for the open content of the University. Entirely based on Semantic Web technologies (RDF and the Linked Data principles), data.open.ac.uk is used to curate, publish and access data about academic degree qualifications, courses, scholarly publications and open educational resources of the University. It exposes a SPARQL endpoint and several other services to support developers, including queries stored server-side and entity lookup using known identifers such as course codes and YouTube video IDs. The platform is now a key information service at the Open University, with several core systems and websites exploiting linked data through data.open.ac.uk. Through these applications, data.open.ac.uk is now fulfilling a key role in the overall data infrastructure of the university, and in establishing connections with other educational institutions and information providers.
The article reports on the evolution of data.open.ac.uk, the Linked Open Data platform of the Ope... more The article reports on the evolution of data.open.ac.uk, the Linked Open Data platform of the Open University, from a research experiment to a data hub for the open content of the university. Entirely based on Semantic Web technologies (RDF and the Linked Data principles), data.open.ac.uk is used to curate, publish and access data about academic degree qualifications, courses, research papers and open educational resources of the university. It exposes a SPARQL endpoint and several other services to support developers, including queries stored server-side and entity lookup using known identifers such as course codes and YouTube video IDs. The platform is now a key information service at the Open University, with several core systems and websites exploiting linked data through data.open.ac.uk. Example applications include connecting entities such as courses to media objects published in different places (YouTube, Audioboo, OpenLearn, etc.) and providing recommendations of resources based on application-specific queries. Through these applications, data.open.ac.uk is now fulfilling a key role in the overall data infrastructure of the university, and in establishing connections with other educational institutions and information providers.
The LinkedUp Catalogue of Web datasets for education is a meta-dataset dedicated to supporting pe... more The LinkedUp Catalogue of Web datasets for education is a meta-dataset dedicated to supporting people and applications in discovering, exploring and using Web data for the purpose of innovative, educational services. It is also an evolving dataset, with most of its content being contributed by automatically extracting relevant information from external descriptions and the included datasets themselves. In this paper, we describe the purpose and content of this dataset, as well as the way it is being created, published and maintained.
For guidance on citations see FAQs.
For guidance on citations see FAQs.
Journal on Computing and Cultural Heritage
Modelling the knowledge behind human experiences is a complex process: it should take into accoun... more Modelling the knowledge behind human experiences is a complex process: it should take into account, among others, the activities performed, human observations and the documentation of the evidence. To represent this knowledge in a declarative way means to support data interoperability in the context of cultural heritage artefacts, as linked datasets on experience documentation have started to appear. With this objective in mind, we describe a study based on an ontology design pattern for modelling experiences through observations, which are considered indirect evidence of a mental process (i.e., the experience). This pattern highlights the structural differences between types of experiential documentation, such as diaries and social media, providing a guideline for the comparability between different domains and for supporting the construction of heterogeneous datasets based on an epistemic compatibility. We have performed not only a formal evaluation over the pattern but also an as...
Lecture Notes in Computer Science, 2022
Understanding the structure of identifiers in a particular dataset is critical for users/applicat... more Understanding the structure of identifiers in a particular dataset is critical for users/applications that want to use such a dataset, and connect to it. This is especially true in Linked Data where, while benefiting from having the structure of URIs, identifiers are also designed according to specific conventions, which are rarely made explicit and documented. In this paper, we present an automatic method to extract such URI patterns which is based on adapting formal concept analysis techniques to the mining of string patterns. The result is a tool that can generate, in a few minutes, the documentation of the URI patterns employed in a SPARQL endpoint by the instances of each class in the corresponding datasets. We evaluate the approach through demonstrating its performance and efficiency on several endpoints of various origins.
One of the existing query recommendation strategies for unknown datasets is “by example”, i.e. ba... more One of the existing query recommendation strategies for unknown datasets is “by example”, i.e. based on a query that the user already knows how to formulate on another dataset within a similar domain. In this paper we measure what contribution a structural analysis of the query and the datasets can bring to a recommendation strategy, to go alongside approaches that provide a semantic analysis. Here we concentrate on the case of star-shaped SPARQL queries over RDF datasets. The illustrated strategy performs a least general generalization on the given query, computes the specializations of it that are satisfiable by the target dataset, and organizes them into a graph. It then visits the graph to recommend first the reformulated queries that reflect the original query as closely as possible. This approach does not rely upon a semantic mapping between the two datasets. An implementation as part of the SQUIRE query recommendation library is discussed. 2012 ACM Subject Classification Info...
More and more learning activities take place online in a selfdirected manner. Therefore, just as ... more More and more learning activities take place online in a selfdirected manner. Therefore, just as the idea of self-tracking activities for fitness purposes has gained momentum in the past few years, tools and methods for awareness and self-reflection on one’s own online learning behavior appear as an emerging need for both formal and informal learners. Addressing this need is one of the key objectives of the AFEL (Analytics for Everyday Learning) project. In this paper, we discuss the different aspects of what needs to be put in place in order to enable awareness and self-reflection in online learning. We start by describing a scenario that guides the work done. We then investigate the theoretical, technical and support aspects that are required to enable this scenario, as well as the current state of the research in each aspect within the AFEL project. We conclude with a discussion of the ongoing plans from the project to develop learner-facing tools that enable awareness and selfre...
Data integration problems are commonly viewed as interoperability issues, where the burden of rea... more Data integration problems are commonly viewed as interoperability issues, where the burden of reaching a common ground for exchanging data is distributed across the peers involved in the process. While apparently an effective approach towards standardization and interoperability, it poses a constraint to data providers who, for a variety of reasons, require backwards compatibility with proprietary or non-standard mechanisms. Publishing a holistic data API is one such use case, where a single peer performs most of the integration work in a many-to-one scenario. Incidentally, this is also the base setting of software compilers, whose operational model is comprised of phases that perform analysis, linkage and assembly of source code and generation of intermediate code. There are several analogies with a data integration process, more so with data that live in the Semantic Web, but what requirements would a data provider need to satisfy, for an integrator to be able to query and transfo...
The International Journal of the Humanities: Annual Review, 2015
The Listening Experience Database (http://www.open.ac.uk/Arts/LED) is the first project to collat... more The Listening Experience Database (http://www.open.ac.uk/Arts/LED) is the first project to collate and interrogate a mass of historical personal experiences of listening to music. Such accounts have previously received only isolated attention because they are challenging to locate and gather en masse. An extensive body of data about the responses of “ordinary listeners” (as opposed to professional critics) thus offers new ways of approaching music-related research. The underlying information system relies on linked data, including a knowledge base that is itself a linked dataset. The data management workflow supports both systematic contributions from the project team and crowdsourced input where knowledgeability and completeness of information can be expected to vary widely. The database demonstrates the potential of a mass of data as a robust evidential base for our understanding of how music functions in society. It also contributes a large body of structured data to the global m...
Proceedings of the 13th International Conference on Semantic Systems, 2017
Virtual data integration takes place at query execution time and relies on transformations of the... more Virtual data integration takes place at query execution time and relies on transformations of the original query to many target endpoints, where the data reside. In systems that integrate many data sources, this means maintaining many mappings, queries and query templates, as well as possibly issuing separate queries for linking entities in the datasets and retrieving their data. We propose a practical approach to keeping such complexity under control, which manipulates the translation from one client query to many target queries. The method performs just-in-time recompilation of the client query into elements that are combined with a query template into the target queries for multiple sources. It was validated in a setting with a custom star-shaped query language as client API and SPARQL endpoints as sources. The approach has shown to reduce the number of target queries to issue and of query templates to maintain, using a number of compiler functions that scales with the complexity of the data source, with an overhead that may be neglected where the method is most effective.
Proceedings of The International Workshop on Semantic Big Data, 2020
In scenarios where many different, independent and dynamic data sources need to be brought togeth... more In scenarios where many different, independent and dynamic data sources need to be brought together, mediated data integration at runtime is rapidly gaining interest. In a global-as-view approach, schema mappings express how to get data from each data source according to the global schema of the mediator. Key issues include the effort required to include and map new data sources, and the very need of data sources for the global schema to be expressed. It has been argued that the principles of Linked Data can be used to spread the cost of adding new sources in a pay-as-you-go model. We contribute by describing a data integration framework able to mitigate these issues, by relating data sources under a global schema which is implicit and only partly known at the time a new data source joins. Mappings over a data source only require partial knowledge of it and of the part of the global schema that it will affect. Pay-as-you go can then be employed to guarantee eventual schema compliance. This approach was adopted in a large-scale data integration system for Smart Cities, where it allowed short time-to-publish for new data and iterative schema refinements.
International Journal on Digital Libraries, 2018
Research has approached the practice of musical reception in a multitude of ways, such as the ana... more Research has approached the practice of musical reception in a multitude of ways, such as the analysis of professional critique, sales figures and psychological processes activated by the act of listening. Studies in the Humanities, on the other hand, have been hindered by the lack of structured evidence of actual experiences of listening as reported by the listeners themselves, a concern that was voiced since the early Web era. It was however assumed that such evidence existed, albeit in pure textual form, but could not be leveraged until it was digitised and aggregated. The Listening Experience Database (LED) responds to this research need by providing a centralised hub for evidence of listening in the literature. Not only does LED support search and reuse across nearly 10,000 records, but it also provides machine-readable structured data of the knowledge around the contexts of listening. To take advantage of the mass of formal knowledge that already exists on the Web concerning these contexts, the entire framework adopts Linked Data principles and technologies. This also allows LED to directly reuse open data from the British Library for the source documentation that is already published. Reused data are re-published as open data with enhancements obtained by expanding over the model of the original data, such as the partitioning of published books and collections into individual stand-alone documents. The database was populated through crowdsourcing and seamlessly incorporates data reuse from the very early data entry phases. As the sources of the evidence often contain vague, fragmentary of uncertain information, facilities were put in place to generate structured data out of such fuzziness. Alongside elaborating on these functionalities, this article provides insights into the most recent features of the latest instalment of the dataset and portal, such as the interlinking with the MusicBrainz database, the relaxation of geographical input constraints through text mining, and the plotting of key locations in an interactive geographical browser.
Lecture Notes in Computer Science, 2016
In this demo paper, a SPARQL Query Recommendation Tool (called SQUIRE) based on query reformulati... more In this demo paper, a SPARQL Query Recommendation Tool (called SQUIRE) based on query reformulation is presented. Based on three steps, Generalization, Specialization and Evaluation, SQUIRE implements the logic of reformulating a SPARQL query that is satisfiable w.r.t a source RDF dataset, into others that are satisfiable w.r.t a target RDF dataset. In contrast with existing approaches, SQUIRE aims at recommending queries whose reformulations: i) reflect as much as possible the same intended meaning, structure, type of results and result size as the original query and ii) do not require to have a mapping between the two datasets. Based on a set of criteria to measure the similarity between the initial query and the recommended ones, SQUIRE demonstrates the feasibility of the underlying query reformulation process, ranks appropriately the recommended queries, and offers a valuable support for query recommendations over an unknown and unmapped target RDF dataset, not only assisting the user in learning the data model and content of an RDF dataset, but also supporting its use without requiring the user to have intrinsic knowledge of the data.
2016 IEEE International Smart Cities Conference (ISC2), 2016
For guidance on citations see FAQs.
Semantic Web, 2016
The article reports on the evolution of data.open.ac.uk, the Linked Open Data platform of the Ope... more The article reports on the evolution of data.open.ac.uk, the Linked Open Data platform of the Open University, from a research experiment to a data hub for the open content of the University. Entirely based on Semantic Web technologies (RDF and the Linked Data principles), data.open.ac.uk is used to curate, publish and access data about academic degree qualifications, courses, scholarly publications and open educational resources of the University. It exposes a SPARQL endpoint and several other services to support developers, including queries stored server-side and entity lookup using known identifers such as course codes and YouTube video IDs. The platform is now a key information service at the Open University, with several core systems and websites exploiting linked data through data.open.ac.uk. Through these applications, data.open.ac.uk is now fulfilling a key role in the overall data infrastructure of the university, and in establishing connections with other educational institutions and information providers.
The article reports on the evolution of data.open.ac.uk, the Linked Open Data platform of the Ope... more The article reports on the evolution of data.open.ac.uk, the Linked Open Data platform of the Open University, from a research experiment to a data hub for the open content of the university. Entirely based on Semantic Web technologies (RDF and the Linked Data principles), data.open.ac.uk is used to curate, publish and access data about academic degree qualifications, courses, research papers and open educational resources of the university. It exposes a SPARQL endpoint and several other services to support developers, including queries stored server-side and entity lookup using known identifers such as course codes and YouTube video IDs. The platform is now a key information service at the Open University, with several core systems and websites exploiting linked data through data.open.ac.uk. Example applications include connecting entities such as courses to media objects published in different places (YouTube, Audioboo, OpenLearn, etc.) and providing recommendations of resources based on application-specific queries. Through these applications, data.open.ac.uk is now fulfilling a key role in the overall data infrastructure of the university, and in establishing connections with other educational institutions and information providers.
The LinkedUp Catalogue of Web datasets for education is a meta-dataset dedicated to supporting pe... more The LinkedUp Catalogue of Web datasets for education is a meta-dataset dedicated to supporting people and applications in discovering, exploring and using Web data for the purpose of innovative, educational services. It is also an evolving dataset, with most of its content being contributed by automatically extracting relevant information from external descriptions and the included datasets themselves. In this paper, we describe the purpose and content of this dataset, as well as the way it is being created, published and maintained.
For guidance on citations see FAQs.
For guidance on citations see FAQs.