SPARQL Research Papers - Academia.edu (original) (raw)

Community Health Workers (CHWs) act as liaisons between health-care providers and patients in underserved or un-served areas. However, the lack of information sharing and training support impedes the effectiveness of CHWs and their... more

Community Health Workers (CHWs) act as liaisons between health-care providers and patients in underserved or un-served areas. However, the lack of information sharing and training support impedes the effectiveness of CHWs and their ability to correctly diagnose patients. In this paper, we propose and describe a system for mobile and wearable computing devices called Rafiki which assists CHWs in decision making and facilitates collaboration among them. Rafiki can infer possible diseases and treatments by representing the diseases, their symptoms, and patient context in OWL ontologies and by reasoning over this model. The use of semantic representation of data makes it easier to share knowledge related to disease, symptom, diagnosis guidelines, and patient demography, between various personnel involved in health-care (e.g., CHWs, patients, healthcare providers). We describe the Rafiki system with the help of a motivating community health-care scenario and present an Android prototype for smart phones and Google Glass. Keywords-Collaboration in health-care, mobile health, medical diagnosis, Semantic Web, reasoning, community health-care 1 Rafiki is a Swahili word which means friend.

We present a query formulation language (called MashQL) in order to easily query and fuse structured data on the web. The main novelty of MashQL is that it allows people with limited IT skills to explore and query one (or multiple) data... more

We present a query formulation language (called MashQL) in order to easily query and fuse structured data on the web. The main novelty of MashQL is that it allows people with limited IT skills to explore and query one (or multiple) data sources without prior knowledge about the schema, structure, vocabulary, or any technical details of these sources. More importantly, to be robust and cover most cases in practice, we do not assume that a data source should have-an offline or inline-schema.

In this paper, we propose the Normalized Freebase Distance (NFD), a new measure for determing semantic concept relatedness that is based on similar principles as the Normalized Web Distance (NWD). We illustrate that the NFD is more... more

In this paper, we propose the Normalized Freebase Distance (NFD), a new measure for determing semantic concept relatedness that is based on similar principles as the Normalized Web Distance (NWD). We illustrate that the NFD is more effective when comparing ambiguous concepts.

The need for smart e-learning environments is resulting in new challenges for researchers and practitioners to develop intelligent systems that can be used to automate the Higher Education (HE) activities in an intelligent way. Some... more

The need for smart e-learning environments is resulting in new challenges for researchers and practitioners to develop intelligent systems that can be used to automate the Higher Education (HE) activities in an intelligent way. Some common examples of such activities are “analyzing, finding, and ranking the right resource to teach a course,” “analyzing and finding the people with common research interests to start joint research projects,” and “using data analytics and machine reasoning techniques for conducting the exams with different levels of complexities.” Ontological reasoning and smart data analytics can play an important role in analyzing and automating these HE activities and processes. In this paper, we present a framework named as Higher Education Activities and Processes Automation Framework (HEAPAF). The HEAPAF framework can be used to identify, extract, process, and produce the semantically enriched data in machine understandable format from different educational resources. We also present the Higher Education Ontology (HEO) that we designed and developed to accommodate the HE data and then to perform analysis and reasoning on it. As a proof of concept, we present a case study on the topic, “analyzing, finding, and ranking the right resources to teach a course,” which can dramatically improve the learning patterns of students in the growing smart educational environment. Finally, we provide the evaluation of our framework as evidence of its competency and consistency in improving academic analytics for educational activities and processes by using machine reasoning.

Today's ICT system necessarily involves multiple system domains complying with different management models. Integration of domain-specific management models is required to achieve, for example, agile fault localization in a huge ICT... more

Today's ICT system necessarily involves multiple system domains complying with different management models. Integration of domain-specific management models is required to achieve, for example, agile fault localization in a huge ICT system, but this has not been efficiently achieved due to the difficulty in bridging multiple management models. To address this issue, we propose a systems management architecture reusing an existing standardized system management model for each domain and integrating them by introducing meta-data modeling. To show the feasibility of our proposal, we demonstrate that managed data of IT and network resources modeled by Common Information Model (CIM) and Shared Information Data (SID) model are bridged by meta-data modeling based on Resource Description Framework (RDF). The architectural design implies reduced effort thanks to use of well-deployed and standardized models and meta-data modeling language with its easy-to-use query language.

In this study, a novel metacrawling method is proposed for discovering and monitoring linked data sources on the Web. We implemented the method in a prototype system, named SPARQL Endpoints Discovery (SpEnD). SpEnD starts with a... more

In this study, a novel metacrawling method is proposed for discovering and monitoring linked data sources on the Web. We implemented the method in a prototype system, named SPARQL Endpoints Discovery (SpEnD). SpEnD starts with a "search keyword" discovery process for finding relevant keywords for the linked data domain and specifically SPARQL endpoints. Then, these search keywords are utilized to find linked data sources via popular search engines (Google, Bing, Yahoo, Yandex). By using this method, most of the currently listed SPARQL endpoints in existing endpoint repositories, as well as a significant number of new SPARQL endpoints, have been discovered. Finally, we have developed a new SPARQL endpoint crawler (SpEC) for crawling and link analysis.

This paper is a survey of the research topics in the field of Semantic Web, Linked Data and Web of Data. This study looks at the contributions of this research community over its first twenty years of existence. Compiling several... more

This paper is a survey of the research topics in the field of Semantic Web, Linked Data and Web of Data. This study looks at the contributions of this research community over its first twenty years of existence. Compiling several bibliographical sources and bibliometric indicators , we identify the main research trends and we reference some of their major publications to provide an overview of that initial period. We conclude with some perspectives for the future research challenges. RÉSUMÉ. Cet article est une étude des sujets de recherche dans le domaine du Web sémantique, des données liées et du Web des données. Cette étude se penche sur les contributions de cette communauté de recherche au cours de ses vingt premières années d'existence. En compilant plusieurs sources bibliographiques et indicateurs bibliométriques, nous identifions les princi-pales tendances de la recherche et nous référencons certaines de leurs publications majeures pour donner un aperçu de cette période initiale. Nous concluons avec une discussion sur les tendances et perspectives de recherche.

We have created the Knowledgebase of Standard Biological Parts (SBPkb) as a publically accessible Semantic Web resource for synthetic biology (sbolstandard.org). The SBPkb allows researchers to query and retrieve standard biological parts... more

We have created the Knowledgebase of Standard Biological Parts (SBPkb) as a publically accessible Semantic Web resource for synthetic biology (sbolstandard.org). The SBPkb allows researchers to query and retrieve standard biological parts for research and use in synthetic biology. Its initial version includes all of the information about parts stored in the Registry of Standard Biological Parts (partsregistry.org). SBPkb transforms this information so that it is computable, using our semantic framework for synthetic biology parts. This framework, known as SBOL-semantic, was built as part of the Synthetic Biology Open Language (SBOL), a project of the Synthetic Biology Data Exchange Group. SBOL-semantic represents commonly used synthetic biology entities, and its purpose is to improve the distribution and exchange of descriptions of biological parts. In this paper, we describe the data, our methods for transformation to SBPkb, and finally, we demonstrate the value of our knowledgebase with a set of sample queries. We use RDF technology and SPARQL queries to retrieve candidate “promoter” parts that are known to be both negatively and positively regulated. This method provides new web based data access to perform searches for parts that are not currently possible.

Mémoire pour le diplôme de master « Technologies numériques appliquées à l'histoire » 2015 Remerciements Je souhaite remercier en premier lieu Anne-Marie Turcan-Verkerk et Matthieu Bonicel pour m'avoir accordé ce stage et donné une vraie... more

Mémoire pour le diplôme de master « Technologies numériques appliquées à l'histoire » 2015 Remerciements Je souhaite remercier en premier lieu Anne-Marie Turcan-Verkerk et Matthieu Bonicel pour m'avoir accordé ce stage et donné une vraie autonomie pour mener mes activités. Je remercie également tous les membres du Pool Biblissima, en particulier Pauline Charbonnier, Stefanie Gehrke, Elizabeth MacDonald et Régis Robineau, pour leur accueil enthousiaste, leurs conseils précieux, leur confiance, leur patience et leur gentillesse. Je remercie de nouveau Anne-Marie Turcan-Verkerk pour avoir partagé avec moi ses connaissances sur Florus, Mannon et leurs manuscrits, ainsi que Pierre Chambert-Protat pour m'avoir apporté son expertise en matière des manuscrits de Florus. Mes remerciements vont également à l'ensemble des enseignants de l'Ecole des chartes et à Jean-Baptiste Camps, responsable pédagogique du master « Technologies numériques appliquées à l'histoire ». Les compétences acquises dans le cadre du master m'ont permis d'appréhender ce stage avec sérénité et de mener à bout mon travail.

Data exploration and visualization systems are of great importance in the Big Data era. Exploring and visualizing very large datasets has become a major research challenge, of which scalability is a vital requirement. In this survey, we... more

Data exploration and visualization systems are of great importance in the Big Data era. Exploring and visualizing very large datasets has become a major research challenge, of which scalability is a vital requirement. In this survey, we describe the major prerequisites and challenges that should be addressed by the modern exploration and visualization systems. Considering these challenges, we present how state-of-the-art approaches from the Database and Information Visualization communities attempt to handle them. Finally , we survey the systems developed by Semantic Web community in the context of the Web of Linked Data, and discuss to which extent these satisfy the contemporary requirements.

a b s t r a c t RDF is a knowledge representation language dedicated to the annotation of resources within the framework of the semantic web. Among the query languages for RDF, SPARQL allows querying RDF through graph patterns, i.e., RDF... more

a b s t r a c t RDF is a knowledge representation language dedicated to the annotation of resources within the framework of the semantic web. Among the query languages for RDF, SPARQL allows querying RDF through graph patterns, i.e., RDF graphs involving variables. Other languages, inspired by the work in databases, use regular expressions for searching paths in RDF graphs. Each approach can express queries that are out of reach of the other one. Hence, we aim at combining these two approaches. For that purpose, we define a language, called PRDF (for "Path RDF") which extends RDF such that the arcs of a graph can be labeled by regular expression patterns. We provide PRDF with a semantics extending that of RDF, and propose a correct and complete algorithm which, by computing a particular graph homomorphism, decides the consequence between an RDF graph and a PRDF graph. We then define the PSPARQL query language, extending SPARQL with PRDF graph patterns and complying with RDF model theoretic semantics. PRDF thus offers both graph patterns and path expressions. We show that this extension does not increase the computational complexity of SPARQL and, based on the proposed algorithm, we have implemented a correct and complete PSPARQL query engine.

Pervasive computing and Internet of Things (IoTs) paradigms have created a huge potential for new business. To fully realize this potential, there is a need for a common way to abstract the heterogeneity of devices so that their... more

Pervasive computing and Internet of Things (IoTs) paradigms have created a huge potential for new business. To fully realize this potential, there is a need for a common way to abstract the heterogeneity of devices so that their functionality can be represented as a virtual computing platform. To this end, we present novel semantic level interoperability architecture for pervasive computing and IoTs. There are two main principles in the proposed architecture. First, information and capabilities of devices are represented with semantic web knowledge representation technologies and interaction with devices and the physical world is achieved by accessing and modifying their virtual representations. Second, global IoT is divided into numerous local smart spaces managed by a semantic information broker (SIB) that provides a means to monitor and update the virtual representation of the physical world. An integral part of the architecture is a resolution infrastructure that provides a means to resolve the network address of a SIB either using a physical object identifier as a pointer to information or by searching SIBs matching a specification represented with SPARQL. We present several reference implementations and applications that we have developed to evaluate the architecture in practice. The evaluation also includes performance studies that, together with the applications, demonstrate the suitability of the architecture to real-life IoT scenarios. In addition, to validate that the proposed architecture conforms to the common IoTA architecture reference model (ARM), we map the central components of the architecture to the IoT-ARM.

We describe a generic framework for representing and reasoning with annotated Semantic Web data, a task becoming more important with the recent increased amount of inconsistent and non-reliable metadata on the web. We formalise the... more

We describe a generic framework for representing and reasoning with annotated Semantic Web data, a task becoming more important with the recent increased amount of inconsistent and non-reliable metadata on the web. We formalise the annotated language, the corresponding deductive system and address the query answering problem. Previous contributions on specific RDF annotation domains are encompassed by our unified reasoning formalism as we show by instantiating it on (i) temporal, (ii) fuzzy, and (iii) provenance annotations. Moreover, we provide a generic method for combining multiple annotation domains allowing to represent, e.g., temporally-annotated fuzzy RDF. Furthermore, we address the development of a query language -AnQL -that is inspired by SPARQL, including several features of SPARQL 1.1 (subqueries, aggregates, assignment, solution modifiers) along with the formal definitions of their semantics.

The Web of Data is an open environment consisting of a great number of large inter-linked RDF datasets from various do-mains. In this environment, organizations and companies adopt the Linked Data practices utilizing Semantic Web (SW)... more

The Web of Data is an open environment consisting of a great number of large inter-linked RDF datasets from various do-mains. In this environment, organizations and companies adopt the Linked Data practices utilizing Semantic Web (SW) tech-nologies, in order to publish their data and offer SPARQL endpoints (i.e., SPARQL-based search services). On the other hand, the dominant standard for information exchange in the Web today is XML. Additionally, many international standards (e.g., Dublin Core, MPEG-7, METS, TEI, IEEE LOM) in several domains (e.g., Digital Libraries, GIS, Multimedia, e-Learning) have been expressed in XML Schema. The aforementioned have led to an increasing emphasis on XML data, accessed using the XQuery query language. The SW and XML worlds and their developed infrastructures are based on different data models, semantics and query languages. Thus, it is crucial to develop interoperability mechanisms that allow the Web of Data users to access XML datasets, using SPARQL, from their own working environments. It is unrealistic to expect that all the existing legacy data (e.g., Relational, XML, etc.) will be transformed into SW data. Therefore, publishing legacy data as Linked Data and providing SPARQL endpoints over them has become a major research challenge. In this direction, we introduce the SPARQL2XQuery Framework which creates an interoperable environment, where SPARQL queries are automatically translated to XQuery queries, in order to access XML data across the Web. The SPARQL2XQuery Framework provides a mapping model for the expression of OWL–RDF/S to XML Schema mappings as well as a method for SPARQL to XQuery translation. To this end, our Framework supports both manual and automatic mapping specification between ontologies and XML Schemas. In the automatic mapping specification scenario, the SPARQL2XQuery exploits the XS2OWL component which transforms XML Sche-mas into OWL ontologies. Finally, extensive experiments have been conducted in order to evaluate the schema transfor-mation, mapping generation, query translation and query evaluation efficiency, using both real and synthetic datasets.

Abstract: RDF is a knowledge,representation language dedicated to the annotation of resources within the Semantic Web. Though RDF itself can be used as a query language for an RDF knowledge base (using RDF consequence), the need for added... more

Abstract: RDF is a knowledge,representation language dedicated to the annotation of resources within the Semantic Web. Though RDF itself can be used as a query language for an RDF knowledge base (using RDF consequence), the need for added expressivity in queries has led to the definition of the SPARQL query language. SPARQL queries are defined on top of graph patterns

Processing the excessive volumes of information on the Web is an important issue. The Semantic Web paradigm has been proposed as the solution. However, this approach generates several challenges, such as query processing and optimisation.... more

Processing the excessive volumes of information on the Web is an important issue. The Semantic Web paradigm has been proposed as the solution. However, this approach generates several challenges, such as query processing and optimisation. This paper proposes a novel approach for optimising SPARQL queries with different graph shapes. This new method reorders the triple patterns using Ant Colony Optimisation (ACO) algorithms. Reordering the triple patterns is a way of decreasing the execution times of the SPARQL queries. The proposed approach is focused on in-memory models of RDF data, and it optimises the SPARQL queries by means of Ant System, Elitist Ant System and MAX-MIN Ant System algorithms. The approach is implemented in the Apache Jena ARQ query engine, which is used for the experimentation, and the new method is compared with Normal Execution, Jena Reorder Algorithms, and the Stocker et al. Algorithms. All of the experiments are performed using the LUBM dataset for various shapes of queries, such as chain, star, cyclic, and chain-star. The first contribution is the real-time optimisation of SPARQL query triple pattern orders using ACO algorithms, and the second contribution is the concrete implementation for the ARQ query engine, which is a component of the widely used Semantic Web framework Apache Jena. The experiments demonstrate that the proposed method reduces the execution time of the queries significantly.

—The quantity of data published on the Web according to principles of Linked Data is increasing intensely. However, this data is still largely limited to be used up by domain professionals and users who understand Linked Data... more

—The quantity of data published on the Web according to principles of Linked Data is increasing intensely. However, this data is still largely limited to be used up by domain professionals and users who understand Linked Data technologies. Therefore, it is essential to develop tools to enhance intuitive perceptions of Linked Data for lay users. The features of Linked Data point to various challenges for an easy-to-use data presentation. In this paper, Semantic Web and Linked Data technologies are overviewed, challenges to the presentation of Linked Data is stated, and LOD Explorer is presented with the aim of delivering a simple application to discover triplestore resources. Furthermore, to hide the technical challenges behind Linked Data and provide both specialist and non-specialist users, an interactive and effective way to explore RDF resources.

Efficient storage and querying of RDF data is of increasing importance, due to the increased popularity and widespread acceptance of RDF on the web and in the enterprise. In this paper, we describe a novel storage and query mechanism for... more

Efficient storage and querying of RDF data is of increasing importance, due to the increased popularity and widespread acceptance of RDF on the web and in the enterprise. In this paper, we describe a novel storage and query mechanism for RDF which works on top of existing relational representations. Reliance on relational representations of RDF means that one can take advantage of 35+ years of research on efficient storage and querying, industrial-strength transaction support, locking, security, etc. However, there are significant challenges in storing RDF in relational, which include data sparsity and schema variability. We describe novel mechanisms to shred RDF into relational, and novel query translation techniques to maximize the advantages of this shredded representation. We show that these mechanisms result in consistently good performance across multiple RDF benchmarks, even when compared with current state-of-the-art stores. This work provides the basis for RDF support in DB2 v.10.1.

The huge amounts of data produced in high-throughput techniques in the life sciences and the need for integration of heterogeneous data from disparate sources in new fields such as Systems Biology and translational drug development,... more

The huge amounts of data produced in high-throughput techniques in the life sciences and the need for integration of heterogeneous data from disparate sources in new fields such as Systems Biology and translational drug development, require better approaches to data integration. The semantic web is anticipated to provide solutions through new formats for knowledge representation and management. Software libraries for semantic web formats are becoming mature, but there exist multiple tools based on foundationally different technologies. SWI-Prolog, a tool with semantic web support, was integrated into the Bioclipse bio-and cheminformatics workbench software and evaluated in terms of performance against non-Prolog-based semantic web tools in Bioclipse, Jena and Pellet, for querying a data set consisting of mostly numerical, NMR shift values, in the semantic web format RDF. The integration has given access to the convenience of the Prolog language for working with semantic data and defining data management workflows in Bioclipse. The performance comparison shows that SWI-Prolog is superior in terms of performance over Jena and Pellet for this specific dataset and suggests Prolog-based tools as interesting for further evaluations.

SPARQL is today the standard access language for Semantic Web data. In the recent years XML databases have also acquired industrial importance due to the widespread applicability of XML in the Web. In this paper we present a framework... more

SPARQL is today the standard access language for Semantic Web data. In the recent years XML databases have also acquired industrial importance due to the widespread applicability of XML in the Web. In this paper we present a framework that bridges the heterogeneity gap and creates an interoperable environment where SPARQL queries are used to access XML databases. Our approach assumes that fairly generic mappings between ontology constructs and XML Schema constructs have been automatically derived or manually specified. The mappings are used to automatically translate SPARQL queries to semantically equivalent XQuery queries which are used to access the XML databases. We present the algorithms and the implementation of SPARQL2XQuery framework, which is used for answering SPARQL queries over XML databases.

This project is intended to ease the writing process of dynamic SPARQL queries for applications. Its goal is to make an autocomplete form that can be reused in different applications and will be up to date with the latest ontologies, thus... more

This project is intended to ease the writing process of dynamic SPARQL queries for applications. Its goal is to make an autocomplete form that can be reused in different applications and will be up to date with the latest ontologies, thus making the process of using Linked Open Data closer to application developers in general. This is done by having a server with an API that returns a JSONP format for each SPARQL query sent by the user application, i.e. the autocomplete form. The autocomplete feature is implemented with AngularJS and helps the user with writing the SPARQL keywords, the ontology classes and properties.

This paper presents the architecture for the development of web applications for exploring semantic knowledge graphs through parameterized interactive visualizations. The web interface and the interactive parameterized visualizations, in... more

This paper presents the architecture for the development of web applications for exploring semantic knowledge graphs through parameterized interactive visualizations. The web interface and the interactive parameterized visualizations, in the form of a computational book, provide a way in which knowledge graphs can be explored. An important part of using this approach for building interactive web visualizations is that we can substitute the knowledge graph entities with other entities within the existing interactive visualizations, execute commands in a web-based environment, and get the same visualization for the new entities. With this architecture, various applications for interactive visualization of knowledge graphs can be developed, which can also stimulate the interest to explore the graph and its entities. We also present a publicly available open source use-case that is built using the concepts discussed in this paper.

İnternet teknolojilerinin hızla gelişmesi ve hayatımıza girmesiyle birlikte pek çok alanda önemli değişimler meydana gelmeye başlamıştır. Bu değişimler kullanıcıların ihtiyaçlarına cevap verebilmek için eğitim, bilgi yönetimi, e-ticaret... more

İnternet teknolojilerinin hızla gelişmesi ve hayatımıza girmesiyle birlikte pek çok alanda önemli değişimler meydana gelmeye başlamıştır. Bu değişimler kullanıcıların ihtiyaçlarına cevap verebilmek için eğitim, bilgi yönetimi, e-ticaret gibi alanlarda yenilik ihtiyacını artırmıştır. İnternetin bir eklentisi olarak tanıtılan anlamsal ağ, bu ihtiyaçlardan birini oluşturan ve hızla büyüyen veri havuzundaki bilgiler arasında ilişkisel yapılar oluşturabilme kabiliyetiyle, bilgi temelli talebi karşılama potansiyeline sahiptir. Bu çalışmada, internetin tarihsel süreçteki gelişiminde anlamsal ağın yeri hakkında bilgilendirme yapılarak, gelişen teknolojiyle birlikte eğitim, bilgi yönetimi, e-ticaret alanlarında oluşan yeni ihtiyaçları karşılamak üzere anlamsal ağın potansiyeline ışık tutmak amaçlanmıştır. Çalışmada anlamsal ağın, bilgi işlemede kullandığı ontolojiler, rdf yapıları ve sparql gibi teknolojilerle sağladığı farklı yaklaşımlar ve bu sayede farklı disiplinlerde uygulanabilecek yeni tasarımlar belirlenmiştir.

Mapping relational databases to RDF is a fundamental problem for the development of the Semantic Web. We present a solution, inspired by draft methods defined by the W3C where relational databases are directly mapped to RDF and OWL. Given... more

Mapping relational databases to RDF is a fundamental problem for the development of the Semantic Web. We present a solution, inspired by draft methods defined by the W3C where relational databases are directly mapped to RDF and OWL. Given a relational database schema and its integrity constraints, this direct mapping produces an OWL ontology, which, provides the basis for generating RDF instances. The semantics of this mapping is defined using Datalog. Two fundamental properties are information preservation and query preservation. We prove that our mapping satisfies both conditions, even for relational databases that contain null values. We also consider two desirable properties: monotonicity and semantics preservation. We prove that our mapping is monotone and also prove that no monotone mapping, including ours, is semantic preserving. We realize that monotonicity is an obstacle for semantic preservation and thus present a non-monotone direct mapping that is semantics preserving.

What means Big Data? Is that a real change or just the last IT trend? We will try, together, to undestand, give answers and guidelines. First of all we'll dive into the main topics affecting Big Data: NoSQL, Parallel Processing, Services... more

What means Big Data? Is that a real change or just the last IT trend? We will try, together, to undestand, give answers and guidelines. First of all we'll dive into the main topics affecting Big Data: NoSQL, Parallel Processing, Services and PaaS, Social Networks, Internet of Things, Open Data. We'll discuss about the great benefits coming from Big Data, but we will provide an overview of the major Security and Privacy issues, too. One of the most important example of Big Data source is Wikipedia, one of the largest information resource of mankind, maintained by hundreds of thousands of people around the world. The Wikidata project centralizes all Wikimedia project links (Wikipedia, Wikisource, Wikivoyage, Wikispecies, Wikibooks, Wikiquote, Wikinotizie, Wikiversità), collecting structured data, creating automatic queries, providing support for third parties. Interoperability is one of the most important benefits of open data model. The data, if isolated, have little value; conversely, their value increases significantly when different data sets, produced and published independently by various parties, can be crossed freely by third parties. The codelab represents a session of code that allows participants to put into practice the theoretical concepts presented in the talk associated with it. Concretely, we will create an application that, throught the webservice, will run SPARQL query on Open Data made available at the endpoints of DBpedia (project that extracts structured information from Wikipedia and releases them on the web as Linked Open Data in RDF format).

"In the context of the emergent Web of Data, a large number of organizations, institutes and companies (e.g., DBpedia, Geonames, PubMed ACM, IEEE, NASA, BBC) adopt the Linked Data practices and publish their data utilizing Semantic Web... more

"In the context of the emergent Web of Data, a large number of organizations, institutes and companies (e.g., DBpedia, Geonames, PubMed ACM, IEEE, NASA, BBC) adopt the Linked Data practices and publish their data utilizing Semantic Web (SW) technologies. On the other hand, the dominant standard for information exchange in the Web today is XML. Many international standards (e.g., Dublin Core, MPEG-7, METS, TEI, IEEE LOM) have been expressed in XML Schema resulting to a large number of XML datasets. The SW and XML worlds and their developed infrastructures are based on different data models, semantics and query languages. Thus, it is crucial to provide interoperability and integration mechanisms to bridge the gap between the SW and XML worlds.
In this chapter, we give an overview and a comparison of the technologies and the standards adopted by the XML and SW worlds. In addition, we outline the latest efforts from the W3C groups, including the latest working drafts and recommendations (e.g., OWL 2, SPARQL 1.1, XML Schema 1.1). Moreover, we present a survey of the research approaches which aim to provide interoperability and integration between the XML and SW worlds. Finally, we present the SPARQL2XQuery and XS2OWL Frameworks, which bridge the gap and create an interoperable environment between the two worlds. These Frameworks provide mechanisms for: (a) Query translation (SPARQL to XQuery translation); (b) Mapping specification and generation (Ontology to XML Schema mapping); and (c) Schema transformation (XML Schema to OWL transformation)."

Semantic Web is a system that allows machines to understand complex human requests. Depending on the meaning semantic web replies. Semantics is the learning of the meanings of linguistic appearance. It is the main branch of contemporary... more

Semantic Web is a system that allows machines to understand complex human requests. Depending on the meaning semantic web replies. Semantics is the learning of the meanings of linguistic appearance. It is the main branch of contemporary linguistics. Semantics is meaning of words, text or a phrase and relations between them. RDF provides essential support to the Semantic Web. To represent distributed information RDF is created. Applications can use RDF created and process it in an adaptive manner. Knowledge representation is done using RDF standards and it is machine understandable. This paper describes the creation of a semantic web using RDF, and retrieval of accurate results using SparQL query language.

This paper presents the final results of the research project that aimed for the construction of a tool which is aided by Artificial Intelligence through an Ontology with a model trained with Machine Learning, and is aided by Natural... more

This paper presents the final results of the research project that aimed for the construction of a tool which is aided by Artificial Intelligence through an Ontology with a model trained with Machine Learning, and is aided by Natural Language Processing to support the semantic search of research projects of the Research System of the University of Nariño. For the construction of NATURE, as this tool is called, a methodology was used that includes the following stages: appropriation of knowledge, installation and configuration of tools, libraries and technologies, collection, extraction and preparation of research projects, design and development of the tool. The main results of the work were three: a) the complete construction of the Ontology with classes, object properties (predicates), data properties (attributes) and individuals (instances) in Protegé, SPARQL queries with Apache Jena Fuseki and the respective coding with Owlready2 using Jupyter Notebook with Python within the virtual environment of anaconda; b) the successful training of the model for which Machine Learning algorithms were used and specifically Natural Language Processing algorithms such as: SpaCy, NLTK, Word2vec and Doc2vec, this was also performed in Jupyter Notebook with Python within the virtual environment of anaconda and with Elasticsearch; and c) the creation of NATURE by managing and unifying the queries for the Ontology and for the Machine Learning model. The tests showed that NATURE was successful in all the searches that were performed as its results were satisfactory.

The Mighty Storage Challenge (MOCHA) aims to test the performance of solutions for SPARQL processing, in several aspects relevant for modern Linked Data applications. The Virtuoso Server by Open-Link is a modern enterprise-grade solution... more

The Mighty Storage Challenge (MOCHA) aims to test the performance of solutions for SPARQL processing, in several aspects relevant for modern Linked Data applications. The Virtuoso Server by Open-Link is a modern enterprise-grade solution for data access, integration, and relational database management, which provides a scalable RDF Quad Store. In this paper, we present the initial evaluation results of running the Social Network Benchmark queries over the provided challenge datasets, as part of our application for the MOCHA challenge. These initial results will serve as a guideline for improvements in Virtuoso which will then be tested as part of the MOCHA challenge.

In this paper, we formalize the problem of Basic Graph Pattern (BGP) optimization for SPARQL queries and main memory graph implementations of RDF data. We define and analyze the characteristics of heuristics for selectivitybased static... more

In this paper, we formalize the problem of Basic Graph Pattern (BGP) optimization for SPARQL queries and main memory graph implementations of RDF data. We define and analyze the characteristics of heuristics for selectivitybased static BGP optimization. The heuristics range from simple triple pattern variable counting to more sophisticated selectivity estimation techniques. Customized summary statistics for RDF data enable the selectivity estimation of joined triple patterns and the development of efficient heuristics. Using the Lehigh University Benchmark (LUBM), we evaluate the performance of the heuristics for the queries provided by the LUBM and discuss some of them in more details.

The increasing attention for federated SPARQL query systems emphasize necessity for benchmarking systems to evaluate their performance. Most of the existing benchmark systems rely on a set of predefined static queries over a particular... more

The increasing attention for federated SPARQL query systems emphasize necessity for benchmarking systems to evaluate their performance. Most of the existing benchmark systems rely on a set of predefined static queries over a particular set of data sources. Such benchmark are useful for comparing general purpose SPARQL query federation systems such as FedX, SPLENDID etc. However, special purpose federation systems such as TopFed, SAFE etc. cannot be tested with these static benchmarks since these systems only operate on a specific data sets and the corresponding queries. To facilitate the process of benchmarking for such special purpose SPARQL query federation systems, we propose QFed, a dynamic SPARQL query set generator that takes into account the characteristics of both dataset and queries along with the cost of data communication. Our experimental results show that QFed can successfully generate a large set of meaningful federated SPARQL queries to be considered for the performance evaluation of different federated SPARQL query engines.

The logic-based machine-understandable framework of the Semantic Web often challenges naive users when they try to query ontology-based knowledge bases. Existing research efforts have approached this problem by introducing natural... more

The logic-based machine-understandable framework of the Semantic Web often challenges naive users when they try to query ontology-based knowledge bases. Existing research efforts have approached this problem by introducing natural language (NL) interfaces to ontologies. These NL interfaces have the ability to construct SPARQL queries based on NL user queries. However, most efforts were restricted to queries expressed in English, and they often benefited from the advancement of English NLP tools. However, little research has been done to support querying the Arabic content on the Semantic Web by using NL queries. This paper presents a domain-independent approach to translate Arabic NL queries to SPARQL by leveraging linguistic analysis. Based on a special consideration on Noun Phrases (NPs), our approach uses a language parser to extract NPs and the relations from Arabic parse trees and match them to the underlying ontology. It then utilizes knowledge in the ontology to group NPs into triple-based representations. A SPARQL query is finally generated by extracting targets and modifiers, and interpreting them into SPARQL. The interpretation of advanced semantic features including negation, conjunctive and disjunctive modifiers is also supported. The approach was evaluated by using two datasets consisting of OWL test data and queries, and the obtained results have confirmed its feasibility to translate Arabic NL queries to SPARQL.

SPARQL queries, OWL Ontologies and Individuals

A semantic-web ontology, simply known as ontology, comprises a data model and data that should comply with it. Due to their distributed nature, there exist a large amount of heterogeneous ontologies, and a strong need for exchanging data... more

A semantic-web ontology, simply known as ontology, comprises a data model and data that should comply with it. Due to their distributed nature, there exist a large amount of heterogeneous ontologies, and a strong need for exchanging data amongst them, i.e., populating a target ontology using data that come from one or more source ontologies. Data exchange may be implemented using correspondences that are later transformed into executable mappings; however, exchanging data amongst ontologies is not a trivial task, so tools that help software engineers to exchange data amongst ontologies are a must. In the literature, there are a number of tools to automatically generate executable mappings; unfortunately, they have some drawbacks, namely: 1) they were designed to work with nested-relational data models, which prevents them to be applied to ontologies; 2) they require their users to handcraft and maintain their executable mappings, which is not appealing; or 3) they do not attempt to identify groups of correspondences, which may easily lead to incoherent target data. In this article, we present MostoDE, a tool that assists software engineers in generating SPARQL executable mappings and exchanging data amongst ontologies. The salient features of our tool are as follows: it allows to automate the generation of executable mappings using correspondences and constraints; it integrates several systems that implement semantic-web technologies to exchange data; and it provides visual aids for helping software engineers to exchange data amongst ontologies.

In the context of the emergent Web of Data, a large number of organizations, institutes and companies (e.g., DBpedia, Geonames, PubMed ACM, IEEE, NASA, BBC) adopt the Linked Data practices and publish their data utilizing Semantic Web... more

In the context of the emergent Web of Data, a large number of organizations, institutes and companies (e.g., DBpedia, Geonames, PubMed ACM, IEEE, NASA, BBC) adopt the Linked Data practices and publish their data utilizing Semantic Web (SW) technologies. On the other hand, the dominant standard for information exchange in the Web today is XML. Many international standards (e.g., Dublin Core, MPEG-7, METS, TEI, IEEE LOM) have been expressed in XML Schema resulting to a large number of XML datasets. The SW and XML worlds and their developed infrastructures are based on different data models, semantics and query languages. Thus, it is crucial to provide interoperability and integration mechanisms to bridge the gap between the SW and XML worlds.

Joined-up government depends fundamentally on semantics---on the computable representation of meaning, so that data is associated with appropriate metadata from the start, and this association is maintained as the data is manipulated.... more

Joined-up government depends fundamentally on semantics---on the computable representation of meaning, so that data is associated with appropriate metadata from the start, and this association is maintained as the data is manipulated. This paper summaries a tutorial and workshop on semantic technologies for supporting electronic government.

Chatbot is a conversational agent that communicates with users based on natural language. It is founded on a question answering system which tries to understand the intent of the user. Several chatbot methods deal with a model based... more

Chatbot is a conversational agent that communicates with users based on natural language. It is founded on a question answering system which tries to understand the intent of the user. Several chatbot methods deal with a model based template of question answering. However, these approaches are not able to cope with various questions and can affect the quality of the results. To address this issue, we propose a new semantic question answering approach combining Natural Language Processing (NLP) methods and Semantic Web techniques to analyze user’s question and transform it into SPARQL query. An ontology has been developed to represent the domain knowledge of the chatbot. Experimentations show that our approach outperforms state of the art methods.

Chatbot is a conversational agent that communicates with users based on natural language. It is founded on a question answering system which tries to understand the intent of the user. Several chatbot methods deal with a model based... more

Chatbot is a conversational agent that communicates with users based on natural language. It is founded on a question answering system which tries to understand the intent of the user. Several chatbot methods deal with a model based template of question answering. However, these approaches are not able to cope with various questions and can affect the quality of the results. To address this issue, we propose a new semantic question answering approach combining Natural Language Processing (NLP) methods and Semantic Web techniques to analyze user's question and transform it into SPARQL query. An ontology has been developed to represent the domain knowledge of the chatbot. Experimentations show that our approach outperforms state of the art methods.

Semantic Web is an emerging area to augment human reasoning. Various technologies are being developed in this arena which have been standardized by the World Wide Web Consortium (W3C). One such standard is the Resource Description... more

Semantic Web is an emerging area to augment human reasoning. Various technologies are being developed in this arena which have been standardized by the World Wide Web Consortium (W3C). One such standard is the Resource Description Framework (RDF). Semantic Web technologies can be utilized to build efficient and scalable systems for Cloud Computing. With the explosion of semantic web technologies, large RDF graphs are common place. This poses significant challenges for the storage and retrieval of RDF graphs. Current frameworks do not scale for large RDF graphs and as a result do not address these challenges. In this paper, we describe a framework that we built using Hadoop to store and retrieve large numbers of RDF triples by exploiting the cloud computing paradigm. We describe a scheme to store RDF data in Hadoop Distributed File System. More than one Hadoop job (the smallest unit of execution in Hadoop) may be needed to answer a query because a single triple pattern in a query cannot simultaneously take part in more than one join in a single Hadoop job. To determine the jobs, we present an algorithm to generate query plan, whose worst case cost is bounded, based on a greedy approach to answer a SPARQL Protocol and RDF Query Language(SPARQL) query. We use Hadoop's MapReduce framework to answer the queries. Our results show that we can store large RDF graphs in Hadoop clusters built with cheap commodity class hardware. Furthermore, we show that our framework is scalable and efficient and can handle large amounts of RDF data, unlike traditional approaches.

The Web of Data is an interconnected global dataspace in which discovering resources related to a given resource and recommend relevant ones is still an open research area. This work describes a new recommendation algorithm based on... more

The Web of Data is an interconnected global dataspace in which discovering resources related to a given resource and recommend relevant ones is still an open research area. This work describes a new recommendation algorithm based on structured data published on the Web (Linked Data). The algorithm exploits existing relationships between resources by dynamically analyzing both the categories to which they belong to and their explicit references to other resources. A user study conducted to evaluate the algorithm showed that our algorithm provides more novel recommendations than other state-of-the-art algorithms and keeps a satisfying prediction accuracy. The algorithm has been applied in a mobile application to recommend movies by relying on DBpedia (the Linked Data version of Wikipedia), although it could be applied to other datasets on the Web of Data.

The use of RDF to expose semantic data on the Web has seen a dramatic increase over the last few years. Nowadays, RDF datasets are so big and interconnected that, in fact, classical mono-node solutions present significant scalability... more

The use of RDF to expose semantic data on the Web has seen a dramatic increase over the last few years. Nowadays,
RDF datasets are so big and interconnected that, in fact, classical mono-node solutions present significant
scalability problems when trying to manage big semantic data. MapReduce, a standard framework for distributed
processing of great quantities of data, is earning a place among the distributed solutions facing RDF scalability
issues. In this article, we survey the most important works addressing RDF management and querying through
diverse MapReduce approaches, with a focus on their main strategies, optimizations and results.

We propose a new method for mining sets of patterns for classification, where patterns are represented as SPARQL queries over RDFS. The method contributes to so-called semantic data mining, a data mining approach where domain ontologies... more

We propose a new method for mining sets of patterns for classification, where patterns are represented as SPARQL queries over RDFS. The method contributes to so-called semantic data mining, a data mining approach where domain ontologies are used as background knowledge, and where the new challenge is to mine knowledge encoded in domain ontologies, rather than only purely empirical data. We have developed a tool that implements this approach. Using this we have conducted an experimental evaluation including comparison of our method to state-of- the-art approaches to classification of semantic data and an experimental study within emerging subfield of meta-learning called semantic meta-mining. The most important research contributions of the paper to the state-of-art are as follows. For pattern mining research or relational learning in general, the paper contributes a new algorithm for discovery of new type of patterns. For Semantic Web research, it theoretically and empirically illustrates how semantic, structured data can be used in traditional machine learning methods through a pattern-based approach for constructing semantic features.