Raúl García Castro - Academia.edu (original) (raw)

Papers by Raúl García Castro

The joint W3C (World Wide Web Consortium) and OGC (Open Geospatial Consortium) Spatial Data on th... more The joint W3C (World Wide Web Consortium) and OGC (Open Geospatial Consortium) Spatial Data on the Web (SDW) Working Group developed a set of ontologies to describe sensors, actuators, samplers as well as their observations, actuation, and sampling activities. The ontologies have been published both as a W3C recommendation and as an OGC implementation standard. The set includes a lightweight core module called SOSA (Sensor, Observation, Sampler, and Actuator) available at: http://www.w3.org/ns/sosa/, and a more expressive extension module called SSN (Semantic Sensor Network) available at: http://www.w3.org/ns/ssn/. Together they describe systems of sensors and actuators, observations, the used procedures, the subjects and their properties being observed or acted upon, samples and the process of sampling, and so forth. The set of ontologies adopts a modular architecture with SOSA as a self-contained core that is extended by SSN and other modules to add expressivity and breadth. The S...

Journal of Web Semantics, 2012

The W3C Semantic Sensor Network Incubator group (the SSN-XG) produced an OWL 2 ontology to descri... more The W3C Semantic Sensor Network Incubator group (the SSN-XG) produced an OWL 2 ontology to describe sensors and observations-the SSN ontology, available at http://purl.oclc.org/NET/ssnx/ssn. The SSN ontology can describe sensors in terms of capabilities, measurement processes, observations and deployments. This article describes the SSN ontology. It further gives an example and describes the use of the ontology in recent research projects.

International Journal on Semantic Web and Information Systems, 2012

Semantic Sensor Web infrastructures use ontology-based models to represent the data that they man... more Semantic Sensor Web infrastructures use ontology-based models to represent the data that they manage; however, up to now, these ontological models do not allow representing all the characteristics of distributed, heterogeneous, and web-accessible sensor data. This paper describes a core ontological model for Semantic Sensor Web infrastructures that covers these characteristics and that has been built with a focus on reusability. This ontological model is composed of different modules that deal, on the one hand, with infrastructure data and, on the other hand, with data from a specific domain, that is, the coastal flood emergency planning domain. The paper also presents a set of guidelines, followed during the ontological model development, to satisfy a common set of requirements related to modelling domain-specific features of interest and properties. In addition, the paper includes the results obtained after an exhaustive evaluation of the developed ontologies along different aspec...

Engineering Applications of Artificial Intelligence

SHACL Shapes extracted from the DBpedia ontology.

International Conferences on Software Engineering and Knowledge Engineering, 2019

The validation of ontologies, whose aim is to check whether an ontology matches the conceptualiza... more The validation of ontologies, whose aim is to check whether an ontology matches the conceptualization it is meant to specify, is a key activity for guaranteeing the quality of ontologies. This work is focused on the validation through requirements, with the aim of assuring, both the domain experts and ontology developers, that the ontologies they are building or using are complete regarding their needs. Inspired by software engineering testing processes, this work proposes a web-based tool called Themis, independent of any ontology development environment, for validating ontologies by means of the application of test expressions which, following lexicosyntactic patterns, represent the desired behaviour that will present an ontology if a requirement is satisfied.

RANLP 2017 - Recent Advances in Natural Language Processing Meet Deep Learning, 2017

Named Entity Recognition (NER) poses new challenges in real-world documents in which there are en... more Named Entity Recognition (NER) poses new challenges in real-world documents in which there are entities with different roles according to their purpose or meaning. Retrieving all the possible entities in scenarios in which only a subset of them based on their role is needed, produces noise on the overall precision. This work proposes a NER model that relies on role classification models that support recognizing entities with a specific role. The proposed model has been implemented in two use cases using Spanish drug Summary of Product Characteristics: identification of therapeutic indications and identification of adverse reactions. The results show how precision is increased using a NER model that is oriented towards a specific role and discards entities out of scope.

The Linked Data initiative continues to grow making more datasets available; however, discovering... more The Linked Data initiative continues to grow making more datasets available; however, discovering the type of data contained in a dataset, its structure, and the vocabularies used still remains a challenge hindering the querying and reuse. VoID descriptions provide a starting point but a more detailed analysis is required to unveil the implicit vocabulary usage such as common data patterns. Such analysis helps the selection of datasets, the formulation of effective queries, or the identification of quality issues. Loupe is an online tool for inspecting datasets by looking at both implicit data patterns as well as explicit vocabulary definitions in data. This demo paper presents the dataset inspection capabilities of Loupe.

Interoperability has become a cornerstone for the Internet of Things (IoT). The IoT infrastructur... more Interoperability has become a cornerstone for the Internet of Things (IoT). The IoT infrastructures that rely on the Web have become pervasive, by either publishing their data or enabling their remote management through it. The interoperability at the semantic level provides an environment in which the heterogeneities of the different IoT infrastructures are wrapped, and, therefore, systems can interact transparently. Achieving full interoperability requires the implementation of three layers: the technical, the syntactic, and the semantic layer. Furthermore, query-based transparent discovery and distributed access of IoT infrastructures can be implemented on top of these interoperability layers. In this chapter, the implementation of these three layers, with focus on the semantic layer, is discussed, as well as, the implementation of the interoperability services that provide transparent discovery and access of IoT infrastructures on the Web. At the end of each section, a reader ca...

This work has received funding from the European Union’s Horizon 2020 research and innovation pro... more This work has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreements No.732240 (SynchroniCity) and No. 688467 (VICINITY); from ETSI under Specialist Task Forces 534, 556, and 566. This work is partially funded by Hazards SEES NSF Award EAR 1520870, and KHealth NIH 1 R01 HD087132-01.

The Linked Data Platform (LDP) W3C Recommendation provides a standard protocol and a set of best ... more The Linked Data Platform (LDP) W3C Recommendation provides a standard protocol and a set of best practices for the development of read-write Linked Data applications based on HTTP access to Web resources that describe their state using the RDF data model. The Hydra Core Vocabulary is an initiative to define a lightweight vocabulary to describe hypermedia-driven Web APIs. By specifying concepts commonly used in Web APIs such as hypermedia controls with their explicit semantics, the Hydra Core Vocabulary enables creation of generic API clients. This paper discusses how LDP applications can benefit from the Hydra Core Vocabulary to describe their APIs. Using Hydra, an LDP application can enable generic clients by describing the semantics of the expected and returned data. Having an API documentation will be a more efficient approach for most LDP applications than gathering information about affordences and restrictions in each HTTP interaction. Nevertheless, there are potential conflic...

Multiple indicators are of interest in smart cities at different scales and for different stakeho... more Multiple indicators are of interest in smart cities at different scales and for different stakeholders. In open environments, such as The Web, or when indicator information has to be interchanged across systems, contextual information (e.g., unit of measurement, measurement method) should be transmitted together with the data and the lack of such information might cause undesirable effects. Describing the data by means of ontologies increases interoperability among datasets and applications. However, methodological guidance is crucial during ontology development in order to transform the art of modeling in an engineering activity. In the current paper, we present a methodological approach for modelling data about Key Performance Indicators and their context with an application example of such guidelines.

The Internet of Things (IoT) envisions an ecosystem in which physical entities, systems and infor... more The Internet of Things (IoT) envisions an ecosystem in which physical entities, systems and information resources bridge the gap between the physical and the virtual world. The existing heterogeneity in such physical entities, systems and information resources, intensified by the fact that they originate from different sectors and according to different perspectives, poses numerous challenges to the IoT vision. One of them is the need for interoperability, since capturing the maximum value from the IoT involves multiple IoT systems working together and, therefore, seamlessly interchanging information. However, successfully achieving interoperability requires coping with different aspects, not only technological but also social and/or regulatory ones. This talk will address how these aspects influence semantic interoperability, taking into account that such interoperability requires being aware of both the information interchanged and the data model (i.e., ontology) of such information.

This document presents a final report of the work carried out as part of work package 2 of the RE... more This document presents a final report of the work carried out as part of work package 2 of the READY4SmartCities project (R4SC), whose goal it is to identify the knowledge and data resources that support interoperability for energy management systems. The document is divided into two parts.

Enterprise Interoperability, 2018

People interested in the research are advised to contact the author for the final version of the ... more People interested in the research are advised to contact the author for the final version of the publication, or visit the DOI to the publisher's website. • The final author version and the galley proof are versions of the publication after peer review. • The final published version features the final layout of the paper including the volume, issue and page numbers. Link to publication General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal. If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the "Taverne" license above, please follow below link for the End User Agreement:

Proceedings of the Knowledge Capture Conference, 2017

Terminologies in the biomedical field are one of the main resources used in the clinical practice... more Terminologies in the biomedical field are one of the main resources used in the clinical practice. Keeping them up-to-date to meet realworld use cases is a critical operation that even in the case of well maintained terminologies such as SNOMED-CT involves much effort from domain experts. Pharmacological products or drugs are constantly being approved and made available in the market and their clinical information should be also updated in terminologies. Each new drug is provided with its Summary of Product Characteristics (SPC), a document in natural language that contains its essential information. This paper proposes a method for populating the Spanish extension of SNOMED-CT with drug names using SPCs and representing their clinical data sections in the terminology. More precisely, the method has been applied to the therapeutic indication and the adverse reaction sections, in which disease names are recognized as named entities in the document and mapped to the terminology. The relations between the drug name and the mapped entities are also represented in the terminology based on the specific roles that they have in the document.

AIAA Scitech 2021 Forum, 2021

Proceedings of the 33rd Annual ACM Symposium on Applied Computing, 2018

Knowledge Graphs (KGs) are becoming the core of most artificial intelligent and cognitive applica... more Knowledge Graphs (KGs) are becoming the core of most artificial intelligent and cognitive applications. Popular KGs such as DBpedia and Wikidata have chosen the RDF data model to represent their data. Despite the advantages, there are challenges in using RDF data, for example, data validation. Ontologies for specifying domain conceptualizations in RDF data are designed for entailments rather than validation. Most ontologies lack the granular information needed for validating constraints. Recent work on RDF Shapes and standardization of languages such as SHACL and ShEX provide better mechanisms for representing integrity constraints for RDF data. However, manually creating constraints for large KGs is still a tedious task. In this paper, we present a data driven approach for inducing integrity constraints for RDF data using data profiling. Those constraints can be combined into RDF Shapes and can be used to validate RDF graphs. Our method is based on machine learning techniques to automatically generate RDF shapes using profiled RDF data as features. In the experiments, the proposed approach achieved 97% precision in deriving RDF Shapes with cardinality constraints for a subset of DBpedia data. CCS CONCEPTS • Information systems → Data cleaning; • Computing methodologies → Knowledge representation and reasoning;

The Semantic Web, 2020

Knowledge Graphs (KGs) that publish RDF data modelled using ontologies in a wide range of domains... more Knowledge Graphs (KGs) that publish RDF data modelled using ontologies in a wide range of domains have populated the Web. The SHACL language is a W3C recommendation that has been endowed to encode a set of either value or model data restrictions that aim at validating KG data, ensuring data quality. Developing shapes is a complex and time consuming task that is not feasible to achieve manually. This article presents two resources that aim at generating automatically SHACL shapes for a set of ontologies: (1) Astrea-KG, a KG that publishes a set of mappings that encode the equivalent conceptual restrictions among ontology constraint patterns and SHACL constraint patterns, and (2) Astrea, a tool that automatically generates SHACL shapes from a set of ontologies by executing the mappings from the Astrea-KG. These two resources are openly available at Zenodo, GitHub, and a web application. In contrast to other proposals, these resources cover a large number of SHACL restrictions producing both value and model data restrictions, whereas other proposals consider only a limited number of restrictions or focus only on value or model restrictions.

Current Trends in Web Engineering, 2018

While the number of things present in the Web grows, the ability of discovering such things in or... more While the number of things present in the Web grows, the ability of discovering such things in order to successfully interact with them becomes a challenge, mainly due to heterogeneity. The contribution of this paper is twofold. First, an ontology-based approach to leverage web things discovery that is transparent to the syntax, protocols and formats used in things interfaces is described. Second, a semantic model for describing web things and how to extract and understand the relevant information for discovery is proposed.