Christophe Debruyne | Trinity College Dublin (original) (raw)

Journal Articles by Christophe Debruyne

HRB Open Research, Mar 14, 2019

There is an ongoing challenge as to how best manage and understand 'big data' in precision medici... more There is an ongoing challenge as to how best manage and understand 'big data' in precision medicine settings. This paper describes the potential for a Linked Data approach, using a Resource Description Framework (RDF) model, to combine multiple datasets with temporal and spatial elements of varying dimensionality. This "AVERT model" provides a framework for converting multiple standalone files of various formats, from both clinical and environmental settings, into a single data source. This data source can thereafter be queried effectively, shared with outside parties, more easily understood by multiple stakeholders using standardized vocabularies, incorporating provenance metadata and supporting temporo-spatial reasoning. The approach has further advantages in terms of data sharing, security and subsequent analysis. We use a case study relating to anti-Glomerular Basement Membrane (GBM) disease, a rare autoimmune condition, to illustrate a technical proof of concept for the AVERT model.

International Journal of Standardization Research, Jan 2018

The General Data Protection Regulation (GDPR) specifies obligations that shape the way informatio... more The General Data Protection Regulation (GDPR) specifies obligations that shape the way information is collected, shared, provided, or communicated, and provides rights for receiving a copy of their personal data in an interoperable format. The sharing of information between entities affected by GDPR provides a strong motivation towards the adoption of an interoperable model for the exchange of information and demonstration of compliance. This article explores such an interoperability model through entities identified by the GDPR and their information flows along with relevant obligations. The model categorises information exchanged between entities and presents a discussion on its representation using existing standards. An investigation of data provided under the Right to Data Portability for exploring interoperability in a real-world use-case. The findings demonstrate how the use of common data formats hamper its usability due to a lack of context. The article discusses the adoption of contextual metadata using a semantic model of interoperability to remedy these identified shortcomings.

International Journal of Web Information Systems, Nov 6, 2017

Purpose: Typically tools that map non-RDF data into RDF format rely on the technology native to t... more Purpose: Typically tools that map non-RDF data into RDF format rely on the technology native to the source of the data when data transformation is required. Depending on the data format, data manipulation can be performed using underlying technology, such as RDBMS for relational databases or XPath for XML. For CSV/Tabular data there is no such underlying technology, and instead it requires either a transformation of source data into another format or pre/post-processing techniques. In this paper we evaluate the state of the art in CSV uplift tools. Based on this evaluation, a method that incorporates data transformations into uplift mapping languages by means of functions is proposed and evaluated. Design/methodology/approach: In order to evaluate the state of the art in CSV uplift tools we present a comparison framework and have applied it to such tools. A key feature evaluated in the comparison framework is data transformation functions. We argue that existing approaches for transformation functions are complex-in that a number of steps and tools are required. Our proposed method, FunUL, in contrast, defines functions independent of the source data being mapped into RDF, as resources within the mapping itself. Findings: The approach was evaluated using two typical real world use cases. We have compared how well our approach and others (that include transformation functions as part of the uplift mapping) could implement an uplift mapping from CSV/Tabular into RDF. This comparison indicates that our approach performs well for these use cases. Originality/value: This paper presents a comparison framework and applies it to the state of the art in CSV uplift tools. Furthermore, we describe FunUL, which unlike other related work, defines functions as resources within the uplift mapping itself, integrating data transformation functions and mapping definitions. This makes the generation of RDF from source data transparent and traceable. Moreover, since functions are defined as resources, these can be reused multiple times within mappings.

Journal on Data Semantics, Sep 27, 2016

Collaborative ontology-engineering methods usually prescribe a set of processes, activities, type... more Collaborative ontology-engineering methods usually prescribe a set of processes, activities, types of stakeholders and the roles each stakeholder plays in these activities. We, however, believe that the stake-holder community of each ontology-engineering project is different and one can therefore observe different types of user behavior. It may thus very well be that the prescribed set of stakeholder types and roles do not suffice. If one were able to identify these user behavior types, which we will call a user profile, one can compliment or revisit those predefined roles. For instance, those user profiles can be used to provide customized interfaces for optimizing activities in certain ontology-engineering projects. We present a method that discovers different user profiles based on the interactions users have with each other in a collaborative ontology-engineering environment. Our approach clusters the users based on the types of interactions they perform, which are retrieved from datasets that were annotated with an interaction ontology-built on top of SIOC-that we have developed. We demonstrate our method using the database of two instances of the GOSPL ontology-engineering tool. The databases contain the interactions of two distinct ontology-engineering projects involving respectively 42 and 36 users. For each dataset, we discuss the findings by analyzing the different clusters. We found that we are able to discover different user profiles, indicating that the approach we have taken is viable, though more experiments are needed to validate the results.

International Journal on Digital Libraries, Jul 1, 2016

Irish Record Linkage 1864-1913 is a multidisciplinary project that started in 2014 aiming to crea... more Irish Record Linkage 1864-1913 is a multidisciplinary project that started in 2014 aiming to create a platform for analyzing events captured in his-We are grateful to the Registrar General of Ireland for permitting us to use the rich digital content contained in the vital records for the purposes of this research project. torical birth, marriage and death records by applying semantic technologies for annotating, storing and inferring information from the data contained in those records. This enables researchers to, among other things, investigate to what extent maternal and infant mortality rates were underreported. We report on the semantic architecture, provide motivation for the adoption of RDF and Linked Data principles, and elaborate on the ontology construction process that was influenced by both the requirements of the digital archivists and historians. Concerns of digital archivists include the preservation of the archival record and following best practices in preservation, cataloguing and data protection. The historians in this project wish to discover certain patterns in those vital records. An important aspect of the semantic architecture is the clear separation of concerns that reflects those distinct requirements-the transcription and archival authenticity of the register pages and the interpretation of the transcribed data-that led to the creation of two distinct ontologies and knowledge bases. The advantage of this clear separation is the transcription of register pages resulted in a reusable dataset fit for other research purposes. These transcriptions were enriched with metadata according to best practices in archiving for ingestion in suitable longterm digital preservation platforms.

Journal of Theoretical and Applied Electronic Commerce Research, May 2014

We report on the results of the application of a method and tool for ontology construction in the... more We report on the results of the application of a method and tool for ontology construction in the research information domain, held in the context of an open data initiative of Flanders. The method emphasizes the use of natural language descriptions of concepts next to formal descriptions, and uses - for the formal definitions -a fact-oriented formalism grounded in natural language. In this experiment, a group of 36 participants were divided into different groups to build ontologies to establish semantic interoperability between autonomously developed research information systems and to annotate the data of an existing system provided by a public administration. User satisfaction of the tool was measured with the Post-Study System Usability Questionnaire. The result of that survey was that the participants were generally pleased with the platform, with its usefulness scoring best. As for the developed ontologies, their use was demonstrated by the applications developed by the participants. The experiment showed that having a formalism grounded in natural language leverages the ontology construction process for the stakeholders. The experiment also shows that a method needs to take into account the collaborative building of workflows within ontology projects, as not all ontology-engineering projects are alike.

Journal on Data Semantics, May 29, 2013

Ontologies for enabling semantic interoper-ability is one of the branches in which agreement betw... more Ontologies for enabling semantic interoper-ability is one of the branches in which agreement between a heterogeneous group of stakeholders is of vital importance. As agreements are the result of interactions , appropriate methods should take into account the natural language used by the community during those interactions. In this article, we first extend a fact-oriented formalism for the construction of so-called hybrid ontologies. In hybrid ontologies, concepts are described both formally and informally and the agreements are being grounded in community interactions. We furthermore present GOSPL, a collaborative ontol-ogy engineering method on top of this extension and describe how agreements on formal and informal descriptions are complementary and interplay. We show how the informal descriptions can drive the ontology construction process and how commitments from the ontology to the application are exploited to steer the agreement processes. All of the ideas presented in this article have been implemented in a tool and used in an experiment involving 40+ users, of which a discussion is presented.

Book Chapters by Christophe Debruyne

Shaping the Future Through Standardization, 2020

The General Data Protection Regulation (GDPR) has changed the ecosystem of services involving per... more The General Data Protection Regulation (GDPR) has changed the ecosystem of services involving personal data and information. It emphasises several obligations and rights, amongst which the Right to Data Portability requires providing a copy of the given personal data in a commonly used, structured, and machine-readable format – for interoperability. The GDPR thus explicitly motivates the use and adoption of data interoperability concerning information. This chapter explores the entities and their interactions in the context of the GDPR to provide an information model for the development of interoperable services. The model categorises information and exchanges and explores existing standards and efforts towards use for interoperable interactions. The chapter concludes with an argument for the use and adoption of structured metadata to enable more expressive services through semantic interoperability.

Business Intelligence - Second European Summer School, eBISS 2012, Brussels, Belgium, July 15-21, 2012, Tutorial Lectures, 2013

Conceptual modeling captures descriptions of business entities in terms of their attributes and r... more Conceptual modeling captures descriptions of business entities in terms of their attributes and relations with other business entities. When those descriptions are needed for interoperability tasks between two or more autonomously developed information systems ranging from Web of Data with no a priori known purposes for the data to Enterprise Information Management in which organizations agree on (strict) rules to ensure proper business, those descriptions are often captured in a shared formal specification called an ontology. We present the method Business Semantics Management (BSM), a fact-oriented approach to knowledge modeling grounded in natural language. We first show how fact-oriented approaches differ from approaches in terms of, amongst others, expressiveness, complexity, and decidability and how this formalism is easier for users to render their knowledge. We then explain the different processes in BSM and how the tool suite supports those processes. Final- ly, we show how the ontologies can be transformed into other formalisms suitable for particular interoperability tasks. All the processes and examples will be taken from industry cases throughout the lecture.

Conference Papers by Christophe Debruyne

Lecture Notes in Computer Science, 2011

ABSTRACT For autonomously developed information systems to interoperate in a meaningful manner, o... more ABSTRACT For autonomously developed information systems to interoperate in a meaningful manner, ontologies capturing the intended semantics of that interoperation have to be developed by a community of stakeholders in those information systems. As the requirements of the ontology and the ontology itself evolve, so in general will the community, and vice versa. Ontology construction should thus be viewed as a complex activity leading to formalized semantic agreement involving various social processes within the community, and that may translate into a number of ontology evolution operators to be implemented. The hybrid ontologies that emerge in this way indeed need to support both the social agreement processes in the stakeholder communities and the eventual reasoning implemented in the information systems that are governed by these ontologies. In this paper, we discuss formal aspects of the social processes involved, a so-called fact-oriented methodology and formalism to structure and describe these, as well as certain relevant aspects of the communities in which they occur. We also report on a prototypical tool set that supports such a methodology, and on examples of some early experiments.

Lecture Notes in Computer Science, 2016

It is argued that reflecting on the in-game performance in a serious game is important for facili... more It is argued that reflecting on the in-game performance in a serious game is important for facilitating learning transfer. A way to facilitate such a reflection is by means of a so-called debriefing phase. However, a human facilitated debriefing is expensive, time consuming and not always possible. Therefore , an automatic self-debriefing facility for serious games would be desirable. However, a general approach for creating such an automatic self-debriefing system for serious games doesn't exist. As a first step towards the development of such a framework, we targeted a specific type of serious games, i.e., games displaying realistic behavior and having multiple possible paths to a solution. In addition, we decided to start with the development of a debriefing system for a concrete case, a serious game about cyber bullying in social networks. In particular , in this paper, we focus on different visualizations that could be used for such an automatic debriefing. We combined a textual feedback with three different types of visualizations. A prototype was implemented and evaluated with the goal of comparing the three visualizations and gathering first feedback on the usability and effectiveness. The results indicate that the visualizations did help the participants in having a better understanding of the outcome of the game and that there was a clear preference for one of the three visualizations.

Journal on Data Semantics, 2016

Proceedings of the 13th International Conference on Semantic Systems - Semantics2017

Building Information Modelling (BIM) is a key enabler to support integration of building data wit... more Building Information Modelling (BIM) is a key enabler to support integration of building data within the buildings life cycle and is an important aspect to support a wide range of use cases, related to building navigation, control, sustainability, etc. Open BIM faces several challenges related to standardization, data interde-pendency, data access, and security. In addition to these technical challenges, there remains the barrier among BIM developers who wish to protect their intellectual property, as full 3D BIM development requires expertise and effort. This means that there is often limited availability of BIM models. In Ireland, the Ordnance Survey Ireland (OSi) has a substantial dataset which includes not only GIS data (polygon footprint, geodetic coordinate), but also additional building specific data (form and function). In this paper we demonstrate the use of an applied and tested methodology for uplifting GIS data (relational data) into RDF (GeoSPARQL and OSi ontology) and demonstrate how this data is used for interlinking to other building data with an initial, simple exploratory example, taken from DBpedia. By interlink-ing building data and making it available, new insights about buildings in Ireland can be made, currently not possible due to lack of availability of data. This is an important step towards the iterative integration of ever more complex BIM models into the wider web of data to support the aforementioned use cases.

HRB Open Research, Mar 14, 2019

International Journal of Standardization Research, Jan 2018

International Journal of Web Information Systems, Nov 6, 2017

Journal on Data Semantics, Sep 27, 2016

International Journal on Digital Libraries, Jul 1, 2016

Journal of Theoretical and Applied Electronic Commerce Research, May 2014

Journal on Data Semantics, May 29, 2013

Shaping the Future Through Standardization, 2020

Business Intelligence - Second European Summer School, eBISS 2012, Brussels, Belgium, July 15-21, 2012, Tutorial Lectures, 2013

Lecture Notes in Computer Science, 2011

Lecture Notes in Computer Science, 2016

Journal on Data Semantics, 2016

Proceedings of the 13th International Conference on Semantic Systems - Semantics2017

Knowledge and Information Systems

Data processing is increasingly becoming the subject of various policies and regulations, such as... more Data processing is increasingly becoming the subject of various policies and regulations, such as the European General Data Protection Regulation (GDPR) that came into effect in May 2018. One important aspect of GDPR is informed consent, which captures one’s permission for using one’s personal information for specific data processing purposes. Organizations must demonstrate that they comply with these policies. The fines that come with non-compliance are of such importance that it has driven research in facilitating compliance verification. The state-of-the-art primarily focuses on, for instance, the analysis of prescriptive models and posthoc analysis on logs to check whether data processing is compliant to GDPR. We argue that GDPR compliance can be facilitated by ensuring datasets used in processing activities are compliant with consent from the very start. The problem addressed in this paper is how we can generate datasets that comply with given consent “just-in-time”. We propose...

On the Move to Meaningful Internet Systems: OTM 2019 Conferences - Confederated International Conferences: CoopIS, ODBASE, C&TC 2019, Rhodes, Greece, October 21-25, 2019, Proceedings, Oct 21, 2019

The creation of interlinks between Linked Data datasets is key to the creation of a global databa... more The creation of interlinks between Linked Data datasets is key to the creation of a global database. One can create such interlinks in various ways: manually, semi-automatically, and automatically. While quite a few tools exist to facilitate this process in a (semi-)automatic manner, often with support for ge-ospatial data. It is not uncommon that interlinks need to be created manually, e.g., when interlinks need to be authoritative. In this study, we focus on the manual interlinking of geospatial data using maps. The State-of-the-Art uses maps to facilitate the search and visualization of such data. Our contribution is to investigate whether maps are useful for the creation of interlinks. We designed and developed such a tool and set up an experiment in which 16 participants used the tool to create links between different Linked Data datasets. We not only describe the tool but also analyze the data we have gathered. The data suggests the creation of these interlinks from these maps is a viable approach. The data also indicate that people had a harder time dealing with Linked Data principles (e.g., content negotiation) than with the creation of interlinks.

The Semantic Web - 16th International Conference, ESWC 2019, Portoroz, Slovenia, June 2-6, 2019, Proceedings, Jun 2, 2019

Consent is an important legal basis for the processing of personal data under the General Data Pr... more Consent is an important legal basis for the processing of personal data under the General Data Protection Regulation (GDPR), which is the current European data protection law. GPDR provides constraints and obligations on the validity of consent, and provides data subjects with the right to withdraw their consent at any time. Determining and demonstrating compliance to these obligations require information on how the consent was obtained, used, and changed over time. Existing work demonstrates feasibility of semantic web technologies in modelling information and determining compliance for GDPR. Although these address consent, they currently do not model all the information associated with it. In this paper, we address this by first presenting our analysis of information associated with consent under the GDPR. We then present GConsent, an OWL2-DL ontology for representation of consent and its associated information such as provenance. The paper presents the methodology used in the creation and validation of the ontology as well as an example use-case demonstrating its applicability. The ontology and this paper can be accessed online at https://w3id.org/GConsent.

2019 ACM/IEEE Joint Conference on Digital Libraries (JCDL), Jun 2, 2019

By interlinking internal Linked Data (LD) entities to related LD entities published by authoritat... more By interlinking internal Linked Data (LD) entities to related LD entities published by authoritative creators and holders of data, libraries have the potential to expose their collections to a larger audience and to allow for richer user searches. While increasing numbers libraries are devoting time to publishing LD, the full potential of these datasets has not been explored due to limited LD interlink-ing. In 2018 we conducted a survey which explored the position of Information Professionals (IPs), such as librarians, archivists and cataloguers, with regards to LD. Results indicated that IPs find the process of data interlinking to be a particularly challenging step in the creation of Five Star LD. Consequently, we developed NAISC, an interlinking approach designed specifically for the library domain aimed at facilitating increased IP engagement in the LD interlink-ing process. Our paper provides an overview of the design and user-evaluation of NAISC. Results indicated that IPs found NAISC easy-to-use and useful for creating LD interlinks.

2019 13th International Conference on Research Challenges in Information Science (RCIS), May 29, 2019

In this paper, we argue that there is a gap to be bridged between the development and maintenance... more In this paper, we argue that there is a gap to be bridged between the development and maintenance of services and the various internal and external policies that emerge and evolve outside of these systems. To bridge this gap, we propose a semantic model, i.e. ontology, for representing Data Flows and linking them with structured representations of the data that is processed (datasets, databases, queries, etc.). Data Flow Dia-gramming is a technique for capturing the various data and information flows between an information system and external stakeholders as well as within such a system. This technique is used in the analysis phase of information systems development and captures the inputs and outputs of various processes. Our model allows these data flows to be presented and linked with structured representations of the data that is to be used, consulted , processed, etc. We demonstrate that this model can facilitate compliance verification processes of (intelligent) systems by allowing these flows to be analyzed. Next to the ontology, which has been made available according to best practices in the field, we furthermore posit our contributions within the state of the art.

2019 IEEE 13th International Conference on Semantic Computing (ICSC), 2019

The development of intelligent (e.g., AI-based) applications increasingly requires governance mod... more The development of intelligent (e.g., AI-based) applications increasingly requires governance models and processes, as financial legal sanctions are more and more being associated with violation of policies. We propose an ontology representing the informed consent that was collected by an organization and argue how it can be used to assess a dataset prior its use in any type of data processing activities. We demonstrate the utility of our ontol-ogy using a particular scenario, where datasets are generated "just in time" for a particular purpose such as sending newsletters. This scenario shows how data processing activities can be managed to in such a way as to support compliance verification. This paper furthermore compares the contributions to related work and positions it into prior work concerned with the broader problem of prescribing and analyzing compliance.

On the Move to Meaningful Internet Systems. OTM 2018 Conferences - Confederated International Conferences: CoopIS, C&TC,and ODBASE 2018, Valletta, Malta, October 22-26, 2018,Proceedings, Part II, 2018

Data processing is increasingly the subject of various internal and external regulations, such as... more Data processing is increasingly the subject of various internal and external regulations, such as GDPR which has recently come into effect. Instead of assuming that such processes avail of data sources (such as files and relational databases), we approach the problem in a more abstract manner and view these processes as taking datasets as input. These datasets are then created by pulling data from various data sources. Taking a W3C Recommendation for prescribing the structure of and for describing datasets, we investigate an extension of that vocabulary for the generation of executable R2RML mappings. This results in a top-down approach where one prescribes the dataset to be used by a data process and where to find the data, and where that prescription is subsequently used to retrieve the data for the creation of the dataset "just in time". We argue that this approach to the generation of an R2RML mapping from a dataset description is the first step towards policy-aware mappings, where the generation takes into account regulations to generate mappings that are compliant. In this paper, we describe how one can obtain an R2RML mapping from a data structure definition in a declarative manner using SPARQL CONSTRUCT queries, and demonstrate it using a running example. Some of the more technical aspects are also described .

Human Mental Workload: Models and Applications - Second International Symposium, H- WORKLOAD 2018, Amsterdam, The Netherlands, September 20-21, 2018, Revised Selected Papers, 2018

Self-reporting procedures have been largely employed in literature to measure the mental workload... more Self-reporting procedures have been largely employed in literature to measure the mental workload experienced by users when executing a specific task. This research proposes the adoption of these mental workload assessment techniques to the task of creating uplift mappings in Linked Data. A user study has been performed to compare the mental workload of "manually" creating such mappings, using a formal mapping language and a text editor, to the use of a visual representation, based on the block metaphor, that generate these mappings. Two subjective mental workload instruments, namely the NASA Task Load Index and the Workload Profile, were applied in this study. Preliminary results show the reliability of these instruments in measuring the perceived mental workload for the task of creating uplift mappings. Results also indicate that participants using the visual representation achieved smaller and more consistent scores of mental workload.

JCDL '18: Proceedings of the 18th ACM/IEEE on Joint Conference on Digital Libraries, 2018

e aim of this study was to explore the bene ts and challenges to using Linked Data (LD) in the Li... more e aim of this study was to explore the bene ts and challenges to using Linked Data (LD) in the Libraries, Archives and Museums (LAMs) as perceived by Information Professionals (IPs). e study also aimed to gain an insight into potential solutions for overcoming these challenges, with a particular focus on the idea of LD tooling for IPs as a means of doing so. Data was collected via a questionnaire which was completed by 185 Information Professionals (IPs) from a range of LAM institutions. Results indicated that there are many challenges relating to the usability and utility of LD tooling that create barriers to IPs engaging with LD. e study shows that LD tools designed with the work ows and expertise of IPs in mind could help break down these barriers.

2018 IEEE 12th International Conference on Semantic Computing (ICSC), 2018

A significant part of the Linked Data web is achieved by converting non-RDF resources into RDF. E... more A significant part of the Linked Data web is achieved by converting non-RDF resources into RDF. Even though several approaches and mapping languages have been proposed in the literature, the knowledge required for such a task is still substantial. In prior work, we proposed a visual representation based on the block metaphor and applied it to the W3C Recommendation R2RML. In this paper, we describe a new implementation of this method, called Juma Uplift, that is capable of generating mapping definitions for different uplift mapping languages while still being fully compliant to the particular uplift mapping specification. Preliminary findings indicate that Juma Uplift is expressive enough to generate accurate mappings for the two syntactically distinct mapping languages under examination, R2RML and SML.

The Semantic Web - ISWC 2017 - 16th International Semantic Web Conference, Vienna, Austria, October 21-25, 2017, Proceedings, Part II, 2017

Data.geohive.ie aims to provide an authoritative service for serving Ireland's national geospatia... more Data.geohive.ie aims to provide an authoritative service for serving Ireland's national geospatial data as Linked Data. The service currently provides information on Irish administrative boundaries and the boundaries used for the Irish 2011 census. The service is designed to support two use cases: serving boundary data of geographic features at various level of detail and capturing the evolution of administrative boundaries. In this paper, we report on the development of the service and elaborate on some of the informed decisions concerned with the URI strategy and use of named graphs for the support of aforementioned use cases-relating those with similar initiatives. While clear insights on how the data is being used are still being gathered, we provide examples of how and where this geospatial Linked Data dataset is used.

Semantics2017: Proceedings of the 13th International Conference on Semantic Systems, 2017

Research and Advanced Technology for Digital Libraries - 21st International Conference on Theory and Practice of Digital Libraries, TPDL 2017, Thessaloniki, Greece, September 18-21, 2017, Proceedings, 2017

It is a best practice to avoid the use of RDF collections and containers when publishing Linked D... more It is a best practice to avoid the use of RDF collections and containers when publishing Linked Data, but sometimes vocabularies such as MADS-RDF prescribe these constructs. The Library of Trinity College Dublin is building a new asset management system backed by a relational database and wants to publish their metadata according to these vocabularies. We chose to use the W3C Recommendation R2RML to relate the database to RDF datasets, but R2RML unfortunately does not provide support for collections and containers. In this paper, we propose an extension to R2RML to address this problem. We support gathering collections and containers from different fields in a row of a (logical) table as well as across rows. We furthermore prescribe how the extended R2RML engine deals with named graphs in the RDF dataset as well as empty sets. Examples and our demonstration on a part of the Library's database prove the feasibility of our approach.

iiWAS '16: Proceedings of the 18th International Conference on Information Integration and Web-based Applications and Services, 2016

Typically tools that map non-RDF data into RDF format rely on the technology native to the source... more Typically tools that map non-RDF data into RDF format rely on the technology native to the source of the data when manipulation of data during the mapping is required. Depending on the data format, data manipulation can be performed using underlying technology, such as RDBMS for relational databases or XPath for XML. For CSV/Tabular data there is no such underlying technology , and instead transforming the source data into another format or pre/post-processing techniques are used. As part of this paper, we present a comparison framework for the state-of-the-art in converting CSV/Tabular data into RDF, where a key feature evaluated is transformation functions. We argue that existing approaches for transformation functions in such tools are complex-in number of steps and tools involved-and therefore not as traceable and transparent as one would like. We tackle these problems by defining a more generic, usable and amenable method to incorporate functions into uplift mapping languages, called FunUL. As proof of concept, we show an implementation of our method. Moreover, by using a real world Digital Humanities case study, we compare our approach with other approaches that we have identified to include transformation functions as part of the mapping for CSV/Tabular data.

Proceedings of the 17th International Conference on Information Integration and Web-based Applications &Services - iiWAS '15, 2015

Ontology matching and mapping is concerned with discovering correspondences between two ontologie... more Ontology matching and mapping is concerned with discovering correspondences between two ontologies to create a mapping that enable applications to relate, interlink or integrate data. The construction of such mappings is not trivial as they are created to serve a purpose and result from collaboration between the different stakeholders. Current ontology-mapping metadata formats only capture a glimpse of the mapping construction process by focusing on the exchange of mappings and they provide some limited properties to facilitate reuse and discovery. For mapping governance to be possible, we argue that a suitable metadata model -- which will be presented in this paper -- needs to capture all aspects from the ontology mapping lifecycle: from the inception of a project to the execution of these mappings. This allows one to formulate queries that not only would facilitate the discovery and reuse, but also queries that allow one to govern the ontology mapping projects and render the construction processes more transparent and traceable.

Proceedings of the Second International Workshop on Semantic Web for Scientific Heritage co-located with 13th Extended Semantic Web Conference (ESWC 2016), 2016

Linked Data technologies are increasingly being implemented to enhance cataloguing workflows in m... more Linked Data technologies are increasingly being implemented to enhance cataloguing workflows in memory institutions such as libraries, archives and museums. These institutions actively seek authoritative Linked Data datasets to enhance their metadata. This talk presents the results of Linked Logainm. Linked Logainm is a collaborative project that aimed to create a Linked Data dataset out of Logainm.ie, an online database holding the authoritative hierarchical list of Irish and English language place names in Ireland. We will describe the process of generating RDF from XML, elaborate on the discovery and creation of links with other Linked Data datasets, and report on how the dataset we created was used to enhance the National Library of Ireland's metadata MARCXML metadata records for its Longfield maps collection. We conclude this talk with describing the potential benefits of Linked Logainm, the applications that can be built on top of this dataset, and a reflection on the work we have conducted.

4th IEEE International Conference on Digital Ecosystems and Technologies (DEST 2010), Jan 13, 2010

Semantic Web, Social Web, and new economic challenges are causing major shifts in the pervasive f... more Semantic Web, Social Web, and new economic challenges are causing major shifts in the pervasive fabric that the internet has become, in particular for the business world. The internet’s new role as participatory medium and its ubiquity lead to dense tri-sortal communities of humans and businesses mixed with computer systems, and semantically interoperating in a well-defined sense. Many of the challenges and ongoing (r)evolutions appear to produce as yet seemingly contradictory requirements and thus produce potentially very interesting research areas. We argue that linguistics, community-based real world “social” semantics and pragmatics, scalability, the tri- sortal nature of the communities involved, the balance between usability and reusability, and the methodological requirements for non-disruptive adoption by enterprises of the new technologies provide vectors for fundamental computer science research, for interesting new artefacts, and for new valorisations of enterprise interoperability. We posit that one such development will likely result in hybrid ontologies and their supporting social implementation environments –such as semantic wikis– that accommodate the duality and co-existence of formal reasoning requirements inside systems on the one hand and of declarative knowledge manipulation underlying human communication and agreement on the other hand.

Using Semantic Technologies to Create Virtual Families from Historical Vital Records, 1st EUON Workshop, 2014, 2014

We report on the semantic architecture and ontology creation of the Irish Record Linkage 1864-191... more We report on the semantic architecture and ontology creation of the Irish Record Linkage 1864-1913 project, which aims to create a platform to reconstitute families and create longitudinal health histories by applying semantic technologies to annotate, store and analyze the data contained in historical birth-, marriage-and death records. This enables researchers to, for instance, investigate to what extent maternal and infant mortality rates were underreported. We make a clear distinction between (i) the curation of the encoded data contained in the records as well as the long-term preservation thereof, managed by a digital archivist, and (ii) the analysis and interpretation of that data to answer specific questions for historians and researchers. To support the two processes and to maintain the clear separation of concerns, two distinct but interrelated knowledge bases (KBs) are developed. The knowledge engineer is responsible for setting up the semantic infrastructure, and interprets the research questions of the historians into queries for the enriched KB. A first KB is setup to contain the in RDF encoded records using "flat" ontologies that capture the information contained in those records in a lexical manner. While encoding, any noise such as errors or missing values will be respected to preserve the original historical record and provenance. The forms of these records were "lifted" into an ontology, as available ontologies for vital records were, to the best of our knowledge, non-existent. We then populate more expressive ontologies capturing domain knowledge for creating the second and richer KB from the RDF records. Existing ontologies such as the Persona Vocabulary were adopted and extensions specific to this project were developed. This KB was partly populated by adopting ontology matching techniques to detect correspondences across different records and by creating links to external datasets for additional contextual information (e.g., Logainm.ie, the authority database for historical and contemporary Irish place names). Additional knowledge includes medical information expressed as rules based on the International Classification of Diseases. This allows historians to analyze the content in the records using different parameters and different assumptions, e.g., by using different or newer bodies of medical knowledge. We thank the Registrar General of Ireland for permitting us to use this rich digital content contained in the vital records.

Companion of The 2019 World Wide Web Conference, WWW 2019, San Francisco, CA, USA, May 13-17, 2019, 2019

As the Web of Data grows, so does the need to establish the quality and trustworthiness of its co... more As the Web of Data grows, so does the need to establish the quality and trustworthiness of its contents. Increasing numbers of libraries are publishing their metadata as Linked Data (LD). As these institutions are considered authoritative sources of information, it is likely that library LD will be treated with increased credibility over data published by other sources. However, in order to establish this trust, the provenance of library LD must be provided. In 2018 we conducted a survey which explored the position of Information Professionals (IPs), such as librarians, archivists and cataloguers, with regards to LD. Results indicated that IPs find the process of LD interlinking to be a particularly challenging. In order to publish authoritative interlinks, provenance data for the description and justification of the links is required. As such, the goal of this research is to provide a provenance model for the LD interlinking process that meets the requirements of library metadata standards. Many current LD technologies are not accessible to non-technical experts or attuned to the needs of the library domain. By designing a model specifically for libraries, with input from IPs, we aim to facilitate this domain in the process of creating interlink provenance data.

Proceedings of the 3rd International Workshop on Geospatial Linked Data and the 2nd Workshop on Querying the Web of Data co-located with 15th Extended Semantic Web Conference (ESWC 2018), Heraklion, Greece, June 3, 2018, 2018

Challenges in interlinking two datasets have been studied extensively in the state-of-art in term... more Challenges in interlinking two datasets have been studied extensively in the state-of-art in terms of the complexity of the matching process used for interlinking. However, the challenges in gathering the input datasets to be interlinked and finalizing a link specification, which constitute the preprocessing phase of the link discovery (LD) workflow, are mostly overlooked. In this paper, we highlight these challenges through a case study of interlinking the Ordnance Survey Ireland (OSi) datasets with the geospatial data in the Linked Open Data (LOD) cloud. Our study shows that designing a query and using an interface to retrieve the instances to be interlinked from SPARQL endpoint is difficult. In finalizing a link specification, additional properties can be critical when labels are ambiguous. Also, the selection of similarity measures to compare these properties is unintuitive. These challenges show that interlinking datasets is not very straightforward, even with the availability of link discovery tools. Since the challenges in the preprocessing phase are not obvious, the analysis documented here can provide guidance in undertaking a project in interlinking two datasets.

Proceedings of the Third International Workshop on Visualization and Interaction for Ontologies and Linked Data co-located with the 16th International Semantic Web Conference (ISWC 2017), Vienna, Austria, October 22, 2017, 2017

R2RML is a W3C Recommendation that provides for the declaration of mappings to generate RDF datas... more R2RML is a W3C Recommendation that provides for the declaration of mappings to generate RDF datasets from relational databases. One issue that hampers its adoption is the manual effort needed in the creation and maintenance of such mappings. To tackle this problem, various initiatives have started to emerge. One of the directions is to investigate how different representations can facilitate the creation and maintenance of such mappings for a wider set of stakeholders. In prior work, we proposed a visual representation based on the block metaphor for R2RML mappings that is compliant with this specification. This representation has been integrated within a tool for creating and managing R2RML mappings. In this paper, we report on a user study to evaluate the proposed visual representation considering stakeholders with different background knowledge. Preliminary findings indicate that participants were able to create accurate mappings and that the visual representation achieves good results in standard usability evaluations.

Proceedings of the 5th Workshop on Society, Privacy and the Semantic Web - Policy and Technology (PrivOn2017) co-located with 16th International Semantic Web Conference (ISWC 2017), Vienna, Austria, October 22, 2017, 2017

The General Data Protection Regulations (GDPR) imposes greater restrictions on obtaining valid us... more The General Data Protection Regulations (GDPR) imposes greater restrictions on obtaining valid user consents involving the use of personal data. A semantic model of consent can make the concepts of consent explicit, establish a common understanding and enable re-use of consent. Therefore, forming a semantic model of consent will satisfy the GDPR requirements of specificity and unambiguity and is an important step towards ensuring compliance. In this paper, we discuss obtaining an open vocabulary of expressing consent leverag-ing existing semantic models of provenance, processes, permission and obligations. We also present a reference architecture for the management of data processing according to consent permission. This data management model utilizes the open vocabulary of consent and incorporates the change of context into the data processing activity. By identifying and incorporating changes to the rela-tional context between data controllers and data subjects into the data processing model, it aims to improve the integration of data management across different information systems specifically adhering to the GDPR and helping controllers to demonstrate compliance.

Proceedings of the 12th International Workshop on Ontology Matching co-located with the 16th International Semantic Web Conference (ISWC 2017), Vienna, Austria, October 21, 2017, 2017

This paper describes an extension to the M-Gov framework that captures queryable metadata about m... more This paper describes an extension to the M-Gov framework that captures queryable metadata about matcher tools that have been utilized, the users involved, and the discussions of the users, during the generation of alignments. This increases the traceability in an alignment creation process and enables an evaluator to more deeply interpret and evaluate an alignment, e.g. for reuse or maintenance. This requires precise information about the alignments being encoded and the decisions undertaken during their creation. This information is not captured by state of the art approaches in a queryable format. The paper also describes an experiment that was undertaken to examine the effectiveness of our approach in enabling the traceability in the alignment creation process. In the experiment, stakeholders created an alignment between two different da-tasets. The results indicate that the users were 93% accurate while creating the alignment. The major traceability achievements demonstrated for the test groups were 1) level of participation of various users of a group during alignment creation; 2) most discussed correspondences by users of a group; and 3) accuracy of a group in creating alignment.

GeoRich '17: Proceedings of the Fourth International ACM Workshop on Managing and Mining Enriched Geo-Spatial Data, May 2017, 2017

The concept of "location" provides one a useful dimension to explore, align, combine, and analyze... more The concept of "location" provides one a useful dimension to explore, align, combine, and analyze data. Though one can rely on bespoke GIS systems to conduct their data analyses, we aim to investigate the feasibility of using Semantic Web technologies to leverage the exploration and enrichment of data in CSV files with the vast amount of geographic and geospatial data that are available on the Linked Data Web. In this paper, we propose a lightweight method and set of tools for: uplift-transforming non-RDF resources into RDF documents; creating links between RDF datasets; client-side processing of geospatial functions; and downlift-transforming (enriched) RDF documents back into a non-RDF format. With this approach, people who wish to avail of the spatial dimension in data can do so from their client (e.g., in a browser) without the need to rely on bespoke technology. is could be of great utility for decision-makers and scholars, amongst others. We applied our approach on datasets that are hosted on the Irish open data portal, and combined it with authoritative geospatial data made available by Ordnance Survey Ireland (OSi). Albeit aware that our approach cannot compete with specialist tools, we do demonstrate its feasibility. ough currently conducted for enriching datasets hosted on the Irish open data portal, future work will look into broader governance and provenance aspects of geospatial data enriched dataset management.

Workshop on Linked Data on the Web co-located with 26th International World Wide Web Conference (WWW 2017), Perth, Australia, April 3rd, 2017, 2017

"Place" is an important concept providing a useful dimension to explore, align and analyze data o... more "Place" is an important concept providing a useful dimension to explore, align and analyze data on the Linked Data Web. ough Linked Data datasets can use standardized geospatial predicates such as GeoSPARQL, access to SPARQL endpoints that supports these is not guaranteed. When not available, one needs to load the data into their own GeoSPARQL-enabled triplestores in order to avail of those predicates. Triple Pa ern Fragments (TPF) is a proposal to make clients more intelligent in processing RDF, thereby lessening the burden carried by servers. In this paper, we propose to extend TPF to support GeoSPARQL. e contribution is a minimal extension of the TPF client that does not rely on a spatial database such that the extension can be run from within a browser. Even though our approach will unlikely outperform GeoSPARQL-enabled triplestores in terms of query execution time, we demonstrate its feasibility by means of a couple of use cases using data provided by data.geohive.ie, an initiative to publish authoritative, high-resolution geospatial data for e Republic of Ireland as Linked Data on the Web. is high-resolution data does cause a lot of network tra c, but related work showed how extending the communication between a TPF client and server reduces the number HTTP calls and some network tra c. e integration of our extension in one such optimization did reduce the overhead. We, however, decided to stick to our rst implementation as it only extended the client in a minimal way. Future work includes investigating how our approach scales, and its usefulness of adding and using a spatial component to datasets.

Workshop on Linked Data on the Web co-located with 26th International World Wide Web Conference (WWW 2017), Perth, Australia, April 3rd, 2017, 2017

In this paper, we argue that layering a question answering system on the Web of Data based on use... more In this paper, we argue that layering a question answering system on the Web of Data based on user preferences, leads to the derivation of more knowledge from external sources and customisation of query results based on user's interests. As various users may find different things relevant because of different preferences and goals, we can expect different answers to the same query. We propose a personalised question answering framework for a user to query over Linked Data, which enhances a user query with related preferences of the user stored in his/her user profile with the aim of providing personalized answers. We also propose the extension of the QALD-5 scoring system to define a relevancy metric that measures similarity of query answers to a user's preferences.

Computational History and Data-Driven Humanities - Second IFIP WG 12.7 International Workshop, CHDDH 2016, Dublin, Ireland, May 25, 2016, Revised Selected Papers, 2016

We report on the Linked Data platform developed for the administrative boundaries governed by the... more We report on the Linked Data platform developed for the administrative boundaries governed by the Ordnance Survey Ireland (OSi), as they wished to serve this data as an authoritative Linked Open Data dataset on the Web. To implement this platform, we have adopted best practices and guidelines from the industry and academia. We demonstrate how this dataset can be combined with other datasets to add a spatial component to information. We believe that the publication of this dataset not only provides opportunities for third parties (including scholars) in their activities, but that this outcome of this initiative is of importance, as the OSi made the authoritative dataset available. With the current platform deployed, future work will include the inclusion of other (closed) datasets and the investigation of access mechanisms.

2016 IEEE Security and Privacy Workshops, SP Workshops 2016, San Jose, CA, USA, May 22-26, 2016, 2016

Handling personal data in a legally compliant way is an important factor for ensuring the trustwo... more Handling personal data in a legally compliant way is an important factor for ensuring the trustworthiness of a service provider. The EU data protection directive (EU DPD) is built in such a way that the outcomes of rules are subject to explanations, contexts with dependencies, and human interpretation. Therefore, the process of obtaining deterministic and formal rules in policy languages from the EU DPD is difficult to fully automate. To tackle this problem, we demonstrate in this paper the use of a Controlled Natural Language (CNL) to encode the rules of the EU DPD, in a manner that can be automatically converted into the policy languages XACML and PERMIS. We also show that forming machine executable rules automatically from the controlled natural language grammar not only has the benefit of ensuring the correctness of those rules but also has potential of making the overall process more efficient.

Proceedings of the Workshop on Linked Data on the Web co-located with 25th International World Wide Web Conference (WWW 2016), Montreal, Canada, April 12th, 2016, 2016

Many initiatives have emerged to aid one in publishing structured resources as Linked Data on the... more Many initiatives have emerged to aid one in publishing structured resources as Linked Data on the Web with one of the major achievements being the R2RML W3C Recommendation. R2RML and its dialects assume a certain underlying technology (e.g., Core SQL 2008). This means that domain-specific data transformations-such as transforming geospatial coordinates-rely either on that underlying technology or data-preprocessing steps. We argue that one can incorporate and subsequently share that procedural domain knowledge in such mappings. Such an extension would make certain pre-processing steps redundant. One can furthermore attach metadata to these functions, which can be published as well. In this paper, we present R2RML-F, an extension to R2RML, that adopts ECMAScript for capturing domain knowledge and for which we have developed a prototype. We demonstrate the viability of the approach with a demonstration and compare its performance with different mappings in some initial experiments. Our preliminary results suggest that there is little or no overhead with respect to relying on underlying technology .

On the Move to Meaningful Internet Systems: OTM 2015 Workshops - Confederated International Workshops: OTM Academy, OTM Industry Case Studies Program, EI2N, FBM, INBAST, ISDE, META4eS, and MSC 2015, Rhodes, Greece, October 26-30, 2015, Proceedings, 2015

In the Irish Record Linkage 1864-1913 (IRL) project, digital archivists transcribe digitized regi... more In the Irish Record Linkage 1864-1913 (IRL) project, digital archivists transcribe digitized register pages containing vital records into a database, which is then used to generate RDF triples. Historians then use those triples to answer some specific research questions on the IRL platform. Though the triples themselves are a highly valuable asset that can be adopted by many, the digitized records and their RDF representations need to be adequately stored and preserved according to best standards and guidelines to ensure those do not get lost over time. This was a problem currently not investigated within this project. This paper reports on the creation of Qualified Dublin Core from those triples for ingestion with the digitized register pages in an adequate long-term digital preservation platform and repository. Rather than creating RDF only for the purpose of this project, we demonstrate how we can distill artifacts from the RDF that is fit for discovery, access, and even reuse via that repository and how we elicit and conserve the knowledge and memories about Ireland, its history and culture contained in those register pages.

Linked Data makes available a vast amount of data on the Semantic Web for agents, both human and ... more Linked Data makes available a vast amount of data on the Semantic Web for agents, both human and software, to consume. Linked Data datasets are made available with different ontologies, even when their domains overlap. The interoperability problem that rises when one needs to consume and combine two or more of such datasets to develop a Linked Data application or mashup is still an important challenge. Ontology-matching techniques help overcome this problem. The process, however, often relies on knowledge engineers to carry out the tasks as they have expertise in ontologies and semantic technologies. It is reasonable to assume that knowledge engineers should require help from the domain experts, end users, etc. to contribute in the validation of the results and help distilling ontology mappings from these correspondences. However, the current design for the ontology-mapping tools does not take into consideration the different types of users expected to be involved in the creation of Linked Data applications or mashups. In this paper, we identify the different users and their roles in the mapping involved in the context of developing Linked Data mashups and propose a collaborative mapping method in which we prescribe where collaboration between the different stakeholders could, and should, take place. In addition, we propose a tool architecture based on bringing together an adaptive interface, mapping services, workflow services and agreement services that will ease the collaboration between the different stakeholders. This output will be used in an ongoing study to constructing a col-laborative mapping platform.

Proceedings of the 1st International IFIP Working Conference on Value-Driven Social Semantics & Collective Intelligence co-located with the 5th Annual ACM Web Science Conference (WebSci 2013), 2015

In this paper, we investigate the relation between Guarino’s seminal paper “Formal Ontology and I... more In this paper, we investigate the relation between Guarino’s seminal paper “Formal Ontology and Information Systems” and the DOGMA ontology-engineering framework. As DOGMA is geared towards the development of ontologies for semantic interoperation between autonomously developed and maintained information systems, it follows that the stakeholders in this project form a community and adds a social dimension to the ontology project. The goal of this exercise is to examine how the different terminologies and ideas relate to one and another, thus providing a reference for clarifying DOGMA’s ideas and notation inside Guarino’s framework.

Knowledge Representation for Health Care KR4HC 2014, 2014

The Irish Record Linkage 1864-1913 project aims to create a knowledge base containing historical ... more The Irish Record Linkage 1864-1913 project aims to create a knowledge base containing historical birth-, marriage- and death records encoded into RDF to reconstitute families and create longitudinal health histories. The goal is to interlink the different persons across these records as well as with supplementary datasets that provide additional context. With the help of knowledge engineers who will create the ontologies and set up the platform and the digital archivist who will curate, ingest and maintain the RDF, the historians will be able to analyse reconstructed “virtual” families of Dublin in the 19th and early 20th centuries, allowing them to address questions about the accuracy of officially reported maternal mortality and infant mortality rates. In the longer term, this platform will allow researchers to investigate how official historical datasets can contribute to modern-day epidemiological planning.

On the Move to Meaningful Internet Systems: OTM 2013 Workshops - Confederated International Workshops: OTM Academy, OTM Industry Case Studies Program, ACM, EI2N, ISDE, META4eS, ORM, SeDeS, SINCOM, SMS, and SOMOCO 2013, Graz, Austria, September 9 - 13, 2013, Proceedings, 2013

In an effort to continuously improve a research prototype for collaborative ontology engineering,... more In an effort to continuously improve a research prototype for collaborative ontology engineering, we report on the reapplication of a usability test within an ontology-engineering experiment involving 36 users. The tool offers additional functionalities and measures were taken to address the problems identified in a previous study. The evaluation criteria proposed in the study were developed by taking into account the people involved, the processes and their outcomes, focusing on the user experience, in an approach that goes beyond usability; users were asked if the tool helped them in achieving their goals. We identify the problems the users encountered while using the system and also investigate whether the measures did tackle the problems observed in the first study. A set of recommendations is proposed in order to overcome these new problems and to improve the user experience with the system.

Semantic Decision Rule Language (SDRule-L), which is an extension to Object-Role Modelling langua... more Semantic Decision Rule Language (SDRule-L), which is an extension to Object-Role Modelling language (ORM), is designed for modelling semantic decision support rules. An SDRule-L model may contain static(e.g., data constraints) and dynamic rules (e.g., sequence of events). In this paper, we want to illustrate its supporting tool called Metis, with which we can graphically design SDRule-L models, verbalize and reason them. We can store and publish those models in its markup language called SDRule-ML, which can be partly mapped into OWL2. The embedded reasoning engine from Metis is used to check consistency.

Proceedings of the sixth international conference on Knowledge capture - K-CAP '11, 2011

Abstract Domain rules are important for businesses to obtain good data governance. Although effic... more Abstract Domain rules are important for businesses to obtain good data governance. Although efficient for storing and processing data, the use of popular semantic technologies alone does not suffice. As the Web is gaining a prominent role for enterprises (and communities in general), appropriate methods and tools are required for data governance, with a proper emphasis on facts in natural language. This paper presents Business Semantics Glossary that supports the a method called Business Semantics Management.

The Semantic Web: ESWC 2019 Satellite Events - ESWC 2019 Satellite Events, Portoroz, Slovenia, June 2-6, 2019, Revised Selected Papers, 2019

The General Data Protection Regulation (GDPR) has established transparency and accountability in ... more The General Data Protection Regulation (GDPR) has established transparency and accountability in the context of personal data usage and collection. While its obligations clearly apply to data explicitly obtained from data subjects, the situation is less clear for data derived from existing personal data. In this paper, we address this issue with an approach for identifying potential data derivations using a rule-based formalisation of examples documented in the literature using Semantic Web standards. Our approach is useful for identifying risks of potential data derivations from given data and provides a starting point towards an open catalogue to document known derivations for the privacy community , but also for data controllers, in order to raise awareness in which sense their data collections could become problematic.

The Semantic Web: ESWC 2018 Satellite Events - ESWC 2018 Satellite Events, Heraklion, Crete, Greece, June 3-7, 2018, Revised Selected Papers, 2018

The Linked Open Data cloud contains several knowledge bases with overlapping concepts. In order t... more The Linked Open Data cloud contains several knowledge bases with overlapping concepts. In order to reduce heterogeneity and enable greater interoperability, semantic mappings between resources can be established. These mappings are usually represented using mapping languages, where visual representations are often used to support user involvement. In prior work, we have proposed a visual representation based on the block metaphor, called Juma, and applied it to uplift mappings. In this paper, we extend its applicability and propose the use of this visual representation for semantic mappings that automatically generate executable mappings between knowledge bases. We also demonstrate the viability of our approach, in the representation of real mappings, through a use case.

By generating bibliographic records in RDF, libraries can publish and interlink their metadata on... more By generating bibliographic records in RDF, libraries can publish and interlink their metadata on the Semantic Web. However, there are currently many barriers which prevent libraries from doing this. This paper describes the process of developing an RDF-enabled cataloguing tool for a university library in an attempt to overcome some of these obstacles.

The Semantic Web: ESWC 2017 Satellite Events - ESWC 2017 Satellite Events, Portoroz, Slovenia, May 28 - June 1, 2017, Revised Selected Papers, 2017

R2RML is the W3C standard mapping language used to define cus-tomized mappings from relational da... more R2RML is the W3C standard mapping language used to define cus-tomized mappings from relational databases into RDF. One issue that hampers its adoption is the effort needed in the creation of such mappings, as they are stored as RDF documents. To address this problem, several tools that represent mappings as graphs have been proposed in the literature. In this paper, we describe a visual representation based on a block metaphor for creating and editing such mappings that is fully compliant with the R2RML specification. Preliminary findings from users using the tool indicate that the visual representation was helpful in the creation of R2RML mappings with good usability results. In future work, we intend to conduct more experiments focusing on different types of users and to abstract the visual representation from the R2RML mapping language so that it supports the serialization of other uplift mapping languages.

Games and Learning Alliance - 5th International Conference, GALA 2016, Utrecht, The Netherlands, December 5-7, 2016, Proceedings, 2016

The Semantic Web - ESWC 2016 Satellite Events, Heraklion, Crete, Greece, May 29 - June 2, 2016, Revised Selected Papers, 2016

Many solutions have been developed to convert data to RDF. A common task during this conversion i... more Many solutions have been developed to convert data to RDF. A common task during this conversion is applying data manipulation functions to obtain the desired output. Depending on the data format, one can rely on the underlying technology, such as RDBMS for relational databases or XQuery for XML, to manipulate-to a certain extent-the data while generating RDF. For CSV files, however, there is no such underlying technology. One has to resort to pre-or post-processing techniques when data manipulation is needed, which renders the process of generating RDF more complex (in terms of number of steps), and therefore also less traceable and transparent. Another solution is to declare functions in mappings. KR2RML provides data manipulation functions as part of the mapping, but due to its complex format, it is difficult to create or maintain mappings without their editor. In this paper, we propose a method to incorporate functions into mapping languages in a more amenable way.

Proceedings of the ISWC 2016 Posters & Demonstrations Track co-located with 15th International Semantic Web Conference (ISWC 2016), Kobe, Japan, October 19, 2016, 2016

We present data.geohive.ie, which aims to provide an authoritative platform for serving Ireland's... more We present data.geohive.ie, which aims to provide an authoritative platform for serving Ireland's national geospatial data, including Linked Data. Currently, the platform provides information on Irish administrative boundaries and was designed to support two use cases: serving boundary data of geographic features at various level of detail and capturing the evolution of administrative boundaries. We report on the decisions taken for modeling and serving the data such as the adoption of an appropriate URI strategy, the development of necessary ontologies, and the use of (named) graphs to support aforementioned use cases.

Lecture Notes in Computer Science, 2012

On the Move to Meaningful Internet Systems: OTM 2012 Workshops, Confederated International Workshops: OTM Academy, Industry Case Studies Program, EI2N, INBAST, META4eS, OnToContent, ORM, SeDeS, SINCOM, and SOMOCO 2012, Rome, Italy, September 10-14, 2012. Proceedings, 2012

We demonstrate a collaborative knowledge management platform in which communities representing au... more We demonstrate a collaborative knowledge management platform in which communities representing autonomously developed information systems build ontologies to achieve semantic interoperability between those systems. The tool is called GOSPL, which stands for Grounding Ontologies with Social Processes and natural Language, and supports the method bearing the same name. Ontologies in GOSPL are hybrid, meaning that concepts are both described informally in natural language and formally. Agreements on these two levels are made simultaneously and the social interaction between and across communities drive the ontology evolution process.

K-CAP '11: Proceedings of the sixth international conference on Knowledge capture, 2011

Domain rules are important for businesses to obtain good data governance. Although efficient for ... more Domain rules are important for businesses to obtain good data governance. Although efficient for storing and processing data, the use of popular semantic technologies alone does not suffice. As the Web is gaining a prominent role for enterprises (and communities in general), appropriate methods and tools are required for data governance, with a proper emphasis on facts in natural language. This paper presents Business Semantics Glossary that supports the method called Business Semantics Management.

WEBIST 2011, Proceedings of the 7th International Conference on Web Information Systems and Technologies, Noordwijkerhout, The Netherlands, 6-9 May, 2011, 2011

This paper presents a platform for requests for proposals and describes how ontologies drive the ... more This paper presents a platform for requests for proposals and describes how ontologies drive the different components: the creation of a proposal, the annotation of vendor data, the transformation of vendor data into other formats and the semantic matching of a proposal against annotated vendor data. The ontology construction started from DOGMA, a methodology with its grounding in the linguistic representation of knowledge that is suitable for community participation in the creation process. The ontologies were created in a modular way, with general product and meta-models that can be extended depending on the domain. In the case of the pilot, the product were holiday packages, more precisely winter sports holiday packages.

2010 Seventh International Conference on Information Technology: New Generations, Jan 1, 2010

In this paper, we present the GOSPL application that supports communities during the ontology eng... more In this paper, we present the GOSPL application that supports communities during the ontology engineering process by exploiting Social Web technologies and natural language. The resulting knowledge can then be transformed into RDF(S).

Semantic Web in Libraries (SWIB) 2019, 2019

At SWIB 2018, we presented our early stage work on a Linked Data (LD) interlinking approach for t... more At SWIB 2018, we presented our early stage work on a Linked Data (LD) interlinking approach for the library domain called NAISC-Novel Authoritative Interlinking of Schema and Concepts. The aim of NAISC is to meet the unique interlinking requirements of the library domain and to improve LD accessibility for domain expert users. At SWIB 2019 we will present our progress in the development of NAISC including an improved graphical user-interface (GUI), user testing results, and a demonstration of NAISC's interlink provenance components. NAISC consists of an Interlinking Framework, a Provenance Model and a GUI. The Framework describes the steps of entity selection, link-type selection, and RDF generation for the creation of interlinks between entities, such as people, places, or works, stored in a library dataset to related entities held in another institution. NAISC specifically targets librarians by providing access to commonly used datasets and ontologies. NAISC incudes interlink provenance to allow data users to assess the authoritativeness of each link generated. Our provenance model adopts PROV-O as the underlying ontology which we extended to provide interlink specific data. An instantiation of NAISC is provided through a GUI which reduces the need for expert LD knowledge by guiding users in choosing suitable link-types. We will present NAISC and demonstrate the use of our GUI as a means of interlinking LD entities across libraries and other authoritative datasets. We will also discuss our user-evaluation processes and results, including a NAISC usability test, a field test/real-word application of NAISC, and a review of the interlink quality. Finally, we will demonstrate our provenance model and discuss how the provenance data could be modelled as LD develops over time.

Proceedings of the 2nd Annual Virtual Heritage Network Ireland Conference, 2016, 2016

Motivation: MODS is a highly flexible XML metadata schema that can be used to catalogue a great v... more Motivation: MODS is a highly flexible XML metadata schema that can be used to catalogue a great variety of cultural heritage materials, and offers the capability to describe hierarchical relationships between objects. It was developed as a subset of MARC21, however, unlike MARC, MODS uses textual field labels rather than numeric fields, and its structure has allowed for metadata elements to be regrouped and reorganised with the metadata record.

Semantic Web in Libraries (SWIB) 2018, 2018

Through the use of Linked Data (LD), Libraries, Archives and Museums (LAMs) have the potential to... more Through the use of Linked Data (LD), Libraries, Archives and Museums (LAMs) have the potential to expose their collections to a larger audience and to allow for more efficient user searches. Despite this, relatively few LAMs have invested in LD projects and the majority of these display limited interlinking across datasets and institutions. A survey was conducted to understand Information Professionals' (IPs') position with regards to LD, with a particular focus on the interlinking problem. The survey was completed by 185 librarians, archivists, metadata cataloguers and researchers. Results indicated that, when interlinking, IPs find the process of ontology and property selection to be particularly challenging, and LD tooling to be technologically complex and unsuitable for their needs. Our research is focused on developing an authoritative interlinking framework for LAMs with a view to increasing IP engagement in the linking process. Our framework will provide a set of standards to facilitate IPs in the selection of link types, specifically when linking local resources to authorities. The framework will include guidelines for authority, ontology and property selection, and for adding provenance data. A user-interface will be developed which will direct IPs through the resource interlinking process as per our framework. Although there are existing tools in this domain, our framework differs in that it will be designed with the needs and expertise of IPs in mind. This will be achieved by involving IPs in the design and evaluation of the framework. A mock-up of the interface has already been tested and adjustments have been made based on results. We are currently working on developing a minimal viable product so as to allow for further testing of the framework. We will present our updated framework, interface, and proposed interlinking solutions.

IFIP Advances in Information and Communication Technology, 2017

On the Move to Meaningful Internet Systems: OTM 2019 Conferences, 2019

On the Move to Meaningful Internet Systems. OTM 2017 Workshops - Confederated International Workshops, EI2N, FBM, ICSP, Meta4eS, OTMA 2017 and ODBASE Posters 2017, Rhodes, Greece, October 23-28, 2017, Revised Selected Papers

Abstract. The Do-It-Yourself (DIY) culture has been continuously articulated since mid-1920s. The... more Abstract. The Do-It-Yourself (DIY) culture has been continuously articulated since mid-1920s. The goal of DIY has been gradually shifted from the solution of the “time-rich and money-poor ” situation into the confirmation of personal creativities and the needs of outsourcing and social contact. This paper addresses the design of a DIY environment for managing data semantics from different intelligent components in the ITEA Do-It-Yourself Smart Experiences project (DIY-SE). In particular, it is a flexible and idea inspiring ontology-based DIY architecture named Onto-DIY. Including the DIY aspect, Onto-DIY also takes socio and community aspects into account.

Abstract. Ontologies being shared formal specifications of a domain, are an important lever for d... more Abstract. Ontologies being shared formal specifications of a domain, are an important lever for developing meaningful internet systems. However, the problem is not in what ontologies are, but how they become operationally relevant and sustainable over longer periods of time. Fact-oriented and layered approaches such as DOGMA have been successful in facilitating domain experts in representing and understanding semantically stable ontologies, while emphasising reusability and scalability. DOGMA-MESS, extending DOGMA, is a collaborative ontology evolution methodology that supports stakeholders in iteratively interpreting and modeling their common ontologies in their own terminology and context, and feeding back these results to the owning community. In this paper we extend DOGMA Studio with a set of collaborative ontology evolution support modules. 1

Ontology matching and mapping is concerned with discovering correspondences between two ontologie... more Ontology matching and mapping is concerned with discovering correspondences between two ontologies to a mapping that enable applications to relate, interlink or integrate data. The construction of such mappings is not trivial as they are created to serve a purpose and result from collaboration. Current ontology-mapping metadata formats only capture a glimpse of the mapping construction process and focus on the exchange of mappings and some limited properties to facilitate reuse and discovery. For mapping governance to be possible, we argue that a suitable metadata model âĂŞ presented in this paper âĂŞ needs to capture all aspects from the ontology mapping lifecycle; from the inception of a project to the execution of these mappings. This allows one to formulate queries that not only would facilitate the discovery and reuse, but also queries that allow one to govern the ontology mapping projects and render the construction processes more transparent and traceable.

Abstract. For businesses to obtain good data governance, references to the instances used in the ... more Abstract. For businesses to obtain good data governance, references to the instances used in the application domain, and domain rules are often required. The technologies on which LD is based, RDF and URI, are insufficient for enterprise data governance as common domain rules can either not be imposed nor not always properly modeled. As the Web is gaining a prominent role for enterprises and other types of communities, appropriate semantic methods and tools are required for data governance. This paper presents the ...

In this paper, we argue that layering a question answering system on the Web of Data based on use... more In this paper, we argue that layering a question answering system on the Web of Data based on user preferences, leads to the derivation of more knowledge from external sources and customisation of query results based on user’s interests. As various users may find different things relevant because of different preferences and goals, we can expect different answers to the same query. We propose a personalised question answering framework for a user to query over Linked Data, which enhances a user query with related preferences of the user stored in his/her user profile with the aim of providing personalized answers. We also propose the extension of the QALD-5 scoring system to define a relevancy metric that measures similarity of query answers to a user’s preferences.

For businesses to obtain good data governance, references to the instances used in the applicatio... more For businesses to obtain good data governance, references to the instances used in the application domain, and domain rules are often required. The technologies on which LD is based, RDF and URI, are insufficient for enterprise data governance as common domain rules can either not be imposed nor not always properly modeled. As the Web is gaining a prominent role for enterprises and other types of communities, appropriate semantic methods and tools are required for data governance. This paper presents the Business Semantics Glossary, a tool currently being validated by industry that supports the Business Semantics Management methodology, based on the OMG SBVR standard and the DOGMA ontology framework. Our examples and claims are drawn from a case study at the Flemish Institute of Education and Training, where the tool is currently in use.

Linked Data technologies are increasingly being implemented to enhance cataloguing workflows in m... more Linked Data technologies are increasingly being implemented to enhance cataloguing workflows in memory institutions such as libraries, archives and museums. These institutions actively seek authoritative Linked Data datasets to enhance their metadata. This talk presents the results of Linked Logainm. Linked Logainm is a collaborative project that aimed to create a Linked Data dataset out of Logainm.ie, an online database holding the authoritative hierarchical list of Irish and English language place names in Ireland. We will describe the process of generating RDF from XML, elaborate on the discovery and creation of links with other Linked Data datasets, and report on how the dataset we created was used to enhance the National Library of Ireland’s metadata MARCXML metadata records for its Longfield maps collection. We conclude this talk with describing the potential benefits of Linked Logainm, the applications that can be built on top of this dataset, and a reflection on the work we ...

Many initiatives have emerged to aid one in publishing structured resources as Linked Data on the... more Many initiatives have emerged to aid one in publishing structured resources as Linked Data on the Web with one of the major achievements being the R2RML W3C Recommendation. R2RML and its dialects assume a certain underlying technology (e.g., Core SQL 2008). This means that domain-specific data transformations – such as transforming geospatial coordinates – rely either on that underlying technology or data-preprocessing steps. We argue that one can incorporate and subsequently share that procedural domain knowledge in such mappings. Such an extension would make certain pre-processing steps redundant. One can furthermore attach metadata to these functions, which can be published as well. In this paper, we present R2RML-F, an extension to R2RML, that adopts ECMAScript for capturing domain knowledge and for which we have developed a prototype. We demonstrate the viability of the approach with a demonstration and compare its performance with different mappings in some initial experiments...