A language for the integration of web sources (original) (raw)
Related papers
Modeling Interactive Web Sources for Information Mediation
Lecture Notes in Computer Science, 1999
We propose a method for modeling complex Web sources that have active user interaction requirements. Here active" refers to the fact that certain information in these sources is only reachable through interactions like lling out forms or clicking on image maps. Typically, the former interaction can be automated by wrapper software e.g., using parameterized urls or post commands while the latter cannot and thus requires explicit user interaction. We propose a modeling technique for such i n teractive W eb sources and the information they export, based on so-called interaction diagrams. The nodes of an interaction diagram model sources and their exported information, whereas edges model transitions and their interactions. The paths of a diagram correspond to sequences of interactions and allow to derive the various query capabilities of the source. Based on these, one can determine which queries are supported by a source and derive query plans with minimal user interaction. This technique can be used o ine to support design and implementation of wrappers, or at runtime when the mediator generates query plans against such sources.
Ariadne: a system for constructing mediators for Internet sources
Proceedings of the …, 1998
The Web is based on a browsing paradigm that makes it di cult to retrieve and integrate data from multiple sites. Today, the only way t o a c hieve this integration is by building specialized applications, which are time-consuming to develop and di cult to maintain. We are addressing this problem by creating the technology and tools for rapidly constructing information mediators that extract, query, and integrate data from web sources. The resulting system, called Ariadne, makes it feasible to rapidly build information mediators that access existing web sources.
Modeling web sources for information integration
Proceedings of the …, 1998
The Web is based on a browsing paradigm that makes it di cult to retrieve and integrate data from multiple sites. Today, the only way t o d o t h i s i s t o b u i l d specialized applications, which are time-consuming to develop and di cult to maintain. We are addressing this problem by creating the technology and tools for rapidly constructing information agents that extract, query, and integrate data from web sources. Our approach is based on a simple, uniform representation that makes it e cient t o i n tegrate multiple sources. Instead of building specialized algorithms for handling web sources, we h a ve d e v eloped methods for mapping web sources into this uniform representation. This approach builds on work from knowledge representation, machine learning and automated planning. The resulting system, called Ariadne, makes it fast and cheap to build new information agents that access existing web sources. Ariadne also makes it easy to maintain these agents and incorporate new sources as they become available.
The Ariadne Approach to Web-Based Information Integration
International Journal of Cooperative Information Systems, 2001
The Web is based on a browsing paradigm that makes it difficult to retrieve and integrate data from multiple sites. Today, the only way to do this is to build specialized applications, which are time-consuming to develop and difficult to maintain. We have addressed this problem by creating the technology and tools for rapidly constructing information agents that extract, query, and integrate data from web sources. Our approach is based on a uniform representation that makes it simple and efficient to integrate multiple sources. Instead of building specialized algorithms for handling web sources, we have developed methods for mapping web sources into this uniform representation. This approach builds on work from knowledge representation, databases, machine learning and automated planning. The resulting system, called Ariadne, makes it fast and easy to build new information agents that access existing web sources. Ariadne also makes it easy to maintain these agents and incorporate new sources as they become available.
Web Information Integration Tool: Data Structure Modelling
ó The paper describes a method for relational data model estimation from input web data and usage of this method. It includes also its principal limitations and shows the model us- age for a more effective storage into a repository. The repository is implemented as the universal relation. The properties of the model are described as well. robots. The information retrieval îon the webî (1) approach continu- ally develops: There are new methods for web pages mapping, a similarity evaluation between documents or new methods for page ranks. Many of them interpret a web page as a simple list of words, nothing else. Does there exist a kind of a web page, which could be read by humans and also could be processed by automatic tools - at a higher level - in a helpful way for humans? One type of these pages could be a web interface to informa- tion systems or to content managers. Information is presented in this case as a view of a database and the view is formatted to the XHTML page, to the ...
Design and Implementation of WNDL---Web Navigation Description Language
Abstract Utilization of the World Wide Web can be boosted if we can explore the “deep Web” and integrate information from various Web sites together. However, to automate deep Web exploration and data integration requires custom-made software to accommodate the differences among Web sites, and thus is error-prone and time-consuming. This paper defines a language called Web Navigation Description
Using Agents for Generation and Maintenance of Mediators in a Data Integration System on the Web
2001
In this paper we present a system for data integration on the web, where an XML-based mediator plays a key role providing a homogeneous view of different data sources. One novelty of our approach is that we also propose solutions for the problems of generation and maintenance of mediators. Observe that, in dynamic environments, such as the Web, individual data sources may change not only their data but also their schemas. As a result, whenever a local schema changes, the mediator needs to be updated to reflect the modifications. The system uses agents to support mediator generation and maintenance. We specify a set of tasks that must be performed by the agents in order to support these two tasks. In our approach, we use correspondence assertions for specifying the semantics of XML-based mediators. We also discuss how this high-level specification of the mediator can be used to automate the generation and maintenance of mediators.