Storing and Querying of XML Documents Without Redundant Path Information (original) (raw)

Abstract

We propose an improved approach that stores and queries a large volume of XML documents in a relational database, while removing the redundancy of path information and using an inverted index on the reduced path information. In order to store XML documents in a relational database, the XML document is decomposed into nodes based on its tree structure, and stored in relational tables with path information from the root node to each node. The existing XML storage methods which use relational data model, usually store path information for every node. Thus, they can increase storage overhead and decrease query processing performance with the increased data volume. Our approach stores only leaf node path information in XML tree structure while finding out internal node path information from the leaf node path information. In this manner, our approach can reduce data volume for a large amount of XML documents to a degree and also reduce the size of inverted index for the path information with the smaller number of posting lists by key words. We show the effectiveness of this approach through several experiments that compare XPath query performance with the existing methods.

Preview

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

Download references

Author information

Authors and Affiliations

  1. College of Electronics and Information, Kyung Hee University, Kyung-gi, 449-701, Korea
    Byeong-Soo Jeong & Young-Koo Lee

Authors

  1. Byeong-Soo Jeong
  2. Young-Koo Lee

Editor information

Editors and Affiliations

  1. Department of Computer Science, University of Calgary, 2500 University Drive N.W., T2N 1N4, Calgary, AB, Canada
    Marina L. Gavrilova
  2. Department of Mathematics and Computer Science, University of Perugia, via Vanvitelli, 1, I-06123, Perugia, Italy
    Osvaldo Gervasi
  3. William Norris Professor, Head of the Computer Science and Engineering Department, University of Minnesota, USA
    Vipin Kumar
  4. OptimaNumerics Ltd., Cathedral House, 23-31 Waring Street, BT1 2DX, Belfast, UK
    C. J. Kenneth Tan
  5. Clayton School of IT, Monash University, 3800, Clayton, Australia
    David Taniar
  6. Department of Chemistry, University of Perugia, Via Elce di Sotto, 8, I-06123, Perugia, Italy
    Antonio Laganá
  7. School of Computing, Soongsil University, Seoul, Korea
    Youngsong Mun
  8. School of Information and Communication Engineering, Sungkyunkwan University, Korea
    Hyunseung Choo

Rights and permissions

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Jeong, BS., Lee, YK. (2006). Storing and Querying of XML Documents Without Redundant Path Information. In: Gavrilova, M.L., et al. Computational Science and Its Applications - ICCSA 2006. ICCSA 2006. Lecture Notes in Computer Science, vol 3981. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11751588\_53

Download citation

Publish with us