Querying and Repairing Inconsistent XML Data (original) (raw)

Abstract

The problem of repairing XML data which are inconsistent and incomplete with respect to a set of integrity constraints and a DTD is addressed. The existence of repairs (i.e. minimal sets of update operations making data consistent) is investigated and shown to be undecidable in the general case. This pro-blem is shown to be still undecidable when data are interpreted as “incomplete” (so that they could be repaired by performing insert operations only). However, it becomes decidable when particular classes of constraints are considered. The existence of repairs is proved to be decidable and, in particular, \(\mathcal{NP}\)-complete, if inconsistent data are interpreted as “dirty” data (so that repairs are data-cleaning operations consisting in only deletions). The existence of general repairs (containing both insert and delete operations) for special classes of integrity constraints (functional dependencies) is also investigated. Finally, for all the cases where the existence of a repair is decidable, the complexity of providing consistent answers to a query (issued on inconsistent data) is characterized.

Preview

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases. Addison-Wesley, Reading (1994)
    Google Scholar
  2. Abiteboul, S., Segoufin, L., Vianu, V.: Representing and Querying XML with Incomplete Information. In: Proc. of Symposium on Principles of Database Systems (PODS), Santa Barbara, CA, USA (2001)
    Google Scholar
  3. Arenas, M., Bertossi, L., Chomicki, J.: Consistent Query Answers in Inconsistent Databases. In: Proc. of Symposium on Principles of Database Systems (PODS), Philadephia, PA, USA (1999)
    Google Scholar
  4. Arenas, M., Libkin, L.: A Normal Form for XML Documents. In: Proc. of PODS Conf. (2002)
    Google Scholar
  5. Arenas, M., Fan, W., Libkin, L.: On Verifying Consistency of XML Specifications. In: Proc. PODS Conf. (2002)
    Google Scholar
  6. Arenas, M., Fan, W., Libkin, L.: What’s Hard about XML Schema Constraints? In: Hameurlain, A., Cicchetti, R., Traunmüller, R. (eds.) DEXA 2002. LNCS, vol. 2453, p. 269. Springer, Heidelberg (2002)
    Chapter Google Scholar
  7. Buneman, P., Davidson, S.B., Fan, W., Hara, C.S., Tan, W.C.: Keys for XML. Comp. Networks 39(5) (2002)
    Google Scholar
  8. Buneman, P., Davidson, S.B., Fan, W., Hara, C.S., Tan, W.C.: Reasoning about keys for XML. Inf. Syst. 28(8), 1037–1063 (2003)
    Article Google Scholar
  9. Buneman, P., Fan, W., Weinstein, S.: Path Constraints in Semistructured Databases. J. Comput. Syst. Sci. 61(2), 146–193 (2000)
    Article MATH MathSciNet Google Scholar
  10. Buneman, P., Fan, W., Weinstein, S.: Path Constraints in Semistructured and Structured Databases. In: Proc. of Symposium on Principles of Database Systems (PODS), Seattle, WA, USA (1998)
    Google Scholar
  11. Fan, W., Libkin, L.: On XML integrity constraints in the presence of DTDs. Journal of the ACM 49(3) (2002)
    Google Scholar
  12. Fan, W., Simeon, J.: Integrity constraints for XML. J. Comput. Syst. Sci. 66(1), 254–291 (2003)
    Article MATH MathSciNet Google Scholar
  13. Fernandez, M., Robie, J.: XML Query Data Model. W3C Working Draft (2001), http://www.w3.or/TR/query-datamodel/
  14. Greco, S., Zumpano, E.: Querying Inconsistent Databases. In: Parigot, M., Voronkov, A. (eds.) LPAR 2000. LNCS (LNAI), vol. 1955, pp. 308–325. Springer, Heidelberg (2000)
    Chapter Google Scholar
  15. Greco, G., Greco, S., Zumpano, E.: A Logical Framework for Querying and Repairing Inconsistent Databases. IEEE Trans. Knowl. Data Eng. 15(6), 1389–1408 (2003)
    Article Google Scholar
  16. Flesca, S., Furfaro, F., Greco, S., Zumpano, E.: Querying and Repairing Inconsistent XML Data. Thecnical Report (2005), Available at http://si.deis.unical.it/~flesca/XML-repairs.pdf
    Google Scholar
  17. Lee, M.L., Ling, T.W., Low, W.L.: Designing Functional Dependencies for XML. In: Jensen, C.S., Jeffery, K., Pokorný, J., Šaltenis, S., Bertino, E., Böhm, K., Jarke, M. (eds.) EDBT 2002. LNCS, vol. 2287, p. 124. Springer, Heidelberg (2002)
    Chapter Google Scholar
  18. Suciu, D.: Semistructured Data and XML. In: Proc. FODO Conf. (1998)
    Google Scholar
  19. Vincent, M.W., Liu, J.: Functional Dependencies for XML. In: Zhou, X., Zhang, Y., Orlowska, M.E. (eds.) APWeb 2003. LNCS, vol. 2642, pp. 22–34. Springer, Heidelberg (2003)
    Chapter Google Scholar
  20. Yang, X., Yu, G., Wang, G.: Efficiently Mapping Integrity Constraints from Relational Database to XML Document. In: Proc. of 5th ADIBIS Conf. (2001)
    Google Scholar

Download references

Author information

Authors and Affiliations

  1. D.E.I.S. – Università della Calabria, 87036, Rende (CS), Italy
    S. Flesca, F. Furfaro, S. Greco & E. Zumpano

Authors

  1. S. Flesca
  2. F. Furfaro
  3. S. Greco
  4. E. Zumpano

Editor information

Editors and Affiliations

  1. Texas State University, San Marcos, TX,
    Anne H. H. Ngu
  2. Institute of Industrial Science, The University of Tokyo, 4-6-1 Komaba, Meguro-ku, 153-8505, Tokyo, Japan
    Masaru Kitsuregawa
  3. University of Vienna, Vienna, Austria
    Erich J. Neuhold
  4. IBM Research Division, Thomas J. Watson Research Center, P.O. Box 218, 10598, New York, Yorktown Heights, USA
    Jen-Yao Chung
  5. School of Computer Science and Engineering, University of New South Wales, NSW 2052, Sydney, Australia
    Quan Z. Sheng

Rights and permissions

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Flesca, S., Furfaro, F., Greco, S., Zumpano, E. (2005). Querying and Repairing Inconsistent XML Data. In: Ngu, A.H.H., Kitsuregawa, M., Neuhold, E.J., Chung, JY., Sheng, Q.Z. (eds) Web Information Systems Engineering – WISE 2005. WISE 2005. Lecture Notes in Computer Science, vol 3806. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11581062\_14

Download citation

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Publish with us