RAF: An Activation Framework for Refining Similarity Queries Using Learning Techniques (original) (raw)

Abstract

In numerous applications that deal with similarity search, a user may not have an exact specification of his information need and/or may not be able to formulate a query that exactly captures his notion of similarity. A promising approach to mitigate this problem is to enable the user to submit a rough approximation of the desired query and use relevance feedback on retrieved objects to refine the query. In this paper, we explore such a refinement strategy for a general class of structured similarity queries. Our approach casts the refinement problem as that of learning concepts using the tuples on which the user provides feedback as a labeled training set. Under this setup, similarity query refinement consists of two learning tasks: learning the structure of the query and learning the relative importance of query components. The paper develops machine learning approaches suitable for the two learning tasks. The primary contribution of the paper is the Refinement Activation Framework (RAF) that decides when each learner is invoked. Experimental analysis over many real life datasets shows that our strategy significantly outperforms existing approaches in terms of retrieval quality.

Preview

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Baeza-Yates, R., Ribeiro-Neto,: Modern Information Retrieval. ACM Press Series. Addison Wesley, New York (1999)
    Google Scholar
  2. Bloedorn, E., Michalski, R.S., Wnek, J.: Multistrategy constructive induction: AQ17-MCI. In: Proc. of the 2nd Int. Workshop on Multistrategy Learning, pp. 188–203 (1993)
    Google Scholar
  3. Clark, P., Niblett, T.: The CN2 Induction Algorithm. Machine Learning 3(4), 261–283 (1989)
    Google Scholar
  4. Fagin, R.: Combining Fuzzy Information from Multiple Systems. In: Proc. of the 15th ACM Symp. on PODS (1996)
    Google Scholar
  5. Fagin, R., Lotem, A., Naor, M.: Optimal aggregation algorithms for middleware. In: PODS (2001)
    Google Scholar
  6. I.: IBM linear optimization package: http://www-3.ibm.com/software/data/bi/osl/pubs/lpsol/lpuser.htm
  7. Ishikawa, Y., Subramanya, R., Faloutsos, C.: Mindreader: Querying databases through multiple examples. In: VLDB (1998)
    Google Scholar
  8. Mangasarian, O.L., Setiono, R., Wolberg, W.H.: Pattern recognition via linear programming: Theory and application to medical diagnosis. In: SIAM (1990)
    Google Scholar
  9. Mehrotra, S., Rui, Y., Ortega, M., Huang, T.: Supporting content-based queries over images in mars. In: Proc. of IEEE-ICMCS 1997 (1997)
    Google Scholar
  10. Merz, C.J., Murphy, P.: UCI Repository of Machine Learning Databases (1996), http://www.cs.uci.edu/~mlearn/MLRepository.html
  11. Raymond, J., Mooney, R.J.: Encouraging Experimental Results on learning CNF. Machine Learning 19(1), 79–92 (1995)
    Google Scholar
  12. Ortega, M., Rui, Y., Chakrabarti, K., Porkaew, K., Mehrotra, S., Huang, T.: Supporting ranked boolean similarity queries in mars. IEEE Trans. on Data Engineering 10(6) ( December 1998)
    Google Scholar
  13. Ortega-Binderberger, M., Chakrabarti, K., Mehrotra, S.: An Approach to Integrating Query Refinement in SQL. In: Chaudhri, A.B., Unland, R., Djeraba, C., Lindner, W. (eds.) EDBT 2002. LNCS, vol. 2490, Springer, Heidelberg (2002)
    Google Scholar
  14. Porkaew, K., Mehrotra, S., Ortega, M., Chakrabarti, K.: Similarity search using multiple examples in mars. In: Huijsmans, D.P., Smeulders, A.W.M. (eds.) VISUAL 1999. LNCS, vol. 1614, Springer, Heidelberg (1999)
    Chapter Google Scholar
  15. Quinlan, R.: C4.5: Program for Machine Learning. Morgan Kaufmann, San Francisco (1992)
    Google Scholar
  16. Rocchio, J.: Relevance feedback in information retrieval. In: Salton, G. (ed.) The SMART Retrieval System: Experiments in Automatic Document Processing, pp. 313–323. Prentice Hall, Englewood Cliffs (1971)
    Google Scholar
  17. Rui, Y., Huang, T., Mehrotra, S.: Content-based image retrieval with relevance feedback in mars. In: IEEE Proc. of Int. Conf. on Image Processing (1997)
    Google Scholar
  18. Rui, Y., Huang, T., Ortega, M., Mehrotra, S.: Relevance feedback: A power tool for interactive content-based image retrieval. IEEE Trans. Circuits and Systems for Video Technology (1998)
    Google Scholar
  19. Salton, G.: The use of extended boolean logic in information retrieval. In: SIGMOD (1984)
    Google Scholar
  20. Wu, L., Faloutsos, C., Sycara, K., Payne, T.: FALCON: Feedback adaptive loop for content-based retrieval. In: VLDB (2000)
    Google Scholar

Download references

Author information

Authors and Affiliations

  1. Department of Computer Science, University of California, Irvine, USA
    Yiming Ma, Sharad Mehrotra, Dawit Yimam Seid & Qi Zhong

Authors

  1. Yiming Ma
  2. Sharad Mehrotra
  3. Dawit Yimam Seid
  4. Qi Zhong

Editor information

Editors and Affiliations

  1. Department of Computer Science, National University of Singapore, Singapore
    Mong Li Lee
  2. School of Computing, National University of Singapore, Singapore
    Kian-Lee Tan
  3. School of Engineering and Technology, Asian Institute of Technology, P.O. Box 4, 12120, Klong Luang, Pathum Thani, Thailand
    Vilas Wuwongse

Rights and permissions

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ma, Y., Mehrotra, S., Seid, D.Y., Zhong, Q. (2006). RAF: An Activation Framework for Refining Similarity Queries Using Learning Techniques. In: Li Lee, M., Tan, KL., Wuwongse, V. (eds) Database Systems for Advanced Applications. DASFAA 2006. Lecture Notes in Computer Science, vol 3882. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11733836\_41

Download citation

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Publish with us