RAF: An Activation Framework for Refining Similarity Queries Using Learning Techniques (original) (raw)
Abstract
In numerous applications that deal with similarity search, a user may not have an exact specification of his information need and/or may not be able to formulate a query that exactly captures his notion of similarity. A promising approach to mitigate this problem is to enable the user to submit a rough approximation of the desired query and use relevance feedback on retrieved objects to refine the query. In this paper, we explore such a refinement strategy for a general class of structured similarity queries. Our approach casts the refinement problem as that of learning concepts using the tuples on which the user provides feedback as a labeled training set. Under this setup, similarity query refinement consists of two learning tasks: learning the structure of the query and learning the relative importance of query components. The paper develops machine learning approaches suitable for the two learning tasks. The primary contribution of the paper is the Refinement Activation Framework (RAF) that decides when each learner is invoked. Experimental analysis over many real life datasets shows that our strategy significantly outperforms existing approaches in terms of retrieval quality.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
- Baeza-Yates, R., Ribeiro-Neto,: Modern Information Retrieval. ACM Press Series. Addison Wesley, New York (1999)
Google Scholar - Bloedorn, E., Michalski, R.S., Wnek, J.: Multistrategy constructive induction: AQ17-MCI. In: Proc. of the 2nd Int. Workshop on Multistrategy Learning, pp. 188–203 (1993)
Google Scholar - Clark, P., Niblett, T.: The CN2 Induction Algorithm. Machine Learning 3(4), 261–283 (1989)
Google Scholar - Fagin, R.: Combining Fuzzy Information from Multiple Systems. In: Proc. of the 15th ACM Symp. on PODS (1996)
Google Scholar - Fagin, R., Lotem, A., Naor, M.: Optimal aggregation algorithms for middleware. In: PODS (2001)
Google Scholar - I.: IBM linear optimization package: http://www-3.ibm.com/software/data/bi/osl/pubs/lpsol/lpuser.htm
- Ishikawa, Y., Subramanya, R., Faloutsos, C.: Mindreader: Querying databases through multiple examples. In: VLDB (1998)
Google Scholar - Mangasarian, O.L., Setiono, R., Wolberg, W.H.: Pattern recognition via linear programming: Theory and application to medical diagnosis. In: SIAM (1990)
Google Scholar - Mehrotra, S., Rui, Y., Ortega, M., Huang, T.: Supporting content-based queries over images in mars. In: Proc. of IEEE-ICMCS 1997 (1997)
Google Scholar - Merz, C.J., Murphy, P.: UCI Repository of Machine Learning Databases (1996), http://www.cs.uci.edu/~mlearn/MLRepository.html
- Raymond, J., Mooney, R.J.: Encouraging Experimental Results on learning CNF. Machine Learning 19(1), 79–92 (1995)
Google Scholar - Ortega, M., Rui, Y., Chakrabarti, K., Porkaew, K., Mehrotra, S., Huang, T.: Supporting ranked boolean similarity queries in mars. IEEE Trans. on Data Engineering 10(6) ( December 1998)
Google Scholar - Ortega-Binderberger, M., Chakrabarti, K., Mehrotra, S.: An Approach to Integrating Query Refinement in SQL. In: Chaudhri, A.B., Unland, R., Djeraba, C., Lindner, W. (eds.) EDBT 2002. LNCS, vol. 2490, Springer, Heidelberg (2002)
Google Scholar - Porkaew, K., Mehrotra, S., Ortega, M., Chakrabarti, K.: Similarity search using multiple examples in mars. In: Huijsmans, D.P., Smeulders, A.W.M. (eds.) VISUAL 1999. LNCS, vol. 1614, Springer, Heidelberg (1999)
Chapter Google Scholar - Quinlan, R.: C4.5: Program for Machine Learning. Morgan Kaufmann, San Francisco (1992)
Google Scholar - Rocchio, J.: Relevance feedback in information retrieval. In: Salton, G. (ed.) The SMART Retrieval System: Experiments in Automatic Document Processing, pp. 313–323. Prentice Hall, Englewood Cliffs (1971)
Google Scholar - Rui, Y., Huang, T., Mehrotra, S.: Content-based image retrieval with relevance feedback in mars. In: IEEE Proc. of Int. Conf. on Image Processing (1997)
Google Scholar - Rui, Y., Huang, T., Ortega, M., Mehrotra, S.: Relevance feedback: A power tool for interactive content-based image retrieval. IEEE Trans. Circuits and Systems for Video Technology (1998)
Google Scholar - Salton, G.: The use of extended boolean logic in information retrieval. In: SIGMOD (1984)
Google Scholar - Wu, L., Faloutsos, C., Sycara, K., Payne, T.: FALCON: Feedback adaptive loop for content-based retrieval. In: VLDB (2000)
Google Scholar
Author information
Authors and Affiliations
- Department of Computer Science, University of California, Irvine, USA
Yiming Ma, Sharad Mehrotra, Dawit Yimam Seid & Qi Zhong
Authors
- Yiming Ma
- Sharad Mehrotra
- Dawit Yimam Seid
- Qi Zhong
Editor information
Editors and Affiliations
- Department of Computer Science, National University of Singapore, Singapore
Mong Li Lee - School of Computing, National University of Singapore, Singapore
Kian-Lee Tan - School of Engineering and Technology, Asian Institute of Technology, P.O. Box 4, 12120, Klong Luang, Pathum Thani, Thailand
Vilas Wuwongse
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ma, Y., Mehrotra, S., Seid, D.Y., Zhong, Q. (2006). RAF: An Activation Framework for Refining Similarity Queries Using Learning Techniques. In: Li Lee, M., Tan, KL., Wuwongse, V. (eds) Database Systems for Advanced Applications. DASFAA 2006. Lecture Notes in Computer Science, vol 3882. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11733836\_41
Download citation
- .RIS
- .ENW
- .BIB
- DOI: https://doi.org/10.1007/11733836\_41
- Publisher Name: Springer, Berlin, Heidelberg
- Print ISBN: 978-3-540-33337-1
- Online ISBN: 978-3-540-33338-8
- eBook Packages: Computer ScienceComputer Science (R0)Springer Nature Proceedings Computer Science
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.