Algorithms for Indexing and Searching: Readings (original) (raw)
Algorithms for Indexing and Searching: Readings
- Introduction to traditional IR (Week 1)
- Course notes on Indexing and Searching from a course on Algorithms in the Real World. Contains several pointers to other sources of information on the web.
- Introduction to recent work (Week 2)
- A. Broder and M. Henzinger (1998). Information retrieval on the web: Tools and algorithmic issues. Invited tutorial at Foundations of Computer Science (FOCS).
- Latent Semantic Indexing (Week 3)
- Michael W. Berry and Susan T. Dumais, and Gavin W. O'Brien.Using Linear Algebra for Intelligent Information Retrieval. Published in SIAM Review 37:4 (1995), pp. 573-595.
- Christos Papadimitriou, Prabhakar Raghavan, Hisao Tamaki and Santosh Vempala.Latent Semantic Indexing: A Probabilistic Analysis. Proc. 17th ACM Symposium on the Principles of Database Systems, Seattle, 1998.
- Link-based Search Analysis (Week 4)
- J. Kleinberg. Authoritative sources in a hyperlinked environment. In Proceedings of the 9th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 668-677, January 1998.
- K. Bharat, A.Z. Broder, M. Henzinger, P. Kumar, and S. Venkatasubramanian. The connectivity server: Fast access to linkage information on the web. In Proceedings of the Seventh International World Wide Web Conference (WWW7), pages 469-477.
- S. Chakrabarti, B. Dom, R.P., S. Rajagopalan, D. Gibson, and J. Kleinberg. Automatic resource compilation by analyzing hyperlink structure and associated text. In Proceedings of the Seventh International World Wide Web Conference (WWW7), pages 65-74.
- The Google search engine (also Week 4)
- Sergey Brin and Lawrence Page.The Anatomy of a Large-Scale Hypertextual Web Search Engine. The 7th International World Wide Web Conference. April 1998.
- A Large-Scale Hypertextual Web Search Engine. Slides from a talk on Google
- Approximating the SVD (Week 5)
- P. Drineas, Ravi Kannan, Alan Frieze, Santosh Vempala and V. Vinay. Clustering in large graphs and matrices, Proc. of the 10th ACM-SIAM Symposium on Discrete Algorithms, Baltimore, 1999.
- Alan Frieze, Ravi Kannan and Santosh Vempala, Fast Monte-Carlo Algorithms for Finding Low-Rank Approximations, Proc. of the 39th Foundations of Computer Science, Palo Alto, 1998.
- High-Dimensional Nearest Neighbors (Week 6)
- E. Kushilevitz, R. Ostrovsky, and Y. Rabani. Efficient Search for Approximate Nearest Neighbor in High Dimensional Spaces, in Proceedings of the 30th Symposium on Theory of Computing, 1998.
- P. Indyk and R. Motwani. Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionalityin Proceedings of the 30th Symposium on Theory of Computing, 1998.
- Spectral Partitioning (Week 7)
- Stephen Guattery and Gary L. Miller. On the performance of the spectral graph partitioning methods. Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 233--242.
- Dynamic Search (Week 8)
- Dan Pelleg. Dynamic Search: A practical perspective.
- Michael Hersovici1, Michal Jacovi, Yoelle S. Maarek, Dan Pelleg, Menachem Shtalhaim, Sigalit Ur, The shark-search algorithm.
- Finding Near Duplicate Documents (Week 10)
- Andrei Z. Broder, Steven C. Glassman, Mark S. Manasse, Geoffrey Zweig,Syntactic Clustering of the Web, SRC Technical Note 1997-015, July 25, 1997
- Andrei Broder, On the resemblance and containment of documents, In Compression and Complexity of Sequences (SEQUENCES'97), pp 21-29.
- Andrei Broder, Moses Charikar, Alan Frieze, and Michael Mitzenmacher. Min-wise independent permutations. In Proceedings of the 30th Annual ACM Symposium on Theory of Computing, pages 327-336, May 1998.
- Web Caching (Week 11)
- D. Karger, E. Lehman, T. Leighton, M. Levine, D. Lewin and R. Panigrahy,Consistent hashing and random trees, Distributed caching protocols for relieving hot spots on the World Wide Web, STOC'97. slides
- D. Karger, E. Lehman, T. Leighton, M. Levine, D. Lewin and R. Panigrahy,Consistent hashing and random trees, Distributed caching protocols for relieving hot spots on the World Wide Web, STOC'97. slides