M. Lalmas - Academia.edu (original) (raw)
Papers by M. Lalmas
Lecture Notes in Computer Science, 2000
This paper proposes an approach for query reformulation based on the generation of appropriate qu... more This paper proposes an approach for query reformulation based on the generation of appropriate query-biased concepts. Query-biased concepts are generated from retrieved documents using their content and structure. In this paper, we focus on three aspects of the concept generation; the selection of query-biased concepts from retrieved documents, the effect of the structure, and the number of retrieved documents used for generating the concepts.
Lecture Notes in Computer Science, 2014
Lecture Notes in Computer Science, 2002
Lecture Notes in Computer Science, 2009
This paper presents a user study that evaluated the effectiveness of an aggregated search interfa... more This paper presents a user study that evaluated the effectiveness of an aggregated search interface in the context of non-navigational search tasks. An experimental system was developed to present search results aggregated from multiple information sources, and compared to a conventional tabbed interface. Sixteen participants were recruited to evaluate the performance of the two interfaces. Our results suggest that the aggregated search interface is a promising way of supporting nonnavigational search tasks. The quantity and diversity of the retrieved items which participants accessed to complete a task, increased in the aggregated interface. Participants also found the aggregated presentation easier to access to retrieved items and to find relevant information, compared to the conventional interface.
User engagement is a key concept in designing user-centred web applications. It refers to the qua... more User engagement is a key concept in designing user-centred web applications. It refers to the quality of the user experience that emphasises the positive aspects of the interaction, and in particular the phenomena associated with being captivated by technology. This definition is motivated by the observation that successful technologies are not just used, but they are engaged with. Numerous methods have been proposed in the literature to measure engagement, however, little has been done to validate and relate these measures and so ...
Abstract Documents formatted in eXtensible Markup Language (XML) are becoming increasingly availa... more Abstract Documents formatted in eXtensible Markup Language (XML) are becoming increasingly available in collections of various document types. In this paper, we present an approach for the summarisation of XML documents. The novelty of this approach lies in that it is based on features not only from the content of documents, but also from their logical structure. We follow a sentence extraction-based summarisation method that employs on a novel machine learning approach. To find which feature are more effective for producing ...
Aggregated search interfaces are a common way to present web search results, mixing different typ... more Aggregated search interfaces are a common way to present web search results, mixing different types of results into one single result page. Although numerous efforts have been made to infer users' information needs in "standard" search, we know little about users' information needs within the context of aggregated search. This paper presents the outcomes of a survey of 117 respondents, investigating users' preferences for their type of search result (image, news, video) and their type of information need (informational, navigational and transactional). The survey reveals that users' result preferences differ based on their underlying information needs, suggesting that the taxonomy provided by Broder [1] requires updating to reflect user information needs in the context of aggregated search. For instance, respondents indicated a preference for diverse results (news and reviews about a particular software product) for navigational and transactional queries rather than a single result (the web page to download that software product).
Quantum Informatics Symposium. AAAI Fall Symposia Series, Jan 11, 2010
D. Song 1 , M. Lalmas 2 , CJ van Rijsbergen 2 , I. Frommholz 2 , ... B. Piwowarski 2 , J. Wang 1 ... more D. Song 1 , M. Lalmas 2 , CJ van Rijsbergen 2 , I. Frommholz 2 , ... B. Piwowarski 2 , J. Wang 1 , P. Zhang 1 , G. Zuccon 2 , PD Bruza 4 , S. Arafat 2 , ... L. Azzopardi 2 , E. Di Buccio 5 , A. Huertas-Rosero 2 , Y. Hou 6 , M. Melucci 5 , S. R¨uger 3 ... 1 The Robert Gordon University, UK; 2 University of Glasgow, UK; 3 The Open University, UK; ... 4 Queensland University of Technology, Australia; 5 University of Padua, Italy; 6 Tianjin University, China ... This position paper provides an overview of work conducted and an outlook of future directions within ...
Democracy, Design, and Development in Community Content Creation: Lessons From the StoryBank Proj... more Democracy, Design, and Development in Community Content Creation: Lessons From the StoryBank Project.
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2011
International Conference on Information and Knowledge Management, Proceedings, 2011
Aggregating search results from a variety of distributed heterogeneous sources, i.e. so-called ve... more Aggregating search results from a variety of distributed heterogeneous sources, i.e. so-called verticals, such as news, image, video and blog, into a single interface has become a popular paradigm in large-scale web search. As various distributed vertical search techniques (also as known as aggregated search) have been proposed, it is crucial that we need to be able to properly evaluate those systems on a large-scale standard test set. A test collection for aggregated search requires a number of verticals, each populated by items (e.g. documents, images, etc) of that vertical type, a set of topics expressing information needs relating to one or more verticals, and relevance assessments, indicating the relevance of the items and their associated verticals to each of the topics. Building a large-scale test collection for aggregate search is costly in terms of time and resources. In this paper, we propose a methodology to build such a test collection reusing existing test collections, which allows the investigation of aggregated search approaches. We report on experiments, based on twelve simulated aggregated search systems, that show the impact of misclassification of items into verticals to the evaluation of systems.
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2012
Aggregating search results from a variety of heterogeneous sources, ie so-called verticals [1], s... more Aggregating search results from a variety of heterogeneous sources, ie so-called verticals [1], such as news, image, video and blog, into a single interface has become a popular paradigm in web search. In this paper, we present the results of a user study that collected more than 1,500 assessments of vertical intent over 320 web topics. Firstly, we show that users prefer diverse vertical content for many queries and that the level of inter-assessor agreement for the task is fair [2]. Secondly, we propose a methodology to predict the ...
Aggregating search results from a variety of heterogeneous sources or verticals such as news, ima... more Aggregating search results from a variety of heterogeneous sources or verticals such as news, image and video into a single interface is a popular paradigm in web search. Although various approaches exist for selecting relevant verticals or optimising the aggregated search result page, evaluating the quality of an aggregated page is an open question. This paper proposes a general framework for evaluating the quality of aggregated search pages. We evaluate our approach by collecting annotated user preferences over a set of aggregated search pages for 56 topics and 12 verticals. We empirically demonstrate the fidelity of metrics instantiated from our proposed framework by showing that they strongly agree with the annotated user preferences of pairs of simulated aggregated pages. Furthermore, we show that our metrics agree with the majority user preference more often than the current diversity-based information retrieval metrics. Finally, we demonstrate the flexibility of our framework by showing that personalised historical preference data can improve the performance of our proposed metrics.
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2013
To cope with the uncertainty involved with ambiguous or underspecified queries, search engines of... more To cope with the uncertainty involved with ambiguous or underspecified queries, search engines often diversify results to return documents that cover multiple interpretations, e.g. the car brand, animal or operating system for the query 'jaguar'. Current diversity evaluation measures take the popularity of the subtopics into account and aim to favour systems that promote most popular subtopics earliest in the result ranking. However, this subtopic popularity is assumed to be static over time. In this paper, we hypothesise that temporal subtopic popularity change is common for many topics and argue this characteristic should be considered when evaluating diversity. Firstly, to support our hypothesis we analyse temporal subtopic popularity changes for ambiguous queries through historic Wikipedia article viewing statistics. Further, by simulation, we demonstrate the impact of this temporal intent variability on diversity evaluation.
Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '06, 2006
For many years it has been commonly held that a user who adds structural "hints" to a query will ... more For many years it has been commonly held that a user who adds structural "hints" to a query will improve precision in an element retrieval search. At INEX 2005 we conducted an experiment to test this assumption. We present the unexpected result that structural hints in queries do not improve precision. An analysis of the topics and the judgments suggests that this is because users are particularly bad at giving structural hints.
Proceedings of the 22nd International Conference on World Wide Web - WWW '13 Companion, 2013
Users interact with online news in many ways, one of them being sharing content through online so... more Users interact with online news in many ways, one of them being sharing content through online social networking sites such as Twitter. There is a small but important group of users that devote a substantial amount of effort and care to this activity. These users monitor a large variety of sources on a topic or around a story, carefully select interesting material on this topic, and disseminate it to an interested audience ranging from thousands to millions. These users are news curators, and are the main subject of study of this paper. We adopt the perspective of a journalist or news editor who wants to discover news curators among the audience engaged with a news site.
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '01, 2001
In this paper we report on a series of experiments designed to investigate query modification tec... more In this paper we report on a series of experiments designed to investigate query modification techniques motivated by the area of abductive reasoning. In particular we use the notion of abductive explanation, explanations being a description of data that highlight important features of the data. We describe several methods of creating abductive explanations, exploring term reweighting and query reformulation techniques and demonstrate their suitability for relevance feedback.
Lecture Notes in Computer Science, 2000
This paper proposes an approach for query reformulation based on the generation of appropriate qu... more This paper proposes an approach for query reformulation based on the generation of appropriate query-biased concepts. Query-biased concepts are generated from retrieved documents using their content and structure. In this paper, we focus on three aspects of the concept generation; the selection of query-biased concepts from retrieved documents, the effect of the structure, and the number of retrieved documents used for generating the concepts.
Lecture Notes in Computer Science, 2014
Lecture Notes in Computer Science, 2002
Lecture Notes in Computer Science, 2009
This paper presents a user study that evaluated the effectiveness of an aggregated search interfa... more This paper presents a user study that evaluated the effectiveness of an aggregated search interface in the context of non-navigational search tasks. An experimental system was developed to present search results aggregated from multiple information sources, and compared to a conventional tabbed interface. Sixteen participants were recruited to evaluate the performance of the two interfaces. Our results suggest that the aggregated search interface is a promising way of supporting nonnavigational search tasks. The quantity and diversity of the retrieved items which participants accessed to complete a task, increased in the aggregated interface. Participants also found the aggregated presentation easier to access to retrieved items and to find relevant information, compared to the conventional interface.
User engagement is a key concept in designing user-centred web applications. It refers to the qua... more User engagement is a key concept in designing user-centred web applications. It refers to the quality of the user experience that emphasises the positive aspects of the interaction, and in particular the phenomena associated with being captivated by technology. This definition is motivated by the observation that successful technologies are not just used, but they are engaged with. Numerous methods have been proposed in the literature to measure engagement, however, little has been done to validate and relate these measures and so ...
Abstract Documents formatted in eXtensible Markup Language (XML) are becoming increasingly availa... more Abstract Documents formatted in eXtensible Markup Language (XML) are becoming increasingly available in collections of various document types. In this paper, we present an approach for the summarisation of XML documents. The novelty of this approach lies in that it is based on features not only from the content of documents, but also from their logical structure. We follow a sentence extraction-based summarisation method that employs on a novel machine learning approach. To find which feature are more effective for producing ...
Aggregated search interfaces are a common way to present web search results, mixing different typ... more Aggregated search interfaces are a common way to present web search results, mixing different types of results into one single result page. Although numerous efforts have been made to infer users' information needs in "standard" search, we know little about users' information needs within the context of aggregated search. This paper presents the outcomes of a survey of 117 respondents, investigating users' preferences for their type of search result (image, news, video) and their type of information need (informational, navigational and transactional). The survey reveals that users' result preferences differ based on their underlying information needs, suggesting that the taxonomy provided by Broder [1] requires updating to reflect user information needs in the context of aggregated search. For instance, respondents indicated a preference for diverse results (news and reviews about a particular software product) for navigational and transactional queries rather than a single result (the web page to download that software product).
Quantum Informatics Symposium. AAAI Fall Symposia Series, Jan 11, 2010
D. Song 1 , M. Lalmas 2 , CJ van Rijsbergen 2 , I. Frommholz 2 , ... B. Piwowarski 2 , J. Wang 1 ... more D. Song 1 , M. Lalmas 2 , CJ van Rijsbergen 2 , I. Frommholz 2 , ... B. Piwowarski 2 , J. Wang 1 , P. Zhang 1 , G. Zuccon 2 , PD Bruza 4 , S. Arafat 2 , ... L. Azzopardi 2 , E. Di Buccio 5 , A. Huertas-Rosero 2 , Y. Hou 6 , M. Melucci 5 , S. R¨uger 3 ... 1 The Robert Gordon University, UK; 2 University of Glasgow, UK; 3 The Open University, UK; ... 4 Queensland University of Technology, Australia; 5 University of Padua, Italy; 6 Tianjin University, China ... This position paper provides an overview of work conducted and an outlook of future directions within ...
Democracy, Design, and Development in Community Content Creation: Lessons From the StoryBank Proj... more Democracy, Design, and Development in Community Content Creation: Lessons From the StoryBank Project.
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2011
International Conference on Information and Knowledge Management, Proceedings, 2011
Aggregating search results from a variety of distributed heterogeneous sources, i.e. so-called ve... more Aggregating search results from a variety of distributed heterogeneous sources, i.e. so-called verticals, such as news, image, video and blog, into a single interface has become a popular paradigm in large-scale web search. As various distributed vertical search techniques (also as known as aggregated search) have been proposed, it is crucial that we need to be able to properly evaluate those systems on a large-scale standard test set. A test collection for aggregated search requires a number of verticals, each populated by items (e.g. documents, images, etc) of that vertical type, a set of topics expressing information needs relating to one or more verticals, and relevance assessments, indicating the relevance of the items and their associated verticals to each of the topics. Building a large-scale test collection for aggregate search is costly in terms of time and resources. In this paper, we propose a methodology to build such a test collection reusing existing test collections, which allows the investigation of aggregated search approaches. We report on experiments, based on twelve simulated aggregated search systems, that show the impact of misclassification of items into verticals to the evaluation of systems.
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2012
Aggregating search results from a variety of heterogeneous sources, ie so-called verticals [1], s... more Aggregating search results from a variety of heterogeneous sources, ie so-called verticals [1], such as news, image, video and blog, into a single interface has become a popular paradigm in web search. In this paper, we present the results of a user study that collected more than 1,500 assessments of vertical intent over 320 web topics. Firstly, we show that users prefer diverse vertical content for many queries and that the level of inter-assessor agreement for the task is fair [2]. Secondly, we propose a methodology to predict the ...
Aggregating search results from a variety of heterogeneous sources or verticals such as news, ima... more Aggregating search results from a variety of heterogeneous sources or verticals such as news, image and video into a single interface is a popular paradigm in web search. Although various approaches exist for selecting relevant verticals or optimising the aggregated search result page, evaluating the quality of an aggregated page is an open question. This paper proposes a general framework for evaluating the quality of aggregated search pages. We evaluate our approach by collecting annotated user preferences over a set of aggregated search pages for 56 topics and 12 verticals. We empirically demonstrate the fidelity of metrics instantiated from our proposed framework by showing that they strongly agree with the annotated user preferences of pairs of simulated aggregated pages. Furthermore, we show that our metrics agree with the majority user preference more often than the current diversity-based information retrieval metrics. Finally, we demonstrate the flexibility of our framework by showing that personalised historical preference data can improve the performance of our proposed metrics.
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2013
To cope with the uncertainty involved with ambiguous or underspecified queries, search engines of... more To cope with the uncertainty involved with ambiguous or underspecified queries, search engines often diversify results to return documents that cover multiple interpretations, e.g. the car brand, animal or operating system for the query 'jaguar'. Current diversity evaluation measures take the popularity of the subtopics into account and aim to favour systems that promote most popular subtopics earliest in the result ranking. However, this subtopic popularity is assumed to be static over time. In this paper, we hypothesise that temporal subtopic popularity change is common for many topics and argue this characteristic should be considered when evaluating diversity. Firstly, to support our hypothesis we analyse temporal subtopic popularity changes for ambiguous queries through historic Wikipedia article viewing statistics. Further, by simulation, we demonstrate the impact of this temporal intent variability on diversity evaluation.
Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '06, 2006
For many years it has been commonly held that a user who adds structural "hints" to a query will ... more For many years it has been commonly held that a user who adds structural "hints" to a query will improve precision in an element retrieval search. At INEX 2005 we conducted an experiment to test this assumption. We present the unexpected result that structural hints in queries do not improve precision. An analysis of the topics and the judgments suggests that this is because users are particularly bad at giving structural hints.
Proceedings of the 22nd International Conference on World Wide Web - WWW '13 Companion, 2013
Users interact with online news in many ways, one of them being sharing content through online so... more Users interact with online news in many ways, one of them being sharing content through online social networking sites such as Twitter. There is a small but important group of users that devote a substantial amount of effort and care to this activity. These users monitor a large variety of sources on a topic or around a story, carefully select interesting material on this topic, and disseminate it to an interested audience ranging from thousands to millions. These users are news curators, and are the main subject of study of this paper. We adopt the perspective of a journalist or news editor who wants to discover news curators among the audience engaged with a news site.
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '01, 2001
In this paper we report on a series of experiments designed to investigate query modification tec... more In this paper we report on a series of experiments designed to investigate query modification techniques motivated by the area of abductive reasoning. In particular we use the notion of abductive explanation, explanations being a description of data that highlight important features of the data. We describe several methods of creating abductive explanations, exploring term reweighting and query reformulation techniques and demonstrate their suitability for relevance feedback.