A generic ranking function discovery framework by genetic programming for information retrieval☆
Introduction
The information retrieval (IR) field is undergoing dramatic development and change due to advances in information technology and computation techniques. The large amount of digital information increasingly available in our society makes information retrieval research one of the most exciting and important fields. According to SearchEngineWatch.com,1 75% of online users use search engines to traverse the web, which indicates the importance of information retrieval in our daily life. Despite the recent advances of information retrieval or search technologies, studies (Gordon & Pathak, 1999) show that the performance of search engines are not quite up to the expectations of end users. Users often spend quite a lot of time sifting through hit lists full of irrelevant results.
There are various reasons contributing to the dissatisfaction of end users: imprecise query formulation, unfamiliarity with system usage, etc. We argue in this paper that the ranking strategies adopted by these search engines also deserve part of the blame. Ranking strategies, often called ranking functions in IR, are used to order search results in an order of decreasing relevance match with a user's search query. Most IR systems use a single fixed ranking strategy to support the information seeking task of all users for all queries irrespective of the heterogeneity of end users and queries––so-called “consensus search”––in which the computed relevancy for the entire population is presumed appropriate for each user (Pitkow et al., 2002). It is true that one of the major benefits of “consensus search” is that all users get the same results, which fosters result-sharing among users (Pitkow et al., 2002). However, there are many other cases where users prefer search results to be tailored to their own personal preference––so-called personalized search or personalized ranking (Pitkow et al., 2002). Most current search engines do not support such an advanced personalized search feature.
Both consensus search and personalized search require a good ranking function to obtain good performance. Although there are various ranking functions available, most of them are manually designed by IR experts based on heuristics, experience, or observations. Although some of these ranking functions, such as that used in Okapi (see Eq. (2)), are designed based on probabilistic theory, their performance for each individual query is not guaranteed. In other words, even though those theoretically justified ranking functions may work reasonably well on average for a set of queries, they may not work well for each individual query. In fact, various ranking function evaluations and comparative studies (Salton, 1989; Zobel & Moffat, 1998) showed that these ranking functions do not work consistently well across queries. Moreover, it requires a lot of human effort to design a personalized ranking function for each individual query. Finding an optimal ranking function for a particular query or a group of queries remains a design challenge for IR research.
In this paper, we introduce a systematic and automatic discovery framework to aid the ranking function design process. This ranking function discovery is based on an artificial intelligence technique called Genetic Programming, which is widely used in various optimal design and data mining applications (Koza, 1992). We show through various experiments using real textual data that the new ranking function discovery framework is a flexible and powerful discovery tool for optimal ranking function design.
The remainder of this paper is organized as follows. In Section 2 we review related research on ranking function design and evaluation. In Section 3 we give a formal definition of the ranking function discovery problem and describe our ranking function discovery framework based on Genetic Programming. Section 4 discusses several experiments validating our ranking function discovery framework. We discuss related work in Section 5 and conclude the paper with implications of this study and future research directions in Section 6.
Section snippets
Prior research on ranking function design and evaluation
IR systems use ranking functions to order documents according to the documents' estimated match with a user query. To facilitate this relevance estimation process, both documents and user queries need to be transformed into a form that can be effectively processed by computers. One of the most successful models is the so-called Vector Space Model (VSM) (Salton, 1971, Salton, 1989).
The VSM is the underlying model for this study for two reasons:
- (1)
Ease of interpretation
The VSM is a well-grounded
Nature of the ranking function discovery problem
The problem of finding a good ranking function is illustrated in Fig. 1. “1” and “0” stand for “relevant” and “non-relevant”, respectively, in the column of “Rele.” of both document tables.
The problem of finding a ranking function can be formalized as follows:
Given as input a user query (a set of queries) and a set of training documents with known relevance judgments for the query (queries), a ranking function is sought by the discovery framework that can potentially rank all relevant documents
Experiments
To test the ranking function discovery framework, we used the Associated Press (AP) news collection from the TREC conference (Harman, 1996) as our textual data. This news collection contains more than 240,000 news articles from 1988 to 1990 and covers a variety of domains and topics. It has been used widely in the IR field to test new retrieval algorithms. More specifically, we use AP88 (79,919 documents) as the training data, AP89 (84,678 documents) as the validation data, and AP90 (78,321
Related work
There have been several efforts on ranking function optimization in IR literature.
The earliest work is done by Fox, 1983, in which Fox used polynomial regression to optimize the ranking function. Fuhr et al. (Fuhr & Buckley, 1991; Fuhr & Pfeifer, 1994) used probabilistic models as machine learning approaches. The concept of relevance description used in Fuhr and Buckley (1991), Fuhr and Pfeifer (1994) are very similar to the weighting evidences (tf,df,…) we used for ranking. The difference in
Conclusions
In this paper, by effectively leveraging the clues of different weighting features used by many IR experts, we demonstrated that a machine intelligence tool like GP can help us automate and discover better ranking functions for a variety of contexts, which would be, otherwise, very tedious and difficult for any human being. More specifically, the new ranking function discovery framework based on GP can be used to effectively discover either personalized ranking functions for each individual
References (32)
- et al.
Finding information on the World Wide Web: the retrieval effectiveness of search engines
Information Processing and Management
(1999) - et al.
Real life, real users, and real needs: a study and analysis of user queries on the web
Information Processing and Management
(2000) - et al.
Term weighting approaches in automatic text retrieval
Information Processing and Management
(1988) - et al.
Document length normalization
Information Processing and Management
(1996) - et al.
Genetic programming: an introduction––on the automatic evolution of computer programs and its applications
(1998) - Bartell, B. T., Cottrell, G. W., & Belew, R. K. (1994). Automatic combination of multiple ranked retrieval systems. In...
- et al.
A smart itsy bitsy spider for the web
Journal of the American Society for Information Science
(1998) - Fan, W., Gordon, M. D., & Pathak, P. (2000). Personalization of search engine services for effective retrieval and...
- Fox, E. A. (1983). Extending the boolean and vector space models of information retrieval with p-norm queries and...
- Fox, E. A., Koushik, M. P., Shaw, J., Modlin, R., & Rao, D. (1993). Combining evidence from multiple searches. In...
A probabilistic learning approach for document indexing
ACM Transactions on Information Systems
Probabilistic information retrieval as combination of abstraction, inductive learning and probabilistic assumptions
ACM Transactions on Information Systems
Probabilistic and genetic algorithms for document retrieval
Communications of ACM
User-based document clustering by redescribing subject descriptions with a genetic algorithm
Journal of the American Society for Information Science
Cited by (0)
- ☆
An earlier version of this paper was presented at the 2000 International Conference on Information Systems by Fan, Gordon, and Pathak (2000).