Relevance feedback based on genetic programming for image retrieval

https://doi.org/10.1016/j.patrec.2010.05.015Get rights and content

Abstract

This paper presents two content-based image retrieval frameworks with relevance feedback based on genetic programming. The first framework exploits only the user indication of relevant images. The second one considers not only the relevant but also the images indicated as non-relevant.

Several experiments were conducted to validate the proposed frameworks. These experiments employed three different image databases and color, shape, and texture descriptors to represent the content of database images. The proposed frameworks were compared, and outperformed six other relevance feedback methods regarding their effectiveness and efficiency in image retrieval tasks.

Introduction

Large image collections have been created and managed in several applications, such as digital libraries, medicine, and biodiversity information systems (da S. Torres and Falcão, 2006). Given the large size of these collections, it is essential to provide efficient and effective mechanisms to retrieve images. This is the objective of the so-called content-based image retrieval (CBIR) systems (Veltkamp, 2000). In these systems, the search process consists of, for a given query image, finding the most similar images stored in the database. The search process relies on the use of image descriptors. A descriptor can be characterized by two functions (da S. Torres and Falcão, 2006): feature vector extraction and distance function. The feature vectors encode image properties, such as color, texture, and shape. The similarity between two images is computed as a function of the distance between their feature vectors.

Usually, different descriptors are statically combined (Vadivel et al., 2004, da S. Torres et al., 2009), that is, the descriptor combination is fixed (e.g., linear combination) and used to process all queries submitted to the retrieval system. Nevertheless, different people may have distinct visual perceptions of a same image. Therefore, a static combination of descriptors may not characterize properly this perception diversity. Furthermore, it is not easy for a user to map his/her visual perception of an image into low-level features such as color and shape (“semantic gap”). Motivated by these limitations, relevance feedback (RF) approaches were incorporated into CBIR systems (Rui et al., 1998, Liu et al., 2008, Cord et al., 2007, Liu et al., 2007, de Ves et al., 2006).

Basically, the image retrieval process with relevance feedback is comprised of four steps: (i) showing a small number of retrieved images to the user; (ii) user indication of relevant and non-relevant images; (iii) learning the user needs by taking into account his/her feedbacks; (iv) and selecting a new set of images to be shown. This procedure is repeated until a satisfactory result is reached.

An important element of a relevance feedback technique is the learning process. Several relevance feedback methods designed for CBIR systems implement the learning of the user needs by assigning different weights to the descriptors used in the search process (Rui et al., 1998, Rui and Huang, 2000, Doulamis and Doulamis, 2006). This strategy allows only a linear combination of the similarity values defined by each descriptor. However, more complex combination functions may be necessary to characterize specific user visual perceptions.

Another common drawback of existing RF methods is concerned with the fact that they, in general, ignore the similarity function defined for each available descriptor. In some RF approaches, the learning process is based only on feature vectors (Tong and Chang, 2001, Cord et al., 2007). Others define specific distance functions for computing the similarity between two images (Rui and Huang, 2000, Doulamis and Doulamis, 2006). In both cases, the overall CBIR system effectiveness may decrease if the similarity functions of the descriptors are not used.

In this paper two new relevance feedback methods for interactive image search are proposed. These methods adopt a genetic programming approach to learn user preferences in a query session. Genetic programming (GP) (Koza, 1992) is a Machine Learning technique used in many applications, such as data mining, signal processing, and regression (Bhanu and Lin, 2004, Fan et al., 2004b, Zhang et al., 2004). This technique is based on the evolution theory to find near optimal solutions. It is a kind of evolutionary algorithm (Bäck et al., 2002) which is distinguished from the others mainly by the individual representation. The use of GP is motivated in this work by the previous success of using this technique in information retrieval (Fan et al., 2004b, de Almeida et al., 2007) and CBIR (da S. Torres et al., 2009) tasks.

The main contribution of this paper is the proposal of new RF frameworks that use GP to find a function that combines non-linearly similarity values computed by different descriptors. Furthermore, in our approach, the similarity functions defined for each available descriptor are used to compute the overall similarity between two images.

The effectiveness and efficiency of the proposed methods are compared with other relevance feedback techniques (Rui et al., 1998, Rui and Huang, 2000, Tong and Chang, 2001, Doulamis and Doulamis, 2006, Min and Cheng, 2009) for image retrieval tasks. Experiments conducted considering three different image collections and the use of color, texture, and shape descriptors demonstrate that the proposed frameworks are effective and efficient for CBIR, outperforming the (state-of-the-art) baselines.

This paper differs from the papers published in (Ferreira et al., 2008, dos Santos et al., 2008) with regard to the following aspects:

  • the relevance feedback method was extended to incorporate not only relevant (GP+ RF method) but also non-relevant images (GP± RF method). Our hypothesis here is that non-relevant images provide useful information that can be used to generate better retrieval results.

  • the relevance feedback method was modified. The main changes are concerned with:

    • the definition of the training set (see Section 4.1.2). Both GP+ and GP± RF methods adopt the same training set composition, which simplifies their implementation.

    • the definition of the query pattern (see Sections 4.1 , 4.2 ). Previous implementations (Ferreira et al., 2008, dos Santos et al., 2008) considered all images defined as relevant in the query pattern. Current implementation considers only a subset of those images (usually, half of them). That speeds up the fitness computation process, making the method more efficient.

    • the modification of the fitness computation process (see Section 4.1.2.2). A different utility function is used to assess the quality (fitness) of a GP individual. Conducted experiments revealed that the proposed RF methods yield better results in terms of effectiveness by using the new utility function.

  • the execution of more experiments aiming to compare the proposed relevance feedback approaches with a recently proposed method (Min and Cheng, 2009) which is based on SVM (see Sections 5 Experiments, 6 Experiment results). Note that the proposed frameworks are now compared with other five RF methods WDheu (Rui et al., 1998), WDopt (Rui and Huang, 2000), QSstr (Doulamis and Doulamis, 2006), SVMactive (Tong and Chang, 2001), and SVMfuzzy (Min and Cheng, 2009). We also included the GPLSP (dos Santos et al., 2008) relevance feedback method as baseline in the experiments that considered a colorful dataset. Conducted experiments demonstrate that the proposed frameworks outperformed the baselines regarding their effectiveness and efficiency in image retrieval tasks.

This paper is organized as follows: Section 2 discusses related work; Section 3 describes the CBIR model used (Section 3.1) and gives a brief overview of the Genetic Programming basic concepts (Section 3.2); Section 4 details the GP-based frameworks proposed in this paper; experimental design and results are reported in Sections 5 Experiments, 6 Experiment results, respectively; finally, conclusions and future work are discussed in Section 7.

Section snippets

Related work

Relevance feedback (RF) (Zhou and Huang, 2003, Liu et al., 2007, Datta et al., 2008) is a technique initially proposed for document retrieval that has been used with great success for human–computer interaction in CBIR. RF addresses two questions referring to the CBIR process. The first one is the semantic gap between high-level visual properties of images and low-level features used to describe them. Usually, it is not easy for a user to map his/her visual perception of an image into low-level

Background

This section presents the CBIR model adopted in our work and a brief overview of the Genetic Programming basic concepts.

GP-based relevance feedback frameworks

This section presents two novel frameworks for RF in CBIR systems based on genetic programming. The first framework, GP+, incorporates only the user indication of relevant (positive) images into the query pattern. The second one, GP±, considers not only the relevant but also the non-relevant (negative) images indicated by the user.

Experiments

This section describes in details the experiments performed to validate our proposed frameworks.

Experiment results

Two different experiments were conducted: the first one aims to determine the best GP parameters to be used in the RF frameworks (Section 6.1); the second compares the proposed methods with the baselines RF techniques with regard to their effectiveness and efficiency (Sections 6.2 Comparison with baselines, 6.3 Performance evaluation).

In our experiments, the presence of users is simulated. In this simulation, all images belonging to the same class of the query image are considered relevant. Ten

Conclusions

We have presented two novel relevance feedback-based CBIR frameworks. These methods use genetic programming to learn the user preferences, using the similarity functions defined for all available descriptors. The objective of the GP-based learning methods is to find a descriptor combination function that best represents the user perception.

Experiments were performed on three different image collections using several descriptors to characterize color, texture, and shape features. In these

Acknowledgments

This work was partially supported by the Brazilian Institute of Science and Technology for the Web (Grant MCT/CNPq 573871/2008-6) and by the InfoWeb project (Grant MCT/CNPq/CT-INFO 550874/2007-0). The authors also acknowledge their individual grants and scholarships from CNPq, CAPES, FAPESP (Grants 2009/18438-7, 2008/58528-2, 2007/53607-9, and 2005/58228-0) and FAPEMIG. Authors are also grateful to Virtual Institute FAPESP-Microsoft Research (eFarms project).

References (41)

  • R. Liu et al.

    SVM-based active feedback in image retrieval using clustering and unlabeled data

    Pattern Recognition

    (2008)
  • R. Min et al.

    Effective image retrieval using dominant color descriptor and fuzzy support vector machine

    Pattern Recognition

    (2009)
  • Z. Stejić et al.

    Genetic algorithm-based relevance feedback for image retrieval using local similarity patterns

    Inform. Process. Manage.

    (2003)
  • M. Unser et al.

    A family of polynomial spline wavelet transforms

    Signal Process.

    (1993)
  • M. Arevalillo-Herrez et al.

    A relevance feedback CBIR algorithm based on fuzzy sets

    Signal Process. Image Comm.

    (2008)
  • T. Bäck et al.

    Evolutionary Computation 1 Basics Algorithms and Operators

    (2002)
  • I.J. Cox et al.

    The Bayesian image retrieval system, PicHunter: Theory, implementation, and psychophysical experiments

    IEEE Trans. Image Process.

    (2000)
  • R. da S. Torres et al.

    Content-based image retrieval: Theory and applications

    Revista de Informática Teórica e Aplicada

    (2006)
  • R. da S. Torres et al.

    Contour salience descriptors for effective image retrieval and analysis

    Image Vision Comput.

    (2007)
  • R. da S. Torres et al.

    A genetic programming framework for content-based image retrieval

    Pattern Recognition

    (2009)
  • Cited by (73)

    • Semi-supervised and active learning through Manifold Reciprocal kNN Graph for image retrieval

      2019, Neurocomputing
      Citation Excerpt :

      In this paper, the Manifold Reciprocal kNN Graph [39], which is an unsupervised manifold learning algorithm, is employed as an instance of the function fu. The image retrieval pipeline including relevance feedback consists of four main steps [13]: (i) displaying a small set of images to the user; (ii) collecting the user feedback, which defines relevant and non-relevant images; (iii) exploiting the given feedback to learn the user intentions; and (iv) retrieving a more effective set of images to be used as a result in the next iteration. The objective of the proposed semi-supervised approach is to exploit all available information on both unlabeled and labeled data, in order to retrieve more effective results.

    • On interactive learning-to-rank for IR: Overview, recent advances, challenges, and directions

      2016, Neurocomputing
      Citation Excerpt :

      The local analysis conducted with user׳s feedback for the adjustment of base kernels weights outperformed baseline methods with global optimization (SVM-RC [96] and LMNN [97]). For the automatic and adaptive combination of similarity functions from different visual features, the work in [13] proposed a genetic programming framework for CBIR with RF. This method considers user feedback for creating better similarity combination functions that more adequately express the user need.

    View all citing articles on Scopus
    View full text