Relevance feedback based on genetic programming for image retrieval

doi:10.1016/j.patrec.2010.05.015

Pattern Recognition Letters

Volume 32, Issue 1, 1 January 2011, Pages 27-37

https://doi.org/10.1016/j.patrec.2010.05.015 Get rights and content

Abstract

This paper presents two content-based image retrieval frameworks with relevance feedback based on genetic programming. The first framework exploits only the user indication of relevant images. The second one considers not only the relevant but also the images indicated as non-relevant.

Several experiments were conducted to validate the proposed frameworks. These experiments employed three different image databases and color, shape, and texture descriptors to represent the content of database images. The proposed frameworks were compared, and outperformed six other relevance feedback methods regarding their effectiveness and efficiency in image retrieval tasks.

Introduction

Large image collections have been created and managed in several applications, such as digital libraries, medicine, and biodiversity information systems (da S. Torres and Falcão, 2006). Given the large size of these collections, it is essential to provide efficient and effective mechanisms to retrieve images. This is the objective of the so-called content-based image retrieval (CBIR) systems (Veltkamp, 2000). In these systems, the search process consists of, for a given query image, finding the most similar images stored in the database. The search process relies on the use of image descriptors. A descriptor can be characterized by two functions (da S. Torres and Falcão, 2006): feature vector extraction and distance function. The feature vectors encode image properties, such as color, texture, and shape. The similarity between two images is computed as a function of the distance between their feature vectors.

Usually, different descriptors are statically combined (Vadivel et al., 2004, da S. Torres et al., 2009), that is, the descriptor combination is fixed (e.g., linear combination) and used to process all queries submitted to the retrieval system. Nevertheless, different people may have distinct visual perceptions of a same image. Therefore, a static combination of descriptors may not characterize properly this perception diversity. Furthermore, it is not easy for a user to map his/her visual perception of an image into low-level features such as color and shape (“semantic gap”). Motivated by these limitations, relevance feedback (RF) approaches were incorporated into CBIR systems (Rui et al., 1998, Liu et al., 2008, Cord et al., 2007, Liu et al., 2007, de Ves et al., 2006).

Basically, the image retrieval process with relevance feedback is comprised of four steps: (i) showing a small number of retrieved images to the user; (ii) user indication of relevant and non-relevant images; (iii) learning the user needs by taking into account his/her feedbacks; (iv) and selecting a new set of images to be shown. This procedure is repeated until a satisfactory result is reached.

An important element of a relevance feedback technique is the learning process. Several relevance feedback methods designed for CBIR systems implement the learning of the user needs by assigning different weights to the descriptors used in the search process (Rui et al., 1998, Rui and Huang, 2000, Doulamis and Doulamis, 2006). This strategy allows only a linear combination of the similarity values defined by each descriptor. However, more complex combination functions may be necessary to characterize specific user visual perceptions.

Another common drawback of existing RF methods is concerned with the fact that they, in general, ignore the similarity function defined for each available descriptor. In some RF approaches, the learning process is based only on feature vectors (Tong and Chang, 2001, Cord et al., 2007). Others define specific distance functions for computing the similarity between two images (Rui and Huang, 2000, Doulamis and Doulamis, 2006). In both cases, the overall CBIR system effectiveness may decrease if the similarity functions of the descriptors are not used.

In this paper two new relevance feedback methods for interactive image search are proposed. These methods adopt a genetic programming approach to learn user preferences in a query session. Genetic programming (GP) (Koza, 1992) is a Machine Learning technique used in many applications, such as data mining, signal processing, and regression (Bhanu and Lin, 2004, Fan et al., 2004b, Zhang et al., 2004). This technique is based on the evolution theory to find near optimal solutions. It is a kind of evolutionary algorithm (Bäck et al., 2002) which is distinguished from the others mainly by the individual representation. The use of GP is motivated in this work by the previous success of using this technique in information retrieval (Fan et al., 2004b, de Almeida et al., 2007) and CBIR (da S. Torres et al., 2009) tasks.

The main contribution of this paper is the proposal of new RF frameworks that use GP to find a function that combines non-linearly similarity values computed by different descriptors. Furthermore, in our approach, the similarity functions defined for each available descriptor are used to compute the overall similarity between two images.

The effectiveness and efficiency of the proposed methods are compared with other relevance feedback techniques (Rui et al., 1998, Rui and Huang, 2000, Tong and Chang, 2001, Doulamis and Doulamis, 2006, Min and Cheng, 2009) for image retrieval tasks. Experiments conducted considering three different image collections and the use of color, texture, and shape descriptors demonstrate that the proposed frameworks are effective and efficient for CBIR, outperforming the (state-of-the-art) baselines.

This paper differs from the papers published in (Ferreira et al., 2008, dos Santos et al., 2008) with regard to the following aspects:

•
the relevance feedback method was extended to incorporate not only relevant (GP⁺ RF method) but also non-relevant images (GP^± RF method). Our hypothesis here is that non-relevant images provide useful information that can be used to generate better retrieval results.
•
the relevance feedback method was modified. The main changes are concerned with:
- –
  the definition of the training set (see Section 4.1.2). Both GP⁺ and GP^± RF methods adopt the same training set composition, which simplifies their implementation.
- –
  the definition of the query pattern (see Sections 4.1 , 4.2 ). Previous implementations (Ferreira et al., 2008, dos Santos et al., 2008) considered all images defined as relevant in the query pattern. Current implementation considers only a subset of those images (usually, half of them). That speeds up the fitness computation process, making the method more efficient.
- –
  the modification of the fitness computation process (see Section 4.1.2.2). A different utility function is used to assess the quality (fitness) of a GP individual. Conducted experiments revealed that the proposed RF methods yield better results in terms of effectiveness by using the new utility function.
•
the execution of more experiments aiming to compare the proposed relevance feedback approaches with a recently proposed method (Min and Cheng, 2009) which is based on SVM (see Sections 5 Experiments, 6 Experiment results). Note that the proposed frameworks are now compared with other five RF methods WD_heu (Rui et al., 1998), WD_opt (Rui and Huang, 2000), QS_str (Doulamis and Doulamis, 2006), SVM_active (Tong and Chang, 2001), and SVM_fuzzy (Min and Cheng, 2009). We also included the GP_LSP (dos Santos et al., 2008) relevance feedback method as baseline in the experiments that considered a colorful dataset. Conducted experiments demonstrate that the proposed frameworks outperformed the baselines regarding their effectiveness and efficiency in image retrieval tasks.

This paper is organized as follows: Section 2 discusses related work; Section 3 describes the CBIR model used (Section 3.1) and gives a brief overview of the Genetic Programming basic concepts (Section 3.2); Section 4 details the GP-based frameworks proposed in this paper; experimental design and results are reported in Sections 5 Experiments, 6 Experiment results, respectively; finally, conclusions and future work are discussed in Section 7.

Section snippets

Related work

Relevance feedback (RF) (Zhou and Huang, 2003, Liu et al., 2007, Datta et al., 2008) is a technique initially proposed for document retrieval that has been used with great success for human–computer interaction in CBIR. RF addresses two questions referring to the CBIR process. The first one is the semantic gap between high-level visual properties of images and low-level features used to describe them. Usually, it is not easy for a user to map his/her visual perception of an image into low-level

Background

This section presents the CBIR model adopted in our work and a brief overview of the Genetic Programming basic concepts.

GP-based relevance feedback frameworks

This section presents two novel frameworks for RF in CBIR systems based on genetic programming. The first framework, GP⁺, incorporates only the user indication of relevant (positive) images into the query pattern. The second one, GP^±, considers not only the relevant but also the non-relevant (negative) images indicated by the user.

Experiments

This section describes in details the experiments performed to validate our proposed frameworks.

Experiment results

Two different experiments were conducted: the first one aims to determine the best GP parameters to be used in the RF frameworks (Section 6.1); the second compares the proposed methods with the baselines RF techniques with regard to their effectiveness and efficiency (Sections 6.2 Comparison with baselines, 6.3 Performance evaluation).

In our experiments, the presence of users is simulated. In this simulation, all images belonging to the same class of the query image are considered relevant. Ten

Conclusions

We have presented two novel relevance feedback-based CBIR frameworks. These methods use genetic programming to learn the user preferences, using the similarity functions defined for all available descriptors. The objective of the GP-based learning methods is to find a descriptor combination function that best represents the user perception.

Experiments were performed on three different image collections using several descriptors to characterize color, texture, and shape features. In these

Acknowledgments

This work was partially supported by the Brazilian Institute of Science and Technology for the Web (Grant MCT/CNPq 573871/2008-6) and by the InfoWeb project (Grant MCT/CNPq/CT-INFO 550874/2007-0). The authors also acknowledge their individual grants and scholarships from CNPq, CAPES, FAPESP (Grants 2009/18438-7, 2008/58528-2, 2007/53607-9, and 2005/58228-0) and FAPEMIG. Authors are also grateful to Virtual Institute FAPESP-Microsoft Research (eFarms project).

References (41)

N. Arica et al.
BAS: A perceptual shape descriptor based on the beam angle statistics
Pattern Recognition Lett.
(2003)
B. Bhanu et al.
Object detection in multi-modal images using genetic programming
Appl. Soft Comput.
(2004)
M. Cord et al.
Stochastic exploration and active learning for image retrieval
Image Vision Comput.
(2007)
R. da S. Torres et al.
A graph-based approach for multiscale shape analysis
Pattern Recognition
(2004)
E. de Ves et al.
A novel bayesian framework for relevance feedback in image content-based retrieval systems
Pattern Recognition
(2006)
N. Doulamis et al.
Evaluation of relevance feedback schemes in content-based in retrieval systems
Signal Process. Image Comm.
(2006)
L. Duan et al.
Adaptive relevance feedback based on Bayesian inference for image retrieval
Signal Process.
(2005)
W. Fan et al.
A generic ranking function discovery framework by genetic programming for information retrieval
Inform. Process. Manage.
(2004)
T. León et al.
Applying logistic regression to relevance feedback in image retrieval systems
Pattern Recognition
(2007)
Y. Liu et al.
A survey of content-based image retrieval with high-level semantics
Pattern Recognition
(2007)

R. Liu et al.

SVM-based active feedback in image retrieval using clustering and unlabeled data

Pattern Recognition

(2008)

R. Min et al.

Effective image retrieval using dominant color descriptor and fuzzy support vector machine

Pattern Recognition

(2009)

Z. Stejić et al.

Genetic algorithm-based relevance feedback for image retrieval using local similarity patterns

Inform. Process. Manage.

(2003)

M. Unser et al.

A family of polynomial spline wavelet transforms

Signal Process.

(1993)

M. Arevalillo-Herrez et al.

A relevance feedback CBIR algorithm based on fuzzy sets

Signal Process. Image Comm.

(2008)

T. Bäck et al.

Evolutionary Computation 1 Basics Algorithms and Operators

(2002)

I.J. Cox et al.

The Bayesian image retrieval system, PicHunter: Theory, implementation, and psychophysical experiments

IEEE Trans. Image Process.

(2000)

R. da S. Torres et al.

Content-based image retrieval: Theory and applications

Revista de Informática Teórica e Aplicada

(2006)

R. da S. Torres et al.

Contour salience descriptors for effective image retrieval and analysis

Image Vision Comput.

(2007)

R. da S. Torres et al.

A genetic programming framework for content-based image retrieval

Pattern Recognition

(2009)

Cited by (73)

Unsupervised selective rank fusion for image retrieval tasks
2020, Neurocomputing
Several visual features have been developed for content-based image retrieval in the last decades, including global, local and deep learning-based approaches. However, despite the huge advances in features development and mid-level representations, a single visual descriptor is often insufficient to achieve effective retrieval results in several scenarios. Mainly due to the diverse aspects involved in human visual perception, the combination of different features has been establishing as a relevant trend in image retrieval. An intrinsic difficulty consists in the task of selecting the features to combine, which is often supported by supervised learning approaches. Therefore, in the absence of labeled data, selecting features in an unsupervised way is a very challenging, although essential task. In this paper, an unsupervised framework is proposed to select and fuse visual features in order to improve the effectiveness of image retrieval tasks. The framework estimates the effectiveness and correlation among features through a rank-based analysis and uses a list of ranker pairs to determine the selected features combinations. High-effective retrieval results were achieved through a comprehensive experimental evaluation conducted on 5 public datasets, involving 41 different features and comparison with other methods. Relative gains up to +55% were obtained in relation to the highest effective isolated feature.
Semi-supervised and active learning through Manifold Reciprocal kNN Graph for image retrieval
2019, Neurocomputing
Citation Excerpt :
In this paper, the Manifold Reciprocal kNN Graph [39], which is an unsupervised manifold learning algorithm, is employed as an instance of the function fu. The image retrieval pipeline including relevance feedback consists of four main steps [13]: (i) displaying a small set of images to the user; (ii) collecting the user feedback, which defines relevant and non-relevant images; (iii) exploiting the given feedback to learn the user intentions; and (iv) retrieving a more effective set of images to be used as a result in the next iteration. The objective of the proposed semi-supervised approach is to exploit all available information on both unlabeled and labeled data, in order to retrieve more effective results.
A massive and ever growing amount of data collections, including visual and multimedia content are available today. Such content usually possesses additional information, as text or other metadata, to form a rather sparse and noisy, yet rich and diverse source of annotation. Although the text-based retrieval models are well established, they ignore the rich source of information encoded in the visual data. In contrast, the promising content-based retrieval technologies, capable of considering the multimedia content, still face obstacles for mapping the low level features into high level semantic concepts. Supervised approaches based on relevance feedback techniques have been employed for mitigating such gap on visual retrieval tasks. Although often quite effective, such methods rely only on labeled data, which can severely impact the retrieval effectiveness when the number of user interventions is insufficient. In this scenario, the retrieval approaches are ideally suitable for the emerging weakly supervised and active learning technology to semi-autonomously explore data collections by taking into account the relationships among multimedia objects and saving the user’s efforts. In this paper, we discuss a novel semi-supervised learning algorithm for image retrieval tasks. While a manifold learning algorithm uses a reciprocal kNN graph to analyze the unlabeled data, the labeled information obtained through user interactions are represented using similarity sets. Both labeled and unlabeled information are modelled in terms of ranking information to allow a strict link between them. Experimental results obtained on various public datasets and several different visual features have demonstrated the effectiveness of the proposed approach.
Unsupervised manifold learning through reciprocal kNN graph and Connected Components for image retrieval tasks
2018, Pattern Recognition
Performing effective image retrieval tasks, capable of exploiting the underlying structure of datasets still constitutes a challenge research scenario. This paper proposes a novel manifold learning approach that exploits the intrinsic dataset geometry for improving the effectiveness of image retrieval tasks. The underlying dataset manifold is modeled and analyzed in terms of a Reciprocal kNN Graph and its Connected Components. The method computes the new retrieval results on an unsupervised way, without the need of any user intervention. A large experimental evaluation was conducted, considering different image retrieval tasks, various datasets and features. The proposed method yields better effectiveness results than various methods recently proposed, achieving effectiveness gains up to +40.75%.
Unsupervised rank diffusion for content-based image retrieval
2017, Neurocomputing
Despite the continuous development of features and mid-level representations, effectively and reliably measuring the similarity among images remains a challenging problem in image retrieval tasks. Once traditional measures consider only pairwise analysis, context-sensitive measures capable of exploiting the intrinsic manifold structure became indispensable for improving the retrieval performance. In this scenario, diffusion processes and rank-based methods are the most representative approaches. This paper proposes a novel hybrid method, named rank diffusion, which uses a diffusion process based on ranking information. The proposed method consists in a diffusion-based re-ranking approach, which propagates contextual information through a diffusion process defined in terms of top-ranked objects, reducing the computational complexity of the proposed algorithm. Extensive experiments considering a rigorous experimental protocol were conducted on six public image datasets and several different descriptors. Experimental results and comparison with state-of-the-art methods demonstrate that high effectiveness gains can be obtained, despite the low-complexity of the algorithm proposed.
On interactive learning-to-rank for IR: Overview, recent advances, challenges, and directions
2016, Neurocomputing
Citation Excerpt :
The local analysis conducted with user׳s feedback for the adjustment of base kernels weights outperformed baseline methods with global optimization (SVM-RC [96] and LMNN [97]). For the automatic and adaptive combination of similarity functions from different visual features, the work in [13] proposed a genetic programming framework for CBIR with RF. This method considers user feedback for creating better similarity combination functions that more adequately express the user need.
With the amount and variety of information available on digital repositories, answering complex user needs and personalizing information access became a hard task. Putting the user in the retrieval loop has emerged as a reasonable alternative to enhance search effectiveness and consequently the user experience. Due to the great advances on machine learning techniques, optimizing search engines according to user preferences has attracted great attention from the research and industry communities. Interactively learning-to-rank has greatly evolved over the last decade but it still faces great theoretical and practical obstacles. This paper describes basic concepts and reviews state-of-the-art methods on the several research fields that complementarily support the creation of interactive information retrieval (IIR) systems. By revisiting ground concepts and gathering recent advances, this article also intends to foster new research activities on IIR by highlighting great challenges and promising directions. The aggregated knowledge provided here is intended to work as a comprehensive introduction to those interested in IIR development, while also providing important insights on the vast opportunities of novel research.
A multimodal query expansion based on genetic programming for visually-oriented e-commerce applications
2016, Information Processing and Management
We present a novel multimodal query expansion strategy, based on genetic programming (GP), for image search in visually-oriented e-commerce applications. Our GP-based approach aims at both: learning to expand queries with multimodal information and learning to compute the “best” ranking for the expanded queries. However, different from previous work, the query is only expressed in terms of the visual content, which brings several challenges for this type of application. In order to evaluate the effectiveness of our method, we have collected two datasets containing images of clothing products taken from different online shops. Experimental results indicate that our method is an effective alternative for improving the quality of image search results when compared to a genetic programming system based only on visual information. Our method can achieve gains varying from 10.8% against the strongest learning-to-rank baseline to 54% against an adhoc specialized solution for the particular domain at hand.

View all citing articles on Scopus

View full text

Relevance feedback based on genetic programming for image retrieval

Abstract

Introduction

Section snippets

Related work

Background

GP-based relevance feedback frameworks

Experiments

Experiment results

Conclusions

Acknowledgments

Pattern Recognition Lett.

Appl. Soft Comput.

Image Vision Comput.

Pattern Recognition

Pattern Recognition

Signal Process. Image Comm.

Signal Process.

Inform. Process. Manage.

Pattern Recognition

Pattern Recognition

Pattern Recognition

Pattern Recognition

Inform. Process. Manage.

Signal Process.

A relevance feedback CBIR algorithm based on fuzzy sets

Signal Process. Image Comm.

Evolutionary Computation 1 Basics Algorithms and Operators

The Bayesian image retrieval system, PicHunter: Theory, implementation, and psychophysical experiments

IEEE Trans. Image Process.

Content-based image retrieval: Theory and applications

Revista de Informática Teórica e Aplicada

Contour salience descriptors for effective image retrieval and analysis

Image Vision Comput.

A genetic programming framework for content-based image retrieval

Pattern Recognition