Incorporating multiple distance spaces in optimum-path forest classification to improve feedback-based learning

https://doi.org/10.1016/j.cviu.2011.12.001Get rights and content

Abstract

In content-based image retrieval (CBIR) using feedback-based learning, the user marks the relevance of returned images and the system learns how to return more relevant images in a next iteration. In this learning process, image comparison may be based on distinct distance spaces due to multiple visual content representations. This work improves the retrieval process by incorporating multiple distance spaces in a recent method based on optimum-path forest (OPF) classification. For a given training set with relevant and irrelevant images, an optimization algorithm finds the best distance function to compare images as a combination of their distances according to different representations. Two optimization techniques are evaluated: a multi-scale parameter search (MSPS), never used before for CBIR, and a genetic programming (GP) algorithm. The combined distance function is used to project an OPF classifier and to rank images classified as relevant for the next iteration. The ranking process takes into account relevant and irrelevant representatives, previously found by the OPF classifier. Experiments show the advantages in effectiveness of the proposed approach with both optimization techniques over the same approach with single distance space and over another state-of-the-art method based on multiple distance spaces.

Highlights

► Two feedback-based learning methods based on OPF and multiple distance space. ► They solve image retrieval in a few iterations of relevance feedback. ► Considerable gains in effectiveness are demonstrated.

Introduction

Large image collections have increased the demand for efficient and effective information retrieval methods based on the visual content of the images. The simplest visual content representation is a feature vector, which encodes color, texture, and/or shape measures of an image. The similarity between images can be measured by the distance between their feature vectors. More complex representations are very specific for shape or texture or color, requiring special distance functions [1], [2], [3], [4], [5]. In this case, color, texture, and shape features are not simple measures that can be concatenated into a single feature vector. Each pair, feature extraction function and distance function, must be interpreted as an independent entity refereed to as a simple descriptor. Therefore, the use of multiple simple descriptors creates the problem of dealing with multiple distance spaces. Solutions to this problem find the best combination of distance values from simple descriptors, resulting in a composite descriptor [6].

For a given query image, a content-based image retrieval (CBIR) system can rank the most relevant images based on their distances to the query image. However, a semantic gap usually occurs between this result and the user’s expectation, due to the absence of relevant information in those simple/composite descriptors. In order to reduce the semantic gap problem, feedback-based learning approaches have been investigated. In these methods, the user usually indicates which images are relevant (irrelevant) within a small set of returned images and the CBIR system learns how to better rank and return more relevant images in a next iteration. This search process can be repeated until the user is satisfied. The learning process is usually focused on the selection of a more effective simple descriptor [7], [8], by changing features and/or distance function, on a better composite descriptor [9], [10], on changing the query strategy [11], [12], or on training a more accurate pattern classifier [13], [14], [15], [16].

In this paper we propose a method that simultaneously improves the composite descriptor, the query strategy, and the pattern classifier during the feedback-based learning process. Our method extends our previous work [17], [16] by incorporating optimization techniques to compute composite descriptors. Two techniques are evaluated in this context: a multi-scale parameter search (MSPS) [18], never used for CBIR, and a genetic programming (GP) algorithm [6]. The MSPS method is a recent optimization approach under development. This paper presents a slightly different implementation of the algorithm in [18], which was better suited for distance combination. The choice of GP is also motivated by the successful recent works [9], [10], [19]. Santos et al. [9], [19], for example, showed that the GP-based method improves the retrieval effectiveness, outperforming methods based on genetic algorithm.

Support vector machines (SVMs) became a popular classifier in feedback-based learning [13], [14], [20], [15]. Our method is based on the optimum-path forest (OPF) classifier [21], because we have recently demonstrated its gain in effectiveness and efficiency over SVMs for CBIR [17], [16]. The OPF classifier uses the Image Foresting Transform algorithm [22], which is a generalization of Dijkstra’s algorithm for multiple sources and more general path-cost functions in a graph. Its training set is interpreted as a complete graph weighted on the arcs by the combined distance values between nodes. Therefore, the composite descriptor should produce higher arc weights between relevant and irrelevant images, and lower arc weights within each class. The path-cost function assigns to any path its maximum arc weight, but the paths are constrained to start in a special set of representative images (called prototypes) from both classes. The prototypes compete among themselves and each prototype conquers its most closely connected images by paths of minimum cost, partitioning the graph into an optimum-path forest (classifier). The classification of database images based on OPF follows the same rule, by considering minimum-cost paths from the prototypes to the new image and assigning it to the class of the most closely connected prototype. The choice of prototypes is a key aspect in the OPF approach. First, they must “defend” their classes in order to avoid paths from prototypes of distinct classes. Since the method is choosing paths, whose maximum arc weight is minimum, the best prototypes will be the closest images between the relevant and irrelevant classes [21]. Second, the prototypes are used as new query points, changing the ranking process to sort images based on a normalized average distance to the relevant and irrelevant prototypes.

This paper is organized as follows. Section 2 provides a survey of related works and Section 3 reviews the original CBIR approach based on relevance feedback and OPF classification [17]. The method is initially described for a simple descriptor and then the optimization techniques that integrate multiple distance spaces are presented in Section 4 for composite descriptors. Exhaustive experiments involving several datasets, two state-of-the-art approaches [17], [10], various simple descriptors, and the composite descriptors from the MSPS and the GP techniques are presented in Section 5. Finally, Section 6 states conclusion and discusses future work.

Section snippets

Related work

Over the past decades, several softwares for content-based image retrieval (CBIR) have been developed, such as IBM QBIC [23], UIUC MARS [24], PicHunter [25], TinEye [26], and Windsurf [27]. However, CBIR is still an open problem, as demonstrated by the ImageCLEF1 and PASCAL VOC2 challenges.

The research on CBIR initially received great influence from text retrieval [28], taking

Relevance feedback with optimum-path forest

Fig. 1 shows the overview of our approach for feedback-based learning using the OPF classifier and a simple descriptor D =  (v, d), being v the feature extraction function and d(s, t) the distance function between two image representations (e.g., d(s,t)=v(t)-v(s)) [60]. Its extension to composite descriptor is explained in Section 4. At the first iteration, we have a simple search by similarity where, for a given image database Z, the user specifies a query image q and the system simply ranks

Extension to multiple descriptors

Now consider a set D={D1,D2,,Dn} of simple descriptors such that a distance di(s, t), i = 1, 2,  , n, between any given pair of images s,tZ can be computed as a function of their respective representations vi(s) and vi(t).

We wish to find the best combination of simple descriptors in order to retrieve relevant images from Z with respect to a given query image and user. This combination is found through a composite descriptor [60] that integrates the respective distance spaces. That is, a composite

Experiments and results

We call OPFMSPS and OPFGP the feedback-based learning process using OPF classification and the optimization techniques MSPS and GP, respectively.

Table 1 shows the values used for GP parameters [64] used in the OPFGP feedback-based learning method. Table 2 presents the population size and the number of generations of GP in the OPFGP for each dataset.

For each image database, we simulate the user behavior by using each image as initial query point and marking the relevant points (images from the

Conclusion

We extended a recent feedback-based learning approach for CBIR using OPF classification (OPFRF) to handle composite descriptors. The new methods, OPFMSPS and OPFGP, use optimization techniques, such as multi-scale parameter search (MSPS) and genetic programming (GP), to find the best combination function for a given set of simple descriptors at each iteration of relevance feedback.

Recent works [10], [16] compare the baseline methods with the most effective approaches according to the

Acknowledgments

Authors thank CAPES, CNPq (grants 140968/2007-5, 481556/2009-5, 303673/2010-9, and 306587/2009-2), and FAPESP (grants 2007/52015-0, 2008/57428-4, 2008/58528-2, and 2009/18438-7) for financial support.

References (75)

  • I. Laptev et al.

    Local velocity-adapted motion events for spatio-temporal recognition

    Computer Vision Image Understanding

    (2007)
  • B. Tao et al.

    Texture recognition and image retrieval using gradient indexing

    J. Visual Commun. Image Represent.

    (2000)
  • M. Arevalillo-Herráez et al.

    A naive relevance feedback model for content-based image retrieval using multiple similarity measures

    Pattern Recog.

    (2010)
  • C.-H. Lee et al.

    Ego-similarity measurement for relevance feedback

    Expert Syst. Appl.

    (2010)
  • H. Nezamabadi-pour et al.

    Concept learning by fuzzy k-NN classification and relevance feedback for efficient image retrieval

    Expert Syst. Appl.

    (2009)
  • G. Giacinto et al.

    Bayesian relevance feedback for content-based image retrieval

    Pattern Recog.

    (2004)
  • R. Min et al.

    Effective image retrieval using dominant color descriptor and fuzzy support vector machine

    Pattern Recog.

    (2009)
  • J. Zhang et al.

    Local aggregation function learning based on support vector machines

    Signal Process.

    (2009)
  • Z. Ye et al.

    Incorporating rich features to boost information retrieval performance: a SVM-regression based re-ranking approach

    Expert Syst. Appl.

    (2011)
  • C.-H. Lin et al.

    A smart content-based image retrieval system based on color and texture feature

    Image Vision Computing

    (2009)
  • S.D. MacArthur et al.

    Interactive content-based image retrieval using relevance feedback

    Computer Vision Image Understanding

    (2002)
  • J. Peng

    Multi-class relevance feedback content-based image retrieval

    Computer Vision Image Understanding

    (2003)
  • N. Doulamis et al.

    Evaluation of relevance feedback schemes in content-based in retrieval systems

    Signal Process.: Image Commun.

    (2006)
  • R. Min et al.

    Effective image retrieval using dominant color descriptor and fuzzy support vector machine

    Pattern Recog.

    (2009)
  • T.-C. Lu et al.

    Color image retrieval technique based on color features and image bitmap

    Inform. Process. Manage.

    (2007)
  • R.O. Stehling, M.A. Nascimento, A.X. Falcão, A compact and efficient image retrieval approach based on border/interior...
  • A. Williams et al.

    Content-based image retrieval using joint correlograms

    Multimedia Tools Appl.

    (2007)
  • J.A. Montoya-Zegarra, N.J. Leite, R. da S.Torres, Rotation-invariant and scale-invariant steerable pyramid...
  • J.A. Santos, C.D. Ferreira, R.S. Torres, A genetic programming approach for relevance feedback in region-based image...
  • Y. Rui, T. Huang, Optimizing learning in image retrieval, in: IEEE Conference on Computer Vision and Pattern...
  • D. Liu et al.

    Fast query point movement techniques for large CBIR systems

    IEEE Trans. Knowl. Data Eng.

    (2009)
  • S. Tong, E. Chang, Support vector machine active learning for image retrieval, in: ACM international conference on...
  • X. Wang, L. Yang, Application of SVM relevance feedback algorithms in image retrieval, in: International Symposium on...
  • A.T. Silva et al.

    Active learning paradigms for CBIR systems based on optimum-path forest classification

    Pattern Recog.

    (2011)
  • A.T. Silva et al.

    A new CBIR approach based on relevance feedback and optimumpath forest classification

    J. WSCG

    (2010)
  • G. Ruppert et al.

    Fast and accurate image registration using the multiscale parametric space and grayscale watershed transform

  • J. Santos et al.

    A relevance feedback method based on genetic programming for classification of remote sensing images

    Inform. Sci.

    (2011)
  • Cited by (21)

    • Theoretical background and related works

      2022, Optimum-Path Forest: Theory, Algorithms, and Applications
    • Hybrid and modified OPFs for intrusion detection systems and large-scale problems

      2022, Optimum-Path Forest: Theory, Algorithms, and Applications
    • Semi-automatic data annotation guided by feature space projection

      2021, Pattern Recognition
      Citation Excerpt :

      For this, we exploit the concept of sample informativeness from Active Learning (AL). Such approaches select samples for expert supervision based on their informativeness — i.e., potential to improve the design of a classifier from the knowledge of their true label [18], measured by the confidence of a classifier about the label assigned to a sample [19–22]. In our case, we propagate labels to samples with high-confidence values; and enable the expert focus on low-confidence values for manual label propagation.

    • Unsupervised manifold learning through reciprocal kNN graph and Connected Components for image retrieval tasks

      2018, Pattern Recognition
      Citation Excerpt :

      Based on this assumption, the use of machine learning methods was quickly spread in order to associate low-level features with high-level query concepts. In image retrieval applications, for example, several relevance feedback approaches [10–13] have been proposed. Such approaches obtain supervised information through user interactions with the aim of learning new distance measures capable of encoding user preferences.

    • Pattern recognition in Latin America in the "big data" era

      2015, Pattern Recognition
      Citation Excerpt :

      In our search, we found several works that address different stages of this problem; feature extraction, dimensionality reduction, classification and visualization, to mention a few. The works [47,46] discuss methods for Content-Based Image Retrieval (CBIR) systems based on relevance feedback according to two active learning paradigms: greedy and planned. In the first case, the system returns the most relevant images for a query at each iteration.

    • An integrated approach to region based image retrieval using firefly algorithm and support vector machine

      2015, Neurocomputing
      Citation Excerpt :

      Their technique combines an IGA with distance based learning to reduce the semantic gap between the information provided by low level descriptors and their high-level semantic contents of images. In [51], the retrieval process is enhanced by integrating optimum path forest (OPF) classification with two optimization technique such as a multi-scale parameter search and genetic programming (GP) algorithm. In [52], a hybrid approach is proposed which merges genetic programming framework [53] and OPF classifier [54] for the identification of interested regions in remote sensing images.

    View all citing articles on Scopus

    This paper has been recommended for acceptance by C.-S. Li.

    View full text