Solving the ocean color problem using a genetic programming approach

https://doi.org/10.1016/S1568-4946(01)00007-2Get rights and content

Abstract

The ocean color problem consists in evaluating ocean components concentrations (phytoplankton, sediment and yellow substance) from sunlight reflectance or luminance values at selected wavelengths in the visible band. The interest of this application increases with the availability of new satellite sensors. Moreover, monitoring phytoplankton concentrations is a key point for a wide set of problems ranging from greenhouse effect to industrial fishing and signaling toxic algae blooms. To our knowledge, it is the first attempt at this regression problem with genetic programming (GP). We show that GP outperforms traditional polynomial fits and rivals artificial neural nets in the case of open ocean waters. We improve previous works by also solving a range of coastal waters types, providing detailed results on estimation errors. To our knowledge, we are the firsts to publish numerical results regarding coastal waters. Experiments were conducted with a dynamic fitness GP algorithm in order to speed up computing time through a process of progressive learning.

Introduction

One of the advantages of genetic programming (GP) lies in its ability to adapt to many types of problems. Within the image analysis field, it has been successfully applied to several difficult tasks, like features extraction, image recognition [1], automatic detector [2], image discrimination [3], etc. Most of these works use GP primitives based on statistics of pixel data, like averaging pixel values in a n×n box, or using standard deviations on a pixel grid [4]. All these works but Daida et al.’s [5], [6] are based on the visual spectrum but do not use multi-spectral data. In this paper, we present the first results of a study dealing with multi-spectral remote sensing data analysis, to solve the ocean color problem. The objective of this application is evaluation of phytoplankton concentration in oceanic and coastal waters, using remote sensing measurements of the reflected sunlight. These measurements are made along some wavelengths in the visible band, hence the name “ocean color”. It is the first time to our knowledge that a GP scheme is applied to tackle the ocean color problem. We also deal with a more technical matter: speeding up GP execution time. It is well known that most of the CPU time, during a GP run, is usually spent on the fitness evaluation. Thus, depending on the problem, the computation time requirements may become quickly impractical. This fitness evaluation problem has been addressed by many people, for example, by using a staged approach [2], by limiting the fitness evaluation to a subpopulation [7] and by stopping the fitness evaluation as soon as a given threshold of bad individuals has been reached [8]. We experiment a GP algorithm where new cases are gradually added to the set of fitness cases during the run. The idea beyond is that it is very unlikely to quickly find fit individuals for all cases. So, in order to allow the opportunity for a soft adaptation to a more and more complex learning situation, our GP variant starts with a subset of fitness cases, and once a given fitness threshold is reached, new fitness cases are then added. This process is iterated until all fitness cases are viewed or the maximum number of generations is reached.

Section 2 introduces the ocean color problem. Section 3 briefly introduces evolutionary computation and the GP general method. Section 4 describes the algorithms we use, and presents some non-GP schemes that are currently used or proposed to solve the ocean color. Section 5 presents the results and some comparisons with other known methods. In Section 6 we sum up the lessons brought by these experiments and we sketch how future works could be directed both towards refining the ocean color application, and towards using dynamic fitness cases in other applications.

Section snippets

The ocean color problem

Being able to monitor the evolution of ocean water characteristics is an important challenge. The knowledge of phytoplankton concentration is especially interesting, since it allows quantitative assessment of ocean and coastal waters primary production. In turn, this primary production plays a key role for the evaluation of the global carbon cycle and is thus of great scientific concern, notably to understand the so-called greenhouse effect. Phytoplankton is also the base of the marine life

A short tour of evolutionary computation

Evolutionary computation (EC) has been a growing field these last two decades. Using natural evolution as an inspiration for solving computational problems may sound challenging and utopian at first glance. However, natural evolution can be seen as a method for designing new innovative solution to complex problems: the amazing diversity of living beings is a testimony to this innovative power. In the same way, evolutionary computation heuristics are stochastic schemes, whose search method

Overview

We tried both standard runs of GP and a “progressive” multi-stage variant, inspired from the one that Howard and Roberts have proposed in [2]. In these experiments rough algorithms are expected to be quickly developed, and then refined in later stages. To obtain this progressive development, we partition the fitness cases in several subsets, called “classes”, of same size. The details of class partition are described in the next subsection. In the standard run, all classes are used from the

Comparing standard and progressive GP

For this first experiment, we tried standard GP, and our progressive algorithm based on an adjusted fitness threshold. Both algorithms were given 30 runs, on a dataset that simulates open ocean conditions. The parameter setting given in Table 3, Table 4 sums up the results. Relative RMS is shown with the following format: best (mean/S.D.). In this case it appears that the progressive variant offers results that are better than standard runs.

Progressive GP versus traditional methods

In this second experiment, we used the same dataset as

Conclusion

The easy K1 problem has been solved with good precision, and GP performance outperforms traditional polynomial fits and rivals artificial neural nets; the progressive scheme behaves even better that the “classical” GP scheme. The K2 precision, as expected, is not as good as the K1, but is still considered very good by physicists. To our knowledge, it is the first study that provides a model solving a range of K2 waters, accompanied with detailed results and estimation errors. Following previous

Acknowledgements

We thank M. Chami and R. Santer, at the LISE laboratory, for providing us with the physical models for generating fitness cases sets and their many helpful hints. We also thank the anonymous referee for his helpful comments.

References (19)

There are more references available in the full text version of this article.

Cited by (0)

View full text