Elsevier

Ecological Informatics

Volume 1, Issue 1, January 2006, Pages 43-53
Ecological Informatics

Discovery of predictive rule sets for chlorophyll-a dynamics in the Nakdong River (Korea) by means of the hybrid evolutionary algorithm HEA

https://doi.org/10.1016/j.ecoinf.2005.08.001Get rights and content

Abstract

This paper presents a hybrid evolutionary algorithm (HEA) to discover complex rule sets predicting the concentration of chlorophyll-a (Chl.a) based on the measured meteorological, hydrological and limnological variables in the hypertrophic Nakdong River. The HEA is designed: (1) to evolve the structure of rule sets by using genetic programming and (2) to optimise the random parameters in the rule sets by means of a genetic algorithm. Time-series of input–output data from 1995 to 1998 without and with time lags up to 7 days were used for training HEA. Independent input–output data for 1994 were used for testing HEA. HEA successfully discovered rule sets for multiple nonlinear relationships between physical, chemical variables and Chl.a, which proved to be predictive for unseen data as well as explanatory. The comparison of results by HEA and previously applied recurrent artificial neural networks to the same data with input–output time lags of 3 days revealed similar good performances of both methods. The sensitivity analysis for the best performing predictive rule set unraveled relationships between seasons, specific input variables and Chl.a which to some degree correspond with known properties of the Nakdong River. The statistics of numerous random runs of the HEA also allowed determining most relevant input variables without a priori knowledge.

Introduction

It has been demonstrated that ecological time series, which are highly complex and nonlinear, can be successfully unraveled and predicted by artificial neural networks (ANN) and genetic algorithms (e.g. Recknagel et al., 1997, Recknagel et al., 1998, Maier et al., 1998, Stockwell, 1999, Liu and Yao, 1999, Jeong et al., 2001, Whigham and Recknagel, 2001, Recknagel et al., 2002, Jeong et al., 2003a, Jeong et al., 2003b, Lee et al., 2004, Recknagel et al., 2005). Even though ANN are very competitive in classifying or predicting noisy data by minimizing the root mean square error of approximations they lack an explicit representation. By contrast, Whigham and Recknagel (2001) proposed grammar based genetic programming to evolve functions and rules, and Bobbin and Recknagel (2003) applied an evolutionary based learning algorithm to discover predictive rules for population dynamics in limnological data. Even though both approaches allowed to discover predictive rules for ecological relationships they had following limitations: (1) the rules were relatively simple with attributes being associated only with constant parameters rather than function to reflect complex relationship between multiple attributes, and (2) the parameters which determine the output values on the rules are generated randomly rather than being simultaneously optimised during the evolution. Whigham and Recknagel (2001) performed the hill climbing mutation for the fine tuning of the random real numbers and Bobbin and Recknagel (2003) adopted a self-adapting evolutionary algorithm to modify these parameters. However both methods fail when the number of parameters increases with the complexity of the rule.

This research aims at rule-based prediction and explanation of chlorophyll-a (Chl.a) dynamics by means of a hybrid evolutionary algorithm (HEA). HEA evolves the structure of the rule set by using genetic programming, and optimises the random parameters on the rule set by using a general genetic algorithm. Rules discovered by HEA have the IF-THEN-ELSE structure and allow imbedding complex functions synthesised from various predefined arithmetic operators. The maximum tree depth and rule set size control the complexity of rule sets.

The results demonstrate that HEA allows to discover rule sets which predict well unseen data and represent causal relationships between physical and chemical variables and Chl.a dynamics. Moreover the statistics of numerous random runs of the HEA also allowed determining most relevant input variables without a priori knowledge.

Section snippets

Hybrid evolutionary algorithm

Evolutionary algorithms (EA) are adaptive methods which mimic processes of biological evolution, natural selection and genetic variation. They search for suitable representations of a problem solution by means of genetic operators and the principle of “survival of the fittest”. Due to their merits of self-organization, self-learning, intrinsic parallelism and generality, EA have been successfully applied to pattern recognition, economic prediction, optimum control and parallel processing (

The Nakdong River dataset

The Nakdong River basin is situated in the southeastern part of South Korea. South Korea experiences four distinct seasons, and is characterized by heavy rainfall during the monsoon season and several typhoon events. The annual mean precipitation across the river basin is about 1200 mm, but more than 50% of the annual rainfall is concentrated during summer (June–August). The annual mean water temperature at the study site was 13.7 °C. The mean water temperature was 2.2 °C during the coldest

Parameter settings and measures

To examine the effectiveness of the HEA, we applied it to data sets with no-delay and time-lagged inputs by 1–7 days. 100 runs were conducted independently for each data set. All the experiments were performed on a Hydra supercomputer (IBM eServer 1350 Linux) with a peak speed of 1.2 TFlops by using the programming language C. The parameter settings of the HEA are listed in Table 3.

In addition, in order to validate the results of different rule sets not only the training error (fitness) but

Conclusion

A hybrid evolutionary algorithm (HEA) has been developed to discover predictive rule sets in complex ecological data. It has been designed to evolve the structure of rule sets by using genetic programming and to optimise the random parameters in the rule sets by means of a genetic algorithm.

HEA was successfully applied to meteorological, hydrological and limnological time series data of the hypertrophic Nakdong River in order to predict Chl.a. The results have demonstrated that HEA is able to

References (21)

There are more references available in the full text version of this article.

Cited by (19)

  • Ecological informatics: Overview

    2018, Encyclopedia of Ecology
  • Real-time observation, early warning and forecasting phytoplankton blooms by integrating in situ automated online sondes and hybrid evolutionary algorithms

    2014, Ecological Informatics
    Citation Excerpt :

    Hybrid evolutionary algorithms (HEAs) (Cao et al., 2013) were selected to develop 1–7 days ahead forecasting models based on the data provided by the sondes for phytoplankton dynamics in Xiangxi Bay. The HEA is an optimizing technique drawn from the principles of natural selection and genetic variation in biological evolution (Cao et al., 2006; Wai et al., 2007). Because HEA can provide rules and equations to explain what factors drive the dynamics of the dependent variables (e.g. phytoplankton abundance), this algorithm has attracted much attention in forecasting the dynamics of aquatic ecosystems (Cao et al., 2006; Recknagel, 2013; Recknagel et al., 2013, 2014; Wai et al., 2007).

  • Ecological Informatics: Overview

    2008, Encyclopedia of Ecology, Five-Volume Set
  • Ecological Informatics: Editorial

    2006, Ecological Informatics
  • Ecological Informatics: Overview

    2019, Encyclopedia of Ecology: Volume 1-4, Second Edition
  • Long-term ecological research in the Nakdong River: Application of ecological informatics to harmful algal blooms

    2017, Ecological Informatics: Data Management and Knowledge Discovery: Third Edition
View all citing articles on Scopus
View full text