Research papersWavelet-linear genetic programming: A new approach for modeling monthly streamflow
Introduction
Streamflows are important and effective factors in stream ecosystems that directly interferes many of the physical characteristics of a stream. Depending on the discharge amount, based on both chemical and biological aspects, the aquatic life will get modified. The streamflow regime is an increasingly cited as a master variable that structures the aquatic systems and the habitats (Poff and Ward, 1989, Richter et al., 1996). Water commissioners under the office of the state engineer and various other water officials need accurate streamflow predictions for filling reservoirs and delivering supplies with a minimum of waste, in accordance with water right seniority. The prediction of streamflow which is a non-linear and complex phenomenon is not a simple task. Conventional data processing models such as the auto regressive integrated moving average (ARIMA) and MLR are widely used for hydrological time series modeling (Ladlani et al., 2014). These types of models, which are essentially linear models accept the data to be stationary, and have a constrained capacity to catch non-stationarities and non-linearities in hydrologic dataset henceforth are no more sufficiently appropriate for solving this kind of issues (Nourani et al., 2007). In recent years, the use of artificial intelligence models such as ANN and LGP has received more attention due to its ability (DanandehMehr et al., 2014b, Zounemat-Kermani et al., 2013). A genetic algorithm (GA) were also applied to optimize the projection direction of the projection pursuit cluster model for Karstic water pollution evaluation by Wang et al. (2006) and for evaluation of a regional partition of water resources in China by Wang and Ni (2008). In the field of hydrology, the ANNs are successfully applied to hydrological predictions (Atabay et al., 2012), to model daily chlorophyll-a dynamics in a river (Wu et al., 2013), to predict the water temperature in a river (Piotrowski et al., 2013) and the LGP model is applied to predict monthly streamflow (DanandehMehr et al., 2014b), to predict flow discharge (Zahiri and Azamathulla, 2014), to estimate the longitudinal velocity field in open channel junctions (Zaji and Bonakdari, 2015), to simulate friction factor in alluvial channel (Roushangar et al., 2014) and to model monthly pan evaporation (Guven and Kisi, 2013). In the last decades, the hybrid models have been commonly utilized in water resources and environmental studies. Some of them are the prediction of water quality time series by hybrid neural network and ARIMA model (Faruk, 2009), the spatio-temporal groundwater level forecasting by hybrid artificial neural network-geostatistics model (Nourani et al., 2010), ground water level variation using wavelet combined with ANN, MLR and support vector machine (Ebrahimi and Rajaee, 2017) and the forecasting of chlorophyll-a by hybrid support vector regression and genetic algorithm (GA-SVR) (Rajaee and Broumand, 2015). The methods for dealing with non-stationary data are not as advanced as those for stationary dataset and an additional research is required to examine these methods that are better capable to handle non-stationary dataset. The application of the discrete wavelet transform (DWT) has become a useful method for analyzing variations in time series, which give information in both time and frequency domains of signal by providing considerable insight into the physical form of the data. A number of studies have applied by coupling a wavelet transform theory with other intelligence models and provides superior accuracy compared to corresponding single models and regression models for water resources and environmental problems. The characteristics of these works are presented in Table 1.
Khani and Rajaee (2017) predicted dissolved oxygen (DO) using conjunction wavelet-regression (WR) and wavelet-ANN (WANN). In the developed wavelet-based models, the daily observed streamflow (Q) data and DO were decomposed into several sub-time series at different scales by DWT. The Daubechies wavelet of order 1 and 2 (db1 and db2), sym2 and coif1 were utilized in their study for decomposition of time series. The wavelet sub-series were applied to the models in order to build the WANN and WR models. The results of their study indicated that the WANN technique performed better than the WR, MLR and ANN models in prediction of DO. Ravansalar et al. (2015) compared the ability of the wavelet ANN (WANN) and simple ANN models in modeling dissolved oxygen in the River Calder at the UK. For the wavelet model, they utilized 4 different mother wavelet such as Db4, Sym1, Coif1 and Dmey for decomposition of the time series. These models linked to the ANN at different levels from 1 to 5, and then the model with best efficiency was selected. In their research, the WANN model with Dmey function of level 3 was found to be significantly superior to the ANN model. In another study by Ravansalar and Rajaee (2015), the ability of the hybrid WANN model were investigated and compared with the single ANN model. They utilized several seven mother wavelet at 5 different levels (1–5) for wavelet and then linked to the ANN as WANN model and compared with the single ANN model for prediction of the electrical conductivity in a gauging station of the Asi River, Turkey. They found that the WANN model which developed by Dmey function at level 3 is the best one among seven functions. In their study, prediction of the Electrical Conductivity by the WANN showed the superiority to the ANN model. DanandehMehr et al. (2014a) investigated the accuracy of the WLGP model in drought forecasting with 3, 6 and 12 months lead times in the state of Texas. In their study, the main time series of EI Nino-Southern Oscillation indicator (NINO) and Palmer’s modified drought index (PMDI) were decomposed to the some sub-time series by continues wavelet transform (CWT). Then significant bands were determined using significant variances method in the wavelet spectra. These significant bands then were used as inputs to the LGP model for drought forecasting. They compared the model results with the LGP, ANN, WANN, fuzzy logic (FL) and wavelet-FL (WFL) models and found that the WLGP model performed better than those models in drought forecasting. In this research, the ability of the LGP model in forecasting the drought was not satisfactorily well. Nourani et al. (2013), coupled self-organization map (SOM), FFNN and wavelet transform for space time pre-processing of satellite precipitation and runoff data to forecast the rainfall-runoff. The result of this combined model was compared with SOM-FFNN and conventional auto regressive integrated moving average with eXogenous input (ARIMAX) models. They found that the SOM-WT-FFNN model which utilized the wavelet transform to extract dynamic and multi-scale features of the non-stationary runoff time series and to remove noise provided more accurate forecast than those of the other models. In a review paper, Nourani et al. (2014) clearly explained the capability of the wavelet theory and combining it with the other artificial intelligence (AI) models.
The aim of this study was to predict the monthly streamflow in one month ahead, using the hybrid WLGP model at two gauging stations of the Beshar River located in Yasuj, Iran. For this purpose, the DWT theory was utilized to generate the wavelet coefficient of several sub-time series. Then these sub-series were imposed to the LGP model for prediction of streamflow in one month ahead.
In the next sections, the paper introduces the LGP, ANN, DWT and MLR. New WLGP model combination is proposed where the model application is presented. Forthcoming section describes the gauging station and the statistical analysis of the dataset. The results were discussed, and conclusions of this work with suggestions to future similar studies were as well presented.
Section snippets
Linear genetic programming
Linear Genetic Programming (LGP) was first proposed by Koza (1992). The LGP model automatically solves problems using computers and was inspired by biological evolution. Genetic programming is based on a tree structure. Problems taking into account the Darwinian rule of reproduction and analogs of naturally occurring genetic operations, for example, reproduction. The problems will be solved by LGP system using several steps as follows: 1) Input selection. 2) Functional and terminal sets. 3)
Gauging Station and statistical analysis
The data used in this study correspond to the monthly streamflow time series at the Pataveh (downstream) and Shahmokhtar (upstream) gauging stations (51° 19′ 59″ E, 30° 51′ 56″ N and 51° 31′ 25″ E, 30° 41′ 31″ N, respectively) on the Beshar River, Boyer Ahmad (Yasuj), Iran and were taken from Kohgilooye & Boyerahmad Regional Water Company (KBRW). Because of unavailability of the data as online, they are received as a soft copy from the KBRW Company. The regular data were measured by KBRW as daily. Monthly
Model performance
In this part of the study, the results of the LGP, WLGP, ANN, WANN and MLR models were investigated and presented in Table 4, Table 5 for both Pataveh and Shahmokhtar stations, respectively. As evident in Table 4, for the Pataveh Station, third combination contains Qt, Q(t−1) and Q(t−2) as input patterns for the proposed WLGP model illustrated the best result in terms of E = 0.877, RMSE = 19.664 m3/s and MAE = 12.308 m3/s. Within the presented WLGP models, all input combinations use the streamflow in
Conclusions
In this study, a new hybrid wavelet-LGP model was proposed for streamflow time series prediction for two gauging stations, the Pataveh and Shahmokhtar, on the Beshar River that were located in Iran. For this purpose, the discrete wavelet transform, which capture the multi scale features of a signal, was first used for decomposing the time series to several sub-series. Then, these sub-series were imposed to the LGP model as inputs so as to predict the streamflow for one month ahead. The
References (54)
- et al.
Treatment of multi-dimensional data to enhance neural network estimators in regression problems
J. Expert Syst. Appl.
(2007) - et al.
Streamflow prediction using linear genetic programming in comparison with a neuro-wavelet technique
J. Hydrol.
(2013) - et al.
A gene–wavelet model for long lead time drought forecasting
J. Hydrol.
(2014) - et al.
Linear genetic programming application for successive-station monthly streamflow prediction
J. Comput. Geosci.
(2014) - et al.
Simulation of groundwater level variations using wavelet combined with neural network, linear regression and support vector machine
Global Planet. Change
(2017) Back-propagation neural networks for modeling complex systems
J. Artif. Intell. Eng.
(1995)- et al.
Monthly pan evaporation modeling using linear genetic programming
J. Hydrol.
(2013) - et al.
Analysis of hydraulic characteristics for hollow semi-circular weirs using artificial neural networks
J. Flow Meas. Instrum.
(2014) - et al.
Rainfall–runoff relation for karstic spring. Part 2: continuous wavelet and discrete orthogonal multi resolution analyses
J. Hydrol.
(2000) - et al.
River flow forecasting through conceptual models part I — a discussion of principles
J. Hydrol.
(1970)