Research papersPrediction of surface water total dissolved solids using hybridized wavelet-multigene genetic programming: New approach
Introduction
River water is recognized as one of the essential fresh surface waters that is available naturally to supply multiple human usage such as drinking, irrigation, and industrial production purposes. Monitoring and management of surface water quality (WQ) play an undeniable role in environmental protection and sustainable use of these water resources (Ahmadianfar et al., 2020). Over the past few decades, efforts have been devoted to improving WQ management and sustainability through the precise simulation of the physical, chemical, and biological processes of various pollutants. Total dissolved solid (TDS) is a well-accepted indicator for the WQ that is effectively used for assessing the suitability of drinking and irrigation water supply. TDS consists of a variety of inorganic salts (e.g., sodium (Na+), magnesium (Mg+2), calcium (Ca+2), and potassium (K+) as cations, as well as chloride (Cl-), sulfate (SO4-2), nitrates (NO-3), and bicarbonates (HCO-3) as anions) and dissolved organic matter. Based on the reported standard by the World Health Organization (WHO), the acceptable range of TDS for drinking water is 300–600 mg/l (WHO, 2011). Also, the permissible water concentration range of TDS for agriculture is 450–2000 mg/l (Ayers and Westcot, 1985).
Laboratory investigations and empirical calculation methods have been reported for the TDS quantification (Tiyasha et al., 2020). However, its laboratory test or the manual calculation is associated with some drawbacks such as time consuming, unintentional errors, and the generalization for the perfect computation. The potential of the AI models exhibited a remarkable advancement on modeling TDS of river WQ (Banadkooki et al., 2020). The massive implementation of the AI models is due to some limitations that have been recognized in the classical mathematical models in addition to the high stochasticity pattern associated with the WQ (Abba et al., 2020). In addition, the classical models can only provide predictions for a linear and stationary state of a dataset (Deng et al., 2015). The capacity of the AI models is reported in their potential to handle the non-linearity and complexity phenomena of the environmental and hydrological processes, overcoming the drawback of the traditional models (Alizadeh et al., 2018, Das et al., 2020, Gholami et al., 2016, Maier et al., 2014, Naganna and Deka, 2019, Rezaie-Balf et al., 2019, Tiyasha et al., 2020, Wu and Chau, 2013). The AI models have been positively employed to address a variety of water quality issues such as water quality index (WQI), dissolved oxygen (DO), nitrate (NO3), electrical conductivity (EC), chemical oxygen demand (COD), biochemical oxygen demand (BOD), ammoniacal nitrogen (NH3-N), pH, and sodium adsorption ratio (SAR) (Tiyasha et al., 2020). The examples of these AI models include: artificial neural network (ANN), support vector machine (SVM), adaptive neuro fuzzy inference system (ANFIS), random forest (RF), decision tree (DT), genetic programming (GP), linear genetic programming (LGP), extreme learning machine (ELM), and gene expression programming (GEP) (Ay and Kisi, 2014, Azad et al., 2017, Emamgholizadeh et al., 2014, Heydari et al., 2013, Olyaie et al., 2017, Sengorur et al., 2015, Sepahvand et al., 2019, Takdastan et al., 2018, Tiwari et al., 2018).
Recently, the capacity of the SVM model was tested to assess different WQ variables in rivers (Mahmoudi et al., 2016); to forecast the Carlson's trophic state index in reservoirs (Chou et al., 2018); and to predict some WQ parameters in the Sefid Rud River basin in Iran (Bozorg-Haddad et al., 2017). The GEP, DT, and LGP were used to forecast TDS levels in the Zarinehroud basin in Iran (Zaman Zad Ghavidel and Montaseri, 2014); and to assess BOD, DO, and COD in the Karoun River in Iran (Najafzadeh et al., 2018).
WQ time series data are highly stochastic and chaotic. Implementation of an individual AI based model has limitations for WQ modeling (Yaseen et al., 2018). Hence, the integration of the time series data pre-processing approaches can facilitate decomposition of the time series and improve the predictability performance of the AI models. Among several powerful data pre-processing techniques, the discrete wavelet transform (DWT) has been demonstrated a satisfactory approach for decomposition of environmental, hydrological, and ecological time series data (Nourani et al., 2014). By providing a time–frequency representation of an analyzed signal of time series in the time domain and the information about the physical structure of the input time series, the wavelet transform can successfully lead to an accurate prediction especially when input data are limited (Ghimire et al., 2019). Recently, some researchers investigated the possibility and the advantage of integrating the wavelet transform (WT) approach with AI based models for diverse river WQ simulations. Barzegar et al., 2017, Barzegar et al., 2016 integrated WT with ELM, ANIFS and ANN models to predict EC and salinity. Research findings evidenced the improvement of the prediction accuracy using the pre-processing approach. Several other researchers conducted similar studies on the integration of the WT with AI models and demonstrated its successful implementation for river WQ simulations (Montaseri et al., 2018, Rajaee et al., 2018, Ravansalar et al., 2016b, Ravansalar et al., 2016a, Ravansalar and Rajaee, 2015). These studies indicated that the hybridization of the WT with AI models presented an optimistic new computer aid approach for environmental modeling. The enthusiasm of the exploration of new robust and reliable soft computing predictive models is a new modeling trend for better watershed management and sustainability.
In the current state-of-the-art research, a new hybrid artificial intelligence model, called wavelet-multigene genetic programming (W-MGGP), is developed for accurately predicting the monthly TDS levels at the Sefid Rud River in Iran. The selection of the MGGP model is owing to its feasibility in modeling highly non-linear time series (Mehr and Safari, 2020, Mohammad-Azari et al., 2020). In this study, the influence of the discrete wavelet transform is explored in combination with the MGGP and GEP models. The capacity of the new version of GP and MGGP has been employed for limited hydrological and environmental forecasting (Dadandeh Mehr and Demirel, 2016, Danandeh Mehr and Nourani, 2017), and thus the current research is devoted to the TDS prediction.
Section snippets
Multigene genetic programming
GP is an optimization technique that utilizes the principle of Darwin’s theory (Gandomi et al., 2010). The principle of GP is similar to that of the genetic algorithm (GA) and thus both methods use the three main operators: crossover, mutation, and selection (Danandeh Mehr et al., 2018). The main difference between these two methods is how to present solutions. The GA presents solutions by strings with fixed lengths, while the GP expresses the solutions by tree structures with varying sizes (
Case study and data analysis
In this study, the data (monthly time scale) are related to the Astane gauging station located on the Sefid Rud River (Longitude 49° 37′ 40′', Latitude 36° 57′ 02′'). The observed data are employed to build the AI based models for TDS prediction. The Sefid Rud River has a length of 670 km and a drainage area of 13,450 km2. It is the longest river in Northern Iran. In this study, monthly discharge and TDS data were obtained over a 20-year period (1985–2005, 240 months). Fig. 3 displays the
Proposed wavelet-multigene genetic programming method
In this research, the W-MGGP model is developed by combining the DWT and MGGP models. To do so, the datasets of Q and TDS are composed into several sub-datasets. To decompose a dataset using the wavelet transform, it is very important to choose a mother wavelet and a decomposition level for modeling. According to Nourani et al. (2014), the mother wavelet db4 is the most efficient in producing time localization properties for time series. In addition, the mother wavelets of bior6.8 and dmey are
Results and discussion
The efficiencies of the MGGP, W-MGGP, GEP, and W-GEP models were compared and evaluated. Their statistical performance results for the seven input combinations are shown in Table 4, Table 5. For the MGGP model (Table 4), combination (6), which used the TDS of the first, second, and third successive previous months and the Q of the current, first, and second successive previous months as inputs, yielded a better performance than the other combinations in terms of R (0.396), RMSE (239.718), and
Conclusions
In the current study, a hybrid wavelet-multigene genetic programming (W-MGGP) model using three mother wavelets (db4, bior6.8, and dmey) was developed to simulate the monthly TDS levels of river water. Particularly, the W-WGGP was compared with the W-GEP, MGGP, and GEP model and their efficiencies and performances were evaluated. The time series of river discharge and TDS over a 20-years period were utilized for the development of the predictive models. Statistical analysis (i.e.,
CRediT authorship contribution statement
Mehdi Jamei: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Resources, Supervision, Validation, Visualization, Writing - original draft, Writing - review & editing. Iman Ahmadianfar: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing - original draft, Writing - review & editing. Xuefeng Chu: Conceptualization,
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgement
The authors acknowledge their appreciation for the dataset provider “Guilan regional water company.” In addition, the authors are very much thankful for the editors and reviewers for their constructive comments and suggestions.
References (63)
- et al.
Evolutionary computational intelligence algorithm coupled with self-tuning predictive model for water quality index determination
J. Hydrol.
(2020) - et al.
A wavelet neural network conjunction model for groundwater level forecasting
J. Hydrol.
(2011) - et al.
Modelling of chemical oxygen demand by using ANNs, ANFIS and k-means clustering techniques
J. Hydrol.
(2014) - et al.
Determining quality of water in reservoir using machine learning
Ecol. Inf.
(2018) - et al.
A Pareto-optimal moving average-multigene genetic programming model for rainfall-runoff modelling
Environ. Modell. Software
(2017) - et al.
A novel hybrid water quality time series prediction method based on cloud model and fuzzy forecasting
Chemom. Intell. Lab. Syst.
(2015) - et al.
Multi-stage genetic programming: a new strategy to nonlinear system modeling
Inf. Sci.
(2011) - et al.
An integrated SRM-multi-gene genetic programming approach for prediction of factor of safety of 3-D soil nailed slopes
Eng. Appl. Artif. Intell.
(2014) - et al.
Wavelet-based 3-phase hybrid SVR model trained with satellite-derived predictors, particle swarm optimization and maximum overlap discrete wavelet transform for solar radiation prediction
Renewable Sustainable Energy Rev.
(2019) - et al.
Precipitation forecasting by using wavelet-support vector machine conjunction model
Eng. Appl. Artif. Intell.
(2012)
Understanding the behaviour and optimising the performance of back-propagation neural networks: an empirical study
Environ. Modell. Software
Evolutionary algorithms and other metaheuristics in water resources: current status, research challenges and future directions
Environ. Modell. Software
River flow forecasting through conceptual models part I — a discussion of principles
J. Hydrol.
Applications of hybrid wavelet–Artificial Intelligence models in hydrology: a review
J. Hydrol.
A comparative analysis among computational intelligence techniques for dissolved oxygen prediction in Delaware River
Geosci. Front.
A wavelet-linear genetic programming model for sodium (Na+) concentration forecasting in rivers
J. Hydrol.
Neuro-fuzzy inference system Prediction of stability indices and Sodium absorption ratio in Lordegan rural drinking water resources in west Iran
Data in Brief
A comparison of performance of several artificial intelligence methods for forecasting monthly discharge time series
J. Hydrol.
Prediction of rainfall time series using modular soft computing methods
Eng. Appl. Artif. Intell.
A novel hybrid wavelet-locally weighted linear regression (W-LWLR) model for electrical conductivity (EC) prediction in water surface
J. Contam. Hydrol.
Effect of river flow on the quality of estuarine and coastal waters using machine learning models
Eng. App. Comput. Fluid Mech.
Water Quality for Agriculture
Prediction of water quality parameters using ANFIS optimized by intelligence algorithms (Case study: Gorganrood River)
KSCE J. Civ. Eng.
Estimation of total dissolved solids (TDS) using new hybrid machine learning models
J. Hydrol.
Application of wavelet-artificial intelligence hybrid models for water quality prediction: a case study in Aji-Chay River, Iran
Stochastic Environ. Res. Risk Assess.
Multi-step water quality forecasting using a boosting ensemble multi-wavelet extreme learning machine model
Stoch. Env. Res. Risk Assess.
Modeling water-quality parameters using genetic algorithm-least squares support vector regression and genetic programming
J. Environ. Eng.
On the calibration of multigene genetic programming to simulate low flows in the Moselle river
Uludağ Univ. J. Faculty Eng.
Genetic programming in water resources engineering: a state-of-the-art review
J. Hydrol.
Hybrid wavelet packet machine learning approaches for drought modeling
Environ. Earth Sci.
Cited by (68)
Genetic programming expressions for effluent quality prediction: Towards AI-driven monitoring and management of wastewater treatment plants
2024, Journal of Environmental ManagementBidirectional Long Short-Term Memory (BILSTM) - Support Vector Machine: A new machine learning model for predicting water quality parameters
2024, Ain Shams Engineering JournalEvaluation of water quality indexes with novel machine learning and SHapley Additive ExPlanation (SHAP) approaches
2024, Journal of Water Process EngineeringPrediction of total dissolved solids, based on optimization of new hybrid SVM models
2023, Engineering Applications of Artificial Intelligence