### Fitness

# Predicting water quality through daily concentration of dissolved oxygen using improved artificial intelligence – Scientific Reports

Once again, this paper offers four novel models for DO prediction. The models are composed of an MLP neural network as the core and the TLBO, SCA, WCA, and EFO as the training algorithms. All models are developed and implemented in the MATLAB 2017 environment.

### Optimization and training

Proper training of the MLP is dependent on the strategy employed by the algorithm appointed for this task (as described in previous sections for the TLBO, SCA, WCA, and EFO). In this section, this characteristic is discussed in the format of the hybridization results of the MLP.

An MLPNN is considered the basis of the hybrid models. As per Section “The MLPNN”, this model has three layers. The input layer receives the data and has 3 neurons, one for each of WT, pH, and SC. The output layer has one neuron for releasing the final prediction (i.e., DO). However, the hidden layer can have various numbers of neurons. In this study, a trial-and-error effort was carried out to determine the most proper number. Ten models were tested with 1, 2, …, and 10 neurons in the hidden layer and it was observed that 6 gives the best performance. Hence, the final model is structured as 3 × 6 × 1. With the same logic, the activation function of the output and hidden neurons is respectively selected Pureline (*x* = *y*) and Tansig (described in Section “Formula presentation”) ^{83}.

Next, the training dataset was exposed to the selected MLPNN network. The relationship between the DO and water conditions is established by means of weights and biases within the MLPNN (Fig. 4). In this study, the role of tuning theses weighst and biases is assigned to the named metaheuristic algorithms. For this purpose, the MLPNN configuration is first transformed in the form of mathematical equations with adjustable weights and biases (The equations will be shown in Section “Formula presentation”). Training the MLPNN using metaheuristic algorithms is an iterative effort. Hereupon, the RMSE between the modeled and measured DOs is introduced as the objective function of the TLBO, SCA, WCA, and EFO. This function is used to monitor the optimization benhavior of the algorithms. Since RMSE is an error indicator, the algorithms aim to minimize it over time to improve the quality of the weights and biases. Designating the appropriate number of iterations is another important step. By analyzing the convergence behavior of the algorithms, as well as referring to previous similar studies, 1000 iterations were determined for the TLBO, SCA, and WCA, while the EFO was implemented with 30,000 iterations. The final solution is used to constrcuct the optimized MLPNN. Figure 5 illustrates the optimization flowchart.

Furthermore, each algorithm was implemented with nine swarm sizes (*N*_{SW}s) to achieve the best model configuration. These tested *N*_{SW}s were 10, 25, 50, 75, 100, 200, 300, 400, and 500 for the TLBO, SCA, WCA, while 25, 30, 50, 75, 100, 200, 300, 400, and 500 for the EFO^{84}. Collecting the obtained objective functions (i.e., the RMSEs) led to creating a convergence curve for each tested *N*_{SW}s. Figure 6 depicts the convergence curves of the TLBO-MLPNN, SCA-MLPNN, WCA-MLPNN, and EFO-MLPNN.

As is seen, each algorithm has a different method for training the MLPNN. According to the above charts, the TLBO-MLPNN, SCA-MLPNN, WCA-MLPNN, and EFO-MLPNN with respective *N*_{SW}s of 500, 400, 400, and 50, attained the lowest RMSEs. It means that for each model, the MLPNNs trained by these configurations acquired more promising weights and biases compared to eight other *N*_{SW}s. Table 2 collects the final parameters of each model.

### Training and testing results

The RMSE of the recognized elite models (i.e., the TLBO-MLPNN, SCA-MLPNN, WCA-MLPNN, and EFO-MLPNN with the *N*_{SW}s of 500, 400, 400, and 50) was 1.3231, 1.4269, 1.3043, and 1.3210, respectively. These values plus the MAEs of 0.9800, 1.1113, 0.9624, and 0.9783, and the NSEs of 0.7730, 0.7359, 0.7794, and 0.7737 indicate that the MLP has been suitably trained by the proposed algorithms. In order to graphically assess the quality of the results, Fig. 7a,c,e, and g are generated to show the agreement between the modeled and measured DOs. The calculated R_{P}s (i.e., 0.8792, 0.8637, 0.8828, and 0.8796) demonstrate a large degree of agreement for all used models. Moreover, the outcome of \({DO}_{{i}_{expected }}- {DO}_{{i}_{predicted}}\) is referred to as “error” for every sample, and the frequency of these values is illustrated in Fig. 7b,d,f, and h. These charts show larger frequencies for the error values close to 0; meaning that accurately predicted DOs outnumber those with considerable errors.

Evaluating the testing accuracies revealed the high competency of all used models in predicting the DO for new values of WT, pH, and SC. In other words, the models could successfully generalize the DO pattern captured by exploring the data belonging to 2014–2018 to the data of the fifth year. For example, Fig. 8 shows the modeled and measured DOs for two different periods including (a) October 01, 2018 to December 01, 2018 and (b) January 01, 2019 to March 01, 2019. It can be seen that, for the first period, the upward DO patterns have been well-followed by all four models. Also, the models have shown high sensitivity to the fluctuations in the DO pattern for the second period.

Figure 9a,c,e, and g show the errors obtained for the testing data. The RMSE and MAE of the TLBO-MLPNN, SCA-MLPNN, WCA-MLPNN, and EFO-MLPNN were 1.2980 and 0.9728, 1.4493 and 1.2078, 1.3096 and 0.9915, and 1.2903 and 1.0002, respectively. These values, along with the NSEs of 0.7668, 0.7092, 0.7626, and 0.7695, imply that the models have predicted unseen DOs with a tolerable level of error. Moreover, Fig. 9b,d,f, and h present the corresponding scatterplots illustrating the correlation between the modeled and measured DOs in the testing phase. Based on the R_{p} values of 0.8785, 0.8587, 0.8762, and 0.8815, a very satisfying correlation can be seen for all used models.

### Efficiency comparison and discussion

To compare the efficiency of the employed models, the most accurate model is first determined by comparing the obtained accuracy indicators, then, a comparison between the optimization time is carried out. Table 3 collects all calculated accuracy criteria in this study.

In terms of all all accuracy criteria (i.e., RMSE, MAE, R_{P}, and NSE), the WCA-MLPNN emerged as the most reliable model in the training phase. In other words, the WCA presented the highest quality training of the MLP followed by the EFO, TLBO, and SCA. However, the results of the testing data need more discussion. In this phase, while the EFO-MLPNN achieved the smallest RMSE (1.2903), the largest R_{P} (0.8815), and the largest NSE (0.7695) at the same time, the smallest MAE (0.9728) was obtained for the TLBO-MLPNN. About the SCA-based ensemble, it was shown that this model yields the poorest predictions in both phases.

Additionally, Figs. 10 and 11 are also produced to compare the accuracy of the models in the form of boxplot and Taylor Diagram, respectively. The results of these two figures are consistent with the above comparison. They indicate the high accordance between the models’ outputs and target DOs, and also, they reflect the higher accuracy of the WCA-MLPNN, EFO-MLPNN, and TLBO-MLPNN, compared to the SCA-MLPNN.

In comparison with some previous literature, it can be said that our models have attained a higher accuracy of DO prediction. For instance, in the study by Yang et al.^{85}, three metaheuristic algorithms, namely multi-verse optimizer (MVO), shuffled complex evolution (SCE), and black hole algorithm (BHA) were combined with an MLPNN and the models were applied to the same case study (Klamath River Station). The best training performance was achieved by the MLP-MVO (with respective RMSE, MAE, and R_{P} of 1.3148, 0.9687, and 0.8808), while the best testing performance was achieved by the MLP-SCE (with respective RMSE, MAE, and R_{P} of 1.3085, 1.0122, and 0.8775). As per Table 3, it can be inferred that the WCA-MLPNN suggested in this study provides better training results. Also, as far as the testing results are concerned, both WCA-MLPNN and TLBO-MLPNN outperformed all models tested by Yang et al.^{85}. In another study by Kisi et al.^{42}, an ensemble model called BMA was suggested for the same case study, and it achieved training and testing RMSEs of 1.334 and 1.321, respectively (See Table 5 of the cited paper). These error values are higher than the RMSEs of the TLBO-MLPNN, WCA-MLPNN, and EFO-MLPNN in this study. Consequently, these model outperform benchmark conventional models that were tested by Kisi et al.^{42} (i.e., ELM, CART, ANN, MLR, and ANFIS). With the same logic, the superiority of the suggested hybrid models over some conventional models employed in the previous studies^{49,65} for different stations on the Klamath River can be inferred. Altogether, these comparisons indicate that this study has achieved considerable improvements in the field of DO prediction.

Table 4 denotes the times elapsed for optimizing the MLP by each algorithm. According to this table, the EFO-MLPNN, despite requiring a greater number of iterations (i.e., 30,000 for the EFO vs. 1000 for the TLBO, SCA, and WCA), accomplishes the optimization in a considerably shorter time. In this relation, the times for the TLBO, SCA, and WCA range in [181.3, 12,649.6] *s*, [88.7, 6095.2] *s*, and [83.2, 4804.0] *s*, while those of the EFO were bounded between 277.2 and 296.0 s. Another difference between the EFO and other proposed algorithms is related to two initial *N*_{SW}s. Since *N*_{SW} of 10 was not a viable value for implementing the EFO, two values of 25 and 30 are alternatively considered.

Based on the above discussion, the TLBO, WCA, and EFO showed higher capability compared to the SCA. Examining the time of the selected configurations of the TLBO-MLPNN, SCA-MLPNN, WCA-MLPNN, and EFO-MLPNN (i.e., 12,649.6, 5295.7, 4733.0, and 292.6 s for the *N*_{SW}s of 500, 400, 400, and 50, respectively) shows that the WCA needs around 37% of the TLBO’s time to train the MLP. The EFO, however, provides the fastest training.

Apart from comparisons, the successful prediction carried out by all four hybrid models represents the compatibility of the MLPNN model with metaheuristic science for creating predictive ensembles. The used optimizer algorithms could nicely optimize the relationship between the DO and water conditions (i.e., WT, pH, and SC) in the Klamath River Station. The basic model was a 3 × 6 × 1 MLPNN containing 24 weights and 7 biases (Fig. 4). Therefore, each algorithm provided a solution composed of 31 variables in each iteration. Considering the number of tested* N*_{SW}s and iterations for each algorithm (i.e., 30,000 iterations of the EFO and 1000 iterations of the WCA, SCA, and TLBO all with nine *N*_{SW}s), it can be said that the outstanding solution (belonging to the EFO algorithm) has been excerpted among a large number of candidates (= 1 × 30,000 × 9 + 3 × 1000 × 9).

However, concerning the limitations of this work in terms of data and methodology, potential ideas can be raised for future studies. First, it is suggested to update the applied models with the most recent hydrological data, as well as the records of other water quality stations, in order to enhance the generalizability of the models. Moreover, further metaheuristic algorithms can be tested in combination with different basic models such as ANFIS and SVM to conduct comparative studies.

### Formula presentation

The higher efficiency of the WCA and EFO (in terms of both time and accuracy) was derived in the previous section. Hereupon, the MLPNNs constructed by the optimal responses of these two algorithms are mathematically presented in this section to give two formulas for predicting the DO. Referring to Fig. 4, the calculations of the output neuron in the WCA-MLPNN and EFO-MLPNN is expressed by Eqs. 5 and 6, respectively.

$$ \begin{aligned} DO_{WCA – MLPNN } & = \, 0.395328 \times O_{HN1 } + 0.193182 + O_{HN2 } – 0.419852 \times O_{HN3 } + 0.108298 \times O_{HN4 } \\ & \quad +\, 0.686191 \times O_{HN5 } + 0.801148 \times O_{HN6 } + 0.340617 \\ \end{aligned} $$

(5)

$$ \begin{aligned} DO_{EFO – MLPNN } & = 0.033882 \times {{O}_{HN1}}^{\prime} – 0.737699 \times {{O}_{HN2}}^{\prime} – 0.028107 \times {{O}_{HN3}}^{\prime} – 0.700302 \\ & \quad \times {{O}_{HN4}}^{\prime} + 0.955481 \times {{O}_{HN5}}^{\prime} – 0.757153 \times {{O}_{HN6}}^{\prime} + 0.935491 \\ \end{aligned} $$

(6)

In the above relationships, \({O}_{HNi}\) and \({{O}_{HNi}}^{\prime}\) represent the outcome of the *i*^{th} hidden neuron in the WCA-MLPNN and EFO-MLPNN, respectively. Given *Tansig (x)* = \(\frac{2}{1+ {e}^{-2x}}\) *– 1* as the activation function of the hidden neurons, \({O}_{HNi}\) and \({{O}_{HNi}}^{\prime}\) are calculated by the below equations. As is seen, these two parameters are calculated from the inputs of the study, i.e., (WT, pH, and SC).

$$ \left[ {\begin{array}{*{20}c} {O_{HN1 } } \\ {O_{HN2 } } \\ {O_{HN3 } } \\ {O_{HN4 } } \\ {O_{HN5 } } \\ {O_{HN6 } } \\ \end{array} } \right] = Tansig\left( {\left( {\left[ {\begin{array}{*{20}c} { – 1.818573} & {1.750088} & { – 0.319002} \\ {0.974577} & {0.397608} & { – 2.316006} \\ { – 1.722125} & { – 1.012571} & {1.575044} \\ {0.000789} & { – 2.532009} & { – 0.246384} \\ { – 1.288887} & { – 1.724770} & {1.354887} \\ {0.735724} & { – 2.250890} & {0.929506} \\ \end{array} } \right] \left[ {\begin{array}{*{20}c} {WT} \\ {pH} \\ {SC} \\ \end{array} } \right] } \right) + \left[ {\begin{array}{*{20}c} {2.543969} \\ { – 1.526381} \\ {0.508794} \\ {0.508794} \\ { – 1.526381} \\ {2.543969} \\ \end{array} } \right]} \right) $$

(7)

$$ \left[ {\begin{array}{*{20}c} {O_{HN1}{\prime} } \\ {O_{HN2}{\prime} } \\ {O_{HN3}{\prime} } \\ {O_{HN4}{\prime} } \\ {O_{HN5}{\prime} } \\ {O_{HN6}{\prime} } \\ \end{array} } \right] = Tansig\left( {\left( {\left[ {\begin{array}{*{20}c} {1.323143} & { – 2.172674} & { – 0.023590} \\ {1.002364} & {0.785601} & {2.202243} \\ {1.705369} & { – 1.245099} & { – 1.418881} \\ { – 0.033210} & { – 1.681758} & {1.908498} \\ {1.023548} & { – 0.887137} & { – 2.153396} \\ {0.325776} & { – 1.818692} & { – 1.748715} \\ \end{array} } \right] \left[ {\begin{array}{*{20}c} {WT} \\ {pH} \\ {SC} \\ \end{array} } \right] } \right) + \left[ {\begin{array}{*{20}c} { – 2.543969} \\ { – 1.526381} \\ { – 0.508794} \\ { – 0.508794} \\ {1.526381} \\ {2.543969} \\ \end{array} } \right]} \right) $$

(8)

More clearly, the integration of Eqs. (5 and 7) results in the WCA-MLPNN formula, while the integration of Eqs. (6 and 8) results in the EFO-MLPNN formula. Given the excellent accuracy of these two models and their superiority over some previous models in the literature, either of these two formulas can be used for practical estimations of the DO, especially for solving the water quality issue within the Klamath River.