CN119673325A

CN119673325A - A water quality prediction method and related device based on double decomposition and hybrid model

Info

Publication number: CN119673325A
Application number: CN202411733154.1A
Authority: CN
Inventors: 徐贻霆; 刘兆炬
Original assignee: Individual
Current assignee: Individual
Priority date: 2024-11-28
Filing date: 2024-11-28
Publication date: 2025-03-21

Abstract

The invention discloses a water quality prediction method and a related device based on a double decomposition and mixing model, and relates to the technical field of water quality monitoring; the method comprises the steps of determining data to be processed based on a Pearson correlation coefficient method and preprocessed data, performing complete set empirical mode decomposition on the data to be processed to obtain a plurality of eigenmode functions, performing k-means clustering on the plurality of eigenmode functions, performing variational mode decomposition on high-frequency modes in the clustered eigenmode functions to obtain a plurality of high-frequency sub-modes, and inputting the plurality of high-frequency sub-modes, the medium-frequency modes and the low-frequency modes into Parallel Transformer-LSTM models respectively to obtain a final prediction result. The invention can improve the accuracy and stability of water quality parameter prediction.

Description

Water quality prediction method and related device based on double decomposition and mixing model

Technical Field

The invention relates to the technical field of water quality monitoring, in particular to a water quality prediction method based on a double decomposition and mixing model and a related device.

Background

Water quality monitoring is critical to ensure the safety and health of water resources. Along with the promotion of industrialization and urbanization, the water pollution problem is increasingly serious, and the ecological environment and human health are affected. Accurate prediction and monitoring of water quality parameters such as dissolved oxygen, ammonia nitrogen, total phosphorus and the like are helpful for timely finding and controlling pollution sources, so that effective treatment measures are adopted. Currently, methods for predicting water quality mainly include statistical methods, traditional machine learning methods (such as linear regression, support vector machines, etc.), and deep learning methods (such as LSTM, CNN, etc.). These methods have certain limitations in practical applications:

(1) Statistical methods assume that data obeys a specific distribution, and it is difficult to capture complex nonlinear relations.

(2) Although the prediction precision is improved to a certain extent, the dependence relation processing of time sequence data is not ideal in the traditional machine learning method.

(3) Deep learning methods such as LSTM are excellent in processing time series data, but there is room for improvement in facing high-dimensional, complex water quality data.

Therefore, how to improve the accuracy and stability of water quality parameter prediction is a technical problem to be solved in the art.

Disclosure of Invention

The invention aims to provide a water quality prediction method and a related device based on a double decomposition and mixing model, which can improve the accuracy and stability of water quality parameter prediction and provide powerful technical support for water quality monitoring and treatment.

In order to achieve the above object, the present invention provides the following solutions:

In a first aspect, the present invention provides a water quality prediction method based on a dual decomposition and mixing model, the water quality prediction method based on the dual decomposition and mixing model comprising:

raw water quality time series data are obtained, wherein the raw water quality time series data comprise Water Temperature (WT), pH, dissolved Oxygen (DO), permanganate index (CODMn), ammonia nitrogen (NH 3-N), total Phosphorus (TP), total Nitrogen (TN), conductivity (EC) and Turbidity (TU).

Preprocessing the original water quality time sequence data to obtain preprocessed data, wherein the preprocessing comprises denoising processing and normalization processing.

And determining the data to be processed based on the Pearson correlation coefficient method and the preprocessed data, wherein the data to be processed is index data with correlation with the index to be predicted.

And decomposing the data to be processed in a complete set of empirical modes (CEEMDAN) to obtain a plurality of eigenmode functions.

And carrying out k-means clustering on the plurality of eigen mode functions to obtain clustered eigen mode functions, wherein the clustered eigen mode functions comprise a high-frequency mode, a medium-frequency mode and a low-frequency mode.

And carrying out Variable Mode (VMD) decomposition on the high-frequency modes in the clustered eigenmode functions to obtain a plurality of high-frequency sub-modes.

And inputting the high-frequency sub-mode, the medium-frequency mode and the low-frequency mode into Parallel Transformer-LSTM models respectively to obtain a final prediction result, wherein the Parallel Transformer-LSTM model is a model established based on a transducer model and an LSTM model.

In a second aspect, the invention provides a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor executing the computer program to implement the dual decomposition and mixing model based water quality prediction method described above.

In a third aspect, the present invention provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the dual decomposition and mixing model based water quality prediction method described above.

In a fourth aspect, the present invention provides a computer program product comprising a computer program which, when executed by a processor, implements the above-described water quality prediction method based on a dual decomposition and mixing model.

According to the specific embodiment provided by the invention, the invention discloses the following technical effects:

The invention provides a water quality prediction method and a related device based on a double decomposition and mixing model, firstly, obtaining original water quality time sequence data; the original water quality time series data comprise water temperature, pH, dissolved oxygen, permanganate index, ammonia nitrogen, total phosphorus, total nitrogen, conductivity and turbidity; the method comprises the steps of preprocessing original water quality time series data to obtain preprocessed data, wherein the preprocessing comprises denoising and normalization, determining the data to be processed based on a Pearson correlation coefficient method and the preprocessed data, enabling the data to be index data which are correlated with indexes to be predicted to be processed, secondly, completely collecting empirical mode decomposition of the data to be processed to obtain a plurality of eigenmode functions, enabling non-stable and nonlinear water quality time series signals to be decomposed more accurately, extracting eigenmode functions, further conducting k-means clustering on the eigenmode functions to obtain clustered eigenmode functions, enabling the clustered eigenmode functions to comprise high-frequency mode, medium-frequency mode and low-frequency mode, enabling the dimension of the data to be reduced, simultaneously keeping the characteristics of information of different frequency bands, enabling high-frequency modes in the clustered eigenmode functions to be decomposed to obtain a plurality of high-frequency sub-modes, enabling the characteristics in the high-frequency mode functions to be further decomposed, enabling the characteristics in the high-frequency mode functions to be extracted to be further enhanced, enabling the characteristics of the high-frequency mode to be extracted, enabling the characteristics in the high-frequency mode functions to be further extracted, enabling the characteristics of the high-frequency sub-mode data to be input to be more accurate, and the high-frequency mode models to be more accurate, and the high-frequency mode data to be extracted, and the quality mode data after the characteristics are extracted, obtaining a final prediction result, wherein the Parallel Transformer-LSTM model is a model established based on a transducer model and an LSTM model. The transducer model captures long-term dependencies through a self-attention mechanism, and is suitable for processing long-time span data. The LSTM model processes short-term dependency through the memory unit and the gating mechanism, is suitable for processing data in a short time span, and can capture long-term and short-term dependency in time sequence data simultaneously compared with a serial transducer-LSTM model, so that prediction performance is improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a diagram of an application environment of a water quality prediction method based on a dual decomposition and mixing model according to an embodiment of the present invention.

FIG. 2 is a flow chart of a water quality prediction method based on a dual decomposition and mixing model according to an embodiment of the present invention.

FIG. 3 is a schematic overall flow chart of a water quality prediction method based on a dual decomposition and mixing model according to an embodiment of the present invention.

Fig. 4 is a pearson correlation heat map of a water quality index according to an embodiment of the present invention.

FIG. 5 is a schematic diagram of a portion CEEMDAN of decomposed raw total nitrogen data provided in an embodiment of the present invention.

FIG. 6 is a diagram of another portion CEEMDAN of the exploded raw total nitrogen data provided by one embodiment of the present invention.

FIG. 7 is a schematic diagram of K-means clustering results of IMF components according to an embodiment of the present invention.

Fig. 8 is a schematic diagram of 3 high frequency components, intermediate frequency components, and low frequency components of VMD decomposition according to an embodiment of the present invention.

Fig. 9 is a schematic structural diagram of a computer device according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description.

In order to overcome the limitations of the existing methods, the invention provides a water quality prediction method based on CEEMDAN, k-means clustering, VMD, parallel Transformer, LSTM and Random Forest (RF). The method combines the advantages of signal decomposition, deep learning and integrated learning, aims to improve the accuracy and stability of water quality parameter prediction, and provides powerful technical support for water quality monitoring and treatment.

The water quality prediction method based on the double decomposition and mixing model provided by the embodiment of the invention can be applied to an application environment shown in figure 1. Wherein the terminal 102 communicates with the server 104 via a network. The data storage system may store data that the server 104 needs to process. The data storage system may be provided separately, may be integrated on the server 104, or may be placed on a cloud or other server. The terminal 102 may send the obtained raw water quality time series data to the server 104, the raw water quality time series data including water temperature, pH, dissolved oxygen, permanganate index, ammonia nitrogen, total phosphorus, total nitrogen, conductivity, and turbidity; after receiving the original water quality time series data, the server 104 performs preprocessing on the original water quality time series data to obtain preprocessed data, wherein the preprocessing comprises denoising processing and normalization processing, determining the data to be processed based on a Pearson correlation coefficient method and the preprocessed data, wherein the data to be processed is index data with correlation with indexes to be predicted, performing complete set empirical mode decomposition on the data to be processed to obtain a plurality of intrinsic mode functions, performing k-means clustering on the plurality of intrinsic mode functions to obtain clustered intrinsic mode functions, wherein the clustered intrinsic mode functions comprise a high-frequency mode, a medium-frequency mode and a low-frequency mode, performing variable mode decomposition on the high-frequency mode in the clustered intrinsic mode functions to obtain a plurality of high-frequency sub-modes, respectively inputting the plurality of high-frequency sub-modes, the medium-frequency mode and the low-frequency mode into Parallel Transformer-LSTM model to obtain a final prediction result, and performing k-means clustering on the clustered intrinsic mode functions to form a model based on the model of the transaction model and the LSmer. The server 104 may feed back the obtained prediction result to the terminal 102. In addition, in some embodiments, the water quality prediction method based on the dual decomposition and mixing model may be implemented by the server 104 or the terminal 102 alone, for example, the terminal 102 may directly perform water quality prediction on the original water quality time series data, or the server 104 may obtain the original water quality time series data from the data storage system, and perform water quality prediction on the original water quality time series data.

The terminal 102 may be, but is not limited to, a variety of desktop computers, notebook computers, smart phones, and tablet computers. The server 104 may be implemented as a stand-alone server or a server cluster composed of a plurality of servers, or may be a cloud server.

In an exemplary embodiment, as shown in fig. 2, a water quality prediction method based on a dual decomposition and mixing model is provided, where the method is executed by a computer device, specifically, may be executed by a computer device such as a terminal or a server, or may be executed by the terminal and the server together, and in an embodiment of the present invention, the method is applied to the server 104 in fig. 1, and is described as an example, and includes the following steps:

s1, acquiring original water quality time series data, wherein the original water quality time series data comprise water temperature, pH, dissolved oxygen, permanganate index, ammonia nitrogen, total phosphorus, total nitrogen, conductivity and turbidity.

S2, preprocessing the original water quality time sequence data to obtain preprocessed data, wherein the preprocessing comprises denoising and normalization.

And S3, determining data to be processed based on a Pearson correlation coefficient method and the preprocessed data, wherein the data to be processed is index data with correlation with an index to be predicted.

And S4, carrying out complete set empirical mode decomposition on the data to be processed to obtain a plurality of intrinsic mode functions.

S5, carrying out k-means clustering on the plurality of eigen mode functions to obtain clustered eigen mode functions, wherein the clustered eigen mode functions comprise a high-frequency mode, a medium-frequency mode and a low-frequency mode.

And S6, carrying out variation mode decomposition on the high-frequency modes in the clustered eigenmode functions to obtain a plurality of high-frequency sub-modes.

And S7, respectively inputting a plurality of high-frequency sub-modes, the medium-frequency modes and the low-frequency modes into Parallel Transformer-LSTM models to obtain a final prediction result, wherein the Parallel Transformer-LSTM models are models established based on a transducer model and an LSTM model.

By implementing the steps S1 to S7, CEEMDAN are adopted for signal decomposition, so that non-stationary and nonlinear water quality time series signals can be decomposed more accurately, an intrinsic mode function is extracted, the accuracy of signal decomposition is improved, and the characteristics in water quality data can be captured better. The k-means clustering is used for classifying the eigenvalue functions (IMFs) obtained through decomposition, and the eigenvalue functions are divided into a high-frequency mode, a medium-frequency mode and a low-frequency mode, so that the dimensionality of data is reduced, the calculation efficiency is improved, and meanwhile, the characteristics of different frequency band information are reserved. VMD is introduced to carry out secondary decomposition, and the high-frequency mode in the clustered eigenmode function is subjected to secondary decomposition so as to further extract detail features in the signals, thereby enhancing the feature extraction capability of complex water quality data and improving the accuracy of data representation. Feature extraction is performed by using a parallel transducer-LSTM model, which captures long-term dependencies through a self-attention mechanism and is suitable for processing long-span data. The LSTM model processes short-term dependency through the memory unit and the gating mechanism, is suitable for processing data in a short time span, and can capture long-term and short-term dependency in time sequence data simultaneously compared with a serial transducer-LSTM model, so that prediction performance is improved.

The overall scheme is shown in fig. 3, and in step S1, raw water quality time series data is collected from a data source (e.g., sensor, history database). The original water quality time series data comprise 9 water quality indexes including water temperature, pH, dissolved oxygen, permanganate index, ammonia nitrogen, total phosphorus, total nitrogen, conductivity and turbidity.

In step S2, the data is normalized to eliminate the influence of different eigenvalue ranges, and min-max normalization is generally adopted:

Wherein x' is normalized data, x is original water quality time series data, x _min is minimum value of original water quality time series data, and x _max is maximum value of original water quality time series data.

In step S3, the pearson correlation coefficient method is used to select the feature parameter that is most relevant to the predicted parameter:

Where ρ _X,Y is the correlation coefficient between the feature X and the target Y, cov (X, Y) is the covariance, σ _X is the standard deviation of the feature, and σ _Y is the standard deviation of the target.

As an alternative embodiment, in step S4, the raw water quality time series data is decomposed into several eigenmode functions to capture non-linear and non-stationary features. CEEMDAN the basic steps are as follows:

(1) Noise is added to generate a noise auxiliary signal.

A white noise sequence n _i (t) with the same variance is superimposed on the original signal x (t) to generate a noise-assisted signal:

x_i(t)＝x(t)+ε·n_i(t) (3);

wherein x _i (t) is a noise auxiliary signal, epsilon is a very small positive number for controlling the noise amplitude, and i is a positive integer, which represents different noise realizations.

(2) And calculating a first IMF, namely performing EMD decomposition on each noise auxiliary signal x _i (t) to obtain the first IMF.

Wherein IMF ₁ (t) is the first IMF, and N is the number of times of noise addition.

(3) And (5) residual calculation.

Removing the first IMF from the original signal, resulting in a residual signal:

r₁(t)＝x(t)-IMF₁(t) (5);

Wherein r ₁ (t) is a residual signal.

(4) And (3) recursively calculating a subsequent IMF, namely taking a residual signal as input, repeating the steps (1) and (2), and continuing to carry out EMD decomposition on the residual to obtain the next IMF until the residual is close to zero.

As an alternative embodiment, in step S5, the IMFs obtained by the decomposition are clustered, and the complexity of each IMF component is calculated by calculating the sample entropy, and is divided into a high frequency mode, a medium frequency mode and a low frequency mode. The goal of clustering is to group IMFs with similar frequency characteristics, calculate the distance of each IMF component from all cluster centers, and assign it to the nearest center.

Where k is the number of clusters, C _i is the ith cluster, and μ _i is the center of the ith cluster. And marking the IMF components as high frequency, medium frequency and low frequency according to the final clustering result. The IMFs decomposed by CEEMDAN are divided into 3 categories of high frequency, medium frequency and low frequency, so that the subsequent processing flow is simplified, and different frequency components, particularly the further removal of high-frequency noise, can be processed in a targeted manner.

Specifically, the K-means clustering comprises the following steps:

(1) The complexity of each IMF component is calculated by the sample entropy SE, which is used for distinguishing high frequency, medium frequency and low frequency, and the basic steps of the sample entropy calculation are as follows:

Step one, input parameters and data preprocessing.

1) Input data IMF component IMF _k (t).

2) Embedding dimension m the dimension of the build vector (typically 2 or 3 is chosen).

3) The similarity margin r is used to determine whether two sequences are similar, and is typically selected to be 0.1 to 0.25 times the standard deviation of the predictor sequence.

4) The delay time tau is the time interval used to construct the embedded matrix, and the default value is 1.

And step two, constructing an embedding matrix.

Building an embedded vector sequence of length m for data of length N of time sequence x (t):

X^m(i)={x(i),x(i+1),...,x(i+m-1)},i=1,2,...,N-m+1 (7);

Wherein X ^m (i) is a subsequence of length m.

And thirdly, calculating the distance of the embedded vector.

The distance between the two embedding vectors X ^m (i) and X ^m (j) is defined as chebyshev distance:

This distance is used to determine the similarity between the two subsequences.

And step four, counting the number of the matched samples.

For each embedded vector X ^m (i), counting the number of vectors whose distance is less than a threshold r, removing the sub-matches:

Wherein, Representing the number of vector duty cycles similar to X ^m (i).

And fifthly, calculating the average value of all the matched samples.

Calculating the average value of the similarity of all embedded vectors:

And step six, increasing the embedding dimension.

Increasing the embedding dimension from m to m+1 and repeating the above steps to calculate B ^m+1.

And step seven, calculating sample entropy.

The definition of sample entropy is the logarithm of the ratio of similarity in two embedding dimensions, minus:

if the pattern in the signal is complex, the sample entropy will be large, whereas the simpler the pattern, the smaller the sample entropy.

(2) Constructing a feature matrix, namely constructing the sample entropy and the frequency energy of each IMF into a feature matrix F:

Where SE _k represents the sample entropy of the kth IMF and Spectrum _k represents the Spectrum of the kth IMF.

(3) K-means clustering was applied.

IMFs are clustered into three classes, representing high frequency, medium frequency and low frequency modes, respectively. Sample entropy values of several IMF components (3 should be this patent) are randomly selected as initial cluster centers. The Euclidean distance is used to calculate the distance between each IMF component and all cluster centers and assign it to the nearest center.

Euclidean distance formula:

Where d (x _i,c_j) is the distance between the data point xi and the cluster center cj, x _i＝(x_i1,x_i2,…,x_in) is the coordinate of the data point i in n-dimensional space, and c _j＝(c_j1,c_j2,…,c_jn) is the coordinate of the cluster center j in n-dimensional space.

For each cluster, the average of all data points in the cluster is calculated and taken as a new cluster center.

Where C _j represents the set of all data points currently belonging to cluster j, |C _j | is the number of data points in the family, and x _i is the data point belonging to C _j.

The allocation and updating steps are repeated until the cluster center is no longer changed. And respectively calculating the sum of various IMFs according to the clustering result to obtain a high-frequency mode, a medium-frequency mode and a low-frequency mode.

In an exemplary embodiment, in step S6, specifically includes:

And S61, determining the number of high-frequency sub-modes, and initializing the center frequency and the mode function of each high-frequency sub-mode.

And S62, carrying out bandwidth minimization decomposition on the high-frequency mode to obtain a plurality of mode components.

And S63, iteratively updating the modal function and the center frequency by using a variation method based on a plurality of modal components, and minimizing the bandwidth of the modal until convergence to obtain a plurality of high-frequency sub-modalities.

VMD decomposition is further used for each clustered high-frequency mode to generate sub-modes. VMD is an adaptive non-stationary signal decomposition method that decomposes a signal into several sub-modes with center frequencies by solving a variational problem. The basic steps of the VMD are as follows:

(1) The center frequency of each sub-mode is initialized.

(2) And iteratively updating the sub-modes and the center frequency by using a variation method until convergence.

Specifically, for the clustered high-frequency mode, VMD is applied to carry out secondary decomposition, and detail features are further extracted:

(1) The method comprises the steps of initializing the number of modes and frequencies, namely setting the number of modes K (the invention decomposes a high-frequency signal into 3 components) and an initial center frequency (firstly, carrying out spectrum analysis on the signal to find out main frequency components in the signal, and taking the main frequency components as the initial center frequency of each mode).

(2) And (3) signal decomposition, namely bandwidth minimization decomposition is carried out on the input high-frequency signal, and a plurality of modal components are obtained.

(3) Alternate optimization by alternately updating the modal components and center frequency, the bandwidth of the modal is minimized.

(4) And judging convergence, namely judging whether convergence conditions are met, outputting each modal component if the conditions are met, and if not, continuing iteration.

As an alternative implementation mode, the water quality prediction method based on the double decomposition and mixing model further comprises data reconstruction, wherein the medium frequency mode, the low frequency mode and the high frequency sub-mode are recombined to generate a characteristic data set for prediction. The feature dataset is organized into a training set and a test set, typically using 80% of the data as the training set and 20% of the data as the test set.

In an exemplary embodiment, in step S7, specifically includes:

S71, dividing a plurality of high-frequency sub-modes into a plurality of parallel streams.

And S72, respectively inputting a plurality of parallel streams into a transducer model and an LSTM model to obtain a plurality of characteristics of first different time scales.

And S73, connecting the characteristics of the plurality of first different time scales in a characteristic level manner to obtain a plurality of first prediction results.

And S74, respectively inputting the intermediate frequency modes into a transducer model and an LSTM model to obtain the characteristics of the second different time scales.

And S75, connecting the features of the second different time scales in a feature level to obtain a second prediction result.

And S76, respectively inputting the low-frequency modes into a transducer model and an LSTM model to obtain the characteristics of the third different time scales.

And S77, carrying out feature level connection on the features of the third different time scales to obtain a third prediction result.

And S78, carrying out regression prediction on the plurality of first prediction results, the second prediction results and the third prediction results by using a random forest to obtain a final prediction result.

The invention adopts a random forest model to finally predict the characteristics extracted from a transducer-LSTM model. The prediction results of the IMFs in the parallel transformers-lstm are combined more effectively, so that the influence of offset and error of direct superposition of the prediction results is reduced, the stability and noise resistance of the model are improved, and the accuracy and the robustness of prediction are improved.

(Conventional superposition methods cause a problem of increased error, one of the reasons:

offset of IMF predictors-there is an offset in the predictors of each IMF, which can accumulate when superimposed, resulting in a large error.

Correlation between IMFs-there is correlation between IMFs-direct superposition can introduce more errors.

Error propagation the error of each IMF prediction result is amplified during the superposition. Thus, rather than a simple superposition, the use of regression models to learn how to combine the predicted results of the IMFs is employed. )

The random forest is an integrated learning method, and by integrating a plurality of decision trees, the stability and the noise immunity of the model can be improved, the accuracy and the robustness of prediction can be improved, and the risk of overfitting, offset and error accumulation can be reduced.

The Parallel Transformer-LSTM model is described below.

The transducer model is used to capture long-term dependencies in the time series data, and a weighted sum of each position in the series is calculated by a self-attention mechanism.

Where Q is the query matrix, K is the key matrix, V is the value matrix, and d _k is the dimension of the key vector.

The steps are as follows:

① Input data embedding-time series data is embedded into a high-dimensional space.

② And a self-attention mechanism for calculating the attention weight of each time step and other time steps and capturing the global dependency relationship.

③ Feedforward neural network, processing the self-attention output to generate the final feature representation.

The LSTM model is used for capturing short-term dependency relations in the time series data, and the time series data is processed through forgetting gates, input gates, candidate memory states and output gates. The steps are as follows:

Forgetting gate, which information needs to be reserved in the memory unit state of the previous time step.

f_t＝σ(W_f·[h_t-1,x_t]+b_f) (16);

① Meaning that the forget gate determines which information in the memory cell state Ct-1 of the previous time step needs to be forgotten.

② The inputs are the previous hidden state h _t-1 and the current input x _t, linearly transformed by the weight matrix W _f and the bias b _f, and then sigmoid function sigma (output between 0 and 1). The output f _t is a vector, each element representing the proportion of information retention at the corresponding location in the memory cell state.

An input gate for determining the part of the current input information to be written into the memory cell state.

i_t＝σ(W_i·[h_t-1,x_t]+b_i) (17);

① Meaning that the input gate determines which information in the current input x _t needs to be written to the memory cell state Ct.

② The input is the previous hidden state h _t-1 and the current input x _t, and is subjected to linear transformation of a weight matrix W _i and a bias b _i, and then is subjected to a sigmoid function sigma. Output i _t is a vector, each element representing the proportion of the state of the memory cell to which information at the corresponding location in the current input is written.

Candidate memory unit state, generating new candidate memory content of the current time step.

① Meaning that new candidate memory cell states C- _t are generated, representing new memory content calculated at the current time step based on the current input and the previous hidden state.

② The calculation is that the input is the previous hidden state h _t-1 and the current input x _t, the linear transformation of the weight matrix W _C and the bias b _C is carried out, and the tanh function is carried out (the output is between-1 and 1). Output ofIs a vector, and each element represents candidate memory content for the current time step.

And updating the memory cell state, namely integrating the memory state of the previous time step and the candidate memory state of the current time step to generate a new memory cell state.

① Meaning that the memory cell state Ct is updated, and the memory state Ct-1 of the previous time step and the candidate memory state of the current time step are integrated

② The memory cell state Ct is calculated as a weighted sum of the previous memory cell state Ct-1 and the current candidate memory state C- _t. Forget gate f _t determines which information in the previous memory state remains and input gate i _t determines which information currently entered is written into the new memory state.

And an output gate for determining which information in the memory cell state is output to the hidden state.

o_t＝σ(W_o·[h_t-1,x_t]+b_o) (20);

① The sense that the output gate determines which information in the current memory cell state Ct needs to be output to the hidden state h _t.

② The input is the previous hidden state h _t-1 and the current input x _t, and is subjected to linear transformation of a weight matrix W _o and a bias b _o, and then is subjected to a sigmoid function sigma. The output o _t is a vector, and each element represents the proportion of the information of the corresponding position in the memory cell state that is output to the hidden state.

And calculating a hidden state, namely outputting an LSTM unit, wherein the LSTM unit comprises memory and state information of the current time step.

h_t＝o_t·tanh(C_t) (21);

① Meaning that the hidden state h _t of the current time step is calculated, which is the output of the LSTM unit.

② The hidden state h _t is calculated as the product of the output of output gate o _t and the weighted output (via the tanh function) of the current cell state Ct. The output gate o _t controls which information in the memory cell state is output to the hidden state.

ParallelTransformer-LSTM model treatment:

① Data parallelization, namely dividing the data into a plurality of parallel streams, respectively inputting the parallel streams into a transducer and an LSTM, and capturing characteristics of different time scales.

② And (3) model fusion, namely fusing the outputs of the transducer and the LSTM, adopting feature level connection, and carrying out final prediction on the outputs of the transducer and the LSTM through a full connection layer, so that the model can combine two different types of feature extraction capabilities together by utilizing the self-attention mechanism of the transducer and the time sequence modeling capability of the LSTM at the same time, and a final prediction result is generated to provide richer feature representation.

Random forest regression prediction:

The parallel-transducer-LSTM outputs are post-processed using a random forest to reduce the effects of offset and error per IMF. Random forests are an integrated learning method that improves predictive performance and stability by building multiple decision trees and averaging their results. The basic steps of random forests are as follows:

① Several subsamples are randomly sampled from the training set.

② A decision tree is built for each sub-sample.

③ And averaging the prediction results of all the decision trees to obtain a final prediction value.

Model evaluation and prediction.

1) Model performance was evaluated.

Error indexes are calculated, and evaluation indexes such as Mean Square Error (MSE), root Mean Square Error (RMSE), mean Absolute Error (MAE), decision coefficient (R2) and the like are adopted for evaluation.

Wherein MSE is mean square error, RMSE is root mean square error, MAE is mean absolute error, R ² is a determining coefficient, N is a sample number, y is a true value; Is a predicted value of the model; Is the mean of the true values.

MSE, RMSE and MAE measure the deviation between the true and predicted values, with smaller values indicating more accurate model predictions. R ² can reflect the accuracy of the model fitting data, ranging from 0to 1, with closer R ² to 1 indicating better model fitting capability.

2) And generating a prediction result.

And generating a prediction result by using the trained model, and inversely normalizing the prediction result to the original range. And deploying the trained model to a production environment for real-time prediction and decision support.

In one exemplary embodiment, the method specifically comprises the following steps:

(1) And (3) data acquisition and pretreatment, namely collecting water quality monitoring data, carrying out necessary pretreatment, denoising and normalization.

(2) CEEMDAN decomposing, namely CEEMDAN decomposing the pretreated data to obtain a series of IMFs.

(3) K-means clustering classification, namely performing k-means clustering on IMFs and dividing the IMFs into high-frequency, medium-frequency and low-frequency components.

(4) VMD secondary decomposition, namely performing VMD decomposition on clustered IMFs and further extracting detail features.

(5) Parallel transform-LSTM feature extraction, namely inputting the decomposed features into a transform and an LSTM model at the same time for feature extraction, improving the processing efficiency and fusing various feature information

(6) And finally predicting the random forest, namely finally predicting the characteristics output by the LSTM by using the random forest to obtain the predicted value of the water quality parameter.

(7) Analyzing and evaluating the result, namely analyzing the predicted result, evaluating the performance of the model, and adjusting and optimizing according to the requirement.

In one exemplary embodiment, the water quality dataset is derived from published data of a China environmental monitoring total station (https:// www.cnemc.cn /). The total nitrogen of the section of the child under the Zhejiang river is predicted.

(1) And (5) data processing and feature extraction.

And deleting the abnormal data, interpolating the deleted data by adopting a spline interpolation method, and obtaining parameters with the total nitrogen correlation coefficient larger than 0.2, namely water temperature, dissolved oxygen, ammonia nitrogen and total phosphorus, respectively, based on a Pearson correlation coefficient method. Therefore, water temperature, dissolved oxygen, ammonia nitrogen, total phosphorus and total nitrogen are selected for water quality prediction. The water quality index pearson correlation heat map is shown in fig. 4.

(2) CEEMDAN signal decomposition.

As shown in fig. 5 and 6, the raw water quality time series data is decomposed into 14 eigenmode functions (IMFs).

(3) K-means clustering.

As shown in fig. 7, the IMFs obtained by the decomposition are clustered by calculating the sample entropy, and are classified into a high-frequency mode, a medium-frequency mode, and a low-frequency mode.

(4) As shown in fig. 8, the high frequency signal VMD is secondarily decomposed to obtain IMF1, IMF2, and IMF3, and IMF4 and IMF5 correspond to the intermediate frequency and low frequency signals in fig. 7.

(5) And (5) comparing the water quality model prediction result with other prediction methods. The comparison results are shown in Table 1.

Table 1 comparison of predicted performance

Model	MSE	RMSE	MAE	R²
					The method	0.19334	0.37234	0.30162	0.93821
Parallel-tansformer-lstm	0.36571	0.40708	0.35207	0.91451
					serial-tansformerl-lstm	0.34412	0.43963	0.34088	0.91816

The invention also provides an application scene, which applies the water quality prediction method based on the double decomposition and mixing model. The water quality prediction method based on the double decomposition and mixing model can be applied to a water quality prediction scene. The water quality prediction scene comprises a data acquisition link, a data processing link, a complete set empirical mode decomposition link, a clustering link, a variation mode decomposition link and a prediction link. Firstly, acquiring original water quality time series data, wherein the original water quality time series data comprises water temperature, pH, dissolved oxygen, permanganate index, ammonia nitrogen, total phosphorus, total nitrogen, conductivity and turbidity, preprocessing the original water quality time series data to obtain preprocessed data, the preprocessing comprises denoising and normalizing, determining the data to be processed based on a Pearson correlation coefficient method and the preprocessed data, the data to be processed is index data with correlation with a to-be-predicted index, secondly, performing complete set empirical mode decomposition on the data to be processed to obtain a plurality of eigen mode functions, further performing k-means clustering on the plurality of eigen mode functions to obtain clustered eigen mode functions, wherein the clustered eigen mode functions comprise a high frequency mode, an intermediate frequency mode and a low frequency mode, performing variable decomposition on the high frequency mode in the clustered eigen mode functions to obtain a plurality of high frequency sub-modes, and finally, performing complete set empirical mode decomposition on the plurality of high frequency modes, the intermediate frequency modes and the low frequency modes Parallel Transformer-LS35 respectively, and finally, inputting the obtained final prediction result.

In an exemplary embodiment, a computer device, which may be a server or a terminal, is provided, and an internal structure thereof may be as shown in fig. 9. The computer device includes a processor, a memory, an Input/Output interface (I/O) and a communication interface. The processor, the memory and the input/output interface are connected through a system bus, and the communication interface is connected to the system bus through the input/output interface. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is used for storing raw water quality time series data. The input/output interface of the computer device is used to exchange information between the processor and the external device. The communication interface of the computer device is used for communicating with an external terminal through a network connection. The computer program, when executed by a processor, implements a water quality prediction method based on a dual decomposition and mixing model.

It will be appreciated by persons skilled in the art that the architecture shown in fig. 9 is merely a block diagram of some of the architecture relevant to the present inventive arrangements and is not limiting as to the computer device to which the present inventive arrangements are applicable, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.

In an exemplary embodiment, there is also provided a computer device including a memory and a processor, the memory storing a computer program, the processor implementing the above-described method embodiments when executing the computer program.

In one exemplary embodiment, a computer-readable storage medium is provided, storing a computer program that, when executed by a processor, implements the above-described method embodiments.

In an exemplary embodiment, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the above-described method embodiments.

It should be noted that, the user information (including but not limited to user equipment information, user personal information, etc.) and the data (including but not limited to data for analysis, stored data, presented data, etc.) related to the present invention are both information and data authorized by the user or sufficiently authorized by each party, and the collection, use and processing of the related data are required to meet the related regulations.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, database, or other medium used in embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (ReRAM), magneto-resistive random access Memory (Magnetoresistive RandomAccess Memory, MRAM), ferroelectric Memory (Ferroelectric RandomAccess Memory, FRAM), phase change Memory (PHASE CHANGE Memory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (RandomAccess Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in various forms such as static random access memory (Static RandomAccess Memory, SRAM) or dynamic random access memory (Dynamic RandomAccess Memory, DRAM), etc.

The databases referred to in the embodiments provided herein may include at least one of a relational database and a non-relational database. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processor referred to in the embodiments provided in the present invention may be a general-purpose processor, a central processing unit, a graphics processor, a digital signal processor, a programmable logic unit, a data processing logic unit based on quantum computing, or the like, but is not limited thereto.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The principles and embodiments of the present invention have been described herein with reference to specific examples, which are intended to facilitate an understanding of the principles and concepts of the invention and are to be varied in scope and detail by persons of ordinary skill in the art based on the teachings herein. In view of the foregoing, this description should not be construed as limiting the invention.

Claims

1. A water quality prediction method based on a double decomposition and hybrid model, characterized in that the water quality prediction method based on a double decomposition and hybrid model comprises:

Obtaining original water quality time series data; the original water quality time series data includes: water temperature, pH, dissolved oxygen, permanganate index, ammonia nitrogen, total phosphorus, total nitrogen, conductivity and turbidity;

Preprocessing the original water quality time series data to obtain preprocessed data; the preprocessing includes: denoising and normalization;

Based on the Pearson correlation coefficient method and the pre-processed data, determining the data to be processed; the data to be processed is the index data that is correlated with the index to be predicted;

Performing complete set empirical mode decomposition on the data to be processed to obtain a plurality of intrinsic mode functions;

Performing k-means clustering on a plurality of the intrinsic mode functions to obtain clustered intrinsic mode functions; the clustered intrinsic mode functions include: high-frequency mode, medium-frequency mode and low-frequency mode;

Performing variational mode decomposition on the high-frequency modes in the clustered intrinsic mode functions to obtain a plurality of high-frequency sub-modes;

Several of the high-frequency sub-modes, the medium-frequency modes and the low-frequency modes are respectively input into the Parallel Transformer-LSTM model to obtain the final prediction result; the Parallel Transformer-LSTM model is a model established based on the Transformer model and the LSTM model.

2. The water quality prediction method based on double decomposition and hybrid model according to claim 1, characterized in that the normalized expression is:

Among them, x' is the normalized data, x is the original water quality time series data; x _min is the minimum value of the original water quality time series data; x _max is the maximum value of the original water quality time series data.

3. The water quality prediction method based on double decomposition and hybrid model according to claim 1, characterized in that the expression of the Pearson correlation coefficient method is:

Among them, ρ _X,Y is the correlation coefficient between feature X and target Y, cov(X,Y) is the covariance, σ _X is the standard deviation of the feature, and σ _Y is the standard deviation of the target.

4. The water quality prediction method based on double decomposition and hybrid model according to claim 1 is characterized in that the high-frequency modes in the clustered intrinsic mode function are subjected to variational mode decomposition to obtain a plurality of high-frequency sub-modes, specifically including:

Determine the number of high-frequency sub-modes, and initialize the center frequency and modal function of each high-frequency sub-mode;

Performing bandwidth minimization decomposition on the high frequency mode to obtain a plurality of modal components;

Based on the several modal components, the modal function and the center frequency are iteratively updated by using the variational method to minimize the bandwidth of the mode until convergence, thereby obtaining several high-frequency sub-modes.

5. The water quality prediction method based on double decomposition and hybrid model according to claim 1 is characterized in that a plurality of the high-frequency sub-modes, the intermediate-frequency modes and the low-frequency modes are respectively input into the Parallel Transformer-LSTM model to obtain the final prediction result, which specifically includes:

dividing a plurality of the high frequency sub-modes into a plurality of parallel streams;

Inputting the plurality of parallel streams into a Transformer model and an LSTM model respectively to obtain features of a plurality of first different time scales;

Connecting a plurality of features of the first different time scales at a feature level to obtain a plurality of first prediction results;

Inputting the intermediate frequency mode into the Transformer model and the LSTM model respectively to obtain features of a second different time scale;

Performing feature-level connection on the features of the second different time scales to obtain a second prediction result;

Inputting the low-frequency mode into the Transformer model and the LSTM model respectively to obtain features of a third different time scale;

Performing feature-level connection on the features of the third different time scales to obtain a third prediction result;

Use random forest to perform regression prediction on several of the first prediction results, the second prediction results and the third prediction results to obtain a final prediction result.

6. The water quality prediction method based on double decomposition and hybrid model according to claim 1, characterized in that the water quality prediction method based on double decomposition and hybrid model further comprises:

The Parallel Transformer-LSTM model is evaluated by using evaluation indicators, including mean square error, root mean square error, mean absolute error and determination coefficient.

7. The water quality prediction method based on double decomposition and hybrid model according to claim 6, characterized in that the calculation formula of the mean square error is:

The calculation formula of the root mean square error is:

The calculation formula of the mean absolute error is:

The calculation formula of the determination coefficient is:

Among them, MSE is the mean square error, RMSE is the root mean square error, MAE is the mean absolute error, R ² is the determination coefficient, N is the number of samples; y is the true value; is the predicted value of the model; is the mean of the true values.

8. A computer device, comprising: a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the computer program to implement the water quality prediction method based on the dual decomposition and mixing model as described in any one of claims 1 to 7.

9. A computer-readable storage medium having a computer program stored thereon, characterized in that when the computer program is executed by a processor, the water quality prediction method based on the double decomposition and hybrid model described in any one of claims 1 to 7 is implemented.

10. A computer program product, comprising a computer program, characterized in that when the computer program is executed by a processor, the water quality prediction method based on the double decomposition and hybrid model described in any one of claims 1 to 7 is implemented.