CN114936620B

CN114936620B - Bias correction method for sea surface temperature numerical forecast based on attention mechanism

Info

Publication number: CN114936620B
Application number: CN202210209290.5A
Authority: CN
Inventors: 汪祥; 朱俊星; 费童涵; 张卫民; 陈祥国; 王辉赞; 陈妍
Original assignee: National University of Defense Technology
Current assignee: National University of Defense Technology
Priority date: 2022-03-04
Filing date: 2022-03-04
Publication date: 2024-05-03
Anticipated expiration: 2042-03-04
Also published as: CN114936620A

Abstract

The invention discloses a sea surface temperature numerical forecast deviation correcting method based on an attention mechanism, which comprises the following basic steps: preprocessing the prediction data of the mode to be corrected, and constructing an input sequence; constructing a sea surface temperature correction model; correcting the model data by using the constructed model; and evaluating the correction precision of the model. The method not only considers the influence of the space distribution of training data, adds characteristic factors such as salinity and the like, but also considers the importance of history information. The model can effectively extract the space-time dependency relationship between the sea temperature field data, thereby realizing high-precision SST correction.

Description

Bias correction method for sea surface temperature numerical forecast based on attention mechanism

技术领域Technical Field

本发明属于海洋气象预报订正技术领域，尤其涉及基于注意力机制的海表面温度数值预报偏差订正方法。The present invention belongs to the technical field of marine meteorological forecast correction, and in particular relates to a method for correcting sea surface temperature numerical forecast deviations based on an attention mechanism.

背景技术Background technique

海洋占据地球表面积的70％以上,与人类活动息息相关。在全球气候研究、海洋生态系统研究和海洋相关应用中，海表温度都是一个重要的物理量。海水温度的变化对航海、海洋防灾减灾和海洋渔业等活动有着极为重要的影响，因此准确观测和预报海表温度具有重要意义。数值模式预报是海洋预报中一种常见的预报方法，但由于数值预报模式还不能完全描述海洋中的各种物理过程，模式还存在初始场的不确定性以及模式数值求解过程中难以避免的计算误差等问题，预报结果仍存在一定误差，使得数值预报产品的预报结果需要进一步的订正。基于机器学习的预报模型是一种纯数据驱动的预报模型，虽然其在捕捉预报因子和预报目标之间的非线性关系方面具有较大优势。但是，随着科技和社会的进步，仅基于深度学习方法开展的要素预报难以满足目前业务预报的需要，而基于数值模式的预报结果也存在难以避免的误差。数据驱动的机器学习方法与理论驱动的数值模式方法都无法满足海洋要素预报精度的需要。因此，迫切需要将理论驱动的数值模式方法与数据驱动的深度学习方法进行融合来预报海洋要素。利用机器学习方法对数值模式预报结果进行后处理等工作来订正数值预报产品的预报误差，以提高各种要素预报的准确性。在两者如何进行融合计算方面，亟须开展利用深度学习方法进行模式结果后处理等工作来订正数值预报产品的预报误差，以提高各种要素预报的准确性。The ocean occupies more than 70% of the earth's surface area and is closely related to human activities. In global climate research, marine ecosystem research and ocean-related applications, sea surface temperature is an important physical quantity. Changes in sea temperature have a very important impact on activities such as navigation, marine disaster prevention and mitigation, and marine fisheries. Therefore, accurate observation and forecasting of sea surface temperature is of great significance. Numerical model forecasting is a common forecasting method in marine forecasting. However, since the numerical forecast model cannot fully describe various physical processes in the ocean, the model still has problems such as uncertainty in the initial field and unavoidable calculation errors in the numerical solution process of the model, the forecast results still have certain errors, which makes the forecast results of numerical forecast products need further correction. The forecast model based on machine learning is a purely data-driven forecast model, although it has great advantages in capturing the nonlinear relationship between forecast factors and forecast targets. However, with the advancement of science and technology and society, element forecasting based only on deep learning methods is difficult to meet the needs of current business forecasting, and the forecast results based on numerical models also have unavoidable errors. Both data-driven machine learning methods and theory-driven numerical model methods cannot meet the needs of marine element forecast accuracy. Therefore, it is urgent to fuse the theory-driven numerical model method with the data-driven deep learning method to forecast ocean elements. The forecast errors of numerical forecast products can be corrected by post-processing the forecast results of numerical models using machine learning methods to improve the accuracy of forecasts of various elements. In terms of how to fuse the two, it is urgent to use deep learning methods to post-process the model results to correct the forecast errors of numerical forecast products to improve the accuracy of forecasts of various elements.

发明内容Summary of the invention

有鉴于此，本发明提出了基于注意力机制的海表面温度数值预报偏差订正方法，基于ConvLSTM和CBAM的具有注意力机制的海表面温度订正方法，包括数据预处理，构建输入序列，构建订正模型，对模式数据进行订正和订正精度评价等。In view of this, the present invention proposes a sea surface temperature numerical forecast deviation correction method based on attention mechanism, and a sea surface temperature correction method with attention mechanism based on ConvLSTM and CBAM, including data preprocessing, constructing input sequence, constructing correction model, correcting pattern data and evaluating correction accuracy, etc.

本发明公开的基于注意力机制的海表面温度数值预报偏差订正方法，包括以下步骤：The present invention discloses a method for correcting deviations of sea surface temperature numerical forecasts based on an attention mechanism, comprising the following steps:

步骤一：对模式预报数据及遥感卫星海温数据进行预处理，构建输入序列，包括：Step 1: Preprocess the model forecast data and remote sensing satellite sea temperature data to construct the input sequence, including:

(1)获取模式预报数据及作为参考值的遥感卫星海温数据，提取海洋环境数据；(1) Obtain model forecast data and remote sensing satellite sea temperature data as reference values, and extract marine environment data;

(2)构建时间列，通过滑动窗口获得时间的特征数量，再添加海洋特征影响因素，海洋特征影响因素包括海温、盐度、水流的U和V向量，得到模型的输入历史SST序列；(2) Construct a time column, obtain the number of time features through a sliding window, and then add ocean feature influencing factors, including sea temperature, salinity, and the U and V vectors of water currents, to obtain the input historical SST sequence of the model;

(3)将上述步骤获得的历史SST序列数据进行标准化处理；(3) Standardize the historical SST series data obtained in the above steps;

步骤二：构建订正模型，包括：Step 2: Construct a revised model, including:

(1)空间特征提取：利用三维卷积将三维核与多个连续矩阵叠加形成的立方体进行卷积，提取海温数据的空间依赖性特征以及多个环境变量之间的联系；(1) Spatial feature extraction: Using 3D convolution, the 3D kernel is convolved with a cube formed by superimposing multiple continuous matrices to extract the spatial dependency characteristics of SST data and the relationship between multiple environmental variables;

(2)对经过三维卷积后的数据序列利用3D-CBAM注意机制提高三维卷积网络空间特征的利用率，并显示不同环境变量对结果的重要性；所述3D-CBAM注意力机制由通道注意模块和空间注意模块两个部分组成；(2) Using the 3D-CBAM attention mechanism on the data sequence after 3D convolution to improve the utilization rate of the spatial features of the 3D convolution network and show the importance of different environmental variables to the results; the 3D-CBAM attention mechanism consists of two parts: a channel attention module and a spatial attention module;

将Mc和输入特征矩阵F相乘得到空间注意模块的输入矩阵Ms；Multiply Mc and the input feature matrix F to get the input matrix Ms of the spatial attention module;

将Ms和该模块的输入F’相乘，得到最终生成的特征矩阵；Multiply Ms by the input F’ of the module to obtain the final generated feature matrix;

(3)将上述步骤得到的特征矩阵数据序列输入到模型的ConvLSTM部分；使用Attention机制，在ConvLSTM网络后面添加一个自定义的注意层，利用ConvLSTM模型每一步的隐藏层状态，将时间注意力权重分配到每个时间步的隐藏层状态；(3) Input the feature matrix data sequence obtained in the above steps into the ConvLSTM part of the model; use the Attention mechanism to add a custom attention layer after the ConvLSTM network, and use the hidden layer state of each step of the ConvLSTM model to assign the temporal attention weight to the hidden layer state of each time step;

调整最终的ConvLSTM输出，得到最终的订正结果；Adjust the final ConvLSTM output to get the final corrected result;

(4)搭建并训练上述步骤所得的订正模型，不断调整参数，择优选取参数获得海表面温度订正模型；(4) Building and training the correction model obtained in the above steps, continuously adjusting the parameters, and selecting the best parameters to obtain the sea surface temperature correction model;

步骤三：利用订正模型对模式预报数据进行订正；Step 3: Use the correction model to correct the model forecast data;

(1)将标准化处理后的时间列预报数据输入步骤二获得的海表面温度订正模型，得到订正后的输出结果；(1) Inputting the standardized time series forecast data into the sea surface temperature correction model obtained in step 2 to obtain the corrected output result;

(2)对模型输出结果反标准化处理，反标准化方法和步骤一中的标准化方法相对应，处理得到订正后的SST值；(2) De-standardize the model output results. The de-standardization method corresponds to the standardization method in step 1, and the corrected SST value is obtained;

步骤四：利用MAE、MAPE、MSE和RMSE评价指标对模型订正后结果精度进行评价。Step 4: Use MAE, MAPE, MSE and RMSE evaluation indicators to evaluate the accuracy of the model correction results.

进一步的，采用双线性插值的方法，统一预报数据和遥感卫星观测数据的时空分辨率。Furthermore, the bilinear interpolation method is used to unify the temporal and spatial resolutions of the forecast data and remote sensing satellite observation data.

进一步的，对于历史数据序列X，其中任意时刻t的SST数据和其他海洋环境变量组成的Xt都是W×H×C规格的网格数据，整个模型的输入为一个五维张量，表示为B×T×C×W×H，其中B是一批训练样本的数量，T是序列数据的长度，W和H是SST字段的长度和宽度，C是加入的海洋环境变量数。Furthermore, for the historical data sequence X, where the SST data at any time t and other marine environmental variables Xt are grid data of W×H×C specifications, the input of the entire model is a five-dimensional tensor, expressed as B×T×C×W×H, where B is the number of training samples in a batch, T is the length of the sequence data, W and H are the length and width of the SST field, and C is the number of marine environmental variables added.

进一步的，所述标准化公式如下：Furthermore, the standardization formula is as follows:

其中，X_max,X_min为分别为序列数据中的最大值和最小值。Among them, X _max and X _min are the maximum and minimum values in the sequence data respectively.

进一步的，所述通道注意模块计算公式如下：Furthermore, the calculation formula of the channel attention module is as follows:

Mc(F)＝σ(MLP(MaxPool3D(F))+MLP(AvgPool3D(F)))Mc(F)＝σ(MLP(MaxPool3D(F))+MLP(AvgPool3D(F)))

其中MLP表示多层感知机，MaxPool3D表示最大池化，AvgPool3D表示平均池化，F是输入的特征矩阵；Where MLP represents multi-layer perceptron, MaxPool3D represents maximum pooling, AvgPool3D represents average pooling, and F is the input feature matrix;

所述空间注意模块的输入矩阵Ms为：The input matrix Ms of the spatial attention module is:

Ms(F)＝σ(f^3×3([MaxPool3D(F)；AvgPool3D(F)]))Ms(F) = σ(f ^3×3 ([MaxPool3D(F); AvgPool3D(F)]))

其中，f是一个3×3的卷积操作。Among them, f is a 3×3 convolution operation.

进一步的，所述ConvLSTM网络的计算公式如下：Furthermore, the calculation formula of the ConvLSTM network is as follows:

i_t＝σ(w_xi*x_t+w_hi*h_t-1+w_ci·c_t-1+b_i) _it =σ( _wxi * _xt + _whi * _ht-1 + _wci · _ct-1 +b _i )

f_t＝σ(w_xf*x_t+w_hf*h_t-1+w_cf·c_t-1+b_f) _ft = σ( _wxf * _xt + _whf * _ht-1 + _wcf · _ct-1 + _bf )

c_t＝f_t·c_t-1+i_t·tanh(w_xc*x_t+w_hc*h_t-1+b_c)c _t = f _t · c _t-1 + _it · tanh(w _xc * x _t + w _hc * h _t-1 + b _c )

o_t＝σ(w_xo*x_t+w_ho*c_t-1+w_co·c_t+b_o)o _t =σ(w _xo *x _t +w _ho *c _t-1 +w _co ·c _t +b _o )

h_t＝o_t·tanh(c_t)h _t = o _t ·tanh(c _t )

其中i_t表示输入门，f_t表示遗忘门，o_t表示输出门，c_t表示单元状态，w为权值矩阵，w_xi是输入门x_t的权值矩阵，w_xf是遗忘门x_t的权值矩阵，w_xc是单元状态计算中x_t的权值矩阵，w_ho是输出门x_t的权值矩阵，b为偏置项，b_i是输入门的偏置项，b_f是遗忘门的偏置项，b_c是单元状态的偏置项，b_o是输出门的偏置项是，x_t是输入矩阵，h_t-1是t-1时刻的隐藏层状态，c_t-1是t-1时刻的记忆状态。Where _it represents the input gate, _ft represents the forget gate, _ot represents the output gate, _ct represents the unit state, w is the weight matrix, _wxi is the weight matrix of the input gate _xt , _wxf is the weight matrix of the forget gate _xt , _wxc is the weight matrix of _xt in the unit state calculation, _who is the weight matrix of the output gate _xt , b is the bias term, _bi is the bias term of the input gate, _bf is the bias term of the forget gate, _bc is the bias term of the unit state, _bo is the bias term of the output gate, _xt is the input matrix, _ht-1 is the hidden layer state at time t-1, and _ct-1 is the memory state at time t-1.

进一步的，注意层将ConvLSTM每次迭代的输出h_t作为输入；然后通过Softmax运算得到每个输出向量的权值AT(h_t)；最后，将注意力权值AT(h_t)与隐藏层状态h_t相乘，得到最后的订正结果Y_t，计算公式如下:Furthermore, the attention layer takes the output _ht of each iteration of ConvLSTM as input; then the weight AT( _ht ) of each output vector is obtained through Softmax operation; finally, the attention weight AT( _ht ) is multiplied by the hidden layer state _ht to obtain the final correction result _Yt , which is calculated as follows:

其中，W为权值矩阵。Among them, W is the weight matrix.

进一步的，采用均方误差作为损失函数，损失函数的计算公式为：Furthermore, the mean square error is used as the loss function, and the calculation formula of the loss function is:

其中n表示格点数据中的格点数，是格点的真值数据，y_i是网络订正的数据；搭建模型后，设置模型的输入维度和输入数据的时间步长；设置模型优化器和学习速率；设置隐层神经节点数；设置模型迭代次数；不断调整参数，以模型损失查看模型收敛程度，择优选取收敛度参数，形成最终海温订正模型。Where n represents the number of grid points in the grid data, is the true value data of the grid, _yi is the network corrected data; after building the model, set the input dimension of the model and the time step of the input data; set the model optimizer and learning rate; set the number of hidden layer neural nodes; set the number of model iterations; continuously adjust the parameters, check the degree of model convergence with the model loss, select the convergence parameters at the best, and form the final sea temperature correction model.

进一步的，所述均方误差MSE、均方根误差RMSE、平均绝对误差MAE和平均绝对百分比误差MAPE的计算方法如下：Furthermore, the calculation methods of the mean square error MSE, root mean square error RMSE, mean absolute error MAE and mean absolute percentage error MAPE are as follows:

其中y_i为真实值，为估计值，n为样本个数。Where _yi is the true value, is the estimated value, and n is the number of samples.

本发明的有益效果如下：The beneficial effects of the present invention are as follows:

基于时空序列挖掘海表面温度数据的变动规律，可取得较好的订正效果，订正精度高；Mining the variation patterns of sea surface temperature data based on spatiotemporal series can achieve better correction results with high correction accuracy.

本发明的CBAM模块很小，使得模型训练时间减少，参数量也较少，训练速度更快，具有良好的性能。The CBAM module of the present invention is very small, so that the model training time is reduced, the number of parameters is also small, the training speed is faster, and good performance is achieved.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1为本发明流程图；Fig. 1 is a flow chart of the present invention;

图2为订正模型框架图；Figure 2 is a diagram of the revised model framework;

图3为订正模型在不同时间步长下的订正效果图；Figure 3 is a diagram showing the correction effect of the correction model at different time steps;

图4为订正模型在不同学习率下的订正效果图；Figure 4 is a diagram showing the correction effect of the correction model at different learning rates;

图5为订正模型在不同训练次数下的订正效果图。Figure 5 shows the correction effect of the correction model under different training times.

具体实施方式Detailed ways

下面结合附图对本发明作进一步的说明，但不以任何方式对本发明加以限制，基于本发明教导所作的任何变换或替换，均属于本发明的保护范围。The present invention is further described below in conjunction with the accompanying drawings, but the present invention is not limited in any way. Any changes or substitutions made based on the teachings of the present invention belong to the protection scope of the present invention.

本发明是一种基于ConvLSTM和CBAM的具有注意力机制的海表面温度订正方法，包括数据预处理，构建输入序列，构建订正模型，对模式数据进行订正和订正精度评价等内容。具体流程图如图1所示。The present invention is a sea surface temperature correction method with an attention mechanism based on ConvLSTM and CBAM, including data preprocessing, constructing an input sequence, constructing a correction model, correcting the pattern data and evaluating the correction accuracy, etc. The specific flow chart is shown in Figure 1.

步骤一具体步骤为：Step 1 The specific steps are:

1.对模式预报数据及用作标签值的遥感卫星海温数据进行预处理，构建输入序列。采用双线性插值的方法，统一预报数据和遥感卫星观测数据的时空分辨率为0.25°×0.25°，每天。1. Preprocess the model forecast data and the remote sensing satellite sea temperature data used as label values to construct the input sequence. Using the bilinear interpolation method, the spatial and temporal resolution of the unified forecast data and remote sensing satellite observation data is 0.25°×0.25° per day.

2.对于历史数据序列X，其中任意时刻t的SST数据和其他海洋环境变量组成的Xt都是W×H×C规格的网格数据，因此整个模型的输入为一个五维张量，表示为B×T×C×W×H。在这里B是一批训练样本的数量，T是序列数据的长度。W和H是SST字段的长度和宽度，C是加入的海洋环境变量数。在实验中，长度H和宽度W即经度和维度。通过滑动窗口获得时间的特征数量，例如，如果用过去3天的历史数据订正当天SST，那么时间序列的长度为4，即T的值。实验中，除了海温以外，添加了盐度、水流的U和V向量三个影响因素，所以C在这里是4。最后得到一个5维张量，作为模型的输入序列。U是指向东的方向，V是指向北的方向。有时候，U指的是纬向速度，V指的是径向速度。2. For the historical data sequence X, Xt composed of SST data and other marine environmental variables at any time t is grid data of W×H×C specifications, so the input of the entire model is a five-dimensional tensor, expressed as B×T×C×W×H. Here B is the number of training samples in a batch, and T is the length of the sequence data. W and H are the length and width of the SST field, and C is the number of added marine environmental variables. In the experiment, the length H and width W are longitude and latitude. The number of time features is obtained by sliding the window. For example, if the SST of the day is corrected with the historical data of the past 3 days, the length of the time series is 4, which is the value of T. In the experiment, in addition to the sea temperature, three influencing factors, U and V vectors of salinity and water flow, are added, so C is 4 here. Finally, a 5-dimensional tensor is obtained as the input sequence of the model. U is the direction pointing to the east, and V is the direction pointing to the north. Sometimes, U refers to the zonal velocity and V refers to the radial velocity.

3.将获得的输入数据序列进行标准化处理，将标准化处理后的数据序列作为CBAM-ConvLSTM模型的输入数据，输入维度为B×T×C×W×H。3. Standardize the obtained input data sequence and use the standardized data sequence as the input data of the CBAM-ConvLSTM model with an input dimension of B×T×C×W×H.

步骤二具体步骤为：Step 2 The specific steps are:

建立订正模型CBAM-ConvLSTM。CBAM-ConvLSTM方法的框架如图2所示。Establish the correction model CBAM-ConvLSTM. The framework of the CBAM-ConvLSTM method is shown in Figure 2.

1.空间特征提取。在这一步中，利用三维卷积来提取海温数据的空间依赖特征和各环境变量之间的联系。三维卷积是在二维卷积的基础上发展起来的。三维卷积是将三维核与多个连续矩阵叠加形成的立方体进行卷积。1. Spatial feature extraction. In this step, three-dimensional convolution is used to extract the spatial dependence characteristics of sea temperature data and the relationship between various environmental variables. Three-dimensional convolution is developed on the basis of two-dimensional convolution. Three-dimensional convolution is to convolve a cube formed by superimposing a three-dimensional kernel with multiple continuous matrices.

2.对经过三维卷积后的数据序列利用3D-CBAM注意机制提高三维卷积网络空间特征的利用率，并显示不同环境变量对结果的重要性。将输入的特征矩阵F(B×T×H×W×C)分别经过基于width和height的global max pooling(全局最大池化)和global averagepooling(全局平均池化)，得到两个B×T×1×1×C的特征图，接着，再将它们分别送入一个两层的神经网络(MLP)，第一层神经元个数为C/r(r为减少率)，激活函数为Relu，第二层神经元个数为C，这个两层的神经网络是共享的。而后，将MLP输出的特征进行加和操作，再经过sigmoid激活操作，生成最终的通道注意力矩阵，即Mc。最后，将Mc和输入特征矩阵F相乘，生成空间注意模块模块需要的输入特征F’。将通道注意模块输出的特征矩阵F’作为本模块的输入特征矩阵。首先做一个基于channel的global max pooling和global averagepooling，得到两个B×T×H×W×1的特征矩阵，然后将这2个特征图基于channel做通道拼接操作(concat)。然后经过一个卷积操作，降维为1个channel。再经过sigmoid激活函数生成空间注意力矩阵，即Ms。最后将该Ms和该模块的输入F’相乘，得到最终生成的特征矩阵X。2. The 3D-CBAM attention mechanism is used to improve the utilization rate of the spatial features of the 3D convolutional network for the data sequence after 3D convolution, and the importance of different environmental variables to the results is shown. The input feature matrix F (B×T×H×W×C) is subjected to global max pooling and global average pooling based on width and height respectively to obtain two B×T×1×1×C feature maps. Then, they are sent to a two-layer neural network (MLP) respectively. The number of neurons in the first layer is C/r (r is the reduction rate), the activation function is Relu, and the number of neurons in the second layer is C. The two layers of the neural network are shared. Then, the features output by the MLP are added and then activated by sigmoid to generate the final channel attention matrix, i.e. Mc. Finally, Mc is multiplied by the input feature matrix F to generate the input feature F' required by the spatial attention module. The feature matrix F' output by the channel attention module is used as the input feature matrix of this module. First, perform a channel-based global max pooling and global average pooling to obtain two B×T×H×W×1 feature matrices, and then perform a channel-based concatenation operation (concat) on these two feature maps. Then, after a convolution operation, the dimension is reduced to 1 channel. Then, a sigmoid activation function is used to generate a spatial attention matrix, Ms. Finally, the Ms is multiplied by the input F’ of the module to obtain the final generated feature matrix X.

3.将上述步骤得到的X输入到模型的ConvLSTM部分，由于海温订正时，间隔时间是一天，因此上一个时刻的隐藏层状态是h_t-1，上一时刻的记忆状态是c_t-1。ConvLSTM包括遗忘门f_t、输入门i_t、输出门o_t。当前t时刻，遗忘门f_t负责控制上一时刻的c_t-1有多少保存到当前时刻的c_t；输入门i_t负责控制当前时刻的即时状态有多少输入到当前单元状态c_t；输出门o_t负责控制当前单元状态o_t有多少作为当前时刻的隐层输出h_t。其计算公式分别为：3. Input X obtained in the above steps into the ConvLSTM part of the model. Since the interval of sea temperature correction is one day, the hidden layer state at the previous moment is h _t-1 and the memory state at the previous moment is c _t-1 . ConvLSTM includes forget gate f _t , input gate _it , and output gate o _t . At the current moment t, the forget gate f _t is responsible for controlling how much of the previous moment's c _t-1 is saved to the current moment's c _t ; the input gate _it is responsible for controlling how much of the current moment's instant state is input to the current unit state c _t ; the output gate o _t is responsible for controlling how much of the current unit state o _t is used as the hidden layer output h _t at the current moment. The calculation formulas are:

i_t＝σ(w_xi*x_t+w_hi*H_t-1+w_ci·C_t-1+b_i) _it =σ( _wxi * _xt + _whi * _Ht-1 + _wci · _Ct-1 + _bi )

f_t＝σ(w_xf*x_t+w_hf*H_t-1+w_cf·C_t-1+b_f)f _t =σ(w _xf *x _t +w _hf *H _t-1 +w _cf ·C _t-1 +b _f )

o_t＝σ(w_xo*x_t+w_ho*H_t-1+w_co·C_t+b_o)o _t =σ(w _xo *x _t +w _ho *H _t-1 +w _co ·C _t +b _o )

其中，w_xi、w_xf、w_xo分别是输入门、遗忘门和输出门的权重矩阵，b_i、b_f、b_o分别是输入门、遗忘门和输出门的偏置项，σ为sigmoid函数。Among them, w _xi , w _xf , w _xo are the weight matrices of the input gate, forget gate and output gate respectively, _bi , b _f , b _o are the bias terms of the input gate, forget gate and output gate respectively, and σ is the sigmoid function.

当前单元状态c_t由遗忘门f_t、上一时刻单元状态c_t-1、输入门i_t和当前输入的单元状态共同决定，其计算公式为：The current cell state c _t is determined by the forget gate f _t , the cell state c _t-1 at the previous moment, the input gate i _t and the current input cell state. The calculation formula is:

C_t＝f_t·C_t-1+i_t·tanh(w_xc*x_t+w_hc*H_t-1+b_c)C _t = f _t ·C _t-1 + _it ·tanh(w _xc *x _t +w _hc *H _t-1 +b _c )

其中，w_xc是输入单元状态的权重矩阵，b_c是输入单元状态的偏置项，tanh为双曲正切函数。Among them, w _xc is the weight matrix of the input unit state, b _c is the bias term of the input unit state, and tanh is the hyperbolic tangent function.

当前时刻ConvLSTM的隐层输出值h_t由输出门o_t与当前单元状态c_t共同决定，其计算公式为:The hidden layer output value _ht of ConvLSTM at the current moment is determined by the output gate _ot and the current unit state _ct , and its calculation formula is:

H_t＝o_t·tanh(C_t)H _t = o _t ·tanh(C _t )

同时，该阶段使用Attention机制，在ConvLSTM网络后面添加一个自定义的注意层，充分利用ConvLSTM模型每一步的隐藏层状态，将时间注意力权重分配到每个时间步的隐藏层状态。注意层将ConvLSTM每次迭代的输出h_t作为输入；然后通过Softmax运算得到每个输出向量的权值AT(h_t)；最后，将注意力权值AT(h_t)与隐藏层状态h_t相乘，得到最后的订正结果Y_t，计算公式如下:At the same time, this stage uses the Attention mechanism to add a custom attention layer after the ConvLSTM network, making full use of the hidden layer state of each step of the ConvLSTM model and assigning the time attention weight to the hidden layer state of each time step. The attention layer takes the output h _t of each iteration of ConvLSTM as input; then the weight AT(h _t ) of each output vector is obtained through the Softmax operation; finally, the attention weight AT(h _t ) is multiplied by the hidden layer state h _t to obtain the final correction result Y _t , and the calculation formula is as follows:

其中，W为权值矩阵。Among them, W is the weight matrix.

整个SST预报订正模型可以表述为：The entire SST forecast correction model can be expressed as:

本发明中采用均方误差(MSE)作为损失函数(LOSS)，其计算公式为：The present invention adopts mean square error (MSE) as loss function (LOSS), and its calculation formula is:

其中n表示格点数据中的格点数，是格点的真值数据，y_i是网络订正的数据。作为进一步地优选，搭建模型后，设置模型的输入维度和输入数据的时间步长；设置模型优化器和学习速率；设置隐层神经节点数；设置模型迭代次数；不断调整参数，以模型损失查看模型收敛程度，择优选取高收敛度参数，形成最终海温订正模型。Where n represents the number of grid points in the grid data, is the true value data of the grid points, and _yi is the network-corrected data. As a further optimization, after building the model, set the model input dimension and the time step of the input data; set the model optimizer and learning rate; set the number of hidden neural nodes; set the number of model iterations; continuously adjust the parameters, check the model convergence degree with the model loss, select the parameters with high convergence degree, and form the final sea temperature correction model.

1.将标准化处理后的时间列预报数据输入步骤二获得的海表面温度订正模型，得到订正后的输出结果；历史数据主要用来训练模型，是一段时间序列的数据，当日数据则是用来订正数据，是当前这一时刻的数据。通常当日数据包含在历史数据内，但不包含在内时也可订正。1. Input the normalized time series forecast data into the sea surface temperature correction model obtained in step 2 to obtain the corrected output result; historical data is mainly used to train the model and is the data of a time series, while the daily data is used to correct the data and is the data at the current moment. Usually the daily data is included in the historical data, but it can also be corrected when it is not included.

2.对模型输出结果反标准化处理，反标准化方法和步骤一中的标准化方法相对应，处理得到订正后的SST值，精度为0.25°×0.25°、每天；2. De-standardize the model output results. The de-standardization method corresponds to the standardization method in step 1. The corrected SST values are obtained with an accuracy of 0.25°×0.25° per day.

步骤四：为验证CBAM-ConvLSTM模型的有效性，采用均方误差MSE、均方根误差RMSE、平均绝对误差MAE和平均绝对百分比误差MAPE四个指标对模型进行评估，MAE反映估测的总体误差，RMSE反映样本数据的估测灵敏度和极值效应，二者值越小，表明效果越好。各指标计算方法如下：Step 4: To verify the effectiveness of the CBAM-ConvLSTM model, four indicators, namely mean square error (MSE), root mean square error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE), are used to evaluate the model. MAE reflects the overall error of the estimate, and RMSE reflects the estimation sensitivity and extreme value effect of the sample data. The smaller the values of the two, the better the effect. The calculation methods of each indicator are as follows:

其中，y_i表示真实的观测值，表示真实观测值的平均值，/>表示预测值。Among them, _yi represents the true observation value, represents the average of the true observations, /> Represents the predicted value.

下面将举例说明基于3DCBAM-ConvLSTM的海表温度订正方法效果。实验中采用美国国家海洋和大气管理局(NOAA)发布的全球1/12.5°混合坐标海洋模式HYCOM预报数据作为待订正预报数据，采用NOAA 1/4°Daily OI SST Analysis卫星遥感观测资料作为真值来评估订正精度。实验以110°E-114°E，纬度8°N-12°N的0.25°×0.25°范围海域为例，该海域位于南海。提取上述范围内的历史SST数据，时间范围为2019-01-01至2019-12-25，共计360天。The following example illustrates the effect of the sea surface temperature correction method based on 3DCBAM-ConvLSTM. In the experiment, the global 1/12.5° hybrid coordinate ocean model HYCOM forecast data released by the National Oceanic and Atmospheric Administration (NOAA) of the United States is used as the forecast data to be corrected, and the NOAA 1/4° Daily OI SST Analysis satellite remote sensing observation data is used as the true value to evaluate the correction accuracy. The experiment takes the 0.25°×0.25° range of 110°E-114°E and latitude 8°N-12°N as an example. The sea area is located in the South China Sea. The historical SST data within the above range is extracted, and the time range is from 2019-01-01 to 2019-12-25, a total of 360 days.

HYCOM为逐三小时预报，对HYCOM求日均温度,并采用双线性插值的方法，将HYCOM预报数据向OI SST Analysis的格点进行插值，以统一预报数据和遥感卫星观测数据的时空分辨率。由于数据的数值差异比较明显，因此需要进行归一化。归一化操作一方面可以提升模型的收敛速度，一方面可以提升模型的精度，还能够防止模型梯度爆炸。根据数据时间生成对应的时间列，将处理后的全部的SST数据进行划分，将75％的数据作为训练集，用于训练CBAM-ConvLSTM预报模型的参数，将余下25％的数据作为验证集，用于验证模型的学习效果。HYCOM is a three-hour forecast. The daily average temperature of HYCOM is calculated, and the bilinear interpolation method is used to interpolate the HYCOM forecast data to the grid points of OI SST Analysis to unify the spatiotemporal resolution of the forecast data and remote sensing satellite observation data. Since the numerical differences of the data are obvious, normalization is required. The normalization operation can improve the convergence speed of the model on the one hand, improve the accuracy of the model on the other hand, and prevent the gradient explosion of the model. According to the data time, the corresponding time column is generated, and all the processed SST data are divided. 75% of the data is used as the training set to train the parameters of the CBAM-ConvLSTM forecast model, and the remaining 25% of the data is used as the validation set to verify the learning effect of the model.

该模型由Pytorch构建，调整训练数据和输入数据的形状，并转换成Pytorch框架中的所需的张量(Tensor)格式。然后定义CBAM-ConvLSTM模型的参数，包括输入步长即输入序列长度、隐含层层数、输出序列长度、每层神经元个数。在该实验中，模型的卷积部分包括一个Conv3D层和一个BN层，BN层的主要作用是使得网络中每层输入数据的分布相对稳定，加速模型学习速度，缓解梯度消失问题，具有一定的正则化效果。网络设置时Conv3D层中卷积核大小是3×3×3。模型的CBAM部分中卷积注意力中的卷积核大小是3×3×3。模型的ConvLSTM部分，由于实验所选取的序列较短，因此只选用单层ConvLSTM，隐层神经元数为32个，输出层的神经元个数为1。定义好模型参数后，定义损失函数和优化器，选取MSE损失函数和Adam优化器，然后选择合适的训练次数开始训练模型。模型训练完成后，将测试数据输入模型进行测试，将模型输出结果反归一化后，得到海温的偏差订正值。通过比较偏差订正前后的指标来检验订正效果。The model is built by Pytorch, and the shapes of training data and input data are adjusted and converted into the required tensor format in the Pytorch framework. Then the parameters of the CBAM-ConvLSTM model are defined, including the input step length, i.e. the length of the input sequence, the number of hidden layers, the length of the output sequence, and the number of neurons in each layer. In this experiment, the convolution part of the model includes a Conv3D layer and a BN layer. The main function of the BN layer is to make the distribution of input data in each layer of the network relatively stable, accelerate the learning speed of the model, alleviate the gradient disappearance problem, and have a certain regularization effect. When setting the network, the convolution kernel size in the Conv3D layer is 3×3×3. The convolution kernel size in the convolution attention in the CBAM part of the model is 3×3×3. In the ConvLSTM part of the model, since the sequence selected for the experiment is short, only a single-layer ConvLSTM is selected, the number of hidden layer neurons is 32, and the number of neurons in the output layer is 1. After defining the model parameters, define the loss function and optimizer, select the MSE loss function and Adam optimizer, and then select the appropriate number of training times to start training the model. After the model training is completed, the test data is input into the model for testing, and the deviation correction value of the sea temperature is obtained after the model output result is denormalized. The correction effect is tested by comparing the indicators before and after the deviation correction.

为了证明本文所提的CBAM-ConvLSTM混合模型的有效性，将实验结果与海表温度订正中两种传统机器学习算法进行对比。它们分别是线性回归和SVM支持向量机。线性回归分析模型，其优点是抗干扰能力强，训练速度较快，缺点是不能模拟非线性关系，准确率并不是很高，容易欠拟合。SVM模型泛化性能比较好,不容易过拟合，可以在较少的数据下取得好的性能。但SVM对缺失数据、参数、核函数敏感以往为了匹配这些算法的输入形式，普遍将SST和海洋变量作为独立的特征，从而无法考虑变量之间的时空关系。本文实现这两种算法的过程，首先是将所有样本展开为算法能处理的形式，并调用sklearn的机器学习算法包进行预测分析。此外，我们对比了一系列模型，并与CBAM-ConvLSTM模式进行比较，因为以前用来订二维海温的方法较少。这包括了一个只考虑时序关系而不考虑空间关系的LSTM模型，一个结合了卷积的改进方法ConvLSTM模型还有只加3dCnn的ConvLSTM模型和只加时间注意力机制的ConvLSTM模型及CNN和AT都加了的CONVLSTM混合模型。在这里，我们设置实验参数学习率LR＝0.01,迭代次数Epoch＝300，并使用3天的历史数据进行SST订正。后面我们将会对参数设置进行详细讨论。模型评价采用RMSE、MSE、MAE和MAPE。不同订正方法的形成预报实验结果见表1。In order to prove the effectiveness of the CBAM-ConvLSTM hybrid model proposed in this paper, the experimental results are compared with two traditional machine learning algorithms in sea surface temperature correction. They are linear regression and SVM support vector machine. The linear regression analysis model has the advantages of strong anti-interference ability and fast training speed. The disadvantage is that it cannot simulate nonlinear relationships, the accuracy is not very high, and it is easy to underfit. The SVM model has better generalization performance, is not easy to overfit, and can achieve good performance with less data. However, SVM is sensitive to missing data, parameters, and kernel functions. In the past, in order to match the input form of these algorithms, SST and ocean variables were generally used as independent features, so that the spatiotemporal relationship between variables could not be considered. The process of implementing these two algorithms in this paper is to first expand all samples into a form that the algorithm can handle, and call the sklearn machine learning algorithm package for predictive analysis. In addition, we compared a series of models and compared them with the CBAM-ConvLSTM model, because there were fewer methods used to calibrate two-dimensional sea temperatures. This includes an LSTM model that only considers temporal relationships but not spatial relationships, an improved ConvLSTM model that combines convolution, a ConvLSTM model that only adds 3dCnn, a ConvLSTM model that only adds the temporal attention mechanism, and a CONVLSTM hybrid model that adds both CNN and AT. Here, we set the experimental parameters learning rate LR = 0.01, the number of iterations Epoch = 300, and use 3 days of historical data for SST correction. We will discuss the parameter settings in detail later. The model evaluation uses RMSE, MSE, MAE and MAPE. The experimental results of the formation prediction of different correction methods are shown in Table 1.

表1各订正方法订正结果对比Table 1 Comparison of correction results of various correction methods

从表1中对比可以发现，在使用传统机器学习方法进行订正时，SVR准确率较线性回归高。在其他深度学习的模型中，3DCNN-CONVLSTM-AT模型结果最好。然而，我们提出的CBAM-CONVLSTM模型在订正实验中可以达到MSE值为0.3520，MAE值为0.2641,MAPE为0.9546％，效果优于其余模型。From the comparison in Table 1, we can find that when using traditional machine learning methods for correction, SVR has a higher accuracy than linear regression. Among other deep learning models, the 3DCNN-CONVLSTM-AT model has the best results. However, the CBAM-CONVLSTM model we proposed can achieve an MSE value of 0.3520, a MAE value of 0.2641, and a MAPE of 0.9546% in the correction experiment, which is better than the other models.

从表1可以看出，LSTM的效果优于使用传统的机器学习方法进行订正，这说明了时间相关性对SST订正的重要性。而现有的LSTM的改进方法ConvLSTM的结果优于LSTM，验证了空间相关性对SST订正的重要性。同时，添加了注意力机制的ConvLSTM-AT的实验结果表明考虑了历史SST对要预报SST的不同影响通过分配不同权重的形式表现出来，因此模型表现更优。对3DCNN-ConvLSTM-AT和ConvLSTM-AT的预报结果进行对比，其中ConvLSTM-AT没有加卷积层，3DCNN-ConvLSTM-AT加了卷积层。在参数相同、数据集相同的情况下，很明显，3DCNN-ConvLSTM-AT订正精度高于ConvLSTM-AT。实验结果表明增加卷积层后对提高SST预报精度具有一定的作用，这种情况发生的主要原因是因为通过ConvLSTM自身的卷积操作对SST数据提取的局部特征还不够明显。而在模型之前再增加一个卷积层，提高了模型的特征提取能力，使数据在ConvLSTM模型中空间特征表现得更加明显，有利于提高SST预报的精度。As can be seen from Table 1, the effect of LSTM is better than that of using traditional machine learning methods for correction, which shows the importance of temporal correlation to SST correction. The results of the existing LSTM improved method ConvLSTM are better than LSTM, which verifies the importance of spatial correlation to SST correction. At the same time, the experimental results of ConvLSTM-AT with added attention mechanism show that the different effects of historical SST on the predicted SST are considered by assigning different weights, so the model performs better. The prediction results of 3DCNN-ConvLSTM-AT and ConvLSTM-AT are compared, where ConvLSTM-AT does not have a convolution layer and 3DCNN-ConvLSTM-AT has a convolution layer. Under the same parameters and the same data set, it is obvious that the correction accuracy of 3DCNN-ConvLSTM-AT is higher than that of ConvLSTM-AT. The experimental results show that adding convolution layers has a certain effect on improving the accuracy of SST forecasts. The main reason for this is that the local features extracted from SST data by the convolution operation of ConvLSTM itself are not obvious enough. Adding another convolutional layer before the model improves the feature extraction capability of the model, making the spatial characteristics of the data more obvious in the ConvLSTM model, which is conducive to improving the accuracy of SST forecast.

而我们的CBAM-Convlstm模型是在3DCNN-ConvLSTM-AT模型基础上添加CBAM注意力机制，RMSE指标为0.35，订正效果最优。CBAM-ConvLSTM模型进一步提取空间特征并为环境信息和空间特征添加权重，提高信息利用率，使得模型更贴近现实，包含信息更加全面，最终提高了SST预报精度。综上所述，与LR、SVR等传统机器学习订正方法以及深度学习方法LSTM、ConvLSTM和ConvLSTM-AT等相比，CBAM-ConvLSTM在订正SST时均具备最好的性能，验证了该方法的有效性。Our CBAM-ConvLSTM model adds the CBAM attention mechanism to the 3DCNN-ConvLSTM-AT model, with an RMSE index of 0.35 and the best correction effect. The CBAM-ConvLSTM model further extracts spatial features and adds weights to environmental information and spatial features to improve information utilization, making the model closer to reality and containing more comprehensive information, ultimately improving the accuracy of SST forecasts. In summary, compared with traditional machine learning correction methods such as LR and SVR, as well as deep learning methods such as LSTM, ConvLSTM, and ConvLSTM-AT, CBAM-ConvLSTM has the best performance in correcting SST, verifying the effectiveness of this method.

时间步长是模型学习时序信息过程的一个重要参数，时间步长在这里指对模型进行训练时输入数据提前的时间，例如timestep＝1即表示用前一天的历史数据来订正SST。通常，时间步长越长，可用于订正的信息就越多，但误差的积累也会增加。因此，分别取timestep＝1，3，5，7，10，15来订正SST.通过实验对比RMSE和MAPE等指标，确定合适的timestep用于订正SST。图3反映的是不同时间步长下CBAM-ConvLSTM订正SST的评价指标对比。timestep象征的是时间维度的信息，取值对模型的性能有影响。在订正时，对比不同timestep下的RMSE指标，timestep＝3时订正效果优于timestep＝1,5,7,10,15时的评价指标,timestep大于10之后，结果趋于平稳，时间信息对订正结果的影响变小。由此可知，在预报SST时，空间和时间维度的信息量应该适中，过多或过少都会影响到模型的性能，因此才需要进行以上实验。综上所述，取timestep＝3来订正SST。The time step is an important parameter in the process of learning time series information of the model. Here, the time step refers to the time of input data advance when training the model. For example, timestep = 1 means that the historical data of the previous day is used to correct the SST. Generally, the longer the time step, the more information can be used for correction, but the accumulation of errors will also increase. Therefore, timestep = 1, 3, 5, 7, 10, and 15 are used to correct the SST. Through experimental comparison of indicators such as RMSE and MAPE, the appropriate timestep is determined for correcting the SST. Figure 3 reflects the comparison of evaluation indicators of CBAM-ConvLSTM correcting SST under different time steps. Timestep symbolizes the information of the time dimension, and its value has an impact on the performance of the model. When correcting, comparing the RMSE indicators under different timesteps, the correction effect when timestep = 3 is better than the evaluation indicators when timestep = 1, 5, 7, 10, and 15. After timestep is greater than 10, the results tend to be stable, and the influence of time information on the correction results becomes smaller. It can be seen that when forecasting SST, the amount of information in the spatial and temporal dimensions should be moderate. Too much or too little will affect the performance of the model, so the above experiments are necessary. In summary, timestep = 3 is used to correct SST.

为了确定数据集的最佳训练次数(epoch)，设置不同的epoch进行实验。如图4显示RMSE在300个时点达到稳定状态。因此，考虑到模型的精度和性能，在我们的实验中，采用300训练次数。In order to determine the best number of training epochs for the dataset, different epochs were set for experiments. As shown in Figure 4, RMSE reaches a stable state at 300 time points. Therefore, considering the accuracy and performance of the model, 300 training epochs were used in our experiments.

学习率(Learning rate)是一个重要的超参数，其决定着目标函数能否收敛到局部最小值以及何时收敛到最小值。合适的学习率能够使目标函数在合适的时间内收敛到局部最小值。研究发现，在训练过程的开始阶段，使用自适应优化算法如“Adam”算法时，模型处于非收敛状态。然后调整固定模型框架内的lr和其他超参数。第一步是从0.1降至0.001，降速为10。那么，当学习率在10^-2水平时，模型的训练和验证损失将处于稳定下降状态。调整学习率进行实验，实验结果如图5所示。图表示RMSE/MAPE等指标随lr变化而变化。根据均方根误差(RMSE)，最优学习率为0.01。根据MAPE，最优学习率为0.004。因此，在我们的数据集中，最好的学习速率在10-2和10-3之间。The learning rate is an important hyperparameter that determines whether and when the objective function can converge to the local minimum. A suitable learning rate enables the objective function to converge to the local minimum in a suitable time. It is found that at the beginning of the training process, when using an adaptive optimization algorithm such as the "Adam" algorithm, the model is in a non-convergent state. Then adjust lr and other hyperparameters within the fixed model framework. The first step is to reduce it from 0.1 to 0.001, with a rate of 10. Then, when the learning rate is at the level of 10 ^-2 , the training and validation losses of the model will be in a stable decline. The learning rate is adjusted for experiments, and the experimental results are shown in Figure 5. The figure shows that indicators such as RMSE/MAPE change with lr. According to the root mean square error (RMSE), the optimal learning rate is 0.01. According to MAPE, the optimal learning rate is 0.004. Therefore, in our data set, the best learning rate is between 10-2 and 10-3.

复杂性和训练时间分析，实验环境是Windows10，Intel Core i5 11，2.4GHz，16GRAM，算法实现使用python3。Complexity and training time analysis,The experimental environment is Windows 10, Intel Core i5 11, 2.4GHz, 16GRAM, and the algorithm is implemented using python3.

表2每个模型的网络参数个数和训练时间Table 2 Number of network parameters and training time for each model

ParametersParameters Train(s)Train(s) Test(s)Test(s) LSTMLSTM 1360113601 271271 11 CONVLSTMCONVLSTM 4499344993 236236 11 CONVLSTM-ATCONVLSTM-AT 4607946079 437437 11 3DCNN-CONVLSTM-AT3DCNN-CONVLSTM-AT 1319713197 272272 11 CBAM-CONVLSTMCBAM-CONVLSTM 1356013560 223223 11

表2列出了实验中使用的模型的训练时间和模型参数数量。可以发现CBAM-CONVLSTM模型的训练参数大约少于ConvLSTM的10倍，这使得训练速度更快，更适合于实际应用。3DCNN-CONVLSTM-AT模型的参数量接近CBAM-CONVLSTM模型，表明CBAM模块很小，还会使得模型训练时间减少。我们提出的CBAM-CONVLSTM模型消耗的时间最少，参数量也较少,该模型拥有良好的性能。Table 2 lists the training time and number of model parameters of the models used in the experiment. It can be found that the training parameters of the CBAM-CONVLSTM model are about 10 times less than those of the ConvLSTM, which makes the training faster and more suitable for practical applications. The number of parameters of the 3DCNN-CONVLSTM-AT model is close to that of the CBAM-CONVLSTM model, indicating that the CBAM module is very small and will also reduce the model training time. The CBAM-CONVLSTM model we proposed consumes the least time and has fewer parameters, and the model has good performance.

本文所使用的词语“优选的”意指用作实例、示例或例证。本文描述为“优选的”任意方面或设计不必被解释为比其他方面或设计更有利。相反，词语“优选的”的使用旨在以具体方式提出概念。如本申请中所使用的术语“或”旨在意指包含的“或”而非排除的“或”。即，除非另外指定或从上下文中清楚，“X使用A或B”意指自然包括排列的任意一个。即，如果X使用A；X使用B；或X使用A和B二者，则“X使用A或B”在前述任一示例中得到满足。As used herein, the word "preferred" is intended to be used as an example, instance, or illustration. Any aspect or design described herein as "preferred" is not necessarily to be construed as being more advantageous than other aspects or designs. On the contrary, the use of the word "preferred" is intended to present concepts in a specific way. The term "or" as used in this application is intended to mean an inclusive "or" rather than an exclusive "or". That is, unless otherwise specified or clear from the context, "X uses A or B" means any one of the naturally included permutations. That is, if X uses A; X uses B; or X uses both A and B, then "X uses A or B" is satisfied in any of the foregoing examples.

而且，尽管已经相对于一个或实现方式示出并描述了本公开，但是本领域技术人员基于对本说明书和附图的阅读和理解将会想到等价变型和修改。本公开包括所有这样的修改和变型，并且仅由所附权利要求的范围限制。特别地关于由上述组件(例如元件等)执行的各种功能，用于描述这样的组件的术语旨在对应于执行所述组件的指定功能(例如其在功能上是等价的)的任意组件(除非另外指示)，即使在结构上与执行本文所示的本公开的示范性实现方式中的功能的公开结构不等同。此外，尽管本公开的特定特征已经相对于若干实现方式中的仅一个被公开，但是这种特征可以与如可以对给定或特定应用而言是期望和有利的其他实现方式的一个或其他特征组合。而且，就术语“包括”、“具有”、“含有”或其变形被用在具体实施方式或权利要求中而言，这样的术语旨在以与术语“包含”相似的方式包括。Moreover, although the present disclosure has been shown and described with respect to one or implementations, those skilled in the art will think of equivalent variations and modifications based on the reading and understanding of this specification and the accompanying drawings. The present disclosure includes all such modifications and variations, and is limited only by the scope of the appended claims. In particular, with respect to the various functions performed by the above-mentioned components (such as elements, etc.), the terms used to describe such components are intended to correspond to any component (unless otherwise indicated) that performs the specified function of the component (such as it is functionally equivalent), even if the structure is not equivalent to the disclosed structure of the function in the exemplary implementation of the present disclosure shown herein. In addition, although the specific features of the present disclosure have been disclosed with respect to only one of several implementations, such features can be combined with one or other features of other implementations that may be desired and advantageous for a given or specific application. Moreover, insofar as the terms "including", "having", "containing" or their variations are used in specific embodiments or claims, such terms are intended to be included in a manner similar to the term "comprising".

本发明实施例中的各功能单元可以集成在一个处理模块中，也可以是各个单元单独物理存在，也可以多个或多个以上单元集成在一个模块中。上述集成的模块既可以采用硬件的形式实现，也可以采用软件功能模块的形式实现。所述集成的模块如果以软件功能模块的形式实现并作为独立的产品销售或使用时，也可以存储在一个计算机可读取存储介质中。上述提到的存储介质可以是只读存储器，磁盘或光盘等。上述的各装置或系统，可以执行相应方法实施例中的存储方法。The functional units in the embodiments of the present invention may be integrated into a processing module, or each unit may exist physically separately, or multiple or more units may be integrated into one module. The above-mentioned integrated module may be implemented in the form of hardware or in the form of a software functional module. If the integrated module is implemented in the form of a software functional module and sold or used as an independent product, it may also be stored in a computer-readable storage medium. The above-mentioned storage medium may be a read-only memory, a disk or an optical disk, etc. The above-mentioned devices or systems may execute the storage method in the corresponding method embodiment.

综上所述，上述实施例为本发明的一种实施方式，但本发明的实施方式并不受所述实施例的限制，其他的任何背离本发明的精神实质与原理下所做的改变、修饰、代替、组合、简化，均应为等效的置换方式，都包含在本发明的保护范围之内。To sum up, the above embodiment is an implementation mode of the present invention, but the implementation mode of the present invention is not limited by the embodiment, and any other changes, modifications, substitutions, combinations, and simplifications that deviate from the spirit and principles of the present invention should be equivalent replacement methods and are included in the protection scope of the present invention.

Claims

1. A method for correcting the deviation of sea surface temperature numerical forecast based on attention mechanism, characterized in that it comprises the following steps:

Step 1: Preprocess the model forecast data and remote sensing satellite sea temperature data to construct the input sequence, including:

(1) Obtain model forecast data and remote sensing satellite sea temperature data as reference values, and extract marine environment data;

(2) Construct a time column, obtain the number of time features through a sliding window, and then add ocean feature influencing factors, including sea temperature, salinity, and the U and V vectors of water currents, to obtain the input historical SST sequence of the model;

(3) Standardize the historical SST series data obtained in the above steps;

Step 2: Construct a revised model, including:

(1) Spatial feature extraction: Using 3D convolution, the 3D kernel is convolved with a cube formed by superimposing multiple continuous matrices to extract the spatial dependency characteristics of SST data and the relationship between multiple environmental variables;

(2) Using the 3D-CBAM attention mechanism on the data sequence after 3D convolution to improve the utilization rate of the spatial features of the 3D convolution network and show the importance of different environmental variables to the results; the 3D-CBAM attention mechanism consists of two parts: a channel attention module and a spatial attention module;

Multiply the channel attention module Mc and the input feature matrix F to obtain the input matrix Ms of the spatial attention module;

Multiply the spatial attention module Ms and the input F’ of the module to obtain the final generated feature matrix;

(3) Input the feature matrix data sequence obtained in the above steps into the ConvLSTM part of the model; use the Attention mechanism to add a custom attention layer after the ConvLSTM network, and use the hidden layer state of each step of the ConvLSTM model to assign the temporal attention weight to the hidden layer state of each time step;

Adjust the final ConvLSTM output to get the final corrected result;

The attention layer takes the output _ht of each iteration of ConvLSTM as input; then the weight AT( _ht ) of each output vector is obtained through Softmax operation; finally, the attention weight AT( _ht ) is multiplied by the hidden layer state _ht to obtain the final correction result _Yt , which is calculated as follows:

Among them, W is the weight matrix

(4) Building and training the correction model obtained in the above steps, continuously adjusting the parameters, and selecting the best parameters to obtain the sea surface temperature correction model;

Step 3: Use the correction model to correct the model forecast data;

(1) Inputting the standardized time series forecast data into the sea surface temperature correction model obtained in step 2 to obtain the corrected output result;

The mean square error is used as the loss function, and the calculation formula of the loss function is:

Where n represents the number of grid points in the grid data, is the true value data of the grid points, _yi is the network-corrected data; after building the model, set the model input dimension and the time step of the input data; set the model optimizer and learning rate; set the number of hidden neural nodes; set the number of model iterations; continuously adjust the parameters, check the model convergence degree with the model loss, select the convergence parameters optimally, and form the final SST correction model;

(2) De-standardize the model output results. The de-standardization method corresponds to the standardization method in step 1, and the corrected SST value is obtained;

Step 4: Use the mean absolute error (MAE), mean absolute percentage error (MAPE), mean square error (MSE) and root mean square error (RMSE) evaluation indicators to evaluate the accuracy of the results after model correction.

2. According to the attention mechanism-based sea surface temperature numerical forecast deviation correction method described in claim 1, it is characterized in that a bilinear interpolation method is used to unify the temporal and spatial resolutions of forecast data and remote sensing satellite observation data.

3. According to the method for correcting the deviation of numerical sea surface temperature forecast based on the attention mechanism in claim 1, it is characterized in that, for the historical data sequence X, the SST data at any time t and other marine environmental variables Xt are all grid data of W×H×C specifications, and the input of the entire model is a five-dimensional tensor, expressed as B×T×C×W×H, where B is the number of a batch of training samples, T is the length of the sequence data, W and H are the length and width of the SST field, and C is the number of marine environmental variables added.

4. The method for correcting the deviation of sea surface temperature numerical forecast based on the attention mechanism according to claim 1 is characterized in that the formula for the standardization process is as follows:

Among them, X _max and X _min are the maximum and minimum values in the sequence data respectively.

5. The method for correcting the deviation of sea surface temperature numerical forecast based on the attention mechanism according to claim 1 is characterized in that the calculation formula of the channel attention module is as follows:

Mc(F)＝σ(MLP(MaxPool3D(F))+MLP(AvgPool3D(F)))

Where MLP represents multi-layer perceptron, MaxPool3D represents maximum pooling, AvgPool3D represents average pooling, and F is the input feature matrix;

The input matrix Ms of the spatial attention module is:

Ms(F) = σ(f ^3×3 ([MaxPool3D(F); AvgPool3D(F)]))

Among them, f is a 3×3 convolution operation.

6. The method for correcting the deviation of sea surface temperature numerical forecast based on the attention mechanism according to claim 1, characterized in that the calculation formula of the ConvLSTM network is as follows:

_it =σ( _wxi * _xt + _whi * _ht-1 + _wci · _ct-1 +b _i )

_ft = σ(w _xf *x _t +w _hf *h _t-1 +w _cf -c _t-1 +b _f )

c _t = f _t · c _t-1 + _it · tanh(w _xc * x _t + w _hc * h _t-1 + b _c )

o _t =σ(w _xo *x _t +w _ho *c _t-1 +w _co ·c _t +b _o )

h _t = o _t ·tanh(c _t )

Where _it represents the input gate, _ft represents the forget gate, _ot represents the output gate, _ct represents the unit state, w is the weight matrix, _wxi is the weight matrix of the input gate _xt , _wxf is the weight matrix of the forget gate _xt , _wxc is the weight matrix of _xt in the unit state calculation, _who is the weight matrix of the output gate _xt , b is the bias term, _bi is the bias term of the input gate, _bf is the bias term of the forget gate, _bc is the bias term of the unit state, _bo is the bias term of the output gate, _xt is the input matrix, _ht-1 is the hidden layer state at time t-1, and _ct-1 is the memory state at time t-1.

7. The method for correcting the deviation of sea surface temperature numerical forecast based on the attention mechanism according to claim 1 is characterized in that the calculation method of the mean square error MSE, root mean square error RMSE, mean absolute error MAE and mean absolute percentage error MAPE is as follows:

Where _yi is the true value, is the estimated value, and n is the number of samples.