Background
China is adjacent to the northwest pacific, which is the sea area with the most tropical cyclones generated annually and accounts for 30% of the tropical cyclones generated annually in the world. 5-11 months per year is the frequent period of tropical cyclone activity, in the period, about 25 typhoons are generated by the Pacific ocean in the west and north, about 6.9 tropical cyclones entering offshore places of China and landing in China are generated each year, and the disastrous weather such as gusty wind, rainstorm and storm brought by the tropical cyclones can have great influence on the production and life, life safety, agriculture, fishery and industrial production of people. Therefore, the tropical cyclone is one of the most concerned weather systems of the weather protection service unit in summer, and the improvement of the forecast accuracy of the tropical cyclone has great significance for disaster prevention and reduction.
At present, the weather scale tropical cyclone forecasting technology within 10 days is mature, numerical weather forecasting, particularly collective forecasting technology is mainly used for forecasting, the forecasting accuracy is improved year by year, the predicted path error is reduced continuously, and the forecasting result basically meets the guarantee requirement of people on short-time activities. However, if some long-term planning is involved, forecast results are required for more than 10 days or even on a monthly scale, in which case tropical cyclone prediction on a monthly scale is particularly important. Different from the weather scale tropical cyclone forecast, the monthly scale tropical cyclone climate forecast has its own characteristics in the aspects of influence factors, forecast contents, technical paths and the like. If the numerical prediction of the tropical cyclone of the weather scale is an initial value problem, and the prediction result is greatly influenced by the initial value, the prediction of the tropical cyclone of the month scale relates to the influence of external forcing factors such as the global monthly change of sea temperature, atmospheric surge, snow, sea ice, solar radiation and the like, and the theoretical basis for the prediction of the tropical cyclone of the weather scale cannot be applied. The development of the monthly scale tropical cyclone prediction theory is not as mature as the weather scale tropical cyclone prediction, and is a difficult problem in the international meteorological research field.
The generation frequency of the tropical cyclone is always an important content for predicting the tropical cyclone in a climate scale, and currently, many researches on the tropical cyclone frequency throughout the year have been carried out, but no method can be used for predicting the tropical cyclone frequency in a month scale.
The Tropical cyclone generation frequency is called TCF for short.
Disclosure of Invention
The invention provides a monthly scale northwest pacific tropical cyclone generation frequency prediction method based on sea temperature main component information for the first time in order to realize the prediction of monthly scale northwest pacific tropical cyclone generation frequency. In the method, a monthly scale northwest Pacific ocean tropical cyclone generation frequency TCF prediction model is established first, and then the prediction model is adopted for prediction.
The method comprises the following specific steps:
step 1, performing principal component analysis on the sea temperature level, wherein the sea temperature level is X and has m variables, n samples (m is 72 × 36 grid points, n is 30 years × 12 months), and then the sea temperature is expressed in a matrix form as:
the sea temperature at month t can be represented as a vector
Performing principal component analysis on the sea temperature X, i.e. performing linear transformation on the sea temperature X, so that
Xm×n=Vm×MaM×n (3)
Where V is an M × M matrix, the column vectors are eigenvectors of the X matrix, atThe column vector is formed by the time values of the Tth month of the principal components of M empirical orthogonal function decomposition EOF, and is an M-dimensional time sequence:
the decomposed principal components a are all orthogonal, i.e. uncorrelated with each other, and the principal components of the first few modalities describe the most important modalities of sea temperature, representing the largest variability of sea temperature, so that the sea temperature data volume with a large number of grid points and observation samples is greatly reduced.
And 2, establishing a linear prediction model for the sea temperature main components, and predicting the sea temperature main components of the future months.
Suppose sea temperature XtIf the linear evolution relation is satisfied, the linear model for T step prediction is
Xt+τ=F(τ)Xt+εt+τ (5)
Where F (τ) is a matrix of m × m order constants, which is a natural regression coefficient, εt+τIs an error vector, assumed to be a normal white noise vector.
Substituting the formula (3) into the formula (5) to obtain
Vat+τ=F(τ)Vat+εt+τ (6)
Right (6) type left-hand multiplication by VTUsing orthonormal property V of the eigenvectorsTWhen V is 1, then obtain
at+τ=VTF(τ)Vat+VTεt+τ (7)
It can be seen that the principal component a of sea temperature
tLinear evolution is also satisfied. Note the book
η
t+τ=V
Tε
t+τCarry over to formula (7)
Wherein
Is a matrix of M × M order constants, which can be obtained by:
Cτ=<at+τat> (10)
(11) variance of principal component in formula<a
ta
t>For the eigenvalues λ, the variance between the different principal components is 0.
Is estimated as
To obtain
Then, a linear prediction model of the sea temperature principal component can be established, namely the prediction equation of the kth principal component of the sea temperature is as follows:
and 3, establishing a linear prediction model between the sea temperature main component and the TCF. TCF was processed into a crescent flat form. Extracting the sea-temperature main component information of different historical months and the TCF of the corresponding month, and intercepting the sea-temperature main component a of N modalslAnd l is 1, 2, …, N, and TCF, then the linear relationship between TCF and principal sea temperature components at month i is:
the expression of the coefficient a is:
and (4) predicting by adopting the model, substituting the predicted sea temperature main component of the future month into a month scale northwest Pacific ocean tropical cyclone generation frequency prediction model to obtain the predicted TCF of the t + Tth month as follows:
i is the predicted month.
The data adopted by the model establishing method comprises the following steps:
the Kaplan global average sea-temperature distance data issued by the United kingdom weather bureau has the resolution of 5 degrees multiplied by 5 degrees, 72 lattice points in the longitude direction and 36 lattice points in the latitude direction, and the time length covers 1856 years, 1 month to the present;
the CMA tropical cyclone optimal path data set published by Shanghai typhoon research institute of China weather service covers 1949 and 1 to the present for a long time, and the data contains information such as life history, positions and intensities of all tropical cyclones generated every year.
The method selects 30-year data which can be covered by sea temperature data and tropical cyclone data to carry out prediction modeling, takes the global main component of sea temperature range as a prediction factor, and takes the monthly-scale northwest Pacific ocean tropical cyclone generation frequency (TCF) as a prediction quantity.
The sea temperature principal component mode number N extracted in the above equation (14) is an important parameter for determining the prediction model. In order to determine the number of N in the formula (14), historical return and optimal parameter calibration are carried out on TCF in 5-11 months in 30 years in 1989-2018. 1-4 months and 12 months per year are inactive seasons of tropical cyclones in the pacific northwest, the frequency of the tropical cyclones in the flat tropical zone is less than 1 year in many months, and the number of the tropical cyclones logging in China is less, so that a prediction model and optimal parameter calibration are established only for TCF of 5-11 months. The criteria for selecting the optimal parameter N are: the standard deviation of the average TCF of each month of 30 years of the prediction mode return is minimized, and the accuracy is highest. The formula for the standard deviation is:
wherein, TCFpIn return for tropical cyclone frequency of a month, TCFoIs the observed tropical cyclone frequency for a month.
The accuracy is the percentage of the number of years in which the correct year is reported in a month to all the years reported. Wherein the criterion for reporting correctness is that the absolute value of the deviation between the reported TCF and the observed TCF is less than 2. According to the return result, determining the optimal parameters N of the TCF prediction model in each month of 5-11 months as follows: 2(5 months), 3(6 months), 5(7 months), 5(8 months), 2(9 months), 8(10 months), 4(11 months).
The invention has the beneficial effects that:
(1) the prediction model can extract independent and more representative principal component information from massive historical sea temperature data as a prediction factor, and not only considers the influence of the sea on the generation of tropical cyclones, but also considers the seasonal variation characteristics of the tropical frequencies;
(2) the method realizes the monthly rolling prediction of the tropical cyclone frequency of the northwest Pacific moon scale for the first time, and has long prediction time and high prediction accuracy.
Detailed Description
The method is further described with reference to the following specific embodiments and the accompanying drawings.
Data needed for modeling and prediction needs to be downloaded before a monthly-scale northwest pacific tropical cyclone generation frequency prediction model is built. The data required by the method comprises:
the Kaplan global average sea-temperature distance data issued by the UK weather bureau has the resolution of 5 degrees multiplied by 5 degrees, 72 lattice points in the longitude direction and 36 lattice points in the latitude direction, the time length covers 1856 years, 1 month to the present, and the download address is as follows: https:// www.esrl.noaa.gov/psd/data/gridded/data.
The optimal path data set of the CMA tropical cyclone released by Shanghai typhoon research institute of China weather service covers the time length from 1949 to 1 to the present, and the download addresses are as follows: http:// g.hyyb.org/systems/TY/info/tcdataCMA/zjjsjj _ zlhq.html.
A climatological analysis was performed on the monthly scale tropical cyclone frequency to understand the climatological characteristics of the tropical cyclones as a function of the month. Fig. 1 shows the average number of tropical cyclones in each month in nearly 30 years, and as can be seen from fig. 1, months 1 to 4 and 12 in each year are inactive periods of tropical cyclones in the pacific northwest, and the average tropical cyclone generation frequency in each month is less than 1. Therefore, the method only predicts the tropical cyclone generation frequency (TCF) of Pacific ocean in North West to 11 months.
The modeling time selected by the method is used for carrying out prediction modeling on the 30-year data which can be covered by both the sea temperature data and the tropical cyclone data. For example, if TCF is predicted for 6 months in 2018 and the prediction step size is 2 months, historical sea temperature data for 30 years × 12 months in total between 4 months in 1989 and 4 months in 2018 and 6-month thermal zone cyclone data for 29 years in 1989 and 2017 are required. The main component of the sea temperature is used as a prediction factor, and the monthly scale northwest Pacific ocean tropical cyclone generation frequency is used as a prediction quantity.
According to the flowchart of the TCF prediction method shown in fig. 2, in this example:
step 1 is to perform principal component analysis on the global sea-temperature range level, and the sea-temperature range level is set as X, and m variables and n samples are provided, wherein m is 72 × 36 grid points, and n is 30 years × 12 months. The sea-temperature range can be represented in matrix form as:
performing principal component analysis on the sea-temperature range X, i.e. performing linear transformation on the sea-temperature range X to make X
m×n=V
m×Ma
M×nWhere V is an M × M matrix, the column vectors are eigenvectors of the X matrix, a
tIs a column vector of M EOF principal components at time t values:
TABLE 1
Table 1 shows the cumulative variance contribution of the first 10 modes obtained by the principal component analysis of sea temperature, and as can be seen from table 1, the cumulative variance contribution of the first 10 modes reaches 86.9%, which are the most important modes for describing the change of sea temperature and represent the main change characteristics of sea temperature. The decomposed principal components a are all orthogonal, namely the prediction factors are independent, and the overfitting problem of the prediction model can be greatly reduced by using the prediction factors as the prediction factors. And the sea temperature data volume with a large number of lattice points and observation samples can be greatly reduced by using principal components of several modes as a prediction factor.
And
step 2, establishing a linear prediction model of the sea temperature main components by utilizing the historical sea temperature main components obtained by analyzing the sea temperature main components in the
step 1, and predicting the sea temperature main components of the future months. Suppose sea temperature X
tSatisfying the linear evolution relationship, the main component a of sea temperature
tAlso satisfies a linear evolution, i.e.
Wherein
Is a matrix of M × M order constants, which can be obtained by:
wherein, C
τ=<a
t+τa
t>,
λ is a characteristic value. To obtain
Then, a linear prediction model of the sea temperature main component is established, and the prediction equation of the k-th main component of the sea temperature is as follows:
k is 1, 2, …, M. From this prediction equation we can derive the principal components of the future predicted months.
And 3, establishing a linear prediction model between the sea temperature main component and the TCF by using the sea temperature historical main component obtained in the step 2. And processing the TCF into a month pitch form, and extracting the sea temperature main component information of different historical months and the tropical cyclone frequency of the corresponding month. For example, predicting the TCF of 6 months in 2018 requires extracting all 6-month principal components in the 4 months in 1989 to 360 months in 4 months in 2018 obtained in step 2, and establishing a linear correlation with the TCF of 6 months in 2018 in 1989.
Intercepting sea temperature main component a of N modals
lAnd if a linear prediction model is established between l and TCF, 2, …, N, the linear relation between TCF and principal sea temperature component in month i is as follows:
i is 1, 2, …, 12, and the coefficient a is expressed as:
in the prediction model of the TCF, the intercepted sea temperature principal component mode number N is an important parameter for determining the prediction model. In order to determine the number of the parameters N, the TCF of 30 years and 5-11 months in 1989-2018 is subjected to historical return and optimal parameter calibration. The standard for selecting the optimal parameter N is to minimize the standard deviation of the average TCF of each month in 30 years of the prediction mode return and has the highest accuracy. FIG. 3 shows the TCF standard deviation and accuracy as a function of the number of modes in the 5-11 months of 1989-2018. As can be seen from fig. 3, the standard deviation and accuracy of the reported result vary greatly with the number of modalities N, and the optimal number of modalities can minimize the standard deviation and maximize the accuracy of the reported result. The formula for the standard deviation is:
wherein, TCF
pIn return for tropical cyclone frequency of a month, TCF
oIs the observed tropical cyclone frequency for a month. The accuracy is the percentage of 30 years that a month returns the correct number of years. Wherein the criterion for the correct return is that the absolute value of the deviation between the returned tropical cyclone frequency and the observation is less than 2.
TABLE 2
The optimal modal number, the standard deviation (unit: number) and the accuracy (unit:%) of the prediction model of the vortex frequency of the thermal zone in different months are shown in table 2. As can be seen from table 2, the optimal parameters N of the TCF prediction model in each month of 5-11 are: 2(5 months), 3(6 months), 5(7 months), 5(8 months), 2(9 months), 8(10 months), 4(11 months). After the optimal parameters are determined, the standard deviation of the prediction model return results in each month is less than 2, the accuracy is more than 80%, and the good prediction performance of the prediction model is displayed.
And 4, substituting the sea temperature main component obtained in the
step 2 in the future month into the monthly scale northwest Pacific ocean tropical cyclone generation frequency prediction model obtained in the
step 3 to obtain the predicted tropical cyclone frequency of the t + T month:
i is the predicted month.