CN115081335B - Soil heavy metal spatial distribution prediction method for improved deep extreme learning machine - Google Patents
Soil heavy metal spatial distribution prediction method for improved deep extreme learning machine Download PDFInfo
- Publication number
- CN115081335B CN115081335B CN202210778405.2A CN202210778405A CN115081335B CN 115081335 B CN115081335 B CN 115081335B CN 202210778405 A CN202210778405 A CN 202210778405A CN 115081335 B CN115081335 B CN 115081335B
- Authority
- CN
- China
- Prior art keywords
- heavy metal
- soil
- learning machine
- extreme learning
- delm
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/20—Design optimisation, verification or simulation
- G06F30/27—Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N33/00—Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
- G01N33/24—Earth materials
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/004—Artificial life, i.e. computing arrangements simulating life
- G06N3/006—Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2119/00—Details relating to the type or aim of the analysis or the optimisation
- G06F2119/02—Reliability analysis or reliability optimisation; Failure analysis, e.g. worst case scenario performance, failure mode and effects analysis [FMEA]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Software Systems (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- Mathematical Physics (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Chemical & Material Sciences (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Remote Sensing (AREA)
- Environmental & Geological Engineering (AREA)
- General Life Sciences & Earth Sciences (AREA)
- Geology (AREA)
- Geometry (AREA)
- Computer Hardware Design (AREA)
- Food Science & Technology (AREA)
- Medicinal Chemistry (AREA)
- Analytical Chemistry (AREA)
- Biochemistry (AREA)
- Immunology (AREA)
- Pathology (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
一种改进深度极限学习机的土壤重金属空间分布预测方法,它包括以下步骤:步骤1:确定研究区域内的调查网格;步骤2:在研究区域内采集土壤样品;步骤3:测定采集的土壤样品重金属浓度;步骤4:确定所需多源辅助变量;步骤5:筛选多源辅助变量;步骤6:基于土壤重金属浓度数据和辅助变量数据构建深度极限学习机;步骤7:用金枪鱼群优化算法优化所构建的深度极限学习机;步骤8:获得土壤重金属空间分布图。本发明的目的是提供一种利用深度极限学习机DELM来对土壤重金属浓度空间进行预测的方法,以提高土壤重金浓度空间预测的精度,进而为更好的进行土壤重金属污染防治做准备。
A method for predicting the spatial distribution of heavy metals in soil by improving a deep extreme learning machine includes the following steps: step 1: determining the survey grid in the study area; step 2: collecting soil samples in the study area; step 3: measuring the heavy metal concentration of the collected soil samples; step 4: determining the required multi-source auxiliary variables; step 5: screening the multi-source auxiliary variables; step 6: constructing a deep extreme learning machine based on the soil heavy metal concentration data and the auxiliary variable data; step 7: optimizing the constructed deep extreme learning machine with a tuna school optimization algorithm; step 8: obtaining a spatial distribution map of heavy metals in soil. The purpose of the present invention is to provide a method for predicting the spatial concentration of heavy metals in soil by using a deep extreme learning machine DELM, so as to improve the accuracy of spatial prediction of the concentration of heavy metals in soil, and thus prepare for better prevention and control of soil heavy metal pollution.
Description
技术领域Technical Field
本发明属于土壤重金属预测技术领域,具体涉及一种土壤重金属空间分布预测方法。The invention belongs to the technical field of soil heavy metal prediction, and in particular relates to a method for predicting the spatial distribution of soil heavy metals.
背景技术Background technique
在对土壤重金属的研究中,长时间以来,仅依靠少量采样点获得的土壤重金属污染信息十分有限,难以满足区域尺度土壤重金属污染的空间分布特征、污染程度和污染风险。随着地理信息系统技术与高精度遥感影像的快速发展,GIS技术为实现土壤重金属浓度空间可视化提供了技术支持,使得描述土壤重金属污染状况不再仅仅依赖简单的统计数值指标,数字化制图能够更直观地反映出区域空间上的污染状况。相关研究表明,农田土壤重金属的来源和扩散受多种环境因素综合影响。利用某些环境变量与土壤重金属之间的一定关系可以提高预测模型的预测精度。In the study of heavy metals in soil, for a long time, the information on heavy metal pollution in soil obtained by relying on only a small number of sampling points was very limited, which could not meet the spatial distribution characteristics, pollution degree and pollution risk of heavy metal pollution in soil at the regional scale. With the rapid development of geographic information system technology and high-precision remote sensing images, GIS technology provides technical support for the spatial visualization of soil heavy metal concentrations, so that the description of soil heavy metal pollution status no longer relies solely on simple statistical numerical indicators, and digital mapping can more intuitively reflect the pollution status in regional space. Related studies have shown that the source and diffusion of heavy metals in farmland soil are affected by a combination of multiple environmental factors. The prediction accuracy of the prediction model can be improved by using a certain relationship between certain environmental variables and soil heavy metals.
目前相关研究中主要利用的是环境变量与目标土壤重金属的线性关系,然而土壤重金属与环境变量之间存在的更多的是复杂的非线性关系,如何衡量这种非线性关系,并且利用这种非线性关系提高土壤重金属的预测精度是目前存在的问题。重金属元素作为土壤的重要属性,为获取准确的土壤重金属浓度数据,通常需要对研究区域进行随机布局和采样,然后根据已知样点的土壤重金属数据及数据之间的复杂联系,选择恰当的空间预测模型对未知样点的土壤重金属空间分布进行预测。目前主要预测算法包括:地统计插值、线性回归和神经网络等。其中,地统计插值仅考虑空间自相关性,忽略了其他影响因素,而线性回归更多的是假设土壤重金属浓度和辅助变量之间的关系是线性的,实际应用中,土壤重金属浓度和辅助变量之间的关系是非线性的。所以,建立非线性的机器学习模型来预测土壤重金属空间分布是很有必要的。At present, the main research uses the linear relationship between environmental variables and target soil heavy metals. However, there are more complex nonlinear relationships between soil heavy metals and environmental variables. How to measure this nonlinear relationship and use this nonlinear relationship to improve the prediction accuracy of soil heavy metals is a current problem. Heavy metal elements are important attributes of soil. In order to obtain accurate soil heavy metal concentration data, it is usually necessary to randomly layout and sample the study area. Then, according to the soil heavy metal data of known sample points and the complex relationship between the data, select an appropriate spatial prediction model to predict the spatial distribution of soil heavy metals at unknown sample points. The main prediction algorithms currently include geostatistical interpolation, linear regression, and neural networks. Among them, geostatistical interpolation only considers spatial autocorrelation and ignores other influencing factors, while linear regression assumes that the relationship between soil heavy metal concentration and auxiliary variables is linear. In practical applications, the relationship between soil heavy metal concentration and auxiliary variables is nonlinear. Therefore, it is necessary to establish a nonlinear machine learning model to predict the spatial distribution of soil heavy metals.
发明内容Summary of the invention
本发明的目的是提供一种利用深度极限学习机DELM来对土壤重金属浓度空间进行预测的方法,以提高土壤重金浓度空间预测的精度,进而为更好的进行土壤重金属污染防治做准备。The purpose of the present invention is to provide a method for spatially predicting soil heavy metal concentrations using a deep extreme learning machine (DELM) to improve the accuracy of spatial prediction of soil heavy metal concentrations, thereby preparing for better prevention and control of soil heavy metal pollution.
一种改进深度极限学习机的土壤重金属空间分布预测方法,它包括以下步骤:A method for predicting the spatial distribution of heavy metals in soil using an improved deep extreme learning machine comprises the following steps:
步骤1:确定研究区域内的调查网格;Step 1: Determine the survey grid within the study area;
步骤2:在研究区域内采集土壤样品;Step 2: Collect soil samples in the study area;
步骤3:测定采集的土壤样品重金属浓度;Step 3: Determine the heavy metal concentration of the collected soil samples;
步骤4:确定所需多源辅助变量;Step 4: Determine the required multi-source auxiliary variables;
步骤5:筛选多源辅助变量;Step 5: Screening of multi-source auxiliary variables;
步骤6:基于土壤重金属浓度数据和辅助变量数据构建深度极限学习机DELM,将训练集中的辅助变量数据作为深度极限学习机DELM的输入,将训练集中的土壤重金属弄浓度数据作为深度极限学习机DELM的输出;Step 6: Construct a deep extreme learning machine DELM based on the soil heavy metal concentration data and auxiliary variable data, use the auxiliary variable data in the training set as the input of the deep extreme learning machine DELM, and use the soil heavy metal concentration data in the training set as the output of the deep extreme learning machine DELM;
步骤7:用金枪鱼群优化算法优化所构建的深度极限学习机DELM;Step 7: Optimize the constructed deep extreme learning machine DELM using the tuna swarm optimization algorithm;
步骤8:获得土壤重金属空间分布图。Step 8: Obtain the spatial distribution map of soil heavy metals.
在步骤3中,所述重金属包括Pb、Cd、As、Cr、Hg其中的一种或多种。In step 3, the heavy metals include one or more of Pb, Cd, As, Cr, and Hg.
在步骤4中,多源辅助变量包括地形因子辅助变量、遥感数据辅助变量、空间位置辅助变量、土壤属性辅助变量;In step 4, the multi-source auxiliary variables include terrain factor auxiliary variables, remote sensing data auxiliary variables, spatial position auxiliary variables, and soil property auxiliary variables;
在步骤5中,筛选多源辅助变量因子,探测各影响因子对土壤重金属元素的解释力。In step 5, multi-source auxiliary variable factors are screened to detect the explanatory power of each influencing factor on soil heavy metal elements.
在步骤6中,深度极限学习机DELM的表达式为:In step 6, the expression of the deep extreme learning machine DELM is:
xl=β·g(w,b,xl-1)x l =β·g(w, b, x l-1 )
隐藏层输出权重为:The hidden layer output weights are:
其中,xl-1为第l-1个隐藏层土壤重金属浓度数据的输出,xl为第l个隐藏层土壤重金属浓度数据的输出,β为输出权重,w为输入权重,b为隐藏层偏置值,g(·)为激活函数,H为隐藏层输出特征,C为正则化系数,I为单位矩阵,X为输入的辅助变量数据。Among them, xl-1 is the output of the l-1th hidden layer soil heavy metal concentration data, xl is the output of the lth hidden layer soil heavy metal concentration data, β is the output weight, w is the input weight, b is the hidden layer bias value, g(·) is the activation function, H is the hidden layer output feature, C is the regularization coefficient, I is the unit matrix, and X is the input auxiliary variable data.
在步骤7中,通过金枪鱼群优化算法选取深度极限学习机DELM的输入权重w和隐藏层偏置值b置,以获得改进后的深度极限学习机DELM土壤重金属空间预测模型;具体包括以下步骤:In step 7, the input weight w and hidden layer bias value b of the deep extreme learning machine DELM are selected by the tuna school optimization algorithm to obtain an improved deep extreme learning machine DELM soil heavy metal spatial prediction model; specifically, the following steps are included:
(1)随机初始化金枪鱼种群的位置Xi,j,(i=1,2,...,D,j=1,2,...,M作为初始种群NP,其中,M表示金枪鱼的数量,D表示输入辅助变量的维度;(1) Randomly initialize the positions of the tuna population Xi,j , (i = 1, 2, ..., D, j = 1, 2, ..., M) as the initial population NP, where M represents the number of tunas and D represents the dimension of the input auxiliary variable;
(2)根据反向学习策略方法构建初始种群NP的反向种群OP;再合并种群NP和种群OP;(2) Constructing the reverse population OP of the initial population NP according to the reverse learning strategy method; then merging the population NP and the population OP;
(3)通过公式(1)计算经过DELM网络训练的均方根误差,作为适应度函数选取M个适应度值高的个体作为初始种群;(3) Calculate the root mean square error of the DELM network training by formula (1), and use it as the fitness function to select M individuals with high fitness values as the initial population;
其中,yi为深度极限学习机DELM训练的土壤重金属浓度预测值,ti为土壤重金属浓度的实际值,N为训练样本数。Among them, yi is the predicted value of soil heavy metal concentration trained by deep extreme learning machine DELM, ti is the actual value of soil heavy metal concentration, and N is the number of training samples.
(4)在搜索空间内随机初始化相关参数:反向学习策略初始化过的种群、最大迭代次数tmax、当前迭代次数、搜索空间的上界ub、下界lb;(4) Randomly initialize relevant parameters in the search space: the population initialized by the reverse learning strategy, the maximum number of iterations t max , the current number of iterations, the upper bound ub and the lower bound lb of the search space;
其中,是第i个个体的初始位置,ub和lb分别是搜索空间的上界和下界,NP是金枪鱼种群的数量,rand是一个均匀分布的[0,1]内的随机向量;in, is the initial position of the ith individual, ub and lb are the upper and lower bounds of the search space, NP is the number of tuna populations, and rand is a uniformly distributed random vector in [0,1];
(5)分配自由参数a和z;(5) Assign free parameters a and z;
(6)根据步骤6,基于土壤重金属浓度数据和辅助变量数据构建深度极限学习机DELM,将训练集中的辅助变量数据作为深度极限学习机DELM的输入,将训练集中的土壤重金属弄浓度数据作为深度极限学习机DELM的输出;(6) According to step 6, a deep extreme learning machine DELM is constructed based on the soil heavy metal concentration data and the auxiliary variable data, the auxiliary variable data in the training set is used as the input of the deep extreme learning machine DELM, and the soil heavy metal concentration data in the training set is used as the output of the deep extreme learning machine DELM;
(7)通过公式(1)计算经过深度极限学习机DELM训练的土壤重金属浓度预测值与训练样本土壤重金属浓度实际值的均方根误差,作为金枪鱼群适应度值fitness;(7) Calculate the root mean square error between the predicted value of soil heavy metal concentration trained by the deep extreme learning machine DELM and the actual value of soil heavy metal concentration of the training sample by formula (1), and use it as the fitness value of the tuna group;
(8)判断当前迭代次数t是否达到最大迭代次数;若达到,则执行步骤(10),否则执行步骤(9);(8) Determine whether the current number of iterations t reaches the maximum number of iterations; if so, execute step (10); otherwise, execute step (9);
(9)计算当前最佳个体位置更新:若rand<z,则返回步骤(1)重新初始化,否则通过金枪鱼群的两种觅食策略进行合作狩猎。若rand<0.5,则利用金枪鱼螺旋觅食策略来更新当前最佳个体位置,(9) Calculate the current best individual position update: If rand < z, return to step (1) to reinitialize, otherwise use the two foraging strategies of the tuna school to cooperatively hunt. If rand < 0.5, use the tuna spiral foraging strategy to update the current best individual position.
其中,是第t+1次迭代的第i个个体,是当前最佳个体,是是搜索空间中随机生成的参考点,α1是控制个体向最佳个体移动趋势的权重系数,α2是控制个体向前一个个体移动趋势的权重系数,a是一个常数,用于确定金枪鱼在初始阶段跟随最佳个体和前一个体的程度,β为金枪鱼在初始阶段跟随最佳个体和前一个体的程度,t表示当前迭代次数,tmax表示最大迭代次数,b是均匀分布在0到1之间的随机数。in, is the i-th individual in the t+1-th iteration, Is the best individual at the moment, is a randomly generated reference point in the search space, α1 is the weight coefficient that controls the tendency of the individual to move toward the best individual, α2 is the weight coefficient that controls the tendency of the individual to move toward the previous individual, a is a constant that determines the degree to which the tuna follows the best individual and the previous individual in the initial stage, β is the degree to which the tuna follows the best individual and the previous individual in the initial stage, t represents the current number of iterations, tmax represents the maximum number of iterations, and b is a random number uniformly distributed between 0 and 1.
否则,选择金枪鱼抛物线觅食策略更新当前最佳个体位置,当前迭代次数t=t+1,再根据公式(1)更新金枪鱼群的适应度值,与当前最佳个体进行比较,若优于当前最佳个体则更新,否则执行步骤(8);Otherwise, select the tuna parabolic foraging strategy to update the current best individual position, the current iteration number t = t + 1, and then update the fitness value of the tuna group according to formula (1), and compare it with the current best individual. If it is better than the current best individual, update it, otherwise execute step (8);
其中,TF是一个值为1或-1的随机数,p为抛物线控制系数。Among them, TF is a random number with a value of 1 or -1, and p is the parabola control coefficient.
(10)返回最佳个体Xbest和最佳适应值F(Xbest),从中提取出DELM网络所需的输入层权重w和隐含层偏置b。(10) The best individual X best and the best fitness value F(X best ) are returned, from which the input layer weights w and hidden layer biases b required by the DELM network are extracted.
(11)在改进的金枪鱼群算法选取输入层权重w和隐含层偏置b的过程中模型也将完成训练,构建改进的金枪鱼群优化深度极限学习机DELM预测模型,再把测试数据输入到改进的金枪鱼群优化深度极限学习机DELM预测模型最终完成预测,输出预测的土壤重金属浓度值。(11) In the process of selecting the input layer weight w and the hidden layer bias b by the improved tuna school algorithm, the model will also complete the training, build the improved tuna school optimized deep extreme learning machine DELM prediction model, and then input the test data into the improved tuna school optimized deep extreme learning machine DELM prediction model to finally complete the prediction and output the predicted soil heavy metal concentration value.
在步骤8中,将估算出的土壤重金属浓度预测值保存为记事本格式导入ArcGIS10.8中再转成栅格数据,以绘制土壤重金属浓度空间预测图。In step 8, the estimated predicted values of soil heavy metal concentrations were saved in a notepad format, imported into ArcGIS 10.8, and then converted into raster data to draw a spatial prediction map of soil heavy metal concentrations.
与现有技术相比,本发明具有如下有益效果:Compared with the prior art, the present invention has the following beneficial effects:
1)本发明主要利用多源辅助变量和机器学习方法深度极限学习机DELM构建土壤重金属空间分布预测模型,传统的土壤重金属空间预测方法只考虑多源辅助变量和土壤重金属之间的线性关系,忽略它们之间的非线性关系,所以我们利用非线性的机器学习方法深度极限学习机DELM通过学习自动捕捉多源辅助变量与土壤重金属之间的非线性关系来提高模型预测的精度;1) The present invention mainly uses multi-source auxiliary variables and a machine learning method, deep extreme learning machine DELM, to construct a soil heavy metal spatial distribution prediction model. Traditional soil heavy metal spatial prediction methods only consider the linear relationship between multi-source auxiliary variables and soil heavy metals, ignoring the nonlinear relationship between them. Therefore, we use a nonlinear machine learning method, deep extreme learning machine DELM, to automatically capture the nonlinear relationship between multi-source auxiliary variables and soil heavy metals through learning to improve the accuracy of model prediction;
2)传统深度极限学习机DELM中的输入权值和隐藏层偏置值是随机产生的,因此会导致算法效果不稳定,采用金枪鱼群算法来优化深度极限学习机DELM的输入权值和隐藏层偏置值,增强算法可靠性以及模型预测精度;2) The input weights and hidden layer bias values in the traditional deep extreme learning machine DELM are randomly generated, which will lead to unstable algorithm effects. The tuna school algorithm is used to optimize the input weights and hidden layer bias values of the deep extreme learning machine DELM to enhance the algorithm reliability and model prediction accuracy;
3)金枪鱼群算法在其种群初始化阶段采用的是纯随机策略,在一定程度上会影响种群收敛速度而导致金枪鱼群算法不稳定,采用反向学习策略改进金枪鱼群算法初始种群个体,提高金枪鱼群的算法的稳定性;通过一种改进的金枪鱼群算法优化深度极限学习机DELM的土壤重金属空间预测模型精度以提高对土壤重金属空间预测的精度,在土壤重金属污染预测、防控与治理等方面具有现实意义和指导价值。3) The tuna swarm algorithm adopts a pure random strategy in its population initialization stage, which will affect the population convergence speed to a certain extent and cause the tuna swarm algorithm to be unstable. The reverse learning strategy is used to improve the initial population individuals of the tuna swarm algorithm and improve the stability of the tuna swarm algorithm. An improved tuna swarm algorithm is used to optimize the accuracy of the soil heavy metal spatial prediction model of the deep extreme learning machine DELM to improve the accuracy of the soil heavy metal spatial prediction, which has practical significance and guiding value in the prediction, prevention and control of soil heavy metal pollution.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
下面结合附图和实施例对本发明作进一步说明:The present invention will be further described below in conjunction with the accompanying drawings and embodiments:
图1为本发明的流程图;Fig. 1 is a flow chart of the present invention;
图2为改进的金枪鱼群算法优化过程。Figure 2 shows the optimization process of the improved tuna school algorithm.
具体实施方式Detailed ways
如图1所示,一种改进深度极限学习机的土壤重金属浓度空间预测方法,包括以下步骤:As shown in FIG1 , a spatial prediction method for soil heavy metal concentration using an improved deep extreme learning machine includes the following steps:
步骤1:按相关规定确定研究区域内的调查网格;Step 1: Determine the survey grid within the study area according to relevant regulations;
步骤2:在研究区域内采集土壤样品,包含以下步骤:Step 2: Collect soil samples in the study area, including the following steps:
1)选择典型农田土壤环境进行区域划分,随机布点,确定采样点;1) Select typical farmland soil environment for regional division, randomly distribute points, and determine sampling points;
2)采集过程中,取20cm深度范围内的表层土壤为土壤样品;2) During the collection process, the surface soil within a depth of 20 cm was taken as the soil sample;
3)将采集的土壤样品去除杂质后,进行风干后磨碎过100目尼龙网筛进行筛选。3) After removing impurities from the collected soil samples, the samples were air-dried, ground and screened through a 100-mesh nylon sieve.
为了能够准确获取土壤的辅助变量信息,我们可以选择在相对干燥季节进行样品采集,因为此时采样的土壤是裸露的,能使辅助变量高光谱遥感信息较为准确。In order to accurately obtain auxiliary variable information of soil, we can choose to collect samples in a relatively dry season, because the sampled soil is exposed at this time, which can make the auxiliary variable hyperspectral remote sensing information more accurate.
步骤3:测定采集的土壤样品重金属浓度,利用HNO3-HCl-H2O2消解并采用原子吸收光谱仪测定消解液中重金属Pb、Cd、As、Cr和Hg的浓度。Step 3: Determine the heavy metal concentration of the collected soil samples by digesting them with HNO3-HCl-H2O2 and using an atomic absorption spectrometer to determine the concentrations of heavy metals Pb, Cd, As, Cr and Hg in the digestion solution.
步骤4:确定所需多源辅助变量,多源辅助变量包括以下方面:Step 4: Determine the required multi-source auxiliary variables, which include the following aspects:
1)地形因子辅助变量包括n1个辅助变量,n1表示地形因子辅助变量个数;2)遥感数据辅助变量包括n2个辅助变量,n2表示遥感数据辅助变量个数;3)空间位置辅助变量包括n3个辅助变量,n3表示空间位置辅助变量个数;4)土壤属性辅助变量包括n4个辅助变量,n4表示土壤属性辅助变量个数。1) Terrain factor auxiliary variables include n1 auxiliary variables, where n1 represents the number of terrain factor auxiliary variables; 2) Remote sensing data auxiliary variables include n2 auxiliary variables, where n2 represents the number of remote sensing data auxiliary variables; 3) Spatial position auxiliary variables include n3 auxiliary variables, where n3 represents the number of spatial position auxiliary variables; 4) Soil property auxiliary variables include n4 auxiliary variables, where n4 represents the number of soil property auxiliary variables.
步骤5:利用地理探测器筛选多源辅助变量因子,利用因子探测器探测各影响因子对土壤重金属元素的解释力,共筛选出np个辅助变量,np表示筛选后辅助变量个数。Step 5: Use geographic detectors to screen multi-source auxiliary variable factors, and use factor detectors to detect the explanatory power of each influencing factor on soil heavy metal elements. A total of np auxiliary variables are screened out, where np represents the number of auxiliary variables after screening.
步骤6:基于土壤重金属浓度数据和辅助变量数据构建深度极限学习机DELM,将训练集中的辅助变量数据作为深度极限学习机DELM的输入,将训练集中的土壤重金属弄浓度数据作为深度极限学习机DELM的输出,具体包括以下步骤:Step 6: Construct a deep extreme learning machine DELM based on the soil heavy metal concentration data and auxiliary variable data, use the auxiliary variable data in the training set as the input of the deep extreme learning machine DELM, and use the soil heavy metal concentration data in the training set as the output of the deep extreme learning machine DELM, which specifically includes the following steps:
6-1:建立深度极限学习机DELM模型,并将所述深度极限学习机DELM模型的输入层节点个数设置为筛选后的辅助变量个数np个;6-1: Establish a deep extreme learning machine DELM model, and set the number of input layer nodes of the deep extreme learning machine DELM model to the number of auxiliary variables np after screening;
6-2:设置所述深度极限学习机DELM中隐藏层层数为2,将第一个隐藏层节点数设置为15,第二个隐藏层节点数设置为10;每个隐藏层是一个极限学习机-自动编码(ELM-AE),其中第一个隐藏层的输出为第二个隐藏层的输入,由此可得DELM网络中ELM-AE模型公式如下:6-2: Set the number of hidden layers in the deep extreme learning machine DELM to 2, set the number of nodes in the first hidden layer to 15, and the number of nodes in the second hidden layer to 10; each hidden layer is an extreme learning machine-autoencoder (ELM-AE), where the output of the first hidden layer is the input of the second hidden layer, and the ELM-AE model formula in the DELM network is as follows:
xl=β·g(w,b,xl-1)x l =β·g(w, b, x l-1 )
隐藏层输出权重为:The hidden layer output weights are:
其中,xl-1为第l-1个隐藏层土壤重金属浓度数据的输出,xl为第l个隐藏层土壤重金属浓度数据的输出,β为输出权重,w为输入权重,b为隐藏层偏置值,g(·)为激活函数,H为隐藏层输出特征,C为正则化系数,I为单位矩阵,X为输入的辅助变量数据。Among them, x l-1 is the output of the l-1th hidden layer soil heavy metal concentration data, x l is the output of the lth hidden layer soil heavy metal concentration data, β is the output weight, w is the input weight, b is the hidden layer bias value, g(·) is the activation function, H is the hidden layer output feature, C is the regularization coefficient, I is the unit matrix, and X is the input auxiliary variable data.
6-3:设置DELM神经网络模型的输出层节点个数为1;6-3: Set the number of output layer nodes of the DELM neural network model to 1;
步骤7:,由于深度极限学习机DELM中的输入权重w和隐藏层偏置值b是随机产生的,易影响模型预测精度,提出了一种改进的金枪鱼群算法优化深度极限学习机DELM土壤重金属空间预测模型,通过改进的金枪鱼群优化算法选取所述深度极限学习机DELM模型的输入权重w和隐藏层偏置值b,得到基于改进的金枪鱼群算法优化的深度极限学习机神经网络土壤重金属空间预测模型,改进的金枪鱼群算法通过将反向学习策略引入到金枪鱼群算法的种群初始化过程中,一种改进的金枪鱼群算法优化深度极限学习机DELM土壤重金属空间预测模型具体包括以下步骤:Step 7: Since the input weights w and hidden layer bias values b in the deep extreme learning machine DELM are randomly generated, they are easy to affect the prediction accuracy of the model. An improved tuna school algorithm is proposed to optimize the deep extreme learning machine DELM soil heavy metal spatial prediction model. The input weights w and hidden layer bias values b of the deep extreme learning machine DELM model are selected by the improved tuna school optimization algorithm to obtain a deep extreme learning machine neural network soil heavy metal spatial prediction model based on the improved tuna school algorithm optimization. The improved tuna school algorithm introduces the reverse learning strategy into the population initialization process of the tuna school algorithm. An improved tuna school algorithm optimizes the deep extreme learning machine DELM soil heavy metal spatial prediction model specifically includes the following steps:
(1)随机初始化金枪鱼种群的位置xi,j,(i=1,2,...,D,j=1,2,···,M)作为初始种群NP,其中,M表示金枪鱼的数量,D表示输入辅助变量的维度;(1) Randomly initialize the positions of the tuna population x i,j , (i=1, 2, ..., D, j=1, 2, ..., M) as the initial population NP, where M represents the number of tunas and D represents the dimension of the input auxiliary variables;
(2)根据反向学习策略方法构建初始种群NP的反向种群OP;再合并种群NP和种群OP;(2) Constructing the reverse population OP of the initial population NP according to the reverse learning strategy method; then merging the population NP and the population OP;
(3)通过公式(1)计算经过深度极限学习机DELM训练的的均方根误差,作为适应度函数选取M个适应度值高的个体作为初始种群;(3) Calculate the root mean square error of the deep extreme learning machine DELM training by formula (1), and select M individuals with high fitness values as the initial population as the fitness function;
其中,yi为深度极限学习机DELM训练的土壤重金属浓度预测值,ti为土壤重金属浓度的实际值,N为训练样本数。Among them, yi is the predicted value of soil heavy metal concentration trained by deep extreme learning machine DELM, ti is the actual value of soil heavy metal concentration, and N is the number of training samples.
(4)在搜索空间内随机初始化相关参数:反向学习策略初始化过的种群、最大迭代次数tmax=50、当前迭代次数t=1、搜索空间的上界ub、下界lb;(4) Randomly initialize relevant parameters in the search space: the population initialized by the reverse learning strategy, the maximum number of iterations t max = 50, the current number of iterations t = 1, the upper bound ub and the lower bound lb of the search space;
其中,是第i个个体的初始位置,ub和lb分别是搜索空间的上界和下界,NP是金枪鱼种群的数量,rand是一个均匀分布的[0,1]内的随机向量;in, is the initial position of the ith individual, ub and lb are the upper and lower bounds of the search space, NP is the number of tuna populations, and rand is a uniformly distributed random vector in [0,1];
(5)分配自由参数a和z,a=0.7,z=0.05;(5) Assign free parameters a and z, a = 0.7, z = 0.05;
(6)根据步骤6,基于土壤重金属浓度数据和辅助变量数据构建深度极限学习机DELM,将训练集中的辅助变量数据作为深度极限学习机DELM的输入,将训练集中的土壤重金属弄浓度数据作为深度极限学习机DELM的输出;(6) According to step 6, a deep extreme learning machine DELM is constructed based on the soil heavy metal concentration data and the auxiliary variable data, the auxiliary variable data in the training set is used as the input of the deep extreme learning machine DELM, and the soil heavy metal concentration data in the training set is used as the output of the deep extreme learning machine DELM;
(7)通过公式(1)计算经过深度极限学习机DELM训练的土壤重金属浓度预测值与训练样本土壤重金属浓度实际值的均方根误差,作为金枪鱼群适应度值fitness;(7) Calculate the root mean square error between the predicted value of soil heavy metal concentration trained by the deep extreme learning machine DELM and the actual value of soil heavy metal concentration of the training sample by formula (1), and use it as the fitness value of the tuna group;
(8)判断当前迭代次数t是否达到最大迭代次数;若达到,则执行步骤(10),否则执行步骤(9);(8) Determine whether the current number of iterations t reaches the maximum number of iterations; if so, execute step (10); otherwise, execute step (9);
(9)计算当前最佳个体位置更新:若rand<z,则返回步骤(1)重新初始化,否则通过金枪鱼群的两种觅食策略进行合作狩猎。若rand<0.5,则利用金枪鱼螺旋觅食策略来更新当前最佳个体位置,(9) Calculate the current best individual position update: If rand < z, return to step (1) to reinitialize, otherwise use the two foraging strategies of the tuna school to cooperatively hunt. If rand < 0.5, use the tuna spiral foraging strategy to update the current best individual position.
其中,是第t+1次迭代的第i个个体,是当前最佳个体,是是搜索空间中随机生成的参考点,α1是控制个体向最佳个体移动趋势的权重系数,α2是控制个体向前一个个体移动趋势的权重系数,a是一个常数,用于确定金枪鱼在初始阶段跟随最佳个体和前一个体的程度,β为金枪鱼在初始阶段跟随最佳个体和前一个体的程度,t表示当前迭代次数,tmax表示最大迭代次数,b是均匀分布在0到1之间的随机数。in, is the i-th individual in the t+1-th iteration, Is the best individual at the moment, is a randomly generated reference point in the search space, α1 is the weight coefficient that controls the tendency of the individual to move toward the best individual, α2 is the weight coefficient that controls the tendency of the individual to move toward the previous individual, a is a constant that determines the degree to which the tuna follows the best individual and the previous individual in the initial stage, β is the degree to which the tuna follows the best individual and the previous individual in the initial stage, t represents the current number of iterations, tmax represents the maximum number of iterations, and b is a random number uniformly distributed between 0 and 1.
否则,选择金枪鱼抛物线觅食策略更新当前最佳个体位置,当前迭代次数t=t+1,再根据公式(1)更新金枪鱼群的适应度值,与当前最佳个体进行比较,若优于当前最佳个体则更新,否则执行步骤(8);Otherwise, select the tuna parabolic foraging strategy to update the current best individual position, the current iteration number t = t + 1, and then update the fitness value of the tuna group according to formula (1), and compare it with the current best individual. If it is better than the current best individual, update it, otherwise execute step (8);
其中,TF是一个值为1或-1的随机数,p为抛物线控制系数。Among them, TF is a random number with a value of 1 or -1, and p is the parabola control coefficient.
(10)返回最佳个体Xbest和最佳适应值F(Xbest),从中提取出DELM网络所需的输入层权重w和隐含层偏置b。(10) The best individual X best and the best fitness value F(X best ) are returned, from which the input layer weights w and hidden layer biases b required by the DELM network are extracted.
(11)在改进的金枪鱼群算法选取输入层权重w和隐含层偏置b的过程中模型也将完成训练,构建改进的金枪鱼群优化深度极限学习机DELM预测模型,再把测试数据输入到改进的金枪鱼群优化深度极限学习机DELM预测模型最终完成预测,输出预测的土壤重金属浓度值。(11) In the process of selecting the input layer weight w and the hidden layer bias b by the improved tuna school algorithm, the model will also complete the training, build the improved tuna school optimized deep extreme learning machine DELM prediction model, and then input the test data into the improved tuna school optimized deep extreme learning machine DELM prediction model to finally complete the prediction and output the predicted soil heavy metal concentration value.
步骤8:绘制土壤重金属空间分布预测图,将估算出的土壤重金属浓度预测值保存为记事本格式导入ArcGIS10.8中再转成栅格数据,以绘制土壤重金属浓度空间预测图。Step 8: Draw a prediction map of the spatial distribution of soil heavy metals, save the estimated predicted values of soil heavy metal concentrations in Notepad format, import them into ArcGIS 10.8, and then convert them into raster data to draw a prediction map of soil heavy metal concentrations.
一种改进的金枪鱼群算法,其特征在于,它包括以下步骤:An improved tuna school algorithm, characterized in that it comprises the following steps:
(1)随机初始化金枪鱼种群的位置Xi,j,(i=1,2,·..,D,j=1,2,...,M)作为初始种群P,其中,M表示金枪鱼的数量,D表示输入辅助变量的维度;(1) Randomly initialize the positions of the tuna population Xi,j , (i = 1, 2, ..., D, j = 1, 2, ..., M) as the initial population P, where M represents the number of tunas and D represents the dimension of the input auxiliary variables;
(2)根据反向学习策略方法构建初始种群NP的反向种群OP;再合并种群NP和种群OP;(2) Constructing the reverse population OP of the initial population NP according to the reverse learning strategy method; then merging the population NP and the population OP;
(3)通过公式(1)计算经过深度极限学习机DELM训练的土壤重金属浓度预测值与训练样本土壤重金属浓度实际值的均方根误差,作为金枪鱼群适应度值fitness,作为适应度函数选取M个适应度值高的个体作为初始种群;(3) The root mean square error between the predicted value of soil heavy metal concentration trained by the deep extreme learning machine DELM and the actual value of soil heavy metal concentration of the training sample is calculated by formula (1) as the fitness value of the tuna group, and M individuals with high fitness values are selected as the initial population as the fitness function;
相比于现有的金枪鱼群算法,本发明采用反向学习策略改进金枪鱼群算法的初始种群个体;改进后在一定程度上提高金枪鱼群算法种群收敛速度,保证了金枪鱼群算法的稳定性,性能更优。Compared with the existing tuna school algorithm, the present invention adopts a reverse learning strategy to improve the initial population individuals of the tuna school algorithm; after the improvement, the population convergence speed of the tuna school algorithm is improved to a certain extent, the stability of the tuna school algorithm is ensured, and the performance is better.
本发明的目的是针对深度极限学习机DELM中输入权值和隐藏层偏置值是随机产生的而影响土壤重金浓度空间预测模型精度问题,为了利用更加精确的模型来获取土壤重金属空间预测分布,所以提出了一种改进的金枪鱼群优化算法来优化深度极限学习机DELM中的输入权值和隐藏层偏置值这两个参数,使参数具有普适性,该方法主要通过反向学习策略方法改进金枪鱼群种群的初始值,再利用改进后的金枪鱼群方法优化深度极限学习机DELM中输入权值和隐藏层偏置值,通过这种改进来提高土壤重金浓度空间预测模型精度,帮助我们精准把握土壤重金属污染防治的问题。The purpose of the present invention is to address the problem that the input weights and hidden layer bias values in the deep extreme learning machine DELM are randomly generated, which affects the accuracy of the spatial prediction model of soil heavy metal concentration. In order to use a more accurate model to obtain the spatial prediction distribution of soil heavy metals, an improved tuna school optimization algorithm is proposed to optimize the two parameters of input weights and hidden layer bias values in the deep extreme learning machine DELM, so that the parameters have universal applicability. The method mainly improves the initial value of the tuna school population through a reverse learning strategy method, and then uses the improved tuna school method to optimize the input weights and hidden layer bias values in the deep extreme learning machine DELM. Through this improvement, the accuracy of the spatial prediction model of soil heavy metal concentration is improved, which helps us accurately grasp the problem of soil heavy metal pollution prevention and control.
Claims (3)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202210778405.2A CN115081335B (en) | 2022-06-30 | 2022-06-30 | Soil heavy metal spatial distribution prediction method for improved deep extreme learning machine |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202210778405.2A CN115081335B (en) | 2022-06-30 | 2022-06-30 | Soil heavy metal spatial distribution prediction method for improved deep extreme learning machine |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN115081335A CN115081335A (en) | 2022-09-20 |
| CN115081335B true CN115081335B (en) | 2024-07-16 |
Family
ID=83258004
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202210778405.2A Active CN115081335B (en) | 2022-06-30 | 2022-06-30 | Soil heavy metal spatial distribution prediction method for improved deep extreme learning machine |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN115081335B (en) |
Families Citing this family (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN116050241A (en) * | 2022-11-02 | 2023-05-02 | 西安石油大学 | Prediction method of corrosion rate of submarine pipeline based on PCA-TSO-BPNN model |
| CN116227743B (en) * | 2023-05-06 | 2023-09-01 | 中国华能集团清洁能源技术研究院有限公司 | Method and system for determining abnormal rate of photovoltaic power generation based on tuna swarm algorithm |
| CN118330051B (en) * | 2024-01-18 | 2024-09-17 | 江苏省环境科学研究院 | Polluted site investigation and distribution method based on machine learning |
| CN118067960B (en) * | 2024-02-27 | 2024-07-30 | 镇江苏鹤环境科技有限公司 | A soil environmental pollution monitoring system and method based on multi-source data |
Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112614552A (en) * | 2021-01-04 | 2021-04-06 | 武汉轻工大学 | BP neural network-based soil heavy metal content prediction method and system |
Family Cites Families (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20180240018A1 (en) * | 2016-05-19 | 2018-08-23 | Jiangnan University | Improved extreme learning machine method based on artificial bee colony optimization |
-
2022
- 2022-06-30 CN CN202210778405.2A patent/CN115081335B/en active Active
Patent Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112614552A (en) * | 2021-01-04 | 2021-04-06 | 武汉轻工大学 | BP neural network-based soil heavy metal content prediction method and system |
Also Published As
| Publication number | Publication date |
|---|---|
| CN115081335A (en) | 2022-09-20 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN115081335B (en) | Soil heavy metal spatial distribution prediction method for improved deep extreme learning machine | |
| CN110232471B (en) | A method and device for optimizing node layout of precipitation sensor network | |
| CN117077509B (en) | A thermal error modeling method of electric spindle based on KELM neural network optimized by Northern Goshawk algorithm | |
| CN115728463B (en) | An interpretable water quality prediction method based on semi-embedded feature selection | |
| CN116701868A (en) | A Probabilistic Prediction Method for Short-term Wind Power Range | |
| CN120124819A (en) | Dynamic prediction method of carbon sink in ecological restoration area based on time series remote sensing | |
| WO2025025531A1 (en) | Intelligent operation optimization method for municipal solid waste incineration process | |
| CN113011559A (en) | Automatic machine learning method and system based on kubernets | |
| CN118228613B (en) | Soft measurement method for improving TSO optimization deep learning model | |
| CN117895484A (en) | Stacking rainy season photovoltaic power prediction method and system based on BiGRU-Attention | |
| CN118054400A (en) | Wind power prediction method and system based on interpretability and model fusion | |
| CN117194968A (en) | Educational building operation stage carbon emission prediction method and system | |
| CN113449466B (en) | Solar radiation prediction method and system based on PCA and chaotic GWO optimizing RELM | |
| CN110909492A (en) | Sewage treatment process soft measurement method based on extreme gradient lifting algorithm | |
| CN113297805A (en) | Wind power climbing event indirect prediction method | |
| CN110276478A (en) | Short-term wind power prediction method based on SVM optimization based on segmented ant colony algorithm | |
| CN119227909A (en) | A method and system for predicting carbon emissions of power transmission line projects based on improved Beluga algorithm | |
| CN119692559A (en) | A method, device and medium for predicting boiler NOx emission concentration | |
| CN119272950A (en) | A photovoltaic power prediction method | |
| CN116776921B (en) | A solar radiation prediction method and device based on improved patch-informer | |
| CN119626384A (en) | A method for predicting coagulation and dosing in high-density sewage treatment tanks | |
| CN111626468A (en) | A photovoltaic interval prediction method based on a partial convex loss function | |
| CN118657225A (en) | Interpretability evaluation method and system for hydrological and meteorological deep learning forecasting models | |
| CN114879281A (en) | Deep learning-based precipitation prediction method | |
| CN120541808B (en) | Parameter identification method, device, medium and equipment of photovoltaic system |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |