CN117371812A

CN117371812A - Aircraft group collaborative decision generation method, system and equipment

Info

Publication number: CN117371812A
Application number: CN202311325343.0A
Authority: CN
Inventors: 李雄; 秦小营; 倪晓升; 张易东; 蒋燕梅; 吕雅丽; 熊宇涵; 冼军; 成诚
Original assignee: Sun Yat Sen University
Current assignee: Sun Yat Sen University
Priority date: 2023-10-12
Filing date: 2023-10-12
Publication date: 2024-01-09
Anticipated expiration: 2043-10-12
Also published as: CN117371812B

Abstract

The invention discloses an aircraft group collaborative decision-making generation method, system and equipment, and relates to the technical field of aircraft. By using the preset density peak clustering algorithm to calculate event pseudo-labels, and training the initial event extraction model based on the event pseudo-label set, the target event extraction model is obtained. The target event extraction model is used to extract events from the aircraft group data set and the ground control station data set respectively and perform event comparison to obtain the solution comparison results. When the plan comparison results are inconsistent, the optimal instruction sequence is selected based on the authorization data corresponding to the aircraft group and the preset improved multi-agent deep deterministic policy gradient algorithm to obtain the collaborative decision-making of the aircraft group. When the plan comparison results are consistent, the aircraft group plan will be used as the aircraft group collaborative decision-making. Fine-tune the initial event extraction model based on event pseudo-labels, and then extract events to quickly grasp the overall situation of the flight environment and formulate corresponding plans to deal with emergencies.

Description

A method, system and device for collaborative decision-making generation of aircraft groups

技术领域Technical field

本发明涉及飞行器技术领域，尤其涉及飞行器群协同决策生成方法、系统和设备。The present invention relates to the field of aircraft technology, and in particular to methods, systems and equipment for generating collaborative decision-making for aircraft groups.

背景技术Background technique

飞行器的作用是在大气层内或外进行飞行任务，包括运输、侦察、军事、科研等多个领域。在交通运输中，飞行器可以用于长途旅行和货物运输，大幅缩短了距离，提高了运输效率。在科学研究中，飞行器被广泛应用于各类科学研究，如天文学、气象学、地质学等领域。其中，飞行器群是由多个飞行器组成的一支集体，通过协同工作来完成特定任务，譬如侦察监测、救援搜寻、物流运输等。飞行器群可以实现更高效、更精确的操作，并且在某些情况下具备更强大的功能和灵活性。The role of aircraft is to carry out flight missions inside or outside the atmosphere, including transportation, reconnaissance, military, scientific research and other fields. In transportation, aircraft can be used for long-distance travel and cargo transportation, greatly shortening distances and improving transportation efficiency. In scientific research, aircraft are widely used in various scientific research, such as astronomy, meteorology, geology and other fields. Among them, an aircraft group is a group composed of multiple aircraft that work together to complete specific tasks, such as reconnaissance and monitoring, rescue and search, logistics and transportation, etc. Swarms of aircraft enable more efficient, precise operations and, in some cases, greater capabilities and flexibility.

而高速飞行的飞行器群的飞行控制和稳定性要求更高，因为在执行任务的时候多架高速飞行的飞行器需要有效地确保协同作业，又要避免相互之间的干扰。这使得飞行器群需要实时、高效的通信和协同工作，以确保各个飞行器之间的配合和协调。The flight control and stability requirements of a group of high-speed flying aircraft are higher, because when performing a mission, multiple high-speed flying aircraft need to effectively ensure coordinated operation and avoid mutual interference. This requires real-time, efficient communication and collaborative work among aircraft groups to ensure cooperation and coordination between individual aircraft.

但飞行器在空气动力学环境中会面临更复杂的挑战，如气流湍流、气动失稳等，这些因素可能导致飞行器姿态控制困难、不稳定甚至失控。而现有的飞行器群协同决策通常是提前设置好的，不能根据突发情况进行调整，导致高速飞行的飞行器群中单个飞行器的故障可能会影响作业的正常完成，甚至对其他飞行器的正常运行造成干扰，造成巨大的损失。However, aircraft will face more complex challenges in the aerodynamic environment, such as airflow turbulence, aerodynamic instability, etc. These factors may cause aircraft attitude control to be difficult, unstable or even out of control. However, the existing cooperative decision-making of aircraft groups is usually set in advance and cannot be adjusted according to emergencies. As a result, the failure of a single aircraft in a high-speed flying aircraft group may affect the normal completion of the operation and even cause damage to the normal operation of other aircraft. interference, causing huge losses.

发明内容Contents of the invention

本发明提供了一种飞行器群协同决策生成方法、系统和设备，解决了现有的飞行器群协同决策不能根据突发情况进行调整，容易因为飞行器群中单个飞行器的故障，影响作业的正常完成的技术问题。The present invention provides a method, system and equipment for generating collaborative decision-making of an aircraft group, which solves the problem that the existing collaborative decision-making of an aircraft group cannot be adjusted according to emergencies and is prone to affect the normal completion of operations due to the failure of a single aircraft in the aircraft group. technical problem.

本发明提供的一种飞行器群协同决策生成方法，包括：The invention provides an aircraft group collaborative decision-making generation method, including:

当接收到飞行器群的突发状况数据时，采用预设密度峰值聚类算法计算所述突发状况数据对应的飞行器群数据集和地面控制站生成的地面控制站数据集对应的事件伪标签，得到事件伪标签集；When receiving the emergency situation data of the aircraft group, a preset density peak clustering algorithm is used to calculate the event pseudo-label corresponding to the aircraft group data set corresponding to the emergency situation data and the ground control station data set generated by the ground control station, Get the event pseudo-label set;

基于所述事件伪标签集训练初始事件抽取模型，得到目标事件抽取模型；Train an initial event extraction model based on the event pseudo-label set to obtain a target event extraction model;

通过所述目标事件抽取模型分别对所述飞行器群数据集和所述地面控制站数据集进行事件抽取并进行事件比对，得到方案比对结果；The target event extraction model is used to extract events from the aircraft group data set and the ground control station data set respectively and perform event comparison to obtain solution comparison results;

当方案比对结果为不一致时，根据所述飞行器群对应的授权数据和预设改进的多智能体深度确定性策略梯度算法进行最优指令序列选取，得到所述突发状况数据对应的飞行器群协同决策方案；When the solution comparison results are inconsistent, the optimal instruction sequence is selected based on the authorization data corresponding to the aircraft group and the preset improved multi-agent deep deterministic strategy gradient algorithm to obtain the aircraft group corresponding to the emergency situation data. Collaborative decision-making solutions;

当方案比对结果为一致时，将所述飞行器群数据集中的飞行器群方案作为所述突发状况数据对应的飞行器群协同决策方案。When the plan comparison results are consistent, the aircraft group plan in the aircraft group data set is used as the aircraft group collaborative decision-making plan corresponding to the emergency situation data.

可选地，所述当接收到飞行器群的突发状况数据时，采用预设密度峰值聚类算法计算所述突发状况数据对应的飞行器群数据集和地面控制站生成的地面控制站数据集对应的事件伪标签，得到事件伪标签集的步骤，包括：Optionally, when receiving emergency situation data of an aircraft group, a preset density peak clustering algorithm is used to calculate the aircraft group data set corresponding to the emergency situation data and the ground control station data set generated by the ground control station. Corresponding event pseudo-labels, the steps to obtain the event pseudo-label set include:

当接收到飞行器群的突发状况数据时，获取所述飞行器群基于所述突发状况数据生成的飞行器群数据集和地面控制站生成的地面控制站数据集；When receiving the emergency situation data of the aircraft group, obtain the aircraft group data set generated by the aircraft group based on the emergency situation data and the ground control station data set generated by the ground control station;

通过预设跳字模型提取所述飞行器群数据集和所述地面控制站数据集的特征向量，得到特征向量集；Extract the feature vectors of the aircraft group data set and the ground control station data set through a preset word skip model to obtain a feature vector set;

采用预设密度计算公式分别计算所述特征向量集中各点对应的密度，得到密度；Calculate the density corresponding to each point in the feature vector set using a preset density calculation formula to obtain the density;

所述预设密度计算公式为：The preset density calculation formula is:

其中，ρ(x_i)表示点x_i的密度；k表示一个点的邻居的数量，即以欧式距离为度量标准，距离一个点最近的前k个点；表示点x_i和点x_j之间的欧式距离，u表示一个特征向量集中点的维度，也就是特征的数量，g是下标，用来索引不同的特征；KNN(x_i)表示在数据集中与点x_i的欧式距离最小的前k个点构成的集合；i表示数据集中的第i个点x_i；j表示数据集中的第j个点x_j；Among them, ρ(xi) represents the density of point x _i ; k represents the number of neighbors of a point, that is, using _the Euclidean distance as the metric, the top k points closest to a point; Represents the Euclidean distance between point x _i and point x _j , u represents the dimension of a feature vector concentration point, that is, the number of features, g is a subscript, used to index different features; KNN(xi ₎ represents the A set of the first k points with the smallest Euclidean distance from point x _i ; i represents the i-th point x _i in the data set; j represents the j-th point x _j in the data set;

将所述密度代入预设相对密度计算公式计算所述点的相对密度，得到相对密度；Substitute the density into the preset relative density calculation formula to calculate the relative density of the point to obtain the relative density;

所述预设相对密度计算公式为：The preset relative density calculation formula is:

其中，r_ρ(x_i)表示点x_i的相对密度；ρ(x_i)表示点x_i的密度；ρ(x_j)表示点x_j的密度；k表示一个点的邻居的数量，即以欧式距离为度量标准，距离一个点最近的前k个点；KNN(x_i)表示在数据集中与点x_i的欧式距离最小的前k个点构成的集合；i表示数据集中的第i个点x_i；j表示数据集中的第j个点x_j；Among them, r _ρ ( _xi ) represents the relative density of point x _i ; ρ ( _xi ) represents the density of point _xi ; ρ (x _j ) represents the density of point x _j ; k represents the number of neighbors of a point, that is Using Euclidean distance as the metric, the top k points closest to a point; KNN(xi ₎ represents the set of the top k points with the smallest Euclidean distance from point x _i in the data set; i represents the i-th point in the data set point x _i ;j represents the jth point x _j in the data set;

选取所述相对密度大于所述相对密度对应的密度平均值的点，得到多个簇中心；Select points where the relative density is greater than the density average corresponding to the relative density to obtain multiple cluster centers;

按照所述簇中心，采用预设域计算公式对所述特征向量集进行划分，得到多个集合；According to the cluster center, the feature vector set is divided using a preset domain calculation formula to obtain multiple sets;

所述预设域计算公式为：The calculation formula of the preset domain is:

D_mn(x_p)＝{x_q|x_q∈MN(x_p)∨(x_m∈MN(x_p))∧x_q∈MN(x_m))}；D _mn (x _p )={x _q |x _q ∈MN(x _p )∨(x _m ∈MN(x _p ))∧x _q ∈MN(x _m ))};

式中，D_mn(x_p)表示点x_p的域；MN(x_p)表示点x_p所有互邻的集合；MN(x_m)表示点x_m所有互邻的集合；x_p表示特征向量集中第p个点；x_m表示特征向量集中第m个点；x_q表示特征向量集中第q个点；In the formula, D _mn (x _p ) represents the domain of point x _p ; MN (x _p ) represents the set of all neighbors of point x _p ; MN (x _m ) represents the set of all neighbors of point x _m ; x _p represents the feature The p-th point in the vector set; x _m represents the m-th point in the feature vector set; x _q represents the q-th point in the feature vector set;

按照所述簇中心对应的标签分别对各所述集合对应的点设置相应的标签，得到事件伪标签集。According to the labels corresponding to the cluster centers, corresponding labels are set for the points corresponding to each set to obtain an event pseudo-label set.

可选地，所述基于所述事件伪标签集训练初始事件抽取模型，得到目标事件抽取模型的步骤，包括：Optionally, the step of training an initial event extraction model based on the event pseudo-label set to obtain a target event extraction model includes:

将所述事件伪标签集输入初始事件抽取模型，得到初始损失函数值；Input the event pseudo-label set into the initial event extraction model to obtain the initial loss function value;

按照所述初始损失函数值，微调所述初始事件抽取模型，得到中间事件抽取模型；Fine-tune the initial event extraction model according to the initial loss function value to obtain an intermediate event extraction model;

将所述事件伪标签集输入所述中间事件抽取模型，得到目标损失函数值；Input the event pseudo-label set into the intermediate event extraction model to obtain the target loss function value;

判断所述目标损失函数值是否为预设函数值；Determine whether the target loss function value is a preset function value;

若是，则将所述目标损失函数值对应的中间事件抽取模型作为目标事件抽取模型；If so, use the intermediate event extraction model corresponding to the target loss function value as the target event extraction model;

若否，则将所述中间事件抽取模型作为初始事件抽取模型，并跳转执行所述将所述事件伪标签集输入初始事件抽取模型，得到初始损失函数值的步骤。If not, use the intermediate event extraction model as the initial event extraction model, and jump to the step of inputting the event pseudo-label set into the initial event extraction model to obtain an initial loss function value.

可选地，所述通过所述目标事件抽取模型分别对所述飞行器群数据集和所述地面控制站数据集进行事件抽取并进行事件比对，得到方案比对结果的步骤，包括：Optionally, the step of extracting events from the aircraft group data set and the ground control station data set respectively through the target event extraction model and performing event comparison to obtain solution comparison results includes:

通过所述目标事件抽取模型分别对所述飞行器群数据集和所述地面控制站数据集进行事件抽取，得到飞行器群事件和地面事件；Extract events from the aircraft group data set and the ground control station data set respectively through the target event extraction model to obtain aircraft group events and ground events;

按照触发词类别和参数相似度，将所述飞行器群事件和所述地面事件进行比对，得到事件比对结果；Compare the aircraft group event and the ground event according to the trigger word category and parameter similarity to obtain an event comparison result;

采用全部所述事件比对结果，构建方案比对结果。Using all the event comparison results, a plan comparison result is constructed.

可选地，所述当方案比对结果为不一致时，根据所述飞行器群对应的授权数据和预设改进的多智能体深度确定性策略梯度算法进行最优指令序列选取，得到所述突发状况数据对应的飞行器群协同决策方案的步骤，包括：Optionally, when the plan comparison results are inconsistent, the optimal instruction sequence is selected according to the authorization data corresponding to the aircraft group and the preset improved multi-agent deep deterministic strategy gradient algorithm to obtain the burst The steps of the aircraft group collaborative decision-making plan corresponding to the status data include:

当方案比对结果为不一致时，判断所述飞行器群对应的授权数据是否为授权；When the plan comparison results are inconsistent, determine whether the authorization data corresponding to the aircraft group is authorized;

若是，则根据预设时间内的指令序列和所述飞行器群方案，确定所述突发状况数据对应的飞行器群协同决策方案；If so, determine the aircraft group collaborative decision-making plan corresponding to the emergency situation data based on the command sequence within the preset time and the aircraft group plan;

若否，则按照预设改进的多智能体深度确定性策略梯度算法对所述飞行器群数据集和所述地面控制站数据集进行指令序列选取，得到目标指令序列；If not, perform command sequence selection on the aircraft group data set and the ground control station data set according to the preset improved multi-agent depth deterministic policy gradient algorithm to obtain the target command sequence;

判断所述目标指令序列是否为预设序列；Determine whether the target instruction sequence is a preset sequence;

若是，则将所述飞行器群方案作为所述突发状况数据对应的飞行器群协同决策方案；If so, use the aircraft group plan as the aircraft group collaborative decision-making plan corresponding to the emergency situation data;

若否，则将所述目标指令序列作为所述突发状况数据对应的飞行器群协同决策方案。If not, the target command sequence is used as the aircraft group collaborative decision-making solution corresponding to the emergency situation data.

可选地，所述根据预设时间内的指令序列和所述飞行器群方案，确定所述突发状况数据对应的飞行器群协同决策方案的步骤，包括：Optionally, the step of determining the aircraft group collaborative decision-making plan corresponding to the emergency situation data based on the instruction sequence within the preset time and the aircraft group plan includes:

判断预设时间内所述飞行器群和所述地面控制站是否找到最优指令序列；Determine whether the aircraft group and the ground control station have found the optimal command sequence within the preset time;

若是，则将所述最优指令序列作为所述突发状况数据对应的飞行器群协同决策方案；If so, use the optimal instruction sequence as the aircraft group collaborative decision-making solution corresponding to the emergency situation data;

若否，则将所述飞行器群方案作为所述突发状况数据对应的飞行器群协同决策方案。If not, the aircraft group plan is used as the aircraft group collaborative decision-making plan corresponding to the emergency situation data.

可选地，所述按照预设改进的多智能体深度确定性策略梯度算法对所述飞行器群数据集和所述地面控制站数据集进行指令序列选取，得到目标指令序列的步骤，包括：Optionally, the step of selecting an instruction sequence from the aircraft group data set and the ground control station data set according to a preset improved multi-agent deep deterministic policy gradient algorithm to obtain a target instruction sequence includes:

采用所述飞行器群数据集和所述地面控制站数据集对应的决策准则，构建初始决策准则集合；Using the decision criteria corresponding to the aircraft group data set and the ground control station data set to construct an initial decision criterion set;

选取所述初始决策准则集合中优先级最高的决策准则，得到优先决策准则；Select the decision criterion with the highest priority in the initial decision criterion set to obtain the priority decision criterion;

按照所述优先决策准则，采用指令获取算法分别对所述飞行器群数据集和所述地面控制站数据集对应的指令赋予预设权重，得到指令序列集；According to the priority decision-making criterion, an instruction acquisition algorithm is used to assign preset weights to instructions corresponding to the aircraft group data set and the ground control station data set, respectively, to obtain an instruction sequence set;

采用预设改进的多智能体深度确定性策略梯度算法选取所述指令序列集中满足预设得分阈值的多个初始指令序列，构建初始指令队列；Using a preset improved multi-agent deep deterministic policy gradient algorithm to select multiple initial instruction sequences that meet the preset score threshold in the instruction sequence set, and construct an initial instruction queue;

选取所述初始指令队列中得分最高的初始指令序列作为中间指令序列；Select the initial instruction sequence with the highest score in the initial instruction queue as the intermediate instruction sequence;

根据所述飞行器群在自身约束条件和所述中间指令序列，确定目标指令序列。The target command sequence is determined based on the constraints of the aircraft group on itself and the intermediate command sequence.

可选地，所述根据所述飞行器群在自身约束条件和所述中间指令序列，确定目标指令序列的步骤，包括：Optionally, the step of determining the target instruction sequence based on the aircraft group's own constraints and the intermediate instruction sequence includes:

判断所述飞行器群在自身约束条件下是否可以执行所述中间指令序列；Determine whether the aircraft group can execute the intermediate command sequence under its own constraints;

若是，则将所述中间指令序列作为目标指令序列；If so, use the intermediate instruction sequence as the target instruction sequence;

若否，则将所述中间指令序列从所述初始指令队列中删除，得到中间指令队列；If not, delete the intermediate instruction sequence from the initial instruction queue to obtain an intermediate instruction queue;

判断所述中间指令队列是否为空集；Determine whether the intermediate instruction queue is an empty set;

若是，则将所述优先决策准则从所述初始决策准则集合中删除，得到目标决策准则集合；If so, delete the priority decision criterion from the initial decision criterion set to obtain a target decision criterion set;

判断所述目标决策准则集合是否为空集；Determine whether the set of target decision criteria is an empty set;

若是，则将所述预设序列作为目标指令序列；If so, use the preset sequence as the target instruction sequence;

若否，则将所述目标决策准则集合作为初始决策准则集合，并跳转执行所述选取所述初始决策准则集合中优先级最高的决策准则，得到优先决策准则的步骤；If not, use the target decision criterion set as the initial decision criterion set, and jump to the step of selecting the decision criterion with the highest priority in the initial decision criterion set to obtain the priority decision criterion;

若否，则将所述中间指令队列作为初始指令队列，并跳转执行所述选取所述初始指令队列中得分最高的初始指令序列作为中间指令序列的步骤。If not, use the intermediate instruction queue as the initial instruction queue, and jump to the step of selecting the initial instruction sequence with the highest score in the initial instruction queue as the intermediate instruction sequence.

本发明还提供了一种飞行器群协同决策生成系统，包括：The invention also provides an aircraft group collaborative decision-making system, which includes:

事件伪标签集得到模块，用于当接收到飞行器群的突发状况数据时，采用预设密度峰值聚类算法计算所述突发状况数据对应的飞行器群数据集和地面控制站生成的地面控制站数据集对应的事件伪标签，得到事件伪标签集；The event pseudo-label set obtaining module is used to, when receiving emergency situation data of an aircraft group, use a preset density peak clustering algorithm to calculate the aircraft group data set corresponding to the emergency situation data and the ground control generated by the ground control station. The event pseudo-label corresponding to the station data set is obtained to obtain the event pseudo-label set;

目标事件抽取模型得到模块，用于基于所述事件伪标签集训练初始事件抽取模型，得到目标事件抽取模型；A target event extraction model obtaining module is used to train an initial event extraction model based on the event pseudo-label set to obtain a target event extraction model;

方案比对结果得到模块，用于通过所述目标事件抽取模型分别对所述飞行器群数据集和所述地面控制站数据集进行事件抽取并进行事件比对，得到方案比对结果；A solution comparison result obtaining module is used to extract events from the aircraft group data set and the ground control station data set through the target event extraction model and perform event comparison to obtain a solution comparison result;

飞行器群协同决策第一得到模块，用于当方案比对结果为不一致时，根据所述飞行器群对应的授权数据和预设改进的多智能体深度确定性策略梯度算法进行最优指令序列选取，得到所述突发状况数据对应的飞行器群协同决策方案；The first acquisition module of aircraft group collaborative decision-making is used to select the optimal instruction sequence based on the authorization data corresponding to the aircraft group and the preset improved multi-agent deep deterministic strategy gradient algorithm when the plan comparison results are inconsistent. Obtain the aircraft group collaborative decision-making plan corresponding to the emergency situation data;

飞行器群协同决策第二得到模块，用于当方案比对结果为一致时，将所述飞行器群数据集中的飞行器群方案作为所述突发状况数据对应的飞行器群协同决策方案。The second acquisition module of aircraft group collaborative decision-making is used to use the aircraft group plan in the aircraft group data set as the aircraft group collaborative decision-making plan corresponding to the emergency situation data when the plan comparison results are consistent.

本发明还提供了一种电子设备，包括存储器及处理器，所述存储器中储存有计算机程序，所述计算机程序被所述处理器执行时，使得所述处理器执行实现如上述任一项基于飞行器群协同决策生成方法的步骤。The present invention also provides an electronic device, including a memory and a processor. A computer program is stored in the memory. When the computer program is executed by the processor, the processor executes any one of the above-based implementations. Steps of the collaborative decision-making generation method for aircraft swarms.

从以上技术方案可以看出，本发明具有以下优点：It can be seen from the above technical solutions that the present invention has the following advantages:

本发明通过采用预设密度峰值聚类算法计算突发状况数据对应的飞行器群数据集和地面控制站生成的地面控制站数据集对应的事件伪标签，得到事件伪标签集。基于事件伪标签集训练初始事件抽取模型，得到目标事件抽取模型。通过目标事件抽取模型分别对飞行器群数据集和地面控制站数据集进行事件抽取并进行事件比对，得到方案比对结果。当方案比对结果为不一致时，根据飞行器群对应的授权数据和预设改进的多智能体深度确定性策略梯度算法进行最优指令序列选取，得到突发状况数据对应的飞行器群协同决策方案。当方案比对结果为一致时，将飞行器群数据集中的飞行器群方案作为突发状况数据对应的飞行器群协同决策方案。解决了现有的飞行器群协同决策不能根据突发情况进行调整，容易因为飞行器群中单个飞行器的故障，影响作业的正常完成的技术问题。基于事件伪标签微调初始事件抽取模型，进而抽取事件，快速把握飞行环境整体态势，制定出相应方案来应对突发状况。采用改进后的多智能体深度确定性策略梯度算法寻找指令序列，尽最大力量使得飞行器群躲避威胁、完成任务、成功地应对突发状况。The present invention obtains an event pseudo-label set by using a preset density peak clustering algorithm to calculate the event pseudo-labels corresponding to the aircraft group data set corresponding to the emergency data and the ground control station data set generated by the ground control station. The initial event extraction model is trained based on the event pseudo-label set to obtain the target event extraction model. The target event extraction model is used to extract events from the aircraft group data set and the ground control station data set respectively and perform event comparison to obtain the solution comparison results. When the plan comparison results are inconsistent, the optimal instruction sequence is selected based on the authorization data corresponding to the aircraft group and the preset improved multi-agent deep deterministic strategy gradient algorithm to obtain the aircraft group collaborative decision-making plan corresponding to the emergency situation data. When the plan comparison results are consistent, the aircraft group plan in the aircraft group data set is used as the aircraft group collaborative decision-making plan corresponding to the emergency situation data. It solves the technical problem that the existing cooperative decision-making of the aircraft group cannot be adjusted according to emergencies, and the normal completion of the operation is easily affected by the failure of a single aircraft in the aircraft group. Fine-tune the initial event extraction model based on event pseudo-labels, and then extract events to quickly grasp the overall situation of the flight environment and formulate corresponding plans to deal with emergencies. The improved multi-agent deep deterministic policy gradient algorithm is used to find the instruction sequence and do its best to enable the aircraft group to avoid threats, complete tasks, and successfully respond to emergencies.

附图说明Description of the drawings

为了更清楚地说明本发明实施例或现有技术中的技术方案，下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动性的前提下，还可以根据这些附图获得其它的附图。In order to explain the embodiments of the present invention or the technical solutions in the prior art more clearly, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings in the following description are only These are some embodiments of the present invention. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without exerting any creative effort.

图1为本发明实施例一提供的一种飞行器群协同决策生成方法的步骤流程图；Figure 1 is a step flow chart of a method for generating collaborative decision-making for an aircraft group provided by Embodiment 1 of the present invention;

图2为本发明实施例二提供的一种飞行器群协同决策生成方法的步骤流程图；Figure 2 is a step flow chart of an aircraft group collaborative decision-making generation method provided in Embodiment 2 of the present invention;

图3为本发明实施例二提供的基于CNN的事件抽取模型框架；Figure 3 is a CNN-based event extraction model framework provided by Embodiment 2 of the present invention;

图4为本发明实施例二提供的事件抽取模型微调过程的流程框图；Figure 4 is a flow chart of the event extraction model fine-tuning process provided by Embodiment 2 of the present invention;

图5为本发明实施例二提供的执行指令获取算法的流程框图；Figure 5 is a flow chart of an execution instruction acquisition algorithm provided by Embodiment 2 of the present invention;

图6为本发明实施例二提供的一种飞行器群协同决策生成方法的流程框图；Figure 6 is a flow chart of an aircraft group collaborative decision-making generation method provided in Embodiment 2 of the present invention;

图7为应用本发明实施例二提供的飞行器群协同决策生成方法的应急系统示意图；Figure 7 is a schematic diagram of an emergency system applying the aircraft group collaborative decision-making generation method provided in Embodiment 2 of the present invention;

图8为本发明实施例三提供的一种飞行器群协同决策生成系统的结构框图。Figure 8 is a structural block diagram of an aircraft group collaborative decision-making generation system provided in Embodiment 3 of the present invention.

具体实施方式Detailed ways

本发明实施例提供了一种飞行器群协同决策生成方法、系统和设备，用于解决现有的飞行器群协同决策不能根据突发情况进行调整，容易因为飞行器群中单个飞行器的故障，影响作业的正常完成的技术问题。Embodiments of the present invention provide an aircraft group collaborative decision-making generation method, system and equipment, which are used to solve the problem that the existing aircraft group collaborative decision-making cannot be adjusted according to emergencies, and it is easy to affect the operation due to the failure of a single aircraft in the aircraft group. Technical issues with normal completion.

为使得本发明的发明目的、特征、优点能够更加的明显和易懂，下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，下面所描述的实施例仅仅是本发明一部分实施例，而非全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其它实施例，都属于本发明保护的范围。In order to make the purpose, features, and advantages of the present invention more obvious and easy to understand, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, what is mentioned below The described embodiments are only some, but not all, of the embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts fall within the scope of protection of the present invention.

请参阅图1，图1为本发明实施例一提供的一种飞行器群协同决策生成方法的步骤流程图。Please refer to FIG. 1 , which is a flow chart of a method for generating collaborative decision-making for an aircraft group according to Embodiment 1 of the present invention.

本发明实例一提供的一种飞行器群协同决策生成方法，包括：Example 1 of the present invention provides an aircraft group collaborative decision-making generation method, including:

步骤101、当接收到飞行器群的突发状况数据时，采用预设密度峰值聚类算法计算突发状况数据对应的飞行器群数据集和地面控制站生成的地面控制站数据集对应的事件伪标签，得到事件伪标签集。Step 101. When receiving the emergency situation data of the aircraft group, use the preset density peak clustering algorithm to calculate the event pseudo-label corresponding to the aircraft group data set corresponding to the emergency situation data and the ground control station data set generated by the ground control station. , get the event pseudo-label set.

在本发明实施例中，飞行器群可以看作是多个智能体，当这些智能体遇到突发状况时，为了应对突发状况，它们将产生新指令序列。这些新指令序列与飞行器群出发之前地面控制站赋予它的旧指令序列，常常不一致。此外，称旧指令序列为地面控制站方案、称新指令序列为飞行器群方案。飞行器群数据集包括飞行器群方案和飞行器群数据。地面控制站数据集包括地面控制站方案和地面控制站数据。In the embodiment of the present invention, the aircraft group can be regarded as multiple intelligent agents. When these intelligent agents encounter an emergency situation, in order to cope with the emergency situation, they will generate new instruction sequences. These new command sequences are often inconsistent with the old command sequences given to the aircraft group by the ground control station before departure. In addition, the old command sequence is called the ground control station scheme and the new command sequence is called the aircraft group scheme. The aircraft group data set includes aircraft group schemes and aircraft group data. The ground control station data set includes ground control station schemes and ground control station data.

当接收到飞行器群的突发状况数据时，获取飞行器群基于突发状况数据生成的飞行器群数据集和地面控制站生成的地面控制站数据集。通过预设跳字模型提取飞行器群数据集和地面控制站数据集的特征向量，得到特征向量集，其中预设跳字模型是指Skip-gram模型。采用预设密度计算公式分别计算特征向量集中各点对应的密度，得到密度。将密度代入预设相对密度计算公式计算点的相对密度，得到相对密度。选取相对密度大于相对密度对应的密度平均值的点，得到多个簇中心。按照簇中心，采用预设域计算公式对特征向量集进行划分，得到多个集合。按照簇中心对应的标签分别对各集合对应的点设置相应的标签，得到事件伪标签集。When the emergency situation data of the aircraft group is received, the aircraft group data set generated by the aircraft group based on the emergency situation data and the ground control station data set generated by the ground control station are obtained. The feature vectors of the aircraft group data set and the ground control station data set are extracted through the preset skip model, and the feature vector set is obtained, where the preset skip model refers to the Skip-gram model. Use the preset density calculation formula to calculate the density corresponding to each point in the feature vector set to obtain the density. Substitute the density into the preset relative density calculation formula to calculate the relative density of the point and obtain the relative density. Select points whose relative density is greater than the density average corresponding to the relative density to obtain multiple cluster centers. According to the cluster center, the feature vector set is divided using the preset domain calculation formula to obtain multiple sets. According to the label corresponding to the cluster center, corresponding labels are set for the points corresponding to each set to obtain the event pseudo-label set.

步骤102、基于事件伪标签集训练初始事件抽取模型，得到目标事件抽取模型。Step 102: Train an initial event extraction model based on the event pseudo-label set to obtain a target event extraction model.

在本发明实施例中，通过将事件伪标签集输入初始事件抽取模型，得到初始损失函数值。按照初始损失函数值，微调初始事件抽取模型，得到中间事件抽取模型。将事件伪标签集输入中间事件抽取模型，得到目标损失函数值。判断目标损失函数值是否为预设函数值。若是，则将目标损失函数值对应的中间事件抽取模型作为目标事件抽取模型。若否，则将中间事件抽取模型作为初始事件抽取模型，并跳转执行将事件伪标签集输入初始事件抽取模型，得到初始损失函数值的步骤。In the embodiment of the present invention, the initial loss function value is obtained by inputting the event pseudo-label set into the initial event extraction model. According to the initial loss function value, the initial event extraction model is fine-tuned to obtain the intermediate event extraction model. Input the event pseudo-label set into the intermediate event extraction model to obtain the target loss function value. Determine whether the target loss function value is the preset function value. If so, the intermediate event extraction model corresponding to the target loss function value is used as the target event extraction model. If not, use the intermediate event extraction model as the initial event extraction model, and jump to the step of inputting the event pseudo-label set into the initial event extraction model to obtain the initial loss function value.

步骤103、通过目标事件抽取模型分别对飞行器群数据集和地面控制站数据集进行事件抽取并进行事件比对，得到方案比对结果。Step 103: Extract events from the aircraft group data set and the ground control station data set respectively through the target event extraction model and perform event comparison to obtain the solution comparison results.

在本发明实施例中，通过目标事件抽取模型分别对飞行器群数据集和地面控制站数据集进行事件抽取，得到飞行器群事件和地面事件。按照触发词类别和参数相似度，将飞行器群事件和地面事件进行比对，得到事件比对结果。采用全部事件比对结果，构建方案比对结果。In the embodiment of the present invention, event extraction is performed on the aircraft group data set and the ground control station data set respectively through the target event extraction model to obtain aircraft group events and ground events. According to the trigger word category and parameter similarity, the aircraft group events and ground events are compared to obtain the event comparison results. Use all event comparison results to construct plan comparison results.

步骤104、当方案比对结果为不一致时，根据飞行器群对应的授权数据和预设改进的多智能体深度确定性策略梯度算法进行最优指令序列选取，得到突发状况数据对应的飞行器群协同决策方案。Step 104. When the plan comparison results are inconsistent, select the optimal instruction sequence based on the authorization data corresponding to the aircraft group and the preset improved multi-agent deep deterministic strategy gradient algorithm to obtain the aircraft group collaboration corresponding to the emergency situation data. Decision plan.

在本发明实施例中，当方案比对结果为不一致时，判断飞行器群对应的授权数据是否为授权。若是，则根据预设时间内的指令序列和飞行器群方案，确定突发状况数据对应的飞行器群协同决策方案。若否，则按照预设改进的多智能体深度确定性策略梯度算法对飞行器群数据集和地面控制站数据集进行指令序列选取，得到目标指令序列。判断目标指令序列是否为预设序列。若是，则将飞行器群方案作为突发状况数据对应的飞行器群协同决策方案。若否，则将目标指令序列作为突发状况数据对应的飞行器群协同决策方案。In the embodiment of the present invention, when the solution comparison results are inconsistent, it is determined whether the authorization data corresponding to the aircraft group is authorized. If so, the aircraft group collaborative decision-making plan corresponding to the emergency data is determined based on the command sequence and aircraft group plan within the preset time. If not, then the command sequence is selected from the aircraft group data set and the ground control station data set according to the preset improved multi-agent deep deterministic policy gradient algorithm to obtain the target command sequence. Determine whether the target instruction sequence is a preset sequence. If so, the aircraft group plan is used as the aircraft group collaborative decision-making plan corresponding to the emergency situation data. If not, the target command sequence is used as the aircraft group collaborative decision-making plan corresponding to the emergency situation data.

步骤105、当方案比对结果为一致时，将飞行器群数据集中的飞行器群方案作为突发状况数据对应的飞行器群协同决策方案。Step 105: When the plan comparison results are consistent, use the aircraft group plan in the aircraft group data set as the aircraft group collaborative decision-making plan corresponding to the emergency situation data.

在本发明实施例中，当地面控制站方案和飞行器群方案中有80％以上的事件相同时，认为地面控制站方案和飞行器群方案一致。当两方案一致时，执行飞行器群方案，即将飞行器群数据集中的飞行器群方案作为突发状况数据对应的飞行器群协同决策方案。In the embodiment of the present invention, when more than 80% of the events in the ground control station plan and the aircraft group plan are the same, the ground control station plan and the aircraft group plan are considered to be consistent. When the two plans are consistent, the aircraft group plan is executed, that is, the aircraft group plan in the aircraft group data set is used as the aircraft group collaborative decision-making plan corresponding to the emergency data.

在本发明实施例中，通过采用预设密度峰值聚类算法计算突发状况数据对应的飞行器群数据集和地面控制站生成的地面控制站数据集对应的事件伪标签，得到事件伪标签集。基于事件伪标签集训练初始事件抽取模型，得到目标事件抽取模型。通过目标事件抽取模型分别对飞行器群数据集和地面控制站数据集进行事件抽取并进行事件比对，得到方案比对结果。当方案比对结果为不一致时，根据飞行器群对应的授权数据和预设改进的多智能体深度确定性策略梯度算法进行最优指令序列选取，得到突发状况数据对应的飞行器群协同决策方案。当方案比对结果为一致时，将飞行器群数据集中的飞行器群方案作为突发状况数据对应的飞行器群协同决策方案。解决了现有的飞行器群协同决策不能根据突发情况进行调整，容易因为飞行器群中单个飞行器的故障，影响作业的正常完成的技术问题。基于事件伪标签微调初始事件抽取模型，进而抽取事件，快速把握飞行环境整体态势，制定出相应方案来应对突发状况。采用改进后的多智能体深度确定性策略梯度算法寻找指令序列，尽最大力量使得飞行器群躲避威胁、完成任务、成功地应对突发状况。In the embodiment of the present invention, the event pseudo-label set is obtained by using a preset density peak clustering algorithm to calculate the event pseudo-labels corresponding to the aircraft group data set corresponding to the emergency situation data and the ground control station data set generated by the ground control station. The initial event extraction model is trained based on the event pseudo-label set to obtain the target event extraction model. The target event extraction model is used to extract events from the aircraft group data set and the ground control station data set respectively and perform event comparison to obtain the solution comparison results. When the plan comparison results are inconsistent, the optimal instruction sequence is selected based on the authorization data corresponding to the aircraft group and the preset improved multi-agent deep deterministic strategy gradient algorithm to obtain the aircraft group collaborative decision-making plan corresponding to the emergency situation data. When the plan comparison results are consistent, the aircraft group plan in the aircraft group data set is used as the aircraft group collaborative decision-making plan corresponding to the emergency situation data. It solves the technical problem that the existing cooperative decision-making of the aircraft group cannot be adjusted according to emergencies, and the normal completion of the operation is easily affected by the failure of a single aircraft in the aircraft group. Fine-tune the initial event extraction model based on event pseudo-labels, and then extract events to quickly grasp the overall situation of the flight environment and formulate corresponding plans to deal with emergencies. The improved multi-agent deep deterministic policy gradient algorithm is used to find the instruction sequence and do its best to enable the aircraft group to avoid threats, complete tasks, and successfully respond to emergencies.

请参阅图2，图2为本发明实施例二提供的一种飞行器群协同决策生成方法的步骤流程图。Please refer to Figure 2. Figure 2 is a flow chart of a method for generating collaborative decision making for an aircraft group according to Embodiment 2 of the present invention.

本发明实例二提供的另一种飞行器群协同决策生成方法，包括：Another aircraft group collaborative decision-making generation method provided by Example 2 of the present invention includes:

步骤201、当接收到飞行器群的突发状况数据时，采用预设密度峰值聚类算法计算突发状况数据对应的飞行器群数据集和地面控制站生成的地面控制站数据集对应的事件伪标签，得到事件伪标签集。Step 201. When receiving the emergency situation data of the aircraft group, use the preset density peak clustering algorithm to calculate the event pseudo-label corresponding to the aircraft group data set corresponding to the emergency situation data and the ground control station data set generated by the ground control station. , get the event pseudo-label set.

进一步地，步骤201可以包括以下子步骤S11-S17：Further, step 201 may include the following sub-steps S11-S17:

S11、当接收到飞行器群的突发状况数据时，获取飞行器群基于突发状况数据生成的飞行器群数据集和地面控制站生成的地面控制站数据集。S11. When receiving the emergency situation data of the aircraft group, obtain the aircraft group data set generated by the aircraft group based on the emergency situation data and the ground control station data set generated by the ground control station.

S12、通过预设跳字模型提取飞行器群数据集和地面控制站数据集的特征向量，得到特征向量集。S12. Extract the feature vectors of the aircraft group data set and the ground control station data set through the preset word skip model to obtain a feature vector set.

S13、采用预设密度计算公式分别计算特征向量集中各点对应的密度，得到密度。S13. Use the preset density calculation formula to calculate the density corresponding to each point in the feature vector set to obtain the density.

S14、将密度代入预设相对密度计算公式计算点的相对密度，得到相对密度。S14. Substitute the density into the preset relative density calculation formula to calculate the relative density of the point to obtain the relative density.

S15、选取相对密度大于相对密度对应的密度平均值的点，得到多个簇中心。S15. Select points whose relative density is greater than the density average corresponding to the relative density to obtain multiple cluster centers.

S16、按照簇中心，采用预设域计算公式对特征向量集进行划分，得到多个集合。S16. According to the cluster center, use the preset domain calculation formula to divide the feature vector set to obtain multiple sets.

S17、按照簇中心对应的标签分别对各集合对应的点设置相应的标签，得到事件伪标签集。S17. Set corresponding labels for the points corresponding to each set according to the labels corresponding to the cluster centers to obtain the event pseudo-label set.

在本发明实施例中，为了生成更加准确的事件伪标签以提高事件抽取的准确度，需要改进密度峰值聚类算法。密度峰值聚类算法是一种简单、有效的聚类分析方法。但存在如下局限性：密度峰值聚类算法在执行过程中，需要用户手动选择簇中心，这种操作存在一定的主观性。密度峰值聚类算法的分配策略是将一个点分配到密度比其大，且距离其最近的点所在的簇。这样的分配策略常常将一个密度较小的簇中的大部分点分配到一个高密度簇中，也就是多米诺骨牌效应。针对上述局限性本发明提出了自动确定簇中心的方法，以及新的点分配策略的预设密度峰值聚类算法。In the embodiment of the present invention, in order to generate more accurate event pseudo-labels to improve the accuracy of event extraction, the density peak clustering algorithm needs to be improved. Density peak clustering algorithm is a simple and effective cluster analysis method. However, there are the following limitations: During the execution of the density peak clustering algorithm, the user needs to manually select the cluster center, and this operation has a certain degree of subjectivity. The allocation strategy of the density peak clustering algorithm is to assign a point to the cluster where the density is greater than it and the point closest to it is located. Such an allocation strategy often allocates most points in a low-density cluster to a high-density cluster, which is a domino effect. In view of the above limitations, the present invention proposes a method for automatically determining cluster centers, as well as a new point allocation strategy and a preset density peak clustering algorithm.

为了自动地确定簇中心，需要计算点的密度、点的相对密度。对于密度的计算方法，对于任意点x_i，它到其K近邻(K-Nearest Neighbors，KNN)的平均距离越小，它的密度就越大。因此，点x_i的密度计算公式即预设密度计算公式为：In order to automatically determine the cluster center, the density of points and the relative density of points need to be calculated. Regarding the calculation method of density, for any point x _i , the smaller the average distance from it to its K-Nearest Neighbors (KNN), the greater its density. Therefore, the density calculation formula of point x _i , that is, the preset density calculation formula, is:

其中，ρ(x_i)表示点x_i的密度；k表示一个点的邻居的数量，即以欧式距离为度量标准，距离一个点最近的前k个点；表示点x_i和点x_j之间的欧式距离，u表示一个特征向量集中点的维度，也就是特征的数量，g是下标，用来索引不同的特征；KNN(x_i)表示在数据集中与点x_i的欧式距离最小的前k个点构成的集合；i表示数据集中的第i个点x_i；j表示数据集中的第j个点x_j，设数据集中共有v个点，那么i＝1,2,...,v；j＝1,2,...,v。这里的数据集是指飞行器群数据集和地面控制站数据集。Among them, ρ(xi) represents the density of point x _i ; k represents the number of neighbors of a point, that is, using _the Euclidean distance as the metric, the top k points closest to a point; Represents the Euclidean distance between point x _i and point x _j , u represents the dimension of a feature vector concentration point, that is, the number of features, g is a subscript, used to index different features; KNN(xi ₎ represents the A set of the first k points with the smallest Euclidean distance from point x _i ; i represents the i-th point x _i in the data set; j represents the j-th point x _j in the data set. Assume there are v points in the data set, Then i=1,2,...,v; j=1,2,...,v. The data sets here refer to the aircraft group data set and the ground control station data set.

点的相对密度计算方式即预设相对密度计算公式如下所示：The relative density calculation method of points, that is, the preset relative density calculation formula is as follows:

其中，r_ρ(x_i)表示点x_i的相对密度；ρ(x_i)表示点x_i的密度；ρ(x_j)表示点x_j的密度；k表示一个点的邻居的数量，即以欧式距离为度量标准，距离一个点最近的前k个点；KNN(x_i)表示在数据集中与点x_i的欧式距离最小的前k个点构成的集合；i表示数据集中的第i个点x_i；j表示数据集中的第j个点x_j。Among them, r _ρ ( _xi ) represents the relative density of point x _i ; ρ ( _xi ) represents the density of point _xi ; ρ (x _j ) represents the density of point x _j ; k represents the number of neighbors of a point, that is Using Euclidean distance as the metric, the top k points closest to a point; KNN(xi ₎ represents the set of the top k points with the smallest Euclidean distance from point x _i in the data set; i represents the i-th point in the data set point x _i ; j represents the j-th point x _j in the data set.

自动确定簇中心方法的步骤：对点的相对密度r_ρ和高密度最近邻距离δ分别做归一化处理，仍然以r_ρ和δ表示归一化之后的结果。令v₁＝r_ρδ，以v₂表示将v₁从大到小排序后的结果。对v₂中的第1个点，若其相对密度大于所有点相对密度的密度平均值，且它的高密度最近邻距离大于所有点高密度最近邻距离的平均值，则这个点是簇中心。继续考察v₂中的下一个点是否为簇中心，直到簇中心的数量达到设定值。簇中心的数量为参数，需提前设置。The steps of the method of automatically determining the cluster center: normalize the relative density r _ρ of the points and the high-density nearest neighbor distance δ respectively, and still use r _ρ and δ to represent the normalized results. Let v ₁ =r _ρ δ, and use v ₂ to represent the result of sorting v ₁ from large to small. For the first point in v ₂ , if its relative density is greater than the average density of the relative densities of all points, and its high-density nearest neighbor distance is greater than the average high-density nearest neighbor distance of all points, then this point is the cluster center . Continue to examine whether the next point in v ₂ is the cluster center until the number of cluster centers reaches the set value. The number of cluster centers is a parameter and needs to be set in advance.

新的点分配策略基于点的域，点的域计算公式即预设域计算公式为：The new point allocation strategy is based on the domain of points. The calculation formula of the domain of points, that is, the default domain calculation formula is:

式中，D_mn(x_p)表示点x_p的域；MN(x_p)表示点x_p所有互邻的集合；若x_i∈KNN(x_j)且x_j∈KNN(x_i)，则x_j是x_i的互邻，记作x_j∈MN(x_i)。此处x_i、x_j均表示数据集中的任意点。MN(x_m)、MN(x_p)的含义，与MN(x_i)的含义类似。x_p表示数据集中第p个点，设数据集中共有v个点，那么p＝1,2,...,v。x_m表示MN(x_p)中的任意一个点。x_q表示D_mn(x_p)中的任意一个点。D_mn(x_p)表示x_p的域，其本质是满足一些条件的点的集合，这些点或者是x_p的互邻，或者是x_p的互邻中的点的互邻。也就是说，若x_q∈MN(x_p)，则x_q在MN(x_p)这个集合中；若但是x_q∈MN(x_m)且x_m∈MN(x_p)，则x_q在D_mn(x_p)这个集合中。In the formula, D _mn (x _p ) represents the domain of point x _p ; MN (x _p ) represents the set of all neighbors of point x _p ; if x _i ∈KNN(x _j ) and x _j ∈KNN( _xi ), Then x _j is a mutual neighbor of x _i , denoted as x _j ∈MN(xi ₎ . Here, x _i and x _j both represent any points in the data set. The meanings of MN(x _m ) and MN(x _p ) are similar to the meanings of MN( _xi ). x _p represents the p-th point in the data set. Assume there are v points in the data set, then p=1,2,...,v. x _m represents any point in MN(x _p ). x _q represents any point in D _mn (x _p ). D _mn (x _p ) represents the domain of x _p , and its essence is a set of points that meet some conditions. These points are either mutual neighbors of x _p , or mutual neighbors of points in the mutual neighbors of x _p . That is to say, if x _q ∈MN(x _p ), then x _q is in the set MN(x _p ); if But x _q ∈MN(x _m ) and x _m ∈MN(x _p ), then x _q is in the set D _mn (x _p ).

新的点分配策略：对于非簇中心点，在域范围内执行聚类过程。从密度最大的点x_i开始，若x_i的高密度最近邻点x_j在D_mn(x_i)中，则将x_i分配到x_j所在的簇。若x_i的高密度最近邻点x_j不在D_mn(x_i)中，以x_k表示D_mn(x_i)中距离x_i最近且已经分配过的点，将x_j分配到x_k所在的簇，从而得到该簇中心对应的集合。New point allocation strategy: for non-cluster center points, the clustering process is performed within the domain. Starting from the point x _i with the highest density, if the high-density nearest neighbor point x _j of x _i is in D _mn ( _xi ), then x _i is assigned to the cluster where x _j is located. If the high-density nearest neighbor point x _j of x _i is not in D _mn (x _i ), use x _k to represent the point in D _mn (x _i ) that is closest to x _i and has been assigned, and assign x _j to where x _k is. cluster, thereby obtaining the set corresponding to the center of the cluster.

将地面控制站方案、地面控制站数据、飞行器群方案和飞行器群数据称之为相关数据。获得事件伪标签的具体过程如下：可依据Skip-gram模型即预设跳字模型来提取相关数据的特征向量。将这些特征向量输入改进的密度峰值聚类算法即预设密度峰值聚类算法。改进的密度峰值聚类算法依据欧式距离来计算这些特征向量之间的相似度，而后展开聚类过程。改进的密度峰值聚类算法输出若干簇中心，且对于任意一个簇中心，改进的密度峰值聚类算法输出和其在同一个簇的所有的点。此处一个点代表一个事件。簇中心的数量即为事件类型的数量。对于任意一个簇中心，为其设置一个标签，对于与此簇中心在同一个簇中的点，它们的标签与簇中心的标签相同。设当前有A、B、C三种事件类型，那么改进的密度峰值聚类算法将输出3个簇中心，以及哪些点与哪个簇中心在同一个簇。将与A类型事件对应的簇中心的标签设置为“A”、将与B类型事件对应的簇中心的标签设置为“B”、将与C类型事件对应的簇中心的标签设置为“C”。此处的“A”、“B”、“C”便是事件标签，此处的事件标签也称作事件伪标签。在有监督学习算法中，其标签多数是人为标记的，是相对准确的，是真的标签，所以称其为事件伪标签。本专利中的改进的密度峰值聚类算法属于无监督学习算法。在无监督学习算法中，没有人为标记的标签，只有算法生成的标签，多数情况下，这些标签不如人为标记的准确，会有错误的标签，如将A类型的事件标记为B类型的事件，所以将无监督学习算法生成的标签称作事件伪标签。The ground control station plan, ground control station data, aircraft group plan and aircraft group data are called relevant data. The specific process of obtaining event pseudo-labels is as follows: the feature vector of relevant data can be extracted based on the Skip-gram model, which is a preset word skip model. These feature vectors are input into the improved density peak clustering algorithm, that is, the preset density peak clustering algorithm. The improved density peak clustering algorithm calculates the similarity between these feature vectors based on Euclidean distance, and then starts the clustering process. The improved density peak clustering algorithm outputs several cluster centers, and for any cluster center, the improved density peak clustering algorithm outputs all points in the same cluster. Here a point represents an event. The number of cluster centers is the number of event types. For any cluster center, set a label for it. For points in the same cluster as this cluster center, their labels are the same as the label of the cluster center. Assume that there are currently three event types A, B, and C. Then the improved density peak clustering algorithm will output 3 cluster centers, as well as which points are in the same cluster as which cluster center. Set the label of the cluster center corresponding to type A events to "A", set the label of the cluster center corresponding to type B events to "B", and set the label of the cluster center corresponding to type C events to "C" . "A", "B", and "C" here are event tags, and the event tags here are also called event pseudo-tags. In supervised learning algorithms, most of the labels are artificially labeled, which are relatively accurate and real labels, so they are called event pseudo-labels. The improved density peak clustering algorithm in this patent is an unsupervised learning algorithm. In the unsupervised learning algorithm, there are no human-labeled labels, only labels generated by the algorithm. In most cases, these labels are not as accurate as human-labeled ones, and there will be wrong labels, such as labeling type A events as type B events. Therefore, the labels generated by unsupervised learning algorithms are called event pseudo-labels.

步骤202、基于事件伪标签集训练初始事件抽取模型，得到目标事件抽取模型。Step 202: Train an initial event extraction model based on the event pseudo-label set to obtain a target event extraction model.

进一步地，步骤202可以包括以下子步骤S21-S26：Further, step 202 may include the following sub-steps S21-S26:

S21、将事件伪标签集输入初始事件抽取模型，得到初始损失函数值。S21. Input the event pseudo-label set into the initial event extraction model to obtain the initial loss function value.

S22、按照初始损失函数值，微调初始事件抽取模型，得到中间事件抽取模型。S22. Fine-tune the initial event extraction model according to the initial loss function value to obtain the intermediate event extraction model.

S23、将事件伪标签集输入中间事件抽取模型，得到目标损失函数值。S23. Enter the event pseudo-label set into the intermediate event extraction model to obtain the target loss function value.

S24、判断目标损失函数值是否为预设函数值，若是，则执行步骤S25，若否，则执行步骤S26。S24. Determine whether the target loss function value is the preset function value. If so, execute step S25. If not, execute step S26.

S25、将目标损失函数值对应的中间事件抽取模型作为目标事件抽取模型。S25. Use the intermediate event extraction model corresponding to the target loss function value as the target event extraction model.

S26、将中间事件抽取模型作为初始事件抽取模型，并跳转执行将事件伪标签集输入初始事件抽取模型，得到初始损失函数值的步骤。S26. Use the intermediate event extraction model as the initial event extraction model, and jump to the step of inputting the event pseudo-label set into the initial event extraction model to obtain the initial loss function value.

在本发明实施例中，如图3所示，初始事件抽取模型为基于CNN的事件抽取模型，将事件抽取建模设置为多分类任务。设置句子长度为sl，当句子长度大于sl时截断，当句子长度小于sl时填充。采用jieba工具包(即结巴工具包)将数据集里面的中文句子分割为token(即文本的一个基本单位)。采用Skip-gram模型(是一种词嵌入模型)获取token的词嵌入，假设维度为d。假设事件触发词种类数量为n₁，事件参数角色种类数量为n₂。接下来通过卷积层来提取特定长度语义单元的特征、通过多层感知机提取句子特征。将卷积神经网络应用于句子时，通过设置过滤器的大小来捕捉一定长度语义单元的特征。若过滤器大小为3×d，则捕获的是长度为3的语义单元的特征。本专利分别采用大小为2×d、3×d、4×d、5×d的过滤器提取对应长度语义单元的特征。不同大小的过滤器的数量都是128。因此，通过卷积层将句子编码为长度为128*4＝512的向量。连接一个句子中所有token的词嵌入，得到一个长度为sl*d的向量。设置多层感知机第一层神经元数量为sl*d，后续层神经元数量依次减半，输出层神经元数量大于或等于2.5(n₁+n₂)，假设输出层神经元数量为n₀。连接从卷积层获取的语义单元特征以及从多层感知机获取的句子特征，得到长度为512+n₀的向量，输入卷积层，获得分类结果。基于分类结果，计算损失函数，获得反向传播误差，从而确定模型参数。In the embodiment of the present invention, as shown in Figure 3, the initial event extraction model is a CNN-based event extraction model, and the event extraction modeling is set as a multi-classification task. Set the sentence length to sl, truncate when the sentence length is greater than sl, and pad when the sentence length is less than sl. The jieba toolkit (i.e. stuttering toolkit) is used to segment the Chinese sentences in the data set into tokens (i.e. a basic unit of text). Use the Skip-gram model (a word embedding model) to obtain the word embedding of the token, assuming the dimension is d. Assume that the number of event trigger word types is n ₁ and the number of event parameter role types is n ₂ . Next, the features of semantic units of a specific length are extracted through the convolutional layer, and the sentence features are extracted through the multi-layer perceptron. When applying a convolutional neural network to a sentence, the size of the filter is set to capture the characteristics of a certain length of semantic units. If the filter size is 3×d, the features of semantic units of length 3 are captured. This patent uses filters with sizes of 2×d, 3×d, 4×d, and 5×d to extract features of corresponding length semantic units. The number of filters of different sizes is 128. Therefore, the sentence is encoded through the convolutional layer into a vector of length 128*4=512. Concatenate the word embeddings of all tokens in a sentence to obtain a vector of length sl*d. Set the number of neurons in the first layer of the multi-layer perceptron to sl*d, and the number of neurons in subsequent layers is halved in turn. The number of neurons in the output layer is greater than or equal to 2.5 (n ₁ + n ₂ ). Assume that the number of neurons in the output layer is n. ₀ . Connect the semantic unit features obtained from the convolutional layer and the sentence features obtained from the multi-layer perceptron to obtain a vector with a length of 512+n ₀ , which is input to the convolutional layer to obtain the classification result. Based on the classification results, the loss function is calculated and the back propagation error is obtained to determine the model parameters.

如图4所示，将改进的密度峰值聚类算法应用于相关数据，获得事件伪标签以及事件参数角色伪标签。将伪标签输入初始事件抽取模型，获得损失函数值，从而反向传播误差以微调初始事件抽取模型，得到中间事件抽取模型。将事件伪标签集输入基于CNN的事件抽取模型即中间事件抽取模型，得到目标损失函数值。预设函数值是指目标损失函数值不再下降。当目标损失函数值不再下降模型收敛时，停止微调过程，将目标损失函数值对应的中间事件抽取模型作为目标事件抽取模型。否则，将中间事件抽取模型作为初始事件抽取模型，并跳转执行将事件伪标签集输入初始事件抽取模型，得到初始损失函数值的步骤。As shown in Figure 4, the improved density peak clustering algorithm is applied to the relevant data to obtain event pseudo-labels and event parameter role pseudo-labels. Input the pseudo-label into the initial event extraction model to obtain the loss function value, and then back-propagate the error to fine-tune the initial event extraction model to obtain the intermediate event extraction model. The event pseudo-label set is input into the CNN-based event extraction model, that is, the intermediate event extraction model, to obtain the target loss function value. The preset function value means that the target loss function value no longer decreases. When the target loss function value no longer decreases and the model converges, the fine-tuning process is stopped, and the intermediate event extraction model corresponding to the target loss function value is used as the target event extraction model. Otherwise, use the intermediate event extraction model as the initial event extraction model, and jump to the step of inputting the event pseudo-label set into the initial event extraction model to obtain the initial loss function value.

步骤203、通过目标事件抽取模型分别对飞行器群数据集和地面控制站数据集进行事件抽取并进行事件比对，得到方案比对结果。Step 203: Extract events from the aircraft group data set and the ground control station data set respectively through the target event extraction model and perform event comparison to obtain solution comparison results.

进一步地，步骤203可以包括以下子步骤S31-S33：Further, step 203 may include the following sub-steps S31-S33:

S31、通过目标事件抽取模型分别对飞行器群数据集和地面控制站数据集进行事件抽取，得到飞行器群事件和地面事件。S31. Extract events from the aircraft group data set and the ground control station data set respectively through the target event extraction model to obtain the aircraft group events and ground events.

S32、按照触发词类别和参数相似度，将飞行器群事件和地面事件进行比对，得到事件比对结果。S32. Compare the aircraft group event and the ground event according to the trigger word category and parameter similarity, and obtain the event comparison result.

S33、采用全部事件比对结果，构建方案比对结果。S33. Use all event comparison results to construct plan comparison results.

在本发明实施例中，对相关数据进行事件抽取的目的有两个：一是基于事件抽取结果判断飞行器群方案与地面控制站方案是否一致：二是将事件抽取结果输入改进的多智能体深度确定性策略梯度算法以获得最优指令序列。采用微调过后的基于CNN的事件抽取模型即目标事件抽取模型在相关数据上抽取事件。地面控制站方案和飞行器群方案均由一系列指令构成。指令与事件一一对应。以e_i表示地面控制站方案中的一个事件，以e_j表示飞行器群方案中的一个事件，若e_i与e_j的触发词属于同一类别，且e_i与e_j有50％以上的参数相同，则e_i与e_j属于同一事件。当地面控制站方案和飞行器群方案中有80％以上的事件相同时，认为地面控制站方案和飞行器群方案一致。当两方案一致时，执行飞行器群方案并结束算法即执行步骤210，否则进行步骤204。In the embodiment of the present invention, there are two purposes for extracting events from relevant data: one is to determine whether the aircraft group scheme is consistent with the ground control station scheme based on the event extraction results; the other is to input the event extraction results into the improved multi-agent depth Deterministic policy gradient algorithm to obtain optimal instruction sequences. The fine-tuned CNN-based event extraction model, that is, the target event extraction model, is used to extract events from relevant data. Both the ground control station plan and the aircraft group plan consist of a series of instructions. Instructions correspond to events one-to-one. Let e _i represent an event in the ground control station plan, and let e _j represent an event in the aircraft group plan. If the trigger words of e _i and e _j belong to the same category, and e _i and e _j have more than 50% of the parameters. are the same, then e _i and e _j belong to the same event. When more than 80% of the events in the ground control station plan and the aircraft group plan are the same, the ground control station plan and the aircraft group plan are considered to be consistent. When the two plans are consistent, the aircraft group plan is executed and the algorithm ends, that is, step 210 is executed. Otherwise, step 204 is executed.

步骤204、当方案比对结果为不一致时，判断飞行器群对应的授权数据是否为授权，若是，则执行步骤205，若否，则执行步骤206。Step 204: When the plan comparison results are inconsistent, determine whether the authorization data corresponding to the aircraft group is authorized. If so, perform step 205. If not, perform step 206.

在本发明实施例中，为了协同决策算法的实用性，添加授权模块来决定是否授权飞行器群可自行行动。当授权时，执行步骤205，未授权时，执行步骤206。In the embodiment of the present invention, in order to make the collaborative decision-making algorithm practical, an authorization module is added to determine whether to authorize the aircraft group to act on its own. When authorized, perform step 205; when not authorized, perform step 206.

步骤205、根据预设时间内的指令序列和飞行器群方案，确定突发状况数据对应的飞行器群协同决策方案。Step 205: Determine the aircraft group collaborative decision-making plan corresponding to the emergency situation data based on the command sequence and the aircraft group plan within the preset time.

进一步地，步骤205可以包括以下子步骤S41-S43：Further, step 205 may include the following sub-steps S41-S43:

S41、判断预设时间内飞行器群和地面控制站是否找到最优指令序列，若是，则执行步骤S42，若否，则执行步骤S43。S41. Determine whether the aircraft group and the ground control station have found the optimal command sequence within the preset time. If yes, step S42 is executed. If not, step S43 is executed.

S42、将最优指令序列作为突发状况数据对应的飞行器群协同决策方案。S42. Use the optimal command sequence as a collaborative decision-making scheme for the aircraft group corresponding to the emergency situation data.

S43、将飞行器群方案作为突发状况数据对应的飞行器群协同决策方案。S43. Use the aircraft group plan as the aircraft group collaborative decision-making plan corresponding to the emergency data.

在本发明实施例中，预设时间是指基于实际需要设置的时间区间。在预设时间内，若飞行器群或地面控制站已输出最优指令序列，便执行此指令序列。若未输出最优指令序列，便将飞行器群方案作为最优指令序列并执行。In the embodiment of the present invention, the preset time refers to a time interval set based on actual needs. Within the preset time, if the aircraft group or ground control station has output the optimal command sequence, this command sequence will be executed. If the optimal command sequence is not output, the aircraft group plan will be regarded as the optimal command sequence and executed.

步骤206、按照预设改进的多智能体深度确定性策略梯度算法对飞行器群数据集和地面控制站数据集进行指令序列选取，得到目标指令序列。Step 206: Select the command sequence from the aircraft group data set and the ground control station data set according to the preset improved multi-agent deep deterministic policy gradient algorithm to obtain the target command sequence.

进一步地，步骤206可以包括以下子步骤S51-S56：Further, step 206 may include the following sub-steps S51-S56:

S51、采用飞行器群数据集和地面控制站数据集对应的决策准则，构建初始决策准则集合。S51. Use the decision criteria corresponding to the aircraft group data set and the ground control station data set to construct an initial decision criterion set.

S52、选取初始决策准则集合中优先级最高的决策准则，得到优先决策准则。S52. Select the decision criterion with the highest priority in the initial decision criterion set to obtain the priority decision criterion.

S53、按照优先决策准则，采用指令获取算法分别对飞行器群数据集和地面控制站数据集对应的指令赋予预设权重，得到指令序列集。S53. According to the priority decision-making criterion, use the instruction acquisition algorithm to assign preset weights to the instructions corresponding to the aircraft group data set and the ground control station data set respectively, and obtain the instruction sequence set.

S54、采用预设改进的多智能体深度确定性策略梯度算法选取指令序列集中满足预设得分阈值的多个初始指令序列，构建初始指令队列。S54. Use the preset improved multi-agent deep deterministic policy gradient algorithm to select multiple initial instruction sequences that meet the preset score threshold in the instruction sequence set, and construct an initial instruction queue.

S55、选取初始指令队列中得分最高的初始指令序列作为中间指令序列。S55. Select the initial instruction sequence with the highest score in the initial instruction queue as the intermediate instruction sequence.

S56、根据飞行器群在自身约束条件和中间指令序列，确定目标指令序列。S56. Determine the target command sequence based on the aircraft group's own constraints and the intermediate command sequence.

进一步地，步骤S56可以包括以下子步骤S561-S569：Further, step S56 may include the following sub-steps S561-S569:

S561、判断飞行器群在自身约束条件下是否可以执行中间指令序列，若是，则执行步骤S562，若否，则执行步骤S563。S561. Determine whether the aircraft group can execute the intermediate command sequence under its own constraints. If so, step S562 is executed. If not, step S563 is executed.

S562、将中间指令序列作为目标指令序列。S562. Use the intermediate instruction sequence as the target instruction sequence.

S563、将中间指令序列从初始指令队列中删除，得到中间指令队列。S563. Delete the intermediate instruction sequence from the initial instruction queue to obtain an intermediate instruction queue.

S564、判断中间指令队列是否为空集，若是，则执行步骤S565，若否，则执行步骤S569。S564. Determine whether the intermediate instruction queue is an empty set. If so, execute step S565. If not, execute step S569.

S565、将优先决策准则从初始决策准则集合中删除，得到目标决策准则集合。S565. Delete the priority decision criteria from the initial decision criteria set to obtain the target decision criteria set.

S566、判断目标决策准则集合是否为空集，若是，则执行步骤S567，若否，则执行步骤S568。S566. Determine whether the target decision criterion set is an empty set. If so, execute step S567. If not, execute step S568.

S567、将预设序列作为目标指令序列。S567. Use the preset sequence as the target instruction sequence.

S568、将目标决策准则集合作为初始决策准则集合，并跳转执行选取初始决策准则集合中优先级最高的决策准则，得到优先决策准则的步骤。S568: Use the target decision criterion set as the initial decision criterion set, and jump to the step of selecting the decision criterion with the highest priority in the initial decision criterion set to obtain the priority decision criterion.

S569、将中间指令队列作为初始指令队列，并跳转执行选取初始指令队列中得分最高的初始指令序列作为中间指令序列的步骤。S569: Use the intermediate instruction queue as the initial instruction queue, and jump to the step of selecting the initial instruction sequence with the highest score in the initial instruction queue as the intermediate instruction sequence.

本发明实施例中，飞行器群和地面控制站分别依据不同的决策准则确定指令权重，用改进的多智能体深度确定性策略梯度算法即预设改进的多智能体深度确定性策略梯度算法确定若干条候选指令序列，结合飞行器群自身约束条件选择目标指令序列。步骤206涉及到三方面：指令获取算法原理；改进多智能体深度确定性策略梯度算法；基于改进的多智能体深度确定性策略梯度算法获得候选指令序列。In the embodiment of the present invention, the aircraft group and the ground control station respectively determine the instruction weight according to different decision-making criteria, and use the improved multi-agent deep deterministic strategy gradient algorithm, that is, the preset improved multi-agent deep deterministic strategy gradient algorithm to determine several candidate command sequence, and select the target command sequence based on the constraints of the aircraft group itself. Step 206 involves three aspects: the principle of the instruction acquisition algorithm; the improved multi-agent deep deterministic policy gradient algorithm; and obtaining the candidate instruction sequence based on the improved multi-agent deep deterministic policy gradient algorithm.

(1)指令获取算法原理。为了更加快速地完成协同决策任务，更加合理地分配资源，将决策准则按照优先级从高到低的顺序排序。在飞行器群上，将优先级高的决策准则输入指令获取算法，以寻找最优指令序列。在地面控制站上，将优先级低的决策准则输入指令获取算法，从而寻找最优指令序列。当飞行器群可由优先级高的决策准则找到最优指令序列时，执行此序列；否则与地面控制站通信，获取它查找到的最优指令序列。以上述最优指令序列指导飞行器群下一阶段行动。(1) Principle of instruction acquisition algorithm. In order to complete collaborative decision-making tasks more quickly and allocate resources more rationally, the decision-making criteria are sorted from high to low priority. On the aircraft group, high-priority decision criteria are input into the instruction acquisition algorithm to find the optimal instruction sequence. On the ground control station, low-priority decision criteria are input into the command acquisition algorithm to find the optimal command sequence. When the aircraft group can find the optimal instruction sequence based on the decision criteria with high priority, it executes this sequence; otherwise, it communicates with the ground control station to obtain the optimal instruction sequence it finds. Use the above-mentioned optimal command sequence to guide the next phase of actions of the aircraft group.

指令获取算法过程。从决策准则集合中取出优先级最高的决策准则，据此决策准则赋予指令权重。为与地面控制站方案、飞行器群方案相关的指令赋予较高的权重，为与地面控制站数据、飞行器群数据相关的指令赋予较低的权重。即若智能体完成了与地面控制站方案、飞行器群方案相关的指令，便赋予智能体高奖励。若智能体完成了与地面控制站数据、飞行器群数据相关的指令，便赋予智能体低奖励，这个奖励是正值。采用改进的多智能体深度确定性策略梯度算法从上述指令中选出得分最高的若干条候选指令序列，存储于队列中。取出队列里得分最高的指令序列，若飞行器群剩余能源可支撑完成这一条指令序列，那么它便是最优指令序列；否则，从队列中删除这条指令序列，继续考察队列中其他的指令序列。若飞行器群剩余能源不能支撑其完成队列中的任意一条指令序列，更换决策准则，直至找到最优指令序列。上述过程请参阅图5。Instruction acquisition algorithm process. The decision criterion with the highest priority is taken from the set of decision criteria, and the instruction is weighted according to the decision criterion. Instructions related to the ground control station plan and the aircraft group plan are given a higher weight, and instructions related to the ground control station data and the aircraft group data are given a lower weight. That is, if the agent completes the instructions related to the ground control station plan and the aircraft group plan, the agent will be given high rewards. If the agent completes the instructions related to the ground control station data and aircraft group data, the agent will be given a low reward, which is a positive value. The improved multi-agent deep deterministic policy gradient algorithm is used to select several candidate instruction sequences with the highest scores from the above instructions and store them in the queue. Take out the command sequence with the highest score in the queue. If the remaining energy of the aircraft group can support the completion of this command sequence, then it is the optimal command sequence; otherwise, delete this command sequence from the queue and continue to examine other command sequences in the queue. . If the remaining energy of the aircraft group cannot support it to complete any instruction sequence in the queue, the decision-making criteria will be changed until the optimal instruction sequence is found. Please refer to Figure 5 for the above process.

为与地面控制站方案、飞行器群方案相关的指令赋予较高的权重，为与地面控制站数据、飞行器群数据相关的指令赋予较低的权重的原因：地面控制站是在深思熟虑后制定出的方案，飞行器群方案是在考虑过突发状况后制定出的方案。所以地面控制站方案和飞行器群方案所涉及指令具有很高的可信任度。而另一方面，地面控制站数据反映信息多为飞行环境的全局信息，飞行器群数据反映信息多为飞行环境的局部信息。而全局信息和局部信息结合能更加确切地反映飞行器群的飞行环境。因此，不仅与地面控制站方案和飞行器群方案相关的指令具有可参考性，与地面控制站数据和飞行器群数据相关的指令也有一定的可参考性。The reason for assigning higher weights to instructions related to ground control station plans and aircraft group plans, and assigning lower weights to instructions related to ground control station data and aircraft group data: Ground control stations are formulated after careful consideration Plan, the aircraft group plan is a plan formulated after considering emergencies. Therefore, the instructions involved in the ground control station scheme and the aircraft group scheme have a high degree of trust. On the other hand, the information reflected by the ground control station data is mostly global information of the flight environment, and the information reflected by the aircraft group data is mostly local information of the flight environment. The combination of global information and local information can more accurately reflect the flight environment of the aircraft group. Therefore, not only the instructions related to the ground control station plan and the aircraft group plan are referable, but the instructions related to the ground control station data and the aircraft group data are also referable to a certain extent.

(2)改进的多智能体深度确定性策略梯度算法即预设改进的多智能体深度确定性策略梯度算法。本发明的决策场景是己方的飞行器群与目标处的装备群。己方的各个飞行器群功能一致。为了提高协同决策方法的泛化能力，本发明将场景设定为两个装备群之间的博弈过程。类似于足球比赛，团队间对抗，团队内合作，团队内不同角色发挥不同作用。不同的装备对应团队内部不同的角色。多智能体深度确定性策略梯度算法正是解决团队间对抗团队内合作的多智能体深度强化学习算法。然而多智能体深度确定性策略梯度算法中，每个智能体都有一个全局评价器，Critic网络(即价值网络)的输入空间随着智能体数量的增加而增加，导致智能体学习周期长，仅适用于智能体数量少的场景。针对上述问题，本发明先改进多智能体深度确定性策略梯度算法，然后针对一个确定的决策准则，采用改进的多智能体深度确定性策略梯度算法查找若干候选指令序列。(2) The improved multi-agent deep deterministic policy gradient algorithm is the preset improved multi-agent deep deterministic policy gradient algorithm. The decision-making scenario of the present invention is one's own aircraft group and the equipment group at the target. Each of your own aircraft groups has the same functions. In order to improve the generalization ability of the collaborative decision-making method, the present invention sets the scenario as a game process between two equipment groups. Similar to a football match, there is competition between teams, cooperation within the team, and different roles within the team playing different roles. Different equipment corresponds to different roles within the team. The multi-agent deep deterministic policy gradient algorithm is a multi-agent deep reinforcement learning algorithm that solves inter-team confrontation and intra-team cooperation. However, in the multi-agent deep deterministic policy gradient algorithm, each agent has a global evaluator, and the input space of the Critic network (i.e., value network) increases as the number of agents increases, resulting in a long learning cycle for the agents. Only suitable for scenarios with a small number of agents. In response to the above problems, the present invention first improves the multi-agent deep deterministic policy gradient algorithm, and then uses the improved multi-agent deep deterministic policy gradient algorithm to find several candidate instruction sequences based on a certain decision criterion.

改进的多智能体深度确定性策略梯度算法。以中央控制器指代N个价值网络，以策略区指代为N个智能体配备的N个策略网络。为了让多智能体间的逻辑关系更加清晰，在中央控制器与策略区间添加融合层，融合层由两个伪智能体构成，中央控制器只需要负责两个伪智能体。第一个伪智能体集成己方所有智能体的动作、观测和奖励为一组动作、观测和奖励。第二个伪智能体集成目标中所有智能体的动作、观测和奖励为一组动作、观测和奖励。而后将上述两组动作、观测和奖励上传至中央控制器。中央控制器面向的仅为博弈双方，只需要考虑两个伪智能体间的竞争关系，这一定程度上解决了多智能体深度确定性策略梯度算法难以处理大量智能体的问题。单个伪智能体面向的是博弈一方中所有的真实的智能体，其只需要考虑合作关系。整体而言，插入融合层后的多智能体深度确定性策略梯度算法不仅缓解了算法复杂度随输入空间增加而增加的现状，还可以处理合作与竞争同在的复杂多智能体关系。在这场多智能体博弈中，我们最终关注的仅仅是己方是否能够获胜，而不关注哪个智能体赢得的奖励最多。因此将整个多智能体博弈场景划分为2个视图，一个视图面向己方，也就是第一个伪智能体。另一个视图面向目标处的装备群。己方所有智能体捕获到的观察不完全一致，因而对这些观察求并集，输入到第1个伪智能体。对己方所有智能体的动作和奖励的融合也是一样的。中央控制器对两个伪智能体做出评价。两个伪智能体分别帮助各自所代表的真实智能体改进自身策略，对于一个伪智能体对应的多个真实智能体而言，它们目标一致，即获胜，是完全合作关系，只需要配合完成伪智能体从中央控制器那里获取的改进意见即可。Improved multi-agent deep deterministic policy gradient algorithm. The central controller refers to the N value networks, and the strategy area refers to the N strategy networks equipped for N agents. In order to make the logical relationship between multiple agents clearer, a fusion layer is added between the central controller and the strategy. The fusion layer consists of two pseudo-agents, and the central controller only needs to be responsible for two pseudo-agents. The first pseudo-agent integrates the actions, observations and rewards of all its own agents into a set of actions, observations and rewards. The second pseudo-agent integrates the actions, observations and rewards of all agents in the target into a set of actions, observations and rewards. The above two sets of actions, observations and rewards are then uploaded to the central controller. The central controller is only oriented to both parties in the game and only needs to consider the competitive relationship between the two pseudo-agents. This solves to a certain extent the problem that the multi-agent deep deterministic policy gradient algorithm is difficult to handle a large number of agents. A single pseudo-agent is oriented to all real agents on one side of the game, and it only needs to consider the cooperative relationship. Overall, the multi-agent deep deterministic policy gradient algorithm after inserting the fusion layer not only alleviates the current situation that the algorithm complexity increases with the increase of the input space, but can also handle complex multi-agent relationships where cooperation and competition coexist. In this multi-agent game, what we ultimately focus on is whether our own side can win, not which agent wins the most rewards. Therefore, the entire multi-agent game scene is divided into two views. One view faces oneself, which is the first pseudo-agent. Another view faces the equipment group at the target. The observations captured by all our own agents are not completely consistent, so these observations are summed and input to the first pseudo-agent. The same goes for the integration of actions and rewards for all agents on one's side. The central controller evaluates the two pseudo-agents. The two pseudo-agents respectively help the real agents they represent to improve their own strategies. For the multiple real agents corresponding to one pseudo-agent, they have the same goal, that is, winning, and they have a complete cooperative relationship. They only need to cooperate to complete the pseudo-agent. The agent obtains improvement suggestions from the central controller.

(3)基于改进的多智能体深度确定性策略梯度算法获得候选指令序列。本发明设定决策场景为双方装备群对抗场景。暂设定装备具有侦察、突防、打击这三种功能，可继续添加功能。令A＝{a₁,a₂,...,a_i,...,a_n1}、B＝{b₁,b₂,...,b_i,...,b_n2}、C＝{c₁,c₂,...,c_i,...,c_n3}分别表示己方具有侦察、突防、或打击功能的装备的集合。以a_i、b_i、c_i分别表示第i种具有侦察、突防、或打击功能的装备。n₁、n₂、n₃分别表示具有侦察、突防、打击功能的装备的种类的数量。A、B、C之间可以有交集，因为一种装备可以具有一种或一种以上的功能。令X＝{x₁,x₂,...,x_i,...,x_n4}、Y＝{y₁,y₂,...,y_i,...,y_n5}、Z＝{z₁,z₂,...,z_i,...,z_n6}分别表示目标中具有侦察、突防、或打击功能的装备的集合。以x_i、y_i、z_i分别表示第i种具有侦察、突防、或打击功能的装备。n₄、n₅、n₆分别表示具有侦察、突防、打击功能的各类装备的种类的数量。X、Y、Z之间可以有交集，因为一种装备可以具有一种或一种以上的功能。对于A、B、C、X、Y、Z中的任意一个集合，其中无重复元素。以a_ij表示己方的具有侦察功能的第i种装备的数量，b_ij、c_ij具有类似含义。以x_ij表示目标中的具有侦察功能的第i种装备的数量，y_ij、z_ij具有类似含义。以c’_i表示己方的具有打击功能的第i种装备可以实施打击的次数，z’_i具有类似含义。对于各类装备，他们的侦察、突防功能可以实施的次数不受限制，但是在被其他智能体击败之后，终止服役，不可再实施侦察、打击或突防的功能。(3) Obtain candidate instruction sequences based on the improved multi-agent deep deterministic policy gradient algorithm. The present invention sets the decision-making scenario as a confrontation scenario between equipment groups of both parties. The equipment is temporarily set to have three functions: reconnaissance, penetration, and attack, and functions can continue to be added. Let A={a ₁ ,a ₂ ,...,a _i ,...,a _n1 }, B={b ₁ ,b ₂ ,...,b _i ,...,b _n2 }, C ={c ₁ , c ₂ ,..., c _i ,..., c _n3 } respectively represent a collection of one's own equipment with reconnaissance, penetration, or strike functions. Let a _i , b _i , and c _i respectively represent the i-th type of equipment with reconnaissance, penetration, or strike functions. n ₁ , n ₂ , and n ₃ respectively represent the number of types of equipment with reconnaissance, penetration, and strike functions. There can be intersections between A, B, and C, because a piece of equipment can have one or more functions. _Let _{_} _{_} _{_} _{_} _{_} _{_} _{_} ={z ₁ , z ₂ ,...,z _i ,...,z _n6 } respectively represent a collection of equipment with reconnaissance, penetration, or strike functions in the target. Let x _i , y _i , and z _i respectively represent the i-th type of equipment with reconnaissance, penetration, or strike functions. n ₄ , n ₅ , and n ₆ respectively represent the number of types of equipment with reconnaissance, penetration, and strike functions. There can be intersections between X, Y, and Z, because a piece of equipment can have one or more functions. For any set among A, B, C, X, Y, and Z, there are no duplicate elements. Let a _ij represent the number of our own i-th equipment with reconnaissance function, and b _ij and c _ij have similar meanings. Let x _ij represent the number of the i-th equipment with reconnaissance function in the target, and y _ij and z _ij have similar meanings. Let c' _i represent the number of times that one's own i-th equipment with strike function can strike, and z' _i has a similar meaning. For various types of equipment, there is no limit to the number of times their reconnaissance and penetration functions can be implemented. However, after being defeated by other intelligent entities, their service will be terminated and they can no longer perform reconnaissance, strike or penetration functions.

设置动作空间。对任意智能体而言，其可采取的动作来自于集合A＝{打击、侦察、突防、静止、运动}，不要求其具备执行A中所有动作的能力。Set up the action space. For any intelligent agent, the actions it can take come from the set A = {strike, reconnaissance, penetration, stationary, motion}, and it is not required to have the ability to perform all actions in A.

设置状态空间。设置状态空间S＝{探测到的目标的种类，探测到的目标的数量、己方装备的种类，己方装备的数量、各个装备正在执行的动作}。Set up the state space. Set the state space S = {type of detected target, number of detected targets, type of own equipment, number of own equipment, and actions being performed by each equipment}.

设置奖励函数。己方第a_i个装备探测到目标时，己方第a_i个装备对应的智能体奖励加r₁，被探测到的目标对应的智能体，其奖励减r₁。己方第b_i个装备突防成功时，己方第b_i个装备对应的智能体奖励加r₂，未能阻止b_i突防的目标对应的智能体，其奖励减r₂。己方第c_i个装备打击到目标时，己方第c_i个装备对应的智能体奖励加r₃，被打击到的目标对应的智能体，其奖励减r₃，接下来被打击到的目标终止服役。截止到一定时间，若己方装备获得的奖励之和大于目标处的装备获得的奖励之和，己方胜，否则己方未胜。Set reward function. When the _i- th equipment of one's side detects the target, the reward of the agent corresponding to _{the i} -th equipment of one's side is increased by r ₁ , and the reward of the agent corresponding to the detected target is reduced by r ₁ . When the b _i- th equipment of one's side successfully penetrates the defense, the reward of the agent corresponding to the b _i- th equipment of one's side is increased by r _2. The reward of the agent corresponding to the target b _i that fails to prevent the penetration of the defense is reduced by r ₂ . When one's c _i- th equipment hits the target, the reward of the agent corresponding to one's c _i-th equipment is increased by r _3. The reward of the agent corresponding to the hit target is reduced by r _3. The next target that is hit is terminated. service. As of a certain time, if the sum of rewards obtained by one's own equipment is greater than the sum of rewards obtained by the equipment at the target, one's side wins, otherwise one's side does not win.

双方智能体博弈的过程，是在一个时间段，一个空间内，不同智能体分别执行一系列动作的过程。目前有两个决策方案，一个是飞行器群方案，一个是地面控制站方案。对于双方智能体博弈过程来讲，这两个决策方案是两个先验信息。分别将两个先验信息输入改进的多智能体深度确定性策略梯度算法，观察己方获胜情况。而后将两个先验信息同时输入改进的多智能体深度确定性策略梯度算法，当两个先验信息出现冲突时，剪切掉冲突部分，观察我获胜情况。最终选择己方获胜且获得奖励最多的前几条指令序列。The process of the game between the two agents is a process in which different agents perform a series of actions in a time period and a space. There are currently two decision-making plans, one is the aircraft group plan and the other is the ground control station plan. For the game process of both agents, these two decision-making solutions are two pieces of prior information. Input the two prior information into the improved multi-agent deep deterministic policy gradient algorithm respectively, and observe the victory of your own side. Then the two prior information are simultaneously input into the improved multi-agent deep deterministic policy gradient algorithm. When the two prior information conflicts, cut off the conflicting part and observe my victory. Finally, choose the first few instruction sequences that your side wins and gets the most rewards.

步骤207、判断目标指令序列是否为预设序列，若是，则执行步骤208，若否，则执行步骤209。Step 207: Determine whether the target instruction sequence is a preset sequence. If yes, execute step 208. If not, execute step 209.

在本发明实施例中，预设序列是指目标指令序列不是最优指令序列，预设指令序列也可能是空集。判断按照预设改进的多智能体深度确定性策略梯度算法对飞行器群数据集和地面控制站数据集进行指令序列选取，得到目标指令序列是否为最优指令序列。In the embodiment of the present invention, the preset sequence means that the target instruction sequence is not an optimal instruction sequence, and the preset instruction sequence may also be an empty set. Determine whether the target command sequence is the optimal command sequence by selecting command sequences from the aircraft group data set and the ground control station data set according to the preset improved multi-agent deep deterministic policy gradient algorithm.

步骤208、将飞行器群方案作为突发状况数据对应的飞行器群协同决策方案。Step 208: Use the aircraft group plan as the aircraft group collaborative decision-making plan corresponding to the emergency situation data.

在本发明实施例中，极端情况下，飞行器群、地面控制站都没有找到最优指令序列，以飞行器群方案作为最优指令序列。可从飞行器群方案中抽取出一系列事件，这些事件与指令一一对应，因此飞行器群方案本质上是一个指令序列。In the embodiment of the present invention, in extreme cases, neither the aircraft group nor the ground control station can find the optimal instruction sequence, and the aircraft group solution is used as the optimal instruction sequence. A series of events can be extracted from the aircraft group plan, and these events correspond to instructions one-to-one. Therefore, the aircraft group plan is essentially an instruction sequence.

步骤209、将目标指令序列作为突发状况数据对应的飞行器群协同决策方案。Step 209: Use the target command sequence as the aircraft group collaborative decision-making solution corresponding to the emergency situation data.

在本发明实施例中，因为飞行器群基于优先级高的决策准则查找最优指令序列，所以当飞行器群可以找到最优指令序列时，执行这个指令序列。当飞行器群无法找到最优指令序列时，才执行地面控制站找到的最优指令序列。In the embodiment of the present invention, because the aircraft group searches for the optimal instruction sequence based on a high-priority decision criterion, when the aircraft group can find the optimal instruction sequence, this instruction sequence is executed. When the aircraft group cannot find the optimal command sequence, the optimal command sequence found by the ground control station will be executed.

步骤210、当方案比对结果为一致时，将飞行器群数据集中的飞行器群方案作为突发状况数据对应的飞行器群协同决策方案。Step 210: When the plan comparison results are consistent, use the aircraft group plan in the aircraft group data set as the aircraft group collaborative decision-making plan corresponding to the emergency situation data.

在本发明实施例中，步骤210的具体实施过程与步骤105类似，在此不再赘述。In the embodiment of the present invention, the specific implementation process of step 210 is similar to step 105, and will not be described again here.

在本发明实施例中，如图6所示，输入地面控制站方案、地面控制站数据、飞行器群方案、飞行器群数据，输出最优指令序列。具体地，改进密度峰值聚类算法，生成事件伪标签，微调已有动态多池化卷积神经网络模型，对相关数据进行事件抽取，将事件映射为指令。根据事件抽取结果，判断地面控制站方案与飞行器群方案是否一致。如果两方案不一致，则确定是否授权飞行器群可自行行动。如果未授权飞行器群可自行行动，则飞行器群、地面控制站分别调用指令获取算法，以获取最优指令序列。如果飞行器群依据优先级高的决策准则找到了最优指令序列I₁，则以I₁指导飞行器群行动。若飞行器群未找到最优指令序列，而地面控制站找到了最优指令序列I₂，以I₂指导飞行器群行动。若地面控制站没有找到最优指令序列，以飞行器群方案作为最优指令序列。如果已授权飞行器群可自行行动，在设定时间内，若已找到I₁或I₂，便执行I₁或I₂；否则，以飞行器群方案作为最优指令序列。当飞行器群方案与地面控制站方案一致时，以飞行器群方案作为最优指令序列。In the embodiment of the present invention, as shown in Figure 6, the ground control station plan, ground control station data, aircraft group plan, and aircraft group data are input, and the optimal instruction sequence is output. Specifically, the density peak clustering algorithm is improved, event pseudo-labels are generated, the existing dynamic multi-pooling convolutional neural network model is fine-tuned, events are extracted from relevant data, and events are mapped into instructions. Based on the event extraction results, it is judged whether the ground control station plan and the aircraft group plan are consistent. If the two plans are inconsistent, determine whether the aircraft group is authorized to act on its own. If the unauthorized aircraft group can act on its own, the aircraft group and the ground control station will respectively call the instruction acquisition algorithm to obtain the optimal instruction sequence. If the aircraft group finds the optimal command sequence I ₁ based on the high-priority decision criterion, I ₁ will be used to guide the action of the aircraft group. If the aircraft group does not find the optimal command sequence, but the ground control station finds the optimal command sequence I ₂ , it will use I ₂ to guide the actions of the aircraft group. If the ground control station does not find the optimal command sequence, the aircraft group plan will be used as the optimal command sequence. If the aircraft group has been authorized to act on its own, if I ₁ or I ₂ is found within the set time, I ₁ or I ₂ will be executed; otherwise, the aircraft group plan will be used as the optimal command sequence. When the aircraft group plan is consistent with the ground control station plan, the aircraft group plan is used as the optimal command sequence.

针对飞行器群遭遇突发状况时，实际场景具有复杂性、不确定性、数据样本少的特点，对突发状况的种类、数据的处理、协同决策的方式、计算资源的分配和利用等进行了详细的研究。通过预设密度峰值聚类算法生成事件伪标签，解决了无事件标签的问题，基于事件伪标签微调基于CNN的事件抽取模型，进而抽取事件，快速把握飞行环境整体态势。由地面控制站来决定飞行器群是否可自行行动，无论其是否可自行行动，均制定出相应方案来应对突发状况。令飞行器群处理优先级高的决策准则、地面控制站处理优先级低的决策准则，合理利用了飞行器群和地面控制站的计算资源。指令序列与飞行器群飞行路线、执行任务密切相关，采用改进后的多智能体深度确定性策略梯度算法寻找指令序列，尽最大力量使得飞行器群躲避威胁、完成任务、成功地应对突发状况。按照所提协同决策方法应对飞行器群面临的突发状况，可提高对飞行场景把握的准确性和高效性，能够为飞行器在下一阶段的行动提供科学的决策依据。In view of the fact that when an aircraft group encounters an emergency situation, the actual scenario is characterized by complexity, uncertainty, and few data samples. The types of emergency situations, data processing, collaborative decision-making methods, and the allocation and utilization of computing resources were studied. Detailed research. The preset density peak clustering algorithm is used to generate event pseudo-labels, which solves the problem of no event labels. Based on the event pseudo-labels, the CNN-based event extraction model is fine-tuned to extract events and quickly grasp the overall situation of the flight environment. The ground control station determines whether the aircraft group can act on its own, and regardless of whether it can act on its own, corresponding plans are formulated to deal with emergencies. The aircraft group processes high-priority decision criteria and the ground control station processes low-priority decision criteria, making reasonable use of the computing resources of the aircraft group and the ground control station. The instruction sequence is closely related to the flight path and mission execution of the aircraft group. An improved multi-agent deep deterministic strategy gradient algorithm is used to find the instruction sequence, and do its best to enable the aircraft group to avoid threats, complete tasks, and successfully respond to emergencies. Responding to emergencies faced by the aircraft group according to the proposed collaborative decision-making method can improve the accuracy and efficiency of grasping the flight scenario, and can provide a scientific decision-making basis for the aircraft's actions in the next stage.

如图7所示，本发明实施例还包括应用本发明实施例提供的飞行器群协同决策生成方法的应急系统，应急系统包括决策处理模块、数据处理模块、飞控模块、传感器模块、数据库模块和通信模块。其中应急系统的硬件设施上包括飞行器群、无线通信链路系统、地面控制站三大组件。其中飞行器群一般携带有效载荷，用于完成特定任务而携带的设备。无线通信链路系统，用于在飞行器群和地面控制站之间传递数据。地面控制站对飞行器群进行控制，并且进行任务的分配和协调。As shown in Figure 7, the embodiment of the present invention also includes an emergency system applying the aircraft group collaborative decision generation method provided by the embodiment of the present invention. The emergency system includes a decision processing module, a data processing module, a flight control module, a sensor module, a database module and Communication module. The hardware facilities of the emergency system include three major components: aircraft group, wireless communication link system, and ground control station. The aircraft groups generally carry payloads, which are equipment used to complete specific tasks. A wireless communications link system used to transfer data between a fleet of aircraft and a ground control station. The ground control station controls the aircraft group and distributes and coordinates tasks.

决策处理器模块用以执行改进密度峰值聚类算法。数据处理模块为决策处理器提供计算资源，用于分析飞行器群飞行数据以检测飞行质量或产生决策方案。飞控模块用于飞行管理与控制系统，负责启动协同决策算法，执行决策处理器模块计算结果，并将结果通过通信模块传递给地面控制站。传感器模块用于采集飞行器群飞行过程中的速度、高度、姿态、加速度和角速率等数据。数据库模块，用于存储传感器模块采集的数据、决策处理器模块的执行结果、地面控制站的指挥命令和捕获数据、存储飞行器群传递过来的各类数据以及维护飞行器群正常飞行所需的数据。The decision processor module is used to execute the improved density peak clustering algorithm. The data processing module provides computing resources to the decision processor, which is used to analyze the flight data of the aircraft group to detect flight quality or generate decision solutions. The flight control module is used in the flight management and control system. It is responsible for starting the collaborative decision-making algorithm, executing the calculation results of the decision processor module, and transmitting the results to the ground control station through the communication module. The sensor module is used to collect data such as speed, altitude, attitude, acceleration and angular rate during the flight of the aircraft group. The database module is used to store data collected by the sensor module, execution results of the decision processor module, command commands and capture data from the ground control station, store various data transmitted by the aircraft group, and data required to maintain the normal flight of the aircraft group.

通信模块，用于将飞行器群的各类数据传递至地面控制站，用于收取地面控制站传递给飞行器群和收取飞行器群传递给地面控制站的各类数据。控制模块，用于传达控制指令、遥控飞行器群，与数据处理模块交互、分析飞行数据、产生决策指令。The communication module is used to transmit various data of the aircraft group to the ground control station, and is used to receive various data transmitted by the ground control station to the aircraft group and to collect various data transmitted by the aircraft group to the ground control station. The control module is used to convey control instructions, remotely control the aircraft group, interact with the data processing module, analyze flight data, and generate decision-making instructions.

进一步地，应急系统还包括人机交互界面模块，人机交互界面，用于生成可视化界面，以供操作手通过地面控制站传达指令或分析飞行数据。通信模块为无线通信链路，还用于在飞行器群和地面控制站之间传递数据。传感器模块将数据存入数据库模块，当飞控模块发现数据异常时，表明出现突发状况。飞控模块启动决策处理器模块，决策处理器模块执行协同决策算法。当飞行器群计算资源不足时，飞行器群通过无线通信链路系统与地面控制站通信，在地面控制站的帮助下共同完成协同决策算法。协同决策算法输出最优指令序列，以最优指令序列指导飞行器群下一阶段行动。Furthermore, the emergency system also includes a human-computer interaction interface module, which is used to generate a visual interface for the operator to convey instructions or analyze flight data through the ground control station. The communication module is a wireless communication link and is also used to transfer data between the aircraft group and the ground control station. The sensor module stores the data into the database module. When the flight control module detects data anomalies, it indicates an emergency. The flight control module starts the decision processor module, and the decision processor module executes the collaborative decision-making algorithm. When the computing resources of the aircraft group are insufficient, the aircraft group communicates with the ground control station through the wireless communication link system, and jointly completes the collaborative decision-making algorithm with the help of the ground control station. The collaborative decision-making algorithm outputs the optimal instruction sequence and uses the optimal instruction sequence to guide the next phase of the aircraft group's actions.

请参阅图8，图8为本发明实施例三提供的一种飞行器群协同决策生成系统的结构框图。Please refer to FIG. 8 , which is a structural block diagram of an aircraft group collaborative decision-making generation system provided in Embodiment 3 of the present invention.

本发明实例三提供的一种飞行器群协同决策生成系统，包括：Example 3 of the present invention provides an aircraft group collaborative decision-making system, including:

事件伪标签集得到模块801，用于当接收到飞行器群的突发状况数据时，采用预设密度峰值聚类算法计算突发状况数据对应的飞行器群数据集和地面控制站生成的地面控制站数据集对应的事件伪标签，得到事件伪标签集。The event pseudo-label set obtaining module 801 is used to, when receiving emergency data of an aircraft group, use a preset density peak clustering algorithm to calculate the aircraft group data set corresponding to the emergency data and the ground control station generated by the ground control station. The event pseudo-label corresponding to the data set is used to obtain the event pseudo-label set.

目标事件抽取模型得到模块802，用于基于事件伪标签集训练初始事件抽取模型，得到目标事件抽取模型。The target event extraction model obtaining module 802 is used to train an initial event extraction model based on the event pseudo-label set to obtain a target event extraction model.

方案比对结果得到模块803，用于通过目标事件抽取模型分别对飞行器群数据集和地面控制站数据集进行事件抽取并进行事件比对，得到方案比对结果。The solution comparison result obtaining module 803 is used to extract events from the aircraft group data set and the ground control station data set respectively through the target event extraction model and perform event comparison to obtain the solution comparison result.

飞行器群协同决策第一得到模块804，用于当方案比对结果为不一致时，根据飞行器群对应的授权数据和预设改进的多智能体深度确定性策略梯度算法进行最优指令序列选取，得到突发状况数据对应的飞行器群协同决策方案。The first acquisition module 804 of collaborative decision-making of the aircraft group is used to select the optimal instruction sequence based on the authorization data corresponding to the aircraft group and the preset improved multi-agent deep deterministic strategy gradient algorithm when the plan comparison results are inconsistent, to obtain Collaborative decision-making scheme for aircraft groups corresponding to emergency situation data.

飞行器群协同决策第二得到模块805，用于当方案比对结果为一致时，将飞行器群数据集中的飞行器群方案作为突发状况数据对应的飞行器群协同决策方案。The second acquisition module 805 of aircraft group collaborative decision-making is used to use the aircraft group plan in the aircraft group data set as the aircraft group collaborative decision-making plan corresponding to the emergency situation data when the plan comparison results are consistent.

可选地，事件伪标签集得到模块801包括：Optionally, the event pseudo-label set obtaining module 801 includes:

飞行器群数据集和地面控制站数据集获取模块，用于当接收到飞行器群的突发状况数据时，获取飞行器群基于突发状况数据生成的飞行器群数据集和地面控制站生成的地面控制站数据集。The aircraft group data set and the ground control station data set acquisition module is used to obtain the aircraft group data set generated by the aircraft group based on the emergency data and the ground control station generated by the ground control station when receiving the emergency situation data of the aircraft group. data set.

特征向量集得到模块，用于通过预设跳字模型提取飞行器群数据集和地面控制站数据集的特征向量，得到特征向量集。The feature vector set obtaining module is used to extract the feature vectors of the aircraft group data set and the ground control station data set through the preset word skip model to obtain the feature vector set.

密度得到模块，用于采用预设密度计算公式分别计算特征向量集中各点对应的密度，得到密度。The density obtaining module is used to calculate the density corresponding to each point in the feature vector set using the preset density calculation formula to obtain the density.

预设密度计算公式为：The default density calculation formula is:

其中，ρ(x_i)表示点x_i的密度；k表示一个点的邻居的数量，即以欧式距离为度量标准，距离一个点最近的前k个点；表示点x_i和点x_j之间的欧式距离，u表示一个特征向量集中点的维度，也就是特征的数量，g是下标，用来索引不同的特征；KNN(x_i)表示在数据集中与点x_i的欧式距离最小的前k个点构成的集合；i表示数据集中的第i个点x_i；j表示数据集中的第j个点x_j。Among them, ρ(xi) represents the density of point x _i ; k represents the number of neighbors of a point, that is, using _the Euclidean distance as the metric, the top k points closest to a point; Represents the Euclidean distance between point x _i and point x _j , u represents the dimension of a feature vector concentration point, that is, the number of features, g is a subscript, used to index different features; KNN(xi ₎ represents the A set of the first k points with the smallest Euclidean distance from point x _i ; i represents the i-th point x _i in the data set; j represents the j-th point x _j in the data set.

相对密度得到模块，用于将密度代入预设相对密度计算公式计算点的相对密度，得到相对密度。The relative density obtaining module is used to substitute the density into the preset relative density calculation formula to calculate the relative density of the point and obtain the relative density.

预设相对密度计算公式为：The default relative density calculation formula is:

簇中心得到模块，用于选取相对密度大于相对密度对应的密度平均值的点，得到多个簇中心。The cluster center obtaining module is used to select points whose relative density is greater than the density average corresponding to the relative density and obtain multiple cluster centers.

集合得到模块，用于按照簇中心，采用预设域计算公式对特征向量集进行划分，得到多个集合。The set obtaining module is used to divide the feature vector set according to the cluster center and using the preset domain calculation formula to obtain multiple sets.

预设域计算公式为：The default domain calculation formula is:

式中，D_mn(x_p)表示点x_p的域；MN(x_p)表示点x_p所有互邻的集合；MN(x_m)表示点x_m所有互邻的集合；x_p表示特征向量集中第p个点；x_m表示特征向量集中第m个点；x_q表示特征向量集中第q个点。In the formula, D _mn (x _p ) represents the domain of point x _p ; MN (x _p ) represents the set of all neighbors of point x _p ; MN (x _m ) represents the set of all neighbors of point x _m ; x _p represents the feature The p-th point in the vector set; x _m represents the m-th point in the feature vector set; x _q represents the q-th point in the feature vector set.

事件伪标签集得到子模块，用于按照簇中心对应的标签分别对各集合对应的点设置相应的标签，得到事件伪标签集。The event pseudo-label set obtaining sub-module is used to set corresponding labels for the points corresponding to each set according to the labels corresponding to the cluster centers to obtain the event pseudo-label set.

可选地，目标事件抽取模型得到模块802包括：Optionally, the target event extraction model obtaining module 802 includes:

初始损失函数值得到模块，用于将事件伪标签集输入初始事件抽取模型，得到初始损失函数值。The initial loss function value obtaining module is used to input the event pseudo-label set into the initial event extraction model to obtain the initial loss function value.

中间事件抽取模型得到模块，用于按照初始损失函数值，微调初始事件抽取模型，得到中间事件抽取模型。The intermediate event extraction model is obtained by a module for fine-tuning the initial event extraction model according to the initial loss function value to obtain an intermediate event extraction model.

目标损失函数值得到模块，用于将事件伪标签集输入中间事件抽取模型，得到目标损失函数值。The target loss function value obtaining module is used to input the event pseudo-label set into the intermediate event extraction model to obtain the target loss function value.

目标损失函数值判断模块，用于判断目标损失函数值是否为预设函数值。The target loss function value judgment module is used to judge whether the target loss function value is the preset function value.

目标事件抽取模型得到子模块，用于若是，则将目标损失函数值对应的中间事件抽取模型作为目标事件抽取模型。The target event extraction model obtains a sub-module, which is used to use the intermediate event extraction model corresponding to the target loss function value as the target event extraction model.

跳转执行模块，用于若否，则将中间事件抽取模型作为初始事件抽取模型，并跳转执行将事件伪标签集输入初始事件抽取模型，得到初始损失函数值的步骤。The jump execution module is used to, if not, use the intermediate event extraction model as the initial event extraction model, and jump to the step of inputting the event pseudo-label set into the initial event extraction model to obtain the initial loss function value.

可选地，方案比对结果得到模块803包括：Optionally, the solution comparison result obtaining module 803 includes:

飞行器群事件和地面事件得到模块，用于通过目标事件抽取模型分别对飞行器群数据集和地面控制站数据集进行事件抽取，得到飞行器群事件和地面事件。The aircraft group event and ground event obtaining module is used to extract events from the aircraft group data set and the ground control station data set respectively through the target event extraction model to obtain the aircraft group event and ground event.

事件比对结果得到模块，用于按照触发词类别和参数相似度，将飞行器群事件和地面事件进行比对，得到事件比对结果。The event comparison result obtaining module is used to compare aircraft group events and ground events according to trigger word categories and parameter similarities, and obtain event comparison results.

方案比对结果得到子模块，用于采用全部事件比对结果，构建方案比对结果。The plan comparison result acquisition sub-module is used to use all event comparison results to construct the plan comparison result.

可选地，飞行器群协同决策第一得到模块804包括：Optionally, the aircraft group collaborative decision-making first obtaining module 804 includes:

授权数据判断模块，用于当方案比对结果为不一致时，判断飞行器群对应的授权数据是否为授权。The authorization data judgment module is used to judge whether the authorization data corresponding to the aircraft group is authorized when the plan comparison results are inconsistent.

飞行器群协同决策得到第一子模块，用于若是，则根据预设时间内的指令序列和飞行器群方案，确定突发状况数据对应的飞行器群协同决策方案。The aircraft group collaborative decision-making obtains the first sub-module, which is used to determine the aircraft group collaborative decision-making plan corresponding to the emergency situation data based on the command sequence and the aircraft group plan within the preset time.

目标指令序列得到模块，用于若否，则按照预设改进的多智能体深度确定性策略梯度算法对飞行器群数据集和地面控制站数据集进行指令序列选取，得到目标指令序列。The target command sequence obtaining module is used to select the command sequence from the aircraft group data set and the ground control station data set according to the preset improved multi-agent depth deterministic strategy gradient algorithm to obtain the target command sequence.

目标指令序列判断模块，用于判断目标指令序列是否为预设序列。The target instruction sequence judgment module is used to judge whether the target instruction sequence is a preset sequence.

飞行器群协同决策得到第二子模块，用于若是，则将飞行器群方案作为突发状况数据对应的飞行器群协同决策方案。The aircraft group collaborative decision-making obtains a second sub-module, which is used to, if so, use the aircraft group plan as the aircraft group collaborative decision-making plan corresponding to the emergency situation data.

飞行器群协同决策得到第三子模块，用于若否，则将目标指令序列作为突发状况数据对应的飞行器群协同决策方案。The aircraft group collaborative decision-making obtains a third sub-module, which is used to use the target command sequence as the aircraft group collaborative decision-making solution corresponding to the emergency situation data if not.

可选地，飞行器群协同决策得到第一子模块可以执行以下步骤：Optionally, the first sub-module obtained through aircraft group collaborative decision-making can perform the following steps:

判断预设时间内飞行器群和地面控制站是否找到最优指令序列；Determine whether the aircraft group and ground control station have found the optimal command sequence within the preset time;

若是，则将最优指令序列作为突发状况数据对应的飞行器群协同决策方案；If so, the optimal command sequence will be used as the aircraft group collaborative decision-making solution corresponding to the emergency situation data;

若否，则将飞行器群方案作为突发状况数据对应的飞行器群协同决策方案。If not, the aircraft group plan is used as the aircraft group collaborative decision-making plan corresponding to the emergency data.

可选地，目标指令序列得到模块可以执行以下步骤：Optionally, the target instruction sequence obtaining module can perform the following steps:

采用飞行器群数据集和地面控制站数据集对应的决策准则，构建初始决策准则集合；Use the decision criteria corresponding to the aircraft group data set and the ground control station data set to construct an initial set of decision criteria;

选取初始决策准则集合中优先级最高的决策准则，得到优先决策准则；Select the decision criterion with the highest priority in the initial decision criterion set to obtain the priority decision criterion;

按照优先决策准则，采用指令获取算法分别对飞行器群数据集和地面控制站数据集对应的指令赋予预设权重，得到指令序列集；According to the priority decision-making criterion, the instruction acquisition algorithm is used to assign preset weights to the instructions corresponding to the aircraft group data set and the ground control station data set, respectively, to obtain the instruction sequence set;

采用预设改进的多智能体深度确定性策略梯度算法选取指令序列集中满足预设得分阈值的多个初始指令序列，构建初始指令队列；The preset improved multi-agent deep deterministic policy gradient algorithm is used to select multiple initial instruction sequences that meet the preset score threshold in the instruction sequence set and build the initial instruction queue;

选取初始指令队列中得分最高的初始指令序列作为中间指令序列；Select the initial instruction sequence with the highest score in the initial instruction queue as the intermediate instruction sequence;

根据飞行器群在自身约束条件和中间指令序列，确定目标指令序列。The target command sequence is determined based on the aircraft group's own constraints and intermediate command sequences.

可选地，目标指令序列得到模块还可以执行以下步骤：Optionally, the target instruction sequence obtaining module can also perform the following steps:

判断飞行器群在自身约束条件下是否可以执行中间指令序列；Determine whether the aircraft group can execute the intermediate command sequence under its own constraints;

若是，则将中间指令序列作为目标指令序列；If so, use the intermediate instruction sequence as the target instruction sequence;

若否，则将中间指令序列从初始指令队列中删除，得到中间指令队列；If not, delete the intermediate instruction sequence from the initial instruction queue to obtain the intermediate instruction queue;

判断中间指令队列是否为空集；Determine whether the intermediate instruction queue is an empty set;

若是，则将优先决策准则从初始决策准则集合中删除，得到目标决策准则集合；If so, delete the priority decision criteria from the initial decision criteria set to obtain the target decision criteria set;

判断目标决策准则集合是否为空集；Determine whether the target decision criteria set is an empty set;

若是，则将预设序列作为目标指令序列；If so, use the preset sequence as the target instruction sequence;

若否，则将目标决策准则集合作为初始决策准则集合，并跳转执行选取初始决策准则集合中优先级最高的决策准则，得到优先决策准则的步骤；If not, use the target decision criterion set as the initial decision criterion set, and jump to the step of selecting the decision criterion with the highest priority in the initial decision criterion set to obtain the priority decision criterion;

若否，则将中间指令队列作为初始指令队列，并跳转执行选取初始指令队列中得分最高的初始指令序列作为中间指令序列的步骤。If not, use the intermediate instruction queue as the initial instruction queue, and jump to the step of selecting the initial instruction sequence with the highest score in the initial instruction queue as the intermediate instruction sequence.

本发明实施例还提供了一种电子设备，电子设备包括：存储器及处理器，存储器中储存有计算机程序；计算机程序被处理器执行时，使得处理器执行如上述任一实施例的基于飞行器群协同决策生成方法。Embodiments of the present invention also provide an electronic device. The electronic device includes: a memory and a processor. A computer program is stored in the memory; when the computer program is executed by the processor, it causes the processor to execute an aircraft group-based method as in any of the above embodiments. Collaborative decision making methods.

存储器可以是诸如闪存、EEPROM(电可擦除可编程只读存储器)、EPROM、硬盘或者ROM之类的电子存储器。存储器具有用于执行上述方法中的任何方法步骤的程序代码的存储空间。例如，用于程序代码的存储空间可以包括分别用于实现上面的方法中的各种步骤的各个程序代码。这些程序代码可以从一个或者多个计算机程序产品中读出或者写入到这一个或者多个计算机程序产品中。这些计算机程序产品包括诸如硬盘，紧致盘(CD)、存储卡或者软盘之类的程序代码载体。程序代码可以例如以适当形式进行压缩。这些代码当由计算处理设备运行时，导致该计算处理设备执行上面所描述的基于飞行器群协同决策生成方法中的各个步骤。The memory may be electronic memory such as flash memory, EEPROM (Electrically Erasable Programmable Read Only Memory), EPROM, hard disk or ROM. The memory has storage space for program code for performing any of the method steps described above. For example, the storage space for the program code may include individual program codes respectively used to implement various steps in the above method. These program codes can be read from or written into one or more computer program products. These computer program products include program code carriers such as hard disks, compact disks (CDs), memory cards or floppy disks. The program code may, for example, be compressed in a suitable form. These codes, when run by the computing processing device, cause the computing processing device to perform various steps in the aircraft group-based collaborative decision generation method described above.

所属领域的技术人员可以清楚地了解到，为描述的方便和简洁，上述描述的系统，装置和单元的具体工作过程，可以参考前述方法实施例中的对应过程，在此不再赘述。Those skilled in the art can clearly understand that for the convenience and simplicity of description, the specific working processes of the systems, devices and units described above can be referred to the corresponding processes in the foregoing method embodiments, and will not be described again here.

在本申请所提供的几个实施例中，应该理解到，所揭露的系统，装置和方法，可以通过其它的方式实现。例如，以上所描述的装置实施例仅仅是示意性的，例如，单元的划分，仅仅为一种逻辑功能划分，实际实现时可以有另外的划分方式，例如多个单元或组件可以结合或者可以集成到另一个系统，或一些特征可以忽略，或不执行。另一点，所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口，装置或单元的间接耦合或通信连接，可以是电性，机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed systems, devices and methods can be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated. to another system, or some features can be ignored, or not implemented. On the other hand, the coupling or direct coupling or communication connection between each other shown or discussed may be through some interfaces, and the indirect coupling or communication connection of the devices or units may be in electrical, mechanical or other forms.

作为分离部件说明的单元可以是或者也可以不是物理上分开的，作为单元显示的部件可以是或者也可以不是物理单元，即可以位于一个地方，或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。A unit described as a separate component may or may not be physically separate. A component shown as a unit may or may not be a physical unit, that is, it may be located in one place, or it may be distributed to multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.

另外，在本发明各个实施例中的各功能单元可以集成在一个处理单元中，也可以是各个单元单独物理存在，也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现，也可以采用软件功能单元的形式实现。In addition, each functional unit in various embodiments of the present invention can be integrated into one processing unit, or each unit can exist physically alone, or two or more units can be integrated into one unit. The above integrated units can be implemented in the form of hardware or software functional units.

集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时，可以存储在一个计算机可读取存储介质中。基于这样的理解，本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来，该计算机软件产品存储在一个存储介质中，包括若干指令用以使得一台计算机设备(可以是个人计算机，服务器，或者网络设备等)执行本发明各个实施例方法的全部或部分步骤。而前述的存储介质包括：U盘、移动硬盘、只读存储器(ROM，Read-Only Memory)、随机存取存储器(RAM，Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。Integrated units may be stored in a computer-readable storage medium if they are implemented in the form of software functional units and sold or used as independent products. Based on this understanding, the technical solution of the present invention is essentially or contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to cause a computer device (which can be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods of various embodiments of the present invention. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program code. .

以上实施例仅用以说明本发明的技术方案，而非对其限制；尽管参照前述实施例对本发明进行了详细的说明，本领域的普通技术人员应当理解：其依然可以对前述各实施例所记载的技术方案进行修改，或者对其中部分技术特征进行等同替换；而这些修改或者替换，并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围。The above embodiments are only used to illustrate the technical solutions of the present invention, but not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that they can still modify the technical solutions of the foregoing embodiments. The recorded technical solutions may be modified, or some of the technical features thereof may be equivalently replaced; however, these modifications or substitutions shall not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of each embodiment of the present invention.

Claims

1. A method for generating collaborative decision-making for aircraft groups, which is characterized by including:

When receiving the emergency situation data of the aircraft group, a preset density peak clustering algorithm is used to calculate the event pseudo-label corresponding to the aircraft group data set corresponding to the emergency situation data and the ground control station data set generated by the ground control station, Get the event pseudo-label set;

Train an initial event extraction model based on the event pseudo-label set to obtain a target event extraction model;

The target event extraction model is used to extract events from the aircraft group data set and the ground control station data set respectively and perform event comparison to obtain solution comparison results;

When the solution comparison results are inconsistent, the optimal instruction sequence is selected based on the authorization data corresponding to the aircraft group and the preset improved multi-agent deep deterministic strategy gradient algorithm to obtain the aircraft group corresponding to the emergency situation data. Collaborative decision-making solutions;

When the plan comparison results are consistent, the aircraft group plan in the aircraft group data set is used as the aircraft group collaborative decision-making plan corresponding to the emergency situation data.

2. The aircraft group collaborative decision-making generation method according to claim 1, characterized in that when receiving emergency situation data of the aircraft group, a preset density peak clustering algorithm is used to calculate the emergency situation data corresponding to The event pseudo-labels corresponding to the aircraft group data set and the ground control station data set generated by the ground control station, and the steps to obtain the event pseudo-label set include:

When receiving the emergency situation data of the aircraft group, obtain the aircraft group data set generated by the aircraft group based on the emergency situation data and the ground control station data set generated by the ground control station;

Extract the feature vectors of the aircraft group data set and the ground control station data set through a preset word skip model to obtain a feature vector set;

Calculate the density corresponding to each point in the feature vector set using a preset density calculation formula to obtain the density;

The preset density calculation formula is:

Among them, ρ(xi) represents the density of point x _i ; k represents the number of neighbors of a point, that is, using _the Euclidean distance as the metric, the top k points closest to a point; Represents the Euclidean distance between point x _i and point x _j , u represents the dimension of a feature vector concentration point, that is, the number of features, g is a subscript, used to index different features; KNN(xi ₎ represents the A set of the first k points with the smallest Euclidean distance from point x _i ; i represents the i-th point x _i in the data set; j represents the j-th point x _j in the data set;

Substitute the density into the preset relative density calculation formula to calculate the relative density of the point to obtain the relative density;

The preset relative density calculation formula is:

Among them, r _ρ ( _xi ) represents the relative density of point x _i ; ρ ( _xi ) represents the density of point _xi ; ρ (x _j ) represents the density of point x _j ; k represents the number of neighbors of a point, that is Using Euclidean distance as the metric, the top k points closest to a point; KNN(xi ₎ represents the set of the top k points with the smallest Euclidean distance from point x _i in the data set; i represents the i-th point in the data set point x _i ;j represents the jth point x _j in the data set;

Select points where the relative density is greater than the density average corresponding to the relative density to obtain multiple cluster centers;

According to the cluster center, the feature vector set is divided using a preset domain calculation formula to obtain multiple sets;

The calculation formula of the preset domain is:

D _mn (x _p )={x _q |x _q ∈MN(x _p )∨(x _m ∈MN(x _p ))^x _q ∈MN(x _m ))};

In the formula, D _mn (x _p ) represents the domain of point x _p ; MN (x _p ) represents the set of all neighbors of point x _p ; MN (x _m ) represents the set of all neighbors of point x _m ; x _p represents the feature The p-th point in the vector set; x _m represents the m-th point in the feature vector set; x _q represents the q-th point in the feature vector set;

According to the labels corresponding to the cluster centers, corresponding labels are set for the points corresponding to each set to obtain an event pseudo-label set.

3. The aircraft group collaborative decision-making generation method according to claim 1, characterized in that the step of training an initial event extraction model based on the event pseudo-label set to obtain a target event extraction model includes:

Input the event pseudo-label set into the initial event extraction model to obtain the initial loss function value;

Fine-tune the initial event extraction model according to the initial loss function value to obtain an intermediate event extraction model;

Input the event pseudo-label set into the intermediate event extraction model to obtain the target loss function value;

Determine whether the target loss function value is a preset function value;

If so, use the intermediate event extraction model corresponding to the target loss function value as the target event extraction model;

If not, use the intermediate event extraction model as the initial event extraction model, and jump to the step of inputting the event pseudo-label set into the initial event extraction model to obtain an initial loss function value.

4. The aircraft group collaborative decision-making generation method according to claim 1, characterized in that the target event extraction model is used to extract events from the aircraft group data set and the ground control station data set respectively. Event comparison, the steps to obtain the solution comparison results, include:

Extract events from the aircraft group data set and the ground control station data set respectively through the target event extraction model to obtain aircraft group events and ground events;

Compare the aircraft group event and the ground event according to the trigger word category and parameter similarity to obtain an event comparison result;

Using all the event comparison results, a plan comparison result is constructed.

5. The aircraft group collaborative decision-making generation method according to claim 1, characterized in that when the solution comparison result is inconsistent, the aircraft group is determined based on the authorization data corresponding to the aircraft group and the preset improved multi-agent depth. The steps of selecting the optimal instruction sequence using the sexual policy gradient algorithm and obtaining the aircraft group collaborative decision-making solution corresponding to the emergency data include:

When the plan comparison results are inconsistent, determine whether the authorization data corresponding to the aircraft group is authorized;

If so, determine the aircraft group collaborative decision-making plan corresponding to the emergency situation data based on the command sequence within the preset time and the aircraft group plan;

If not, perform command sequence selection on the aircraft group data set and the ground control station data set according to the preset improved multi-agent depth deterministic policy gradient algorithm to obtain the target command sequence;

Determine whether the target instruction sequence is a preset sequence;

If so, use the aircraft group plan as the aircraft group collaborative decision-making plan corresponding to the emergency situation data;

If not, the target command sequence is used as the aircraft group collaborative decision-making solution corresponding to the emergency situation data.

6. The aircraft group collaborative decision-making generation method according to claim 5, characterized in that the aircraft group collaborative decision-making corresponding to the emergency situation data is determined based on the instruction sequence within the preset time and the aircraft group plan. Program steps include:

Determine whether the aircraft group and the ground control station have found the optimal command sequence within the preset time;

If so, use the optimal instruction sequence as the aircraft group collaborative decision-making solution corresponding to the emergency situation data;

If not, the aircraft group plan is used as the aircraft group collaborative decision-making plan corresponding to the emergency situation data.

7. The aircraft group collaborative decision-making generation method according to claim 5, characterized in that the multi-agent depth deterministic strategy gradient algorithm based on the preset improvement is used to compare the aircraft group data set and the ground control station data. Set the steps to select the instruction sequence and obtain the target instruction sequence, including:

Using the decision criteria corresponding to the aircraft group data set and the ground control station data set to construct an initial decision criterion set;

Select the decision criterion with the highest priority in the initial decision criterion set to obtain the priority decision criterion;

According to the priority decision-making criterion, an instruction acquisition algorithm is used to assign preset weights to instructions corresponding to the aircraft group data set and the ground control station data set, respectively, to obtain an instruction sequence set;

Using a preset improved multi-agent deep deterministic policy gradient algorithm to select multiple initial instruction sequences that meet the preset score threshold in the instruction sequence set, and construct an initial instruction queue;

Select the initial instruction sequence with the highest score in the initial instruction queue as the intermediate instruction sequence;

The target command sequence is determined based on the constraints of the aircraft group on itself and the intermediate command sequence.

8. The aircraft group collaborative decision-making generation method according to claim 7, characterized in that the step of determining the target instruction sequence based on the aircraft group's own constraints and the intermediate instruction sequence includes:

Determine whether the aircraft group can execute the intermediate command sequence under its own constraints;

If so, use the intermediate instruction sequence as the target instruction sequence;

If not, delete the intermediate instruction sequence from the initial instruction queue to obtain an intermediate instruction queue;

Determine whether the intermediate instruction queue is an empty set;

If so, delete the priority decision criterion from the initial decision criterion set to obtain a target decision criterion set;

Determine whether the set of target decision criteria is an empty set;

If so, use the preset sequence as the target instruction sequence;

If not, use the target decision criterion set as the initial decision criterion set, and jump to the step of selecting the decision criterion with the highest priority in the initial decision criterion set to obtain the priority decision criterion;

If not, use the intermediate instruction queue as the initial instruction queue, and jump to the step of selecting the initial instruction sequence with the highest score in the initial instruction queue as the intermediate instruction sequence.

9. An aircraft group collaborative decision-making generation system, characterized by including:

The event pseudo-label set obtaining module is used to, when receiving emergency situation data of an aircraft group, use a preset density peak clustering algorithm to calculate the aircraft group data set corresponding to the emergency situation data and the ground control generated by the ground control station. The event pseudo-label corresponding to the station data set is obtained to obtain the event pseudo-label set;

A target event extraction model obtaining module is used to train an initial event extraction model based on the event pseudo-label set to obtain a target event extraction model;

A solution comparison result obtaining module is used to extract events from the aircraft group data set and the ground control station data set through the target event extraction model and perform event comparison to obtain a solution comparison result;

The first acquisition module of aircraft group collaborative decision-making is used to select the optimal instruction sequence based on the authorization data corresponding to the aircraft group and the preset improved multi-agent deep deterministic strategy gradient algorithm when the plan comparison results are inconsistent. Obtain the aircraft group collaborative decision-making plan corresponding to the emergency situation data;

The second acquisition module of aircraft group collaborative decision-making is used to use the aircraft group plan in the aircraft group data set as the aircraft group collaborative decision-making plan corresponding to the emergency situation data when the plan comparison results are consistent.

10. An electronic device, characterized in that it includes a memory and a processor. A computer program is stored in the memory. When the computer program is executed by the processor, the computer program causes the processor to execute claims 1 to 8. The steps of the aircraft group collaborative decision-making generation method described in any one of the above.