CN117636086B

CN117636086B - Passive domain adaptive target detection method and device

Info

Publication number: CN117636086B
Application number: CN202311332829.7A
Authority: CN
Inventors: 张璐; 张思琦; 刘智勇; 乔红
Original assignee: Institute of Automation of Chinese Academy of Science
Current assignee: Institute of Automation of Chinese Academy of Science
Priority date: 2023-10-13
Filing date: 2023-10-13
Publication date: 2024-09-24
Anticipated expiration: 2043-10-13
Also published as: CN117636086A

Abstract

The present invention provides a passive domain adaptive target detection method and device, comprising: constructing multiple feature prototypes of each type of target based on the first instance features of each type of target extracted from part of the images of the target domain data set by a teacher model; correcting the target detection results of each image in the target domain data set obtained by the teacher model according to the multiple feature prototypes of each type of target, and obtaining pseudo labels of each image; using each image of the target domain data set as a sample, using the pseudo labels of each image as a label to train a student model, and using the trained student model to detect targets in the image to be detected; the teacher model and the student model are obtained by pre-training a target detection model using a source domain data set. The present invention uses multiple feature prototypes of various types of targets in the target domain to guide the generation of more accurate pseudo labels as supervisory information for model training, thereby improving the accuracy of target detection.

Description

Passive domain adaptive target detection method and device

技术领域Technical Field

本发明涉及计算机视觉技术领域，尤其涉及一种无源域适应目标检测方法及装置。The present invention relates to the field of computer vision technology, and in particular to a passive domain adaptive target detection method and device.

背景技术Background Art

在进行目标检测时，如果使用训练集训练的目标检测模型检测某一新环境中的目标，存在训练集(源域)和测试集(目标域)中的数据分布不一致的问题，导致训练好的目标检测模型在新环境中效果不佳。如果为新环境收集并标注一个足够大的数据集，费时费力，成本高昂。When performing target detection, if the target detection model trained with the training set is used to detect targets in a new environment, there is a problem of inconsistent data distribution between the training set (source domain) and the test set (target domain), resulting in poor performance of the trained target detection model in the new environment. Collecting and annotating a large enough dataset for a new environment is time-consuming, labor-intensive, and costly.

为了解决这一问题，无源域适应目标检测被提出，其致力于将源域上训练好的目标检测模型中的知识迁移到目标域，在不需要对目标域数据进行标注和访问的情况下，提高目标检测模型在目标域上的检测性能，极大降低标注成本。To solve this problem, passive domain adaptive target detection was proposed, which aims to transfer the knowledge of the target detection model trained in the source domain to the target domain. Without the need to label and access the target domain data, the detection performance of the target detection model in the target domain is improved, greatly reducing the labeling cost.

现有的无源域适应目标检测方法通常采用伪标签生成策略。由源域上预训练的目标检测模型为目标域数据集生成伪标签，作为在目标域上的监督信息来微调目标检测器。在这类方法中，伪标签的质量影响目标检测模型适应后的性能。例如，类别不平衡问题会引入有噪声的伪标签，因为常见类别受到更多关注，影响对稀有类别的目标检测的性能。因此，伪标签不准确和目标检测模型对域偏移敏感等问题，导致目标检测的准确性较低。Existing passive domain adaptation object detection methods usually adopt a pseudo-label generation strategy. The object detection model pre-trained on the source domain generates pseudo-labels for the target domain dataset, which are used as supervision information on the target domain to fine-tune the object detector. In this type of method, the quality of the pseudo-labels affects the performance of the object detection model after adaptation. For example, the class imbalance problem will introduce noisy pseudo-labels because common categories receive more attention, affecting the performance of object detection for rare categories. Therefore, problems such as inaccurate pseudo-labels and the sensitivity of the object detection model to domain shift lead to low accuracy of object detection.

发明内容Summary of the invention

本发明提供一种无源域适应目标检测方法及装置，用以解决现有技术中目标检测模型为目标域数据集生成的伪标签质量较差，影响目标检测的准确性的缺陷，实现提高伪标签的质量，从而提高目标检测的准确性。The present invention provides a passive domain adaptive target detection method and device, which are used to solve the defect that the quality of pseudo labels generated by the target detection model for the target domain data set in the prior art is poor, which affects the accuracy of target detection, and improves the quality of pseudo labels, thereby improving the accuracy of target detection.

本发明提供一种无源域适应目标检测方法，包括：The present invention provides a passive domain adaptive target detection method, comprising:

基于教师模型从目标域数据集的部分图像中提取的各类目标的第一实例特征，构建所述各类目标的多个特征原型；Based on the first instance features of various types of targets extracted by the teacher model from some images of the target domain dataset, multiple feature prototypes of the various types of targets are constructed;

根据所述各类目标的多个特征原型，对所述教师模型获取的所述目标域数据集中各图像的目标检测结果进行纠正，得到所述各图像的伪标签；According to the multiple feature prototypes of the various types of targets, the target detection results of each image in the target domain dataset obtained by the teacher model are corrected to obtain pseudo labels of each image;

将所述目标域数据集的各图像作为样本，将所述各图像的伪标签作为标签对学生模型进行训练，使用训练后的所述学生模型检测待检测图像中的目标；Using each image of the target domain data set as a sample, using the pseudo label of each image as a label to train the student model, and using the trained student model to detect the target in the image to be detected;

所述教师模型和所述学生模型通过预先使用源域数据集对目标检测模型进行训练得到。The teacher model and the student model are obtained by pre-training the target detection model using a source domain dataset.

根据本发明提供的一种无源域适应目标检测方法，所述基于教师模型从目标域数据集的部分图像中提取的各类目标的第一实例特征，构建所述各类目标的多个特征原型，包括：According to a passive domain adaptation target detection method provided by the present invention, the first instance features of various targets extracted from part of the images of the target domain data set based on the teacher model are used to construct multiple feature prototypes of the various targets, including:

从所述目标域数据集中随机抽取部分图像；Randomly extracting some images from the target domain dataset;

基于所述教师模型检测所述部分图像中各目标的第一类别标签和第一检测框，并将所述目标的第一检测框内的图像区域作为所述第一实例特征；Detecting a first category label and a first detection frame of each target in the partial image based on the teacher model, and using an image area within the first detection frame of the target as the first instance feature;

根据所述各目标的第一类别标签，确定所述部分图像中属于同一类的目标；Determining objects in the partial image that belong to the same category according to the first category labels of the objects;

根据各类所述目标的第一实例特征，构建所述各类目标的多个特征原型。According to the first instance features of each type of target, a plurality of feature prototypes of each type of target are constructed.

根据本发明提供的一种无源域适应目标检测方法，所述根据各类所述目标的第一实例特征，构建所述各类目标的多个特征原型，包括：According to a passive domain adaptive target detection method provided by the present invention, the step of constructing multiple feature prototypes of each type of target based on first instance features of each type of target includes:

对所述各类目标的第一实例特征进行多次聚类，每次聚类的簇数量不同；Clustering the first instance features of the various types of targets multiple times, with the number of clusters in each clustering being different;

确定每次聚类的轮廓分数，将最大的轮廓分数对应的聚类中每个簇的聚类中心作为所述各类目标的特征原型。The silhouette score of each clustering is determined, and the cluster center of each cluster in the cluster corresponding to the largest silhouette score is used as the feature prototype of each type of target.

根据本发明提供的一种无源域适应目标检测方法，所述根据所述各类目标的多个特征原型，对所述教师模型获取的所述目标域数据集中各图像的目标检测结果进行纠正，得到所述各图像的伪标签，包括：According to a passive domain adaptive target detection method provided by the present invention, the target detection result of each image in the target domain data set obtained by the teacher model is corrected according to the multiple feature prototypes of the various types of targets to obtain a pseudo label of each image, including:

基于所述教师模型从所述目标域数据集的各图像中检测各目标的第二类别标签、第二检测框和各目标属于所述第二类别标签的置信度分数；Detecting a second category label, a second detection frame, and a confidence score that each target belongs to the second category label of each target from each image of the target domain dataset based on the teacher model;

将所述各目标的第二检测框内的图像区域作为所述各目标的第二实例特征；Using the image area within the second detection frame of each target as the second instance feature of each target;

根据所述各目标的第二实例特征与所述各类目标的多个特征原型之间的相似度，对所述各目标的第二类别标签和所述各目标属于所述第二类别标签的置信度分数进行纠正；Correcting the second category label of each target and the confidence score that each target belongs to the second category label according to the similarity between the second instance feature of each target and the plurality of feature prototypes of each type of target;

根据所述各目标的第二检测框、纠正后的第二类别标签和纠正后的置信度分数，确定所述各目标的伪标签。A pseudo label for each target is determined according to the second detection frame, the corrected second category label, and the corrected confidence score of each target.

根据本发明提供的一种无源域适应目标检测方法，所述根据所述各目标的第二实例特征与所述各类目标的多个特征原型之间的相似度，对所述各目标的第二类别标签和所述各目标属于所述第二类别标签的置信度分数进行纠正，包括：According to a passive domain adaptive target detection method provided by the present invention, the second category label of each target and the confidence score that each target belongs to the second category label are corrected according to the similarity between the second instance feature of each target and the multiple feature prototypes of each type of target, including:

确定所述各目标的第二实例特征与所述各类目标的多个特征原型之间的相似度中的第一最大值；Determine a first maximum value among similarities between the second instance feature of each target and a plurality of feature prototypes of each type of target;

根据所有类目标对应的第一最大值中的最大值所对应的第一类别标签，作为所述各目标纠正后的第二类别标签；According to the first category label corresponding to the maximum value among the first maximum values corresponding to all categories of targets, as the second category label after correction of each target;

将根据所有类目标对应的第一最大值中的最大值作为所述各目标纠正后的置信度分数。The maximum value among the first maximum values corresponding to all categories of targets is used as the corrected confidence score of each target.

根据本发明提供的一种无源域适应目标检测方法，所述将所述各目标的第二检测框内的图像区域作为所述各目标的第二实例特征，包括：According to a passive domain adaptation target detection method provided by the present invention, the step of using the image area within the second detection frame of each target as the second instance feature of each target includes:

将所述各目标属于所述第二类别标签的置信度分数与预设阈值进行比较，所述预设阈值根据所述目标检测模型对应的预设检测类别数目确定；Comparing the confidence score of each target belonging to the second category label with a preset threshold, where the preset threshold is determined according to the number of preset detection categories corresponding to the target detection model;

在所述置信度分数小于等于所述预设阈值的情况下，将所述各目标的第二检测框内的图像区域作为所述各目标的第二实例特征。When the confidence score is less than or equal to the preset threshold, the image area within the second detection frame of each target is used as the second instance feature of each target.

根据本发明提供的一种无源域适应目标检测方法，所述将所述目标域数据集的各图像作为样本，将所述各图像的伪标签作为标签对学生模型进行训练，包括：According to a passive domain adaptation target detection method provided by the present invention, each image of the target domain data set is used as a sample, and the pseudo label of each image is used as a label to train the student model, including:

基于所述学生模型从所述目标域数据集的各图像中检测各目标的第三类别标签、第三检测框和各目标属于所述第三类别标签的置信度分数；Detecting a third category label, a third detection frame, and a confidence score that each target belongs to the third category label of each target from each image of the target domain dataset based on the student model;

将所述各目标的第三检测框内的图像区域作为所述各目标的第三实例特征；Using the image area within the third detection frame of each target as the third instance feature of each target;

根据所述目标域数据集的各图像中各目标的第三类别标签和纠正后的所述第二类别标签之间的损失，以及第三检测框和第二检测框之间的损失，确定所述各目标的第一损失；Determine a first loss for each target according to a loss between a third category label of each target in each image of the target domain dataset and the corrected second category label, and a loss between a third detection frame and a second detection frame;

根据所述各目标的第二实例特征、第三实例特征与所述各类目标的多个特征原型之间的相似度，确定所述各目标的第二损失；Determine the second loss of each target according to the similarity between the second instance feature and the third instance feature of each target and the plurality of feature prototypes of each type of target;

根据所述各目标属于所述第三类别标签的置信度分数和所述纠正后的置信度分数，确定所述各目标的第三损失；Determining a third loss for each target according to the confidence score that each target belongs to the third category label and the corrected confidence score;

根据所述第一损失、所述第二损失和所述第三损失，对所述学生模型进行训练。The student model is trained according to the first loss, the second loss and the third loss.

根据本发明提供的一种无源域适应目标检测方法，所述根据所述各目标的第二实例特征、第三实例特征与所述各类目标的多个特征原型之间的相似度，确定所述各目标的第二损失，包括：According to a passive domain adaptive target detection method provided by the present invention, the second loss of each target is determined according to the similarity between the second instance feature and the third instance feature of each target and the plurality of feature prototypes of each type of target, including:

确定所述各目标的第三实例特征与所述各类目标的多个特征原型之间的相似度中的第二最大值；Determine a second maximum value of similarities between the third instance feature of each target and a plurality of feature prototypes of each type of target;

根据所述第一最大值和所述第二最大值，确定所述各目标的第二损失。A second loss of each of the targets is determined according to the first maximum value and the second maximum value.

根据本发明提供的一种无源域适应目标检测方法，在所述将所述目标域数据集的各图像作为样本，将所述各图像的伪标签作为标签对学生模型进行训练的同时，还包括：According to a passive domain adaptation target detection method provided by the present invention, while taking each image of the target domain data set as a sample and taking the pseudo label of each image as a label to train the student model, it also includes:

根据所述训练的过程中每预设次数的迭代新产生的所述各目标的第二实例特征，确定所述各类目标的新聚类中心；Determining new cluster centers of each class of targets according to the second instance features of each target newly generated in each preset number of iterations during the training process;

从所述各类目标的特征原型中选择与所述各类目标的新聚类中心最相似的特征原型；Selecting the feature prototypes that are most similar to the new cluster centers of the various types of targets from the feature prototypes of the various types of targets;

使用所述各类目标的新聚类中心，对选择的所述各类目标的特征原型进行更新。The new cluster centers of the various types of targets are used to update the selected feature prototypes of the various types of targets.

本发明还提供一种无源域适应目标检测装置，包括：The present invention also provides a passive domain adaptive target detection device, comprising:

构建模块，用于基于教师模型从目标域数据集的部分图像中提取的各类目标的第一实例特征，构建所述各类目标的多个特征原型；A construction module, configured to construct a plurality of feature prototypes of each type of target based on first instance features of each type of target extracted by the teacher model from a portion of images of the target domain dataset;

生成模块，用于根据所述各类目标的多个特征原型，对所述教师模型获取的所述目标域数据集中各图像的目标检测结果进行纠正，得到所述各图像的伪标签；A generation module, configured to correct the target detection result of each image in the target domain dataset obtained by the teacher model according to the multiple feature prototypes of the various types of targets, so as to obtain a pseudo label of each image;

训练模块，用于将所述目标域数据集的各图像作为样本，将所述各图像的伪标签作为标签对所述学生模型进行训练，使用训练后的所述学生模型检测待检测图像中的目标。The training module is used to use each image of the target domain data set as a sample, use the pseudo label of each image as a label to train the student model, and use the trained student model to detect the target in the image to be detected.

本发明还提供一种电子设备，包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序，所述处理器执行所述程序时实现如上述任一种所述无源域适应目标检测方法。The present invention also provides an electronic device, comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein when the processor executes the program, the passive domain adaptive target detection method as described above is implemented.

本发明还提供一种非暂态计算机可读存储介质，其上存储有计算机程序，该计算机程序被处理器执行时实现如上述任一种所述无源域适应目标检测方法。The present invention also provides a non-transitory computer-readable storage medium having a computer program stored thereon, and when the computer program is executed by a processor, the passive domain adaptive target detection method as described above is implemented.

本发明还提供一种计算机程序产品，包括计算机程序，所述计算机程序被处理器执行时实现如上述任一种所述无源域适应目标检测方法。The present invention also provides a computer program product, comprising a computer program, wherein when the computer program is executed by a processor, the passive domain adaptive target detection method as described above is implemented.

本发明提供的无源域适应目标检测方法及装置，通过使用教师模型从目标域数据集的部分图像中提取的各类目标的第一实例特征，构建各类目标的多个特征原型，为目标域提供更具代表性的类别信息，使得经各类目标的多个特征原型对教师模型预测的目标域数据集中各图像的目标检测结果进行纠正后，得到更加准确的伪标签作为学生模型训练的监督信息，从而提高目标检测的准确性。The passive domain adaptive target detection method and device provided by the present invention constructs multiple feature prototypes of each type of target by using the first instance features of each type of target extracted from part of the images of the target domain data set by the teacher model, thereby providing more representative category information for the target domain. After the target detection results of each image in the target domain data set predicted by the teacher model are corrected by the multiple feature prototypes of each type of target, more accurate pseudo labels are obtained as supervision information for the student model training, thereby improving the accuracy of target detection.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

为了更清楚地说明本发明或现有技术中的技术方案，下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍，显而易见地，下面描述中的附图是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the present invention or the prior art, the following briefly introduces the drawings required for use in the embodiments or the description of the prior art. Obviously, the drawings described below are some embodiments of the present invention. For ordinary technicians in this field, other drawings can be obtained based on these drawings without paying creative work.

图1是本发明提供的无源域适应目标检测方法的流程示意图；FIG1 is a schematic diagram of a flow chart of a passive domain adaptive target detection method provided by the present invention;

图2是本发明提供的无源域适应目标检测方法框架示意图；FIG2 is a schematic diagram of a framework of a passive domain adaptive target detection method provided by the present invention;

图3是本发明提供的无源域适应目标检测装置的结构示意图；FIG3 is a schematic diagram of the structure of a passive domain adaptive target detection device provided by the present invention;

图4是本发明提供的电子设备的结构示意图。FIG. 4 is a schematic diagram of the structure of an electronic device provided by the present invention.

具体实施方式DETAILED DESCRIPTION

为使本发明的目的、技术方案和优点更加清楚，下面将结合本发明中的附图，对本发明中的技术方案进行清楚、完整地描述，显然，所描述的实施例是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。In order to make the purpose, technical solution and advantages of the present invention clearer, the technical solution of the present invention will be clearly and completely described below in conjunction with the drawings of the present invention. Obviously, the described embodiments are part of the embodiments of the present invention, not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by ordinary technicians in this field without creative work are within the scope of protection of the present invention.

下面结合图1描述本发明的一种无源域适应目标检测方法，包括：A passive domain adaptive target detection method of the present invention is described below in conjunction with FIG1 , including:

步骤101，基于教师模型从目标域数据集的部分图像中提取的各类目标的第一实例特征，构建各类目标的多个特征原型；Step 101, constructing multiple feature prototypes of various types of targets based on first instance features of various types of targets extracted by the teacher model from some images of the target domain dataset;

目标域数据集中包括N_t张图像，目标域数据集中的图像无标注。从目标域数据集中抽取部分图像。Target domain dataset The dataset contains N _t images, and the images in the target domain dataset are not labeled. Extract part of the image.

将抽取的部分图像输入教师模型θ_tea中进行目标检测，得到部分图像中目标的第一实例特征f_t，即目标的检测框内的图像区域，以及目标的第一类别标签。The extracted partial image is input into the teacher model θ _tea for target detection, and the first instance feature f _t of the target in the partial image is obtained, that is, the image area within the detection frame of the target and the first category label of the target.

根据目标的第一类别标签，统计部分图像中的每类目标。对各类目标的第一实例特征进行分析，得到各类目标的多个特征原型。各类目标的多个特征原型用来表示各类目标的特征。According to the first category label of the target, each type of target in the partial image is counted. The first instance features of each type of target are analyzed to obtain multiple feature prototypes of each type of target. The multiple feature prototypes of each type of target are used to represent the features of each type of target.

可根据每类目标的第一实例特征的分布自适应生成的聚类中心作为每类目标的多个特征原型，从而为目标域提供更具代表性的类别信息。通过构建类别特定的多特征原型引导知识从源域到目标域的迁移。本实施例对特征原型的构建方式不作限定。The cluster centers adaptively generated according to the distribution of the first instance features of each type of target can be used as multiple feature prototypes of each type of target, thereby providing more representative category information for the target domain. The migration of knowledge from the source domain to the target domain is guided by constructing category-specific multi-feature prototypes. This embodiment does not limit the construction method of the feature prototype.

步骤102，根据各类目标的多个特征原型，对教师模型获取的目标域数据集中各图像的目标检测结果进行纠正，得到各图像的伪标签；Step 102, according to multiple feature prototypes of various types of targets, correct the target detection results of each image in the target domain data set obtained by the teacher model to obtain a pseudo label for each image;

将目标域数据集中的各图像输入教师模型，得到各图像的目标检测结果。Each image in the target domain dataset is input into the teacher model to obtain the object detection results of each image.

在将目标域数据集中的各图像输入教师模型之前，可对各图像进行弱增强，包括随机水平翻转。即教师模型输入的是弱增强图像。Before each image in the target domain dataset is input into the teacher model, each image can be weakly enhanced, including random horizontal flipping. That is, the teacher model inputs a weakly enhanced image.

由于教师模型是使用源域数据集训练得到，直接使用教师模型对各图像预测的伪标签不准确。引入目标域数据集中各类目标特定的多个特征原型对教师模型预测的各图像的目标检测结果进行纠正，以便在存在类内变化的情况下为各图像中的目标分配更准确的伪标签。Since the teacher model is trained using the source domain dataset, the pseudo labels predicted by the teacher model for each image are inaccurate. Multiple feature prototypes specific to each type of target in the target domain dataset are introduced to correct the target detection results of each image predicted by the teacher model, so as to assign more accurate pseudo labels to the targets in each image in the presence of intra-class variations.

可根据教师模型从目标域数据集中各图像中提取的目标的第二实例特征与各类目标的多个特征原型之间的距离，来获取更准确的伪标签。More accurate pseudo labels can be obtained based on the distance between the second instance features of the target extracted by the teacher model from each image in the target domain dataset and multiple feature prototypes of each type of target.

步骤103，将目标域数据集的各图像作为样本，将各图像的伪标签作为标签对学生模型进行训练，使用训练后的学生模型检测待检测图像中的目标；将使用源域数据集训练得到的目标检测模型作为教师模型和学生模型。Step 103, using each image of the target domain data set as a sample, using the pseudo label of each image as a label to train the student model, using the trained student model to detect the target in the image to be detected; using the target detection model trained using the source domain data set as the teacher model and the student model.

源域数据集中包括多张图像，每张图像中的目标标注有类别标签和检测框。The source domain dataset includes multiple images, and the objects in each image are annotated with category labels and detection boxes.

目标检测模型可采用两阶段目标检测器Faster-RCNN(Faster RegionConvolutional Neural Networks，更快的区域卷积神经网络)，但不限于这种模型。The target detection model may adopt a two-stage target detector Faster-RCNN (Faster Region Convolutional Neural Networks), but is not limited to this model.

在源域数据集上对目标检测模型进行训练，得到初始目标检测模型θ。将初始目标检测模型θ分别作为初始的教师模型θ_tea和初始的学生模型θ_tea。在源域数据集上对目标检测模型进行训练后，对源域数据集不再使用。The target detection model is trained on the source domain dataset to obtain an initial target detection model θ. The initial target detection model θ is used as the initial teacher model θ _tea and the initial student model θ _tea . After the target detection model is trained on the source domain dataset, the source domain dataset is no longer used.

将目标域数据集的各图像作为样本输入学生模型θ_stu中，得到学生模型θ_stu预测的目标检测结果。根据学生模型θ_stu预测的目标检测结果与对应的伪标签之间的差别，调整学生模型θ_stu的参数，使得学生模型θ_stu预测的目标检测结果与对应的伪标签之间的差别小于设定值。Each image of the target domain dataset is input as a sample into the student model θ _stu to obtain the target detection result predicted by the student model θ _stu . According to the difference between the target detection result predicted by the student model θ _stu and the corresponding pseudo label, the parameters of the student model θ _stu are adjusted so that the difference between the target detection result predicted by the student model θ _stu and the corresponding pseudo label is less than the set value.

教师模型的参数通过根据学生模型的参数使用指数移动平均方法进行更新，其公式如下:The parameters of the teacher model are updated using the exponential moving average method based on the parameters of the student model, and the formula is as follows:

θ_tea＝η·θ_tea+(1-η)·θ_stu；θ _tea =η·θ _tea +(1-η)·θ _stu ;

其中，η为指数移动平均方法的系数，可设置为0.99。Among them, η is the coefficient of the exponential moving average method, which can be set to 0.99.

由更新后的教师模型继续进行前向推理，训练后的学生模型可检测新环境中的目标。The updated teacher model continues forward reasoning, and the trained student model can detect objects in new environments.

在将目标域数据集中的各图像输入学生模型之前，可对各图像进行强增强，包括随机灰度、高斯模糊和颜色抖动中的一种或多种。即学生模型输入的是强增强图像。本实施例提供的无源域适应目标检测方法的框架图如图2所示。Before each image in the target domain data set is input into the student model, each image may be strongly enhanced, including one or more of random grayscale, Gaussian blur and color jitter. That is, the student model inputs a strongly enhanced image. The framework diagram of the passive domain adaptation target detection method provided in this embodiment is shown in FIG2.

在多个域适应目标检测基准上进行实验，包括Cityscapes、Foggy Cityscapes、KITTI、Sim10k、BDD100k、PASCAL VOC和Watercolor数据集。实验结果表明，本实施例提供的方法在性能上优于其他无源域适应目标检测方法。Experiments are conducted on multiple domain adaptation object detection benchmarks, including Cityscapes, Foggy Cityscapes, KITTI, Sim10k, BDD100k, PASCAL VOC, and Watercolor datasets. The experimental results show that the method provided in this embodiment outperforms other passive domain adaptation object detection methods in terms of performance.

本实施例通过使用教师模型从目标域数据集的部分图像中提取的各类目标的第一实例特征，构建各类目标的多个特征原型，为目标域提供更具代表性的类别信息，使得经各类目标的多个特征原型对教师模型预测的目标域数据集中各图像的目标检测结果进行纠正后，得到更加准确的伪标签作为学生模型训练的监督信息，从而提高目标检测的准确性。This embodiment constructs multiple feature prototypes of each type of target by using the first instance features of each type of target extracted by the teacher model from some images of the target domain dataset, thereby providing more representative category information for the target domain. After the target detection results of each image in the target domain dataset predicted by the teacher model are corrected by the multiple feature prototypes of each type of target, more accurate pseudo-labels are obtained as supervision information for student model training, thereby improving the accuracy of target detection.

在上述实施例的基础上，本实施例中基于教师模型从目标域数据集的部分图像中提取的各类目标的第一实例特征，构建各类目标的多个特征原型，包括：On the basis of the above embodiment, in this embodiment, based on the first instance features of various types of targets extracted by the teacher model from some images of the target domain data set, multiple feature prototypes of various types of targets are constructed, including:

从目标域数据集中随机抽取部分图像；Randomly extract some images from the target domain dataset;

基于教师模型检测部分图像中各目标的第一类别标签和第一检测框，并将目标的第一检测框内的图像区域作为第一实例特征；Detecting a first category label and a first detection frame of each target in a partial image based on the teacher model, and using an image region within the first detection frame of the target as a first instance feature;

根据各目标的第一类别标签，确定部分图像中属于同一类的目标；According to the first category label of each target, determine the targets belonging to the same category in some images;

根据各类目标的第一实例特征，构建各类目标的多个特征原型。According to the first instance features of each category of targets, multiple feature prototypes of each category of targets are constructed.

从目标域数据集中随机抽取部分图像，如随机采样500张图像。From the target domain dataset Randomly select some images from , such as randomly sampling 500 images.

使用教师模型对部分图像中的各图像进行目标检测，得到每张图像中各目标的第一类别标签、第一检测框和各目标属于第一类别标签的置信度分数。The teacher model is used to perform target detection on each image in some images to obtain the first category label, the first detection frame and the confidence score of each target belonging to the first category label for each target in each image.

在确定目标的第一类别标签时，教师模型先预测目标属于每个预设类别标签的置信度分数，将置信度分数最高的预设类别标签作为目标的第一类别标签。When determining the first category label of the target, the teacher model first predicts the confidence score that the target belongs to each preset category label, and takes the preset category label with the highest confidence score as the first category label of the target.

对于抽取的部分图像的每张图像中的每类目标，可仅保留每类目标中置信度分数最高的目标的实例特征，作为具有类别代表性的第一实例特征 For each type of target in each image of the extracted partial images, only the instance feature of the target with the highest confidence score in each type of target can be retained as the first instance feature with category representativeness

根据部分图像中具有同一第一类别标签的目标的第一实例特征构建各类目标的多个特征原型。According to the first instance features of the target with the same first category label in the partial image Build multiple feature prototypes for various types of targets.

在上述实施例的基础上，本实施例中根据各类目标的第一实例特征，构建各类目标的多个特征原型，包括：On the basis of the above embodiment, in this embodiment, multiple feature prototypes of various types of targets are constructed according to the first instance features of various types of targets, including:

对各类目标的第一实例特征进行多次聚类，每次聚类的簇数量不同；The first instance features of each type of target are clustered multiple times, with a different number of clusters each time;

确定每次聚类的轮廓分数，将最大的轮廓分数对应的聚类中每个簇的聚类中心作为各类目标的特征原型。Determine the silhouette score of each clustering, and take the cluster center of each cluster in the cluster corresponding to the largest silhouette score as the feature prototype of each type of target.

对第i类目标的第一实例特征分别进行多次聚类。每次聚类将第i类目标的第一实例特征分成多组，使得每组内的第一实例特征之间的区别较小，不同组的第一实例特征之间的区别较大。第一实例特征分成的组数即为簇数量。The first instance features of the i-th category target are clustered multiple times. Each clustering divides the first instance features of the i-th category target into multiple groups, so that the difference between the first instance features in each group is small, and the difference between the first instance features of different groups is large. The number of groups into which the first instance features are divided is the number of clusters.

计算每次聚类的轮廓分数，轮廓分数用于表征聚类的每个簇的紧致度，其值越大越好。可将每次聚类中各簇的轮廓分数的平均值作为每次聚类的轮廓分数S。保留轮廓分数最大的那次聚类，将该次聚类的簇数量n作为第i类目标的特征原型数量 Calculate the silhouette score of each clustering. The silhouette score is used to characterize the compactness of each cluster in the clustering. The larger the value, the better. The average of the silhouette scores of each cluster in each clustering can be used as the silhouette score S of each clustering. Keep the clustering with the largest silhouette score, and use the number of clusters n in this clustering as the number of characteristic prototypes of the i-th target.

将该次聚类中每个簇的聚类中心作为第i类目标的特征原型其中为第i类目标的第j个特征原型，|G_ij|是属于第i类目标的第j个簇G_ij的第一实例特征的数量。The cluster center of each cluster in this clustering is used as the characteristic prototype of the i-th type of target in is the j-th feature prototype of the i-th category target, |G _ij | is the number of first instance features of the j-th cluster G _ij belonging to the i-th category target.

例如，对第i类目标的第一实例特征进行三次聚类，分别将第i类目标的第一实例特征划分成2、3和4个簇。第一次聚类的轮廓分数为0.82，第二次聚类的轮廓分数为0.92，第三次聚类的轮廓分数为0.88，则第i类目标的特征原型数量为第二次聚类中划分的簇数量2。第i类目标的特征原型为第二次聚类中两个簇的聚类中心。For example, the first instance feature of the i-th target is clustered three times, and the first instance feature of the i-th target is divided into 2, 3, and 4 clusters respectively. The silhouette score of the first clustering is 0.82, the silhouette score of the second clustering is 0.92, and the silhouette score of the third clustering is 0.88. Then the number of feature prototypes of the i-th target is the number of clusters divided in the second clustering, 2. The feature prototype of the i-th target is the cluster center of the two clusters in the second clustering.

在上述实施例的基础上，本实施例中根据各类目标的多个特征原型，对教师模型获取的目标域数据集中各图像的目标检测结果进行纠正，得到各图像的伪标签，包括：On the basis of the above embodiment, in this embodiment, according to multiple feature prototypes of various types of targets, the target detection results of each image in the target domain data set obtained by the teacher model are corrected to obtain pseudo labels of each image, including:

基于教师模型从目标域数据集的各图像中检测各目标的第二类别标签、第二检测框和各目标属于第二类别标签的置信度分数；Based on the teacher model, detect the second category label, the second detection box and the confidence score that each target belongs to the second category label of each target from each image of the target domain dataset;

将各目标的第二检测框内的图像区域作为各目标的第二实例特征；Using the image area within the second detection frame of each target as the second instance feature of each target;

根据各目标的第二实例特征与各类目标的多个特征原型之间的相似度，对各目标的第二类别标签和各目标属于第二类别标签的置信度分数进行纠正；According to the similarity between the second instance feature of each target and multiple feature prototypes of each target, the second category label of each target and the confidence score of each target belonging to the second category label are corrected;

根据各目标的第二检测框、纠正后的第二类别标签和纠正后的置信度分数，确定各目标的伪标签。A pseudo label of each target is determined according to the second detection frame of each target, the corrected second category label and the corrected confidence score.

将目标域数据集中的第i张图像输入教师模型，得到教师模型预测的目标检测结果：The i-th image in the target domain dataset Input the teacher model and get the target detection results predicted by the teacher model:

其中，是预测的目标域数据集中第i张图像中目标的第二类别标签，是预测的目标域数据集中第i张图像中目标的第二检测框，是分类层如softmax输出的目标属于第二类别标签的置信度分数，F是教师模型。in, is the second category label of the target in the i-th image in the predicted target domain dataset, is the second detection box of the target in the i-th image in the predicted target domain dataset, is the confidence score that the target output by the classification layer such as softmax belongs to the second category label, and F is the teacher model.

由于源域和目标域之间存在域偏移，目标检测结果中包含噪声样本，比如错误分类的正样本。Due to the domain shift between the source domain and the target domain, the object detection results contain noisy samples, such as misclassified positive samples.

为了进一步纠正错误分类的正样本，利用目标的第二实例特征和各类目标的多个特征原型之间的相似度对目标检测结果进行纠正，得到各目标的伪标签。In order to further correct the misclassified positive samples, the target detection results are corrected using the similarity between the second instance features of the target and multiple feature prototypes of each type of target, and the pseudo labels of each target are obtained.

在上述实施例的基础上，本实施例中根据各目标的第二实例特征与各类目标的多个特征原型之间的相似度，对各目标的第二类别标签和各目标属于第二类别标签的置信度分数进行纠正，包括：On the basis of the above embodiment, in this embodiment, according to the similarity between the second instance feature of each target and multiple feature prototypes of each type of target, the second category label of each target and the confidence score of each target belonging to the second category label are corrected, including:

确定各目标的第二实例特征与各类目标的多个特征原型之间的相似度中的第一最大值；Determine a first maximum value among similarities between a second instance feature of each target and a plurality of feature prototypes of each class of targets;

根据所有类目标对应的第一最大值中的最大值所对应的第一类别标签，作为各目标纠正后的第二类别标签；The first category label corresponding to the maximum value among the first maximum values corresponding to all categories of targets is used as the corrected second category label of each target;

将根据所有类目标对应的第一最大值中的最大值作为各目标纠正后的置信度分数。The maximum value among the first maximum values corresponding to all class targets is taken as the corrected confidence score of each target.

根据各目标的第二实例特征与各类目标的多个特征原型之间的相似度，对教师模型预测的目标检测结果进行纠正和过滤的公式可表示如下：According to the similarity between the second instance features of each target and multiple feature prototypes of each type of target, the formula for correcting and filtering the target detection results predicted by the teacher model can be expressed as follows:

其中，f_t′是第二检测框B_t对应的第二实例特征，为第i类目标的第n个特征原型，f_t′与之间的相似度可用两者的点积表示，Nc为目标的类别总数，为第j类目标的第n个特征原型，是纠正后的第二类别标签，exp表示指数函数，是纠正后的置信度分数，τ为设定常数，为最终得到的伪标签。Among them, ft _′ is the second instance feature corresponding to the second detection box _Bt , is the nth feature prototype of the i-th target, and _ft ′ is The similarity between them can be expressed by the dot product of the two, Nc is the total number of target categories, is the nth feature prototype of the jth target, is the corrected second category label, exp represents the exponential function, is the corrected confidence score, τ is the set constant, is the final pseudo label.

在上述实施例的基础上，本实施例中将各目标的第二检测框内的图像区域作为各目标的第二实例特征，包括：On the basis of the above embodiment, in this embodiment, the image area within the second detection frame of each target is used as the second instance feature of each target, including:

将各目标属于第二类别标签的置信度分数与预设阈值进行比较，预设阈值根据目标检测模型对应的预设检测类别数目确定；Compare the confidence score of each target belonging to the second category label with a preset threshold, where the preset threshold is determined according to the number of preset detection categories corresponding to the target detection model;

在置信度分数小于等于预设阈值的情况下，将各目标的第二检测框内的图像区域作为各目标的第二实例特征。When the confidence score is less than or equal to a preset threshold, the image area within the second detection frame of each target is used as the second instance feature of each target.

由于源域和目标域之间存在域偏移，目标检测结果中包含噪声样本，比如负样本。为了过滤掉负样本，通过置信度分数进行过滤：Due to the domain offset between the source domain and the target domain, the target detection results contain noise samples, such as negative samples. In order to filter out negative samples, we filter them by confidence score:

其中，C为目标检测模型对应的预设检测类别数目。目标检测模型在进行目标检测时，输出目标属于每个预设检测类别的置信度分数，将置信度分数最高的预设检测类别作为目标的类别标签。Where C is the number of preset detection categories corresponding to the target detection model. When performing target detection, the target detection model outputs the confidence score that the target belongs to each preset detection category, and takes the preset detection category with the highest confidence score as the category label of the target.

根据过滤掉负样本的目标检测结果，确定目标的第二实例特征。根据目标的第二实例特征对过滤掉负样本的目标检测结果进行进一步纠正。According to the target detection result after filtering out the negative samples, the second instance feature of the target is determined, and the target detection result after filtering out the negative samples is further corrected according to the second instance feature of the target.

在上述实施例基础上，本实施例中将目标域数据集的各图像作为样本，将各图像的伪标签作为标签对学生模型进行训练，包括：Based on the above embodiment, in this embodiment, each image of the target domain data set is used as a sample, and the pseudo label of each image is used as a label to train the student model, including:

基于学生模型从目标域数据集的各图像中检测各目标的第三类别标签、第三检测框和各目标属于第三类别标签的置信度分数；Based on the student model, a third category label, a third detection box, and a confidence score that each target belongs to the third category label are detected from each image of the target domain dataset;

将各目标的第三检测框内的图像区域作为各目标的第三实例特征；Using the image area within the third detection frame of each target as the third instance feature of each target;

根据目标域数据集的各图像中各目标的第三类别标签和纠正后的第二类别标签之间的损失，以及第三检测框和第二检测框之间的损失，确定各目标的第一损失；Determine a first loss for each target according to a loss between a third category label of each target in each image of the target domain dataset and a corrected second category label, and a loss between a third detection frame and a second detection frame;

根据各目标的第二实例特征、第三实例特征与各类目标的多个特征原型之间的相似度，确定各目标的第二损失；Determine the second loss of each target according to the similarity between the second instance feature, the third instance feature of each target and multiple feature prototypes of each type of target;

根据各目标属于第三类别标签的置信度分数和纠正后的置信度分数，确定各目标的第三损失；Determine the third loss of each target based on the confidence score of each target belonging to the third category label and the corrected confidence score;

根据第一损失、第二损失和第三损失，对学生模型进行训练。The student model is trained based on the first loss, the second loss, and the third loss.

对于目标检测的一致性学习策略，目标检测模型通过保持原始图像与风格扰动图像之间的一致性，来降低对域变化的敏感性。现有研究在单一的预测层面应用一致性正则化，对性能提升有限，或者引入辅助分支来构建一致性正则化，增加了模型的参数量。For the consistency learning strategy of object detection, the object detection model reduces the sensitivity to domain changes by maintaining the consistency between the original image and the style perturbation image. Existing studies apply consistency regularization at a single prediction level, which has limited performance improvement, or introduce auxiliary branches to construct consistency regularization, which increases the number of model parameters.

本实施例设计多层次一致性正则化，包括基于原型的一致性正则化和预测一致性正则化，进一步提高了一致性学习在缓解域偏移问题上的性能，并且引入的额外模型参数较少。This embodiment designs multi-level consistency regularization, including prototype-based consistency regularization and prediction consistency regularization, which further improves the performance of consistency learning in alleviating the domain shift problem and introduces fewer additional model parameters.

将目标域数据集中各图像的伪标签作为学生模型的监督信息，构建自训练损失，即第一损失。自训练损失函数构建为：The pseudo labels of each image in the target domain dataset are used as the supervision information of the student model to construct the self-training loss, i.e. the first loss. The self-training loss function is constructed as:

其中，和分别表示各图像中目标的分类损失和检测框回归损失。分类损失根据学生模型预测的目标的第三类别标签和伪标签中纠正后的第二类别标签得到。检测框回归损失根据学生模型预测的目标的第三检测框和伪标签中的第二检测框得到。in, and They represent the classification loss and the detection box regression loss of the target in each image. The classification loss is obtained based on the third category label of the target predicted by the student model and the corrected second category label in the pseudo label. The detection box regression loss is obtained based on the third detection box of the target predicted by the student model and the second detection box in the pseudo label.

通过利用多原型生成的更加准确的伪标签作为学生模型在无标签目标域上的监督信号，有助于提高目标检测器在目标域上的检测性能。By utilizing the more accurate pseudo labels generated by multiple prototypes as supervision signals for the student model in the unlabeled target domain, it helps to improve the detection performance of the object detector in the target domain.

对于基于原型的一致性正则化，可通过利用多类别原型来计算实例特征的类别概率分布实现。具体可根据教师模型提取的第二实例特征与各类目标的多个特征原型之间的相似度，确定第二实例特征的类别概率分布。可根据学生模型提取的第三实例特征与各类目标的多个特征原型之间的相似度，确定第三实例特征的类别概率分布。根据第二实例特征的类别概率分布和第三实例特征的类别概率分布，构建基于原型的一致性损失即第二损失。For prototype-based consistency regularization, it can be achieved by using multi-category prototypes to calculate the category probability distribution of instance features. Specifically, the category probability distribution of the second instance feature can be determined based on the similarity between the second instance feature extracted by the teacher model and multiple feature prototypes of each category of targets. The category probability distribution of the third instance feature can be determined based on the similarity between the third instance feature extracted by the student model and multiple feature prototypes of each category of targets. Based on the category probability distribution of the second instance feature and the category probability distribution of the third instance feature, a prototype-based consistency loss is constructed. That is the second loss.

根据第二损失进行训练，可最小化教师模型与学生模型提取的实例特征的类别概率分布之间的差异，有效降低目标检测模型对数据领域变化的敏感性。Training according to the second loss can minimize the difference between the category probability distribution of instance features extracted by the teacher model and the student model, effectively reducing the sensitivity of the target detection model to changes in the data domain.

利用教师模型和学生模型预测的置信度分数，构建预测一致性正则化损失，即第三损失。对于目标域数据集中的每张图像，对弱增强图像和强增强图像的预测保持一致性有助于目标检测模型学习更好的可迁移特征。The confidence scores predicted by the teacher model and the student model are used to construct the prediction consistency regularization loss, i.e., the third loss. For each image in the target domain dataset, keeping the predictions of weakly enhanced images and strongly enhanced images consistent helps the object detection model learn better transferable features.

将教师模型和学生模型预测的置信度分数分别表示为P_stu和P_stu，并将预测一致性正则化损失定义为：Denote the confidence scores of the teacher model and the student model predictions as P _stu and P _stu , respectively, and define the prediction consistency regularization loss as:

使用构建的第一损失、第二损失和第三损失，监督学生模型训练。通过如下所示的目标函数来更新学生模型θ_stu的参数：Use the constructed first loss, second loss, and third loss to supervise the student model training. Update the parameters of the student model θ _stu through the objective function shown below:

其中，是通过伪标签监督学生模型的自训练损失，是基于原型的一致性正则化损失，是预测一致性正则化损失。in, is the self-training loss of the student model supervised by pseudo labels, is the prototype-based consistency regularization loss, is the prediction consistency regularization loss.

在上述实施例的基础上，本实施例中根据各目标的第二实例特征、第三实例特征与各类目标的多个特征原型之间的相似度，确定各目标的第二损失，包括：On the basis of the above embodiment, in this embodiment, the second loss of each target is determined according to the similarity between the second instance feature and the third instance feature of each target and multiple feature prototypes of each type of target, including:

确定各目标的第三实例特征与各类目标的多个特征原型之间的相似度中的第二最大值；Determine the second maximum value of the similarities between the third instance feature of each target and the plurality of feature prototypes of each class of targets;

根据第一最大值和第二最大值，确定各目标的第二损失。A second loss for each target is determined based on the first maximum value and the second maximum value.

可确定教师模型提取的第二实例特征f_t与各类目标的多个特征原型之间的相似度中的第一最大值作为第二实例特征与各类目标的特征原型之间的相似度。确定所有类目标对应的第一最大值的总和，将各类目标对应的第一最大值与所有类目标对应的第一最大值的总和之间的比值，作为第二实例特征的类别概率分布Z_tea。The first maximum value among the similarities between the second instance feature f _t extracted by the teacher model and the multiple feature prototypes of each class of targets can be determined as the similarity between the second instance feature and the feature prototypes of each class of targets. The sum of the first maximum values corresponding to all class targets is determined, and the ratio between the first maximum value corresponding to each class of targets and the sum of the first maximum values corresponding to all class targets is used as the category probability distribution Z _tea of the second instance feature.

可确定学生模型提取的第三实例特征与各类目标的多个特征原型之间的相似度中的第二最大值作为第三实例特征与各类目标的特征原型之间的相似度。确定所有类目标对应的第二最大值的总和，将各类目标对应的第二最大值与所有类目标对应的第二最大值的总和之间的比值，作为第三实例特征的类别概率分布Z_stu，公式如下：The third instance feature extracted by the student model can be determined The second maximum value among the similarities between the multiple feature prototypes of each class of targets is used as the similarity between the third instance feature and the feature prototypes of each class of targets. The sum of the second maximum values corresponding to all class targets is determined, and the ratio between the second maximum values corresponding to each class of targets and the sum of the second maximum values corresponding to all class targets is used as the category probability distribution Z _stu of the third instance feature, and the formula is as follows:

其中，sim为相似度函数，可为余弦距离。Among them, sim is a similarity function, which can be a cosine distance.

通过最小化基于原型的一致性正则化损失，即第二损失来迫使Z_stu和Z_twa之间的一致性：By minimizing the prototype-based consistency regularization loss, i.e., the second loss To force consistency between Z _stu and Z _twa :

其中，是Kullback-Leibler散度，用于衡量两个类别概率分布之间的差异程度。in, is the Kullback-Leibler divergence, which measures the degree of difference between the probability distributions of two classes.

在上述实施例的基础上，本实施例中在将目标域数据集的各图像作为样本，将各图像的伪标签作为标签对学生模型进行训练的同时，还包括：On the basis of the above embodiment, in this embodiment, while taking each image of the target domain data set as a sample and taking the pseudo label of each image as a label to train the student model, the following is also included:

根据训练的过程中每预设次数的迭代新产生的各目标的第二实例特征，确定各类目标的新聚类中心；Determine new cluster centers of each type of target according to the second instance features of each target newly generated at each preset number of iterations during the training process;

从各类目标的特征原型中选择与各类目标的新聚类中心最相似的特征原型；Select the feature prototype that is most similar to the new cluster center of each type of target from the feature prototypes of each type of target;

使用各类目标的新聚类中心，对选择的各类目标的特征原型进行更新。Using the new cluster centers of each class of targets, the feature prototypes of each class of selected targets are updated.

各类目标的特征原型根据教师模型提取的第二实例特征进行动态更新。可使用记忆库存储教师模型提取的第二实例特征。The feature prototypes of various types of targets are dynamically updated according to the second instance features extracted by the teacher model. The second instance features extracted by the teacher model can be stored in a memory bank.

在学生模型的训练过程中，每次迭代教师模型提取一张图像的第二实例特征生成伪标签。每预设次数，如100次的迭代中，教师模型新产生100张图像中目标的第二实例特征。将新产生的各类目标的第二实例特征的平均值作为各类目标的新聚类中心C表示目标的类别数量，并清空记忆库。During the training of the student model, the teacher model extracts the second instance features of an image in each iteration to generate a pseudo label. For each preset number of iterations, such as 100 iterations, the teacher model generates the second instance features of the targets in 100 images. The average value of the second instance features of each class of targets is used as the new cluster center of each class of targets. C represents the number of target categories and clears the memory bank.

在同一类目标的特征原型中选择与新聚类中心v_i最相似的原型可通过动量策略更新原型 Select the prototype that is most similar to the new cluster center _vi among the feature prototypes of the same type of target Prototypes can be updated with momentum strategies

其中，α是动量系数，可设置为0.99。Where α is the momentum coefficient and can be set to 0.99.

下面对本发明提供的无源域适应目标检测装置进行描述，下文描述的无源域适应目标检测装置与上文描述的无源域适应目标检测方法可相互对应参照。The passive domain adaptive target detection device provided by the present invention is described below. The passive domain adaptive target detection device described below and the passive domain adaptive target detection method described above can be referred to each other.

如图3所示，该装置包括构建模块301、生成模块302和训练模块303，其中：As shown in FIG3 , the device includes a construction module 301, a generation module 302 and a training module 303, wherein:

构建模块301用于基于教师模型从目标域数据集的部分图像中提取的各类目标的第一实例特征，构建各类目标的多个特征原型；The construction module 301 is used to construct multiple feature prototypes of various types of targets based on the first instance features of various types of targets extracted by the teacher model from the partial images of the target domain data set;

生成模块302用于根据各类目标的多个特征原型，对教师模型获取的目标域数据集中各图像的目标检测结果进行纠正，得到各图像的伪标签；The generation module 302 is used to correct the target detection results of each image in the target domain data set obtained by the teacher model according to multiple feature prototypes of various types of targets, and obtain pseudo labels for each image;

训练模块303用于将目标域数据集的各图像作为样本，将各图像的伪标签作为标签对学生模型进行训练，使用训练后的学生模型检测待检测图像中的目标；The training module 303 is used to use each image of the target domain data set as a sample, use the pseudo label of each image as a label to train the student model, and use the trained student model to detect the target in the image to be detected;

教师模型和学生模型通过预先使用源域数据集对目标检测模型进行训练得到。The teacher model and the student model are obtained by pre-training the object detection model using the source domain dataset.

图4示例了一种电子设备的实体结构示意图，如图4所示，该电子设备可以包括：处理器(processor)410、通信接口(Communications Interface)420、存储器(memory)430和通信总线440，其中，处理器410，通信接口420，存储器430通过通信总线440完成相互间的通信。处理器410可以调用存储器430中的逻辑指令，以执行无源域适应目标检测方法，该方法包括：基于教师模型从目标域数据集的部分图像中提取的各类目标的第一实例特征，构建各类目标的多个特征原型；根据各类目标的多个特征原型，对教师模型获取的目标域数据集中各图像的目标检测结果进行纠正，得到各图像的伪标签；将目标域数据集的各图像作为样本，将各图像的伪标签作为标签对学生模型进行训练，使用训练后的学生模型检测待检测图像中的目标；教师模型和学生模型通过预先使用源域数据集对目标检测模型进行训练得到。FIG4 illustrates a schematic diagram of the physical structure of an electronic device. As shown in FIG4 , the electronic device may include: a processor 410, a communication interface 420, a memory 430 and a communication bus 440, wherein the processor 410, the communication interface 420 and the memory 430 communicate with each other through the communication bus 440. The processor 410 may call the logic instructions in the memory 430 to execute the passive domain adaptation target detection method, which includes: constructing multiple feature prototypes of each type of target based on the first instance features of each type of target extracted from part of the images of the target domain data set by the teacher model; correcting the target detection results of each image in the target domain data set obtained by the teacher model according to the multiple feature prototypes of each type of target, and obtaining pseudo labels of each image; using each image of the target domain data set as a sample, using the pseudo labels of each image as a label to train the student model, and using the trained student model to detect the target in the image to be detected; the teacher model and the student model are obtained by pre-training the target detection model using the source domain data set.

此外，上述的存储器430中的逻辑指令可以通过软件功能单元的形式实现并作为独立的产品销售或使用时，可以存储在一个计算机可读取存储介质中。基于这样的理解，本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来，该计算机软件产品存储在一个存储介质中，包括若干指令用以使得一台计算机设备(可以是个人计算机，服务器，或者网络设备等)执行本发明各个实施例方法的全部或部分步骤。而前述的存储介质包括：U盘、移动硬盘、只读存储器(ROM，Read-Only Memory)、随机存取存储器(RAM，Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。In addition, the logic instructions in the above-mentioned memory 430 can be implemented in the form of a software functional unit and can be stored in a computer-readable storage medium when it is sold or used as an independent product. Based on such an understanding, the technical solution of the present invention, in essence, or the part that contributes to the prior art or the part of the technical solution, can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including a number of instructions for a computer device (which can be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods of various embodiments of the present invention. The aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), disk or optical disk, etc. Various media that can store program codes.

另一方面，本发明还提供一种计算机程序产品，计算机程序产品包括计算机程序，计算机程序可存储在非暂态计算机可读存储介质上，计算机程序被处理器执行时，计算机能够执行上述各方法所提供的无源域适应目标检测方法，该方法包括：基于教师模型从目标域数据集的部分图像中提取的各类目标的第一实例特征，构建各类目标的多个特征原型；根据各类目标的多个特征原型，对教师模型获取的目标域数据集中各图像的目标检测结果进行纠正，得到各图像的伪标签；将目标域数据集的各图像作为样本，将各图像的伪标签作为标签对学生模型进行训练，使用训练后的学生模型检测待检测图像中的目标；教师模型和学生模型通过预先使用源域数据集对目标检测模型进行训练得到。On the other hand, the present invention also provides a computer program product, which includes a computer program. The computer program can be stored on a non-transitory computer-readable storage medium. When the computer program is executed by a processor, the computer can execute the passive domain adaptive target detection method provided by the above methods, which includes: constructing multiple feature prototypes of each type of target based on the first instance features of each type of target extracted by a teacher model from part of the images of a target domain data set; correcting the target detection results of each image in the target domain data set obtained by the teacher model according to the multiple feature prototypes of each type of target, and obtaining a pseudo label for each image; using each image of the target domain data set as a sample, and using the pseudo label of each image as a label to train a student model, and using the trained student model to detect targets in the image to be detected; the teacher model and the student model are obtained by pre-training the target detection model using the source domain data set.

又一方面，本发明还提供一种非暂态计算机可读存储介质，其上存储有计算机程序，该计算机程序被处理器执行时实现以执行上述各方法提供的无源域适应目标检测方法，该方法包括：基于教师模型从目标域数据集的部分图像中提取的各类目标的第一实例特征，构建各类目标的多个特征原型；根据各类目标的多个特征原型，对教师模型获取的目标域数据集中各图像的目标检测结果进行纠正，得到各图像的伪标签；将目标域数据集的各图像作为样本，将各图像的伪标签作为标签对学生模型进行训练，使用训练后的学生模型检测待检测图像中的目标；教师模型和学生模型通过预先使用源域数据集对目标检测模型进行训练得到。On the other hand, the present invention also provides a non-transitory computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, is implemented to execute the passive domain adaptive target detection method provided by the above-mentioned methods, the method comprising: constructing multiple feature prototypes of each type of target based on the first instance features of each type of target extracted by a teacher model from part of the images of a target domain data set; correcting the target detection results of each image in the target domain data set obtained by the teacher model according to the multiple feature prototypes of each type of target, and obtaining a pseudo label for each image; using each image of the target domain data set as a sample and the pseudo label of each image as a label to train a student model, and using the trained student model to detect targets in the image to be detected; the teacher model and the student model are obtained by pre-training the target detection model using the source domain data set.

以上所描述的装置实施例仅仅是示意性的，其中作为分离部件说明的单元可以是或者也可以不是物理上分开的，作为单元显示的部件可以是或者也可以不是物理单元，即可以位于一个地方，或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。本领域普通技术人员在不付出创造性的劳动的情况下，即可以理解并实施。The device embodiments described above are merely illustrative, wherein the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, i.e., they may be located in one place, or they may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the scheme of this embodiment. Those of ordinary skill in the art may understand and implement it without creative effort.

通过以上的实施方式的描述，本领域的技术人员可以清楚地了解到各实施方式可借助软件加必需的通用硬件平台的方式来实现，当然也可以通过硬件。基于这样的理解，上述技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来，该计算机软件产品可以存储在计算机可读存储介质中，如ROM/RAM、磁碟、光盘等，包括若干指令用以使得一台计算机设备(可以是个人计算机，服务器，或者网络设备等)执行各个实施例或者实施例的某些部分的方法。Through the description of the above implementation methods, those skilled in the art can clearly understand that each implementation method can be implemented by means of software plus a necessary general hardware platform, and of course, it can also be implemented by hardware. Based on this understanding, the above technical solution is essentially or the part that contributes to the prior art can be embodied in the form of a software product, and the computer software product can be stored in a computer-readable storage medium, such as ROM/RAM, a disk, an optical disk, etc., including a number of instructions for a computer device (which can be a personal computer, a server, or a network device, etc.) to execute the methods of each embodiment or some parts of the embodiment.

最后应说明的是：以上实施例仅用以说明本发明的技术方案，而非对其限制；尽管参照前述实施例对本发明进行了详细的说明，本领域的普通技术人员应当理解：其依然可以对前述各实施例所记载的技术方案进行修改，或者对其中部分技术特征进行等同替换；而这些修改或者替换，并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, rather than to limit it. Although the present invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art should understand that they can still modify the technical solutions described in the aforementioned embodiments, or make equivalent replacements for some of the technical features therein. However, these modifications or replacements do not deviate the essence of the corresponding technical solutions from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. A passive domain adaptive target detection method, comprising:

constructing a plurality of feature prototypes of various targets based on first instance features of the various targets extracted by a teacher model from partial images of a target domain data set;

Correcting target detection results of all images in the target domain data set acquired by the teacher model according to a plurality of feature prototypes of all the targets to obtain pseudo labels of all the images;

Training a student model by taking each image of the target domain data set as a sample and taking a pseudo tag of each image as a tag, and detecting a target in an image to be detected by using the trained student model;

The teacher model and the student model are obtained by training a target detection model by using a source domain data set in advance; the first example features of various targets extracted from partial images of a target domain data set based on a teacher model are used for constructing a plurality of feature prototypes of the various targets, and the method comprises the following steps:

Randomly extracting a partial image from the target domain dataset;

detecting a first class label and a first detection frame of each target in the partial image based on the teacher model, and taking an image area in the first detection frame of the target as the first example characteristic;

according to the first class labels of the targets, determining targets belonging to the same class in the partial image;

Constructing a plurality of feature prototypes of various targets according to the first example features of the various targets; the plurality of feature prototypes are used to represent features of the various types of objects.

2. The passive domain adaptive object detection method of claim 1, wherein said constructing a plurality of feature prototypes for each of said objects from first instance features of said each of said objects comprises:

Clustering the first example features of the targets for multiple times, wherein the number of clusters in each cluster is different;

and determining the contour score of each cluster, and taking the cluster center of each cluster in the clusters corresponding to the maximum contour score as the feature prototype of each type of target.

3. The passive domain adaptive target detection method according to claim 1, wherein correcting the target detection result of each image in the target domain data set obtained by the teacher model according to the feature prototypes of the various targets to obtain the pseudo tag of each image comprises:

Detecting a second class label, a second detection frame and confidence scores of the targets belonging to the second class label from each image of the target domain data set based on the teacher model;

taking the image area in the second detection frame of each target as a second example characteristic of each target;

Correcting the second class labels of the targets and confidence scores of the targets belonging to the second class labels according to the similarity between the second example features of the targets and the feature prototypes of the targets;

and determining the pseudo tags of the targets according to the second detection frames of the targets, the corrected second class tags and the corrected confidence scores.

4. The passive domain adaptive object detection method according to claim 3, wherein correcting the second class labels of the objects and the confidence scores of the objects belonging to the second class labels according to the similarity between the second instance features of the objects and the feature prototypes of the objects comprises:

Determining a first maximum value in similarity between the second example feature of each object and a plurality of feature prototypes of each object;

According to the first class labels corresponding to the maximum values in the first maximum values corresponding to all the class targets, the first class labels are used as second class labels after correction of the targets;

and taking the maximum value in the first maximum values corresponding to all the category targets as the corrected confidence score of each target.

5. A passive domain adaptive object detection method as defined in claim 3, wherein said characterizing an image area within a second detection frame of each object as a second instance of each object comprises:

comparing the confidence scores of the targets belonging to the second class labels with a preset threshold, wherein the preset threshold is determined according to the number of preset detection classes corresponding to the target detection model;

and taking the image area in the second detection frame of each target as a second example characteristic of each target under the condition that the confidence score is smaller than or equal to the preset threshold value.

6. The passive domain adaptive object detection method of claim 3, wherein training the student model using each image of the object domain dataset as a sample and a pseudo tag of each image as a tag comprises:

Detecting a third class label, a third detection frame and confidence scores of the targets belonging to the third class label from each image of the target domain dataset based on the student model;

taking the image area in the third detection frame of each target as a third example characteristic of each target;

Determining a first loss of each target according to the loss between the third class label of each target in each image of the target domain data set and the corrected second class label and the loss between the third detection frame and the second detection frame;

determining a second loss of each target according to the similarity between the second example feature and the third example feature of each target and the feature prototypes of each target;

Determining a third loss of each target according to the confidence score of each target belonging to the third category label and the corrected confidence score;

Training the student model based on the first loss, the second loss, and the third loss.

7. The passive domain adaptive object detection method of claim 6, wherein said determining the second loss of each object based on the similarity between the second instance feature, the third instance feature, and the plurality of feature prototypes of each object comprises:

Determining a second maximum value in similarity between the third instance feature of each object and a plurality of feature prototypes of each object;

And determining a second loss of each target according to the first maximum value and the second maximum value.

8. The passive domain adaptive object detection method according to claim 3, wherein training the student model using each image of the object domain data set as a sample and using a pseudo tag of each image as a tag further comprises:

determining a new clustering center of each target according to the second example characteristic of each target newly generated in each preset number of iterations in the training process;

Selecting a feature prototype most similar to a new cluster center of each type of target from the feature prototypes of each type of target;

and updating the feature prototype of the selected various targets by using the new clustering centers of the various targets.

9. A passive domain adaptive target detection apparatus, comprising:

the construction module is used for constructing a plurality of feature prototypes of various targets based on first example features of the various targets extracted from partial images of the target domain data set by the teacher model;

The generating module is used for correcting target detection results of all images in the target domain data set acquired by the teacher model according to a plurality of feature prototypes of all the targets to obtain pseudo labels of all the images;

the training module is used for taking each image of the target domain data set as a sample, taking the pseudo tag of each image as a tag to train a student model, and detecting a target in an image to be detected by using the trained student model;

the teacher model and the student model are obtained by training a target detection model by using a source domain data set in advance;

The construction module is specifically used for:

Randomly extracting a partial image from the target domain dataset;