[go: up one dir, main page]

CN117011603B - Image detection method based on improved FCOS network - Google Patents

Image detection method based on improved FCOS network

Info

Publication number
CN117011603B
CN117011603B CN202310959453.6A CN202310959453A CN117011603B CN 117011603 B CN117011603 B CN 117011603B CN 202310959453 A CN202310959453 A CN 202310959453A CN 117011603 B CN117011603 B CN 117011603B
Authority
CN
China
Prior art keywords
convolution
data set
image
module
detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310959453.6A
Other languages
Chinese (zh)
Other versions
CN117011603A (en
Inventor
王子民
关挺强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guilin University of Electronic Technology
Original Assignee
Guilin University of Electronic Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guilin University of Electronic Technology filed Critical Guilin University of Electronic Technology
Priority to CN202310959453.6A priority Critical patent/CN117011603B/en
Publication of CN117011603A publication Critical patent/CN117011603A/en
Application granted granted Critical
Publication of CN117011603B publication Critical patent/CN117011603B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

本发明公开了一种基于改进FCOS网络的图像检测方法,包括如下步骤:1)制作数据集;2)对数据集中的图像做标注;3)图像预处理;4)在初始化文件里面加入所用类别的名称;5)设置参数;6)步骤3)预处理好的图像作为网络模型的输入;7)获取输出结果C1;8)获取输出结果C2;9)获取输出结果C3;10)获取输出结果C4;11)获取输出结果C5;12)获取输出S3、S4、S5;13)获取P3、P4、P5、P6和P7;14)在检测分类前设置损失函数;15)将步骤13)得到的P3~P7输送到检测头,得到最终的预测结果。这种方法通过特征的内在通信达到扩增卷积感受野的目的,进而增强输出特征的多样性,同时解决信息超载问题,使模型聚焦于对当前任务更为关键的信息。

The present invention discloses an image detection method based on an improved FCOS network, comprising the following steps: 1) preparing a dataset; 2) labeling images in the dataset; 3) image preprocessing; 4) adding the names of the categories used in the initialization file; 5) setting parameters; 6) using the preprocessed images from step 3 as input to the network model; 7) obtaining output C1; 8) obtaining output C2; 9) obtaining output C3; 10) obtaining output C4; 11) obtaining output C5; 12) obtaining outputs S3, S4, and S5; 13) obtaining P3, P4, P5, P6, and P7; 14) setting a loss function before detection and classification; and 15) transmitting P3 to P7 obtained in step 13) to the detection head to obtain the final prediction result. This method achieves the purpose of expanding the convolution receptive field through intrinsic feature communication, thereby enhancing the diversity of output features and solving the information overload problem, allowing the model to focus on information that is more critical to the current task.

Description

Image detection method based on improved FCOS network
Technical Field
The invention belongs to the technical field of artificial intelligence and image detection, and particularly relates to an image detection method based on an improved FCOS network.
Background
Object detection is an important technology in the field of computer vision, the main object of which is to accurately identify and locate a target object of interest from an image or video. With the rapid development of artificial intelligence and deep learning, the target detection technology has also made great progress. Before deep learning arises, object detection relies primarily on traditional computer vision methods. These methods include extracting features in the image using feature engineering and classifying and locating objects using conventional machine learning algorithms (e.g., SVM, decision tree, etc.). However, due to the diversity and complexity of the targets, conventional approaches often fail to efficiently process complex scenes and large-scale data. With the rise of deep learning technology, in particular Convolutional Neural Networks (CNNs), target detection has revolutionized. The excellent performance of CNN enables the computer to automatically learn advanced features in the image, thereby greatly improving the accuracy and efficiency of target detection. Among them, R-CNN (Region-based Convolutional Neural Networks) proposed by Yann LeCun et al is the earliest end-to-end target detection framework, which lays a foundation for subsequent development. Roos et al in 2014 proposed two-phase networks R-CNN, as a milestone for applying the CNN method to target detection problems. Along with development of R-CNN series algorithms such as Faster R-CNN and Mask R-CNN, the R-CNN series algorithm is applied to image detection and segmentation. These two-stage target detection algorithms have achieved some success in image detection, but have higher computational complexity, slower detection speed, and larger consumed computational resources, requiring higher hardware configuration support.
Disclosure of Invention
The invention aims to solve the problems of low image detection speed, difficult feature extraction, high calculation resource consumption and the like of a first-stage network model, and provides an image detection method based on an improved FCOS network. According to the method, SCConv convolution is adopted, the purpose of amplifying convolution receptive fields can be achieved through internal communication of features, the diversity of output features is further improved, the dependence relationship between remote space and channels is built around each space position in a self-adaptive mode through self-calibration operation, CNN is helped to generate feature expression with more discrimination capability, the feature expression has more abundant information, and under the condition that computing capability is limited, computing resources are distributed to more important tasks, meanwhile, the problem of information overload is solved, and the model is focused on information which is more critical to the current task.
The technical scheme for realizing the aim of the invention is as follows:
an image detection method based on an improved FCOS network, comprising the steps of:
1) Firstly, manufacturing a data set for training and testing, wherein the data set is an MRI-T2 image data set of a human lumbar intervertebral disc, and is divided into a train data set, a val data set and a test data set according to a ratio of 8:1:1;
2) Fixing pixels of an input image in a dataset to 768x768, and marking the image in the dataset by adopting a COCO data format;
3) Data enhancement is carried out on all input images, including turning and scaling, and the enhanced images are preprocessed by using top hat operation and gray stretching image preprocessing technology;
4) Adopting a general target detection platform MMDetection for detection, MMDetection is a target detection algorithm framework based on deep learning, a target detection network can be quickly built by using MMDetection, target detection is realized, firstly, a COCO data set code needs to be modified, 80 categories in the COCO data set code are replaced by normal and diseased 2 categories in a data set, and then the names of the categories are added into an initialization file;
5) A random gradient GRADIENT DESCENT (SGD for short) is adopted to optimize the training process, the initial learning rate is 0.005, and the momentum is 0.9;
6) Step 3) the preprocessed image is used as the input of a network model;
7) The method comprises the steps that a background carries out convolution operation on an input image with a convolution kernel size of 7x7 and a stride of 2, and then carries out maximum pooling with a convolution kernel size of 3x3 and a stride of 2 to obtain an output result C1;
8) C1 is sent to a first self-calibration convolution module SCConv _1 to obtain an output result C2;
9) C2 is sent to a second self-calibration convolution module SCConv _2, and an output result C3 is obtained;
10 C3 is sent to a third self-calibration convolution module SCConv _3 to obtain an output result C4;
11 C4 is sent to a fourth self-calibration convolution module SCConv _4 to obtain an output result C5;
12 The feature dimension is reduced to 1/r of the input, then the feature dimension is increased to the original dimension through one FC layer after being activated by a ReLU, the complex correlation among channels can be better fitted compared with the method of directly using one FC layer, the parameter quantity and the calculated quantity are greatly reduced, the normalized weight between 0 and 1 is obtained through a Sigmoid gate, finally the normalized weight is weighted to the feature of each channel through a Scale operation, after the activation operation, the output S3, S4 and S5 are respectively obtained after the SE size and the channel number before and after the activation operation are not changed;
13 S3, S4 and S5 are sent to an FPN module, P3, P4 and P5 are generated on S3, S4 and S5 output by SE Attention respectively by FPN, P6 is obtained on the basis of P5 through a convolution layer with the convolution kernel size of 3x3 and the step distance of 2, and P7 is obtained on the basis of P6 through a convolution layer with the convolution kernel size of 3x3 and the step distance of 2;
14 Before detection and classification, a loss function is required to be set, wherein the loss function has three output branches, namely classification, regression and centrality, so that the loss consists of three parts, namely classification loss Lcls, positioning loss Lreg and centrality loss Lctrness, and the calculation method is as shown in the following formula:
p (x,y) represents the score for each category predicted at the feature map (x, y) point, Representing the true class labels corresponding to the points of the feature map (x, y),The value is 1 when the feature map (x, y) points are matched as positive samples, otherwise 0, t x,y represents the target bounding box information predicted at the feature map (x, y) points,Representing real object bounding box information corresponding to a feature map (x, y) point, s x,y representing the centrality predicted at the feature map (x, y) point,Representing the true centrality corresponding to the points (x, y) of the feature map;
15 And (3) conveying the P3-P7 obtained in the step (13) to a detection head, wherein the P3-P7 shares a detection head, the detection head shares three subdivided branches, classification, regression and Center-less, wherein the regress and Center-less are two different small branches on the same branch, the Classification, regression and Center-less branches firstly pass through a combination module of 4 Conv2d+GN+ReLU, and then pass through a convolution layer with a convolution kernel size of 3x3 steps of 1 to obtain a final prediction result.
The technical scheme is realized by an anchor-free FCOS network model, SCConv self-calibration convolution modules and an SE Attention mechanism module. The SCConv convolution can achieve the purpose of amplifying the convolution receptive field through the inherent communication of the features, so that the diversity of the output features is further enhanced. SE Attention can better exploit dynamic relationships between feature channels.
The technical scheme has the advantages or beneficial effects that:
The target detection method provided by the technical scheme combines the latest excellent network FCOS, uses SCConv convolution which is different from standard convolution and adopts a small-size kernel (such as 3×3 convolution) to fuse the space dimension domain and channel dimension information, and SCConv can adaptively establish the dependency relationship between the remote space and the channel around each space position through self-calibration operation. Therefore, it can help CNN generate feature expression with more discrimination ability, because it has more abundant information. The SE Attention mechanism module used in the technical scheme distributes computing resources to more important tasks under the condition of limited computing capacity, solves the problem of information overload, and enables the model to focus on information more critical to the current task.
Drawings
FIG. 1 is a network block diagram of an embodiment;
FIG. 2 is a block diagram of SCConv in an embodiment;
FIG. 3 is an SC module configuration in an embodiment;
FIG. 4 is a diagram of a module of an attention mechanism in an embodiment;
FIG. 5 is a diagram of a test head structure in an embodiment;
FIG. 6 is a flow diagram of network reasoning of an embodiment;
FIG. 7 is an original image of a spinal MRI in an embodiment;
Fig. 8 is a graph of spinal MRI test results of an embodiment.
Detailed Description
The present invention will now be further illustrated, but not limited, by the following figures and examples.
Examples:
referring to fig. 6, an image detection method based on an improved FCOS network includes the steps of:
1) Firstly, a data set for training and testing is made, wherein the data set is a non-public human lumbar intervertebral disc MRI-T2 image data set collected from a network, as shown in figure 7, 470 pieces of the data set are divided into a train data set (376 pieces), a val data set (47 pieces) and a test data set (47 pieces) according to the ratio of 8:1:1;
2) Fixing pixels of an input image in a dataset to 768x768, and marking the image in the dataset by adopting a COCO data format;
3) Data enhancement is carried out on all input images, including turning and scaling, and the enhanced images are preprocessed by using top hat operation and gray stretching image preprocessing technology;
4) Adopting a target detection platform MMDetection to perform detection, firstly, modifying COCO data set codes, replacing 80 categories in the COCO data set codes with normal and diseased 2 categories in the data set, and then adding names of the categories in an initialization file;
5) Optimizing a training process by adopting a random gradient descent method (SGD), wherein the initial learning rate is 0.005 and the momentum is 0.9;
6) Step 3) the preprocessed image is used as the input of a network model, and the structure of the model is shown in figure 1;
7) The method comprises the steps that a background carries out convolution operation on an input image with a convolution kernel size of 7x7 and a stride of 2, and then carries out maximum pooling with a convolution kernel size of 3x3 and a stride of 2 to obtain an output result C1;
8) Feeding C1 into a first self-calibration convolution module SCConv _1, wherein the SCConv module is shown in FIG. 2, FIG. 2 (a) is an original structure, FIG. 2 (b) is a modified structure of the embodiment, and the internal structure of the SC module is shown in FIG. 3 to obtain an output result C2;
9) C2 is sent to a second self-calibration convolution module SCConv _2, and an output result C3 is obtained;
10 C3 is sent to a third self-calibration convolution module SCConv _3 to obtain an output result C4;
11 C4 is sent to a fourth self-calibration convolution module SCConv _4 to obtain an output result C5;
12 The structure of the SE Ateention module of this example is shown in fig. 4, global average pooling is used as a squeize operation, then two FC layers are combined into a Bottleneck structure to model the correlation among channels and output the weight the same as the quantity of input features, firstly, the feature dimension is reduced to 1/r of the input, then the feature dimension is increased to the original dimension through one FC layer after being activated by a ReLU, the advantage of the method is that the method has more nonlinearity than that of directly using one FC layer, can better fit the complex correlation among channels, greatly reduces the quantity of parameters and calculated quantity, then obtains the normalized weight between 0-1 through a Sigmoid gate, finally weights the normalized weight to the characteristics of each channel through a Scale operation, and after the activation operation, the size and the channel number before and after the operation are not changed, and the output S3, S4 and S5 are respectively obtained after the processing of the SE Attention module;
13 S3, S4 and S5 are sent to an FPN module, P3, P4 and P5 are generated on S3, S4 and S5 output by SE Attention respectively by FPN, P6 is obtained on the basis of P5 through a convolution layer with the convolution kernel size of 3x3 and the step distance of 2, and P7 is obtained on the basis of P6 through a convolution layer with the convolution kernel size of 3x3 and the step distance of 2;
14 Before detection and classification, a loss function is required to be set, wherein the loss function has three output branches, namely classification, regression and centrality, so that the loss consists of three parts, namely classification loss Lcls, positioning loss Lreg and centrality loss Lctrness, and the calculation method is as shown in the following formula:
p (x,y) represents the score for each category predicted at the feature map (x, y) point, Representing the true class labels corresponding to the points of the feature map (x, y),The value is 1 when the feature map (x, y) points are matched as positive samples, otherwise 0, t x,y represents the target bounding box information predicted at the feature map (x, y) points,Representing real object bounding box information corresponding to a feature map (x, y) point, s x,y representing the centrality predicted at the feature map (x, y) point,Representing the true centrality corresponding to the points (x, y) of the feature map;
15 Conveying the P3-P7 obtained in the step 13) to a detection head, wherein the P3-P7 shares a detection head, the structure of the detection head is shown in figure 5, the detection head shares three subdivided branches, classification, regression and Center-less, wherein the regress and Center-less are two different small branches on the same branch, the Classification, regression and Center-less branches firstly pass through a combination module of 4 Conv2d+GN+ReLU, and then pass through a convolution layer with a convolution kernel size of 3x3 and a step distance of 1 to obtain a final prediction result, as shown in figure 8;
16 Testing the test set by using the trained network model, wherein the test results of the original FCOS network are compared with the test results of the method. The method of the embodiment can be seen to have a significantly improved accuracy over the original FCOS network detection.

Claims (1)

1. An image detection method based on an improved FCOS network, comprising the steps of:
1) Firstly, manufacturing a data set for training and testing, wherein the data set is an MRI-T2 image data set of a human lumbar intervertebral disc, and is divided into a train data set, a val data set and a test data set according to the ratio of 8:1:1;
2) Fixing pixels of an input image in a dataset to 768x768, and marking the image in the dataset by adopting a COCO data format;
3) Data enhancement is carried out on all input images, including turning and scaling, and the enhanced images are preprocessed by using top hat operation and gray stretching image preprocessing technology;
4) Adopting a general target detection platform MMDetection for detection, firstly, modifying COCO data set codes, replacing 80 categories in the COCO data set codes with normal and diseased 2 categories in the data set, and then adding names of the categories in an initialization file;
5) Optimizing a training process by adopting a random gradient descent method (SGD), wherein the initial learning rate is 0.005 and the momentum is 0.9;
6) Step 3) the preprocessed image is used as the input of a network model;
7) The method comprises the steps that a background carries out convolution operation on an input image with a convolution kernel size of 7x7 and a stride of 2, and then carries out maximum pooling with a convolution kernel size of 3x3 and a stride of 2 to obtain an output result C1;
8) C1 is sent to a first self-calibration convolution module SCConv _1 to obtain an output result C2;
9) C2 is sent to a second self-calibration convolution module SCConv _2, and an output result C3 is obtained;
10 C3 is sent to a third self-calibration convolution module SCConv _3 to obtain an output result C4;
11 C4 is sent to a fourth self-calibration convolution module SCConv _4 to obtain an output result C5;
12 The method comprises the steps of) sending C3, C4 and C5 into an SE Attention module, using global average pooling as a squeize operation, then forming a Bottleneck structure by two FC layers to model the correlation among channels, outputting the weight the same as the quantity of input features, firstly reducing the feature dimension to 1/r of the input, then raising the feature dimension to the original dimension through an FC layer after being activated by a ReLU, then obtaining the normalized weight between 0 and 1 through a Sigmoid gate, finally weighting the normalized weight to the feature of each channel through a Scale operation, and obtaining the output S3, S4 and S5 after being processed by the SE Attention module after the activation operation without changing the size and the number of channels;
13 S3, S4 and S5 are sent to an FPN module, P3, P4 and P5 are generated on S3, S4 and S5 output by SE Attention respectively by FPN, P6 is obtained on the basis of P5 through a convolution layer with the convolution kernel size of 3x3 and the step distance of 2, and P7 is obtained on the basis of P6 through a convolution layer with the convolution kernel size of 3x3 and the step distance of 2;
14 Before detection and classification, a loss function is required to be set, wherein the loss function has three output branches, namely classification, regression and centrality, so that the loss consists of three parts, namely classification loss Lcls, positioning loss Lreg and centrality loss Lctrness, and the calculation method is as shown in the following formula:
p (x,y) represents the score for each category predicted at the feature map (x, y) point, Representing the true class labels corresponding to the points of the feature map (x, y),The value is 1 when the feature map (x, y) points are matched as positive samples, otherwise 0, t x,y represents the target bounding box information predicted at the feature map (x, y) points,Representing real object bounding box information corresponding to a feature map (x, y) point, s x,y representing the centrality predicted at the feature map (x, y) point,Representing the true centrality corresponding to the points (x, y) of the feature map;
15 And (3) conveying the P3-P7 obtained in the step (13) to a detection head, wherein the P3-P7 shares a detection head, the detection head shares three subdivided branches, classification, regression and Center-less, wherein the regress and Center-less are two different small branches on the same branch, the Classification, regression and Center-less branches firstly pass through a combination module of 4 Conv2d+GN+ReLU, and then pass through a convolution layer with a convolution kernel size of 3x3 steps of 1 to obtain a final prediction result.
CN202310959453.6A 2023-08-01 2023-08-01 Image detection method based on improved FCOS network Active CN117011603B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310959453.6A CN117011603B (en) 2023-08-01 2023-08-01 Image detection method based on improved FCOS network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310959453.6A CN117011603B (en) 2023-08-01 2023-08-01 Image detection method based on improved FCOS network

Publications (2)

Publication Number Publication Date
CN117011603A CN117011603A (en) 2023-11-07
CN117011603B true CN117011603B (en) 2025-09-19

Family

ID=88568551

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310959453.6A Active CN117011603B (en) 2023-08-01 2023-08-01 Image detection method based on improved FCOS network

Country Status (1)

Country Link
CN (1) CN117011603B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118429813B (en) * 2024-05-29 2025-11-04 山东锋士信息技术有限公司 A Method for Detecting Abnormal Floating Objects in Rivers Based on Self-Similarity Enhancement and Dense Processing

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112836713A (en) * 2021-03-12 2021-05-25 南京大学 Identification and Tracking Method of Mesoscale Convective System Based on Image Anchorless Frame Detection
CN114937151A (en) * 2022-05-06 2022-08-23 西安电子科技大学 Lightweight target detection method based on multi-receptive-field and attention feature pyramid

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210319420A1 (en) * 2020-04-12 2021-10-14 Shenzhen Malong Technologies Co., Ltd. Retail system and methods with visual object tracking
CN113887455B (en) * 2021-10-11 2024-05-28 东北大学 A face mask detection system and method based on improved FCOS

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112836713A (en) * 2021-03-12 2021-05-25 南京大学 Identification and Tracking Method of Mesoscale Convective System Based on Image Anchorless Frame Detection
CN114937151A (en) * 2022-05-06 2022-08-23 西安电子科技大学 Lightweight target detection method based on multi-receptive-field and attention feature pyramid

Also Published As

Publication number Publication date
CN117011603A (en) 2023-11-07

Similar Documents

Publication Publication Date Title
CN112446476B (en) Neural network model compression method, device, storage medium and chip
US20220215227A1 (en) Neural Architecture Search Method, Image Processing Method And Apparatus, And Storage Medium
CN111210443B (en) Deformable convolution mixing task cascading semantic segmentation method based on embedding balance
CN113807399B (en) A neural network training method, detection method and device
CN111950453B (en) Random shape text recognition method based on selective attention mechanism
WO2022252272A1 (en) Transfer learning-based method for improved vgg16 network pig identity recognition
WO2022116856A1 (en) Model structure, model training method, and image enhancement method and device
CN110287800A (en) A Scene Classification Method of Remote Sensing Image Based on SGSE-GAN
CN115937655A (en) Target detection model of multi-order feature interaction, and construction method, device and application thereof
CN112507777A (en) Optical remote sensing image ship detection and segmentation method based on deep learning
CN118429389B (en) Target tracking method and system based on multiscale aggregation attention feature extraction network
CN115018039B (en) A neural network distillation method, a target detection method, and an apparatus
CN113378949A (en) Dual-generation confrontation learning method based on capsule network and mixed attention
CN116912595A (en) A cross-domain multi-modal remote sensing image classification method based on contrastive learning
CN118279566B (en) An autonomous driving target detection system for small objects
CN118691929A (en) UAV target detection method based on space-frequency feature fusion detection head
CN114881155B (en) Fruit image classification method based on deep migration learning
CN111310609B (en) Video target detection method based on time sequence information and local feature similarity
Ghosh et al. PB3C-CNN: An integrated PB3C and CNN based approach for plant leaf classification
CN112149526A (en) Lane line detection method and system based on long-distance information fusion
Jiang et al. Multi-level graph convolutional recurrent neural network for semantic image segmentation
Singh et al. MRN-LOD: multi-exposure refinement network for low-light object detection
CN117011603B (en) Image detection method based on improved FCOS network
CN118212572A (en) A road damage detection method based on improved YOLOv7
CN115497059B (en) A vehicle behavior recognition method based on attention network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant