[go: up one dir, main page]

CN116206276A - Linear object recognition method and device, vehicle and equipment - Google Patents

Linear object recognition method and device, vehicle and equipment Download PDF

Info

Publication number
CN116206276A
CN116206276A CN202310170650.XA CN202310170650A CN116206276A CN 116206276 A CN116206276 A CN 116206276A CN 202310170650 A CN202310170650 A CN 202310170650A CN 116206276 A CN116206276 A CN 116206276A
Authority
CN
China
Prior art keywords
image
sampling
linear target
linear
sampling point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310170650.XA
Other languages
Chinese (zh)
Inventor
赵起超
袁金伟
张振林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Automotive Innovation Corp
Original Assignee
China Automotive Innovation Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Automotive Innovation Corp filed Critical China Automotive Innovation Corp
Priority to CN202310170650.XA priority Critical patent/CN116206276A/en
Publication of CN116206276A publication Critical patent/CN116206276A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The application relates to a linear target identification method, a linear target identification device, a linear target identification vehicle and linear target identification equipment. The method comprises the following steps: obtaining a perceived image, wherein the perceived image comprises a linear target object to be identified; performing discrete sampling on the perceived image to obtain a sampled image containing discrete sampling results; identifying a characteristic region of the sampling image through a convolutional neural network to obtain position information of sampling points corresponding to the linear target object; and determining the identification result of the linear object according to the position information of the sampling point corresponding to the linear object. The invention can solve the problem of limited output number of linear targets by discrete sampling of the perceived image; the linear target recognition method is small in calculated amount, wide in applicability and capable of being conveniently expanded to detection of other linear objects.

Description

线状目标的识别方法、装置、车辆及设备Recognition method, device, vehicle and equipment for linear targets

技术领域technical field

本申请涉及自动驾驶感知识别技术领域,特别是涉及一种线状目标的识别方法、装置、车辆及设备。The present application relates to the technical field of automatic driving perception recognition, and in particular to a method, device, vehicle and equipment for recognition of linear objects.

背景技术Background technique

神经网络是一种大规模、多参数优化的工具。依靠大量的训练数据,神经网络能够学习出数据中难以总结的隐藏特征,从而完成多项复杂的任务,如人脸检测、图像语义分割、物体检测、动作追踪、自然语言翻译等。神经网络已被人工智能领域广泛应用。近些年,卷积神经网络(Convolutional Neural Networks,CNN)在自动驾驶领域取得了快速发展。对于自动驾驶系统来说,感知系统相当于自动驾驶车辆的眼睛,可快速识别车身周围的动态和静态障碍物和周边道路结构环境。随着CNN的快速嵌入式化,目前自动驾驶车辆中的感知系统都采用神经网络进行相应的道路环境认知,相比传统方法,卷积神经网络往往具有更高的识别准确率和鲁棒性。但是,一般的卷积神经网络对于图像的识别方式应用在识别线状目标时,会造成识别不准确、计算过程复杂等问题。Neural networks are a tool for large-scale, multi-parameter optimization. Relying on a large amount of training data, the neural network can learn hidden features that are difficult to summarize in the data, thereby completing multiple complex tasks, such as face detection, image semantic segmentation, object detection, action tracking, natural language translation, etc. Neural networks have been widely used in the field of artificial intelligence. In recent years, Convolutional Neural Networks (CNN) have achieved rapid development in the field of autonomous driving. For the autonomous driving system, the perception system is equivalent to the eyes of the autonomous vehicle, which can quickly identify dynamic and static obstacles around the vehicle body and the surrounding road structure environment. With the rapid embedding of CNN, the current perception system in self-driving vehicles uses neural networks to recognize the corresponding road environment. Compared with traditional methods, convolutional neural networks often have higher recognition accuracy and robustness. . However, when the image recognition method of the general convolutional neural network is applied to the recognition of linear targets, it will cause problems such as inaccurate recognition and complicated calculation process.

发明内容Contents of the invention

基于此,有必要针对上述技术问题,提供一种基于神经网络的线状目标识别方法、装置、车辆、计算机设备。Based on this, it is necessary to provide a neural network-based linear target recognition method, device, vehicle, and computer equipment for the above-mentioned technical problems.

第一方面,本申请提供了一种基于神经网络的线状目标识别方法,所述方法包括:In a first aspect, the present application provides a neural network-based linear target recognition method, the method comprising:

获取感知图像,所述感知图像中包括线状的目标物;Acquiring a perceptual image, the perceptual image including a linear target;

对感知图像的像素进行离散采样和处理,得到含有离散采样结果的感知图像,即采样图像;Perform discrete sampling and processing on the pixels of the perceptual image to obtain a perceptual image containing discrete sampling results, that is, a sampled image;

通过卷积神经网络识别所述含有离散采样结果的感知图像的特征区域,得到线状目标物对应的采样点的位置信息;Identifying the feature region of the perceptual image containing discrete sampling results through a convolutional neural network to obtain position information of the sampling points corresponding to the linear target;

根据所述采样点的位置信息确定识别结果。The identification result is determined according to the position information of the sampling point.

在其中一个实施例中,所述卷积神经网络包括主干网络和分支网络,所述主干网络用于识别和提取所述采样图像中的特征区域,并输出预设尺寸的特征图像至分支网络。In one of the embodiments, the convolutional neural network includes a backbone network and a branch network, and the backbone network is used to identify and extract feature regions in the sampled image, and output a feature image of a preset size to the branch network.

在其中一个实施例中,所述分支网络至少包括第一分支网络和第二分支网络,第一分支网络和第二分支网络生成具有不同尺度的特征图像,和/或生成对应于所述感知图像的不同区域的特征图像。In one of the embodiments, the branch network includes at least a first branch network and a second branch network, the first branch network and the second branch network generate feature images with different scales, and/or generate images corresponding to the perceptual Feature images of different regions of .

在其中一个实施例中,所述第一分支网络对所述感知图像进行分类,感知所述特征区域中各采样点处是否包含线状目标物,并通过第二分支网络感知包含线状目标物的采样点的位置信息。In one of the embodiments, the first branch network classifies the perceptual image, perceives whether each sampling point in the feature region contains a linear target, and perceives whether the linear target is contained through the second branch network. location information of the sampling points.

在其中一个实施例中,获取所述感知图像后,还包括:对所述感知图像进行预处理。In one of the embodiments, after acquiring the perceptual image, it further includes: performing preprocessing on the perceptual image.

在其中一个实施例中,所述对感知图像进行离散采样,得到采样图像,包括:In one of the embodiments, the discrete sampling of the perceived image to obtain the sampled image includes:

对所述感知图像的像素每间隔预设数量个点采样;sampling the pixels of the perceptual image at intervals of a preset number of points;

以所述感知图像为基准建立坐标系;Establishing a coordinate system based on the perceived image;

获得采样点坐标;Obtain the coordinates of the sampling point;

基于采样点坐标,得到所述感知图像对应的采样图像。Based on the coordinates of the sampling points, a sampling image corresponding to the perceived image is obtained.

在其中一个实施例中,所述采样点坐标相对于所述特征区域的中心点的偏移值为所述采样点的位置信息。In one of the embodiments, the offset value of the sampling point coordinates relative to the center point of the feature area is the position information of the sampling point.

在其中一个实施例中,所述根据所述线状目标物对应的采样点的位置信息,确定线状目标物的识别结果,包括:In one of the embodiments, the determination of the recognition result of the linear target according to the position information of the sampling point corresponding to the linear target includes:

根据所述线状目标物对应的采样点的位置信息,确定所述线状目标物在采样方向的偏移量;determining the offset of the linear target in the sampling direction according to the position information of the sampling point corresponding to the linear target;

根据所述线状目标物在采样方向的偏移量,确定线状目标物的识别结果。The identification result of the linear target is determined according to the offset of the linear target in the sampling direction.

在其中一个实施例中,所述采样点坐标为(xi,yi),则xi=si-1,输入所述神经网络的采样点数据为yi值,其中,xi为采样点的横坐标,yi为采样点的纵坐标,s为所述感知图像的像素采样间隔的预设值,i=0,1,2,3……。In one of the embodiments, the coordinates of the sampling points are ( xi , y i ), then xi = si-1, the sampling point data input into the neural network is the value of y i , where xi is the sampling point The abscissa of , y i is the ordinate of the sampling point, s is the preset value of the pixel sampling interval of the perceptual image, i=0,1,2,3....

第二方面,本申请还提供了一种基于神经网络的线状目标识别装置,所述装置包括:In a second aspect, the present application also provides a neural network-based linear target recognition device, the device comprising:

采集模块,用于获取感知图像,所述感知图像中包括待识别的线状目标物;An acquisition module, configured to acquire a perceptual image, the perceptual image including a linear target to be identified;

处理模块,预设有卷积神经网络;所述处理模块对感知图像的像素进行离散采样,得到含有离散采样结果的采样图像;所述处理模块识别所述采样图像,得到线状目标物对应的采样点的位置信息;根据所述线状目标物对应的采样点的位置信息,确定线状目标物的识别结果;The processing module is preset with a convolutional neural network; the processing module performs discrete sampling on the pixels of the perceptual image to obtain a sampled image containing discrete sampling results; the processing module recognizes the sampled image to obtain the corresponding linear target The position information of the sampling point; according to the position information of the sampling point corresponding to the linear target, determine the recognition result of the linear target;

传输模块,用于传输数据以及处理模块得到的识别结果。The transmission module is used for transmitting data and identifying results obtained by the processing module.

第三方面,本申请还提供了一种车辆,采用上述的识别装置,还包括:In the third aspect, the present application also provides a vehicle, which adopts the above-mentioned identification device, and further includes:

控制装置,连接所述识别装置,所述控制装置根据所述处理模块的结果进行计算,并判断所述车辆的通行情况,若判断为是,则控制所述车辆前进。The control device is connected to the recognition device, and the control device performs calculation according to the result of the processing module, and judges the traffic condition of the vehicle, and if the judgment is yes, controls the vehicle to move forward.

在其中一个实施例中,所述控制装置执行以下步骤:In one of the embodiments, the control device performs the following steps:

选择或设置参考系,将识别结果拟合,得到线状目标物方程;Select or set the reference system, fit the recognition results, and obtain the linear target equation;

基于线状目标物方程,计算线状目标物与参考系的相对位置,并根据相对位置判断目标物状态;Based on the linear target equation, calculate the relative position of the linear target and the reference system, and judge the state of the target according to the relative position;

以目标物状态作为判断所述车辆是否前进的依据。The state of the target is used as a basis for judging whether the vehicle is moving forward.

第四方面,本申请还提供了一种计算机设备。所述计算机设备包括存储器和处理器,所述存储器存储有计算机程序,所述处理器执行所述计算机程序时实现以下步骤:In a fourth aspect, the present application also provides a computer device. The computer device includes a memory and a processor, the memory stores a computer program, and the processor implements the following steps when executing the computer program:

获取感知图像,所述感知图像中包括待识别的线状的目标物;Acquiring a perceptual image, the perceptual image includes a linear target to be identified;

对感知图像的像素进行离散采样,得到含有离散采样结果的采样图像;Perform discrete sampling on the pixels of the perceptual image to obtain a sampled image containing discrete sampling results;

卷积神经网络识别所述采样图像的特征区域,得到线状目标物对应的采样点的位置信息;The convolutional neural network identifies the feature area of the sampled image, and obtains the position information of the sampling point corresponding to the linear target object;

根据所述线状目标物对应的采样点的位置信息得到线状的目标物的识别结果。The recognition result of the linear target is obtained according to the position information of the sampling point corresponding to the linear target.

在其中一个实施例中,所述卷积神经网络包括主干网络和分支网络,所述分支网络包括第一分支网络和第二分支网络;所述通过卷积神经网络识别所述采样图像,得到含有所述感知图像中的线状目标物对应的采样点的位置信息,包括:In one of the embodiments, the convolutional neural network includes a backbone network and a branch network, and the branch network includes a first branch network and a second branch network; the sampling image is identified by the convolutional neural network to obtain a The position information of the sampling point corresponding to the linear target in the perceptual image includes:

通过所述主干网络提取所述采样图像中的特征区域,并将识别到的所述特征区域分别传输至第一分支网络和第二分支网络;extracting feature regions in the sampled image through the backbone network, and transmitting the identified feature regions to the first branch network and the second branch network respectively;

通过所述第一分支网络感知所述特征区域中各采样点处是否包含线状目标物,并通过第二分支网络感知包含线状目标物的采样点的位置信息。The first branch network is used to sense whether each sampling point in the feature area contains a linear target, and the second branch network is used to sense the position information of the sampling point containing the linear target.

在其中一个实施例中,获取所述感知图像后,还包括:对所述感知图像进行预处理。In one of the embodiments, after acquiring the perceptual image, it further includes: performing preprocessing on the perceptual image.

在其中一个实施例中,所述对感知图像进行离散采样,得到采样图像,包括:In one of the embodiments, the discrete sampling of the perceived image to obtain the sampled image includes:

对所述感知图像的像素每间隔预设个点采样;Sampling the pixels of the perceptual image at intervals of preset points;

以所述感知图像为基准建立坐标系;Establishing a coordinate system based on the perceived image;

获得采样点坐标;Obtain the coordinates of the sampling point;

基于采样点坐标,得到所述感知图像对应的采样图像。Based on the coordinates of the sampling points, a sampling image corresponding to the perceived image is obtained.

在其中一个实施例中,所述采样点坐标相对于所述特征图像的中心点的偏移值为采样点的位置信息。In one of the embodiments, the offset value of the sampling point coordinates relative to the central point of the feature image is the position information of the sampling point.

在其中一个实施例中,所述根据所述线状目标物对应的采样点的位置信息,确定线状目标物的识别结果,包括:In one of the embodiments, the determination of the recognition result of the linear target according to the position information of the sampling point corresponding to the linear target includes:

根据所述线状目标物对应的采样点的位置信息,确定所述线状目标物在采样方向的偏移量;determining the offset of the linear target in the sampling direction according to the position information of the sampling point corresponding to the linear target;

根据所述线状目标物在采样方向的偏移量,确定线状目标物的识别结果。The identification result of the linear target is determined according to the offset of the linear target in the sampling direction.

在其中一个实施例中,所述采样点坐标为(xi,yi),则xi=si-1,输入所述神经网络的采样点数据为yi值,其中,xi为采样点的横坐标,yi为采样点的纵坐标,s为所述感知图像的像素采样间隔的预设值,i=0,1,2,3……。In one of the embodiments, the coordinates of the sampling points are ( xi , y i ), then xi = si-1, the sampling point data input into the neural network is the value of y i , where xi is the sampling point The abscissa of , y i is the ordinate of the sampling point, s is the preset value of the pixel sampling interval of the perceptual image, i=0,1,2,3....

上述的线状目标识别方法、装置、车辆、设备,通过对感知图像离散采样,可以解决线状目标输出数目受限的问题;本申请的线状目标识别方法计算量小,适用性广,可以方便的扩展到其他线状物体的检测。The above-mentioned linear target recognition method, device, vehicle, and equipment can solve the problem of a limited number of linear target outputs by discretely sampling perceptual images; the linear target recognition method of the present application has a small amount of calculation, wide applicability, and It is convenient to extend to the detection of other linear objects.

附图说明Description of drawings

图1为一个实施例中的线状目标识别方法的流程图;Fig. 1 is the flowchart of the linear object recognition method in an embodiment;

图2为一个实施例中卷积神经网络的结构示意图;Fig. 2 is a schematic structural diagram of a convolutional neural network in an embodiment;

图3为一个实施例中采样图像的示意图;Fig. 3 is a schematic diagram of a sampling image in an embodiment;

图4为另一个实施例中基于神经网络的线状目标识别装置的结构示意图。Fig. 4 is a schematic structural diagram of a neural network-based linear object recognition device in another embodiment.

具体实施方式Detailed ways

为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。In order to make the purpose, technical solution and advantages of the present application clearer, the present application will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present application, and are not intended to limit the present application.

首先,对本申请实施例的应用场景和所涉及的部分词汇进行解释说明。First, the application scenarios of the embodiments of the present application and some of the vocabulary involved are explained.

本申请实施例提供的线状目标识别方法、装置及设备可以应用于驾驶车辆对于图像识别的应用场景中,并不限于完全自动驾驶技术领域,应当视为包含辅助驾驶等相近的图像识别领域。The linear object recognition method, device, and equipment provided in the embodiments of the present application can be applied to the application scenarios of image recognition for driving vehicles, and are not limited to the field of fully automatic driving technology, and should be regarded as including similar image recognition fields such as assisted driving.

本申请实施例提供的线状目标识别方法的执行主体可以为线状目标识别装置、计算机设备、存储介质和计算机程序产品装置。示例性地,该装置可以通过软件和/或硬件实现。The executing subject of the linear object recognition method provided in the embodiment of the present application may be a linear object recognition device, a computer device, a storage medium, and a computer program product device. Exemplarily, the device can be implemented by software and/or hardware.

本申请实施例中涉及的状目标物图像和/或包含线状目标物的感知图像主要为闸机的闸杆图像,但在其他实施例中,也可以包括但不限于以下至少一项:电线杆图像、斑马线图像或车道线图像。The image of the linear object involved in the embodiment of the present application and/or the perceptual image containing the linear object is mainly the image of the lever of the gate, but in other embodiments, it may also include but not limited to at least one of the following: wires pole image, zebra crossing image or lane line image.

深度学习是以不少于两个隐含层的神经网络对输入进行非线性变换或表示学习的技术,通过构建深层神经网络,进行各项分析活动。神经网络根据神经元的不同,包含多种不同的类别,常见的有前馈神经网络(FNN),卷积神经网络(CNN)——编码了空间相关性、循环神经网络(RNN)——编码了时间相关性,和对抗神经网络(GAN)等。广义的深度神经网络包括上述任一及其组合,狭义的深度神经网络指前馈神经网络,简单的前馈神经网络甚至可以仅包括输入层和输出层,而复杂一些的具有隐藏层的前馈神经网络也称作多层感知网络(MLP),并且通常具有三部分:输入层、输出层以及至少一个隐藏层。与其他图像分类算法相比,CNN使用的预处理相对较少。这意味着网络将学习传统算法中手工设计的过滤器。与特征设计中的先验知识和人工无关的这种独立性是主要优势。CNN通过检测和分类在图像和视频识别方面具有如本文所述的识别应用。Deep learning is a technology that performs nonlinear transformation or representation learning on the input by a neural network with no less than two hidden layers. By constructing a deep neural network, various analysis activities are performed. Neural networks include a variety of different categories according to different neurons. The common ones are feedforward neural network (FNN), convolutional neural network (CNN) - encoding spatial correlation, cyclic neural network (RNN) - encoding Time correlation, and confrontational neural network (GAN), etc. The generalized deep neural network includes any of the above and its combination. The narrowly defined deep neural network refers to the feedforward neural network. The simple feedforward neural network can even include only the input layer and the output layer, while the more complex ones have hidden layers. A neural network is also called a multilayer perceptron (MLP), and typically has three parts: an input layer, an output layer, and at least one hidden layer. Compared with other image classification algorithms, CNN uses relatively less preprocessing. This means the network will learn hand-crafted filters found in traditional algorithms. This independence from prior knowledge and human effort in feature design is a major advantage. CNNs have recognition applications as described in this paper in image and video recognition through detection and classification.

对于自动驾驶车辆过闸机的场景,关键信息是识别闸机档杆状态是否允许通过。现有技术中,通常是将上述图像送入神经网络直接训练一个分类网络。但是由于临界点不好区分,经常导致分类错误。在某些深度学习神经网络的视觉检测任务中,通常采用矩形框来表示一个目标。For the scene where the self-driving vehicle passes the gate, the key information is to identify whether the status of the gate lever allows passage. In the prior art, the above images are usually sent to the neural network to directly train a classification network. However, because the critical point is not easy to distinguish, it often leads to classification errors. In the visual detection tasks of some deep learning neural networks, a rectangular box is usually used to represent a target.

对于本实施例中闸机的闸杆,若采用矩形框表示,在闸机的闸杆闭合状态下,由于闸杆自身的尺寸比例,该矩形框的宽度会远大于其高度。同时,矩形框在闸机的闸杆抬起状态下,矩形框高度远大于宽度,整个闸杆由闭合到抬起过程,矩形框的宽高比值变化较大,不利于神经网络模型特征的学习。For the brake lever of the gate machine in this embodiment, if it is represented by a rectangular frame, when the brake lever of the gate machine is closed, the width of the rectangular frame will be much larger than its height due to the size ratio of the brake lever itself. At the same time, when the gate lever of the gate is lifted, the height of the rectangular frame is much larger than the width. During the whole gate lever closing to lifting process, the width-to-height ratio of the rectangular frame changes greatly, which is not conducive to the learning of neural network model features. .

为了解决这个技术问题,本申请中提供一种线状目标识别方法,利用闸机闸杆固有的线性特征,将其抽象为线状目标表示。In order to solve this technical problem, this application provides a linear target recognition method, which uses the inherent linear characteristics of the gate lever to abstract it into a linear target representation.

因此,本申请一实施例提供一种线状目标识别方法,图1为上述识别方法的流程示意图。如图1所示,本实施例提供的线状目标识别方法可以包括:Therefore, an embodiment of the present application provides a linear object recognition method, and FIG. 1 is a schematic flowchart of the above recognition method. As shown in Figure 1, the linear target recognition method provided in this embodiment may include:

S100、获取感知图像,所述感知图像中包括待识别的线状的目标物。S100. Acquire a perception image, where the perception image includes a linear target to be identified.

在本步骤中,通过图像感知装置(例如,摄像头)获取感知图像,感知图像中包括线状目标物的图像。进一步的,本步骤中的获取过程,可以是装置实时获取的图像,也可以是预先采集并存储的图像,并包括从存储的模块中获取的过程。In this step, a perception image is acquired through an image perception device (for example, a camera), and the perception image includes an image of a linear target. Further, the acquisition process in this step may be an image acquired by the device in real time, or an image acquired and stored in advance, and includes an acquisition process from a stored module.

获取感知图像后,可以在对感知图像进行离散采样处理之前对感知图像进行预处理。进一步的,预处理包括将图像压缩或裁剪至预设的初始图像尺寸WxH,其中,W代表初始图像宽度,H代表初始图像高度。在本实施例的一种预处理方式中,将感知图像压缩至WxH为512x288像素的初始图像尺寸大小。感知图像可以为RGB格式的图像,本实施例包括但不限于此。After the perceptual image is acquired, the perceptual image can be preprocessed before the discrete sampling process on the perceptual image. Further, the preprocessing includes compressing or cropping the image to a preset initial image size WxH, where W represents the initial image width, and H represents the initial image height. In a preprocessing manner of this embodiment, the perceptual image is compressed to an initial image size whose WxH is 512x288 pixels. The perceptual image may be an image in RGB format, which this embodiment includes but is not limited to.

S200、对感知图像的像素进行离散采样,得到含有离散采样结果的采样图像。S200. Perform discrete sampling on pixels of the perceived image to obtain a sampled image including discrete sampling results.

在本实施例中,步骤S200中的感知图像也可以表示经过上述预处理的初始图像,在其他实施例中,感知图像也可以不经过预处理,直接执行步骤S200。In this embodiment, the perceptual image in step S200 may also represent the preprocessed initial image. In other embodiments, the perceptual image may not be preprocessed, and step S200 may be performed directly.

具体而言,步骤S200还包括以下步骤:Specifically, step S200 also includes the following steps:

S201、对感知图像或初始图像的像素每间隔s个点采样,s=0,1,2,3……;进一步的,选定横向或竖向两个方向中的一个或两个进行间隔采样,其中,横向为平行于图像宽度W方向,竖向为平行于图像高度H方向。S201. Sampling the pixels of the perceptual image or the initial image at intervals of s points, s=0, 1, 2, 3...; further, selecting one or both of the horizontal and vertical directions for interval sampling , where the horizontal direction is parallel to the image width W, and the vertical direction is parallel to the image height H.

在本实施例中,初始图像尺寸WxH为512x288像素,s取值为4。在图像宽度方向上每隔四个像素进行采样。In this embodiment, the initial image size WxH is 512x288 pixels, and the value of s is 4. Sample every fourth pixel across the width of the image.

S202、以感知图像为基准建立坐标系;S202. Establish a coordinate system based on the perceived image;

在本实施例中,以感知图像的底边为x轴,感知图像的左侧边为y轴,以感知图像的底边和感知图像的左侧边的交点为原点,建立用于描述采样点坐标的坐标系。In this embodiment, the bottom edge of the perceptual image is used as the x-axis, the left side of the perceptual image is the y-axis, and the intersection point between the bottom edge of the perceptual image and the left side of the perceptual image is used as the origin to establish a sampling point for describing The coordinate system for the coordinates.

S203、获得采样点坐标,作为离散采样结果;S203. Obtain the coordinates of the sampling point as a discrete sampling result;

在本实施例中,目标物为闸机的档杆。采样点坐标表示为(xi,yi),则xi=si-1,输入所述神经网络的采样点数据为yi值,其中,xi为采样点的横坐标,yi为采样点的纵坐标,i=1,2,3……。In this embodiment, the target object is the gear lever of the gate. The sampling point coordinates are expressed as (xi , y i ), then x i =si-1, the sampling point data input into the neural network is the value of y i , where x i is the abscissa of the sampling point, and y i is the sampling point The ordinate of the point, i=1,2,3....

具体的,本实施例需要依次对感知图像或初始图像的每一行进行采样,针对每一行都获取到一组采样点坐标。例如,若采样间隔为s,图像的尺寸为512x288像素,则对线状目标物在感知图像或初始图像中所在行离散采样共采集到的包含有线状目标的采样点128个,包含有线状目标的采样点数学公式描述表示为:Specifically, this embodiment needs to sequentially sample each row of the perceptual image or the initial image, and obtain a set of sampling point coordinates for each row. For example, if the sampling interval is s and the size of the image is 512x288 pixels, then a total of 128 sampling points containing linear objects are collected for discrete sampling of the row where the linear object is located in the perception image or initial image, including linear objects The mathematical formula description of the sampling point is expressed as:

pole=[(x1,y1),...(xi,yi),...(x128,y128)] (1)pole=[(x 1 ,y 1 ),...(x i ,y i ),...(x 128 ,y 128 )] (1)

进一步的,若采样间隔为4,将s=4带入则xi=4i-1,则上述数学公式(1)可以简化为如下形式:Further, if the sampling interval is 4, s=4 is brought into x i =4i-1, then the above mathematical formula (1) can be simplified to the following form:

pole=[(3,y1),...(4i-1,yi),...(511,y128)] (2)pole=[(3,y 1 ),...(4i-1,y i ),...(511,y 128 )] (2)

需要说明的是,如图3所示,图中纵向的线条为采样线,由于线状目标物(如闸杆)仅占据全部128个采样点中间一段,此时线状目标物和采样线不相交的地方,其纵向坐标为零。上述公式(1)和公式(2)只代表针对感知图像或初始图像含有线状目标物的采样点坐标集合,由于本实施例需要对感知图像或初始图像的每一行都进行采样,所以能够得到多组类似公式(1)和公式(2)对应的采样点集合。且每组采样集合中各采样点的x坐标值已知,y坐标值未知,例如,各组采样集合的各采样点的x坐标都均依次为3,...,4i-1,...,511。It should be noted that, as shown in Figure 3, the vertical line in the figure is the sampling line, since the linear target (such as the brake lever) only occupies the middle section of all 128 sampling points, at this time the linear target and the sampling line are not Where they intersect, their vertical coordinates are zero. The above formulas (1) and (2) only represent the set of sampling point coordinates for the perceptual image or the initial image containing the linear target, since this embodiment needs to sample each row of the perceptual image or the initial image, so it can be obtained Multiple sets of sampling point sets corresponding to formula (1) and formula (2). And the x coordinate value of each sampling point in each sampling set is known, and the y coordinate value is unknown. For example, the x coordinates of each sampling point in each sampling set are all 3,...,4i-1,... .,511.

S204、基于采样点坐标,确定所述感知图像对应的采样图像;S204. Determine a sampling image corresponding to the perceived image based on the coordinates of the sampling point;

在本实施例中,基于S203采集到的各组采样集合中各采样点的位置坐标,对感知图像进行网格划分,得到感知图像对应的采样图像,例如,图3示出的即为对感知图像进行离散采样后得到的采样图像的示意图。在本实施例中,采样结果具体为所有离散采样点坐标(xi,yi)形成的坐标矩阵。将获得的所有采样点的坐标标注在图像中,即每个感知图像或初始图像对应一个标注文本,标注文本里含有采样点坐标矩阵。对于不包含线状目标的采样点,采样结果的坐标是已知的,对于包含线状目标的采样点,采样结果为(2)式所示,包含未知的y坐标信息。在本实施例中,针对每个采样点,已经确定横坐标的值,所以后续步骤中利用卷积神经网络时,仅需要预测起对应的y即可。进一步的,尤其选择将采样点坐标相对于特征图像的中心点的偏移值为采样点的位置信息。In this embodiment, based on the position coordinates of each sampling point in each group of sampling sets collected in S203, the perception image is divided into grids to obtain the sampling image corresponding to the perception image. For example, what is shown in FIG. 3 is the perception Schematic diagram of the sampled image obtained after the image is discretely sampled. In this embodiment, the sampling result is specifically a coordinate matrix formed by coordinates ( xi , y i ) of all discrete sampling points. Mark the coordinates of all the sampling points obtained in the image, that is, each perceptual image or initial image corresponds to an annotation text, and the annotation text contains the coordinate matrix of the sampling points. For sampling points that do not contain linear targets, the coordinates of the sampling results are known, and for sampling points that contain linear targets, the sampling results are shown in formula (2), which contains unknown y-coordinate information. In this embodiment, the value of the abscissa has been determined for each sampling point, so when using the convolutional neural network in subsequent steps, it is only necessary to predict the corresponding y. Further, it is particularly selected that the offset value of the sampling point coordinates relative to the center point of the feature image is the position information of the sampling point.

S300、通过卷积神经网络识别采样图像,得到采样点的位置信息。S300. Recognize the sampled image through a convolutional neural network to obtain position information of the sampling point.

卷积神经网络主要由输入层、卷积层、池化(Pooling)层和全连接层这几类层构成。输入层是将原始数据或者其他算法预处理后的数据输入到卷积神经网络。Convolutional neural network is mainly composed of input layer, convolutional layer, pooling (Pooling) layer and fully connected layer. The input layer is to input raw data or data preprocessed by other algorithms into the convolutional neural network.

如图2所示,本实施例的卷积神经网络1包括主干网络20和分支网络30,分支网络30包括第一分支网络31和第二分支网络32。主干网络20用于识别采样图像中的特征区域,并输出预设尺寸的特征图像至分支网络,进一步的,主干网络20将识别到的特征区域分别传输至第一分支网络31和第二分支网络32。主干网络20主要由多个卷积层和池化层组成,在本实施例中,卷积层和池化层分别至少具有两层。卷积神经网络1的卷积层是对输入数据进行特征提取,通过卷积核矩阵对原始数据中隐含关联性进行抽象。卷积神经网络1通常在连续的卷积层之间会周期性地插入一个池化层。池化层的作用是逐渐降低数据体的空间尺寸,这样的话就能减少网络中参数的数量,使得计算资源耗费变少。As shown in FIG. 2 , the convolutional neural network 1 of this embodiment includes a backbone network 20 and a branch network 30 , and the branch network 30 includes a first branch network 31 and a second branch network 32 . The backbone network 20 is used to identify feature regions in the sampled images, and output feature images of preset sizes to the branch networks. Further, the backbone network 20 transmits the identified feature regions to the first branch network 31 and the second branch network respectively. 32. The backbone network 20 is mainly composed of multiple convolutional layers and pooling layers. In this embodiment, each of the convolutional layers and pooling layers has at least two layers. The convolutional layer of the convolutional neural network 1 extracts features from the input data, and abstracts the implicit correlation in the original data through the convolutional kernel matrix. Convolutional neural networks 1 usually periodically insert a pooling layer between successive convolutional layers. The function of the pooling layer is to gradually reduce the spatial size of the data body, so that the number of parameters in the network can be reduced, and the computing resource consumption is reduced.

卷积神经网络识别感知图像中的特征区域。进一步的,采样图像10输入卷积神经网络1的主干网络20后,经由主干网络20进行特征提取,得到中间特征图像21(即特征区域)。在本实施例中,如图2所示,主干网络20输出的中间特征图像21的尺寸为wxh=W/32xH/32=16x9像素。Convolutional neural networks identify feature regions in perceptual images. Further, after the sampled image 10 is input into the backbone network 20 of the convolutional neural network 1, feature extraction is performed through the backbone network 20 to obtain an intermediate feature image 21 (ie, a feature region). In this embodiment, as shown in FIG. 2 , the size of the intermediate feature image 21 output by the backbone network 20 is wxh=W/32xH/32=16x9 pixels.

主干网络20连接有至少两个分支网络30,包括第一分支网络31和第二分支网络32,第一分支网络31和第二分支分网络32支生成具有不同尺度的特征图,和/或生成对应于所述感知图像的不同区域的特征图。第一分支网络31感知特征区域中各采样点处是否包含线状目标物,并通过第二分支网络32感知包含线状目标物的采样点的位置信息。需要说明的是,在本实施例中,“连接”可以表示在信号传输的方向上将上一网络的输出作为下一网络的输入。例如,“主干网络20连接有至少两个分支网络30”可以表示将主干网络20的输出作为两个分支网络30的共同输入。第一分支网络31对中间特征图像21进行分类,具体用于判断各采样点处是否包含线状目标物,即将各采样点分到包含线状目标物的一类,或者是不包含线状目标物的一类。第二分支网络32监督第一分支网络31。进一步的,第一分支网络31根据训练对所有的中间特征图像21进行判断,通过计算概率值等方式,得到中间特征图像21属于哪种分类。第一分支网络31具有两个通道,分别输出包含线状目标物的一类采样点和不包含线状目标物的一类采样点,在本实施例中,线状目标物为闸杆。进一步的,第二分支网络32为回归网络,第二分支网络32不断计算数据的坐标值和差异值,确定最终的识别结果。在本实施例中,第二分支网络32具有128个通道,每个通道中的中间特征图像21都包含相应的采样点,该第二分支网络32用于输出包含线状目标物的采样点的位置信息。可选的,对于不包含线状目标物的各采样点,可以不输出其位置信息,或者将其位置信息默认为0。The backbone network 20 is connected with at least two branch networks 30, including a first branch network 31 and a second branch network 32, and the first branch network 31 and the second branch network 32 generate feature maps with different scales, and/or generate Feature maps corresponding to different regions of the perceptual image. The first branch network 31 senses whether each sampling point in the feature area contains a linear target, and senses the position information of the sampling point containing the linear target through the second branch network 32 . It should be noted that in this embodiment, "connection" may indicate that the output of the previous network is used as the input of the next network in the direction of signal transmission. For example, “the backbone network 20 is connected to at least two branch networks 30 ” may indicate that the output of the backbone network 20 is used as the common input of the two branch networks 30 . The first branch network 31 classifies the intermediate feature image 21, which is specifically used to judge whether each sampling point contains a linear object, that is, to classify each sampling point into a category that contains a linear object, or does not contain a linear object. a class of things. The second branch network 32 supervises the first branch network 31 . Further, the first branch network 31 judges all the intermediate feature images 21 according to the training, and obtains which category the intermediate feature images 21 belong to by calculating probability values and other methods. The first branch network 31 has two channels, respectively outputting a type of sampling point containing a linear object and a type of sampling point not containing a linear object. In this embodiment, the linear object is a gate lever. Further, the second branch network 32 is a regression network, and the second branch network 32 continuously calculates the coordinate values and difference values of the data to determine the final recognition result. In the present embodiment, the second branch network 32 has 128 channels, and the intermediate feature image 21 in each channel contains corresponding sampling points, and the second branch network 32 is used to output the image of the sampling points containing the linear target. location information. Optionally, for each sampling point that does not contain a linear target, its position information may not be output, or its position information may be set to 0 by default.

在卷积神经网络进行实际预测之前,还需要根据预设的训练集进行训练。卷积神经网络1的训练过程包括:获取或输入样本采样图像,构建训练集,样本采样图像中包括线状的目标物;将训练集输入卷积神经网络,利用损失函数,得到最终的神经网络模型。具体的,在进行训练集的构建时,需要对样本感知图像进行采样,将标引后的样本采样图像输入至神经网络进行学习,具体的采样方法如上文所述。训练时,需要构建相应的损失函数,针对线状目标所设计的损失函数如下:Before the convolutional neural network can make actual predictions, it needs to be trained according to the preset training set. The training process of the convolutional neural network 1 includes: obtaining or inputting sample sampling images, constructing a training set, and the sample sampling images include linear objects; inputting the training set into the convolutional neural network, and using the loss function to obtain the final neural network Model. Specifically, when constructing the training set, it is necessary to sample the sample-aware image, and input the indexed sample sampled image to the neural network for learning. The specific sampling method is as described above. During training, a corresponding loss function needs to be constructed. The loss function designed for linear targets is as follows:

Figure BDA0004097960310000121
Figure BDA0004097960310000121

式(3)中。第一分支网络31的损失函数losscls和第二分支网络32的损失函数lossoffset计算公式分别为:In formula (3). The calculation formulas of the loss function loss cls of the first branch network 31 and the loss function loss offset of the second branch network 32 are respectively:

losscls=-[ctlncp+(1-ct)ln(1-cp)] (4)loss cls =-[c t lnc p +(1-c t )ln(1-c p )] (4)

Figure BDA0004097960310000122
Figure BDA0004097960310000122

针对分类分支,本实施例所采用的损失函数为交叉熵损失函数;针对坐标偏移量回归分支,本实施例所采用的损失函数为平方差损失函数。其中,

Figure BDA0004097960310000123
lobj表示当第(i,j)网格内有挡杆通过时为1,否则为0;Ncls表示最后特征图的大小,也就是网格的数目Ncls=wxh。ct表示第(i,j)的网格里的真值,cp表示第(i,j)的网格里的预测值,yk p表示神经网络预测偏移量的值,yk gt表示偏移量的真值。待训练的卷积神经网络的训练目标是最小化系统损失值,通过训练结果不断调整卷积神经网络的各参数,最终构建所需要的卷积神经网络1。For the classification branch, the loss function used in this embodiment is a cross-entropy loss function; for the coordinate offset regression branch, the loss function used in this embodiment is a square difference loss function. in,
Figure BDA0004097960310000123
l obj means 1 when there is a blocking rod passing through the (i, j)th grid, otherwise it is 0; N cls means the size of the final feature map, that is, the number of grids N cls =wxh. c t represents the true value in the (i,j)th grid, c p represents the predicted value in the (i,j)th grid, y k p represents the value of the neural network prediction offset, y k gt The truth value representing the offset. The training goal of the convolutional neural network to be trained is to minimize the system loss value, continuously adjust the parameters of the convolutional neural network through the training results, and finally construct the required convolutional neural network 1 .

进一步的,卷积神经网络1的主干网络20对采样图像进行卷积和池化,得到中间特征图像21(即特征区域);分支网络对中间特征图像21进行分类和回归,得到所述感知图像中的线状目标物对应的采样点的位置信息。Further, the backbone network 20 of the convolutional neural network 1 performs convolution and pooling on the sampled image to obtain an intermediate feature image 21 (ie, a feature area); the branch network classifies and returns the intermediate feature image 21 to obtain the perceptual image The position information of the sampling point corresponding to the linear target in .

可选的,在本实施例中,在此步骤中,经过第一分支网络31分类和第二分支网络32的监督计算,的含有所述感知图像中的线状目标物对应的采样点的位置信息,可以为线状目标物(闸杆)的c个离散采样点yci相对于网格中心点ycenter在y方向坐标的偏移量Δyci=yci-ycenterOptionally, in this embodiment, in this step, after classification by the first branch network 31 and supervision calculation by the second branch network 32, the position of the sampling point corresponding to the linear target in the perceptual image The information may be an offset Δy ci = y ci −y center of the c discrete sampling points y ci of the linear target (brake bar) relative to the grid center point y center in the y direction.

S400、根据线状目标物对应的采样点的位置信息,得到识别结果。S400. Obtain a recognition result according to the position information of the sampling point corresponding to the linear target.

其中,本实施例的线状目标物的识别结果可以是线状目标物的位置,还可以是线状目标物的状态,还可以是线状目标物相对于采样方向的偏移角度等。Wherein, the recognition result of the linear target in this embodiment may be the position of the linear target, the state of the linear target, or the offset angle of the linear target relative to the sampling direction.

进一步的,若线状目标物的识别结果为线状目标物的状态,则S400包括以下步骤:Further, if the recognition result of the linear target is the state of the linear target, S400 includes the following steps:

S401、选择或设置参考系;在本实施例中,选择感知图像的底边和侧边分别作为坐标轴的x轴和y轴。S401. Select or set a reference system; in this embodiment, select the bottom edge and the side edge of the perceptual image as the x-axis and the y-axis of the coordinate axes respectively.

S402、处理线状目标物的对应的采样点的位置信息,拟合得到线状目标物的方程;S402. Process the position information of the corresponding sampling points of the linear target, and obtain the equation of the linear target by fitting;

在步骤S302中,卷积神经网络已经获得线状目标物对应的采样点的位置信息,如可以是采样点的x轴和y轴的坐标信息,还可以是y轴的偏移量,即△yci。若位置信息为偏移量,则根据一定的计算,就能获得线状目标物对应的采样点的坐标,即特征区域的闸杆像素的坐标。根据线状目标物对应的采样点的位置坐标,本实施例采用最小二乘法进行直线拟合后可以得到闸杆的方程。若位置信息就是线状目标物对应的采样点的位置坐标,则可以直接进行拟合得到线状目标物(如闸杆)的方程。In step S302, the convolutional neural network has obtained the position information of the sampling point corresponding to the linear target, such as the coordinate information of the x-axis and y-axis of the sampling point, or the offset of the y-axis, that is, △ y ci . If the position information is an offset, then according to certain calculations, the coordinates of the sampling points corresponding to the linear targets, that is, the coordinates of the gate lever pixels in the feature area, can be obtained. According to the position coordinates of the sampling points corresponding to the linear target objects, the equation of the brake lever can be obtained after straight line fitting using the least square method in this embodiment. If the position information is the position coordinates of the sampling points corresponding to the linear target, the equation of the linear target (such as a brake lever) can be obtained by direct fitting.

S403、基于线状目标物方程,计算线状目标物与参考系的相对位置,并根据相对位置判断目标物状态,输出识别结果。S403. Based on the equation of the linear target, calculate the relative position of the linear target and the reference frame, judge the state of the target according to the relative position, and output the recognition result.

进一步的,在本实施例中,根据上述闸杆的方程和坐标系,可以计算出闸杆方程的斜率,进而计算闸杆的倾斜角度,根据闸杆倾斜角度的变化,判断闸机的状态能否允许车辆通过,并输出识别结果。Further, in this embodiment, according to the equation and coordinate system of the above brake lever, the slope of the brake lever equation can be calculated, and then the tilt angle of the brake lever can be calculated, and the state performance of the gate can be judged according to the change of the tilt angle of the brake lever. No to allow the vehicle to pass, and output the recognition result.

进一步的,若线状目标物的识别结果为线状目标物的倾斜角度,则S400还可以是根据线状目标物对应的采样点的位置信息,确定线状目标物在采样方向的偏移量;根据线状目标物在采样方向的偏移量,确定线状目标物的识别结果。具体的,可以计算线状目标物对应的采样点与该采样点所在采样网格区域中心点,在采样方向的坐标值差值,作为线状目标物在采样方向的偏移量,并基于该采样方向的偏移量,确定线状目标物相对于某一方向的倾斜角度,将该倾斜角度作为最终的线状目标物识别结果。如线状目标物为闸机,可以确定闸机与水平方向的倾斜角度,作为闸机状态识别结果。Further, if the recognition result of the linear target is the inclination angle of the linear target, S400 may also determine the offset of the linear target in the sampling direction according to the position information of the sampling point corresponding to the linear target ; Determine the recognition result of the linear target according to the offset of the linear target in the sampling direction. Specifically, the difference between the sampling point corresponding to the linear target and the center point of the sampling grid area where the sampling point is located can be calculated as the offset of the linear target in the sampling direction, and based on the The offset of the sampling direction determines the inclination angle of the linear target relative to a certain direction, and takes the inclination angle as the final recognition result of the linear target. If the linear target is a gate, the inclination angle between the gate and the horizontal direction can be determined as the identification result of the gate state.

本实施例以线状目标物为闸杆为例,同时通过检测闸杆采样点进而得到档杆抬起过程中相应的角度变化,最后通过阈值控制可以有效的得到闸机是否允许通行的状态,方便实时掌握闸机挡杆状态;通过角度变化,可以更加精准的得到闸机挡杆由闭合到抬起允许通行的临界值,给出的闸机状态更加精准;本发明直接对图像进行分类的方法,鲁棒性强,同时可以额外获取挡杆角度,方便告知司机当前挡杆状态。In this embodiment, the linear target is taken as the gate lever as an example. At the same time, by detecting the sampling point of the gate lever, the corresponding angle change during the lifting process of the gear lever can be obtained. Finally, the status of whether the gate is allowed to pass can be effectively obtained through threshold control. It is convenient to grasp the state of the barrier bar of the gate in real time; through the change of the angle, the critical value of the gate bar from closing to lifting can be obtained more accurately, and the state of the gate given is more accurate; the invention directly classifies the image The method is robust, and at the same time can additionally obtain the angle of the gear lever, which is convenient for informing the driver of the current gear lever status.

应该理解的是,虽然如上所述的各实施例所涉及的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,如上所述的各实施例所涉及的流程图中的至少一部分步骤可以包括多个步骤或者多个阶段,这些步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤中的步骤或者阶段的至少一部分轮流或者交替地执行。It should be understood that although the steps in the flow charts involved in the above embodiments are shown sequentially according to the arrows, these steps are not necessarily executed sequentially in the order indicated by the arrows. Unless otherwise specified herein, there is no strict order restriction on the execution of these steps, and these steps can be executed in other orders. Moreover, at least some of the steps in the flow charts involved in the above-mentioned embodiments may include multiple steps or stages, and these steps or stages are not necessarily executed at the same time, but may be performed at different times For execution, the execution order of these steps or stages is not necessarily performed sequentially, but may be executed in turn or alternately with other steps or at least a part of steps or stages in other steps.

基于同样的发明构思,本发明实施例还提供了一种用于实现上述所涉及的基于神经网络的线状目标识别方法的线状目标识别装置。该装置所提供的解决问题的实现方案与上述方法中所记载的实现方案相似,故下面所提供的一个或多个线状目标识别装置实施例中的具体限定可以参见上文中对于线状目标识别方法的限定,在此不再赘述。Based on the same inventive concept, an embodiment of the present invention also provides a linear object recognition device for realizing the neural network-based linear object recognition method mentioned above. The solution to the problem provided by this device is similar to the implementation described in the above method, so the specific limitations in one or more embodiments of the linear target recognition device provided below can be referred to above for the linear target recognition The limitation of the method will not be repeated here.

如图4所示,在本实施例中,线状目标识别装置100包括采集模块101、数据存储模块103以及输出模块104,线状目标识别装置100被设置在终端200上。其中,终端102通过传输模块104与线状目标识别装置100进行通信。数据存储模块103可以存储或缓存处理模块102需要处理的数据。数据存储模块103可以集成在处理模块102上,也可以放在云上或其他服务器上,服务器可以用独立的服务器或者是多个服务器组成的服务器集群来实现。采集模块101用于获取感知图像,感知图像中包括线状的目标物,进一步的,采集模块为摄像头等,感知图像为图片或视频。处理模块102预设有上文所述的训练后的卷积神经网络。处理模块102对感知图像的像素进行离散采样和处理;识别感知图像的特征区域,得到含有采样结果的特征图像;根据特征图像得到识别结果。处理模块102通过传输模块104将识别结果传送至终端200。终端200可以但不限于是各种个人计算机、笔记本电脑、智能手机、平板电脑、物联网设备和便携式可穿戴设备,物联网设备可为智能音箱、智能电视、智能空调、智能车载设备等。便携式可穿戴设备可为智能手表、智能手环、头戴设备等。在本实施例中,终端200为车辆。As shown in FIG. 4 , in this embodiment, the linear object recognition device 100 includes a collection module 101 , a data storage module 103 and an output module 104 , and the linear target recognition device 100 is set on a terminal 200 . Wherein, the terminal 102 communicates with the linear object recognition device 100 through the transmission module 104 . The data storage module 103 may store or cache data that the processing module 102 needs to process. The data storage module 103 can be integrated on the processing module 102, or placed on the cloud or other servers, and the server can be realized by an independent server or a server cluster composed of multiple servers. The acquisition module 101 is used to acquire a perception image, and the perception image includes a linear target object. Further, the acquisition module is a camera, and the perception image is a picture or a video. The processing module 102 is preset with the above-mentioned trained convolutional neural network. The processing module 102 performs discrete sampling and processing on the pixels of the perceptual image; identifies the feature region of the perceptual image to obtain a feature image containing sampling results; and obtains the recognition result according to the feature image. The processing module 102 transmits the identification result to the terminal 200 through the transmission module 104 . The terminal 200 can be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, IoT devices and portable wearable devices, and the IoT devices can be smart speakers, smart TVs, smart air conditioners, smart vehicle devices, etc. Portable wearable devices can be smart watches, smart bracelets, head-mounted devices, and the like. In this embodiment, the terminal 200 is a vehicle.

上述线状目标识别装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。Each module in the above-mentioned linear object recognition device can be fully or partially realized by software, hardware and a combination thereof. The above-mentioned modules can be embedded in or independent of the processor in the computer device in the form of hardware, and can also be stored in the memory of the computer device in the form of software, so that the processor can invoke and execute the corresponding operations of the above-mentioned modules.

本发明实施例还提供了一种用于实现上述所涉及的基于神经网络的线状目标识别方法的应用场景,本实施例提供了一种车辆,该车辆设有线状目标识别装置100,该车辆还包括控制装置,连接线状目标识别装置100,控制装置根据处理模块102的识别结果进行计算,并判断车辆的通行情况,若判断为是,则控制所述车辆前进。控制装置执行以下步骤:选择或设置参考系,将识别结果拟合,得到目标物方程;计算目标物与参考系的相对位置,并根据相对位置判断目标物状态;以目标物状态作为判断所述车辆是否前进的依据。进一步的,在本实施例中,控制装置执行上述基于神经网络的线状目标识别方法的步骤S400,根据闸杆倾斜角度的变化,判断闸机的状态能否允许车辆通过,将角度与预设阈值比较,符合通过要求后,控制装置控制车辆前进或给出前进信号。进一步的,当本实施例的车辆为应用了自动驾驶技术的车辆或车辆设置为自动驾驶模式时,控制装置自动控制车辆前进;当本实施例的车辆为普通车辆时,控制装置给出可以前进信号。The embodiment of the present invention also provides an application scenario for realizing the neural network-based linear object recognition method involved above. This embodiment provides a vehicle, the vehicle is provided with a linear object recognition device 100, and the vehicle It also includes a control device connected to the linear object recognition device 100, the control device calculates according to the recognition result of the processing module 102, and judges the traffic situation of the vehicle, and if the judgment is yes, then controls the vehicle to move forward. The control device performs the following steps: select or set the reference system, fit the recognition results to obtain the target object equation; calculate the relative position of the target object and the reference system, and judge the state of the target object according to the relative position; use the state of the target object as the judgment The basis for whether the vehicle is moving forward. Further, in this embodiment, the control device executes step S400 of the neural network-based linear object recognition method above, and judges whether the state of the gate can allow vehicles to pass according to the change of the tilt angle of the brake lever, and compares the angle with the preset Threshold comparison, after meeting the passing requirements, the control device controls the vehicle to move forward or gives a forward signal. Further, when the vehicle in this embodiment is a vehicle using automatic driving technology or the vehicle is set to automatic driving mode, the control device automatically controls the vehicle to advance; when the vehicle in this embodiment is an ordinary vehicle, the control device gives Signal.

在一个实施例中,提供了一种计算机设备,该计算机设备可以是服务器。该计算机设备包括通过系统总线连接的处理器、存储器和网络接口。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质和内存储器。该非易失性存储介质存储有操作系统、计算机程序和数据库。该内存储器为非易失性存储介质中的操作系统和计算机程序的运行提供环境。该计算机设备的数据库用于存储图像和计算过程数据。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机程序被处理器执行时以实现上述的基于神经网络的线状目标识别方法。In one embodiment, a computer device is provided, which may be a server. The computer device includes a processor, memory and a network interface connected by a system bus. Wherein, the processor of the computer device is used to provide calculation and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs and databases. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage medium. The database of the computer device is used to store images and calculate process data. The network interface of the computer device is used to communicate with an external terminal via a network connection. When the computer program is executed by the processor, the above-mentioned neural network-based linear object recognition method can be realized.

本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的计算机程序可存储于一非易失性计算机可读取存储介质中,该计算机程序在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、数据库或其它介质的任何引用,均可包括非易失性和易失性存储器中的至少一种。非易失性存储器可包括只读存储器(Read-OnlyMemory,ROM)、磁带、软盘、闪存、光存储器、高密度嵌入式非易失性存储器、阻变存储器(ReRAM)、磁变存储器(Magnetoresistive Random Access Memory,MRAM)、铁电存储器(Ferroelectric Random Access Memory,FRAM)、相变存储器(Phase Change Memory,PCM)、石墨烯存储器等。易失性存储器可包括随机存取存储器(Random Access Memory,RAM)或外部高速缓冲存储器等。作为说明而非局限,RAM可以是多种形式,比如静态随机存取存储器(Static Random Access Memory,SRAM)或动态随机存取存储器(Dynamic RandomAccess Memory,DRAM)等。本申请所提供的各实施例中所涉及的数据库可包括关系型数据库和非关系型数据库中至少一种。非关系型数据库可包括基于区块链的分布式数据库等,不限于此。本申请所提供的各实施例中所涉及的处理器可为通用处理器、中央处理器、图形处理器、数字信号处理器、可编程逻辑器、基于量子计算的数据处理逻辑器等,不限于此。Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above-mentioned embodiments can be completed by instructing related hardware through computer programs, and the computer programs can be stored in a non-volatile computer-readable memory In the medium, when the computer program is executed, it may include the processes of the embodiments of the above-mentioned methods. Wherein, any reference to storage, database or other media used in the various embodiments provided in the present application may include at least one of non-volatile and volatile storage. Non-volatile memory can include read-only memory (Read-Only Memory, ROM), magnetic tape, floppy disk, flash memory, optical memory, high-density embedded non-volatile memory, resistive variable memory (ReRAM), magnetic variable memory (Magnetoresistive Random Access Memory, MRAM), Ferroelectric Random Access Memory (FRAM), Phase Change Memory (Phase Change Memory, PCM), graphene memory, etc. The volatile memory may include random access memory (Random Access Memory, RAM) or external cache memory. As an illustration and not a limitation, RAM can be in various forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (Dynamic Random Access Memory, DRAM). The databases involved in the various embodiments provided in this application may include at least one of a relational database and a non-relational database. The non-relational database may include a blockchain-based distributed database, etc., but is not limited thereto. The processors involved in the various embodiments provided by this application can be general-purpose processors, central processing units, graphics processors, digital signal processors, programmable logic devices, data processing logic devices based on quantum computing, etc., and are not limited to this.

以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。The technical features of the above embodiments can be combined arbitrarily. To make the description concise, all possible combinations of the technical features in the above embodiments are not described. However, as long as there is no contradiction in the combination of these technical features, they should be It is considered to be within the range described in this specification.

以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对本申请专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请的保护范围应以所附权利要求为准。The above-mentioned embodiments only express several implementation modes of the present application, and the description thereof is relatively specific and detailed, but should not be construed as limiting the patent scope of the present application. It should be noted that those skilled in the art can make several modifications and improvements without departing from the concept of the present application, and these all belong to the protection scope of the present application. Therefore, the protection scope of the present application should be determined by the appended claims.

Claims (10)

1.一种线状目标识别方法,其特征在于,所述方法包括:1. A linear target recognition method, characterized in that the method comprises: 获取感知图像,所述感知图像中包括待识别的线状目标物;Acquiring a perceptual image, the perceptual image includes a linear target to be identified; 对感知图像进行离散采样,得到含有离散采样结果的采样图像;Perform discrete sampling on the perceptual image to obtain a sampled image containing discrete sampling results; 通过卷积神经网络识别所述采样图像的特征区域,得到线状目标物对应的采样点的位置信息;Identifying the feature region of the sampled image through a convolutional neural network to obtain the position information of the sampling point corresponding to the linear target; 根据所述线状目标物对应的采样点的位置信息,确定线状目标物的识别结果。The recognition result of the linear target is determined according to the position information of the sampling point corresponding to the linear target. 2.根据权利要求1所述的线状目标识别方法,其特征在于,所述卷积神经网络包括主干网络和分支网络,所述主干网络用于识别和提取所述采样图像中的特征区域,并输出预设尺寸的特征图像至分支网络;所述分支网络至少包括第一分支网络和第二分支网络,第一分支网络和第二分支网络生成具有不同尺度的特征图像,或生成对应于所述采样图像的不同区域的特征图像;所述第一分支网络对所述采样图像进行分类;所述第二分支网络感知包含线状目标物的采样点的位置信息。2. linear target recognition method according to claim 1, is characterized in that, described convolutional neural network comprises backbone network and branch network, and described backbone network is used for identifying and extracting the characteristic area in described sampling image, And output a feature image of a preset size to the branch network; the branch network includes at least a first branch network and a second branch network, the first branch network and the second branch network generate feature images with different scales, or generate feature images corresponding to the The feature images of different regions of the sampled image; the first branch network classifies the sampled image; the second branch network perceives the position information of the sampling point containing the linear target. 3.根据权利要求1所述的线状目标识别方法,其特征在于,获取所述感知图像后,还包括:对所述感知图像进行预处理。3 . The linear target recognition method according to claim 1 , further comprising: preprocessing the perceptual image after acquiring the perceptual image. 4 . 4.根据权利要求1至3任意一项所述的线状目标识别方法,其特征在于,所述对感知图像进行离散采样,得到采样图像,包括:4. The linear target recognition method according to any one of claims 1 to 3, wherein said performing discrete sampling on the perceptual image to obtain a sampled image comprises: 对所述感知图像的像素每间隔预设个点采样;Sampling the pixels of the perceptual image at intervals of preset points; 以所述感知图像为基准建立坐标系;Establishing a coordinate system based on the perceived image; 获得采样点坐标;Obtain the coordinates of the sampling point; 基于采样点坐标,得到所述感知图像对应的采样图像。Based on the coordinates of the sampling points, a sampling image corresponding to the perceived image is obtained. 5.根据权利要求1至3任意一项所述的线状目标识别方法,其特征在于,所述采样点坐标相对于所述特征区域的中心点的偏移值为所述采样点的位置信息。5. The linear target recognition method according to any one of claims 1 to 3, wherein the offset value of the sampling point coordinates relative to the center point of the feature area is the position information of the sampling point . 6.根据权利要求4所述的线状目标识别方法,其特征在于,所述采样点坐标为(xi,yi),则xi=si-1,输入所述神经网络的采样点数据为yi值,其中,xi为采样点的横坐标,yi为采样点的纵坐标,s为所述感知图像的像素采样间隔的预设值,i=0,1,2,3……。6. linear target recognition method according to claim 4, is characterized in that, described sampling point coordinate is ( xi , y i ), then x i =si-1, input the sampling point data of described neural network is the value of y i , wherein, x i is the abscissa of the sampling point, y i is the ordinate of the sampling point, s is the preset value of the pixel sampling interval of the perceptual image, i=0,1,2,3... … 7.一种线状目标识别装置,其特征在于,包括:7. A linear target recognition device, characterized in that it comprises: 采集模块,用于获取感知图像,所述感知图像中包括待识别的线状的目标物;An acquisition module, configured to acquire a perceptual image, the perceptual image including a linear target to be identified; 处理模块,预设有卷积神经网络;所述处理模块对感知图像的像素进行离散采样,得到含有离散采样结果的采样图像;所述处理模块识别所述采样图像,得到线状目标物对应的采样点的位置信息;根据所述线状目标物对应的采样点的位置信息,确定线状目标物的识别结果。The processing module is preset with a convolutional neural network; the processing module performs discrete sampling on the pixels of the perceptual image to obtain a sampled image containing discrete sampling results; the processing module recognizes the sampled image to obtain the corresponding linear target The position information of the sampling point; according to the position information of the sampling point corresponding to the linear target, the recognition result of the linear target is determined. 8.一种车辆,采用如权利要求7所述的识别装置,其特征在于,还包括:8. A vehicle using the identification device according to claim 7, further comprising: 控制装置,连接所述识别装置,所述控制装置根据所述处理模块的结果进行计算,并判断所述车辆的通行情况,若判断为是,则控制所述车辆前进。The control device is connected to the recognition device, and the control device performs calculation according to the result of the processing module, and judges the traffic condition of the vehicle, and if the judgment is yes, controls the vehicle to move forward. 9.根据权利要求8所述的车辆,其特征在于,所述控制装置执行以下步骤:9. The vehicle of claim 8, wherein the control device performs the following steps: 选择或设置参考系,将识别结果拟合,得到线状目标物方程;Select or set the reference system, fit the recognition results, and obtain the linear target equation; 基于线状目标物方程,计算线状目标物与参考系的相对位置,并根据相对位置判断目标物状态;Based on the linear target equation, calculate the relative position of the linear target and the reference system, and judge the state of the target according to the relative position; 以目标物状态作为判断所述车辆是否前进的依据。The state of the target is used as a basis for judging whether the vehicle is moving forward. 10.一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,其特征在于,所述处理器执行所述计算机程序时实现权利要求1至6中任一项所述的方法的步骤。10. A computer device, comprising a memory and a processor, the memory stores a computer program, wherein the processor implements the method according to any one of claims 1 to 6 when executing the computer program step.
CN202310170650.XA 2023-02-27 2023-02-27 Linear object recognition method and device, vehicle and equipment Pending CN116206276A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310170650.XA CN116206276A (en) 2023-02-27 2023-02-27 Linear object recognition method and device, vehicle and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310170650.XA CN116206276A (en) 2023-02-27 2023-02-27 Linear object recognition method and device, vehicle and equipment

Publications (1)

Publication Number Publication Date
CN116206276A true CN116206276A (en) 2023-06-02

Family

ID=86507425

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310170650.XA Pending CN116206276A (en) 2023-02-27 2023-02-27 Linear object recognition method and device, vehicle and equipment

Country Status (1)

Country Link
CN (1) CN116206276A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119758696A (en) * 2025-03-05 2025-04-04 四川京炜交通工程技术有限公司 Quick detection method and system for rise and fall time of electric railing for charging

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108090456A (en) * 2017-12-27 2018-05-29 北京初速度科技有限公司 A kind of Lane detection method and device
CN109345589A (en) * 2018-09-11 2019-02-15 百度在线网络技术(北京)有限公司 Location detection method, device, device and medium based on autonomous vehicle
US20190188530A1 (en) * 2017-12-20 2019-06-20 Baidu Online Network Technology (Beijing) Co., Ltd . Method and apparatus for processing image
WO2021175006A1 (en) * 2020-03-04 2021-09-10 深圳壹账通智能科技有限公司 Vehicle image detection method and apparatus, and computer device and storage medium
CN113591967A (en) * 2021-07-27 2021-11-02 南京旭锐软件科技有限公司 Image processing method, device and equipment and computer storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190188530A1 (en) * 2017-12-20 2019-06-20 Baidu Online Network Technology (Beijing) Co., Ltd . Method and apparatus for processing image
CN108090456A (en) * 2017-12-27 2018-05-29 北京初速度科技有限公司 A kind of Lane detection method and device
CN109345589A (en) * 2018-09-11 2019-02-15 百度在线网络技术(北京)有限公司 Location detection method, device, device and medium based on autonomous vehicle
WO2021175006A1 (en) * 2020-03-04 2021-09-10 深圳壹账通智能科技有限公司 Vehicle image detection method and apparatus, and computer device and storage medium
CN113591967A (en) * 2021-07-27 2021-11-02 南京旭锐软件科技有限公司 Image processing method, device and equipment and computer storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119758696A (en) * 2025-03-05 2025-04-04 四川京炜交通工程技术有限公司 Quick detection method and system for rise and fall time of electric railing for charging

Similar Documents

Publication Publication Date Title
US11775574B2 (en) Method and apparatus for visual question answering, computer device and medium
US12136262B2 (en) Segmenting objects by refining shape priors
CN111666921B (en) Vehicle control method, apparatus, computer device, and computer-readable storage medium
US11768876B2 (en) Method and device for visual question answering, computer apparatus and medium
CN110929622B (en) Video classification method, model training method, device, equipment and storage medium
CN110458095B (en) Effective gesture recognition method, control method and device and electronic equipment
CN112784869B (en) A fine-grained image recognition method based on attention perception and adversarial learning
WO2021218786A1 (en) Data processing system, object detection method and apparatus thereof
WO2021093435A1 (en) Semantic segmentation network structure generation method and apparatus, device, and storage medium
CN111368672A (en) Construction method and device for genetic disease facial recognition model
CN110070107A (en) Object identification method and device
CN108830285A (en) A kind of object detection method of the reinforcement study based on Faster-RCNN
CN113762327B (en) Machine learning method, machine learning system and non-transitory computer readable medium
CN113487610B (en) Herpes image recognition method and device, computer equipment and storage medium
US20240362818A1 (en) Method and device with determining pose of target object in query image
CN113780145A (en) Sperm morphology detection method, sperm morphology detection device, computer equipment and storage medium
CN115731530A (en) Model training method and device
CN115965786A (en) Occluded Object Recognition Method Based on Local Semantic-Aware Attention Neural Network
US20240242365A1 (en) Method and apparatus with image processing
KR20230164384A (en) Method For Training An Object Recognition Model In a Computing Device
CN115775377B (en) Automatic driving lane line segmentation method with fusion of image and steering angle of steering wheel
CN116206276A (en) Linear object recognition method and device, vehicle and equipment
CN110111358B (en) Target tracking method based on multilayer time sequence filtering
CN108596013B (en) Pedestrian detection method and device based on multi-granularity deep feature learning
CN118262258B (en) Ground environment image aberration detection method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination