TWI745204B - High-efficiency LiDAR object detection method based on deep learning - Google Patents
High-efficiency LiDAR object detection method based on deep learning Download PDFInfo
- Publication number
- TWI745204B TWI745204B TW109146375A TW109146375A TWI745204B TW I745204 B TWI745204 B TW I745204B TW 109146375 A TW109146375 A TW 109146375A TW 109146375 A TW109146375 A TW 109146375A TW I745204 B TWI745204 B TW I745204B
- Authority
- TW
- Taiwan
- Prior art keywords
- dimensional
- lidar
- point cloud
- deep learning
- efficiency
- Prior art date
Links
- 238000013135 deep learning Methods 0.000 title claims abstract description 29
- 238000001514 detection method Methods 0.000 title claims abstract description 27
- 239000011159 matrix material Substances 0.000 claims abstract description 6
- 238000000034 method Methods 0.000 claims description 20
- 230000003287 optical effect Effects 0.000 claims description 17
- 238000011176 pooling Methods 0.000 claims description 3
- 230000009466 transformation Effects 0.000 claims description 3
- 238000013507 mapping Methods 0.000 claims 1
- 238000005070 sampling Methods 0.000 claims 1
- 238000006243 chemical reaction Methods 0.000 abstract description 4
- 238000000605 extraction Methods 0.000 description 6
- 230000000694 effects Effects 0.000 description 4
- 238000012545 processing Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000007613 environmental effect Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 238000000691 measurement method Methods 0.000 description 1
- 238000007500 overflow downdraw method Methods 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Landscapes
- Image Analysis (AREA)
- Optical Radar Systems And Details Thereof (AREA)
Abstract
本發明係提供一種基於深度學習之高效率光達物件偵測方法,步驟包括:(A)由一光達取得一三維點雲數據,該三維點雲數據為Nx4維的資訊,N為光達點數目,每個光達點具有x軸值、y軸值、z軸值之空間資訊及反射強度資訊;(B)藉由空間轉換矩陣旋轉位移該三維點雲數據之該空間資訊;(C)將該三維點雲數據做多次的一維卷積擴充特徵維度,並經池化得出高維度的特徵;(D)將該三維點雲數據映射出二維影像提取出點雲二維特徵;以及(E)將該高維度的特徵及該點雲二維特徵輸入調節模型做特徵調節。 The present invention provides a high-efficiency LiDAR object detection method based on deep learning. The steps include: (A) Obtaining a three-dimensional point cloud data from an LiDAR, the three-dimensional point cloud data is Nx4 dimensional information, and N is LiDAR The number of points, each luminous point has spatial information of x-axis value, y-axis value, z-axis value and reflection intensity information; (B) the spatial information of the three-dimensional point cloud data is rotated and displaced by the spatial conversion matrix; (C) ) Do one-dimensional convolution of the three-dimensional point cloud data multiple times to expand the feature dimension, and then pool to obtain high-dimensional features; (D) map the three-dimensional point cloud data to a two-dimensional image to extract the two-dimensional point cloud Features; and (E) input the high-dimensional features and the two-dimensional feature of the point cloud into the adjustment model for feature adjustment.
Description
本發明係關於一種基於深度學習之高效率光達物件偵測方法,特別是關於一種在於物件特徵提取的加速改善之一種基於深度學習之高效率光達物件偵測方法。 The present invention relates to a high-efficiency lidar object detection method based on deep learning, and more particularly to a high-efficiency lidar object detection method based on deep learning that accelerates and improves object feature extraction.
現存的物件偵測方法中,基於圖像的方法已經行之有年,不論透過傳統影像辨識或者深度學習的方法都已經有非常多了,但為了能更真實了解周遭的全貌及感知距離,在未來自動駕駛車中使用光學雷達(Lidar)獲取的點雲來辨識為更可靠的方式,雖然探測距離不及毫米波雷達(millimeter wave radar),但光學雷達有更高的空間分辨率足以針對物件樣貌辨別物件,目前光學雷達已經成為重要的sensor並廣泛運用於自動駕駛上。 Among the existing object detection methods, image-based methods have been available for years. There have been many methods, whether through traditional image recognition or deep learning. In the future, self-driving cars will use the point cloud obtained by optical radar (Lidar) to identify it as a more reliable way. Although the detection distance is not as good as millimeter wave radar, the optical radar has a higher spatial resolution enough to target objects. To distinguish objects by appearance, optical radar has become an important sensor and is widely used in autonomous driving.
習知技術中,已有提出使用提取光學雷達的人工特徵來得到一個特徵向量,並對其分類出辦公室內的行人,另外,在三維的深度學習方法中,已有將點雲圖轉換成立體像素(Voxel),利用三維的卷積網路(3D Convolutional Neural Network)算出的特徵將三維的點雲資料做標籤的分類,再者,亦有透過卷積及池化保持點雲序不變性萃取出強健的特徵, 其特徵可用來做分類或是語義分割。 In the prior art, it has been proposed to extract the artificial features of the optical radar to obtain a feature vector and classify the pedestrians in the office. In addition, in the three-dimensional deep learning method, the point cloud image has been converted into volume pixels. (Voxel), using the features calculated by the 3D Convolutional Neural Network to classify the three-dimensional point cloud data as tags. Furthermore, it is also extracted through convolution and pooling to maintain the invariance of the point cloud order Strong characteristics, Its features can be used for classification or semantic segmentation.
上述技術中,大多數都是以基於視覺影像的方法取得不錯的成果,可是這些方法其實是在清晰的圖像條件獲得好結果的,其實實際狀況卻是更加惡劣,包括天候或是環境明亮度的情形都將導致影像變得不清晰,因此辨識的效果會變差,而我們將使用的光學雷達對於在些外在條件來說是有高度的強健性的,但儘管光學雷達擁有50公尺的範圍偵測,但由於角度發散的關係,隨著偵測距離越長,點雲資料也會越稀疏,使得障礙物較難辨識,這也是使用光學雷達偵測的難處之一。 Most of the above technologies have achieved good results with methods based on visual images, but these methods actually achieve good results under clear image conditions, but the actual conditions are worse, including weather or environmental brightness. The situation will cause the image to become unclear, so the recognition effect will be worse, and the optical radar we will use is highly robust to some external conditions, but even though the optical radar has 50 meters However, due to the angle divergence, the point cloud data will become sparser as the detection distance becomes longer, making obstacles more difficult to identify. This is also one of the difficulties in optical radar detection.
另一點在於光達資料的處理結構非常繁瑣,相對於影像具有像素化的結構性來說,光達所蒐集到的物件只是點雲,是不具有結構性的點集合而已,在針對特定物件的狀況下,可以建立對於特定物件相關的特徵來做判別,但在此特徵在其他的物件卻又不太精準,而現在興起的深度學習的方法會解決人工特徵的問題,近來也人有導入到此類三維資料中,通常會透過體素化點雲結構來完成,但體素化的過程又過於耗時,此類方法並不適合用做需要兼顧實時運作與精準性的自駕車領域。 Another point is that the processing structure of LiDAR data is very cumbersome. Compared with the pixelated structure of the image, the objects collected by LiDAR are only point clouds, which are not structured point collections. They are aimed at specific objects. Under the circumstances, it is possible to establish features related to a specific object to make a judgment, but this feature is not very accurate in other objects, and the emerging deep learning methods will solve the problem of artificial features. Recently, people have imported them into This type of 3D data is usually done through a voxelized point cloud structure, but the process of voxelization is too time-consuming. This type of method is not suitable for use in the field of self-driving cars that requires both real-time operation and accuracy.
綜上所述,目前體素化的網路模型,既複雜又耗時,因此本案之申請人經苦心研究發展出了一種基於深度學習之高效率光達物件偵測方法的網路架構,透過直接對三維 點資料的處理,獲得一簡潔快速的三維特徵,我們將其稱為快速光達網路架構(Fast LiDar Net)。 In summary, the current voxelized network model is complicated and time-consuming. Therefore, the applicant in this case has developed a network architecture based on deep learning and a high-efficiency lidar object detection method through painstaking research. Directly to 3D The point data is processed to obtain a concise and fast three-dimensional feature, which we call Fast LiDar Net (Fast LiDar Net).
鑒於上述悉知技術之缺點,本發明之主要目的在於提供一種基於深度學習之高效率光達物件偵測方法,藉由提出一網路架構,透過直接對三維點資料的處理,獲得一簡潔快速的三維特徵,解決目前體素化的網路模型,既複雜又耗時之缺點。 In view of the shortcomings of the above-mentioned known technology, the main purpose of the present invention is to provide a high-efficiency LiDAR object detection method based on deep learning. By proposing a network architecture, by directly processing three-dimensional point data, a simple and fast method is obtained. The three-dimensional features of the voxel solve the shortcomings of the current voxelized network model, which is both complicated and time-consuming.
為了達到上述目的,根據本發明所提出之一方案,提供一種基於深度學習之高效率光達物件偵測方法,步驟包括:(A)由一光達取得三維點雲數據,該三維點雲數據為Nx4維的資訊,N為光達點數目,每個光達點具有x軸值、y軸值、z軸值之空間資訊及反射強度資訊;(B)藉由空間轉換矩陣旋轉位移三維點雲數據之空間資訊;(C)將三維點雲數據做多次的一維卷積擴充特徵維度,並經池化得出一高維度的特徵;(D)將三維點雲數據映射出二維影像提取出點雲二維特徵;以及(E)將該高維度的特徵及該點雲二維特徵輸入一調節模型做特徵調節。 In order to achieve the above objective, according to a solution proposed by the present invention, a method for detecting objects with high efficiency based on deep learning is provided. The steps include: (A) Obtaining three-dimensional point cloud data from a LiDAR, the three-dimensional point cloud data Nx4 dimensional information, N is the number of luminous points, each luminous point has spatial information of x-axis value, y-axis value, z-axis value and reflection intensity information; (B) the three-dimensional point is rotated and displaced by the space conversion matrix Spatial information of cloud data; (C) Do one-dimensional convolution of three-dimensional point cloud data multiple times to expand the feature dimension, and then pool to obtain a high-dimensional feature; (D) map the three-dimensional point cloud data into two dimensions The two-dimensional feature of the point cloud is extracted from the image; and (E) the high-dimensional feature and the two-dimensional feature of the point cloud are input into an adjustment model for feature adjustment.
較佳地,於步驟(D)中,可根據距離擴張對二維影像使用不同的膨脹係數。 Preferably, in step (D), different expansion coefficients can be used for the two-dimensional image according to the distance expansion.
較佳地,於該步驟(D)中,可對二維影像之一影 像三通道賦予反射強度資訊及反射點距離有意義的值。 Preferably, in this step (D), one of the two-dimensional images can be shadowed The image three channels give meaningful values of reflection intensity information and reflection point distance.
較佳地,影像三通道可包括:一藍色通道、一綠色通道及一紅色通道,該藍色通道為光達點的反射強度,並根據值域調整為0~255,若為光達投影影像輪廓內則在該綠色通道填上值255,以及該紅色通道為光達點與二維影像之中心距離,並根據值域調整為0~255。 Preferably, the three channels of the image may include: a blue channel, a green channel, and a red channel. The blue channel is the reflection intensity of the light reaching point and adjusted to 0~255 according to the value range. If it is the light reaching projection In the image outline, fill the green channel with a value of 255, and the red channel is the distance between the luminous point and the center of the two-dimensional image, and adjust it to 0~255 according to the value range.
較佳地,該方法係可採用機器人作業系統整合並架設於嵌入式系統。 Preferably, the method can be integrated with a robot operating system and installed in an embedded system.
較佳地,於步驟(A)中,可進一步執行地面去除步驟,利用隨機樣本共識(Random Sample Consensus,簡稱RANSAC)來去除光達之地面,而在使用該隨機樣本共識前,透過點雲體素化(Voxel Filter)下採樣使地面有相同的y軸值。 Preferably, in step (A), the ground removal step can be further performed, using Random Sample Consensus (RANSAC) to remove the ground of the LiDAR, and before using the random sample consensus, through the point cloud Voxel Filter downsampling makes the ground have the same y-axis value.
較佳地,可進一步執行一物件分群步驟,根據光達點之間的距離作分群,透過K-D Tree做搜尋方式,當光達點彼此間距離小於0.2m時,兩光達點標記為同一群。 Preferably, an object grouping step can be further performed to group according to the distance between the luminous points, and search through KD Tree. When the distance between the luminous points is less than 0.2m, the two luminous points are marked as the same group .
以上之概述與接下來的詳細說明及附圖,皆是為了能進一步說明本發明達到預定目的所採取的方式、手段及功效。而有關本發明的其他目的及優點,將在後續的說明及圖式中加以闡述。 The above summary, the following detailed description and the accompanying drawings are all intended to further illustrate the methods, means and effects adopted by the present invention to achieve the intended purpose. The other objectives and advantages of the present invention will be described in the following description and drawings.
S1-S5、S11-S15:步驟 S1-S5, S11-S15: steps
第一圖係為本發明之一種基於深度學習之高效率光達物件偵測方法之物件特徵提取流程圖。 The first figure is a flowchart of object feature extraction in a high-efficiency lidar object detection method based on deep learning of the present invention.
第二圖係為本發明之一種基於深度學習之高效率光達物件偵測方法流程圖。 The second figure is a flowchart of a high-efficiency lidar object detection method based on deep learning of the present invention.
第三圖係為本發明之加入光達點的反射強度資訊及反射點距離後之比較示意圖。 The third figure is a comparison schematic diagram of the present invention after adding the reflection intensity information of the light reaching point and the distance of the reflection point.
第四圖係為本發明之快速光達網路架構(Fast LiDar Net)示意圖。 The fourth figure is a schematic diagram of the Fast LiDar Net architecture of the present invention.
以下係藉由特定的具體實例說明本發明之實施方式,熟悉此技藝之人士可由本說明書所揭示之內容輕易地了解本創作之優點及功效。 The following is a specific example to illustrate the implementation of the present invention. Those familiar with the art can easily understand the advantages and effects of the creation from the content disclosed in this specification.
請參閱第一圖係為本發明之一種基於深度學習之高效率光達物件偵測方法之物件特徵提取流程圖,以及第二圖係為本發明之一種基於深度學習之高效率光達物件偵測方法流程圖。本發明在於提供一種基於深度學習之高效率光達物件偵測方法,包括:步驟S1,由一光達取得三維點雲數據(亦就是步驟S11之數據採集),三維點雲數據為Nx4維的資訊,N為光達點數目,每個光達點具有x軸值、y軸值、z軸值之空間資訊及反射強度資訊。在步驟S14之物件特徵提取的詳細步驟包括:步驟S2,藉由空間轉換矩陣旋轉位移三維點雲數據之空間資訊,更詳言之,為了加速整體模型,我們首先在三維深度學習特徵的方面做改變,相較以往體素化的 網路模型,既複雜又耗時,我們提出一網路架構,透過直接對三維點資料的處理,獲得一簡潔快速的三維特徵,首先只針對光達的空間資訊xyz軸透過Transform Net推估一空間轉換矩陣,透過此空間轉換矩陣將三維點雲數據預處理後,所有同類的點雲物件在空間應較有一致性。步驟S3,將三維點雲數據做多次的一維卷積擴充特徵維度,並經池化得出一高維度的特徵,此特徵有利於快速獲取三維資訊,而由步驟S2至步驟S3之架構如第四圖所示,我們將其稱為快速光達網路架構(Fast LiDar Net),我們使用改良後的Fast lidar net深度學習網路去進行特徵提取,取代先前運算量非常高的深度學習特徵與人工特徵,有效提升運算效能。步驟S4,將三維點雲數據映射出二維影像提取出點雲二維特徵,雖然快速光達網路架構雖有助於了解三維空間資訊,但準確率並沒有那麼高,會降低辨識率,因此融合了步驟S4的二維影像來提升我們的準確率,此舉比起純三維資訊在準度方面多有助益,速度也不會因此降低太多,算是速度與準度兼容的做法,以及步驟S5,將高維度的特徵及點雲二維特徵輸入調節模型做特徵調節,詳言之,此步驟將各特徵的運算時間與特徵的獨特及有效性輸入此模型做特徵調節,透過此模型挑選出效率好的特徵值融合,可以得到又快又準確的結果,為達到實時的光學雷達物件檢測,本方法的特徵融合法,對於單一點雲分離出20個最有可能為檢測物之物件並分類可以達到約80毫 秒的處理速度,換算為14FPS,並有著94.3%的準確率。 Please refer to the first figure for the object feature extraction flowchart of a high-efficiency lidar object detection method based on deep learning of the present invention, and the second figure is a high-efficiency lidar object detection method based on deep learning of the present invention Flow chart of the measurement method. The present invention is to provide a high-efficiency LiDAR object detection method based on deep learning, including: step S1, obtaining three-dimensional point cloud data from a LiDAR (that is, the data collection in step S11), and the three-dimensional point cloud data is Nx4-dimensional Information, N is the number of luminous points, each luminous point has spatial information of x-axis value, y-axis value, z-axis value, and reflection intensity information. The detailed steps of object feature extraction in step S14 include: step S2, the spatial information of the 3D point cloud data is rotated and displaced by the space conversion matrix. In more detail, in order to speed up the overall model, we first do in the aspect of 3D deep learning features Change, compared to the previous voxelized The network model is complicated and time-consuming. We propose a network architecture to directly process the three-dimensional point data to obtain a simple and fast three-dimensional feature. Firstly, only the xyz axis of Lidar’s spatial information is estimated through Transform Net. Spatial transformation matrix. After preprocessing the three-dimensional point cloud data through this spatial transformation matrix, all point cloud objects of the same kind should be more consistent in space. Step S3, the three-dimensional point cloud data is subjected to multiple one-dimensional convolutions to expand the feature dimension, and a high-dimensional feature is obtained by pooling. This feature is conducive to quickly acquiring three-dimensional information. The structure from step S2 to step S3 As shown in the fourth figure, we call it Fast LiDar Net. We use the improved Fast lidar net deep learning network for feature extraction, replacing the previous deep learning with very high computational complexity. Features and artificial features effectively improve computing performance. Step S4: Map the 3D point cloud data to the 2D image to extract the 2D features of the point cloud. Although the fast LiDAR network architecture is helpful to understand the 3D spatial information, the accuracy rate is not so high, which will reduce the recognition rate. Therefore, the two-dimensional image of step S4 is integrated to improve our accuracy. This is more helpful in terms of accuracy than pure three-dimensional information, and the speed will not be reduced too much. It can be regarded as a speed and accuracy compatible approach. And step S5, input high-dimensional features and two-dimensional point cloud features into the adjustment model for feature adjustment. In detail, this step inputs the calculation time of each feature and the uniqueness and effectiveness of the feature into this model for feature adjustment. The model selects the efficient feature value fusion, and can obtain fast and accurate results. In order to achieve real-time optical radar object detection, the feature fusion method of this method separates the 20 most likely detection objects from a single point cloud. Objects and classification can reach about 80 millimeters The processing speed in seconds, converted to 14FPS, and has an accuracy rate of 94.3%.
以上,在步驟S14之物件特徵提取中,除了將光達的三維點資訊提取出特徵之外,我們在這步驟S14也對三維點雲映射出一個二維影像提取出點雲二維特徵。 Above, in the object feature extraction in step S14, in addition to extracting features from the 3D point information of LiDAR, we also map a 2D image to the 3D point cloud in this step S14 to extract the 2D features of the point cloud.
在本實施方式中,於步驟S4中,可根據距離擴張對二維影像使用不同的膨脹係數,以及可對二維影像之一影像三通道賦予反射強度資訊及反射點距離有意義的值。更詳言之,特別在光達轉換二維影像的處理過程中,由於光達的機構特性,讓點雲在不同距離下的稀疏程度不同,為了讓光達影像適應於深度學習影像辨識,因此在物件距離不同的位置,我們使用不同的膨脹係數,來解決二維影像空洞的問題,具體來說,以每五公尺增加一倍膨脹係數,最後找出物件輪廓並填滿,而我們也分別對於影像三通道賦予有意義的值,將原先單通道影像加入光學雷達中反射強度資訊及反射點距離有物理意義的資訊,使其變成三通道影像;這邊可以想像成原本單通道影像是灰階影像,現在改為三通道彩色影像(RGB),只是三個通道中不是放入RGB色彩資訊,而是光學雷達之反射強度資訊及反射點距離有物理意義的資訊,藍色通道為光達點的反射強度,並根據值域調整為0~255;若為光達投影影像輪廓內則在綠色通道填上值255;紅色通道為光達點與中心距離,並根據值域調整為0~255,此三通道的值可以幫助我們深度學習的模型更有效的辨識物件(如第三圖所 示)。 In this embodiment, in step S4, different expansion coefficients can be used for the two-dimensional image according to the distance expansion, and reflection intensity information and meaningful values of the reflection point distance can be assigned to one of the three channels of the two-dimensional image. In more detail, especially in the process of Lidar's conversion of two-dimensional images, due to Lidar’s mechanical characteristics, the sparseness of point clouds at different distances is different. In order to make Lidar images suitable for deep learning image recognition, At different positions of the object, we use different expansion coefficients to solve the problem of two-dimensional image holes. Specifically, we double the expansion coefficient every five meters, and finally find the outline of the object and fill it up, and we also Assign meaningful values to the three channels of the image, and add the original single-channel image to the optical radar’s reflection intensity information and the physical meaning of the reflection point distance to turn it into a three-channel image; here you can imagine that the original single-channel image is gray The first-level image is now changed to a three-channel color image (RGB), but instead of RGB color information in the three channels, the reflection intensity information of the optical radar and the reflection point distance have physical meaning. The blue channel is Lidar The reflection intensity of the point is adjusted to 0~255 according to the value range; if it is within the outline of the projection image, a value of 255 is filled in the green channel; the red channel is the distance between the light reach point and the center, and is adjusted to 0~ according to the value range. 255. The value of these three channels can help our deep learning model to identify objects more effectively (as shown in the third figure). Show).
在本實施方式中,本發明之方法係可採用機器人作業系統整合並架設於嵌入式系統,此方法不必耗用大量電腦運算資源,便可執行於簡易的嵌入式設備達到良好的執行效率。 In this embodiment, the method of the present invention can be integrated with a robot operating system and installed in an embedded system. This method does not need to consume a large amount of computer computing resources, and can be executed on a simple embedded device to achieve good execution efficiency.
在本實施方式中,可進一步執行地面去除步驟S12,利用隨機樣本共識(Random Sample Consensus,簡稱RANSAC)來去除光達之地面,而在使用該隨機樣本共識前,透過點雲體素化(Voxel Filter)下採樣使地面有相同的y軸值。更詳言之,步驟S12之地面去除是為了下一步驟S13中的有效分群,需要先把光達的地面環分割出來,並將非地面環部分傳入下一步驟進行分群,這邊利用隨機樣本共識(Random Sample Consensus,簡稱RANSAC)來去除地面,而在使用RANSAC前,透過Voxel Filter下採樣使地面盡量有相同的y軸值,也可以加速並使RANSAC更精準分割出地面。 In this embodiment, the ground removal step S12 can be further performed, using Random Sample Consensus (RANSAC) to remove the LiDAR ground, and before using the random sample consensus, through point cloud voxelization (Voxel Filter) Downsampling makes the ground have the same y-axis value. In more detail, the ground removal in step S12 is for effective grouping in the next step S13. The ground ring of Lidar needs to be segmented first, and the non-ground ring part is passed to the next step for grouping. Here we use random Random Sample Consensus (RANSAC) is used to remove the ground, and before using RANSAC, the Voxel Filter downsampling makes the ground have the same y-axis value as much as possible, which can also speed up and make RANSAC segment the ground more accurately.
在本實施方式中,可進一步執行一物件分群步驟S13,根據光達點之間的距離作分群,透過K-D Tree做搜尋方式,當光達點彼此間距離小於0.2m時,兩光達點標記為同一群,最後對各群長寬高再進行最後篩選以符合正確的物件大小。 In this embodiment, an object grouping step S13 can be further performed, which is grouped according to the distance between the light arrival points, and the KD Tree is used as a search method. When the distance between the light arrival points is less than 0.2m, the two light arrival points are marked For the same group, the length, width, and height of each group are finally screened to match the correct object size.
在本實施方式中,可進一步執行一物件分類步驟S15,最後透過全連接層把擷取並整合好的特徵,進一步做 one-hot vector的分類。利用分離各物件後萃取出的特徵進行深度學習的分類,可以確立車體周遭物件之分類及位置。 In this embodiment, an object classification step S15 can be further performed, and finally the features are captured and integrated through the fully connected layer, and further Classification of one-hot vector. Using the features extracted after separating each object for deep learning classification, the classification and location of objects around the car body can be established.
綜上所述,本發明利用快速的三維深度模型,獲取在三維空間上的特徵優點,而光達投影影像則利用以往在二維影像在深度學習上的基礎,保留二維影像物件判別方面的優勢,還有結合二三維人工設定特徵,透過一個調節準確度及執行效率的模型調節出適合的執行效率,該調節模型將所有特徵整理過後來根據速度以及準確率的影響程度提取在這些特徵中較具有優勢的特徵,讓整個分類器在調整後擁有快速且準確的成果。另外,光學雷達發射時具有較高的指向性且不用考慮光線環境影響,發射高精準的雷射光束掃描道路環境周圍,近年來在自動駕駛,都被廣泛探討。倚靠著光學雷達的發展和硬體設備的進步,我們能有更優於純影像辨識的環境抵抗力,靠著我們的演算法,能更增強環境周遭物件的檢測速度與準度。再者,在未來的自駕車系統上,配備光學雷達將使得電腦系統在不論白天或夜晚都擁有準確的判斷周遭物件的能力,而且光學雷達有著至少50公尺的且環繞360度範圍偵測能力,比起影像辨識更有著良好的動態判別能力,非常適合自駕車的使用,本發明提出基於深度學習之光學雷達點雲物件偵測就是為了在將來自駕車的應用上,提供自駕車更周全的視野感知能力。 In summary, the present invention uses a fast three-dimensional depth model to obtain the advantages of features in three-dimensional space, while the LiDAR projection image uses the deep learning basis of the previous two-dimensional images to retain the two-dimensional image object identification. The advantage is to combine the two-dimensional and three-dimensional manually set features to adjust the appropriate execution efficiency through a model that adjusts the accuracy and execution efficiency. The adjustment model organizes all the features and then extracts these features according to the influence of speed and accuracy. The more advantageous features allow the entire classifier to have fast and accurate results after adjustment. In addition, the optical radar has high directivity and does not need to consider the influence of the light environment. It emits high-precision laser beams to scan the surroundings of the road environment. In recent years, automatic driving has been widely discussed. Relying on the development of optical radar and the advancement of hardware equipment, we can have better environmental resistance than pure image recognition. Relying on our algorithm, we can further enhance the detection speed and accuracy of surrounding objects in the environment. Moreover, in the future self-driving car system, equipped with optical radar will enable the computer system to have the ability to accurately judge surrounding objects regardless of day or night, and the optical radar has a detection capability of at least 50 meters and a 360-degree range. Compared with image recognition, it has better dynamic discrimination ability and is very suitable for the use of self-driving cars. The present invention proposes optical radar point cloud object detection based on deep learning in order to provide more comprehensive self-driving car applications. Vision perception ability.
上述之實施例僅為例示性說明本創作之特點及 功效,非用以限制本發明之實質技術內容的範圍。任何熟悉此技藝之人士均可在不違背創作之精神及範疇下,對上述實施例進行修飾與變化。因此,本發明之權利保護範圍,應如後述之申請專利範圍所列。 The above-mentioned embodiments are only illustrative to illustrate the characteristics of this creation and The effect is not intended to limit the scope of the essential technical content of the present invention. Anyone familiar with this technique can modify and change the above-mentioned embodiments without violating the spirit and scope of creation. Therefore, the scope of protection of the rights of the present invention should be listed in the scope of patent application described later.
S1-S5:步驟 S1-S5: steps
Claims (7)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| TW109146375A TWI745204B (en) | 2020-12-28 | 2020-12-28 | High-efficiency LiDAR object detection method based on deep learning |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| TW109146375A TWI745204B (en) | 2020-12-28 | 2020-12-28 | High-efficiency LiDAR object detection method based on deep learning |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| TWI745204B true TWI745204B (en) | 2021-11-01 |
| TW202225730A TW202225730A (en) | 2022-07-01 |
Family
ID=79907383
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| TW109146375A TWI745204B (en) | 2020-12-28 | 2020-12-28 | High-efficiency LiDAR object detection method based on deep learning |
Country Status (1)
| Country | Link |
|---|---|
| TW (1) | TWI745204B (en) |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20220406014A1 (en) * | 2021-06-21 | 2022-12-22 | Cyngn, Inc. | Granularity-flexible existence-based object detection |
| TWI814503B (en) * | 2022-07-26 | 2023-09-01 | 鴻海精密工業股份有限公司 | Method for training depth identification model, identifying depth of image and related devices |
| TWI831234B (en) * | 2022-06-06 | 2024-02-01 | 遠創智慧股份有限公司 | Methods for detecting and classifying objects, and related systems |
Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102317972A (en) * | 2009-02-13 | 2012-01-11 | 哈里公司 | Registration of 3d point cloud data to 2d electro-optical image data |
| CN108604301A (en) * | 2015-12-04 | 2018-09-28 | 欧特克公司 | The point based on key point of scalable automatic global registration for big RGB-D scannings is to feature |
| TW201920982A (en) * | 2017-08-25 | 2019-06-01 | 大陸商北京嘀嘀無限科技發展有限公司 | Methods and systems for detecting environmental information of a vehicle |
| TW202017784A (en) * | 2018-11-07 | 2020-05-16 | 國家中山科學研究院 | Car detection method based on LiDAR by proceeding the three-dimensional feature extraction and the two-dimensional feature extraction on the three-dimensional point cloud map and the two-dimensional map |
| CN111507982A (en) * | 2019-06-28 | 2020-08-07 | 浙江大学 | Point cloud semantic segmentation method based on deep learning |
| US10762673B2 (en) * | 2017-08-23 | 2020-09-01 | Tusimple, Inc. | 3D submap reconstruction system and method for centimeter precision localization using camera-based submap and LiDAR-based global map |
| CN112099046A (en) * | 2020-09-16 | 2020-12-18 | 辽宁工程技术大学 | Airborne LIDAR 3D Plane Detection Method Based on Multivalued Voxel Model |
-
2020
- 2020-12-28 TW TW109146375A patent/TWI745204B/en active
Patent Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102317972A (en) * | 2009-02-13 | 2012-01-11 | 哈里公司 | Registration of 3d point cloud data to 2d electro-optical image data |
| CN108604301A (en) * | 2015-12-04 | 2018-09-28 | 欧特克公司 | The point based on key point of scalable automatic global registration for big RGB-D scannings is to feature |
| US10762673B2 (en) * | 2017-08-23 | 2020-09-01 | Tusimple, Inc. | 3D submap reconstruction system and method for centimeter precision localization using camera-based submap and LiDAR-based global map |
| TW201920982A (en) * | 2017-08-25 | 2019-06-01 | 大陸商北京嘀嘀無限科技發展有限公司 | Methods and systems for detecting environmental information of a vehicle |
| TW202017784A (en) * | 2018-11-07 | 2020-05-16 | 國家中山科學研究院 | Car detection method based on LiDAR by proceeding the three-dimensional feature extraction and the two-dimensional feature extraction on the three-dimensional point cloud map and the two-dimensional map |
| CN111507982A (en) * | 2019-06-28 | 2020-08-07 | 浙江大学 | Point cloud semantic segmentation method based on deep learning |
| CN112099046A (en) * | 2020-09-16 | 2020-12-18 | 辽宁工程技术大学 | Airborne LIDAR 3D Plane Detection Method Based on Multivalued Voxel Model |
Cited By (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20220406014A1 (en) * | 2021-06-21 | 2022-12-22 | Cyngn, Inc. | Granularity-flexible existence-based object detection |
| WO2022271742A1 (en) * | 2021-06-21 | 2022-12-29 | Cyngn, Inc. | Granularity-flexible existence-based object detection |
| US11747454B2 (en) * | 2021-06-21 | 2023-09-05 | Cyngn, Inc. | Granularity-flexible existence-based object detection |
| TWI831234B (en) * | 2022-06-06 | 2024-02-01 | 遠創智慧股份有限公司 | Methods for detecting and classifying objects, and related systems |
| TWI814503B (en) * | 2022-07-26 | 2023-09-01 | 鴻海精密工業股份有限公司 | Method for training depth identification model, identifying depth of image and related devices |
Also Published As
| Publication number | Publication date |
|---|---|
| TW202225730A (en) | 2022-07-01 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN112825192B (en) | Object identification system and method based on machine learning | |
| CN110264416B (en) | Sparse point cloud segmentation method and device | |
| CN117058646B (en) | Complex road target detection method based on multi-mode fusion aerial view | |
| CN113111974A (en) | Vision-laser radar fusion method and system based on depth canonical correlation analysis | |
| CN108510467A (en) | SAR image target recognition method based on variable depth shape convolutional neural networks | |
| CN113408584A (en) | RGB-D multi-modal feature fusion 3D target detection method | |
| TWI745204B (en) | High-efficiency LiDAR object detection method based on deep learning | |
| CN104850850A (en) | Binocular stereoscopic vision image feature extraction method combining shape and color | |
| CN114648698A (en) | Improved 3D target detection system based on PointPillars | |
| CN111461221A (en) | A multi-source sensor fusion target detection method and system for autonomous driving | |
| CN115063698A (en) | Automatic identification and information extraction method and system for slope surface deformation crack | |
| CN116486287A (en) | Target detection method and system based on environment self-adaptive robot vision system | |
| CN119888738B (en) | Multi-view semantic recognition method based on depth map assistance | |
| CN113537397B (en) | Target detection and image definition joint learning method based on multi-scale feature fusion | |
| CN116797894A (en) | A radar and video fusion target detection method with enhanced feature information | |
| CN114358150A (en) | A SAR-Visible Light Remote Sensing Image Matching Method | |
| CN114639115A (en) | 3D pedestrian detection method based on fusion of human body key points and laser radar | |
| CN119625279A (en) | Multimodal target detection method, device and multimodal recognition system | |
| Engels et al. | 3d object detection from lidar data using distance dependent feature extraction | |
| CN116778262A (en) | Three-dimensional target detection method and system based on virtual point cloud | |
| CN119693932B (en) | A target detection method based on multimodal data fusion and 3D voxel projection | |
| WO2025232268A1 (en) | Artificial-intelligence image processing method based on intelligent transportation | |
| CN118762064A (en) | An automatic registration method for non-homologous point clouds based on digital surface model | |
| CN118262258A (en) | A method and system for detecting differences in ground environment images | |
| CN117789193A (en) | Multimode data fusion 3D target detection method based on secondary enhancement |