WO2019147024A1

WO2019147024A1 - Object detection method using two cameras having different focal distances, and apparatus therefor

Info

Publication number: WO2019147024A1
Application number: PCT/KR2019/000987
Authority: WO
Inventors: 빈 딘꾸앙; 전문구; 윤재웅
Original assignee: Gwangju Institute of Science and Technology
Current assignee: Gwangju Institute of Science and Technology
Priority date: 2018-01-23
Filing date: 2019-01-23
Publication date: 2019-08-01
Anticipated expiration: 2020-07-23
Also published as: KR20190095597A; KR102013781B1

Abstract

A front object detection method for an autonomous vehicle is disclosed. The front object detection method for an autonomous vehicle, according to an embodiment of the present invention, comprises the steps of: obtaining a first image from a first camera having a first focal distance; obtaining a second image from a second camera having a second focal distance that is relatively longer than the first focal distance; detecting an object from the first image and the second image; determining corresponding regions between the first image and the second image; removing an image characteristic from the corresponding region of the first image; and detecting a front object on the basis of the object detection result of the second image and the object detection result of the first image from which the image characteristic has been removed.

Description

Method and apparatus for object detection using two cameras having different focal lengths

본 발명은 객체 검출 방법 및 장치에 관한 발명이다. 구체적으로 서로 다른 초점 거리를 갖는 두 개의 카메라를 이용하여 검출하기 어려운 소형 객체를 검출하는 객체 검출 방법 및 장치에 관한 발명이다.The present invention relates to an object detection method and apparatus. Specifically, the present invention relates to an object detecting method and apparatus for detecting a small object that is difficult to detect using two cameras having different focal lengths.

컴퓨터 비전 및 기계 학습 분야에서 객체 검출은 의학에서 로봇 공학에 이르기까지 다양한 분야에서 많은 중요성을 가지고 있다. 인간의 시력은 현재의 시나리오에서 객체를 지역화하고 식별하는데 매우 견고하지만 컴퓨터에서는 다루기가 매우 어렵다. 객체 검출의 궁극적인 목표는 이미지에 있는 동일한 객체 또는 다른 객체의 인스턴스를 지역화하고 식별하는 것이다.Object detection in the fields of computer vision and machine learning is of great importance in a variety of fields, from medicine to robotics. Human vision is very robust to localize and identify objects in current scenarios, but it is very difficult to manipulate in computers. The ultimate goal of object detection is to localize and identify instances of the same or different objects in the image.

일반적인 객체 검출 방법은 컬러 및 질감 정보, 형상 신호, 및 국고 특징을 사용한다. 딥 신경 네트워크(deep neural network)의 도래는 객체 검출 기술을 혁신적으로 변화시켰다. 객체 검출을 위해 여러 개의 하위 레벨 기능을 결합한 복잡한 앙상블 메소드가 제거되엇다. 이러한 방법의 정확도는 딥 네트워크보다 훨씬 낮다.Common object detection methods use color and texture information, shape signals, and state features. The advent of deep neural networks revolutionized object detection technology. Complex ensemble methods that combine several low-level functions for object detection have been removed. The accuracy of this method is much lower than in deep networks.

객체 검출을 위한 기존에 제안된 딥 컨볼루션 신경 네트워크(convolution neural network, CNN)가 있으며, 이는 선택적인 검색을 사용하여 지역을 제안한 다음, CNN을 사용하여 계층적 방식으로 특징 표현을 만든다. 최종 레이어는 support-vector networks(SVM)를 사용하여 객체인지 아닌지를 분류하고, 객체에 경계 상자를 맞춘다. 그러나, 경계 상자를 맞추는 것은 회귀 문제로 고려된다. There is a proposed convolution neural network (CNN) for object detection, which proposes a region using selective retrieval and then uses CNN to create a feature representation in a hierarchical manner. The final layer uses support-vector networks (SVMs) to classify objects and whether or not they are bounding boxes. However, matching the bounding box is considered a regression problem.

기존에 알려진 R-CNN의 개선된 버전이 Fast-RCNN으로 개발되었다. Fast-RCNN 물체 검출의 시간 효율성과 정확도를 향상시키는데, 구체적으로 각 지역 제안서에 대해 훈련된 중복 CNN을 제거하고 단일 이미지에 대해 단일 CNN만을 훈련한다. 지역 제안에서는 훈련된 CNN 기능을 사용하여 지역을 제안한다.An improved version of the previously known R-CNN was developed as Fast-RCNN. To improve the time efficiency and accuracy of Fast-RCNN object detection, specifically, we remove duplicate CNN trained for each local proposal and train only a single CNN for a single image. The regional proposal offers an area using trained CNN facilities.

최근, 차량에는 종종 전면 및 후면보기 용 카메라가 장착되어 잇다. 차량 및 보행자 검출, 차선 및 교통 표지 검출과 같은 차량 관련 애플리케이션에 사용할 수 있는 카메라를 재사용할 수 있다면 매우 경제적이다. 원거리 객체를 검출하고 인식하는 것은 현재 까다로운 문제이며 고속으로 주행하는 차량의 요구사항이다. Recently, vehicles are often equipped with front and rear view cameras. It would be very economical to be able to reuse cameras available for vehicle-related applications such as vehicle and pedestrian detection, lane and traffic sign detection. Detecting and recognizing remote objects is currently a tricky issue and is a requirement for high-speed vehicles.

초점 거리가 더 긴 카메라는 더 먼 장면을 캡처할 수 있다. 반면 초점 길이가 짧은 카메라는 더 넓은 시야를 가지며 더 큰 정보를 캡처할 수 있다. 따라서 두 대의 카메라를 사용하면 멀리 있는 물체뿐만 아니라 넓은 지역에 있는 물체를 검출하는데 도움이 된다.A camera with a longer focal length can capture farther scenes. On the other hand, cameras with short focal lengths have wider field of view and can capture larger information. Thus, using two cameras helps detect objects in a wide area as well as distant objects.

선행기술문헌:Prior Art Documents:

Ryoji Tanabe and Alex Fukunaga, "Improving the Search Performance of SHADE Using Linear Population Size Reduction", Proc. IEEE Congress on Evolutionary Computation, 2014.Ryoji Tanabe and Alex Fukunaga, "Improving the Search Performance of SHADE Using Linear Population Size Reduction ", Proc. IEEE Congress on Evolutionary Computation, 2014.

D. G. Lowe: 'Distinctive image features from scale-invariant keypoints', International Journal of Computer Vision, 2004. W. Liu et al., "SSD: Single Shot MultiBox Detector," ArXiv151202325 Cs, vol. 9905, pp. 21-37, 2016.D. G. Lowe: "Distinctive image features from scale-invariant keypoints", International Journal of Computer Vision, 2004. W. Liu et al., "SSD: Single Shot MultiBox Detector," ArXiv 151202325 Cs, vol. 9905, pp. 21-37, 2016.

Ren, Jimmy and Chen, Xiaohao and Liu, Jianbo and Sun, Wenxiu and Pang, Jiahao and Yan, Qiong and Tai, Yu-Wing and Xu, Li: 'Accurate Single Stage Detector Using Recurrent Rolling Convolution', CVPR, 2017.Reno, Jimmy and Chen, Xiaohao and Liu, Jianbo and Sun, Wenxiu and Pang, Jiahao and Yan, Qiong and Tai, Yu-Wing and Xu, Li: 'Accurate Single Stage Detector Using Recurrent Rolling Convolution', CVPR,

Ren, Shaoqing and He, Kaiming and Girshick, Ross and Sun, Jian: 'Faster R-CNN: Towards Real-time Object Detection with Region Pro posal Networks' Proceedings of the 28th International Conference on Neural Information Processing Systems.Ren, Shaoqing and He, Kaiming and Girshick, Ross and Sun, Jian: 'Faster R-CNN: Towards Real-time Object Detection with Region Proposals Networks' Proceedings of the 28th International Conference on Neural Information Processing Systems.

본 발명은 고 비용의 고화질 카메라를 사용하지 않고도 검출이 어려운 작은 물체까지 검출할 수 있는 객체 검출 장치 및 방법을 제안 한다.The present invention proposes an object detecting apparatus and method capable of detecting even a small object which is difficult to detect without using a high-cost high-quality camera.

본 발명의 실시 예에 따른 자율 주행 차량의 전방 객체 검출 방법은, 제1 초점 거리를 갖는 제1 카메라로부터 획득한 제1 이미지를 획득하는 단계, 상기 제1 초점 거리보다 상대적으로 긴 제2 초점 거리를 갖는 제2 카메라로부터 획득한 제2 이미지를 획득하는 단계, 상기 제1 이미지 및 상기 제2 이미지로부터 객체를 검출하는 단계, 상기 제1 이미지와 상기 제2 이미지간 대응 영역을 판단하는 단계, 상기 제1 이미지의 대응 영역에서 이미지 특징을 제거하는 단계, 및, 상기 제2 이미지에서의 객체 검출 결과와 이미지 특징이 제거된 제1 이미지에서의 객체 검출 결과에 기초하여 전방의 객체를 검출하는 단계를 포함한다.A method of detecting a front object of an autonomous vehicle according to an embodiment of the present invention includes the steps of: obtaining a first image acquired from a first camera having a first focal length; obtaining a second focal length The method comprising the steps of: obtaining a second image obtained from a second camera having a first image and a second image, detecting an object from the first image and the second image, determining a corresponding region between the first image and the second image, The method comprising the steps of: removing an image feature in a corresponding area of a first image; and detecting an object ahead of the object based on the object detection result in the second image and the object detection result in the first image from which the image feature has been removed .

이 경우 상기 전방의 객체를 검출하는 단계는, 상기 제1 이미지의 대응 영역에서 이미지 특징을 제거 함에 따라, 대응 영역을 제외한 상기 제1 이미지에서 객체를 검출하는 단계를 포함할 수 있다.In this case, the step of detecting the forward object may include detecting the object in the first image except for the corresponding area, by removing the image feature in the corresponding area of the first image.

한편 상기 제1 이미지와 상기 제2 이미지간 대응 영역을 판단하는 단계는, 매개 변수에 기초하여 상기 제2 이미지에서 검출된 객체의 영역에 대응하는 상기 제1 이미지의 대응 영역을 결정하는 단계를 포함할 수 있다.Wherein determining the corresponding region between the first image and the second image includes determining a corresponding region of the first image corresponding to an area of the object detected in the second image based on the parameter can do.

이 경우 상기 매개변수는, 상기 제1 카메라의 초점 거리와 상기 제2 카메라의 초점 거리의 비율 및 상기 제2 이미지의 원점에 대응하는 상기 제1 이미지의 픽셀을 포함할 수 있다.In this case, the parameter may include a pixel of the first image corresponding to a ratio of the focal distance of the first camera to the focal length of the second camera and the origin of the second image.

한편 상기 제2 초점 거리를 갖는 제2 카메라의 시야각은 상기 제1 초점 거리를 갖는 제1 카메라의 시야각보다 좁을 수 있다.On the other hand, the viewing angle of the second camera having the second focal length may be narrower than the viewing angle of the first camera having the first focal distance.

한편 상기 제2 카메라에 의해 촬영되는 범위는, 기존의 검출 데이터를 이용한 학습을 통해 객체가 많을 것으로 예상되는 범위일 수 있다.On the other hand, the range photographed by the second camera may be a range in which a large number of objects are expected through learning using existing detection data.

한편 본 발명의 실시 예에 따른 자율 주행 차량의 전방 객체 검출 시스템은, 제1 초점 거리를 갖는 제1 카메라, 상기 제1 초점 거리보다 상대적으로 긴 제2 초점 거리를 갖는 제2 카메라, 및, 상기 제1 카메라로부터 획득한 제1 이미지 및 상기 제2 카메라로부터 획득한 제2 이미지에 기초하여 객체 검출을 수행하는 제어부를 포함하며, 상기 제어부는 상기 제1 이미지 및 상기 제2 이미지로부터 객체를 검출하고, 상기 제1 이미지와 상기 제2 이미지간 대응 영역을 판단하고, 상기 제1 이미지의 대응 영역에서 이미지 특징을 제거하고, 상기 제2 이미지에서의 객체 검출 결과와 이미지 특징이 제거된 제1 이미지에서의 객체 검출 결과에 기초하여 전방의 객체를 검출한다.Meanwhile, a front object detecting system of an autonomous vehicle according to an embodiment of the present invention includes a first camera having a first focal length, a second camera having a second focal length relatively longer than the first focal length, And a control unit for performing object detection based on the first image acquired from the first camera and the second image acquired from the second camera, wherein the control unit detects the object from the first image and the second image Determining a corresponding region between the first image and the second image, removing an image feature from a corresponding region of the first image, and comparing the object detection result in the second image with the first image, The object in front is detected based on the object detection result of the object.

이 경우 상기 제어부는, 상기 제1 이미지의 대응 영역에서 이미지 특징을 제거 함에 따라, 대응 영역을 제외한 상기 제1 이미지에서 객체를 검출할 수 있다.In this case, the control unit may detect the object in the first image except for the corresponding area by removing the image feature in the corresponding area of the first image.

한편 상기 제어부는, 매개 변수에 기초하여 상기 제2 이미지에서 검출된 객체의 영역에 대응하는 상기 제1 이미지의 대응 영역을 결정하는 단계를 포함할 수 있다.The control unit may include determining a corresponding region of the first image corresponding to an area of the object detected in the second image based on the parameter.

한편 제2 초점 거리를 갖는 제2 카메라의 시야각은 상기 제1 초점 거리를 갖는 제1 카메라의 시야각보다 좁을 수 있다.And the viewing angle of the second camera having the second focal length may be narrower than the viewing angle of the first camera having the first focal distance.

본 발명은 서로 다른 두 개의 초점 거리를 갖는 카메라를 이용하여, 고비용의 카메라 및 고성능의 프로세서를 이용하지 않고도 검출이 어려운 작은 물체까지 검출이 가능하다.The present invention can detect a small object which is difficult to detect without using a high-cost camera and a high-performance processor by using a camera having two different focal lengths.

도 1은 본 발명의 일 실시 예에 따른 객체 인식 장치의 개념도이다.1 is a conceptual diagram of an object recognition apparatus according to an embodiment of the present invention.

도 2는 두개의 입력 이미지를 사용하는 경우 일반적으로 발생할 수 있는 중복 검출의 문제를 나타낸다.Figure 2 illustrates the problem of duplicate detection, which can generally occur when using two input images.

도 3은 두 입력 영상의 관계를 계산하는 흐름을 나타낸다.3 shows a flow for calculating the relationship between two input images.

도 4는 다른 초점거리를 갖는 두 개의 카메라를 사용하여 객체 검출을 하는 프로세스를 나타낸다.Figure 4 shows a process for object detection using two cameras with different focal lengths.

도 5는 본 발명과 기존의 방법간의 정확도 비교를 나타낸다.Figure 5 shows a comparison of the accuracy between the present invention and the conventional method.

도 6은 본 발명의 일 실시 예에 따른 객체 검출 방법을 나타내는 흐름도이다.6 is a flowchart illustrating an object detection method according to an embodiment of the present invention.

이하에서는 도면을 참조하여 본 발명의 구체적인 실시예를 상세하게 설명한다. 그러나 본 발명의 사상은 이하의 실시예에 제한되지 아니하며, 본 발명의 사상을 이해하는 당업자는 동일한 사상의 범위 내에 포함되는 다른 실시예를 구성요소의 부가, 변경, 삭제, 및 추가 등에 의해서 용이하게 제안할 수 있을 것이나, 이 또한 본 발명 사상의 범위 내에 포함된다고 할 것이다. Hereinafter, specific embodiments of the present invention will be described in detail with reference to the drawings. It should be understood, however, that there is no intention to limit the invention to the embodiments described below, and that those skilled in the art, upon reading and understanding the spirit of the invention, It is to be understood that this is also included within the scope of the present invention.

첨부 도면은 발명의 사상을 이해하기 쉽게 표현하기 위하여 전체적인 구조를 설명함에 있어서는 미소한 부분은 구체적으로 표현하지 않을 수도 있고, 미소한 부분을 설명함에 있어서는 전체적인 구조는 구체적으로 반영되지 않을 수도 있다. 또한, 설치 위치 등 구체적인 부분이 다르더라도 그 작용이 동일한 경우에는 동일한 명칭을 부여함으로써, 이해의 편의를 높일 수 있도록 한다. 또한, 동일한 구성이 복수 개가 있을 때에는 어느 하나의 구성에 대해서만 설명하고 다른 구성에 대해서는 동일한 설명이 적용되는 것으로 하고 그 설명을 생략한다. The accompanying drawings are merely exemplary and are not to be construed as limiting the present invention. The present invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Further, even if specific parts such as installation positions are different, the same names are given when the functions are the same, so that the convenience of understanding can be improved. When there are a plurality of identical configurations, only one configuration will be described, and the same description will be applied to the other configurations, and a description thereof will be omitted.

기존의 객체 검출 방법은 하나의 카메라를 사용하며, 여기에서 발생하는 문제는 다른 크기, 특히 작은 크기의 객체를 검출하는 것이다. 최근에 소개된 객체 검출 방법은 컨볼루션 네트워크의 모든 계산된 특징 맵을 사용하여 작은 크기의 객체를 탐지하고자 한다. 그러나, 그 성능은 제한되어 있으며 멀리있는 물체를 검출할 수 없는 문제가 있다.The existing object detection method uses one camera, and the problem that arises here is to detect objects of different size, especially small size. Recently, the object detection method seeks to detect objects of small size using all the calculated feature maps of the convolution network. However, the performance is limited and there is a problem that an object at a distance can not be detected.

도 1에 도시된 바와 같이, 본 발명의 일 실시 예에 따른 객체 인식 장치는 짧은 초점 길이를 가지면서 넓은 시야를 갖는 메인 카메라(1)가 개시되어 있으며, 이와 반대로, 긴 초점 길이를 가지면서 좁은 시야를 갖는 서포트 카메라(2)가 개시되어 있다. 그리고, 메인 카메라(1)로부터 획득한 이미지와 서포트 카메라(2)로부터 획득한 이미지를 처리하여 객체 검출을 수행하는 제어부(미도가)가 객체 인식 장치에 더 포함될 수 있다. 이하에서는 메인 카메라(1)와 서포트 카메라(2)로부터 획득한 이미지를 처리하는 제어부 또는 제어부를 포함하는 시스템의 동작을 설명한다. 여기에서 제어부는 프로세서일 수 있으며, 아래 설명한 알고리즘에 따라 구성된 프로그램이 실행되는 프로세서 일 수 있다. As shown in FIG. 1, an object recognition apparatus according to an embodiment of the present invention has a main camera 1 having a short focal length and a wide field of view, and conversely, A support camera 2 having a field of view is disclosed. A control unit (not shown) for processing an image acquired from the main camera 1 and an image acquired from the support camera 2 to perform object detection may be further included in the object recognition apparatus. Hereinafter, the operation of a system including a control unit or a control unit for processing images acquired from the main camera 1 and the support camera 2 will be described. Here, the control unit may be a processor and may be a processor on which a program configured according to the algorithm described below is executed.

본 발명에서는 도 1에 도시된 바와 같이, 초점 거리가 다른 두 대의 카메라를 사용하여 객체를 검출한다. 초점 거리가 짧은 카메라를 메인 카메라(1, Cm), 초점거리가 긴 카메라를 서포트 카메라(2, Cs)라고 지칭한다. 메인 카메라(1)는 넓은 시야각을 가지며 넓은 장면을 촬영할 수 있다. 대조적으로 서포트 카메라(2)는 작은 시야를 가지고 멀리있는 장면을 포착할 수 있다. 두 카메라의 초점 길이는 임의적이다. 따라서 충분히 긴 초점 거리를 사용하면 아주 멀리있는 물체를 검출할 수 있다. 이 기능은 가능한 한 빨리 교통신호나 장애물을 운전자에게 알려주어야 하는 고속 주행 차량에 특히 유용하다.In the present invention, as shown in FIG. 1, an object is detected using two cameras having different focal lengths. The cameras having short focal lengths are referred to as main cameras 1 and Cm, and the cameras having long focal lengths are referred to as support cameras 2 and Cs. The main camera 1 has a wide viewing angle and can photograph a wide scene. In contrast, the support camera 2 can capture a distant scene with a small field of view. The focal lengths of both cameras are arbitrary. Therefore, a sufficiently long focal length can be used to detect a very distant object. This feature is particularly useful for high-speed vehicles that must notify the driver of traffic signals or obstacles as quickly as possible.

그러나, 초점 거리가 다른 두 대의 카메라를 사용하는 경우 몇가지 문제가 발생할 수 있다. 두 카메라 사이의 관계를 그들 사이의 대응하는 영역이 각각 계산될 수 있도록 하는 것이 요구된다. However, when using two cameras with different focal lengths, some problems may occur. It is required to make the relationship between the two cameras such that corresponding areas between them can be calculated respectively.

또한, 동일한 객체가 메인 카메라가 캡쳐한 이미지(이하 '메인 이미지'), 및 서포트 카메라가 캡쳐한 이미지(이하 '서포트 이미지') 모두에 나타날 수 있다. 결과적으로 중복 검색이 문제될 수 있다.In addition, the same object may appear in both an image captured by the main camera (hereinafter referred to as a 'main image') and an image captured by a support camera (hereinafter, a 'support image'). As a result, duplicate searches can be a problem.

(a)는 35mm 초점 거리 카메라가 캡쳐한 이미지를 나타낸다. (b)는 12mm 초점 거리 카메라가 캡쳐한 이미지를 나타낸다. (c)는 (a)를 사용한 객체 검출 결과를 나타낸다. (d)는 (b)를 사용한 객체 검출 결과를 나타낸다.(a) shows an image captured by a 35 mm focal length camera. (b) shows an image captured by a 12mm focal length camera. (c) shows the result of object detection using (a). (d) shows the result of object detection using (b).

도 2를 예로 들어 설명하면, 서포트 이미지가 35mm 초점 거리의 카메라로 캡처되고, 메인 이미지가 12mm 초점 거리의 카메라로 캡처된다. 기존에 알려진 알고리즘인 Faster-RCNN은 두 이미지의 객체를 독립적으로 검색하는데 사용된다. 결과적으로 동일한 차량이 두 번 검출되며, 따라서 두 이미지를 직접 입력으로 사용할 수 없다. 따라서 이러한 문제를 해결하기 위해서는 중복 객체 검출을 피할 수 있도록 두 개의 이미지를 활용하는 또 다른 솔루션이 필요하다.Taking FIG. 2 as an example, a support image is captured with a 35 mm focal length camera, and a main image captured with a 12 mm focal length camera. A known algorithm, Faster-RCNN, is used to independently search for objects in two images. As a result, the same vehicle is detected twice, and therefore both images can not be used as direct inputs. Therefore, to solve this problem, another solution is needed that utilizes two images to avoid duplicate object detection.

본 발명의 일 실시 예에 따른 객체 검출 방법은 구성된 에너지 함수를 최적화하기 위해 L-SHADE 알고리즘(선행기술문헌 1)을 사용한다.An object detection method according to an embodiment of the present invention uses an L-SHADE algorithm (prior art document 1) to optimize a configured energy function.

다른 진화 알고리즘과 마찬가지로, L-SHADE는 블랙-박스 최적화 알고리즘이다. 어떠한 문제에 L-SHADE 알고리즘을 적용하기 위해서는 두 가지 주요 사항이 필요하다. 첫 번째로, 고려해야할 문제를 위한 인코딩을 구성하는 것이다. 이 단계에서는 최적화에 필요한 문제의 매개 변수를 결정하고 컴퓨터가 처리할 수 있는 형식(예를 들어 어레이 구조)으로 인코딩해야 한다. 두 번째로, 최소화된 비용으로 최적화된 매개 변수 값을 생성하는 에너지 함수를 만드는 것이다.Like other evolutionary algorithms, L-SHADE is a black-box optimization algorithm. To apply the L-SHADE algorithm to any problem, two main points are needed. First, configure the encoding for the problem you want to consider. At this stage, you need to determine the parameters of the problem you need to optimize and encode it in a format that your computer can process (for example, an array structure). Second, it creates an energy function that generates optimized parameter values at a minimized cost.

본 발명의 일 실시 예에 따른 객체 검출 프레임 워크는 두 개의 서로 다른 프로세스를 포함한다. 첫 번째 과정은 도 3에 도시된 바와 같이, 두 입력 영상의 관계를 계산하고, 오프라인으로 수행하는 것이다. 이 관계는 두 개의 초점 길이의 비율인 세 개의 매개 변수와 서포트 이미지의 원점(0,0)에 대응하는 메인 이미지의 픽셀(x,y)에 의해 결정된다.The object detection framework according to an embodiment of the present invention includes two different processes. The first step is to calculate the relationship between the two input images as shown in FIG. 3, and to perform offline. This relationship is determined by the three parameters, the ratio of the two focal lengths, and the pixel (x, y) of the main image corresponding to the origin (0,0) of the support image.

본 발명에서는 두 이미지 사이의 대응점의 수를 찾기 위해 SIFT(선행기술 2)를 사용한다. 그 후, 계산된 대응점을 사용하여 에너지 함수가 구성된다. 여기에서는 L-SHADE 알고리즘을 사용하여 에너지 함수를 최적화한다. 최적화된 함수는 최상의 계산된 매개 변수인 r,x, 및 y를 생성한다. In the present invention, SIFT (Prior Art 2) is used to find the number of corresponding points between two images. Then, the energy function is constructed using the calculated corresponding points. Here we use the L-SHADE algorithm to optimize the energy function. The optimized function produces the best calculated parameters r, x, and y.

온라인 프로세스는 도 4와 같이 두 이미지를 사용하여 객체를 검출한다. 먼저 서포트 이미지는 기존의 객체 검출 방법을 사용하여 객체를 검출하는데 사용된다. 그 후 검출된 객체의 영역은 메인 이미지에 매핑되어, 메인 이미지 내의 대응 영역은 반복적으로 검출되지 않는다. 다음으로 객체 검출 방법은 처리된 메인 이미지 내의 객체를 검출하는데 사용된다. 마지막으로 두 검출 단계에서 검출된 객체가 결합되어 메인 이미지에 대한 최종 결과가 출력된다.The online process detects an object using two images as shown in FIG. First, a support image is used to detect an object using an existing object detection method. The region of the detected object is then mapped to the main image so that the corresponding region in the main image is not repeatedly detected. The object detection method is then used to detect objects in the processed main image. Finally, the objects detected in the two detection steps are combined and the final result for the main image is output.

먼저, 관계 계산(Relation Computation)을 설명한다.First, Relation Computation is explained.

본 발명에서는 SIFT를 사용하여 스테레오 이미지간의 최상의 대응점(Nr)을 검출한다. 그리고 본 발명에서는 메인 이미지와 서포트 이미지간 관계를 추정하는 방법을 제안한다. 본 발명에서 제안하는 방법은 상술한 바와 같이 L-SHADE에 기반을 두고 있다. L-SHADE를 사용할 때 두 가지 중요한 포인트가 있는데, 그것은 문제의 구조를 정의하고, 비용 함수를 구성하는 것이다.In the present invention, SIFT is used to detect the best correspondence point Nr between stereo images. In the present invention, a method for estimating a relationship between a main image and a support image is proposed. The method proposed by the present invention is based on L-SHADE as described above. There are two important points when using L-SHADE, which is to define the structure of the problem and construct the cost function.

1) 문제의 구조(structure of problem): 오프라인 모드에서 추정을 위해 세가지 매개 변수인 r,x,y가 필요하다. r은 두 초점거리의 비율을 나타낸다. 따라서 그 값은 1에서 fmax 사이의 값일 수 있다. x와 y는 서포트 이미지의 원점(0,0)에 대응하는 메인 이미지의 픽셀이고, 해당 픽셀은 메인 이미지의 어느 곳에나 위치할 수 있다. 따라서, 0 < x < W, 0 < y < H 이고, 여기에서 W와 H는 각각 메인 이미지의 너비와 높이이다.1) Structure of problem: In offline mode, three parameters r, x and y are required for estimation. r represents the ratio of the two focal lengths. Therefore, the value may be a value between 1 and fmax. x and y are the pixels of the main image corresponding to the origin (0, 0) of the support image, and the pixel can be located anywhere in the main image. Thus, 0 <x <W, 0 <y <H, where W and H are the width and height of the main image, respectively.

2) 비용 함수(cost function): 본 발명에서는 메인 이미지와 서포트 이미지 사이의 계산된 대응점을 사용하여 구축된 비용 함수의 오차를 평가한다. 본 발명에서는 SIFT를 사용하여 두 이미지 사이의 최적 대응점을 계산한다. 2) Cost function: In the present invention, the error of the cost function constructed using the calculated correspondence between the main image and the support image is evaluated. In the present invention, SIFT is used to calculate an optimal correspondence point between two images.

본 발명에서는 r,x,y의 최적 매개변수 값을 사용하여 서포트 이미지의 점(xis, yis)과 원점(0,0)간의 거리를 r로 스케일링한 뒤의 거리가 원점(x,y)에 대응하는 메인 이미지의 대응점(xim, yim)간의 거리와 비슷해야 한다. 문제 구조 및 비용 함수의 정의로 L-SHADE를 적용하여 정의된 오류가 최소화된 r, x, y에 대한 매개 변수 값을 얻을 수 있다.In the present invention, the distance between the point (xis, yis) of the support image and the origin (0,0) is scaled by r using the optimum parameter values of r, x, y to the origin (x, y) (Xim, yim) of the corresponding main image. By applying L-SHADE to the definition of the problem structure and cost function, parameter values for r, x, and y with minimized errors can be obtained.

두번째로 지역 매핑(region mapping)을 설명한다.Second, we explain the region mapping.

온라인 프로세스에서, 서포트 이미지로부터 객체가 검출될 때, 검출된 객체 영역은 이미지 특징을 제거하기 위해 메인 이미지에 매핑될 필요가 있다. 결과적으로 메인 이미지의 객체들은 다시 검출되지 않는다. In an online process, when an object is detected from a support image, the detected object area needs to be mapped to the main image to remove the image feature. As a result, objects in the main image are not detected again.

서포트 이미지의 객체 영역이 왼쪽 위 좌표(xs1, ys1)와 오른쪽 아래 좌표(xs2, ys2)를 갖는 사각형 모형이라고 가정한다. 객체 영역을 매핑하는 것은 실제로 사각형의 두 점을 매핑하는 것이다. 메인 이미지의 (xs1, ys1)와 (xs2, ys2)에 대한 대응점은 아래 수학식 1 및 수학식 2와 같이 계산된다.Suppose that the object area of the support image is a rectangular model with upper left coordinates (xs1, ys1) and lower right coordinates (xs2, ys2). Mapping an object region actually maps two points of a rectangle. The corresponding points of (xs1, ys1) and (xs2, ys2) in the main image are calculated as shown in the following equations (1) and (2).

여기에서 k = 1,2이다. Where k = 1, 2.

메인 이미지의 해당 객체 영역이 계산될 때, 이미지 특징이 제거된다. 본 발명에서는 해당 영역을 간단하게 0으로 채운다.When the corresponding object area of the main image is calculated, the image feature is removed. In the present invention, the area is simply filled with zeros.

세번째로, 서포트 이미지의 판단(determination of support image)를 설명한다.Third, a determination of support image is described.

상술한 첫번째 단계와 두번째 단계는 두 개의 입력 이미지 중 어느 이미지가 서포트 이미지(초점 거리가 더 큰)인지를 미리 알아야 한다. 따라서, 본 발명에서는 서포트 이미지를 추정하기 위해 SIFT 대응점을 재 사용하는 것을 제안한다. The first and second steps described above need to know in advance which of the two input images is the support image (the focal length is larger). Therefore, in the present invention, it is proposed to reuse the SIFT corresponding point to estimate the support image.

구체적으로 서포트 이미지 추정 방법을 설명하면, I1과 I2를 두 개의 입력 이미지라고 하고, s1과 s2를 I1과 I2 각각에 대한 대응점(Nd)의 두 세트라고 지칭하고, I1과 I2에서 SIFT 포인트 사이의 유클리드 거리의 합을 비교하는 방법이다. 유클리드 거리 합이 더 큰 이미지가 서포트 이미지이다. D(s1) 및 D(s2)를 SIFT 포인트인 s1 과 s2 의 두 개의 셋트들을 위한 유클리드 거리의 합으로 지칭할 수 있고, D(s1)과 D(s2)를 아래 수학식 3과 같이 계산할 수 있다.Specifically, I1 and I2 are referred to as two input images, and s1 and s2 are referred to as two sets of corresponding points (Nd) for I1 and I2, respectively, and a difference between I1 and I2 and SIFT points It is a way to compare the sum of Euclidean distances. An image with a larger Euclidean distance sum is a support image. D (s1) and D (s2) can be referred to as the sum of Euclidean distances for two sets of SIFT points s1 and s2 and D (s1) and D (s2) have.

여기에서 d( , )는 두 개의 포인트 간의 유클리드 거리를 나타내며, si는 세트 s에서의 i번째 포인트를 나타낸다. D(s1) - D(s2) > 0 이면, I1은 서포트 이미지이며, 반대의 경우 I2가 서포트 이미지이다.Where d (,) denotes the Euclidean distance between two points, and si denotes the i-th point in set s. If D (s1) - D (s2)> 0, I1 is a support image, and in the opposite case I2 is a support image.

도 5는 하나의 이미지만을 사용하는 기존의 Faster-RCNN과 RRC과, 본 발명의 일 실시 예와 같이 두 개의 이미지(DF)를 사용하는 방법을 6개의 시퀀스에 대하여 적용한 결과를 보여준다. Faster-RCNN과 RRC는 상술한 바와 같이 두 이미지에서 차량을 검출하도록 설계되어 있지 않기 때문에 단순히 서로 다른 초점거리의 이미지를 사용하여 중복 감지를 한다. 따라서, Faster-RCNN과 RRC의 성능은 비교적 낮은 것을 확인할 수 있다. 반면에, 본 발명에서 제안된 방법인 DF + Faster-RCNN과 DF + RRC는 중복검출 문제를 인식하고 있는바, 기존의 방법보다 높은 퍼포먼스를 보여줌을 확인할 수 있다.FIG. 5 shows a result of applying the conventional Faster-RCNN and RRC using only one image and the method using two images (DF) to six sequences as one embodiment of the present invention. Since Faster-RCNN and RRC are not designed to detect vehicles in two images as described above, they simply detect overlapping images using different focal length images. Therefore, the performance of Faster-RCNN and RRC is relatively low. On the other hand, DF + Faster-RCNN and DF + RRC, which are proposed in the present invention, recognize the redundant detection problem and can show higher performance than the conventional method.

본 발명의 일 실시 예에 따른 객체 검출 시스템은 메인 이미지와, 서포트 이미지를 획득한다(S101). 여기에서 메인 이미지는 상대적으로 짧은 초점거리를 가지면서 넓은 촬영 범위를 갖는 카메라로부터 획득한 이미지이고, 서포트 이미지는 상대적으로 긴 초점거리를 가지면서 좁은 촬영 범위를 갖는 카메라로부터 획득한 이미지이다. The object detection system according to an embodiment of the present invention acquires a main image and a support image (S101). Here, the main image is an image obtained from a camera having a relatively short focal length and a wide photographing range, and the support image is an image obtained from a camera having a relatively long focal distance and having a narrow photographing range.

구체적인 실시 예에서, 객체 검출 시스템은 자율 주행 차량에 설치되어 자율 주행 차량의 주행간에 전방의 객체 검출에 사용될 수 있다. 일 예로 자율 주행 차량에 두 개의 서로 다른 초점 거리를 갖는 카메라가 설치되어 있을 수 있다. 그리고, 제1 카메라는 차량 전면의 넓은 범위를 촬영하는 짧은 초점 거리를 갖는 카메라이며, 제2 카메라는 차량 전방의 특정의 좁은 범위를 촬영하는 긴 초점 거리를 갖는 카메라일 수 있다. In a specific embodiment, the object detection system may be installed in an autonomous vehicle and used for object detection in front of the running of the autonomous vehicle. For example, an autonomous vehicle may have two cameras with different focal lengths. And, the first camera may be a camera having a short focal length for photographing a wide range of the entire front surface of the vehicle, and the second camera may be a camera having a long focal distance for photographing a specific narrow range of the front of the vehicle.

여기에서, 일 예로 제2 카메라가 촬영하는 특정의 범위는 제1 카메라가 촬영한 이미지에서 도로로 인식된 부분일 수 있다. 다른 예로, 제2 카메라가 촬영하는 특정의 범위는 기존의 검출 데이터를 이용한 학습(training)을 통해 객체가 상대적으로 많을 것으로 예상되는 부분일 수 있다.Here, as an example, the specific range captured by the second camera may be a portion recognized as a road in the image captured by the first camera. As another example, the specific range captured by the second camera may be a portion where an object is expected to be relatively large through training using existing detection data.

객체 검출 시스템은 메인 이미지에서 객체를 검출하고, 서포트 이미지에서 객체를 검출한다(S103). 기존에 알려진 객체 검출 방법을 이용하여 객체 검출 시스템은 메인 이미지에서 객체를 검출하고, 서포트 이미지에서 객체를 검출한다. 여기에서 기존의 객체 검출 방법은 RCNN과 같은 알고리즘일 수 있다.The object detection system detects the object in the main image and detects the object in the support image (S103). Using the existing object detection method, the object detection system detects the object in the main image and detects the object in the support image. Here, the conventional object detection method may be an algorithm such as RCNN.

객체 검출 시스템은 메인 이미지와 서포트 이미지간의 관계를 판단하여 대응 영역을 판단한다(S105). 객체 검출 시스템은 기존에 알려진 알고리즘을 이용하여 메인 이미지와 서포트 이미지간의 관계를 판단할 수 있다. 여기에서 기존의 알고리즘은 SIFT일 수 있다.The object detection system determines a corresponding region by judging a relation between the main image and the support image (S105). The object detection system can determine the relation between the main image and the support image by using a known algorithm. Here, the existing algorithm may be SIFT.

객체 검출 시스템은 메인 이미지의 대응 영역에서의 이미지 특징을 제거한다(S107). 구체적으로 메인 이미지와 서포트 이미지가 대응되는 부분에서 객체의 중복 검출이 발생할 수 있는바, 객체 검출 시스템은 메인 이미지의 대응 영역에서의 이미지 특징을 제거하여 해당 영역에서 객체가 검출되지 않도록 한다.The object detection system removes the image feature in the corresponding region of the main image (S107). Specifically, the object detection system may detect duplication of objects in a portion where the main image and the support image correspond to each other, so that the object detection system removes the image feature in the corresponding region of the main image so that the object is not detected in the corresponding region.

객체 검출 시스템은 메인 이미지에서의 객체 검출 결과와, 서포트 이미지에서의 객체 검출 결과를 종합하여 객체 검출을 수행한다(S109). 구체적으로 객체 검출 시스템은 서포트 이미지에서 획득한 객체 검출 결과와, 대응 영역을 제외한 메인 이미지에서 획득한 객체 검출 결과를 합하여 전체 객체 검출을 수행한다.The object detection system performs object detection by integrating the object detection result in the main image and the object detection result in the support image (S109). Specifically, the object detection system performs object detection by summing the object detection result obtained from the support image and the object detection result obtained from the main image excluding the corresponding region.

따라서, 서로 다른 초점 거리를 갖는 카메라가 설치된 자율 주행 차량은 고 해상도의 카메라를 설치하지 않고 일반적으로 사용되는 카메라만으로도 상대적으로 작은 크기로 촬영되는 객체를 검출할 수 있다.Therefore, an autonomous vehicle having cameras having different focal lengths can detect an object photographed in a relatively small size by using only a general camera without installing a high-resolution camera.

전술한 본 발명은, 프로그램이 기록된 매체에 컴퓨터가 읽을 수 있는 코드로서 구현하는 것이 가능하다. 컴퓨터가 읽을 수 있는 매체는, 컴퓨터 시스템에 의하여 읽혀질 수 있는 데이터가 저장되는 모든 종류의 기록장치를 포함한다. 컴퓨터가 읽을 수 있는 매체의 예로는, HDD(Hard Disk Drive), SSD(Solid State Disk), SDD(Silicon Disk Drive), ROM, RAM, CD-ROM, 자기 테이프, 플로피 디스크, 광 데이터 저장 장치 등이 있으며, 또한 캐리어 웨이브(예를 들어, 인터넷을 통한 전송)의 형태로 구현되는 것도 포함한다.The present invention described above can be embodied as computer-readable codes on a medium on which a program is recorded. The computer readable medium includes all kinds of recording devices in which data that can be read by a computer system is stored. Examples of the computer readable medium include a hard disk drive (HDD), a solid state disk (SSD), a silicon disk drive (SDD), a ROM, a RAM, a CD-ROM, a magnetic tape, a floppy disk, , And may also be implemented in the form of a carrier wave (e.g., transmission over the Internet).

따라서, 상기의 상세한 설명은 모든 면에서 제한적으로 해석되어서는 아니되고 예시적인 것으로 고려되어야 한다. 본 발명의 범위는 첨부된 청구항의 합리적 해석에 의해 결정되어야 하고, 본 발명의 등가적 범위 내에서의 모든 변경은 본 발명의 범위에 포함된다.Accordingly, the above description should not be construed in a limiting sense in all respects and should be considered illustrative. The scope of the present invention should be determined by rational interpretation of the appended claims, and all changes within the scope of equivalents of the present invention are included in the scope of the present invention.

Claims

Obtaining a first image acquired from a first camera having a first focal distance;

Obtaining a second image acquired from a second camera having a second focal length relatively longer than the first focal distance;

Detecting an object from the first image and the second image;

Determining a corresponding region between the first image and the second image;

Removing an image feature in a corresponding region of the first image; And

And detecting an object ahead of the object based on the object detection result in the second image and the object detection result in the first image from which the image feature has been removed

A method of detecting a forward object of an autonomous vehicle.

The method according to claim 1,

Wherein the step of detecting the forward object comprises:

Removing an image feature in a corresponding region of the first image, and detecting an object in the first image except a corresponding region

A method of detecting a forward object of an autonomous vehicle.

The method according to claim 1,

Wherein determining the corresponding region between the first image and the second image comprises:

Determining a corresponding region of the first image corresponding to an area of the object detected in the second image based on the parameter

A method of detecting a forward object of an autonomous vehicle.

The method of claim 3,

The parameter may comprise:

Wherein the first image includes a pixel of the first image corresponding to a ratio of a focal distance of the first camera to a focal length of the second camera and an origin of the second image

A method of detecting a forward object of an autonomous vehicle.

The method according to claim 1,

Wherein the viewing angle of the second camera having the second focal length is smaller than the viewing angle of the first camera having the first focal distance

A method of detecting a forward object of an autonomous vehicle.

The method according to claim 1,

The range photographed by the second camera is a range in which objects are expected to be large through learning using existing detection data

A method of detecting a forward object of an autonomous vehicle.

A first camera having a first focal length;

A second camera having a second focal length that is relatively longer than the first focal length; And

And a control unit for performing object detection based on a first image acquired from the first camera and a second image acquired from the second camera,

Wherein the control unit detects an object from the first image and the second image, determines a corresponding region between the first image and the second image, removes the image feature from the corresponding region of the first image, 2 < / RTI > image and the object detection result in the first image from which the image feature has been removed

A front object detection system for an autonomous vehicle.

8. The method of claim 7,

Wherein,

Removing the image feature in the corresponding region of the first image, and detecting the object in the first image except the corresponding region

A front object detection system for an autonomous vehicle.

8. The method of claim 7,

Wherein,

A front object detection system for an autonomous vehicle.

10. The method of claim 9,

The parameter may comprise:

A front object detection system for an autonomous vehicle.

8. The method of claim 7,

A front object detection system for an autonomous vehicle.

8. The method of claim 7,

A front object detection system for an autonomous vehicle.