[go: up one dir, main page]

WO2020054260A1 - Image recognition device - Google Patents

Image recognition device Download PDF

Info

Publication number
WO2020054260A1
WO2020054260A1 PCT/JP2019/030823 JP2019030823W WO2020054260A1 WO 2020054260 A1 WO2020054260 A1 WO 2020054260A1 JP 2019030823 W JP2019030823 W JP 2019030823W WO 2020054260 A1 WO2020054260 A1 WO 2020054260A1
Authority
WO
WIPO (PCT)
Prior art keywords
dimensional object
recognition
area
image
characteristic information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/JP2019/030823
Other languages
French (fr)
Japanese (ja)
Inventor
郭介 牛場
小林 正幸
都 堀田
裕史 大塚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hitachi Astemo Ltd
Original Assignee
Hitachi Automotive Systems Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hitachi Automotive Systems Ltd filed Critical Hitachi Automotive Systems Ltd
Priority to JP2020546756A priority Critical patent/JP6983334B2/en
Priority to CN201980054785.1A priority patent/CN112639877A/en
Publication of WO2020054260A1 publication Critical patent/WO2020054260A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis

Definitions

  • the present invention relates to an image recognition device.
  • Patent Literature 1 discloses that in a situation where a certain moving three-dimensional object and another three-dimensional object overlap, by tracking a feature point inside a predetermined area including the three-dimensional object, a pedestrian or the like existing inside the area is tracked. A recognition device that detects a moving three-dimensional object has been proposed.
  • the conventional apparatus has a problem that the recognition performance is reduced when the entire three-dimensional object cannot be detected properly.
  • An image recognition device provides a three-dimensional object detection area based on the three-dimensional object detection characteristic information for a three-dimensional object detection area set on an image captured by an imaging unit.
  • a three-dimensional object region setting unit that sets a three-dimensional object region by enlarging or reducing the three-dimensional object region, and a recognition process that performs a recognition process of specifying the type of the three-dimensional object with respect to the three-dimensional object region set by the three-dimensional object region setting unit Unit.
  • An image recognition device according to a second aspect of the present invention provides a three-dimensional object detection area based on first three-dimensional object characteristic information for a three-dimensional object detection area set on an image captured by an imaging unit.
  • a three-dimensional object region setting unit that sets a three-dimensional object region by enlarging or reducing a detection region, and using the three-dimensional object region obtained by the three-dimensional object region setting unit as a reference size, based on second characteristic information of the three-dimensional object.
  • a recognition magnification setting unit that defines a recognition area of a plurality of sizes; and a plurality of the recognition areas determined by the recognition magnification setting unit, based on third characteristic information of the three-dimensional object.
  • a scanning area setting unit configured to set a plurality of wide scanning areas; and a recognition processing unit configured to perform a recognition process using the scanning area set by the scanning area setting unit.
  • an image recognition apparatus that accurately detects a three-dimensional object and improves recognition performance.
  • FIG. 9 is a diagram illustrating a three-dimensional object area set on an image by a detection process. It is a flowchart which shows the detail of a recognition process. It is a figure explaining the principle of solid object field setting processing. It is a figure explaining the principle of recognition magnification setting processing.
  • FIG. 9 is a diagram illustrating normalization in a recognition magnification setting process.
  • FIG. 4 is a diagram illustrating the principle of a scanning area setting process. It is a figure explaining the principle of magnification-specific scan recognition processing.
  • FIG. 4 is a diagram for explaining the principle of an optimum magnification setting process. It is a figure explaining the principle of detailed recognition position deciding processing.
  • FIG. 9 is a diagram for explaining the principle of the detailed recognition process.
  • FIG. 11 is a block diagram illustrating an overall configuration of an image recognition device according to a modification.
  • FIG. 1 is a block diagram showing the overall configuration of an image recognition device 100 according to the present embodiment.
  • the image recognition device 100 includes a left camera 101 and a right camera 102 mounted on a vehicle and arranged on the left and right in front of the vehicle.
  • the cameras 101 and 102 constitute a stereo camera, and capture an image of a three-dimensional object such as a pedestrian, a vehicle, a signal, a sign, a white line, a tail lamp of a car, and a headlight.
  • the image recognition device 100 recognizes the environment outside the vehicle based on image information of the front of the vehicle captured by the cameras 101 and 102. Then, the vehicle (own vehicle) controls braking, steering, and the like based on the recognition result by the image recognition device 100.
  • the image recognition device 100 captures images captured by the cameras 101 and 102 from the image input interface 103.
  • the image information captured from the image input interface 103 is sent to the image processing unit 104 via the internal bus 109.
  • the image data is processed by the arithmetic processing unit 105, and image information of the result in the middle of the processing and the final result are stored in the storage unit 106.
  • the image processing unit 104 compares the first image obtained from the image sensor of the left camera 101 with the second image obtained from the image sensor of the right camera 102, and compares each image with the image sensor. Correction of the device-specific deviation due to the correction and image correction such as noise interpolation are performed and stored in the storage unit 106. Further, between the first image and the second image, mutually corresponding portions are calculated to obtain disparity information, and this is stored in the storage unit 106 as distance information corresponding to each pixel on the image. I do.
  • the image processing unit 104 is connected to an arithmetic processing unit 105, a CAN interface 107, and a control processing unit 108 via an internal bus 109.
  • the arithmetic processing unit 105 uses the image information and the distance information (disparity information) stored in the storage unit 106 to recognize a three-dimensional object in order to grasp the environment around the vehicle. A part of the recognition result of the three-dimensional object and a part of the intermediate processing result are stored in the storage unit 106.
  • the arithmetic processing unit 105 performs a vehicle control calculation using the recognition result after performing recognition of a three-dimensional object on the captured image.
  • a control policy of the vehicle obtained as a result of the vehicle control calculation and a part of the recognition result are transmitted to the in-vehicle network CAN 110 via the CAN interface 107, whereby the vehicle is controlled.
  • the control processing unit 108 monitors whether each processing unit has performed an abnormal operation, whether an error has occurred during data transfer, and the like, and prevents the abnormal operation.
  • the image processing unit 104, the arithmetic processing unit 105, and the control processing unit 108 may be configured by a single or a plurality of computer units.
  • FIG. 2 is a flowchart showing the operation of the image recognition device 100.
  • An image is captured by the left camera 101 and the right camera 102 provided in the image recognition apparatus 100, and the captured image information 203 and 204 are images such as corrections for absorbing a peculiar habit of the image sensor.
  • Step 205 is performed.
  • the processing result of the image processing 205 is stored in the image buffer 206.
  • the image buffer 206 is provided in the storage unit 106 in FIG.
  • parallax processing 207 is performed. Specifically, the images are collated using the two corrected images, and thereby the disparity information of the images obtained by the left camera 101 and the right camera 102 is obtained. From the parallax between the left and right images, a certain point of interest on the image of the three-dimensional object is obtained as the distance to the three-dimensional object according to the principle of triangulation.
  • the image processing 205 and the parallax processing 207 are performed by the image processing unit 104 in FIG. 1, and the finally obtained image information and parallax information are stored in the storage unit 106.
  • FIG. 3 is a diagram illustrating a detection area of a three-dimensional object set on an image by the detection processing 208.
  • FIG. 3 shows a pedestrian detection area 301 and a vehicle detection area 302 detected by the cameras 101 and 102 on the image as a result of the detection processing 208.
  • These detection areas 301 and 302 may be rectangular as shown in FIG. 3, or may be irregular areas obtained from parallax and distance. It is generally treated as a rectangle to facilitate handling on a computer in subsequent processing.
  • the region will be described as a rectangle, and a pedestrian will be mainly described as an example of a three-dimensional object.
  • a recognition process for specifying the type of the three-dimensional object is performed on the detection area set on the image by the detection process 208.
  • the three-dimensional object to be recognized by the recognition process 209 is, for example, a pedestrian, a vehicle, a signal, a sign, a white line, a tail lamp or a headlight of the vehicle, and the type is identified as any of these.
  • the detection region on the image and the target region to be recognized need to match.
  • the details of the recognition processing 209 that solves this problem will be described later.
  • the vehicle control process 210 in consideration of the recognition result of the three-dimensional object and the state (speed, steering angle, etc.) of the own vehicle, for example, a warning is issued to an occupant, and braking and steering angle adjustment of the own vehicle are performed. And so on. Alternatively, avoidance control for the recognized three-dimensional object is determined, and the result is output via the CAN interface 107 as automatic control information.
  • the recognition processing 209 and the vehicle control processing 210 are performed by the arithmetic processing unit 105 in FIG.
  • the programs shown in the flowchart of FIG. 2 and the flowchart of FIG. 4 described below can be executed by a computer including a CPU, a memory, and the like. All or part of the processing may be realized by a hard logic circuit. Further, this program can be provided by being stored in a storage medium of the image recognition device 100 in advance. Alternatively, the program may be stored and provided in an independent storage medium, or the program may be recorded and stored in the storage medium of the image recognition device 100 via a network line. Various forms of computer-readable computer program products, such as data signals (carrier waves), may be provided.
  • data signals carrier waves
  • FIG. 4 is a flowchart showing details of the recognition process 209. As shown in FIG. 4, the flowchart includes a three-dimensional object area setting process 401, a recognition magnification setting process 402, a scanning area setting process 403, a magnification-based scan recognition process 404, an optimum magnification setting process 406, a detailed recognition position determination process 407, A detailed recognition process 408 is performed.
  • a recognition magnification setting process 402 a scanning area setting process 403, a magnification-based scan recognition process 404, an optimum magnification setting process 406, a detailed recognition position determination process 407, A detailed recognition process 408 is performed.
  • each process will be described in order. Note that these processes will be described on the assumption that a stereo camera is used.
  • the three-dimensional object detection area 301 obtained by the detection processing 208 is enlarged or reduced based on the three-dimensional object detection characteristic information to set a three-dimensional object area 501.
  • FIG. 5 is a diagram for explaining the principle of the three-dimensional object area setting processing 401.
  • FIG. 5 shows an example in which the three-dimensional object area 501 is set by enlarging the pedestrian detection area 301 based on the three-dimensional object detection characteristic information.
  • the detection characteristic information includes, for example, (1) the identifiability of the three-dimensional object, (2) the distance from the three-dimensional object, (3) the size of the three-dimensional object, (4) the assumed size of the three-dimensional object, and (5) the brightness of the outside environment. (6) Headlight orientation, (7) Road surface height where a three-dimensional object exists, (8) Sensor resolution, and the like.
  • detection characteristic information will be described.
  • a three-dimensional object for example, in the cameras 101 and 102, a case where a three-dimensional object is difficult to obtain due to a combination with a background area may be considered.
  • the pedestrian's clothes of the same color as the road surface and the top of the pedestrian at night correspond to this.
  • the cameras 101 and 102 are partially blurred due to the influence of raindrops or the like and lack a three-dimensional object region. In such a case, the detection area is enlarged.
  • a three-dimensional object may be formed by combining a person and a region other than the person in the three-dimensional space.
  • the detection area is reduced based on the color, luminance, and edge of the image.
  • the horizontal mounting height is different between different types of sensors such as a radar sensor and the cameras 101 and 102, hiding mainly occurs in the upper direction, and a three-dimensional object appears small. If the processing characteristic has such a configuration, the detection area is enlarged.
  • the three-dimensional object region is enlarged if it is far, and small if the distance is short.
  • This magnification may be determined by the resolution of the sensor including the cameras 101 and 102. This is because the farther the object goes, the larger the size of one pixel occupying the three-dimensional space becomes, and the more the error gets.
  • the three-dimensional object region is enlarged if the three-dimensional object is small, and the three-dimensional object region is reduced if the three-dimensional object is large.
  • the assumed size of the three-dimensional object is, for example, assuming that the three-dimensional object is a pedestrian, a three-dimensional object that is too small assuming a pedestrian enlarges the three-dimensional object region, and is too large assuming a pedestrian. Reduces the three-dimensional object area.
  • the degree to which the object is targeted may be determined in consideration of (5) the brightness of the outside environment and (6) the direction of the headlight. For example, the three-dimensional object area is reduced in a bright environment in the daytime, and the three-dimensional object area is expanded in a dark environment at night.
  • the direction of the headlight for example, if the headlight is in the low beam direction, the light hits the feet, and the three-dimensional object area is enlarged in the height direction.
  • the direction of the headlight is a high beam
  • the entire body is illuminated, so that the three-dimensional object area is reduced.
  • the three-dimensional object area may be enlarged or reduced depending on the distance to the three-dimensional object or (7) the height of the road surface at the position where the three-dimensional object exists. For example, when the height of the road surface is low, if the beam is a high beam, the light does not hit the feet, so that the three-dimensional object area is enlarged in the downward direction.
  • the sensor resolution is enlarged or reduced by combining the size of the three-dimensional object and the distance of the object. I do. For example, when a three-dimensional object is in the vicinity, the resolution of the three-dimensional space per pixel is high, so that the three-dimensional object region is enlarged. When the three-dimensional object is distant, the resolution of the three-dimensional space per pixel is low, so that the three-dimensional object region is reduced. Further, the three-dimensional object area is reduced depending on the characteristics of the area acquired as the three-dimensional object area.
  • the characteristic of the region acquired as the three-dimensional object region is, for example, a case where the region having the parallax and the sensor response region where the three-dimensional object region is obtained are set to be larger. In such a case, the three-dimensional object region is reduced.
  • the three-dimensional object area 501 is set based on the three-dimensional object detection characteristic information. For example, in a pedestrian, a limb region having a large change on an image is likely to be missing and smaller. Also, in the case of nighttime, if the person has black hair, it is difficult to detect the head because it is mixed with the background. In such a case, the three-dimensional object area 501 on the image is changed based on the size of one pixel in the three-dimensional space. For example, in a bright daytime, the area considered to be the head is 0 cm, and the area of the feet is 10 cm. In the nighttime, the area of the head is 10 cm, and the area of the feet is 10 cm.
  • the head area is set to 10 cm and the foot area is set to 0 cm.
  • the width is appropriately enlarged or reduced. Further, the correction may be performed based on the change amount of the width in the time series.
  • the recognition area may be reduced depending on the content of the recognition processing in the subsequent stage. For example, in the case of a pedestrian, recognition is performed only with the upper body.
  • the size to be enlarged or reduced may be determined by a predetermined ratio or the size on the image. However, by setting the size on the basis of the size in the three-dimensional space, it is possible to exclude a size that cannot be recognized as a recognition target. Also, the size of the three-dimensional object area 501 due to the enlargement / reduction may be the same as the detection area 301 due to the relationship between the distance in the three-dimensional space and the pixels.
  • the three-dimensional object area setting process 401 sets the three-dimensional object area 501 with higher accuracy by combining a plurality of pieces of the detection characteristic information.
  • a three-dimensional object area 501 in which the influence of day and night is further reduced is set by combining distance, brightness of the outside environment, light direction, road surface height, sensor resolution, and the like.
  • the number of pixels and the size described here are merely examples, and the present invention is not limited to this range.
  • the recognition magnification setting processing 402 shown in FIG. 4 will be described. Assuming that the three-dimensional object area 501 set in the three-dimensional object area setting processing 401 is the reference size of the recognition processing 209, this reference size is not always the optimum recognition area for recognition. Therefore, in the recognition magnification setting processing 402, a plurality of sizes of recognition areas are determined using the recognition characteristic information of the three-dimensional object. At this time, since the optimal recognition area is unknown, the recognition area is enlarged or reduced based on the reference size to determine a plurality of size recognition areas.
  • the recognition characteristic information includes, for example, (1) the distance from the three-dimensional object, (2) the size of the three-dimensional object, (3) the limit size of the three-dimensional object, and (4) the sensor resolution. Hereinafter, these pieces of recognition characteristic information will be described.
  • the distance from the three-dimensional object is an index for determining the amount of enlargement or reduction when the recognition area is enlarged or reduced. For example, when the three-dimensional object is far away, the size of the three-dimensional object per pixel increases. In this case, since an image is input to a later-described classifier 901 that performs recognition processing, the number of pixels to be enlarged or reduced when a three-dimensional object is far away is smaller than when it is nearby. Therefore, based on the reference size, the recognition area is enlarged or reduced in accordance with the distance to the three-dimensional object, and recognition areas of a plurality of sizes are determined.
  • the size of the three-dimensional object is an index for determining the amount of enlargement or reduction when the recognition area is enlarged or reduced. For example, when the three-dimensional object is large, the number of pixels when performing enlargement or reduction on the image is smaller than when the three-dimensional object is small. In the case where recognition areas of a plurality of sizes are determined based on the size in the real space, the number of pixels when performing enlargement or reduction on an image is smaller in a distant place than in a nearby place. If the recognition area is not set in sub-pixel units, the size may be the same as the reference size.
  • the limit size of the three-dimensional object is a limit size assumed when the three-dimensional object is a recognition target. For example, when the three-dimensional object is a pedestrian, if the height of the three-dimensional object is larger than 2.5 meters, an area is set in the reduction direction. Conversely, if the height is smaller than 0.8 meters, the area is set in the enlargement direction. If they are in between, areas are set for both.
  • the upper and lower limits of the region to be set may be determined based on a three-dimensional object to be recognized, restrictions on recognition processing, and the like.
  • a range to be enlarged or reduced can be determined based on the sensor resolution. For example, in a distant place where the size in a three-dimensional space per pixel exceeds 20 cm, the range to be enlarged or reduced is defined as a small range of one pixel or two pixels. Conversely, in a short distance where the size of one pixel in a three-dimensional space is less than 1 cm, the image is enlarged or reduced in a large range such as 10 pixels or 20 pixels.
  • the size on the image may be calculated from the size of the three-dimensional object in the three-dimensional space.
  • a variation of the recognition area may be set using the detection characteristic information in the recognition magnification setting processing 402. In this case, which condition of the detection characteristic information is used to enlarge or reduce the recognition area is the same as the content described in the three-dimensional object area setting processing 401.
  • the number of pixels and the size described here are merely examples, and the present invention is not limited to this range.
  • FIG. 6 illustrates the principle of the recognition magnification setting process 402.
  • the three-dimensional object area 501 is set as a recognition area of a reference size, and the recognition areas 601 and 602 obtained by enlarging or reducing this are determined.
  • the recognition area 501 is a recognition area having a small reference magnification with a reduced reference size
  • the recognition area 601 is a recognition area having a high recognition magnification with a reduced reference size.
  • two types of recognition areas that are enlarged or reduced with respect to the reference size are shown. However, this number may have many variations as long as the recognition processing time has a margin.
  • the enlargement or the reduction may be set by the setting of the detection processing 208 or the three-dimensional object area setting processing 401.
  • the amount of enlargement or reduction of the recognition area is set based on the recognition characteristic information.
  • the recognition area may be the same as the recognition area of the reference size depending on the resolution of the image.
  • FIG. 7 is a diagram illustrating the normalization in the recognition magnification setting process 402.
  • an area to be normalized is determined when the recognition area (501, 601 and 602) is subjected to the subsequent recognition processing.
  • the recognition area indicates a range in which recognition processing is performed in recognition processing described later.
  • the recognition area 501 is a recognition area having a small reference magnification with a reduced reference size
  • the recognition area 601 is a recognition area having a high recognition magnification with a reduced reference size.
  • the recognition area 501 of the reference size captures the target object neatly, and how it should be captured depends on the characteristics of the recognition processing implemented in the apparatus. Therefore, a region to be normalized is set in advance.
  • the recognition area 501 almost includes the head and the feet, whereas the recognition area 601 having a small recognition magnification has the top of the head and the limbs protruding, and the recognition area 602 having the large recognition magnification has a head. Margins are formed at the top and feet.
  • this normalization processing is not necessarily performed in the recognition magnification setting processing 402. It may be implemented as a part of the later-described scanning per-magnification processing 404 and the later-described detailed recognition processing 408.
  • the scanning area setting process 403 sets a scanning area larger than the recognition area for each recognition area based on the arrangement characteristic information of the three-dimensional object.
  • the scanning area is set as an area on the image, and in the recognition processing, the set scanning area is scanned by the recognition area. That is, the recognition area indicates a range in which recognition processing is performed in recognition processing described later, and the scanning area is a range in which the recognition area is moved within the range of the scanning area. Thus, the recognition processing is performed while moving the recognition area within the range of the scanning area.
  • the arrangement characteristic information for determining the size of the scanning area includes, for example, (1) the distance position of the three-dimensional object, (2) the road surface height where the three-dimensional object exists, and the like. Hereinafter, such arrangement characteristic information will be described.
  • the perspective position of the three-dimensional object is an index when setting the scanning area. For example, when a three-dimensional object is nearby, the scanning area on the image is determined to be large. If the three-dimensional object is far away, the scanning area is set small. This is because the sensor resolution is high when the object is near, and the scanning amount in the three-dimensional space when one pixel is scanned is about several mm, whereas it exceeds 10 cm in a distant place.
  • the scanning area is also determined by characteristics such as the amount of detection deviation generated by the three-dimensional object detection.
  • the amount of variance and dispersion between the center of the horizontal position of the three-dimensional object and the center of the horizontal position of the actual recognition target can be used for the scanning area. You may set so that the center of the horizontal position of a recognition object may be settled.
  • the road surface height at which a three-dimensional object exists is an index when setting a scanning area. For example, when the road surface is rising and a three-dimensional object (such as a pedestrian) is located at a position higher than the own vehicle, the height of the head side becomes smaller than the actual height due to increased hiding on the head side. If a three-dimensional object (such as a pedestrian) is at a low position, the foot may be cut off or hidden by a bumper depending on the angle of view. The scanning area is enlarged or reduced in accordance with such a state.
  • the scanning area setting processing 403 uses the information. May be determined. In this case, the conditions under which the scanning area is enlarged or reduced are the same as in the three-dimensional object area setting processing 401 and the recognition magnification setting processing 402. Further, the number of pixels and the size described here are merely examples, and the present invention is not limited to this range.
  • FIG. 8 is a diagram for explaining the principle of the scanning area setting process 403.
  • the scanning area setting process 403 determines the scanning areas 801, 802, and 803 for the recognition areas 501, 601, and 602, respectively.
  • the scanning areas 801, 802, 803 are the same as or larger than the recognition areas 501, 601, 602. However, since the scanning areas 801, 802, 803 are scanned in the recognition areas 501, 601, 602, the scanning amount is not always large.
  • the scanning regions 801, 802, and 803 determine regions on the image from the arrangement characteristic information. At this time, depending on the resolution of the image, the recognition area and the scanning area may be the same on the image.
  • the scanning area is individually determined for each recognition area. However, if there is enough processing time, one having the largest scanning area may be used. If there is not enough processing time, one small scanning area may be adapted to each recognition area.
  • magnification-specific scan recognition processing 404 shown in FIG. 4 will be described.
  • the images and the parallax areas (distance areas) corresponding to the scan areas 801, 802, and 803 are scanned in the recognition areas 501, 601 and 602, and the recognition processing is performed for each scan position of each size. Then, it is determined whether the target scanning position is a three-dimensional object.
  • the vehicle control process 210 may be performed using the result of the magnification-based scan recognition process 404 as shown by a broken line 405 in FIG.
  • the magnification-based scanning recognition processing 404 may have a plurality of results depending on the magnification, the scanning position, and the like. In this case, narrowing is performed by processing such as selecting one having the best recognition result.
  • FIG. 9 is a view for explaining the principle of the magnification-based scanning recognition processing 404. While scanning in each of the scanning areas 801, 802, and 803 with the recognition areas 501, 601, and 602, a response position 902 as a result of recognition by the discriminator 901 that performs recognition processing is obtained.
  • the response position 902 is indicated by x in FIG. The greater the number of response positions 902, the better the recognition process.
  • the scanning area 801 ' is the largest as shown by the response position 902 of the scanning areas 801', 802 ', 803'.
  • the discriminator 901 may use machine learning or may use heuristic threshold value determination. If the result of this determination is sufficient, recognition may be terminated using this result, as shown by the broken line 405 in FIG. In that case, for example, the one with the best recognition processing is adopted.
  • the optimum magnification setting process 406 shown in FIG. 4 selects an optimum recognition region for the detailed recognition process from the recognition regions of a plurality of sizes created in the recognition magnification setting process 402.
  • the selection method uses, for example, the number of recognition targets determined in the recognition processing result obtained by scanning and its reliability, the number of non-recognition targets determined and its reliability, the distribution of recognition results, and the like.
  • the amount and the reliability of a plurality of sizes of recognition areas are compared, and an optimum recognition area is used.
  • the optimum magnification setting process 406 may be omitted if there is not enough time for the processing time.
  • FIG. 10 is a view for explaining the principle of the optimum magnification setting processing 406. From the recognition results of the plurality of magnifications, the optimum magnification with the best response is selected. As described above, the optimum magnification is selected using the number of responses in the scanning area of the recognition processing and the reliability thereof. In the example of FIG. 10, the scanning area 801 ′ having the largest number of responses is selected. This scanning area 801 ′ corresponds to the scanning area 801 of the reference size, and the scanning area 801 of the reference size recognizes the reference size. This corresponds to the area 501.
  • the detailed recognition position determination processing 407 illustrated in FIG. 4 determines a representative position for performing detailed recognition on the optimal magnification obtained in the optimal magnification setting processing 406. For the detailed recognition, for example, a position where the reliability of the recognition processing obtained in the magnification-based scanning recognition processing 404 is the highest is selected. Alternatively, the position may be determined by using a clustering means such as an average displacement method (Mean Shift method). When the optimum magnification setting processing 406 is not performed, the detailed recognition position determination processing 407 may be performed for each magnification.
  • FIG. 11 is a diagram for explaining the principle of the detailed recognition position determination processing 407. From the one or more response positions obtained from the magnification-based scan recognition processing 404, a representative position 111 for performing the detailed recognition processing 408 is determined. When there are a plurality of reaction points, a clustering technique such as the Mean Shift method is used. An area centered on the determined representative position 111 is a detailed identification area.
  • the detailed recognition processing 408 illustrated in FIG. 4 performs detailed recognition on the representative position 111 determined in the detailed recognition position determination processing 407, and calculates the type and reliability of the target. Alternatively, detailed recognition is performed using a recognition area of the optimal size selected based on the response position by the scan recognition processing 404 for each magnification, and the type and reliability of the target are calculated. For the detailed recognition processing 408, the discriminator 120 having the classification performance equal to or higher than the recognition processing used in the magnification-based scanning recognition processing 404 is used.
  • FIG. 12 is a diagram for explaining the principle of the detailed recognition processing 408.
  • a detailed recognition process is performed on the representative position 111 obtained by the detailed recognition position determination process 407 using the discriminator 120 to determine the type of the three-dimensional object.
  • the type of the three-dimensional object is, for example, a pedestrian, a vehicle, a signal, a sign, a white line, a tail lamp of a car, a headlight, and the like.
  • the recognition processing used in the magnification-based scanning recognition processing 404 and the detailed recognition processing 408 includes, for example, the following techniques.
  • a technique using template matching for comparing a template having a recognition target likeness prepared in advance with a recognition area includes, for example, the following techniques.
  • the edge shape or the like may be recognized by a threshold determination that is determined artificially.
  • the magnification-based scanning recognition processing 404 and the detailed recognition processing 408 include image processing such as resizing, smoothing, edge extraction, normalization, isolated point removal, gradient extraction, color conversion, and histogram creation necessary for performing these. .
  • FIG. 13 is a diagram illustrating a processing operation in the image recognition device 100 ′.
  • the same parts as those of the image recognition apparatus 100 shown in FIG. 2 are denoted by the same reference numerals, and description thereof will be omitted.
  • the image recognition device 100 includes an optical camera 1301 and a radar sensor 1302. Thereby, a three-dimensional object is detected. An image is captured by the optical camera 1301, and image processing 205 such as correction for absorbing the peculiar habit of the image sensor is performed on the captured image information. The processing result of the image processing 205 is stored in the image buffer 206. Further, the distance to the three-dimensional object can be obtained by the radar sensor 1302. The detection processing 1303 detects a three-dimensional object in a three-dimensional space based on the distance to the three-dimensional object. The recognition process 209 performs a recognition process of specifying the type of the three-dimensional object with respect to the detection area set by the detection process 1303.
  • the detection processing 1303 that inputs the distance to the three-dimensional object output from the radar sensor 1302 needs to perform the detection processing in consideration of the sensor characteristics of the radar sensor 1302 used for distance measurement, but after the detection area is determined.
  • the processing can be performed in the same manner as the configuration using the stereo camera described in the image recognition device 100. Further, the image recognition device 100 ′ does not require a plurality of images in the image processing 205.
  • the image recognition devices 100 and 100 ′ detect the three-dimensional object detection area based on the three-dimensional object detection characteristic information with respect to the three-dimensional object detection area 301 set on the images captured by the cameras 101 and 102.
  • a three-dimensional object area setting process 401 for setting the three-dimensional object region 501 by enlarging or reducing 301 is performed, and a recognition process for specifying the type of the three-dimensional object is performed on the three-dimensional object region 501 set by the three-dimensional object region setting process 401.
  • the detection characteristic information includes, for example, the distinctiveness of the three-dimensional object, the distance to the three-dimensional object, the size of the three-dimensional object, the assumed size of the three-dimensional object, the brightness of the outside environment, the direction of the headlight, the height of the road surface on which the three-dimensional object exists. It is at least one of the sensor resolutions of the imaging unit. Thus, it is possible to provide an image recognition device that accurately detects a three-dimensional object and improves recognition performance.
  • the image recognition devices 100 and 100 ′ detect the three-dimensional object based on the first characteristic information of the three-dimensional object with respect to the three-dimensional object detection area 301 set on the images captured by the cameras 101 and 102.
  • the recognition magnification setting processing 402 that determines the recognition areas 601 and 602 of a plurality of sizes, and the plurality of recognition areas 601 and 602 that are determined in the recognition magnification setting processing 402, the third characteristic information of the three-dimensional object is used.
  • a scanning area setting process 403 for setting a plurality of scanning regions 802 and 803 wider than the recognition regions 601 and 602 based on the scanning regions Using regions 802 and 803 includes a recognition process 209 that the recognition process is performed, the.
  • the first characteristic information to the third characteristic information include, for example, the identifiability of the three-dimensional object, the distance to the three-dimensional object, the size of the three-dimensional object, the assumed size of the three-dimensional object, the brightness of the outside environment, the direction of the headlight, It is at least one of the height of the road surface on which the three-dimensional object exists, the sensor resolution of the imaging unit, the limit size of the three-dimensional object, the near / far position of the three-dimensional object, and the height of the road surface on which the three-dimensional object exists.
  • the present invention is not limited to the above embodiments, and other forms considered within the scope of the technical idea of the present invention are also included in the scope of the present invention, as long as the features of the present invention are not impaired. . Further, a configuration in which the above-described embodiment and the modification are combined may be adopted.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)
  • Traffic Control Systems (AREA)
  • Image Processing (AREA)

Abstract

There is a problem in that recognition performance degrades when the entirety of a solid object cannot be adequately sensed. In the present invention, a solid object region setting process 401 is carried out to set a solid object region 501 through enlargement or reduction of a sensing region 301 in a solid object obtained through a sensing process 208 using sensing characteristic information for the solid object. When the solid object region 501 set through the solid object region setting process 401 is used for a reference size in a recognition process 209, the reference size may not necessarily be an optimal recognition region for recognition. As such, in a recognition magnification setting process 402, the recognition region is corrected using recognition characteristic information for the solid object. In a scan region setting process 403, positioning characteristic information is used to set, for a recognition region, a scan region that is larger than the recognition region.

Description

画像認識装置Image recognition device

 本発明は、画像認識装置に関する。 The present invention relates to an image recognition device.

 近年、運転支援や自動運転等に必要な画像認識装置に対する性能向上への要求が高まっている。例えば、歩行者に対する衝突安全機能では、自動車アセスメントにおいて夜間歩行者への衝突安全試験が追加されるなど、性能向上が求められている。これを実現するために、歩行者など立体物に対する高い認識性能が必要になる。 In recent years, there has been an increasing demand for improved performance of image recognition devices required for driving support, automatic driving, and the like. For example, in the collision safety function for pedestrians, performance improvement is required, for example, a collision safety test for pedestrians at night is added in an automobile assessment. In order to realize this, high recognition performance for three-dimensional objects such as pedestrians is required.

 特許文献1には、ある移動立体物と他の立体物が重なっている状況において、立体物を内包する所定の領域の内部の特徴点を追跡することで領域の内部に存在する歩行者などの移動立体物を検知する認識装置が提案されている。 Patent Literature 1 discloses that in a situation where a certain moving three-dimensional object and another three-dimensional object overlap, by tracking a feature point inside a predetermined area including the three-dimensional object, a pedestrian or the like existing inside the area is tracked. A recognition device that detects a moving three-dimensional object has been proposed.

特開2017-142760号公報JP-A-2017-142760

 しかしながら従来の装置では、立体物の全体をうまく検知できなかった場合に認識性能が低下するという課題があった。 However, the conventional apparatus has a problem that the recognition performance is reduced when the entire three-dimensional object cannot be detected properly.

 本発明の第1の態様による画像認識装置は、撮像部によって撮像された画像上に設定された立体物の検知領域に対して、前記立体物の検知特性情報に基づいて前記立体物の検知領域を拡大もしくは縮小して立体物領域を設定する立体物領域設定部と、前記立体物領域設定部により設定された前記立体物領域に対して前記立体物の種別を特定する認識処理を行う認識処理部と、を備える。
 本発明の第2の態様による画像認識装置は、撮像部によって撮像された画像上に設定された立体物の検知領域に対して、前記立体物の第1の特性情報に基づいて前記立体物の検知領域を拡大もしくは縮小して立体物領域を設定する立体物領域設定部と、前記立体物領域設定部によって求めた前記立体物領域を基準サイズとして、前記立体物の第2の特性情報に基づいて、複数のサイズの認識領域を定める認識倍率設定部と、前記認識倍率設定部で定めた複数の前記認識領域に対して、前記立体物の第3の特性情報に基づいて前記認識領域よりも広い複数の走査領域を設定する走査領域設定部と、前記走査領域設定部で設定された前記走査領域を用いて、認識処理を行う認識処理部と、を備える。
An image recognition device according to a first aspect of the present invention provides a three-dimensional object detection area based on the three-dimensional object detection characteristic information for a three-dimensional object detection area set on an image captured by an imaging unit. A three-dimensional object region setting unit that sets a three-dimensional object region by enlarging or reducing the three-dimensional object region, and a recognition process that performs a recognition process of specifying the type of the three-dimensional object with respect to the three-dimensional object region set by the three-dimensional object region setting unit Unit.
An image recognition device according to a second aspect of the present invention provides a three-dimensional object detection area based on first three-dimensional object characteristic information for a three-dimensional object detection area set on an image captured by an imaging unit. A three-dimensional object region setting unit that sets a three-dimensional object region by enlarging or reducing a detection region, and using the three-dimensional object region obtained by the three-dimensional object region setting unit as a reference size, based on second characteristic information of the three-dimensional object. A recognition magnification setting unit that defines a recognition area of a plurality of sizes; and a plurality of the recognition areas determined by the recognition magnification setting unit, based on third characteristic information of the three-dimensional object. A scanning area setting unit configured to set a plurality of wide scanning areas; and a recognition processing unit configured to perform a recognition process using the scanning area set by the scanning area setting unit.

 本発明によれば、立体物を的確に検知し、認識性能を向上させた画像認識装置を提供できる。 According to the present invention, it is possible to provide an image recognition apparatus that accurately detects a three-dimensional object and improves recognition performance.

画像認識装置の全体構成を示すブロック図である。It is a block diagram showing the whole composition of an image recognition device. 画像認識装置の動作を示すフローチャートである。6 is a flowchart illustrating an operation of the image recognition device. 検知処理により画像上に設定された立体物領域を示す図である。FIG. 9 is a diagram illustrating a three-dimensional object area set on an image by a detection process. 認識処理の詳細を示すフローチャートである。It is a flowchart which shows the detail of a recognition process. 立体物領域設定処理の原理を説明する図である。It is a figure explaining the principle of solid object field setting processing. 認識倍率設定処理の原理を説明する図である。It is a figure explaining the principle of recognition magnification setting processing. 認識倍率設定処理における正規化を説明する図である。FIG. 9 is a diagram illustrating normalization in a recognition magnification setting process. 走査領域設定処理の原理を説明する図である。FIG. 4 is a diagram illustrating the principle of a scanning area setting process. 倍率毎走査認識処理の原理を説明する図である。It is a figure explaining the principle of magnification-specific scan recognition processing. 最適倍率設定処理の原理を説明する図である。FIG. 4 is a diagram for explaining the principle of an optimum magnification setting process. 詳細認識位置決定処理の原理を説明する図である。It is a figure explaining the principle of detailed recognition position deciding processing. 詳細認識処理の原理を説明する図である。FIG. 9 is a diagram for explaining the principle of the detailed recognition process. 変形例に係る画像認識装置の全体構成を示すブロック図である。FIG. 11 is a block diagram illustrating an overall configuration of an image recognition device according to a modification.

 図1は、本実施形態にかかわる画像認識装置100の全体構成を示すブロック図である。画像認識装置100は、車両に搭載され、車両前方の左右に配置された左カメラ101と右カメラ102を備える。カメラ101、102は、ステレオカメラを構成し、例えば、歩行者、車両、信号、標識、白線、車のテールランプ、ヘッドライトなどの立体物を撮像する。画像認識装置100は、カメラ101、102で撮像された車両前方の画像情報に基づいて車外環境を認識する。そして、車両(自車両)は、画像認識装置100による認識結果に基づいて、ブレーキ、ステアリングなどを制御する。 FIG. 1 is a block diagram showing the overall configuration of an image recognition device 100 according to the present embodiment. The image recognition device 100 includes a left camera 101 and a right camera 102 mounted on a vehicle and arranged on the left and right in front of the vehicle. The cameras 101 and 102 constitute a stereo camera, and capture an image of a three-dimensional object such as a pedestrian, a vehicle, a signal, a sign, a white line, a tail lamp of a car, and a headlight. The image recognition device 100 recognizes the environment outside the vehicle based on image information of the front of the vehicle captured by the cameras 101 and 102. Then, the vehicle (own vehicle) controls braking, steering, and the like based on the recognition result by the image recognition device 100.

 画像認識装置100は、カメラ101、102で撮像した画像を画像入力インタフェース103より取り込む。画像入力インタフェース103より取り込まれた画像情報は、内部バス109を介して画像処理部104へ送られる。そして、演算処理部105で処理され、処理途中の結果や最終結果の画像情報などは記憶部106に記憶される。 (4) The image recognition device 100 captures images captured by the cameras 101 and 102 from the image input interface 103. The image information captured from the image input interface 103 is sent to the image processing unit 104 via the internal bus 109. Then, the image data is processed by the arithmetic processing unit 105, and image information of the result in the middle of the processing and the final result are stored in the storage unit 106.

 画像処理部104は、左カメラ101の撮像素子から得られる第1の画像と、右カメラ102の撮像素子から得られる第2の画像とを比較して、それぞれの画像に対して、撮像素子に起因するデバイス固有の偏差の補正や、ノイズ補間などの画像補正を行い、これを記憶部106に記憶する。更に、第1の画像と第2の画像との間で、相互に対応する箇所を計算して、視差情報を求め、画像上の各画素に対応する距離情報として、これを記憶部106に記憶する。画像処理部104は、内部バス109を介して演算処理部105、CANインタフェース107、制御処理部108に接続されている。 The image processing unit 104 compares the first image obtained from the image sensor of the left camera 101 with the second image obtained from the image sensor of the right camera 102, and compares each image with the image sensor. Correction of the device-specific deviation due to the correction and image correction such as noise interpolation are performed and stored in the storage unit 106. Further, between the first image and the second image, mutually corresponding portions are calculated to obtain disparity information, and this is stored in the storage unit 106 as distance information corresponding to each pixel on the image. I do. The image processing unit 104 is connected to an arithmetic processing unit 105, a CAN interface 107, and a control processing unit 108 via an internal bus 109.

 演算処理部105は、記憶部106に蓄えられた画像情報および距離情報(視差情報)を使い、車両周辺の環境を把握するために、立体物の認識を行う。立体物の認識結果や中間的な処理結果の一部が、記憶部106に記憶される。演算処理部105は、撮像した画像に対して立体物の認識を行った後に、認識結果を用いて車両制御の計算を行う。車両制御の計算の結果として得られた車両の制御方針や、認識結果の一部はCANインタフェース107を介して、車載ネットワークCAN110に伝えられ、これにより車両の制御が行われる。 The arithmetic processing unit 105 uses the image information and the distance information (disparity information) stored in the storage unit 106 to recognize a three-dimensional object in order to grasp the environment around the vehicle. A part of the recognition result of the three-dimensional object and a part of the intermediate processing result are stored in the storage unit 106. The arithmetic processing unit 105 performs a vehicle control calculation using the recognition result after performing recognition of a three-dimensional object on the captured image. A control policy of the vehicle obtained as a result of the vehicle control calculation and a part of the recognition result are transmitted to the in-vehicle network CAN 110 via the CAN interface 107, whereby the vehicle is controlled.

 制御処理部108は、各処理部が異常動作を起こしていないか、データ転送時にエラーが発生していないかなどを監視し、異常動作を防止する。画像処理部104、演算処理部105、および制御処理部108は、単一または複数のコンピュータユニットにより構成してもよい。 (4) The control processing unit 108 monitors whether each processing unit has performed an abnormal operation, whether an error has occurred during data transfer, and the like, and prevents the abnormal operation. The image processing unit 104, the arithmetic processing unit 105, and the control processing unit 108 may be configured by a single or a plurality of computer units.

 図2は、画像認識装置100の動作を示すフローチャートである。
 画像認識装置100に備えられた左カメラ101と右カメラ102とにより画像が撮像され、撮像された画像情報203、204のそれぞれについて、撮像素子が持つ固有の癖を吸収するための補正などの画像処理205を行う。画像処理205の処理結果は画像バッファ206に蓄えられる。画像バッファ206は、図1の記憶部106に設けられる。
FIG. 2 is a flowchart showing the operation of the image recognition device 100.
An image is captured by the left camera 101 and the right camera 102 provided in the image recognition apparatus 100, and the captured image information 203 and 204 are images such as corrections for absorbing a peculiar habit of the image sensor. Step 205 is performed. The processing result of the image processing 205 is stored in the image buffer 206. The image buffer 206 is provided in the storage unit 106 in FIG.

 次に視差処理207が行われる。具体的には、補正された2つの画像を使って、画像同士の照合を行い、これにより左カメラ101、右カメラ102で得た画像の視差情報を得る。左右画像の視差により、立体物の画像上のある着目点が、三角測量の原理によって、立体物までの距離として求められる。画像処理205および視差処理207は、図1の画像処理部104で行われ、最終的に得られた画像情報、および視差情報は記憶部106に蓄えられる。 Next, parallax processing 207 is performed. Specifically, the images are collated using the two corrected images, and thereby the disparity information of the images obtained by the left camera 101 and the right camera 102 is obtained. From the parallax between the left and right images, a certain point of interest on the image of the three-dimensional object is obtained as the distance to the three-dimensional object according to the principle of triangulation. The image processing 205 and the parallax processing 207 are performed by the image processing unit 104 in FIG. 1, and the finally obtained image information and parallax information are stored in the storage unit 106.

 そして、次の検知処理208では、視差処理207により左右画像の各画素の視差または距離が得られた視差情報を用いて、3次元空間上の立体物を検知する。図3は、検知処理208により画像上に設定された立体物の検知領域を示す図である。図3には、検知処理208の結果、画像上において、カメラ101、102によって検知された歩行者の検知領域301と車両の検知領域302が示されている。これらの検知領域301、302は、図3に示すように矩形であっても、視差や距離から得られる不定形の領域であってもよい。後段の処理において計算機での扱いを容易にするため一般的には矩形として扱われる。本実施形態では以下、領域は矩形として扱い、立体物の一例として主に歩行者を用いて説明する。 Then, in the next detection processing 208, a three-dimensional object in a three-dimensional space is detected using the parallax information obtained by obtaining the parallax or distance of each pixel of the left and right images by the parallax processing 207. FIG. 3 is a diagram illustrating a detection area of a three-dimensional object set on an image by the detection processing 208. FIG. 3 shows a pedestrian detection area 301 and a vehicle detection area 302 detected by the cameras 101 and 102 on the image as a result of the detection processing 208. These detection areas 301 and 302 may be rectangular as shown in FIG. 3, or may be irregular areas obtained from parallax and distance. It is generally treated as a rectangle to facilitate handling on a computer in subsequent processing. Hereinafter, in the present embodiment, the region will be described as a rectangle, and a pedestrian will be mainly described as an example of a three-dimensional object.

 次に、認識処理209では、検知処理208により画像上に設定された検知領域に対して立体物の種別を特定する認識処理を行う。認識処理209による認識対象の立体物は、例えば、歩行者、車両、信号、標識、白線、車のテールランプやヘッドライトなどであり、これらの何れであるかその種別が特定される。この認識処理209が安定して立体物の認識を行うためには、画像上の検知領域と認識したい対象の領域が一致している必要がある。しかし、カメラ101、102においては外環境の明るさやカメラ間の撮像性能のばらつきなどによって、認識したい画像上の領域を完全に一致させることができない場合がある。これは、ミリ波などのレーダーと、カメラなどの画像センサを組み合わせた場合でも同様である。この問題を解決した認識処理209の詳細については後述する。 Next, in the recognition process 209, a recognition process for specifying the type of the three-dimensional object is performed on the detection area set on the image by the detection process 208. The three-dimensional object to be recognized by the recognition process 209 is, for example, a pedestrian, a vehicle, a signal, a sign, a white line, a tail lamp or a headlight of the vehicle, and the type is identified as any of these. In order for the recognition process 209 to stably recognize a three-dimensional object, the detection region on the image and the target region to be recognized need to match. However, in the cameras 101 and 102, it may not be possible to completely match the regions on the image to be recognized due to the brightness of the external environment, the variation in the imaging performance between the cameras, and the like. This is the same even when a radar such as a millimeter wave is combined with an image sensor such as a camera. The details of the recognition processing 209 that solves this problem will be described later.

 次に、車両制御処理210では、立体物の認識結果と、自車両の状態(速度、舵角など)とを勘案して、例えば、乗員に警告を発し、自車両のブレーキングや舵角調整などの制御を行う。あるいは、認識した立体物に対する回避制御を定め、その結果を自動制御情報としてCANインタフェース107を介して出力する。認識処理209および車両制御処理210は、図1の演算処理部105で行われる。 Next, in the vehicle control process 210, in consideration of the recognition result of the three-dimensional object and the state (speed, steering angle, etc.) of the own vehicle, for example, a warning is issued to an occupant, and braking and steering angle adjustment of the own vehicle are performed. And so on. Alternatively, avoidance control for the recognized three-dimensional object is determined, and the result is output via the CAN interface 107 as automatic control information. The recognition processing 209 and the vehicle control processing 210 are performed by the arithmetic processing unit 105 in FIG.

 なお、図2のフローチャート、および後述の図4のフローチャートで示したプログラムを、CPU、メモリなどを備えたコンピュータにより実行することができる。全部の処理、または一部の処理をハードロジック回路により実現してもよい。更に、このプログラムは、予め画像認識装置100の記憶媒体に格納して提供することができる。あるいは、独立した記憶媒体にプログラムを格納して提供したり、ネットワーク回線によりプログラムを画像認識装置100の記憶媒体に記録して格納することもできる。データ信号(搬送波)などの種々の形態のコンピュータ読み込み可能なコンピュータプログラム製品として供給してもよい。 Note that the programs shown in the flowchart of FIG. 2 and the flowchart of FIG. 4 described below can be executed by a computer including a CPU, a memory, and the like. All or part of the processing may be realized by a hard logic circuit. Further, this program can be provided by being stored in a storage medium of the image recognition device 100 in advance. Alternatively, the program may be stored and provided in an independent storage medium, or the program may be recorded and stored in the storage medium of the image recognition device 100 via a network line. Various forms of computer-readable computer program products, such as data signals (carrier waves), may be provided.

 図4は、認識処理209の詳細を示すフローチャートである。このフローチャートは、図4に示すように、立体物領域設定処理401、認識倍率設定処理402、走査領域設定処理403、倍率毎走査認識処理404、最適倍率設定処理406、詳細認識位置決定処理407、詳細認識処理408を行う。以下、順に各処理を説明する。なお、これらの処理ではステレオカメラを前提に説明する。 FIG. 4 is a flowchart showing details of the recognition process 209. As shown in FIG. 4, the flowchart includes a three-dimensional object area setting process 401, a recognition magnification setting process 402, a scanning area setting process 403, a magnification-based scan recognition process 404, an optimum magnification setting process 406, a detailed recognition position determination process 407, A detailed recognition process 408 is performed. Hereinafter, each process will be described in order. Note that these processes will be described on the assumption that a stereo camera is used.

[立体物領域設定処理]
 立体物領域設定処理401では、検知処理208によって得られた立体物の検知領域301を、立体物の検知特性情報に基づいて拡大もしくは縮小して立体物領域501を設定する。
[Three-dimensional object area setting process]
In the three-dimensional object area setting processing 401, the three-dimensional object detection area 301 obtained by the detection processing 208 is enlarged or reduced based on the three-dimensional object detection characteristic information to set a three-dimensional object area 501.

 図5は、立体物領域設定処理401の原理を説明する図である。図5では、歩行者の検知領域301を立体物の検知特性情報に基づいて拡大して立体物領域501を設定した例を示す。検知特性情報は、例えば、(1)立体物の識別性、(2)立体物との距離、(3)立体物の大きさ、(4)立体物の想定サイズ、(5)外環境の明るさ、(6)ヘッドライトの向き、(7)立体物が存在する路面の高さ、(8)センサ分解能などである。以下に、これらの検知特性情報について説明する。 FIG. 5 is a diagram for explaining the principle of the three-dimensional object area setting processing 401. FIG. 5 shows an example in which the three-dimensional object area 501 is set by enlarging the pedestrian detection area 301 based on the three-dimensional object detection characteristic information. The detection characteristic information includes, for example, (1) the identifiability of the three-dimensional object, (2) the distance from the three-dimensional object, (3) the size of the three-dimensional object, (4) the assumed size of the three-dimensional object, and (5) the brightness of the outside environment. (6) Headlight orientation, (7) Road surface height where a three-dimensional object exists, (8) Sensor resolution, and the like. Hereinafter, such detection characteristic information will be described.

(1)立体物の識別性は、例えばカメラ101、102においては背景領域との組み合わせによって立体物が得にくい場合が考えられる。路面と同色の歩行者の服装や、夜間の歩行者の頭頂部などがこれに当たる。また、カメラ101、102が雨滴などの影響で対象の一部がぼけて立体物領域が欠けることも考えられる。このような場合、検知領域を拡大する。また、三次元空間中の人と人以外の領域が結合した立体物になる場合も有る。路肩の電柱や柵などと言った立体物と人との結合のように識別前に分離することは困難なためである。このような場合に画像の色や輝度やエッジに基づいて検知領域を縮小する。また、レーダーセンサなどの別種のセンサや、カメラ101、102においても水平方向の取り付け高さが違う場合、主に上部方向に隠れが発生し、立体物は小さく出る。このような構成の処理特性を持つ場合には検知領域を拡大する。 (1) Regarding the distinguishability of a three-dimensional object, for example, in the cameras 101 and 102, a case where a three-dimensional object is difficult to obtain due to a combination with a background area may be considered. The pedestrian's clothes of the same color as the road surface and the top of the pedestrian at night correspond to this. It is also conceivable that the cameras 101 and 102 are partially blurred due to the influence of raindrops or the like and lack a three-dimensional object region. In such a case, the detection area is enlarged. In addition, a three-dimensional object may be formed by combining a person and a region other than the person in the three-dimensional space. This is because it is difficult to separate a three-dimensional object such as a telephone pole or fence on the roadside from a person before identification. In such a case, the detection area is reduced based on the color, luminance, and edge of the image. In addition, if the horizontal mounting height is different between different types of sensors such as a radar sensor and the cameras 101 and 102, hiding mainly occurs in the upper direction, and a three-dimensional object appears small. If the processing characteristic has such a configuration, the detection area is enlarged.

(2)立体物との距離は、遠ければ大きく立体物領域を拡大し、近ければ小さく立体物領域を縮小する。この拡大率は、カメラ101、102を含むセンサ分解能によって決定しても良い。対象が遠方に行けば行くほど1画素あたりの3次元空間を占めるサイズが大きくなり、誤差が乗るからである。 (2) If the distance from the three-dimensional object is large, the three-dimensional object region is enlarged if it is far, and small if the distance is short. This magnification may be determined by the resolution of the sensor including the cameras 101 and 102. This is because the farther the object goes, the larger the size of one pixel occupying the three-dimensional space becomes, and the more the error gets.

(3)立体物の大きさは、立体物が小さければ立体物領域を拡大し、立体物が大きければ立体物領域を縮小する。 (3) As for the size of the three-dimensional object, the three-dimensional object region is enlarged if the three-dimensional object is small, and the three-dimensional object region is reduced if the three-dimensional object is large.

(4)立体物の想定サイズは、例えば、立体物を歩行者と想定して、歩行者と想定して小さ過ぎる立体物は立体物領域を拡大し、歩行者と想定して大き過ぎる立体物は立体物領域を縮小する。どの程度のものまでを対象とするかは、(5)外環境の明るさや(6)ヘッドライトの向きも考慮に入れて決定してよい。例えば、昼の明るい環境であれば立体物領域を縮小し、夜の暗い環境であれば立体物領域を拡大する。また、ヘッドライトの向きに応じて、例えば、ヘッドライトの向きがロウビームであれば足元に光が当たっているので、高さ方向に立体物領域を拡大する。ヘッドライトの向きがハイビームであれば、全身に光が当たっているので立体物領域を縮小する。また、立体物までの距離や(7)立体物が存在する位置の路面の高さによっても立体物領域を拡大もしくは縮小してもよい。例えば、路面の高さが低い場合に、ハイビームであれば、足元には光が当たらないため下方向に立体物領域を拡大する。 (4) The assumed size of the three-dimensional object is, for example, assuming that the three-dimensional object is a pedestrian, a three-dimensional object that is too small assuming a pedestrian enlarges the three-dimensional object region, and is too large assuming a pedestrian. Reduces the three-dimensional object area. The degree to which the object is targeted may be determined in consideration of (5) the brightness of the outside environment and (6) the direction of the headlight. For example, the three-dimensional object area is reduced in a bright environment in the daytime, and the three-dimensional object area is expanded in a dark environment at night. In addition, according to the direction of the headlight, for example, if the headlight is in the low beam direction, the light hits the feet, and the three-dimensional object area is enlarged in the height direction. If the direction of the headlight is a high beam, the entire body is illuminated, so that the three-dimensional object area is reduced. The three-dimensional object area may be enlarged or reduced depending on the distance to the three-dimensional object or (7) the height of the road surface at the position where the three-dimensional object exists. For example, when the height of the road surface is low, if the beam is a high beam, the light does not hit the feet, so that the three-dimensional object area is enlarged in the downward direction.

(8)センサ分解能は、センサがカメラ101、102であれば、距離に応じて1画素あたりのサイズが変わるので、立体物のサイズや、対象の距離と組み合わせることによって立体物領域を拡大もしくは縮小する。例えば、立体物が近傍に居る場合は1画素あたりの3次元空間の分解能が高いため、立体物領域を拡大する。立体物が遠い場合は1画素あたりの3次元空間の分解能が低いため、立体物領域を縮小する。また、立体物領域として取得する領域の特性によっては、立体物領域を縮小する。立体物領域として取得する領域の特性とは、例えば立体物領域が得られた視差のある領域やセンサ応答領域がより大きく設定される場合であり、このような場合は立体物領域を縮小する。 (8) If the sensors are the cameras 101 and 102, the size per pixel changes according to the distance. Therefore, the sensor resolution is enlarged or reduced by combining the size of the three-dimensional object and the distance of the object. I do. For example, when a three-dimensional object is in the vicinity, the resolution of the three-dimensional space per pixel is high, so that the three-dimensional object region is enlarged. When the three-dimensional object is distant, the resolution of the three-dimensional space per pixel is low, so that the three-dimensional object region is reduced. Further, the three-dimensional object area is reduced depending on the characteristics of the area acquired as the three-dimensional object area. The characteristic of the region acquired as the three-dimensional object region is, for example, a case where the region having the parallax and the sensor response region where the three-dimensional object region is obtained are set to be larger. In such a case, the three-dimensional object region is reduced.

 立体物の検知特性情報に基づいて立体物領域501を設定する例について説明する。例えば、歩行者においては、画像上の変化が大きい手足領域は欠けて小さくなりやすい。また、夜間の場合は黒髪の人ならば、頭が背景と混ざって検知しづらい。このような場合に、1画素あたりの3次元空間上のサイズを元にして、画像上の立体物領域501を変更する。例えば、明るい昼間なら頭部と思われる領域の拡大は0cm、足元の領域の拡大は10cm、夜間なら頭部の領域の拡大は10cm、足元の領域の拡大は10cmとする。さらに車両のロウビームが届く範囲なら頭部の領域の拡大は10cm、足元の領域の拡大は0cmとする。横幅も同様に適宜拡大縮小する。また時系列による横幅の変化量を元に補正を実施しても良い。また、後段の認識処理の内容によっては、認識領域を縮小しても良い。例えば、歩行者であれば上半身のみで認識を実施する場合などである。拡大縮小するサイズは所定の割合、若しくは画像上のサイズで決定しても良いが、3次元空間上のサイズを基準に設定することにより、認識対象としてあり得ないサイズの除外が可能となる。
  また、この拡大縮小による立体物領域501のサイズは、3次元空間上の距離と画素の関係から、検知領域301と同一になる場合もある。
An example in which the three-dimensional object area 501 is set based on the three-dimensional object detection characteristic information will be described. For example, in a pedestrian, a limb region having a large change on an image is likely to be missing and smaller. Also, in the case of nighttime, if the person has black hair, it is difficult to detect the head because it is mixed with the background. In such a case, the three-dimensional object area 501 on the image is changed based on the size of one pixel in the three-dimensional space. For example, in a bright daytime, the area considered to be the head is 0 cm, and the area of the feet is 10 cm. In the nighttime, the area of the head is 10 cm, and the area of the feet is 10 cm. Furthermore, if the low beam of the vehicle can be reached, the head area is set to 10 cm and the foot area is set to 0 cm. Similarly, the width is appropriately enlarged or reduced. Further, the correction may be performed based on the change amount of the width in the time series. Further, the recognition area may be reduced depending on the content of the recognition processing in the subsequent stage. For example, in the case of a pedestrian, recognition is performed only with the upper body. The size to be enlarged or reduced may be determined by a predetermined ratio or the size on the image. However, by setting the size on the basis of the size in the three-dimensional space, it is possible to exclude a size that cannot be recognized as a recognition target.
Also, the size of the three-dimensional object area 501 due to the enlargement / reduction may be the same as the detection area 301 due to the relationship between the distance in the three-dimensional space and the pixels.

 以上、検知特性情報について説明したが、立体物領域設定処理401においては、これらの検知特性情報を複数組み合わせることによって、より精度よく立体物領域501を設定する。例えば、距離、外環境の明るさ、ライト向き、路面高さ、センサ分解能などを組み合わせることで、昼夜の影響をより軽減した立体物領域501を設定する。また、ここで述べた画素数やサイズは一例であり、この範囲に限定するものではない。 Although the detection characteristic information has been described above, the three-dimensional object area setting process 401 sets the three-dimensional object area 501 with higher accuracy by combining a plurality of pieces of the detection characteristic information. For example, a three-dimensional object area 501 in which the influence of day and night is further reduced is set by combining distance, brightness of the outside environment, light direction, road surface height, sensor resolution, and the like. Further, the number of pixels and the size described here are merely examples, and the present invention is not limited to this range.

[認識倍率設定処理]
 次に、図4に示す認識倍率設定処理402について説明する。
 立体物領域設定処理401で設定された立体物領域501を認識処理209の基準サイズとすると、この基準サイズが認識のための最適な認識領域とは限らない。そこで、認識倍率設定処理402では、立体物の認識特性情報を用いて複数のサイズの認識領域を定める。この時、最適な認識領域は不明であることから、基準サイズを元に認識領域を拡大もしくは縮小して、複数のサイズの認識領域を定める。認識特性情報は、例えば、(1)立体物との距離、(2)立体物の大きさ、(3)立体物の限界サイズ、(4)センサ分解能などである。以下に、これらの認識特性情報について説明する。
[Recognition magnification setting processing]
Next, the recognition magnification setting processing 402 shown in FIG. 4 will be described.
Assuming that the three-dimensional object area 501 set in the three-dimensional object area setting processing 401 is the reference size of the recognition processing 209, this reference size is not always the optimum recognition area for recognition. Therefore, in the recognition magnification setting processing 402, a plurality of sizes of recognition areas are determined using the recognition characteristic information of the three-dimensional object. At this time, since the optimal recognition area is unknown, the recognition area is enlarged or reduced based on the reference size to determine a plurality of size recognition areas. The recognition characteristic information includes, for example, (1) the distance from the three-dimensional object, (2) the size of the three-dimensional object, (3) the limit size of the three-dimensional object, and (4) the sensor resolution. Hereinafter, these pieces of recognition characteristic information will be described.

(1)立体物との距離は、認識領域を拡大もしくは縮小を行う場合の拡大量もしくは縮小量を決定する指標となる。例えば、立体物が遠方にある場合、1画素あたりの立体物サイズは大きくなる。この場合、認識処理を行う後述の識別器901の入力になるのは画像であるため、立体物が遠方にある場合は、近傍にある場合に比べて拡大もしくは縮小する画素数は小さくなる。そこで、基準サイズを元に、立体物との距離に応じて、認識領域を拡大もしくは縮小して、複数のサイズの認識領域を定める。 (1) The distance from the three-dimensional object is an index for determining the amount of enlargement or reduction when the recognition area is enlarged or reduced. For example, when the three-dimensional object is far away, the size of the three-dimensional object per pixel increases. In this case, since an image is input to a later-described classifier 901 that performs recognition processing, the number of pixels to be enlarged or reduced when a three-dimensional object is far away is smaller than when it is nearby. Therefore, based on the reference size, the recognition area is enlarged or reduced in accordance with the distance to the three-dimensional object, and recognition areas of a plurality of sizes are determined.

(2)立体物の大きさは、認識領域を拡大もしくは縮小を行う場合の拡大量もしくは縮小量を決定する指標となる。例えば、立体物が大きい場合は立体物が小さい場合に比べて、画像上で拡大もしくは縮小を行う場合の画素数は小さくなる。また、実空間上のサイズに基づいて複数のサイズの認識領域を定める場合、遠方に居る場合は近傍に居る場合に比べて、画像上で拡大もしくは縮小を行う場合の画素数は小さくなる。サブピクセル単位で認識領域を設定しない場合は、基準サイズと同じになる場合も有る。 (2) The size of the three-dimensional object is an index for determining the amount of enlargement or reduction when the recognition area is enlarged or reduced. For example, when the three-dimensional object is large, the number of pixels when performing enlargement or reduction on the image is smaller than when the three-dimensional object is small. In the case where recognition areas of a plurality of sizes are determined based on the size in the real space, the number of pixels when performing enlargement or reduction on an image is smaller in a distant place than in a nearby place. If the recognition area is not set in sub-pixel units, the size may be the same as the reference size.

(3)立体物の限界サイズは、立体物が認識対象である場合に想定される限界サイズである。例えば、立体物が歩行者である場合、立体物の高さが2.5メートルを超えるなど大きければ、縮小方向に領域を設定する。逆に高さが0.8メートルを下回るなど小さければ、拡大方向に領域を設定する。それら中間であれば、双方に領域を設定する。設定する領域の上限下限は、認識対象の立体物や認識処理の制限などから決定してよい。 (3) The limit size of the three-dimensional object is a limit size assumed when the three-dimensional object is a recognition target. For example, when the three-dimensional object is a pedestrian, if the height of the three-dimensional object is larger than 2.5 meters, an area is set in the reduction direction. Conversely, if the height is smaller than 0.8 meters, the area is set in the enlargement direction. If they are in between, areas are set for both. The upper and lower limits of the region to be set may be determined based on a three-dimensional object to be recognized, restrictions on recognition processing, and the like.

(4)センサ分解能は、例えばセンサがカメラ101、102であれば、距離に応じて1画素あたりのサイズが変わる。そこで、センサ分解能を基に拡大もしくは縮小する範囲を定めることができる。例えば1画素あたりの3次元空間中のサイズが20cmを超すような遠方においては、拡大もしくは縮小する範囲は1画素、2画素と言った小さな範囲で定める。逆に1画素あたりの3次元空間中のサイズが1cmを切るような近距離においては、10画素や20画素と言った大きな範囲で拡大もしくは縮小する。 (4) For the sensor resolution, for example, if the sensors are the cameras 101 and 102, the size per pixel changes according to the distance. Therefore, a range to be enlarged or reduced can be determined based on the sensor resolution. For example, in a distant place where the size in a three-dimensional space per pixel exceeds 20 cm, the range to be enlarged or reduced is defined as a small range of one pixel or two pixels. Conversely, in a short distance where the size of one pixel in a three-dimensional space is less than 1 cm, the image is enlarged or reduced in a large range such as 10 pixels or 20 pixels.

 なお、画像上のサイズは3次元空間中の立体物サイズから逆算して求めても良い。また、立体物領域設定処理401において立体物領域501の設定に考慮しなかった検知特性情報に関しては、認識倍率設定処理402で検知特性情報を用いて認識領域のバリエーションを設定してもよい。その場合、検知特性情報のどの条件に応じて認識領域を拡大するか縮小するかは、立体物領域設定処理401で説明した内容と同様である。また、ここで述べた画素数やサイズは一例であり、この範囲に限定するものではない。 サ イ ズ Note that the size on the image may be calculated from the size of the three-dimensional object in the three-dimensional space. Further, with respect to the detection characteristic information not considered in the setting of the three-dimensional object area 501 in the three-dimensional object area setting processing 401, a variation of the recognition area may be set using the detection characteristic information in the recognition magnification setting processing 402. In this case, which condition of the detection characteristic information is used to enlarge or reduce the recognition area is the same as the content described in the three-dimensional object area setting processing 401. Further, the number of pixels and the size described here are merely examples, and the present invention is not limited to this range.

 図6は認識倍率設定処理402の原理を説明する図である。認識倍率設定処理402は、立体物領域501を基準サイズの認識領域として、これを拡大もしくは縮小した認識領域601、602を定める。認識領域501は基準サイズ、認識領域601は、基準サイズを縮小した認識倍率の小さい認識領域、認識領域602は、基準サイズを拡大した認識倍率の大きい認識領域である。図6の例では基準サイズに対して拡大もしくは縮小した2種類の認識領域を示したが、この数は認識処理時間に余裕があれば多数のバリエーションを持ってよい。また、検知処理208や立体物領域設定処理401の設定により、拡大と縮小どちらかのみを設定してもよい。認識領域の拡大量もしくは縮小量は、認識特性情報に基づいて設定する。この場合、立体物領域設定処理401と同様に、画像の分解能によっては基準サイズの認識領域と同一となる場合もある。 FIG. 6 illustrates the principle of the recognition magnification setting process 402. In the recognition magnification setting processing 402, the three-dimensional object area 501 is set as a recognition area of a reference size, and the recognition areas 601 and 602 obtained by enlarging or reducing this are determined. The recognition area 501 is a recognition area having a small reference magnification with a reduced reference size, and the recognition area 601 is a recognition area having a high recognition magnification with a reduced reference size. In the example of FIG. 6, two types of recognition areas that are enlarged or reduced with respect to the reference size are shown. However, this number may have many variations as long as the recognition processing time has a margin. Further, only the enlargement or the reduction may be set by the setting of the detection processing 208 or the three-dimensional object area setting processing 401. The amount of enlargement or reduction of the recognition area is set based on the recognition characteristic information. In this case, similarly to the three-dimensional object area setting processing 401, the recognition area may be the same as the recognition area of the reference size depending on the resolution of the image.

 図7は、認識倍率設定処理402における正規化を説明する図である。
 図7に示すように、認識領域(501、601、602)を後段の認識処理を実施する場合において正規化する領域を定めている。認識領域は、後述の認識処理において、認識処理を行う範囲を示すものである。認識領域501は基準サイズ、認識領域601は、基準サイズを縮小した認識倍率の小さい認識領域、認識領域602は、基準サイズを拡大した認識倍率の大きい認識領域である。
FIG. 7 is a diagram illustrating the normalization in the recognition magnification setting process 402.
As shown in FIG. 7, an area to be normalized is determined when the recognition area (501, 601 and 602) is subjected to the subsequent recognition processing. The recognition area indicates a range in which recognition processing is performed in recognition processing described later. The recognition area 501 is a recognition area having a small reference magnification with a reduced reference size, and the recognition area 601 is a recognition area having a high recognition magnification with a reduced reference size.

 認識処理においては入力情報の次元数を合わせる必要がある。基準サイズの認識領域501は対象の物体を綺麗に捉えている保証が無く、また装置に実装された認識処理の特性によって、どのように捉えていればよいかが変わってくる。そこで、正規化する領域をあらかじめ設定する。図7の例では、認識領域501は頭と足がほぼ入っているのに対し、認識倍率の小さい認識領域601は頭頂部と手足がはみ出でおり、認識倍率の大きい認識領域602は逆に頭頂部や足元に余白ができる。これらの認識領域を同じサイズに正規化すると、図7に示すように、正規化後の認識領域701、702、703となり、後述の認識処理で同様な処理を施すことが可能となる。ただしこの正規化処理は認識倍率設定処理402で必ずしも行うものではない。後述の倍率毎走査認識処理404や後述の詳細認識処理408の処理の一部として実施してよい。 In recognition processing, it is necessary to match the number of dimensions of input information. There is no guarantee that the recognition area 501 of the reference size captures the target object neatly, and how it should be captured depends on the characteristics of the recognition processing implemented in the apparatus. Therefore, a region to be normalized is set in advance. In the example of FIG. 7, the recognition area 501 almost includes the head and the feet, whereas the recognition area 601 having a small recognition magnification has the top of the head and the limbs protruding, and the recognition area 602 having the large recognition magnification has a head. Margins are formed at the top and feet. When these recognition areas are normalized to the same size, as shown in FIG. 7, they become recognition areas 701, 702, and 703 after normalization, and similar processing can be performed in recognition processing described later. However, this normalization processing is not necessarily performed in the recognition magnification setting processing 402. It may be implemented as a part of the later-described scanning per-magnification processing 404 and the later-described detailed recognition processing 408.

[走査領域設定処理]
 次に、図4の走査領域設定処理403について説明する。走査領域設定処理403は、各認識領域に対して、立体物の配置特性情報に基づいて、認識領域よりも大きな走査領域を設定する。走査領域は画像上の領域として設定され、認識処理においては、設定された走査領域内を認識領域により走査する。すなわち、認識領域は、後述の認識処理において、認識処理を行う範囲を示すものであり、走査領域は、この認識領域を走査領域の範囲内において移動させる範囲である。これにより、認識領域を走査領域の範囲内において移動させながら認識処理を行う。走査領域の大きさを決定する配置特性情報は、例えば、(1)立体物の遠近位置、(2)立体物が存在する路面高さなどである。以下に、これらの配置特性情報について説明する。
[Scanning area setting processing]
Next, the scanning area setting process 403 in FIG. 4 will be described. The scanning area setting process 403 sets a scanning area larger than the recognition area for each recognition area based on the arrangement characteristic information of the three-dimensional object. The scanning area is set as an area on the image, and in the recognition processing, the set scanning area is scanned by the recognition area. That is, the recognition area indicates a range in which recognition processing is performed in recognition processing described later, and the scanning area is a range in which the recognition area is moved within the range of the scanning area. Thus, the recognition processing is performed while moving the recognition area within the range of the scanning area. The arrangement characteristic information for determining the size of the scanning area includes, for example, (1) the distance position of the three-dimensional object, (2) the road surface height where the three-dimensional object exists, and the like. Hereinafter, such arrangement characteristic information will be described.

(1)立体物の遠近位置は、走査領域の設定を行う場合の指標となる。例えば、立体物が近くに在る場合は、画像上の走査領域は大きく定める。また、立体物が遠方に在る場合は、走査領域は小さく定める。これは、近くに在る場合は、センサ分解能が高く、1画素走査した場合の3次元空間上の走査量が数mm程度になるのに対し、遠方では10cmを超える為である。走査領域は、立体物検知によって発生する検知のズレ量などの特性によっても定まる。例えば、立体物の横位置中心をとった場合に最も性能を発揮する認識処理を用いる場合、立体物の横位置中心と、実際の認識対象の横位置中心のズレ量や分散から、走査領域に認識対象の横位置中心が収まるように設定してもよい。 (1) The perspective position of the three-dimensional object is an index when setting the scanning area. For example, when a three-dimensional object is nearby, the scanning area on the image is determined to be large. If the three-dimensional object is far away, the scanning area is set small. This is because the sensor resolution is high when the object is near, and the scanning amount in the three-dimensional space when one pixel is scanned is about several mm, whereas it exceeds 10 cm in a distant place. The scanning area is also determined by characteristics such as the amount of detection deviation generated by the three-dimensional object detection. For example, when using a recognition process that exhibits the best performance when the center of the horizontal position of a three-dimensional object is taken, the amount of variance and dispersion between the center of the horizontal position of the three-dimensional object and the center of the horizontal position of the actual recognition target can be used for the scanning area. You may set so that the center of the horizontal position of a recognition object may be settled.

(2)立体物が存在する路面高さは、走査領域の設定を行う場合の指標となる。例えば、路面が上昇しており立体物(歩行者など)が自車よりも高い位置に在る場合は、頭側の隠れが増えて高さが実際より小さく出る。また、立体物(歩行者など)が低い位置に在る場合は、画角などによっては足元が切れる、バンパーで隠れるなどが考えられる。このような状態に合わせて、走査領域を拡大もしくは縮小する。 (2) The road surface height at which a three-dimensional object exists is an index when setting a scanning area. For example, when the road surface is rising and a three-dimensional object (such as a pedestrian) is located at a position higher than the own vehicle, the height of the head side becomes smaller than the actual height due to increased hiding on the head side. If a three-dimensional object (such as a pedestrian) is at a low position, the foot may be cut off or hidden by a bumper depending on the angle of view. The scanning area is enlarged or reduced in accordance with such a state.

 また、立体物領域設定処理401、認識倍率設定処理402において立体物領域や認識領域の設定に考慮しなかった検知特性情報や認識特性情報に関しては、これを用いて走査領域設定処理403で走査領域を定めてもよい。この場合、どの条件に応じて走査領域を拡大するか縮小するかは、立体物領域設定処理401や認識倍率設定処理402と同様である。また、ここで述べた画素数やサイズは一例であり、この範囲に限定するものではない。 In addition, regarding the detection characteristic information and the recognition characteristic information which are not considered in setting the three-dimensional object area and the recognition area in the three-dimensional object area setting processing 401 and the recognition magnification setting processing 402, the scanning area setting processing 403 uses the information. May be determined. In this case, the conditions under which the scanning area is enlarged or reduced are the same as in the three-dimensional object area setting processing 401 and the recognition magnification setting processing 402. Further, the number of pixels and the size described here are merely examples, and the present invention is not limited to this range.

 図8は、走査領域設定処理403の原理を説明する図である。走査領域設定処理403は、各認識領域501、601、602に対して、走査領域801、802、803をそれぞれ定める。走査領域801、802、803は認識領域501、601、602と同じかそれよりも大きな領域である。ただし走査領域801、802、803内を認識領域501、601、602で走査するため、走査量が多いとは限らない。走査領域801、802、803は配置特性情報から画像上の領域を定める。この時、画像の分解能によっては認識領域と走査領域の画像上が同じになる場合も有る。走査領域は、各認識領域に対して個別で定めるが、処理時間に余裕があるならば、最も走査領域が大きくなる1つを採用しても良い。また、処理時間に余裕ない場合、小さな走査領域1つを各認識領域に適応しても良い。 FIG. 8 is a diagram for explaining the principle of the scanning area setting process 403. The scanning area setting process 403 determines the scanning areas 801, 802, and 803 for the recognition areas 501, 601, and 602, respectively. The scanning areas 801, 802, 803 are the same as or larger than the recognition areas 501, 601, 602. However, since the scanning areas 801, 802, 803 are scanned in the recognition areas 501, 601, 602, the scanning amount is not always large. The scanning regions 801, 802, and 803 determine regions on the image from the arrangement characteristic information. At this time, depending on the resolution of the image, the recognition area and the scanning area may be the same on the image. The scanning area is individually determined for each recognition area. However, if there is enough processing time, one having the largest scanning area may be used. If there is not enough processing time, one small scanning area may be adapted to each recognition area.

[倍率毎走査認識処理]
 次に、図4に示す倍率毎走査認識処理404について説明する。倍率毎走査認識処理404では、走査領域801、802、803に対応する画像および視差領域(距離領域)を認識領域501、601、602で走査し、各サイズの走査位置毎に認識処理を実施して、対象の走査位置が立体物であるかを判別する。
[Scan recognition processing for each magnification]
Next, the magnification-specific scan recognition processing 404 shown in FIG. 4 will be described. In the magnification-based scan recognition processing 404, the images and the parallax areas (distance areas) corresponding to the scan areas 801, 802, and 803 are scanned in the recognition areas 501, 601 and 602, and the recognition processing is performed for each scan position of each size. Then, it is determined whether the target scanning position is a three-dimensional object.

 ここで、認識処理の性能が十分であるならば、図4の破線405に示すように、倍率毎走査認識処理404の結果を用いて車両制御処理210を実施してもよい。倍率毎走査認識処理404は倍率、走査位置などにより複数の結果を有する場合があるが、これは認識結果が最良であった1つを選択するなどの処理によって絞り込みを実施する。 Here, if the performance of the recognition process is sufficient, the vehicle control process 210 may be performed using the result of the magnification-based scan recognition process 404 as shown by a broken line 405 in FIG. The magnification-based scanning recognition processing 404 may have a plurality of results depending on the magnification, the scanning position, and the like. In this case, narrowing is performed by processing such as selecting one having the best recognition result.

 図9は倍率毎走査認識処理404の原理を説明する図である。各走査領域801、802、803内を、認識領域501、601、602で走査しながら、認識処理を行う識別器901で認識した結果の応答位置902を求める。応答位置902を図9ではxで示した。応答位置902の数が多いほど認識処理が良好であることを示している。走査領域801、802、803内を識別器901で認識した結果の一例は、走査領域801’、802’、803’の応答位置902で示すように、走査領域801’が最も多くなっている。 FIG. 9 is a view for explaining the principle of the magnification-based scanning recognition processing 404. While scanning in each of the scanning areas 801, 802, and 803 with the recognition areas 501, 601, and 602, a response position 902 as a result of recognition by the discriminator 901 that performs recognition processing is obtained. The response position 902 is indicated by x in FIG. The greater the number of response positions 902, the better the recognition process. As an example of the result of recognizing the inside of the scanning areas 801, 802, and 803 by the discriminator 901, the scanning area 801 'is the largest as shown by the response position 902 of the scanning areas 801', 802 ', 803'.

 識別器901は機械学習を用いても良いし、ヒューリスティックな閾値判定を用いても良い。この判定結果が十分であるならば、図4の破線405に示したように、この結果を用いて認識を終えてよい。その場合、例えば最も認識処理が良好であったものを採用する。 The discriminator 901 may use machine learning or may use heuristic threshold value determination. If the result of this determination is sufficient, recognition may be terminated using this result, as shown by the broken line 405 in FIG. In that case, for example, the one with the best recognition processing is adopted.

 倍率毎走査認識処理404において、認識処理の計算コストの削減などにより、認識処理の性能が不十分である場合に、倍率毎走査認識処理404の結果を用いて、詳細処理を実施してもよい。本実施形態においては詳細処理として、最適倍率設定処理406、詳細認識位置決定処理407、詳細認識処理408を設けた場合を説明する。 When the performance of the recognition processing is insufficient due to reduction of the calculation cost of the recognition processing in the magnification-based scan recognition processing 404, detailed processing may be performed using the result of the magnification-based scan recognition processing 404. . In the present embodiment, a case will be described in which the optimum magnification setting processing 406, the detailed recognition position determination processing 407, and the detailed recognition processing 408 are provided as the detailed processing.

[最適倍率設定処理]
 図4に示す最適倍率設定処理406は、認識倍率設定処理402で作成した複数のサイズの認識領域から、詳細認識処理に最適な認識領域を選択する。選択方法は、例えば走査によって得られた認識処理結果における認識対象と判定された個数やその信頼度や、非認識対象と判定された個数やその信頼度、認識結果の分布などを用い、応答数の量や信頼度を複数のサイズの認識領域間で比較し、最適な認識領域を用いる。最適倍率設定処理406は、処理時間に十分な猶予がないならば省略してもよい。
[Optimal magnification setting processing]
The optimum magnification setting process 406 shown in FIG. 4 selects an optimum recognition region for the detailed recognition process from the recognition regions of a plurality of sizes created in the recognition magnification setting process 402. The selection method uses, for example, the number of recognition targets determined in the recognition processing result obtained by scanning and its reliability, the number of non-recognition targets determined and its reliability, the distribution of recognition results, and the like. The amount and the reliability of a plurality of sizes of recognition areas are compared, and an optimum recognition area is used. The optimum magnification setting process 406 may be omitted if there is not enough time for the processing time.

 図10は最適倍率設定処理406の原理を説明する図である。複数の倍率の認識結果から、最も応答が良かった最適倍率を選択する。最適倍率は前述の通り、認識処理の走査領域での応答数やその信頼度を用いて選択する。図10の例では応答数が最も多かった走査領域801’を選択しているが、この走査領域801’は、基準サイズの走査領域801に対応し、基準サイズの走査領域801は基準サイズの認識領域501に対応している。 FIG. 10 is a view for explaining the principle of the optimum magnification setting processing 406. From the recognition results of the plurality of magnifications, the optimum magnification with the best response is selected. As described above, the optimum magnification is selected using the number of responses in the scanning area of the recognition processing and the reliability thereof. In the example of FIG. 10, the scanning area 801 ′ having the largest number of responses is selected. This scanning area 801 ′ corresponds to the scanning area 801 of the reference size, and the scanning area 801 of the reference size recognizes the reference size. This corresponds to the area 501.

[詳細認識位置決定処理]
 図4に示す詳細認識位置決定処理407は、最適倍率設定処理406で得られた最適倍率について、詳細認識を実施する代表位置を決定する。詳細認識は、例えば、倍率毎走査認識処理404で得られた認識処理の信頼度が最大の位置を選ぶ。または、平均変位法(Mean Shift法)のようなクラスタリング手段を用いて位置を決定しても良い。最適倍率設定処理406を行わない場合、各倍率に対して詳細認識位置決定処理407を実施してよい。
[Detailed recognition position determination processing]
The detailed recognition position determination processing 407 illustrated in FIG. 4 determines a representative position for performing detailed recognition on the optimal magnification obtained in the optimal magnification setting processing 406. For the detailed recognition, for example, a position where the reliability of the recognition processing obtained in the magnification-based scanning recognition processing 404 is the highest is selected. Alternatively, the position may be determined by using a clustering means such as an average displacement method (Mean Shift method). When the optimum magnification setting processing 406 is not performed, the detailed recognition position determination processing 407 may be performed for each magnification.

 図11は詳細認識位置決定処理407の原理を説明する図である。倍率毎走査認識処理404から得られた一つ以上の応答位置から、詳細認識処理408を行う代表位置111を決定する。複数の反応点が存在する場合は、例えばMean Shift法のようなクラスタリング技術を用いる。決定された代表位置111を中心とした領域が詳細識別領域となる。 FIG. 11 is a diagram for explaining the principle of the detailed recognition position determination processing 407. From the one or more response positions obtained from the magnification-based scan recognition processing 404, a representative position 111 for performing the detailed recognition processing 408 is determined. When there are a plurality of reaction points, a clustering technique such as the Mean Shift method is used. An area centered on the determined representative position 111 is a detailed identification area.

[詳細認識処理]
 図4に示す詳細認識処理408は、詳細認識位置決定処理407で決定した代表位置111に対して詳細認識を実施し、対象の種別や信頼度を算出する。もしくは、倍率毎走査認識処理404による応答位置に基づいて選択された最適のサイズの認識領域を用いて詳細認識を実施し、対象の種別や信頼度を算出する。詳細認識処理408は倍率毎走査認識処理404で用いた認識処理と同等性能以上の種別分類性能を有する識別器120を用いる。
[Detailed recognition processing]
The detailed recognition processing 408 illustrated in FIG. 4 performs detailed recognition on the representative position 111 determined in the detailed recognition position determination processing 407, and calculates the type and reliability of the target. Alternatively, detailed recognition is performed using a recognition area of the optimal size selected based on the response position by the scan recognition processing 404 for each magnification, and the type and reliability of the target are calculated. For the detailed recognition processing 408, the discriminator 120 having the classification performance equal to or higher than the recognition processing used in the magnification-based scanning recognition processing 404 is used.

 図12は、詳細認識処理408の原理を説明する図である。詳細認識位置決定処理407によって求めた代表位置111に対して識別器120を用いて詳細な認識処理を行い、立体物の種別を決定する。立体物の種別とは、例えば、歩行者、車両、信号、標識、白線、車のテールランプやヘッドライトなどである。 FIG. 12 is a diagram for explaining the principle of the detailed recognition processing 408. A detailed recognition process is performed on the representative position 111 obtained by the detailed recognition position determination process 407 using the discriminator 120 to determine the type of the three-dimensional object. The type of the three-dimensional object is, for example, a pedestrian, a vehicle, a signal, a sign, a white line, a tail lamp of a car, a headlight, and the like.

 倍率毎走査認識処理404と詳細認識処理408で用いる認識処理には、例えば以下のような技術があげられる。予め用意した認識対象らしさを有するテンプレートと認識領域を比較するテンプレートマッチングを用いる技術。輝度画像やHOGやHaar-Likeといった特徴量と、サポートベクターマシンやAda-BoostやDeepLearningといった機械学習手法を合わせた識別器を利用する技術。また、エッジ形状などを人為的に決めた閾値判定で認識しても良い。倍率毎走査認識処理404と詳細認識処理408にはこれらを実施するために必要なリサイズ、平滑化、エッジ抽出、正規化、孤立点除去、勾配抽出、色変換、ヒストグラム作成などの画像処理を含む。 認識 The recognition processing used in the magnification-based scanning recognition processing 404 and the detailed recognition processing 408 includes, for example, the following techniques. A technique using template matching for comparing a template having a recognition target likeness prepared in advance with a recognition area. A technology that uses a discriminator that combines a luminance image, a feature amount such as HOG or Haar-Like, and a machine learning method such as a support vector machine, Ada-Boost, or Deep Learning. Further, the edge shape or the like may be recognized by a threshold determination that is determined artificially. The magnification-based scanning recognition processing 404 and the detailed recognition processing 408 include image processing such as resizing, smoothing, edge extraction, normalization, isolated point removal, gradient extraction, color conversion, and histogram creation necessary for performing these. .

(変形例)
 本実施形態では、ステレオカメラを用いた画像認識装置100で説明した。しかし、ステレオカメラを用いない画像認識装置100’を用いて実現してもよい。
 図13は、画像認識装置100’における処理動作を示す図である。図2に示した画像認識装置100と同一の個所には同一の符号を付してその説明を省略する。
(Modification)
In the present embodiment, the image recognition apparatus 100 using a stereo camera has been described. However, it may be realized by using an image recognition device 100 'that does not use a stereo camera.
FIG. 13 is a diagram illustrating a processing operation in the image recognition device 100 ′. The same parts as those of the image recognition apparatus 100 shown in FIG. 2 are denoted by the same reference numerals, and description thereof will be omitted.

 画像認識装置100’は、光学カメラ1301とレーダーセンサ1302を備えている。これにより、立体物を検知する。光学カメラ1301により画像が撮像され、撮像された画像情報について、撮像素子が持つ固有の癖を吸収するための補正などの画像処理205を行う。画像処理205の処理結果は画像バッファ206に蓄えられる。また、レーダーセンサ1302により、立体物までの距離が得られる。検知処理1303は、立体物までの距離に基づいて、3次元空間上の立体物を検知する。認識処理209は、検知処理1303により設定された検知領域に対して立体物の種別を特定する認識処理を行う。 The image recognition device 100 ’includes an optical camera 1301 and a radar sensor 1302. Thereby, a three-dimensional object is detected. An image is captured by the optical camera 1301, and image processing 205 such as correction for absorbing the peculiar habit of the image sensor is performed on the captured image information. The processing result of the image processing 205 is stored in the image buffer 206. Further, the distance to the three-dimensional object can be obtained by the radar sensor 1302. The detection processing 1303 detects a three-dimensional object in a three-dimensional space based on the distance to the three-dimensional object. The recognition process 209 performs a recognition process of specifying the type of the three-dimensional object with respect to the detection area set by the detection process 1303.

 レーダーセンサ1302から出力される立体物までの距離を入力とする検知処理1303は、距離計測に用いるレーダーセンサ1302のセンサ特性を考慮した検知処理を行う必要はあるが、検知領域を決定した後の処理は、画像認識装置100で説明したステレオカメラによる構成と同様にできる。また、画像認識装置100’は、画像処理205において複数の画像を必要としない。 The detection processing 1303 that inputs the distance to the three-dimensional object output from the radar sensor 1302 needs to perform the detection processing in consideration of the sensor characteristics of the radar sensor 1302 used for distance measurement, but after the detection area is determined. The processing can be performed in the same manner as the configuration using the stereo camera described in the image recognition device 100. Further, the image recognition device 100 ′ does not require a plurality of images in the image processing 205.

 以上説明した実施形態によれば、次の作用効果が得られる。
(1)画像認識装置100、100’は、カメラ101、102によって撮像された画像上に設定された立体物の検知領域301に対して、立体物の検知特性情報に基づいて立体物の検知領域301を拡大もしくは縮小して立体物領域501を設定する立体物領域設定処理401と、立体物領域設定処理401により設定された立体物領域501に対して立体物の種別を特定する認識処理を行う認識処理209と、を備える。検知特性情報は、例えば、立体物の識別性、立体物との距離、立体物の大きさ、立体物の想定サイズ、外環境の明るさ、ヘッドライトの向き、立体物が存在する路面の高さ、撮像部のセンサ分解能の少なくとも一つである。これにより、立体物を的確に検知し、認識性能を向上させた画像認識装置を提供できる。
According to the embodiment described above, the following operation and effect can be obtained.
(1) The image recognition devices 100 and 100 ′ detect the three-dimensional object detection area based on the three-dimensional object detection characteristic information with respect to the three-dimensional object detection area 301 set on the images captured by the cameras 101 and 102. A three-dimensional object area setting process 401 for setting the three-dimensional object region 501 by enlarging or reducing 301 is performed, and a recognition process for specifying the type of the three-dimensional object is performed on the three-dimensional object region 501 set by the three-dimensional object region setting process 401. And recognition processing 209. The detection characteristic information includes, for example, the distinctiveness of the three-dimensional object, the distance to the three-dimensional object, the size of the three-dimensional object, the assumed size of the three-dimensional object, the brightness of the outside environment, the direction of the headlight, the height of the road surface on which the three-dimensional object exists. It is at least one of the sensor resolutions of the imaging unit. Thus, it is possible to provide an image recognition device that accurately detects a three-dimensional object and improves recognition performance.

(2)画像認識装置100、100’は、カメラ101、102によって撮像された画像上に設定された立体物の検知領域301に対して、立体物の第1の特性情報に基づいて立体物の検知領域301を拡大もしくは縮小して立体物領域501を設定する立体物領域設定処理401と、立体物領域設定処理401によって求めた立体物領域501を基準サイズとして、立体物の第2の特性情報に基づいて、複数のサイズの認識領域601、602を定める認識倍率設定処理402と、認識倍率設定処理402で定めた複数の認識領域601、602に対して、立体物の第3の特性情報に基づいて認識領域601、602よりも広い複数の走査領域802、803を設定する走査領域設定処理403と、走査領域設定処理403で設定された走査領域802、803を用いて、認識処理を行う認識処理209と、を備える。第1の特性情報乃至第3の特性情報は、例えば、立体物の識別性、立体物との距離、立体物の大きさ、立体物の想定サイズ、外環境の明るさ、ヘッドライトの向き、立体物が存在する路面の高さ、撮像部のセンサ分解能、立体物の限界サイズ、立体物の遠近位置、立体物が存在する路面高さの少なくとも一つである。これにより、立体物を的確に検知し、認識性能を向上させた画像認識装置を提供できる。 (2) The image recognition devices 100 and 100 ′ detect the three-dimensional object based on the first characteristic information of the three-dimensional object with respect to the three-dimensional object detection area 301 set on the images captured by the cameras 101 and 102. A three-dimensional object region setting process 401 for enlarging or reducing the detection region 301 to set a three-dimensional object region 501, and second characteristic information of the three-dimensional object using the three-dimensional object region 501 obtained by the three-dimensional object region setting process 401 as a reference size Based on the recognition magnification setting processing 402 that determines the recognition areas 601 and 602 of a plurality of sizes, and the plurality of recognition areas 601 and 602 that are determined in the recognition magnification setting processing 402, the third characteristic information of the three-dimensional object is used. A scanning area setting process 403 for setting a plurality of scanning regions 802 and 803 wider than the recognition regions 601 and 602 based on the scanning regions Using regions 802 and 803 includes a recognition process 209 that the recognition process is performed, the. The first characteristic information to the third characteristic information include, for example, the identifiability of the three-dimensional object, the distance to the three-dimensional object, the size of the three-dimensional object, the assumed size of the three-dimensional object, the brightness of the outside environment, the direction of the headlight, It is at least one of the height of the road surface on which the three-dimensional object exists, the sensor resolution of the imaging unit, the limit size of the three-dimensional object, the near / far position of the three-dimensional object, and the height of the road surface on which the three-dimensional object exists. Thus, it is possible to provide an image recognition device that accurately detects a three-dimensional object and improves recognition performance.

 本発明は、上記の実施形態に限定されるものではなく、本発明の特徴を損なわない限り、本発明の技術思想の範囲内で考えられるその他の形態についても、本発明の範囲内に含まれる。また、上述の実施形態と変形例を組み合わせた構成としてもよい。 The present invention is not limited to the above embodiments, and other forms considered within the scope of the technical idea of the present invention are also included in the scope of the present invention, as long as the features of the present invention are not impaired. . Further, a configuration in which the above-described embodiment and the modification are combined may be adopted.

100、100’ 画像認識装置、101、102 カメラ、103 画像入力インタフェース、104 画像処理部、105 演算処理部、106 記憶部、107 CANインタフェース、108 制御処理部、109 内部バス、110 車載ネットワークCAN 100, 100 '{Image Recognition Device, 101, 102 Camera, 103 Image Input Interface, 104 Image Processing Unit, 105 Processing Unit, 106 Storage Unit, 107 CAN Interface, 108 Control Processing Unit, 109 Internal Bus, 110 Vehicle Network CAN

Claims (12)

 撮像部によって撮像された画像上に設定された立体物の検知領域に対して、前記立体物の検知特性情報に基づいて前記立体物の検知領域を拡大もしくは縮小して立体物領域を設定する立体物領域設定部と、
 前記立体物領域設定部により設定された前記立体物領域に対して前記立体物の種別を特定する認識処理を行う認識処理部と、
 を備える画像認識装置。
A three-dimensional object for setting a three-dimensional object area by enlarging or reducing the three-dimensional object detection area based on the three-dimensional object detection characteristic information with respect to the three-dimensional object detection area set on the image captured by the imaging unit. Object area setting unit,
A recognition processing unit that performs recognition processing for specifying the type of the three-dimensional object with respect to the three-dimensional object region set by the three-dimensional object region setting unit;
An image recognition device comprising:
 請求項1に記載の画像認識装置において、
 前記検知特性情報は、前記立体物の識別性、前記立体物との距離、前記立体物の大きさ、前記立体物の想定サイズ、外環境の明るさ、ヘッドライトの向き、前記立体物が存在する路面の高さ、前記撮像部のセンサ分解能の少なくとも一つである画像認識装置。
The image recognition device according to claim 1,
The detection characteristic information includes the distinguishability of the three-dimensional object, the distance from the three-dimensional object, the size of the three-dimensional object, the assumed size of the three-dimensional object, the brightness of the outside environment, the direction of the headlight, and the presence of the three-dimensional object. An image recognition device that is at least one of a height of a road surface to be scanned and a sensor resolution of the imaging unit.
 請求項1または請求項2に記載の画像認識装置において、
 前記立体物領域設定部によって求めた前記立体物領域を基準サイズとして、前記立体物の認識特性情報に基づいて、複数のサイズの認識領域を定める認識倍率設定部を備え、
 前記認識処理部は、前記認識倍率設定部で定められた複数のサイズの前記認識領域に対して、それぞれに前記認識処理を行う画像認識装置。
In the image recognition device according to claim 1 or 2,
With the three-dimensional object region obtained by the three-dimensional object region setting unit as a reference size, based on the recognition characteristic information of the three-dimensional object, a recognition magnification setting unit that determines recognition regions of a plurality of sizes,
The image recognition device, wherein the recognition processing unit performs the recognition processing on each of the plurality of sizes of the recognition areas determined by the recognition magnification setting unit.
 請求項3に記載の画像認識装置において、
 前記認識特性情報は、前記立体物との距離、前記立体物の大きさ、前記立体物の限界サイズ、前記撮像部のセンサ分解能の少なくとも一つである画像認識装置。
The image recognition device according to claim 3,
The image recognition device, wherein the recognition characteristic information is at least one of a distance from the three-dimensional object, a size of the three-dimensional object, a limit size of the three-dimensional object, and a sensor resolution of the imaging unit.
 請求項3に記載の画像認識装置において、
 前記認識倍率設定部で定めた複数の前記認識領域に対して、前記立体物の配置特性情報に基づいて前記認識領域よりも広い複数の走査領域を設定する走査領域設定部を備え、
 前記認識処理部は、前記走査領域設定部で設定された前記走査領域を用いて、前記認識処理を行う画像認識装置。
The image recognition device according to claim 3,
For a plurality of the recognition areas determined by the recognition magnification setting unit, a scanning area setting unit that sets a plurality of scanning areas wider than the recognition area based on the arrangement characteristic information of the three-dimensional object,
The image recognition device, wherein the recognition processing unit performs the recognition process by using the scanning area set by the scanning area setting unit.
 請求項5に記載の画像認識装置において、
 前記配置特性情報は、前記立体物の遠近位置、前記立体物が存在する路面高さの少なくとも一つである画像認識装置。
The image recognition device according to claim 5,
The image recognition device, wherein the arrangement characteristic information is at least one of a perspective position of the three-dimensional object and a road surface height at which the three-dimensional object exists.
 請求項5に記載の画像認識装置において、
 複数の前記走査領域を複数の前記認識領域により走査して認識結果の応答位置を求める倍率毎走査認識処理部を備える画像認識装置。
The image recognition device according to claim 5,
An image recognition apparatus comprising: a magnification-based scan recognition processing unit that scans the plurality of scan regions with the plurality of recognition regions to obtain a response position of a recognition result.
 請求項7に記載の画像認識装置において、
 前記倍率毎走査認識処理部による前記応答位置に基づいて最適のサイズの前記認識領域を選択する最適倍率設定部を備える画像認識装置。
The image recognition device according to claim 7,
An image recognition apparatus comprising: an optimum magnification setting unit that selects the recognition area having an optimum size based on the response position by the magnification-based scan recognition processing unit.
 請求項8に記載の画像認識装置において、
 前記最適倍率設定部で選択された前記認識領域に対応する前記走査領域の前記応答位置に基づいて前記認識処理を行う代表位置を決定する詳細認識位置決定処理部を備える画像認識装置。
The image recognition device according to claim 8,
An image recognition apparatus comprising: a detailed recognition position determination processing unit that determines a representative position at which the recognition processing is performed based on the response position of the scanning area corresponding to the recognition area selected by the optimal magnification setting unit.
 請求項9に記載の画像認識装置において、
 前記最適のサイズの前記認識領域もしくは前記代表位置を用いて前記認識処理を行い、認識対象の前記立体物の種別を特定する詳細認識処理部を備える画像認識装置。
The image recognition device according to claim 9,
An image recognition device comprising a detailed recognition processing unit that performs the recognition process using the recognition area or the representative position having the optimum size and specifies a type of the three-dimensional object to be recognized.
 撮像部によって撮像された画像上に設定された立体物の検知領域に対して、前記立体物の第1の特性情報に基づいて前記立体物の検知領域を拡大もしくは縮小して立体物領域を設定する立体物領域設定部と、
 前記立体物領域設定部によって求めた前記立体物領域を基準サイズとして、前記立体物の第2の特性情報に基づいて、複数のサイズの認識領域を定める認識倍率設定部と、
 前記認識倍率設定部で定めた複数の前記認識領域に対して、前記立体物の第3の特性情報に基づいて前記認識領域よりも広い複数の走査領域を設定する走査領域設定部と、
 前記走査領域設定部で設定された前記走査領域を用いて、前記立体物の種別を特定する認識処理を行う認識処理部と、
 を備える画像認識装置。
A three-dimensional object detection area is set by enlarging or reducing the three-dimensional object detection area based on the first characteristic information of the three-dimensional object with respect to the three-dimensional object detection area set on the image captured by the imaging unit. A three-dimensional object area setting unit to
A recognition magnification setting unit that defines a plurality of sizes of recognition regions based on the second characteristic information of the three-dimensional object, using the three-dimensional object region obtained by the three-dimensional object region setting unit as a reference size;
A scanning area setting unit configured to set a plurality of scanning areas wider than the recognition area based on third characteristic information of the three-dimensional object, for the plurality of recognition areas determined by the recognition magnification setting unit;
Using the scanning region set by the scanning region setting unit, a recognition processing unit that performs a recognition process of specifying the type of the three-dimensional object,
An image recognition device comprising:
 請求項11に記載の画像認識装置において、
 前記第1の特性情報、前記第2の特性情報、および前記第3の特性情報は、それぞれ、前記立体物の識別性、前記立体物との距離、前記立体物の大きさ、前記立体物の想定サイズ、外環境の明るさ、ヘッドライトの向き、前記立体物が存在する路面の高さ、前記撮像部のセンサ分解能、前記立体物の限界サイズ、前記立体物の遠近位置、前記立体物が存在する路面高さの少なくとも一つである画像認識装置。
The image recognition device according to claim 11,
The first characteristic information, the second characteristic information, and the third characteristic information are respectively the identification of the three-dimensional object, the distance to the three-dimensional object, the size of the three-dimensional object, and the size of the three-dimensional object. The assumed size, the brightness of the outside environment, the direction of the headlights, the height of the road surface on which the three-dimensional object exists, the sensor resolution of the imaging unit, the limit size of the three-dimensional object, the near-far position of the three-dimensional object, and the three-dimensional object An image recognition device that is at least one of the existing road surface heights.
PCT/JP2019/030823 2018-09-12 2019-08-06 Image recognition device Ceased WO2020054260A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2020546756A JP6983334B2 (en) 2018-09-12 2019-08-06 Image recognition device
CN201980054785.1A CN112639877A (en) 2018-09-12 2019-08-06 Image recognition device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2018170737 2018-09-12
JP2018-170737 2018-09-12

Publications (1)

Publication Number Publication Date
WO2020054260A1 true WO2020054260A1 (en) 2020-03-19

Family

ID=69777082

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2019/030823 Ceased WO2020054260A1 (en) 2018-09-12 2019-08-06 Image recognition device

Country Status (3)

Country Link
JP (1) JP6983334B2 (en)
CN (1) CN112639877A (en)
WO (1) WO2020054260A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112487917A (en) * 2020-11-25 2021-03-12 中电科西北集团有限公司 Template matching cabinet indicator lamp identification method and device and storage medium
JPWO2022113470A1 (en) * 2020-11-30 2022-06-02
CN116391119A (en) * 2020-11-06 2023-07-04 株式会社电装 Raindrop detection device
WO2024157413A1 (en) * 2023-01-26 2024-08-02 日立Astemo株式会社 Environment recognition device and environment recognition method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005316607A (en) * 2004-04-27 2005-11-10 Toyota Motor Corp Image processing apparatus and image processing method
JP2010128919A (en) * 2008-11-28 2010-06-10 Hitachi Automotive Systems Ltd Object detection apparatus
JP2010250501A (en) * 2009-04-14 2010-11-04 Hitachi Automotive Systems Ltd Vehicle external recognition device and vehicle system using the same
JP2013161241A (en) * 2012-02-03 2013-08-19 Toyota Motor Corp Object recognition device and object recognition method
WO2014103433A1 (en) * 2012-12-25 2014-07-03 本田技研工業株式会社 Vehicle periphery monitoring device

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
RU2635280C2 (en) * 2012-03-01 2017-11-09 Ниссан Мотор Ко., Лтд. Device for detecting three-dimensional objects
JP6163453B2 (en) * 2014-05-19 2017-07-12 本田技研工業株式会社 Object detection device, driving support device, object detection method, and object detection program
JP6397801B2 (en) * 2015-06-30 2018-09-26 日立オートモティブシステムズ株式会社 Object detection device
CN107993256A (en) * 2017-11-27 2018-05-04 广东工业大学 Dynamic target tracking method, apparatus and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005316607A (en) * 2004-04-27 2005-11-10 Toyota Motor Corp Image processing apparatus and image processing method
JP2010128919A (en) * 2008-11-28 2010-06-10 Hitachi Automotive Systems Ltd Object detection apparatus
JP2010250501A (en) * 2009-04-14 2010-11-04 Hitachi Automotive Systems Ltd Vehicle external recognition device and vehicle system using the same
JP2013161241A (en) * 2012-02-03 2013-08-19 Toyota Motor Corp Object recognition device and object recognition method
WO2014103433A1 (en) * 2012-12-25 2014-07-03 本田技研工業株式会社 Vehicle periphery monitoring device

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116391119A (en) * 2020-11-06 2023-07-04 株式会社电装 Raindrop detection device
CN112487917A (en) * 2020-11-25 2021-03-12 中电科西北集团有限公司 Template matching cabinet indicator lamp identification method and device and storage medium
JPWO2022113470A1 (en) * 2020-11-30 2022-06-02
WO2022113470A1 (en) * 2020-11-30 2022-06-02 日立Astemo株式会社 Image processing device and image processing method
DE112021004901T5 (en) 2020-11-30 2023-07-27 Hitachi Astemo, Ltd. IMAGE PROCESSING DEVICE AND IMAGE PROCESSING METHOD
WO2024157413A1 (en) * 2023-01-26 2024-08-02 日立Astemo株式会社 Environment recognition device and environment recognition method

Also Published As

Publication number Publication date
JP6983334B2 (en) 2021-12-17
CN112639877A (en) 2021-04-09
JPWO2020054260A1 (en) 2021-08-30

Similar Documents

Publication Publication Date Title
JP7369921B2 (en) Object identification systems, arithmetic processing units, automobiles, vehicle lights, learning methods for classifiers
US7566851B2 (en) Headlight, taillight and streetlight detection
US7957559B2 (en) Apparatus and system for recognizing environment surrounding vehicle
US8908924B2 (en) Exterior environment recognition device and exterior environment recognition method
CN102779430B (en) Collision-warning system, controller and method of operating thereof after the night of view-based access control model
US8848980B2 (en) Front vehicle detecting method and front vehicle detecting apparatus
JP6983334B2 (en) Image recognition device
JP2015143979A (en) Image processing apparatus, image processing method, program, and image processing system
JP2003296736A (en) Obstacle detection apparatus and method
JP7032280B2 (en) Pedestrian crossing marking estimation device
JP5874831B2 (en) Three-dimensional object detection device
JP6569280B2 (en) Road marking detection device and road marking detection method
WO2018008461A1 (en) Image processing device
JP7201706B2 (en) Image processing device
JP7460282B2 (en) Obstacle detection device, obstacle detection method, and obstacle detection program
CN101978392B (en) Image processing device for vehicle
JP4969359B2 (en) Moving object recognition device
JP6174960B2 (en) Outside environment recognition device
JP2018163530A (en) Object detection device, object detection method, and object detection program
KR102629639B1 (en) Apparatus and method for determining position of dual camera for vehicle
JP2020126304A (en) Out-of-vehicle object detection apparatus
CN111133439A (en) Panoramic monitoring system
JP6034713B2 (en) Outside environment recognition device and outside environment recognition method
JP6273156B2 (en) Pedestrian recognition device
JP6582891B2 (en) Empty vehicle frame identification system, method and program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19859548

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2020546756

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19859548

Country of ref document: EP

Kind code of ref document: A1