WO2015030689A2

WO2015030689A2 - A tool and method for robust, scale and orientation invariant object detection and classification

Info

Publication number: WO2015030689A2
Application number: PCT/TR2014/000318
Authority: WO
Inventors: Halis ALTUN
Original assignee: Individual
Current assignee: Individual
Priority date: 2013-08-27
Filing date: 2014-08-27
Publication date: 2015-03-05
Anticipated expiration: 2016-02-27
Also published as: WO2015030689A3

Abstract

The invention relates to shape detection method and system and more particularly to automatic shape detection method and system that detects the shape of an object in an image acquired by a camera.

Description

DESCRIPTION

A TOOL AND METHOD FOR ROBUST, SCALE AND ORIENTATION

INVARIANT OBJECT DETECTION AND CLASSIFICATION TECHNICAL FIELD

The invention relates to shape detection method and system and more particularly to automatic shape detection method and system that detects the shape of an object in an image acquired by a camera. PRIOR TECHNIQUE

Object detection is a method which is used in various industrial application and therefore a plenty of methods have been proposed for a solution to this problem. For a successful solution, a method should be robust against to noise and illumination effects, and should be invariant to scale and orientation of the object, due to possible various poses and position of the object in front of a camera. In literature, various approaches have been advised to tackle these problems. In the present invention, we propose new robust and fast approach.

BRIEF SUMMARY OF THE INNOVATION

Recognizing an object automatically in an image using features related to its shape is mostly a fundamental step in the automatic detection and classification of the objects. Therefore shape detection plays an important role and has an intensive usage in various applications.

In the invention, the adverse effects of illumination and noise are alleviated using edge information. Furthermore, a normalization scheme is proposed to tackle with the problems related to different sizes of the object in an image. This will be a case for example if the distance of the object from the camera is not fixed. Also an alignment scheme is proposed to get rid of the orientation problem due to different position of the object in an image.

In the invention, Histogram of Oriented Gradient (HOG) algorithm is proposed to extract shape related features using edge information. The average magnitude difference function (AMDF) is employed to obtain rotation invariant shape detection algorithm. AMDF is a dissimilarity function which commonly used in Speech Analysis to provide a similarity between waveforms. In the invention, the obtained HOG features present a waveform which presents a circular periodicity within 0-360 degree. The invention uses an AMDF module, which is first time in literature, to find out the rotation angle present in the image of the object by comparing the current waveform (i.e HOG feature vector) to the waveforms obtained from the original template objects and stored in memory. A score which indicates the similarity between the original waveforms and present waveform of the given object, and the rotation angle between two of them are provided by the AMDF module. Based on these information, an alignment stage is utilized to remove the rotation from object shape before directing the features of the object to the recognition module. In the invention, Multilayer Pereptron type Artificial Neural Network (ANN) is proposed as a classifier in the recognition module. ANN is trained using the original reference waveforms belongs to the shapes to be determined.

DETAILED DESCRIPTION OF THE INNOVATION

The present invention provides an object detection and classification method and system. The invention requires acquiring an image by a camera, wherein the object will be detected automatically by the proposed method. The image will go under a preprocessing stage to provide smoothed version of the acquired image. The smoothed image will be used to extract feature vectors based on the edge information. The obtained feature vector will be robust against illumination and noise. Also the feature vectors will further go under a process to make them robust against scale and orientation of object in the image.

In the proposed methods, HOG features are obtained from the current edge information which would produced by one of the edge operators available in literature.

The present invention eliminates the adverse effect of illumination and noise by relying on the edge information. Furthermore, in order to alleviate the problems due to orientation the present invention suggest to use AMDF as an indication of the degree of rotation of the object in the image; and in order to alleviate the problems due to scale and size variation, the present invention suggest to use a normalization of HOG vectors based on an average value of HOG feature vectors.

In the present invention, a new method is advised for alignment of HOG feature vectors in order to correctly classify the object, which might lays in different positions and orientation in front of a camera. In the proposed method, the stored HOG vectors, which belong to the pre-determined objects, and which is already obtained and stored in the memory, is compared to the HOG vector, which is extracted from the current input image using AMDF module. The AMDF module will provide a score as a degree of similarity between the current HOG feature vector and the HOG feature vectors stored in the memory. It is easy to show that the orientation of the object in the current image will introduce a shift on the HOG feature vectors. If a method would be able to remove this shift on the HOG feature vector, then a high score of similarity will be obtained between the current HOG feature vector and one of the template HOG feature vectors. On the operation, AMDF will introduce a shift on the current HOG feature, between 0 and 360, and then calculate the degree of similarity for each steps. Therefore the highest degree of similarity will occurs at the point if the same amount of shift is introduced, which will eventually indicate the degree of shift on the current HOG feature vector, which should be removed to have an orientation invariant method. After the alignment process, HOG feature vector is sent to the recognition unit which consists of artificial neural network as a classifier.

In the next step a multilayer perceptron (MLP) neural network (NN) will classify the aligned HOG feature vector. As MLP NN is trained with various samples of the object, a high success score will be obtained even under noise, occlusion and deformation of the shape of the object. The number of output of MLP NN unit will equal to the number of shapes to be detected. If the present shape is recognized successfully the corresponding output of the MLP NN will be activated. For example, in the case of successful recognition, if the present object belong to the third class, the third output of MLP NN will be activated while the rest of the output will be de-activated. The detected label information and the location information of the object will be sent to a robot which will perform the defined operation on the detected object accordingly.

Claims

1. An object detection and classification system based on the shape of the object which compares a set of features stored in the memory with the object in an image taken by a camera; and consist of

- The Histogram of Oriented Gradient (HOG) unit which extracts oriented gradients from the image taken by a camera

- The average magnitude difference function (AMDF) unit which find the degree of dissimilarity between two HOG feature vectors

- The multilayer perceptron (MLP) neural network (NN) classifier unit which detects and classifies

2. The Histogram of Oriented Gradient (HOG) mentioned in claim 1 wherein a normalized HOG feature vectors by the HOG unit is extracted in a way that alleviates the problems

- caused by the fact that the size of the object in the capture image could be different from its original size due to the non- fixed distance between the object and the the camera; or

- caused by the fact that the objects might be produced with different scale and size.

3. The average magnitude difference function (AMDF) mentioned in claim 1 wherein the degree of dissimilarity is calculated by AMDF unit using

- The HOG features of the object in the current image acquired by the camera which will be extracted and provided by the preceding HOG unit

- The set of pre-stored HOG feature vectors which belongs to the each of reference objects

4. The average magnitude difference function (AMDF) mentioned in claim 1 wherein the degree of dissimilarity is calculated by AMDF unit in a way that provides a maximum similarity possible between two HOG vectors by removing the shift in the feature vector extracted from the image of the object due to the orientation of the object in the image.

5. The multilayer perceptron (MLP) neural network (NN) mentioned in claim 1 wherein the multilayer perceptron (MLP) neural network (NN) classifier unit has as many output as number of the possible classes and each of output identifies and labels the corresponding class of the object detected.

6. The multilayer perceptron (MLP) neural network (NN) mentioned in claim 1 wherein the multilayer perceptron (MLP) neural network (NN) classifier unit will set only one of its output an active output, which indicates the detected class of the object, and the rest of the outputs will be deactivated.

7. The method for robust, scale and orientation invariant object detection and classification which operates using the steps of:

- capturing the image of the object by a camera

- extracting the HOG feature vectors from the edge information obtained by one of the available edge detection algorithm

- alleviating the effect of the illumination and noise by using the edge information obtained by one of the available edge detection algorithm

- normalizing the HOG feature vectors in order to remove the problems caused by the fact that the objects might be produced with different scale and size or the distance between the camera and the object is not fixed.

- finding the amount of shift caused by orientation of the object by calculating the degree of similarity between two HOG feature vectors by the AMDF unit; one HOG feature vector that is obtained from the image acquired by the camera and one HOG feature vector which stored in the memory available on the system for each of reference objects

- aligning the HOG feature vector by removing the present shift and sending the aligned HOG feature vector to the multilayer perceptron (MLP) neural network (NN) classifier unit setting only the corresponding output of the multilayer perceptron (MLP) neural network (NN) classifier unit as an active output in order to indicate the detected object

Sending the label of the detected object and the location information to a robot which performs pre-defined operation on the detected objects accordingly.