WO2011136276A1 - 三次元物体認識用画像データベースの作成方法および作成装置 - Google Patents
三次元物体認識用画像データベースの作成方法および作成装置 Download PDFInfo
- Publication number
- WO2011136276A1 WO2011136276A1 PCT/JP2011/060277 JP2011060277W WO2011136276A1 WO 2011136276 A1 WO2011136276 A1 WO 2011136276A1 JP 2011060277 W JP2011060277 W JP 2011060277W WO 2011136276 A1 WO2011136276 A1 WO 2011136276A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- subspace
- query
- recognition
- database
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/46—Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
- G06V10/462—Salient features, e.g. scale invariant feature transforms [SIFT]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
- G06F18/2135—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/7715—Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/64—Three-dimensional objects
Definitions
- the present invention relates to a method and apparatus for creating a three-dimensional object recognition image database.
- the local feature amount is a description of a local region feature of an image by a vector.
- the local feature amount is usually obtained from hundreds to thousands of various points from various parts of one image. Therefore, even when only a part of the object is shown in the question image or when a part of the object is hidden, the 3D object corresponding to the part shown in the question image can be recognized. . Also, those using local feature quantities are robust to conversions such as similar transformations and rotations when the shooting conditions are different.
- the simplest method for recognizing a three-dimensional object using local feature values is to capture various objects in advance and register the local feature values obtained from these images in a database. Then, the object is recognized by comparing each local feature obtained from the query image with the local feature registered in the database. At this time, in order to recognize a highly accurate three-dimensional object, it is necessary to be able to recognize the question image from any viewpoint. For that purpose, it is better to shoot an object from various viewpoints, acquire local feature amounts from those images, and register them in the database. However, if all the local feature values are stored as they are, a problem arises that a huge memory capacity is required.
- Non-Patent Document 5 there is one that reduces the memory capacity by selecting the local feature values and reducing the number of local feature values registered in the database (for example, Non-Patent Document 5).
- the recognition rate of the object may decrease.
- the inventors have paid attention to a local area obtained to some extent consistently from a set of continuously changing images, and have newly considered using only the local feature quantity obtained from such a portion. If only such local feature amounts are registered in the database, the memory capacity should be able to be reduced without significantly reducing the accuracy of object recognition. Furthermore, in order to increase the effect of reducing the memory capacity, we considered registering local feature values obtained from the same part of the object together in the database.
- the CLAFIC method for example, Kenichiro Ishii, Eisaku Maeda, Noriyoshi Ueda, Hiroshi Murase, “Easy-to-understand pattern recognition”, Ohmsha, August 1998, p.147-151, or E. Oja:
- the present invention has been made in consideration of the above circumstances, and reduces the memory capacity by selecting a local feature amount to be registered in the image database for 3D object recognition, while reducing the selection capacity.
- the present invention provides a technique that can suppress the reduction of the recognition rate that occurs as a compensation to the extent that it is comparable to the state of no reduction.
- the present invention is a set of a plurality of sets consisting of a step of extracting local features of each image and representing each as a feature vector when a plurality of images of a three-dimensional object viewed from different viewpoints are input.
- Subspace generation in which each set generates a set representing local features of the same part of the object viewed from a series of adjacent viewpoints, and generates a plurality of subspaces representing the characteristics of each set using the subspace method
- the present invention is based on an extraction unit that extracts local features of each image and represents them as feature vectors when a plurality of images of a three-dimensional object viewed from different viewpoints are input, and a feature vector A plurality of sets, each set generating a set representing local features of the same part of the object viewed from a series of adjacent viewpoints, and subspaces representing the characteristics of each set A partial space generation unit generated by the method, and a registration unit that associates the identifier of the object with each partial space and registers it in a three-dimensional object recognition database, and the database is accessed by a three-dimensional object recognition device, When an image obtained by viewing a certain object from a certain viewpoint is given as a search question, the recognition device queries a plurality of feature vectors respectively representing local features of the search question.
- An object that is extracted as a feature vector determines a partial space that is most similar to each query feature vector, performs an aggregation process for the object ID associated with each partial space for each query feature vector, and is an object that is most similar to the search query
- An apparatus for creating a three-dimensional object recognition image database having the function of obtaining
- a method for creating an image database for 3D object recognition includes a plurality of sets of feature vectors, each set representing a local feature in which the same part of the object is viewed from a series of adjacent viewpoints. Generating a plurality of subspaces representing the characteristics of each set by a subspace method, and a registration step of registering each of the subspaces in association with the identifier of the object in a three-dimensional object recognition database Therefore, feature vectors that represent the same part of the object are converted into approximate subspaces in response to successive viewpoint changes, and the subspaces are registered in the database instead of each feature vector.
- the reduction of the recognition rate that occurs as a price while reducing the memory capacity is not much lower than when registering each feature vector as it is It can be suppressed.
- the memory capacity can be reduced to about 1/18 of the unreduced state.
- each step of database creation is executed by a computer.
- the computer and the computer that executes the recognition process may be the same, but are different. It may be a thing.
- the computer here may be a single computer, but also includes a mode in which a plurality of devices physically realize the function as one computer, such as so-called cloud computing.
- a three-dimensional object is an object in which a part is visible and hidden as the viewpoint (photographing angle) changes, unlike a two-dimensional (planar) object.
- the subspace can be generated by applying a known subspace method.
- a characteristic aspect of the present invention is that feature vectors of the same part of the same object viewed from a series of viewpoints are combined into one by the subspace method, thereby saving the memory capacity of the database.
- viewpoint data representing a series of viewpoints is added to a subspace as a set of feature vectors collected together.
- a specific aspect of the viewpoint data is a shooting angle.
- the shooting angle corresponds to the rotation angle of the turntable when a three-dimensional object is rotated once on the turntable and the object is shot from different directions.
- the registration step registers a combination of the identifier of the object and viewpoint data representing the series of viewpoints in association with each partial space
- the recognition process includes: It may be a step of performing aggregation processing for the combination associated with each partial space to obtain an object that is most similar to the search question and a viewpoint that is most similar.
- the aggregation processing for the combination of the object ID and the viewpoint data is performed for each query feature vector, so that the recognition rate decrease that occurs as a price for reducing the memory capacity is smaller than when each feature vector is registered as it is. It can be suppressed to the extent that it does not fade.
- the memory capacity can be reduced to about 1/18 of the unreduced state. . In this way, it is possible not only to recognize the type of object in the question image, but also to estimate the approximate viewpoint of the search question.
- the subspace generation step calculates the distance between each pair of feature vectors corresponding to adjacent viewpoints, excludes the difference between different pairs exceeding a predetermined reference as noise, and leaves
- the feature vectors may be a set of feature vectors representing the same location. In this way, a set of feature vectors representing the same location can be obtained by calculating the distance between the feature vectors of adjacent viewpoints, and noise can be excluded and stable acquisition can be performed.
- the subspace generation step is configured to generate a subspace when the feature vector of each set covers a wider range than a predetermined viewpoint change amount, but not to generate a subspace when the change amount does not satisfy the change amount. May be.
- the partial space generation step may exclude, as noise, a case where the difference between the pair with the closest distance and the pair with the second closest distance exceeds the reference.
- the recognition process projects each query feature vector onto a base that defines a coordinate system of each partial space registered in advance in the database, and based on the size of the projection component, each query feature vector and each partial space, And the subspace with the highest similarity may be determined to be the subspace most similar to the query feature vector. In this way, the degree of similarity can be calculated based on the size of the component projected onto each base in each partial space.
- the base of the first principal component related to the largest eigenvalue in the set of feature vectors in which each subspace represents a characteristic is determined for each subspace, and the normal from the origin on each base of the first principal component
- Each point is placed at the same distance, and the distance between each point and each query feature vector is obtained by applying an approximate nearest neighbor search method. You may determine that it is a similar partial space.
- the similarity calculation is processed as a distance calculation, and by applying the approximate nearest neighbor search method to the distance calculation, the similarity calculation is performed compared to the method of projecting to each base. Processing time can be shortened.
- the query feature vector is projected onto each base of each candidate partial space, and the size of the projection component Based on the above, the similarity between each query feature vector and each subspace may be calculated, and the subspace with the highest similarity may be determined as the subspace most similar to the query feature vector. In this way, it is possible to expect a higher recognition rate than when only the first main component of each partial space is used.
- the processing time can be reduced because the target (candidate) for similarity calculation is narrowed down.
- a query related to the search question A plurality of sets of feature vectors, each set generating a set representing local features of the same part of the object viewed from a series of adjacent viewpoints, and a plurality of query parts representing the characteristics of each set A space is generated by a subspace method, and (2) the normalized distance from the origin is set at a predetermined position on each base that defines the coordinate system of each query subspace, and is registered in the database.
- the normalized distance from the origin on each base of each subspace sets a reference point at the predetermined position, and (3) a portion having a reference point with the shortest distance from each query reference point
- the viewpoint data may be data of a shooting angle of the object.
- a shooting angle is used as the viewpoint data, and a partial space is generated on the condition that a vector of the same location is continuously included in the set with respect to a change amount of the shooting angle.
- the CLAFIC method may be applied to collect the feature vectors as a set to generate a subspace.
- the number of dimensions of the subspace may be 1 or more and 3 or less.
- the SIFT feature quantity used in the embodiment is a 128-dimensional vector, but the number of dimensions can be reduced by converting it into a subspace.
- the extent to which the number of dimensions can be reduced must be determined in consideration of the balance between the two.
- a recognition rate of 98.9% is obtained for the non-reduced memory state even when the number of dimensions of the subspace is 3D in the aspect of adding the viewpoint data. Maintained. And even if the number of dimensions was increased further, the recognition rate of 98.9% or more was not obtained.
- the recognition rate of 98.0% was maintained for the unreduced state (see FIG. 6 and Table 1). Even if the viewpoint data is not added, the recognition rate of 98.3% of the maximum value is obtained when the number of dimensions in the subspace is three dimensions, and the recognition rate of 95.8% is maintained even if the number of dimensions is reduced to one dimension. It was done.
- the various preferable aspects shown here can also be combined.
- 3D object recognition using local features can be broadly divided into two types: global features and local features.
- the former uses a pixel value of an image as a feature amount for an entire image, and can be obtained relatively easily. However, if the pixel values of the database image and the question image are completely different, such as when only a part of the object is shown in the question image or part of the image is hidden, the feature value may vary greatly. This makes it difficult to recognize objects.
- the latter local feature amount is obtained by extracting a characteristic local region from an image and describing the local region with a vector. As described above, since the local feature amount is described as a vector, it is also referred to as a feature vector. It is known that it is effective as a method for dealing with the problem of the global feature quantity because it can be obtained from hundreds to thousands of counts from various parts of one image.
- SIFT features for example, D. Lowe: “Distinctive image features from scale-invariant keypoints", International Journal of Computer Vision, 60, 2, pp. 91- 110 (2004)
- a basic method of three-dimensional object recognition using local features see Non-Patent Document 4
- SIFT Scale-Invariant Feature Transform
- FIG. 1 is an explanatory diagram showing a state where SIFT feature values, which are conventional local feature values, are extracted from image data viewed from two different viewpoints for a certain object.
- FIGS. 1A and 1B are images of the same object taken at different shooting angles. What is represented by a white arrow in the figure is the SIFT feature.
- FIG. 2 is an explanatory diagram showing a state in which SIFT feature values, which are conventional local feature values, are extracted from image data viewed from two different viewpoints for an object different from FIG.
- 3D object recognition method A 3D object recognition method is briefly described. First, registration in the database is described. Assume that the user has prepared images obtained by photographing an object to be registered in the database from various viewpoints (photographing angles). As the registration process, the computer extracts SIFT feature values from each image obtained by photographing the object to be registered. The extracted SIFT feature value is registered in the database together with an identifier (object ID) unique to the object. The object ID is for identifying each object registered in the database.
- the recognition process will be described. It is assumed that an object as a search question is photographed from a certain viewpoint and a photographed image is given.
- the computer searches whether the same object as the given search query image is registered in the database, and if there is the same object, the object ID is specified.
- the procedure of the recognition process is as follows. Similar to the registration process described above, SIFT feature values are extracted from the question image. The distance between each extracted SIFT feature and each SIFT feature registered in the database is calculated. Then, the SIFT feature value at the smallest distance relative to the SIFT feature value of the query image, that is, the nearest SIFT feature value is found. Vote for the object ID of the object to which the nearest SIFT feature belongs. The other SIFT feature quantities of the query image are also voted for any object ID. As a result of the voting process, the object ID that obtained the most votes is set as the recognition result for the search question.
- CLAFIC method-classification of local features is a kind of subspace method proposed by Watanabe in 1969. This is a technique for classifying using a subspace created by a method that obtains a subspace that most closely approximates the distribution of the above. In the present invention, it is considered that local feature amounts of the same part extracted from a plurality of images at different shooting angles for a certain object are grouped into the same class using the CLAFIC method. This is clustering of local features.
- a large number of sample vectors x are prepared for each class, and their autocorrelation matrix Q is expressed by the following equation:
- the eigenvector is obtained by solving the eigenvalue problem of. If there are n original sample vectors x, generally n eigenvectors are obtained. Of the n eigenvectors, a space spanned by several eigenvectors is called a subspace. If the number of dimensions of the desired subspace is k, the eigenvector corresponding to the top k large eigenvalues
- the space whose base is (coordinate system) is the subspace to be obtained.
- the eigenvalue related to the base u i of a sample vector x corresponds to the length (projection component) when the sample vector x is orthogonally projected onto the base u i .
- the basis u 1 is a basis corresponding to the maximum eigenvalue of the entire class, and is referred to as a first principal component in this specification.
- Processing for finding local feature values obtained from the same part of an object from a plurality of image data whose shooting angles continuously change is as follows. First, in two pieces of image data with slightly different shooting angles, distance calculation is performed between each local feature amount of the image data of the first frame and each local feature amount of the image data of the second frame. Then, by finding the local feature amounts that are closest to each other, the local feature amounts of the first frame image and the second frame image are associated with each other. At this time, the local feature amount that is the nearest neighbor due to the influence of noise or the like may be obtained from a portion that is not really related. Therefore, if the distance from the nearest local feature is d1, the distance from the second nearest local feature is d2, and the threshold is ⁇ ,
- FIG. 3 is an explanatory view schematically showing the state of the trace processing according to the present invention.
- Rectangular frames such as “first frame” and “second frame” indicate images obtained by photographing an object from different viewpoints.
- a, b, and c represent trajectories in which feature vectors representing the same part of the object appear successively in images at different shooting angles (viewpoints).
- the feature vector set a corresponds to the feature vector ⁇ 2 belonging to the second frame from the feature vector ⁇ 1 belonging to the first frame
- the feature vector ⁇ 2 of the second frame did not correspond to any feature vector of the third frame Represents.
- the feature vector set b represents a state in which feature vectors continuously correspond to the viewpoints from the first frame to the fifth frame.
- the feature vector set c represents a state in which feature vectors continuously correspond to the viewpoints from the third frame to the fifth frame.
- the computer combines a set of local feature values obtained by the trace processing into one by creating a subspace of the set. Then, the partial space is registered together with the object ID. At this time, if a partial space is created from a set of local feature quantities with a short trajectory and registered in the database as shown in the trajectory a in FIG. . Therefore, a threshold is set for the length of the trajectory, and a partial space is created only from a set of local feature values that are continuously associated with each other to some extent and registered in the database. This is because a local feature amount with a short trajectory has a smaller imaging range that can be identified than a long trajectory.
- the subspace created by the present invention is specifically composed of k direction vectors that define a k-dimensional base (coordinate system).
- Each trajectory corresponds to the class described in the description of the CLAFIC method.
- the sample vector x for each class corresponds to a local feature amount belonging to each trajectory.
- the autocorrelation matrix of local features is obtained by the CLAFIC method, and a subspace with the eigenvector as the basis is set. It is only necessary to register k eigenvectors constituting this partial space in the database.
- the registered eigenvector approximately represents each local feature amount belonging to the subspace.
- the dimension number k of the subspace takes any dimension number between 1 and 128.
- the registration process not the local feature value itself but a partial space in which a set of local feature values is combined is registered in the database together with the object ID.
- the range of the shooting angle of the partial space such as from which viewpoint to which local feature value change is expressed, is also registered.
- the registration process registers the partial space in association with the object ID and the range of the photographing angle.
- recognition processing when an image of a search question is given, the computer extracts a local feature amount from the search question image, projects each local feature amount to each partial space in the database, The similarity between the local feature and the subspace is calculated. Then, the partial space having the highest similarity is obtained, and voting processing is performed on the object ID having the partial space. This is performed for all local features. Finally, the object ID that has obtained the most votes is output as the result of object recognition. At this time, since the number of subspaces differs depending on the object, the number of votes is normalized by the number of subspaces.
- a large number of votes may be collected in an object ID having a large number of subspaces, which may cause erroneous recognition. If the number of subspaces that can identify a shooting angle ⁇ with an object is N ⁇ and the number of votes for the shooting angle ⁇ is G ⁇ ,
- FIG. 4 is an explanatory view showing the state of the voting process according to the present invention. For example, when a partial space having the highest degree of similarity with each of five local feature values obtained from the query image is obtained, five partial spaces having an object ID and an imaging range as shown in the left of FIG. 4 are obtained. To do. At this time, according to one aspect of the present invention, if voting is simply performed for each object ID, the result is as shown in FIG. However, even if two local feature quantities correspond to partial spaces having the same object ID, they can be said to show different things if the viewpoint ranges are completely different. Therefore, in a different aspect of the present invention, voting is performed in consideration of the shooting angle of the object as shown in FIG.
- FIG. 4 shows an example in which five partial spaces are extracted from the search question.
- FIG. 4 shows a voting result when voting is performed only on the object ID. According to the voting result in FIG. 4A, it is impossible to determine which one should be used as the recognition result in order to obtain the same number of objects 1 and objects 2 and the largest number of votes.
- FIG. 4B shows a voting result when voting is performed for a combination of an object ID and a shooting angle. According to the voting result shown in FIG. 4B, the object 1 can be used as the recognition result because the most obtained vote is obtained in the range of the shooting angle of 20 degrees to 40 degrees among the objects 1. Furthermore, there is also an advantage that a shooting angle of 20 to 40 degrees can be estimated as an approximate shooting angle of the question image.
- FIG. 13 is an explanatory diagram showing the flow of processing of 3D object recognition according to the present invention.
- the three-dimensional object recognition process is roughly divided into a registration process and a recognition process.
- the registration process is a process of creating a database of three-dimensional object images used for recognition.
- the recognition process is a process in which when an image of a three-dimensional object is given as a search question, the object image indicated in the image is searched from the database and the object is specified.
- (a) shows the flow of registration processing
- (b) shows the flow of recognition processing.
- the registration process a plurality of images obtained by photographing the same object from different viewpoints are input.
- the computer that performs the registration process extracts a local feature amount from each image (step S1).
- the local feature amount of the same location continuously included in the images of different viewpoints is obtained using distance calculation, and a partial space is created in which these local feature amounts are combined into one (step S3).
- the identifier of the object and the range of the shooting angle of the viewpoint are added to the generated partial space and registered in the database (step S5).
- the recognition process is performed on the assumption that multiple data are registered in the database.
- an image of a search question is given as an input.
- the search question is an image obtained by photographing a certain object from a certain viewpoint.
- the computer that performs the recognition process extracts a local feature amount from the search question (step S11). For each local feature extracted from the search query, the most similar partial space is searched from the database. Then, majority processing is performed on the object ID added to the searched partial space to determine one object ID (step S13). And it outputs so that an operator can recognize the object identified by the object ID (step S15).
- the database creation device is a computer that captures each process related to registration as a device.
- Each process is realized by common hardware mainly including the computer.
- a program for each process is different, a function realized by the computer process is different. This corresponds to the fact that a plurality of parts having different functions are combined to realize one apparatus as a whole, and therefore the invention can be understood as an apparatus in which parts that realize the functions of the respective processes are combined.
- the extraction unit responsible for the local feature extraction process the partial space generation part responsible for the partial space creation process, and the registration responsible for the database registration process
- the present invention can be understood as an apparatus for creating a database including a section.
- the recognition process the part responsible for extracting the local feature quantity from the search question, the partial space most similar to each extracted local feature quantity, are searched, and the object ID added to the partial space is majority processed.
- it can be understood as a device having a part responsible for the function of determining one object ID and a part responsible for the function of outputting the determined object ID.
- ⁇ Reduction of processing time In the method described above, a partial space is created using a KL expansion from a set of features obtained from the same local region over multiple frames, and multiple local features are represented together. . By doing so, the feature quantity registered in the database can be greatly reduced, and the memory capacity can be reduced.
- ANN is used for the mutual subspace method (for details, see Maeda and Watanabe: “Pattern matching method with local structure introduced”, Science theory (D), pp. 345-352 (1985)). As a result, the recognition rate is improved and the processing speed is increased.
- each query feature vector as a local feature is stored in each database in recognition. You must project to each base of the subspace corresponding to the class and find the similarity. Therefore, the processing time increases as the number of objects registered in the database increases.
- a method of searching for a partial space with approximately high similarity is employed to reduce processing time.
- each query feature vector is projected onto each base in each subspace to find the subspace with the highest similarity. This causes a problem that processing time is required. Therefore, in this embodiment, a method is proposed in which the subspace with the highest similarity is obtained by calculating the distance between the local feature obtained from the query and the point on the first principal component of the subspace. If the subspace with the highest similarity can be obtained by the distance calculation, the speed can be increased by various approximate nearest neighbor search methods that have already been proposed.
- the method for obtaining the subspace with the highest similarity by distance calculation is as follows. First, a point equidistant from the origin is placed on the coordinate axis of each partial space in the database. Then, the distance between these points and the local feature amount obtained from the query is calculated. The shorter this distance, the longer the projection length when the local feature amount is projected onto the partial space. That is, it becomes a partial space with the highest similarity.
- ANN reduces the number of cells for distance calculation by reducing the radius of the hypersphere by 1 / (1 + ⁇ ⁇ ⁇ ).
- ANN is a method that approximates a nearby point to a certain point, and it cannot determine the similarity between local features and subspaces. Therefore, in order to use ANN in the subspace method, it is necessary to obtain a subspace having the highest similarity with the local feature amount by distance calculation. In this embodiment, a method for realizing this will be described.
- a one-dimensional subspace based only on the eigenvector having the largest eigenvalue that is, the set of original local features that created the subspace is expressed sufficiently with only the first principal component. Need to be.
- a set of local feature values that continuously change is obtained by the trace processing, and a subspace is created from the set. For this reason, there are almost no outliers, and they can be expressed approximately well in a one-dimensional subspace.
- the similarity between the local feature and the subspace is inversely proportional to the distance between the local feature and the base of the subspace. That is, the partial space with the highest similarity can be obtained by obtaining the partial space closest to the local feature amount.
- the distance between the local feature quantity and the base of the subspace is generally obtained as the length of the perpendicular drawn from the local feature quantity to the base of each subspace.
- the distance calculation process using only the first principal component of the subspace reduces the processing time required for recognition compared to the method of calculating the similarity to the k-dimensional subspace. can do.
- a higher recognition rate can be expected by using a plurality of subspace axes. Therefore, as another embodiment, a mode in which recognition is performed by employing a two-stage process can be cited.
- as a first step only the first principal component of the subspace is used, and candidates that will become the subspace with the highest similarity are narrowed down by the approximate nearest neighbor search at high speed.
- the second stage projection is performed in a high dimension using a plurality of axes of each of the partial spaces listed as candidates, and the partial space having the truly highest similarity is obtained.
- the processing time is reduced because the target (candidate) for similarity calculation is narrowed down can do.
- ANN Use of ANN in Mutual Subspace Method
- This embodiment is effective when a moving image or a plurality of images taken continuously is used as a search question.
- the tracing process is also performed on a plurality of images related to the search question.
- a subspace is created from a set of local features obtained by the trace processing. By doing so, the partial space of the database can be compared with the partial space obtained from the query image.
- the subspace obtained from the search query is referred to as a query subspace in this specification.
- the canonical angles are generally used. However, this method cannot be accelerated. Therefore, an optimum subspace is also obtained here by distance calculation.
- a point equidistant from the origin is placed on the axis of each subspace, and the distance between the points is calculated. It should be noted that since the query is also a subspace in this embodiment, a point must also be placed on the coordinate axis of the query subspace. A more detailed description will be given below.
- the mutual subspace method is used when a moving image or an image of a plurality of viewpoints can be used as a search question.
- a query subspace is also created from a set of local feature values obtained from the search question, and the search question is recognized by comparing the query subspace with the subspace in the database. Since the partial space has less fluctuation than the local feature amount, a high recognition rate can be expected by using it as a search question.
- the similarity between the query subspace created from the search question and the subspace in the database can be obtained by calculating the canonical angle formed by both.
- the processing time required to calculate the canonical angle is enormous.
- a point Si ⁇ having a distance of 1 from the origin is arranged on each base of each partial space registered in the database.
- An appropriate point Q is also placed on the base of the query subspace obtained from the search question.
- a subspace having a point Si on the base that minimizes the distance from the point Q is a subspace to be obtained.
- the point Si ⁇ having the minimum distance from the point Q can be approximately obtained by using ANN.
- Example 1 A comparison experiment was performed between the basic conventional method of object recognition using local feature amounts and the method of the present invention that expresses a plurality of local feature amounts in a subspace.
- the comparison index is the memory capacity and the recognition rate.
- FIG. 5 is an explanatory diagram showing some examples of objects used in the experiment according to the present invention.
- photographed moving image was acquired, and the image for registration and the search question image which are registered into a database were produced from the frame image.
- registration images 360 images were created by changing the shooting angle for each object by about 1 degree.
- 10 images were created as the question images were changed about every 36 degrees for each object.
- the registration image and the question image are different images. That is, there is no registration image that is completely identical to the search question image.
- the value of the threshold value ⁇ (refer to the above formula 5) for determining whether or not the local feature amount is obtained from the same part of the object when performing the trace processing is set to 0.6. .
- a partial space consisting of a set of each local feature amount was obtained, and an object ID and a corresponding range of shooting angles were added to the obtained partial space and registered in the database.
- the number of subspaces obtained from one object was about 100 to 400 each.
- FIG. 6 is a graph showing a first experimental result according to the present invention.
- the horizontal axis represents the number of dimensions of the used subspace, and the vertical axis represents the recognition rate. From the experimental results, even in the case of voting for each object ID, a recognition rate exceeding 95% was obtained, and it was 98.3% when the number of dimensions of the subspace was three-dimensional.
- false votes were dispersed, and the recognition rate was further improved regardless of the dimensions of the subspace.
- FIG. 7 is an explanatory diagram showing the object A used as a search question in the experiment of the present invention.
- FIG. 8 is a graph showing the result of voting only on the object ID for the object A in FIG.
- FIG. 9 is an explanatory diagram showing an example in which the object A in FIG. 7 is correctly recognized by voting on the combination of the object ID and the shooting angle.
- the question image in FIG. 7 is obtained by photographing the object A at a certain photographing angle.
- FIG. 8 shows scores obtained by voting on each object ID and normalizing the number of votes obtained. Originally, the score of object A should be the highest, but the scores of other objects B and C were higher. Therefore, when voting is performed for each object, the question image is recognized as the object C.
- FIG. 9 shows the score at each shooting angle of object A, object B, and object C when voting for a combination of the object ID and the shooting angle.
- object C which had the highest score when voting only on the object ID, the number of votes was dispersed by voting on the combination of the object ID and the shooting angle, resulting in a low score.
- object A votes gathered at a certain range of shooting angles. Therefore, what was recognized as the object C when voting only on the object ID was correctly recognized as the object A by voting on a combination of the object ID and the shooting angle.
- FIG. 10 is an explanatory diagram showing 68 degree and 69 degree images of the object A used in the experiment of the present invention.
- (a) is an image of 68 degrees and
- (b) is an image of 69 degrees.
- FIG. 11 is an explanatory diagram showing the object D used in the experiment of the present invention.
- FIG. 12 shows scores of shooting angles of the object D and the object E when the object D shown in FIG. 11 is used as a question image.
- the number of votes is distributed to some extent like the score of the object E, if the score is high, it will be erroneously recognized.
- Table 1 shows the recognition rate and memory capacity when using a one-dimensional subspace with the smallest memory capacity (see (1) of the database in Table 1), and the three-dimensional with the highest recognition rate. The recognition rate and the memory capacity (see (2) of the database in Table 1) when using the subspace are respectively shown.
- the memory capacity when a three-dimensional subspace is used in the present invention can be reduced to about 1/18 of the memory capacity of the database (3) which is not reduced.
- the recognition that the subspace was created and used was reduced slightly from the unreduced state while reducing the memory capacity to about 1/7. The rate could be recovered.
- the memory capacity when using a one-dimensional subspace and the database (5) the memory capacity slightly increased compared to (5), but a higher recognition rate was obtained.
- the reason why the recognition rate improved compared to (5) is as follows.
- the local feature amount at the center of the trace is merely a representative vector of a plurality of local feature amounts associated by the trace process, and does not represent all the local feature amounts of the set.
- the subspace used in the present invention well represents the characteristics of any local feature included in the set.
- the database includes a set A of local feature amounts whose trace range of the object F is 20 degrees to 60 degrees and a set B of local feature amounts 40 degrees to 80 degrees.
- the question image is obtained by photographing the direction of the object F at 40 degrees.
- the set A is associated with the object F because the local feature amount of 40 degrees is the representative vector, but the set B has the local feature amount of 60 degrees as the representative vector and the object F It becomes difficult to associate correctly. For this reason, erroneous voting often occurs, and it is easy to be erroneously recognized.
- subspace including a direction of 40 degrees is created for both of the sets A and B, so that they can be correctly associated.
- subspace is a technique for classifying so that the characteristics of members of the class are well represented. For this reason, it is considered that the method of the present invention using the subspace method has reduced false voting and improved the recognition rate.
- the representative vector is only one local feature, and the local feature represents a “point”.
- the range of the shooting angle represents a “section”
- the obtained subspace represents a “section”. Therefore, no deviation occurs.
- the cause of what was correctly recognized when voting for each object was misrecognized by voting for each shooting angle of the object can be considered as follows. Since there are many false votes, votes may be dispersed by voting for each shooting angle even if it is a correct object, or conversely, some votes may overlap the shooting angle of an object that is incorrect. Has increased. In addition, by normalizing the number of votes obtained, the score of the shooting angle of the correct object may be lowered. For example, it is assumed that there is a set of local feature amounts of 20 to 60 degrees, a set of local feature amounts of 60 to 100 degrees, and a set of local feature amounts of 60 to 110 degrees of the object G.
- a local space that continuously changes can be registered in a database by creating a partial space that can express the change approximately well.
- the memory capacity was reduced.
- the memory capacity could be reduced to 1/18 of the unreduced state.
- voting for each object shooting angle it was possible not only to recognize the type of object in the question image, but also to estimate the approximate direction of the object.
- Example 2 For the recognition experiment of 1002 objects, we conducted experiments for the purpose of (1) the effectiveness of using ANN and (2) the improvement of the recognition rate by the mutual subspace method. In the first experiment, an object was recognized by comparing a local feature amount and a subspace. In the second experiment, the object was recognized using the mutual subspace method. 2-1. Experiment preparation The database used in this experiment example will be described. In the experimental example, for each of 1002 three-dimensional objects, each object was rotated once on the turntable, and moving images were taken from the front, elevation angles of 15 degrees above and 30 degrees above.
- FIG. 14 is an explanatory diagram illustrating an example of an object registered in the database. A frame image was obtained from the captured video and used as a database image.
- SIFT features were extracted from these images, trace processing was performed, and subspaces were created.
- the set of local feature values for creating the subspace is a set of local feature values corresponding to 50 frames or more continuously by the trace processing.
- the created subspace was registered in the database along with the object ID.
- the number of subspaces is about 550 per object.
- the search question used in this experimental example will be described. Of the 1002 objects used in the database, 100 were selected at random, and the video was shot with the selected objects in hand. A frame image was obtained from the captured moving image and used as a search question image.
- FIG. 15 is an explanatory diagram illustrating an example of an object photographed for a search question. SIFT features were also extracted from search query images. 2-2.
- a, b, c feature vector set ⁇ 1, ⁇ 2: feature vector
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- Medical Informatics (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- General Engineering & Computer Science (AREA)
- Image Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Processing Or Creating Images (AREA)
Abstract
Description
コンピュータ処理によって三次元物体を認識する手法として、物体の幾何形状を用いて認識する手法がある(例えば、非特許文献1、2参照)。また、物体を撮影した平面画像を利用して認識する手法がある(例えば、非特許文献3、4参照)。この発明では、前述の手法のうち、後者、即ち、平面画像を利用して三次元物体を認識する手法に焦点を当てる。
前記方法と実質的に対応する、この発明の三次元物体認識用画像データベースの作成装置についても、同様の作用効果を奏する。
この発明による三次元物体認識用画像データベースの作成方法において、前記登録工程は、前記物体の識別子および前記一連の視点を表す視点データの組み合わせを各部分空間に関連づけて登録し、前記認識処理は、各部分空間に関連づけられた前記組み合わせについて集計処理を行い、前記検索質問に最も類似する物体および最も類似する視点を得る工程であってもよい。
このようにすれば、物体IDおよび視点データの組み合わせについての集計処理を各クエリ特徴ベクトルについて行うので、メモリ容量削減の代償として生じる認識率の低下を、各特徴ベクトルをそのまま登録するときと比べて遜色のない程度に抑制できる。
後述する効果検証の実験によれば、各特徴ベクトルをそのまま登録する無削減状態に対して98.9%の認識率を維持する一方、メモリ容量は無削減状態の約1/18にすることができた。
また、このようにすれば、質問画像の物体の種類を認識するだけでなく、検索質問のおおよその視点を推定することができる。
前記部分空間生成工程は、各対のうち、最も近い距離の対と2番目に近い距離の対との相違が前記基準を超えるものをノイズとして除外してもよい。
このようにすれば、各部分空間の第一主成分のみを用いる場合に比べて高い認識率を見込むことができる。一方、各部分空間の各基底に各特徴ベクトルを射影して類似度を求める場合に比べて、類似度計算の対象(候補)を絞り込むので、処理時間を削減することができる。
前記部分空間生成工程は、CLAFIC法を適用して特徴ベクトルを集合としてまとめ、部分空間を生成してもよい。
ここで示した種々の好ましい態様は、それら複数を組み合わせることもできる。
≪関連技術の説明≫
発明をよりよく理解できるようにするため、まず、この発明の基礎となる技術について簡単に説明する。
画像から得ることができる特徴量には、大きく分けて大域特徴量と局所特徴量の2つがある。前者は、画像の画素値などを用いて1枚の画像全体を特徴量とするもので、比較的容易に得ることができる。しかし、質問画像に物体の一部しか写っていない場合や一部が隠れている場合など、データベースの画像と質問画像の画素値が全く異なると、特徴量の値も大きく変動するという問題が生じ、物体の認識が困難になる。一方、後者の局所特徴量は画像から特徴的な局所領域を取り出し、その局所領域をベクトルで記述したものである。このように局所特徴量はベクトルとして記述されるので、特徴ベクトルともいう。1枚の画像の様々な個所から計数百個から数千個ほど得られるため、大域特徴量がもつ問題に対処する方法として、有効であることが知られている。
Loweらによって提案されたSIFT特徴量は、画像内の輝度の勾配を利用することで局所領域を取得し、その座標、方向、大きさなどを特徴ベクトルで表したものである。SIFT特徴量は、128次元のベクトルで表され、相似変換や回転などの変換に対して頑健である。
図1は、従来の局所特徴量であるSIFT特徴量をある物体について異なる二つの視点から見た画像データからそれぞれ抽出した様子を示す説明図である。図1(a)と(b)は、同一の物体を異なる撮影角度で撮影したものである。図中に白い矢印で表したものが、SIFT特徴量である。図2は、従来の局所特徴量であるSIFT特徴量を図1と異なる物体について異なる二つの視点から見た画像データからそれぞれ抽出した様子を示す説明図である。
三次元物体の物体認識手法について簡単に説明する。まず、データベースへの登録について述べる。データベースに登録したい物体を様々な視点(撮影角度)から撮影した画像がユーザーによって用意されているとする。登録処理として、コンピュータは、登録すべき物体が撮影された各画像からSIFT特徴量を抽出する。抽出されたSIFT特徴量を、前記物体に固有の識別子(物体ID)とともにデータベースに登録する。物体IDは、データベースに登録された各物体を識別するためのものである。
CLAFIC法(CLAss-Featuring Information Compression)は、1969年にWatanabeが提案した部分空間法の一種であり、クラスごとにKL展開(Karhunen-Loeve展開の略であり、ベクトルの分布を最もよく近似する部分空間を求める手法)により作成した部分空間を用いてクラス分類を行う手法である。この発明では、ある物体について異なる撮影角度の複数の画像から抽出される同一部分の局所特徴量を、CLAFIC法を用いて同一クラスにまとめることを考える。局所特徴量のクラスタリングである。
一般に、CLAFIC法により部分空間を作成するにはまず、クラスごとに多数のサンプルベクトルxを用意し、それらの自己相関行列Qを以下の式、
三次元物体の認識手法で述べたように、SIFT特徴量を用いた物体認識の従来手法によれば、高精度な認識結果を得るために、様々な視点(撮影角度)から撮影した画像から得た局所特徴量(特徴ベクトル)をそのままデータベースに登録する。そのため、メモリ容量が莫大なものになってしまう。そこでこの発明では、一連の視点の画像データから得られる同一部分の局所特徴量を、部分空間を用いて一つにまとめ、まとめられたものをデータベースに登録することによってメモリ容量を削減することを考える。また、部分空間に撮影角度の情報を対応づけておくことで、物体認識と併せて検索質問のおおよその撮影角度を得る。
局所特徴量の格納に要するメモリ容量が莫大になるという問題の解決策として、データベースに登録する局所特徴量の数の削減を考える。しかし、局所特徴量を無作為に削減すると、認識率が大きく低下するおそれがある。そこで、同一の物体を示しかつ撮影角度が連続して変化する複数の画像データから抽出される、ある程度一貫性のある局所領域に注目し、そのような局所領域から得られる局所特徴量のみを物体認識に用いることを考える。なぜなら、撮影角度の変化に対してある程度一貫した局所特徴量が得られなければ、その局所特徴量は撮影角度の変化によって生じたノイズの可能性が高いからである。撮影角度が連続して変化する複数の画像データにおいて、物体の同じ部分から得られた局所特徴量は少しずつ変化する。そのような局所特徴量の変化を近似的に精度よく記述することができる部分空間を作成することで、複数の局所特徴量を一つにまとめて表現するのである。
図3は、この発明に係るトレース処理の様子を模式的に表す説明図である。「1フレーム目」、「2フレーム目」等の矩形枠は異なる視点から物体を撮影した各画像を示す。a,bおよびcは前記物体の同一箇所を表す特徴ベクトルが異なる撮影角度(視点)の画像に連続して出現する軌跡を表す。例えば、特徴ベクトル集合aは1フレーム目に属する特徴ベクトルα1から2フレーム目に属する特徴ベクトルα2に対応づき、2フレーム目の特徴ベクトルα2は3フレーム目のどの特徴ベクトルにも対応づかなかったことを表している。また、特徴ベクトル集合bは1フレーム目から5フレーム目までの各視点に特徴ベクトルが連続して対応づいた様子を表している。特徴ベクトル集合cは、3フレーム目から5フレーム目までの各視点に特徴ベクトルが連続して対応づいた様子を表している。
この発明に係る登録処理においてコンピュータは、トレース処理によって得られた局所特徴量の集合を、その集合の部分空間を作成することで一つにまとめる。そして、その部分空間を物体IDとともに登録する。このとき、図3の軌跡aのように、軌跡が短い局所特徴量の集合からも部分空間を作成しデータベースに登録すると、登録する部分空間の数が多くなり、メモリ容量の削減があまりなされない。そこで、軌跡の長さに閾値を設け、ある程度以上連続して対応づけられた局所特徴量の集合のみから部分空間を作成し、データベースに登録する。これは、局所特徴量の軌跡が短いものは、軌跡が長いものに比べて、識別できる撮影範囲が少ないためである。
この発明に係る認識処理において、コンピュータは、検索質問の画像が与えられたとき、その検索質問画像から局所特徴量を抽出し、各局所特徴量をデータベース中の各部分空間に射影し、局所特徴量と部分空間の類似度を計算する。そして最も類似度の高い部分空間を求め、その部分空間をもつ物体IDに投票処理を行う。これをすべての局所特徴量において行う。最終的に、最多得票数を得た物体IDを物体認識の結果として出力する。この際、物体によって部分空間の数が異なるため、得票数を部分空間の数で正規化する。正規化を行わなければ、部分空間の数が多い物体IDに多数の得票が集まってしまい、誤認識を起こす可能性がある。物体のある撮影角度ωを識別することができる部分空間の数をNωとし、撮影角度ωの得票数をGωとすると、
図13は、この発明に係る三次元物体認識の処理の流れを示す説明図である。図13に示すように、三次元物体認識の処理は、登録処理と認識処理に大別される。登録処理は、認識に用いる三次元物体の画像のデータベースを作成する処理である。認識処理は、検索質問として三次元物体の画像が与えられたとき、その画像に示された物体の画像を前記データベースの中から検索し、物体を特定する処理である。
登録処理は、同一の物体を異なる視点から撮影した複数の画像が入力になる。登録処理を行うコンピュータは、前記入力が与えられると、各画像から局所特徴量を抽出する(ステップS1)。そして、異なる視点の画像に連続して含まれる同一箇所の局所特徴量を距離計算を用いて求め、それらの局所特徴量を一つにまとめる部分空間を作成する(ステップS3)。そして、生成された部分空間に前記物体の識別子と視点の撮影角度の範囲を付加し、データベースに登録する(ステップS5)。
なお、認識処理についても、検索質問から局所特徴量を抽出する機能を担う部分、抽出された各局所特徴量に最も類似する部分空間を検索し、その部分空間に付加された物体IDを多数決処理して一つの物体IDを決定する機能を担う部分、決定された物体IDを出力する機能を担う部分を備える装置として捉えることができる。
以上に述べた手法では、複数フレームに渡って同一の局所領域から得られた特徴量の集合から、KL 展開を用いて部分空間を作成し、複数の局所特徴量を一つにまとめて表現する。そうすることで、データベースに登録する特徴量を大幅に減らすことができ、メモリ容量を削減することができる。
各クラスに対応するk次元の部分空間との類似度を計算する前述の手法では、認識を行う際に、局所特徴量としての各クエリ特徴ベクトルをデータベース中の各クラスに対応する部分空間のそれぞれの基底に射影し、類似度を求めなければならない。そのためデータベースに登録する物体の増加に伴って、処理時間が大きくなる。この実施形態では、近似的に類似度の高い部分空間を探索する手法を採用し処理時間を削減する。
k次元の部分空間への射影による認識処理では、各クエリ特徴ベクトルをデータベース中の各部分空間の各基底に射影し、最も類似度の高い部分空間を見つけるため、処理時間がかかるという問題が生じる。そこで、この実施形態では、最も類似度の高い部分空間を、クエリから得られた局所特徴量と部分空間の第一主成分上の点との距離計算によって求める方法を提案する。距離計算によって、最も類似度の高い部分空間を求めることができれば、既に提案されている様々な近似最近傍探索の手法により、高速化することができる。
近似最近傍探索の手法の一つとして、ANN と呼ばれる手法が提案されている。この手法は、木構造(kd-tree)をベースにしたもので、処理ソフトウェアが利用可能なため容易にANNの処理を試すことができる。ANN での処理の概要を以下に示す。
前述のように、部分空間の第一主成分のみを用いて距離計算を行う手法により、k次元の部分空間に対する類似度を求める手法に比べて認識に要する処理時間を削減することができる。しかし、部分空間の複数の軸を用いたほうが、より高い認識率を見込むことができる。
そこで、別の実施形態として、2段階の処理を採用して認識を行う態様を挙げることができる。この実施形態では、まず1段階目として、部分空間の第一主成分のみを用いて、近似最近傍探索により高速に最も類似度の高い部分空間となるであろう候補を絞り込む。そして2段階目に、その候補に挙げられた各部分空間の複数の軸を用いて高次元に射影し、真に最も類似度の高い部分空間を求める。2段階目の処理を加えることによって、各部分空間の第一主成分のみを用いる場合に比べて高い認識率を見込むことができる。一方、2段階目のみの態様、即ち、各部分空間の各基底に各特徴ベクトルを射影して類似度を求める場合に比べて、類似度計算の対象(候補)を絞り込むので、処理時間を削減することができる。
この実施形態は、検索質問に動画もしくは連続撮影した複数の画像を用いる場合に有効である。この態様によれば、データベースを作成する際と同様に、検索質問に係る複数画像に対してもトレース処理を行う。トレース処理によって得られた局所特徴量の集合から部分空間を作成する。そうすることで、データベースの部分空間と質問画像から得た部分空間とを比較することができる。検索質問から得られた部分空間を、この明細書ではクエリ部分空間と呼ぶ。部分空間同士を比較し最適な部分空間を見つけるには、一般的にそれらの正準角を用いる。しかし、その手法では高速化が望めない。そこで、ここでも距離計算によって最適な部分空間を求める。既に述べたANNの距離計算と同様、各部分空間の軸上に、原点から等距離の点を置き、その点と点との距離計算を行う。この実施形態ではクエリも部分空間であるから、そのクエリ部分空間の座標軸上にも点を置く必要がある点に留意すべきである。以下、さらに詳細な説明を述べる。
局所特徴量を用いた物体認識の基本的な従来手法と、複数の局所特徴量を部分空間で表現するこの発明の手法との比較実験を行った。比較の指標はメモリ容量と認識率である。
本実験で用いたデータセットについて説明する。本実験では、55個の三次元物体をターンテーブルで一回転させ、動画を撮影した。
図5は、この発明に係る実験に使用した物体のいくつかの例を示す説明図である。そして、撮影された動画を構成する各フレーム画像を取得し、そのフレーム画像の中からデータベースに登録する登録用画像と検索質問画像を作成した。登録用画像は、各物体につき撮影角度を約1度ごとに変化させた360枚の画像を作成した。また、質問画像は、各物体につき約36度ごとに変化させた10枚の画像を作成した。ここで、登録用画像と質問画像は、異なる画像とした。つまり、検索質問画像と完全に同一の登録用画像は存在しない。データベースを作成するために、登録用画像からSIFT特徴量を抽出した。1枚の登録用画像から100~400個ほどの局所特徴量をそれぞれ得た。仮に1枚あたり300個の局所特徴量が得られたとすると、1つの物体の局所特徴量は300個×360枚=108,000個であり、55個の物体に係る局所特徴量の総数はその55倍の5,940,000個である。検索質問は、全部で10枚×55個=550枚である。1枚の画像につきそれぞれ物体認識を行うので、最大550通りの物体認識が行われる。図6の縦軸の認識率はそれらの平均である。
まず、この発明の認識処理において、投票を行う際、図4(b)のように物体に加えその撮影角度を考慮して投票を行うことが、どのくらい認識率に影響を及ぼしているかを調べる第1の実験を行った。
図6は、この発明に係る第1の実験結果を示すグラフである。横軸は用いた部分空間の次元数を、縦軸は認識率を表している。実験結果より、物体IDごとに投票を行う態様であっても、95%を超える認識率が得られ、部分空間の次元数が三次元のときに98.3%となった。物体の撮影角度ごとに投票を行うと誤投票が分散され、部分空間の次元に関係なく認識率がさらに向上した。単なる物体IDでなく、物体IDと撮影角度との組み合わせに投票を行った場合、部分空間の次元数が三次元のときに98.9%となった。意外にも、次元数をそれ以上増やしても98.9%以上の認識率を得ることはなかった。この理由としては、部分空間の次元数を増やすと、部分空間と部分空間の間で重なりが増加し、部分空間の識別性が低下するためではないかと考えられる。つまり、求めたい部分空間との類似度だけが高くなるだけでなく、他の部分空間との類似度も高くなる可能性があるのである。
図8は、図7の物体Aに対して物体IDのみに投票を行った結果を示すグラフである。
また、図9は、図7の物体Aに対して物体IDと撮影角度との組み合わせに投票を行うことで正しく認識されるようになった例を示す説明図である。図7の質問画像は、物体Aをある撮影角度で撮影したものである。図8に、各物体IDに投票を行い得票数を正規化して得られるスコアを示す。本来ならば物体Aのスコアが最も高くなるべきであるが、他の物体Bや物体Cのスコアのほうが高くなった。そのため、物体ごとに投票を行った場合、質問画像は物体Cと認識された。
図11は、この発明の実験に用いた物体Dを示す説明図である。図12は、図11に示す物体Dを質問画像としたときの、物体Dと物体Eの各撮影角度のスコアを表したものである。本来ならば、物体Dの最もスコアが高い132度から144度が認識結果として得られることが望ましい。しかし、物体Eのスコアのように、得票数がある程度分散しているにも関わらず、スコアが高い場合は誤認識となってしまう。
1002 個の物体の認識実験に対して、(1)ANN を用いることの有効性と(2)相互部分空間法による認識率の向上を確かめることを目的としてそれぞれ実験を行った。第1の実験では局所特徴量と部分空間の比較により物体を認識した。第2の実験では相互部分空間法を用いて物体を認識した。
2-1.実験準備
この実験例で用いたデータベースについて説明する。実験例では、1002 個の3次元物体につき各物体をターンテーブルで1回転させ、正面、上15 度および上30 度の仰角からの動画をそれぞれ撮影した。
図14は、データベースに登録した物体の一例を示す説明図である。撮影された動画からフレーム画像を取得し、データベース用画像とした。これらの画像からSIFT 特徴量を抽出し、トレース処理を行い、部分空間を作成した。ここで、部分空間を作成する局所特徴量の集合は、トレース処理によって50 フレーム以上連続して対応づいた局所特徴量の集合とした。作成した部分空間を物体ID とともにデータベースに登録した。部分空間の数は、1物体あたり約550 個である。
次に、この実験例で用いた検索質問について説明する。データベースに用いた1002 物体のうち、無作為に100 物体を選び、選んだ物体を手で持って動画を撮影した。そして、撮影動画からフレーム画像を取得し、検索質問用画像とした。図15は、検索質問用に撮影した物体の一例を示す説明図である。検索質問用画像からもSIFT 特徴量を抽出した。
2-2.ANNを用いた距離計算に関する実験
第1の実験として、ANN を用いた距離計算の有効性を検証するため、最近傍探索を行った際と、ANNを用いて近似最近傍探索を行った際の認識率と処理時間の比較を行った。各検索質問用画像に対して撮影した物体を認識した。
相互部分空間法による認識率の向上を確かめることを目的として実験を行った。各物体の検索質問用フレーム画像に対して、トレース処理を行い、クエリ部分空間を作成した。クエリ部分空間を作成した局所特徴量の集合は、トレース処理によってT フレーム以上対応づいたもののみとした。T の値は[8, 13, 25, 38, 50] と変化させた。
この実施形態では、部分空間法において、最も高い類似度の部分空間を求める処理を高速化する手法を示した。この実施形態によれば、局所特徴量と最も高い類似度となる部分空間を求めるにあたり、各部分空間の各基底への射影成分ではなく、各部分空間の基底上の定点との距離の大小関係に基づいて求めた。距離の大小関係によって求めることで、ANN を用いて近似最近傍探索を行うことができ、処理時間の高速化が実現できた。
その結果、従来の部分空間法では処理時間が190 秒、認識率が34% であったものが、ANN を使うことで認識率を略同程度に保ちながら処理時間を0.012 秒にすることができた。また、相互部分空間法にも提案手法を適用し、処理時間の高速化と同時に、大幅な認識率の向上を可能にした。
α1,α2:特徴ベクトル
Claims (13)
- 三次元の物体を異なる視点から見た複数の画像が入力されたとき、各画像の局所的特徴を抽出し特徴ベクトルとしてそれぞれ表す工程と、
特徴ベクトルからなる複数の集合であって、各集合が、隣り合う一連の視点から前記物体の同一箇所を見た局所的特徴を表す集合を生成し、それぞれの集合の特性を表す複数の部分空間を部分空間法により生成する部分空間生成工程と、
各部分空間に前記物体の識別子を関連づけて三次元物体認識用データベースに登録する登録工程とを備え、
前記データベースは、三次元物体の認識処理のため、コンピュータによりアクセスされ、
前記認識処理は、ある物体をある視点から見た一つの画像または一連の視点から見た複数の画像が検索質問として与えられたとき、その検索質問の局所的特徴をそれぞれ表す複数の特徴ベクトルをクエリ特徴ベクトルとして抽出し、各クエリ特徴ベクトルに最も類似する部分空間をそれぞれ決定し、各部分空間に関連づけられた物体IDについて集計処理を行い、前記検索質問に最も類似する物体を得る工程により実現される三次元物体認識用画像データベースの作成方法。 - 前記登録工程は、前記物体の識別子および前記一連の視点を表す視点データの組み合わせを各部分空間に関連づけて登録し、
前記認識処理は、各部分空間に関連づけられた前記組み合わせについて集計処理を行い、前記検索質問に最も類似する物体および最も類似する視点を得る工程である請求項1に記載の方法。 - 前記部分空間生成工程は、隣り合う視点に対応する各対の特徴ベクトルの距離を計算し、異なる対との距離が予め定められた基準を超えて相違するものをノイズとして除外し、残った特徴ベクトルを同一箇所を表す特徴ベクトルの集合とする請求項1または2に記載の方法。
- 前記部分空間生成工程は、各集合の特徴ベクトルが予め定められた視点の変化量より広範囲に渡る場合は部分空間を生成するが、前記変化量に満たない場合は部分空間を生成しない請求項1~3のいずれか一つに記載の方法。
- 前記部分空間生成工程は、各対のうち、最も近い距離の対と2番目に近い距離の対との相違が前記基準を超えるものをノイズとして除外する請求項4に記載の方法。
- 前記認識処理は、各クエリ特徴ベクトルを前記データベースに予め登録された各部分空間の座標系を定める基底にそれぞれ射影して射影成分の大きさに基づき各クエリ特徴ベクトルと各部分空間との類似度を算出し、最も高い類似度の部分空間を、そのクエリ特徴ベクトルに最も類似する部分空間であると決定する請求項1~5のいずれか一つに記載の方法。
- 前記認識処理は、各部分空間が特性を表す特徴ベクトルの集合において最大の固有値に係る第一主成分の基底を各部分空間についてそれぞれ決定し、第一主成分の各基底上において原点からの正規化された距離が等しい位置にそれぞれ点をおき、各点と各クエリ特徴ベクトルとの距離を近似最近傍探索の手法を適用して求め、最も近い距離の部分空間を、そのクエリ特徴ベクトルに最も類似する部分空間であると決定する請求項1~5のいずれか一つに記載の方法。
- 前記認識処理は、請求項7に記載の方法により各クエリ特徴ベクトルに類似する部分空間を幾つかの候補に絞り込んだ後、候補とされた各部分空間の各基底にそのクエリ特徴ベクトルをそれぞれ射影して射影成分の大きさに基づき各クエリ特徴ベクトルと各部分空間との類似度を算出し、最も高い類似度の部分空間を、そのクエリ特徴ベクトルに最も類似する部分空間であると決定する三次元物体認識用画像データベースの作成方法。
- 前記認識処理は、各クエリ特徴ベクトルに最も類似する部分空間を決定する処理に代えて、
(1)一連の視点から見た複数の画像が検索質問として与えられたとき、その検索質問に係るクエリ特徴ベクトルからなる複数の集合であって、各集合が、隣り合う一連の視点から前記物体の同一箇所を見た局所的特徴を表す集合を生成し、それぞれの集合の特性を表す複数のクエリ部分空間を部分空間法により生成し、
(2)各クエリ部分空間の座標系を定める各基底上において原点からの正規化された距離が所定の位置にそれぞれクエリ基準点をおき、前記データベースに登録された各部分空間の各基底上において原点からの正規化された距離が前記所定の位置にそれぞれ基準点をおき、
(3)各クエリ基準点から最短距離の基準点を有する部分空間を各近似最近傍探索の手法を用いて決定することにより、
各クエリ部分空間に最も類似する部分空間を決定する請求項1~5の何れか一つに記載の方法。 - 前記視点データは、前記物体の撮影角度のデータである請求項1~9のいずれか一つに記載の方法。
- 前記部分空間生成工程は、CLAFIC法を適用して特徴ベクトルを集合としてまとめ、部分空間を生成する請求項1~10のいずれか一つに記載の方法。
- 部分空間の次元数は、1以上3以下である請求項1~6のいずれか一つに記載の方法。
- 三次元の物体を異なる視点から見た複数の画像が入力されたとき、各画像の局所的特徴を抽出し特徴ベクトルとしてそれぞれ表す抽出部と、
特徴ベクトルからなる複数の集合であって、各集合が、隣り合う一連の視点から前記物体の同一箇所を見た局所的特徴を表す集合を生成し、それぞれの集合の特性を表す複数の部分空間を部分空間法により生成する部分空間生成部と、
各部分空間に前記物体の識別子を関連づけて三次元物体認識用データベースに登録する登録部とを備え、
前記データベースは、三次元物体の認識装置によりアクセスされ、
前記認識装置は、ある物体をある視点から見た一つの画像または一連の視点から見た複数の画像が検索質問として与えられたとき、その検索質問の局所的特徴をそれぞれ表す複数の特徴ベクトルをクエリ特徴ベクトルとして抽出し、各クエリ特徴ベクトルに最も類似する部分空間をそれぞれ決定し、各部分空間に関連づけられた物体IDについての集計処理を各クエリ特徴ベクトルについて行い、前記検索質問に最も類似する物体を得る機能を有する三次元物体認識用画像データベースの作成装置。
Priority Applications (5)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201180021296.XA CN103109307B (zh) | 2010-04-28 | 2011-04-27 | 用于制作三维物体识别用图像数据库的方法和装置 |
| HK13111622.8A HK1184267B (en) | 2010-04-28 | 2011-04-27 | Creation method and creation device of three-dimensional object recognition-use image database |
| US13/643,284 US8971610B2 (en) | 2010-04-28 | 2011-04-27 | Method and apparatus of compiling image database for three-dimensional object recognition |
| JP2012512888A JP5818327B2 (ja) | 2010-04-28 | 2011-04-27 | 三次元物体認識用画像データベースの作成方法および作成装置 |
| EP11775057.0A EP2565844B1 (en) | 2010-04-28 | 2011-04-27 | Creation method and creation device of three-dimensional object recognition-use image database |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2010103543 | 2010-04-28 | ||
| JP2010-103543 | 2010-04-28 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2011136276A1 true WO2011136276A1 (ja) | 2011-11-03 |
Family
ID=44861574
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2011/060277 Ceased WO2011136276A1 (ja) | 2010-04-28 | 2011-04-27 | 三次元物体認識用画像データベースの作成方法および作成装置 |
Country Status (5)
| Country | Link |
|---|---|
| US (1) | US8971610B2 (ja) |
| EP (1) | EP2565844B1 (ja) |
| JP (1) | JP5818327B2 (ja) |
| CN (1) | CN103109307B (ja) |
| WO (1) | WO2011136276A1 (ja) |
Cited By (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN103136751A (zh) * | 2013-02-05 | 2013-06-05 | 电子科技大学 | 一种改进型sift图像特征匹配算法 |
| WO2015015554A1 (ja) * | 2013-07-29 | 2015-02-05 | Necソリューションイノベータ株式会社 | 3dプリンタ装置、3dプリント方法及び立体造形物の製造方法 |
| WO2017017808A1 (ja) * | 2015-07-29 | 2017-02-02 | 株式会社日立製作所 | 画像処理システム、画像処理方法及び記憶媒体 |
| CN110047275A (zh) * | 2018-01-13 | 2019-07-23 | 丰田自动车株式会社 | 多个连接的车辆的观察结果之间的关联和相似度学习 |
Families Citing this family (15)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP5906071B2 (ja) * | 2011-12-01 | 2016-04-20 | キヤノン株式会社 | 情報処理方法、情報処理装置、および記憶媒体 |
| US9418427B2 (en) * | 2013-03-15 | 2016-08-16 | Mim Software Inc. | Population-guided deformable registration |
| US9495389B2 (en) * | 2013-03-15 | 2016-11-15 | Qualcomm Incorporated | Client-server based dynamic search |
| CN104133807B (zh) * | 2014-07-29 | 2017-06-23 | 中国科学院自动化研究所 | 学习跨平台多模态媒体数据共同特征表示的方法及装置 |
| TW201719572A (zh) * | 2015-11-19 | 2017-06-01 | 國立交通大學 | 三維模型分析及搜尋方法 |
| CN109074369B (zh) * | 2016-03-08 | 2022-03-04 | 河谷控股Ip有限责任公司 | 用于基于图像的对象识别的图像特征组合 |
| US20170323149A1 (en) * | 2016-05-05 | 2017-11-09 | International Business Machines Corporation | Rotation invariant object detection |
| CN110147460B (zh) * | 2019-04-23 | 2021-08-06 | 湖北大学 | 基于卷积神经网络与多视角图的三维模型检索方法及装置 |
| CN110389703A (zh) * | 2019-07-25 | 2019-10-29 | 腾讯数码(天津)有限公司 | 虚拟物品的获取方法、装置、终端和存储介质 |
| US11328170B2 (en) * | 2020-02-19 | 2022-05-10 | Toyota Research Institute, Inc. | Unknown object identification for robotic device |
| WO2022015236A1 (en) * | 2020-07-17 | 2022-01-20 | Hitachi, Ltd. | Method of image processing for object identification and system thereof |
| CN112463952B (zh) * | 2020-12-22 | 2023-05-05 | 安徽商信政通信息技术股份有限公司 | 一种基于近邻搜索的新闻文本聚合方法及系统 |
| US11886445B2 (en) * | 2021-06-29 | 2024-01-30 | United States Of America As Represented By The Secretary Of The Army | Classification engineering using regional locality-sensitive hashing (LSH) searches |
| JP7113469B1 (ja) * | 2021-12-24 | 2022-08-05 | 学校法人明治大学 | 物体認識システム、物体認識プログラム及び、物体認識方法 |
| CN117033718B (zh) * | 2023-09-14 | 2024-06-07 | 上海交通大学 | 基于光线追踪的近似近邻搜索方法、系统、介质及设备 |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2008026414A1 (fr) * | 2006-08-31 | 2008-03-06 | Osaka Prefecture University Public Corporation | Procédé de reconnaissance d'image, dispositif de reconnaissance d'image et programme de reconnaissance d'image |
| WO2009133855A1 (ja) * | 2008-04-30 | 2009-11-05 | 公立大学法人大阪府立大学 | 3次元物体認識用画像データベースの作成方法、処理装置および処理用プログラム |
| JP2010009332A (ja) * | 2008-06-27 | 2010-01-14 | Denso It Laboratory Inc | 画像検索用データベース装置、画像検索用データベース管理方法およびプログラム |
Family Cites Families (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2004008392A1 (ja) * | 2002-07-10 | 2004-01-22 | Nec Corporation | 3次元物体モデルを用いた画像照合システム、画像照合方法及び画像照合プログラム |
| US7706603B2 (en) * | 2005-04-19 | 2010-04-27 | Siemens Corporation | Fast object detection for augmented reality systems |
-
2011
- 2011-04-27 CN CN201180021296.XA patent/CN103109307B/zh not_active Expired - Fee Related
- 2011-04-27 WO PCT/JP2011/060277 patent/WO2011136276A1/ja not_active Ceased
- 2011-04-27 US US13/643,284 patent/US8971610B2/en not_active Expired - Fee Related
- 2011-04-27 JP JP2012512888A patent/JP5818327B2/ja not_active Expired - Fee Related
- 2011-04-27 EP EP11775057.0A patent/EP2565844B1/en not_active Not-in-force
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2008026414A1 (fr) * | 2006-08-31 | 2008-03-06 | Osaka Prefecture University Public Corporation | Procédé de reconnaissance d'image, dispositif de reconnaissance d'image et programme de reconnaissance d'image |
| WO2009133855A1 (ja) * | 2008-04-30 | 2009-11-05 | 公立大学法人大阪府立大学 | 3次元物体認識用画像データベースの作成方法、処理装置および処理用プログラム |
| JP2010009332A (ja) * | 2008-06-27 | 2010-01-14 | Denso It Laboratory Inc | 画像検索用データベース装置、画像検索用データベース管理方法およびプログラム |
Non-Patent Citations (11)
| Title |
|---|
| D. LOWE: "Distinctive image features from scale-invariant keypoints", INTERNATIONAL JOURNAL OF COMPUTER VISION, vol. 60, no. 2, 2004, pages 91 - 110, XP019216426, DOI: doi:10.1023/B:VISI.0000029664.99615.94 |
| E. OJA: "Subspace methods of pattern recognition", 1983, RESEARCH STUDIES PRESS |
| F. ROTHGANGER; S. LAZEBNIK; C. SCHMID; J. PONCE: "3D Object Modeling and Recognition Using Local Affine-Invariant Image Descriptors and Multi-View Spatial Constraints", INTERNATIONAL JOURNAL OF COMPUTER VISION, vol. 66, 2006, pages 3 |
| HONDO; KISE: "Inspection of Memory Reduction Methods for Specific Object Recognition: Approaches by quantization and selection of Local Features", TECHNICAL REPORT OF THE INSTITUTE OF ELECTRONICS, INFORMATION, AND COMMUNICATION ENGINEERS, 2009 |
| INOUE; MIYAKE; KISE: "Experimental Investigation of a Memory Reduction for 3D Object Recognition Based on Selection of Local Features", THE IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS D, vol. J92-D, no. 9, 2009, pages 1686 - 1689 |
| KENICHIRO ISHII; EISAKU MAEDA; NAONORI UEDA; HIROSHI MURASE: "Plain Pattern Recognition", August 1998, OHMSHA LTD., pages: 147 - 151 |
| MAEDA; WATANABE: "Pattern matching using local structure", THE TRANSACTIONS OF THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS (D, 1985, pages 345 - 352 |
| MURASE; S. K. NAYAR: "3D object Recognition from appearance - parametric eigenspace method", THE TRANSACTIONS OF THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS (D-II, vol. J77-D-II, no. 11, 1994, pages 2179 - 2187 |
| P. J. BESL; R. C. JAIN: "Three-Dimensional Object Recognition", ACM COMPUTING SURVEYS, vol. 17, no. 1, 1985, pages 75 - 145, XP008018762, DOI: doi:10.1145/4078.4081 |
| R. S. S. ARYA; D. M. MOUNT; A. Y. WU: "An optimal algorithm for approximate nearest neighbor searching", JOURNAL OF THE ACM, 1988, pages 891 - 923 |
| See also references of EP2565844A4 |
Cited By (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN103136751A (zh) * | 2013-02-05 | 2013-06-05 | 电子科技大学 | 一种改进型sift图像特征匹配算法 |
| WO2015015554A1 (ja) * | 2013-07-29 | 2015-02-05 | Necソリューションイノベータ株式会社 | 3dプリンタ装置、3dプリント方法及び立体造形物の製造方法 |
| WO2017017808A1 (ja) * | 2015-07-29 | 2017-02-02 | 株式会社日立製作所 | 画像処理システム、画像処理方法及び記憶媒体 |
| JPWO2017017808A1 (ja) * | 2015-07-29 | 2018-03-08 | 株式会社日立製作所 | 画像処理システム、画像処理方法及び記憶媒体 |
| CN110047275A (zh) * | 2018-01-13 | 2019-07-23 | 丰田自动车株式会社 | 多个连接的车辆的观察结果之间的关联和相似度学习 |
Also Published As
| Publication number | Publication date |
|---|---|
| US20130039569A1 (en) | 2013-02-14 |
| CN103109307B (zh) | 2015-11-25 |
| JP5818327B2 (ja) | 2015-11-18 |
| CN103109307A (zh) | 2013-05-15 |
| EP2565844A4 (en) | 2014-12-03 |
| HK1184267A1 (zh) | 2014-01-17 |
| EP2565844A1 (en) | 2013-03-06 |
| US8971610B2 (en) | 2015-03-03 |
| EP2565844B1 (en) | 2016-04-06 |
| JPWO2011136276A1 (ja) | 2013-07-22 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP5818327B2 (ja) | 三次元物体認識用画像データベースの作成方法および作成装置 | |
| JP4963216B2 (ja) | コンピュータにより実施される、データサンプルのセットについて記述子を作成する方法 | |
| US9141871B2 (en) | Systems, methods, and software implementing affine-invariant feature detection implementing iterative searching of an affine space | |
| Darom et al. | Scale-invariant features for 3-D mesh models | |
| Mian et al. | On the repeatability and quality of keypoints for local feature-based 3d object retrieval from cluttered scenes | |
| Bagri et al. | A comparative study on feature extraction using texture and shape for content based image retrieval | |
| US9542593B2 (en) | Image-based feature detection using edge vectors | |
| Salti et al. | A performance evaluation of 3d keypoint detectors | |
| US20110286628A1 (en) | Systems and methods for object recognition using a large database | |
| JP5705147B2 (ja) | 記述子を用いて3dオブジェクトまたはオブジェクトを表す方法 | |
| US20120301014A1 (en) | Learning to rank local interest points | |
| CN105069457B (zh) | 图像识别方法和装置 | |
| CN106650580A (zh) | 基于图像处理的货架快速清点方法 | |
| Salti et al. | On the affinity between 3D detectors and descriptors | |
| Mouine et al. | Combining leaf salient points and leaf contour descriptions for plant species recognition | |
| CN118351113A (zh) | 基于人工智能的微塑料智能识别定位和尺寸计算系统 | |
| Wu et al. | A vision-based indoor positioning method with high accuracy and efficiency based on self-optimized-ordered visual vocabulary | |
| Tang et al. | A GMS-guided approach for 2D feature correspondence selection | |
| Razzaghi et al. | A new invariant descriptor for action recognition based on spherical harmonics | |
| KR101306576B1 (ko) | 차분 성분을 고려한 조명 변화에 강인한 얼굴 인식 시스템 | |
| WO2015178001A1 (ja) | 画像照合システム、画像照合方法、およびプログラムを記憶する記録媒体 | |
| Kalaiyarasi et al. | Enhancing logo matching and recognition using local features | |
| HK1184267B (en) | Creation method and creation device of three-dimensional object recognition-use image database | |
| Robert et al. | Efficient real-time contour matching | |
| Meenakshi et al. | Multi-View Point Panorama Construction With Wide-Baseline Geo-Graphical Images |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| WWE | Wipo information: entry into national phase |
Ref document number: 201180021296.X Country of ref document: CN |
|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 11775057 Country of ref document: EP Kind code of ref document: A1 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2012512888 Country of ref document: JP |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 13643284 Country of ref document: US |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2011775057 Country of ref document: EP |