WO2009047561A1 - Détermination de valeurs - Google Patents
Détermination de valeurs Download PDFInfo
- Publication number
- WO2009047561A1 WO2009047561A1 PCT/GB2008/050923 GB2008050923W WO2009047561A1 WO 2009047561 A1 WO2009047561 A1 WO 2009047561A1 GB 2008050923 W GB2008050923 W GB 2008050923W WO 2009047561 A1 WO2009047561 A1 WO 2009047561A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- target object
- determining
- steps
- attribute
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- A—HUMAN NECESSITIES
- A43—FOOTWEAR
- A43D—MACHINES, TOOLS, EQUIPMENT OR METHODS FOR MANUFACTURING OR REPAIRING FOOTWEAR
- A43D37/00—Machines for roughening soles or other shoe parts preparatory to gluing
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/24323—Tree-organised classifiers
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/178—Human faces, e.g. facial parts, sketches or expressions estimating age from face image; using age information for improving recognition
Definitions
- the present invention relates to a method and apparatus for determining values associated with unknown attributes of objects visible in images.
- the present invention relates to the real time, simultaneous determination of attributes, such as age or gender, associated with faces visible in still or moving images.
- computers and other processing units are able to process graphic images in digital format.
- images may be digital or digitized photographs or still frames from video footage or the like.
- From time to time it is desirable or indeed required to identify objects of interest in such images and subsequently determine one or more attributes associated with the identified objects.
- One such particular field of interest is the detection of one or more human faces in an image and the determination of one or more attributes associated with the face.
- Such use has a broad range of possible applications such as surveillance and/or demographic analysis or the like.
- target object may be a face but alternatively a broad range of target objects such as vehicles or buildings or planets or animals or the like could be targeted.
- target objects such as vehicles or buildings or planets or animals or the like could be targeted.
- a method of determining at least one unknown attribute value associated with a target object visible in an image comprising the steps of: providing a sample point in multi-dimensional space associated with the target object; determining at least one conditional probability distribution for at least one attribute associated with the target object; and determining at least one conditional expectation value, indicating a respective attribute value associated with the target object, from the conditional probability distribution.
- apparatus for determining at least one unknown attribute value associated with a target object visible in an image comprising: a vector generator that generates an N-dimensional vector associated with a target object; and an attribute determiner that determines at least one conditional probability distribution for at least one attribute associated with the target object, and from the distribution determines at least one conditional expectation value indicating a respective attribute value associated with the target object.
- a method of providing an N-dimensional regularised Gaussian probability distribution comprising the steps of: providing a plurality of images of a selected class or sub-class of target object; hand labelling attributes associated with the target object; and generating a regularised model of the target object.
- Embodiments of the present invention provide a method and apparatus for determining values for unknown attributes and/or confirming known attribute values associated with target objects visible in an image.
- Embodiments of the present invention can determine attributes such as gender and/or race and/or glasses wearer and/or age and/or demeanour or the like for a person or persons whose face is visible in an image.
- Embodiments of the present invention can carry out the determination of attribute values on more than one face in an image simultaneously and in real time so that frame-by- frame image processing is possible.
- Embodiments of the present invention provide values for unknown attributes.
- the attribute is a binary attribute the values can be used to determine the incidence or lack of incidence of the attribute.
- the values can be used to indicate a number within a possible range of values for a particular attribute. The number gives an indication of the attribute.
- Embodiments of the present invention may also provide a further value in addition to the unknown attribute value indicating a reliability associated with each determined attribute value.
- Embodiments of the present invention can be broadly applied to the determination of attributes associated with a broad range of target objects such as, but not limited to, faces, vehicles, buildings, animals, and others as detailed hereinafter.
- Embodiments of the present invention are also broadly applicable to a wide variety of applications including, but not limited to, surveillance, biometrics, demographic analysis, movie logging, key frame selection, video conferencing, control of further systems portrait beautification or safety monitoring.
- Embodiments of the present invention are useful in "description" of objects when an object is visible in an image.
- Embodiments of the present invention may be used in combination with known forms of object “detection”, “recognition”, and/or “verification”.
- Figure 1 is a block diagram illustrating detection and description of a target object
- Figure 2 illustrates a detection and description system
- Figure 3 illustrates images and the detection and description of facial attributes
- Figure 4 illustrates an N-dimension sampling point
- Figure 5 illustrates model training
- Figure 6 illustrates an image including multiple instances of target objects
- Figure 7 illustrates tagging of an image with determined attributes
- Figure 8 illustrates face/non face classification performance with training set size
- Figure 9 illustrates an optimisation surface corresponding to the CCMI result
- Figure 10 illustrates examples of face detection.
- Embodiments of the present invention provide a method and apparatus for determining values associated with unknown attributes of target objects.
- target object/s It is to be understood that this term is to be interpreted broadly to cover any object whatsoever having one or more attributes associated with it which may be the target of analysis. Further discussion is made, by way of example only, with reference to a class of target object being a face. Different classes of object may be target objects such as, but not limited to, vehicles, buildings, animals, planets, stars, hands or other body parts or whole people or objects within medical images such as micro calcifications in mammograms or the like.
- attributes are to be broadly construed as meaning any feature by which the class of object may be partitioned into a sub-class.
- attribute is to be broadly construed as meaning any feature by which the class of object may be partitioned into a sub-class.
- attributes associated with a class of object being a face
- Table 1 also indicates possible examples for applications of the attributes.
- attributes are referred to herein as binary attributes. Such attributes may in real life have only one of two values. For example, with respect to gender of a human face, the gender must be male or female. Equally, some attributes will be non-binary attributes. A value within a range of possible values for such attributes, such as age, etc, can thus be determined.
- Figure 1 illustrates schematically the flow of steps followed according to embodiments of the present invention in order to define one or more unknown attributes associated with a target object or to confirm known attribute values.
- the process starts at step S100 with an input image which might be a still image or stream of moving images which may be separated into frame-by-frame still images.
- Figure 2 illustrates a block diagram of an object detection and description system 200 which includes a digital processor 201 and data storage 202.
- the processor includes an image digitiser 203 which receives an input image 204.
- the image may or may not include an object 205 being targeted displayed in a sub-window 206.
- the image digitiser 203 is a routine, module or firmware or hardware entity or software program or the like that creates a digital image representation of the image 204.
- An object detector 207 detects the object, such as a face, at step S101. If no target object is detected the system flags this and moves on to the next image to be processed or stops if all images are exhausted.
- the object detector 207 is a routine, module, firmware or hardware entity or software program or the like that detects instances of objects in an image 204.
- a broad range of face detectors which provide framing coordinates for a sub-window including a face in the image.
- Aptly possible face detectors are cascade classifiers based on the work of Viola and Jones P. Viola and M. Jones, "Rapid Object Detection using a Boosted Cascade of Simple Features", Proc.
- the instance of the face is extracted at step S102 and then the size and/or luminance of the extracted image is adjusted at step S103.
- the relevant area from the input picture 204 is extracted and resized to the nominal size of face models utilised during the face description process as described below.
- Each face may also be normalised for luminance e.g. to a standard mean and variance.
- FIG. 3 illustrates two examples of still frames from the film Kill Bill by Miramax Films 2003.
- Each image 204 1 ; 204 2 shown includes at least one face as a target object.
- a respective sub-window 20G 1 , 206 2 is detected in each image.
- an N-dimensional vector is generated in the attribute determiner 208.
- Figure 4 illustrates the generation of an N-dimensional vector for a detected target object in a sub-window.
- the image is divided into multiple pixels (16 shown in Figure 4).
- Each pixel 40O 1 to 40O 16 has a grey scale value determined for it.
- the value determined equivalent to the grey scale value for that pixel is stored in a corresponding entry in a column vector 401 .
- the grey scale value for pixel 40O 1 is loaded into entry 40I 1 whilst the grey scale value for pixel 40O 2 is stored in entry 401 2 etc.
- the values corresponding to the grey scale values from the target object image thus provide known values in respective entries in the vector 401 .
- the vector 401 has dimensions N.
- N is equal to the number of known entries plus a number of unknown entries 402! to 402 4 . It will be understood that N can be any integral number and is determined by the pixellation scheme used as well as the number of attributes which are known/are to be determined. The number of unknown attributes may be selected as appropriate to the application.
- entry 402 ! corresponds to the gender attribute associated with the face whilst entry 402 2 corresponds to race associated with the face, entry 402 3 corresponds to the age of the face and entry 402 4 corresponds to the degree to which the face is smiling.
- these unknown attributes 402 will include blank entries in the N-dimensional vector 401 generated for the object.
- the vector is applied to a relevant model S105 supplied from a training module 209. It will be appreciated that whilst the module 209 is illustrated as a separate unit in Figure 2, the functionality of the module could be held in any part of the processor 201 .
- Embodiments of the present invention use conditional density estimation as described hereinbelow in more detail to derive the expectation for the unknown attribute values from a regularised model of faces probed by the input.
- models of target objects such as faces
- the models used can consist of a single multi-variate normal characterised by a single mean vector and covariance matrix or optionally of a multiple (mixture) model consisting of several multi-variate normals. In either case the models are generated from training samples.
- the target object is a face these are face images which have been framed and scaled to a consistent size then hand-labelled with attributes.
- the attributes are converted to numerical values which, for any particular image, are appended to the values derived from the pixels of that image to form a feature vector.
- the lengths of the vectors provide the dimensionality of the space in which all the image and attribute values are processed.
- the normal models are regularised so that in deriving the parameters of each model from training samples the statistics are combined in ways that optimise performance.
- Figure 5 illustrates a step in the training process which is carried out prior to the face description.
- Many training samples 500 each being an image 501 , including an object instance 502 displayed in a respective sub-window region 503.
- Each window 503 is extracted and sized and adjusted for luminance properties to provide an example image of an object 504 ! to 504 n with the object, such as a face, in each image having certain characteristics.
- Grey scale values for pixels in each sized sub window in the image are determined according to a number of divisions of the image which will ultimately be used during processing of unknown images. For example, as per Figure 4, each image 504 is split into 16 pixels and grey scale values corresponding to each respective pixel are stored into the first 16 entries 505i to 505i 6 of the column vector for that object.
- Next values for the 4 remaining entries, which each correspond to a respective attribute, are entered by a human supervisor.
- a human supervisor will look at each image in the training sample and determine a gender, race, age and level of smile and a respective value will be stored in respective entries 506i to 506 4 . It will be appreciated that other numbers of attributes may be utilised according to embodiments of the present invention.
- the training samples are stored in a data store of the training module 209 or data store 202 accessible to the training module.
- N-dimensional vectors are generated for each image in the training sample set.
- a regularised normal model is generated from the full set of training samples.
- the training set is first partitioned according to the scheme described in section "MULTIPLE-MODEL APPROACH", then a regularised normal model is generated for each subset.
- Normal models used in the invention are derived from training samples according to the following scheme for regularising covariance estimates.
- the characterization of class / amounts to estimating the mean vector ⁇ ; and covariance matrix ⁇ ; .
- the estimated values are denoted ⁇ ; and ⁇ ; .
- ⁇ the sample-mean of the class / training samples m, is the maximum likelihood estimate. While there is sometimes a justification for moving of the sample mean towards a prior there is usually no reason to modify it on the basis of any out-of-class samples.
- m will be adopted as ⁇ ; .
- the class /training samples also provide a sample- covariance matrix
- J I which weights all training samples equally.
- N with ⁇ /the total number of samples and ⁇ the mean vector of all samples.
- Equation 9 indicates that mixing for different classes will use the same weightings of these components (the ⁇ a ⁇ ), but that overall class volumes will then be adjusted by class- specific weights (the ( ⁇ J).
- the class-specific scalars J 1 are meant to undo the differential effects on the volume of the different class-specific contributions to the estimate (like S, ) by the same global contributions (like I).
- ⁇ ⁇ is set to 1 ; the remainder are set relative to this.
- each class has a highly ellipsoidal distribution.
- An n-dimensional hyperellipsoid has three fundamental properties - its volume (a scalar), its shape (the lengths of its n axes; n-1 free if the volume is known) and its orientation (specified by n-1 angles). In principle these can be manipulated independently. Changing the volume of the sample-covariance distribution corresponds to changing the matrix determinant or scaling all the eigenvalues equally. Changing the shape corresponds to a transformation of the eigenvalues that is not a pure multiplication. Changing the orientation corresponds to multiplication by an orthogonal matrix.
- volume change is less well motivated and the use of the average of the diagonal of a sample-covariance matrix in both RDA and Mixed-LOOC to "normalize" the identity matrix term is questionable, first because it takes the volume of the non-regularized sample-covariance as a reliable estimator of covariance volume, and second because it corresponds in the measurement (pixel) domain to adding different amounts of uncorrelated noise to each class.
- the estimator according to embodiments of the present invention uses the identity matrix and the common sample-covariance as components for weighted corrections to the class sample-covariance. Following point 3, matrix diagonals are not included.
- the argument of 1 b relates to the weighting of the components.
- the estimator according to embodiments of the present invention adopts global parameters for combining a class's sample-covariance matrix with the common sample- covariance and the identity matrix, but recognizing that this produces different volume changes for each class, compensates by rescaling every class's volume independently.
- the ⁇ a ⁇ and ⁇ ⁇ are thus tested on a grid of possible values.
- a subset of size (m-1 ) ⁇ //m of the training samples is used to develop sample-covariance matrices while the remaining N/m samples are processed according to the covariances estimated from equation 9 and evaluated according to the application's objective function.
- the objective function is always that of the underlying task.
- the ordinary Maximum Likelihood (ML) decision rules (2), (3) classify the validation set to produce error rates. The process is repeated m times (m-fold cross-validation) with different validation sets and the results summed. The best-performing parameter combination is selected.
- Cross-validation equates to repeatedly dividing the training set into two parts, deriving statistics from the first part, processing the second with regularization parameter values to obtain objective function values, then choosing the values that give best overall performance. Doing this so that the second part of the training set contains only one sample, i.e. leave-one-out cross-validation, is the most exhaustive and accurate approach, but because the whole process must be run N times, it is time-consuming.
- the validation set can have more members, say 1/m of the total training set, and the process repeats m times, always with different samples in the validation set. This m-fold cross- validation takes just over ml N of the time of full leave-one-out validation.
- CCM has positive distinguishing characteristics — (i) it combines partial estimators similarly for all classes, then compensates for volume distortions via scale parameters, (ii) it evaluates only by an application-specific objective function, (iv) it permits cross-validation with fewer steps than leave-one-out.
- appearance-based image processing it is reasonable not to include diag(Si).
- the dimensions are all pixels and uncorrelated variation in individual component variances can be assumed due to equal noise, and therefore captured by the component I.
- the single normal model is trained from all available training samples or a subset selected to match the training set more closely to the expected types of faces in the application. In this way the most appropriate part of the available training set can be used without extraneous training cases.
- the single normal model estimates attributes by transforming the face covariance into a set of matrices for conditional density estimation. For each face to be described, the relevant area from the input picture is extracted and sized to the nominal size of the face model. This is then used as a probe with a single linear transformation being applied to the probe in line with conditional density estimation to derive an expectation value for the missing dimensions which are the descriptive attributes.
- variances for the descriptive attributes which are unknown, can be obtained to provide a measure of confidence of the estimation.
- the numerical values for the attributes are converted as necessary to descriptive tags (e.g. for male/female) or via a simple transformation to coordinates (e.g. of landmarks).
- P 1 and P 2 are column vectors of dimensionality q and p respectively ( P 1 1 S values are undefined and may correspond to unknown attibutes).
- ⁇ i represents the mean of the values in X 1
- ⁇ 2 represents the mean of the values in X 2
- ⁇ n is the covariance matrix associated with the Xi component values
- ⁇ 22 is the covariance matrix associated with the X 2 component values
- ⁇ i2 is a sub matrix of overall covariance which represents covariance between components in the top X, and bottom X 2 (the probe) values.
- ⁇ 21 is the transpose of ⁇ 12 .
- ⁇ perm is the mean vector reordered to match RP, the reordered probe vector.
- ⁇ penn is the covariance matrix with rows and columns appropriately reordered.
- a matrix A can then be defined with submatrices and dimensions as shown:
- Equation (16) above is the motivation for choosing A: when the covariance matrix is calculated, the top right and bottom left corners turn out to be 0 submatrices, meaning
- the covariance matrix of the estimated data has been derived as E 11 - ⁇ 12 ⁇ ⁇ 22 ⁇ 21 . This can be used to explore the principal components of variation, i.e. the modes in which the real values are likely to differ from the estimate. This provides an indication on the reliability in the determined values.
- an image generator 210 uses these values, together with the input image, to display an output image 21 1 including the target object 205 together with one or more tags 212 associated with the object in the input image.
- the face description stops at step S108. Alternatively if the image is a frame of a series of frames the next frame in the flow can be input for processing.
- the multiple normal model is created by partitioning the training set, either manually (i.e. in a supervised way) or automatically (unsupervised).
- Supervised partitioning of the training set may be by selection of an attribute and a threshold for that attribute, thereby defining a partition.
- the sex of a face is encoded as an attribute with value between 0 (male) and 1 (female). Choosing that attribute and the threshold 0.5 would define a partition of the training set where the male samples belong to one subset and the female samples belong to another.
- supervised partitioning could be by a criterion that involves a set of attributes. For example, a formula involving landmark location could be used to partition training images into those looking left, those looking forward and those looking upwards.
- supervised partitioning could specify a criterion on the pixel values rather than the descriptive attributes. For example a criterion based on the distribution of luminance in the picture could be used to partition based on illumination.
- Unsupervised methods of partitioning the training set include standard clustering techniques whereby some or all dimensions within the feature space can be used to group the training samples into clusters which then function as partitions for later stages. Following partition of the training set, each subset is used independently to derive normal models. For each subset, more than one model is formed. The reason is that the requirements of classification and conditional density estimation lead to different regularization criteria for those two tasks. Since the description stage involves both classification and conditional density estimation, the two models are used separately in the two stages.
- the multiple-model version may involve partitioning the training set in multiple ways, e.g. according to sex and according to illumination. For each of these different ways, each partition will have its own normal model for classification and its own normal model for conditional density estimation.
- the multiple-model version estimates attributes similarly to the single model version in taking as input a resized and normalized face area. However, this area is then classified to one of the partitions defined during training. Classification is done by standard statistical means using the distributions regularized for this purpose. The face is then used as a probe in the conditional density model for that participation and the attributes read out appropriately. Alternatively, several models may be probed by the face, and the output description based on a combination of these depending on its closeness to boundaries in the classifier stage.
- Embodiments of the present invention can be applied to images containing a single instance of a target object.
- images containing more than one target object may be described.
- Figure 7 illustrates use of the input image shown in Figure 6 with output images displayed in which certain landmarks associated with each face are described, together with numerical values for certain further attributes. These can be shown on a user interface such as a monitor in real time acting as tags visible to a user.
- CCM1 can be used for estimating covariances both for conditional distribution estimation and for classification between the component models of a multiple-model realisation.
- the trials here are concerned with classification, and evaluate CCM1 on diverse problems.
- Four sets of experiments are described below referring to CC/W7-estimated full- dimensional normal models.
- Section A presents two-class discrimination - face vs. non- face and smiling vs. neutral face -where CCMI is compared against other estimators for a range of training set sizes. This affords direct comparison between CCMI, regularised discriminant analysis RDA and Mixed-LOOC1 (leave one out covariance) matrix estimators.
- Section B shows results for the face/non-face discriminator used to find faces. These results are indicators of performance which may be compared against other face finding approaches.
- Section C illustrates face classification experiments with 40 and 200 classes representing the identities of 40 and 200 individuals. Distance measurement in full dimensional space is compared with a well-known subspace method. Finally Section D uses the same face datasets as C to investigate dimensionality reduction.
- Face classification experiments used 19x19 greyscale pictures for a total dimensionality of 361. All training images were normalized to the same luminance mean and variance and the face training images were centred just above the nose tip. The applications were discrimination of faces from non-faces and discrimination between smiling and neutral faces. In the first case the number of training images per class was many times the dimensionality, in the second it was of about the same size. In both cases an extended superset (pool) of images that encompassed both classes was available. For the smiling/neutral comparison, this larger pool was the set of training faces from the face/non-face case.
- Tables 2 to 5 and figure 8 summarize the experiment.
- the tables show details for runs which used all available training images to compare CCM1 , RDA, Mixed-LOOC1 , unregularized quadratic ML classification and pooled-sample-covariance linear ML classification.
- Tables 2 and 3 show face/non-face classification; tables 4 and 5 smiling/neutral classification.
- Tables 2 and 4 use just the training sets for the classes being discriminated, tables 3 and 5 incorporate the larger pools.
- Two cases are shown for CCM1 : first the leave-one-out training result, where every one of the training images was used in turn for cross validation, then the result for three-fold cross validation.
- the graph in figure 8 compares the same methods for face/non-face discrimination, but with a range of training set sizes. The far left points on each curve in figure 8 correspond to the results of table 2.
- CCM1 has only one ⁇ parameter ⁇ .
- Table 2 illustrates face/non-face discrimination with no extra pool of training images. All images 19x19 greyscale. Training images: 3608 faces, 4479 non-faces. Test images: 1370 faces, 1276 non-faces.
- Table 3 illustrates face/non-face discrimination with extra pool of training images. All images 19x19 greyscale. Training images: 3608 faces, 4479 non-faces, 13200 superset of both. Test images: 1370 faces, 1276 non-faces.
- Table 4 illustrates smiling/neutral face discrimination with no extra pool of training images. All images 19x19 greyscale. Training images: 164 smiling faces, 300 neutral faces. Test images: 69 smiling faces, 136 neutral faces.
- Table 5 illustrates smiling/neutral face discrimination with extra pool of training images. All images 19x19 greyscale. Training images: 164 smiling faces, 300 neutral faces, 2249 superset of both. Test images: 69 smiling faces, 136 neutral faces.
- CCM1 outperforms both Mixed-LOOC and RDA, particularly when the volumes of the class covariances are dissimilar (e.g. in table 2 and figure 8 where a low-volume face class is compared with a high-volume non-face class).
- a reason for this is CCMI 's class-specific correction for changes to covariance matrix volume - i.e. the ⁇ 2 parameter.
- Evidence for this comes from a supplementary experiment to that of
- Figure 10 shows examples of a face detection scanner using maximum likelihood classification of each 19x19 window as face or non-face, according to regularized covariance matrices. Errors are illustrated on the bottom row of figure 333
- Faces detected as in figure 10 can be fed to the smiling/neutral classifier according to embodiments of the present invention.
- embodiments of the present invention can be used in conjunction with any type of object detector/methodology.
- Embodiments of the present invention provide a method and apparatus for providing an image processing scheme which can describe a picture of an object, such as a face, with a list of textual attributes.
- the attributes can include the location of facial landmarks, interpretation of facial expression, identification of the existence of glasses, beard, moustache or the like, and intrinsic properties of the subject, such as race.
- embodiments of the present invention can also determine other properties, such as head orientation, gaze direction, arousal and valence.
- By monitoring attributes over time still further properties, such as detection of speaking when the object is a face, and in principle lip-reading and temporal tracking of expression, can be carried out.
- Embodiments of the present invention which are directed to the determining of attributes of a face have a broad range of applications which include but are not limited to video logging, surveillance, demography analysis, selection of key frames or portraits with particular attributes (e.g. eye visibility) from a photoset or video, preprocessing for recognition, resynthesis, beautification, animation, video telephony, and the use of the face as a user interface.
- Embodiments of the present invention provide a detailed, rich, holistic, description of face images which can be utilised for a broad range of purposes. For example, developers of well known beautification applications may appreciate that such applications benefit from automatic location of landmarks. However, embodiments of the present invention provide the possibility of exploiting knowledge of age and sex and other attributes when carrying out the beautification application.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Multimedia (AREA)
- Human Computer Interaction (AREA)
- Oral & Maxillofacial Surgery (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
Abstract
L'invention concerne un procédé et un appareil de détermination d'au moins une valeur d'attribut inconnue associée à un objet cible visible sur une image. Le procédé comprend les étapes consistant à prévoir un point d'échantillon dans un espace multidimensionnel associé à l'objet cible, déterminer au moins une répartition de probabilité conditionnelle pour au moins un attribut associé à l'objet cible et déterminer au moins une valeur d'espérance conditionnelle, qui indique une valeur d'attribut respective associée à l'objet cible, à partir de la répartition de probabilité conditionnelle.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| GB0719527.4 | 2007-10-08 | ||
| GBGB0719527.4A GB0719527D0 (en) | 2007-10-08 | 2007-10-08 | Value determination |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2009047561A1 true WO2009047561A1 (fr) | 2009-04-16 |
Family
ID=38739227
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/GB2008/050923 Ceased WO2009047561A1 (fr) | 2007-10-08 | 2008-10-08 | Détermination de valeurs |
Country Status (2)
| Country | Link |
|---|---|
| GB (1) | GB0719527D0 (fr) |
| WO (1) | WO2009047561A1 (fr) |
Cited By (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN107092931A (zh) * | 2017-04-24 | 2017-08-25 | 河北工业大学 | 一种奶牛个体识别的方法 |
| WO2019229607A1 (fr) * | 2018-05-31 | 2019-12-05 | International Business Machines Corporation | Modélisation d'apprentissage automatique (ml) par calcul d'adn |
| US11580455B2 (en) | 2020-04-01 | 2023-02-14 | Sap Se | Facilitating machine learning configuration |
| CN118072248A (zh) * | 2024-02-28 | 2024-05-24 | 托里县招金北疆矿业有限公司 | 一种基于矿井多源数据分析的热害防治方法 |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113792569B (zh) * | 2020-11-12 | 2023-11-07 | 北京京东振世信息技术有限公司 | 对象识别方法、装置、电子设备及可读介质 |
Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20070086660A1 (en) * | 2005-10-09 | 2007-04-19 | Haizhou Ai | Apparatus and method for detecting a particular subject |
-
2007
- 2007-10-08 GB GBGB0719527.4A patent/GB0719527D0/en not_active Ceased
-
2008
- 2008-10-08 WO PCT/GB2008/050923 patent/WO2009047561A1/fr not_active Ceased
Patent Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20070086660A1 (en) * | 2005-10-09 | 2007-04-19 | Haizhou Ai | Apparatus and method for detecting a particular subject |
Non-Patent Citations (5)
| Title |
|---|
| BOR-CHEN KUO ET AL: "A Covariance Estimator for Small Sample Size Classification Problems and Its Application to Feature Extraction", IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, IEEE SERVICE CENTER, PISCATAWAY, NJ, US, vol. 40, no. 4, 1 April 2002 (2002-04-01), XP011073115, ISSN: 0196-2892 * |
| FEITOSA R Q ET AL: "A New Covariance Estimate for Bayesian Classifiers in Biometric Recognition", IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, IEEE SERVICE CENTER, PISCATAWAY, NJ, US, vol. 14, no. 2, 1 February 2004 (2004-02-01), pages 214 - 223, XP011108231, ISSN: 1051-8215 * |
| ROBINSON J A ET AL.: "Estimation of face depths by conditional densities", PROC. BRITISH MACHINE VISION CONFERENCE BMVC 2005, vol. 1, September 2005 (2005-09-01), Oxford, UK, pages 609 - 618, XP002507655 * |
| ROBINSON J R ET AL.: "Covariance matrix estimation for appearance-based face image processing", PROC. BRITISH MACHINE VISION CONFERENCE BMVC 2005, vol. 1, September 2005 (2005-09-01), pages 389 - 398, XP002507657 * |
| ZHONG JING ET AL: "Glasses detection for face recognition using Bayes rules", ADVANCES IN MULTIMODAL INTERFACES-ICMI 2000. THIRD INTERNATIONAL CONFERENCE (LECTURE NOTES IN COMPUTER SCIENCE VOL.1948) SPRINGER VERLAG BERLIN, GERMANY, 2000, pages 127 - 134, XP002507656, ISBN: 3-540-41180-1 * |
Cited By (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN107092931A (zh) * | 2017-04-24 | 2017-08-25 | 河北工业大学 | 一种奶牛个体识别的方法 |
| WO2019229607A1 (fr) * | 2018-05-31 | 2019-12-05 | International Business Machines Corporation | Modélisation d'apprentissage automatique (ml) par calcul d'adn |
| GB2589237A (en) * | 2018-05-31 | 2021-05-26 | Ibm | Machine learning (ML) modeling by DNA computing |
| US11531934B2 (en) | 2018-05-31 | 2022-12-20 | Kyndryl, Inc. | Machine learning (ML) modeling by DNA computing |
| GB2589237B (en) * | 2018-05-31 | 2023-02-08 | Kyndryl Inc | Machine learning (ML) modeling by DNA computing |
| US11928603B2 (en) | 2018-05-31 | 2024-03-12 | Kyndryl, Inc. | Machine learning (ML) modeling by DNA computing |
| US11580455B2 (en) | 2020-04-01 | 2023-02-14 | Sap Se | Facilitating machine learning configuration |
| US11880740B2 (en) | 2020-04-01 | 2024-01-23 | Sap Se | Facilitating machine learning configuration |
| CN118072248A (zh) * | 2024-02-28 | 2024-05-24 | 托里县招金北疆矿业有限公司 | 一种基于矿井多源数据分析的热害防治方法 |
Also Published As
| Publication number | Publication date |
|---|---|
| GB0719527D0 (en) | 2007-11-14 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Sung et al. | Example-based learning for view-based human face detection | |
| Kollreider et al. | Real-time face detection and motion analysis with application in “liveness” assessment | |
| JP4543423B2 (ja) | 対象物自動認識照合方法および装置 | |
| Sahbi et al. | A Hierarchy of Support Vector Machines for Pattern Detection. | |
| Prince et al. | Tied factor analysis for face recognition across large pose differences | |
| Huang et al. | Face detection from cluttered images using a polynomial neural network | |
| Shih et al. | Face detection using discriminating feature analysis and support vector machine | |
| US9355303B2 (en) | Face recognition using multilayered discriminant analysis | |
| Yang et al. | Face detection using multimodal density models | |
| Zheng et al. | A subspace learning approach to multishot person reidentification | |
| Zuobin et al. | Feature regrouping for CCA-based feature fusion and extraction through normalized cut | |
| WO2009047561A1 (fr) | Détermination de valeurs | |
| Hu et al. | Probabilistic linear discriminant analysis based on L 1-norm and its Bayesian variational inference | |
| Maronidis et al. | Improving subspace learning for facial expression recognition using person dependent and geometrically enriched training sets | |
| Lin et al. | Recognizing human actions using NWFE-based histogram vectors | |
| Pham et al. | Face detection by aggregated bayesian network classifiers | |
| Liu et al. | Human action recognition using manifold learning and hidden conditional random fields | |
| Meynet et al. | Fast multi-view face tracking with pose estimation | |
| Masip et al. | Feature extraction methods for real-time face detection and classification | |
| Popovici et al. | Face detection using an SVM trained in eigenfaces space | |
| Robinson | Covariance Matrix Estimation for Appearance-based Face Image Processing. | |
| Colmenarez | Facial analysis from continuous video with application to human-computer interface | |
| Zhao et al. | Evolutionary discriminant feature extraction with application to face recognition | |
| Urschler et al. | Robust facial component detection for face alignment applications | |
| Bayık | Automatic target recognition in infrared imagery |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 08806739 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 08806739 Country of ref document: EP Kind code of ref document: A1 |