[go: up one dir, main page]

NL2021481B1 - A method for automatically annotating and identifying a living being or an object with an identifier, such as RFID, and computer vision. - Google Patents

A method for automatically annotating and identifying a living being or an object with an identifier, such as RFID, and computer vision. Download PDF

Info

Publication number
NL2021481B1
NL2021481B1 NL2021481A NL2021481A NL2021481B1 NL 2021481 B1 NL2021481 B1 NL 2021481B1 NL 2021481 A NL2021481 A NL 2021481A NL 2021481 A NL2021481 A NL 2021481A NL 2021481 B1 NL2021481 B1 NL 2021481B1
Authority
NL
Netherlands
Prior art keywords
image
subject
annotated
learning model
reader
Prior art date
Application number
NL2021481A
Other languages
Dutch (nl)
Inventor
Jean Baptist Van Oldenborgh Marc
Original Assignee
Kepler Vision Tech Bv
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kepler Vision Tech Bv filed Critical Kepler Vision Tech Bv
Priority to NL2021481A priority Critical patent/NL2021481B1/en
Priority to US17/042,063 priority patent/US11308358B2/en
Priority to EP19756032.9A priority patent/EP3740894A1/en
Priority to PCT/NL2019/050533 priority patent/WO2020036490A1/en
Application granted granted Critical
Publication of NL2021481B1 publication Critical patent/NL2021481B1/en
Priority to US17/659,574 priority patent/US11961320B2/en
Priority to US18/601,933 priority patent/US20240212385A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/30Scenes; Scene-specific elements in albums, collections or shared content, e.g. social network photos or video
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/70Labelling scene content, e.g. deriving syntactic or semantic representations

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a method for training a machine learning model to identify a subject having at least one machine readable identifier providing a subject ID, said method comprising: - providing a computer vision system with an image capturing system comprising at least one image capturing device, and a reader system comprising at least one reader for reading said at least one machine readable identifier; - defining said machine learning model in said computer vision system; - capturing an image using said image capturing system, said image showing said subject, - reading said subject ID using said reader system when capturing said image, and linking said subject ID with said image, said linking providing said image with a linked subject ID, resulting in at least one annotated image, - capturing at least one further image showing said subject, linking said linked subject ID to said at least one further image providing at least one further annotated image, and - subjecting said annotated image and said at least one further annotated image to said machine learning model for training said machine learning model.

Description

A method for automatically annotating and identifying a living being or an object with an identifier, such as RFID, and computer vision.
Field of the invention
The invention relates to a method, device, and computer program product for training a machine learning model to identify a subject having at least one machine readable identifier providing a subject ID.
Background of the invention
Artificial intelligence (Al) is developing rapidly and Al applications are supporting or will support all industries including the aerospace industry, agriculture, chemical industry, computer industry, construction industry, defense industry, education industry, energy industry, entertainment industry, financial services industry, food industry, health care industry, hospitality industry, information industry, manufacturing, mass media, mining, telecommunication industry, transport industry, water industry and direct selling industry.
Human-machine communication becomes more and more important. Machines (such as computers, smartphones, tablets and robots) are penetrating society rapidly.
Computer vision is an area of Al wherein machine learning is used to classify living beings and objects in images. Training a machine learning model for computer vision involves providing a training set with annotated images. Often a large number of images need to be annotated manually to establish a computer vision system with sufficient accuracy. Automatic annotation, instead of manual annotation, of living beings and objects in images can reduces the time and costs of annotation dramatically.
In “Automatic Image Annotation via Label Transfer in the Semantic Space’’, May 2016, by Tiberio Uricchio et al. (https://arxiv.org/abs/1605.04770) according to its abstract describes “Automatic image annotation is among the fundamental problems in computer vision and pattern recognition, and it is becoming increasingly important in order to develop algorithms that are able to search and browse large-scale image collections. In this paper, we propose a label propagation framework based on Kernel Canonical Correlation Analysis (KCCA), which builds a latent semantic space where correlation of visual and textual features are well preserved into a semantic embedding.
The proposed approach is robust and can work either when the training set is well annotated by experts, as well as when it is noisy such as in the case of user-generated tags in social media. We report extensive results on four popular datasets. Our results show that our KCCA-based framework can be applied to several state-of-the-art label transfer methods to obtain significant improvements. Our approach works even with the noisy tags of social users, provided that appropriate denoising is performed. Experiments on a large scale setting show that our method can provide some benefits even when the semantic space is estimated on a subset of training images.”
US20070086626, with title “Individual identity authentication systems”, according to its abstract describes “A single image from a camera (14) is captured of an individual (40) seeking entry through a door held by a door latch (24). An image processor (16) looks for and locates a tag (42) worn by the individual (40) in the image and reads an identification (ID) code from the tag (42). A comparator (20) compares this ID code with ID codes in an identification database (22) to find a match. Once a match of ID codes is found, the image processor (16) looks for and locates a face (44) of the individual (40) in the image and extracts facial features from the face (44). The comparator (20) compares the extracted facial features with facial features associated with the matched ID code, from the identification database (22), to find a match. Once there is a match of facial features, the door latch (24) is released.”
In “Automatic image annotation and retrieval using cross-media relevance model”, July 2003, by J. Jeon et al. (http://hpds.ee.kuas.edu.tw/download/parallel_processing/97/97present/20081226/Aut omatic%20Image%20Annotation%20and%20Retrieval%20using.pdf) according to its abstract describes “Libraries have traditionally used manual image annotation for indexing and then later retrieving their image collections. However, manual image annotation is an expensive and labor intensive procedure and hence there has been great interest in coming up with automatic ways to retrieve images based on content. Here, we propose an automatic approach to annotating and retrieving images based on a training set of images. We assume that regions in an image can be described using a small vocabulary of blobs. Blobs are generated from image features using clustering. Given a training set of images with annotations, we show that probabilistic models allow us to predict the probability of generating a word given the blobs in an image. This may be used to automatically annotate and retrieve images given a word as a query. We show that relevance models allow us to derive these probabilities in a natural way.
Experiments show that the annotation performance of this cross-media relevance model is almost six times as good (in terms of mean precision) than a model based on wordblob co-occurrence model and twice as good as a state of the art model derived from machine translation. Our approach shows the usefulness of using formal information retrieval models for the task of image annotation and retrieval.”
US 8,380,558, with title “Method and system for analyzing shopping behavior in a store by associating RFID data with video-based behavior and segmentation data”, according to its abstract describes “The present invention is a method and system for analyzing shopping behavior by associating RFID data, such as tracking data by the RFID tag identifications, with video-based behavior and segmentation data, such as behavior analysis and demographic composition analysis of the customers, utilizing a plurality of means for sensing and using RFID tags, a plurality of means for capturing images, and a plurality of computer vision technologies. In the present invention, the association can further comprise the association of the RFID with the transaction data or any time-based measurement in the retail space. The analyzed shopping behavior in the present invention helps people to better understand business elements in a retail space. It is one of the objectives of the present invention to provide an automatic videobased segmentation of customers in the association with the RFID based tracking of the customers, based on a novel usage of a plurality of means for capturing images and a plurality of computer vision technologies on the captured visual information of the people in the retail space. The plurality of computer vision technologies can comprise face detection, person tracking, body parts detection, and demographic classification of the people, on the captured visual information of the people in the retail space. “
CN107066605, with title “Image identification-based device information automatic retrieval and display method”, according to its abstract describes “The invention relates to an image identification-based device information automatic retrieval and display method. The method is mainly and technically characterized by comprising the following steps of establishing a real scene map of a substation; obtaining a view angle picture of the position of a browser, and identifying a device type of a device contained in the picture in real time; obtaining a monitoring information account corresponding to the device type; and dynamically displaying the monitoring information account on the real scene map. By adopting the method, a user does not need to perform manual annotation; the information retrieval is performed according to the device type automatically identified in the picture and a device ID; and the information display is more intelligent and quicker.’’
In “Attention-based Deep Multiple Instance Learning”, Feb 2018, by Maximilian Use et al. (https://arxiv.org/abs/1802.04712) according to its abstract describes “Multiple instance learning (MIL) is a variation of supervised learning where a single class label is assigned to a bag of instances. In this paper, we state the MIL problem as learning the Bernoulli distribution of the bag label where the bag label probability is fully parameterized by neural networks. Furthermore, we propose a neural network-based permutation-invariant aggregation operatorthat corresponds to the attention mechanism. Notably, an application of the proposed attention-based operator provides insight into the contribution of each instance to the bag label. We show empirically that our approach achieves comparable performance to the best MIL methods on benchmark MIL datasets and it outperforms other methods on a MNIST-based MIL dataset and two real-life histopathology datasets without sacrificing interpretability.”
Summary of the invention
In order to train a machine learning (ML) model for computer vision, often a training set with a large number of annotated images should be provided. Annotating images manually is a tedious job. Annotating images automatically is saving resources and therefore efficient but is often lacking the accuracy for training a ML model when a high reliability of the model is required.
Hence, it is an aspect of the invention to provide an improved and/or alternative method for annotating images which automates the annotating process and preferably further, at least partly, obviates one or more of above-described drawbacks, in particular by increasing the accuracy of the labeled data by automatic annotation.
The method according to the invention allows Al systems to improve over time due to the increasing availability of labelled or annotated data. In many cases it would not be necessary anymore to pre-train a ML model anymore for a specific application.
There is provided a method for training a machine learning model to identify a subject having at least one machine readable identifier providing a subject ID, said method comprising:
- providing a computer vision system with an image capturing system comprising at least one image capturing device, and a reader system comprising at least one reader for reading said at least one machine readable identifier;
- defining said machine learning model in said computer vision system;
- capturing an image using said image capturing system, said image showing said subject;
- reading said subject ID using said reader system when capturing said image, and linking said subject ID with said image, said linking providing said image with a linked subject ID, resulting in at least one annotated image;
- capturing at least one further image showing said subject, linking said linked subject ID to said at least one further image providing at least one further annotated image, and
- subjecting said annotated image and said at least one further annotated image to said machine learning model for training said machine learning model.
There is further provided A system for identifying a subject having at least one machine readable identifier providing a subject ID, said system comprising:
- a computer vision system comprising an image capturing system comprising at least one image capturing device, and a reader system comprising at least one reader for reading said at least one machine readable identifier;
- a machine learning model defined in said computer vision system;
said computer vision system in operation:
- capturing an image using said image capturing system, said image showing said subject;
- reading said subject ID using said reader system when capturing said image, and linking said subject ID with said image, said linking providing said image with a linked subject ID, resulting in at least one annotated image;
- capturing at least one further image showing said subject, linking said linked subject ID to said at least one further image providing at least one further annotated image, and
- subjecting said annotated image and said at least one further annotated image to said machine learning model for training said machine learning model.
A subject can be an animal, a person or an object. A product is an example of an object.
A reader is a device for reading machine readable identifiers. A reader can consist of an antenna to receive a signal. Examples of readers are a RFID reader, a barcode scanner/camera, QR scanner/camera, chip and pin card reader, biometric reader (such as for fingerprint and iris recognition) and audio analyser (for voice and sound recognition).
An image capturing device is a device that can provide an image, in particular a digital image or digital picture. Such a device can comprise a camera of a filming (motion picture) device. Examples are devices comprising a CCD or similar imaging elements. As such, these devices are known to a skilled person.
In order to detect and localize a subject in a scene from a captured image an embodiment uses a method to detect subjects. Such a method will use machine learning techniques (mainly deep learning) to design and train a model which detects subjects given an input of a visual representation, e.g. an RGB image, as the system perceives. The model is trained on a large amount of annotated data; it comprises images with and without subjects and locations of the subjects are annotated.
In the case of deep learning, a detection framework such as Faster-RCNN, SSD, R-FCN, Mask-RCNN, or one of their derivatives can be used. A base model structure can be VGG, AlexNet, ResNet, GoogLeNet, adapted from the previous, or a new one. A model can be initialized with weights and trained similar tasks to improve and speedup the training. Optimizing the weights of a model, in case of deep learning, can be done with the help of deep learning frameworks such as Tensorflow, Caffe, or MXNET. To train a model, optimization methods such as Adam or RMSProb can be used. Classification loss functions such Hinge Loss or Softmax Loss can be used. Other approaches which utilize handcrafted features (such as LBP, SIFT, or HOG) and conventional classification methods (such as SVM or Random Forest) can be used.
In an embodiment, after localizing subjects in a scene from captured images, trained multiple instance neural networks (MINN) are used to match the correct subject IDs with subjects.
In an embodiment, after localizing subjects in a scene from retrieved images, a deep neural network (DNN) is trained to compare subjects from different captured images with each other in order to detect similar subjects.
In order to detect similar subjects from different captured images, an embodiment uses machine learning techniques (mainly deep learning) to design and train a model which detects the similarity of subjects, given an input of a visual representation, e.g. a RGB images, as the system perceives. The model is trained on a large amount of annotated data; it comprises images of subjects wherein similar subjects the are annotated.
For example, a pretrained DNN on ImageNet, e.g. VGGNet, AlexNet, ResNet, Inception and Xception, can be adapted by taking the convolution layers from these pretrained DNN networks, and on top of them adding new layers specially designed for detecting similar subjects, and train the network as described in the previous paragraph.
In case similar subjects are detected with sufficient reliability, the subject in the different captured images are automatically annotated with one or more subject IDs which are consistent with the session IDs retrieved by a reader system for the captured images. For example, if there is a similar subject detected in both captured image A and captured image B while for these images multiple subject IDs have been retrieved, then the similar subject in both image A and image B will automatically be annotated with the section of the subject IDs belonging to the subject IDs of image A and of image B.
In an embodiment, the method further comprises providing said subject with said machine readable identifier providing a subject ID.
In an embodiment, when capturing said at least one further image, a further subject ID is read using said reader system and said further subject ID is linked to said at least one further image.
In an embodiment, said annotated image and said at least one further annotated image are included in a training dataset that is built during performing said method, and said training dataset is used for at least one of training and additionally training said machine learning model.
In an embodiment, the machine learning model comprises a machine learning model part for localizing subjects in at least one of said captured image and said captured at least one further image.
In an embodiment, the reader system comprises at least a first reader and a second reader, wherein said first reader reads said subject ID when said image is captured, and said second reader reads said subject ID when said at least one further image is captured.
In this respect, “when” can be in a timeframe about said capturing such that it ensures that the image shows a subject of the subject ID.
In an embodiment, the subject comprises at least a first and a second machine readable identifier, said first reader reads said first machine readable identifier for providing said subject ID, and said second reader reads said second machine readable identifier for providing said subject ID.
In an embodiment, the first and second reader and said first and a second machine readable identifier are of a different type, wherein said first and second reader provide a first and second identifier, and in particular said vision system provides said subject ID from said first and second identifier. For instance, the first reader is an RFID reader and the second reader is a chip card reader.
In an embodiment, at least one selected from said linked subject ID and a further subject ID is repeated.
In an embodiment, the capturing said at least one further image and said linking said linked subject ID to said at least one further image continuously repeated, providing a series of said at least one further annotated image, in particular said capturing is repeated when there is one or more subject in a field of view of said image capturing system.
In an embodiment, the capturing said at least one further image is continuously repeated, and said reader system repeats reading said subject ID using said reader system when a said at least one further image is captured, providing each time a renewed subject ID, linking said renewed subject ID with said at least one further image, said linking providing said at least one further image with a linked subject ID, resulting in at least one further annotated image, for providing a series of annotated images.
In an embodiment, the annotating images is continued until a predetermined reliability level for identifying said subject in an image is reached.
In an embodiment, the method further is for training a machine learning model to identify a plurality of subject each having at least one machine readable identifier providing a subject ID for each subject, wherein said reader system reads said machine readable identifiers of at least part of said plurality of subjects, providing a series of subject IDs, said image capturing system captures said image with said at least part of said plurality of subjects and, and links said image with said at least part of said plurality of subjects with said series of subject IDs, providing said annotated image.
In an embodiment, the image capturing system captures said at least one further image with said at least part of said plurality of subjects and, and links said image with said at least part of said plurality of subjects with said series of subject IDs, providing said annotated image.
The method is in an embodiment further provided for training a machine-learning model to identify an animal among a group of animals, in particular a livestock animal amidst a group of livestock animals, using the method described above.
There is further provided A computer program product for running on a data processor on a computer vision system, wherein said computer program product when running on said data processor: enables said computer vision system to perform the method described above.
The term “statistically” when used herein, relates to dealing with the collection, analysis, interpretation, presentation, and organization of data. In particular, it comprises modelling behavior of a population. Using probability distributions, a probability of optimizing transmission reliability is calculated and predicted.
The term “substantially” herein, such as in “substantially all emission” or in “substantially consists”, will be understood by the person skilled in the art. The term “substantially” may also include embodiments with “entirely”, “completely”, “all”, etc. Hence, in embodiments the adjective substantially may also be removed. Where applicable, the term “substantially” may also relate to 90% or higher, such as 95% or higher, especially 99% or higher, even more especially 99.5% or higher, including 100%. The term “comprise” includes also embodiments wherein the term “comprises” means “consists of’.
The term functionally will be understood by, and be clear to, a person skilled in the art. The term “substantially” as well as “functionally” may also include embodiments with “entirely”, “completely”, “all”, etc. Hence, in embodiments the adjective functionally may also be removed. When used, for instance in “functionally parallel”, a skilled person will understand that the adjective “functionally” includes the term substantially as explained above. Functionally in particular is to be understood to include a configuration of features that allows these features to function as if the adjective “functionally” was not present. The term “functionally” is intended to cover variations in the feature to which it refers, and which variations are such that in the functional use of the feature, possibly in combination with other features it relates to in the invention, that combination of features is able to operate or function. For instance, if an antenna is functionally coupled or functionally connected to a communication device, received electromagnetic signals that are receives by the antenna can be used by the communication device. The word “functionally” as for instance used in “functionally parallel” is used to cover exactly parallel, but also the embodiments that are covered by the word “substantially” explained above. For instance, “functionally parallel” relates to embodiments that in operation function as if the parts are for instance parallel. This covers embodiments for which it is clear to a skilled person that it operates within its intended field of use as if it were parallel.
Furthermore, the terms first, second, third and the like in the description and in the claims, are used for distinguishing between similar elements and not necessarily for describing a sequential or chronological order. It is to be understood that the terms so used are interchangeable under appropriate circumstances and that the embodiments of the invention described herein are capable of operation in other sequences than described or illustrated herein.
The devices or apparatus herein are amongst others described during operation. As will be clear to the person skilled in the art, the invention is not limited to methods of operation or devices in operation.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design many alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. Use of the verb to comprise and its conjugations does not exclude the presence of elements or steps other than those stated in a claim. The article a or an preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the device or apparatus claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage.
The invention further applies to an apparatus or device comprising one or more of the characterizing features described in the description and/or shown in the attached drawings. The invention further pertains to a method or process comprising one or more of the characterizing features described in the description and/or shown in the attached drawings.
The various aspects discussed in this patent can be combined in order to provide additional advantages. Furthermore, some of the features can form the basis for one or more divisional applications.
Brief description of the drawings
Embodiments of the invention will now be described, by way of example only, with reference to the accompanying schematic drawings in which corresponding reference symbols indicate corresponding parts, and in which:
FIG 1 schematically depicts an embodiment for training a machine learning model to identify products labeled with a barcode;
FIGs 2A-2C schematically depict an embodiment for training a machine learning model to identify cows earmarked with a RFID chip;
FIGs 3A-3C schematically depict another embodiment for training a machine learning model to identify cows earmarked with a RFID chip;
FIGs 4A-4C schematically depicts an embodiment for training a machine learning model to identify travelers using a boarding pass, and
FIGs 5 A-5C schematically depict an embodiment for training a machine learning model to identify a woman identifying herself at different locations.
The drawings are not necessarily on scale.
Description of preferred embodiments
FIG 1 schematically depicts an embodiment in a warehouse 106 for training a machine learning model 9”’, defined in a computer vision system 99, to identify products 10 labeled with a barcode as subject ID. The computer vision system 99 is operationally coupled with scanner 5 and cameras 1 and Γ. The barcode of the products 10 are scanned by scanner 5 and the cameras 1 and 1’ capture images of the products 10. An annotated image of product 10’ captured by camera 1, comprising a scanned barcode of product 10’, is subjected to machine learning model 9’”. A further annotated image of product 10’ captured by camera 1’, comprising a scanned barcode of product 10’, is also subjected to machine learning model 9’”. Product 10’ in the captured images is automatically labeled or annotated with a unique subject ID belonging to its barcode. Machine learning model 9’” that is trained in this way can thus be applied to identify product 10’.
FIG 2A-2C schematically depicts an embodiment, at a farmyard 101 and 103, for training a machine learning model 9’, defined in a computer vision system, to identify cow 13 among cows 14 and 15. Cow 13 is earmarked with a RFID chip 23, cow 14 is earmarked with a RFID chip 24 and cow 15 is earmarked with a RFID chip 25. The signals 33, 34 and 35, belonging respectively to the RFID chips 23, 24 and 25, comprise a unique subject IDs for respectively cow 13, 14 and 15. Antennas 3 and 3’ are operationally coupled to a RFID reader. The RFID reader, camera 1 and 1’ are operationally coupled to the computer vision system.
In FIG 2A, the three cows 13,14 and 15 are grouped at a farmyard 101. The signal s 33, 34, and 35 are being received by antenna 3. Camera 1 captured an image of the cows 13, 14 and 15. An annotated image 201” (FIG 2C) captured by camera 1, comprising the subject IDs of cow 13, 14 and 15, is subjected to the machine learning model 9’.
In FIG 2B, cow 13 is eating at a cratch 8 in a designated area at a farmyard 103. The signal 33 is being received by antenna 3. Camera Γ captured a further image of cow
13. A further annotated image 203 (FIG 2C) captured by camera 1’, comprising the unique subject ID of cow 13, is subjected to the machine learning model 9’.
In FIG 2C, cow 13 in the annotated image 201 ” and cow 13 in the annotated image 203 are thus automatically labeled or annotated with the unique subject ID belonging to RFID chip 23 (marked with an arrow) which is in section of subject IDs of annotated image 201” and 203. Machine learning model 9’ that is trained in this way can thus be applied to identify cow 13 in an image.
In practice, the computer vision system will continuously capture images of one or more cows and read subject IDs. These will be automatically linked to provide annotated images and applied to the machine learning model 9’. In this way, the machine learning model 9’ can be (additionally) trained and improved. If the machine learning model 9’ qualifies the annotated image as being below a predefined threshold, the annotated image may be disregarded in the training process, and/or the annotated image may even be removed from the system.
FIG 3A-3C schematically depicts an embodiment, at a farmyard 101, fortraining a machine learning model 9, defined in a computer vision system, to identify cow 13 among cows 11, 12, 13, 14 and 15. Cow 11 is earmarked with a RFID chip 21, cow 12 is earmarked with a RFID chip 22, cow 13 is earmarked with a RFID chip 23, cow 14 is earmarked with a RFID chip 24 and cow 15 is earmarked with a RFID chip 25. The signals 31, 32, 33, 34 and 35, belonging respectively to the RFID chips 21, 22, 23, 24 and 25, comprise a unique subject IDs for respectively cow 11, 12, 13, 14 and 15. Antennas 3 is operationally coupled to a RFID reader. The RFID reader and camera 1 are operationally coupled to the computer vision system.
In FIG 3 A, the three cows 11,12 and 13 are grouped at a farmyard 101. The signal s 31, 32, and 33 are being received by antenna 3. Camera 1 captured an image of the cows
11, 12 and 13. An annotated image 201 (FIG 3C) captured by camera 1, comprising the subject IDs of cow 11, 12 and 13, is subjected to the machine learning model 9.
In FIG 3B, the three cows 13, 14and 15 are grouped at a farmyard 101.The signals 33, 34, and 35 are being received by antenna 3. Camera 1 captured a further image of the cows 13, 14 and 15. A further annotated image 201’ (FIG 3C) captured by camera 1, comprising the subject IDs of cow 13, 14 and 15, is subjected to the machine learning model 9.
In FIG 3C, cow 13 in the annotated image 201 and cow 13 in the annotated image 201’ are thus automatically labeled or annotated with the unique subject ID belonging to RFID chip 23 (marked with an arrow) which is in section of subject IDs of annotated image 201 and 201’. Machine learning model 9 that is trained in this way can thus be applied to identify cow 13 in an image.
The RFID chip can either be active or passive.
FIG4A-4C schematically depicts an embodiment, at airport halls 104 and 105, for training a machine learning model 9”, defined in a computer vision system, to identify a person 16 among a crowd. Person 16 is carrying a chip card 26. The chip card 26 comprises a unique subject IDs for person 16. Chip card reader 4 and camera 1 are operationally coupled to the computer vision system.
In FIG 4A, person 16 is in the process of entering the airport in airport hall 104 by unlocking turn style 7, by putting his chip card 26 in card reader 4. Camera 1 captured an image of person 16. An annotated image 204 (FIG 4C) captured by camera 1, comprising the subject IDs of person 16, is subjected to the machine learning model 9”.
In FIG 4B, person 16 is walking in an airport hall 105. Camera 1 captured an image of person 16. An image 205 (FIG 4C) captured by camera 1 is subjected to the machine learning model 9”.
In FIG 4C, person 16 in the annotated image 204 and person 16 in the image 205 are automatically labeled or annotated with the unique subject ID belonging to chip card 26 since person 16 in annotated image 204 and image 205 are detected as to be likely similar. Machine learning model 9” that is trained in this way can thus be applied to identify person 16 in an image.
FIG 5A-5C schematically depicts an embodiment for training a machine learning model to identify a woman 17 identifying herself at different locations 107, 108 and 109, for training a machine learning, defined in a computer vision system, to identify a woman 17 in various situations. Turn style 7’ with fingerprint reader 4’, ATM cash machine 6 with a bank card reader, ID card reader 4” and image capturing device 1 are operationally coupled to the computer vision system.
In FIG 5 A, woman 17 in an office entrance 107 identifies herself at turn style 7’ by putting her finger 27 on a fingerprint reader 4’ while image capturing device 1 5 captures at least one image of her.
In FIG 5B, woman 17, in a designated area 108, withdraws cash from an ATM cash machine 6 with a bank card reader, and identifies herself by a bank card 27’ and by typing her pin code on the ATM cash machine while image capturing device 1 captures at least one image of her.
In FIG 5C, woman 17 in a town hall 109 identifies herself at a counter by showing her ID card 27’ ’ to an ID card reader 4’ ’ while image capturing device 1 captures at least one image of her.
It will also be clear that the above description and drawings are included to illustrate some embodiments of the invention, and not to limit the scope of protection.
Starting from this disclosure, many more embodiments will be evident to a skilled person. These embodiments are within the scope of protection and the essence of this invention and are obvious combinations of prior art techniques and the disclosure of this patent.

Claims (17)

ConclusiesConclusions 1. Een werkwijze voor het trainen van een machine-leermodel om een subject met ten minste één machine leesbare identificator die een subject-ID verschaft te identificeren, waarbij de werkwijze omvat:A method of training a machine learning model to identify a subject with at least one machine readable identifier providing a subject ID, the method comprising: - het verschaffen van een computerzichtsysteem met een beeldopnamesysteem omvattende ten minste één beeldopname-inrichting, en een leessysteem omvattende ten minste één lezer voor het lezen van ten minste één machineleesbare identificator;- providing a computer vision system with an image recording system comprising at least one image recording device, and a reading system comprising at least one reader for reading at least one machine-readable identifier; - het definiëren van het machine-leermodel in het computerzichtsysteem;- defining the machine learning model in the computer vision system; - het vastleggen van een beeld met behulp van het beeldopnamesysteem, waarbij het beeld het subject toont;- capturing an image using the image recording system, the image showing the subject; - lezen van het subject-ID door gebruik te maken van het leessysteem bij het vastleggen van het beeld, en het koppelen van het subject-ID met het beeld, waarbij het koppelen hef beeld vóórziet van een daarmee gekoppeld subject-ID, resulterend in ten minste één geannoteerd beeld;- reading the subject ID by using the reading system when capturing the image, and coupling the subject ID to the image, the coupling presenting the image of an associated subject ID, resulting in at least at least one annotated image; - het vastleggen van ten minste één verder beeld dat het subject toont, het koppelen van het gekoppelde subject-ID met het ten minste ene verdere beeld waardoor ten minste één verder geannoteerd beeld verschaft wordt, encapturing at least one further image showing the subject, coupling the coupled subject ID with the at least one further image to provide at least one further annotated image, and - het onderwerpen van het geannoteerde beeld en het ten minste ene verdere geannoteerde beeld aan het machine-leermodel voor het trainen van het machine-leermodel.- subjecting the annotated image and the at least one further annotated image to the machine learning model for training the machine learning model. 2. De werkwijze volgens conclusie 1, verder omvattende het voorzien van het subject van de machine-ieesbare Identificator die het subject-ID verschaft.The method of claim 1, further comprising providing the subject with the machine-readable Identifier providing the subject ID. 3. De werkwijze volgens conclusie 1 of 2, waarbij bij het opnemen van het ten minste ene verdere beeld een verder subject-ID wordt gelezen met behulp van het leessysteem en het verdere subject-ID wordt gekoppeld aan het ten minste ene verdere beeld.The method of claim 1 or 2, wherein upon recording the at least one further image, a further subject ID is read using the reading system and the further subject ID is coupled to the at least one further image. 4. De werkwijze volgens één van de voorgaande conclusies, waarbij het' geannoteerde beeld en het ten minste ene verdere geannoteerde beeld worden opgenomen in een trainingsgegevensset die wordt opgebouwd tijdens het uitvoeren van de werkwijze., en de trainingsgegevensset wordt gebruikt voor ten minste een gekozen uit training en aanvullend trainen van het machineleermodel.The method according to any of the preceding claims, wherein the annotated image and the at least one further annotated image are included in a training data set that is built up during the execution of the method, and the training data set is used for at least one selected from training and additional training of the machine learning model. 5. De werkwijze volgens één van de voorgaande conclusies, waarbij het machineleermodel een machine-leermodeldeel omvat voor het lokaliseren van subjecten in ten minste één gekozen uit het opgenomen beeld en het ten minste ene verdere opgenomen beeld.The method of any preceding claim, wherein the machine learning model includes a machine learning model portion for locating subjects in at least one selected from the recorded image and the at least one further recorded image. 6. De werkwijze volgens één van de voorgaande conclusies, waarbij het genoemde lezersysteem ten minste een eerste lezer en een tweede lezer omvat, waarbij de eerste lezer het subject-ID leest wanneer de afbeelding wordt opgenomen, en de tweede lezer het subject-ID leest wanneer het ten minste één verder beeld wordt opgenomen.The method of any preceding claim, wherein said reader system comprises at least a first reader and a second reader, the first reader reading the subject ID when the image is captured, and the second reader reading the subject ID when it records at least one further image. 7. De werkwijze van een van de voorgaande conclusies wanneer deze afhankelijk is van conclusie 6, waarbij het subject ten minste een eerste en een tweede machine-leesbare identificator omvat, eerste lezer leest de eerste machine-leesbare identificator voor het verschaffen van het subject-ID en de tweede lezer de tweede machine leesbare identificator leest voor het verschaffen van het subject-ID.The method of any of the preceding claims when dependent on claim 6, wherein the subject comprises at least a first and a second machine-readable identifier, first reader reads the first machine-readable identifier to provide the subject- ID and the second reader reads the second machine readable identifier to provide the subject ID. 8. Werkwijze volgens een der voorgaande conclusies, waarbij de eerste en tweede lezer en de eerste en een tweede machineleesbare identificator van een ander type zijn, waarbij de eerste en tweede lezer een eerste en een tweedeA method according to any one of the preceding claims, wherein the first and second reader and the first and a second machine-readable identifier are of a different type, wherein the first and second reader are a first and a second NL2021481 identificator verschaffen, en in het bijzonder verschaft het zichtsysteem het subject-iD van de eerste en tweede identificator,NL2021481 provide identifier, and in particular the vision system provides the subject ID of the first and second identifiers, 9. De werkwijze volgens één van de voorgaande conclusies, waarbij lezen van ten minste een geselecteerd uit het gekoppelde subject ID en een verder subject-iD wordt herhaald,The method of any preceding claim, wherein reading at least one selected from the linked subject ID and a further subject iD is repeated, 10. De werkwijze volgens een der voorgaande conclusies, waarbij het vastleggen van betten minste ene verdere beeld en het koppelen van het gekoppelde subject-iD aan het ten minste één verder beeld continu herhaald wordt, waardoor een reeks van het ten minste ene verder geannoteerde beeld verschaft wordt, in het bijzonder wordt het opnemen herhaald wanneer er een of meer subjecten zijn in een gezichtsveld van het beeldopnamesysteem.The method according to any one of the preceding claims, wherein capturing the at least one further image and coupling the coupled subject ID to the at least one further image is continuously repeated, whereby a sequence of the at least one further annotated image in particular, recording is repeated when there are one or more subjects in a field of view of the image recording system. 11. De werkwijze volgens een der voorgaande conclusies, waarbij het vastleggen van het ten minste ene verdere beeld continu wordt herhaald en het leessysteem het lezen van het onderwerp-ID met behulp van het leessysteem herhaalt wanneer een van het ten minste ene verdere beeld wordt opgenomen, waardoor telkens een vernieuwd subject-ID wordt verschaft, koppelen van het vernieuwde subject-ID aan het ten minste ene verdere beeld, waarbij het koppelen het ten minste ene verdere beeld verschaft met een gekoppeld subject-ID, resulterend in ten minste één verder geannoteerd beeld, voor het verschaffen van een reeks van geannoteerde beelden.The method according to any one of the preceding claims, wherein the recording of the at least one further image is continuously repeated and the reading system repeats reading the subject ID using the reading system when one of the at least one further image is recorded thereby providing a renewed subject ID each time, coupling the renewed subject ID to the at least one further image, the coupling providing the at least one further image with a coupled subject ID, resulting in at least one further annotated image, to provide a series of annotated images. 12. De werkwijze volgens een der voorgaande conclusies, waarbij het annoteren van beelden worden voortgezet totdat een vooraf bepaald betrouwbaarheidsniveau voor het identificeren van het subject in een beeld wordt bereikt.The method of any preceding claim, wherein image annotation is continued until a predetermined confidence level for identifying the subject in an image is reached. 13. De werkwijze volgens een van de voorgaande conclusies, voor het trainen van een machine-ieermode! om meerdere subjecten te identificeren die elk ten minste één machine-leesbare identificator hebben voor het verschaffen van een subject-ID voor elk subject, waarbij het leessysteem de machine-leesbare identificatoren leest van ten minste een deel van de meerdere subjecten, voor het verschaffen van meerdere subject-iD's, waarbij het beeldopnamesysteem het beeld opneemt met het tenminste een deel van de meerdere subjecten, en het beeld koppelt met het tenminste een deel van de meerdere subjecten met de meerdere subject-ID’s, waarbij de geannoteerde beelden worden verschaft.The method according to any of the preceding claims, for training a machine mode! to identify multiple subjects each having at least one machine-readable identifier to provide a subject ID for each subject, the reading system reading the machine-readable identifiers of at least a portion of the multiple subjects, to provide multiple subject IDs, wherein the image recording system records the image with the at least a portion of the multiple subjects, and couples the image to the at least a portion of the multiple subjects with the multiple subject IDs, providing the annotated images. 14. De werkwijze volgens conclusie 13, waarbij het beeldopnamesysteem het ten minste ene verdere beeld met de ten minste deel van de meerdere subjecten opneemt, en het beeld koppelt met de ten minste deel van de meerdere subjecten met de meerdere subject-ID’s, voor het verschaffen van de geannoteerde beelden.The method of claim 13, wherein the image recording system records the at least one further image with the at least part of the plural subjects, and couples the image with the at least part of the multiple subjects with the multiple subject IDs, for providing the annotated images. 15. Een werkwijze voor het trainen van een machine-leermodel om een dier te identificeren in een groep dieren, in het bijzonder een veestapel dier te midden van een groep van veestapel dieren, gebruikmakend van de werkwijze volgens één van de voorgaande conclusies.A method of training a machine learning model to identify an animal in a group of animals, in particular a livestock animal, among a group of livestock animals, using the method of any of the preceding claims. 16. Een systeem voor het identificeren van een subject met ten minste een machine-leesbare identificator dat een subject-ID verschaft, het systeem omvattende:A system for identifying a subject with at least one machine-readable identifier providing a subject ID, the system comprising: - een computerzichtsysteem omvattende een beeldopnamesysteem omvattende ten minste één beeldopname-inrichting, en een leessysteem omvattende ten minste één lezer voor het lezen van de ten minste ene machine-leesbare identificator;a computer vision system comprising an image recording system comprising at least one image recording device, and a reading system comprising at least one reader for reading the at least one machine-readable identifier; - een machine-leermodel dat is gedefinieerd in het computerzichtsysteem; waarbij het computerzichtsysteem in werking:- a machine learning model defined in the computer vision system; the computer vision system operating: - een beeld vastlegt met het beeldopnamesysteem, waarbij het beeld het subject toont;- capture an image with the image recording system, the image showing the subject; - het subject-ID leest door gebruik te maken van het leessysteem bij het opnemen van het beeld, en het koppelen van het subject-ID met het beeld, waarbij het- reads the subject ID using the reading system when recording the image, and coupling the subject ID to the image, wherein the 5 koppelen het beeld met een gekoppeld subject-ID verschaft, resulterend in ten minste één geannoteerd beeld;5 linking the image to a linked subject ID provides, resulting in at least one annotated image; - ten minste één verder beeld opneemt dat het subject toont, waarbij het koppelen van het gekoppelde subject-ID met het ten minste ene verdere beeld het ten minste één verder geannoteerd beeld verschaft, en- records at least one further image showing the subject, coupling the coupled subject ID with the at least one further image provides the at least one further annotated image, and 10 - het geannoteerde beeld en het ten minste ene verdere geannoteerde beeld onderwerp aan het machine-leermodel voor het trainen van het machineleermodel.10 - subject the annotated image and the at least one further annotated image to the machine learning model for training the machine learning model. 17. Een computerprogrammaproduct voor het werken op een dataprocessor op17. A computer program product for operating on a data processor on 15 een computerzichtsysteem, waarbij het computerprogrammaproduct wanneer het draait op de dataprocessor:15 a computer vision system, wherein the computer program product when running on the data processor: het computerzichtsysteem in staat stelt om de werkwijze volgens een van de voorgaande conclusies 1-14 uit te voeren.enables the computer vision system to perform the method of any of the preceding claims 1-14.
NL2021481A 2018-08-17 2018-08-17 A method for automatically annotating and identifying a living being or an object with an identifier, such as RFID, and computer vision. NL2021481B1 (en)

Priority Applications (6)

Application Number Priority Date Filing Date Title
NL2021481A NL2021481B1 (en) 2018-08-17 2018-08-17 A method for automatically annotating and identifying a living being or an object with an identifier, such as RFID, and computer vision.
US17/042,063 US11308358B2 (en) 2018-08-17 2019-08-15 Method and system for automatically annotating and identifying a living being or an object with an identifier providing a subject identification
EP19756032.9A EP3740894A1 (en) 2018-08-17 2019-08-15 A method and system for automatically annotating and identifying a living being or an object with an identifier providing a subject identification
PCT/NL2019/050533 WO2020036490A1 (en) 2018-08-17 2019-08-15 A method and system for automatically annotating and identifying a living being or an object with an identifier providing a subject identification
US17/659,574 US11961320B2 (en) 2018-08-17 2022-04-18 Method and system for automatically annotating and identifying a living being or an object with an identifier providing a subject identification
US18/601,933 US20240212385A1 (en) 2018-08-17 2024-03-11 Method and system for automatically annotating and identifying a living being or an object with an identifier providing a subject identification

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
NL2021481A NL2021481B1 (en) 2018-08-17 2018-08-17 A method for automatically annotating and identifying a living being or an object with an identifier, such as RFID, and computer vision.

Publications (1)

Publication Number Publication Date
NL2021481B1 true NL2021481B1 (en) 2020-02-24

Family

ID=64427159

Family Applications (1)

Application Number Title Priority Date Filing Date
NL2021481A NL2021481B1 (en) 2018-08-17 2018-08-17 A method for automatically annotating and identifying a living being or an object with an identifier, such as RFID, and computer vision.

Country Status (1)

Country Link
NL (1) NL2021481B1 (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1212939A1 (en) * 2000-12-08 2002-06-12 N.V. Nederlandsche Apparatenfabriek NEDAP Farm management system provided with cameras for monitoring animals on the farm
US20030164878A1 (en) * 1998-10-27 2003-09-04 Hitoshi Iizaka Method of and device for aquiring information on a traffic line of persons
US20070086626A1 (en) 2003-10-08 2007-04-19 Xid Technologies Pte Ltd Individual identity authentication systems
US8380558B1 (en) 2006-12-21 2013-02-19 Videomining Corporation Method and system for analyzing shopping behavior in a store by associating RFID data with video-based behavior and segmentation data
WO2013085985A1 (en) * 2011-12-06 2013-06-13 Google Inc. System and method of identifying visual objects
WO2015149610A1 (en) * 2014-04-03 2015-10-08 Beijing Zhigu Rui Tuo Tech Co., Ltd Association methods and association devices
CN107066605A (en) 2017-04-26 2017-08-18 国家电网公司 Facility information based on image recognition has access to methods of exhibiting automatically
US10025950B1 (en) * 2017-09-17 2018-07-17 Everalbum, Inc Systems and methods for image recognition

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030164878A1 (en) * 1998-10-27 2003-09-04 Hitoshi Iizaka Method of and device for aquiring information on a traffic line of persons
EP1212939A1 (en) * 2000-12-08 2002-06-12 N.V. Nederlandsche Apparatenfabriek NEDAP Farm management system provided with cameras for monitoring animals on the farm
US20070086626A1 (en) 2003-10-08 2007-04-19 Xid Technologies Pte Ltd Individual identity authentication systems
US8380558B1 (en) 2006-12-21 2013-02-19 Videomining Corporation Method and system for analyzing shopping behavior in a store by associating RFID data with video-based behavior and segmentation data
WO2013085985A1 (en) * 2011-12-06 2013-06-13 Google Inc. System and method of identifying visual objects
WO2015149610A1 (en) * 2014-04-03 2015-10-08 Beijing Zhigu Rui Tuo Tech Co., Ltd Association methods and association devices
CN107066605A (en) 2017-04-26 2017-08-18 国家电网公司 Facility information based on image recognition has access to methods of exhibiting automatically
US10025950B1 (en) * 2017-09-17 2018-07-17 Everalbum, Inc Systems and methods for image recognition

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
J. JEON ET AL., AUTOMATIC IMAGE ANNOTATION AND RETRIEVAL USING CROSS-MEDIA RELEVANCE MODEL, July 2003 (2003-07-01), Retrieved from the Internet <URL:http://hpds.ee.kuas.edu.tw/download/parallelj rocessing/97/97present/20081226/Aut omatic%20Image%20Annotation%20and%20Retrieval%20using.pdf>
MAXIMILIAN ILSE ET AL., ATTENTION-BASED DEEP MULTIPLE INSTANCE LEARNING, February 2018 (2018-02-01), Retrieved from the Internet <URL:https://arxiv.org/abs/1802.04712>
TIBERIO URICCHIO ET AL., AUTOMATIC IMAGE ANNOTATION VIA LABEL TRANSFER IN THE SEMANTIC SPACE, May 2016 (2016-05-01), Retrieved from the Internet <URL:https://arxiv.org/abs/1605.04770>

Similar Documents

Publication Publication Date Title
US11961320B2 (en) Method and system for automatically annotating and identifying a living being or an object with an identifier providing a subject identification
US8478048B2 (en) Optimization of human activity determination from video
Yang et al. Multi-object tracking with discriminant correlation filter based deep learning tracker
Li et al. Clothing attributes assisted person reidentification
US20160350336A1 (en) Automated image searching, exploration and discovery
Hoai et al. Improving human action recognition using score distribution and ranking
Patil et al. A perspective view of cotton leaf image classification using machine learning algorithms using WEKA
Nguyen et al. Inductive and transductive few-shot video classification via appearance and temporal alignments
Mar et al. Cow detection and tracking system utilizing multi-feature tracking algorithm
EP3002710A1 (en) System and method for object re-identification
Layne et al. Re-id: Hunting Attributes in the Wild.
Yaghoubi et al. SSS-PR: A short survey of surveys in person re-identification
CN114419363B (en) Target classification model training method and device based on unlabeled sample data
Kaur et al. Cattle identification system: a comparative analysis of SIFT, SURF and ORB feature descriptors
Singh Machine learning in pattern recognition
Matzen et al. Bubblenet: Foveated imaging for visual discovery
Roth et al. On the exploration of joint attribute learning for person re-identification
NL2021481B1 (en) A method for automatically annotating and identifying a living being or an object with an identifier, such as RFID, and computer vision.
Bebawy et al. Active shape model vs. deep learning for facial emotion recognition in security
Gao et al. UD-YOLOv5s: Recognition of cattle regurgitation behavior based on upper and lower jaw skeleton feature extraction
Bhoir et al. FIODC Architecture: the architecture for fashion image annotation
Mazo et al. Evaluation of two computer vision approaches for grazing dairy cow identification
Dai et al. Object detection based on visual memory: a feature learning and feature imagination process
Andrade et al. Dog Face Recognition Using Deep Features Embeddings
Verma et al. Age and gender prediction using deep learning framework