[go: up one dir, main page]

WO2020147430A1 - Procédé d'affichage de produit sur la base de l'identification d'image, dispositif, appareil et support - Google Patents

Procédé d'affichage de produit sur la base de l'identification d'image, dispositif, appareil et support Download PDF

Info

Publication number
WO2020147430A1
WO2020147430A1 PCT/CN2019/120988 CN2019120988W WO2020147430A1 WO 2020147430 A1 WO2020147430 A1 WO 2020147430A1 CN 2019120988 W CN2019120988 W CN 2019120988W WO 2020147430 A1 WO2020147430 A1 WO 2020147430A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
recognized
target
images
video data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2019/120988
Other languages
English (en)
Chinese (zh)
Inventor
罗琳耀
徐国强
邱寒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
OneConnect Smart Technology Co Ltd
Original Assignee
OneConnect Smart Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by OneConnect Smart Technology Co Ltd filed Critical OneConnect Smart Technology Co Ltd
Publication of WO2020147430A1 publication Critical patent/WO2020147430A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions

Definitions

  • This application relates to the field of intelligent decision-making, and in particular to a method, device, equipment and medium for displaying commodities based on image recognition.
  • the embodiments of the present application provide a method, device, equipment, and medium for displaying commodities based on image recognition to solve the problem of insufficient attractiveness of displayed commodities.
  • a product display method based on image recognition including:
  • Target video data of the commodity to be displayed where the target video data includes at least two frames to be recognized
  • a face detection model to perform face recognition and clustering on at least two frames of the to-be-recognized images, and obtain the number of customers corresponding to the target video data and the image cluster set corresponding to each customer, each of the image clusters
  • the set includes at least one frame to be recognized
  • a pre-trained micro-expression recognition model is used to recognize the images to be recognized in each of the image clustering sets, and the single frame emotion of each image to be recognized is obtained;
  • a product display device based on image recognition including:
  • a data acquisition module for acquiring target video data of a commodity to be displayed, where the target video data includes at least two frames of images to be recognized;
  • the image clustering collection acquisition module is used to perform face recognition and clustering on at least two frames of the to-be-recognized images by using a face detection model to acquire the number of customers corresponding to the target video data and the image clusters corresponding to each customer A set, each of the image cluster sets includes at least one frame to be recognized;
  • the single-frame emotion determination module is configured to, if the number of the customers is greater than the preset number, use a pre-trained micro-expression recognition model to identify the image to be recognized in each image cluster set, and obtain each Single frame emotion of the image to be recognized;
  • a target emotion obtaining module configured to obtain the target emotion of the customer corresponding to the image cluster set based on the single frame emotion of at least one frame of the image to be recognized;
  • the final emotion obtaining module is used to obtain the final emotion according to the number of customers and the target emotion of the customer corresponding to each image cluster set;
  • the target display product obtaining module is configured to obtain the target display product according to the final emotion corresponding to the product to be displayed.
  • a computer device includes a memory, a processor, and computer-readable instructions stored in the memory and executable on the processor.
  • the processor executes the computer-readable instructions, the following steps are implemented:
  • Target video data of the commodity to be displayed where the target video data includes at least two frames to be recognized
  • a face detection model to perform face recognition and clustering on at least two frames of the to-be-recognized images, and obtain the number of customers corresponding to the target video data and the image cluster set corresponding to each customer, each of the image clusters
  • the set includes at least one frame to be recognized
  • a pre-trained micro-expression recognition model is used to recognize the images to be recognized in each of the image clustering sets, and the single frame emotion of each image to be recognized is obtained;
  • One or more readable storage media storing computer readable instructions
  • the computer readable storage medium storing computer readable instructions
  • the one Or multiple processors perform the following steps:
  • Target video data of the commodity to be displayed where the target video data includes at least two frames to be recognized
  • a face detection model to perform face recognition and clustering on at least two frames of the to-be-recognized images, and obtain the number of customers corresponding to the target video data and the image cluster set corresponding to each customer, each of the image clusters
  • the set includes at least one frame to be recognized
  • a pre-trained micro-expression recognition model is used to recognize the images to be recognized in each of the image clustering sets, and the single frame emotion of each image to be recognized is obtained;
  • FIG. 1 is a schematic diagram of an application environment of a commodity display method based on image recognition in an embodiment of the present application
  • FIG. 2 is a flowchart of a method for displaying products based on image recognition in an embodiment of the present application
  • FIG. 3 is a flowchart of a method for displaying products based on image recognition in an embodiment of the present application
  • FIG. 4 is a flowchart of a method for displaying products based on image recognition in an embodiment of the present application
  • FIG. 5 is a flowchart of a method for displaying products based on image recognition in an embodiment of the present application
  • FIG. 6 is a flowchart of a method for displaying products based on image recognition in an embodiment of the present application
  • FIG. 7 is a flowchart of a method for displaying products based on image recognition in an embodiment of the present application.
  • FIG. 8 is a functional block diagram of a product display device based on image recognition in an embodiment of the present application.
  • Fig. 9 is a schematic diagram of a computer device in an embodiment of the present application.
  • the product display method based on image recognition provided by the embodiment of the present application can be applied in the application environment as shown in FIG. 1, wherein the client communicates with the server through the network.
  • the product display method based on image recognition is applied on the server, and the target video data corresponding to the product to be displayed is analyzed and identified, and the emotion corresponding to each customer in the target video data is obtained, and the customer pair is determined according to the emotion corresponding to each customer
  • the final sentiment of the product to be displayed, the target display product is determined according to the final sentiment corresponding to each product to be displayed, so that the target display product is the product that customers pay more attention to, and the attractiveness of the target display product is improved to attract customers to purchase the displayed product .
  • the client can be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices.
  • the server can be implemented as an independent server or a server cluster composed of multiple servers.
  • a product display method based on image recognition is provided.
  • the method is applied to the server in FIG. 1 as an example for description, which specifically includes the following steps:
  • S10 Obtain target video data of the commodity to be displayed, where the target video data includes at least two frames of images to be recognized.
  • the target video data refers to the video data obtained by filtering the initial video data corresponding to the commodity to be displayed, and may specifically be video data that meets some conditions. For example, video data satisfying the attribute information of the product to be displayed.
  • the attribute information may include appropriate age and appropriate gender.
  • the collected initial video data corresponding to the commodity to be displayed is filtered by suitable age and suitable gender to obtain the target video data.
  • the image to be recognized refers to an image that is screened according to a suitable age and a suitable gender.
  • the initial video data is the video data corresponding to each commodity to be displayed collected by the video collection tool.
  • a video collection tool is configured in advance for each area where the commodity to be displayed is located.
  • the video collection tool is used to collect images or video data.
  • the video collection tool detects that a customer appears within the collection range of the collection tool, the video The acquisition tool automatically triggers and collects image or video data from the customer.
  • the video collection tool is specifically a camera, through which the initial video data within the collection range corresponding to each commodity to be displayed can be collected in real time. Since each collection tool corresponds to a product to be displayed, the initial video data of each area of the product to be displayed is collected through each camera, and the initial video data of the area of the product to be displayed is filtered to obtain each and the product to be displayed. The target video data corresponding to the product.
  • the collected initial video data carries a product identifier corresponding to the product to be displayed, so that the corresponding target video data can be determined subsequently through the product identifier.
  • the initial video data collected by the video collection tool includes the product identifier A, then the initial video data is the initial video data corresponding to the product identifier A, and the initial video data is filtered to obtain the target video data corresponding to the product identifier A.
  • the product identifier refers to a unique identifier used to distinguish different products to be displayed.
  • the product identification may consist of at least one of numbers, letters, words or symbols.
  • the product identifier may be the serial number or serial number of the product to be displayed.
  • S20 Use the face detection model to perform face recognition and clustering on at least two frames of images to be recognized, and obtain the number of customers corresponding to the target video data and the image cluster set corresponding to each customer, and each image cluster set includes at least one Frame the image to be recognized.
  • the face detection model refers to a pre-trained model used to detect whether each frame of the image to be recognized contains a human face area.
  • the number of customers refers to the number determined according to different customers in the target video data.
  • the image clustering set refers to clustering the to-be-identified images corresponding to the same customer to form a set.
  • the server is connected to the database network, the face detection model is stored in the database, the target video data contains the images to be recognized corresponding to different customers, and the face detection model is used to recognize the images to be recognized to obtain the images that contain Face image, where the face image refers to the image of the customer’s face area.
  • the server inputs each image to be recognized in the acquired target video data into the face detection model, and performs face recognition on each image to be recognized in the target video data through the face detection model to determine whether each image to be recognized is Is a face image, if the image to be recognized is a face image, cluster the same face image, that is, cluster the image to be recognized corresponding to the same face image to obtain the image cluster corresponding to each customer Set, determine the number of customers in the target video data according to the number of image clustering sets in the target video data.
  • the feature extraction algorithm is used to extract the facial features of the facial image corresponding to each image to be recognized, and the facial features corresponding to each image to be recognized are subjected to feature similarity calculation. If the feature similarity is greater than a preset threshold, It means that the facial image corresponding to the feature similarity greater than the preset threshold is the facial image corresponding to the same customer, and the image to be recognized corresponding to the facial image corresponding to the same customer is clustered to obtain the image cluster corresponding to each customer Set, and determine the number of customers in the target video data according to the number of image clustering sets. Wherein, when the threshold is preset, it is preset to evaluate whether the similarity reaches the value determined to be the face of the same customer.
  • Feature extraction algorithms include, but are not limited to, CNN (Convolutional Neural Network, convolutional neural network) algorithms.
  • the CNN algorithm can be used to extract the facial features of the facial image corresponding to the image to be recognized.
  • the micro-expression recognition model is to capture the local features of the customer's face in the image to be recognized, and identify each target facial action unit of the face in the image to be recognized according to the local feature, and then according to the recognized target facial action unit Determine the model of its emotions.
  • Single-frame emotions refer to the emotions to be recognized by the micro-expression recognition model to identify the image to be recognized, and the emotions determined according to the recognized target facial action unit.
  • the micro-expression recognition model may be a neural network recognition model based on deep learning, a local recognition model based on classification, or a local emotion recognition model based on local binary pattern (LBP).
  • the micro-expression recognition model is a partial recognition model based on classification.
  • the micro-expression recognition model is trained in advance, a large amount of training image data is collected in advance, and the training image data contains the positive samples and faces of each facial action unit. The negative samples of the action unit are trained on the training image data through the classification algorithm to obtain the micro-expression recognition model.
  • a large amount of training image data may be trained by an SVM classification algorithm to obtain SVM classifiers corresponding to N facial action units.
  • the micro-expression recognition model is formed through N SVM classifiers, and the more SVM classifiers obtained, the more accurate the emotions recognized by the formed micro-expression recognition model.
  • the preset number is a preset value, and the preset number corresponding to each commodity to be displayed is the same.
  • the server determines that the number of customers is greater than the preset number, and uses a pre-trained micro-expression recognition model to gather each image.
  • the images to be recognized in the class set are recognized, and the single frame emotion corresponding to each image to be recognized is obtained.
  • Using a pre-trained micro-expression recognition model to recognize the images to be recognized in each image cluster set specifically includes the following steps.
  • the server first performs face key point detection and feature extraction on the images to be recognized, so as Obtain the corresponding local features, and then input the local features corresponding to the image to be recognized into the pre-trained micro-expression recognition model.
  • the micro-expression recognition model includes N SVM classifiers, each SVM classifier recognizes a local feature corresponding to an image to be recognized, and all the local features of the input are recognized by N SVM classifiers to obtain N SVM classifications
  • the facial action unit is determined as the target facial action unit corresponding to the image to be recognized.
  • the preset probability threshold is a preset value.
  • the target facial action unit refers to the facial action unit (Action Unit, AU) obtained by recognizing the image to be recognized according to the micro-expression recognition model.
  • the micro-expression recognition model includes 54 SVM classifiers, and a facial action unit number mapping table is established, and each facial action unit is represented by a predetermined number.
  • AU1 means the inner eyebrows are raised
  • AU2 is the outer eyebrows are raised
  • AU5 is the upper eyelids are raised
  • AU26 is the lower jaw.
  • Each facial action unit has a corresponding SVM classifier trained.
  • the SVM classifier corresponding to the inner eyebrow can be identified as the probability that the inner feature of the inner eyebrow is belonged to the inner eyebrow
  • the SVM classifier corresponding to the outer eyebrow can be identified that the local feature of the outer eyebrow is belonged to the outer eyebrow.
  • the probability value can be a value between 0-1. If the output probability value is 0.6 and the preset probability threshold value is 0.5, then the probability value 0.6 is greater than the preset probability threshold value 0.5, then the facial action unit corresponding to 0.6 is used as the waiting Identify the target facial action unit of the image.
  • the server can determine the single frame emotion of the image to be recognized according to the target facial action unit corresponding to each image to be recognized, that is, query the evaluation table according to the target facial action unit corresponding to each image to be recognized, and obtain the The single frame emotion corresponding to the image.
  • 54 SVM classifiers are used to identify local features, determine all target facial action units corresponding to the image to be recognized, and look up the evaluation table based on all target facial action units to determine the single corresponding to the image to be recognized.
  • Frame emotions to improve the accuracy of obtaining customer single frame emotions.
  • the evaluation table is a pre-configured table. One or more facial action units form different emotions, the facial action unit combination corresponding to each emotion is obtained in advance, and each facial action unit combination and the corresponding emotion are associated and stored to form an evaluation table.
  • S40 Obtain the target emotion of the customer corresponding to the image cluster set based on the emotion of a single frame of at least one frame to be recognized.
  • the target emotion refers to the emotion determined according to the emotion of a single frame corresponding to each image to be recognized in the image cluster set. Understandably, in the image clustering set, there are images to be recognized corresponding to the same customer, and the target emotion corresponding to the customer is determined through a single frame of emotion corresponding to each image to be recognized of the customer.
  • the server obtains the single frame emotion corresponding to each image to be recognized in the image cluster set, and analyzes the single frame emotion corresponding to each image to be recognized in the image cluster set to obtain the target of the customer corresponding to the image cluster set mood. Understandably, it is judged whether the single frame emotion corresponding to each to-be-recognized image in the image cluster set is the same, and if all the single frame emotions in the image cluster set are the same, then the single frame emotion is used as the target emotion. If at least two single-frame emotions are not the same, determine which single-frame emotion in the image cluster set corresponds to the largest number, and use the single-frame emotion corresponding to the largest number as the target emotion.
  • the target emotions of the customers corresponding to each image cluster set in the target video data corresponding to the products to be displayed are sequentially acquired, that is, the emotions of each customer for the products to be displayed are acquired.
  • the target video data corresponding to product A includes 100 image cluster sets, and the target sentiment of the customer corresponding to each image cluster set is acquired, that is, the target sentiment of 100 customers on product A is acquired.
  • S50 Obtain the final emotion according to the number of customers and the target emotion of the customer corresponding to each image cluster set.
  • the final emotion refers to the emotion corresponding to the product to be displayed obtained through quantitative analysis of the target emotion of the product to be displayed by each customer.
  • the server judges whether the target emotions of the customers corresponding to all image clustering sets are the same according to the target emotions of the customers corresponding to each image clustering set. If the target emotions of the customers corresponding to the image clustering sets are different, it will be based on the customer The number and the target emotion of the customer corresponding to each image cluster set are counted, the number corresponding to each target emotion is counted, and the target emotion with the largest number is taken as the final emotion. For example, the number of customers corresponding to the target video data of product A is 100, that is, the image cluster set is also 100.
  • the target emotions corresponding to 100 image cluster sets, 50 are joy, 30 are calm, and 20 are cold , Then the target emotion (ie joy) corresponding to the largest number is regarded as the final emotion of A commodity. If there are at least two target emotions corresponding to the largest number, the final emotion is determined based on the number of emotion categories of the target emotion. Preferably, the target emotion is a positive emotion as the final emotion. For example, the number of customers corresponding to the target video data of product A is 100, that is, the image cluster set is also 100. If 100 image cluster sets correspond to the target emotion, 50 are joy and 50 are indifferent, then the target emotion For joy as the final emotion of the product to be displayed. If there are at least two target emotions corresponding to the largest number of juxtapositions, and the target emotions are all negative emotions, since the final emotion corresponding to the final display target display product should be a positive emotion, any target emotion can be selected as the final emotion.
  • the target display product refers to obtaining a displayable product from the product to be displayed according to the final emotion corresponding to each product to be displayed.
  • the server obtains the target display product from the product to be displayed according to the final emotion corresponding to each product to be displayed, specifically including the following steps:
  • (1) determine whether the final emotion corresponding to each commodity to be displayed is a commodity that can be displayed. Specifically, preset emotions that can be displayed are set in advance, and the final emotion corresponding to each product to be displayed is matched with the preset emotion. If the final emotion is successfully matched with the preset emotion, the matched final emotion is matched to the waiting Display products are products that can be displayed. Through this step, the product to be displayed whose target emotion does not match the preset emotion is avoided as the target display product. Generally speaking, the preset emotions are positive emotions, which avoids taking the products to be displayed corresponding to the negative emotions as the target display products.
  • the final emotion corresponding to the product A to be displayed is joy
  • most customers are more interested in the product to be displayed, and the joy is matched with the preset emotion. If the joy and the preset emotion match successfully, then the product A is determined It is a product that can be displayed.
  • the final emotion corresponding to the item B to be displayed is disgust, anger or disappointment, most customers do not like the item to be displayed, and matching the final emotion corresponding to the item B to be displayed with the preset emotion fails, then B Not as a display product.
  • the emotion ranking table is a preset table, and the more positive emotions are ranked higher, for example, the ranking is based on joy, joy, surprise, calm, disgust, anger, and disappointment.
  • the final emotion corresponding to the item to be displayed that can be displayed is obtained, A is the final emotion corresponding to the item to be displayed is joy, and B corresponds to the item to be displayed.
  • the final emotion of C is joy
  • the final emotion corresponding to the product to be displayed in C is surprise
  • the final emotion corresponding to the product to be displayed in D is calm
  • the displayed products are sorted according to the emotional ranking table, namely A, B, C, and D, and get
  • the product to be displayed with the preset value in the front row is used as the target display product, that is, the first three A, B, and C products to be displayed are used as the target display product.
  • the products to be displayed corresponding to the final emotions are sorted according to the emotion ranking table, there is a situation of juxtaposition, first determine whether the target display product is obtained within the preset value; if the target display cannot be obtained within the preset value Commodities, the number of target video data of the product to be displayed that matches the final emotion of the tie (ie the maximum number of target emotions) is determined, the finite level of the final emotion of the tie is determined according to the number, and the target display product of the preset value is obtained.
  • the final emotion corresponding to the item to be displayed that can be displayed is obtained
  • A is the final emotion corresponding to the item to be displayed is joy
  • B corresponds to the item to be displayed
  • the final emotion of C is joy
  • C is the final emotion corresponding to the product to be displayed is calm
  • D is the final emotion corresponding to the product to be displayed is calm
  • the displayed products are sorted according to the emotion ranking table, namely A, B, C and D, C Parallel to D
  • the target display product cannot be obtained within the preset value
  • the number is 60; then the number of D to-be-displayed products corresponding to a number of 60 is prioritized over the number of C corresponding to 50
  • a preset value of the products to be displayed is obtained as the target display products, that is, the products to be displayed in A, B, and D are used as the target display products.
  • the target video data of the product to be displayed is obtained, and the face detection model is used to perform face recognition and clustering on the image to be recognized in the target video data, and the number of customers corresponding to the target video data and the number of customers corresponding to each customer are obtained.
  • Image clustering collection in order to subsequently determine the target display product according to the target sentiment of a certain number of customers, and improve the accuracy of the target display product. If the number of customers is greater than the preset number, the pre-trained micro-expression recognition model is used to identify the images to be recognized in each image cluster set, and the single frame emotion of each image to be recognized is obtained to realize the recognition of customer emotions .
  • the target emotion of the customer corresponding to the image cluster set is obtained to determine whether the commodity to be displayed is the commodity that the customer is interested in.
  • the target display products are obtained according to the final emotions corresponding to the products to be displayed, and the target display is improved. The accuracy of the product makes the target product a product that most customers are more interested in.
  • step S10 the target video data of the commodity to be displayed is obtained.
  • the target video data includes at least two frames to be recognized, which specifically includes the following steps:
  • S11 Obtain initial video data of the commodity to be displayed, where the initial video data includes at least two frames of initial video images.
  • the initial video data refers to the video data corresponding to each commodity to be displayed collected by the video collection tool.
  • a video capture tool is used to collect the initial video data corresponding to each product to be displayed.
  • the initial video data includes at least two frames of initial video images, and the initial video data includes the corresponding product identifier of the product to be displayed.
  • the initial video data can be subsequently analyzed to obtain the final emotion corresponding to each commodity to be displayed.
  • S12 Obtain attribute information of the commodity to be displayed, and the attribute information includes a suitable age and a suitable gender.
  • the appropriate age refers to the age corresponding to the commodity to be displayed.
  • the appropriate gender refers to the gender corresponding to the product to be displayed.
  • the attribute information corresponding to each commodity to be displayed is stored in the database.
  • the server searches the database according to the products to be displayed, and obtains attribute information corresponding to each product to be displayed.
  • the attribute information includes suitable age and suitable gender.
  • a certain product to be displayed is clothing, and the attribute information corresponding to the clothing includes a female with a suitable age of 20-24 and a suitable gender.
  • the product to be displayed is cosmetics, the appropriate age in the attribute information corresponding to the cosmetics is 25-30 years old, and the appropriate gender is male.
  • the commodities to be displayed are not specifically limited in this embodiment.
  • S13 Screen the initial video image with a suitable age and a suitable gender, obtain an image to be recognized, and form target video data based on at least two frames of the image to be recognized.
  • a pre-trained classifier is used to identify at least two frames of initial video images in the initial video data, and the target age and target gender corresponding to each initial video image are obtained.
  • the server matches the target age with the appropriate age.
  • the target gender is matched with the appropriate gender, the initial video image that is successfully matched with the appropriate age and successfully matched with the appropriate gender is determined as the image to be recognized, and the initial video image that is unsuccessfully matched is deleted, based on at least two frames of waiting
  • the recognition image forms the target video data.
  • the target age refers to the age obtained by recognizing the initial video image through a pre-trained classifier.
  • the target gender refers to the gender obtained by recognizing the initial video image through a pre-trained classifier.
  • Steps S11-S13 screening the initial video images according to the appropriate age and appropriate gender corresponding to the products to be displayed, obtain the images to be recognized, and form target video data based on at least two frames of images to be recognized, so that the target video data and the products to be displayed are more matched , Improve the accuracy of target display products.
  • the method for displaying goods based on image recognition further includes:
  • Super-Resolution refers to reconstructing a corresponding high-resolution image from the acquired low-resolution image.
  • the initial video image in the initial video data is a low-resolution image.
  • the image to be determined refers to the conversion of the initial video image into a high-resolution image.
  • the server obtains the initial video data.
  • the initial video data includes at least two frames of initial video images.
  • the initial video images are in the low-resolution (LR) space.
  • the feature map of the low-resolution space is extracted through the ESPCN algorithm, and the effective sub-pixel Convolutional layer, enlarge the initial video image from low resolution to high resolution, upgrade the final low resolution feature map to high resolution feature map, and obtain the high resolution corresponding to each initial video image based on the high resolution feature map Rate image, the high-resolution image is used as the image to be determined.
  • the core concept of the ESPCN algorithm is a sub-pixel convolutional layer.
  • the input is a low-resolution image (that is, the initial video image).
  • the size of the characteristic image obtained is The input image is the same, but the characteristic channel is r ⁇ 2 (r is the target magnification of the image).
  • r is the target magnification of the image.
  • step S13 the initial video image is screened by the appropriate age and the appropriate gender to obtain the image to be recognized, which specifically includes the following steps:
  • S131 Use a pre-trained classifier to identify at least two frames of images to be determined, and obtain the target age and target gender corresponding to each image to be determined.
  • the pre-trained classifier includes a gender classifier and an age classifier, and the image to be determined is recognized through the gender classifier and the age classifier respectively to obtain the target age and target gender corresponding to the image to be determined.
  • the target gender refers to the gender obtained by recognizing the image to be determined through the gender classifier.
  • the target age refers to the age obtained by recognizing the image to be determined by the age classifier.
  • the training image data contains face images of different ages and different genders, and each face image in the training image data is processed by age and gender.
  • Annotation the annotated training image data is input to the deep neural network, and the annotated training image data is trained through the deep neural network.
  • the deep neural network includes at least two convolutional layers, the predicted age and the labeled
  • the age is compared to adjust the weight and bias of each layer in the deep neural network until the model converges, and the age classifier is obtained.
  • the gender prediction value is compared with the marked gender to adjust the weight and bias of each layer in the deep neural network until the model converges, and the gender classifier is obtained.
  • a pre-trained gender classifier is used to identify the image to be determined.
  • the image to be determined is an image containing the face of the customer.
  • the image to be determined containing the face of the customer is subjected to face key point detection and feature extraction to obtain the face. Department features.
  • the extracted facial features are input into a pre-trained gender classifier, and the facial features are recognized by the gender classifier to obtain the target gender corresponding to the image to be determined.
  • And input the extracted facial features into a pre-trained age classifier, and classify the facial features through the age classifier to obtain the target age corresponding to the image to be determined.
  • the pre-trained gender classifier and age classifier are used to estimate the gender and age of the customer on the image to be determined, so as to improve the accuracy of obtaining the target gender and target age.
  • S132 Match the target age with the appropriate age, and match the target gender with the appropriate gender.
  • the appropriate age may be an age group, for example, 20-24 years old.
  • the server matches the target age with the appropriate age, mainly to determine whether the target age is within the appropriate age range.
  • the suitable gender is female and male, and the identified target gender is matched with the suitable gender.
  • the server determines that the target age is within the appropriate age range, and the target gender is successfully matched with the appropriate gender, it will use the to-be-determined image corresponding to the target age and the target gender as the image to be recognized.
  • Steps S131-S132 using a pre-trained classifier to identify at least two frames of images to be determined, and obtain the target age and target gender corresponding to each image to be determined, so as to realize the determination of the target age and target gender through the classifier , To improve the speed of acquiring the target display product.
  • the image to be determined corresponding to the target age and gender is used as the image to be identified, so that the acquired image to be identified and the attribute information of the product to be displayed.
  • the obtained target display product is more accurate, and the acquisition accuracy of the target display product is improved.
  • step S20 is to use a face detection model to perform face recognition and clustering on at least two frames of images to be recognized, and obtain the number of customers corresponding to the target video data and the number of customers corresponding to each customer.
  • the image clustering collection specifically includes the following steps:
  • S21 Use a face detection model to perform face recognition on at least two frames of images to be recognized, and obtain a face image corresponding to each image to be recognized;
  • the server obtains the target video data, uses a face detection model to perform face recognition on each frame of the image to be recognized in the target video data, and obtains a face image corresponding to each image to be recognized in the target video data.
  • face recognition means that for any given frame of image, a certain strategy is used to search it to determine whether the image contains a face.
  • the face detection model is a pre-trained model used to detect whether each frame of the image to be recognized contains a human face area.
  • the server inputs each frame of the to-be-recognized image into the face detection model, and detects whether each frame of the to-be-recognized image contains a human face. If the to-be-recognized image contains a human face, it acquires each frame of the target video data. A face image corresponding to the image to be recognized.
  • S22 Cluster the face images corresponding to the image to be recognized, and obtain at least two image cluster sets, and each image cluster set includes at least one frame of the image to be recognized.
  • the server clusters the acquired facial images corresponding to the image to be recognized, clusters the facial images containing the same customer, and acquires at least two image cluster sets, where each image cluster set Includes at least one frame to be recognized.
  • the feature extraction algorithm is used to extract the facial features of the facial image corresponding to each image to be recognized, and the facial features corresponding to the facial image are calculated for feature similarity. If the feature similarity is greater than the preset threshold, it is stated It is the face image of the same customer, and clusters the to-be-identified images corresponding to the face image of the same customer to obtain the image cluster set corresponding to each customer. That is, one customer corresponds to one image cluster set, and each image cluster set includes at least one frame to be recognized.
  • the number of image cluster sets corresponding to each commodity to be displayed is counted, and the number of image cluster sets is taken as the number of customers corresponding to the target video data.
  • Steps S21-S23 Use the face detection model to perform face recognition on at least two frames of images to be recognized, and obtain a face image corresponding to each image to be recognized, so as to determine whether the image to be recognized is a face image and avoid not containing a face
  • the images to be recognized are clustered to improve the acquisition speed of subsequent image clustering sets.
  • Cluster the face images corresponding to the image to be recognized obtain at least two image cluster sets, and obtain the number of customers corresponding to the target video data according to the number of image cluster sets, so as to determine the number of customers and ensure that the number of customers is obtained accuracy.
  • each image to be recognized corresponds to a time mark, and the time mark refers to the time corresponding to the image to be recognized is collected.
  • step S22 that is, clustering the face images corresponding to the image to be recognized, and obtaining at least two image cluster sets, specifically includes the following steps:
  • S221 According to the time mark, use the first recognized face image in at least two frames of images to be recognized as a reference image.
  • the reference image refers to the face image recognized for the first time from the image to be recognized.
  • the server obtains the time stamps corresponding to at least two frames of images to be recognized, and according to the time stamps, first determines the first recognized face image in the at least two frames of images to be recognized, and uses the face image as the reference image. By determining the reference image, the acquisition speed of the image cluster set can be improved.
  • a similarity algorithm is used to calculate the feature similarity between the reference image and the remaining images to be recognized except for the reference image in the at least two frames of images to be recognized to obtain the feature similarity.
  • the similarity algorithm may be Euclidean distance algorithm, Manhattan distance algorithm, Minkowski distance algorithm, or cosine similarity algorithm.
  • the cosine similarity algorithm is used to calculate the characteristic similarity between the reference image and the remaining images to be recognized, which can speed up the acquisition of image clustering sets and improve the acquisition efficiency of target display products.
  • the preset threshold is a preset value. If the server determines that the feature similarity between the reference image and the remaining images to be recognized is greater than the preset threshold, it is considered that the reference image matches the remaining images to be recognized successfully, and the reference image matches the remaining images. If the image to be identified is an image of the same customer, the image to be identified and the reference image whose feature similarity is greater than the preset threshold are attributed to the same image cluster set.
  • the reference image 1 and the remaining image 2 to be recognized is 80%
  • the feature similarity between the reference image 1 and the remaining image 3 to be recognized is 99%
  • the preset threshold is 90%
  • the reference image 1 and The feature similarity corresponding to the remaining image 3 to be identified is greater than the preset threshold, and the reference image 1 and the remaining image 3 to be identified are attributed to the same image cluster set.
  • S224 If the feature similarity is not greater than the preset threshold, according to the time mark, update the first image of the remaining to-be-recognized images whose feature similarity is not greater than the preset threshold to a new reference image, repeat the execution according to the time mark, and use it in sequence
  • the similarity algorithm calculates the feature similarity between the reference image and the remaining images to be identified until at least two image cluster sets to be identified are completed, forming at least two image cluster sets.
  • the server determines that the feature similarity between the reference image and the remaining images to be recognized is not greater than the preset threshold, it is considered that the reference image and the remaining images to be recognized have failed to match, and the customers corresponding to the reference image and the customers corresponding to the remaining images to be recognized are different.
  • the first image among the remaining images to be identified whose feature similarity is not greater than the preset threshold is updated as a new reference image.
  • the feature similarity between the reference image 1 and the remaining image 2 to be recognized is 80%, and the preset threshold is 90%, then the feature similarity corresponding to the reference image 1 and the remaining image 2 to be recognized is not greater than the preset threshold, then Time stamp, the remaining image 2 to be recognized is updated as a new reference image.
  • Steps S221-S224 according to the time stamp, the first recognized face image in at least two frames to be recognized is used as the reference image, and the similarity algorithm is used to calculate the feature similarity between the reference image and the remaining images to be recognized in order to determine the reference image Whether it is the same customer as the remaining images to be recognized. If the feature similarity is greater than the preset threshold, the image to be recognized and the reference image with feature similarity greater than the preset threshold are attributed to the same image clustering set, so as to cluster the to-be-recognized images of the same customer.
  • the algorithm calculates the feature similarity between the reference image and the remaining images to be recognized until at least two frame of image clustering sets to be identified are completed to form at least two image clustering sets to achieve clustering of the images to be identified for the same customer. In order to subsequently determine the target sentiment of each customer for the product to be displayed.
  • step S30 a pre-trained micro-expression recognition model is used to recognize the images to be recognized in each image cluster set, and the single frame emotion of each image to be recognized is obtained. , Specifically including the following steps:
  • S31 Use the face key point algorithm to perform face recognition on the images to be recognized in each image cluster set, and obtain the face key points corresponding to each image to be recognized.
  • the face key point algorithm can be but not limited to Ensemble of Regression Tress (ERT) algorithm, SIFT (scale-invariant feature transform) algorithm, SURF (Speeded Up Robust Features) algorithm, LBP (Local Binary Patterns) algorithm and HOG (Histogram of Oriented Gridients) algorithm.
  • ERT Ensemble of Regression Tress
  • SIFT scale-invariant feature transform
  • SURF Speeded Up Robust Features
  • LBP Local Binary Patterns
  • HOG Histogram of Oriented Gridients
  • the ERT algorithm is a regression-based method, and the ERT algorithm is expressed as follows: among them, Is the shape or coordinates of the feature points of the image to be recognized obtained in the t+1 iteration, t represents the cascade number, In order to predict the shape or coordinates of the feature points of the image, I is the image to be recognized as input by the regressor, and r t represents the t-level regressor.
  • Each regressor is composed of many regression trees, which can be obtained through training. Tree, through the regression tree to obtain the key points of the face corresponding to each image to be recognized.
  • S32 Use a feature extraction algorithm to perform feature extraction on the face key points corresponding to each image to be recognized, and obtain local features corresponding to the face key points.
  • the feature extraction algorithm can be the CNN (Convolutional Neural Network, convolutional neural network) algorithm.
  • the CNN algorithm is used to extract the local features of the key points of the face corresponding to the image to be recognized. Specifically, the local features are extracted according to the location of the facial action unit. .
  • the CNN algorithm is a feed-forward neural network, and its artificial neurons can respond to a part of the surrounding units within the coverage area, and can quickly and efficiently perform image processing.
  • a pre-trained convolutional neural network is used to quickly extract local features corresponding to key points of a human face.
  • the key points of the face corresponding to each image to be recognized are subjected to a convolution operation through several convolution kernels, and the result of the convolution is the local feature corresponding to the face detection point.
  • y is the output local feature
  • x is a two-dimensional input vector of size (M, N), which is formed by the coordinates of the key points of the face of L
  • w ij is a convolution kernel of size I*J
  • b Is the bias
  • the size is M*N
  • the activation function is denoted by f
  • each convolution kernel is convolved with the face key points of the input image to be recognized in the previous layer, and each convolution kernel will have a corresponding
  • the local features of the convolution kernel share the weights, and the number of parameters is greatly reduced, which greatly improves the training speed of the network.
  • the local features corresponding to the facial action unit can be obtained.
  • AU1, AU2, AU5, and AU26 are the local features corresponding to raised inner eyebrows, raised outer eyebrows, raised upper eyelids, and opened lower jaw.
  • the convolutional neural network is used to extract the local features of the key points of the face in the image to be recognized, so as to subsequently determine the target facial action unit based on the local features, and determine the customer's emotions based on the recognized target facial action unit.
  • the use of convolutional neural network for recognition is faster and the recognition accuracy is higher.
  • S33 Use a pre-trained classifier to recognize local features, and obtain a target facial action unit corresponding to each local feature.
  • the local features are recognized by each SVM classifier in the pre-trained micro-expression recognition model, where the SVM classifier and the number of recognizable facial motion units are the same, that is, 54 facial motion units can be recognized , Then there are 54 pre-trained SVM classifiers.
  • the probability value is obtained, and the obtained probability value is compared with the preset probability threshold, which will be greater than the preset probability
  • the facial action unit corresponding to the probability value of the threshold is used as the target facial action unit corresponding to the local feature, and all target facial action units corresponding to the local feature are acquired.
  • the evaluation table is a pre-configured table.
  • the evaluation table stores the correspondence between facial action unit combinations and emotions, for example, AU12, AU6, and AU7 combinations, the corresponding emotions are joy, AU9, AU10, AU17, and AU24 correspond to The emotion is disgust.
  • the server searches the evaluation table through the target facial action unit corresponding to each local feature, obtains the combination that matches the target facial action unit, and uses the emotion corresponding to the combination as the single frame emotion corresponding to the image to be recognized.
  • the face key point algorithm is used to perform face recognition on the to-be-recognized images in each image clustering set, to obtain the face key points corresponding to each to-be-recognized image, and provide technology for subsequent extraction of local features Assist to improve the accuracy of local feature extraction; use feature extraction algorithms to extract features of key points of the face to quickly obtain the local features corresponding to the key points of the face, so that the subsequent extracted target facial action units are more accurate; adopt The pre-trained classifier recognizes the local features to quickly obtain the target facial action unit corresponding to each local feature, and realize the determination of the target facial action unit.
  • a product display device based on image recognition is provided, and the product display device based on image recognition has a one-to-one correspondence with the product display method based on image recognition in the foregoing embodiment.
  • the image recognition-based product display device package data acquisition module 10, image cluster collection acquisition module 20, single frame emotion determination module 30, target emotion acquisition module 40, final emotion acquisition module 50 and target display products Get the module 60.
  • the detailed description of each functional module is as follows:
  • the data acquisition module 10 is configured to acquire target video data of the commodity to be displayed, and the target video data includes at least two frames of images to be recognized.
  • the image clustering set acquisition module 20 is used to use the face detection model to perform face recognition and clustering on at least two frames of images to be recognized, to acquire the number of customers corresponding to the target video data and the image clustering set corresponding to each customer.
  • An image cluster set includes at least one frame to be recognized.
  • the single frame emotion determination module 30 is configured to use a pre-trained micro-expression recognition model to recognize images to be recognized in each image cluster set if the number of customers is greater than the preset number, and obtain a single frame of each image to be recognized Frame emotions.
  • the target emotion obtaining module 40 is configured to obtain the target emotion of the customer corresponding to the image cluster set based on the single frame emotion of at least one frame to be recognized.
  • the final emotion obtaining module 50 is used to obtain the final emotion according to the number of customers and the target emotion of the customer corresponding to each image cluster set.
  • the target display product obtaining module 60 is configured to obtain the target display product according to the final emotion corresponding to the product to be displayed.
  • the data acquisition module 10 includes an initial video data acquisition unit 11, an attribute information determination unit 12 and a target video data formation unit 13.
  • the initial video data acquiring unit 11 is configured to acquire initial video data of the commodity to be displayed, and the initial video data includes at least two frames of initial video images.
  • the attribute information determining unit 12 is used to obtain attribute information of the commodity to be displayed, and the attribute information includes a suitable age and a suitable gender.
  • the target video data forming unit 13 is configured to filter the initial video images with a suitable age and a suitable gender, obtain the image to be recognized, and form target video data based on at least two frames of the image to be recognized.
  • the commodity display device based on image recognition before the target video data forming unit, the commodity display device based on image recognition further includes an image resolution conversion unit.
  • the image resolution conversion unit is used to process at least two frames of initial video images by using super-resolution technology, obtain high-resolution images corresponding to the at least two frames of initial video images, and use the high-resolution images as the images to be determined.
  • the target video data forming unit includes a target age and target gender determination subunit, a matching subunit, and a target image determination subunit.
  • the target age and target gender determination subunit is used to identify at least two frames of images to be determined using a pre-trained classifier, and obtain the target age and target gender corresponding to each image to be determined.
  • the matching subunit is used to match the target age with the appropriate age, and match the target gender with the appropriate gender.
  • the to-be-recognized image determination subunit is used for if the target age is successfully matched with the appropriate age, and the target gender is successfully matched with the appropriate gender, then the image to be determined corresponding to the target age and the target gender is used as the image to be recognized.
  • the image cluster set acquisition module 20 includes a face image acquisition unit, an image cluster set acquisition unit, and a customer number determination unit.
  • the face image acquisition unit is configured to use a face detection model to perform face recognition on at least two frames of images to be recognized, and obtain a face image corresponding to each image to be recognized.
  • the image clustering set obtaining unit is configured to cluster the face images corresponding to the image to be recognized to obtain at least two image cluster sets, and each image cluster set includes at least one frame of the image to be recognized.
  • the customer number determining unit is used to obtain the number of customers corresponding to the target video data according to the number of image clustering sets.
  • each image to be recognized corresponds to a time stamp.
  • the image cluster set acquisition unit includes a reference image determination subunit, a feature similarity calculation unit, a first image cluster set determination subunit, and a second image cluster set determination subunit.
  • the reference image determination subunit is used to use the first recognized face image in at least two frames of images to be recognized as the reference image according to the time stamp.
  • the feature similarity calculation unit is used to calculate the feature similarity between the reference image and the remaining images to be recognized by sequentially adopting a similarity algorithm according to the time mark.
  • the first image cluster set determining subunit is configured to, if the feature similarity is greater than the preset threshold, attribute the image to be recognized and the reference image with the feature similarity greater than the preset threshold to the same image cluster set.
  • the second image clustering set determining subunit is used to update the first image in the remaining images to be identified whose feature similarity is not greater than the preset threshold according to the time mark if the feature similarity is not greater than the preset threshold to a new reference
  • the steps of calculating the feature similarity between the reference image and the remaining image to be identified are repeated according to the time mark and sequentially using the similarity algorithm, until at least two image cluster sets to be identified are completed to form at least two image cluster sets.
  • the single frame emotion determination module 30 includes a face key point acquisition unit, a local feature extraction unit, a target facial action unit acquisition unit, and a single frame emotion acquisition unit.
  • the face key point acquisition unit is used to perform face recognition on the image to be recognized in each image clustering set by using the face key point algorithm, and obtain the face key point corresponding to each image to be recognized.
  • the local feature extraction unit is used to perform feature extraction on the key points of the face corresponding to each image to be recognized by using a feature extraction algorithm to obtain the local features corresponding to the key points of the face.
  • the target facial action unit acquisition unit is used to recognize the local features using a pre-trained classifier, and acquire the target facial action unit corresponding to each local feature.
  • the single frame emotion acquisition unit is used to look up the evaluation table based on the target facial action unit corresponding to each local feature, and acquire the single frame emotion of each image to be recognized.
  • the various modules in the above-mentioned image recognition-based product display device can be implemented in whole or in part by software, hardware and their combination.
  • the above-mentioned modules may be embedded in the form of hardware or independent of the processor in the computer equipment, or may be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the operations corresponding to the above-mentioned modules.
  • a computer device is provided.
  • the computer device may be a server, and its internal structure diagram may be as shown in FIG. 9.
  • the computer device includes a processor, memory, network interface, and database connected by a system bus. Among them, the processor of the computer device is used to provide computing and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system, computer-readable instructions, and a database.
  • the internal memory provides an environment for the operation of the operating system and computer-readable instructions in the non-volatile storage medium.
  • the database of the computer equipment is used to store the face detection model and the attribute information of the commodity to be displayed.
  • the network interface of the computer device is used to communicate with external terminals through a network connection.
  • a computer device including a memory, a processor, and computer-readable instructions stored in the memory and capable of running on the processor.
  • the processor executes the computer-readable instructions to implement the
  • the steps of the product display method based on image recognition are, for example, the steps S10 to S60 shown in FIG. 2 or the steps shown in FIG. 3 to FIG. 7.
  • the processor executes the computer-readable instructions, the functions of the modules/units in the commodity display apparatus based on image recognition in the foregoing embodiments are implemented, for example, the functions of modules 10 to 50 shown in FIG. 8. To avoid repetition, I won’t repeat them here.
  • one or more readable storage media storing computer readable instructions are provided, the computer readable storage medium storing computer readable instructions, and the computer readable instructions are processed by one or more
  • the processor executes, the one or more processors execute the following steps to implement the product display method based on image recognition in the above method embodiment, for example, step S10 to step S60 shown in FIG. 2, or, FIG. 3 to FIG. 7 steps shown.
  • the computer readable instruction is executed by the processor, the function of each module/unit in the commodity display device based on image recognition in the above embodiment is realized, for example, the function of the module 10 to the module 60 shown in FIG. 8. To avoid repetition, I won’t repeat them here.
  • the readable storage medium in this embodiment includes a nonvolatile readable storage medium and a volatile readable storage medium.
  • Non-volatile memory may include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory can include random access memory (RAM) or external cache memory.
  • RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (RambuS) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

Landscapes

  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

L'invention concerne un procédé d'affichage de produit sur la base de l'identification d'image, un dispositif, un appareil et un support. Le procédé consiste à : effectuer, au moyen d'un modèle de détection de visage, une identification de visage et un regroupement sur au moins deux images à identifier dans des données vidéo cibles d'un produit affiché, et acquérir un nombre de clients correspondant aux données vidéo cibles et aux ensembles de groupes d'images respectifs correspondant aux clients ; si le nombre de clients est supérieur à un nombre prédéterminé, identifier des images à identifier dans chacun des ensembles de groupes d'images au moyen d'un modèle d'identification de micro-expression pré-entraîné, et acquérir des émotions de trame unique respectives des images (S30) ; acquérir, sur la base des émotions de trame unique des images à identifier, des émotions cibles des clients correspondant aux ensembles de groupes d'images (S40) ; acquérir une émotion finale en fonction du nombre de clients et des émotions cibles respectives des clients correspondant aux ensembles de groupes d'images (S50) ; et acquérir un produit d'affichage cible en fonction de l'émotion finale correspondant au produit à afficher (S60). L'invention résout le problème d'un attrait insuffisant des produits affichés.
PCT/CN2019/120988 2019-01-17 2019-11-26 Procédé d'affichage de produit sur la base de l'identification d'image, dispositif, appareil et support Ceased WO2020147430A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910042541.3A CN109815873A (zh) 2019-01-17 2019-01-17 基于图像识别的商品展示方法、装置、设备及介质
CN201910042541.3 2019-01-17

Publications (1)

Publication Number Publication Date
WO2020147430A1 true WO2020147430A1 (fr) 2020-07-23

Family

ID=66604505

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/120988 Ceased WO2020147430A1 (fr) 2019-01-17 2019-11-26 Procédé d'affichage de produit sur la base de l'identification d'image, dispositif, appareil et support

Country Status (2)

Country Link
CN (1) CN109815873A (fr)
WO (1) WO2020147430A1 (fr)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112215653A (zh) * 2020-10-12 2021-01-12 珠海格力电器股份有限公司 商品推荐方法、装置、服务器及存储介质
CN112434711A (zh) * 2020-11-27 2021-03-02 杭州海康威视数字技术股份有限公司 一种数据管理方法、装置及电子设备
CN113705329A (zh) * 2021-07-07 2021-11-26 浙江大华技术股份有限公司 重识别方法、目标重识别网络的训练方法及相关设备
CN113761275A (zh) * 2020-11-18 2021-12-07 北京沃东天骏信息技术有限公司 视频预览动图生成方法、装置、设备及可读存储介质
CN114255420A (zh) * 2021-12-10 2022-03-29 华院计算技术(上海)股份有限公司 情绪识别方法及装置、存储介质、终端
CN115512418A (zh) * 2022-09-29 2022-12-23 太保科技有限公司 基于动作单元au和微表情的问答情绪评估方法及装置
CN116310939A (zh) * 2022-12-28 2023-06-23 北京爱奇艺科技有限公司 一种商品匹配方法、装置、设备及存储介质
CN116665278A (zh) * 2023-06-09 2023-08-29 平安科技(深圳)有限公司 微表情识别方法、装置、计算机设备及存储介质

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109815873A (zh) * 2019-01-17 2019-05-28 深圳壹账通智能科技有限公司 基于图像识别的商品展示方法、装置、设备及介质
CN110650306B (zh) * 2019-09-03 2022-04-15 平安科技(深圳)有限公司 视频聊天中添加表情的方法、装置、计算机设备及存储介质
CN110598790A (zh) * 2019-09-12 2019-12-20 北京达佳互联信息技术有限公司 一种图像的识别方法、装置、电子设备及存储介质
CN111144241B (zh) * 2019-12-13 2023-06-20 深圳奇迹智慧网络有限公司 基于图像校验的目标识别方法、装置和计算机设备
CN111062786B (zh) * 2019-12-25 2023-05-23 创新奇智(青岛)科技有限公司 一种基于建立商品外观特征映射表的模型更新方法
CN111291623A (zh) * 2020-01-15 2020-06-16 浙江连信科技有限公司 基于人脸信息的心生理特征预测方法及装置
CN111310602A (zh) * 2020-01-20 2020-06-19 北京正和恒基滨水生态环境治理股份有限公司 一种基于情绪识别的展品关注度分析系统及分析方法
CN111563503A (zh) * 2020-05-09 2020-08-21 南宁市第三中学 一种少数民族文化象征物识别方法
CN114153342B (zh) * 2020-08-18 2024-11-26 深圳市万普拉斯科技有限公司 视觉信息展示方法、装置、计算机设备和存储介质
CN112363624B (zh) * 2020-11-16 2022-09-09 新之航传媒科技集团有限公司 一种基于情绪分析的互动展馆系统
CN113269035B (zh) * 2021-04-12 2025-01-17 北京爱奇艺科技有限公司 一种图像处理方法、装置、设备和存储介质
CN113177603B (zh) * 2021-05-12 2022-05-06 中移智行网络科技有限公司 分类模型的训练方法、视频分类方法及相关设备
CN113762156B (zh) * 2021-09-08 2023-10-24 北京优酷科技有限公司 观影数据处理方法、装置及存储介质
CN113947422A (zh) * 2021-09-13 2022-01-18 青岛颐中科技有限公司 基于多维度特征的营销方法、装置和电子设备

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018143630A1 (fr) * 2017-02-01 2018-08-09 삼성전자 주식회사 Dispositif et procédé de recommandation de produits
CN108509941A (zh) * 2018-04-20 2018-09-07 北京京东金融科技控股有限公司 情绪信息生成方法和装置
CN108858245A (zh) * 2018-08-20 2018-11-23 深圳威琳懋生物科技有限公司 一种导购机器人
CN109048934A (zh) * 2018-08-20 2018-12-21 深圳威琳懋生物科技有限公司 一种智能导购机器人系统
CN109191190A (zh) * 2018-08-20 2019-01-11 深圳威琳懋生物科技有限公司 一种导购机器人的控制方法及计算机可读存储介质
CN109815873A (zh) * 2019-01-17 2019-05-28 深圳壹账通智能科技有限公司 基于图像识别的商品展示方法、装置、设备及介质

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106843028B (zh) * 2016-12-04 2019-07-02 上海如晶新材料科技有限公司 一种包含推送广告及其智能筛选的智能衣柜

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018143630A1 (fr) * 2017-02-01 2018-08-09 삼성전자 주식회사 Dispositif et procédé de recommandation de produits
CN108509941A (zh) * 2018-04-20 2018-09-07 北京京东金融科技控股有限公司 情绪信息生成方法和装置
CN108858245A (zh) * 2018-08-20 2018-11-23 深圳威琳懋生物科技有限公司 一种导购机器人
CN109048934A (zh) * 2018-08-20 2018-12-21 深圳威琳懋生物科技有限公司 一种智能导购机器人系统
CN109191190A (zh) * 2018-08-20 2019-01-11 深圳威琳懋生物科技有限公司 一种导购机器人的控制方法及计算机可读存储介质
CN109815873A (zh) * 2019-01-17 2019-05-28 深圳壹账通智能科技有限公司 基于图像识别的商品展示方法、装置、设备及介质

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112215653A (zh) * 2020-10-12 2021-01-12 珠海格力电器股份有限公司 商品推荐方法、装置、服务器及存储介质
CN113761275A (zh) * 2020-11-18 2021-12-07 北京沃东天骏信息技术有限公司 视频预览动图生成方法、装置、设备及可读存储介质
CN112434711A (zh) * 2020-11-27 2021-03-02 杭州海康威视数字技术股份有限公司 一种数据管理方法、装置及电子设备
CN112434711B (zh) * 2020-11-27 2023-10-13 杭州海康威视数字技术股份有限公司 一种数据管理方法、装置及电子设备
CN113705329A (zh) * 2021-07-07 2021-11-26 浙江大华技术股份有限公司 重识别方法、目标重识别网络的训练方法及相关设备
CN114255420A (zh) * 2021-12-10 2022-03-29 华院计算技术(上海)股份有限公司 情绪识别方法及装置、存储介质、终端
CN115512418A (zh) * 2022-09-29 2022-12-23 太保科技有限公司 基于动作单元au和微表情的问答情绪评估方法及装置
CN116310939A (zh) * 2022-12-28 2023-06-23 北京爱奇艺科技有限公司 一种商品匹配方法、装置、设备及存储介质
CN116665278A (zh) * 2023-06-09 2023-08-29 平安科技(深圳)有限公司 微表情识别方法、装置、计算机设备及存储介质

Also Published As

Publication number Publication date
CN109815873A (zh) 2019-05-28

Similar Documents

Publication Publication Date Title
WO2020147430A1 (fr) Procédé d'affichage de produit sur la base de l'identification d'image, dispositif, appareil et support
Pérez-Borrero et al. A fast and accurate deep learning method for strawberry instance segmentation
US11423076B2 (en) Image similarity-based group browsing
Kao et al. Visual aesthetic quality assessment with a regression model
US9633045B2 (en) Image ranking based on attribute correlation
Fabian Benitez-Quiroz et al. Emotionet: An accurate, real-time algorithm for the automatic annotation of a million facial expressions in the wild
US20160350336A1 (en) Automated image searching, exploration and discovery
Liu et al. Age classification using convolutional neural networks with the multi-class focal loss
WO2021003938A1 (fr) Procédé et appareil de classification d'image, dispositif informatique et support de stockage
CN110175298B (zh) 用户匹配方法
Sumi et al. Human gender detection from facial images using convolution neural network
CN109815920A (zh) 基于卷积神经网络和对抗卷积神经网络的手势识别方法
CN113780145A (zh) 精子形态检测方法、装置、计算机设备和存储介质
Dakshina et al. Saree texture analysis and classification via deep learning framework
Fahira et al. Classical machine learning classification for javanese traditional food image
CN112149449A (zh) 一种基于深度学习的人脸属性识别方法及系统
Bashir et al. A comprehensive review on apple leaf disease detection
Dhanashree et al. Fingernail analysis for early detection and diagnosis of diseases using machine learning techniques
Sadeghzadeh et al. Triplet loss-based convolutional neural network for static sign language recognition
CN114066564A (zh) 服务推荐时间确定方法、装置、计算机设备、存储介质
Vasavi et al. Age detection in a surveillance video using deep learning technique
CN110414562A (zh) X光片的分类方法、装置、终端及存储介质
Singla et al. Age and gender detection using Deep Learning
CN112580527A (zh) 一种基于卷积长短期记忆网络的人脸表情识别方法
Juliet Image-based bird species identification using machine learning

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19910035

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS (EPO FORM 1205A DATED 08.11.2021)

122 Ep: pct application non-entry in european phase

Ref document number: 19910035

Country of ref document: EP

Kind code of ref document: A1