[go: up one dir, main page]

WO2018113526A1 - Système et procédé d'authentification interactive basés sur la reconnaissance de signature vocale et la reconnaissance faciale - Google Patents

Système et procédé d'authentification interactive basés sur la reconnaissance de signature vocale et la reconnaissance faciale Download PDF

Info

Publication number
WO2018113526A1
WO2018113526A1 PCT/CN2017/114928 CN2017114928W WO2018113526A1 WO 2018113526 A1 WO2018113526 A1 WO 2018113526A1 CN 2017114928 W CN2017114928 W CN 2017114928W WO 2018113526 A1 WO2018113526 A1 WO 2018113526A1
Authority
WO
WIPO (PCT)
Prior art keywords
voice
recognition
user
terminal
server
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2017/114928
Other languages
English (en)
Chinese (zh)
Inventor
刘�东
李晓冬
杨震泉
彭世伟
孙云松
孟庆康
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan Changhong Electric Co Ltd
Original Assignee
Sichuan Changhong Electric Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan Changhong Electric Co Ltd filed Critical Sichuan Changhong Electric Co Ltd
Publication of WO2018113526A1 publication Critical patent/WO2018113526A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/08Network architectures or network communication protocols for network security for authentication of entities
    • H04L63/0861Network architectures or network communication protocols for network security for authentication of entities using biometrical features, e.g. fingerprint, retina-scan
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination

Definitions

  • the invention relates to an authentication technology, in particular to an authentication technology for face recognition and voiceprint recognition.
  • Biometrics technology authenticates legal identity through the physiological or behavioral characteristics of the human body, such as fingerprints, irises, Facial image recognition and DNA sequence matching matching.
  • fingerprint recognition because it is easy to be forged, only needs to obtain the fingerprint of the other party from the daily necessities of the forged person, and the fingerprint can be forged. Therefore, the field of fingerprint identification is only a daily attendance record with low security requirements. in.
  • the iris recognition technology collects the annular part between the black pupil and the white sclera through the camera equipment, which contains many interlaced spots, filaments, crowns, stripes and crypts, so the camera
  • the hardware equipment requirements are relatively high, and it is not easy to be commercialized on a large scale or promoted to ordinary users.
  • Single image recognition verification face recognition verification
  • face recognition verification is also easy to use for static images (photos) to impersonate, while DNA sorting matches the threshold of recognition, which requires direct contact with the human body, so it is not suitable for "short” , flat, fast” Internet platform.
  • the human voice is rich in information of multiple dimensions, such as speech content, speech tone and sound characteristics.
  • Voiceprint recognition is a technique for distinguishing different speakers through human voice characteristics. Different channel structures determine the sound. The uniqueness of the pattern.
  • the object of the present invention is to solve the problem that the detection result of face recognition authentication is easily replaced by impersonation, and an interactive authentication system and method based on face recognition and voiceprint recognition are provided.
  • the interactive authentication system based on face recognition and voiceprint recognition comprises a terminal and a server, and the terminal and the server are connected through a network, wherein
  • the terminal is configured to acquire a facial video of the detected user, collect voice audio data input by the user, send the voice audio data to the server, and display display prompt information sent by the server;
  • the server is configured to perform matching between the facial feature parameters of the user and the user voiceprint feature vector, and perform the intersection of the voiceprint recognition result and the face recognition result. If there is only one result in the intersection, the verification is successful, and the return is successful. The terminal verifies the success information.
  • the matching of the user facial feature parameters and the matching of the user voiceprint feature vector means that the server acquires the user facial feature parameters from the received facial video of the detected user, and obtains the user facial feature parameters and the server in advance. All stored facial feature parameters of the user are matched. If the matching is successful, the face recognition result is obtained, and then the preset voice password text is sent to the terminal, and after receiving the voice audio data sent by the voice collection module of the terminal, converting the voice audio data into text Content, and matching the text content with the previously sent voice password text. If the matching is successful, the voiceprint feature vector in the voice audio data is extracted, and matched with all user voiceprint feature vectors pre-stored by the server, and matched. Success will result in voiceprint recognition.
  • the terminal includes a display module, a face video capture module, a voice collection module, and a first communication module
  • the server includes a face recognition module, a voice recognition module, a verification module, a database, and a second communication module
  • the display module The face video acquisition module and the voice collection module are respectively connected with the first communication module, and the face recognition module, the voice recognition module and the verification module are respectively connected with the second communication module, and the face recognition module and the voice recognition module are respectively connected with the verification module.
  • the database module is respectively connected with the face recognition module, the voice recognition module and the verification module, and the first communication module and the second communication module are connected through a network.
  • the face video capture module is configured to acquire a facial video of the detected user and send the video to the face recognition module through the first communication module and the second communication module;
  • the voice collection module is configured to collect voice audio data input by the user and send the voice audio data to the voice recognition module through the first communication module and the second communication module;
  • the display module is configured to display display prompt information sent by the server, including face recognition failure information, voice password input incorrect information, verification failure information, voice password text, and verification success information;
  • the first communication module and the second communication module are used for information interaction between the terminal and the server;
  • the face recognition module is configured to filter and denoise the face video of the detected user, extract key frames, acquire user facial feature parameters according to the key frame, and select key feature parameters and stored in the database. All the user facial feature parameters are matched. If the matching is successful, the matching success result is sent to the verification module, and the successful matching result is the face recognition result. If the matching fails, the terminal face recognition failure information is returned;
  • the voice recognition module is configured to: after receiving the voice recognition request sent by the verification module, send the preset voice password text to the terminal, so that the terminal displays the voice password text through the display module, and is sent by the voice collection module of the terminal.
  • the voice audio data is converted into text content, and the text content is matched with the previously sent voice password text, if the matching fails, the recognition is failed, and the terminal voice password input incorrect information is returned, and if the matching is successful, the data is extracted.
  • the voiceprint feature vector in the voice audio data is matched with all user voiceprint feature vectors stored in the database. If the match fails, the recognition fails, and the terminal voice recognition failure information is returned. If the match is successful, the match is successful.
  • the result is sent to the verification module, and the successful result of the matching is the voiceprint recognition result;
  • the verification module is configured to send a voice recognition request to the voice recognition module after receiving the matching success result sent by the face recognition module, and after receiving the matching success result sent by the voice recognition module, and the face recognition module If the intersection is empty, the current user verification fails, and the terminal verification failure information is returned. If there is only one result in the intersection, the verification is successful, and the terminal verification success information is returned. If there is more than one result in the concentration, the voiceprint feature is not obvious, and the voice recognition request is resent to the voice recognition module. If a predetermined number of voice recognition requests have been sent at this time, the user authentication failure is considered, and the terminal verification failure information is returned. .
  • the face video capture module is a camera module
  • the voice capture module is a pickup.
  • the face recognition module is configured with an image similarity preset value, and when the key feature parameter in the user facial feature parameter is matched with the user facial feature parameter stored in the database, if the matching result is When the threshold value of the facial feature parameter of each user is smaller than the preset value of the image similarity, it is determined that the matching is successful, otherwise it is determined that the matching fails.
  • the successful matching result of the face recognition module includes user information, where the user information includes user age information.
  • the voice recognition request sent by the verification module to the voice recognition module includes user age information or a voice password text when requesting to send a registration.
  • the voice recognition request sent by the verification module to the voice recognition module if the voice recognition request is sent to the voice recognition module by the preset number of times, the voice recognition request includes the voice when requesting to send the registration. Password text.
  • the preset voice password text is an easy-to-read text or a number of numbers or a piece of news text or a voice password text corresponding to the user information.
  • the voice recognition request is further determined according to the voice recognition request, and if the voice recognition request is requested to send the voice password text when registering, the voice recognition module selects the preset
  • the voice password text is a voice password text corresponding to the user information, and if there is user age information in the voice recognition request, the user age is determined according to the user age information, and the preset is selected if the user is an elderly person or a minor.
  • the voice password text is an easy-to-read text or a number of numbers, otherwise the selected preset voice password text is a piece of news text.
  • the voice recognition module starts timing and determines whether the voice and audio data sent by the terminal is received within a preset time, and if the time count reaches the preset time. If the voice audio data sent by the terminal is not received, the preset voice password text is replaced and the replaced preset voice password text is re-sent to the terminal, and the timing is restarted, and it is determined whether the terminal is sent within the preset time.
  • the step of voice audio data is performed by the terminal is sent to the terminal.
  • An interactive authentication method based on face recognition and voiceprint recognition is applied to the above-mentioned interactive authentication system based on face recognition and voiceprint recognition, characterized in that it comprises the following steps;
  • Step 1 The user uses the terminal to perform user registration with the server, and the server stores the user information, the facial feature parameters of the user, and the user voiceprint feature vector in the database;
  • Step 2 When authenticating, the terminal acquires a facial video of the detected user and sends the video to the server;
  • Step 3 The server filters and denoises the facial video of the detected user, extracts the key frame, acquires the facial feature parameters of the user according to the key frame, and selects the key feature parameters and all the user facial feature parameters stored in the database. Matching, if the matching is successful, the face recognition result is obtained and proceeds to step 5, if the matching fails, the process proceeds to step 4;
  • Step 4 the server returns the terminal face recognition failure information, the terminal displays the face recognition failure and prompts the user, and returns to step 2;
  • Step 5 The server generates and sends a preset voice password text to the terminal.
  • Step 6 the terminal displays the voice password text, and collects the voice audio data input by the user and uploads it to the server;
  • Step 7 The server converts the received voice audio data into text content, and matches the text content with the previously sent voice password text. If the matching fails, the identification fails, and the terminal voice password input incorrect information is returned. Go to step 8, if the match is successful, go to step 9;
  • Step 8 the terminal displays the voice password input incorrect information, return to step 2;
  • Step 9 The server extracts the voiceprint feature vector in the voice audio data, and matches it with all user voiceprint feature vectors stored in the database. If the match fails, the recognition fails, and the terminal voice recognition failure information is returned, and the process proceeds to the step. 10, if the match is successful, the speech recognition result is obtained and proceeds to step 11;
  • Step 10 The terminal displays the voice recognition failure information, and returns to step 2;
  • Step 11 The server performs the intersection of the face recognition result and the voice recognition result. If the intersection is empty, it is considered that the current user verification fails, and the terminal verification failure information is returned, and the process proceeds to step 12. If there is only one result in the intersection, it is considered If the verification is successful, the terminal verification success information is returned. If there is more than one result in the intersection, the voiceprint feature is considered to be inconspicuous, and it is determined whether the current authentication has sent a preset number of voice password texts. If yes, the user verification is failed. The terminal verifies the failure information, proceeds to step 12, otherwise regenerates and sends the preset voice password text to the terminal, and returns to step 6;
  • step 12 the terminal displays the verification failure information, and returns to step 2.
  • step 1 includes the following steps:
  • Step 101 The user inputs user information to the terminal, and collects a face video or a plurality of face images through the terminal, and the terminal uploads the user information and the face video or the plurality of face images to the server;
  • Step 102 The server intercepts multiple face images from the face video or uses the received multiple images as face samples to obtain the facial feature parameters of the user, and performs face modeling and associates with the user information. Stored in the database, and randomly generated voice password text is sent to the terminal;
  • Step 103 The terminal displays the voice password text, and collects voice audio data of the user, and uploads the collected voice and audio data to the server;
  • Step 104 The server performs voiceprint feature vector extraction on the voice audio data, and associates the extracted voiceprint feature vector, voice audio data, and corresponding voice password text with the user information, and stores the data in the database.
  • step 102 the randomly generated voice password text is sent to the terminal, and at least one piece of voice password text is randomly generated and sent to the terminal in sequence;
  • step 103 the terminal displays the voice password text, and collects the user's voice and audio data, and uploads the collected voice and audio data to the server, and the terminal displays the voice password text in sequence, when a voice password text is collected three times. After the user's voice and audio data, the next voice password text is displayed, and each of the three voice and audio data corresponding to all the voice password texts is obtained and sent to the server.
  • step 104 after receiving all the voice and audio data, the server separately extracts the voiceprint feature vector, and selects, for each voice password text, a voice and audio data in which the voiceprint feature vector is most obvious, and the voice is The password text, the selected voice and audio data, and the voiceprint feature vector are associated with the information system. Stored in the database.
  • step 11 the regenerating and sending the preset voice password text to the terminal, the regenerated preset voice password text is one of the voice password texts at the time of registration corresponding to the user information.
  • step 3 the image similarity preset value is set in the server, and when the key feature parameter in the user facial feature parameter is matched with the user facial feature parameter stored in the database, if the matching result is When the user facial feature parameter similarity threshold is smaller than the image similarity preset value, it is determined that the matching is successful, otherwise it is determined that the matching fails.
  • the preset voice password text is a randomly generated piece of readable text or a randomly generated piece of numbers or a randomly generated piece of news type text or a registered voice code text corresponding to the user information.
  • the user information includes user age information
  • step 3 the face recognition result includes user information
  • step 5 when the server generates and sends a preset voice password text to the terminal, if the user information in the face recognition result is displayed as an elderly person or a minor, the preset voice password text selected is an easy-to-read text or A number of digits, otherwise the selected preset voice password text is a piece of news text.
  • step 9 if the matching fails, it is further determined whether a preset number minus one voice password text has been generated, and if yes, the recognition is failed, and the terminal voice recognition failure information is returned, and the process proceeds to step 10, otherwise re-generating and The terminal sends the preset voice password text, and returns to step 6.
  • the preset voice password text that is regenerated and sent to the terminal is a randomly generated piece of easy-to-read text or a randomly generated segment number or a randomly generated piece of news text. The length is greater than the preset sound password text generated last time.
  • step 9 the preset value of the voiceprint similarity is set in the server, and when the server matches the voiceprint feature vector in the extracted voice audio data with all the user voiceprint feature vectors stored in the database, If the threshold value of the voiceprint feature vector of each user in the matched result is less than the preset value of the voiceprint similarity, it is determined that the matching is successful, otherwise it is determined that the matching fails.
  • step 5 after the server generates and sends the preset voice password text to the terminal, the timing is also started;
  • step 9 after the server regenerates and sends the preset voice password text to the terminal, the timing is also started;
  • step 11 after the server regenerates and sends the preset voice password text to the terminal, the timing is also started;
  • step 5 the following steps are further included:
  • Step A the server determines whether the voice audio data sent by the terminal is received within a preset time, if the voice audio data sent by the terminal is not received after the preset time reaches the preset time, the process proceeds to step A, otherwise proceeds to step 7;
  • Step B The server replaces the preset voice password text and resends the replaced preset voice password text to the terminal, and restarts the timing, and returns to step A, where the replaced preset voice password text is a re-randomly generated segment. Easy-to-read text or randomly generated numbers or randomly generated pieces of news text.
  • step 9 if the matching fails, after returning the terminal speech recognition failure information, the server also proceeds to step 13;
  • step 11 if the verification is successful, returning the terminal verification success information, the server also proceeds to step 13, if it is considered that the current user verification fails, returning the terminal verification failure information, the server also proceeds to step 13;
  • Step 13 The server optimizes the face modeling corresponding to the user information in the face recognition result by using the face image received in the current authentication.
  • the invention has the beneficial effects that in the solution of the present invention, through the above-mentioned interactive authentication system and method based on face recognition and voiceprint recognition, face recognition and voiceprint recognition are used to achieve higher security authentication and improve security. Sex.
  • FIG. 1 is a system block diagram of an interactive authentication system based on face recognition and voiceprint recognition according to an embodiment of the present invention.
  • FIG. 1 An interactive authentication system based on face recognition and voiceprint recognition according to the present invention, the system block diagram of which is shown in FIG. 1 , including a terminal and a server, where the terminal and the server are connected through a network, wherein the terminal is configured to acquire a facial video of the detected user and The voice audio data input by the user is collected and sent to the server, and the display prompt information sent by the server is displayed; the server is configured to perform matching of the user facial feature parameters and matching the user voiceprint feature vector, and the voiceprint recognition result is related to the person.
  • the face recognition result is collected and intersected. If there is only one result in the intersection, the verification is successful, and the terminal verification success information is returned.
  • the interactive authentication method based on face recognition and voiceprint recognition is applied to the above-mentioned interactive authentication system based on face recognition and voiceprint recognition.
  • the user uses the terminal to perform user registration with the server, and the server is in the database.
  • the user information, the user facial feature parameter, and the user voiceprint feature vector are stored.
  • the terminal acquires the facial video of the detected user and sends the video to the server, and the server filters and denoises the facial video of the detected user.
  • the server returns Returning the terminal face recognition failure information, the terminal displays the face recognition failure and prompts the user to return to the authentication step to re-authenticate.
  • the matching is successful, the face recognition result is obtained, and the preset voice password text is generated and sent to the terminal, and then the terminal Display the voice password text, and collect the voice audio data input by the user to upload to the server, and then convert the received voice audio data into text content, and match the text content with the previously sent voice password text, if the match If the failure is that the recognition fails, the terminal voice password input error information is returned, the terminal displays the voice password input incorrect information, and the step back to the authentication is re-authenticated. If the match is successful, the server extracts the voiceprint feature vector in the voice audio data. Match it with all user voiceprint feature vectors stored in the database. If the match fails, the recognition is failed, and the terminal voice recognition failure information is returned.
  • the terminal displays the voice recognition failure information, and returns to the authentication step to re-authenticate. Successfully get speech recognition results
  • the server performs the intersection of the face recognition result and the voice recognition result. If the intersection is empty, it is considered that the user verification fails, and the terminal verification failure information is returned, and the terminal displays the verification failure information, and returns to the authentication step to re-authenticate. If there is only one result in the intersection, it is considered that the verification is successful, and the terminal verification success information is returned. If there is more than one result in the intersection, the voiceprint feature is considered to be inconspicuous, and it is determined whether the authentication has sent a preset number of voice password texts, and if so If the user authentication fails, the terminal returns the terminal verification failure message. The terminal displays the verification failure information and returns to the authentication step to re-authenticate. Otherwise, it regenerates and sends the preset voice password text to the terminal, and returns to the terminal to display the voice password text. .
  • FIG. 1 An interactive authentication system based on face recognition and voiceprint recognition according to an embodiment of the present invention is shown in FIG. 1 , which includes a terminal and a server.
  • the terminal and the server are connected through a network, and the terminal may include a display module and a face video capture module.
  • the voice collection module and the first communication module, the server may include a face recognition module, a voice recognition module, a verification module, a database, and a second communication module, and the display module, the face video collection module, and the voice collection module are respectively connected to the first communication module.
  • the face recognition module, the voice recognition module and the verification module are respectively connected with the second communication module, and the face recognition module and the voice recognition module are respectively connected with the verification module, and the database module is respectively connected with the face recognition module, the voice recognition module and the verification module.
  • the first communication module and the second communication module are connected through a network.
  • the terminal is configured to acquire the facial video of the detected user and collect the voice audio data input by the user, and send the data to the server, and display the display prompt information sent by the server.
  • the terminal may include a display module, a face video acquisition module, a voice collection module, and a first communication module.
  • the face video capture module is configured to obtain the face video of the detected user and send it to the face recognition module through the first communication module and the second communication module;
  • the camera module can be a camera module such as a camera.
  • the voice collection module is configured to collect voice and audio data input by the user and pass the first communication module and the second communication
  • the module is sent to the speech recognition module; it can be a pickup such as a microphone.
  • the display module is configured to display display prompt information sent by the server, including face recognition failure information, voice password input incorrect information, verification failure information, voice password text, and verification success information.
  • the first communication module is used for information interaction between the terminal and the server.
  • the server is configured to perform matching between the facial feature parameters of the user and the user voiceprint feature vector, and combine the voiceprint recognition result with the face recognition result. If there is only one result in the intersection, the verification is successful, and the terminal verification is returned. Success information.
  • the matching of the user facial feature parameters and the matching of the user voiceprint feature vector is preferably: the server acquires the user facial feature parameters from the received facial video of the detected user, and acquires the obtained user facial feature parameters and all the pre-stored parameters of the server.
  • the user facial feature parameters are matched, and the face recognition result is obtained after the matching is successful, and then the preset voice password text is sent to the terminal, and after receiving the voice audio data sent by the voice collection module of the terminal, the voice audio data is converted into text content, and Matching the text content with the previously sent voice password text, and if the matching is successful, extracting the voiceprint feature vector in the voice audio data, and matching it with all user voiceprint feature vectors pre-stored by the server, and matching is successful. Voiceprint recognition results.
  • the server may include a face recognition module, a voice recognition module, a verification module, a database, and a second communication module.
  • the second communication module is used for information interaction between the terminal and the server.
  • the face recognition module is configured to filter and denoise the face video of the detected user, extract key frames, acquire user facial feature parameters according to the key frame, and select all the key feature parameters and all stored in the database.
  • the user facial feature parameters are matched. If the matching is successful, the matching success result is sent to the verification module, and the successful matching result is the face recognition result. If the matching fails, the terminal face recognition failure information is returned.
  • the image recognition module may set an image similarity preset value, and when the key feature parameter in the user facial feature parameter is matched with the user facial feature parameter stored in the database, if the user facial feature parameter is matched in the result When the similarity threshold is smaller than the image similarity preset value, it is determined that the matching is successful, otherwise it is determined that the matching fails.
  • the matching result of the face recognition module may include user information, and the user information includes user age information.
  • the voice recognition module is configured to: after receiving the voice recognition request sent by the verification module, send the preset voice password text to the terminal, so that the terminal displays the voice password text through the display module, and receives the voice audio sent by the voice collection module of the terminal. After the data, it is converted into text content, and the text content is matched with the previously sent voice password text. If the matching fails, the recognition is failed, and the terminal voice password input incorrect information is returned. If the matching is successful, the voice is extracted. The voiceprint feature vector in the audio data is matched with all user voiceprint feature vectors stored in the database. If the match fails, the recognition fails, and the terminal voice recognition failure information is returned. If the match is successful, the match success result is sent.
  • the successful result of the matching is the voiceprint recognition result.
  • the preset voice password text is an easy-to-read text or a number of numbers or a piece of news text or a voice password text corresponding to the user information; in the voice recognition module, the preset voice password text is sent to the terminal.
  • the voice password request may be judged according to the voice recognition request. If the voice password request has a request to send the voice password text, the voice password module selects the preset voice password text as the voice password text corresponding to the user information, if the voice If there is user age information in the identification request, the user's age is determined according to the user's age information.
  • the preset voice password text is an easy-to-read text or a number of digits, otherwise the selected default voice password is selected.
  • the text is a piece of news text; in addition, in the voice recognition module, after the preset voice password text is sent to the terminal, the time is also started to determine whether the terminal sends the received time within a preset time (for example, 10 seconds). Voice and audio data, if the time is up to the preset time, it has not been sent by the terminal. For voice and audio data, replace the preset voice password text and resend the replaced preset voice password text to the terminal, and restart the timing, and return to the step of determining whether to receive the voice and audio data sent by the terminal within the preset time. .
  • the verification module is configured to send a voice recognition request to the voice recognition module after receiving the matching success result sent by the face recognition module, and send the voice recognition request to the face recognition module after receiving the matching success result sent by the voice recognition module. If the intersection is empty, it is considered that the user verification fails, and the terminal verification failure information is returned. If there is only one result in the intersection, the verification is successful, and the terminal verification success information is returned, if the intersection has If there is more than one result, it is considered that the voiceprint feature is not obvious, and the voice recognition request is resent to the voice recognition module. If a predetermined number of voice recognition requests have been sent at this time, the current user authentication failure is considered, and the terminal verification failure information is returned.
  • the voice recognition request sent by the verification module to the voice recognition module includes the user age information or the voice password text when requesting to send the registration, and may also be in the voice recognition request sent by the verification module to the voice recognition module, if this is the first The preset number of times (for example, when the preset number is 3, and the third time is now), the voice recognition request is sent to the voice recognition module, and the voice recognition request includes the voice password text when the registration is requested to be sent.
  • the processing method is as follows:
  • Step 1 The user uses the terminal to perform user registration with the server, and the server stores user information, the user facial feature parameter, and the user voiceprint feature vector in the database.
  • the user information preferably includes user age information, and the step may specifically include the following steps:
  • Step 101 The user inputs user information to the terminal, and collects a face video or a plurality of face images through the terminal, and the terminal uploads the user information and the face video or the plurality of face images to the server.
  • Step 102 The server intercepts multiple face images from the face video or uses the received multiple images as face samples to obtain the facial feature parameters of the user, and performs face modeling and associates with the user information. Stored in the database and randomly generated voice password text is sent to the terminal.
  • the randomly generated voice password text is sent to the terminal, and at least one piece of voice password text can be randomly generated and sent to the terminal in sequence, for example, three pieces of voice password text are randomly generated, randomly sorted, and then sequentially transmitted to the terminal.
  • how many pieces of voice password text are randomly generated is determined according to the security degree of the service authentication. Generally, the service authentication with higher security requirement, the more the number of randomly generated voice password texts at the time of registration.
  • Step 103 The terminal displays the voice password text, and collects voice audio data of the user, and uploads the collected voice audio data to the server.
  • the terminal displays the voice password text, and collects the user's voice and audio data, and uploads the collected voice and audio data to the server. If the terminal receives the plurality of voice password texts in sequence, the voice password text is displayed in order, when one After the voice password data is collected three times corresponding to the user's voice and audio data, the next voice password text is displayed, and each of the three voice and audio data corresponding to all the voice password texts is obtained and sent to the server. For example, when the terminal receives two pieces of voice password text in sequence, the first voice password text is displayed first, and the user voice audio data input by the user according to the first voice password text is collected three times, and then the second voice password text is displayed.
  • the user voice audio data input by the user according to the second voice password text is collected three times, and then the three user voice audio data corresponding to the first voice password text and the three user voice audio data corresponding to the second voice password text are collected together. Sent to the server for a total of six user voice audio data.
  • Step 104 The server performs voiceprint feature vector extraction on the voice audio data, and associates the extracted voiceprint feature vector, voice audio data, and corresponding voice password text with the user information, and stores the data in the database.
  • the server if the server receives a plurality of voice and audio data, the server respectively extracts the voiceprint feature vectors after receiving all the voice and audio data, and selects the most distinctive voiceprint feature vector for each voice password text.
  • a voice audio data, the voice password text, the selected voice audio data and its voiceprint feature vector are associated with the information system and stored in the database. That is, one voice password text corresponds to one voice audio data, and the other two voice audio data can be deleted.
  • Step 2 During authentication, the terminal acquires the face video of the detected user and sends it to the server.
  • Step 3 The server filters and denoises the facial video of the detected user, extracts the key frame, acquires the facial feature parameters of the user according to the key frame, and selects the key feature parameters and all the user facial feature parameters stored in the database. Matching is performed. If the matching is successful, the face recognition result is obtained and the process proceeds to step 5. If the matching fails, the process proceeds to step 4.
  • the image similarity preset value may be set in the server, and when the key feature parameter in the user facial feature parameter is matched with the user facial feature parameter stored in the database, if the user facial features are matched in the matched result
  • the parameter similarity threshold is less than the image similarity preset value, it is determined that the matching is successful, otherwise the determination is Match failed.
  • the face recognition result preferably includes user information, and the user information is visible from step 1, which preferably includes user age information.
  • Step 4 The server returns the terminal face recognition failure information, and the terminal displays that the face recognition fails and prompts the user, and returns to step 2.
  • Step 5 The server generates and sends a preset voice password text to the terminal.
  • the preset voice password text may be a randomly generated piece of readable text or a randomly generated piece of numbers or a randomly generated piece of news type text or a voice password text at the time of registration corresponding to the user information.
  • the server when the server generates and sends the preset voice password text to the terminal, if the user information in the face recognition result (which can be judged according to the user age information) is displayed as an elderly person or a minor, the preset voice password text is a segment. Easy-to-read text or a number of numbers, the purpose is to ensure that the user can understand and read the voice password text, otherwise the selected preset voice password text is a piece of news text, otherwise the user information display user is an adult Adults can generally understand and read the voice password text, so choose a piece of news text to increase recognition accuracy.
  • Step 6 The terminal displays the voice password text, and collects the voice audio data input by the user and uploads it to the server.
  • Step 7 The server converts the received voice audio data into text content, and matches the text content with the previously sent voice password text. If the matching fails, the identification fails, and the terminal voice password input incorrect information is returned. Go to step 8. If the match is successful, go to step 9.
  • Step 8 The terminal displays the voice password input incorrect information, and returns to step 2.
  • Step 9 The server extracts the voiceprint feature vector in the voice audio data, and matches it with all user voiceprint feature vectors stored in the database. If the match fails, the recognition fails, and the terminal voice recognition failure information is returned, and the process proceeds to the step. 10. If the matching is successful, the speech recognition result is obtained and the process proceeds to step 11.
  • step 10 if the matching fails, it can also be determined whether the preset number has been generated minus one (for example, the preset number is 3, then it is judged whether 2 voice password texts have been generated), if it is Then, the recognition fails, and the terminal voice recognition failure information is returned, and the process proceeds to step 10; otherwise, the preset voice password text is regenerated and sent to the terminal, and the process returns to step 6, and the preset voice password text that is regenerated and sent to the terminal is randomly generated.
  • a piece of readable text or a randomly generated piece of numbers or a randomly generated piece of news type text having a length greater than the previously generated preset sound password text, visible, which may correspond to the generation method in step 5.
  • the preset value of the voiceprint similarity may also be set in the server, and if the voiceprint feature vector in the extracted voice audio data is matched with all the user voiceprint feature vectors stored in the database, if the server matches In the result, when the user user's voiceprint feature vector similarity threshold is smaller than the preset value of the voiceprint similarity, it is determined that the match is Work, otherwise it is determined that the match failed.
  • Step 10 The terminal displays the voice recognition failure information, and returns to step 2.
  • Step 11 The server performs the intersection of the face recognition result and the voice recognition result. If the intersection is empty, it is considered that the current user verification fails, and the terminal verification failure information is returned, and the process proceeds to step 12. If there is only one result in the intersection, it is considered If the verification is successful, the terminal verification success information is returned. If there is more than one result in the intersection, the voiceprint feature is considered to be inconspicuous, and it is determined whether the current authentication has sent a preset number of voice password texts. If yes, the user verification is failed. The terminal verifies the failure information, and proceeds to step 12, otherwise regenerates and sends the preset voice password text to the terminal, and returns to step 6.
  • the preset voice password text is regenerated and sent to the terminal, and the regenerated preset voice password text is one of the voice password texts at the time of registration corresponding to the user information, that is, random in step 102 in this example.
  • One of the generated voice password texts when there is only one, the voice password text is directly selected. If the random voice password text is not generated as in step 102, the user voice audio data is directly collected, and then passed. The user voice audio data is acquired to the user's voiceprint feature vector, and then the voice password text corresponding to the user voice audio data can be selected (which can be obtained by converting the user voice audio data into text data).
  • step 12 the terminal displays the verification failure information, and returns to step 2.
  • the timer is also started.
  • the server may be the first time to generate and send the preset voice password text to the terminal during the current authentication, or the server may be the current time.
  • the authentication is re-generated and the preset voice password text is sent to the terminal, it means that the timer starts as long as the server generates and sends the preset voice password text to the terminal.
  • step 5 the following steps may also be included:
  • Step A the server determines whether the voice audio data sent by the terminal is received within a preset time, if the voice audio data sent by the terminal is not received after the preset time reaches the preset time, the process proceeds to step A, otherwise proceeds to step 7;
  • Step B The server replaces the preset voice password text and resends the replaced preset voice password text to the terminal, and restarts the timing, and returns to step A, where the replaced preset voice password text is a re-randomly generated segment. Easy-to-read text or randomly generated numbers or randomly generated pieces of news text.
  • step 9 if the matching fails, after returning the terminal voice recognition failure information, the server may also proceed to step 13, at which time the terminal still proceeds to step 10;
  • step 11 if the verification success is successful and the terminal verification success information is returned, the server may further enter step 13. If the user authentication failure is determined and the terminal verification failure information is returned, the server may proceed to step 13. At this point, the terminal still proceeds to step 12.
  • Step 13 may be: the server optimizes the face modeling corresponding to the user information in the face recognition result by using the face image received in the current authentication.
  • the purpose is: since the face recognition is successful, it indicates that the face image used for the recognition or the collected face video is correct, and the correct face image information can be used to optimize the face modeling and improve the person. Accuracy in face recognition, deletion of invalid user facial feature parameters, etc., to improve computational efficiency.
  • the server may further perform the voiceprint feature data corresponding to the user information in the face recognition result by using the voice and audio data received in the current authentication. optimization.
  • the face recognition step is prior to the front, and the voiceprint is recognized later.
  • the reason is: First, the face recognition has been developed over the past several decades, and the technology is relatively mature and the algorithm is efficient. The processing speed is fast, and the voiceprint recognition is different from other physiological feature recognition.
  • the voiceprint recognition feature must be a "personalized" feature, and the speaker (ie, the user who needs voiceprint recognition) needs to recognize the feature for the speaker must be There are “common characteristics”.
  • acoustic features related to the anatomical structure of the human's pronunciation mechanism (eg, spectrum) , cepstrum, formant, pitch, reflection coefficient, etc.), nasal sound, deep breath sounds, hoarseness, laughter, etc.; 2) semantics, rhetoric, pronunciation, etc. affected by socioeconomic status, education level, place of birth, etc. Speech habits, etc.; 3) Personal characteristics or rhythm, rhythm, speed, intonation, volume and other characteristics affected by parents.
  • the features currently available for the voiceprint automatic recognition model include: 1) acoustic features (cepstrum); 2) lexical features (speaker-related word n-gram, phoneme n-gram) 3) prosodic features (pitch and energy "postures” described by n-gram); 4) language, dialect and accent information; 5) channel information (what channel is used). Therefore, in the solution of the present invention, the preset voice password text may be randomly generated based on the user information.
  • the specific method of face recognition and voiceprint recognition mentioned in the present invention is a relatively mature technology, the present invention will not be described in detail.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Collating Specific Patterns (AREA)
  • Telephonic Communication Services (AREA)
  • Telephone Function (AREA)

Abstract

La présente invention concerne la technologie d'authentification. La présente invention résout le problème selon lequel un résultat de détection d'une authentification de reconnaissance faciale existante peut être facilement usurpé et ainsi un système et un procédé d'authentification interactive à base de reconnaissance faciale et de reconnaissance de signature vocale sont fournis. La solution technique de la présente invention peut être résumée comme suit : le système d'authentification interactive à base de reconnaissance faciale et de reconnaissance de signature vocale comprend un terminal et un serveur, le terminal et le serveur étant connectés au moyen d'un réseau, le terminal étant utilisé pour obtenir une vidéo de visage d'un utilisateur détecté, pour collecter des données audio vocales entrées par l'utilisateur, pour envoyer les données vidéo de visage et les données audio vocales au serveur et pour afficher des informations d'invite d'affichage envoyées par le serveur ; et le serveur est utilisé pour mettre en correspondance des paramètres de caractéristiques de visage d'utilisateur, pour mettre en correspondance un vecteur de caractéristique de signature vocale d'utilisateur et pour collecter un résultat de reconnaissance de signature vocale et un résultat de reconnaissance faciale afin d'obtenir une intersection ; et si l'intersection ne contient qu'un résultat, l'authentification est considérée comme réussie, et des informations de réussite d'authentification de terminal sont renvoyées. La présente invention présente les effets bénéfiques d'une sécurité améliorée et peut être appliquée à un système d'authentification.
PCT/CN2017/114928 2016-12-20 2017-12-07 Système et procédé d'authentification interactive basés sur la reconnaissance de signature vocale et la reconnaissance faciale Ceased WO2018113526A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201611181543.3A CN106790054A (zh) 2016-12-20 2016-12-20 基于人脸识别和声纹识别的交互式认证系统及方法
CN201611181543.3 2016-12-20

Publications (1)

Publication Number Publication Date
WO2018113526A1 true WO2018113526A1 (fr) 2018-06-28

Family

ID=58890935

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/114928 Ceased WO2018113526A1 (fr) 2016-12-20 2017-12-07 Système et procédé d'authentification interactive basés sur la reconnaissance de signature vocale et la reconnaissance faciale

Country Status (2)

Country Link
CN (1) CN106790054A (fr)
WO (1) WO2018113526A1 (fr)

Cited By (57)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108694767A (zh) * 2018-07-13 2018-10-23 北京工业职业技术学院 身份认证装置和智能门禁系统
CN108846676A (zh) * 2018-08-02 2018-11-20 平安科技(深圳)有限公司 生物特征辅助支付方法、装置、计算机设备及存储介质
CN109543377A (zh) * 2018-10-17 2019-03-29 深圳壹账通智能科技有限公司 身份验证方法、装置、计算机设备和存储介质
CN109842805A (zh) * 2019-01-04 2019-06-04 平安科技(深圳)有限公司 视频看点的生成方法、装置、计算机设备及存储介质
CN110074519A (zh) * 2019-04-10 2019-08-02 南京启诺信息技术有限公司 一种语言识别手环
CN110163630A (zh) * 2019-04-15 2019-08-23 中国平安人寿保险股份有限公司 产品监管方法、装置、计算机设备及存储介质
CN110287363A (zh) * 2019-05-22 2019-09-27 深圳壹账通智能科技有限公司 基于深度学习的资源推送方法、装置、设备及存储介质
CN110309570A (zh) * 2019-06-21 2019-10-08 济南大学 一种具有认知能力的多模态仿真实验容器及方法
CN110363278A (zh) * 2019-07-23 2019-10-22 广东小天才科技有限公司 一种亲子互动方法、机器人、服务器及亲子互动系统
CN110427468A (zh) * 2019-07-10 2019-11-08 深圳市一恒科电子科技有限公司 一种基于儿童云服务的学习方法及学习机
CN110442033A (zh) * 2019-07-30 2019-11-12 恒大智慧科技有限公司 家居设备的权限控制方法、装置、计算机设备及存储介质
CN110472394A (zh) * 2019-07-24 2019-11-19 天脉聚源(杭州)传媒科技有限公司 一种预留信息处理方法、系统、装置和存储介质
CN110751471A (zh) * 2018-07-06 2020-02-04 上海博泰悦臻网络技术服务有限公司 基于声纹识别的车内支付方法与云端服务器
WO2020029496A1 (fr) * 2018-08-10 2020-02-13 珠海格力电器股份有限公司 Procédé et dispositif de distribution sélective d'informations
CN110807630A (zh) * 2019-09-19 2020-02-18 平安科技(深圳)有限公司 基于人脸识别的支付方法、装置、计算机设备和存储介质
CN111063358A (zh) * 2019-12-18 2020-04-24 浙江中辰城市应急服务管理有限公司 一种具有生命体识别功能的早期火灾预警和逃生指示系统
CN111079443A (zh) * 2019-12-26 2020-04-28 上海传英信息技术有限公司 一种视频通话人机交互方法及装置
CN111103966A (zh) * 2018-10-25 2020-05-05 安徽黑洞科技有限公司 一种智能展品统筹控制系统
CN111128144A (zh) * 2019-10-16 2020-05-08 国网浙江省电力有限公司金华供电公司 一种语音电网调度系统及方法
CN111341464A (zh) * 2020-03-25 2020-06-26 北京金和网络股份有限公司 疫情信息采集与分析方法及系统
CN111368737A (zh) * 2020-03-04 2020-07-03 开放智能机器(上海)有限公司 一种自动分析员工工作行为的系统及方法
CN111767805A (zh) * 2020-06-10 2020-10-13 云知声智能科技股份有限公司 多模态数据自动清洗与标注方法与系统
CN111803955A (zh) * 2019-04-12 2020-10-23 奇酷互联网络科技(深圳)有限公司 通过可穿戴设备管理账号的方法及系统、存储装置
CN112000939A (zh) * 2020-08-04 2020-11-27 叶兵 一种基于数字证书认证的律师远程法律服务系统及方法
CN112069484A (zh) * 2020-11-10 2020-12-11 中国科学院自动化研究所 基于多模态交互式的信息采集方法及系统
CN112185363A (zh) * 2020-10-21 2021-01-05 北京猿力未来科技有限公司 音频处理方法及装置
CN112202912A (zh) * 2020-10-12 2021-01-08 安徽兴安电气设备股份有限公司 一种二次供水远程自动监控系统
CN112235682A (zh) * 2020-11-17 2021-01-15 歌尔科技有限公司 耳机通话保密方法以及通话装置
CN112651610A (zh) * 2020-12-17 2021-04-13 韦福瑞 一种基于声音判断与识别模拟环境适应能力的检查方法和系统
CN112819061A (zh) * 2021-01-27 2021-05-18 北京小米移动软件有限公司 口令信息识别方法、装置、设备及存储介质
CN113032758A (zh) * 2021-03-26 2021-06-25 平安银行股份有限公司 视讯问答流程的身份识别方法、装置、设备及存储介质
CN113034110A (zh) * 2021-03-30 2021-06-25 泰康保险集团股份有限公司 基于视频审核的业务处理方法、系统、介质与电子设备
CN113112664A (zh) * 2021-02-23 2021-07-13 广州李博士科技研究有限公司 一种人脸识别立式门禁设备
CN113221672A (zh) * 2021-04-22 2021-08-06 国网安徽省电力有限公司 一种用于电力仪表库房的面部识别设备
CN113239041A (zh) * 2021-05-13 2021-08-10 大连交通大学 一种计算机大数据处理的采集系统及方法
CN113343211A (zh) * 2021-06-24 2021-09-03 工银科技有限公司 数据处理方法、处理系统、电子设备及存储介质
CN113469012A (zh) * 2021-06-28 2021-10-01 广州云从鼎望科技有限公司 用户刷脸验证的方法、系统、介质及装置
CN113890736A (zh) * 2021-11-22 2022-01-04 国网四川省电力公司成都供电公司 一种基于国密sm9算法的移动终端身份认证方法及系统
CN114007043A (zh) * 2021-10-27 2022-02-01 北京鼎普科技股份有限公司 基于视频数据指纹特征的视频解码方法、装置及系统
CN114038087A (zh) * 2020-07-20 2022-02-11 阜阳万瑞斯电子锁业有限公司 一种用于电子锁语音识别的开锁系统及方法
CN114168722A (zh) * 2021-11-23 2022-03-11 安徽经邦软件技术有限公司 基于人工智能技术的财务问答机器人
CN114187630A (zh) * 2021-11-29 2022-03-15 华人运通(上海)云计算科技有限公司 一种人脸特征的比对方法及系统
CN114511941A (zh) * 2022-02-16 2022-05-17 中国工商银行股份有限公司 防作弊签到方法、装置、设备、介质和程序产品
CN114580034A (zh) * 2022-03-10 2022-06-03 合肥工业大学 一种基于fpga的ro puf双重身份认证系统及其控制方法
CN114842513A (zh) * 2022-04-02 2022-08-02 湖南麓木和择科技有限公司 一种基于互联网的数据采集系统及信息采集装置
CN114876321A (zh) * 2022-05-23 2022-08-09 江苏德普尔门控科技有限公司 一种智能化自动感应式带家居系统的入户门
CN114979543A (zh) * 2021-02-24 2022-08-30 中国联合网络通信集团有限公司 一种智能家居控制方法及装置
CN115189911A (zh) * 2022-05-30 2022-10-14 平安科技(深圳)有限公司 面签文件的生成方法、装置、设备及存储介质
CN115311713A (zh) * 2022-08-05 2022-11-08 南京甄视智能科技有限公司 终端提取深度学习网络人脸特征值并网内高效复用的方法与系统
CN115412284A (zh) * 2022-07-04 2022-11-29 国网浙江省电力有限公司杭州市临安区供电公司 一种电力现场故障信息安全传输方法
CN115641105A (zh) * 2022-12-01 2023-01-24 中网道科技集团股份有限公司 一种监控社区矫正对象请假外出的数据处理方法
CN116189680A (zh) * 2023-05-04 2023-05-30 北京水晶石数字科技股份有限公司 一种展演智能设备的语音唤醒方法
CN116259095A (zh) * 2023-03-31 2023-06-13 南京审计大学 一种基于计算机的识别系统及方法
CN117273747A (zh) * 2023-09-28 2023-12-22 广州佳新智能科技有限公司 基于人脸图像识别的支付方法、装置、存储介质和设备
CN117376854A (zh) * 2023-10-30 2024-01-09 深圳中网讯通技术有限公司 多媒体短信内容的生成方法、装置、设备及存储介质
CN113127827B (zh) * 2021-05-08 2024-03-08 上海日羲科技有限公司 一种基于ai系统的用户指令处理方法
CN120410783A (zh) * 2025-07-03 2025-08-01 遇见美好文旅科技集团有限公司 一种中老年旅游erp业务全流程管理方法及系统

Families Citing this family (52)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106790054A (zh) * 2016-12-20 2017-05-31 四川长虹电器股份有限公司 基于人脸识别和声纹识别的交互式认证系统及方法
CN106878344A (zh) * 2017-04-25 2017-06-20 北京洋浦伟业科技发展有限公司 一种生物特征认证、注册方法及装置
CN109147770B (zh) * 2017-06-16 2023-07-28 阿里巴巴集团控股有限公司 声音识别特征的优化、动态注册方法、客户端和服务器
CN107358699B (zh) * 2017-07-17 2020-04-24 深圳市斑点猫信息技术有限公司 一种安全验证方法及系统
CN107481449A (zh) * 2017-08-25 2017-12-15 南京真格邦软件有限公司 一种基于人脸识别和语音识别的vtm机
CN107564541B (zh) * 2017-09-04 2018-11-02 南方医科大学南方医院 一种便携式婴儿啼哭声识别器及其识别方法
CN107832720B (zh) * 2017-11-16 2022-07-08 北京百度网讯科技有限公司 基于人工智能的信息处理方法和装置
CN108154884A (zh) * 2017-12-07 2018-06-12 浙江海洋大学 一种防替考的身份识别系统
CN108074310B (zh) * 2017-12-21 2021-06-11 广东汇泰龙科技股份有限公司 基于语音识别模块的语音交互方法及智能锁管理系统
CN108171137B (zh) * 2017-12-22 2021-12-28 深圳市泛海三江科技发展有限公司 一种人脸识别方法及系统
CN110022454B (zh) 2018-01-10 2021-02-23 华为技术有限公司 一种在视频会议中识别身份的方法及相关设备
CN108600627A (zh) * 2018-04-25 2018-09-28 东莞职业技术学院 一种智慧校园视频处理系统
CN108734114A (zh) * 2018-05-02 2018-11-02 浙江工业大学 一种结合面部和声纹的宠物识别方法
CN110555918B (zh) * 2018-06-01 2022-04-26 杭州海康威视数字技术股份有限公司 考勤管理的方法和考勤管理设备
CN110634472B (zh) * 2018-06-21 2024-06-04 中兴通讯股份有限公司 一种语音识别方法、服务器及计算机可读存储介质
CN110647729A (zh) * 2018-06-27 2020-01-03 深圳联友科技有限公司 一种登录验证方法及系统
CN110875905A (zh) * 2018-08-31 2020-03-10 百度在线网络技术(北京)有限公司 账号管理方法、装置及存储介质
CN109450850B (zh) * 2018-09-26 2022-10-11 深圳壹账通智能科技有限公司 身份验证方法、装置、计算机设备和存储介质
CN108965341A (zh) * 2018-09-28 2018-12-07 北京芯盾时代科技有限公司 登录认证的方法、装置及系统
CN109542216B (zh) * 2018-10-11 2022-11-22 平安科技(深圳)有限公司 人机交互方法、系统、计算机设备及存储介质
CN111083278A (zh) * 2018-10-21 2020-04-28 内蒙古龙腾睿昊智能有限公司 基于智能手机监测呼吸、步伐及定位人员信息的采集识别
CN109560941A (zh) * 2018-12-12 2019-04-02 深圳市沃特沃德股份有限公司 会议记录方法、装置、智能终端及存储介质
CN109767335A (zh) * 2018-12-15 2019-05-17 深圳壹账通智能科技有限公司 双录质检方法、装置、计算机设备及存储介质
CN109815806B (zh) * 2018-12-19 2024-06-28 平安科技(深圳)有限公司 人脸识别方法及装置、计算机设备、计算机存储介质
CN109769099B (zh) * 2019-01-15 2021-01-22 三星电子(中国)研发中心 通话人物异常的检测方法和装置
CN109829691B (zh) * 2019-01-16 2021-11-23 北京影谱科技股份有限公司 基于位置和深度学习多重生物特征的c/s打卡方法和装置
CN109658579A (zh) * 2019-02-28 2019-04-19 中新智擎科技有限公司 一种门禁控制方法、系统、设备及存储介质
CN110210935B (zh) * 2019-05-22 2022-05-17 未来(北京)黑科技有限公司 安全认证方法及装置、存储介质、电子装置
CN110472485A (zh) * 2019-07-03 2019-11-19 华为技术有限公司 识别身份的方法和装置
CN110349583A (zh) * 2019-07-15 2019-10-18 高磊 一种基于语音识别的游戏教育方法及系统
CN110599325A (zh) * 2019-08-27 2019-12-20 杭州深景数据技术有限公司 一种告知书读取的方法、装置、设备及存储介质
CN112446395B (zh) 2019-08-29 2023-07-25 杭州海康威视数字技术股份有限公司 网络摄像机、视频监控系统及方法
CN111124109B (zh) * 2019-11-25 2023-05-05 北京明略软件系统有限公司 一种交互方式的选择方法、智能终端、设备及存储介质
CN110963382B (zh) * 2019-12-31 2022-03-15 界首市迅立达电梯有限公司 一种基于语音助手的电梯选层控制系统及方法
CN111401218B (zh) * 2020-03-12 2023-05-26 上海虹点智能科技有限公司 一种智慧城市监控方法及系统
CN111417018A (zh) * 2020-04-29 2020-07-14 苏州思必驰信息科技有限公司 用于智能视频播放设备的智能遥控注册和使用方法及装置
WO2021257000A1 (fr) * 2020-06-19 2021-12-23 National University Of Singapore Vérification de locuteur intermodale
CN111882739B (zh) * 2020-07-21 2022-05-17 中国工商银行股份有限公司 门禁验证方法、门禁装置、服务器及系统
CN112016452A (zh) * 2020-08-27 2020-12-01 四川卫宁软件有限公司 一种医疗行为分析方法及其分析系统、计算机终端
CN112214298B (zh) * 2020-09-30 2023-09-22 国网江苏省电力有限公司信息通信分公司 基于声纹识别的动态优先级调度方法及系统
CN112491844A (zh) * 2020-11-18 2021-03-12 西北大学 一种基于可信执行环境的声纹及面部识别验证系统及方法
CN112466057B (zh) * 2020-12-01 2022-07-29 上海旷日网络科技有限公司 基于人脸识别和语音识别的交互式认证取件系统
CN112863513A (zh) * 2021-01-21 2021-05-28 中国南方电网有限责任公司超高压输电公司柳州局 一种通过面部语音识别结合身份验证下达控制指令的方法
CN113160826B (zh) * 2021-03-01 2022-09-02 特斯联科技集团有限公司 一种基于人脸识别的家庭成员通联方法和系统
CN113329013A (zh) * 2021-05-28 2021-08-31 南京国网电瑞系统工程有限公司 基于数字证书的电力调度数据网安全加密方法及系统
CN113271587B (zh) * 2021-06-11 2023-12-26 北京白龙马云行科技有限公司 一种用于车辆的物联网可信认证系统
CN113658357A (zh) * 2021-08-11 2021-11-16 四川长虹电器股份有限公司 基于声音和图像识别的远程控制智能门锁的方法
CN113806703B (zh) * 2021-08-26 2025-02-18 江苏苏商银行股份有限公司 一种多方位身份识别认证系统与方法
CN114710328A (zh) * 2022-03-18 2022-07-05 中国建设银行股份有限公司 一种身份识别处理方法和装置
CN115830683A (zh) * 2022-12-08 2023-03-21 石家庄职业技术学院 一种基于人工智能的互联网大数据处理系统及方法
CN115981184A (zh) * 2023-03-20 2023-04-18 太原重工股份有限公司 基于人脸和语音双重认证的远程急停控制系统及方法
CN116416726B (zh) * 2023-04-10 2024-06-25 深圳智慧空间信息技术有限公司 基于多重特征验证的高安全性门禁识别方法和系统

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102708867A (zh) * 2012-05-30 2012-10-03 北京正鹰科技有限责任公司 一种基于声纹和语音的防录音假冒身份识别方法及系统
CN103634118A (zh) * 2013-12-12 2014-03-12 山东神思电子技术股份有限公司 基于证卡和复合生物特征识别的生存认证方法
CN103841108A (zh) * 2014-03-12 2014-06-04 北京天诚盛业科技有限公司 用户生物特征的认证方法和系统
KR20140093459A (ko) * 2013-01-18 2014-07-28 한국전자통신연구원 자동 통역 방법
WO2014117583A1 (fr) * 2013-01-29 2014-08-07 Tencent Technology (Shenzhen) Company Limited Procédé et appareil pour authentifier un utilisateur sur la base de données audio et vidéo
CN104834849A (zh) * 2015-04-14 2015-08-12 时代亿宝(北京)科技有限公司 基于声纹识别和人脸识别的双因素身份认证方法及系统
CN105426723A (zh) * 2015-11-20 2016-03-23 北京得意音通技术有限责任公司 基于声纹识别、人脸识别以及同步活体检测的身份认证方法及系统
CN106790054A (zh) * 2016-12-20 2017-05-31 四川长虹电器股份有限公司 基于人脸识别和声纹识别的交互式认证系统及方法

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102708867A (zh) * 2012-05-30 2012-10-03 北京正鹰科技有限责任公司 一种基于声纹和语音的防录音假冒身份识别方法及系统
KR20140093459A (ko) * 2013-01-18 2014-07-28 한국전자통신연구원 자동 통역 방법
WO2014117583A1 (fr) * 2013-01-29 2014-08-07 Tencent Technology (Shenzhen) Company Limited Procédé et appareil pour authentifier un utilisateur sur la base de données audio et vidéo
CN103634118A (zh) * 2013-12-12 2014-03-12 山东神思电子技术股份有限公司 基于证卡和复合生物特征识别的生存认证方法
CN103841108A (zh) * 2014-03-12 2014-06-04 北京天诚盛业科技有限公司 用户生物特征的认证方法和系统
CN104834849A (zh) * 2015-04-14 2015-08-12 时代亿宝(北京)科技有限公司 基于声纹识别和人脸识别的双因素身份认证方法及系统
CN105426723A (zh) * 2015-11-20 2016-03-23 北京得意音通技术有限责任公司 基于声纹识别、人脸识别以及同步活体检测的身份认证方法及系统
CN106790054A (zh) * 2016-12-20 2017-05-31 四川长虹电器股份有限公司 基于人脸识别和声纹识别的交互式认证系统及方法

Cited By (74)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110751471A (zh) * 2018-07-06 2020-02-04 上海博泰悦臻网络技术服务有限公司 基于声纹识别的车内支付方法与云端服务器
CN108694767A (zh) * 2018-07-13 2018-10-23 北京工业职业技术学院 身份认证装置和智能门禁系统
CN108846676B (zh) * 2018-08-02 2023-07-11 平安科技(深圳)有限公司 生物特征辅助支付方法、装置、计算机设备及存储介质
CN108846676A (zh) * 2018-08-02 2018-11-20 平安科技(深圳)有限公司 生物特征辅助支付方法、装置、计算机设备及存储介质
WO2020029496A1 (fr) * 2018-08-10 2020-02-13 珠海格力电器股份有限公司 Procédé et dispositif de distribution sélective d'informations
CN109543377A (zh) * 2018-10-17 2019-03-29 深圳壹账通智能科技有限公司 身份验证方法、装置、计算机设备和存储介质
CN111103966A (zh) * 2018-10-25 2020-05-05 安徽黑洞科技有限公司 一种智能展品统筹控制系统
CN109842805A (zh) * 2019-01-04 2019-06-04 平安科技(深圳)有限公司 视频看点的生成方法、装置、计算机设备及存储介质
CN109842805B (zh) * 2019-01-04 2022-10-21 平安科技(深圳)有限公司 视频看点的生成方法、装置、计算机设备及存储介质
CN110074519A (zh) * 2019-04-10 2019-08-02 南京启诺信息技术有限公司 一种语言识别手环
CN111803955A (zh) * 2019-04-12 2020-10-23 奇酷互联网络科技(深圳)有限公司 通过可穿戴设备管理账号的方法及系统、存储装置
CN110163630A (zh) * 2019-04-15 2019-08-23 中国平安人寿保险股份有限公司 产品监管方法、装置、计算机设备及存储介质
CN110163630B (zh) * 2019-04-15 2024-04-05 中国平安人寿保险股份有限公司 产品监管方法、装置、计算机设备及存储介质
CN110287363A (zh) * 2019-05-22 2019-09-27 深圳壹账通智能科技有限公司 基于深度学习的资源推送方法、装置、设备及存储介质
CN110309570B (zh) * 2019-06-21 2022-11-04 济南大学 一种具有认知能力的多模态仿真实验容器及方法
CN110309570A (zh) * 2019-06-21 2019-10-08 济南大学 一种具有认知能力的多模态仿真实验容器及方法
CN110427468A (zh) * 2019-07-10 2019-11-08 深圳市一恒科电子科技有限公司 一种基于儿童云服务的学习方法及学习机
CN110363278A (zh) * 2019-07-23 2019-10-22 广东小天才科技有限公司 一种亲子互动方法、机器人、服务器及亲子互动系统
CN110472394A (zh) * 2019-07-24 2019-11-19 天脉聚源(杭州)传媒科技有限公司 一种预留信息处理方法、系统、装置和存储介质
CN110442033A (zh) * 2019-07-30 2019-11-12 恒大智慧科技有限公司 家居设备的权限控制方法、装置、计算机设备及存储介质
CN110807630A (zh) * 2019-09-19 2020-02-18 平安科技(深圳)有限公司 基于人脸识别的支付方法、装置、计算机设备和存储介质
CN111128144A (zh) * 2019-10-16 2020-05-08 国网浙江省电力有限公司金华供电公司 一种语音电网调度系统及方法
CN111063358A (zh) * 2019-12-18 2020-04-24 浙江中辰城市应急服务管理有限公司 一种具有生命体识别功能的早期火灾预警和逃生指示系统
CN111079443A (zh) * 2019-12-26 2020-04-28 上海传英信息技术有限公司 一种视频通话人机交互方法及装置
CN111368737A (zh) * 2020-03-04 2020-07-03 开放智能机器(上海)有限公司 一种自动分析员工工作行为的系统及方法
CN111341464A (zh) * 2020-03-25 2020-06-26 北京金和网络股份有限公司 疫情信息采集与分析方法及系统
CN111767805A (zh) * 2020-06-10 2020-10-13 云知声智能科技股份有限公司 多模态数据自动清洗与标注方法与系统
CN114038087B (zh) * 2020-07-20 2024-03-15 阜阳万瑞斯电子锁业有限公司 一种用于电子锁语音识别的开锁系统及方法
CN114038087A (zh) * 2020-07-20 2022-02-11 阜阳万瑞斯电子锁业有限公司 一种用于电子锁语音识别的开锁系统及方法
CN112000939A (zh) * 2020-08-04 2020-11-27 叶兵 一种基于数字证书认证的律师远程法律服务系统及方法
CN112000939B (zh) * 2020-08-04 2023-10-27 叶兵 一种基于数字证书认证的律师远程法律服务系统及方法
CN112202912A (zh) * 2020-10-12 2021-01-08 安徽兴安电气设备股份有限公司 一种二次供水远程自动监控系统
CN112202912B (zh) * 2020-10-12 2022-08-09 安徽兴安电气设备股份有限公司 一种二次供水远程自动监控系统
CN112185363A (zh) * 2020-10-21 2021-01-05 北京猿力未来科技有限公司 音频处理方法及装置
CN112185363B (zh) * 2020-10-21 2024-02-13 北京猿力未来科技有限公司 音频处理方法及装置
CN112069484A (zh) * 2020-11-10 2020-12-11 中国科学院自动化研究所 基于多模态交互式的信息采集方法及系统
CN112235682A (zh) * 2020-11-17 2021-01-15 歌尔科技有限公司 耳机通话保密方法以及通话装置
CN112651610A (zh) * 2020-12-17 2021-04-13 韦福瑞 一种基于声音判断与识别模拟环境适应能力的检查方法和系统
CN112651610B (zh) * 2020-12-17 2024-02-02 韦福瑞 一种基于声音判断与识别模拟环境适应能力的检查方法和系统
CN112819061B (zh) * 2021-01-27 2024-05-10 北京小米移动软件有限公司 口令信息识别方法、装置、设备及存储介质
CN112819061A (zh) * 2021-01-27 2021-05-18 北京小米移动软件有限公司 口令信息识别方法、装置、设备及存储介质
CN113112664A (zh) * 2021-02-23 2021-07-13 广州李博士科技研究有限公司 一种人脸识别立式门禁设备
CN114979543A (zh) * 2021-02-24 2022-08-30 中国联合网络通信集团有限公司 一种智能家居控制方法及装置
CN113032758A (zh) * 2021-03-26 2021-06-25 平安银行股份有限公司 视讯问答流程的身份识别方法、装置、设备及存储介质
CN113034110A (zh) * 2021-03-30 2021-06-25 泰康保险集团股份有限公司 基于视频审核的业务处理方法、系统、介质与电子设备
CN113034110B (zh) * 2021-03-30 2023-12-22 泰康保险集团股份有限公司 基于视频审核的业务处理方法、系统、介质与电子设备
CN113221672A (zh) * 2021-04-22 2021-08-06 国网安徽省电力有限公司 一种用于电力仪表库房的面部识别设备
CN113127827B (zh) * 2021-05-08 2024-03-08 上海日羲科技有限公司 一种基于ai系统的用户指令处理方法
CN113239041A (zh) * 2021-05-13 2021-08-10 大连交通大学 一种计算机大数据处理的采集系统及方法
CN113343211A (zh) * 2021-06-24 2021-09-03 工银科技有限公司 数据处理方法、处理系统、电子设备及存储介质
CN113469012A (zh) * 2021-06-28 2021-10-01 广州云从鼎望科技有限公司 用户刷脸验证的方法、系统、介质及装置
CN113469012B (zh) * 2021-06-28 2024-05-03 广州云从鼎望科技有限公司 用户刷脸验证的方法、系统、介质及装置
CN114007043A (zh) * 2021-10-27 2022-02-01 北京鼎普科技股份有限公司 基于视频数据指纹特征的视频解码方法、装置及系统
CN114007043B (zh) * 2021-10-27 2023-09-26 北京鼎普科技股份有限公司 基于视频数据指纹特征的视频解码方法、装置及系统
CN113890736A (zh) * 2021-11-22 2022-01-04 国网四川省电力公司成都供电公司 一种基于国密sm9算法的移动终端身份认证方法及系统
CN113890736B (zh) * 2021-11-22 2023-02-28 国网四川省电力公司成都供电公司 一种基于国密sm9算法的移动终端身份认证方法及系统
CN114168722A (zh) * 2021-11-23 2022-03-11 安徽经邦软件技术有限公司 基于人工智能技术的财务问答机器人
CN114187630A (zh) * 2021-11-29 2022-03-15 华人运通(上海)云计算科技有限公司 一种人脸特征的比对方法及系统
CN114511941A (zh) * 2022-02-16 2022-05-17 中国工商银行股份有限公司 防作弊签到方法、装置、设备、介质和程序产品
CN114580034A (zh) * 2022-03-10 2022-06-03 合肥工业大学 一种基于fpga的ro puf双重身份认证系统及其控制方法
CN114842513A (zh) * 2022-04-02 2022-08-02 湖南麓木和择科技有限公司 一种基于互联网的数据采集系统及信息采集装置
CN114876321A (zh) * 2022-05-23 2022-08-09 江苏德普尔门控科技有限公司 一种智能化自动感应式带家居系统的入户门
CN115189911A (zh) * 2022-05-30 2022-10-14 平安科技(深圳)有限公司 面签文件的生成方法、装置、设备及存储介质
CN115412284A (zh) * 2022-07-04 2022-11-29 国网浙江省电力有限公司杭州市临安区供电公司 一种电力现场故障信息安全传输方法
CN115311713A (zh) * 2022-08-05 2022-11-08 南京甄视智能科技有限公司 终端提取深度学习网络人脸特征值并网内高效复用的方法与系统
CN115641105B (zh) * 2022-12-01 2023-08-08 中网道科技集团股份有限公司 一种监控社区矫正对象请假外出的数据处理方法
CN115641105A (zh) * 2022-12-01 2023-01-24 中网道科技集团股份有限公司 一种监控社区矫正对象请假外出的数据处理方法
CN116259095A (zh) * 2023-03-31 2023-06-13 南京审计大学 一种基于计算机的识别系统及方法
CN116189680B (zh) * 2023-05-04 2023-09-26 北京水晶石数字科技股份有限公司 一种展演智能设备的语音唤醒方法
CN116189680A (zh) * 2023-05-04 2023-05-30 北京水晶石数字科技股份有限公司 一种展演智能设备的语音唤醒方法
CN117273747A (zh) * 2023-09-28 2023-12-22 广州佳新智能科技有限公司 基于人脸图像识别的支付方法、装置、存储介质和设备
CN117273747B (zh) * 2023-09-28 2024-04-19 广州佳新智能科技有限公司 基于人脸图像识别的支付方法、装置、存储介质和设备
CN117376854A (zh) * 2023-10-30 2024-01-09 深圳中网讯通技术有限公司 多媒体短信内容的生成方法、装置、设备及存储介质
CN120410783A (zh) * 2025-07-03 2025-08-01 遇见美好文旅科技集团有限公司 一种中老年旅游erp业务全流程管理方法及系统

Also Published As

Publication number Publication date
CN106790054A (zh) 2017-05-31

Similar Documents

Publication Publication Date Title
WO2018113526A1 (fr) Système et procédé d'authentification interactive basés sur la reconnaissance de signature vocale et la reconnaissance faciale
CN104834849B (zh) 基于声纹识别和人脸识别的双因素身份认证方法及系统
US8812319B2 (en) Dynamic pass phrase security system (DPSS)
CN106782572B (zh) 语音密码的认证方法及系统
CN108075892B (zh) 一种语音处理的方法、装置和设备
CN106850648B (zh) 身份验证方法、客户端和服务平台
US10276168B2 (en) Voiceprint verification method and device
US9979721B2 (en) Method, server, client and system for verifying verification codes
US11665153B2 (en) Voice biometric authentication in a virtual assistant
WO2017197953A1 (fr) Procédé et dispositif de reconnaissance d'identité fondés sur une empreinte vocale
WO2016123900A1 (fr) Système et procédé d'authentification d'identité basée sur la voix par mot de passe dynamique ayant une fonction d'auto-apprentissage
CN110169014A (zh) 用于认证的装置、方法和计算机程序产品
US9721079B2 (en) Image authenticity verification using speech
CN103841108A (zh) 用户生物特征的认证方法和系统
CN106373575A (zh) 一种用户声纹模型构建方法、装置及系统
CN103714282A (zh) 一种互动式的基于生物特征的识别方法
EP3001343B1 (fr) Système et procédé de reconnaissance d'identité améliorée incorporant des actions aléatoires
CN103177238A (zh) 终端和用户识别方法
CN110110513A (zh) 基于人脸和声纹的身份认证方法、装置和存储介质
CN109785834B (zh) 一种基于验证码的语音数据样本采集系统及其方法
CN107451185B (zh) 录音方法、朗读系统、计算机可读存储介质和计算机装置
CN109727342A (zh) 门禁系统的识别方法、装置、门禁系统及存储介质
CN110516426A (zh) 身份认证方法、认证终端、装置及可读存储介质
CN112133314A (zh) 声纹密码的设置和验证的方法、装置、设备和存储介质
CN111611569A (zh) 一种人脸声纹复核终端及其身份认证方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17884689

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17884689

Country of ref document: EP

Kind code of ref document: A1