[go: up one dir, main page]

WO2017113370A1 - Procédé et appareil de détection d'empreinte vocale - Google Patents

Procédé et appareil de détection d'empreinte vocale Download PDF

Info

Publication number
WO2017113370A1
WO2017113370A1 PCT/CN2015/100286 CN2015100286W WO2017113370A1 WO 2017113370 A1 WO2017113370 A1 WO 2017113370A1 CN 2015100286 W CN2015100286 W CN 2015100286W WO 2017113370 A1 WO2017113370 A1 WO 2017113370A1
Authority
WO
WIPO (PCT)
Prior art keywords
signal portion
preset
feature
characteristic
audio signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2015/100286
Other languages
English (en)
Chinese (zh)
Inventor
范姝男
郜文美
魏卓
秦超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to PCT/CN2015/100286 priority Critical patent/WO2017113370A1/fr
Priority to CN201580079562.2A priority patent/CN107533415B/zh
Publication of WO2017113370A1 publication Critical patent/WO2017113370A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit

Definitions

  • the present invention relates to the field of electronic technologies, and in particular, to a method and apparatus for voiceprint detection.
  • terminal devices have become an indispensable part of people's daily lives.
  • most terminal devices are provided with a password protection unlock function.
  • the terminal device When the terminal device is in the locked state, the user can only unlock the terminal device by inputting the correct password.
  • voice unlocking has higher security than other unlocking methods, it has become a widely used unlocking method.
  • the terminal device or application software provides a function of unlocking voice, verifying the user through voice unlocking, further unlocking the terminal device, or providing services.
  • the voice unlocking mainly authenticates the user through the voiceprint unlocking, and compares the sound signal input by the user with the preset sound signal when unlocking, and if it is determined that the voiceprint input by the user matches the preset voiceprint, it is determined to be a person. , then unlock it.
  • the recording attack cannot be prevented, that is, the text recognized by the user is recorded, and the voice-recognition text of the recording is played out to unlock the voiceprint, and the voiceprint can be successfully unlocked.
  • the soundprint unlocking has a safety hazard and the safety is not high.
  • the invention provides a method and a device for detecting a voiceprint, which improves the security of voiceprint unlocking.
  • the method for detecting voiceprint comprises: detecting whether a sound signal is present by a terminal, and if the terminal detects a sound signal, the terminal receives the sound signal, and the terminal extracts the audio signal portion and the judgment signal portion of the sound signal, The voiceprint feature of the audio signal portion is compared with the preset voiceprint feature, and the expiratory airflow characteristic of the judgment signal portion is compared with the exhalation airflow characteristic of the audio signal portion, and the voiceprint feature of the audio signal portion is preset.
  • the matching degree of the voiceprint feature exceeds a preset threshold, and the degree of matching between the expiratory airflow characteristic of the signal portion and the expiratory airflow characteristic of the audio feature portion exceeds a preset threshold, it is determined that the voiceprint detection result is successful.
  • the terminal recognize the sound
  • the sound signal is divided into an audio signal portion and a judgment signal portion, thereby realizing double recognition of the sound signal, and at the same time, effectively avoiding the situation that the user blows the mouth while playing the recording, and improves the security of the voiceprint unlocking.
  • the feature of the expiratory airflow that is greater than a preset airflow threshold in the portion of the determination signal is received; the characteristics of the expiratory airflow are quantized; and the quantized expiratory airflow characteristic corresponds to the text corresponding to the audio signal portion. Comparing the characteristics of the expiratory airflow; if the matching of the characterized expiratory airflow characteristic with the expiratory airflow characteristic of the audio signal portion exceeds a preset threshold, determining the expiratory airflow characteristic of the signal portion and the audio characteristic portion The matching of the expiratory airflow characteristics exceeds a preset threshold.
  • the expiratory airflow characteristic is compared with a preset airflow threshold, and if the expiratory airflow characteristic is greater than the preset airflow threshold, the expiratory airflow feature is quantized to 1; otherwise, The expiratory flow characteristic is quantized to 0; if at least one of the following two conditions: the expiratory airflow characteristic is quantized to 1, and the text corresponding to the audio signal portion is an aspirated sound; the expiratory airflow characteristic is quantized to 0, and the audio signal portion is The corresponding text is a non-aspirate sound, and the matched expiratory airflow characteristic and the expiratory airflow characteristic of the audio signal portion exceed a preset threshold.
  • the characteristics of the expiratory flow are quantified by comparing the characteristics of the expiratory flow with the preset airflow threshold.
  • the range it is judged that the voiceprint detection result is successful.
  • the angle of the pointing direction of the determining signal portion and the pointing direction of the audio signal portion are respectively compared with the preset pointing angle threshold; if the angle of the pointing direction of the signal portion and the audio signal portion are determined The angle of the pointing direction is smaller than the preset pointing angle threshold, and then the pointing direction characteristic of the signal portion and the pointing direction characteristic of the audio signal portion are preset.
  • the preset pointing angle threshold By comparing the angle of the pointing direction of the judgment signal portion and the angle of the pointing direction of the audio signal portion with the preset pointing angle threshold, respectively, whether the pointing direction characteristic of the judgment signal portion and the pointing direction characteristic of the audio signal portion are in advance Set within the scope.
  • the sensing temperature characteristic of the determining signal portion is compared with a preset temperature threshold; when the matching degree between the voiceprint feature of the audio signal portion and the preset voiceprint feature exceeds a preset threshold, and determining The matching degree of the expiratory airflow characteristic of the signal part and the expiratory airflow characteristic of the audio characteristic part exceeds a preset threshold value, and the pointing direction characteristic of the signal part and the pointing direction characteristic of the audio signal part are within a preset range, and the judgment signal part is When the perceived temperature characteristic is greater than or equal to the preset temperature threshold, it is determined that the voiceprint detection result is successful.
  • the method further includes: the terminal separating the sound signal into the audio signal portion and the determining signal portion; specifically, the terminal adopts the sound signal a filter of a preset frequency is filtered to obtain an audio signal portion; the terminal filters the sound signal by using a filter of a second preset frequency to obtain a judgment signal portion; wherein the filter of the first preset frequency is a high-pass filter The filter of the second preset frequency is a low pass filter.
  • the sound signal is separated into an audio signal portion and a judgment signal portion by passing the sound signal through a filter of a preset frequency.
  • the voiceprint feature of the audio signal portion includes: at least one of a voiceprint waveform and a signal frequency; and at least one of the following two cases: the voiceprint waveform of the audio signal portion and the pre- Aligning the characteristic waveforms of the voiceprint samples; comparing the signal frequency of the audio signal portion with the characteristic frequency of the preset voiceprint sample; if the matching between the voiceprint waveform of the audio signal portion and the preset voiceprint sample characteristic waveform exceeds The threshold value is set; and/or, the matching between the signal frequency of the audio signal portion and the characteristic frequency of the preset voiceprint sample exceeds a preset threshold, and the matching degree between the voiceprint feature of the audio signal portion and the preset voiceprint feature exceeds Set the threshold.
  • the method further includes: collecting, by the terminal, a sound signal sent by the user, performing feature analysis on the sound signal, acquiring a preset voiceprint feature, and storing the sound signal. Collect users in advance through the terminal The sound signal is emitted, and the sound signal is analyzed as a preset voiceprint feature and stored to ensure the accuracy of the preset voiceprint feature, thereby improving the accuracy of matching the voiceprint feature of the audio signal portion with the preset voiceprint feature. Sex, which improves the security of voiceprint unlocking.
  • the method further includes: acquiring, by the terminal, an airflow feature that is exhaled when the user outputs a sound corresponding to the sound signal.
  • the acquisition of the expiratory airflow characteristic of the judging signal portion is performed to ensure that the expiratory airflow characteristic of the judging signal portion is compared with the expiratory airflow characteristic of the audio signal portion.
  • the method further includes: acquiring, by the terminal, a direction of the user outputting a sound corresponding to the sound signal.
  • the obtaining of the pointing direction feature of the judging signal portion is performed to ensure that the pointing direction feature of the judging signal portion and the pointing direction feature of the audio signal portion are within a preset range.
  • the method further includes: acquiring, by the terminal, a temperature when the user outputs a sound corresponding to the sound signal.
  • the acquisition of the sensing temperature characteristic of the determination signal portion is performed to ensure that the sensing temperature characteristic of the determination signal portion is compared with the preset temperature threshold.
  • the terminal includes: a detecting module, configured to detect whether a sound signal is present; a receiving module, configured to receive a sound signal; and an extracting module, configured to extract an audio signal portion and a determining signal portion of the sound signal; a matching module for comparing the voiceprint feature of the audio signal portion with the preset voiceprint feature; comparing the expiratory airflow characteristic of the determination signal portion with the expiratory airflow characteristic of the audio signal portion; wherein, exhaling The airflow characteristic is a feature of the airflow exhaled when the user outputs the sound corresponding to the sound signal; and the determining module is configured to: when the matching degree of the voiceprint feature of the audio signal portion and the preset voiceprint feature exceeds a preset threshold, and determine the signal portion When the matching degree of the expiratory airflow characteristic and the expiratory airflow characteristic of the audio feature part exceeds a preset threshold, it is determined that the voiceprint detection result is the detection success.
  • the sound signal is divided into the audio signal portion and the judgment signal portion, thereby realizing the double recognition of the sound signal, and at the same time, effectively avoiding the user blowing the air while playing the recording, and improving the voiceprint unlocking. Security.
  • the terminal includes: a microphone and a processor; a microphone for detecting whether there is a sound signal; if detecting a sound signal, receiving a sound signal; and a processor for extracting an audio signal portion of the sound signal And judging the signal portion; comparing the voiceprint feature of the audio signal portion with the preset voiceprint feature; comparing the expiratory airflow characteristic of the determination signal portion with the expiratory airflow characteristic of the audio signal portion; wherein, the expiratory airflow The feature is a feature of the airflow exhaled when the user outputs the sound corresponding to the sound signal; when the matching degree of the voiceprint feature of the audio signal portion and the preset voiceprint feature exceeds a preset threshold, and the expiratory airflow characteristic and the audio of the signal portion are judged Exhalation of the characteristic part When the matching degree of the airflow feature exceeds a preset threshold, it is determined that the voiceprint detection result is successful.
  • the sound signal is divided into the audio signal portion and the judgment signal portion, thereby realizing the double recognition of the sound signal, and at the same time, effectively avoiding the user blowing the air while playing the recording, and improving the voiceprint unlocking. Security.
  • the present invention provides a non-transitory computer readable storage medium storing computer instructions for causing an apparatus for controlling a cache to perform an operation in the above method.
  • the method and device for detecting voiceprint detect whether there is a sound signal through a terminal, and if the terminal detects a sound signal, the terminal receives the sound signal, and the terminal extracts the audio signal portion and the judgment signal portion of the sound signal, and the audio signal portion
  • the voiceprint feature is compared with the preset voiceprint feature, and the expiratory airflow characteristic of the judgment signal portion is compared with the exhalation airflow characteristic of the audio signal portion, and the voiceprint feature and the preset voiceprint feature of the audio signal portion are compared.
  • the matching degree exceeds the preset threshold, and the matching degree of the expiratory airflow characteristic of the signal part and the expiratory airflow characteristic of the audio characteristic part exceeds a preset threshold, it is determined that the voiceprint detection result is the detection success.
  • the terminal recognizes the sound signal, the sound signal is divided into the audio signal portion and the judgment signal portion, thereby realizing the double recognition of the sound signal, and at the same time, effectively avoiding the user blowing the air while playing the recording, and improving the voiceprint unlocking. Security.
  • FIG. 1A is a schematic diagram of a scene for unlocking a voiceprint according to an embodiment of the present invention
  • FIG. 1B is a schematic diagram of a scenario for setting a voiceprint password according to an embodiment of the present invention
  • FIG. 2 is a flowchart of a method for detecting voiceprint according to Embodiment 1 of the present invention
  • FIG. 3A is a schematic diagram of quantification of a blowing signal according to Embodiment 1 of the present invention.
  • FIG. 3B is a schematic diagram of quantification of a blowing signal according to Embodiment 2 of the present invention.
  • FIG. 4 is a schematic diagram of a process of voiceprint detection according to Embodiment 1 of the present invention.
  • FIG. 5 is a flowchart of a method for voiceprint detection according to Embodiment 2 of the present invention.
  • FIG. 6 is a schematic diagram of an angle of a pointing direction of a sound signal according to Embodiment 1 of the present invention.
  • FIG. 7 is a flowchart of a method for voiceprint detection according to Embodiment 3 of the present invention.
  • FIG. 8 is a flowchart of a method for voiceprint detection according to Embodiment 4 of the present invention.
  • FIG. 9 is a schematic structural diagram of a terminal according to Embodiment 1 of the present invention.
  • FIG. 10 is a schematic structural diagram of a terminal according to Embodiment 2 of the present invention.
  • FIG. 11 is a schematic structural diagram of a terminal according to Embodiment 3 of the present invention.
  • FIG. 12 is a schematic structural diagram of a terminal according to Embodiment 4 of the present invention.
  • FIG. 13 is a schematic structural diagram of a device for detecting a voiceprint according to Embodiment 1 of the present invention.
  • FIG. 1A is a schematic diagram of a scene of a voiceprint unlocking according to an embodiment of the present invention.
  • the terminal device or the application software provides a function of unlocking the voiceprint.
  • the user verifies the user by unlocking the voiceprint by speaking the corresponding voiceprint password, further unlocking the device, or providing a service.
  • Voiceprint recognition generally includes two types: 1.
  • the text content recognized during voiceprint recognition is preset: each time the unlock is unlocked, the same user-preset text recognition is repeated (for example, sesame opens the door); or
  • the electronic device randomly generates some text or digital passwords, and the user reads out the prompt random password to ensure the security of the voiceprint recognition; 2.
  • FIG. 1B is a schematic diagram of a scene of a voiceprint password setting according to an embodiment of the present invention. As shown in FIG.
  • a user can set a voiceprint password, and a voiceprint password can be defined in advance, for example, by defining a voiceprint password, the user After the voiceprint password "Sesame Opens" is spoken, the terminal successfully enters the user's voiceprint password through the microphone, and the user logs in the account through the voiceprint password, and the terminal determines whether to let the user log in to the account by verifying the voiceprint password input by the user.
  • FIG. 2 is a flowchart of a method for voiceprint detection according to Embodiment 1 of the present invention. As shown in FIG. 2, the method provided by the embodiment of the present invention includes:
  • S201 The terminal detects whether there is a sound signal.
  • the terminal has the function of receiving voice, and the terminal may include, but is not limited to, a mobile communication device such as a mobile phone or a tablet computer.
  • the user when the user needs to unlock the verification, the user sends a sound signal (speech signal) to the terminal.
  • the voice signal sent by the user may be that the user speaks the preset voiceprint password “open sesame” or the user calls.
  • the name of the voice assistant such as "small ice", “hello google”, etc., may also be a text or digital password randomly generated by the user to read the terminal, or the user may randomly say a paragraph.
  • the terminal When the terminal is in the unlocked state, it detects whether there is a voice signal sent by the user. If the terminal detects the voice signal sent by the user when the terminal is in the unlocked state, that is, when the voiceprint recognition signal is detected, the voice signal sent by the user is recognized.
  • the terminal is not always in the living voiceprint recognition mode, but when the terminal detects the voiceprint recognition signal, after entering the living voiceprint recognition mode, the voice signal sent by the user is identified.
  • the terminal is in an unlocked (standby) state.
  • voiceprint recognition is required, the voiceprint recognition mode is entered, for example, the terminal enters the screen to be locked, the application software is to be unlocked, the user's mouth is close to the microphone, or the user is identified.
  • the live voiceprint recognition mode is entered.
  • the mouth proximity microphone can be judged by a sensor such as a proximity sensor, an ultrasonic sensor, an infrared sensor, or the like, and enters the living voiceprint recognition mode.
  • the living voiceprint recognition mode requires the terminal to open the corresponding module to perform corresponding analysis and processing on the received voiceprint identification signal, including, for example, a recording module, a voiceprint recognition module, a thermometer module, a light sensor module, a directional monitor module, Any module or combination of modules in the ultrasonic sensor and infrared sensor to enter the live voiceprint recognition mode.
  • the terminal in the embodiment of the present invention may also be in the living voiceprint recognition mode. When the voiceprint recognition signal is detected, the voice signal sent by the user may be identified.
  • the implementation of the present invention mainly describes how to enter the living voiceprint recognition mode when the terminal detects the sound signal, but is not limited thereto.
  • the voiceprint is a sound wave spectrum of a sound signal (speech signal) displayed by an electroacoustic instrument. Due to different habits of different people's voices, different people's vocal airflow is different, resulting in sound quality and tone. There are differences, and each voiceprint is different.
  • Voiceprint recognition is a type of biometric recognition to confirm whether a certain voice is spoken by a designated person.
  • the voiceprint recognition signal is a sound signal (speech signal) detected when the terminal is in an unlocked state, and the voiceprint recognition signal includes a voice signal of the user voiceprint, and the terminal can recognize the sound signal according to the voiceprint recognition signal. Whether the user's voiceprint is the voiceprint of the specified user to confirm whether the detected voice signal is what the specified user said.
  • the terminal when detecting a sound signal, can receive the sound signal through the microphone.
  • the terminal receives the sound signal and stores the received sound signal.
  • the terminal may always be in the listening state, and buffer the received sound signal, so that when the terminal enters the living voiceprint recognition mode, the voiceprint identification signal is complete for analysis and processing. .
  • S203 The terminal extracts an audio signal portion and a judgment signal portion of the sound signal.
  • the sound signal may include an audio signal of the user's speaking voice and a perceived temperature when the user speaks, and may also include an audio signal and a sound signal direction of the user's speaking voice, and may also include an audio signal of the user's speaking voice and the user speaking.
  • the exhaled signal the terminal may divide the sound signal into an audio signal portion and a judgment signal portion, wherein the audio signal portion may include a voiceprint feature of the audio signal in the sound signal, and the determination signal portion may include a perceived temperature when the user speaks, At least one of a direction of the sound signal and a signal of exhalation when the user speaks, for example, the terminal can obtain the perceived temperature of the voice signal in the voice signal by the terminal; the terminal can also obtain the directivity direction of the sound signal through the microphone array; The terminal can also obtain a signal of exhalation when the user speaks through a filter of a preset frequency (low pass filter).
  • the audio signal portion may include a voiceprint feature of the audio signal in the sound signal
  • the determination signal portion may include a perceived temperature when the user speaks, At least one of a direction of the sound signal and a signal of exhalation when the user speaks, for example, the terminal can obtain the perceived temperature of the voice signal in the voice signal by the terminal; the terminal can also obtain the directivity direction of the sound signal
  • the terminal compares the voiceprint feature of the audio signal portion with the preset voiceprint feature, and determines whether the voiceprint feature of the audio signal portion matches the preset voiceprint feature.
  • the user before the terminal enters the standby state, the user may set the living voiceprint recognition in the terminal, including receiving a voice signal preset by the user, for example, giving the word “sesame opening”, the user reads the a preset text, the terminal records a user voice signal, the voice signal includes an audio signal that the user reads the preset text, the audio signal has a voiceprint recognition feature, and the voiceprint recognition feature of the audio signal is used as a preset sound Pattern features.
  • the voiceprint feature may include at least one of a voiceprint waveform of the audio signal and a signal frequency of the audio signal. Comparing the voiceprint feature of the audio signal portion with the preset voiceprint feature may be by at least one of two things:
  • the signal frequency of the audio signal portion is compared with the preset voiceprint sample characteristic frequency.
  • S205 Compare the characteristics of the expiratory airflow of the judging signal portion with the characteristics of the expiratory airflow of the audio signal portion.
  • the exhalation airflow is characterized by the airflow exhaled when the user corresponding to the sound signal outputs the sound.
  • the terminal captures the microphone input by detecting a sound signal received by the microphone, using a tape recorder or the like.
  • a sound signal received by the microphone
  • the exhaled airflow must open the glottis. Due to the Bernoulli effect, the glottis is returned, and the pressure under the glottis is large enough to open the glottis repeatedly.
  • the opening and closing forms a periodic tremor, so there is an airflow exhalation when the sound is pronounced, which is referred to herein as a blowing signal, that is, the blowing signal is an outgoing airflow characteristic corresponding to the sound output by the user.
  • the microphone receives the effective airflow as a blow signal. Since the frequency of the audio in the sound signal is about 300-3000 Hertz (Hz), the air blowing sound to the microphone is mainly a low-frequency signal, so the high-frequency component that is not blown can be filtered out by low-pass filtering to obtain the air blowing. Signal, thereby separating the audio signal and the blow signal.
  • Hz Hertz
  • the terminal when the terminal detects that there is a sound signal, and the extracted judgment signal portion includes a blow signal, the terminal converts the audio signal into a corresponding text, and determines that the exhalation airflow characteristic of each word or word of the text is a breath sound or The air-free tone is compared with the expiratory airflow characteristic of the audio signal portion to determine whether the user's air blowing signal matches the audio signal. For example, when the user recognizes a sample in a preset voiceprint, the voice is sounded for a certain word, but when the voiceprint is verified, the voice is unvoiced when the word is pronounced, and it is determined that the user's blow signal does not match the audio signal.
  • the terminal may learn the characteristics of the exhalation airflow of each user's audio signal from at least one of the user's daily call and the voice assistant according to the user's voice habit, for example, When the user speaks a particular word or word, the blow is larger, while the same word or other user blows less, to improve the accuracy of the user's expiratory flow characteristics.
  • the terminal compares the voiceprint feature of the audio signal portion with the preset voiceprint feature, determines that the voiceprint feature of the audio signal portion matches the preset voiceprint feature, and the terminal determines the expiratory airflow characteristic of the signal portion and The characteristics of the expiratory airflow in the audio signal portion are compared, and the characteristics of the expiratory airflow of the signal portion are matched with the characteristics of the expiratory airflow of the audio feature portion.
  • the voiceprint detection result is successful.
  • the terminal is unlocked, and the user can be at the terminal. Complete the corresponding operations, such as unlocking the phone, logging in to WeChat, and so on.
  • the terminal may first determine whether the matching degree between the voiceprint feature of the audio signal portion and the preset voiceprint feature exceeds a preset threshold, if the matching degree between the voiceprint feature of the audio signal portion and the preset voiceprint feature does not exceed The preset threshold value does not match the voiceprint feature of the audio signal portion and the preset voiceprint feature, and the terminal determines that the voiceprint detection fails, and the terminal can directly exit the voiceprint detection mode; if the voiceprint portion of the audio signal portion and the preset sound If the matching degree of the pattern feature exceeds a preset threshold, it is further determined whether the matching degree of the expiratory airflow characteristic of the determination signal portion and the expiratory airflow characteristic of the audio characteristic portion exceeds a preset threshold value, and if the expiratory airflow characteristic of the signal portion is determined The matching degree of the expiratory airflow characteristic with the audio feature portion exceeds a preset threshold, and the voiceprint feature of the audio signal portion matches the preset voiceprint feature, and the exhalation airflow characteristic of the signal portion and the ex
  • the airflow feature is matched, the terminal determines that the voiceprint detection is successful, and the terminal unlocks; if the signal part of the expiratory airflow characteristic and the audio feature is judged Matching feature of the expiratory flow does not exceed a preset threshold, wherein the expiratory flow expiratory flow characteristic signal portion and an audio portion is determined characteristic does not match, the terminal determines a failure detector voiceprint, voiceprint detection mode terminal exits.
  • the preset threshold may be determined according to actual conditions. For example, if the voiceprint feature matching accuracy in the terminal is high, the preset threshold may be set to 95%. If the voiceprint feature matching accuracy in the terminal is low, Set the preset threshold to 90%.
  • the voiceprint detection method detects whether there is a sound signal through the terminal. If the terminal detects a sound signal, the terminal receives the sound signal, and the terminal extracts the audio signal portion and the judgment signal portion of the sound signal, and the audio signal portion The voiceprint feature is compared with the preset voiceprint feature, and the expiratory airflow characteristic of the judgment signal portion is compared with the exhalation airflow characteristic of the audio signal portion, and the voiceprint feature and the preset voiceprint feature of the audio signal portion are compared.
  • the matching degree exceeds the preset threshold, and the matching degree of the expiratory airflow characteristic of the signal part and the expiratory airflow characteristic of the audio characteristic part exceeds a preset threshold, it is determined that the voiceprint detection result is successful, so that the terminal recognizes the sound.
  • the sound signal is divided into an audio signal part and a judgment signal part to realize double recognition of the sound signal, and at the same time, the user can effectively avoid the situation that the user blows the mouth while playing the recording, and the safety of the voiceprint unlocking is improved.
  • the method for voiceprint detection further includes:
  • An expiratory airflow characteristic that is greater than a preset airflow threshold in the portion of the determination signal is received.
  • the characteristics of the expiratory flow are quantified.
  • the quantified expiratory airflow characteristics are compared to the expiratory airflow characteristics of the audio signal portion.
  • the matching degree between the expiratory airflow characteristic of the signal portion and the expiratory airflow characteristic of the audio characteristic portion exceeds a preset threshold, including:
  • the quantified expiratory airflow characteristic matches the expiratory airflow characteristic of the audio signal portion exceeding a preset threshold.
  • the terminal determines whether the size of the expiratory airflow of the inflating signal is greater than a preset airflow threshold, and the receiving the determining signal portion is greater than the preset airflow threshold.
  • the expiratory flow is quantified based on the size of the expiratory flow.
  • the preset airflow threshold in the embodiment of the present invention may take 0.10 liters/second (L/s).
  • the characteristics of the expiratory flow are quantified, including:
  • the expiratory airflow characteristic is compared with a preset airflow threshold. If the expiratory airflow characteristic is greater than the preset airflow threshold, the expiratory airflow characteristic is quantized to 1; otherwise, the expiratory flow characteristic is quantized to zero.
  • the quantified expiratory airflow characteristic and the expiratory airflow characteristic matching degree of the audio signal portion exceed a preset threshold, and includes: at least one of the following two cases.
  • the expiratory airflow characteristic is quantized to 1, and the text corresponding to the audio signal portion is an aspirated sound.
  • the expiratory airflow characteristic is quantized to 0, and the text corresponding to the audio signal portion is an unspised sound.
  • the insufflation may be divided into several levels, for example, 10 levels.
  • the gas of the received insufflation signal is greater than or equal to the fifth level, the determination is met.
  • the preset threshold determines that the blow signal is 1.
  • the gas of the received blow signal is less than the fifth level, it is determined that the preset threshold is not reached, and then the blow signal is determined to be zero.
  • FIG. 3A is a schematic diagram of quantification of a blowing signal according to Embodiment 1 of the present invention. As shown in FIG.
  • FIG. 3A is a schematic diagram of quantification of a blowing signal according to Embodiment 2 of the present invention. As shown in FIG. 3B, when the gas of the blowing signal reaches 8 levels, it is determined that the blowing signal is 1.
  • FIG. 4 is a schematic diagram of a process of voiceprint detection according to Embodiment 1 of the present invention.
  • a user sends a voiceprint identification signal “Opening a door of sesame”, and after the microphone of the terminal is received, the separation module separates the voiceprint recognition.
  • the signal is an audio signal and a blow signal, and the audio signal is further sent to the voiceprint recognition module to complete the voiceprint recognition.
  • the audio-to-text module converts the audio into corresponding text, and determines each word of the text or The blow signal corresponding to the word is an aspirated sound or a non-aspirate sound.
  • the blowing module quantifies the received blowing signal, defines 1 for the blowing signal equal to or greater than the threshold, 0 for the threshold, outputs the binary signal of the blowing signal, and judges the word output by the module to the audio to text module. Or the word is compared with the binary signal output by the blowing signal recognition module. For example, the blowing signal of the "sesame door opening" that the user says is “0", “0", "1", "0". For example, the user says that the "top” blow signal is “1" "1", and the user says that the "sport” blow signal is “0" "1".
  • FIG. 5 is a flowchart of a method for voiceprint detection according to Embodiment 2 of the present invention. Another specific implementation manner of the method provided by the embodiment of the present invention, as shown in FIG. 5, the method provided by the embodiment of the present invention includes:
  • S501 The terminal detects whether there is a sound signal.
  • S503 The terminal extracts an audio signal portion and a judgment signal portion of the sound signal.
  • S504 Align the voiceprint feature of the audio signal portion with the preset voiceprint feature.
  • S501, S502, S503, and S504 are the same as those of S201, S202, S203, and S204.
  • S201, S202, S203, and S204 For details, refer to the descriptions of S201, S202, S203, and S204, and details are not described herein again.
  • the determining signal portion may include a pointing direction feature, wherein the pointing direction feature is a direction in which the user corresponding to the sound signal outputs the sound.
  • the sound signal received by the terminal may have problems in that the audio signal and the air blowing signal come from different directions, that is, other users use the recording for the audio signal, and at the same time, the other voice signal is used to emit the blowing signal, resulting in the audio signal and The blowing signal is not from the same voice signal, and the audio signal is inconsistent with the direction of the blowing signal. For example, the word “sesame opening" is given, but the pronunciation is not pronounced, so that the direction and blowing of the recording are played.
  • the direction of the gas is inconsistent, and the terminal determines whether the pointing direction feature of the signal portion and the pointing direction characteristic of the audio signal portion are within a preset range to determine whether the audio signal and the blowing signal are from the same directivity direction, thereby avoiding recording attacks. .
  • the terminal detects that the voice signal of the user A can unlock the terminal, that is, the voice signal of the user A can unlock the terminal. If user B holds the voice recording of user A while blowing the mouth, but does not pronounce it, the direction of playing the recording and the mouth-type blowing may not be the same at this time, but when the ordinary voiceprint is used to unlock the user, user B also The terminal may be successfully unlocked, which poses a security risk and is not safe. In the embodiment of the present invention, it is determined whether the two directivity directions are within a preset range of the microphone array.
  • the two directivity directions are within a preset range of the microphone array, the sound signals are from the same directivity direction, and There is a recording attack; if the two directivity directions are not within the preset range of the microphone array, the sound signal comes from different directivity directions, and there is a recording attack.
  • determining whether the pointing direction feature of the determining signal portion and the pointing direction feature of the audio signal portion are within a preset range comprising: respectively determining an angle of a pointing direction of the determining signal portion and a pointing direction of the audio signal portion with a pre-preparation Set the pointing angle threshold comparison.
  • determining the pointing direction feature of the signal portion and the pointing direction feature of the audio signal portion are within a preset range, including: determining that the angle of the pointing direction of the signal portion and the pointing direction of the audio signal portion are smaller than the preset pointing angle Threshold.
  • the microphone directivity receiving technology can be used to prevent the recording attack.
  • the terminal can include a mode in which the microphone is directed to receive the signal, that is, the microphone enters the directivity listening mode, and only the receiving conforms to the preset. Audio and blow signals in the range of angles avoid recording attacks by limiting the range of microphones that receive audio and blow signals.
  • the directional reception of the microphone is realized by the sound source localization technology, which can be realized by the microphone array.
  • the sound transmitted from different directions can be captured, and the microphone is pointed to a specific direction by an algorithm operation to form a “beam” pointing to the sound, and the direction is captured.
  • the audio signal can be used to realize the directionality of the microphone to receive the voice signal.
  • the acoustic waves reach the microhour difference between each microphone in the array, and the microphone array can achieve better directivity than a single microphone.
  • the specific implementation includes that the microphone array can point the sound beam to a range of angles, for example, by a generalized cross-correlation method, smooth coherent transform, phase transform or maximum likelihood, and then adjust the radio direction according to the delay and the set position of the microphone array.
  • the tangential receiving direction is a conical shape with an angle ⁇ , and it is further determined that the audio signal and the blowing signal of the received sound source S are both from the effective information number in the direction of the ⁇ 1 angle in the direction smaller than the ⁇ angular direction cone.
  • FIG. 6 is a schematic diagram of an angle of a pointing direction of a sound signal according to Embodiment 1 of the present invention.
  • one mobile phone has two microphones A and B, and the distance between A and B is fixed, which is a known d, and the sound propagation speed is fixed to C, and the time difference between the arrival of the microphones A and B according to the sound is ⁇ .
  • the angle ⁇ 1 between the sound source (sound signal signal) and the microphone B can be calculated, and it is judged whether it is within the conic shape of the effective sound source direction ⁇ angle according to the angle ⁇ 1. Therefore, it can be judged that the audio signal and the air blowing signal of the sound source are effective signals received by the directional microphone.
  • is the delay amount of the sound reaching the two microphones
  • d is the distance between the two microphones
  • ⁇ 1 is the directivity direction angle of the speech signal
  • C is the speed of the sound.
  • the distance between the sound source and the microphone can be set.
  • the distance between the sound source and the microphone is determined by the light sensor, the infrared sensor, the ultrasonic sensor, etc., and the distance threshold is set to ensure whether the direction of the recording attack and the blowing signal are consistent. Because if the sound source is close to the microphone, the recording attack and the blow signal will come from the same direction.
  • S506 determining a sound when the matching degree between the voiceprint feature of the audio signal portion and the preset voiceprint feature exceeds a preset threshold, and determining that the pointing direction feature of the signal portion and the pointing direction feature of the audio signal portion are within a preset range The result of the detection is successful.
  • the matching degree between the voiceprint feature of the audio signal portion and the preset voiceprint feature exceeds a preset threshold, and whether the audio signal and the air blowing signal in the voiceprint recognition signal are from the same directivity direction, determining the voiceprint The test result is that the test is successful.
  • the voiceprint detection method detects whether there is a sound signal through the terminal. If the terminal detects a sound signal, the terminal receives the sound signal, and the terminal extracts the audio signal portion and the judgment signal portion of the sound signal, and the audio signal portion
  • the voiceprint feature is compared with the preset voiceprint feature to determine whether the pointing direction feature of the signal portion and the pointing direction feature of the audio signal portion are within a preset range, when the voiceprint feature of the audio signal portion and the preset voiceprint feature
  • the matching degree exceeds the preset threshold, and the pointing direction characteristic of the determined signal portion and the pointing direction characteristic of the audio signal portion are within the preset range, determining that the voiceprint detection result is successful for detection, so that when the terminal recognizes the sound signal
  • the sound signal is divided into an audio signal part and a judgment signal part to realize double recognition of the sound signal, and at the same time, the situation that the playing recording and the mouth-type blowing direction may be inconsistent is effectively avoided, and the security of the voiceprint unlocking
  • FIG. 7 is a flowchart of a method for voiceprint detection according to Embodiment 3 of the present invention.
  • a further implementation manner of the method provided by the embodiment of the present invention, as shown in FIG. 7 the method provided by the embodiment of the present invention includes:
  • S701 The terminal detects whether there is a sound signal.
  • S703 The terminal extracts an audio signal portion and a judgment signal portion of the sound signal.
  • S704 Align the voiceprint feature of the audio signal portion with the preset voiceprint feature.
  • S701, S702, S703, and S704 are implemented in the same manner as S201, S202, S203, and S204, respectively.
  • S201, S202, S203, and S204 are implemented in the same manner as S201, S202, S203, and S204, respectively.
  • S201, S202, S203, and S204 are implemented in the same manner as S201, S202, S203, and S204, respectively.
  • S201, S202, S203, and S204 are not described herein again.
  • the determining signal portion may include a sensing temperature characteristic, wherein the sensing temperature characteristic is a temperature when the user corresponding to the sound signal outputs the sound.
  • the sensing temperature characteristic of the determining signal portion is compared with a preset temperature threshold to determine whether the sensing temperature characteristic of the determining signal portion is greater than or equal to a preset temperature threshold.
  • the terminal can sense the temperature of the adjacent microphone through the infrared sensor to determine that the voice signal is from the human body, such as a user, not a recorded electronic device.
  • the preset temperature threshold may be determined according to a temperature range of the human body, and the preset temperature threshold is generally set to a minimum temperature within a normal range of the human body, such as 36 degrees Celsius.
  • the voice signal received by the terminal may be determined.
  • the voiceprint detection method detects whether there is a sound signal through the terminal. If the terminal detects a sound signal, the terminal receives the sound signal, and the terminal extracts the audio signal portion and the judgment signal portion of the sound signal, and the audio signal portion
  • the voiceprint feature is compared with the preset voiceprint feature, and the perceived temperature characteristic of the determination signal portion is compared with the preset temperature threshold, and the matching degree between the voiceprint feature of the audio signal portion and the preset voiceprint feature exceeds the preset. Threshold and judge the sense of the signal When the temperature characteristic is greater than or equal to the preset temperature threshold, it is determined that the voiceprint detection result is successful.
  • the preset temperature threshold is used to determine that the voice signal received by the terminal is from the user, not the recorded electronic device, thereby avoiding the recording attack and improving the security of the voiceprint unlocking.
  • FIG. 8 is a flowchart of a method for voiceprint detection according to Embodiment 4 of the present invention.
  • the method provided by the embodiment of the present invention is another specific implementation manner of the method provided in the embodiment 1 of FIG. 2 .
  • the method provided by the embodiment of the present invention includes:
  • the terminal Before the voiceprint recognition signal is detected, before the terminal enters the living voiceprint recognition mode, the terminal further includes:
  • the terminal detects whether there is a voiceprint recognition signal; wherein the voiceprint recognition signal is a sound signal detected when the terminal is in an unlocked state.
  • the terminal detects whether there is a voiceprint identification signal, including: when the unlocked state, the terminal detects whether there is a sound signal; if the terminal detects a sound signal, the sound signal is a voiceprint recognition signal.
  • S802 The terminal receives the voiceprint recognition signal and stores it.
  • S803 The terminal extracts an audio signal portion and a judgment signal portion of the voiceprint recognition signal.
  • S804 The terminal determines whether the matching degree between the voiceprint feature of the audio signal portion and the preset voiceprint feature exceeds a preset threshold. If the matching degree between the voiceprint feature of the audio signal portion and the preset voiceprint feature exceeds a preset threshold, then S805 is performed; otherwise, S808 is performed.
  • the terminal may compare the voiceprint feature of the audio signal portion with the preset voiceprint feature to determine whether the matching relationship between the voiceprint feature of the audio signal portion and the preset voiceprint feature is Exceeded the preset threshold.
  • S805 The terminal determines whether the audio signal of the audio signal portion and the air blowing signal of the determination signal portion are from the same directivity direction. If the audio signal of the audio signal portion and the air blowing signal of the determination signal portion are from the same directivity direction, then S806 is performed; otherwise, S808 is performed.
  • the terminal may determine whether the pointing direction feature of the signal portion and the pointing direction feature of the audio signal portion are within a preset range to determine the audio signal and the judgment signal portion of the audio signal portion. Whether the gas signal comes from the same directivity direction.
  • S806 The terminal determines whether the text corresponding to the audio signal portion matches the expiratory airflow in the determination signal portion. If the text corresponding to the audio signal portion matches the expiratory airflow in the judgment signal portion, Then, S807 is executed; otherwise, S808 is executed.
  • the terminal may compare the expiratory airflow characteristic of the determination signal part with the expiratory airflow characteristic of the audio signal part to determine the text corresponding to the audio signal part and the call in the judgment signal part. Whether the gas flow matches.
  • the method further comprises: determining whether the sensing temperature characteristic of the determining signal portion is greater than or equal to a preset temperature threshold; if the matching relationship between the voiceprint feature of the audio signal portion and the preset voiceprint feature exceeds a preset threshold, and determining that the matching degree of the expiratory airflow characteristic of the signal part and the expiratory airflow characteristic of the audio characteristic part exceeds a preset threshold value, and determining a pointing direction characteristic of the signal part and a pointing direction characteristic of the audio signal part are preset
  • the living voiceprint detection is successful within the range and when the perceived temperature characteristic of the signal portion is greater than or equal to the preset temperature threshold.
  • the voiceprint detection method when the voiceprint recognition signal is detected, the terminal enters the living voiceprint recognition mode, the terminal receives the voiceprint recognition signal and stores, and the terminal extracts the audio signal portion of the voiceprint recognition signal. And determining the signal portion, when the matching degree between the voiceprint feature of the audio signal portion and the preset voiceprint feature exceeds a preset threshold, and the judgment feature of the determination signal portion satisfies a preset determination condition, determining the voiceprint detection result is detection Successfully, when the terminal recognizes the voiceprint recognition signal, the voiceprint recognition signal is divided into an audio signal portion and a judgment signal portion, thereby realizing double recognition of the voiceprint recognition signal, and improving the security of the voiceprint unlocking.
  • the method before the terminal extracts the audio signal portion and the determination signal portion of the voiceprint recognition signal, the method further includes:
  • the terminal separates the voiceprint recognition signal into an audio signal portion and a judgment signal portion
  • the terminal separates the voiceprint recognition signal into an audio signal portion and a judgment signal portion, including:
  • the terminal filters the voiceprint recognition signal by using a filter of a first preset frequency to obtain an audio signal portion
  • the terminal filters the voiceprint recognition signal by using a filter of a second preset frequency to obtain a judgment signal part
  • the filter of the first preset frequency is a high pass filter
  • the filter of the second preset frequency is a low pass filter
  • FIG. 9 is a schematic structural diagram of a terminal according to Embodiment 1 of the present invention.
  • the terminal provided by the embodiment of the present invention includes: a detecting module 901, a receiving module 902, an extracting module 903, a first matching module 904, and a determining module 905.
  • the detecting module 901 is configured to detect whether there is a sound signal.
  • the receiving module 902 is configured to receive a sound signal.
  • the extraction module 903 is configured to extract an audio signal portion and a determination signal portion of the sound signal.
  • the first matching module 904 is configured to compare the voiceprint feature of the audio signal portion with the preset voiceprint feature; and compare the expiratory airflow characteristic of the determination signal portion with the expiratory airflow characteristic of the audio signal portion.
  • the exhalation airflow is characterized by the airflow exhaled when the user corresponding to the sound signal outputs the sound.
  • the determining module 905 is configured to: when the matching degree between the voiceprint feature of the audio signal portion and the preset voiceprint feature exceeds a preset threshold, and determine the matching degree between the expiratory airflow feature of the signal portion and the expiratory airflow feature of the audio feature portion When the preset threshold is exceeded, it is determined that the voiceprint detection result is successful.
  • the terminal of the embodiment of the present invention is used to perform the technical solution of the method embodiment shown in FIG. 2, and the implementation principle and the technical effect are similar, and details are not described herein again.
  • the receiving module 902 is further configured to receive an expiratory airflow feature that is greater than a preset airflow threshold in the determination signal portion.
  • the terminal further includes: a quantization module.
  • a quantification module for quantifying the characteristics of the expiratory flow.
  • the first matching module 904 is further configured to compare the quantized expiratory airflow characteristics with the expiratory airflow characteristics of the audio signal portion.
  • the matching degree between the expiratory airflow characteristic of the judging signal portion judged by the judging module 905 and the expiratory airflow characteristic of the audio characteristic portion exceeds a preset threshold, including: the quantized expiratory airflow characteristic and the audio
  • the expiratory airflow feature matching of the signal portion exceeds a preset threshold.
  • the first matching module 904 is specifically configured to: compare the characteristics of the expiratory airflow with the preset airflow threshold, and if the expiratory airflow characteristic is greater than the preset airflow threshold, The expiratory flow feature is quantized to one; otherwise, the expiratory flow characteristic is quantized to zero.
  • the matching degree of the expired expiratory airflow characteristic determined by the determining module 905 and the expiratory airflow characteristic of the audio signal portion exceeds a preset threshold, and includes: at least one of the following two situations:
  • the expiratory airflow characteristic is quantized to 1, and the text corresponding to the audio signal portion is an aspirated sound.
  • the expiratory airflow characteristic is quantized to 0, and the text corresponding to the audio signal portion is an unspised sound.
  • FIG. 10 is a schematic structural diagram of a terminal according to Embodiment 2 of the present invention. As shown in FIG. 10, the terminal provided by the embodiment of the present invention further includes: a second matching module 906, based on the foregoing embodiment.
  • the second matching module 906 is configured to determine whether the pointing direction feature of the determining signal portion and the pointing direction feature of the audio signal portion are within a preset range.
  • the determining mode 905 is further configured to: when the matching degree between the voiceprint feature of the audio signal portion and the preset voiceprint feature exceeds a preset threshold, and determine the matching of the expiratory airflow characteristic of the signal portion with the expiratory airflow characteristic of the audio feature portion When the degree exceeds the preset threshold, and the pointing direction characteristic of the signal portion and the pointing direction characteristic of the audio signal portion are within the preset range, it is determined that the voiceprint detection result is the detection success.
  • the terminal of the embodiment of the present invention is used to implement the technical solution of the method embodiment shown in FIG. 5, and the implementation principle and the technical effect are similar, and details are not described herein again.
  • the second matching module 906 is specifically configured to: respectively compare an angle of a pointing direction of the determining signal portion and an angle of a pointing direction of the audio signal portion with a preset pointing angle threshold.
  • the pointing direction feature of the determining signal portion determined by the determining module 905 and the pointing direction feature of the audio signal portion are within a preset range, including: determining that the angle of the pointing direction of the signal portion and the pointing direction of the audio signal portion are smaller than the preset pointing direction. Angle threshold.
  • FIG. 11 is a schematic structural diagram of a terminal according to Embodiment 3 of the present invention. As shown in FIG. 11, the terminal provided by the embodiment of the present invention further includes: a third matching module 907, based on the foregoing embodiment.
  • the third matching module 907 is configured to compare the perceived temperature feature of the determination signal portion with a preset temperature threshold.
  • the determining module 905 is further configured to: when the matching degree between the voiceprint feature of the audio signal portion and the preset voiceprint feature exceeds a preset threshold, and determine the matching of the expiratory airflow characteristic of the signal portion with the expiratory airflow characteristic of the audio feature portion If the degree exceeds the preset threshold, the direction of the pointing direction of the signal portion and the pointing direction characteristic of the audio signal portion are within a preset range, and the sensing temperature characteristic of the signal portion is greater than or equal to the preset temperature threshold, determining the voiceprint detection result is The test was successful.
  • the terminal of the embodiment of the present invention is used to implement the technical solution of the method embodiment shown in FIG. 7.
  • the implementation principle and the technical effect are similar, and details are not described herein again.
  • the terminal further includes: a separation module.
  • a separation module configured to separate the sound signal into an audio signal portion and a determination signal portion before the extraction module extracts the audio signal portion of the sound signal and the determination signal portion.
  • the separation module is specifically configured to: filter the sound signal by using a filter of a first preset frequency to obtain an audio signal portion; and filter the sound signal by using a filter of a second preset frequency to obtain a judgment signal portion.
  • the filter of the first preset frequency is a high pass filter
  • the filter of the second preset frequency is a low pass filter
  • FIG. 12 is a schematic structural diagram of a terminal according to Embodiment 4 of the present invention.
  • a terminal provided by an embodiment of the present invention includes: a microphone 1201, a memory 1202, and a processor 1203.
  • the microphone 1201 may be corresponding to the detection module 901 of the terminal for detecting whether there is a sound signal; if the sound signal is detected, the sound signal is received.
  • the microphone 1503 can also be configured to receive an expiratory airflow characteristic of the determination signal portion that is greater than a preset airflow threshold.
  • the memory 1202 is configured to store execution instructions, and the processor 1203 may be a Central Processing Unit (CPU), or an Application Specific Integrated Circuit (ASIC), or one or more implementations of the embodiments of the present invention. Integrated circuits. When the terminal is running, the processor 1203 communicates with the memory 1202, and the processor 1203 invokes an execution instruction for performing the following operations:
  • the expiratory airflow characteristic is a feature of the airflow exhaled when the user corresponding to the sound signal outputs the sound; when the matching degree of the voiceprint feature of the audio signal portion and the preset voiceprint feature exceeds a preset threshold, and the determination signal portion Expiratory airflow characteristics and audio features When the matching degree of the expiratory airflow characteristic exceeds a preset threshold, it is determined that the voiceprint detection result is the detection success.
  • the terminal may further include: a recorder 1204.
  • the recorder 1204 can be used to collect sound signals emitted by the user, perform feature analysis on the sound signals, and acquire preset voiceprint features and store them.
  • the processor 1203 is further configured to perform the following operations:
  • Quantifying the characteristics of the expiratory flow comparing the quantified expiratory flow characteristics with the characteristics of the expiratory flow of the audio signal portion;
  • the matching degree between the expiratory airflow characteristic of the judging signal portion determined by the processor 1203 and the expiratory airflow characteristic of the audio characteristic portion exceeds a preset threshold, including: the quantified expiratory airflow characteristic matches the expiratory airflow characteristic of the audio signal portion Degree exceeds the preset threshold
  • the processor 1203 is further configured to perform the following operations:
  • the degree of matching between the quantized expiratory airflow characteristic determined by the processor 1203 and the expiratory airflow characteristic of the audio signal portion exceeds a preset threshold, including: at least one of the following two cases:
  • the expiratory airflow characteristic is quantized to 0, and the text corresponding to the audio signal portion is an unspised sound.
  • the processor 1203 is further configured to perform the following operations:
  • the processor 1203 is further configured to perform the following operations:
  • the pointing direction feature of the judgment signal portion judged by the processor 1203 and the pointing direction feature of the audio signal portion are within a preset range, including: determining the angle of the pointing direction of the signal portion and the audio signal The angle of the pointing direction of the number portion is smaller than the preset pointing angle threshold.
  • the processor 1203 is further configured to perform the following operations:
  • the processor 1203 is further configured to perform the following operations:
  • the signal frequency of the audio signal portion is compared with the preset voiceprint sample characteristic frequency.
  • the processor 1203 is further configured to perform the following operations:
  • the sound signal is separated into an audio signal portion and a judgment signal portion.
  • the sound signal is filtered by using a filter of a first preset frequency to obtain an audio signal portion; and the sound signal is filtered by a filter of a second preset frequency to obtain a judgment signal portion; wherein, the first preset frequency
  • the filter is a high pass filter and the filter of the second preset frequency is a low pass filter.
  • FIG. 13 is a schematic structural diagram of an apparatus for detecting a voiceprint according to Embodiment 1 of the present invention.
  • the device provided by the example of the present invention can be implemented as a single device, or can be integrated into various voice assistant devices, such as a set top box, a mobile phone, a tablet personal computer, and a laptop computer. , multimedia player, digital camera, personal digital assistant (PDA), navigation device, mobile Internet device (MID) or wearable device (Wearable Device).
  • the apparatus provided by the embodiment of the present invention may include one or more of the following units: an input unit, a storage unit, a processor unit, a communication unit, a peripheral interface, an output unit, and a power source.
  • the microphone can be used as an input unit, and the input unit can input an audio signal to detect whether the terminal has a voiceprint recognition signal.
  • the memory can be used as a storage unit, and the storage unit can store execution instructions, such as an execution instruction such as an operation program and an application program, or a specific blow signal recognition module, a blow signal and an audio signal separation module, and a blow signal determination module. Wait Line instructions.
  • the processor may be a processor unit, and the processor unit may be a central processing unit (CPU), or an application specific integrated circuit (ASIC), or one or more implementations of the embodiments of the present invention. Integrated circuits.
  • the processor unit When the terminal is running, the processor unit communicates with the memory unit, and the processor unit invokes an execution instruction for performing the operations in the above method embodiments.
  • the communication unit can be used for limited or wireless communication between the terminal and other devices.
  • the peripheral interface can be used to provide an interface between the terminal and the peripheral interface module, wherein the peripheral interface module can be a keyboard, a button, or the like.
  • the output unit can be used to output an audio signal.
  • the power supply can be used to provide power to the various units of the terminal.
  • Embodiments of the present invention also provide a non-transitory computer readable storage medium, such as a storage unit including instructions that are executable by a processor of a voiceprint detecting device to perform the above method.
  • the non-transitory computer readable storage medium can be a ROM, a random access memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, and an optical data storage device.
  • a non-transitory computer readable storage medium storing computer instructions for causing an apparatus for controlling a cache to perform an operation in the above-described method embodiments.
  • the instructions in the storage medium are executed by the processor of the terminal, the terminal is enabled to perform the operations in the above method embodiments.

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Telephone Function (AREA)
  • Toys (AREA)

Abstract

L'invention concerne un procédé et un appareil de détection d'empreinte vocale. Le procédé consiste à : lorsqu'un degré de correspondance entre une caractéristique d'empreinte vocale d'une partie signal audio et une caractéristique d'empreinte vocale prédéfinie dépasse une valeur seuil prédéfinie, et qu'un degré de correspondance entre une caractéristique de flux d'air expiré d'une partie signal de détermination et une caractéristique de flux d'air expiré de la partie signal audio dépasse la valeur de seuil prédéfinie, déterminer qu'un résultat de détection d'empreinte vocale indique que la détection a réussi (S206). Le procédé et l'appareil de détection d'empreinte vocale permettent d'améliorer la sécurité de déverrouillage par empreinte vocale.
PCT/CN2015/100286 2015-12-31 2015-12-31 Procédé et appareil de détection d'empreinte vocale Ceased WO2017113370A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2015/100286 WO2017113370A1 (fr) 2015-12-31 2015-12-31 Procédé et appareil de détection d'empreinte vocale
CN201580079562.2A CN107533415B (zh) 2015-12-31 2015-12-31 声纹检测的方法和装置

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2015/100286 WO2017113370A1 (fr) 2015-12-31 2015-12-31 Procédé et appareil de détection d'empreinte vocale

Publications (1)

Publication Number Publication Date
WO2017113370A1 true WO2017113370A1 (fr) 2017-07-06

Family

ID=59224366

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/100286 Ceased WO2017113370A1 (fr) 2015-12-31 2015-12-31 Procédé et appareil de détection d'empreinte vocale

Country Status (2)

Country Link
CN (1) CN107533415B (fr)
WO (1) WO2017113370A1 (fr)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019141139A1 (fr) * 2018-01-17 2019-07-25 Huawei Technologies Co., Ltd. Authentification d'utilisateur echoprint
CN113707182A (zh) * 2021-09-17 2021-11-26 北京声智科技有限公司 声纹识别方法、装置、电子设备及存储介质
CN113744431A (zh) * 2020-05-14 2021-12-03 大富科技(安徽)股份有限公司 一种共享单车车锁控制装置、方法、设备及介质
CN116092167A (zh) * 2023-02-23 2023-05-09 唯思电子商务(深圳)有限公司 一种基于读数的人脸活体检测方法
CN119207432A (zh) * 2024-09-14 2024-12-27 维沃移动通信有限公司 声纹校验方法、装置、电子设备及可读存储介质
GB2635257A (en) * 2021-09-07 2025-05-07 Pi A Creative Systems Ltd Method for detecting user input to a breath input configured user interface

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115346340B (zh) * 2022-07-21 2023-11-17 浙江极氪智能科技有限公司 改善驾驶疲劳的装置及方法

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH06105388A (ja) * 1992-09-18 1994-04-15 Matsushita Electric Ind Co Ltd 呼気流センサ
US20060036441A1 (en) * 2004-08-13 2006-02-16 Canon Kabushiki Kaisha Data-managing apparatus and method
CN101441869A (zh) * 2007-11-21 2009-05-27 联想(北京)有限公司 语音识别终端用户身份的方法及终端
CN102737634A (zh) * 2012-05-29 2012-10-17 百度在线网络技术(北京)有限公司 一种基于语音的认证方法及装置
CN102866844A (zh) * 2012-08-13 2013-01-09 上海华勤通讯技术有限公司 移动终端及其解锁方法
CN202841290U (zh) * 2012-06-04 2013-03-27 百度在线网络技术(北京)有限公司 移动终端的解锁装置及具有该解锁装置的移动终端
CN104021790A (zh) * 2013-02-28 2014-09-03 联想(北京)有限公司 声控解锁方法以及电子设备

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101897627B (zh) * 2010-06-30 2012-10-24 广州医学院第一附属医院 一种小鼠咳嗽模型的建立和检测方法
CN102523347A (zh) * 2011-12-16 2012-06-27 广东步步高电子工业有限公司 一种应用于电子产品中的吹气操控方法和装置
CN103886861B (zh) * 2012-12-20 2017-03-01 联想(北京)有限公司 一种控制电子设备的方法及电子设备

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH06105388A (ja) * 1992-09-18 1994-04-15 Matsushita Electric Ind Co Ltd 呼気流センサ
US20060036441A1 (en) * 2004-08-13 2006-02-16 Canon Kabushiki Kaisha Data-managing apparatus and method
CN101441869A (zh) * 2007-11-21 2009-05-27 联想(北京)有限公司 语音识别终端用户身份的方法及终端
CN102737634A (zh) * 2012-05-29 2012-10-17 百度在线网络技术(北京)有限公司 一种基于语音的认证方法及装置
CN202841290U (zh) * 2012-06-04 2013-03-27 百度在线网络技术(北京)有限公司 移动终端的解锁装置及具有该解锁装置的移动终端
CN102866844A (zh) * 2012-08-13 2013-01-09 上海华勤通讯技术有限公司 移动终端及其解锁方法
CN104021790A (zh) * 2013-02-28 2014-09-03 联想(北京)有限公司 声控解锁方法以及电子设备

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019141139A1 (fr) * 2018-01-17 2019-07-25 Huawei Technologies Co., Ltd. Authentification d'utilisateur echoprint
US10853463B2 (en) 2018-01-17 2020-12-01 Futurewei Technologies, Inc. Echoprint user authentication
US11461447B2 (en) 2018-01-17 2022-10-04 Futurewei Technologies, Inc. Echoprint user authentication
CN113744431A (zh) * 2020-05-14 2021-12-03 大富科技(安徽)股份有限公司 一种共享单车车锁控制装置、方法、设备及介质
CN113744431B (zh) * 2020-05-14 2024-04-09 大富科技(安徽)股份有限公司 一种共享单车车锁控制装置、方法、设备及介质
GB2635257A (en) * 2021-09-07 2025-05-07 Pi A Creative Systems Ltd Method for detecting user input to a breath input configured user interface
GB2635257B (en) * 2021-09-07 2025-12-03 Pi A Creative Systems Ltd Method for detecting user input to a breath input configured user interface
CN113707182A (zh) * 2021-09-17 2021-11-26 北京声智科技有限公司 声纹识别方法、装置、电子设备及存储介质
CN116092167A (zh) * 2023-02-23 2023-05-09 唯思电子商务(深圳)有限公司 一种基于读数的人脸活体检测方法
CN119207432A (zh) * 2024-09-14 2024-12-27 维沃移动通信有限公司 声纹校验方法、装置、电子设备及可读存储介质

Also Published As

Publication number Publication date
CN107533415B (zh) 2020-09-11
CN107533415A (zh) 2018-01-02

Similar Documents

Publication Publication Date Title
US11735191B2 (en) Speaker recognition with assessment of audio frame contribution
CN108305615B (zh) 一种对象识别方法及其设备、存储介质、终端
US11475899B2 (en) Speaker identification
CN107533415B (zh) 声纹检测的方法和装置
US11735189B2 (en) Speaker identification
Wang et al. Voicepop: A pop noise based anti-spoofing system for voice authentication on smartphones
US8589167B2 (en) Speaker liveness detection
US20250225983A1 (en) Detection of replay attack
US10950245B2 (en) Generating prompts for user vocalisation for biometric speaker recognition
US11042616B2 (en) Detection of replay attack
Wang et al. Secure your voice: An oral airflow-based continuous liveness detection for voice assistants
GB2608710A (en) Speaker identification
JP6220304B2 (ja) 音声識別装置
Gupta et al. Deep convolutional neural network for voice liveness detection
Sahidullah et al. Robust speaker recognition with combined use of acoustic and throat microphone speech
Zhang et al. A phoneme localization based liveness detection for text-independent speaker verification
Chang et al. Vogue: Secure user voice authentication on wearable devices using gyroscope
KR20110079161A (ko) 이동 단말기에서 화자 인증 방법 및 장치
WO2019006587A1 (fr) Système de reconnaissance de locuteur, procédé de reconnaissance de locuteur et dispositif intra-auriculaire
WO2023047893A1 (fr) Dispositif d'authentification et procédé d'authentification
Zhang et al. A Continuous Liveness Detection System for Text-independent Speaker Verification
Korshunov et al. Presentation attack detection in voice biometrics

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15911995

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15911995

Country of ref document: EP

Kind code of ref document: A1