CN108986790A - The method and apparatus of voice recognition of contact - Google Patents
The method and apparatus of voice recognition of contact Download PDFInfo
- Publication number
- CN108986790A CN108986790A CN201811148211.4A CN201811148211A CN108986790A CN 108986790 A CN108986790 A CN 108986790A CN 201811148211 A CN201811148211 A CN 201811148211A CN 108986790 A CN108986790 A CN 108986790A
- Authority
- CN
- China
- Prior art keywords
- contact person
- default
- phoneme sequence
- identification
- recognition
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 52
- 239000000284 extract Substances 0.000 claims description 15
- 238000009472 formulation Methods 0.000 claims description 7
- 239000000203 mixture Substances 0.000 claims description 7
- 238000004590 computer program Methods 0.000 claims description 6
- 238000000354 decomposition reaction Methods 0.000 claims description 6
- 230000006870 function Effects 0.000 description 9
- 230000006854 communication Effects 0.000 description 8
- 238000004891 communication Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 5
- 230000005236 sound signal Effects 0.000 description 5
- 238000012545 processing Methods 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 3
- 230000005291 magnetic effect Effects 0.000 description 2
- 239000004065 semiconductor Substances 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
- G10L2015/025—Phonemes, fenemes or fenones being the recognition units
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Telephonic Communication Services (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
The embodiment of the present application discloses the method and apparatus of voice recognition of contact.One specific embodiment of this method includes: to carry out speech recognition to the speech polling formula received, and the corresponding aligned phoneme sequence of target identification for identifying inquired object contact person is extracted from recognition result;The phonotactics corresponding with being used to identify the identification of contacts of default contact person in default linkman set of the corresponding aligned phoneme sequence of target identification are matched, determine the object contact person that speech polling formula is inquired from default linkman set according to matching result.The embodiment improves the efficiency and precision of offline speech recognition.
Description
Technical field
The invention relates to field of computer technology, and in particular to voice technology field more particularly to speech recognition
The method and apparatus of contact person.
Background technique
The speech recognition of usual mobile terminal needs the computing capability by means of server, the offline voice in some scenes
Identification or user provide more accurately voice service.Offline voice recognition of contact is in operating system without using network
Scene under, by offline speech recognition application by voice messaging that user issues title and contact person's surname for locally saving
Name is compared, and obtains being best suitable for the name of contact person that user it is expected lookup.
In the technology of above-mentioned offline voice recognition of contact, by taking Chinese language as an example, local speech recognition application to
After the speech recognition at family, the similar Chinese characters in common use combination of user pronunciation is returned, is then converted Chinese character combination to corresponding
Phonetic full name compares with the phonetic full name of contact person one by one, obtains matching result.
Summary of the invention
The embodiment of the present application proposes the method and apparatus of voice recognition of contact.
In a first aspect, the embodiment of the present application provides a kind of method of voice recognition of contact, comprising: to the language received
Sound query formulation carries out speech recognition, and the target identification pair for identifying inquired object contact person is extracted from recognition result
The aligned phoneme sequence answered;By the corresponding aligned phoneme sequence of target identification be used to identify default contact person's in default linkman set
The corresponding phonotactics of identification of contacts are matched, and determine speech polling from default linkman set according to matching result
The object contact person that formula is inquired.
In some embodiments, the above method further include: determine the connection of the default contact person in default linkman set
People identifies corresponding phonotactics.
In some embodiments, the identification of contacts of the default contact person in the default linkman set of above-mentioned determination is corresponding
Phonotactics, comprising: according to the corresponding relationship of individual character and phoneme in character library, to the default contact person in default linkman set
Identification of contacts carry out phoneme decomposition according to the individual character for being included, obtain the connection of the default contact person in default linkman set
It is that people identifies corresponding phonotactics.
In some embodiments, above-mentioned that speech recognition is carried out to the speech polling formula received, it is extracted from recognition result
Out for identifying the corresponding aligned phoneme sequence of target identification of inquired object contact person, comprising: based on acoustic model to voice
Query formulation is decoded, and obtains the corresponding aligned phoneme sequence of speech polling formula;Based on language model by the corresponding sound of speech polling formula
Prime sequences are converted into corresponding text identification result;Text identification result is matched with preset instruction template, from text
It is extracted in recognition result and the matched instruction text section of preset instruction template;From the corresponding aligned phoneme sequence of speech polling formula
Aligned phoneme sequence corresponding with instruction text section is rejected, the target identification obtained for identifying inquired object contact person is corresponding
Aligned phoneme sequence.
In some embodiments, above-mentioned that speech recognition is carried out to the speech polling formula received, it is extracted from recognition result
Out for identifying the corresponding aligned phoneme sequence of target identification of inquired object contact person, comprising: by the input of speech polling formula
Trained character recognition and label phoneme extracts model, obtains the corresponding phoneme of target identification for identifying inquired object contact person
Sequence.
Second aspect, the embodiment of the present application provide a kind of device of voice recognition of contact, comprising: recognition unit, quilt
It is configured to carry out speech recognition to the speech polling formula received, be extracted from recognition result for identifying inquired target
The corresponding aligned phoneme sequence of the target identification of contact person;Matching unit is configured as the corresponding aligned phoneme sequence of target identification and uses
It is matched in the corresponding phonotactics of identification of contacts for identifying the default contact person in default linkman set, according to matching
As a result the object contact person that speech polling formula is inquired is determined from default linkman set.
In some embodiments, above-mentioned apparatus further include: determination unit is configured to determine that in default linkman set
The corresponding phonotactics of identification of contacts of default contact person.
In some embodiments, above-mentioned determination unit is configured to determine default contact person's collection as follows
The corresponding phonotactics of identification of contacts of default contact person in conjunction: according to the corresponding relationship of individual character and phoneme in character library,
Phoneme decomposition is carried out according to the individual character for being included to the identification of contacts of the default contact person in default linkman set, is obtained pre-
If the corresponding phonotactics of identification of contacts of the default contact person in linkman set.
In some embodiments, above-mentioned recognition unit is configured to as follows look into the voice received
Inquiry formula carries out speech recognition, and the target identification extracted from recognition result for identifying inquired object contact person is corresponding
Aligned phoneme sequence: speech polling formula is decoded based on acoustic model, obtains the corresponding aligned phoneme sequence of speech polling formula;Based on language
Say that the corresponding aligned phoneme sequence of speech polling formula is converted corresponding text identification result by model;By text identification result and preset
Instruction template matched, extracted from text identification result and the matched instruction text section of preset instruction template;From
It rejects corresponding with instruction text section aligned phoneme sequence in the corresponding aligned phoneme sequence of speech polling formula, obtains being inquired for identifying
The corresponding aligned phoneme sequence of the target identification of object contact person.
In some embodiments, above-mentioned recognition unit is configured to as follows look into the voice received
Inquiry formula carries out speech recognition, and the target identification extracted from recognition result for identifying inquired object contact person is corresponding
Aligned phoneme sequence: the character recognition and label phoneme that the input of speech polling formula has been trained is extracted into model, is obtained for identifying inquired mesh
Mark the corresponding aligned phoneme sequence of target identification of contact person.
The third aspect, the embodiment of the present application provide a kind of electronic equipment, comprising: one or more processors;Storage dress
It sets, for storing one or more programs, when one or more programs are executed by one or more processors, so that one or more
The method that a processor realizes the voice recognition of contact provided such as first aspect.
Fourth aspect, the embodiment of the present application provide a kind of computer-readable medium, are stored thereon with computer program,
In, the method for the voice recognition of contact that first aspect provides is realized when program is executed by processor.
The method and apparatus of the voice recognition of contact of the above embodiments of the present application, by the speech polling formula received
Speech recognition is carried out, the corresponding phoneme of target identification for identifying inquired object contact person is extracted from recognition result
Sequence;The corresponding aligned phoneme sequence of target identification is marked with the contact person for being used to identify the default contact person in default linkman set
Know corresponding phonotactics to be matched, determines that speech polling formula is inquired from default linkman set according to matching result
Object contact person, optimize the process of voice recognition of contact, eliminate and convert the Chinese for the corresponding phoneme of speech polling formula
Word, the step of converting corresponding phonetic for Chinese character again, are able to ascend contact person's matching efficiency.
Further, since phoneme is phonetic unit more smaller than phonetic, therefore, it is based on the matched contact identification method of phoneme
It is more advantageous to the aligned phoneme sequence and phonotactics for distinguishing similar pronunciation, so the voice recognition of contact of the above embodiments of the present application
Method can also promote the accuracy rate of identification.
Detailed description of the invention
By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, the application's is other
Feature, objects and advantages will become more apparent upon:
Fig. 1 is that the embodiment of the present application can be applied to exemplary system architecture figure therein;
Fig. 2 is the flow chart according to one embodiment of the method for the voice recognition of contact of the application;
Fig. 3 is the flow chart according to another embodiment of the method for the voice recognition of contact of the application;
Fig. 4 is the structural schematic diagram of one embodiment of the device of the voice recognition of contact of the application;
Fig. 5 is adapted for the structural schematic diagram for the computer system for realizing the electronic equipment of the embodiment of the present application.
Specific embodiment
The application is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched
The specific embodiment stated is used only for explaining related invention, rather than the restriction to the invention.It also should be noted that in order to
Convenient for description, part relevant to related invention is illustrated only in attached drawing.
It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase
Mutually combination.The application is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
Fig. 1 is shown can be using the device of the method or voice recognition of contact of the voice recognition of contact of the application
Exemplary system architecture 100.
As shown in Figure 1, system architecture 100 may include terminal device 101,102, network and server 103.Network is used
To provide the medium of communication link between terminal device 101,102 and server 103.Network may include various connection classes
Type, such as wired, wireless communication link or fiber optic cables etc..
Terminal device 101,102 can be interacted by network with server 103, to receive or send message etc..Terminal is set
Various voice messaging interactive applications, such as voice assistant application, information search application, map can be installed on standby 101,102
Using the application of, social platform, audio and video playing application etc..
Terminal device 101,102 can be the equipment with audio signal sample function, can be with microphone and props up
Hold the various electronic equipments of internet access, including but not limited to car-mounted terminal, intelligent sound box, smart phone, tablet computer, intelligence
Energy wrist-watch, laptop, above-knee pocket computer, E-book reader etc..
Server 103 can be to provide the server of Audio Signal Processing, such as speech recognition server.In network communication
When quality is good, the audio signal that server 103 can send terminal device 101,102 is decoded, and identifies that message is believed
Number corresponding text.The recognition result of voice signal can be passed through network-feedback to terminal device 101,102 by server 103.
Terminal device 101,102 can also or network poor in network communication quality it is unavailable when, to the user 110 of acquisition
Audio signal parsed, judge that user is intended to, and responded.Such as user issues audio signal " phoning XXX ",
Terminal device 101,102 can find the name of contact person that user wishes connection with off-line address list, and execute and make a phone call
Operation.Terminal device 101,102 may include the component (such as the processors such as GPU) for executing physical manipulations, the application
The method of voice recognition of contact provided by embodiment can be executed by terminal device 101,102, correspondingly, speech recognition connection
It is that the device of people can be set in terminal device 101,102.
It should be understood that the terminal device, network, the number of server in Fig. 1 are only schematical.According to realization need
It wants, can have any number of terminal device, network, server.Also, in the embodiment of the present application, above system framework
Network and server can not included.
With continued reference to Fig. 2, it illustrates the streams according to one embodiment of the method for the voice recognition of contact of the application
Journey 200.The method of the voice recognition of contact, comprising the following steps:
Step 201, speech recognition is carried out to the speech polling formula received, extracted from recognition result for identifying
The corresponding aligned phoneme sequence of the target identification of the object contact person of inquiry.
In the present embodiment, the executing subject of the method for voice recognition of contact can receive speech polling formula.Voice is looked into
Inquiry formula can be what the speech polling request issued by user generated.Specifically, speech polling formula can be sends out according to user
Speech polling request out carries out the voice signal of coding generation, wherein may include the voice coder for requesting the content of inquiry
Code.
In practice, user can issue the speech polling request of request inquiry object contact person to above-mentioned executing subject,
Such as the voice request of " phoning Zhang San " can be issued.Above-mentioned executing subject can be according to the generation pair of the voice request of user
The speech polling formula answered.
Above-mentioned executing subject can identify the speech polling formula received in local, can be in identification process
Extract the acoustic feature of speech polling formula, such as extract fundamental frequency feature, mel cepstrum frequecy characteristic etc., based on the acoustics extracted
Feature parses each speech frame of speech polling formula, obtains the corresponding phoneme of each speech frame, be then combined with it is continuous and
Identical phoneme forms the corresponding aligned phoneme sequence of speech polling formula.
It is then based on the corresponding phoneme of each speech frame, optimal decoding paths are searched for using language model, is being searched for most
During shortest path, the corresponding individual character of speech polling formula or word can be exported one by one, at this moment may determine that output individual character or
Whether word is individual character or word for ID association people, specifically can be based on common character recognition and label (such as name, appellation)
Library constructs key word library and keywords database, such as can construct the keywords database comprising common surname.In search optimal path
In the process, it can be determined that currently decode the individual character obtained or word whether in above-mentioned key word library or keywords database.If so,
It can determine and decode the individual character obtained or word currently as one in the target identification for identifying inquired object contact person
A individual character or a word.Optionally, if currently decoding the individual character obtained or word in above-mentioned key word library or keywords database
In, it can be combined with whether the context determination of the individual character individual character or word are for identifying inquired object contact person
An individual character or a word in target identification.It, can be common if the individual character that decoding obtains is " Liu " by taking Chinese as an example
Surname is matched to " Liu " in library, it is determined that the individual character is the individual character in the target identification for identifying object contact person;If solution
The individual character that code obtains is " department ", and an individual character after the individual character is " horse ", then surname " department is matched in common surname library
Horse " can also determine that " department ", " horse " are the individual character in the target identification for identifying inquired object contact person.
During searching for optimal decoding paths, if it is determined that go out one or more individual characters or word is for identifying mesh
This then can be used to identify the corresponding aligned phoneme sequence of target identification of object contact person from voice by the target identification for marking contact person
It is extracted in the corresponding aligned phoneme sequence of query formulation, obtains the corresponding aligned phoneme sequence of target identification.Herein, target identification can be with
The appellation of object contact person, can with the name of object contact person, post, with the social relationships of local user (such as cousin,
The Kinship Terms such as two uncle (mother's brother)s) etc. indicate.
In some optional implementations of the present embodiment, the speech polling formula that user issues can be to be marked by contact person
Made of knowledge and corresponding operation instructing combination, such as " making a phone call to XX ", wherein " give ... and make a phone call " is operational order, " XX "
Identification of contacts for the object contact person inquired.In this way, can by separation speech polling formula in identification of contacts and
Operational order extracts the corresponding aligned phoneme sequence of target identification for identifying inquired object contact person.It can be according to such as
Under type carries out speech recognition to the speech polling formula received, is extracted from recognition result for identifying inquired target
The corresponding aligned phoneme sequence of the target identification of contact person: it is primarily based on acoustic model and speech polling formula is decoded, obtain voice
The corresponding aligned phoneme sequence of query formulation;It is then based on language model and converts corresponding text for the corresponding aligned phoneme sequence of speech polling formula
This recognition result;Text identification result is matched with preset instruction template later, is extracted from text identification result
With the matched instruction text section of preset instruction template;It is rejected and instruction text section from the corresponding aligned phoneme sequence of speech polling formula
Corresponding aligned phoneme sequence obtains the corresponding aligned phoneme sequence of target identification for identifying inquired object contact person.
Specifically, the acoustic feature that speech polling formula can be extracted first, be then based on acoustic model to acoustic feature into
Row decoding, obtains the corresponding aligned phoneme sequence of speech polling formula.It is then based on language model to be decoded aligned phoneme sequence, search is most
Excellent decoding paths obtain the text identification result of speech polling formula.It later, can be using fuzzy matching or accurate matched side
Formula matches the corresponding text identification result of speech polling formula with preset instruction template.Preset instruction template can be
Indicate to execute the instruction template of predetermined registration operation, such as " phoning ", " sending out wechat to ... ", " calling " etc..It can be from text
Extracted in recognition result with the matched text chunk of preset instruction template, by text identification result in addition to preset instruction
Other text chunks except the text chunk of template matching are as the target identification for identifying object contact person.Finally, can be from
Determined in the corresponding aligned phoneme sequence of speech polling formula phoneme corresponding with the matched text chunk of preset instruction template and by its
It rejects, then available aligned phoneme sequence corresponding with target identification.
It, can be as follows to the speech polling received in other optional implementations of the present embodiment
Formula carries out speech recognition, and the corresponding sound of target identification for identifying inquired object contact person is extracted from recognition result
Prime sequences: the character recognition and label phoneme that the input of speech polling formula has been trained is extracted into model, is obtained for identifying inquired target
The corresponding aligned phoneme sequence of the target identification of contact person.
Specifically, personage's labeling phonemes can be trained to extract model in advance, can will include character recognition and label in training
The corresponding voice data of instruction text as sample voice data, mark out the personage in the instruction text comprising character recognition and label
The standard pronunciation of mark, and it is converted into corresponding aligned phoneme sequence.In the training process by adjusting character recognition and label sound to be trained
The parameter that element extracts model is come so that character recognition and label phoneme to be trained extracts model to included in sample voice data
The prediction result of the corresponding aligned phoneme sequence of character recognition and label reaches unanimity with annotation results.It is instructed based on a large amount of sample voice data
After getting out character recognition and label phoneme extraction model, the speech polling formula received can be inputted into the character recognition and label trained
Phoneme extracts model, extracts the corresponding aligned phoneme sequence of character recognition and label therein, then in the recognition result for obtaining speech polling formula
The corresponding aligned phoneme sequence of target identification for identifying inquired object contact person.
Step 202, the corresponding aligned phoneme sequence of target identification default is contacted be used to identify in default linkman set
The corresponding phonotactics of the identification of contacts of people match, and determine voice from default linkman set according to matching result
The object contact person that query formulation is inquired.
In the present embodiment, the corresponding sound of identification of contacts of the default contact person in available default linkman set
Element combination, wherein identification of contacts is for identifying default contact person.Then the corresponding sound of target identification step 201 extracted
Prime sequences phonotactics corresponding with the identification of contacts of default contact person in default linkman set match.Wherein,
Default linkman set can be local linkages people set, such as can be the address list that above-mentioned executing subject is saved and included
All Contacts set, the identification of contacts of default contact person can be the appellation of default contact person, such as name, post
Appellation etc..The corresponding aligned phoneme sequence of matched target identification and pre- for identifying can be treated using various string matching modes
If the corresponding phonotactics of identification of contacts of the default contact person in linkman set match, such as can calculate similar
Degree or diversity factor.Specifically, two words can be calculated for example, by using the mode of editing distance (Levenshtein distance)
The distance between symbol string determines that matching degree is given a mark according to the distance being calculated.Distance is closer, then shows to be matched two
Character string is more similar, and matching degree marking is higher.
It can be corresponding with the default identification of contacts of contact person of each local to the corresponding aligned phoneme sequence of target identification
Phonotactics carry out matching degree marking, then can determine that the artificial target identification of the highest default connection of matching degree marking is marked
The object contact person of knowledge.It is thus achieved that the object contact person that identification user is inquired in user speech inquiry.
It is alternatively possible to provide at least one matching result according to the sequence that matching degree is given a mark, that is, determine at least one
The candidate result of object contact person, and matching result is ranked up according to the sequence that matching degree is given a mark.
It should be noted that each the identification of contacts of default contact person can at least correspond to a phonotactics.One
In a little scenes, preset in the identification of contacts of contact person when including polyphone, for example, in default name of contact person comprising " all ",
When the polyphones such as " pleasure ", " weight ", the pronunciation that the identification of contacts of the contact person can correspond to the polyphone for being included with it is combined
The identical phonotactics of quantity.
The method of the voice recognition of contact of the above embodiments of the present application, by carrying out language to the speech polling formula received
Sound identification, extracts the corresponding aligned phoneme sequence of target identification for identifying inquired object contact person from recognition result;
By the corresponding aligned phoneme sequence of target identification and the identification of contacts pair for being used to identify the default contact person in default linkman set
The phonotactics answered are matched, and determine the mesh that speech polling formula is inquired from default linkman set according to matching result
Contact person is marked, the process of offline voice recognition of contact is optimized, eliminates and convert the Chinese for the corresponding phoneme of speech polling formula
Word, the step of converting corresponding phonetic for Chinese character again, are able to ascend contact person's matching efficiency.
Further, since phoneme is phonetic unit more smaller than phonetic, therefore, it is based on the matched contact identification method of phoneme
It is more advantageous to the aligned phoneme sequence and phonotactics for distinguishing similar pronunciation, so the voice recognition of contact of the above embodiments of the present application
Method can also promote the accuracy rate of identification.
With continued reference to Fig. 3, it illustrates according to another embodiment of the method for the voice recognition of contact of the application
Flow chart.As shown in figure 3, the method flow 300 of the voice recognition of contact of the present embodiment, comprising the following steps:
Step 301, speech recognition is carried out to the speech polling formula received, extracted from recognition result for identifying
The corresponding aligned phoneme sequence of the target identification of the object contact person of inquiry.
In the present embodiment, the executing subject of the method for voice recognition of contact can receive the speech polling according to user
Contact person requests the speech polling formula generated, then can identify to speech polling formula.Speech polling can specifically be extracted
Then the acoustic feature of formula converts corresponding aligned phoneme sequence for speech polling formula using acoustic model.It can use language later
Speech model is decoded aligned phoneme sequence, the individual character or word being sequentially output in speech polling formula, in utilization language model to sound
In prime sequences decoding process, it can detecte each and currently decode whether obtained individual character or word are to be identified by frequent contact
In individual character or predetermined keyword library in the preset keyword library for individual character or the word building that identification of contacts in library is included
Word, if so, the corresponding phoneme of individual character for currently decoding and obtaining can be extracted, and then extract for identifying target connection
It is the corresponding aligned phoneme sequence of target identification of people.
In some optional implementations of the present embodiment, acoustic model can be primarily based on, speech polling formula is carried out
Decoding, obtains the corresponding aligned phoneme sequence of speech polling formula;Language model is then based on by the corresponding aligned phoneme sequence of speech polling formula
It is converted into corresponding text identification result;Text identification result is matched with preset instruction template later, is known from text
It is extracted in other result and the matched instruction text section of preset instruction template;It is picked from the corresponding aligned phoneme sequence of speech polling formula
Except aligned phoneme sequence corresponding with instruction text section, the corresponding sound of target identification for identifying inquired object contact person is obtained
Prime sequences.Wherein, preset instruction template can be instruction execute predetermined registration operation instruction template, such as " phoning ",
" is given ... to send out wechat ", " calling " etc..
In other optional implementations of the present embodiment, speech polling formula can be inputted to the personage trained and marked
Know phoneme and extract model, obtains the corresponding aligned phoneme sequence of target identification for identifying inquired object contact person.Wherein,
Trained character recognition and label phoneme, which extracts model, can be used for extracting the corresponding aligned phoneme sequence of character recognition and label in the voice data of input.
Step 302, the corresponding phonotactics of identification of contacts of the default contact person in default linkman set are determined.
In the present embodiment, available default linkman set, then by the default connection in default linkman set
The identification of contacts of people is converted into corresponding phonotactics.Default linkman set can be the user for issuing speech polling request
Address list in contact person set, can by be stored in above-mentioned executing subject local address list obtain.Default connection
The identification of contacts of people can be appellation of default contact person, such as name, post appellation, social relationships appellation etc..
Specifically, the connection of default contact person can be marked out according to the pronunciation dictionary of Chinese phonetic alphabet dictionary or other language
People identifies corresponding phonetic or pronunciation, then carries out sound according to corresponding languages to the pronunciation of the phonetic marked out or other languages
Element decomposes, and obtains the default corresponding phonotactics of identification of contacts.
In some optional implementations of the present embodiment, default connection can be determined in a manner of as follows 3021
It is the corresponding phonotactics of identification of contacts of the default contact person in people's set:
Step 3021, according to the corresponding relationship of individual character and phoneme in character library, to default in default linkman set
It is that the identification of contacts of people carries out phoneme decomposition according to the individual character for being included, obtains the default contact person in default linkman set
The corresponding phonotactics of identification of contacts.
It specifically, can be according between the individual character in the phoneme constructed in advance and basic character library in step 3021
Contrast relationship, by the identification of contacts of default contact person each individual character or word be converted into corresponding phoneme, then according to
Sequence combination forms the corresponding phonotactics of identification of contacts of default contact person.In this way, can be directly according to phoneme and individual character
Contrast relationship quickly determine out the corresponding phonotactics of identification of contacts of default contact person.
Step 303, the corresponding aligned phoneme sequence of target identification default is contacted be used to identify in default linkman set
The corresponding phonotactics of the identification of contacts of people match, and determine voice from default linkman set according to matching result
The object contact person that query formulation is inquired.
In the present embodiment, the corresponding aligned phoneme sequence of target identification and step 302 that step 301 can be extracted determine
The corresponding phonotactics of identification of contacts of default contact person out match.Various string matching modes pair can be used
The corresponding aligned phoneme sequence of target identification to be matched and contact person for identifying the default contact person in default linkman set
It identifies corresponding phonotactics to be matched, such as similarity or diversity factor can be calculated.As an example, can using editor away from
Mode from (Levenshtein distance) calculates the distance between two character strings, according to the distance being calculated come really
Determine matching degree marking.Distance is closer, then shows that two character strings to be matched are more similar, and matching degree marking is higher.
The step 301 of the method flow of the voice recognition of contact of the present embodiment, step 303 respectively with previous embodiment
Step 201, step 202 be consistent, and step 301, the specific implementation of step 303 can be retouched with reference to step 201, step 202
It states, details are not described herein again.
The method of the voice recognition of contact of the present embodiment is preset default in linkman set by increased determination
The step of being the identification of contacts corresponding phonotactics of people, can quickly, in real time construct for matching under off-line state
The corresponding phonotactics of identification of contacts of the default contact person for the object contact person that user is inquired are constructing default contact person
Identification of contacts corresponding phonotactics when can determine corresponding phonotactics for each contact person in local address book,
Avoid the corresponding sound of identification of contacts of the default contact person after address list updates in predetermined default linkman set
Element combination fails the influence to timely update to matching result, can further promote the identification essence of offline voice recognition of contact
Degree.
With further reference to Fig. 4, as the realization to method shown in above-mentioned each figure, this application provides a kind of speech recognition connection
It is one embodiment of the device of people, the Installation practice is corresponding with Fig. 2 and embodiment of the method shown in Fig. 3, device tool
Body can be applied in various electronic equipments.
As shown in figure 4, the device 400 of the voice recognition of contact of the present embodiment includes recognition unit 401 and matching unit
402.Wherein, recognition unit 401, which can be configured as, carries out speech recognition to the speech polling formula received, from recognition result
Extract the corresponding aligned phoneme sequence of target identification for identifying inquired object contact person;Matching unit 402 can be matched
It is set to and marks the corresponding aligned phoneme sequence of target identification with the contact person for being used to identify the default contact person in default linkman set
Know corresponding phonotactics to be matched, determines that speech polling formula is inquired from default linkman set according to matching result
Object contact person.
In some embodiments, above-mentioned apparatus 400 can also comprise determining that unit, be configured to determine that default contact person
The corresponding phonotactics of identification of contacts of default contact person in set.
In some embodiments, above-mentioned determination unit can be configured to determine default connection as follows
The corresponding phonotactics of identification of contacts of default contact person in people's set: it is closed according to the individual character in character library is corresponding with phoneme
System carries out phoneme decomposition according to the individual character for being included to the identification of contacts of the default contact person in default linkman set, obtains
The corresponding phonotactics of identification of contacts of default contact person into default linkman set.
In some embodiments, above-mentioned recognition unit 401 can be configured to as follows to receiving
Speech polling formula carry out speech recognition, the target mark for identifying inquired object contact person is extracted from recognition result
Know corresponding aligned phoneme sequence: speech polling formula being decoded based on acoustic model, obtains the corresponding phoneme sequence of speech polling formula
Column;Corresponding text identification result is converted by the corresponding aligned phoneme sequence of speech polling formula based on language model;By text identification
As a result it is matched, is extracted from text identification result and the matched instruction of preset instruction template with preset instruction template
Text chunk;Aligned phoneme sequence corresponding with instruction text section is rejected from the corresponding aligned phoneme sequence of speech polling formula, is obtained for marking
Know the corresponding aligned phoneme sequence of target identification of inquired object contact person.
In some embodiments, above-mentioned recognition unit 401 can be configured to as follows to receiving
Speech polling formula carry out speech recognition, the target mark for identifying inquired object contact person is extracted from recognition result
Know corresponding aligned phoneme sequence: the character recognition and label phoneme that the input of speech polling formula has been trained being extracted into model, is obtained for identifying
The corresponding aligned phoneme sequence of the target identification of the object contact person of inquiry.
It should be appreciated that all units recorded in device 400 and each step phase in the method described referring to figs. 2 and 3
It is corresponding.It is equally applicable to device 400 and unit wherein included above with respect to the operation and feature of method description as a result, herein
It repeats no more.
The device 400 of the voice recognition of contact of the above embodiments of the present application, by being obtained using identification speech polling formula
Carry out contact person's matching for identifying the corresponding aligned phoneme sequence of target identification of inquired object contact person, optimize language
Sound identifies the process of contact person, eliminates and converts Chinese character for the corresponding phoneme of speech polling formula, by Chinese character converts correspondence again
Phonetic the step of, be able to ascend contact person's matching efficiency.Simultaneously as phoneme is phonetic unit more smaller than phonetic, therefore,
The aligned phoneme sequence and phonotactics for distinguishing similar pronunciation are more advantageous to based on the matched contact identification device of phoneme, so above-mentioned
The device of voice recognition of contact can also promote the accuracy rate of identification.
Below with reference to Fig. 5, it illustrates the computer systems 500 for the electronic equipment for being suitable for being used to realize the embodiment of the present application
Structural schematic diagram.Electronic equipment shown in Fig. 5 is only an example, function to the embodiment of the present application and should not use model
Shroud carrys out any restrictions.
As shown in figure 5, computer system 500 includes central processing unit (CPU) 501, it can be read-only according to being stored in
Program in memory (ROM) 502 or be loaded into the program in random access storage device (RAM) 503 from storage section 508 and
Execute various movements appropriate and processing.In RAM 503, also it is stored with system 500 and operates required various programs and data.
CPU 501, ROM 502 and RAM 503 are connected with each other by bus 504.Input/output (I/O) interface 505 is also connected to always
Line 504.
I/O interface 505 is connected to lower component: the importation 506 including keyboard, mouse etc.;It is penetrated including such as cathode
The output par, c 505 of spool (CRT), liquid crystal display (LCD) etc. and loudspeaker etc.;Storage section 508 including hard disk etc.;
And the communications portion 509 of the network interface card including LAN card, modem etc..Communications portion 509 via such as because
The network of spy's net executes communication process.Driver 510 is also connected to I/O interface 505 as needed.Detachable media 511, such as
Disk, CD, magneto-optic disk, semiconductor memory etc. are mounted on as needed on driver 510, in order to read from thereon
Computer program be mounted into storage section 508 as needed.
Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description
Software program.For example, embodiment of the disclosure includes a kind of computer program product comprising be carried on computer-readable medium
On computer program, which includes the program code for method shown in execution flow chart.In such reality
It applies in example, which can be downloaded and installed from network by communications portion 509, and/or from detachable media
511 are mounted.When the computer program is executed by central processing unit (CPU) 501, limited in execution the present processes
Above-mentioned function.It should be noted that the computer-readable medium of the application can be computer-readable signal media or calculating
Machine readable storage medium storing program for executing either the two any combination.Computer readable storage medium for example can be --- but it is unlimited
In system, device or the device of --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor, or any above combination.It calculates
The more specific example of machine readable storage medium storing program for executing can include but is not limited to: have the electrical connection, portable of one or more conducting wires
Formula computer disk, hard disk, random access storage device (RAM), read-only memory (ROM), erasable programmable read only memory
(EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory device or
The above-mentioned any appropriate combination of person.In this application, computer readable storage medium can be it is any include or storage program
Tangible medium, which can be commanded execution system, device or device use or in connection.And in this Shen
Please in, computer-readable signal media may include in a base band or as carrier wave a part propagate data-signal,
In carry computer-readable program code.The data-signal of this propagation can take various forms, including but not limited to
Electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be computer-readable
Any computer-readable medium other than storage medium, the computer-readable medium can send, propagate or transmit for by
Instruction execution system, device or device use or program in connection.The journey for including on computer-readable medium
Sequence code can transmit with any suitable medium, including but not limited to: wireless, electric wire, optical cable, RF etc. are above-mentioned
Any appropriate combination.
The calculating of the operation for executing the application can be write with one or more programming languages or combinations thereof
Machine program code, programming language include object oriented program language-such as Java, Smalltalk, C++, also
Including conventional procedural programming language-such as " C " language or similar programming language.Program code can be complete
It executes, partly executed on the user computer on the user computer entirely, being executed as an independent software package, part
Part executes on the remote computer or executes on a remote computer or server completely on the user computer.It is relating to
And in the situation of remote computer, remote computer can pass through the network of any kind --- including local area network (LAN) or extensively
Domain net (WAN)-be connected to subscriber computer, or, it may be connected to outer computer (such as provided using Internet service
Quotient is connected by internet).
Flow chart and block diagram in attached drawing are illustrated according to the system of the various embodiments of the application, method and computer journey
The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation
A part of one module, program segment or code of table, a part of the module, program segment or code include one or more use
The executable instruction of the logic function as defined in realizing.It should also be noted that in some implementations as replacements, being marked in box
The function of note can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are actually
It can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it to infuse
Meaning, the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart can be with holding
The dedicated hardware based system of functions or operations as defined in row is realized, or can use specialized hardware and computer instruction
Combination realize.
Being described in unit involved in the embodiment of the present application can be realized by way of software, can also be by hard
The mode of part is realized.Described unit also can be set in the processor, for example, can be described as: a kind of processor packet
Include recognition unit and matching unit.Wherein, the title of these units does not constitute the limit to the unit itself under certain conditions
It is fixed, for example, recognition unit is also described as " speech recognition being carried out to the speech polling formula received, from recognition result
Extract the unit of the corresponding aligned phoneme sequence of target identification for identifying inquired object contact person ".
As on the other hand, present invention also provides a kind of computer-readable medium, which be can be
Included in device described in above-described embodiment;It is also possible to individualism, and without in the supplying device.Above-mentioned calculating
Machine readable medium carries one or more program, when said one or multiple programs are executed by the device, so that should
Device: speech recognition is carried out to the speech polling formula received, is extracted from recognition result for identifying inquired target
The corresponding aligned phoneme sequence of the target identification of contact person;By the corresponding aligned phoneme sequence of target identification and it is used to identify default contact person's collection
The corresponding phonotactics of identification of contacts of default contact person in conjunction match, and are collected according to matching result from default contact person
The object contact person that speech polling formula is inquired is determined in conjunction.
Above description is only the preferred embodiment of the application and the explanation to institute's application technology principle.Those skilled in the art
Member is it should be appreciated that invention scope involved in the application, however it is not limited to technology made of the specific combination of above-mentioned technical characteristic
Scheme, while should also cover in the case where not departing from foregoing invention design, it is carried out by above-mentioned technical characteristic or its equivalent feature
Any combination and the other technical solutions formed.Such as features described above has similar function with (but being not limited to) disclosed herein
Can technical characteristic replaced mutually and the technical solution that is formed.
Claims (12)
1. a kind of method of voice recognition of contact, comprising:
Speech recognition is carried out to the speech polling formula received, is extracted from recognition result for identifying inquired target connection
It is the corresponding aligned phoneme sequence of target identification of people;
By the corresponding aligned phoneme sequence of the target identification be used to identify contacting for default contact person in default linkman set
People identifies corresponding phonotactics and matches, and determines the voice from the default linkman set according to matching result
The object contact person that query formulation is inquired.
2. according to the method described in claim 1, wherein, the method also includes:
Determine the corresponding phonotactics of identification of contacts of the default contact person in default linkman set.
3. according to the method described in claim 2, wherein, the connection of the default contact person in linkman set is preset in the determination
People identifies corresponding phonotactics, comprising:
Connection according to the corresponding relationship of individual character and phoneme in character library, to the default contact person in the default linkman set
People's mark carries out phoneme decomposition according to the individual character for being included, and obtains the connection of the default contact person in the default linkman set
People identifies corresponding phonotactics.
4. method according to claim 1-3, wherein the described pair of speech polling formula received carries out voice knowledge
Not, the corresponding aligned phoneme sequence of target identification for identifying inquired object contact person is extracted from recognition result, comprising:
The speech polling formula is decoded based on acoustic model, obtains the corresponding aligned phoneme sequence of the speech polling formula;
Corresponding text identification result is converted by the corresponding aligned phoneme sequence of the speech polling formula based on language model;
The text identification result is matched with preset instruction template, is extracted from the text identification result and institute
State the matched instruction text section of preset instruction template;
Aligned phoneme sequence corresponding with described instruction text chunk is rejected from the corresponding aligned phoneme sequence of the speech polling formula, obtains institute
State the corresponding aligned phoneme sequence of target identification for identifying inquired object contact person.
5. method according to claim 1-3, wherein the described pair of speech polling formula received carries out voice knowledge
Not, the corresponding aligned phoneme sequence of target identification for identifying inquired object contact person is extracted from recognition result, comprising:
The character recognition and label phoneme trained of speech polling formula input extracted into model, obtain it is described be used to identify inquired
The corresponding aligned phoneme sequence of the target identification of object contact person.
6. a kind of device of voice recognition of contact, comprising:
Recognition unit is configured as carrying out speech recognition to the speech polling formula received, extracts and be used for from recognition result
Identify the corresponding aligned phoneme sequence of target identification of inquired object contact person;
Matching unit is configured as the corresponding aligned phoneme sequence of the target identification and is used to identify in default linkman set
The corresponding phonotactics of identification of contacts of default contact person match, according to matching result from the default linkman set
In determine the object contact person that the speech polling formula is inquired.
7. device according to claim 6, wherein described device further include:
Determination unit is configured to determine that the corresponding phoneme group of the identification of contacts of the default contact person in default linkman set
It closes.
8. device according to claim 7, wherein the determination unit is configured to determine as follows
The corresponding phonotactics of identification of contacts of default contact person in default linkman set:
Connection according to the corresponding relationship of individual character and phoneme in character library, to the default contact person in the default linkman set
People's mark carries out phoneme decomposition according to the individual character for being included, and obtains the connection of the default contact person in the default linkman set
People identifies corresponding phonotactics.
9. according to the described in any item devices of claim 6-8, wherein the recognition unit is configured to according to as follows
Mode carries out speech recognition to the speech polling formula received, is extracted from recognition result for identifying inquired target connection
It is the corresponding aligned phoneme sequence of target identification of people:
The speech polling formula is decoded based on acoustic model, obtains the corresponding aligned phoneme sequence of the speech polling formula;
Corresponding text identification result is converted by the corresponding aligned phoneme sequence of the speech polling formula based on language model;
The text identification result is matched with preset instruction template, is extracted from the text identification result and institute
State the matched instruction text section of preset instruction template;
Aligned phoneme sequence corresponding with described instruction text chunk is rejected from the corresponding aligned phoneme sequence of the speech polling formula, obtains institute
State the corresponding aligned phoneme sequence of target identification for identifying inquired object contact person.
10. according to the described in any item devices of claim 6-8, wherein the recognition unit is configured to according to such as
Under type carries out speech recognition to the speech polling formula received, is extracted from recognition result for identifying inquired target
The corresponding aligned phoneme sequence of the target identification of contact person:
The character recognition and label phoneme trained of speech polling formula input extracted into model, obtain it is described be used to identify inquired
The corresponding aligned phoneme sequence of the target identification of object contact person.
11. a kind of electronic equipment, comprising:
One or more processors;
Storage device, for storing one or more programs,
When one or more of programs are executed by one or more of processors, so that one or more of processors are real
Now such as method as claimed in any one of claims 1 to 5.
12. a kind of computer-readable medium, is stored thereon with computer program, wherein real when described program is executed by processor
Now such as method as claimed in any one of claims 1 to 5.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201811148211.4A CN108986790A (en) | 2018-09-29 | 2018-09-29 | The method and apparatus of voice recognition of contact |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201811148211.4A CN108986790A (en) | 2018-09-29 | 2018-09-29 | The method and apparatus of voice recognition of contact |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN108986790A true CN108986790A (en) | 2018-12-11 |
Family
ID=64543126
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201811148211.4A Pending CN108986790A (en) | 2018-09-29 | 2018-09-29 | The method and apparatus of voice recognition of contact |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN108986790A (en) |
Cited By (14)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109996026A (en) * | 2019-04-23 | 2019-07-09 | 广东小天才科技有限公司 | Video special effect interaction method, device, equipment and medium based on wearable equipment |
| CN110310631A (en) * | 2019-06-28 | 2019-10-08 | 北京百度网讯科技有限公司 | Speech recognition method, device, server and storage medium |
| CN111147444A (en) * | 2019-11-20 | 2020-05-12 | 维沃移动通信有限公司 | Interaction method and electronic equipment |
| CN111312226A (en) * | 2020-02-17 | 2020-06-19 | 出门问问信息科技有限公司 | Voice recognition method, voice recognition equipment and computer readable storage medium |
| CN112309398A (en) * | 2020-09-30 | 2021-02-02 | 音数汇元(上海)智能科技有限公司 | Working time monitoring method and device, electronic equipment and storage medium |
| CN112331207A (en) * | 2020-09-30 | 2021-02-05 | 音数汇元(上海)智能科技有限公司 | Service content monitoring method and device, electronic equipment and storage medium |
| CN112447176A (en) * | 2019-08-29 | 2021-03-05 | 株式会社东芝 | Information processing apparatus, keyword detection apparatus, and information processing method |
| CN112735394A (en) * | 2020-12-16 | 2021-04-30 | 青岛海尔科技有限公司 | Semantic parsing method and device for voice |
| CN113808593A (en) * | 2020-06-16 | 2021-12-17 | 阿里巴巴集团控股有限公司 | Voice interaction system, related method, device and equipment |
| CN113889083A (en) * | 2021-11-03 | 2022-01-04 | 广州博冠信息科技有限公司 | Voice recognition method and device, storage medium and electronic equipment |
| CN114124875A (en) * | 2021-11-04 | 2022-03-01 | 维沃移动通信有限公司 | Voice message processing method and device, electronic equipment and medium |
| CN115101064A (en) * | 2022-07-20 | 2022-09-23 | 安克创新科技股份有限公司 | Instruction word recognition method and device, electronic equipment and storage medium |
| EP4150489A1 (en) * | 2020-05-15 | 2023-03-22 | Sanofi | Information system and electronic device |
| WO2024188235A1 (en) * | 2023-03-13 | 2024-09-19 | 北京罗克维尔斯科技有限公司 | Speech recognition method and apparatus, electronic device, storage medium, and vehicle |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN1167950A (en) * | 1996-03-19 | 1997-12-17 | 西门子公司 | Speech recognition computer module and digit and speech signal transformation method based on phoneme |
| CN107016994A (en) * | 2016-01-27 | 2017-08-04 | 阿里巴巴集团控股有限公司 | The method and device of speech recognition |
| US9747891B1 (en) * | 2016-05-18 | 2017-08-29 | International Business Machines Corporation | Name pronunciation recommendation |
| CN108122555A (en) * | 2017-12-18 | 2018-06-05 | 北京百度网讯科技有限公司 | The means of communication, speech recognition apparatus and terminal device |
-
2018
- 2018-09-29 CN CN201811148211.4A patent/CN108986790A/en active Pending
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN1167950A (en) * | 1996-03-19 | 1997-12-17 | 西门子公司 | Speech recognition computer module and digit and speech signal transformation method based on phoneme |
| CN107016994A (en) * | 2016-01-27 | 2017-08-04 | 阿里巴巴集团控股有限公司 | The method and device of speech recognition |
| US9747891B1 (en) * | 2016-05-18 | 2017-08-29 | International Business Machines Corporation | Name pronunciation recommendation |
| CN108122555A (en) * | 2017-12-18 | 2018-06-05 | 北京百度网讯科技有限公司 | The means of communication, speech recognition apparatus and terminal device |
Cited By (19)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109996026B (en) * | 2019-04-23 | 2021-01-19 | 广东小天才科技有限公司 | Video special effect interaction method, device, equipment and medium based on wearable equipment |
| CN109996026A (en) * | 2019-04-23 | 2019-07-09 | 广东小天才科技有限公司 | Video special effect interaction method, device, equipment and medium based on wearable equipment |
| CN110310631A (en) * | 2019-06-28 | 2019-10-08 | 北京百度网讯科技有限公司 | Speech recognition method, device, server and storage medium |
| CN112447176A (en) * | 2019-08-29 | 2021-03-05 | 株式会社东芝 | Information processing apparatus, keyword detection apparatus, and information processing method |
| CN111147444A (en) * | 2019-11-20 | 2020-05-12 | 维沃移动通信有限公司 | Interaction method and electronic equipment |
| CN111147444B (en) * | 2019-11-20 | 2021-08-06 | 维沃移动通信有限公司 | An interactive method and electronic device |
| CN111312226A (en) * | 2020-02-17 | 2020-06-19 | 出门问问信息科技有限公司 | Voice recognition method, voice recognition equipment and computer readable storage medium |
| EP4150489A1 (en) * | 2020-05-15 | 2023-03-22 | Sanofi | Information system and electronic device |
| US12314293B2 (en) | 2020-05-15 | 2025-05-27 | Sanofi | Information system and electronic device |
| CN113808593A (en) * | 2020-06-16 | 2021-12-17 | 阿里巴巴集团控股有限公司 | Voice interaction system, related method, device and equipment |
| CN112309398A (en) * | 2020-09-30 | 2021-02-02 | 音数汇元(上海)智能科技有限公司 | Working time monitoring method and device, electronic equipment and storage medium |
| CN112331207A (en) * | 2020-09-30 | 2021-02-05 | 音数汇元(上海)智能科技有限公司 | Service content monitoring method and device, electronic equipment and storage medium |
| CN112735394A (en) * | 2020-12-16 | 2021-04-30 | 青岛海尔科技有限公司 | Semantic parsing method and device for voice |
| CN113889083A (en) * | 2021-11-03 | 2022-01-04 | 广州博冠信息科技有限公司 | Voice recognition method and device, storage medium and electronic equipment |
| CN114124875A (en) * | 2021-11-04 | 2022-03-01 | 维沃移动通信有限公司 | Voice message processing method and device, electronic equipment and medium |
| CN114124875B (en) * | 2021-11-04 | 2023-12-19 | 维沃移动通信有限公司 | Voice message processing method, device, electronic equipment and medium |
| CN115101064A (en) * | 2022-07-20 | 2022-09-23 | 安克创新科技股份有限公司 | Instruction word recognition method and device, electronic equipment and storage medium |
| CN115101064B (en) * | 2022-07-20 | 2025-01-24 | 安克创新科技股份有限公司 | Instruction word recognition method, device, electronic device and storage medium |
| WO2024188235A1 (en) * | 2023-03-13 | 2024-09-19 | 北京罗克维尔斯科技有限公司 | Speech recognition method and apparatus, electronic device, storage medium, and vehicle |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN108986790A (en) | The method and apparatus of voice recognition of contact | |
| CN111933129B (en) | Audio processing method, language model training method and device and computer equipment | |
| CN114186563B (en) | Electronic device and semantic parsing method, medium and human-computer dialogue system thereof | |
| CN113205817B (en) | Speech semantic recognition method, system, device and medium | |
| CN107657017B (en) | Method and apparatus for providing voice service | |
| CN107492379B (en) | Voiceprint creating and registering method and device | |
| CN109036384B (en) | Audio recognition method and device | |
| CN107945786B (en) | Speech synthesis method and apparatus | |
| US20240021202A1 (en) | Method and apparatus for recognizing voice, electronic device and medium | |
| CN107610709B (en) | Method and system for training voiceprint recognition model | |
| US11217236B2 (en) | Method and apparatus for extracting information | |
| CN114357973A (en) | Intention recognition method and device, electronic equipment and storage medium | |
| KR102046486B1 (en) | Information inputting method | |
| CN107481720A (en) | A kind of explicit method for recognizing sound-groove and device | |
| US11036996B2 (en) | Method and apparatus for determining (raw) video materials for news | |
| CN112906380A (en) | Method and device for identifying role in text, readable medium and electronic equipment | |
| CN112036186B (en) | Corpus annotation method, device, computer storage medium and electronic device | |
| CN117093687B (en) | Question answering method and device, electronic device, and storage medium | |
| CN109462482A (en) | Method for recognizing sound-groove, device, electronic equipment and computer readable storage medium | |
| CN112906381A (en) | Recognition method and device of conversation affiliation, readable medium and electronic equipment | |
| CN114528851A (en) | Reply statement determination method and device, electronic equipment and storage medium | |
| CN113836945B (en) | Intention recognition method, device, electronic equipment and storage medium | |
| CN115376496A (en) | Speech recognition method, device, computer equipment and storage medium | |
| CN110647613A (en) | Courseware construction method, courseware construction device, courseware construction server and storage medium | |
| CN118471191A (en) | Audio generation method, model training method, device, equipment and storage medium |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| RJ01 | Rejection of invention patent application after publication | ||
| RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181211 |