[go: up one dir, main page]

CN116052660A - Method, device and storage medium for executing voice commands of intelligent equipment - Google Patents

Method, device and storage medium for executing voice commands of intelligent equipment Download PDF

Info

Publication number
CN116052660A
CN116052660A CN202211651927.2A CN202211651927A CN116052660A CN 116052660 A CN116052660 A CN 116052660A CN 202211651927 A CN202211651927 A CN 202211651927A CN 116052660 A CN116052660 A CN 116052660A
Authority
CN
China
Prior art keywords
word
database
acoustic wave
voice
words
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211651927.2A
Other languages
Chinese (zh)
Inventor
张文杰
李绍斌
唐杰
姜旭东
黄鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Gree Electric Appliances Inc of Zhuhai
Zhuhai Lianyun Technology Co Ltd
Original Assignee
Gree Electric Appliances Inc of Zhuhai
Zhuhai Lianyun Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Gree Electric Appliances Inc of Zhuhai, Zhuhai Lianyun Technology Co Ltd filed Critical Gree Electric Appliances Inc of Zhuhai
Priority to CN202211651927.2A priority Critical patent/CN116052660A/en
Publication of CN116052660A publication Critical patent/CN116052660A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/1822Parsing for meaning understanding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/28Data switching networks characterised by path configuration, e.g. LAN [Local Area Networks] or WAN [Wide Area Networks]
    • H04L12/2803Home automation networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/02Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Artificial Intelligence (AREA)
  • Automation & Control Theory (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to an execution method, a device and a storage medium of a voice instruction of an intelligent device, wherein the method comprises the following steps: acquiring a voice instruction for controlling the intelligent equipment by a user; word segmentation is carried out on sentences corresponding to the voice instructions, and word segmentation results are compared with words in a preset database; comparing the sound wave diagram of the first word, which is not in the database, with the sound wave diagram of the second word in the database; and executing the voice instruction in combination with the semantics of the second word under the condition that the similarity of the sound wave diagram exceeds a first preset threshold. According to the method and the device, the problem that the intelligent home effect is poor due to the fact that voice recognition is deviated due to different pronunciations in the related art is caused.

Description

智能设备语音指令的执行方法、装置及存储介质Method, device and storage medium for executing voice commands of intelligent equipment

技术领域technical field

本申请涉及语音处理领域,尤其涉及一种智能设备语音指令的执行方法、装置及存储介质。The present application relates to the field of voice processing, and in particular to a method, device and storage medium for executing voice commands of smart devices.

背景技术Background technique

随着技术的发展,智能家居已普遍存在于大部分群体的生活中,语音操作智能产品已成为主流趋势,同时也伴随着一起语音带来的问题,例如每个人的发音不同,导致系统识别语音时存在偏差,或者某些同音字带来的识别有误,导致语音识别效率较低以至于通过语音指令控制智能家居的效果较差。With the development of technology, smart home has been ubiquitous in the lives of most groups, and voice-operated smart products have become the mainstream trend. At the same time, it is also accompanied by problems caused by voice, such as everyone's pronunciation is different, causing the system to recognize voice When there is a deviation, or the recognition error caused by some homophones, the speech recognition efficiency is low, and the effect of controlling the smart home through voice commands is poor.

发明内容Contents of the invention

本申请提供了一种智能设备语音指令的执行方法、装置及存储介质,以解决相关技术中因为发音不同以至于语音识别存在偏差,从而导致通过语音指令控制智能家居效果较差的问题。The present application provides a method, device, and storage medium for executing voice commands of smart devices to solve the problem in the related art that voice recognition has deviations due to different pronunciations, resulting in poor control of smart homes through voice commands.

第一方面,本申请提供了一种智能设备语音指令的执行方法,包括:获取用户对智能设备进行控制的语音指令;对所述语音指令所对应的语句进行分词,将分词结果与预设的数据库中的词进行对比;将所述分词结果不在所述数据库中的第一词的声波图与所述数据库中的第二词的声波图进行对比;在声波图的相似度超过第一预设阈值的情况下,结合所述第二词的语义执行所述语音指令。In the first aspect, the present application provides a method for executing a voice command of a smart device, including: obtaining a voice command for the user to control the smart device; Comparing the words in the database; comparing the acoustic wave graph of the first word whose word segmentation result is not in the database with the acoustic wave graph of the second word in the database; the similarity in the acoustic graph exceeds the first preset In the case of a threshold value, execute the voice instruction in combination with the semantics of the second word.

第二方面,本申请提供了一种智能设备语音指令的执行装置,包括:获取模块,用于获取用户对智能设备进行控制的语音指令;处理模块,用于对所述语音指令所对应的语句进行分词,将分词结果与预设的数据库中的词进行对比;对比模块,用于将所述分词结果不在所述数据库中的第一词的声波图与所述数据库中的第二词的声波图进行对比;执行模块,用于在声波图的相似度超过第一预设阈值的情况下,结合所述第二词的语义执行所述语音指令。In a second aspect, the present application provides an execution device for voice commands of smart devices, including: an acquisition module for acquiring voice commands for the user to control the smart device; a processing module for processing sentences corresponding to the voice commands Carrying out word segmentation, comparing the word segmentation result with the words in the preset database; comparison module, used to compare the sound wave of the first word whose word segmentation result is not in the database with the sound wave of the second word in the database The graphs are compared; an execution module is configured to execute the voice instruction in combination with the semantics of the second word when the similarity of the acoustic wave graph exceeds a first preset threshold.

第三方面,提供了一种电子设备,包括处理器、通信接口、存储器和通信总线,其中,处理器,通信接口,存储器通过通信总线完成相互间的通信;In a third aspect, an electronic device is provided, including a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory complete mutual communication through the communication bus;

存储器,用于存放计算机程序;memory for storing computer programs;

处理器,用于执行存储器上所存放的程序时,实现第一方面任一项实施例所述的智能设备语音指令的执行方法的步骤。The processor is configured to implement the steps of the method for executing the voice command of the smart device described in any one embodiment of the first aspect when executing the program stored in the memory.

第四方面,提供了一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现如第一方面任一项实施例所述的智能设备语音指令的执行方法的步骤。In the fourth aspect, there is provided a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the method for executing the voice command of the smart device as described in any embodiment of the first aspect is implemented. A step of.

本申请实施例提供的上述技术方案与现有技术相比具有如下优点:Compared with the prior art, the above-mentioned technical solutions provided by the embodiments of the present application have the following advantages:

在本申请实施例中在获取到用户对智能设备进行控制的语音指令之后,会对该语音指令进行分词处理,如果分词后存在不在预设的数据库中的词的情况下,需要将其与数据库中的词进行比较,获取到波形图相似度的词,进而对不在预设的数据库中的词进行修正,得到该语音指令的正确语义,并基于正确语义执行该修正后的语音指令。也就是说,通过本申请,即使出现口音问题导致对用户发出的语音信息识别不准确,也可以对其进行修正得到正确的语音指令,以使后续语音指令的执行更加准确,避免了误操作或无法识别的情况,解决了相关技术中因为发音不同以至于语音识别存在偏差,从而导致通过语音指令控制智能家居效果较差的问题。In the embodiment of the present application, after the user's voice command for controlling the smart device is obtained, the voice command will be segmented. If there is a word that is not in the preset database after the word segmentation, it needs to be compared with the database. Compare the words in the system to obtain the words with similarity in the waveform diagram, and then correct the words that are not in the preset database to obtain the correct semantics of the voice command, and execute the corrected voice command based on the correct semantics. That is to say, through this application, even if there is an accent problem that leads to inaccurate recognition of the voice information issued by the user, it can be corrected to obtain the correct voice command, so that the execution of subsequent voice commands is more accurate, avoiding misoperation or Unrecognizable situation solves the problem in the related art that voice recognition is biased due to different pronunciation, which leads to poor effect of controlling smart home through voice commands.

附图说明Description of drawings

此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本发明的实施例,并与说明书一起用于解释本发明的原理。The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description serve to explain the principles of the invention.

为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,对于本领域普通技术人员而言,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, for those of ordinary skill in the art, In other words, other drawings can also be obtained from these drawings without paying creative labor.

图1为本申请实施例提供的一种智能设备语音指令的执行方法的流程示意图之一;FIG. 1 is one of the schematic flow diagrams of a method for executing a voice command of a smart device provided in an embodiment of the present application;

图2为本申请实施例提供的一种智能设备语音指令的执行方法的流程示意图之二;FIG. 2 is the second schematic flow diagram of a method for executing a voice command on a smart device provided in an embodiment of the present application;

图3为本申请实施例提供的一种基于声波识别的语义修正方法的流程示意图;FIG. 3 is a schematic flowchart of a semantic correction method based on sound wave recognition provided by an embodiment of the present application;

图4为本申请实施例提供的一种智能设备语音指令的执行装置的结构示意图之一;FIG. 4 is one of the structural schematic diagrams of an execution device for a smart device voice command provided by an embodiment of the present application;

图5为本申请实施例提供的一种智能设备语音指令的执行装置的结构示意图之二;FIG. 5 is the second structural schematic diagram of an execution device for a smart device voice command provided by an embodiment of the present application;

图6为本申请实施例提供的一种电子设备的结构示意图。FIG. 6 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.

具体实施方式Detailed ways

为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请的一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动的前提下所获得的所有其他实施例,都属于本申请保护的范围。In order to make the purposes, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the drawings in the embodiments of the present application. Obviously, the described embodiments It is a part of the embodiments of this application, but not all of them. Based on the embodiments in the present application, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present application.

图1为本申请实施例提供的一种智能设备语音指令的执行方法的流程示意图,如图1所示,该方法的步骤包括:Fig. 1 is a schematic flow diagram of a method for executing a voice command of a smart device provided in an embodiment of the present application. As shown in Fig. 1, the steps of the method include:

步骤101,获取用户对智能设备进行控制的语音指令;Step 101, acquiring the user's voice command for controlling the smart device;

需要说明的是,本申请实施例中的智能设备可以是智能家居,如空调、电视、洗衣机、冰箱等,也可以是智能办公设备,例如打印机、智能座椅等,还可以是个人智能设备,例如手机、智能手表等。It should be noted that the smart device in the embodiment of the present application may be a smart home, such as an air conditioner, TV, washing machine, refrigerator, etc., or a smart office device, such as a printer, a smart seat, etc., or a personal smart device. Such as mobile phones, smart watches, etc.

步骤102,对语音指令所对应的语句进行分词,将分词结果与预设的数据库中的词进行对比;Step 102, performing word segmentation on the sentence corresponding to the voice instruction, and comparing the word segmentation result with the words in the preset database;

在本申请实施例中,语音指令是用于对智能设备进行控制的语音信息,例如“查询净水器滤芯寿命”,“查询空调保养时间”,“开启打印机”等。以语音指令为“查询净水器滤芯寿命”为例,则进行分词之后的结果为“查询”,“净水器”,“滤芯”,“寿命”,如果因为用户口音的问题,对于“查询净水器滤芯寿命”,有可能识别为“查询净水器旅行寿命”,则分析之后的结果为“查询”,“净水器”,“旅行”,“寿命”。可见,分词就是与汉语中词的常规组合进行分词,得到能够识别语义的最小单位。In the embodiment of the present application, the voice command is voice information used to control the smart device, such as "query the life of the filter element of the water purifier", "query the maintenance time of the air conditioner", "turn on the printer" and so on. Taking the voice command as "query life of filter element of water purifier" as an example, the result after word segmentation is "query", "water purifier", "filter element", and "lifetime". "Water purifier filter element life" may be identified as "query water purifier travel life", then the result after analysis is "query", "water purifier", "travel", "life". It can be seen that word segmentation is a word segmentation with the conventional combination of words in Chinese to obtain the smallest unit that can recognize semantics.

步骤103,将分词结果不在数据库中的第一词的声波图与数据库中的第二词的声波图进行对比;Step 103, comparing the acoustic image of the first word in the database with the acoustic image of the second word in the database as a word segmentation result;

需要说明的是,该第二词可以是数据库中任一词,也可以是比较容易出错的词。进一步地,以用户本意的语音指令为“查询净水器滤芯寿命”,但因为口音问题最后识别出的语音指令为“查询净水器旅行寿命”为,对该语音指令进行分词之后,“旅行”明显不是该语音指令中原本想表达的语义,因此,需要对此进行修正以得到正确的语音指令语音。对于智能设备的语音控制指令,比较常用的就是“打开……”,“关闭……”,“查询……”,因此可以事先设置一个数据库,该数据库中保存有常用的语音指令,在收到用户发出的语音指令后,可以将其与数据库中的词进行对比,例如数据库中只有“滤芯”与“旅行”的波形图比较接近,且该“滤芯”也是在预设的语音指令中,则可以将“旅行”修正为“滤芯”。也就是说,在本申请实施例中的数据库中可以预存语音指令,及其语音指令分词之后的结果,从而可以快捷的识别出当前用户发出的语音指令是什么。It should be noted that the second word may be any word in the database, or a word that is relatively error-prone. Furthermore, the voice command originally intended by the user is "query the life of the filter element of the water purifier", but because of the accent problem, the finally recognized voice command is "query the travel life of the water purifier". After word segmentation of the voice command, "travel " is obviously not the semantics originally intended to be expressed in the voice command, therefore, this needs to be corrected to obtain the correct voice command voice. For the voice control commands of smart devices, the more commonly used ones are "open...", "close...", "query...", so you can set up a database in advance, which stores commonly used voice commands. After the voice command issued by the user, it can be compared with the words in the database. For example, only the waveforms of "filter element" and "travel" in the database are relatively close, and the "filter element" is also in the preset voice command, then "Travel" could be corrected to "filter". That is to say, the voice command and the word segmentation result of the voice command can be pre-stored in the database in the embodiment of the present application, so that the voice command issued by the current user can be quickly identified.

步骤104,在声波图的相似度超过第一预设阈值的情况下,结合第二词的语义执行语音指令。Step 104, when the similarity of the acoustic wave pattern exceeds the first preset threshold, execute the voice instruction in combination with the semantics of the second word.

需要说明的是,本申请实施例中的第一预设阈值可以根据实际需求进行相应的设置,例如比较后的波形图中的振幅相比之后的结果和频率相比之后的结果均大于50%,或60%,80%等。It should be noted that the first preset threshold in the embodiment of the present application can be set accordingly according to actual needs, for example, the results of the amplitude comparison and the frequency comparison in the compared waveform diagrams are both greater than 50% , or 60%, 80%, etc.

可见,在本申请实施例中在获取到用户对智能设备进行控制的语音指令之后,会对该语音指令进行分词处理,如果分词后存在不在预设的数据库中的词的情况下,需要将其与数据库中的词进行比较,获取到波形图相似度的词,进而对不在预设的数据库中的词进行修正,得到该语音指令的正确语义,并基于正确语义执行该修正后的语音指令。也就是说,通过本申请,即使出现口音问题导致对用户发出的语音信息识别不准确,也可以对其进行修正得到正确的语音指令,以使后续语音指令的执行更加准确,避免了误操作或无法识别的情况,解决了相关技术中因为发音不同以至于语音识别存在偏差,从而导致通过语音指令控制智能家居效果较差的问题。It can be seen that in the embodiment of the present application, after the user's voice command for controlling the smart device is acquired, the voice command will be segmented. If there are words that are not in the preset database after the word segmentation, they need to be Comparing with the words in the database, the words with waveform similarity are obtained, and then the words not in the preset database are corrected to obtain the correct semantics of the voice command, and the corrected voice command is executed based on the correct semantics. That is to say, through this application, even if there is an accent problem that leads to inaccurate recognition of the voice information issued by the user, it can be corrected to obtain the correct voice command, so that the execution of subsequent voice commands is more accurate, avoiding misoperation or Unrecognizable situation solves the problem in the related art that voice recognition is biased due to different pronunciation, which leads to poor effect of controlling smart home through voice commands.

在本申请实施例的可选实施方式中,对于上述步骤102中涉及到的将分词结果与预设数据库中的词进行对比的方式,进一步可以包括:In an optional implementation of the embodiment of the present application, the method of comparing the word segmentation result with the words in the preset database involved in the above step 102 may further include:

步骤11,统计分词结果中在数据库中存在的词;Step 11, count the words that exist in the database in the word segmentation results;

步骤12,将分词结果中在数据库中存在的词的数量比上分词结果中所有词的数量,得到语音指令的有效率;Step 12, the number of words existing in the database in the word segmentation result is compared with the number of all words in the word segmentation result to obtain the effective rate of the voice command;

在具体示例中以“查询净水器旅行寿命”,则分析之后的结果为“查询”,“净水器”,“旅行”,“寿命”为例,则其中“查询”,“净水器”,“寿命”在数据库中有对应的词,而“旅行”则是在数据库中是不存在的,因此该“查询净水器旅行寿命”分词之后的词的数量为4,在数据库中存在词的数量为3,则该“查询净水器旅行寿命”的有效率为3/4=0.75,则对应的有效率为75%。In the specific example, take "query water purifier travel life", the result after analysis is "query", "water purifier", "travel", "life" as an example, then "query", "water purifier ", "life" has a corresponding word in the database, while "travel" does not exist in the database, so the number of words after the participle of "query water purifier travel life" is 4, which exists in the database The number of words is 3, then the effective rate of the "inquiry about the travel life of the water purifier" is 3/4=0.75, and the corresponding effective rate is 75%.

步骤13,在有效率超过第二预设阈值的情况下,确定语音指令为有效指令;Step 13, in the case that the effective rate exceeds the second preset threshold, determine that the voice command is a valid command;

在本申请实施例中,该第二预设阈值可以根据实际需求进行相应的设置,例如设置为60%,65%,70%等。In the embodiment of the present application, the second preset threshold can be set correspondingly according to actual needs, for example, set to 60%, 65%, 70% and so on.

步骤14,在有效率未超过第二预设阈值的情况下,确定语音指令为无效指令;其中,将有效指令中的不存在数据库中的第一词进行声波图对比的操作。Step 14, if the effective rate does not exceed the second preset threshold, determine that the voice command is an invalid command; where the first word in the valid command that does not exist in the database is compared with the acoustic wave.

该第二预设阈值的设定是为了保证语音指令中未正常识别出来的词较少,这样在后续修正中可以得到该语音指令的正确语义,如果该第二预设阈值设置的较小,则如果当前语音指令识别的处理错误率较高的情况下,则无法对其进行修正,即是无效指令,需要用户重新发出语音指令。The second preset threshold is set to ensure that there are fewer words that are not normally recognized in the voice command, so that the correct semantics of the voice command can be obtained in subsequent corrections. If the second preset threshold is set smaller, Then, if the processing error rate of the current voice command recognition is high, it cannot be corrected, that is, the command is invalid, and the user needs to re-send the voice command.

在本申请实施例的可选实施方式总,对于上述步骤103中涉及到的将分词结果不在数据库中的第一词的声波图与数据库中的第二词的声波图进行对比的方式,进一步可以包括:In an optional implementation manner of the embodiment of the present application, for the method of comparing the acoustic wave image of the first word whose word segmentation result is not in the database with the acoustic wave image of the second word in the database involved in the above step 103, it can further be include:

步骤21,获取第一词的声波图的波形参数,以及第二词的声波图的波形参数,其中,波形参数包括以下至少之一:振幅、频率;Step 21, obtaining the waveform parameters of the acoustic wave diagram of the first word, and the waveform parameters of the acoustic wave diagram of the second word, wherein the waveform parameters include at least one of the following: amplitude, frequency;

步骤22,比较第一词的声波图的波形参数和第二词的声波图的波形参数;Step 22, comparing the waveform parameters of the acoustic wave diagram of the first word and the waveform parameters of the acoustic wave diagram of the second word;

步骤23,获取第一词的声波图的波形参数和第二词的声波图的波形参数之间的比值,其中,在比值用于表征该相似度。Step 23, acquiring the ratio between the waveform parameter of the acoustic wave diagram of the first word and the waveform parameter of the acoustic wave diagram of the second word, wherein the ratio is used to represent the similarity.

在本申请实施例中,该波形图是根据词的语音语调得到的,即每一个词均有对应的波形图,每一个波形图均有对应的振幅和频率。由于在本申请中主要是解决因为口音或发音的问题导致的语音识别存在误差的情况下,但语音或口音引起的误差往往都是两者的发音比较接近,听起来比较像,例如“旅行”与“滤芯”,因此,在本申请中可以根据波形图中振幅和频率的比较来确定两个词之间的相似度,如果相似度较高则说明该第二词是正确的词。In the embodiment of the present application, the waveform diagram is obtained according to the pronunciation and intonation of words, that is, each word has a corresponding waveform diagram, and each waveform diagram has a corresponding amplitude and frequency. In this application, the problem of speech recognition errors caused by accent or pronunciation is mainly solved, but the errors caused by speech or accent are often the pronunciations of the two are relatively close and sound similar, such as "travel" Therefore, in this application, the similarity between two words can be determined according to the comparison of the amplitude and frequency in the waveform diagram. If the similarity is higher, it means that the second word is the correct word.

需要说明的是,如果在当前数据库中识别相似度满足第一预设阈值的词有多个,则可以在对该多个词进行语义识别,将语义最贴合当前语音指令的词确定为需要替换第一词的词。It should be noted that if there are multiple words whose similarity meets the first preset threshold in the current database, then semantic recognition can be performed on the multiple words, and the word whose semantics is most suitable for the current voice instruction is determined as the required word. The word that replaces the first word.

在本申请实施例的可选实施方式中,对于上述步骤103中涉及到的结合第二词的语义执行语音指令的方式,进一步可以包括:In an optional implementation of the embodiment of the present application, the manner of executing voice instructions in combination with the semantics of the second word involved in the above step 103 may further include:

步骤31,将分词结果中的第一词修正为第二词;Step 31, amending the first word in the word segmentation result to the second word;

步骤32,基于修正后分词结果所对应的语音指令的语义,执行修正后分词结果所对应的语音指令。Step 32, based on the semantics of the voice instruction corresponding to the modified word segmentation result, execute the voice instruction corresponding to the modified word segmentation result.

对于上述步骤31和步骤32,以“查询净水器旅行寿命”,则分析之后的结果为“查询”,“净水器”,“旅行”,“寿命”为例,则其中“查询”,“净水器”,“寿命”在数据库中有对应的词,而“旅行”则是在数据库中是不存在的,将“旅行”的波形图与数据库中的词的波形图进行比较,根据波形相似度最终在数据库中识别出来的是“滤芯”,则基于“滤芯”对语音指令进行修正,并执行正确的语音指令“查询净水器滤芯寿命”。当然也有可能识别出多个满足波形相似度的词,例如识别出的结果为“滤芯”,“铝芯”,“绿心”等,则可以进一步判断识别出的结果语义是否与当前语音指令匹配,在该示例“滤芯”是与当前语音指令匹配的,而“铝芯”,“绿心”的语义是不匹配的,则需要“滤芯”语音指令进行修正,并执行正确的语音指令“查询净水器滤芯寿命”。For the above step 31 and step 32, take "query water purifier travel life", then the result after analysis is "query", "water purifier", "travel", "life" as an example, then "query", "Water purifier" and "life" have corresponding words in the database, while "travel" does not exist in the database. Compare the waveform diagram of "travel" with the waveform diagram of words in the database, according to The waveform similarity finally identifies the "filter element" in the database, and the voice command is corrected based on the "filter element", and the correct voice command "query the life of the filter element of the water purifier" is executed. Of course, it is also possible to recognize multiple words that satisfy the waveform similarity. For example, the recognized result is "filter element", "aluminum core", "green heart", etc., and it can be further judged whether the semantics of the recognized result match the current voice command , in this example "filter element" matches the current voice command, but the semantics of "aluminum core" and "green heart" do not match, you need to correct the "filter element" voice command and execute the correct voice command "query Water purifier filter life".

在本申请实施例中,在结合第二词的语义执行语音指令之后,如图2所示,本申请实施例中的方法步骤还可以包括:In the embodiment of the present application, after executing the voice instruction in combination with the semantics of the second word, as shown in FIG. 2 , the method steps in the embodiment of the present application may also include:

步骤105,将第一词备份至数据库中。Step 105, backing up the first word to the database.

可见,在本申请实施例中可以将该第一词添加至数据库中。在具体示例中,以“查询净水器旅行寿命”为例,可以将“旅行”添加至数据库中,再下一次依然是“查询净水器旅行寿命”的情况下,可以直接执行该“查询净水器旅行寿命”语音指令,无需对其进行调整,因为已经能够获知其正确的语义,即“查询净水器滤芯寿命”。也就是说,在本申请中即使发音不标准也可以直接识别出其正确的语义,提升了语音指令控制智能设备的体验效果。It can be seen that in the embodiment of the present application, the first word can be added to the database. In a specific example, take "query the travel life of the water purifier" as an example, you can add "travel" to the database, and next time it is still "query the travel life of the water purifier", you can directly execute the "query "Water purifier travel life" voice command, there is no need to adjust it, because its correct semantics can already be known, that is, "query the life of the water purifier filter element". That is to say, in this application, even if the pronunciation is not standard, its correct semantics can be directly recognized, which improves the experience effect of voice command control smart devices.

下面结合本申请实施例的具体实施方式对本申请进行解释说明,该具体实施方式提供了一种基于声波识别的语义修正方法,该方法通过将未识别的部分词语,生成对应的声波图,将声波图与已有数据库中的数据进行比对,若达到设定的值,则将词语修正,同时将原词语添加至备用预料库中,若下次出现相同的词,即可识别成功。如图3所示,该基于声波识别的语义修正方法的步骤包括:The application will be explained below in conjunction with the specific implementation of the embodiment of the application. This specific implementation provides a semantic correction method based on sound wave recognition. The graph is compared with the data in the existing database. If the set value is reached, the word will be corrected, and the original word will be added to the backup prediction database. If the same word appears next time, the recognition will be successful. As shown in Figure 3, the steps of the semantic correction method based on sound wave recognition include:

步骤301,收集用户说的语音指令;Step 301, collecting voice instructions spoken by the user;

其中,如用户的意图是:查询净水器滤芯寿命,将语音提取为文字。Among them, if the user's intention is: query the life of the filter element of the water purifier, and extract the voice into text.

步骤302,提取语音指令为:查询净水器旅行寿命;Step 302, the extracted voice command is: query the travel life of the water purifier;

对于该步骤302可能是因为用户的发音或者其他因素,导致了提取语音不准确,即将“查询净水器滤芯寿命”识别为“查询净水器旅行寿命”。For this step 302, it may be because of the user's pronunciation or other factors that the extracted voice is inaccurate, that is, "query the life of the filter element of the water purifier" is recognized as "query the life of the travel water purifier".

步骤303,通过分词将语句分为“查询”、“净水器”、“旅行”、“寿命”,这时候系统可以识别“查询”、“净水器”、“寿命”的语义,但是无法识别“旅行”。当前有效率为0.75。Step 303, divide the sentence into "query", "water purifier", "travel", and "life" through word segmentation. At this time, the system can recognize the semantics of "query", "water purifier" and "life", but cannot Identify "travel". The current effective rate is 0.75.

其中,在具体示例中可以设定一个阈值来控制资源的消耗,若超过此值则大概率为有效指令,如果低于该阈值则表明当前语音指令为无效指令。Wherein, in a specific example, a threshold can be set to control resource consumption. If it exceeds this value, it is likely to be a valid instruction, and if it is lower than this threshold, it indicates that the current voice instruction is an invalid instruction.

步骤304,将“旅行”生成声波图,与已有的数据库当中的声波图进行比对;Step 304, generate a sound wave map from "Travel", and compare it with the sound wave map in the existing database;

在具体示例中,可以设置一个阈值来判断相似度是否达标,可以根据具体场景的接受程度来设定,值越大系统的准确率越高。In a specific example, a threshold can be set to determine whether the similarity is up to standard, which can be set according to the acceptance level of a specific scene. The larger the value, the higher the accuracy of the system.

步骤305,若达到设定的阈值,则可以将原词语当成匹配到的词语处理,此时系统就可以成功识别意图,同时可以将原词语加入一个备份的数据库;Step 305, if the set threshold is reached, the original word can be treated as a matched word, then the system can successfully identify the intent, and at the same time, the original word can be added to a backup database;

其中,将原词语加入一个备份的数据库可以使以后相同的语句,在到达步骤303时,就可以直接成功的分析意图。Wherein, adding the original word into a backup database can make the same sentence in the future, when reaching step 303, just can directly and successfully analyze the intention.

通过上述步骤301至步骤305,对于在识别语音时可能会存在的同音词,或因发音问题提取成的近音词,从而造成语音的识别率低,在本申请中将未识别的部分文字生成声波图,通过与已有热词中的数据进行比对,从而进行修正未识别的内容,从而提高了语义的识别效率,并且可以自动扩充数据库。Through the above steps 301 to 305, for the homonyms that may exist when recognizing speech, or near-syllable words extracted due to pronunciation problems, resulting in a low speech recognition rate, in this application, some unrecognized characters are generated into sound waves Figure, by comparing with the data in the existing hot words, so as to correct the unrecognized content, thereby improving the efficiency of semantic recognition, and can automatically expand the database.

对应于上述图1,本申请实施例还提供了一种智能设备语音指令的执行装置,如图4所示,该装置包括:Corresponding to the above-mentioned Figure 1, the embodiment of the present application also provides an execution device for voice commands of smart devices, as shown in Figure 4, the device includes:

获取模块42,用于获取用户对智能设备进行控制的语音指令;Obtaining module 42, is used for obtaining the voice instruction that the user controls smart device;

处理模块44,用于对语音指令所对应的语句进行分词,将分词结果与预设的数据库中的词进行对比;The processing module 44 is used to perform word segmentation on the sentence corresponding to the voice instruction, and compare the word segmentation result with the words in the preset database;

对比模块46,用于将分词结果不在数据库中的第一词的声波图与数据库中的第二词的声波图进行对比;Contrast module 46, is used for comparing the acoustic wave figure of the first word in the database with the second word in the database with the word segmentation result;

执行模块48,用于在声波图的相似度超过第一预设阈值的情况下,结合第二词的语义执行语音指令。The execution module 48 is configured to execute the voice instruction in combination with the semantics of the second word when the similarity of the acoustic wave pattern exceeds the first preset threshold.

可见,在本申请实施例中在获取到用户对智能设备进行控制的语音指令之后,会对该语音指令进行分词处理,如果分词后存在不在预设的数据库中的词的情况下,需要将其与数据库中的词进行比较,获取到波形图相似度的词,进而对不在预设的数据库中的词进行修正,得到该语音指令的正确语义,并基于正确语义执行该修正后的语音指令。也就是说,通过本申请,即使出现口音问题导致对用户发出的语音信息识别不准确,也可以对其进行修正得到正确的语音指令,以使后续语音指令的执行更加准确,避免了误操作或无法识别的情况,解决了相关技术中因为发音不同以至于语音识别存在偏差,从而导致通过语音指令控制智能家居效果较差的问题。It can be seen that in the embodiment of the present application, after the user's voice command for controlling the smart device is acquired, the voice command will be segmented. If there are words that are not in the preset database after the word segmentation, they need to be Comparing with the words in the database, the words with waveform similarity are obtained, and then the words not in the preset database are corrected to obtain the correct semantics of the voice command, and the corrected voice command is executed based on the correct semantics. That is to say, through this application, even if there is an accent problem that leads to inaccurate recognition of the voice information issued by the user, it can be corrected to obtain the correct voice command, so that the execution of subsequent voice commands is more accurate, avoiding misoperation or Unrecognizable situation solves the problem in the related art that voice recognition is biased due to different pronunciation, which leads to poor effect of controlling smart home through voice commands.

在本申请实施例的可选实施方式中,本申请实施例中的处理模块44进一步可以包括:统计单元,用于统计分词结果中在数据库中存在的词;处理单元,用于将分词结果中在数据库中存在的词的数量比上分词结果中所有词的数量,得到语音指令的有效率;第一确定单元,用于在有效率超过第二预设阈值的情况下,确定语音指令为有效指令;第二确定单元,用于在有效率未超过第二预设阈值的情况下,确定语音指令为无效指令;其中,将有效指令中的不存在数据库中的第一词进行声波图对比的操作。In an optional implementation of the embodiment of the present application, the processing module 44 in the embodiment of the present application may further include: a statistical unit for counting the words existing in the database in the word segmentation results; The number of words existing in the database is compared with the number of all words in the word segmentation result to obtain the effective rate of the voice command; the first determination unit is used to determine that the voice command is valid when the effective rate exceeds the second preset threshold instruction; the second determination unit is used to determine that the voice instruction is an invalid instruction when the effective rate does not exceed the second preset threshold; wherein, the first word in the effective instruction that does not exist in the database is compared with the acoustic wave operate.

对此,在具体示例中以“查询净水器旅行寿命”,则分析之后的结果为“查询”,“净水器”,“旅行”,“寿命”为例,则其中“查询”,“净水器”,“寿命”在数据库中有对应的词,而“旅行”则是在数据库中是不存在的,因此该“查询净水器旅行寿命”分词之后的词的数量为4,在数据库中存在词的数量为3,则该“查询净水器旅行寿命”的有效率为3/4=0.75,则对应的有效率为75%。此外,在本申请实施例中,该第二预设阈值可以根据实际需求进行相应的设置,例如设置为60%,65%,70%等。该第二预设阈值的设定是为了保证语音指令中未正常识别出来的词较少,这样在后续修正中可以得到该语音指令的正确语义,如果该第二预设阈值设置的较小,则如果当前语音指令识别的处理错误率较高的情况下,则无法对其进行修正,即是无效指令,需要用户重新发出语音指令。In this regard, in the specific example, take "query water purifier travel life", then the result after analysis is "query", "water purifier", "travel", "life" as an example, then "query", " "Water purifier", "life" has corresponding words in the database, but "travel" does not exist in the database, so the number of words after the participle of "query water purifier travel life" is 4, in If the number of words in the database is 3, then the effective rate of "querying the travel life of the water purifier" is 3/4=0.75, and the corresponding effective rate is 75%. In addition, in the embodiment of the present application, the second preset threshold can be set correspondingly according to actual needs, for example, set to 60%, 65%, 70% and so on. The second preset threshold is set to ensure that there are fewer words that are not normally recognized in the voice command, so that the correct semantics of the voice command can be obtained in subsequent corrections. If the second preset threshold is set smaller, Then, if the processing error rate of the current voice command recognition is high, it cannot be corrected, that is, the command is invalid, and the user needs to re-send the voice command.

在本申请实施例的可选实施方式中,本申请实施例中的对比模块48进一步可以包括:第一获取单元,用于获取第一词的声波图的波形参数,以及第二词的声波图的波形参数,其中,波形参数包括以下至少之一:振幅、频率;比较单元,用于比较第一词的声波图的波形参数和第二词的声波图的波形参数;第二获取单元,用于获取第一词的声波图的波形参数和第二词的声波图的波形参数之间的比值,其中,比值用于表征相似度。In an optional implementation of the embodiment of the present application, the comparison module 48 in the embodiment of the present application may further include: a first acquisition unit, configured to acquire the waveform parameters of the acoustic wave diagram of the first word, and the acoustic wave diagram of the second word The waveform parameters, wherein the waveform parameters include at least one of the following: amplitude, frequency; a comparison unit, used to compare the waveform parameters of the acoustic wave diagram of the first word and the waveform parameters of the acoustic wave diagram of the second word; the second acquisition unit, using The ratio between the waveform parameter of the acoustic wave image of the first word and the waveform parameter of the acoustic wave image of the second word is obtained, wherein the ratio is used to characterize the similarity.

在本申请实施例中,该波形图是根据词的语音语调得到的,即每一个词均有对应的波形图,每一个波形图均有对应的振幅和频率。由于在本申请中主要是解决因为口音或发音的问题导致的语音识别存在误差的情况下,但语音或口音引起的误差往往都是两者的发音比较接近,听起来比较像,例如“旅行”与“滤芯”,因此,在本申请中可以根据波形图中振幅和频率的比较来确定两个词之间的相似度,如果相似度较高则说明该第二词是正确的词。In the embodiment of the present application, the waveform diagram is obtained according to the pronunciation and intonation of words, that is, each word has a corresponding waveform diagram, and each waveform diagram has a corresponding amplitude and frequency. In this application, the problem of speech recognition errors caused by accent or pronunciation is mainly solved, but the errors caused by speech or accent are often the pronunciations of the two are relatively close and sound similar, such as "travel" Therefore, in this application, the similarity between two words can be determined according to the comparison of the amplitude and frequency in the waveform diagram. If the similarity is higher, it means that the second word is the correct word.

需要说明的是,如果在当前数据库中识别相似度满足第一预设阈值的词有多个,则可以在对该多个词进行语义识别,将语义最贴合当前语音指令的词确定为需要替换第一词的词。It should be noted that if there are multiple words whose similarity meets the first preset threshold in the current database, then semantic recognition can be performed on the multiple words, and the word whose semantics is most suitable for the current voice instruction is determined as the required word. The word that replaces the first word.

在本申请实施例的可选实施方式中,本申请实施例中的执行模块48进一步可以包括:修正单元,用于将分词结果中的第一词修正为第二词;执行单元,用于基于修正后分词结果所对应的语音指令的语义,执行修正后分词结果所对应的语音指令。In an optional implementation of the embodiment of the present application, the execution module 48 in the embodiment of the present application may further include: a correction unit, configured to modify the first word in the word segmentation result to a second word; an execution unit, configured to The semantics of the phonetic instruction corresponding to the word segmentation result after modification is corrected, and the phonetic instruction corresponding to the word segmentation result after modification is executed.

对此,以“查询净水器旅行寿命”,则分析之后的结果为“查询”,“净水器”,“旅行”,“寿命”为例,则其中“查询”,“净水器”,“寿命”在数据库中有对应的词,而“旅行”则是在数据库中是不存在的,将“旅行”的波形图与数据库中的词的波形图进行比较,根据波形相似度最终在数据库中识别出来的是“滤芯”,则基于“滤芯”对语音指令进行修正,并执行正确的语音指令“查询净水器滤芯寿命”。当然也有可能识别出多个满足波形相似度的词,例如识别出的结果为“滤芯”,“铝芯”,“绿心”等,则可以进一步判断识别出的结果语义是否与当前语音指令匹配,在该示例“滤芯”是与当前语音指令匹配的,而“铝芯”,“绿心”的语义是不匹配的,则需要“滤芯”语音指令进行修正,并执行正确的语音指令“查询净水器滤芯寿命”。In this regard, taking "query water purifier travel life", the result after analysis is "query", "water purifier", "travel", "life" as an example, then "query", "water purifier" , "Life" has a corresponding word in the database, while "travel" does not exist in the database. Compare the waveform diagram of "travel" with the waveform diagram of the words in the database, and finally determine the If the "filter element" is identified in the database, the voice command is corrected based on the "filter element", and the correct voice command "query the life of the filter element of the water purifier" is executed. Of course, it is also possible to recognize multiple words that satisfy the waveform similarity. For example, the recognized result is "filter element", "aluminum core", "green heart", etc., and it can be further judged whether the semantics of the recognized result match the current voice command , in this example "filter element" matches the current voice command, but the semantics of "aluminum core" and "green heart" do not match, you need to correct the "filter element" voice command and execute the correct voice command "query Water purifier filter life".

在本申请实施例的可选实施方式中,如图5所示,本申请实施例中的装置还包括:备份模块52,用于在结合第二词的语义执行语音指令之后,将第一词备份至数据库中。In an optional implementation of the embodiment of the present application, as shown in FIG. 5 , the device in the embodiment of the present application further includes: a backup module 52, configured to convert the first word to Back up to the database.

可见,在本申请实施例中可以将该第一词添加至数据库中。在具体示例中,以“查询净水器旅行寿命”为例,可以将“旅行”添加至数据库中,再下一次依然是“查询净水器旅行寿命”的情况下,可以直接执行该“查询净水器旅行寿命”语音指令,无需对其进行调整,因为已经能够获知其正确的语义,即“查询净水器滤芯寿命”。也就是说,在本申请中即使发音不标准也可以直接识别出其正确的语义,提升了语音指令控制智能设备的体验效果。It can be seen that in the embodiment of the present application, the first word can be added to the database. In a specific example, take "query the travel life of the water purifier" as an example, you can add "travel" to the database, and next time it is still "query the travel life of the water purifier", you can directly execute the "query "Water purifier travel life" voice command, there is no need to adjust it, because its correct semantics can already be known, that is, "query the life of the water purifier filter element". That is to say, in this application, even if the pronunciation is not standard, its correct semantics can be directly recognized, which improves the experience effect of voice command control smart devices.

如图6所示,本申请实施例提供了一种电子设备,包括处理器111、通信接口112、存储器113和通信总线114,其中,处理器111,通信接口112,存储器113通过通信总线114完成相互间的通信,As shown in Figure 6, the embodiment of the present application provides an electronic device, including a processor 111, a communication interface 112, a memory 113, and a communication bus 114, wherein the processor 111, the communication interface 112, and the memory 113 are completed through the communication bus 114 mutual communication,

存储器113,用于存放计算机程序;Memory 113, used to store computer programs;

在本申请一个实施例中,处理器111,用于执行存储器113上所存放的程序时,实现前述任意一个方法实施例提供的智能设备语音指令的执行方法,其所起到的作用也是类似的,在此不再赘述。In one embodiment of the present application, when the processor 111 is used to execute the program stored on the memory 113, it implements the execution method of the smart device voice command provided by any one of the above method embodiments, and its functions are similar , which will not be repeated here.

本申请实施例还提供了一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现如前述任意一个方法实施例提供的智能设备语音指令的执行方法的步骤。The embodiment of the present application also provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the steps of the method for executing a voice command on a smart device as provided in any one of the above method embodiments are implemented. .

需要说明的是,在本文中,诸如“第一”和“第二”等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。It should be noted that in this article, relative terms such as "first" and "second" are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply these No such actual relationship or order exists between entities or operations. Furthermore, the term "comprises", "comprises" or any other variation thereof is intended to cover a non-exclusive inclusion such that a process, method, article, or apparatus comprising a set of elements includes not only those elements, but also includes elements not expressly listed. other elements of or also include elements inherent in such a process, method, article, or device. Without further limitations, an element defined by the phrase "comprising a ..." does not exclude the presence of additional identical elements in the process, method, article or apparatus comprising said element.

以上所述仅是本发明的具体实施方式,使本领域技术人员能够理解或实现本发明。对这些实施例的多种修改对本领域的技术人员来说将是显而易见的,本文中所定义的一般原理可以在不脱离本发明的精神或范围的情况下,在其它实施例中实现。因此,本发明将不会被限制于本文所示的这些实施例,而是要符合与本文所申请的原理和新颖特点相一致的最宽的范围。The above descriptions are only specific embodiments of the present invention, so that those skilled in the art can understand or implement the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the general principles defined herein may be implemented in other embodiments without departing from the spirit or scope of the invention. Accordingly, the present invention will not be limited to the embodiments shown herein, but is to be accorded the widest scope consistent with the principles and novel features claimed herein.

Claims (10)

1.一种智能设备语音指令的执行方法,其特征在于,包括:1. An execution method of an intelligent device voice command, characterized in that, comprising: 获取用户对智能设备进行控制的语音指令;Obtain the user's voice command to control the smart device; 对所述语音指令所对应的语句进行分词,将分词结果与预设的数据库中的词进行对比;Segmenting the sentence corresponding to the voice instruction, and comparing the word segmentation result with the words in the preset database; 将所述分词结果不在所述数据库中的第一词的声波图与所述数据库中的第二词的声波图进行对比;comparing the acoustic wave image of the first word whose word segmentation result is not in the database with the acoustic wave image of the second word in the database; 在声波图的相似度超过第一预设阈值的情况下,结合所述第二词的语义执行所述语音指令。When the similarity of the acoustic wave pattern exceeds a first preset threshold, the voice instruction is executed in combination with the semantics of the second word. 2.根据权利要求1所述的方法,其特征在于,所述将分词结果与预设数据库中的词进行对比,包括:2. The method according to claim 1, wherein the comparing the word segmentation result with the words in the preset database includes: 统计所述分词结果中在所述数据库中存在的词;Count the words that exist in the database in the word segmentation results; 将所述分词结果中在所述数据库中存在的词的数量比上所述分词结果中所有词的数量,得到所述语音指令的有效率;The number of words existing in the database in the word segmentation result is compared to the number of all words in the word segmentation result to obtain the effective rate of the voice command; 在所述有效率超过第二预设阈值的情况下,确定所述语音指令为有效指令;When the effectiveness rate exceeds a second preset threshold, determining that the voice instruction is an effective instruction; 在所述有效率未超过所述第二预设阈值的情况下,确定所述语音指令为无效指令;其中,将所述有效指令中的不存在所述数据库中的第一词进行声波图对比的操作。In the case that the effective rate does not exceed the second preset threshold, it is determined that the voice instruction is an invalid instruction; wherein, the first word in the effective instruction that does not exist in the database is compared with the acoustic wave operation. 3.根据权利要求1所述的方法,其特征在于,所述将所述分词结果不在所述数据库中的第一词的声波图与所述数据库中的第二词的声波图进行对比,包括:3. The method according to claim 1, characterized in that, comparing the acoustic wave pattern of the first word whose word segmentation result is not in the database with the acoustic wave pattern of the second word in the database includes : 获取所述第一词的声波图的波形参数,以及所述第二词的声波图的波形参数,其中,所述波形参数包括以下至少之一:振幅、频率;Obtaining waveform parameters of the acoustic wave diagram of the first word, and waveform parameters of the acoustic wave diagram of the second word, wherein the waveform parameters include at least one of the following: amplitude, frequency; 比较所述第一词的声波图的波形参数和所述第二词的声波图的波形参数;comparing the waveform parameters of the sonogram of the first word with the waveform parameters of the sonogram of the second word; 获取所述第一词的声波图的波形参数和所述第二词的声波图的波形参数之间的比值,其中,所述比值用于表征所述相似度。Acquiring a ratio between the waveform parameter of the acoustic wave diagram of the first word and the waveform parameter of the acoustic wave diagram of the second word, wherein the ratio is used to characterize the similarity. 4.根据权利要求1所述的方法,其特征在于,在所述结合所述第二词的语义执行所述语音指令之后,所述方法还包括:4. The method according to claim 1, wherein, after performing the voice instruction in conjunction with the semantics of the second word, the method further comprises: 将所述第一词备份至所述数据库中。The first word is backed up in the database. 5.根据权利要求1所述的方法,其特征在于,所述结合所述第二词的语义执行所述语音指令,包括:5. The method according to claim 1, wherein the performing the voice instruction in conjunction with the semantics of the second word comprises: 将所述分词结果中的第一词修正为所述第二词;Amending the first word in the word segmentation result to the second word; 基于修正后分词结果所对应的语音指令的语义,执行修正后分词结果所对应的语音指令。Based on the semantics of the voice instruction corresponding to the modified word segmentation result, the voice instruction corresponding to the modified word segmentation result is executed. 6.一种智能设备语音指令的执行装置,其特征在于,包括:6. An execution device for a smart device voice command, characterized in that it comprises: 获取模块,用于获取用户对智能设备进行控制的语音指令;An acquisition module, configured to acquire a voice command for the user to control the smart device; 处理模块,用于对所述语音指令所对应的语句进行分词,将分词结果与预设的数据库中的词进行对比;A processing module, configured to perform word segmentation on the sentence corresponding to the voice instruction, and compare the word segmentation result with words in a preset database; 对比模块,用于将所述分词结果不在所述数据库中的第一词的声波图与所述数据库中的第二词的声波图进行对比;A comparison module, used to compare the acoustic wave graph of the first word whose word segmentation result is not in the database with the acoustic wave graph of the second word in the database; 执行模块,用于在声波图的相似度超过第一预设阈值的情况下,结合所述第二词的语义执行所述语音指令。An execution module, configured to execute the voice instruction in combination with the semantics of the second word when the similarity of the acoustic wave pattern exceeds a first preset threshold. 7.根据权利要求6所述的装置,其特征在于,所述处理模块包括:7. The device according to claim 6, wherein the processing module comprises: 统计单元,用于统计所述分词结果中在所述数据库中存在的词;A statistical unit, used to count the words existing in the database in the word segmentation result; 处理单元,用于将所述分词结果中在所述数据库中存在的词的数量比上所述分词结果中所有词的数量,得到所述语音指令的有效率;A processing unit, configured to compare the number of words existing in the database in the word segmentation result with the number of all words in the word segmentation result to obtain the effectiveness of the voice instruction; 第一确定单元,用于在所述有效率超过第二预设阈值的情况下,确定所述语音指令为有效指令;A first determining unit, configured to determine that the voice instruction is a valid instruction when the effective rate exceeds a second preset threshold; 第二确定单元,用于在所述有效率未超过所述第二预设阈值的情况下,确定所述语音指令为无效指令;其中,将所述有效指令中的不存在所述数据库中的第一词进行声波图对比的操作。The second determining unit is configured to determine that the voice instruction is an invalid instruction when the effective rate does not exceed the second preset threshold; wherein, among the effective instructions that do not exist in the database The first word performs the operation of comparing the sonogram. 8.根据权利要求6所述的装置,其特征在于,所述对比模块包括:8. The device according to claim 6, wherein the comparison module comprises: 第一获取单元,用于获取所述第一词的声波图的波形参数,以及所述第二词的声波图的波形参数,其中,所述波形参数包括以下至少之一:振幅、频率;The first acquisition unit is configured to acquire waveform parameters of the acoustic wave diagram of the first word and waveform parameters of the acoustic wave diagram of the second word, wherein the waveform parameters include at least one of the following: amplitude and frequency; 比较单元,用于比较所述第一词的声波图的波形参数和所述第二词的声波图的波形参数;A comparison unit, configured to compare the waveform parameters of the acoustic wave diagram of the first word with the waveform parameters of the acoustic wave diagram of the second word; 第二获取单元,用于获取所述第一词的声波图的波形参数和所述第二词的声波图的波形参数之间的比值,其中,所述比值用于表征所述相似度。The second acquisition unit is configured to acquire a ratio between the waveform parameter of the acoustic wave diagram of the first word and the waveform parameter of the acoustic wave diagram of the second word, wherein the ratio is used to characterize the similarity. 9.一种电子设备,其特征在于,包括处理器、通信接口、存储器和通信总线,其中,处理器,通信接口,存储器通过通信总线完成相互间的通信;9. An electronic device, characterized in that it comprises a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory complete mutual communication through the communication bus; 存储器,用于存放计算机程序;memory for storing computer programs; 处理器,用于执行存储器上所存放的程序时,实现权利要求1-5任一项所述的方法步骤。The processor is configured to implement the method steps of any one of claims 1-5 when executing the program stored in the memory. 10.一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1-5任一项所述的方法步骤。10. A computer-readable storage medium, on which a computer program is stored, wherein, when the computer program is executed by a processor, the method steps according to any one of claims 1-5 are implemented.
CN202211651927.2A 2022-12-21 2022-12-21 Method, device and storage medium for executing voice commands of intelligent equipment Pending CN116052660A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211651927.2A CN116052660A (en) 2022-12-21 2022-12-21 Method, device and storage medium for executing voice commands of intelligent equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211651927.2A CN116052660A (en) 2022-12-21 2022-12-21 Method, device and storage medium for executing voice commands of intelligent equipment

Publications (1)

Publication Number Publication Date
CN116052660A true CN116052660A (en) 2023-05-02

Family

ID=86119423

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211651927.2A Pending CN116052660A (en) 2022-12-21 2022-12-21 Method, device and storage medium for executing voice commands of intelligent equipment

Country Status (1)

Country Link
CN (1) CN116052660A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030130843A1 (en) * 2001-12-17 2003-07-10 Ky Dung H. System and method for speech recognition and transcription
JP2011053563A (en) * 2009-09-03 2011-03-17 Neikusu:Kk Collation system of voice keyword in voice data, method thereof, and collation program of voice keyword in voice data
CN107301865A (en) * 2017-06-22 2017-10-27 海信集团有限公司 A kind of method and apparatus for being used in phonetic entry determine interaction text
CN107742516A (en) * 2017-09-29 2018-02-27 上海与德通讯技术有限公司 Intelligent identification method, robot and computer-readable storage medium
CN109346081A (en) * 2018-12-20 2019-02-15 广州河东科技有限公司 Voice control method, device, equipment and storage medium
CN111161730A (en) * 2019-12-27 2020-05-15 中国联合网络通信集团有限公司 Voice command matching method, device, device and storage medium
KR102378895B1 (en) * 2021-09-29 2022-03-28 주식회사 인피닉 Method for learning wake-word for speech recognition, and computer program recorded on record-medium for executing method therefor

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030130843A1 (en) * 2001-12-17 2003-07-10 Ky Dung H. System and method for speech recognition and transcription
JP2011053563A (en) * 2009-09-03 2011-03-17 Neikusu:Kk Collation system of voice keyword in voice data, method thereof, and collation program of voice keyword in voice data
CN107301865A (en) * 2017-06-22 2017-10-27 海信集团有限公司 A kind of method and apparatus for being used in phonetic entry determine interaction text
CN107742516A (en) * 2017-09-29 2018-02-27 上海与德通讯技术有限公司 Intelligent identification method, robot and computer-readable storage medium
CN111968643A (en) * 2017-09-29 2020-11-20 赵成智 Intelligent recognition method, robot and computer readable storage medium
CN109346081A (en) * 2018-12-20 2019-02-15 广州河东科技有限公司 Voice control method, device, equipment and storage medium
CN111161730A (en) * 2019-12-27 2020-05-15 中国联合网络通信集团有限公司 Voice command matching method, device, device and storage medium
KR102378895B1 (en) * 2021-09-29 2022-03-28 주식회사 인피닉 Method for learning wake-word for speech recognition, and computer program recorded on record-medium for executing method therefor

Similar Documents

Publication Publication Date Title
CN107644638B (en) Audio recognition method, device, terminal and computer readable storage medium
CN103165129B (en) Method and system for optimizing voice recognition acoustic model
EP2700071B1 (en) Speech recognition using multiple language models
JP6229046B2 (en) Speech data recognition method, device and server for distinguishing local rounds
WO2015169134A1 (en) Method and apparatus for phonetically annotating text
US20250156398A1 (en) System and method for correction of a query using a replacement phrase
CN110299136A (en) A kind of processing method and its system for speech recognition
US12217751B2 (en) Digital signal processor-based continued conversation
CN113516994B (en) Real-time voice recognition method, device, equipment and medium
CN103632667A (en) Acoustic model optimization method and device, voice awakening method and device, as well as terminal
CN110738061B (en) Ancient poetry generating method, device, equipment and storage medium
CN112509566B (en) Speech recognition method, device, equipment, storage medium and program product
CN108231063A (en) A kind of recognition methods of phonetic control command and device
CN114464193A (en) Voiceprint clustering method and device, storage medium and electronic device
CN109688271A (en) The method, apparatus and terminal device of contact information input
CN109979446A (en) Sound control method, storage medium and device
CN115497468A (en) Voice control method, device, computer equipment, and computer-readable storage medium
CN113838453B (en) Speech processing method, apparatus, device and computer storage medium
CN117636872A (en) Audio processing method, device, electronic equipment and readable storage medium
CN116052660A (en) Method, device and storage medium for executing voice commands of intelligent equipment
CN111785259A (en) Information processing method, device and electronic device
JP2003241787A (en) Speech recognition device and method, and program
CN114999458A (en) Multi-mode wake-up-free system and method based on voice and sight
CN114387968A (en) Voice unlocking method and device, electronic equipment and storage medium
CN114333817A (en) Remote controller and remote controller voice recognition method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination