[go: up one dir, main page]

CN111784971B - Alarm processing method and system, computer readable storage medium and electronic device - Google Patents

Alarm processing method and system, computer readable storage medium and electronic device Download PDF

Info

Publication number
CN111784971B
CN111784971B CN201910272717.4A CN201910272717A CN111784971B CN 111784971 B CN111784971 B CN 111784971B CN 201910272717 A CN201910272717 A CN 201910272717A CN 111784971 B CN111784971 B CN 111784971B
Authority
CN
China
Prior art keywords
alarm
audio information
voice recognition
user
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910272717.4A
Other languages
Chinese (zh)
Other versions
CN111784971A (en
Inventor
唐琛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Horizon Robotics Technology Research and Development Co Ltd
Original Assignee
Beijing Horizon Robotics Technology Research and Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Horizon Robotics Technology Research and Development Co Ltd filed Critical Beijing Horizon Robotics Technology Research and Development Co Ltd
Priority to CN201910272717.4A priority Critical patent/CN111784971B/en
Publication of CN111784971A publication Critical patent/CN111784971A/en
Application granted granted Critical
Publication of CN111784971B publication Critical patent/CN111784971B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G08SIGNALLING
    • G08BSIGNALLING OR CALLING SYSTEMS; ORDER TELEGRAPHS; ALARM SYSTEMS
    • G08B21/00Alarms responsive to a single specified undesired or abnormal condition and not otherwise provided for
    • G08B21/02Alarms for ensuring the safety of persons
    • G08B21/04Alarms for ensuring the safety of persons responsive to non-activity, e.g. of elderly persons
    • G08B21/0438Sensor means for detecting
    • G08B21/0469Presence detectors to detect unsafe condition, e.g. infrared sensor, microphone

Landscapes

  • Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Gerontology & Geriatric Medicine (AREA)
  • Business, Economics & Management (AREA)
  • Emergency Management (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Telephonic Communication Services (AREA)
  • Alarm Systems (AREA)

Abstract

Disclosed are an alarm processing method and apparatus, a computer-readable storage medium, and an electronic device, wherein the method includes: detecting a voice signal; recording audio information in a current scene in response to detecting a preset wake-up word from the voice signal; and performing voice recognition on the audio information, and performing alarm processing based on a voice recognition result. The embodiment of the disclosure can realize alarm under the condition that a user is inconvenient or cannot make an alarm by operating a mobile phone.

Description

Alarm processing method and system, computer readable storage medium and electronic device
Technical Field
The present disclosure relates to voice technology, and in particular, to an alarm processing method and apparatus, a computer-readable storage medium, and an electronic device.
Background
In today's social life, users encounter emergency situations, such as: in emergency situations such as indoor robbery and the like, the alarm can not be given by directly dialing the alarm phone through operating the mobile phone, and the alarm can be given by directly dialing the alarm phone through operating the mobile phone, so that the user is always in a more dangerous situation.
Disclosure of Invention
In the process of implementing the invention, the inventor finds that although the existing mobile phones all provide an emergency contact way for a user to make a call directly without unlocking the mobile phone when the user encounters an emergency, the emergency contact way also needs to be capable of making a call by operating the mobile phone. When the mobile phone is not at the user's side, the user can not make an alarm by operating the mobile phone to make a call.
In order to solve the above technical problem, the embodiments of the present disclosure provide a technical solution for alarm processing.
According to an aspect of the embodiments of the present disclosure, there is provided an alarm processing method, including:
detecting a voice signal;
recording audio information in a current scene in response to detecting a preset wake-up word from the voice signal;
and performing voice recognition on the audio information, and performing alarm processing based on a voice recognition result.
According to another aspect of the embodiments of the present disclosure, there is provided an alarm processing apparatus including:
the receiving module is used for detecting a voice signal;
the recording module is used for recording the audio information in the current scene in response to the detection of a preset awakening word in the voice signal obtained from the receiving module;
and the alarm module is used for carrying out voice recognition on the audio information recorded by the recording module and carrying out alarm processing based on a voice recognition result.
According to a further aspect of embodiments of the present disclosure, there is provided a computer-readable storage medium storing a computer program for executing the method of any of the above embodiments.
According to still another aspect of an embodiment of the present disclosure, there is provided an electronic apparatus including:
a processor;
a memory for storing the processor-executable instructions;
the processor is configured to perform the method according to any of the above embodiments.
Based on the alarm processing method and device, the computer-readable storage medium and the electronic device provided by the embodiments of the present disclosure, the audio information in the current scene is recorded through the preset wake-up word in the voice signal, and the alarm processing is performed based on the audio information, so that the alarm can be realized under the condition that the user is inconvenient or cannot make an alarm by operating the electronic device (e.g., a mobile phone), thereby maximally protecting the personal safety of the user, simultaneously realizing instant alarm and preventing crime, and the recorded audio information can also be used as crime evidence of criminals.
Drawings
The above and other objects, features and advantages of the present disclosure will become more apparent by describing in more detail embodiments of the present disclosure with reference to the attached drawings. The accompanying drawings are included to provide a further understanding of the embodiments of the disclosure and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the principles of the disclosure and not to limit the disclosure. In the drawings, like reference numbers generally represent like parts or steps.
FIG. 1 is a flow diagram of an alarm processing method according to some embodiments of the present disclosure;
FIG. 2 is a flow diagram of an alert process based on voiceprint information and user information of registered users according to some embodiments of the present disclosure;
FIG. 3 is a flow diagram of an alert process based on speech recognition of audio information according to some embodiments of the present disclosure;
FIG. 4 is a flow diagram of identifying the number of people speaking in audio information according to some embodiments of the present disclosure;
FIG. 5 is a schematic diagram of an alarm processing device according to some embodiments of the present disclosure;
FIG. 6 is a schematic structural diagram of an alarm processing apparatus according to another embodiment of the present disclosure;
FIG. 7 is a schematic structural diagram of an alarm module according to some embodiments of the present disclosure;
FIG. 8 is a schematic structural diagram of an alarm module according to further embodiments of the present disclosure;
fig. 9 is a schematic structural diagram of an electronic device according to some embodiments of the present disclosure.
Detailed Description
Hereinafter, example embodiments according to the present disclosure will be described in detail with reference to the accompanying drawings. It is to be understood that the described embodiments are merely a subset of the embodiments of the present disclosure and not all embodiments of the present disclosure, with the understanding that the present disclosure is not limited to the example embodiments described herein.
It should be noted that: the relative arrangement of the components and steps, the numerical expressions, and numerical values set forth in these embodiments do not limit the scope of the present disclosure unless specifically stated otherwise.
It will be understood by those of skill in the art that the terms "first," "second," and the like in the embodiments of the present disclosure are used merely to distinguish one element from another, and are not intended to imply any particular technical meaning, nor is the necessary logical order between them.
It is also understood that in embodiments of the present disclosure, "a plurality" may refer to two or more and "at least one" may refer to one, two or more.
It is also to be understood that any reference to any component, data, or structure in the embodiments of the disclosure, may be generally understood as one or more, unless explicitly defined otherwise or stated otherwise.
In addition, the term "and/or" in the present disclosure is only one kind of association relationship describing an associated object, and means that three kinds of relationships may exist, for example, a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" in the present disclosure generally indicates that the former and latter associated objects are in an "or" relationship.
It should also be understood that the description of the various embodiments of the present disclosure emphasizes the differences between the various embodiments, and the same or similar parts may be referred to each other, so that the descriptions thereof are omitted for brevity.
Meanwhile, it should be understood that the sizes of the respective portions shown in the drawings are not drawn in an actual proportional relationship for the convenience of description.
The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the disclosure, its application, or uses.
Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.
The disclosed embodiments may be applied to electronic devices such as terminal devices, computer systems, servers, etc., which are operational with numerous other general purpose or special purpose computing system environments or configurations. Examples of well known terminal devices, computing systems, environments, and/or configurations that may be suitable for use with electronic devices, such as terminal devices, computer systems, servers, and the like, include, but are not limited to: personal computer systems, server computer systems, thin clients, thick clients, hand-held or laptop devices, microprocessor-based systems, set-top boxes, programmable consumer electronics, networked personal computers, minicomputer systems, mainframe computer systems, distributed cloud computing environments that include any of the above, and the like.
Electronic devices such as terminal devices, computer systems, servers, etc. may be described in the general context of computer system-executable instructions, such as program modules, being executed by a computer system. Generally, program modules may include routines, programs, objects, components, logic, data structures, etc. that perform particular tasks or implement particular abstract data types. The computer system/server may be practiced in distributed cloud computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed cloud computing environment, program modules may be located in both local and remote computer system storage media including memory storage devices.
In today's social life, users encounter emergency situations, such as: in emergency situations such as indoor robbery and the like, the alarm can not be given by directly dialing the alarm phone through operating the mobile phone, and the alarm can be given by directly dialing the alarm phone through operating the mobile phone, so that the user is always in a more dangerous situation.
The existing mobile phones all provide an emergency contact way, so that a user can directly make a call without unlocking the mobile phone when encountering an emergency, and an alarm is given. But such an emergency contact method also requires the operation of a mobile phone to be able to make a call. When the mobile phone is not at the user's side, the user can not make an alarm by operating the mobile phone to make a call.
Voice wakeup is a technique for waking up a device from a sleep state by detecting a specific wakeup word in real time in a continuous speech stream. With the development of artificial intelligence, voice wake-up functions have been increasingly applied to smart devices.
In some embodiments, when a user encounters an emergency, the alarm processing method and/or device of the embodiments of the present disclosure may be utilized, based on a voice wake-up technology, a recording function is started by presetting a wake-up word, and an alarm is realized by performing voice recognition on recorded audio information, so that an alarm may be given under the condition that the user is inconvenient or cannot make a call by operating a mobile phone, and personal safety of the user may be protected to the maximum extent.
As shown in fig. 1, fig. 1 is a flowchart of an alarm processing method according to some embodiments of the present disclosure. The method may be performed by the terminal device, or may be performed by both the terminal device and the server, for example: terminal equipment such as intelligent audio amplifier, intelligent camera, intelligent alarm clock, smart mobile phone. The alarm processing method comprises the following steps:
102, detecting a voice signal.
In the disclosed embodiment, the voice signal may be detected by a microphone, for example: piezoelectric microphones, electret microphones, and the like, and the embodiments of the present disclosure do not limit the types of microphones. In an alternative example, the microphone may support three to five meters of far-field pickup. Optionally, after the voice signal is detected, the voice signal needs to be processed, for example: a/D conversion, noise reduction, energy enhancement, feature extraction, and the like, which are not limited in this disclosure.
And 104, recording the audio information in the current scene in response to detecting the preset awakening words from the voice signals.
In the embodiment of the present disclosure, the preset wake-up word may be a specific word that is registered in advance by the user, for example, the preset wake-up word may be registered through the APP, and after the registration of the preset wake-up word is completed, the preset wake-up word is bound to the start of the recording function. Optionally, the preset wake-up word may be registered in a text form, for example, a user manually inputs the preset wake-up word at a predetermined position of the APP; alternatively, the preset wake-up word may be registered in the form of sound, for example: the user records the audio containing the preset awakening words spoken by the user through the APP. The embodiment of the present disclosure does not limit the registration manner of the preset wake-up word.
In an optional example, in order to prevent error in the effect of waking due to false waking, 2 to 3 preset wake-up words may be registered, and the number of words of each preset wake-up word may be about 4 words, but the number of preset wake-up words and the number of words of each preset wake-up word are not limited in the embodiment of the present disclosure. In another alternative example, in order that the utterance of the preset wake-up word does not appear obtrusive, for example: "hello" and "hello" may be registered, and such words that are relatively confusing are used as the preset wake-up word, but the type of the preset wake-up word is not limited in the embodiment of the present disclosure.
Optionally, the detected voice signal may be processed to detect whether the voice signal includes a preset wake-up word, for example: processing the voice signal by using machine learning methods such as a Dynamic Time Warping (DTW) Model, a Hidden Markov Model-Gaussian Mixture Model (HMM-GMM) or an Artificial Neural Network (ANN) to detect whether the voice signal includes the preset wakeup word, and the implementation manner of detecting the preset wakeup word is not limited in the embodiment of the present disclosure.
In an optional example, when the electronic device is in an operating state and when it is detected that the voice signal includes the preset wakeup word, the recording function of the device may be directly started by the preset wakeup word, and recording of the audio information in the current scene is started. In another optional example, when the electronic device is in a standby state or a sleep state, when it is detected that the voice signal includes a preset wake-up word, the device needs to be woken up from the standby state or the sleep state first, and then the recording function of the device is started by the preset wake-up word, so as to start recording the audio information in the current scene, at this time, the preset wake-up word needs to be bound with the wake-up function of the electronic device in addition to the start-up of the recording function.
Optionally, when the recording function is started by the preset wake-up word and/or when the electronic device is woken up by the preset wake-up word, the user can be informed in a hidden manner without alarming criminals, so that the user can send out more useful help-seeking information. Alternatively, the manner of informing the user may be set according to the form of the electronic device, for example: a weak light may be used to flash or the device may be slightly rotated or shaken, which is not limited by the embodiments of the present disclosure.
And 106, performing voice recognition on the audio information, and performing alarm processing based on a voice recognition result.
In the embodiment of the present disclosure, the recorded audio information may be converted into corresponding text information by performing voice recognition on the recorded audio information, and an alarm process is performed based on the obtained text information. Alternatively, the speech feature sequence of the audio information may be obtained by performing feature extraction on the audio information, for example: speech feature sequences such as Linear Predictive Coding (LPC), Mel-frequency Cepstrum Coefficients (MFCC), Mel-scale Filter banks (Mel-scale Filter Bank, FBank), etc. are processed through an acoustic model and a language model, and the speech feature sequences are converted into character sequences to obtain text information corresponding to the audio information, for example: the acoustic Model may be a Gaussian Mixture Model-Hidden Markov Model (GMM-HMM), Recurrent Neural Network (RNN), Feedforward Sequential Memory Neural Network (FSMN), or the like, and the language Model may be an N-Gram (N-Gram) language Model, a Neural Network Language Model (NNLM), or the like.
Optionally, the alarm mode may be one or more of making a call, sending a short message, sending a WeChat, and the like, which is not limited in this disclosure. In an alternative example, the content of the call, the short message, and the WeChat may be recorded audio information, or text information obtained by performing voice recognition on the recorded audio information. In another optional example, the content of the call, the short message and the WeChat may be a preset audio or text fixed template, the text information obtained by performing voice recognition on the recorded audio information is analyzed to determine a matched audio or text fixed template, and an alarm is given by calling the corresponding audio or text fixed template.
According to the alarm processing method, the audio information in the current scene is recorded through the preset awakening words in the voice signals, alarm processing is carried out based on the audio information, and alarm can be achieved under the condition that a user is inconvenient or cannot make an alarm by operating an electronic device (such as a mobile phone), so that personal safety of the user can be protected to the maximum extent, meanwhile, instant alarm can be achieved, crimes are prevented, and the recorded audio information can be used as crime evidences of criminals.
In some embodiments, before recording audio information in a current scene in response to detecting a preset wakeup word from a voice signal, the preset wakeup word needs to be registered to bind the preset wakeup word with the start of the recording function, when the preset wakeup word is registered in a voice form, voiceprint information of a registered user can be collected while the preset wakeup word is obtained, and when performing alarm processing based on the recorded audio information, whether an alarm initiator is the registered user can be determined according to the voiceprint information of the registered user. Meanwhile, before responding to the detection of a preset awakening word from the voice signal and recording the audio information in the current scene, user information of the registered user can be collected, so that when the alarm initiator is determined to be the registered user, the user information of the registered user is sent to a preset alarm address. The flow of the alarm processing based on the voiceprint information and the user information of the registered user according to some embodiments of the present disclosure will be described in detail below with reference to the example of fig. 2.
Fig. 2 is a flowchart of performing an alarm processing based on voiceprint information and user information of a registered user according to some embodiments of the present disclosure, and as shown in fig. 2, the alarm processing method may include:
202, whether the voice print information of the registered user is included in the audio information is identified.
If the audio information includes voiceprint information of the registered user, performing operation 204; otherwise, if the voice information does not include the voiceprint information of the registered user, the operation is ended.
And 204, determining the alarm initiator as a registered user, acquiring the user information of the registered user, and sending the user information of the registered user to a preset alarm address.
Alternatively, the speech feature sequence of the audio information may be obtained by performing feature extraction on the audio information, for example: voice feature sequences such as MFCC or FBank, etc. are processed by a voiceprint recognition model to recognize whether the voice information includes the voiceprint information of the registered user, for example: by using a voiceprint recognition model such as a DTW model, a Probabilistic Linear Discriminant Analysis (PLDA) model, or GMM-UBM, the embodiment of the present disclosure does not limit the implementation method for recognizing the voiceprint information.
Optionally, for the preset wake-up word registered in the form of sound, the audio of the preset wake-up word may be converted into corresponding text for storage, but the embodiment of the present disclosure does not limit this. Optionally, the preset wake-up word may be replaced by registering a new preset wake-up word, so as to update the preset wake-up word, but the embodiment of the present disclosure does not limit this. Optionally, when the preset wake-up word is detected from the voice signal, the preset wake-up word may be compared with the voiceprint information of the registered user to determine whether the initiator of the recording is the registered user, so as to further improve the limitation on the alarm.
Optionally, the user information of the registered user may include: the name, age, sex, address, telephone, etc. of the user, but the embodiment of the present disclosure does not limit this. Optionally, the preset alarm address may be a pre-stored alarm address, and may be determined according to a voice recognition result obtained by performing voice recognition on the recorded audio information. Optionally, the user information of the registered user may be sent to the preset alarm address in one or more of a short message sending mode and a WeChat sending mode, which is not limited in the embodiment of the present disclosure.
In an alternative example, when alerting by dialing a telephone based on speech recognition of recorded audio information, the preset alert address may be an address associated with the alerting telephone, such as: and a short message address or a WeChat address is associated with the alarm telephone, and at the moment, the user information of the registered user can be sent in a form of sending a short message or sending a WeChat while the alarm is carried out by dialing the telephone. In another optional example, when the alarm is performed by sending a short message or sending a WeChat based on voice recognition of the recorded audio information, the preset alarm address may be the same address as the address for sending the short message or sending the WeChat for alarming, and the user information of the registered user may be sent while the short message or sending the WeChat for alarming is sent.
According to the embodiment of the disclosure, when the user registers the preset wake-up word, voiceprint information and user information of the registered user are collected, whether the alarm initiator is the registered user can be determined, and when the alarm initiator is determined to be the registered user, the user information of the registered user is sent to the preset alarm address, so that the limitation on the use of the alarm function can be improved, the real-name system of the alarm is realized, and the police can obtain more valuable information.
In some embodiments, voice recognition is performed on the audio information, and an alarm process is performed based on the voice recognition result, and whether an emergency situation occurs at present can be determined by performing semantic analysis on the voice recognition result, and an alarm is performed when a semantic analysis result of the emergency situation occurring at present is obtained. The flow of the alarm processing based on the voice recognition of the audio information according to some embodiments of the present disclosure will be described in detail below with reference to the example of fig. 3.
Fig. 3 is a flowchart of an alarm processing based on speech recognition of audio information according to some embodiments of the present disclosure, and as shown in fig. 3, the alarm processing method may include:
and 302, performing voice recognition on the audio information to obtain a voice recognition result.
Alternatively, the speech feature sequence of the audio information may be obtained by performing feature extraction on the audio information, for example: the speech feature sequence such as LPC, MFCC, or FBank, then processes the speech feature sequence through an acoustic model and a language model, converts the speech feature sequence into a character sequence, and obtains text information corresponding to the audio information as a speech recognition result, for example: the acoustic model may be a GMM-HMM, RNN, or FSMN, and the language model may be an N-Gram language model or NNLM, which is not limited in this disclosure.
And 304, performing semantic analysis based on the voice recognition result to obtain a semantic analysis result.
Optionally, the semantic analysis performed on the speech recognition result may be word-level semantic analysis, sentence-level semantic analysis, or chapter-level semantic analysis, which is not limited in the embodiment of the present disclosure. In an optional example, semantic analysis may be performed on the text information obtained by speech recognition, and semantic representation of the speech recognition result is obtained through the semantic analysis as a semantic analysis result; in another alternative example, semantic analysis may be performed on the text information obtained by speech recognition, and whether a preset word, phrase or sentence is included in the speech recognition result may be recognized as a result of the semantic analysis. The implementation manner of performing semantic analysis on the speech recognition result is not limited in the embodiments of the present disclosure.
Alternatively, the speech recognition result may be semantically analyzed through a topic model such as Latent Semantic Analysis (LSA), Probabilistic Latent Semantic Analysis (PLSA), or Latent Dirichlet Allocation (LDA), or may be semantically analyzed through an artificial neural network such as RNN, Long Short-Term Memory network (LSTM), which is not limited in the embodiment of the present disclosure.
And 306, responding to the semantic analysis result meeting the alarm condition, and giving an alarm.
Alternatively, the alarm condition may be an alarm condition preset according to a semantic analysis method, for example, the alarm condition may include a preset word, phrase or sentence in the speech recognition result, but the embodiment of the present disclosure does not limit this. In an alternative example, when the semantic analysis result is recognized to include a preset word, phrase or sentence, a corresponding alarm process may be performed. In the embodiment of the present disclosure, the alarm processing may include processing emergency situations such as fire alarm, bandit alarm, traffic alarm, emergency rescue, etc., which are not limited by the embodiment of the present disclosure.
According to the embodiment of the invention, the result obtained by voice recognition is subjected to semantic analysis, and the alarm is given according to the condition that the result of the semantic analysis meets the alarm condition, so that the accuracy of alarm can be ensured, the error alarm condition caused by mistaken awakening is prevented, and the service performance of alarm processing is improved.
In some embodiments, the audio information is subjected to voice recognition, alarm processing is performed based on a voice recognition result, information such as the number of people speaking in the audio information can be recognized, whether semantic analysis is performed or not can be determined according to a recognition result of the number of people speaking in the audio information, and therefore an alarm is performed when a semantic analysis result meets an alarm condition. The flow of identifying the number of people speaking in the audio information according to some embodiments of the present disclosure will be described in detail below with reference to the example of fig. 4.
Fig. 4 is a flowchart illustrating an example of identifying the number of people speaking in audio information according to some embodiments of the present disclosure, and as shown in fig. 4, the method for identifying audio information may include:
402, it is identified whether the audio information includes sounds of two or more different users.
If the audio information includes sounds of two or more different users, the process proceeds to block 404; otherwise, if the audio information does not include the sounds of two or more different users, the operation is ended.
404, performing semantic analysis operation based on the voice recognition result.
Alternatively, it may be identified whether or not the sounds of two or more different users are included in the audio information based on the voiceprint information of the users. For example: whether the voice information includes the voices of two or more different users may be identified based on the voiceprint information of the registered user and the voiceprint information of the other users stored in the database. Another example is: whether the voice information comprises the voices of two or more different users can be identified according to the matching of the voiceprint information of the registered users and the voice identification result. The embodiment of the present disclosure does not limit the implementation manner of recognizing the number of speakers in the audio information.
Optionally, after recognizing whether the audio information includes sounds of two or more different users, if the audio information includes only a sound of one user, although the semantic analysis may not be performed on the speech recognition result, the alarm processing may still be performed directly based on the speech recognition result, for example: when a user breaks out a disease at home or fires at a place where the user is alone, although semantic analysis is not required to be performed on the audio information of the user, the user can still perform alarm according to a result obtained by performing voice recognition on the audio information, so that the alarm time can be shortened, and the processing speed of the emergency alarm is increased, but the embodiment of the disclosure does not limit the processing speed.
The embodiment of the disclosure identifies whether the audio information includes the sounds of two or more different users, and executes the operation of performing semantic analysis based on the voice identification result when the audio information includes the sounds of two or more different users, so as to provide more information for the semantic analysis, improve the accuracy of the semantic analysis, identify whether the audio information includes the sounds of two or more different users, and provide more valuable information for police.
In an application scenario, the alarm processing method according to any of the above embodiments of the present disclosure may be executed by a terminal device to implement alarm processing, for example: terminal equipment such as intelligent audio amplifier, intelligent camera, intelligent alarm clock, smart mobile phone, this embodiment of this disclosure does not restrict to this.
In another application scenario, the alarm processing method according to any of the above embodiments of the present disclosure may be executed by the terminal device and the server together, so as to implement alarm processing. In an alternative example, a voice signal may be detected by the terminal device, and in response to detecting a preset wake-up word from the voice signal, audio information in the current scene is recorded, and then the recorded audio information is sent to the server, and the server performs voice recognition on the audio information and performs alarm processing based on a voice recognition result. For example: the terminal device is an intelligent sound box, an intelligent camera, an intelligent alarm clock, an intelligent mobile phone and the like, and the server is a cloud server and the like, which are not limited in the embodiment of the disclosure.
Optionally, when the terminal device collects voiceprint information and user information of the registered user, the collected voiceprint information and user information of the registered user need to be uploaded to the server, so that the voiceprint information and the user information of the registered user are utilized to send the user information of the registered user to the preset alarm address when the alarm processing is performed. Optionally, the voiceprint information and the user information of the registered user stored in the server may be encrypted information, so as to ensure the information security of the user and prevent leakage of the user information.
Fig. 5 is a schematic structural diagram of an alarm processing apparatus according to some embodiments of the present disclosure. The device can be arranged on the terminal equipment, or can be arranged on the terminal equipment and the server, and executes the alarm processing method of any one of the above embodiments of the disclosure. As shown in fig. 5, the apparatus may include: a receiving module 510, a recording module 520, and an alarm module 530. Wherein,
a receiving module 510, configured to detect a voice signal.
And a recording module 520, configured to record the audio information in the current scene in response to detecting the preset wake-up word from the voice signal obtained by the receiving module 510.
And an alarm module 530, configured to perform voice recognition on the audio information recorded by the recording module 520, and perform alarm processing based on a voice recognition result.
The alarm processing device of the embodiment records the audio information in the current scene through the preset awakening words in the voice signals, and carries out alarm processing based on the audio information, so that the alarm can be realized under the condition that a user is inconvenient or cannot make a call through operating electronic equipment (for example, a mobile phone), thereby protecting the personal safety of the user to the maximum extent, simultaneously realizing instant alarm and preventing crime, and the recorded audio information can also be used as crime evidence of criminals.
Fig. 6 is a schematic structural diagram of an alarm processing apparatus according to another embodiment of the present disclosure. As shown in fig. 6, the apparatus may include: a registration module 610, a receiving module 620, a recording module 630 and an alarm module 640. Wherein,
the registration module 610 is configured to register a preset wakeup word, and collect voiceprint information and user information of a registered user.
And a receiving module 620, configured to detect a voice signal.
The recording module 630 is configured to record the audio information in the current scene in response to detecting the preset wake-up word registered in the registration module 610 from the voice signal obtained by the receiving module 620.
And an alarm module 640, configured to perform voice recognition on the audio information recorded by the recording module 630, and perform alarm processing based on a voice recognition result.
Optionally, as shown in fig. 6, the alarm module 640 may further include: a voiceprint recognition unit 641 and an information transmission unit 642, wherein the voiceprint recognition unit 641 is configured to recognize whether voiceprint information of a registered user is included in the audio information. The information sending unit 642 is configured to, according to the recognition result of the voiceprint recognition unit 641, respond to the voiceprint information of the registered user included in the audio information, determine that the alarm initiator is the registered user, obtain user information of the registered user, and send the user information of the registered user to a preset alarm address.
Optionally, the user information of the registered user may include: the name, age, sex, address, telephone, etc. of the user, but the embodiment of the present disclosure does not limit this.
According to the embodiment of the disclosure, when the user registers the preset wake-up word, voiceprint information and user information of the registered user are collected, whether the alarm initiator is the registered user can be determined, and when the alarm initiator is determined to be the registered user, the user information of the registered user is sent to the preset alarm address, so that the limitation on the use of the alarm function can be improved, the real-name system of the alarm is realized, and the police can obtain more valuable information.
Fig. 7 is a schematic structural diagram of an alarm module according to some embodiments of the present disclosure. As shown in fig. 7, the alarm module may include: a speech recognition unit 710, a semantic analysis unit 720 and an alarm unit 730. Wherein,
the speech recognition unit 710 is configured to perform speech recognition on the audio information to obtain a speech recognition result.
And a semantic analysis unit 720, configured to perform semantic analysis based on the speech recognition result obtained by the speech recognition unit 710 to obtain a semantic analysis result.
And the alarm unit 730 is configured to alarm in response to that the semantic analysis result obtained by the semantic analysis unit 720 meets an alarm condition.
Optionally, the semantic analysis unit 720 may recognize whether the speech recognition result includes a preset word, a phrase, or a sentence, and as the semantic analysis result, the alarm unit 730 may alarm according to the speech recognition result including the preset word, the phrase, or the sentence, but the embodiment of the disclosure does not limit this.
According to the embodiment of the invention, the result obtained by voice recognition is subjected to semantic analysis, and the alarm is given according to the condition that the result of the semantic analysis meets the alarm condition, so that the accuracy of alarm can be ensured, the error alarm condition caused by mistaken awakening is prevented, and the service performance of alarm processing is improved.
Fig. 8 is a schematic structural diagram of an alarm module according to another embodiment of the present disclosure. As shown in fig. 8, the alarm module may include: a voice recognition unit 810, a person number recognition unit 820, a semantic analysis unit 830, and an alarm unit 840. Wherein,
and the voice recognition unit 810 is configured to perform voice recognition on the audio information to obtain a voice recognition result.
A person number identifying unit 820 for identifying whether the sound of two or more different users is included in the audio information.
And a semantic analysis unit 830, configured to perform semantic analysis based on the voice recognition result obtained by the voice recognition unit 810 according to the recognition result of the person number recognition unit 820 and in response to the audio information including the sounds of two or more different users, so as to obtain a semantic analysis result.
And an alarm unit 840 configured to alarm in response to that the semantic analysis result obtained by the semantic analysis unit 830 meets an alarm condition.
The embodiment of the disclosure identifies whether the audio information includes the sounds of two or more different users, and executes the operation of performing semantic analysis based on the voice identification result when the audio information includes the sounds of two or more different users, so as to provide more information for the semantic analysis, improve the accuracy of the semantic analysis, identify whether the audio information includes the sounds of two or more different users, and provide more valuable information for police.
Next, an electronic apparatus according to an embodiment of the present disclosure is described with reference to fig. 9. The electronic device may be either or both of the first device 100 and the second device 200, or a stand-alone device separate from them that may communicate with the first device and the second device to receive the collected input signals therefrom.
FIG. 9 illustrates a block diagram of an electronic device in accordance with an embodiment of the disclosure.
As shown in fig. 9, the electronic device 10 includes one or more processors 11 and memory 12.
The processor 11 may be a Central Processing Unit (CPU) or other form of processing unit having capabilities and/or instruction execution capabilities, and may control other components in the electronic device 10 to perform desired functions.
Memory 12 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, Random Access Memory (RAM), cache memory (cache), and/or the like. The non-volatile memory may include, for example, Read Only Memory (ROM), hard disk, flash memory, etc. One or more computer program instructions may be stored on the computer readable storage medium and executed by processor 11 to implement the alarm processing methods of the various embodiments of the present disclosure described above and/or other desired functions. Various contents such as an input signal, a signal component, a noise component, etc. may also be stored in the computer-readable storage medium.
In one example, the electronic device 10 may further include: an input device 13 and an output device 14, which are interconnected by a bus system and/or other form of connection mechanism (not shown).
For example, when the electronic device is the first device 100 or the second device 200, the input device 13 may be a microphone or a microphone array as described above for capturing an input signal of a sound source. When the electronic device is a stand-alone device, the input means 13 may be a communication network connector for receiving the acquired input signals from the first device 100 and the second device 200.
The input device 13 may also include, for example, a keyboard, a mouse, and the like.
The output device 14 may output various information including the determined distance information, direction information, and the like to the outside. The output devices 14 may include, for example, a display, speakers, a printer, and a communication network and its connected remote output devices, among others.
Of course, for simplicity, only some of the components of the electronic device 10 relevant to the present disclosure are shown in fig. 9, omitting components such as buses, input/output interfaces, and the like. In addition, the electronic device 10 may include any other suitable components depending on the particular application.
Exemplary computer program product and computer-readable storage Medium
In addition to the above-described methods and apparatus, embodiments of the present disclosure may also be a computer program product comprising computer program instructions that, when executed by a processor, cause the processor to perform the steps in the alarm processing method according to various embodiments of the present disclosure described in the "exemplary methods" section above of this specification.
The computer program product may write program code for carrying out operations for embodiments of the present disclosure in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server.
Furthermore, embodiments of the present disclosure may also be a computer-readable storage medium having stored thereon computer program instructions that, when executed by a processor, cause the processor to perform steps in an alarm processing method according to various embodiments of the present disclosure described in the "exemplary methods" section above of this specification.
The computer-readable storage medium may take any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may include, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The foregoing describes the general principles of the present disclosure in conjunction with specific embodiments, however, it is noted that the advantages, effects, etc. mentioned in the present disclosure are merely examples and are not limiting, and they should not be considered essential to the various embodiments of the present disclosure. Furthermore, the foregoing disclosure of specific details is for the purpose of illustration and description and is not intended to be limiting, since the disclosure is not intended to be limited to the specific details so described.
The block diagrams of devices, apparatuses, systems referred to in this disclosure are only given as illustrative examples and are not intended to require or imply that the connections, arrangements, configurations, etc. must be made in the manner shown in the block diagrams. These devices, apparatuses, devices, systems may be connected, arranged, configured in any manner, as will be appreciated by those skilled in the art. Words such as "including," "comprising," "having," and the like are open-ended words that mean "including, but not limited to," and are used interchangeably therewith. The words "or" and "as used herein mean, and are used interchangeably with, the word" and/or, "unless the context clearly dictates otherwise. The word "such as" is used herein to mean, and is used interchangeably with, the phrase "such as but not limited to".
It is also noted that in the devices, apparatuses, and methods of the present disclosure, each component or step can be decomposed and/or recombined. These decompositions and/or recombinations are to be considered equivalents of the present disclosure.
The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present disclosure. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the disclosure. Thus, the present disclosure is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
The foregoing description has been presented for purposes of illustration and description. Furthermore, this description is not intended to limit embodiments of the disclosure to the form disclosed herein. While a number of example aspects and embodiments have been discussed above, those of skill in the art will recognize certain variations, modifications, alterations, additions and sub-combinations thereof.

Claims (8)

1. An alarm processing method, comprising:
detecting a voice signal;
recording audio information in a current scene in response to detecting a preset wake-up word from the voice signal;
carrying out voice recognition on the audio information, and carrying out alarm processing based on a voice recognition result;
the voice recognition of the audio information and the alarm processing based on the voice recognition result comprise:
carrying out voice recognition on the audio information to obtain a voice recognition result;
identifying whether sounds of two or more different users are included in the audio information;
if the audio information only comprises the sound of one user, alarming based on the voice recognition result;
if the audio information comprises the sounds of two or more different users, performing semantic analysis based on the voice recognition result to obtain a semantic analysis result;
and responding to the semantic analysis result to meet an alarm condition, and giving an alarm.
2. The method of claim 1, wherein the recording of the audio information in the current scene in response to detecting a preset wake up word from the voice signal further comprises:
and registering the preset awakening words, and acquiring voiceprint information and user information of the registered users.
3. The method of claim 1, further comprising:
identifying whether the voice print information of the registered user is included in the audio information;
if the audio information comprises voiceprint information of the registered user, determining that an alarm initiator is the registered user, acquiring user information of the registered user, and sending the user information of the registered user to a preset alarm address.
4. The method of claim 2 or 3, wherein the user information comprises: any one or more of name, address, contact address.
5. The method according to any one of claims 1-3, wherein the performing semantic analysis based on the speech recognition result to obtain a semantic analysis result comprises:
identifying whether the voice recognition result comprises preset words or not;
the responding to the semantic analysis result meeting the alarm condition comprises the following steps:
and the voice recognition result comprises the preset words.
6. An alarm processing apparatus comprising:
the receiving module is used for detecting a voice signal;
the recording module is used for recording the audio information in the current scene in response to the detection of a preset awakening word in the voice signal obtained from the receiving module;
the alarm module is used for carrying out voice recognition on the audio information recorded by the recording module and carrying out alarm processing based on a voice recognition result, wherein the voice recognition is carried out on the audio information to obtain a voice recognition result, whether the audio information comprises the sounds of two or more different users is recognized, and if the audio information comprises only the sound of one user, the alarm is carried out based on the voice recognition result; and if the audio information comprises the sounds of two or more different users, performing semantic analysis based on the voice recognition result to obtain a semantic analysis result, and giving an alarm in response to the semantic analysis result meeting an alarm condition.
7. A computer-readable storage medium, the storage medium storing a computer program for executing the method of any of the preceding claims 1 to 5.
8. An electronic device, the electronic device comprising:
a processor;
a memory for storing the processor-executable instructions;
the processor configured to perform the method of any of the preceding claims 1 to 5.
CN201910272717.4A 2019-04-04 2019-04-04 Alarm processing method and system, computer readable storage medium and electronic device Active CN111784971B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910272717.4A CN111784971B (en) 2019-04-04 2019-04-04 Alarm processing method and system, computer readable storage medium and electronic device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910272717.4A CN111784971B (en) 2019-04-04 2019-04-04 Alarm processing method and system, computer readable storage medium and electronic device

Publications (2)

Publication Number Publication Date
CN111784971A CN111784971A (en) 2020-10-16
CN111784971B true CN111784971B (en) 2022-01-14

Family

ID=72755390

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910272717.4A Active CN111784971B (en) 2019-04-04 2019-04-04 Alarm processing method and system, computer readable storage medium and electronic device

Country Status (1)

Country Link
CN (1) CN111784971B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112287691B (en) * 2020-11-10 2024-02-13 深圳市天彦通信股份有限公司 Conference recording method and related equipment
CN114202891A (en) * 2021-12-28 2022-03-18 深圳市锐明技术股份有限公司 Method and device for sending alarm indication
CN114495930B (en) * 2022-01-28 2025-12-12 苏州科医世凯半导体技术有限责任公司 Voice interaction methods, devices, storage media and electronic devices
CN114822528A (en) * 2022-02-25 2022-07-29 北京百度网讯科技有限公司 Alarm method, device, equipment and storage medium
CN114913881A (en) * 2022-04-27 2022-08-16 北京三快在线科技有限公司 Safety early warning system, method and device, storage medium and electronic equipment
CN115694663A (en) * 2022-10-27 2023-02-03 普联技术有限公司 Alarm method, alarm system, computer readable storage medium and monitoring device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103188392A (en) * 2011-12-30 2013-07-03 牟颖 Voice alarm mobile phone
CN104240438A (en) * 2014-09-01 2014-12-24 百度在线网络技术(北京)有限公司 Method and device for achieving automatic alarming through mobile terminal and mobile terminal
CN104506717A (en) * 2014-12-10 2015-04-08 广东欧珀移动通信有限公司 Alarm method and device for a mobile device
CN105025271A (en) * 2015-07-28 2015-11-04 深圳英飞拓科技股份有限公司 Behavior monitoring method and device
CN107591151A (en) * 2017-08-22 2018-01-16 百度在线网络技术(北京)有限公司 Far field voice awakening method, device and terminal device
CN108010289A (en) * 2017-12-28 2018-05-08 深圳市永达电子信息股份有限公司 A kind of internet alarm method and system based on Application on Voiceprint Recognition

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103188392A (en) * 2011-12-30 2013-07-03 牟颖 Voice alarm mobile phone
CN104240438A (en) * 2014-09-01 2014-12-24 百度在线网络技术(北京)有限公司 Method and device for achieving automatic alarming through mobile terminal and mobile terminal
CN104506717A (en) * 2014-12-10 2015-04-08 广东欧珀移动通信有限公司 Alarm method and device for a mobile device
CN105025271A (en) * 2015-07-28 2015-11-04 深圳英飞拓科技股份有限公司 Behavior monitoring method and device
CN107591151A (en) * 2017-08-22 2018-01-16 百度在线网络技术(北京)有限公司 Far field voice awakening method, device and terminal device
CN108010289A (en) * 2017-12-28 2018-05-08 深圳市永达电子信息股份有限公司 A kind of internet alarm method and system based on Application on Voiceprint Recognition

Also Published As

Publication number Publication date
CN111784971A (en) 2020-10-16

Similar Documents

Publication Publication Date Title
CN111784971B (en) Alarm processing method and system, computer readable storage medium and electronic device
CN112102850B (en) Emotion recognition processing method and device, medium and electronic equipment
CN112037774B (en) System and method for key phrase identification
US20250308377A1 (en) Methods and systems for ambient system control
US20240029739A1 (en) Sensitive data control
CN114038457B (en) Method, electronic device, storage medium, and program for voice wakeup
JP7516571B2 (en) Hotword threshold auto-tuning
CN108010513B (en) Voice processing method and device
CN105679310A (en) Method and system for speech recognition
CN104580650A (en) Method and communication terminal for prompting fraudulent calls
US20240005918A1 (en) System For Recognizing and Responding to Environmental Noises
CN111862943B (en) Speech recognition method and device, electronic equipment and storage medium
US20160328949A1 (en) Method for an Automated Distress Alert System with Speech Recognition
JP7723744B2 (en) Adapting hotword recognition based on personalized negation words
CN115171692A (en) Voice interaction method and device
KR20200092763A (en) Electronic device for processing user speech and controlling method thereof
CN113330513A (en) Voice information processing method and device
CN113096651A (en) Voice signal processing method and device, readable storage medium and electronic equipment
CN110083392B (en) Audio awakening pre-recording method, storage medium, terminal and Bluetooth headset thereof
US12322397B2 (en) Processing audio information captured by interactive virtual assistant
CN114550720A (en) Voice interaction method and device, electronic equipment and storage medium
CN114242053A (en) Voice control method and device and storage medium
US12100417B2 (en) Systems and methods for detecting emotion from audio files
Kumar et al. ECHOSHIELD:‘Turning Sound into Safety’An AI-Powered Sound Detection and Safety Alert System
KR20250150195A (en) Apparatus and method for real-time recognition of voice phishing context during a call

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant