[go: up one dir, main page]

CN114898774B - A method and device for detecting audio dropouts - Google Patents

A method and device for detecting audio dropouts Download PDF

Info

Publication number
CN114898774B
CN114898774B CN202210489774.XA CN202210489774A CN114898774B CN 114898774 B CN114898774 B CN 114898774B CN 202210489774 A CN202210489774 A CN 202210489774A CN 114898774 B CN114898774 B CN 114898774B
Authority
CN
China
Prior art keywords
sequence
amplitude
audio
signal sequence
signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210489774.XA
Other languages
Chinese (zh)
Other versions
CN114898774A (en
Inventor
黄伟隆
冯津伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dingtalk China Information Technology Co Ltd
Original Assignee
Dingtalk China Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dingtalk China Information Technology Co Ltd filed Critical Dingtalk China Information Technology Co Ltd
Priority to CN202210489774.XA priority Critical patent/CN114898774B/en
Publication of CN114898774A publication Critical patent/CN114898774A/en
Application granted granted Critical
Publication of CN114898774B publication Critical patent/CN114898774B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/60Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for measuring the quality of voice signals
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • G10L25/87Detection of discrete points within a voice signal

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing For Digital Recording And Reproducing (AREA)

Abstract

A method and a device for detecting audio dropping points comprise the steps of obtaining a first signal sequence and a second signal sequence, wherein the first signal sequence is a signal sequence obtained by collecting an input audio signal by the audio system through a target sampling rate, obtaining the amplitude of the first signal sequence and the amplitude of the second signal sequence to obtain a first amplitude sequence and a second amplitude sequence respectively, performing linear transformation on the first amplitude sequence and the second amplitude sequence to obtain a first target amplitude sequence, performing linear transformation on the first signal sequence and the second signal sequence to obtain a third signal sequence, obtaining the amplitude of the third signal sequence to obtain a second target amplitude sequence, and obtaining the dropping point condition of the audio system based on the similarity of the first target amplitude sequence and the second target amplitude sequence. By adopting the technical scheme of the application, the condition of the dropped point of the audio can be determined according to the similarity of the two amplitude measurement values.

Description

Audio drop detection method and device
Technical Field
The present application relates to the field of audio data processing, and in particular, to a method and apparatus for detecting an audio drop point.
Background
When the audio system is used for sampling, the acquired sample data may have a dropping condition. A drop condition is an important indicator of the quality of an audio system, for example, collected sample data may affect the subjective hearing of the audio system or may cause some audio algorithms within the audio system to fail when the drop condition occurs. Therefore, it is necessary to detect the case of an audio drop of the audio system.
However, the detection of the audio drop condition of the audio system is currently performed by professional audio detection equipment, which results in high detection cost.
Therefore, how to reduce the cost of detecting the audio drop of the audio system is a technical problem to be solved.
Disclosure of Invention
The application provides a method and a device for detecting audio drop points, which can reduce the cost when detecting the audio drop points of an audio system.
In a first aspect, an embodiment of the present application provides a method for detecting an audio drop, where the method includes obtaining a first signal sequence, where the first signal sequence is a signal sequence obtained by the audio system by using a target sampling rate to collect an input audio signal, obtaining a second signal sequence, obtaining an amplitude of the first signal sequence and an amplitude of the second signal sequence to obtain a first amplitude sequence and a second amplitude sequence, respectively, performing linear transformation on the first amplitude sequence and the second amplitude sequence to obtain a first target amplitude sequence, performing linear transformation on the first signal sequence and the second signal sequence to obtain a third signal sequence, obtaining an amplitude of the third signal sequence to obtain a second target amplitude sequence, and obtaining a drop condition of audio of the audio system based on similarity of the first target amplitude sequence and the second target amplitude sequence.
In this embodiment, the first signal sequence refers to a signal sequence obtained by the audio system after sampling the input audio signal using the target sampling rate. Wherein the input audio signal may be a sinusoidal digital audio signal or a sinusoidal analog audio signal. The target sampling rate refers to a sampling frequency used by the audio system when sampling the input audio signal. Here, it is explained that when the input audio signal in the present embodiment is a sinusoidal digital audio signal, the sampling frequency of the sinusoidal digital audio signal is a target sampling frequency.
In this embodiment, the second signal sequence refers to a signal sequence corresponding to when the input audio signal is acquired using the target sampling rate for the input audio signal before the input audio signal is input to the audio system.
For example, when the input audio signal is a sinusoidal digital audio signal with a sampling frequency equal to the target sampling frequency, the second signal sequence is the sinusoidal digital audio signal.
For example, when the input audio signal is a sinusoidal analog audio signal, the second signal sequence refers to a signal sequence obtained by sampling sinusoidal analog audio by a preset audio system using a target sampling rate, where the preset audio system refers to an audio system in which no point drop occurs.
It should be appreciated that in this embodiment, the lengths of the first signal sequence and the second signal sequence may be the same. For example, the length may be denoted as n for identifying that n sampled signals are contained in the first signal sequence and the second signal sequence, respectively.
In this embodiment, the first target amplitude sequence may be regarded as a target amplitude sequence obtained after the first amplitude sequence and the second amplitude sequence are subjected to linear transformation. For example, the first target amplitude sequence is an amplitude sequence obtained by summing the first amplitude sequence and the second amplitude sequence.
In this embodiment, the third signal sequence refers to a signal sequence obtained by subjecting the first signal sequence and the second signal sequence to linear transformation. Wherein the linear transformation is identical to the linear transformation required to obtain the target amplitude sequence from the first amplitude sequence and the second amplitude sequence.
For example, the first target amplitude sequence is an amplitude sequence obtained by summing the first amplitude sequence and the second amplitude sequence, and then the third signal sequence refers to a signal sequence obtained by summing the first signal sequence and the second signal sequence.
It should be appreciated that in this embodiment, the third signal sequence may be the same length as the first signal sequence and the second signal sequence. For example, the length may be denoted n for identifying that the third signal sequence also comprises n sampled signals when n sampled signals are comprised in the first signal sequence and the second signal sequence, respectively.
It should be appreciated that the first and second target amplitude sequences substantially coincide when no drop-out of audio occurs in the audio system. Therefore, in the present embodiment, after the first target amplitude sequence and the second target amplitude sequence are obtained, the audio drop condition of the audio system can be obtained through the similarity of the first target amplitude sequence and the second target amplitude sequence.
With reference to the first aspect, in one possible implementation manner, the obtaining the amplitude of the first signal sequence and the amplitude of the second signal sequence includes performing hilbert transformation on the first signal sequence to obtain a first resolved signal sequence of the first signal sequence, performing hilbert transformation on the second signal sequence to obtain a second resolved signal sequence of the second signal sequence, obtaining the amplitude sequence of the first resolved signal sequence to obtain the first amplitude sequence, obtaining the amplitude sequence of the second resolved signal sequence to obtain the second amplitude sequence, and correspondingly, obtaining the amplitude of the third signal sequence to obtain a second target amplitude sequence, including performing hilbert transformation on the third signal sequence to obtain a third resolved signal sequence of the third signal sequence, and obtaining the amplitude sequence of the third resolved signal sequence to obtain the second target amplitude sequence.
In this embodiment, the first analysis signal sequence refers to a signal sequence obtained after the hilbert transformation of the first signal sequence, and the second analysis signal sequence refers to a signal sequence obtained after the hilbert transformation of the second signal sequence.
It will be appreciated that the amplitude and phase of a signal may be obtained after the corresponding analysis signal is obtained by a hilbert transformation.
Therefore, in this embodiment, after the first signal sequence is subjected to hilbert transformation to obtain a first resolved signal sequence, an amplitude sequence corresponding to the first signal sequence may be obtained, where the amplitude sequence is referred to as a first amplitude sequence, and after the second signal sequence is subjected to hilbert transformation to obtain a second resolved signal sequence, an amplitude sequence corresponding to the second signal sequence may be obtained, where the amplitude sequence is referred to as a second amplitude sequence.
In this embodiment, the third analysis signal sequence refers to a signal sequence obtained after the hilbert transform of the third signal sequence. Similarly, after the third signal sequence is subjected to hilbert transformation to obtain a third analysis signal sequence, an amplitude sequence corresponding to the third signal sequence can be obtained, and the amplitude sequence is called a second target amplitude sequence.
It should be understood that, in the present embodiment, when the third signal sequence includes n sampling signals, the third resolved signal sequence includes n resolved signals. The n analysis signals are in one-to-one correspondence with n sampling signals in the third signal sequence, and each analysis signal is obtained by performing hilbert transformation on the corresponding sampling signal.
With reference to the first aspect, in one implementation manner, the obtaining the dropping situation of the audio frequency system based on the similarity of the first target amplitude sequence and the second target amplitude sequence includes obtaining an average value of the amplitudes in the first target amplitude sequence to obtain a first average value, obtaining an average value of the amplitudes in the second target amplitude sequence to obtain a second average value, obtaining a difference value of the first average value and the second average value to obtain a mean value difference value, determining that the audio frequency of the audio frequency system drops when the mean value difference value is greater than or equal to a first preset value, and determining that the audio frequency of the audio frequency system does not drop when the mean value difference value is smaller than the first preset value.
In this implementation, when determining the audio drop of the audio system based on the similarity of the first target amplitude sequence and the second target amplitude sequence, the audio drop of the audio system is determined by comparing the average value of the amplitudes in the first target amplitude sequence and the difference between the average value of the amplitudes in the second target amplitude sequence and the first preset value.
With reference to the first aspect, in one implementation manner, the obtaining the dropping situation of the audio frequency of the audio system based on the similarity of the first target amplitude sequence and the second target amplitude sequence includes obtaining a variance of the amplitude in the first target amplitude sequence to obtain a first variance, obtaining a variance of the amplitude in the second target amplitude sequence to obtain a second variance, obtaining a difference value of the first variance and the second variance to obtain a variance difference value, determining that the audio frequency of the audio system drops when the variance difference value is greater than or equal to a second preset value, and determining that the audio frequency of the audio system does not drop when the variance difference value is less than the second preset value.
In this implementation, when determining an audio drop of the audio system based on the similarity of the first target amplitude sequence and the second target amplitude sequence, the audio drop of the audio system is determined by comparing the magnitude between the difference between the variance of the amplitudes in the first target amplitude sequence and the variance of the amplitudes in the second target amplitude sequence and the second preset value.
In combination with the first aspect, in one implementation manner, the obtaining the dropping situation of the audio frequency of the audio system based on the similarity of the first target amplitude sequence and the second target amplitude sequence includes obtaining a standard deviation of the amplitude values in the first target amplitude sequence, obtaining a first standard deviation, obtaining a standard deviation of the amplitude values in the second target amplitude sequence, obtaining a second standard deviation, obtaining a difference of the first standard deviation and the second standard deviation, obtaining a standard deviation difference, determining that the audio frequency of the audio frequency system drops when the difference of the standard deviation is greater than or equal to a third preset value, and determining that the audio frequency of the audio frequency system does not drop when the difference of the standard deviation is smaller than the third preset value.
In this implementation, when determining the dropping situation of the audio system based on the similarity of the first target amplitude sequence and the second target amplitude sequence, it is determined whether the audio of the audio system is dropped by comparing the difference between the standard deviation of the amplitudes in the first target amplitude sequence and the standard deviation of the amplitudes in the second target amplitude sequence with a third preset value.
In combination with the first aspect, in an implementation manner, the obtaining the dropping situation of the audio system based on the similarity between the first target amplitude sequence and the second target amplitude sequence includes obtaining a first similarity according to a preset relation, where i is an integer and is taken from 1 to n, n represents the length of the first target amplitude sequence, b represents the similarity, min () represents a minimum value, X 1,i represents an i element in the first target amplitude sequence, X 2,i represents an i element in the second target amplitude sequence, determining that the audio of the audio system is dropped when the first similarity is greater than or equal to a fourth preset value, and determining that the audio of the audio system is not dropped when the first similarity is less than the fourth preset value.
In the implementation manner, when determining that the audio of the audio system drops based on the similarity of the first target amplitude sequence and the second target amplitude sequence, whether the audio of the audio system drops is determined through the magnitude relation between the similarity between the first target amplitude sequence and the second target amplitude sequence, which is obtained through a preset relation, and a fourth preset value.
The application provides a detection device for audio dropping, which comprises an acquisition module, a processing module and an acquisition module, wherein the acquisition module is used for acquiring a first signal sequence, the first signal sequence is a signal sequence acquired by an audio system through a target sampling rate, the acquisition module is also used for acquiring a second signal sequence, the acquisition module is also used for acquiring the amplitude of the first signal sequence and the amplitude of the second signal sequence to respectively acquire a first amplitude sequence and a second amplitude sequence, the processing module is used for carrying out linear transformation on the first amplitude sequence and the second amplitude sequence to acquire a first target amplitude sequence, the processing module is also used for carrying out linear transformation on the first signal sequence and the second signal sequence to acquire a third signal sequence, the acquisition module is also used for acquiring the amplitude of the third signal sequence to acquire a second target amplitude sequence, and the processing module is also used for acquiring the audio dropping condition of the audio system based on the similarity of the first target amplitude sequence and the second target amplitude sequence.
With reference to the second aspect, in one possible implementation manner, the obtaining module is further configured to perform hilbert transformation on the first signal sequence to obtain a first resolved signal sequence of the first signal sequence, and perform hilbert transformation on the second signal sequence to obtain a second resolved signal sequence of the second signal sequence, obtain an amplitude sequence of the first resolved signal sequence to obtain the first amplitude sequence, obtain an amplitude sequence of the second resolved signal sequence to obtain the second amplitude sequence, and correspondingly, the obtaining module is further configured to perform hilbert transformation on the third signal sequence to obtain a third resolved signal sequence of the third signal sequence, and obtain an amplitude sequence of the third resolved signal sequence to obtain the second target amplitude sequence.
With reference to the second aspect, in one possible implementation manner, the input audio signal is a sinusoidal digital audio signal or a sinusoidal analog audio signal, a sampling frequency of the sinusoidal digital audio signal is the target sampling rate, when the input audio signal is the sinusoidal digital audio signal, the second signal sequence is the sinusoidal digital audio signal sequence, and when the input audio signal is the sinusoidal analog audio signal, the second signal sequence is a signal sequence obtained by sampling the sinusoidal analog audio signal by a preset audio system using the target sampling rate. With reference to the second aspect, in one possible implementation manner, the processing module is specifically configured to obtain an average value of the amplitudes in the first target amplitude sequence, obtain a first average value, obtain an average value of the amplitudes in the second target amplitude sequence, obtain a second average value, obtain a difference value between the first average value and the second average value, obtain a mean value difference value, determine that the audio of the audio system is dropped when the mean value difference value is greater than or equal to a first preset value, and determine that the audio of the audio system is not dropped when the mean value difference value is less than the first preset value.
With reference to the second aspect, in one possible implementation manner, the processing module is specifically configured to obtain a variance of the amplitude in the first target amplitude sequence, obtain a first variance, obtain a variance of the amplitude in the second target amplitude sequence, obtain a second variance, obtain a variance difference value by obtaining a difference value between the first variance and the second variance, determine that the audio of the audio system has a dropped point if the variance difference value is greater than or equal to a second preset value, and determine that the audio of the audio system has no dropped point if the variance difference value is less than the second preset value.
With reference to the second aspect, in one possible implementation manner, the processing module is specifically configured to obtain a standard deviation of the amplitude in the first target amplitude sequence, obtain a first standard deviation, obtain a standard deviation of the amplitude in the second target amplitude sequence, obtain a second standard deviation, obtain a difference between the first standard deviation and the second standard deviation, obtain a standard deviation difference, determine that the audio of the audio system drops when the standard deviation difference is greater than or equal to a third preset value, and determine that the audio of the audio system does not drop when the standard deviation difference is less than the third preset value.
With reference to the second aspect, in one possible implementation manner, the processing module is specifically configured to obtain a similarity between the first target amplitude sequence and the second target amplitude sequence according to a preset relational expression, so as to obtain a first similarity, where i is an integer and is taken from 1 to n, n represents a length of the first target amplitude sequence, b represents the similarity, min () represents a minimum value, X 1,i represents an i-th element in the first target amplitude sequence, X 2,i represents an i-th element in the second target amplitude sequence, determine that an audio of the audio system has a dropped point if the first similarity is greater than or equal to a fourth preset value, and determine that the audio of the audio system has no dropped point if the first similarity is less than the fourth preset value.
In a third aspect, there is provided an audio system comprising an apparatus as claimed in the second or any of the second aspects. In a fourth aspect, an audio drop detection device is provided, comprising a processor for invoking a computer program from a memory, the processor being adapted to perform the method of the first aspect or any of the possible implementations of the first aspect, when said computer program is executed.
In a fifth aspect, a computer readable storage medium is provided for storing a computer program comprising code for performing the method of the first aspect or any possible implementation of the first aspect.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute a limitation on the application. In the drawings:
FIG. 1 is a schematic structural diagram of an audio system according to an embodiment of the present application;
FIG. 2 is a schematic flow chart of a system for detecting a dropped point according to an embodiment of the application;
FIG. 3 is a flowchart illustrating a method for detecting an audio drop according to an embodiment of the present application;
FIG. 4 is a schematic diagram of the first amplitude sequence and the second amplitude sequence according to an embodiment of the present application;
FIG. 5 is a flowchart illustrating a method for detecting an audio drop point according to another embodiment of the present application;
FIG. 6 is a schematic structural diagram of an audio drop detection device according to an embodiment of the present application;
FIG. 7 is a schematic diagram of a drop point detecting device according to an embodiment of the present application;
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be clearly and completely described below with reference to specific embodiments of the present application and corresponding drawings. It will be apparent that the described embodiments are only some, but not all, embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
For ease of understanding, a number of terms referred to in the embodiments of the present application will first be described.
1. Hilbert transform
In digital signal processing, hilbert transform (Hilbert transform) refers to convolving a signal s (t) with 1/(pi t) to obtain an resolved signal s' (t). After the signal is subjected to Hilbert transformation, the amplitude of each frequency component in the frequency domain remains unchanged, but the phase will be shifted by 90 degrees, i.e. pi/2 for the positive frequency lead and pi/2 for the negative frequency lead. After converting a signal into an analysis signal by using a hilbert transform, the amplitude and phase of the analysis signal can be obtained.
2. Audio signal
An audio signal (audio signals) is a signal representing a mechanical wave, and is an information carrier in which the wavelength and intensity of the mechanical wave change. According to the characteristics of the mechanical wave, it can be classified into regular signals and irregular signals.
At present, with the rapid development of the audio and video field, various audio systems are widely used in life. Illustratively, fig. 1 is a schematic structural diagram of an audio system according to an embodiment of the present application. As shown in fig. 1, when an input audio signal is input to the audio system 101, the audio system 101 may process the input audio signal to obtain a target audio signal and output the target audio signal. More specifically, the audio system 101 samples an input audio signal before processing the input audio signal, obtains a signal sequence corresponding to the input audio signal, and then processes the signal sequence.
The audio system 101 in the present application may be, for example, a smart speaker, a smart sound pickup, or an audio/video integrated device, and is not limited to the present application. It is also explained herein that the present embodiment does not limit the specific structure of the audio system 101. For example, the audio system 101 may include an input source, a processor, an output source, and the like, without limiting the application.
However, with the audio system shown in fig. 1, the audio system 101 may acquire sample data during sampling, which may occur in a dropped point. A drop condition is an important indicator of the quality of an audio system, for example, collected sample data may affect the subjective hearing of the audio system or may cause some audio algorithms within the audio system to fail when the drop condition occurs. Therefore, it is necessary to detect the case of an audio drop of the audio system.
However, at present, professional audio detection equipment is used for detecting the audio drop condition of the audio system, so that the detection cost is high. Therefore, how to reduce the cost of detecting the audio drop of the audio system is a technical problem to be solved.
In view of the above, the application provides a method and a device for detecting an audio drop. In the audio drop detection method, an input audio signal of an audio system and a signal sequence obtained by sampling the input audio signal by using a target sampling rate are obtained respectively, then the amplitudes of the two signals are subjected to linear operation to obtain a corresponding amplitude measurement value, and in addition, the amplitudes of the input audio signal and the signal sequence obtained by sampling the input audio signal by using the target sampling rate are subjected to linear operation to obtain another amplitude measurement value. It should be appreciated that if no drop-out occurs in the audio system, the two amplitude metrics will substantially coincide. Therefore, in the technical scheme provided by the application, the condition of audio drop of the audio system can be determined through the similarity of the two amplitude measurement values.
Fig. 2 is a schematic structural diagram of a drop detection system according to an embodiment of the present application. As shown in fig. 2, the drop detection system includes an audio system 201 and a detection device 202. In the drop detection system, the input audio signal, after passing through the audio system 201, will obtain an output audio signal. The detection means 202 are then used to determine the drop-out situation of the audio system 201 from the input signal and the output signal by inputting the output audio signal and the input audio signal into the detection means 202.
The following describes the technical scheme of the present application and how the technical scheme of the present application solves the above technical problems in detail with specific embodiments. The following embodiments may be combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments. Embodiments of the present application will be described below with reference to the accompanying drawings.
Fig. 3 is a flowchart of a method for detecting an audio drop according to an embodiment of the application. As shown in fig. 3, the method of the present embodiment may include S301, S302, S303, S304, S305, S306, and S307. The method for detecting audio drop points in the present embodiment may be performed by the detection device 202 in the drop point detection system shown in fig. 2.
S301, acquiring a first signal sequence, wherein the first signal sequence is a signal sequence obtained by acquiring an input audio signal by the audio system by using a target sampling rate.
In this embodiment, the first signal sequence (S1 is also denoted as a first signal sequence in the embodiment of the present application) refers to a signal sequence obtained by an audio system after sampling an input audio signal using a target sampling rate. The target sampling rate refers to a sampling frequency used when the audio system samples an input audio signal. For example, the target sampling rate used by audio systems is 48 kilohertz.
For ease of understanding, the drop point detection system shown in fig. 2 is taken as an example. As shown in fig. 2, the first signal sequence S1 is the output audio signal in fig. 2.
It should be appreciated that the input audio signal may be a sinusoidal digital audio signal or a sinusoidal analog audio signal.
In the present embodiment, when the input audio signal is a sinusoidal digital audio signal, the sampling frequency of the sinusoidal digital audio signal is a target sampling frequency.
When the input audio signal is a sinusoidal digital audio signal, the frequency corresponding to the input audio signal is not limited in the embodiment of the present application. For example, when the audio system is used with a target sampling rate of 48 khz, the frequency of the input audio signal may be selected to be 11 khz, 17 khz, 20khz, etc., without limiting the application.
S302, acquiring a second signal sequence.
In this embodiment, the second signal sequence (S2 in the embodiment of the present application also refers to a signal sequence corresponding to when the input audio signal is acquired using the target sampling rate for the input audio signal before the input audio signal is input to the audio system.
Specifically, the drop detection system shown in fig. 2 is taken as an example for explanation. As shown in fig. 2, in implementation, when the input audio signal is a sinusoidal digital audio signal and the sampling frequency of the sinusoidal digital audio signal is the target sampling rate of the audio system 201, the second signal sequence is the sinusoidal digital audio signal sequence. When the input audio signal is a sinusoidal analog audio signal, the second signal sequence in the embodiment refers to a signal sequence obtained by sampling sinusoidal analog audio by the preset audio system using the target sampling rate. The preset audio system refers to an audio system without a drop condition.
It should be appreciated that in this embodiment, the lengths of the first signal sequence and the second signal sequence may be the same. For example, the length may be denoted as n for identifying that n sampled signals are contained in the first signal sequence and the second signal sequence, respectively.
S303, acquiring the amplitude of the first signal sequence and the amplitude of the second signal sequence to obtain a first amplitude sequence and a second amplitude sequence respectively.
For example, in one possible implementation manner, the first amplitude sequence and the second amplitude sequence are obtained by performing hilbert transformation on the first signal sequence to obtain a first resolved signal sequence of the first signal sequence, obtaining the amplitude sequence of the first resolved signal sequence to obtain the first amplitude sequence, and performing hilbert transformation on the second signal sequence to obtain a second resolved signal sequence of the second signal sequence, obtaining the amplitude sequence of the second resolved signal sequence, and obtaining the second amplitude sequence.
In this implementation, after the first signal sequence S1 is acquired, hilbert transformation is performed on the first signal sequence S1, that is, by the formula a1= HilbertTransform (S1), to obtain a corresponding analysis signal A1 (also referred to as a first analysis signal sequence A1 in this embodiment). It can be understood that the first analysis signal sequence A1 has the same timing length as the first signal sequence S1, and the difference is that the first analysis signal sequence A1 is a complex signal.
It should be appreciated that in this implementation, when the first signal sequence comprises n sampled signals, the first resolved signal sequence will comprise n resolved signals. The n analysis signals are in one-to-one correspondence with n sampling signals in the first signal sequence, and each analysis signal is obtained by performing Hilbert transformation on the corresponding sampling signal.
In this implementation, after the second signal sequence S2 is acquired, the second signal sequence S2 is subjected to hilbert transformation, that is, by the formula a2= HilbertTransform (S2), to obtain a corresponding analysis signal A2 (also referred to as a second analysis signal sequence A2). It can be understood that the second analysis signal sequence A2 has the same timing length as the second signal sequence S2, and the difference is that the second analysis signal sequence A2 is a complex signal.
It should be understood that, in the present embodiment, when the second signal sequence includes n sampling signals, the second resolved signal sequence includes n resolved signals. The n analysis signals are in one-to-one correspondence with n sampling signals in the second signal sequence, and each analysis signal is obtained by performing Hilbert transform on the corresponding sampling signal.
It will be appreciated that the amplitude and phase of a signal may be obtained after the corresponding analysis signal is obtained by a hilbert transformation.
Therefore, in this implementation manner, after the first signal sequence S1 is subjected to hilbert transformation to obtain the first resolved signal sequence A1, an amplitude sequence corresponding to the first resolved signal sequence A1 (i.e., a first amplitude sequence) may be obtained, and after the second signal sequence S2 is subjected to hilbert transformation to obtain the second resolved signal sequence A2, an amplitude sequence corresponding to the second resolved signal sequence A2 (i.e., a second amplitude sequence) may be obtained as well
Illustratively, in one embodiment, the formula may be: the first amplitude sequence of the first analysis signal sequence A1 is obtained by the first signal sequence A1 2. Wherein, iA 1 i 2 represents solving L2 norm for the first analysis signal sequence A1. It is noted that the concept of the L2 norm and the detailed explanation thereof are referred to the description in the related art, and are not repeated herein.
Illustratively, in one embodiment, the formula may be: and (3) obtaining a second amplitude sequence of the second analysis signal sequence A2 by using the component A2 and the component 2. Wherein, the term "A2" 2 means that the second analysis signal sequence A2 is subjected to L2 norms.
It should be understood that, in the present embodiment, when the second signal sequence includes n sampling signals, the second resolved signal sequence includes n resolved signals, and further, n magnitudes are obtained. The n amplitudes are in one-to-one correspondence with n sampling signals in the second signal sequence or with n analytic signals in the second analytic signal sequence.
S304, performing linear transformation on the first amplitude sequence and the second amplitude sequence to obtain a first target amplitude sequence.
It should be appreciated that the first amplitude sequence includes a plurality of amplitudes, and the second amplitude sequence includes a plurality of amplitudes, so that, in a specific implementation, the amplitudes at the same position in the first amplitude sequence and the second amplitude sequence may be added to obtain the amplitudes at the corresponding positions in the first target amplitude sequence (the first target amplitude sequence is also denoted by X1 in the embodiment of the present application).
Illustratively, assuming that the first amplitude sequence and the second amplitude sequence are as shown in fig. 4, it is now assumed that the length of the line segment at each position in fig. 4 represents the amplitude at that position. When the first target amplitude sequence is obtained by the first amplitude sequence shown in fig. 4 and the second amplitude sequence shown in fig. 4, the amplitudes at the same position in the first amplitude sequence and the second amplitude sequence can be added to obtain the amplitudes at the corresponding position in the first target amplitude sequence. For example, the magnitudes at the corresponding positions in the first target magnitude sequence are obtained by adding the magnitudes at position 1 in both the first and second magnitude sequences.
It should be appreciated that in this embodiment, the first target amplitude sequence may include n amplitudes, where the n amplitudes are in one-to-one correspondence with n sampled signals in the first signal sequence or with n resolved signals in the second resolved signal sequence.
S305, performing linear transformation on the first signal sequence and the second signal sequence to obtain a third signal sequence.
For example, in one implementation, the first signal sequence and the second signal sequence may be summed to obtain a third signal sequence.
It should be appreciated that in this embodiment, the third signal sequence may be the same length as the first signal sequence and the second signal sequence. For example, the length may be denoted n for identifying that the third signal sequence also comprises n sampled signals when n sampled signals are comprised in the first signal sequence and the second signal sequence, respectively.
S306, acquiring the amplitude of the third signal sequence to obtain a second target amplitude sequence.
In one possible implementation, the second target amplitude sequence is obtained by performing hilbert transformation on the third signal sequence to obtain a third analysis signal sequence of the third signal sequence, and obtaining the amplitude sequence of the third analysis signal sequence to obtain the second target amplitude sequence.
In this implementation, after the third signal sequence is obtained, the third signal sequence is subjected to hilbert transform.
For example, assuming that the third signal sequence is obtained by summing the first signal sequence and the second signal sequence, i.e. the third signal sequence is equal to s1+s2, the corresponding analysis signal Asum (also referred to as the third analysis signal sequence Asum in the embodiment of the present application) can be obtained by the formula Asum = HilbertTransform (s1+s2).
It should be understood that, in the present embodiment, when the third signal sequence includes n sampling signals, the third resolved signal sequence includes n resolved signals. The n analysis signals are in one-to-one correspondence with n sampling signals in the third signal sequence, and each analysis signal is obtained by performing hilbert transformation on the corresponding sampling signal.
Illustratively, in one embodiment, the amplitude sequence (i.e., the second target amplitude sequence) of the third resolved signal sequence Asum may be obtained by the formula: | Asum || 2. Where Asum 2 represents the L2 norm of the third analysis signal sequence Asum.
In this embodiment of the present application, the second target amplitude sequence is also denoted as X2.
It should be appreciated that in this embodiment, the second target amplitude sequence may include n amplitudes, where the n amplitudes are in one-to-one correspondence with n sampled signals in the third signal sequence or with n resolved signals in the third resolved signal sequence.
S307, obtaining the drop-off condition of the audio system based on the similarity of the first target amplitude sequence and the second target amplitude sequence.
Typically, if there is no phase difference in the hardware system, then X1 and X2 are substantially coincident, i.e., D (n) =x1-x2=0. Wherein n belongs to any time. However, in the actual measurement, D is not completely equal to 0, but is a very small time sequence, so in the embodiment of the present application, a threshold may be set during specific implementation, and then after the similarity between the first target amplitude sequence and the second target amplitude sequence is calculated, the audio drop condition of the audio system is determined by the magnitude relationship between the similarity value and the set threshold.
Exemplary, in one possible implementation manner, S307 includes obtaining an average value of the amplitudes in the first target amplitude sequence X1 to obtain a first average value, obtaining an average value of the amplitudes in the second target amplitude sequence X2 to obtain a second average value, obtaining a difference value between the first average value and the second average value to obtain a mean value difference value, determining that the audio of the audio system is dropped when the mean value difference value is greater than or equal to a first preset value, and determining that the audio of the audio system is not dropped when the mean value difference value is less than the first preset value.
In this implementation, when determining the audio drop of the audio system based on the similarity of the first target amplitude sequence and the second target amplitude sequence, the audio drop of the audio system is determined by comparing the average value of the amplitudes in the first target amplitude sequence and the difference between the average value of the amplitudes in the second target amplitude sequence and the first preset value.
Illustratively, in one possible implementation, S307 includes obtaining a variance of the amplitude in the first target amplitude sequence X1 to obtain a first variance, obtaining a variance of the amplitude in the second target amplitude sequence X2 to obtain a second variance, obtaining a difference between the first variance and the second variance to obtain a variance difference, determining that the audio of the audio system has dropped when the variance difference is greater than or equal to a second preset value, and determining that the audio of the audio system has not dropped when the variance difference is less than the second preset value.
In this implementation, when determining an audio drop of the audio system based on the similarity of the first target amplitude sequence and the second target amplitude sequence, the audio drop of the audio system is determined by comparing the magnitude between the difference between the variance of the amplitudes in the first target amplitude sequence and the variance of the amplitudes in the second target amplitude sequence and the second preset value.
Illustratively, in one possible implementation manner, S307 includes obtaining a standard deviation of the amplitude in the first target amplitude sequence X1 to obtain a first standard deviation, obtaining a standard deviation of the amplitude in the second target amplitude sequence X2 to obtain a second standard deviation, obtaining a difference between the first standard deviation and the second standard deviation to obtain a standard deviation difference, determining that the audio of the audio system is dropped when the standard deviation difference is greater than or equal to a third preset value, and determining that the audio of the audio system is not dropped when the standard deviation difference is less than the third preset value.
In this implementation, when determining the dropping situation of the audio system based on the similarity of the first target amplitude sequence and the second target amplitude sequence, it is determined whether the audio of the audio system is dropped by comparing the difference between the standard deviation of the amplitudes in the first target amplitude sequence and the standard deviation of the amplitudes in the second target amplitude sequence with a third preset value.
Illustratively, in one possible implementation manner, S307 includes obtaining a similarity between a first target amplitude sequence X1 and a second target amplitude sequence X2 according to a preset relational expression, where i is an integer and is taken from 1 to n, n represents a length of the first target amplitude sequence, b represents the similarity, min () represents a minimum value, X 1,i represents an i element in the first target amplitude sequence, X 2,i represents an i element in the second target amplitude sequence, determining that an audio frequency of the audio system has a dropped point if the first similarity is greater than or equal to a fourth preset value, and determining that the audio frequency of the audio system has no dropped point if the first similarity is less than the fourth preset value.
In the implementation manner, when determining that the audio of the audio system drops based on the similarity of the first target amplitude sequence and the second target amplitude sequence, whether the audio of the audio system drops is determined through the magnitude relation between the similarity between the first target amplitude sequence and the second target amplitude sequence, which is obtained through a preset relation, and a fourth preset value.
The detection device may further determine whether the audio of the audio system is dropped by combining at least two of the magnitude relation between the mean value difference and the first preset value, the magnitude relation between the variance difference and the second preset value, the magnitude relation between the standard deviation difference and the third preset value, and the magnitude relation between the first similarity and the fourth preset value. For example, the audio occurrence of the audio system is determined to be dropped if the mean value difference is greater than or equal to a first preset value and the standard deviation difference is greater than or equal to a third preset value, or the audio occurrence of the audio system is determined to be dropped if the first similarity is greater than or equal to a fourth preset value and the variance difference is greater than or equal to a second preset value.
In this embodiment, in determining the first amplitude sequence corresponding to the first signal sequence, the second amplitude sequence corresponding to the second signal sequence, and the second target amplitude sequence corresponding to the third signal sequence, the corresponding amplitude sequences obtained by hilbert transformation are only one possible implementation manner, and may be other variations that have the same attribute as the hilbert transformation and can obtain the amplitude sequences, for example, and do not limit the present application.
It is noted that, in the embodiment shown in fig. 3, when the first amplitude sequence and the second amplitude sequence are subjected to linear transformation, the amplitudes at the corresponding positions in the first target amplitude sequence are obtained by adding the amplitudes at the same positions in the first amplitude sequence and the second amplitude sequence, which is just an example of linear transformation, and other linear transformations may also be included. For example, in another linear transformation mode, the first amplitude sequence may be multiplied by the coefficient k1 to obtain a third amplitude sequence, the second amplitude sequence may be multiplied by the coefficient k2 to obtain a fourth amplitude sequence, and then the amplitudes located at the same position in the third amplitude sequence and the fourth amplitude sequence may be added to obtain the amplitudes located at the corresponding position in the first target amplitude sequence. It should be appreciated that the linear transformation of the first and second amplitude sequences should be identical to the linear transformation of the first and second signal sequences. Therefore, when the linear transformation performed on the first amplitude sequence and the second amplitude sequence is the other linear transformation method described above, the linear transformation performed on the first signal sequence and the second signal sequence should also be the other linear transformation method described above.
In addition, it is described herein that the linear transformation may be applied to the first signal sequence or the second signal sequence in the embodiment of the present application, which is all the concept of the embodiment of the present application. For example, in the embodiment shown in fig. 3, the first signal sequence may also be subjected to a linear transformation prior to the hilbert transformation of the first signal sequence, it being understood that in this case the first signal sequence should also be subjected to a linear transformation prior to the linear transformation of the first signal sequence and the second signal sequence. In addition or alternatively, the second signal sequence may be subjected to a linear transformation prior to the hilbert transformation of the second signal sequence, it being understood that in this case the second signal sequence should also be subjected to a linear transformation prior to the linear transformation of the first and second signal sequences.
In addition, it is described herein that the number of linear transformations is not limited when the linear transformation is performed in the embodiment of the present application, and the method is all the concept of the embodiment of the present application.
For easy understanding, fig. 5 is a flow chart of a method for detecting an audio drop point according to the present application. As shown in fig. 5, the drop detection method includes:
s501, performing Hilbert transform on the first signal sequence to obtain a first analysis signal sequence of the first signal sequence, and performing Hilbert transform on the second signal sequence to obtain a second analysis signal sequence of the second signal sequence.
In implementation, as shown in fig. 5, S501 may include the following steps:
S5011, performing Hilbert transform on the first signal sequence to obtain a first analysis signal sequence of the first signal sequence.
S5012, performing Hilbert transform on the second signal sequence to obtain a second analysis signal sequence of the second signal sequence.
S502, a first amplitude sequence of the first analysis signal sequence and a second amplitude sequence of the second analysis signal are obtained.
In implementation, as shown in fig. 5, S502 may include the following steps:
S5021, obtaining a first amplitude sequence of a first analysis signal sequence.
S5022, obtaining a second amplitude sequence of the second analysis signal sequence.
S503, performing linear transformation on the first amplitude sequence and the second amplitude sequence to obtain a first target amplitude sequence.
S504, performing linear transformation on the first signal sequence and the second signal sequence to obtain a third signal sequence.
S505, performing Hilbert transform on the third signal sequence to obtain a third analysis signal sequence of the third signal sequence.
S506, the amplitude of the third analysis signal sequence is obtained, and a second target amplitude sequence is obtained.
S507, acquiring the audio dropping condition of the audio system based on the similarity of the first target amplitude sequence and the second target amplitude sequence.
The concepts presented in this embodiment may be described with reference to the embodiment shown in fig. 3 and will not be described here.
Fig. 6 is a schematic structural diagram of an audio drop detection device 600 according to an embodiment of the application. As shown in fig. 6, the apparatus 600 includes an acquisition module 601 and a processing module 602.
The system comprises an acquisition module 601, a processing module 602, and an acquisition module 601, wherein the acquisition module 601 is used for acquiring a first signal sequence, the first signal sequence is a signal sequence acquired by the audio system through a target sampling rate, the acquisition module 601 is also used for acquiring a second signal sequence, the acquisition module 601 is also used for acquiring the amplitude of the first signal sequence and the amplitude of the second signal sequence to respectively obtain a first amplitude sequence and a second amplitude sequence, the processing module 602 is used for performing linear transformation on the first amplitude sequence and the second amplitude sequence to obtain a first target amplitude sequence, the processing module 602 is also used for performing linear transformation on the first signal sequence and the second signal sequence to obtain a third signal sequence, the acquisition module 601 is also used for acquiring the amplitude of the third signal sequence to obtain a second target amplitude sequence, and the processing module 602 is also used for acquiring the drop-off condition of the audio system based on the similarity of the first target amplitude sequence and the second target amplitude sequence.
In a possible implementation manner, the obtaining module 601 is further configured to perform hilbert transformation on the first signal sequence to obtain a first resolved signal sequence of the first signal sequence, and perform hilbert transformation on the second signal sequence to obtain a second resolved signal sequence of the second signal sequence, obtain an amplitude sequence of the first resolved signal sequence to obtain the first amplitude sequence, obtain an amplitude sequence of the second resolved signal sequence to obtain the second amplitude sequence, and correspondingly, the obtaining module 601 is further configured to perform hilbert transformation on the third signal sequence to obtain a third resolved signal sequence of the third signal sequence, and obtain an amplitude sequence of the third resolved signal sequence to obtain the second target amplitude sequence.
In one possible implementation manner, the input audio signal is a sinusoidal digital audio signal or a sinusoidal analog audio signal, the sampling frequency of the sinusoidal digital audio signal is the target sampling rate, when the input audio signal is the sinusoidal digital audio signal, the second signal sequence is the sinusoidal digital audio signal sequence, and when the input audio signal is the sinusoidal analog audio signal, the second signal sequence is a signal sequence obtained by sampling the sinusoidal analog audio signal by a preset audio system using the target sampling rate.
In a possible implementation manner, the processing module 602 is specifically configured to obtain an average value of the amplitudes in the first target amplitude sequence to obtain a first average value, obtain an average value of the amplitudes in the second target amplitude sequence to obtain a second average value, obtain a difference value between the first average value and the second average value to obtain a mean value difference value, determine that the audio of the audio system is dropped when the mean value difference value is greater than or equal to a first preset value, and determine that the audio of the audio system is not dropped when the mean value difference value is less than the first preset value.
In a possible implementation manner, the processing module 602 is specifically configured to obtain a variance of the amplitude in the first target amplitude sequence, obtain a first variance, obtain a variance of the amplitude in the second target amplitude sequence, obtain a difference between the first variance and the second variance, obtain a variance difference, determine that the audio of the audio system has a dropped point if the variance difference is greater than or equal to a second preset value, and determine that the audio of the audio system has not a dropped point if the variance difference is less than the second preset value.
In one possible implementation manner, the processing module 602 is specifically configured to obtain a standard deviation of the amplitude values in the first target amplitude sequence, obtain a first standard deviation, obtain a second standard deviation, obtain a difference between the first standard deviation and the second standard deviation, obtain a standard deviation difference, determine that the audio of the audio system has a dropped point if the standard deviation difference is greater than or equal to a third preset value, and determine that the audio of the audio system has no dropped point if the standard deviation difference is less than the third preset value.
In a possible implementation manner, the processing module 602 is specifically configured to obtain a first similarity according to a preset relational expression, where i is an integer and is taken from 1 to n, where n represents a length of the first target amplitude sequence, b represents the similarity, min () represents a minimum value, X 1,i represents an i element in the first target amplitude sequence, X 2,i represents an i element in the second target amplitude sequence, determine that an audio of the audio system drops when the first similarity is greater than or equal to a fourth preset value, and determine that the audio of the audio system does not drop when the first similarity is less than the fourth preset value.
Fig. 7 is a schematic structural diagram of a drop point detecting device 700 according to an embodiment of the application. The drop detection device 700 is used to perform the method performed by the detection device above.
The drop detection device 700 includes a processor 710, where the processor 710 is configured to execute a computer program or instructions stored in a memory 720, or to read data stored in the memory 720, to perform the methods in the method embodiments above. Optionally, the processor 710 is one or more.
Optionally, as shown in fig. 7, the drop detection device 700 further comprises a memory 720, where the memory 720 is configured to store computer programs or instructions and/or data. The memory 720 may be integrated with the processor 710 or may be separate. Optionally, memory 720 is one or more.
Optionally, as shown in fig. 7, the drop detection device 700 further includes a communication interface 730, where the communication interface 730 is used for receiving and/or transmitting signals. For example, processor 710 is configured to control the reception and/or transmission of signals by communication interface 730.
Optionally, the drop detection device 700 is configured to implement the operations performed by the detection device in the above method embodiments.
For example, the processor 710 is configured to execute computer programs or instructions stored in the memory 720 to implement the relevant operations of the detection apparatus of the above method embodiments. For example, the processor 710 may be configured to obtain a first signal sequence, where the first signal sequence is a signal sequence obtained by the audio system by using a target sampling rate to collect an input audio signal, obtain a second signal sequence, obtain an amplitude of the first signal sequence and an amplitude of the second signal sequence to obtain a first amplitude sequence and a second amplitude sequence, respectively, perform a linear transformation on the first amplitude sequence and the second amplitude sequence to obtain a first target amplitude sequence, perform the linear transformation on the first signal sequence and the second signal sequence to obtain a third signal sequence, obtain an amplitude of the third signal sequence to obtain a second target amplitude sequence, and obtain a dropped point condition of audio of the audio system based on a similarity of the first target amplitude sequence and the second target amplitude sequence.
In one possible implementation manner, the acquiring the amplitude of the first signal sequence and the amplitude of the second signal sequence includes performing hilbert transformation on the first signal sequence to obtain a first analysis signal sequence of the first signal sequence, performing hilbert transformation on the second signal sequence to obtain a second analysis signal sequence of the second signal sequence, acquiring the amplitude sequence of the first analysis signal sequence to obtain the first amplitude sequence, acquiring the amplitude sequence of the second analysis signal sequence to obtain the second amplitude sequence, and correspondingly, acquiring the amplitude of the third signal sequence to obtain a second target amplitude sequence, including performing hilbert transformation on the third signal sequence to obtain a third analysis signal sequence of the third signal sequence, and acquiring the amplitude sequence of the third analysis signal sequence to obtain the second target amplitude sequence.
In one possible implementation manner, the input audio signal is a sinusoidal digital audio signal or a sinusoidal analog audio signal, the sampling frequency of the sinusoidal digital audio signal is the target sampling rate, when the input audio signal is the sinusoidal digital audio signal, the second signal sequence is the sinusoidal digital audio signal sequence, and when the input audio signal is the sinusoidal analog audio signal, the second signal sequence is a signal sequence obtained by sampling the sinusoidal analog audio signal by a preset audio system using the target sampling rate.
In some examples, the processor 710 is further configured to obtain an average value of the magnitudes in the first target magnitude sequence to obtain a first average value, obtain an average value of the magnitudes in the second target magnitude sequence to obtain a second average value, obtain a difference value between the first average value and the second average value to obtain a mean value difference value, determine that the audio of the audio system is dropped when the mean value difference value is greater than or equal to a first preset value, and determine that the audio of the audio system is not dropped when the mean value difference value is less than the first preset value.
In some examples, the processor 710 is further configured to obtain a variance of the magnitudes in the first target magnitude sequence to obtain a first variance, obtain a variance of the magnitudes in the second target magnitude sequence to obtain a second variance, obtain a difference between the first variance and the second variance to obtain a variance difference, determine that the audio of the audio system is dropped if the variance difference is greater than or equal to a second preset value, and determine that the audio of the audio system is not dropped if the variance difference is less than the second preset value.
In some examples, the processor 710 is further configured to obtain a standard deviation of the magnitudes in the first target magnitude sequence, obtain a first standard deviation, obtain a standard deviation of the magnitudes in the second target magnitude sequence, obtain a second standard deviation, obtain a difference between the first standard deviation and the second standard deviation, obtain a difference between the standard deviations, determine that the audio of the audio system is dropped when the difference between the standard deviations is greater than or equal to a third preset value, and determine that the audio of the audio system is not dropped when the difference between the standard deviations is less than the third preset value.
In some examples, the processor 710 is further configured to obtain a first similarity by obtaining a similarity between the first target amplitude sequence and the second target amplitude sequence according to a preset relational expression, where i is an integer and is taken from 1 to n, where n represents a length of the first target amplitude sequence, b represents the similarity, min () represents a minimum value, X 1,i represents an i-th element in the first target amplitude sequence, X 2,i represents an i-th element in the second target amplitude sequence, determine that an audio of the audio system has a dropped point if the first similarity is greater than or equal to a fourth preset value, and determine that the audio of the audio system has no dropped point if the first similarity is less than the fourth preset value.
It should be noted that, the drop point detecting device 700 in fig. 7 may be the detecting device in the foregoing embodiment, or may be a chip, which is not limited herein.
In an embodiment of the present application, the processor is a circuit with signal processing capability, in one implementation, the processor may be a circuit with instruction reading and running capability, such as a CPU, a microprocessor, a GPU (which may be understood as a microprocessor), or a DSP, etc., and in another implementation, the processor may implement a function through a logic relationship of a hardware circuit, where the logic relationship of the hardware circuit is fixed or reconfigurable, for example, the processor is a hardware circuit implemented by an ASIC or a PLD, such as an FPGA. In the reconfigurable hardware circuit, the processor loads the configuration document, and the process of implementing the configuration of the hardware circuit may be understood as a process of loading instructions by the processor to implement the functions of some or all of the above units. Furthermore, a hardware circuit designed for artificial intelligence may be provided, which may be understood as an ASIC, such as NPU, TPU, DPU, etc.
It will be seen that each of the units in the above apparatus may be one or more processors (or processing circuits) configured to implement the above methods, e.g., CPU, GPU, NPU, TPU, DPU, a microprocessor, DSP, ASIC, FPGA, or a combination of at least two of these processor forms.
Furthermore, the units in the above apparatus may be integrated together in whole or in part, or may be implemented independently. In one implementation, these units are integrated together and implemented in the form of a system-on-a-chip (SOC). The SOC may include at least one processor for implementing any of the methods above or for implementing the functions of the units of the apparatus, where the at least one processor may be of different types, including, for example, a CPU and an FPGA, a CPU and an artificial intelligence processor, a CPU and a GPU, and the like.
Accordingly, embodiments of the present application also provide a computer-readable storage medium storing a computer program, which when executed by a processor causes the processor to perform the steps of the method performed by the detection apparatus of fig. 3.
Accordingly, embodiments of the present application also provide a computer-readable storage medium storing a computer program, which when executed by a processor causes the processor to perform the steps of the method performed by the detection apparatus of fig. 3.
Accordingly, embodiments of the present application also provide a computer program product comprising a computer program/instructions which, when executed by a processor, cause the processor to carry out the steps of the method performed by the detection apparatus of fig. 3.
Accordingly, embodiments of the present application also provide a computer program product comprising a computer program/instructions which, when executed by a processor, cause the processor to carry out the steps of the method performed by the detection apparatus of fig. 3.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.
Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises an element.
The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and variations of the present application will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. which come within the spirit and principles of the application are to be included in the scope of the claims of the present application.

Claims (17)

1. The method for detecting the audio drop point is characterized by comprising the following steps:
Acquiring a first signal sequence, wherein the first signal sequence is a signal sequence obtained by acquiring an input audio signal by an audio system by using a target sampling rate;
acquiring a second signal sequence, the second signal sequence being related to the input audio signal;
Acquiring the amplitude of the first signal sequence and the amplitude of the second signal sequence to respectively obtain a first amplitude sequence and a second amplitude sequence;
Performing linear transformation on the first amplitude sequence and the second amplitude sequence to obtain a first target amplitude sequence;
Performing the linear transformation on the first signal sequence and the second signal sequence to obtain a third signal sequence;
acquiring the amplitude of the third signal sequence to obtain a second target amplitude sequence;
and acquiring the drop condition of the audio system based on the similarity of the first target amplitude sequence and the second target amplitude sequence.
2. The method of claim 1, wherein the obtaining the amplitude of the first signal sequence and the amplitude of the second signal sequence to obtain a first amplitude sequence and a second amplitude sequence, respectively, comprises:
performing Hilbert transformation on the first signal sequence to obtain a first analysis signal sequence of the first signal sequence, and performing Hilbert transformation on the second signal sequence to obtain a second analysis signal sequence of the second signal sequence;
acquiring an amplitude sequence of the first analytic signal sequence to obtain the first amplitude sequence;
Acquiring an amplitude sequence of the second analytic signal sequence to obtain the second amplitude sequence;
correspondingly, the obtaining the amplitude of the third signal sequence to obtain a second target amplitude sequence includes:
performing Hilbert transform on the third signal sequence to obtain a third analysis signal sequence of the third signal sequence;
And acquiring an amplitude sequence of the third analysis signal sequence to obtain the second target amplitude sequence.
3. The method according to claim 1 or 2, wherein the input audio signal is a sinusoidal digital audio signal or a sinusoidal analog audio signal, the sampling frequency of the sinusoidal digital audio signal being the target sampling rate;
When the input audio signal is the sinusoidal digital audio signal, the second signal sequence is a sinusoidal digital audio signal sequence, and when the input audio signal is the sinusoidal analog audio signal, the second signal sequence is a signal sequence obtained by sampling the sinusoidal analog audio signal by a preset audio system by using the target sampling rate.
4. The method of claim 3, wherein the obtaining a dropped point condition of the audio system based on the similarity of the first target amplitude sequence and the second target amplitude sequence comprises:
acquiring an average value of the amplitudes in the first target amplitude sequence to obtain a first average value;
acquiring an average value of the amplitudes in the second target amplitude sequence to obtain a second average value;
obtaining a difference value of the first average value and the second average value to obtain an average value difference value;
determining that the audio of the audio system is dropped when the mean value difference is larger than or equal to a first preset value;
And under the condition that the mean value difference value is smaller than the first preset value, determining that no drop occurs in the audio of the audio system.
5. The method of claim 3, wherein the obtaining a dropped point condition of the audio system based on the similarity of the first target amplitude sequence and the second target amplitude sequence comprises:
acquiring the variance of the amplitude values in the first target amplitude value sequence to obtain a first variance;
Acquiring the variance of the amplitude values in the second target amplitude value sequence to obtain a second variance;
obtaining a difference value of the first variance and the second variance to obtain a variance difference value;
determining that the audio of the audio system is dropped when the variance difference is greater than or equal to a second preset value;
And under the condition that the variance difference value is smaller than the second preset value, determining that no drop occurs in the audio of the audio system.
6. The method of claim 3, wherein the obtaining a dropped point condition of the audio system based on the similarity of the first target amplitude sequence and the second target amplitude sequence comprises:
obtaining a standard deviation of the amplitude values in the first target amplitude value sequence to obtain a first standard deviation;
Obtaining a standard deviation of the amplitude values in the second target amplitude value sequence to obtain a second standard deviation;
obtaining a difference value of the first standard deviation and the second standard deviation to obtain a standard deviation value;
determining that the audio of the audio system is dropped when the standard deviation difference value is larger than or equal to a third preset value;
And under the condition that the standard deviation difference value is smaller than the third preset value, determining that no drop occurs in the audio of the audio system.
7. The method of claim 3, wherein the obtaining a dropped point condition of the audio system based on the similarity of the first target amplitude sequence and the second target amplitude sequence comprises:
Obtaining the similarity between the first target amplitude sequence and the second target amplitude sequence according to a preset relational expression to obtain first similarity, wherein i is an integer and is taken from 1 to n, n represents the length of the first target amplitude sequence, b represents the similarity, min () represents a minimum value, X 1,i represents an ith element in the first target amplitude sequence, and X 2,i represents an ith element in the second target amplitude sequence;
Determining that the audio of the audio system is dropped when the first similarity is larger than or equal to a fourth preset value;
And under the condition that the first similarity is smaller than the fourth preset value, determining that no drop occurs in the audio of the audio system.
8. An audio drop detection device, which is characterized by comprising:
The acquisition module is used for acquiring a first signal sequence, wherein the first signal sequence is a signal sequence obtained by acquiring an input audio signal by using a target sampling rate by an audio system;
The acquisition module is further configured to acquire a second signal sequence, where the second signal sequence is related to the input audio signal;
the acquisition module is further used for acquiring the amplitude of the first signal sequence and the amplitude of the second signal sequence to obtain a first amplitude sequence and a second amplitude sequence respectively;
The processing module is used for carrying out linear transformation on the first amplitude sequence and the second amplitude sequence to obtain a first target amplitude sequence;
the processing module is further configured to perform the linear transformation on the first signal sequence and the second signal sequence to obtain a third signal sequence;
the acquisition module is further configured to acquire an amplitude of the third signal sequence, so as to obtain a second target amplitude sequence;
the processing module is further configured to obtain a drop-off condition of audio of the audio system based on similarity of the first target amplitude sequence and the second target amplitude sequence.
9. The apparatus of claim 8, wherein the acquisition module is further configured to:
performing Hilbert transformation on the first signal sequence to obtain a first analysis signal sequence of the first signal sequence, and performing Hilbert transformation on the second signal sequence to obtain a second analysis signal sequence of the second signal sequence;
acquiring an amplitude sequence of the first analytic signal sequence to obtain the first amplitude sequence;
Acquiring an amplitude sequence of the second analytic signal sequence to obtain the second amplitude sequence;
Correspondingly, the acquisition module is further configured to:
performing Hilbert transform on the third signal sequence to obtain a third analysis signal sequence of the third signal sequence;
And acquiring an amplitude sequence of the third analysis signal sequence to obtain the second target amplitude sequence.
10. The apparatus according to claim 8 or 9, wherein the input audio signal is a sinusoidal digital audio signal or a sinusoidal analog audio signal, the sampling frequency of the sinusoidal digital audio signal being the target sampling rate;
When the input audio signal is the sinusoidal digital audio signal, the second signal sequence is a sinusoidal digital audio signal sequence, and when the input audio signal is the sinusoidal analog audio signal, the second signal sequence is a signal sequence obtained by sampling the sinusoidal analog audio signal by a preset audio system by using the target sampling rate.
11. The apparatus of claim 10, wherein the processing module is specifically configured to:
acquiring an average value of the amplitudes in the first target amplitude sequence to obtain a first average value;
acquiring an average value of the amplitudes in the second target amplitude sequence to obtain a second average value;
obtaining a difference value of the first average value and the second average value to obtain an average value difference value;
determining that the audio of the audio system is dropped when the mean value difference is larger than or equal to a first preset value;
And under the condition that the mean value difference value is smaller than the first preset value, determining that no drop occurs in the audio of the audio system.
12. The apparatus of claim 10, wherein the processing module is specifically configured to:
acquiring the variance of the amplitude values in the first target amplitude value sequence to obtain a first variance;
Acquiring the variance of the amplitude values in the second target amplitude value sequence to obtain a second variance;
obtaining a difference value of the first variance and the second variance to obtain a variance difference value;
determining that the audio of the audio system is dropped when the variance difference is greater than or equal to a second preset value;
And under the condition that the variance difference value is smaller than the second preset value, determining that no drop occurs in the audio of the audio system.
13. The apparatus of claim 10, wherein the processing module is specifically configured to:
obtaining a standard deviation of the amplitude values in the first target amplitude value sequence to obtain a first standard deviation;
Obtaining a standard deviation of the amplitude values in the second target amplitude value sequence to obtain a second standard deviation;
obtaining a difference value of the first standard deviation and the second standard deviation to obtain a standard deviation value;
determining that the audio of the audio system is dropped when the standard deviation difference value is larger than or equal to a third preset value;
And under the condition that the standard deviation difference value is smaller than the third preset value, determining that no drop occurs in the audio of the audio system.
14. The apparatus of claim 10, wherein the processing module is specifically configured to:
Obtaining the similarity between the first target amplitude sequence and the second target amplitude sequence according to a preset relational expression to obtain first similarity, wherein i is an integer and is taken from 1 to n, n represents the length of the first target amplitude sequence, b represents the similarity, min () represents a minimum value, X 1,i represents an ith element in the first target amplitude sequence, and X 2,i represents an ith element in the second target amplitude sequence;
Determining that the audio of the audio system is dropped when the first similarity is larger than or equal to a fourth preset value;
And under the condition that the first similarity is smaller than the fourth preset value, determining that no drop occurs in the audio of the audio system.
15. An audio system comprising the apparatus of any one of claims 8 to 14.
16. A computer readable storage medium storing instructions for computer execution, which when executed, cause the method of any one of claims 1 to 7 to be performed.
17. A computer program product comprising computer programs/instructions which, when executed by a processor, cause the method of any of claims 1 to 7 to be performed.
CN202210489774.XA 2022-05-06 2022-05-06 A method and device for detecting audio dropouts Active CN114898774B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210489774.XA CN114898774B (en) 2022-05-06 2022-05-06 A method and device for detecting audio dropouts

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210489774.XA CN114898774B (en) 2022-05-06 2022-05-06 A method and device for detecting audio dropouts

Publications (2)

Publication Number Publication Date
CN114898774A CN114898774A (en) 2022-08-12
CN114898774B true CN114898774B (en) 2025-06-13

Family

ID=82720534

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210489774.XA Active CN114898774B (en) 2022-05-06 2022-05-06 A method and device for detecting audio dropouts

Country Status (1)

Country Link
CN (1) CN114898774B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106791825A (en) * 2016-12-23 2017-05-31 深圳创维数字技术有限公司 A kind of audio automated testing method and terminal
CN113077821A (en) * 2021-03-23 2021-07-06 平安科技(深圳)有限公司 Audio quality detection method and device, electronic equipment and storage medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101608947B (en) * 2008-06-19 2012-05-16 鸿富锦精密工业(深圳)有限公司 Sound testing method
JP6136218B2 (en) * 2012-12-03 2017-05-31 富士通株式会社 Sound processing apparatus, method, and program
WO2015059947A1 (en) * 2013-10-22 2015-04-30 日本電気株式会社 Speech detection device, speech detection method, and program
CN106887241A (en) * 2016-10-12 2017-06-23 阿里巴巴集团控股有限公司 A kind of voice signal detection method and device
CN113035223B (en) * 2021-03-12 2023-11-14 北京字节跳动网络技术有限公司 Audio processing method, device, equipment and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106791825A (en) * 2016-12-23 2017-05-31 深圳创维数字技术有限公司 A kind of audio automated testing method and terminal
CN113077821A (en) * 2021-03-23 2021-07-06 平安科技(深圳)有限公司 Audio quality detection method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN114898774A (en) 2022-08-12

Similar Documents

Publication Publication Date Title
CN111506731B (en) Method, device and equipment for training field classification model
CN109145981B (en) Deep learning automatic model training method and equipment
CN114219306B (en) Method, apparatus, medium for establishing welding quality detection model
US20240310191A1 (en) Method and Apparatus for Global Phase In-phase/Quadrature Demodulation of Optical fiber DAS data
CN117171696B (en) Sensor production monitoring method and system based on Internet of things
CN112948937B (en) Intelligent pre-judging method and device for concrete strength
CN118041255A (en) Signal noise reduction method and system for double-channel adjustable analog signal amplifier
CN114898774B (en) A method and device for detecting audio dropouts
US20240038278A1 (en) Method and device for timing alignment of audio signals
CN113299298B (en) Residual error unit, network and target identification method, system, device and medium
CN108093356B (en) Howling detection method and device
CN112380125B (en) Recommendation method and device for test cases, electronic equipment and readable storage medium
CN110032624B (en) Sample screening method and device
CN114093376A (en) Method and device for identifying audio data packaging format, storage medium and equipment
CN115758377B (en) Container vulnerability processing method and device, storage medium and electronic equipment
CN111489739A (en) Phoneme recognition method and device and computer readable storage medium
CN115033179B (en) Data storage method, device, equipment and medium
CN115187191B (en) A scientific research project progress monitoring method and system based on teaching centralized control management
CN115857467A (en) Fault diagnosis method and device based on multi-head convolution and differential self-attention
CN115510998A (en) Transaction abnormal value detection method and device
CN116072147A (en) Music detection model training method and device, electronic equipment and storage medium
US20220335964A1 (en) Model generation method, model generation apparatus, and program
CN113627681A (en) Data prediction method and device based on prediction model, computer equipment and medium
CN118585133B (en) Data security storage processing method and system for database server
CN119164842B (en) Atmospheric particulate matter analysis method and device based on high and low accelerating voltage coupling analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant