WO2010009659A1 - Procédé, appareil et système de codage/décodage - Google Patents
Procédé, appareil et système de codage/décodage Download PDFInfo
- Publication number
- WO2010009659A1 WO2010009659A1 PCT/CN2009/072793 CN2009072793W WO2010009659A1 WO 2010009659 A1 WO2010009659 A1 WO 2010009659A1 CN 2009072793 W CN2009072793 W CN 2009072793W WO 2010009659 A1 WO2010009659 A1 WO 2010009659A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- code stream
- signal
- signals
- stereo
- core
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S1/00—Two-channel systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H1/00—Details of electrophonic musical instruments
- G10H1/0033—Recording/reproducing or transmission of music for electrophonic musical instruments
- G10H1/0041—Recording/reproducing or transmission of music for electrophonic musical instruments in coded form
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/0204—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using subband decomposition
Definitions
- the present invention relates to the field of audio codec, and more particularly to a codec method, apparatus and system.
- broadband voice music transmission is becoming more and more common, and broadband voice music coding technology is accompanied by the transmission of broadband audio data.
- the existing broadband voice music codec technology is mainly implemented in the following ways:
- the input monophonic speech music signal is narrowband encoded in the time domain to obtain the core stream.
- the core code stream is directly sent to the receiving end, and one path is different from the original input monophonic voice music signal to obtain a residual signal.
- the residual signal is encoded in the frequency domain and then transmitted to the receiving end.
- the receiving end receives the core code stream and the residual signal, and restores the voice music signal in the reverse order of the transmitting end and outputs it.
- Embodiments of the present invention provide a codec method, apparatus, and system capable of improving a reduction effect on a voice music signal.
- An encoding method including:
- narrowband coding is performed to obtain a core code stream
- a decoding method including:
- An encoding device comprising:
- a narrowband coding unit configured to perform down-band coding on signals of the left and right channels, and obtain a core code stream
- a residual signal intercepting unit configured to reduce the core code stream and compare the signals of the original left and right channels to obtain residual signals of the left and right channels respectively;
- a stereo processing unit configured to obtain a stereo code stream and a spreading code stream of residual signals of the left and right channels
- a decoding device comprising:
- a demultiplexing unit configured to demultiplex the received encoded code stream into a core code stream, a stereo code stream, and a spreading code stream;
- a narrowband decoding unit configured to restore the core code stream to a narrowband mono signal by narrowband decoding
- a spreading code stream decoding unit configured to extend and decode the spreading code stream
- the left and right channel signal restoration unit is configured to restore the left and right channel signals according to the stereo code stream, the narrowband mono signal, and the extended decoded stream.
- a codec system comprising:
- An encoding module configured to send left and right channel input signals for narrowband encoding, and residual signals of left and right channels for stereo encoding processing
- a decoding module configured to restore the left and right channel signals and the narrowband mono signals according to the narrowband encoded left and right channel input signals and residual signals of the left and right channels of the stereo encoding process.
- the codec method, device and system provided by the embodiments of the present invention respectively extract the residual signals of the left and right channels at the transmitting end, and then perform stereo processing on the residual signal, and further process the core code stream and the stereo processing.
- the residual signal is sent to the receiving end together, and the receiving end can restore the left and right channel signals through the core code stream and the stereo processed residual signal, so that when the input is a multi-source voice music signal Compared with the prior art monophonic music encoding and decoding scheme, the embodiment of the present invention can improve the restoration effect on the voice music signal by the difference between the left and right channel signals.
- 1 is a transmission end coding process of a first embodiment of the method according to the present invention
- FIG. 3 is a transmitting end coding process of a second embodiment of the method according to the present invention.
- FIG. 5 is a 16 kHz clear speech signal, the difference between the codec and the original signal in AMR-WB mode 2; 6 is a 16 kHz female vocal, AMR-WB mode 1 codec and original signal difference; FIG. 7 is a structural diagram of a transmitting end of the first embodiment of the device of the present invention.
- Figure 8 is a structural diagram of a receiving end of a first embodiment of the apparatus of the present invention.
- Figure 9 is a structural diagram of a transmitting end of a second embodiment of the apparatus of the present invention.
- Figure 10 is a structural diagram of a receiving end of a second embodiment of the apparatus of the present invention.
- FIG. 11 is a structural diagram of a transmitting end of a system embodiment of the present invention.
- FIG. 12 is a structural diagram of a receiving end of a system embodiment of the present invention.
- the embodiment of the present invention is mainly directed to wideband voice music, and a stereo codec scheme is proposed.
- the method, device and system for encoding and decoding according to embodiments of the present invention are described in detail below with reference to the accompanying drawings.
- a first implementation of the codec method of the present invention is as follows:
- the transmitting end coding method is as shown in FIG. 1 and includes:
- S10 performs down-band coding on the signals of the left and right channels, and obtains a core code stream.
- a significant difference between the embodiments of the present invention and the prior art is that signal processing is performed for the left and right channels, whereas the prior art only processes the monophonic voice music signals. This step combines the two input signals of the left and right channels into one signal, mainly for uniform narrowband coding to save system resources.
- this step re-decodes the narrow-band encoded core stream to a down-mixed input signal.
- This narrow-band encoded signal is distorted. By making this distorted data worse than the input signals of the original left and right channels that have not been narrowband encoded, that is, subtracting, the part of data lost when passing through the narrowband, that is, the broadband outside the narrowband, can be obtained. Data, the broadband data is the residual signal finally obtained in this step.
- the core code stream is also restored to be different from the original input signal.
- the difference is that the input of the embodiment is two-channel, so the core code stream needs to be restored and input to the original left and right channels respectively.
- the signal is poor, and the left channel residual signal and the right channel residual signal are respectively obtained.
- the residual signal of the single channel input signal is only extended and coded in the frequency domain, and the extended code stream is transmitted.
- the residual signal is first stereo coded in the frequency domain, the stereo code stream is extracted, and then the extended code is obtained to obtain the extended code stream.
- the frequency domain input is the left and right residual signals, which is one more than the original. Since the input signal is no longer single, the two channels can reflect the stereo effect by the slight difference between the input signals. That is, multiple sound sources can be distinguished.
- whether or not the difference between the residual signals of the left and right channels can be expressed is the key to whether stereo can be restored. If only the two residual signals are downmixed at this time, and then the extended coded output is output, the left and right channel residual signals cannot be restored by the one-way spread code stream at the receiving end.
- the stereo coding of this step is to extract the difference value, combined By expanding the code stream, the residual signal of the left and right channels can be restored at the receiving end.
- this step brings the following benefits:
- the main input signals of the left and right channels are not subjected to time-frequency conversion, and subsequent stereo processing is not performed, which reduces system complexity and delay.
- the core code stream, the stereo code stream, and the extended code stream are multiplexed into one code stream and transmitted.
- the receiving end decoding method is shown in Figure 2, including: 5201. Demultiplex the received encoded code stream into a core code stream, a stereo code stream, and a spreading code stream. This step corresponds to the transmitting end.
- the core stream Since the core stream has not undergone time-frequency conversion and stereo processing, it can be reduced to a narrowband mono signal with only narrowband decoding.
- the process is simple and the system delay is minimized.
- S204 Restore the stereo code stream, the narrowband mono signal, and the extended decoded extended code stream to left and right channel signals.
- the core stream is mainly used, and the residual signal recovered by the stereo stream and the extended decoded stream can be used to restore the stereo signals of the left and right channels.
- the residual signals of the left and right channels are respectively extracted at the transmitting end, and then the residual signal is stereo processed, and the core code stream is sent to the receiving end together with the residual signal processed by the stereo, and received.
- the left and right channel signals can be restored by the core code stream and the stereo processed residual signal, so that when the input is a multi-source voice music signal, compared to the prior art mono channel
- the voice music codec solution in the embodiment of the present invention, can restore the stereo effect of the original multi-source voice music signal by the difference between the left and right channel signals.
- the design of the present embodiment is a stereo voice music codec scheme for two-channel input and output, but the design idea of the embodiment can be applied to a stereo input/output design of more channels.
- a second implementation of the codec method of the present invention is as follows:
- the transmitting end coding method is as shown in FIG. 3, including:
- the S30 hooks down the left and right channel signals into one signal, and performs band-pass filtering and snagging to reduce the signal from the input sampling rate to the internal sampling rate of the core encoding. S302. Perform core coding on the bandpass filtered and downsampled signals to obtain the core. Heart code stream.
- the signal obtained by the core coding is always different from the downmix signal M.
- AMR-WB For a monophonic speech signal of 16 kHz, the AMR-WB based on the internal sampling rate of 12.8 kHz is performed in mode 2 (12. 65 kbps). Down encoding and decoding, and subtracted from the original signal (after 6ms delay adjustment), the difference signal is shown in Figure 5 (the middle horizontal line in Figure 5 represents the reference frame M, the signal near the horizontal line is the difference from the reference frame M ). It can be seen that the difference between M and is very significant. For the 16 kHz female voice singing, M is more significant, as shown in Figure 6 below (the middle horizontal line in Figure 6 represents the reference system M, and the signal near the horizontal line is the difference from the reference frame M).
- This step is an improvement made to eliminate this error.
- the sampled rate of the decoded left downmix signal to the original left and right channels is sampled.
- step S303 It can be known from step S303 that the residual signals of the left and right channels are inconsistent, so this step must separately obtain the residual signals of the left and right channels for stereo coding.
- the purpose of the time-frequency transform is to perform stereo processing on the signal in the frequency domain. This is because If stereo processing is performed in the time domain, the existing time-domain stereo technique predicts another channel from one channel by means of linear regression and prediction filters, and the resolution of stereo signal processing with multiple sound sources is not High, the direct result is that the stereo effect is very poor. Frequency domain processing can effectively avoid problems encountered in the time domain, and the sound image separation of multiple sound sources is high.
- the time-frequency transform is performed on the residual signal for the next step of stereo processing in the frequency domain. This will be better at ensuring the resolution of the sound image than the stereo processing of the residual signal in the time domain.
- the stereo code stream is mainly a phase difference, an intensity difference, a correlation, and a maximum correlation rotation angle between two residual signals.
- Stereo encoding is done in the frequency domain, which reduces complexity and reduces system latency.
- the numerator band extracts stereo information of the residual signal, such as the phase difference (IPD) extracted by the parametric stereo method, the Inter-channel Leve l Difference (ILD), and the correlation (Inter-channe l). Coherence, IC), or the maximal correlation stereo method extracts the maximum correlation rotation angle ⁇ .
- the low frequency and high frequency portions use different quantization and entropy coding methods to reflect different characteristics of the low frequency residual signal and the high frequency residual signal.
- the spread code stream includes spectrally encoded quantized data.
- the core code stream, the stereo code stream, and the extended code stream are multiplexed into one coded code stream and sent.
- the receiving end decoding method is shown in Figure 4, including:
- S40 demultiplexes the received coded stream into a core stream, a stereo stream, and a spread stream. Since the subsequent processing required for the three-way signal is different, first, demultiplexing is performed in this step to separate the three-way signals. 5402. Perform core decoding on the core code stream.
- the processing of the core stream is the shortest, which is advantageous for reducing the delay.
- S404 Extend and decode the extended code stream, and perform time-frequency inverse transform, and combine the time-frequency inverse-transformed signal and the narrow-band mono signal into a wideband mono signal.
- the spread stream has undergone a relatively complicated processing flow at the transmitting end, due to the relatively small amount of data, it is possible to ensure that the delay and phase distortion of the wideband mono signal formed after combining with the narrowband mono signal are minimized.
- the stereo code stream and the extended decoded code stream are subjected to stereo decoding processing to obtain residual frequency domain signals of the left and right channels.
- the reason why the left and right channel residual frequency domain signals are separated into a stereo code stream and a spread code stream transmission is because the amount of data transmitted between the transmitting end and the receiving end can be reduced, but at the cost of adding this processing step.
- the left and right acoustic residual frequency domain signals are inversely transformed by time and frequency to obtain residual signals of the left and right channels.
- the left and right channel residual frequency domain signals are converted into time domain signals for combining with the time domain narrowband mono signals to obtain the final left and right channel output signals.
- the residual signal of the left channel is combined with the narrowband mono signal to obtain a left channel signal.
- the second embodiment has the following advantages:
- the second embodiment when a multi-sound voice music signal is input, compared with the prior art monophonic music encoding and decoding scheme, the second embodiment can pass the difference between the left and right channel signals. Restores the stereo effect of the original multi-source voice music signal.
- stereo processing is performed in the frequency domain, and stereo information can be extracted by a convenient molecular band.
- the sound source contained in it is usually distributed in different frequency bands, so Band processing separates sound sources distributed in different frequency bands.
- the frequency band is divided according to the nonlinear characteristics of the human ear, even if different sound sources appearing in the same band are recognized as a sound source due to the limited resolution of the human ear.
- the subsequent frequency domain processing including stereo coding, two-way residual signal downmixing, and extended coding are performed in the same frequency band, thereby avoiding the existing
- different processing steps are performed in different frequency bands, which causes frequent positive conversion and inverse transformation between the different frequency bands in the stereo processing part, thereby reducing the windowing operation and buffering.
- the processing flow is also reduced correspondingly, which reduces the overall codec complexity and system complexity.
- the transmission signal is divided into three parts: a core code stream, a spreading code stream, and a stereo code stream.
- the narrowband mono signal can be obtained only according to the core code stream, without relying on the extended code stream and the stereo code stream; the wideband mono signal can be obtained according to the core code stream and the extended code stream, without relying on the stereo code stream data; With all three transmitted signals, the wideband stereo signal can be reconstructed.
- this embodiment can be well adapted to the environment of the transmission line during transmission. If the actual transmission line bandwidth is limited and only the narrowband signal can be transmitted, then by the codec method of this embodiment, only the narrowband mono signal can be transmitted.
- a first embodiment of the codec device of the present invention is as follows:
- the transmitting end coding device is as shown in FIG. 7, and includes:
- Narrowband coding unit 1 Used to downmix the signals of the left and right channels and then perform narrowband coding to obtain the core code stream.
- This unit is the same as the prior art, mainly for merging the two input signals of the left and right channels into One signal, unified narrowband processing.
- the residual signal intercepting unit 2 is configured to compare the core stream encoded by the narrowband encoding unit 1 with the signals of the original left and right channels, and obtain residual signals of the left and right channels, respectively.
- this unit re-decodes the narrow-band encoded core stream to a down-mixed input signal.
- This narrow-band encoded signal is distorted, and the distorted data is compared with the original left and right without narrowband encoding.
- the input signal of the channel is made poor, that is, subtracted, and the part of data lost when passing through the narrow band, that is, the wideband data outside the narrow band, which is the residual signal finally obtained by the unit, can be obtained.
- the core code stream is also restored to be different from the original input signal.
- the difference is that the input of the embodiment is two-channel, so the core code stream needs to be restored and input to the original left and right channels respectively.
- the signal is poor, and the left channel residual signal and the right channel residual signal are respectively obtained.
- Stereo Processing Unit 3 A stereo code stream and a spread code stream for obtaining residual signals of the left and right channels obtained by the residual signal intercepting unit 2.
- the residual signal of the single channel input signal is only extended and coded in the frequency domain, and the extended code stream is transmitted.
- the residual signal is first stereo coded in the frequency domain, the stereo code stream is extracted, and then the extended code is obtained to obtain the extended code stream.
- the frequency domain input is the left and right residual signals, which is one more than the original. Since the input signal is no longer single, the two channels can reflect the stereo effect by the slight difference between the input signals. That is, multiple sound sources can be distinguished.
- whether or not the difference between the residual signals of the left and right channels can be expressed is the key to whether stereo can be restored. If only the two residual signals are downmixed at this time, and then the extended coded output is output, the left and right channel residual signals cannot be restored by the one-way spread code stream at the receiving end.
- this unit brings the following benefits: The main input signals of the left and right channels are not subjected to time-frequency conversion, and subsequent stereo processing is not performed, which reduces system complexity and delay.
- Multiplexing unit 4 For multiplexing and transmitting the core code stream, the stereo code stream and the extended code stream into one coded code stream.
- the receiving end decoding device is as shown in FIG. 8, and includes:
- Demultiplexing unit 5 for demultiplexing the received encoded code stream into a core code stream, a stereo code stream, and a spreading code stream.
- This unit corresponds to the transmitting side multiplexing unit 4.
- Narrowband decoding unit 6 The core code stream used for demultiplexing unit 5 is reduced to a narrowband mono signal by narrowband decoding.
- the core stream Since the core stream has not undergone time-frequency conversion and stereo processing, it can be reduced to a narrowband mono signal with only narrowband decoding.
- the process is simple and the system delay is minimized.
- the extended code stream decoding unit 7 is used for spreading decoding of the spread code stream solved by the demultiplexing unit 5.
- the spread stream has undergone a relatively complicated processing flow at the transmitting end, due to the relatively small amount of data, it is possible to ensure that the delay and phase distortion of the wideband mono signal formed after combining with the narrowband mono signal are minimized.
- Left and right channel signal restoration unit 8 For restoring the left and right channel signals according to the stereo code stream, the narrowband mono signal, and the extended decoded stream.
- the residual signals of the left and right channels are respectively extracted at the transmitting end, and then the residual signal is stereo processed, and the core code stream is sent to the receiving end together with the residual signal processed by the stereo, and received.
- the core code stream and the stereo processed residual signal can be restored, thus completing the stereo coding of the two-channel voice music signal.
- Code process A second embodiment of the codec device of the present invention is as follows:
- the transmitting end coding device is as shown in FIG. 9, and includes:
- Narrowband coding unit 1 Used to downmix the signals of the left and right channels and then perform narrowband coding to obtain the core code stream.
- Residual signal intercepting unit 2 The core code stream obtained by encoding the narrowband coding unit 1 is compared with the signals of the original left and right channels, and the residual signals of the left and right channels are respectively obtained.
- Stereo processing unit 3 The residual signal for the left and right channels obtained by the residual signal intercepting unit 2 is stereo coded and spread-coded in the frequency domain, and the stereo code stream is obtained by stereo coding, and is extended by spreading coding. Code stream.
- the multiplexing unit 4 is configured to multiplex the stereo code stream and the extended code stream obtained by the stereo processing unit 3, and the core code stream obtained by the narrowband coding unit 1 into one coded code stream for transmission.
- the narrowband coding unit 1 includes:
- Downmix sub-unit 11 Used to evenly mix the input left and right channel signals into one signal, and perform band pass filtering and snagging.
- the lower sample is to change the mono signal at the input sample rate to the internal sample rate, for example, to convert a mono signal of 12.8 kHz into a mono signal of 16 kHz.
- the core coding sub-unit 12 is configured to perform core coding on the band-pass filtered and down-sampled signals of the down-mixed sub-unit 11 to obtain the core code stream.
- the core coding sub-unit 12 may be a low-rate speech coder, such as a core encoder of AMR-WB or G. 729. 1, inputting a mono signal to the internal sample rate, and outputting Encode data for the core.
- the residual signal intercepting unit 2 includes: The signal is also atomic unit 21: for decoding the core code stream into a downmix signal.
- the upper sub-unit 22 the downmix signal used to restore the signal to the atomic unit 21, and the sampling rate of the original left and right channel input signals is opposite to that of the lower sample.
- the residual processing sub-unit 23 for subtracting the input signals of the original left and right channels processed by the upper-like sub-unit 22 and the up-mixed signals obtained by the upper-like sub-unit, respectively, The residual signal of the left and right channels.
- the stereo processing unit 3 includes:
- the time-frequency transform sub-unit 31 is configured to perform time-frequency transform on the residual signals of the left and right channels to obtain residual frequency domain signals of the left and right channels.
- the time-frequency transform sub-unit 31 can be divided into two types, a complex transform and a real transform, the former being an FFT and the latter being a modified complex cosine (MDCT, Mod i ied D i s cre te Cos ine Trans form).
- MDCT modified complex cosine
- Example 3 ⁇ 4 When the extended encoder uses TCX, the transform uses FFT, and the residual signal of the left and right channels is also FFT-transformed to the complex frequency domain; when the extended encoder uses MPEG-2/4 AAC The transform uses MDCT, and the residual signal of the left and right channels is also transformed into the real frequency domain by MDCT.
- Stereo encoding sub-unit 32 The left and right channel residual frequency domain signals obtained from the time-frequency transform sub-unit 31 are stereo-encoded to obtain the stereo code stream.
- Downmixing and spreading coding sub-unit 33 The left and right channel residual frequency domain signals for extracting the stereo code stream from the stereo coding unit 32 are down-mixed into one channel signal, and subjected to spreading coding to form the spreading code stream.
- the multiplexing unit 4 transmits the core code stream, the stereo code stream, and the extended code stream into one coded code stream.
- the receiving end decoding device is as shown in FIG. 10, and includes:
- Demultiplexing unit 5 for demultiplexing the received encoded code stream into a core code stream, a stereo code stream, and a spreading code stream.
- Narrowband decoding unit 6 The core code stream used for demultiplexing unit 5 is reduced to a narrowband mono signal by narrowband decoding.
- the extended code stream decoding unit 7 is used for spreading decoding of the spread code stream solved by the demultiplexing unit 5.
- the extended code stream decoding unit 7 is further configured to perform the time-frequency inverse transform on the extended decoded code stream, and combine the time-frequency inverse-transformed signal and the narrow-band mono signal into a wideband mono signal.
- Left and right channel signal reduction unit 8 Used to restore the left and right channel signals according to the stereo stream, the narrowband mono signal, and the extended decoded stream.
- the narrowband decoding unit 6 includes:
- the core decoding subunit 61 is configured to perform core decoding on the core code stream.
- the core decoding subunit 61 corresponds to the core encoding subunit 12.
- the core encoder is
- the decoder is an AMR-WB decoder.
- the input is 12. 8 kHz core encoded data and the output is a mono signal at the internal sample rate.
- the upper sub-unit 62 is configured to perform a sampling on the signal decoded by the core of the core decoding sub-unit 61 to obtain a narrow-band mono signal, which is opposite to the next-like signal.
- the extended code stream decoding unit 7 includes:
- the extended decoding sub-unit 71 is configured to perform spreading decoding of the extended coded signal.
- the extended decoding time-frequency inverse transform sub-unit 72 is configured to perform time-frequency inverse transform on the signal that is extended and decoded by the extended decoding sub-unit 71.
- Wideband mono signal synthesis sub-unit 73 For combining the time-frequency inverse transform signal of the extended decoding time-frequency inverse transform sub-unit 72 with the narrow-band mono signal into a wideband mono signal.
- the left and right channel signal restoration unit 8 includes:
- the stereo decoding sub-unit 81 is configured to perform stereo decoding processing on the stereo code stream and the extended decoded spreading code to obtain a residual frequency domain signal of the left and right channels.
- the time-frequency inverse transform sub-unit 82 is configured to perform a time-frequency inverse transform on the left and right channel residual frequency domain signals processed by the stereo decoding sub-unit 81 to obtain residual signals of the left and right channels.
- Left channel signal synthesis sub-unit 83 for combining the left channel residual signal obtained by the time-frequency inverse transform sub-unit 82 with the narrow-band mono signal to obtain a left channel signal.
- the right channel signal synthesis sub-unit 84 is configured to combine the right channel residual signal obtained by the time-frequency inverse transform sub-unit 82 with the narrow-band mono signal to obtain a right channel signal.
- this embodiment has the following advantages:
- the second embodiment can restore the original original by the difference between the left and right channel signals.
- the stereo effect of the sound source voice music signal is input, compared with the prior art monophonic music encoding and decoding scheme, the second embodiment can restore the original original by the difference between the left and right channel signals. The stereo effect of the sound source voice music signal.
- stereo processing is performed in the frequency domain, and stereo information can be extracted by a convenient molecular band.
- the sound sources contained therein are usually distributed in different frequency bands, so the sub-band processing can separate the sound sources distributed in different frequency bands.
- the frequency band is recognized as a sound source by the nonlinear characteristics of the human ear.
- the transmission signal is divided into three parts: a core code stream, a spreading code stream, and a stereo code stream.
- the narrowband mono signal can be obtained only according to the core code stream, without relying on the extended code stream and the stereo code stream; the wideband mono signal can be obtained according to the core code stream and the extended code stream, without relying on the stereo code stream data; With all three transmitted signals, the wideband stereo signal can be reconstructed.
- the embodiment can be well adapted to the environment of the transmission line during the transmission process. If the actual transmission line bandwidth Limited, only narrowband signals can be transmitted, and then the narrowband mono signal can be transmitted by the codec method of this embodiment.
- the codec system of the present invention is as follows:
- Encoding module Used to transmit left and right channel input signals for narrowband encoding, and residual signals for left and right channels for stereo encoding processing.
- Decoding module Used to restore the left and right channel signals, the wideband mono signal, and the narrowband single channel signal according to the left and right channel input signals of the narrowband encoding and the residual signals of the left and right channels of the stereo encoding process.
- the coding module is shown in Figure 11, and includes:
- the narrowband coding submodule 111 is configured to downmix the input signals of the left and right channels and perform narrowband coding to obtain a core code stream.
- the residual signal intercepting sub-module 112 is configured to reduce the core code stream obtained by the narrow-band encoding sub-module 111 and compare it with the input signals of the original left and right channels to obtain residual signals of the left and right channels, respectively.
- Stereo processing sub-module 113 The residual signal of the left and right channels obtained by the residual signal intercepting sub-module 112 is stereo-encoded and extended-encoded in the frequency domain, and the stereo code stream is obtained by stereo coding, and the coded code is extended. Get the extended code stream.
- the multiplexing sub-module 114 is configured to multiplex the stereo code stream and the extended code stream and the core code stream encoded by the stereo processing sub-module 113 into one encoded code stream for transmission.
- the decoding module is shown in Figure 12 and includes:
- the demultiplexing sub-module 121 is configured to demultiplex the received encoded code stream into a core code stream, a stereo code stream, and a spreading code stream.
- the narrowband decoding sub-module 122 is configured to restore the core code stream solved by the demultiplexing sub-module 121 to a narrowband mono signal by narrowband decoding.
- the extended code stream decoding sub-module 123 is configured to perform extended decoding of the spread code stream decoded by the demultiplexing sub-module 121.
- the extended code stream decoding sub-module 123 is further configured to perform the time-frequency inverse transform on the extended decoded code stream, and combine the time-frequency inverse-transformed signal and the narrow-band mono signal into a wideband mono signal.
- the left and right channel signals are also atomic module 124: for restoring the left and right channel signals according to the stereo stream, the narrowband mono signal, and the extended decoded stream.
- the residual signals of the left and right channels are respectively extracted at the transmitting end, and then the residual signal is stereo processed, and the core code stream is sent to the receiving end together with the residual signal processed by the stereo, and received.
- the left and right channel signals can be restored by the core code stream and the stereo processed residual signal, so that the stereo codec process of the two-channel voice music signal is completed.
- the storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), or a random access memory (RAM).
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Computational Linguistics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Health & Medical Sciences (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Mathematical Physics (AREA)
- Stereophonic System (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
Cette invention se rapporte à un procédé, à un appareil et à un système de codage/décodage. Le procédé comprend les étapes consistant à : exécuter un codage à bande étroite sur les voies audio de gauche et de droite après que les signaux ont été mélangés-abaissés de manière à obtenir un flux de code central; obtenir une différence entre le flux de code central inversé et les signaux des voies audio de gauche et de droite d'origine de manière à obtenir les signaux résiduels des voies audio de gauche et de droite respectivement; obtenir le flux de code stéréo et le flux de code étendu des voies audio de gauche et de droite; multiplexer le flux de code central, le flux de code stéréo et le flux de code étendu en un flux de code de codage et l'envoyer.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN2008101322967A CN101635145B (zh) | 2008-07-24 | 2008-07-24 | 编解码方法、装置和系统 |
| CN200810132296.7 | 2008-07-24 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2010009659A1 true WO2010009659A1 (fr) | 2010-01-28 |
Family
ID=41570018
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2009/072793 Ceased WO2010009659A1 (fr) | 2008-07-24 | 2009-07-16 | Procédé, appareil et système de codage/décodage |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN101635145B (fr) |
| WO (1) | WO2010009659A1 (fr) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US9123329B2 (en) | 2010-06-10 | 2015-09-01 | Huawei Technologies Co., Ltd. | Method and apparatus for generating sideband residual signal |
Families Citing this family (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN114708874A (zh) | 2018-05-31 | 2022-07-05 | 华为技术有限公司 | 立体声信号的编码方法和装置 |
| EP4138396A4 (fr) | 2020-05-21 | 2023-07-05 | Huawei Technologies Co., Ltd. | Procédé de transmission de données audio et dispositif associé |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN1148297A (zh) * | 1995-10-16 | 1997-04-23 | 王亚伦 | 调频l-r数据广播系统及其数据信号的处理方法 |
| CN1941648A (zh) * | 1998-01-22 | 2007-04-04 | 英国电讯有限公司 | 接收具有窄带干扰的扩频信号 |
| CN101188878A (zh) * | 2007-12-05 | 2008-05-28 | 武汉大学 | 一种立体声音频信号的空间参数量化及熵编码方法及其所用系统结构 |
Family Cites Families (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR100335611B1 (ko) * | 1997-11-20 | 2002-10-09 | 삼성전자 주식회사 | 비트율 조절이 가능한 스테레오 오디오 부호화/복호화 방법 및 장치 |
| BRPI0516376A (pt) * | 2004-12-27 | 2008-09-02 | Matsushita Electric Industrial Co Ltd | dispositivo de codificação de som e método de codificação de som |
| CN101202042A (zh) * | 2006-12-14 | 2008-06-18 | 中兴通讯股份有限公司 | 可扩展的数字音频编码框架及其扩展方法 |
-
2008
- 2008-07-24 CN CN2008101322967A patent/CN101635145B/zh active Active
-
2009
- 2009-07-16 WO PCT/CN2009/072793 patent/WO2010009659A1/fr not_active Ceased
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN1148297A (zh) * | 1995-10-16 | 1997-04-23 | 王亚伦 | 调频l-r数据广播系统及其数据信号的处理方法 |
| CN1941648A (zh) * | 1998-01-22 | 2007-04-04 | 英国电讯有限公司 | 接收具有窄带干扰的扩频信号 |
| CN101188878A (zh) * | 2007-12-05 | 2008-05-28 | 武汉大学 | 一种立体声音频信号的空间参数量化及熵编码方法及其所用系统结构 |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US9123329B2 (en) | 2010-06-10 | 2015-09-01 | Huawei Technologies Co., Ltd. | Method and apparatus for generating sideband residual signal |
Also Published As
| Publication number | Publication date |
|---|---|
| CN101635145B (zh) | 2012-06-06 |
| CN101635145A (zh) | 2010-01-27 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP7507207B2 (ja) | 周波数ドメインプロセッサ、時間ドメインプロセッサ及び連続的な初期化のためのクロスプロセッサを使用するオーディオ符号器及び復号器 | |
| TWI570710B (zh) | 音源編碼器、音源解碼器、編碼音源訊號的方法、解碼編碼音源訊號的方法及其電腦程式 | |
| KR101945309B1 (ko) | 위상 정보와 잔여 신호를 이용한 부호화/복호화 장치 및 방법 | |
| TWI629681B (zh) | 使用頻譜域重新取樣來編碼或解碼多通道信號之裝置、方法及相關電腦程式 | |
| RU2665214C1 (ru) | Стереофонический кодер и декодер аудиосигналов | |
| CN100571043C (zh) | 一种空间参数立体声编解码方法及其装置 | |
| JP6535730B2 (ja) | 独立したノイズ充填を用いた強化された信号を生成するための装置および方法 | |
| CN101202043A (zh) | 音频信号的编码方法和系统与解码方法和系统 | |
| JP2009532712A (ja) | メディア信号処理方法及び装置 | |
| WO2010009659A1 (fr) | Procédé, appareil et système de codage/décodage | |
| CN101754086A (zh) | 一种基于音源位置线索的多频道音频的解码装置和其方法 | |
| CN101361114B (zh) | 用于处理媒体信号的装置及其方法 | |
| HK40067463A (en) | Audio encoding and decoding using a frequency domain processor, a time domain processor, and a cross processor for continuous initialization | |
| HK1127665B (en) | Apparatus for processing media signal and method thereof |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 09799974 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 09799974 Country of ref document: EP Kind code of ref document: A1 |