[go: up one dir, main page]

HK40016914B - Method and apparatus for generating a mixed spatial/coefficient domain representation of hoa signals - Google Patents

Method and apparatus for generating a mixed spatial/coefficient domain representation of hoa signals Download PDF

Info

Publication number
HK40016914B
HK40016914B HK42020007060.5A HK42020007060A HK40016914B HK 40016914 B HK40016914 B HK 40016914B HK 42020007060 A HK42020007060 A HK 42020007060A HK 40016914 B HK40016914 B HK 40016914B
Authority
HK
Hong Kong
Prior art keywords
vector
hoa
signal
domain signals
coefficient domain
Prior art date
Application number
HK42020007060.5A
Other languages
Chinese (zh)
Other versions
HK40016914A (en
Inventor
斯文·科登
亚历山大·克鲁格
Original Assignee
杜比国际公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 杜比国际公司 filed Critical 杜比国际公司
Publication of HK40016914A publication Critical patent/HK40016914A/en
Publication of HK40016914B publication Critical patent/HK40016914B/en

Links

Description

Method and apparatus for generating a hybrid spatial/coefficient domain representation of an HOA signal
The present application is a divisional application of chinese patent application with application number 201480038940.8, filing date 2014, month 6 and 24, entitled "method and apparatus for generating a hybrid spatial/coefficient domain representation of HOA signals from the coefficient domain representation of said HOA signals".
Technical Field
The invention relates to a method and apparatus for generating a mixed spatial/coefficient domain representation of HOA signals from the coefficient domain representation of said HOA signals, wherein the number of HOA signals can be variable.
Background
Higher order Ambisonics (Ambisonics), denoted HOA, is a mathematical description of a two-dimensional or three-dimensional sound field. The sound field may be captured by a microphone array, designed from a synthetic sound source, or a combination of both. HOA may be used as a transport format for two-dimensional or three-dimensional surround sound. An advantage of HOA over loudspeaker-based surround sound representations is that the sound field is reproduced on a different loudspeaker arrangement. Thus, HOA is suitable for generic audio formats.
The spatial resolution of HOA is determined by the HOA order. This order defines the number of HOA signals describing the sound field. There are two representations for HOA, which are called spatial domain and coefficient domain, respectively. In the usual case, HOA is initially represented in the coefficient domain, and this representation may be converted to the spatial domain by matrix multiplication (or transformation), as described in EP 2469742 A2. The spatial domain includes the same number of signals as the coefficient domain. However, in the spatial domain, each signal is direction dependent, wherein the directions are evenly distributed over the unit sphere. This facilitates analysis of the spatial distribution of the HOA representation. The coefficient domain representation and the spatial domain representation are both temporal domain representations.
Disclosure of Invention
In the following, basically, the aim is to use the spatial domain as much as possible for PCM transmission of HOA representations to provide the same dynamic range for each direction. This means that PCM samples of the HOA signal in the spatial domain have to be normalized into a predefined range of values. However, a disadvantage of this normalization is that: the dynamic range of the HOA signal in the spatial domain is smaller than in the coefficient domain. This is due to the transform matrix of the spatial domain signal being generated from the coefficient domain signal.
In some applications, the HOA signal is transmitted in the coefficient domain, e.g. in the process described in EP 13305558.2, all signals are transmitted in the coefficient domain, since a constant number of HOA signals and a variable number of additional HOA signals will be transmitted. However, as mentioned above and shown in EP 2469742 A2, transmission in the coefficient domain is not very beneficial. As a solution, it is possible to transmit a constant number of HOA signals in the spatial domain and only a variable number of additional HOA signals in the coefficient domain. Transmission of additional HOA signals in the spatial domain is not possible, because a time-varying number of HOA signals will result in time-varying coefficients to spatial domain transformation matrices and discontinuities may occur in all spatial domain signals, which is sub-optimal for subsequent perceptual coding of PCM signals.
In order to ensure the transmission of these additional HOA signals without exceeding a predefined value range, a reversible normalization process may be used, which is designed to prevent such signal discontinuities and also to enable an efficient transmission of the inverse parameters.
Regarding the normalization of the dynamic range of the two HOA representations for PCM encoding and the HOA signal, it can be derived hereinafter whether such normalization should occur in the coefficient domain or in the spatial domain.
In the coefficient time domain, HOA represents N coefficient signals d comprising successive frames n (k) N=0,..n-1, where k represents the sample index and N represents the signal index.
These coefficient signals are collected in a vector d (k) = [ d ] 0 (k),...,d N-1 (k)] T To obtain a compact representation.
The transformation into the spatial domain is performed by an nxn transformation matrix defined in EP 12306569.0:
see the xi described in connection with equations (21) and (22) GRID Is defined in (a).
From w (k) =ψ -1 d (k) (1) obtains a spatial domain vector w (k) = [ w ] 0 (k),...,w N-1 (k)] T Wherein, ψ is -1 Is the inverse of matrix ψ.
The inverse transform from the spatial domain to the coefficient domain is performed by d (k) =ψw (k) (2).
If a range of values for a sample is defined in one domain, the transformation matrix ψ automatically defines the range of values for the other domain. The term (k) of the kth sample is omitted hereinafter.
Since the HOA representation is actually reproduced in the spatial domain, the value range, loudness and dynamic range are defined in this domain. The dynamic range is defined by the bit resolution of PCM encoding. In this application, "PCM encoding" means converting floating point representation samples into fixed point marked integer representation samples.
For PCM encoding of HOA representation, N spatial domain signals must be normalized to-1.ltoreq.w n In the value range < 1 so that they canTo be enlarged to the maximum PCM value W max And rounded to a fixed-point integer PCM mark w' n =[w n W max ](3)。
Note that: this is a generalized PCM encoded representation. The value range of the samples of the coefficient field can be calculated by means of an infinite norm of the matrix ψ, wherein the matrix ψ is calculated by means ofIs defined and the maximum absolute value w in the spatial domain max =1 to- |ψ|| w max ≤d n <||Ψ|| w max . Due to the definition of the matrix used ψ i Greater than '1', thus d n The range of values of (c) increases.
The reciprocal means that PCM encoding of a signal in the coefficient domain needs to pass ψ Because-1.ltoreq.d n /||Ψ|| < 1. However, this normalization reduces the dynamic range of the signal in the coefficient domain, which results in a lower signal-to-quantization noise ratio. Therefore PCM encoding of spatial domain signals is preferred.
The problem to be solved by the present invention is how to use normalization to transmit the desired part of the HOA signal of the spatial domain in the coefficient domain without reducing the dynamic range in the coefficient domain. Furthermore, the normalized signals should not contain signal level transitions so that they can be perceptually encoded without quality loss due to transitions.
In principle, the inventive generation method is adapted to generate a mixed spatial/coefficient domain representation of HOA signals from a coefficient domain representation of said HOA signals, wherein the number of HOA signals is capable of varying over time in successive coefficient frames, the method comprising the steps of:
-separating the vector of HOA coefficient domain signals into a first vector of coefficient domain signals having a constant number of HOA coefficients and a second vector of coefficient domain signals having a time-variable number of HOA coefficients;
-transforming said first vector of coefficient domain signals into a corresponding vector of spatial domain signals by multiplying said vector of coefficient domain signals by an inverse of a transformation matrix;
-PCM encoding said vectors of the spatial domain signal to obtain vectors of the PCM encoded spatial domain signal;
-normalizing the second vector of coefficient domain signals by a normalization factor, wherein the normalization is an adaptive normalization for a current value range of HOA coefficients of the second vector of coefficient domain signals, and in which normalization the available value range of HOA coefficients for a vector is not exceeded, and in which normalization a uniform continuous transfer function is applied to coefficients of the current second vector to continuously change the gain in that vector from the gain in the previous second vector to the gain in the following second vector, and the normalization provides side information for de-normalization of the respective decoder side;
-PCM encoding said vector of normalized coefficient domain signals to obtain a PCM encoded and normalized coefficient domain signal vector;
-multiplexing said vector of PCM encoded spatial domain signals with said vector of PCM encoded and normalized coefficient domain signals.
In principle, the generating device of the invention is adapted to generate a mixed spatial/coefficient domain representation of HOA signals from the coefficient domain representation of said HOA signals, wherein the number of HOA signals is capable of varying over time in successive coefficient frames, the device comprising:
-means adapted to separate the vector of HOA coefficient domain signals into a first vector of coefficient domain signals having constant HOA coefficients and a second vector of coefficient domain signals having a time-variable number (-) of HOA coefficients;
-means adapted to transform said first vector of coefficient domain signals into a corresponding vector of spatial domain signals by multiplying said vector of coefficient domain signals by an inverse of a transform matrix;
-means adapted to PCM encode said vectors of the spatial domain signal to obtain vectors of the PCM encoded spatial domain signal;
-means adapted to normalize said second vector of coefficient domain signals by a normalization factor, wherein said normalization is an adaptive normalization for a current value range of HOA coefficients of said second vector of coefficient domain signals, and in which normalization the available value range of HOA coefficients for a vector is not exceeded, and in which normalization a uniform continuous transfer function is applied to coefficients of the current second vector to continuously change the gain in that vector from the gain in the previous second vector to the gain in the following second vector, and said normalization provides de-normalized side information for the respective decoder side;
-means adapted to PCM encode said vector of normalized coefficient domain signals to obtain a PCM encoded and normalized coefficient domain signal vector;
-means adapted to multiplex said vector of PCM encoded spatial domain signals with said vector of PCM encoded and normalized coefficient domain signals.
In principle, the decoding method of the present invention is adapted to decode a mixed spatial/coefficient domain representation of an encoded HOA signal, wherein the number of HOA signals is capable of varying over time in successive coefficient frames, and wherein the mixed spatial/coefficient domain representation of an encoded HOA signal is generated according to the generating method of the above invention, the decoding comprising the steps of:
-demultiplexing said multiplexed vector of PCM encoded spatial domain signals and PCM encoded and normalized coefficient domain signals;
-transforming said vector of PCM encoded spatial domain signals into a corresponding vector of coefficient domain signals by multiplying said vector of PCM encoded spatial domain signals by said transformation matrix;
-denormalizing the vector of PCM encoded and normalized coefficient domain signals, wherein the denormalizing comprises:
-using the corresponding index e of the received side information n (j-1) and recursively calculated gain value g n (j-2) calculating a transformation vector h n (j-1) wherein the coefficient domain signal to be used for PCM encoding and normalization is to be obtainedCorresponding processed gain value g of the processed subsequent vector n (j-1) being held, j being the running index of the input matrix of HOA signal vectors;
-applying the corresponding inverse benefit value to the current vector of the PCM encoded and normalized signal, thereby obtaining a corresponding vector of the PCM encoded and denormalized signal;
-combining said vector of coefficient domain signals with a vector of denormalized coefficient domain signals, resulting in a combined vector of HOA coefficient domain signals which may have a variable number of HOA coefficients.
In principle, the decoding device of the present invention is adapted to decode a mixed spatial/coefficient domain representation of an encoded HOA signal, wherein the number of HOA signals is capable of varying over time in successive coefficient frames, and wherein the mixed spatial/coefficient domain representation of an encoded HOA signal is generated according to the generating method of the above-described invention, the decoding device comprising:
-means adapted to de-multiplex said multiplexed vectors of PCM encoded spatial domain signals and PCM encoded and normalized coefficient domain signals;
-means adapted to transform said vector of PCM encoded spatial domain signals into a corresponding vector of coefficient domain signals by multiplying said vector of PCM encoded spatial domain signals by said transform matrix;
-means adapted to denormalise the vector of PCM encoded and normalized coefficient domain signals, wherein the denormalising comprises:
-using the corresponding index e of the received side information n (j-1) and recursively calculated gain value g n (j-2) calculating a transformation vector h n (j-1) wherein the corresponding processed gain value g of the subsequent vector to be processed for the PCM encoded and normalized coefficient domain signal n (j-1) being held, j being the running index of the input matrix of HOA signal vectors;
-applying the corresponding inverse benefit value to the current vector of the PCM encoded and normalized signal, thereby obtaining a corresponding vector of the PCM encoded and denormalized signal;
-means adapted to combine said vector of coefficient domain signals with a vector of denormalized coefficient domain signals, resulting in a combined vector of HOA coefficient domain signals which may have a variable number of HOA coefficients.
Drawings
Exemplary embodiments of the present invention are described with reference to the accompanying drawings, in which:
fig. 1 shows that the initial coefficient field HOA represents PCM transmission in the spatial domain;
fig. 2 shows that HOA represents combined transmissions in both the coefficient domain and the spatial domain;
fig. 3 shows the combined transmission of HOA representations in the coefficient domain and the spatial domain using per-block adaptive normalization for signals in the coefficient domain;
fig. 4 shows the information for HOA signals represented in the coefficient domain (x n (j) An adaptive normalization process;
FIG. 5 shows a transfer function used for a smooth transition between two different gain values;
FIG. 6 illustrates an adaptive de-normalization process;
FIG. 7 shows the use of different indices e n Is the transfer function h of (2) n (l) Wherein the maximum amplitude of each function is normalized to 0dB;
fig. 8 shows an example transfer function of three consecutive signal vectors.
Detailed Description
Regarding PCM encoding of HOA representations in the spatial domain, it is assumed (in floating point representation) that-1 < w is satisfied n < 1 so that PCM transmission of HOA representation can be performed as shown in fig. 1. A converter step or stage 11 at the input of the HOA encoder converts the coefficient domain signal d of the current input signal frame into a spatial domain signal w using equation (1). PCM encoding step or stage 12 converts floating point samples w into fixed point marked PCM encoded integer samples w' using equation (3). In a multiplexer step or stage 13, the samples w' are multiplexed into the HOA transport format.
In a demultiplexer step or stage 14 the HOA decoder demultiplexes the received signals w 'in transmission HOA format and in a step or stage 15 transforms them again into coefficient domain signals d' using equation (2). The inverse transform increases the dynamic range of d' so that the transformation from the spatial domain to the coefficient domain always includes a format conversion from integer (PCM) to floating point.
If the matrix ψ is time-varying (i.e. the number or index of HOA signals is time-varying for a sequence of consecutive HOA coefficients (i.e. consecutive input signal frames), the standard HOA transmission of fig. 1 will fail. As described above, one example for this case is the HOA compression process described in EP 13305558.2: a constant number of HOA signals are transmitted consecutively and a variable number of HOAs with varying signal index n are transmitted in parallel. As described above, all signals are transmitted in the coefficient domain, which is not preferable.
The process described in connection with fig. 1 is extended as shown in fig. 2 according to the present invention.
In step or stage 20, the HOA encoder separates the HOA vector d into two vectors d 1 And d 2 Wherein for vector d 1 The number M of HOA coefficients of (2) is a constant, vector d 2 Including a variable number K of HOA coefficients. Since the signal index n is for vector d 1 Is time-varying and thus utilizes the sum w shown in the lower signal path of fig. 2 in steps or stages 21, 22, 23, 24 and 25 (corresponding to steps/stages 11 to 15 of fig. 1) 1 And w' 1 The corresponding signal performs PCM encoding in the spatial domain. However, multiplexer step/stage 23 gets an additional input signal d " 2 The demultiplexer step/stage 24 in the HOA decoder provides a different output signal d " 2
The number of HOA coefficients or the magnitude K of the vector is time-varying and the index n of the transmitted HOA signal may vary over time. This prevents transmission in the spatial domain, as a time-varying transformation matrix is required, which would result in signal discontinuities in all perceptually encoded (the perceptual encoding steps or stages are not shown) HOA signals. But such signal discontinuities should be avoided because they would reduce the quality of the perceived coding of the transmitted signal. Thus, d will be sent in the coefficient domain 2 . Due toA larger range of values for the signal in the coefficient domain, thus in step or stage 27 PCM encoding may be applied before at step or stage 26 by a factor of 1/||ψ| The signal is scaled. However, a disadvantage of this scaling is that: i psi I The maximum absolute value of (a) is a worst case estimate and the maximum absolute sample value will not occur very frequently because the range of values that is typically expected is smaller. As a result, the available resolution for PCM encoding is not efficiently used and the signal-to-quantization noise ratio is low.
The factor ψ is used in step or stage 28 Output signal d' to demultiplexer step/stage 24 2 And performing inverse scaling. The signal d 'to be generated in step or stage 29' 2 And signal d' 1 And combined to produce a decoded coefficient domain HOA signal d'.
According to the present invention, the efficiency of PCM encoding in the coefficient domain can be increased by using signal-adaptive normalization of the signal. However, this normalization must be reversible and uniform and continuous between samples. The required per-block adaptation process is shown in fig. 3. The j-th input matrix D (j) = [ D (jl+0) … D (jl+l-1)]Including L HOA signal vectors d (index j is not shown in fig. 3). Similar to the process in fig. 2, matrix D is split into two matrices D 1 And D 2 . D in steps or stages 31 to 35 1 Corresponds to the processing in the spatial domain described in connection with fig. 2 and 1. But the encoding of the coefficient domain signal comprises a block-wise adaptive normalization step or stage 36 which automatically adapts to the current value range of the signal, followed by a PCM encoding step or stage 37. For matrix D' 2 The side information required for de-normalization of each PCM encoded signal is stored and transmitted in vector e. Vector quantityOne value for each signal. The corresponding adaptive de-normalization step or stage 38 at the receiving side of the decoder uses the information from the transmitted vector e to signal D' 2 To D'. 2 Is inverse transformed. In steps or stepsSignal D 'to be generated in section 39' 2 And signal D' 1 And combined to produce a decoded coefficient domain HOA signal D'.
In the adaptive normalization in step/stage 36, a uniform continuous transfer function is applied to the samples of the current block of input coefficients to continuously change the gain from the last block of input coefficients to the gain of the next block of input coefficients. This type of processing requires a block delay because one block of input coefficients must be advanced to detect the change in normalized gain. The advantages are that: the introduced amplitude modulation is small so that the perceptual coding of the modulated signal has almost no effect on the denormalized signal.
For D 2 (j) Independently performing an implementation of adaptive normalization for each HOA signal. The signals being represented by row vectors of the matrixRepresentation of a representation
Where n represents the index of the transmitted HOA signal. X is x n Transposed in that it is initially a column vector, where a row vector is required.
Fig. 4 shows this adaptive normalization in step/stage 36 in more detail. The input values processed are:
maximum x of temporal smoothing n,max,sm (j-2),
Gain value g n (j-2), i.e. applied to the corresponding signal vector block x n Gain of the last coefficient of (j-2),
signal vector x of current block n (j),
Signal vector x of previous block n (j-1)。
When starting the first block x n (0) By initializing the recursive input value with a predefined value: vector x n The coefficient of (-1) can be set to zero, gain value g n (-2) should be covered withSet to '1' and x n,max,sm (-2) should be set to a predefined average amplitude value.
Thereafter, gain value g of last block n (j-1), the corresponding value e of the side information vector e (j-1) n (j-1), time-smoothed maximum value x n,max,sm (j-1) and normalized Signal vector x n ' j-1 is the output of this process.
The purpose of this processing is to be applied to the signal vector x n The gain value of (j-1) is from g n (j-2) continuously changing to g n (j-1) thereby gain value g n (j-1) vector-ing the signal x n (j) Normalization to a suitable value range.
In a first processing step or stage 41, a signal vector x n (j)=[x n,0 (j)...x n,L-1 (j)]Each coefficient multiplied by a gain value g n (j-2) wherein g n (j-2) from the signal vector x n (j-1) normalization processing remains as a basis for new normalization gains. From the resulting normalized signal vector x in step or stage 42 using equation (5) below n (j) Obtaining the maximum value x of the absolute value n,max
x n,max =max 0≤l<L |g n (j-2)x n,l (j)| (5)
In step or stage 43, temporal smoothing is applied to x n,max Wherein a previous value x receiving the smoothed maximum is used n,max,sm (j-2) implementing the temporal smoothing and producing a maximum value x of the current temporal smoothing n,max,sm (j-1). The purpose of this smoothing is to reduce the adaptation of the normalized gain over time, thereby reducing the number of gain changes and thus the amplitude modulation of the signal. At the value x only n,max The temporal smoothing is only applied if it is within a predefined value range. Otherwise, x is n,max,sm (j-1) is set to x n,max (i.e., x n,max The value of x is kept as it is) because subsequent processing must take x n,max Is reduced to a predefined value range. Thus, the normalized gain is constant only or atThe signal x can be amplified without going beyond the value range n (j) Time smoothing is active when it is.
In step/stage 43, x is calculated as follows n,max,sm (j-1):
Wherein 0 < a.ltoreq.1 is the decay constant.
To reduce the bit rate of the transmission of vector e, maximum value x smoothed from the current time n,max,sm (j-1) calculating a normalized gain, and the normalized gain is transmitted as an exponent based on '2'. Therefore, must satisfy
And in step or stage 44 fromObtaining a quantization index e n (j-1)。
In the period of time that the signal is re-amplified (i.e., the value of the total gain increases over time) to develop a resolution that can be used for efficient PCM encoding, the exponent e can be used n (j) The gain difference (and hence between successive blocks) is limited to a small maximum value, e.g., '1'. This operation has two beneficial effects. On the one hand, small gain differences between successive blocks result in only small amplitude modulation by the transfer function, such that cross-talk between adjacent sub-bands of the FFT spectrum is reduced (see the relevant description of the effect of the transfer function on perceptual coding in connection with fig. 7). On the other hand, the bit rate for encoding the exponent is reduced by constraining its range of values.
Value of total maximum amplificationMay be limited to, for example, '1'. The reason for this is that: if one of the coefficient signals is present in two consecutive blocksA large gain difference between the two blocks will result in a large gain modulation through the transfer function, resulting in severe cross-talk between adjacent sub-bands of the FFT spectrum, where the first block has a small amplitude and the second block has the largest possible amplitude (assuming normalization of the HOA representation in the spatial domain). This is suboptimal for subsequent perceptual coding as discussed below.
In step or stage 45, the index value e n (j-1) applying to the transfer function to obtain the current gain value g n (j-1). For the slave gain value g n (j-2) to gain value g n (j-1) continuous transformation using the function shown in fig. 5. The calculation rule of the function is that
Wherein, l=0, 1,2,... For the slave g n (j-2) to g n (j-1) continuous decay using the actual transfer function vector h n (j-1)=[h n (0) ... h n (L-1)] T (wherein,
for e n Each value of (j-1), since f (0) =1, h n (0) Equal to g n (j-2). The final value of f (L-1) is equal to 0.5, therebyWill result in a value for x according to equation (9) n (j) G of amplification required for normalization of (2) n (j-1)。
In step or stage 46, the vector h is transformed n Gain value pair signal vector x of (j-1) n Sample weighting of (j-1) to obtain
Wherein, the liquid crystal display device comprises a liquid crystal display device,the operator represents multiplication of two vectors by vector elements. The multiplication can also be regarded as signal x n (j-1) amplitude modulation.
More specifically, the transformation vector h n (j-1)=[h n (0) ... h n (L-1)] T Multiplying the coefficient by the signal vector x n The corresponding coefficient of (j-1), wherein h n (0) The value of (2) is h n (0)=g n (j-2), and h n The value of (L-1) is h n (L-1)=g n (j-1). Thus, as shown in the example of fig. 8, the transfer function is derived from the gain value g n (j-2) continuously decaying to a gain value g n (j-1) in which fig. 8 shows the application of the signal vectors x to the respective signal vectors for three consecutive blocks n (j)、x n (j-1) and x n Transfer function h of (j-2) n (j)、h n (j-1) and h n (j-2) gain value. The advantages for downstream perceptual coding are: at the block edges, the applied gain is continuous. Transfer function h n (j-1) use for x n The gain of the coefficient of (j-1) is from g n (j-2) continuous decay to g n (j-1)。
The adaptive de-normalization process at the decoder or receiver side is shown in fig. 6. The input value is the PCM encoded and normalized signal x n (j-1), suitable index e n (j-1) and gain value g of last block n (j-2). Recursively calculating the gain value g of the last block n (j-2) wherein g n (j-2) need to be initialized with predefined values that are also used in the encoder. The output is the gain value g from step/stage 61 n (j-1) and normalized signal x 'from step/stage 62' n (j-1)。
In step or stage 61, an index is applied to the transfer function. To restore x n (j-1) from the received index e, equation (11) n (j-1) calculating a transformation vector h n (j-1), and recursively calculates the gain g n (j-2). Processing for next blockGain g of (2) n (j-1) is set equal to h n (L-1)。
In step or stage 62, inverse gain is applied. Amplitude modulation pass through of applied normalization processIs inversely transformed, wherein->And->Is a multiplication by vector elements used at the encoder or transmitter side. x's' n The sample of (j-1) cannot be x n (j-1) whereby the denormalization requires conversion to a format of a larger range of values, such as a floating point format.
Regarding the side information transmission, for index e n For the transmission of (j-1), it cannot be assumed that their probabilities are uniform, since the applied normalized gain will be constant for consecutive blocks of the same value range. Thus, entropy coding, such as example huffman coding, may be applied to the exponent values to reduce the required data rate.
One disadvantage of the described process may be the gain value g n (j-2) recursively calculating. As a result, the de-normalization process can only start from the beginning of the HOA stream.
The solution to this problem consists in adding access units to the HOA format to provide regularly for calculating g n Information of (j-2). In this case the access unit needs to provide an exponent e for every t blocks n,access =log 2 g n (j-2) (14) wherebyAnd de-normalization may begin at every t blocks.
Through a function h n (l) Frequency response of (2)To analyze the normalized signal x' n (j-1) effects of perceptual coding. The frequency response is represented by h as shown in equation (15) n (l) Is defined by a Fast Fourier Transform (FFT).
FIG. 7 shows the FFT spectrum H of normalized (to 0 dB) size n (u) to clarify the spectral distortion introduced by the amplitude modulation. I H n (u) | is relatively steep for small indices and relatively flat for larger indices.
Due to the passage of h in the time domain n (l) For x n (j-1) the amplitude modulation being equal to the pass H in the frequency domain n Convolution of (u), thus frequency response H n The steep decline of (u) reduces x' n Cross-talk between adjacent subbands of the FFT spectrum of (j-1). This is in accordance with x' n The subsequent perceptual coding of (j-1) is highly correlated because sub-band cross-talk affects the estimated perceptual characteristics of the signal. Thus, for H n (u) steep decline, for x' n The perceptual coding assumption of (j-1) is for an un-normalized signal x n (j-1) is also effective.
This shows that for small exponents, x' n The perceptual coding of (j-1) is almost equal to x n (j-1) and the perceptual coding of the normalized signal hardly affects the denormalized signal as long as the exponent size is small.
The inventive process may be performed by signal processors or electronic circuits on the transmitting side and the receiving side, or by several processors or electronic circuits operating in parallel and/or operating in different parties to the inventive process.

Claims (5)

1. A method for decoding a multiplexed and perceptually encoded HOA signal, the decoding comprising:
demultiplexing the multiplexed vector of the PCM encoded spatial domain signal represented by HOA and the PCM encoded and normalized coefficient domain signal;
transforming said vector of PCM encoded spatial domain signals into a corresponding vector of coefficient domain signals by multiplying the vector of PCM encoded spatial domain signals represented by HOA by a transformation matrix;
de-normalizing the vector of PCM encoded and normalized coefficient domain signals, wherein the de-normalizing comprises:
determining a transformation vector based on a respective exponent of the side information and a recursively calculated gain value, wherein the respective exponent and the gain value are based on a running index of an input matrix of the HOA signal vector;
applying the respective inverse benefit value to the vector of PCM encoded and normalized coefficient domain signals, thereby determining a respective vector of PCM encoded and denormalized signals; and
combining said vector of coefficient domain signals with said vector of denormalized coefficient domain signals to determine a combined vector of HOA coefficient domain signals capable of having a variable number of HOA coefficients,
wherein the multiplexed and perceptually encoded HOA signal is perceptually decoded accordingly prior to being demultiplexed.
2. An apparatus for decoding a multiplexed and perceptually encoded HOA signal, the apparatus comprising:
a demultiplexer for demultiplexing a multiplexing vector of the PCM encoded spatial domain signal represented by HOA and the PCM encoded and normalized coefficient domain signal;
a first processing unit for transforming vectors of PCM encoded spatial domain signals represented by HOA into corresponding vectors of coefficient domain signals by multiplying said vectors of PCM encoded spatial domain signals by a transformation matrix; and
a second processing unit for de-normalizing the vector of PCM encoded and normalized coefficient domain signals, wherein the second processing unit is adapted to:
determining a transformation vector based on a respective exponent of the side information and a recursively calculated gain value, wherein the respective exponent and the gain value are based on a running index of an input matrix of the HOA signal vector; and is also provided with
Applying the respective inverse benefit value to the vector of PCM encoded and normalized coefficient domain signals, thereby determining a respective vector of PCM encoded and denormalized signals; and
a combiner for combining said vector of coefficient domain signals with said vector of denormalized coefficient domain signals to determine a combined vector of HOA coefficient domain signals capable of having a variable number of HOA coefficients,
wherein the multiplexed and perceptually encoded HOA signal is perceptually decoded accordingly prior to being demultiplexed.
3. A non-transitory storage medium containing or storing or recording a digital audio signal decoded according to claim 1.
4. An apparatus for decoding a multiplexed and perceptually encoded HOA signal, comprising:
a memory configured to store program instructions, an
A processor coupled to the memory, configured to execute the program instructions,
wherein the program instructions, when executed by a processor, cause the processor to perform the method according to claim 1.
5. A computer readable storage medium having stored thereon program instructions which, when executed by a processor, cause the processor to perform the method of claim 1.
HK42020007060.5A 2013-07-11 2020-05-08 Method and apparatus for generating a mixed spatial/coefficient domain representation of hoa signals HK40016914B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
EP13305986.5 2013-07-11

Publications (2)

Publication Number Publication Date
HK40016914A HK40016914A (en) 2020-09-18
HK40016914B true HK40016914B (en) 2023-08-25

Family

ID=

Similar Documents

Publication Publication Date Title
CN110648675B (en) Method and apparatus for generating a hybrid spatial/coefficient domain representation of an HOA signal
RU2817687C2 (en) Method and apparatus for generating mixed representation of said hoa signals in coefficient domain from representation of hoa signals in spatial domain/coefficient domain
HK40016914B (en) Method and apparatus for generating a mixed spatial/coefficient domain representation of hoa signals
RU2853530C2 (en) Method and apparatus for forming from representation of hoa signals in coefficient domain mixed representation of said hoa signals in spatial domain/coefficient domain
HK40012738B (en) Method and apparatus for generating a mixed spatial/coefficient domain representation of hoa signals
HK40016914A (en) Method and apparatus for generating a mixed spatial/coefficient domain representation of hoa signals
HK40012739A (en) Method and apparatus for generating a mixed spatial/coefficient domain representation of hoa signals
HK40012718A (en) Method and apparatus for generating a mixed spatial/coefficient domain representation of hoa signals
HK40012738A (en) Method and apparatus for generating a mixed spatial/coefficient domain representation of hoa signals
RU2777660C2 (en) Method and device for formation from representation of hoa signals in domain of mixed representation coefficients of mentioned hoa signals in spatial domain/coefficient domain
HK40012718B (en) Method and apparatus for generating a mixed spatial/coefficient domain representation of hoa signals
HK40012739B (en) Method and apparatus for generating a mixed spatial/coefficient domain representation of hoa signals