WO2019066348A1 - Procédé et dispositif de traitement de signal audio - Google Patents
Procédé et dispositif de traitement de signal audio Download PDFInfo
- Publication number
- WO2019066348A1 WO2019066348A1 PCT/KR2018/010926 KR2018010926W WO2019066348A1 WO 2019066348 A1 WO2019066348 A1 WO 2019066348A1 KR 2018010926 W KR2018010926 W KR 2018010926W WO 2019066348 A1 WO2019066348 A1 WO 2019066348A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- audio signal
- listener
- processing apparatus
- sound
- signal processing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S7/00—Indicating arrangements; Control arrangements, e.g. balance control
- H04S7/30—Control circuits for electronic adaptation of the sound field
- H04S7/302—Electronic adaptation of stereophonic sound system to listener position or orientation
- H04S7/303—Tracking of listener position or orientation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S3/00—Systems employing more than two channels, e.g. quadraphonic
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S5/00—Pseudo-stereo systems, e.g. in which additional channel signals are derived from monophonic signals by means of phase shifting, time delay or reverberation
- H04S5/02—Pseudo-stereo systems, e.g. in which additional channel signals are derived from monophonic signals by means of phase shifting, time delay or reverberation of the pseudo four-channel type, e.g. in which rear channel signals are derived from two-channel stereo signals
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S7/00—Indicating arrangements; Control arrangements, e.g. balance control
Definitions
- the present disclosure relates to an audio signal processing method and apparatus, and more particularly, to an audio signal processing method and apparatus for providing an immersive sound for a portable device including an HMD (Head Mounted Display) device.
- HMD Head Mounted Display
- 3D audio technology refers to signal processing, transmission, encoding, and rendering technologies that provide sound in a three-dimensional space.
- 3D audio can reproduce a sound scene with a height direction added to a sound scene on a horizontal plane (2D) where surround audio is reproduced.
- the audio device can use more number of speakers than the conventional one.
- the audio device is required to have a rendering technique that causes the sound image to be formed at a virtual position where no speaker is present.
- 3D audio rendering technology is more needed because the sense of presence is more important in a virtual reality (VR) or augmented reality (AR) space reproduced using an HMD device or the like.
- VR virtual reality
- AR augmented reality
- the most typical binaural rendering in 3D audio rendering technology is to model the 3D audio signal as an audio signal that is delivered to the user's ears.
- the user can feel the stereoscopic effect through the binaural rendered 2 channel audio output signal through the headphone or the earphone. Specifically, the user can recognize the position and direction of the sound source corresponding to the sound through the sound heard through both ears of the user.
- the audio signal processing apparatus can reproduce the 3D sense of 3D audio by modeling the 3D audio signal in the form of a two-channel audio signal transmitted to both ears of the user.
- an acoustic space that realistically simulates an environment in which a plurality of objects and a user interacts can be an important factor for increasing a user's immersion feeling .
- the acoustic characteristics of each of the plurality of objects may be reflected in a complex manner.
- the interaction between an object and a user can be defined relatively simply by making it heard at a specific position according to the relative position of the sound source.
- the acoustic space may vary depending on the relative position and size of the object between the user and the sound source. This is because various acoustic phenomena may occur depending on the interaction between the objects.
- a sound object sound object
- Non-sound objects passive objects, non-audio objects, scene objects, acoustic elements
- the audio signal processing apparatus can simulate the interaction between the sound source and the user by using the relative positions of the sound sources on the basis of the listeners in the virtual space.
- the sound space may vary depending on the position and size of the object. For example, when there is an object blocking the sound such as a wall between a listener and a sound source in a virtual space, when the audio signal corresponding to the sound source reaches the listener, the size of the audio signal is smaller than when the object is absent Can be attenuated. Also, the sound corresponding to the sound source can be reflected by the wall surface.
- a technique for simulating the above-described acoustic characteristics is required because a user often interacts with various terrains and objects in a virtual reality space, particularly, in a field such as a game.
- One embodiment of the present disclosure aims to reproduce a more realistic spatial sound to the user.
- the present disclosure aims to efficiently simulate a spatial sound including an occlusion effect caused by an obstacle between a sound source and a listener.
- one embodiment of the present disclosure aims to simulate the effect of the culling on an input audio signal in which various audio signals coexist.
- one embodiment of the present disclosure is directed to simulating the interaction between audio signals in various formats and non-sound objects that do not produce sound.
- An apparatus for processing an audio signal may include a processor for outputting an output audio signal generated based on an input audio signal.
- the processor is configured to obtain information about a virtual space in which the input audio signal and the input audio signal are simulated and determine a position of each of at least one object included in the virtual space based on the listener of the virtual space And determining whether there is a blocking object blocking the direct acoustic path between the sound source and the listener based on the position of the sound source corresponding to the input audio signal And binaurally rendering the input audio signal based on the determination result to generate an output audio signal.
- the output audio signal may include a transmission audio signal through which the sound corresponding to the input audio signal is passed to the listener through the blocking object.
- the processor determines whether or not the input audio signal is converted based on the length of a section in which the direct acoustic path between the sound source and the listener overlaps with the blocking object and the acoustic transmittance of the blocking object. And can generate the transparent audio signal.
- the acoustic transmittance of the blocking object may have different values depending on the frequency bin.
- the output audio signal may include a diffracted audio signal that simulates sound that is diffracted by the blocking object to arrive at the listener.
- the processor determines, based on the shape of the blocking object, at least one diffraction spot at which the sound corresponding to the input audio signal is diffracted at the surface of the blocking object, Based on this, the input audio signal can be binaurally rendered to generate the diffracted audio signal.
- the processor is configured to obtain a first HRTF corresponding to the at least one diffraction point based on the head direction of the listener and to generate the HRTF using the first HRTF, Binaural rendering to generate the diffracted audio signal.
- the processor may determine a point at which the sum of distances of the first path from the point on the surface of the object to the listener and the distance of each of the second path from the point to the source is the at least one diffraction point.
- the first path and the second path may be shortest paths that do not cross the object.
- the processor is further configured to perform a binaural rendering of the input audio signal based on the first HRTF and a diffraction distance representing a sum of a distance of the first path and a distance of the second path along the at least one diffraction point, Thereby generating the diffracted audio signal.
- the attenuation gain may have different values according to the frequency bin of the audio signal.
- the processor mixes the diffracted audio signal and the transparent audio signal to generate the output audio signal.
- the output audio signal may include a two-channel output audio signal corresponding to each of the two ears of the listener.
- the processor determines whether there is the blocking object for each of the right and left sides of the listener based on the position of each of the two ears of the listener, and based on the determination result, It can be generated for each channel.
- the blocking object may include a first blocking object that only blocks either the right or left of the listener.
- the 2-channel output audio signal may also include a reflected audio signal that simulates sound that is reflected by the blocking object to the listener and that corresponds to the input audio signal.
- the processor may be configured to detect, based on the position of the ear corresponding to the other one of the listeners of the listener and the shape of the first blocking object, the reflection of the sound corresponding to the input audio signal at the surface of the first blocking object And binaurally rendering the input audio signal based on the position of the reflection point to generate a first reflected audio signal corresponding to the first blocking object.
- the processor is further configured to: obtain a second HRTF corresponding to the reflection point with respect to the head direction of the listener, binaurally render the input audio signal using the second HRTF to generate the first reflected audio signal can do.
- the processor is further configured to determine a channel comprising the first reflected audio signal from the two-channel output audio signal based on the position of the first blocking object, Can be generated.
- the channel audio signal corresponding to the other one includes the first reflected audio signal
- the channel audio signal corresponding to either one of the two includes a first reflected audio signal .
- the processor may determine a position of each of the ears of the listener based on the head size of the listener.
- the processor is further configured to determine, based on the position of the listener, a set of HRTFs comprising a plurality of HRTFs along an elevation angle and an elevation angle of the listener based on the measured reference distance, the location of each ear of the listener, The east side HRTF and the large side HRTF corresponding to the east side and the large side, respectively, and binaurally render the input audio signal based on the east side HRTF and the large side HRTF.
- the east side HRTF and the large side HRTF may be HRTF corresponding to different positions among the plurality of HRTFs.
- the virtual space may include a plurality of subdivisions in which the reverberation filter is different.
- the processor may filter the input audio signal based on different reverberation filters for the right and left sides of the listener, respectively, when the positions of the respective ears of the listener are located in different divided spaces, It is possible to generate the reverberant audio signals corresponding to the right and left sides of the reverberant audio signal.
- the blocking object may be a non-sound object having no sound output from the blocking object in the virtual space.
- the processor may receive metadata indicating information about a non-sound object included in the virtual space together with the input audio signal.
- An operation method of an audio signal processing apparatus for rendering an input audio signal includes the steps of obtaining information about an input audio signal and a virtual space in which the input audio signal is simulated, Based on the position of each of the at least one object included in the virtual space and the position of the sound source corresponding to the input audio signal based on the position of the sound source and the position of the sound source corresponding to the input audio signal, Generating binaural rendering of the input audio signal based on the determination result to generate an output audio signal, and outputting the output audio signal.
- the audio signal processing apparatus can provide an immersive three-dimensional audio signal.
- the audio signal processing apparatus can efficiently simulate a spatial sound including an occlusion effect caused by an obstacle between a sound source and a listener.
- the audio signal processing apparatus can simulate the effect of the arcade on an input audio signal in which audio signals of various formats coexist. Further, the audio signal processing apparatus according to the embodiment of the present disclosure can simulate an interaction between audio signals in various formats and a non-sound object that does not produce sound.
- FIG. 1 is a diagram showing that characteristics of an audio signal are changed by an acoustic acicular effect according to an embodiment of the present disclosure.
- FIG. 2 is a block diagram showing a configuration of an audio signal processing apparatus according to an embodiment of the present disclosure.
- FIG. 3 is a diagram illustrating a method by which an audio signal processing apparatus according to an embodiment of the present disclosure generates a transmission audio signal based on an input audio signal.
- FIGS. 4 and 5 are diagrams illustrating a method by which an audio signal processing apparatus according to an embodiment of the present disclosure generates a diffracted audio signal based on an input audio signal.
- FIG. 6 is a diagram showing HRTFs determined based on the listener's head direction and sound source position with respect to the head center of the listener.
- Figs. 7 and 8 are diagrams showing HRTF pairs obtained when the distance from the listener to the sound source is located closer or farther than the reference distance at which the HRTF set is generated.
- FIG. 9 is a diagram showing the operation of the audio signal processing apparatus when the presence or absence of an object is different in each acoustic path between each of the ears of the listener and the sound source.
- FIG. 10 is a diagram illustrating an example in which an output audio signal according to an embodiment of the present disclosure is configured differently for each ear of the listener.
- FIG. 11 is a diagram illustrating a method by which an audio signal processing apparatus according to an embodiment of the present disclosure generates a reflected audio signal.
- FIG. 12 is a diagram showing a method of generating a reverberation audio signal corresponding to each of the two ears of the listener.
- FIG. 13 is a block diagram illustrating a process of processing an input audio signal by an audio signal processing apparatus according to an embodiment of the present disclosure.
- FIG. 14 is a block diagram showing the preprocessing operation of the audio signal processing apparatus in more detail.
- 15 is a block diagram showing the audio signal preprocessing operation of the audio signal processing apparatus in more detail.
- FIG. 16 is a view showing the binaural rendering process described in FIG. 13 in more detail.
- 17 is a block diagram showing in detail the configuration of an audio signal processing apparatus according to an embodiment of the present disclosure.
- FIG. 18 is a block diagram showing the configuration of an audio signal processing apparatus according to an embodiment of the present disclosure in detail.
- 19 is a block diagram showing in detail the configuration of an audio signal processing apparatus according to an embodiment of the present disclosure.
- 20 is a block diagram specifically illustrating an object renderer according to an embodiment of the present disclosure
- 21 is a diagram showing an object renderer further including a coordinate transformation processing unit according to an embodiment of the present disclosure
- FIG. 22 is a block diagram specifically illustrating an ambsonic renderer according to an embodiment of the present disclosure
- FIG. 23 is a block diagram specifically illustrating a channel renderer according to an embodiment of the present disclosure.
- the audio signal processing device can simulate acoustic acrobation effects by the object (s) blocking the direct acoustic path between the sound source and the listener in the virtual space. In this way, the audio signal processing device can provide the user with a lively output audio signal.
- a direct acoustic path or acoustic path can be used to denote an acoustic path of a direct sound between a sound source and a listener.
- the present disclosure relates to an audio signal processing apparatus for binaurally rendering an input audio signal based on object-related information related to an object included in a virtual space, and simulating an acoustic acrobation effect.
- an object blocking between a sound source and a listener may be referred to as a blocking object (s).
- a listener may represent a listener in a virtual space unless otherwise noted.
- FIG. 1 is a diagram showing that characteristics of an audio signal are changed by an acoustic acicular effect according to an embodiment of the present disclosure.
- the acoustic path of the direct sound from which the sound output from the sound source O is directly transmitted to the listener, can be modeled as a shortest path connecting the head center of the listener A from the sound source O.
- the characteristics of the audio signal corresponding to the sound source O may be changed.
- the direct sound output from the sound source O may be attenuated depending on the acoustic transmittance that indicates the degree to which the sound passes through the object W.
- the audio signal processing apparatus can simulate a direct sound attenuated by the object by attenuating the audio signal corresponding to the sound source O. [ At this time, the audio signal processing apparatus can set the degree of attenuation of the audio signal differently for each frequency component.
- a method for simulating a direct sound attenuated by an object by an audio signal processing apparatus will be described in detail with reference to FIG. Further, the sound output from the sound source O may be diffracted at a specific point (for example, 'a' in FIG. 1) on the surface of the object W and transmitted to the listener A.
- a method of simulating a diffracted sound diffracted by the audio signal processing device on the surface of the object W and transmitted to the listener A will be described in detail with reference to FIG.
- the acoustic path may include a first acoustic path and a second acoustic path with respect to each of the ears of the listener A, respectively.
- the first acoustic path and the second acoustic path may be different from each other.
- the first acoustic path and the second acoustic path may be modeled as a shortest path connecting each of the ears of the listener A from the sound source O.
- the audio signal processing apparatus can simulate acoustic acrobatic effects for each of the first acoustic path and the second acoustic path, rather than one acoustic path based on the head center of the listener A.
- the occlusion effect by the blocking object may be different for each of the first acoustic path and the second acoustic path.
- a blocking object may exist only in either the first acoustic path or the second acoustic path.
- the object on the first acoustic path and the object on the second acoustic path may be different.
- a method for the audio signal processing apparatus to classify the first acoustic path and the second acoustic path to simulate the acoustic eclipse effect will be described in detail with reference to FIG. 6 through FIG.
- FIG. 2 is a block diagram showing a configuration of an audio signal processing apparatus 10 according to an embodiment of the present disclosure.
- the audio signal processing apparatus 10 may further include components not shown in Fig.
- the audio signal processing apparatus 10 may include at least two or more different components as one unit.
- the audio signal processing apparatus 10 may be implemented as one semiconductor chip.
- each component may be implemented through a hardware component, such as a separate circuit
- the audio signal processing apparatus 10 may include a receiving unit 11, a processor 12, and an output unit 13.
- the receiving unit 11 may receive an input audio signal input to the audio signal processing apparatus 10.
- the receiving unit 11 can receive an input audio signal to be processed by the processor 12 for audio signal processing.
- the output unit 13 may also transmit the output audio signal generated by the processor 12.
- the input audio signal may include at least one of an object signal, an ambsonic signal, and a channel signal.
- the output audio signal may be an audio signal rendered from the input audio signal.
- the receiving unit 11 may receive an input audio signal input to the audio signal processing apparatus 10.
- the receiving unit 11 can receive an input audio signal to be processed by the processor 12 for audio signal processing.
- the receiving unit 11 may include receiving means for receiving an audio signal.
- the receiving unit 11 may include an audio signal input / output terminal for receiving an audio signal transmitted through a wire.
- the receiving unit 11 may include a wireless audio receiving module for transmitting and receiving an audio signal transmitted wirelessly.
- the receiving unit 11 can receive an audio signal wirelessly transmitted using a Bluetooth or Wi-Fi communication method.
- the receiving unit 11 may receive a bitstream encoded from the input audio signal.
- the decoder may be implemented through the processor 12, which will be described later.
- the receiving unit 11 may receive information related to the input audio signal together with the input audio signal.
- the bitstream may additionally include information related to the input audio signal in addition to the input audio signal. This will be described in detail with reference to FIG. 17 through FIG.
- the receiving unit 11 may include one or more components communicating with other devices outside the audio signal processing apparatus 10. [ Also, the receiving unit 11 may include at least one antenna for receiving the bit stream. Also, the receiving unit 11 may include hardware for wired communication for receiving the bit stream.
- the processor 12 can control the overall operation of the audio signal processing apparatus 10. [ The processor 12 can control each component of the audio signal processing apparatus 10. The processor 12 may perform arithmetic processing and processing of various data and signals.
- the processor 12 may be implemented in hardware in the form of a semiconductor chip or an electronic circuit, or may be implemented in software that controls hardware.
- the processor 12 may be implemented as a combination of hardware and software.
- the processor 12 can control the operations of the receiving unit 11 and the output unit 13 by executing at least one program included in the software.
- the processor 12 may execute at least one program to perform operations of the audio signal processing apparatus 10 described in Figs. 3 to 23, which will be described later.
- the processor 12 may render the input audio signal based on the spatial information and the listener information to generate an output audio signal.
- the spatial information may include information about a plurality of objects included in a virtual space in which the input audio signal is simulated.
- the information on the plurality of objects may include at least one of the position, the structural characteristic, or the physical characteristic of each of the plurality of objects.
- the structural characteristics of the object may include at least one of the size or the shape of the object.
- the physical property of the object may include at least one of information indicating the material of the object or the transmittance of the object.
- the listener information may also include information associated with the listener in the virtual space. Specifically, the listener information may include listener position information indicating the position of the listener in the virtual space. In addition, the listener information may include head direction information indicating the head direction of the listener according to the head movement of the listener. Head direction information can be acquired in real time via the head-mounted display and sensors attached to the hardware. Also, the listener's location and heading direction information may be obtained based on the user's input. At this time, the user may be a user who controls the operation of the listener in a game environment provided by a device such as a PC or a mobile. The listener information may include head size information indicating the head size of the listener.
- the processor 12 may estimate the position of both ears of the listener based on the listener's location information and the listener's head size information. Or the processor 12 may obtain the position of both ears of the listener via the listener information including information about the position of both ears of the listener. For example, the processor 12 may receive at least one of spatial information or listener information through the receiving unit 11 described above. The processor 12 may receive the spatial information corresponding to the input audio signal together with the input audio signal through the receiving unit 11. [ The way in which the processor 12 receives the spatial information will be described later with reference to Figs. 17 to 19. Fig. In addition, the processor 12 may further perform post-processing on the output audio signal.
- Post processing may include at least one of crosstalk removal, dynamic range control (DRC), volume normalization, and peak limiting.
- the audio signal processing apparatus 10 may include a separate post-processing unit for performing post-processing, and the post-processing unit may be included in the processor 12 according to another embodiment.
- the output unit 13 can output the output audio signal.
- the output unit 13 may output the output audio signal generated by the processor 12.
- the output unit 13 may include at least one output channel.
- the output audio signal may be a two-channel output audio signal corresponding to the amount of the listener, respectively.
- the output audio signal may be a binaural 2-channel output audio signal.
- the output unit 13 can output the 3D audio headphone signal generated by the processor 12.
- the output unit 13 may comprise output means for outputting an output audio signal.
- the output unit 13 may include an output terminal for outputting the output audio signal to the outside.
- the audio signal processing apparatus 10 can output an output audio signal to an external device connected to the output terminal.
- the output unit 13 may include a wireless audio transmission module for outputting an output audio signal to the outside.
- the output unit 13 can output an output audio signal to an external device using a wireless communication method such as Bluetooth or Wi-Fi.
- the output unit 13 may include a speaker.
- the audio signal processing apparatus 10 can output the output audio signal through the speaker.
- the output unit 13 may include a plurality of speakers arranged according to a predetermined channel layout.
- the output unit 13 may further include a converter (e.g., a digital-to-analog converter (DAC)) for converting the digital audio signal into an analog audio signal.
- DAC digital-to-analog converter
- the apparatus for processing an audio signal can determine whether there is an object blocking between a sound source and a listener based on information about a virtual space.
- the information about the virtual space may include position information indicating the position of the sound source based on the listener and the position of each of the plurality of objects included in the virtual space.
- the audio signal processing apparatus can binaurally render the input audio signal based on the determination result to generate an output audio signal. For example, if there is no blocking object, then the audio signal processing device may not use the information associated with the blocking object in filtering the input audio signal.
- the audio signal processing device can filter the input audio signal based on the information associated with the blocking object. In this case, the audio signal processing apparatus can binaurally render the input audio signal using the HRTF corresponding to the additional position in addition to the head related transfer function (HRTF) corresponding to the sound source.
- HRTF head related transfer function
- an object W in the acoustic path between the sound source and the listener there may be an object W in the acoustic path between the sound source and the listener.
- the object W may be an object other than a listener and a sound source.
- the sound source of the input audio signal to be processed by the audio signal processing apparatus may be a sound source O occluded by the object W in the listener A position.
- the audio signal processing apparatus can simulate the effect of the archery by the object W.
- the effect of the occlusion by the object W can be modeled as a transmission sound, a diffraction sound, and a reflection sound representing a direct sound attenuated through the object W.
- the audio signal processing apparatus can generate a transmission audio signal, a diffraction audio signal, and a reflection audio signal corresponding to the transmission sound, the diffraction sound, and the reflection sound, respectively, based on the input audio signal.
- the output audio signal described in this disclosure may include at least one of a transmitted audio signal, a diffracted audio signal, or a reflected audio signal.
- FIG 3 is a diagram illustrating a method by which an audio signal processing apparatus according to an embodiment of the present disclosure generates a transmission audio signal based on an input audio signal.
- the audio signal processing apparatus when the transmittance of the object W is equal to or greater than the reference transmittance, the audio signal processing apparatus can generate a transparent audio signal based on the transmission attenuation gain. If the transmittance of the object is less than the reference transmittance, the audio signal processing apparatus may not generate the transparent audio signal. If the transmittance of the object is less than the reference transmittance, it may be similar to the case where there is no transmitted sound passing through the object to the listener.
- an audio signal processing apparatus may binaurally render an input audio signal based on a transmission attenuation gain to produce a transmitted audio signal.
- the audio signal processing apparatus can generate the transparent audio signal by adjusting the size of the input audio signal with the transmission attenuation gain.
- the transmission attenuation gain may indicate a ratio of the size of the transmission audio signal to the size of the input audio signal.
- the transmission attenuation gain may be a filter coefficient that models the ratio of the lost sound as it passes through the object W.
- the audio signal processing apparatus may multiply an input audio signal by a transmission attenuation gain to generate a transmission audio signal.
- the audio signal processing apparatus may filter the input audio signal corresponding to the sound source O based on the length x of the section in which the direct acoustic path overlaps with the object W.
- the audio signal processing apparatus can determine the attenuation gain based on the length (x).
- the attenuation gain may become smaller as the length (x) of the section in which the acoustic path overlaps with the object W is longer. This is because the longer the length (x) of the section through which the original sound output from the sound source passes through the object, the greater the degree of attenuation of the transmitted sound transmitted to the listener.
- the attenuation gain may be inversely proportional to the length (x).
- the audio signal processing apparatus can calculate the length x based on the position of the sound source and the position of the object W with respect to the listener. Further, the audio signal processing apparatus can calculate the length (x) based on the shape of the object (W).
- the audio signal processing apparatus can filter the input audio signal based on the acoustic transmittance of the object W. Specifically, the audio signal processing apparatus can determine the transmission attenuation gain based on the acoustic transmittance of the object W.
- the acoustic transmittance may indicate the degree to which the object W passes the sound.
- the acoustic transmittance of the object W may vary depending on the material constituting the object W.
- the acoustical transmittance may vary according to the frequency component of the audio signal. In the present disclosure, a frequency component may represent a frequency bin of a predetermined magnitude.
- the audio signal processing apparatus can determine the acoustic transmittance based on the information about the material constituting the object W. [
- the first material may transmit an audio signal relatively more than the second material.
- the acoustic transmittance of the object W may be higher than other objects constituted of the second material.
- the transmittance of the third material may be different from that of the fourth material.
- the third material may transmit the first frequency component relatively more than the second frequency component.
- the acoustic transmittance of the object W may be relatively high in the first frequency component as compared to the second frequency component.
- the first frequency component and the second frequency component may be frequency bands differentiated based on a predetermined frequency in the entire frequency domain.
- the first frequency component may be a frequency band lower than a predetermined frequency.
- the second frequency component may be a frequency band higher than a predetermined frequency.
- the audio signal processing apparatus can binaurally render the input audio signal based on the HRTF and the attenuation gain corresponding to the sound source to generate a transmission audio signal.
- the audio signal processing apparatus can obtain the HRTF corresponding to the sound source based on the head direction of the listener and the position of the sound source.
- the HRTF may include the east side HRTF and the large side HRTF pair.
- the transfer functions include a Head Related Transfer Function (HRTF), an Interaural Transfer Function (ITF), a Modified ITF (MITF), a Binaural Room Transfer Function (BRTF), a Room Impulse Response (RIR), a Binaural Room Impulse Response ), Head Related Impulse Response (HRIR), and modified and edited data thereof, and the present disclosure is not limited thereto.
- the audio signal processing apparatus can acquire the transfer function from a separate database.
- the transfer function is a Fast Fourier Transform (IR) of an impulse response, but the method of conversion is not limited thereto.
- the transform method may include at least one of a Quadratic Mirror Filterbank (QMF), a Discrete Cosine Transform (DCT), a Discrete Sine Transform (DST), or a wavelet.
- QMF Quadratic Mirror Filterbank
- DCT Discrete Cosine Transform
- DST Discrete Sine Transform
- the audio signal processing apparatus can obtain H0 which is the HRTF corresponding to the sound source. Further, the audio signal processing apparatus can generate a transparent audio signal based on H0 and the above-described attenuation gain.
- FIGS. 4 and 5 are diagrams illustrating a method by which an audio signal processing apparatus according to an embodiment of the present disclosure generates a diffracted audio signal based on an input audio signal.
- the audio signal processing device may determine a diffraction point at which the acoustic corresponding to the input audio signal is diffracted at the surface of the blocking object.
- the sound output from the sound source can be diffracted at the diffraction point on the surface of the blocking object to reach the listener.
- the audio signal processing apparatus can determine at least one diffraction point based on the shape of the blocking object.
- the audio signal processing apparatus can determine at least one diffraction point based on the diffraction distance at the surface of the blocking object.
- the audio signal processing apparatus can determine a point at which the diffraction distance is the smallest among the points on the surface of the blocking object as the diffraction point.
- the diffraction distance may represent the sum of the first path from the source to the first point on the blocking object surface and the distance of each of the second path from the first point to the listener.
- the first path and the second path may be shortest paths that do not cross the blocking object.
- the longer the distance the sound is diffracted the smaller the size of the sound reaching the listener and the less the characteristics of the audio signal can be transformed.
- the longer the diffraction distance the greater the degree of attenuation, which may be ineffective to reproduce the effect of the occlusion relative to the required computational complexity.
- the audio signal processing apparatus can efficiently model the diffracted sound based on the diffracted distance.
- the shortest path that does not intersect the blocking object may pass through a plurality of points on the surface of the blocking object.
- the audio signal processing apparatus can determine the last point where the diffraction path of the sound output from the sound source meets the blocking object as the diffraction point.
- the diffraction path represents the entire first path and the second path.
- the audio signal processing device can binaurally render the input audio signal based on the last point at which the acoustic path of the acoustic to be diffracted abuts the blocking object to produce a diffracted audio signal.
- the diffraction distance with respect to the point a on the surface of the blocking object W is determined by the first distance from the position O of the sound source to the point a and the distance from the point a to the listener A It can be the sum of two distances.
- the audio signal processing apparatus can determine the point (a) having the smallest diffraction distance as the diffraction point.
- Point (a) may be one point with the shortest diffraction distance among the plurality of points on the surface of the blocking object.
- the audio signal processing apparatus may generate a diffraction audio signal based on a plurality of diffraction points.
- the audio signal processing apparatus can divide the blocking object into a plurality of regions to determine a diffraction point for each region.
- the audio signal processing apparatus can determine a point corresponding to the shortest diffraction distance for each divided region as a diffraction point for each region.
- the audio signal processing apparatus can divide the blocking object based on at least one of the size and the shape of the object.
- the audio signal processing apparatus can divide the blocking object into a plurality of regions by referring to a coordinate axis representing a blocking object in a virtual space.
- the blocking object may be a two-dimensional or a three-dimensional object.
- the audio signal processing apparatus can divide a blocking object into a first area including a point a and a second area including a points b and c, based on a side including a point a and a point c have.
- the audio signal processing apparatus can determine the point a having the shortest diffraction distance in the first region as the diffraction point in the first region.
- the audio signal processing apparatus can determine the point c having the shortest diffraction distance in the second region as the diffraction point in the second region.
- the diffraction distance corresponding to the point c may be the distance from the sound source O to the point b, the distance from the point b to the point c, and the distance from the point c to the listener.
- the diffraction path of the diffracted sound in the second region can cross a plurality of points on the surface of the blocking object.
- the audio signal processing apparatus can determine the point c, which is the last point where the diffraction path of the sound output from the sound source meets the blocking object, as the diffraction point. Further, the audio signal processing apparatus can binaurally render the input audio signal based on the point c. This will be described later.
- the audio signal processing apparatus can limit the number of diffraction points. For example, the audio signal processing apparatus can determine the maximum number of diffraction points. In addition, the audio signal processing apparatus can generate a diffracted audio signal based on the number of diffraction points that is equal to or less than the maximum number of diffraction points. For example, the audio signal processing apparatus can generate a diffracted audio signal corresponding to each diffraction point based on a maximum number of diffraction points, among the diffraction points for each region.
- the audio signal processing apparatus can determine the diffraction points corresponding to the maximum number or less from the diffraction point having the shortest diffraction distance to the shortest diffraction distance, based on the diffraction distance. For example, when the maximum number of diffraction points is two and the blocking object is divided into three regions, the audio signal processing apparatus generates a first diffraction audio signal based on the first diffraction point corresponding to the shortest diffraction distance . At this time, if there is one point corresponding to the shortest diffraction distance, the audio signal processing apparatus can generate the second diffraction audio signal based on the second diffraction point corresponding to the second shortest diffraction distance.
- the number of different diffraction points corresponding to the same diffraction distance may be larger than the remaining number of diffraction points.
- the audio signal processing apparatus can select any point corresponding to the maximum number of diffraction points remaining among the different points corresponding to the same diffraction distance.
- the audio signal processing apparatus can set the diffraction point so that the distance between the selected diffraction points becomes maximum. Further, according to one embodiment, the audio signal processing apparatus can determine the maximum number of diffraction points based on the processing performance of the audio signal processing apparatus.
- the processing performance may include the processing speed of the processor included in the audio signal processing apparatus. Since the resources that can be allocated to the operation for generating the diffracted audio signal can be limited depending on the processing speed of the processor.
- the processing capabilities of the audio signal processing apparatus may include the computing power of the memory or GPU included in the audio signal processing apparatus.
- the audio signal processing apparatus may determine a point at which the diffraction distance along each point on the surface of the blocking object is shorter than a predetermined distance as the diffraction point. The longer the diffraction distance becomes, the greater the degree of attenuation becomes, and the reproduction of the effect of the arcucation relative to the required computation amount may be ineffective.
- the audio signal processing apparatus may not generate the diffraction audio signal.
- the diffraction distance based on the point c is longer than the predetermined length, and the diffraction distance based on the point a may be shorter than the predetermined distance.
- the audio signal processing apparatus may determine only the point a as the diffraction point. If there is no point having a diffraction distance shorter than the predetermined distance, the audio signal processing apparatus can not determine the diffraction point. On the other hand, when there are a plurality of points having a diffraction distance shorter than a predetermined distance, the audio signal processing apparatus can select some of the plurality of points.
- the predetermined distance may be a value set based on the distance from the sound source to the listener.
- the predetermined distance may be set to a larger value as the distance from the sound source to the listener becomes longer.
- the audio signal processing apparatus can obtain the HRTF corresponding to the diffraction point based on the head direction and the diffraction point of the listener.
- the HRTF may be an HRTF corresponding to a different location from the HRTF used to generate the transmitted audio signal.
- the audio signal processing apparatus can obtain the HRTF corresponding to the diffraction point with respect to the head direction of the listener.
- the audio signal processing apparatus can obtain H1 different from H0 which is the HRTF corresponding to the position of the sound source.
- H1 may be the HRTF corresponding to the diffraction point (a) with respect to the head direction of the listener.
- the audio signal processing apparatus can obtain the HRTF corresponding to each of the plurality of diffraction points based on the position of the listener. Further, the audio signal processing apparatus can binaurally render the input audio signal using the HRTF corresponding to the diffraction point. The audio signal processing apparatus may binaurally render the input audio signal using the HRTF corresponding to the diffraction point to generate a diffracted audio signal. As described above, the diffraction path can cross a plurality of points on the surface of the blocking object. In this case, the audio signal processing apparatus can binaurally render the input audio signal to generate a diffracted audio signal based on the HRTF corresponding to the last point where the diffraction path of the sound output from the sound source meets the blocking object.
- the audio signal processing apparatus can generate the diffracted audio signal based on the diffraction distance.
- the audio signal processing apparatus can generate the diffracted audio signal by attenuating the input audio signal based on the diffraction distance corresponding to the diffraction point.
- the size of the sound output from the sound source is attenuated according to the diffraction distance.
- the audio signal processing apparatus can determine the diffraction attenuation gain by diffraction based on the diffraction distance along the diffraction point.
- the audio signal processing device may multiply the input audio signal by a diffraction attenuation gain. At this time, the audio signal processing apparatus can determine the diffraction damping gain differently for each frequency component.
- the audio signal processing apparatus can set the attenuation gain so that the degree of attenuation becomes smaller as the frequency becomes lower.
- the diffracted sound can be delayed compared to the direct sound. This is because the path through which the sound output from the sound source is transmitted to the listener becomes longer.
- the audio signal processing apparatus can generate the diffracted audio signal by delaying the input audio signal based on the diffraction distance.
- the audio signal processing apparatus may mix the transmission audio signal and the diffraction audio signal generated by the method described with reference to FIG. 3 to generate an output audio signal.
- the audio signal processing device may mix the binaurally rendered transmitted audio signal and the diffracted audio signal for each ear of the listener.
- the acoustic path from the sound source to the head center of the listener and the acoustic path from the sound source to both ears of the listener may be different from each other. Accordingly, the influence of the object W on each acoustic path from the sound source to both ears of the listener can be changed.
- the object W may be located on the second acoustic path from the sound source to the listener's right ear, while the object W is not located on the first acoustic path from the sound source to the listener's left ear. have.
- different objects may be located in each of the first acoustic path and the second acoustic path.
- the audio signal processing apparatus may model different transfer functions for the first acoustic path and the second acoustic path, respectively.
- the audio signal processing apparatus can determine an azimuth angle and an elevation angle corresponding to a position O of a sound source from a center of a listener's head in a virtual space.
- the audio signal processing apparatus can binaurally render the input audio signal corresponding to the sound source using the transfer function H0 corresponding to the determined azimuth and elevation angles.
- the reference distance may represent the measured distance of the HRTF set including the HRTF based on the listener.
- the transfer function H0 may be part of the HRTF set (set) measured based on the reference distance R.
- the set of HRTFs may be a set of transfer functions centered at the listener ' s head center and representing properties measured at points on the sphere with the reference distance R as a radius.
- the audio signal processing apparatus can use the transfer function H0 obtained by the above-described method. However, if the head size of the listener (or the distance between the ears) is greater than or equal to a threshold distance set with respect to the HRTF measurement distance R, and the audio signal processing device is binaurally rendering using the transfer function H0, Performance may be degraded. As shown in FIGS. 7 and 8, the HRTF obtained based on the position of each ear of the listener and the HRTF obtained based on the position of the head center of the listener are different.
- the HRTF set having various reference distances is configured to improve the performance of the binaural rendering
- the number of points to be measured by the apparatus for generating the HRTF set may increase.
- the database may be difficult to store all of the HRTF sets measured at various reference distances.
- Figs. 7 and 8 are diagrams showing HRTF pairs obtained when the distance from the listener to the sound source is located closer or farther than the reference distance at which the HRTF set is generated.
- the angle from the left ear to the sound source of the listener with respect to the head direction of the listener is theta_c.
- the angle from the right ear to the sound source of the listener based on the head direction of the listener is theta_i.
- theta_c and theta_i may be different from each other.
- theta_c and theta_i are different from the angle theta_O from the center of the listener's head to the sound source with respect to the listener's head direction.
- the acoustic path from the source to both ears of the listener may be different from the acoustic path from the source to the listener's head center.
- the transfer function reference positions Hi and Hc transferred on the spherical surface having the measured reference distance R as the radius of the HRTF set may be different from the transfer function H0 obtained on the basis of the head center of the listener.
- the audio signal processing apparatus can obtain the HRTF corresponding to different positions for each of the listeners' ears based on the reference distance at which the HRTF set is generated and the distance between the sound source and the listener.
- the audio signal processing apparatus can obtain the east side HRTF and the large side HRTF corresponding to the east side and the large side of the listener, respectively, based on the reference distance, the position of each of the listener's ears, and the position of the source.
- the east side (left) HRTF may be the HRTF corresponding to the left ear of the listener in the transfer function pair corresponding to the position of Hc.
- the opposite (right) HRTF may be the HRTF corresponding to the right ear of the listener in the transfer function pair corresponding to the position of Hi.
- the audio signal processing apparatus can binaurally render the input audio signal based on the obtained east side HRTF and the opposite side HRTF to generate an output audio signal.
- the audio signal processing apparatus can determine the presence or absence of a blocking object independently of each of the east side and the large side of the listener from the sound source based on the positions of the ears of the listener. This is because the influence of the object can be changed according to the position of both ears of the listener and the positional relationship of the sound source corresponding to the input audio signal. Specifically, the audio signal processing apparatus can determine whether there is an obstacle between the sound source corresponding to the input audio signal and the east side of the listener based on the information about the virtual space. Further, the audio signal processing apparatus can determine whether there is an obstacle between the sound source corresponding to the input audio signal and the opposite side of the listener, based on the information about the virtual space.
- FIG. 9 is a diagram showing the operation of the audio signal processing apparatus when the presence or absence of an object is different in each acoustic path between each of the ears of the listener and the sound source.
- the audio signal processing apparatus can generate an output audio signal using the HRTF obtained at different positions for each of the east side and the large side corresponding to each of the ears of the listener, as in Figs. 7 and 8.
- the output audio signal may include an east-side output audio signal and a large-side output audio signal.
- no blocking object may be located in the first acoustic path between the sound source and the left ear L of the listener.
- the audio signal processing apparatus may not apply the effect of the object W to the left output audio signal for the left ear (L) of the listener.
- the output audio signal for the left ear (L) of the listener may be closer to the actual sound than not to apply the archiving effect by the object W.
- the audio signal processing apparatus can apply the effect of the object W to the right output audio signal for the right ear (R) of the listener.
- the audio signal processing apparatus can generate the left output audio signal based on the left transfer function Hi.
- the audio signal processing apparatus can generate a right output audio signal in which the transmission audio signal and the diffraction audio signal are mixed based on the information on the right transfer function Hc and the blocking object.
- an audio signal processing apparatus is characterized in that the sound diffracted at the diffraction point determined by the method described above with reference to Figs. 4 and 5 is diffracted at another point on the surface of the blocking object including the diffraction point Thereby generating an indirectly diffracted audio signal that is delivered to the listener.
- the blocking object may mask only one of the acoustic paths corresponding to each of the ears of the listener.
- the output audio signal corresponding to the other one may not include the diffracted audio signal.
- the audio signal processing device can provide a realistic audio signal to the user.
- the indirect diffraction spot may represent a diffraction spot determined with the diffraction spot as a virtual sound source.
- the indirectly diffracted audio signal may be an audio signal simulating an indirect diffraction sound.
- the indirect diffraction sound may be sound output from the source and diffracted at the surface of the blocking object diffracted at other points of the same blocking object surface and delivered to the listener.
- the diffraction spot rather than the indirect diffraction spot is referred to as a direct diffraction spot.
- the indirect diffraction point may be a diffraction point determined by using the direct diffraction point as a virtual sound source.
- the audio signal processing apparatus includes a first path from a sound source to a direct diffraction point, a third path from the direct diffraction point to the indirect diffraction point, and a point where the sum of the distances of the fourth path from the indirect diffraction point to the listener is minimum, Can be determined as a point.
- each path may be a shortest path that does not traverse the blocking object, like the first path and the second path described in Fig.
- FIG. 10 is a diagram illustrating an example in which an output audio signal according to an embodiment of the present disclosure is configured differently for each ear of the listener.
- no blocking object is located in the first acoustic path between the sound source and the left ear (L) of the listener, and the blocking object may be located in the second acoustic path between the sound source and the listener's right ear (R) .
- the diffraction points for the first acoustic path may be D1 and D3.
- the audio signal processing apparatus can generate the right diffracted audio signal based on the diffraction points D1 and D3 on the surface of the blocking object.
- the audio signal processing apparatus can binaurally render the input audio signal based on the transfer function HD1 corresponding to the diffraction point D1 to generate a first right diffracted audio signal. Further, the audio signal processing apparatus can binaurally render the input audio signal on the basis of the transfer function HD3 corresponding to the diffraction point D3 to generate the second right-hand diffraction audio signal. In addition, the audio signal processing apparatus can binaurally render the input audio signal on the basis of the right transfer function Hi to generate a transparent audio signal. Next, the audio signal processing apparatus may mix the transmission audio signal, the first right-hand diffraction audio signal, and the second right-hand diffraction audio signal to generate a right output audio signal.
- the left output audio signal corresponding to the second acoustic path may not include the diffracted audio signal. In the case of the second acoustic path, it does not overlap with the blocking object. However, the audio signal processing apparatus may generate a left output audio signal including an indirectly diffracted audio signal. Referring to FIG. 10, the audio signal processing apparatus may generate the indirectly diffracted audio signal by using the diffraction points D1 and D3 for the first acoustic path as virtual sound sources. First, in the case of D1, the shortest path from D1 to the left ear (L) of the listener may not pass through the other point of the blocking object, so that an indirect diffraction sound may not exist.
- the audio signal processing apparatus can determine the point D2 as the indirect diffraction point. Further, the audio signal processing apparatus can render the input audio signal based on the indirect diffraction point D2. For example, the audio signal processing apparatus can binaurally render the input audio signal based on the transfer function HD2 corresponding to the diffraction point D2 to generate an indirectly diffracted audio signal.
- the method for determining the direct diffraction point described above and the method for generating the diffracted audio signal can be applied in the same or corresponding manner to each of the method for determining the indirect diffraction spot and the method for generating the indirect diffraction audio signal.
- the audio signal processing apparatus may attenuate an input audio signal based on an indirect diffraction distance at which the sound output from the sound source reaches the listener through the direct diffraction point and the indirect diffraction point.
- the audio signal processing apparatus can generate an indirectly diffracted audio signal by delaying the input audio signal based on the indirect diffraction distance.
- the audio signal processing apparatus can directly generate an audio signal based on the left transfer function Hc.
- the audio signal processing apparatus can mix the direct audio signal and the left diffracted audio signal to generate a left output audio signal.
- the audio signal processing apparatus can determine whether to generate an indirectly diffracted audio signal based on the size of the blocking object. For example, if the size of the blocking object is smaller than the head size of the listener, the audio signal processing apparatus can generate an indirectly diffracted audio signal. In this case, modeling the indirect diffraction sound by the audio signal processing apparatus can help provide a sense of reality to the user. On the other hand, when the size of the blocking object is larger than the head size of the listener, the audio signal processing apparatus may not generate the indirectly diffracted audio signal. Further, the audio signal processing apparatus can determine whether to generate the indirectly diffracted audio signal based on at least one of the position and the shape of the blocking object.
- FIG. 11 is a diagram illustrating a method by which an audio signal processing apparatus according to an embodiment of the present disclosure generates a reflected audio signal. 11, no blocking object is located in the first acoustic path between the sound source and the left ear (L) of the listener, and the blocking object may be located in the second acoustic path between the sound source and the listener's right ear (R) .
- the audio signal processing device can determine the reflection point at which the sound corresponding to the input audio signal is reflected at the surface of the blocking object. Specifically, the audio signal processing apparatus can determine the reflection point based on the position, size, and shape of the blocking object. The audio signal processing apparatus can determine, as a reflection point, a point where the reflection angle and the incident angle on the surface of the blocking object become equal to each other based on the position of the sound source and the position of the listener. For example, the audio signal processing apparatus can determine, as the reflection point, a point at which the angle of incidence from the sound source at the surface of the blocking object becomes equal to the reflection angle from the listener to the listener.
- the audio signal processing apparatus can binaurally render the input audio signal based on the position of the listener's head direction and the reflection point to generate a reflected audio signal.
- the audio signal processing apparatus can obtain the HRTF corresponding to the reflection point based on the position of the listener's head direction and the reflection point.
- the audio signal processing apparatus can obtain the transfer function HR corresponding to the reflection point R '.
- the audio signal processing apparatus can binaurally render the input audio signal based on the transfer function HR to generate a reflected audio signal.
- the audio signal processing apparatus can generate the reflected audio signal based on the information about the blocking object.
- the information about the blocking object may include the acoustic reflectance of the object.
- Acoustic reflectance can indicate the magnitude ratio of the sound reflected by the object to the acoustic contrast before being reflected.
- the audio signal processing apparatus can determine the reflection attenuation gain based on at least one of the information indicating the material constituting the blocking object or the reflectance of the blocking object. This is because the reflection attenuation gain may vary depending on the material constituting the blocking object.
- the audio signal processing apparatus can generate the reflected audio signal based on the reflection distance indicating the length of the reflection path. Specifically, the audio signal processing device can generate a reflected audio signal by attenuating the input audio signal based on the reflection distance corresponding to the reflection point. The size of the reflected sound transmitted to the listener is attenuated compared to the sound output from the sound source according to the reflection distance. Specifically, the audio signal processing apparatus can determine the reflection attenuation gain due to reflection based on the reflection distance along the reflection point. Also, the reflected sound can be delayed compared to the direct sound. This is because the path through which the sound output from the sound source is transmitted becomes longer.
- the audio signal processing apparatus can generate the diffracted audio signal by delaying the input audio signal based on the reflection distance.
- the audio signal processing apparatus may mix the direct audio signal, the indirectly diffracted audio signal, and the reflected audio signal generated by the above-described method to generate a left output audio signal.
- the audio signal processing apparatus can mix the transmission audio signal and the diffraction audio signal generated by the above-described method to generate a right output audio signal.
- an audio signal processing apparatus can generate a reverberant audio signal corresponding to a room reverberation due to a virtual space of sound output from a sound source.
- the reverberation may be performed in the post-processing process by the processor 12 described above.
- 12 is a diagram showing a method of generating a reverberation audio signal corresponding to each of the two ears of the listener.
- the listener in the virtual space may be located at the boundary of the divided space to have different reverberation characteristics as shown in FIG. In this case, both ears of the listener can acquire sound through spaces having different reverberation characteristics.
- the audio signal processing apparatus can generate a reverberant audio signal for each of the listeners 'ears, based on the reverberation filter corresponding to the divided space where each of the listeners' ears is located.
- the audio signal processing apparatus may filter the input audio signal based on different reverberation filters for each of the right and left of the listener .
- the audio signal processing apparatus can determine the right reverberation filter and the left reverberation filter corresponding to the right and left sides of the listener, respectively, based on the position of each of the listeners' ears.
- the audio signal processing apparatus binaurally renders an input audio signal on the basis of the right reverberation filter and the left reverberation filter, thereby generating a reverberant audio signal corresponding to each of the right and left sides of the listener.
- the audio signal processing apparatus can generate a reverberant audio signal for the left ear based on the first reverberation filter generated based on the characteristics of the space R_A.
- the audio signal processing apparatus can generate a reverberant audio signal for the right ear based on the second reverberation filter generated based on the characteristics of the space R_B.
- the first and second reverberation filters may be filters having different values of at least one filter coefficient.
- the audio signal processing device may combine the first and second reverberation filters to generate one representative reverberation filter.
- the audio signal processing apparatus may generate reverberant audio signals for left and right using the representative reverberation filter.
- the process described below may be a software component that is executed by a hardware configuration such as a processor.
- the processor 12 described above with reference to FIG. 2 may perform the processing described in FIGS.
- the audio signal processing apparatus may preprocess the input audio signal based on the spatial information and the listener information (S100).
- the input audio signal may include a plurality of object signals.
- the input audio signal may include at least one of an object signal, an ambsonic signal, and a channel signal.
- the audio signal processing apparatus can generate the intermediate audio signal in which the acoustic acicular effect by the plurality of objects included in the virtual space is simulated.
- the intermediate audio signal may be one object signal or a monaural signal.
- an intermediate audio signal may be a multi-channel signal.
- the audio signal processing apparatus can acquire the HRTF used for binaural rendering based on the spatial information and the listener information.
- the HRTF may include the east side HRTF and the large side HRTF pair.
- the audio signal processing apparatus may binaurally render the preprocessed audio signal to generate an output audio signal (S200).
- the output audio signal may be a binaural signal.
- the output audio signal may be a 3D audio headphone signal (i.e., a 3D audio 2-channel signal).
- the audio signal processing apparatus can binaurally render the intermediate audio signal using the HRTF pair obtained in the preprocessing step (S100).
- the binaural rendering may be performed in the time domain or the frequency domain.
- the intermediate audio signal may be a two-channel audio signal for each of the listeners' ears.
- the audio signal processing apparatus may be further subjected to Post-Processing on the output audio signal.
- Post-processing may include Cross-Talk Cancellation, Dynamic Range Control (DRC), Volume Normalization, Peak Limiter, and Reverberator.
- the post-processing can be performed in the time domain or the frequency domain as well as the binaural rendering.
- the audio signal processing apparatus may perform frequency / time domain conversion of the output audio signal in the post-processing process.
- the audio signal processing apparatus may include a post-processing block processor for performing post-processing. Or post-processing may be performed through the processor 12 of FIG.
- an audio signal processing apparatus may analyze an acoustic space (S110).
- the audio signal processing apparatus can analyze the acoustic path from the sound source to both ears of the listener based on the position of the listener.
- the audio signal processing apparatus can determine whether there is a blocking object between the listener and the sound source based on the acoustic path.
- the audio signal processing apparatus can determine whether a blocking object exists based on the position of the sound source, the position of the listener, and the position of each of a plurality of objects included in the virtual space.
- the audio signal processing apparatus can determine at least one blocking object from among a plurality of objects.
- the audio signal processing apparatus can generate the modeling information based on the object-related information of each of the determined blocking objects.
- the object related information may be in the form of metadata for the input audio signal.
- the object-related information may include positional information of the object.
- the object-related information may include attribute information indicating whether the object is a sound object or a non-sound object.
- the blocking object may be a non-sound object.
- the blocking object may also be a passive object, a non-audio object, a scene object, a visual object, an acoustic object, an acoustic element, , An occluder, a reflector, or an absorber.
- the object-related information may include information on the material constituting the object.
- the information about the material may include at least one of sound absorption rate, reflectance, transmittance, diffraction rate, and scattering rate for each frequency component of the material constituting the object.
- the object-related information may include a frequency response characteristic in which information about a material constituting the object is reflected.
- the audio signal processing apparatus may selectively transmit an audio signal on which binaural rendering is performed based on object-related information of each object. Specifically, the audio signal processing apparatus may not select the first audio signal corresponding to each of the at least one sound source blocked by the first blocking object, when the transmittance of the first blocking object is less than the reference transmittance. In this case, the audio signal processing apparatus may binaurally render an input audio signal except for the first audio signal to generate an output audio signal.
- the audio signal processing apparatus can generate binaural information necessary for binaural rendering of the intermediate audio signal.
- the binaural information may include a binaural filter that binaurally renders an audio signal.
- binaural information may include horizontal angle and elevation angle information of a specific point relative to the listener.
- the audio signal processing apparatus can generate binaural information based on the listener position information and the listener's head direction information. For example, the audio signal processing apparatus can obtain the horizontal angle and the altitude angle corresponding to the position of the sound source on the basis of the listener. Further, the audio signal processing apparatus may acquire the HRTF corresponding to the position of the sound source on the basis of the listener.
- the audio signal processing apparatus can generate a binaural filter based on the position, size, and shape of the object. Thereby, the audio signal processing apparatus can model the diffracted sound or the reflected sound. For example, the audio signal processing device may obtain a horizontal angle and an elevation angle representing a specific point on the surface of the object. The audio signal processing device may obtain a binaural filter that is used to generate the output audio signal based on the horizontal angle and the altitude angle representing a specific point on the surface of the object.
- the binaural information may include Ipsilateral binaural information and contralateral binaural information.
- the first binaural information and the second binaural information may represent the east side binaural information and the large side binaural information, respectively.
- the east side binaural information may also include at least one binaural filter for modeling the east side sound.
- the lateral binaural information may include at least one binaural filter for modeling the major acoustic.
- an audio signal processing device may use binaural information to simulate acoustic acrobation effects by blocking objects.
- the audio signal processing apparatus can acquire binaural information including a plurality of binaural filter pairs through the above-described acoustic space analysis (S110). Or the audio signal processing apparatus may obtain binaural information including a plurality of sets of horizontal angle and altitude angles.
- the audio signal processing apparatus can generate the intermediate audio signal using the binaural information. For example, an audio signal processing apparatus may generate one representative binaural filter pair based on a plurality of binaural filter pairs. At this time, the audio signal processing apparatus can generate a plurality of intermediate audio signals for each of the east side and the large side based on a plurality of binaural filter pairs. This is because the binaural filter pairs used may vary depending on the type of sound being modeled (for example, transmitted sound, diffracted sound, and reflected sound). Further, when there are a plurality of blocking objects located between the listener and one sound source, the binaural filter pair may vary depending on the blocking object.
- the audio signal processing apparatus may mix a plurality of intermediate audio signals to generate a final intermediate audio signal.
- the audio signal processing apparatus may generate a representative binaural filter pair through a method of averaging, weighting, or compositing a plurality of binaural filter pairs. In this case, the audio signal processing apparatus can binaurally render the intermediate audio signal based on the generated representative binaural filter pair.
- the audio signal processing apparatus may generate the intermediate audio signal based on the modeling information obtained in the acoustic space analysis step (S110) (S120).
- the audio signal processing apparatus can generate an intermediate audio signal by filtering the input audio signal based on the modeling information.
- the intermediate audio signal may include a plurality of audio signals processed in a different manner from the input audio signal.
- the intermediate audio signal may include an audio signal that models the transmitted sound through the blocking object.
- the intermediate audio signal may also include an audio signal that models the diffracted sound diffracted at the surface of the blocking object and the reflected sound reflected at the surface of the blocking object.
- the audio signal processing apparatus may model the transmitted sound by preprocessing the input audio signal (S121). For example, the audio signal processing apparatus can filter the input audio signal based on the transmittance of the blocking object to generate a transparent audio signal. At this time, the transmittance may be applied to different values depending on the frequency bin of the input audio signal.
- the audio signal processing apparatus can pre-process the input audio signal to model at least one of the diffracted sound or the reflected sound (S122).
- the audio signal processing apparatus can model the diffracted sound and the reflected sound based on the time delay and the decay rate generated by the distortion of the acoustic path.
- the audio signal processing apparatus may filter the input audio signal based on the diffraction point on the surface of the blocking object to generate a diffracted audio signal.
- the method described in FIG. 4 can be applied to the method by which the audio signal processing apparatus generates the diffracted audio signal.
- the audio signal processing apparatus can generate a reflected audio signal from the input audio signal based on the reflection point on the surface of the blocking object.
- the method described in FIG. 11 can be applied to a method by which an audio signal processing apparatus generates a reflected audio signal.
- the audio signal processing apparatus may generate at least one of the intermediate audio signals by mixing the input audio signal, the transparent audio signal, the diffracted audio signal, and the reflected audio signal, the modeling of which is bypassed (S123).
- the audio signal processing apparatus can determine the mixing ratio of the input audio signal, the transparent audio signal, the diffracted audio signal, and the reflected audio signal based on the modeling information. Further, the audio signal processing apparatus can mix the input audio signal, the transmission audio signal, the diffraction audio signal, and the reflection audio signal based on the determined mixing ratio. For example, if a blocking object is present, the audio signal processing apparatus may not include the input audio signal to which the modeling is bypassed.
- the audio signal processing apparatus may omit some processing steps based on the modeling information obtained in the acoustic space analysis (S110).
- the audio signal processing apparatus can mix the audio signals required for modeling on both the east side and the large side. Specifically, when there is a blocking object only in the acoustic path corresponding to the east side, the audio signal processing apparatus can mix the transmitted audio signal and the diffracted audio signal to generate the east side intermediate audio signal. Further, the audio signal processing apparatus can mix the input audio signal and the reflection audio signal, which modeling is bypassed, to generate a large-side audio signal. Thus, the audio signal processing apparatus can provide more realistic spatial sound to the user.
- the intermediate audio signal may be a two-channel audio signal corresponding to each of the two ears of the listener.
- the intermediate audio signal may comprise a first intermediate audio signal and a second intermediate audio signal.
- the audio signal processing apparatus can analyze the acoustic space by dividing the sound path into left and right (or east side and large side) acoustic paths along both ears of the listener. In this case, the audio signal processing apparatus can process the audio signal according to the divided acoustic paths. For example, in the acoustic spatial analysis process (S110), the audio signal processing apparatus can generate the east side and the large side binaural filters, respectively. Further, in the audio signal preprocessing step (S120), the audio signal processing apparatus can generate the east side intermediate audio signal and the large side intermediate audio signal. In this case, the audio signal processing apparatus can independently process the first intermediate audio signal and the second intermediate audio signal.
- FIG. 16 is a diagram specifically illustrating the binaural rendering process (S200) illustrated in FIG.
- the audio signal processing apparatus can independently generate an output audio signal corresponding to each of the east side and the large side.
- the audio signal processing apparatus may binaurally render the first intermediate audio signal based on the first binaural information obtained in the acoustic space analysis step (S110) to generate a first output audio signal S210).
- the audio signal processor may binaurally render the second intermediate audio signal based on the second binaural information obtained in the acoustic space analysis step (S110) to generate a second output audio signal (S220) .
- the audio signal processing apparatus according to the embodiment of FIGS. 17 to 22 may be an audio signal processing apparatus which is the same as or equivalent to the audio signal processing apparatus 10 of FIG. 17 to 23 are block diagrams according to an embodiment of the present disclosure. Blocks that are separately displayed are logically distinguishing elements of the audio signal processing apparatus according to their operations. Further, each unit may be a software component that is executed by a hardware configuration such as a processor. Thus, the operation of each block illustrated in FIGS. 17 through 23 may be performed through an integrated processor including at least one processor. For example, the operation of each block may be performed by the processor 12 of FIG. Accordingly, the same or corresponding portions as those of the embodiment of FIG. 2 in the embodiment of FIGS. 17 to 23 are not described.
- the audio signal processing apparatus 160 may include a decoder 100, an object renderer 200, an ambienceic renderer 300, a channel renderer 400, and a mixer 500.
- the audio signal processing apparatus 160 may receive an encoded bit stream from an input audio signal by an apparatus other than the audio signal processing apparatus 160.
- the decoder 100 may decode the input bitstream.
- the decoder 100 may decode the bit stream to obtain an input audio signal.
- the decoder 100 may decode the bitstream using the MPEG-H 3DA standard codec.
- the input audio signal may comprise a plurality of audio signals that are classified in at least one format.
- the input audio signal may include at least one of an object signal, an ambsonic signal, or a channel signal.
- the decoder 100 may classify a plurality of audio signals of different formats included in the input audio signal by format.
- the decoder 100 may decode the bit stream to obtain side information corresponding to each of the audio signals classified according to the format.
- the decoder 100 can acquire additional information corresponding to each of an object signal, an ambisonic signal, and a channel signal.
- the decoder 100 may decode the bit stream to obtain non-sound object side information for a non-sound object that does not make a sound.
- the virtual space in which the input audio signal is simulated may comprise a non-sound object.
- the non-sound object may represent various objects involved in interaction between objects in a virtual space in which the input audio signal is simulated.
- the non-sound object may be an object having no audio signal corresponding to the object.
- the non-sound object may include at least one of a passive object, a non-audio object, a scene object, a visual object, an acoustic object, an acoustic element, an occluder, a reflector, or an absorber.
- the non-sound object side information may be included in an acoustic element.
- an acoustic element may represent a physical object that affects an audio element according to the position and head direction of the listener in a virtual space.
- the audio element constitutes an audio scene and may be one or more audio signals described by the metadata.
- the audio element may include at least one of the above-described object signal, ambience signal, or channel signal and additional information corresponding thereto.
- the audio signal processing apparatus can receive the acoustic element together with the metadata included in the audio object.
- the audio object may include an audio signal and metadata necessary for simulating a sound source corresponding to the audio signal.
- the metadata required to simulate the sound source may include location information.
- the audio object may be an audio object defined by the ISO / IEC 23008-3 standard.
- the input audio signal includes an object signal, an ambsonic signal, and a channel signal is described as an example, but the present disclosure is not limited thereto.
- the audio signal of each format classified by the decoder 100 can be rendered in a format-specific renderer.
- the additional information corresponding to each of the audio signals classified according to the format includes real acoustical environments in which the input audio signal is recorded or 6-DOF (degrees of freedom) coordinates of the speaker layout reproducing the output audio signal .
- the 6-DOF coordinates may include azimuth angle, elevation angle, distance, yaw, pitch and roll information.
- the azimuth, elevation angle, and distance may be information indicating the position of the listener.
- the yaw, pitch and roll may be information indicating the head direction of the listener.
- the object side information corresponding to the object signal may include directional information such as a directivity pattern of the object.
- the non-sound object side information may include information for handling the influence of the non-sound object on sound output from a sound source other than the non-sound object.
- the non-sound object side information may include at least one of a sound absorption ratio, a reflectance, a transmittance, a diffraction rate, and a scattering rate for each frequency component of a material constituting the non-sound object.
- the user interaction information may include the above-described listener information.
- the user interaction information may include a listener's head direction and a listener's location. At this time, the head direction of the listener and the position of the listener can be controlled by user input.
- the user interaction information may include UI (user interface) information such as a sound object moving (sound), playback / stop, and the like.
- the sound object may be an object in which sound corresponding to the object exists, as opposed to a non-sound object.
- the sound object may include at least one of an active object, an audio object, an audio element, or a sound source.
- the renderer corresponding to the format-specific audio signal can generate the intermediate audio signal according to the format of the output audio signal.
- the output audio signal may be a loud speaker audio signal consisting of a combination of 5.1, 7.1, 5.1.2, 10.2, 22.2 channels, and the like.
- the output audio signal may be a 2-channel binaural signal output via the headphone / earphone.
- the output audio signal may be a combination of a speaker output signal and a headphone / earphone output signal.
- the output audio signal may be an audio signal corresponding to a virtual space simulated with the user wearing an earphone or headphone in a space where the loudspeaker layout is installed.
- the mixer 500 mixes a plurality of intermediate audio signals generated through the object renderer 200, the ambienceic renderer 300, and the channel renderer 400 to generate an output audio signal.
- a method of generating an intermediate audio signal in each of the renderers will be described in detail with reference to FIGS. 20 to 23. FIG. Hereinafter, additional information transmitted in various manners will be described.
- the additional information may be obtained through a separate interface from the input audio signal, unlike the example of Fig. In the embodiment of Figs. 18 and 19, the same or corresponding parts as those of the embodiment of Fig. 17 are not described.
- 18 is a block diagram showing in detail the configuration of an audio signal processing apparatus 170 according to an embodiment of the present disclosure.
- the audio signal processing apparatus 170 may include a first parser 171 and a second parser 172.
- the first parser 171 and the second parser 172 are represented as replacing the decoder 100 of FIG. 17, but each parser may include a decoder internally.
- the audio signal processing apparatus 170 may include a separate decoder.
- the audio signal processing apparatus can receive metadata transmitted separately from an input audio signal.
- the audio signal processing apparatus can receive an input audio signal in the form of pulse-code modulation (PCM) audio.
- the audio signal processing apparatus may receive the input audio signal through a separate audio codec (Codec) for processing the audio signal.
- the additional information corresponding to the input audio signal may be parsed through the second parser 172 in addition to the first parser 171 that processes the input audio signal.
- the first parser 171 can classify the input audio signal into an object signal, an ambience signal, and a channel signal.
- the first parser 171 can classify the input audio signal according to the format by referring to the track index information on the input audio signal.
- the second parser 172 may parse the additional information corresponding to the object signal, the ambience signal, and the channel signal, respectively.
- the second parser 172 can parse the above-described non-sound object side information.
- FIG. 19 is a block diagram showing in detail the configuration of an audio signal processing apparatus 180 according to an embodiment of the present disclosure.
- there may be a second object signal received via a separate interface without a decoding process.
- a voice input interface for example, a microphone or a headset. Examples are situations such as voice communication.
- the audio signal of each of the plurality of users may be a second input audio signal other than the predetermined first input audio signal.
- the audio signal processing apparatus can process the second object signal as a separate object signal through the object renderer 200.
- the object renderer 200 may render the second object signal based on the second object side information.
- the object renderer 200 may generate an object intermediate audio signal based on an object signal, object side information, non-sound object side information, and user interaction information.
- the object renderer 200 may include a sound source directivity processing unit 210, an object-to-object (O2O) interaction processing unit 220, and a sound localization processing unit 230.
- O2O object-to-object
- the sound source directivity processing unit 210 may filter the object signal output from the object based on the direction information of the object.
- the sound source directivity processing unit can model the directivity characteristic of the object signal. And the position and direction of the sound source are relatively different depending on the position of the listener and the head direction in the virtual space.
- the O2O interaction processing unit 220 can process the above-described occlusion effect.
- the O2O interaction processing unit 220 may perform the operations of the audio signal processing apparatus described with reference to FIGS.
- the O2O interaction processing unit 220 may generate at least one of a transmitted audio signal, a diffracted audio signal, or a reflected audio signal based on additional information on at least one blocking object.
- the additional information for the blocking object may include at least one of object side information corresponding to the sound object or non-sound object side information.
- the sound phase normalization processing unit 230 can process the sound image of the object signal.
- the sound localization processing unit 230 can filter the object signal based on the layout on which the output audio signal is output. For example, when the output audio signal is output through a loudspeaker layout, the sound image position processing unit 230 generates an object intermediate audio signal using 3D panning such as Vector-Base Amplitude Panning (VBAP) can do. Or the sound localization processing unit 230 may binaurally render the object signal to generate an object intermediate audio signal.
- the object side information may include an azimuth and an elevation angle of the object corresponding to the object signal. At this time, the image-localization processing unit 230 can binarize the object signal using the HRTF determined based on the object side information.
- FIG. 21 is a diagram showing an object renderer 201, which further includes a coordinate transformation processing unit 240 according to an embodiment of the present disclosure.
- the coordinate transformation processing unit 240 can adjust the position information included in the object side information and non-sound object side information based on the user interaction information.
- the user interaction information may include information indicating the position and head direction of the listener.
- the coordinate transformation processing unit 240 may convert coordinates indicating the position of the sound object and the position of the non-sound object based on the position and the head direction of the listener.
- the coordinate transformation processing unit 240 can calculate the relative coordinates indicating the position of the object on the basis of the coordinate indicating the position of the listener in the virtual space.
- 22 is a block diagram specifically illustrating an ambsonic renderer 300 according to one embodiment of the present disclosure.
- the ambisonic renderer 300 renders an ambisonic signal based on the ambisonic signal, the ambisonic supplemental information, the object supplemental information, the non-sound object supplemental information, and the user interaction information, Lt; / RTI >
- the ambienceic renderer 300 includes an ambisonic-to-ambience (A2A) interpolation processing unit 310, an ambsonic-to-object (A2O) interaction processing unit 320, and a rotation processing unit 330 can do.
- A2A ambisonic-to-ambience
- A2O ambsonic-to-object
- the A2A interpolation processing unit 310 may perform interpolation for reproducing acoustic space based on a plurality of ambisonic spatial samples. Each of the Ambisonic spatial samples can represent an ambisonic signal obtained at a plurality of locations.
- the A2A interpolation processing unit 310 may generate an interpolation ambience signal corresponding to a point where the ambience sound signal is not acquired based on the ambience sound space sample. Specifically, the A2A interpolation processing unit 310 may interpolate a plurality of ambisonic space samples to generate an interpolation ambience signal.
- the A2O interaction processing unit 320 can process the occlusion effect on the ambsonic signal. For example, the A2O interaction processing unit 320 may filter the ambsonic signal based on the additional information for at least one blocking object. For example, the A2O interaction processing unit 320 can determine a transmission attenuation gain for each direction component of the ambsonic signal based on the additional information about the blocking object. At this time, the direction component of the ambsonic signal can be specified on the basis of the ambsonic order indicating the highest order among the components of the ambsonic signal. In addition, the A2O interaction processing unit 320 can determine the transmission attenuation gain for each frequency component of the ambsonic signal based on the additional information about the blocking object. The rotation processing unit 330 may rotate the ambsonic signal based on the user interaction information to generate a binaural rendered amviconic intermediate audio signal.
- the channel renderer 400 may generate a channel intermediate audio signal by rendering a channel signal based on a channel signal, channel additional information object additional information, non-sound object additional information, and user interaction information.
- the channel renderer 400 may include a channel-to-channel (C2C) interpolation processing unit 410, a channel-to-object (A2O) interaction processing unit 420, and a rotation processing unit 430.
- C2C channel-to-channel
- A2O channel-to-object
- the C2C interpolation processing unit 410 may perform interpolation for reproducing acoustic space based on a plurality of channel space samples.
- Each of the channel space samples may be a channel signal obtained at a plurality of locations.
- the channel space sample may be a pre-rendered channel signal based on a particular location.
- the C2C interpolation processing unit 410 may generate an interpolation channel signal corresponding to a point where the channel signal is not acquired based on the channel space sample.
- the C2C interpolation processing unit 410 may interpolate a plurality of channel space samples to generate an interpolation channel signal.
- the C2O interaction processing unit 420 can process the culling effect on the channel signal. For example, the C2O interaction processing unit 420 may filter the channel signal based on the additional information for at least one blocking object. For example, the C2O interaction processing unit 420 may determine a panning gain for each channel of the channel signal based on the additional information about the blocking object. In addition, the C2O interaction processing unit 420 may filter the channel signal based on the channel-specific panning gain. The rotation processing unit 430 may rotate the channel signal based on the user interaction information to generate a binaural-rendered channel intermediate audio signal.
- Computer readable media can be any available media that can be accessed by a computer, and can include both volatile and nonvolatile media, removable and non-removable media.
- the computer-readable medium may also include computer storage media.
- Computer storage media may include both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data.
Landscapes
- Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Acoustics & Sound (AREA)
- Signal Processing (AREA)
- Stereophonic System (AREA)
Abstract
Selon l'invention, un dispositif de traitement de signal audio comprend un processeur permettant de fournir en sortie un signal audio de sortie produit en fonction d'un signal audio d'entrée. Le processeur peut acquérir des informations concernant un signal audio d'entrée et un espace virtuel dans lequel le signal audio d'entrée est simulé, peut déterminer si un objet bloquant, qui effectue un blocage entre une source sonore et un auditeur, existe parmi une pluralité d'objets, en fonction de la position de chaque objet de la pluralité d'objets présents dans l'espace virtuel et de la position de la source sonore correspondant au signal audio d'entrée, par rapport à l'auditeur dans l'espace virtuel, et peut restituer de manière binaurale le signal audio d'entrée en fonction du résultat de détermination de façon à produire un signal audio de sortie.
Applications Claiming Priority (6)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| KR10-2017-0126273 | 2017-09-28 | ||
| KR20170126273 | 2017-09-28 | ||
| KR10-2017-0135488 | 2017-10-18 | ||
| KR20170135488 | 2017-10-18 | ||
| KR10-2018-0082709 | 2018-07-17 | ||
| KR20180082709 | 2018-07-17 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2019066348A1 true WO2019066348A1 (fr) | 2019-04-04 |
Family
ID=65902035
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/KR2018/010926 Ceased WO2019066348A1 (fr) | 2017-09-28 | 2018-09-17 | Procédé et dispositif de traitement de signal audio |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2019066348A1 (fr) |
Cited By (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112770227A (zh) * | 2020-12-30 | 2021-05-07 | 中国电影科学技术研究所 | 音频处理方法、装置、耳机和存储介质 |
| WO2022218986A1 (fr) * | 2021-04-14 | 2022-10-20 | Telefonaktiebolaget Lm Ericsson (Publ) | Rendu d'éléments audio occlus |
| US20230224661A1 (en) * | 2022-01-07 | 2023-07-13 | Electronics And Telecommunications Research Institute | Method and apparatus for rendering object-based audio signal considering obstacle |
| WO2023246327A1 (fr) * | 2022-06-22 | 2023-12-28 | 腾讯科技(深圳)有限公司 | Procédé et appareil de traitement de signal audio, et dispositif informatique |
| WO2024098221A1 (fr) * | 2022-11-07 | 2024-05-16 | 北京小米移动软件有限公司 | Procédé de rendu de signal audio, appareil, dispositif et support de stockage |
| EP4325887A4 (fr) * | 2021-04-12 | 2024-09-25 | Panasonic Intellectual Property Corporation of America | Procédé de reproduction de son, dispositif de reproduction de son et programme |
| CN119883177A (zh) * | 2025-01-15 | 2025-04-25 | 湖北工业大学 | 一种基于vr设备的多用户交互场景联动方法 |
| WO2025086035A1 (fr) * | 2023-10-23 | 2025-05-01 | 瑞声开泰声学科技(上海)有限公司 | Procédé de traitement audio, système et dispositif électronique |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR20090109425A (ko) * | 2008-04-15 | 2009-10-20 | 엘지전자 주식회사 | 가상 입체 음향 구현 장치 및 방법 |
| KR20130010893A (ko) * | 2010-03-26 | 2013-01-29 | 방 앤드 오루프센 에이/에스 | 멀티채널 사운드 재생 방법 및 장치 |
| KR20130080819A (ko) * | 2012-01-05 | 2013-07-15 | 삼성전자주식회사 | 다채널 음향 신호의 정위 방법 및 장치 |
| US20130236040A1 (en) * | 2012-03-08 | 2013-09-12 | Disney Enterprises, Inc. | Augmented reality (ar) audio with position and action triggered virtual sound effects |
| KR20160121778A (ko) * | 2015-04-10 | 2016-10-20 | 세종대학교산학협력단 | 컴퓨터 실행 가능한 사운드 트레이싱 방법, 이를 수행하는 사운드 트레이싱 장치 및 이를 저장하는 기록매체 |
-
2018
- 2018-09-17 WO PCT/KR2018/010926 patent/WO2019066348A1/fr not_active Ceased
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR20090109425A (ko) * | 2008-04-15 | 2009-10-20 | 엘지전자 주식회사 | 가상 입체 음향 구현 장치 및 방법 |
| KR20130010893A (ko) * | 2010-03-26 | 2013-01-29 | 방 앤드 오루프센 에이/에스 | 멀티채널 사운드 재생 방법 및 장치 |
| KR20130080819A (ko) * | 2012-01-05 | 2013-07-15 | 삼성전자주식회사 | 다채널 음향 신호의 정위 방법 및 장치 |
| US20130236040A1 (en) * | 2012-03-08 | 2013-09-12 | Disney Enterprises, Inc. | Augmented reality (ar) audio with position and action triggered virtual sound effects |
| KR20160121778A (ko) * | 2015-04-10 | 2016-10-20 | 세종대학교산학협력단 | 컴퓨터 실행 가능한 사운드 트레이싱 방법, 이를 수행하는 사운드 트레이싱 장치 및 이를 저장하는 기록매체 |
Cited By (16)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112770227A (zh) * | 2020-12-30 | 2021-05-07 | 中国电影科学技术研究所 | 音频处理方法、装置、耳机和存储介质 |
| EP4325887A4 (fr) * | 2021-04-12 | 2024-09-25 | Panasonic Intellectual Property Corporation of America | Procédé de reproduction de son, dispositif de reproduction de son et programme |
| WO2022218986A1 (fr) * | 2021-04-14 | 2022-10-20 | Telefonaktiebolaget Lm Ericsson (Publ) | Rendu d'éléments audio occlus |
| EP4568293A3 (fr) * | 2021-04-14 | 2025-08-06 | Telefonaktiebolaget LM Ericsson (publ) | Rendu d'éléments audio occlus |
| JP7703043B2 (ja) | 2021-04-14 | 2025-07-04 | テレフオンアクチーボラゲット エルエム エリクソン(パブル) | オクルージョンされるオーディオエレメントのレンダリング |
| EP4568293A2 (fr) | 2021-04-14 | 2025-06-11 | Telefonaktiebolaget LM Ericsson (publ) | Rendu d'éléments audio occlus |
| JP2024514170A (ja) * | 2021-04-14 | 2024-03-28 | テレフオンアクチーボラゲット エルエム エリクソン(パブル) | オクルージョンされるオーディオエレメントのレンダリング |
| AU2022256751B2 (en) * | 2021-04-14 | 2025-03-13 | Telefonaktiebolaget Lm Ericsson (Publ) | Rendering of occluded audio elements |
| KR102610263B1 (ko) * | 2022-01-07 | 2023-12-06 | 한국전자통신연구원 | 장애물을 고려한 객체 기반의 오디오 신호의 렌더링 방법 및 장치 |
| US12133062B2 (en) * | 2022-01-07 | 2024-10-29 | Electronics And Telecommunications Research Institute | Method and apparatus for rendering object-based audio signal considering obstacle |
| KR20230106986A (ko) * | 2022-01-07 | 2023-07-14 | 한국전자통신연구원 | 장애물을 고려한 객체 기반의 오디오 신호의 렌더링 방법 및 장치 |
| US20230224661A1 (en) * | 2022-01-07 | 2023-07-13 | Electronics And Telecommunications Research Institute | Method and apparatus for rendering object-based audio signal considering obstacle |
| WO2023246327A1 (fr) * | 2022-06-22 | 2023-12-28 | 腾讯科技(深圳)有限公司 | Procédé et appareil de traitement de signal audio, et dispositif informatique |
| WO2024098221A1 (fr) * | 2022-11-07 | 2024-05-16 | 北京小米移动软件有限公司 | Procédé de rendu de signal audio, appareil, dispositif et support de stockage |
| WO2025086035A1 (fr) * | 2023-10-23 | 2025-05-01 | 瑞声开泰声学科技(上海)有限公司 | Procédé de traitement audio, système et dispositif électronique |
| CN119883177A (zh) * | 2025-01-15 | 2025-04-25 | 湖北工业大学 | 一种基于vr设备的多用户交互场景联动方法 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2019066348A1 (fr) | Procédé et dispositif de traitement de signal audio | |
| US10674262B2 (en) | Merging audio signals with spatial metadata | |
| JP7038725B2 (ja) | オーディオ信号処理方法及び装置 | |
| CN112262585B (zh) | 环境立体声深度提取 | |
| JP7544182B2 (ja) | 信号処理装置および方法、並びにプログラム | |
| Davis et al. | High order spatial audio capture and its binaural head-tracked playback over headphones with HRTF cues | |
| EP3311593B1 (fr) | Reproduction audio binaurale | |
| US11089425B2 (en) | Audio playback method and audio playback apparatus in six degrees of freedom environment | |
| Hacihabiboglu et al. | Perceptual spatial audio recording, simulation, and rendering: An overview of spatial-audio techniques based on psychoacoustics | |
| US8374365B2 (en) | Spatial audio analysis and synthesis for binaural reproduction and format conversion | |
| CN106797525B (zh) | 用于生成和回放音频信号的方法和设备 | |
| KR101004393B1 (ko) | 가상 서라운드에서 공간 인식을 개선하는 방법 | |
| JP2019523913A (ja) | 近/遠距離レンダリングを用いた距離パニング | |
| WO2018182274A1 (fr) | Procédé et dispositif de traitement de signal audio | |
| WO2016089180A1 (fr) | Procédé et appareil de traitement de signal audio destiné à un rendu binauriculaire | |
| KR20050056241A (ko) | 동적인 바이노럴 음향 캡쳐 및 재생 장치 | |
| KR20170106063A (ko) | 오디오 신호 처리 방법 및 장치 | |
| EP2119306A2 (fr) | Spatialisation audio et simulation d'environnement | |
| CN113170271A (zh) | 用于处理立体声信号的方法和装置 | |
| CN113196805B (zh) | 用于获得及再现双声道录音的方法 | |
| CN117242796A (zh) | 渲染混响 | |
| KR102758360B1 (ko) | 오디오 렌더링 방법 및 장치 | |
| US10440495B2 (en) | Virtual localization of sound | |
| KR20210007122A (ko) | 오디오 신호 처리 방법 및 장치 | |
| KR102559015B1 (ko) | 공연과 영상에 몰입감 향상을 위한 실감음향 처리 시스템 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 18863694 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 18863694 Country of ref document: EP Kind code of ref document: A1 |