WO2008135049A1 - Spatial sound reproduction system with loudspeakers - Google Patents
Spatial sound reproduction system with loudspeakers Download PDFInfo
- Publication number
- WO2008135049A1 WO2008135049A1 PCT/DK2008/050100 DK2008050100W WO2008135049A1 WO 2008135049 A1 WO2008135049 A1 WO 2008135049A1 DK 2008050100 W DK2008050100 W DK 2008050100W WO 2008135049 A1 WO2008135049 A1 WO 2008135049A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- reproduction
- loudspeaker
- listener
- audio signals
- binaural
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R5/00—Stereophonic arrangements
- H04R5/02—Spatial or constructional arrangements of loudspeakers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S1/00—Two-channel systems
- H04S1/002—Non-adaptive circuits, e.g. manually adjustable or static, for enhancing the sound image or the spatial distribution
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2420/00—Techniques used stereophonic systems covered by H04S but not provided for in its groups
- H04S2420/01—Enhancing the perception of the sound image or of the spatial distribution using head related transfer functions [HRTF's] or equivalents thereof, e.g. interaural time difference [ITD] or interaural level difference [ILD]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S3/00—Systems employing more than two channels, e.g. quadraphonic
- H04S3/002—Non-adaptive circuits, e.g. manually adjustable or static, for enhancing the sound image or the spatial distribution
Definitions
- the invention relates to the field of reproduction of sound with spatial information, e.g. 3D sound, especially via loudspeakers.
- the invention provides a spatial sound reproduction system and method.
- the method and system are suitable e.g. for reproduction of binaural signals, such as binaural synthesis applications where precise reproduction of a 3D sound image is crucial.
- the method and systems are also suitable for entertainment audio systems where a 3D sound via loudspeakers can be obtained.
- binaural technology control of sound pressures at a listener's ear drums provides control of the person's auditory impression.
- generating proper signals at the person's ear drums it is possible to generate an artificial auditory environment with virtual sound sources in all directions, relative to the listening person.
- FIG. 8 illustrates two loudspeakers positioned in front of the listener and two loudspeakers positioned behind the listener. All loudspeakers are positioned in the horizontal plane, i.e. in the same height as the listener's head. Both the front loudspeakers and the back loudspeakers have separate cross-talk cancellation systems.
- a binaural sound source signal is directed mainly to the front or back set of loudspeakers.
- a source signal representing a sound source directly in front of the listener is directed to the front set of loudspeakers. Both sets of loudspeakers will contribute to reproduction of lateral source signals.
- US 6,769,447 describes another system for generating a 3D sound field based on a set of two loudspeakers positioned in front of the listener at +/-5° in the horizontal plane.
- a cross-talk cancellation processing is applied such that the two loudspeakers move out of phase at low frequencies while they move in phase above a certain frequency.
- EP 1 460 879 describes a vehicle audio system with two loudspeakers placed in a stereo dipole configuration.
- the loudspeakers are positioned directly above the listener, i.e. at 90° elevation, with a distance between the loudspeakers corresponding to an average interaural distance.
- the loudspeakers are designed for reproduction of normal stereo signals, and the audio system does not include any cross-talk cancellation processing since with the loudspeakers mounted in the roof of the vehicle, the distance to the listener is short, and therefore acoustic cross-talk is rather limited.
- US 2003/02471144 Al describes a 3D sound reproduction system with a set of at least two spaced-apart loudspeakers positioned on the interaural axis, and preferred at an elevation higher than the listener's head. It is further described that more pairs of loudspeaker transducers can be used. Especially, the audio signal for reproduction is split into different frequency bands, such as "low”, “mid” and “high” with cross over frequencies of 450 Hz and 3.5 kHz. These frequency bands are then reproduced by respective set of loudspeakers positioned differently, relative to the listener.
- the invention provides a spatial sound reproduction system for sound reproduction of a set of first and second audio signals with spatial cues to a listener, the system including a first reproduction chain including
- a cross-talk cancellation unit arranged to receive the first and second audio signals and generate first and second processed audio signals in response
- first loudspeaker with first and second loudspeaker drivers arranged to reproduce the respective first and second processed audio signals
- the first loudspeaker being arranged for a first position, during reproduction, the first position being described by an elevation angle, relative to the listener's head, being larger than +50° or smaller than -50°
- the cross-talk cancellation unit is arranged to reduce acoustic cross-talk corresponding to the first position
- elevation angles above 50° or below -50° correspond to cones defining positions “above” or “below” the listener, and throughout the description "above” and0 “below” are used in brief to denote the defined elevations if no more specific positions are described.
- the best performance is normally obtained with the two loudspeaker drivers positioned laterally symmetrical relative to the listener's head.
- loudspeaker drivers above (or below) the listener are suited to reproduce spatial directions differing significantly from their physical positions, e.g. lateral directions, since no confusing spatial cues to the human auditory system are generated, in contrast to loudspeaker drivers positioned in front of the listener.
- the second reproduction5 chain e.g. with a loudspeaker driver placed right in front of or right behind the listener, the front/back confusion problem is further reduced.
- Such second reproduction chain is preferably used to reproduce directions supported by their physical position, i.e. directions in a cone in front of the listener can be reproduced by a loudspeaker in front of the listener at 0° azimuth, while the first reproduction chain serves to reproduce all other directions.
- reproduction is split between the first and second reproduction chains, such that a spatial part of directions associated with the second position is reproduced purely, or at least primarily, by the second reproduction chain, while other directions are reproduced by the first reproduction chain.
- direction splitting is easy to integrate into the binaural synthesis processing.
- the immunity to listener movements is superior to a traditional setup of two loudspeaker drivers in the horizontal plane in front of the listener.
- the reason is that correct cross-talk cancellation with loudspeakers positioned in the horizontal plane in front of the listener, e.g. at +/-30°, requires HRTFs for these directions which are rather sensitive to small changes to both lateral movements or rotations of the listener's head.
- HRTFs for directions above or below the listener exhibit far less spectral details and they will only change insignificantly with movements or head rotations.
- the system according to the first aspect will provide a localization performance less influenced by inter-individual differences in HRTFs. Due to the low complexity nature of the HRTFs for the above and below directions, inter- individual differences are less pronounced, and thus the system will function for a large variety of listeners since it is rather easy to select a "standard" above or below HRTF that will suit most listeners individual HRTFs.
- a yet further advantage is that preferred embodiments with the first reproduction chain having two closely spaced loudspeaker drivers, the two loudspeaker drivers can be mounted in only one cabinet. Thus, in a very simple embodiment only one loudspeaker box is required for the first reproduction chain. Further, with the second reproduction chain being one single loudspeaker box, e.g. also one cabinet with two closely spaced drivers.
- the reproduction system of the first aspect is suited for a large number of applications where spatial reproduction of audio signals with loudspeaker is required.
- a few examples are: tele-conference systems, game consoles, computer games, arcade games, control rooms such as air traffic control rooms, Virtual Reality setups, auralization systems, simulators such as aircraft training simulators, home entertainment (surround sound systems), car (automotive) audio, audio systems for cinemas and theatres.
- the first position is described by an elevation angle, relative to the listener's head, being larger than +70° or smaller than -70° .
- the first position of the first loudspeaker is preferably above or below the listener.
- the first and second loudspeaker drivers may be positioned with a mutual distance of less than 50 cm, e.g. with a mutual distance of less than 20 cm, such as less than 10 cm, such as 5cm or even less than 5 cm. Especially, it may be preferred that the mutual distance between the first and second loudspeaker drivers corresponds to the distance between the ears of a human.
- the first and second loudspeaker drivers may be configured to form a stereo dipole, e.g. a stereo dipole such as described in US 6,760,447 Bl by Ole Kirkeby.
- Stereo dipoles are suited for cross-talk cancellation since they are rather immune to lateral movements and thus such stereo dipoles provide a wider "sweet spot". This allows e.g. for a high localization performance for two person sitting next to each other.
- the reproduction of one reproduction chain is split up into two or three frequency bands e.g. using a cross-over network, and where these frequency bands are reproduced by loudspeaker drivers positioned differently relative to the listener.
- a physical distance between drivers for the low frequency part can in this way be increased, while mid and high frequencies can be reproduced by more closely spaced drivers, e.g. in a stereo dipole configuration.
- This approach for dividing the reproduction signal into frequency bands reproduced by respective differently positioned loudspeaker drivers is further described in US 2004/0247144 Al.
- the first and second loudspeaker drivers are positioned laterally symmetric relative to the listener, preferably at azimuth angles with numeric values in the range 80°-100° .
- the first and second loudspeaker drivers may be positioned at azimuth angles 90° and -90°, respectively, i.e. right to the sides relative to the listener, but at a high or low elevation angle.
- Further reproduction chains with loudspeakers further different loudspeaker position may be used.
- additional reproduction chains arranged for reproduction of certain spatial areas, it is possible to further improve localization performance for specific directions, e.g. front and back.
- the second loudspeaker may further include a fourth loudspeaker driver, and wherein the second reproduction chain further includes an associated second cross-talk cancellation unit arranged to reduce acoustic cross-talk corresponding to the second position.
- the first and second loudspeaker drivers are used to reproduce at least lateral sound source directions and sound source directions above and below the horizontal plane.
- Additional reproduction chains merely serve to reproduce sound source directions in front of and behind the listener to provide the listener with the correct ITD changes during head movements.
- the second reproduction chain with at least on loudspeaker driver in front of the listener may be used to reproduce directions within an area defined by -45°-45° azimuth and - 45°-45° elevation, i.e. a frontal cone, while the first reproduction chain reproduces all other directions.
- a third reproduction chain with at least one loudspeaker driver behind the listener may reproduce sound source directions within an area defined by an azimuth of below -135°and above 135° and an elevation in the range -45°-45° elevation.
- the system may further include a direction processor arranged to receive the first and second audio signals and generate in response first and second direction processed audio signals for reproduction via the respective first and second reproduction chains in accordance with the spatial cues in the first and second audio signals.
- a direction processor arranged to receive the first and second audio signals and generate in response first and second direction processed audio signals for reproduction via the respective first and second reproduction chains in accordance with the spatial cues in the first and second audio signals.
- Portions of the first and second audio signals representing lateral image directions are mainly or exclusively included in the first direction processed audio signal arranged for reproduction by the first reproduction chain.
- the system includes a binaural synthesis unit arranged to receive a mono audio signal, apply binaural synthesis to the mono audio signal and generate in response a first set of binaural audio signals for reproduction via the first reproduction chain and a second set of binaural audio signals for reproduction via the second reproduction chain, and wherein first image directions are mainly or exclusively represented in the first binaural audio signals while second image directions are mainly or exclusively represented in the second binaural audio signals.
- a binaural synthesis unit arranged to receive a mono audio signal, apply binaural synthesis to the mono audio signal and generate in response a first set of binaural audio signals for reproduction via the first reproduction chain and a second set of binaural audio signals for reproduction via the second reproduction chain, and wherein first image directions are mainly or exclusively represented in the first binaural audio signals while second image directions are mainly or exclusively represented in the second binaural audio signals.
- the first reproduction chain is used for all directions except for frontal image directions (e.g. for directions except those in the area -45°-45° azimuth and -45°-45° elevation).
- the frontal image directions are then reproduced by the second reproduction chain with one or two loudspeaker drivers, e.g. a stereo dipole, positioned in front of the listener (e.g. at 0° azimuth and 0° elevation), and thus, frontal localization performance is enhanced.
- the spatial image direction splitting is easy to perform, since the mono input signal, e.g. representing the voice of one speaker in a teleconference system, is split into one or more known spatial directions.
- the binaural synthesis unit can receive a plurality of mono audio signals and apply separate binaural syntheses thereto and generate a plurality of binaural signal parts in response, i.e. different image directions are assigned to the plurality of mono audio signals.
- the first and second sets of binaural signals for reproduction via the different reproduction chains can then be generated based on sums of the plurality of binaural signals parts depending on the image direction that they represent.
- Applications of these embodiments are such as Virtual Reality systems, computer games, teleconference systems etc.
- the second position may be described by an azimuth angle, relative to the listener's head, being in the range from -60° to +60°, and wherein the system further includes a third reproduction chain including a third loudspeaker with at least one loudspeaker driver, the third loudspeaker being arranged for a third position, during reproduction.
- the third position may be described by an azimuth angle, relative to the listener's head, being smaller than -120° or larger than + 120° .
- the second and third reproduction chains may include respective associated second and third cross-talk cancellation units.
- the set of first and second audio signals are preferably binaural audio signals.
- the invention provides method for reproducing spatial sound to a listener based on a set of first and second audio signals with spatial cues, the method including- generating first and second processed audio signals by performing a cross-talk cancellation processing on the first and second audio channels, - reproducing the respective first and second processed audio signals by a first loudspeaker with first and second loudspeaker drivers, the first loudspeaker being arranged for a first position, during reproduction, the first position being described by an elevation angle, relative to the listener's head, being larger than + 50° or smaller than -50°, and wherein the cross-talk cancellation processing is performed to reduce acoustic cross-talk corresponding to the first position, and - reproducing sound with a second reproduction chain including a second loudspeaker with at least a third loudspeaker driver arranged for a second position, during reproduction, wherein the second position is described by an azimuth angle, relative to the listener's head, being in the range from -60° to +60°, larger than
- Fig. 1 illustrates definition of azimuth and elevation angles relative to the head of the listener
- Fig. 2 illustrates an overall block diagram of a preferred first reproduction chain of a spatial sound reproduction system
- Fig. 3 illustrates positions of loudspeaker drivers in one embodiment
- Fig. 4 illustrates positions of loudspeaker drivers in another embodiment
- Fig. 5 illustrates elements of one spatial sound reproduction system embodiment with three reproduction chains with a total of 6 loudspeaker drivers
- Fig. 6 illustrates elements of another spatial sound reproduction system embodiment with three reproduction chains with a total of 4 loudspeaker drivers
- Fig. 7 illustrates elements of a spatial sound reproduction system embodiment with binaural synthesis and three reproduction chains with a total of 6 loudspeaker drivers.
- Fig. 1 illustrates definition of azimuth and elevation angles used throughout the description and in the claims. Sketches of the listener's head seen from above and from the side are used to define the angles. As seen, both of azimuth and elevation are defined from a reference point being the centre of the listener's head. Positive azimuth angles are to the right of the listener, while positive elevation angles are above the listener. 0° azimuth is right in front of the listener, while 180° azimuth is right behind the listener. 90° elevation is right above the listener, while -90° elevation is right below the listener.
- Fig. 2 illustrates elements of a preferred first reproduction chain of a spatial sound reproduction system.
- the system includes a cross-cancellation unit CCl arranged to receive two audio signals Al, A2, e.g. a set of binaural signals.
- the cross-cancellation unit CCl In response to the input signals Al, A2, the cross-cancellation unit CCl generates processed audio signals Pl, P2 that are applied to respective ones of first and second loudspeaker drivers Ll, L2.
- the cross-cancellation unit CCl functions such as known in the art, i.e. compensating for acoustic cross-talk introduced by the loudspeaker drivers Ll, L2 using knowledge of HRTFs corresponding to the positions of the loudspeaker drivers Ll, L2 relative to the listener.
- the loudspeaker drivers Ll, L2 may in principle be any known type of loudspeaker capable of reproducing sound at audio frequencies, e.g. a normal electro-dynamic loudspeaker.
- the loudspeaker drivers can be rather small, since the most important localization cues is above some 2-300 Hz.
- the reproduction system can then be assisted by subwoofers to reproduce the frequency range below 2-300 Hz.
- the loudspeaker drivers Ll, L2 are to be positioned at elevation angles larger than 50° or lower than -50°, preferably symmetric relative to the listener in lateral direction, e.g. at 90° and -90° respectively. As will be seen below, elevation angles numerically higher than 50° are preferred.
- the system illustrated in Fig. 2 is intended to be used together with one or more additional reproduction chains, such as will be described below.
- Fig. 3 illustrates a preferred position of the loudspeaker drivers Ll, L2, in a front view and in a side view.
- the loudspeaker drivers Ll, L2 are positioned at an elevation angle of 75° at 90° and -90°, respectively.
- the distance between the loudspeaker drivers Ll, L2 is preferred to be rather short, such angular position may be obtained with both loudspeaker drivers Ll, L2 mounted in one common loudspeaker box.
- Fig. 4 illustrates another embodiment where the loudspeaker drivers Ll, L2 are closely spaced loudspeakers and positioned corresponding to elevation angles 85°-88° and at azimuth angles 90° and -90°, respectively.
- the elevation angle is chose to be 88°
- the two loudspeaker drivers Ll, L2 are so closely spaced that they are configured as a stereo dipole, e.g. as described by Ole Kirkeby in US 6,760,447 Bl.
- HRTFs for directions at high elevation angles such as above 50°, especially above 75°, exhibit a rather shallow structure in the frequency domain with only a limited amount of individual peaks and dips.
- incorporating such HRTFs in a cross-talk cancellation system provides a system which is insensitive to listener movements, and performance will only be insignificantly influenced by the fact that the listener's individual HRFTs will be different from the HRTFs used in the cross-talk cancellation system, since inter- individual differences in HRTFs at high elevation angles are small.
- the loudspeaker drivers Ll, L2 may be preferred to position the loudspeaker drivers Ll, L2 even closer together than illustrated in Fig. 4. However, in order for the cross-talk cancellation to function properly, it is normally preferred that there is at least a distance of 10 cm, e.g. 12-16 cm such as corresponding to the distance between ears of humans, between the loudspeakers.
- Figs. 5 and 6 illustrate two different reproduction systems, each with a total of three reproduction chains.
- Both of Figs. 5 and 6 illustrate a direction processor DP that receives an input audio signal AlO, A20 and generates in response audio signals DPI, DP2, DP3 for reproduction by respective first, second and third reproduction chains.
- the direction processor DP should be construed broadly.
- the direction processor DP in Figs. 5 and 6 merely serves to illustrate classes of embodiments where different reproduction chains are used to reproduce different audio signal parts representing different spatial image location areas.
- the direction processor DP can illustrate a processor arranged to extract information in the audio signal AlO, A20 regarding direction of different sound sources in the signal and thus split the input signal AlO, A20 into separate (or a set of two) audio signals DPI, DP2, DP3 for reproduction via respective reproduction chains.
- the direction processor DP may also be construed as a binaural synthesis system that generates binaural signals DPI, DP2, DP3 representing different spatial locations for reproduction via respective reproduction chains.
- the embodiments in Figs. 5 and 6 merely serve to illustrate that preferred reproduction systems include more than one reproduction chain, and signals DPI, DP2, DP3 for reproduction via the separate reproduction chains represent only, or predominantly only, sound sources with spatial image locations in a limited area. In this way it is possible to mix several reproduction chains with physically different loudspeaker positions and obtain a better reproduction with an improved spatial impression.
- the reproduction chain already illustrated in Figs. 2-4 is suited to reproduce all spatial directions where change of ITDs in response to listener movement or head-turns are not crucial for the listener in order to localize the sound sources.
- this reproduction chain can be used for signal parts representing lateral directions, e.g. for azimuths in the ranges 45°-135° and - 45°to -135°, and in general for all direction with elevation angles larger than 45° or lower than -45° .
- Signal parts representing directions in front of the listener should then preferably be reproduced by the "front" reproduction chain, which may be a stereo dipole with a connected cross-cancellation system, while signal parts representing directions behind the listener should preferably be reproduced b the "back" reproduction chain, which may also be a stereo dipole with a connected cross-talk cancellation system.
- the "front" reproduction chain which may be a stereo dipole with a connected cross-cancellation system
- the "back" reproduction chain which may also be a stereo dipole with a connected cross-talk cancellation system.
- Fig. 5 show three separate reproduction chains with respective cross- cancellation units CCl, CC2, CC3 and with respective sets of two loudspeaker drivers Ll, L2, L3, L4, L5, L6.
- the physical position of the loudspeaker sets are above, front and back, respectively.
- the loudspeaker drivers Ll, L2 of the first reproduction chain are preferably positioned as illustrated in Figs. 3 or 4.
- the loudspeaker drivers L3, L4 of the second reproduction chain may be positioned in a traditional stereo setup, i.e. at azimuth angels 30° and -30°, respectively, both at 0° elevation angle.
- the loudspeaker drivers L5, L6 of the third reproduction chain may be positioned at 120° and -120°, respectively, both at 0° elevation angle.
- the front set of loudspeaker drivers L3, L4 may be closely spaced and form a stereo dipole positioned at 2° and -2° azimuth at 0° elevation
- the back set of loudspeaker drivers L5, L6 are also closely spaced to form a stereo dipole positioned at 178° and -178° azimuth and 0° elevation.
- the system can be implemented with only three loudspeaker boxes each with two loudspeaker drivers.
- Fig. 6 the illustrated system is a simplified version of the system of Fig. 5.
- the second and third reproduction chains are replaced by only one loudspeaker driver L3, L5.
- L3, L5 the loudspeaker driver
- Ll and L2 can be small drivers capable of covering only a frequency range of such as 200-15,000 Hz.
- Ll and L2 are configured as a stereo dipole and mounted in one common loudspeaker box.
- the loudspeaker box with Ll and L2 is then mounted on the ceiling right above the "sweet spot" where the listener is most often seated while listening to the reproduction system. Since Ll and L2 do not necessarily be capable of reproducing sound below 200 Hz, the loudspeaker box can be rather small and e.g. hidden behind a suspended ceiling and thereby only hardly visible.
- the loudspeaker box may include a signal processor implementing the cross-talk cancellation CCl, a power amplifier and an RF receiver arranged to receive the input set of signals DPI as a digital audio signal in a Radio Frequency representation, thus allowing wireless operation.
- the front loudspeaker including L3 can be a full-range loudspeaker system including a sub-woofer reproducing below, e.g. positioned in connection with a TV display.
- L5 can be a driver limited to e.g. 200-15,000 Hz and mounted on the wall right behind the "sweet spot".
- the L5 may also include an RF receiver and a power amplifier and thus allow wireless operation.
- a conversion processor can be used to convert the 5.1 signal into signals for the front, the above and the back reproduction chains, respectively.
- the reproduction system may of course also be used in connection with e.g. computer games where the computer is capable of directly generating the correct binaural synthesis signals for the three reproduction chains with the desired spatial splitting.
- Fig. 7 illustrates the three reproduction chains of Fig. 5.
- a binaural synthesis unit BS takes a mono audio signal A as input, and depending on which spatial image direction to be applied, a binaural synthesis processing is performed on the input signal A, and sets of binaural signals DPlO, DP20, DP30 are generated in response.
- binaural synthesized signals are applied to one of the three sets of binaural signals DPlO, DP20, DP30.
- DP20 then includes binaural signals representing frontal spatial image directions
- DP30 includes binaural signals representing spatial image directions behind the listener, e.g. in conformity with azimuth and elevation definitions as described in the foregoing.
- all three loudspeaker driver sets Ll, L2, and L3, L4 and L5, L6 are configured as stereo dipoles and mounted in one box (indicated by the dashed lines).
- the invention provides spatial sound reproduction system for sound reproduction of a set of audio signals.
- a cross-talk cancellation unit is arranged to receive the set of audio signals and generate a processed set of audio signals in response.
- the processed set of audio signals is then reproduced by a set of loudspeaker drivers positioned at an elevation angle, relative to the listener's head, being larger than +50° or smaller than -50° (i.e. above or below the listener), and with the cross-talk cancellation unit being arranged to reduce acoustic cross-talk corresponding to the actual position of the set of loudspeaker drivers.
- the set of loudspeaker drivers are positioned at an elevation angle larger than 70°, preferably right above the listener, laterally symmetric relative to the listener.
- the set of loudspeaker drivers may be configured as a stereo dipole, e.g. with a mutual distance of 10-20 cm.
- a second reproduction chain with at least one loudspeaker driver is positioned at an azimuth angle in the range -60°to +60°, larger than +120°, or smaller than - 120°, e.g. directly in front of or directly behind the listener, in order to improve reproduction of front/back directions.
- Further reproduction chains each with one or two loudspeaker drivers at different positions may be included.
- the reproduction chain with loudspeakers positioned at a high elevation is then arranged to reproduce lateral sound source directions as well as above and below directions, while the second reproduction chain, e.g.
- the second and/or further reproduction chains may each have two loudspeaker driver systems with or without respective cross-talk cancellation units, and the two loudspeaker drivers may be configured as stereo dipoles.
Landscapes
- Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Acoustics & Sound (AREA)
- Signal Processing (AREA)
- Stereophonic System (AREA)
Abstract
The invention providesspatial sound reproduction system for sound reproduction of a set ofaudio signals. Across-talk cancellation unit is arranged to receive the set ofaudio signalsand generate a processed set ofaudio signalsin response. The processed set ofaudio signalsis then reproduced by aset of loudspeaker drivers positioned atan elevation angle, relative to the listener s head, being larger than +50° or smaller than -50°(i.e. above or below the listener), and with thecross-talk cancellation unit beingarranged to reduce acoustic cross-talk corresponding tothe actual positionof the set of loudspeaker drivers. Preferably, 10 the set of loudspeaker drivers are positioned at an elevation angle larger than 70°, preferably right above the listener, laterally symmetric relative to the listener. The set of loudspeaker drivers may be configured as a stereo dipole, e.g. with a mutual distance of 10-20 cm.Further, a second reproduction chain with at least one loudspeaker driver is positioned at an azimuth angle in the range -60°to +60°, larger than +120°, or smaller than -120°, e.g. directly in front of or directly behind the listener, in order to improve reproduction of front/back directions.Further reproduction chains each with one or two loudspeaker drivers at different positions may be included. Preferably, the reproduction chain with loudspeakers positioned at a high elevation is arranged to reproduce lateral sound 20 sourcedirections as well as above andbelowdirections, while the second reproduction chain, e.g. with a front or back loudspeaker,servesto reproduce sound source directions primarily corresponding to the physical position. The second and/or furtherreproduction chains may each havetwo loudspeaker driver systems with or without respective cross-talk cancellation units, and the two loudspeakerdrivers may be configured as stereo dipoles.
Description
SPATIAL SOUND REPRODUCTION SYSTEM WITH LOUDSPEAKERS
Field of the invention
The invention relates to the field of reproduction of sound with spatial information, e.g. 3D sound, especially via loudspeakers. The invention provides a spatial sound reproduction system and method. The method and system are suitable e.g. for reproduction of binaural signals, such as binaural synthesis applications where precise reproduction of a 3D sound image is crucial. However, the method and systems are also suitable for entertainment audio systems where a 3D sound via loudspeakers can be obtained.
Background of the invention
The idea behind binaural technology is that control of sound pressures at a listener's ear drums provides control of the person's auditory impression. Thus, by generating proper signals at the person's ear drums it is possible to generate an artificial auditory environment with virtual sound sources in all directions, relative to the listening person.
Reproduction of binaural signals is possible via headphones. When using a set of stereo loudspeakers positioned in front of the listener for reproduction of binaural signals, acoustic cross-talk is introduced since sound from both loudspeakers reaches both ears of the listener. Thus, true reproduction of recorded or synthesized binaural signals with loudspeakers is not immediately possible. However, a number of cross-talk cancelling systems are well-known, where the acoustic cross-talk is eliminated or at least reduced by proper pre-processing of the electrical signals fed to the loudspeakers, involving the so-called Head Related Transfer Functions (HRTFs). Hereby, binaural reproduction becomes possible, at least in environments without severe reflections and with the listener positioned in the "sweet spot".
However, still with cross-talk cancellation it is well-known that performance of such systems is rather poor in practice. If stereo loudspeakers are positioned in front of the listener, e.g. a normal +/-30° setup, a good reproduction of frontal sound sources is possible. However, such systems often suffer from a high sensitivity to listener movements. If loudspeakers are placed to the sides of the
listener, e.g. a +/-90° setup, reproduction of lateral sound sources is good, but such systems normally suffer from poor front/back reproduction.
US 6,577,736 describes a system for generating a 3D sound field. Figure 8 illustrates two loudspeakers positioned in front of the listener and two loudspeakers positioned behind the listener. All loudspeakers are positioned in the horizontal plane, i.e. in the same height as the listener's head. Both the front loudspeakers and the back loudspeakers have separate cross-talk cancellation systems. Depending on horizontal sound source location, a binaural sound source signal is directed mainly to the front or back set of loudspeakers. Thus, a source signal representing a sound source directly in front of the listener is directed to the front set of loudspeakers. Both sets of loudspeakers will contribute to reproduction of lateral source signals. In US 6,577,736 it is further described, col. 7, lines 39-44, that an additional loudspeakers pair can be positioned at +/-90° in the horizontal plane in order to improve localization performance of lateral sound sources. In general, the system in US 6,577,736 is based on the general knowledge in the art, that localization performance is improved for directions where a physical loudspeaker is present. Thus, as in most cross-talk cancellation systems the system disclosed in US 6,577,736 requires a high number of single loudspeaker drivers at different positions in order to provide a precise and listener movement immune localization of front, back and lateral directions.
US 6,769,447 describes another system for generating a 3D sound field based on a set of two loudspeakers positioned in front of the listener at +/-5° in the horizontal plane. A cross-talk cancellation processing is applied such that the two loudspeakers move out of phase at low frequencies while they move in phase above a certain frequency.
EP 1 460 879 describes a vehicle audio system with two loudspeakers placed in a stereo dipole configuration. The loudspeakers are positioned directly above the listener, i.e. at 90° elevation, with a distance between the loudspeakers corresponding to an average interaural distance. The loudspeakers are designed for reproduction of normal stereo signals, and the audio system does not include any cross-talk cancellation processing since with the loudspeakers mounted in the
roof of the vehicle, the distance to the listener is short, and therefore acoustic cross-talk is rather limited.
US 2003/02471144 Al describes a 3D sound reproduction system with a set of at least two spaced-apart loudspeakers positioned on the interaural axis, and preferred at an elevation higher than the listener's head. It is further described that more pairs of loudspeaker transducers can be used. Especially, the audio signal for reproduction is split into different frequency bands, such as "low", "mid" and "high" with cross over frequencies of 450 Hz and 3.5 kHz. These frequency bands are then reproduced by respective set of loudspeakers positioned differently, relative to the listener. It is suggested to reproduce the lowest frequencies with a pair of loudspeakers positioned at +/- 90° azimuth, while the "mid" and "high" parts are reproduced by loudspeaker pairs positioned at azimuth angles +/- 16° and +/- 3.1°, respectively.
Summary of the invention
According to the above description of prior art, it may be seen as an object of the present invention to provide a sound reproduction method and system capable of providing a precise and still listener movement immune spatial sound reproduction, e.g. of binaural signals, where only a limited number of loudspeakers drivers are required.
In a first aspect, the invention provides a spatial sound reproduction system for sound reproduction of a set of first and second audio signals with spatial cues to a listener, the system including a first reproduction chain including
- a cross-talk cancellation unit arranged to receive the first and second audio signals and generate first and second processed audio signals in response, and
- a first loudspeaker with first and second loudspeaker drivers arranged to reproduce the respective first and second processed audio signals, the first loudspeaker being arranged for a first position, during reproduction, the first position being described by an elevation angle, relative to the listener's head, being larger than +50° or smaller than -50°, and wherein the cross-talk cancellation unit is arranged to reduce acoustic cross-talk corresponding to the first position, and
- a second reproduction chain including a second loudspeaker with at least a third loudspeaker driver arranged for a second position, during reproduction, wherein the second position is described by an azimuth angle, relative to the listener's head, being in the range from -60° to +60°, larger than +120°, or smaller than - 5 120°, and wherein the second position is different from the first set of positions.
For a definition of elevation and azimuth angles, see Fig. 1. In essence, the elevation angles above 50° or below -50° correspond to cones defining positions "above" or "below" the listener, and throughout the description "above" and0 "below" are used in brief to denote the defined elevations if no more specific positions are described. The best performance is normally obtained with the two loudspeaker drivers positioned laterally symmetrical relative to the listener's head.
With a set of loudspeaker drivers positioned at a high (or low) elevation angle5 relative to the listener's head, i.e. with the loudspeaker drivers above or below the listener, it is possible together with cross-talk cancellation to provide a spatial sound reproduction with a rather precise localization performance of most directions compared to other systems with only two loudspeaker drivers. The reason is that e.g. systems with loudspeaker drivers in front of the listener will0 provide a very poor localization of other directions than the front direction, since a small rotation of the listener's head will cause the listener's auditory system to reveal that changes in Interaural Time Differences (ITDs) provided by the front loudspeakers do not match the expected ITDs for other sound source directions, and thus a confusion will occur in spite a correct spectral reproduction is provided.5 With loudspeakers above or below, the listener will not experience such confusing changes in ITDs leading to poor localization performance since small head rotations will only lead to insignificant ITD changes. Thus, with loudspeaker drivers above (or below), these small ITD changes in response to head rotations correspond well to the real-life listening situation e.g. with a sound source to the0 side of the listener. Thus, surprisingly loudspeaker drivers above (or below) the listener are suited to reproduce spatial directions differing significantly from their physical positions, e.g. lateral directions, since no confusing spatial cues to the human auditory system are generated, in contrast to loudspeaker drivers positioned in front of the listener. With the inclusion of the second reproduction5 chain, e.g. with a loudspeaker driver placed right in front of or right behind the
listener, the front/back confusion problem is further reduced. Such second reproduction chain is preferably used to reproduce directions supported by their physical position, i.e. directions in a cone in front of the listener can be reproduced by a loudspeaker in front of the listener at 0° azimuth, while the first reproduction chain serves to reproduce all other directions.
Thus, it is preferred that reproduction is split between the first and second reproduction chains, such that a spatial part of directions associated with the second position is reproduced purely, or at least primarily, by the second reproduction chain, while other directions are reproduced by the first reproduction chain. In binaural synthesis systems such direction splitting is easy to integrate into the binaural synthesis processing.
At the same time, the immunity to listener movements is superior to a traditional setup of two loudspeaker drivers in the horizontal plane in front of the listener. The reason is that correct cross-talk cancellation with loudspeakers positioned in the horizontal plane in front of the listener, e.g. at +/-30°, requires HRTFs for these directions which are rather sensitive to small changes to both lateral movements or rotations of the listener's head. On the contrary, HRTFs for directions above or below the listener exhibit far less spectral details and they will only change insignificantly with movements or head rotations.
Still further, the system according to the first aspect will provide a localization performance less influenced by inter-individual differences in HRTFs. Due to the low complexity nature of the HRTFs for the above and below directions, inter- individual differences are less pronounced, and thus the system will function for a large variety of listeners since it is rather easy to select a "standard" above or below HRTF that will suit most listeners individual HRTFs.
A yet further advantage is that preferred embodiments with the first reproduction chain having two closely spaced loudspeaker drivers, the two loudspeaker drivers can be mounted in only one cabinet. Thus, in a very simple embodiment only one loudspeaker box is required for the first reproduction chain. Further, with the second reproduction chain being one single loudspeaker box, e.g. also one cabinet with two closely spaced drivers.
The reproduction system of the first aspect is suited for a large number of applications where spatial reproduction of audio signals with loudspeaker is required. A few examples are: tele-conference systems, game consoles, computer games, arcade games, control rooms such as air traffic control rooms, Virtual Reality setups, auralization systems, simulators such as aircraft training simulators, home entertainment (surround sound systems), car (automotive) audio, audio systems for cinemas and theatres.
In preferred embodiments, the first position is described by an elevation angle, relative to the listener's head, being larger than +70° or smaller than -70° . The first position of the first loudspeaker is preferably above or below the listener.
The first and second loudspeaker drivers may be positioned with a mutual distance of less than 50 cm, e.g. with a mutual distance of less than 20 cm, such as less than 10 cm, such as 5cm or even less than 5 cm. Especially, it may be preferred that the mutual distance between the first and second loudspeaker drivers corresponds to the distance between the ears of a human.
Especially, the first and second loudspeaker drivers may be configured to form a stereo dipole, e.g. a stereo dipole such as described in US 6,760,447 Bl by Ole Kirkeby. Stereo dipoles are suited for cross-talk cancellation since they are rather immune to lateral movements and thus such stereo dipoles provide a wider "sweet spot". This allows e.g. for a high localization performance for two person sitting next to each other.
In some embodiments, the reproduction of one reproduction chain is split up into two or three frequency bands e.g. using a cross-over network, and where these frequency bands are reproduced by loudspeaker drivers positioned differently relative to the listener. Especially, a physical distance between drivers for the low frequency part can in this way be increased, while mid and high frequencies can be reproduced by more closely spaced drivers, e.g. in a stereo dipole configuration. This approach for dividing the reproduction signal into frequency bands reproduced by respective differently positioned loudspeaker drivers is further described in US 2004/0247144 Al.
Preferably, the first and second loudspeaker drivers are positioned laterally symmetric relative to the listener, preferably at azimuth angles with numeric values in the range 80°-100° . Such lateral symmetric positioning of the loudspeaker drivers is normally desirable for cross-talk cancellation systems, e.g. in order to minimize the effect of listener movement. Especially, the first and second loudspeaker drivers may be positioned at azimuth angles 90° and -90°, respectively, i.e. right to the sides relative to the listener, but at a high or low elevation angle.
Further reproduction chains with loudspeakers further different loudspeaker position may be used. With such additional reproduction chains arranged for reproduction of certain spatial areas, it is possible to further improve localization performance for specific directions, e.g. front and back.
The second loudspeaker may further include a fourth loudspeaker driver, and wherein the second reproduction chain further includes an associated second cross-talk cancellation unit arranged to reduce acoustic cross-talk corresponding to the second position.
It is preferred that the first and second loudspeaker drivers are used to reproduce at least lateral sound source directions and sound source directions above and below the horizontal plane. Additional reproduction chains merely serve to reproduce sound source directions in front of and behind the listener to provide the listener with the correct ITD changes during head movements. E.g. the second reproduction chain with at least on loudspeaker driver in front of the listener may be used to reproduce directions within an area defined by -45°-45° azimuth and - 45°-45° elevation, i.e. a frontal cone, while the first reproduction chain reproduces all other directions. Alternatively or additionally, a third reproduction chain with at least one loudspeaker driver behind the listener may reproduce sound source directions within an area defined by an azimuth of below -135°and above 135° and an elevation in the range -45°-45° elevation.
The system may further include a direction processor arranged to receive the first and second audio signals and generate in response first and second direction processed audio signals for reproduction via the respective first and second
reproduction chains in accordance with the spatial cues in the first and second audio signals.
Portions of the first and second audio signals representing lateral image directions are mainly or exclusively included in the first direction processed audio signal arranged for reproduction by the first reproduction chain.
In some embodiments the system includes a binaural synthesis unit arranged to receive a mono audio signal, apply binaural synthesis to the mono audio signal and generate in response a first set of binaural audio signals for reproduction via the first reproduction chain and a second set of binaural audio signals for reproduction via the second reproduction chain, and wherein first image directions are mainly or exclusively represented in the first binaural audio signals while second image directions are mainly or exclusively represented in the second binaural audio signals. Such embodiment arranged to split the signal between different reproduction chains according to source image directions is suited for a large number of applications where binaural synthesis is used to generate the first and second audio signals, and where different sets of binaural signals are directed to different reproduction systems depending on sound source image directions. Thus, in a preferred embodiment, the first reproduction chain is used for all directions except for frontal image directions (e.g. for directions except those in the area -45°-45° azimuth and -45°-45° elevation). The frontal image directions are then reproduced by the second reproduction chain with one or two loudspeaker drivers, e.g. a stereo dipole, positioned in front of the listener (e.g. at 0° azimuth and 0° elevation), and thus, frontal localization performance is enhanced.
Due to the binaural synthesis, the spatial image direction splitting is easy to perform, since the mono input signal, e.g. representing the voice of one speaker in a teleconference system, is split into one or more known spatial directions. In preferred embodiments the binaural synthesis unit can receive a plurality of mono audio signals and apply separate binaural syntheses thereto and generate a plurality of binaural signal parts in response, i.e. different image directions are assigned to the plurality of mono audio signals. The first and second sets of binaural signals for reproduction via the different reproduction chains can then be
generated based on sums of the plurality of binaural signals parts depending on the image direction that they represent. Applications of these embodiments are such as Virtual Reality systems, computer games, teleconference systems etc.
The second position may be described by an azimuth angle, relative to the listener's head, being in the range from -60° to +60°, and wherein the system further includes a third reproduction chain including a third loudspeaker with at least one loudspeaker driver, the third loudspeaker being arranged for a third position, during reproduction. The third position may be described by an azimuth angle, relative to the listener's head, being smaller than -120° or larger than + 120° .
The second and third reproduction chains may include respective associated second and third cross-talk cancellation units.
The set of first and second audio signals are preferably binaural audio signals.
In a second aspect, the invention provides method for reproducing spatial sound to a listener based on a set of first and second audio signals with spatial cues, the method including- generating first and second processed audio signals by performing a cross-talk cancellation processing on the first and second audio channels, - reproducing the respective first and second processed audio signals by a first loudspeaker with first and second loudspeaker drivers, the first loudspeaker being arranged for a first position, during reproduction, the first position being described by an elevation angle, relative to the listener's head, being larger than + 50° or smaller than -50°, and wherein the cross-talk cancellation processing is performed to reduce acoustic cross-talk corresponding to the first position, and - reproducing sound with a second reproduction chain including a second loudspeaker with at least a third loudspeaker driver arranged for a second position, during reproduction, wherein the second position is described by an azimuth angle, relative to the listener's head, being in the range from -60° to +60°, larger than +120°, or smaller than -120°, and wherein the second position is different from the first set of positions.
It is appreciated that the same advantages as explained for the first aspect also apply for the second aspects, and it is appreciated that embodiments described for the first aspect apply in equivalent form for the second aspect as well.
Brief description of the drawings
In the following the invention is described in more details with reference to the accompanying figures, of which
Fig. 1 illustrates definition of azimuth and elevation angles relative to the head of the listener,
Fig. 2 illustrates an overall block diagram of a preferred first reproduction chain of a spatial sound reproduction system,
Fig. 3 illustrates positions of loudspeaker drivers in one embodiment,
Fig. 4 illustrates positions of loudspeaker drivers in another embodiment,
Fig. 5 illustrates elements of one spatial sound reproduction system embodiment with three reproduction chains with a total of 6 loudspeaker drivers,
Fig. 6 illustrates elements of another spatial sound reproduction system embodiment with three reproduction chains with a total of 4 loudspeaker drivers, and
Fig. 7 illustrates elements of a spatial sound reproduction system embodiment with binaural synthesis and three reproduction chains with a total of 6 loudspeaker drivers.
While the invention is susceptible to various modifications and alternative forms, specific embodiments have been shown by way of example in the drawings and will be described in detail herein. It should be understood, however, that the invention is not intended to be limited to the particular forms disclosed. Rather,
the invention is to cover all modifications, equivalents, and alternatives falling within the scope of the invention as defined by the appended claims.
Description of preferred embodiments Fig. 1 illustrates definition of azimuth and elevation angles used throughout the description and in the claims. Sketches of the listener's head seen from above and from the side are used to define the angles. As seen, both of azimuth and elevation are defined from a reference point being the centre of the listener's head. Positive azimuth angles are to the right of the listener, while positive elevation angles are above the listener. 0° azimuth is right in front of the listener, while 180° azimuth is right behind the listener. 90° elevation is right above the listener, while -90° elevation is right below the listener.
Fig. 2 illustrates elements of a preferred first reproduction chain of a spatial sound reproduction system. The system includes a cross-cancellation unit CCl arranged to receive two audio signals Al, A2, e.g. a set of binaural signals. In response to the input signals Al, A2, the cross-cancellation unit CCl generates processed audio signals Pl, P2 that are applied to respective ones of first and second loudspeaker drivers Ll, L2. The cross-cancellation unit CCl functions such as known in the art, i.e. compensating for acoustic cross-talk introduced by the loudspeaker drivers Ll, L2 using knowledge of HRTFs corresponding to the positions of the loudspeaker drivers Ll, L2 relative to the listener. Since cross-talk cancellation systems are known in the art, no further details will be given in this regard. The loudspeaker drivers Ll, L2 may in principle be any known type of loudspeaker capable of reproducing sound at audio frequencies, e.g. a normal electro-dynamic loudspeaker. The loudspeaker drivers can be rather small, since the most important localization cues is above some 2-300 Hz. The reproduction system can then be assisted by subwoofers to reproduce the frequency range below 2-300 Hz.
The loudspeaker drivers Ll, L2 are to be positioned at elevation angles larger than 50° or lower than -50°, preferably symmetric relative to the listener in lateral direction, e.g. at 90° and -90° respectively. As will be seen below, elevation angles numerically higher than 50° are preferred.
The system illustrated in Fig. 2 is intended to be used together with one or more additional reproduction chains, such as will be described below.
Fig. 3 illustrates a preferred position of the loudspeaker drivers Ll, L2, in a front view and in a side view. As seen, the loudspeaker drivers Ll, L2 are positioned at an elevation angle of 75° at 90° and -90°, respectively. In cases where the distance between the loudspeaker drivers Ll, L2 is preferred to be rather short, such angular position may be obtained with both loudspeaker drivers Ll, L2 mounted in one common loudspeaker box.
Fig. 4 illustrates another embodiment where the loudspeaker drivers Ll, L2 are closely spaced loudspeakers and positioned corresponding to elevation angles 85°-88° and at azimuth angles 90° and -90°, respectively. In case the elevation angle is chose to be 88°, the two loudspeaker drivers Ll, L2 are so closely spaced that they are configured as a stereo dipole, e.g. as described by Ole Kirkeby in US 6,760,447 Bl.
As already mentioned, HRTFs for directions at high elevation angles, such as above 50°, especially above 75°, exhibit a rather shallow structure in the frequency domain with only a limited amount of individual peaks and dips. Thus, incorporating such HRTFs in a cross-talk cancellation system provides a system which is insensitive to listener movements, and performance will only be insignificantly influenced by the fact that the listener's individual HRFTs will be different from the HRTFs used in the cross-talk cancellation system, since inter- individual differences in HRTFs at high elevation angles are small.
It is to be understood that corresponding position of the loudspeaker drivers as in Figs. 3 and 4, but at negative elevation angles (i.e. -75° or -85-88°), in principle has the same advantages, in case it is possible in practice to place loudspeakers at such positions below the listener's head. For some applications it may even be preferred to place loudspeakers below rather than above the listener, e.g. in a car, in a cinema etc.
It may be preferred to position the loudspeaker drivers Ll, L2 even closer together than illustrated in Fig. 4. However, in order for the cross-talk cancellation
to function properly, it is normally preferred that there is at least a distance of 10 cm, e.g. 12-16 cm such as corresponding to the distance between ears of humans, between the loudspeakers.
Figs. 5 and 6 illustrate two different reproduction systems, each with a total of three reproduction chains. Both of Figs. 5 and 6 illustrate a direction processor DP that receives an input audio signal AlO, A20 and generates in response audio signals DPI, DP2, DP3 for reproduction by respective first, second and third reproduction chains. The direction processor DP should be construed broadly. The direction processor DP in Figs. 5 and 6 merely serves to illustrate classes of embodiments where different reproduction chains are used to reproduce different audio signal parts representing different spatial image location areas. Thus, the direction processor DP can illustrate a processor arranged to extract information in the audio signal AlO, A20 regarding direction of different sound sources in the signal and thus split the input signal AlO, A20 into separate (or a set of two) audio signals DPI, DP2, DP3 for reproduction via respective reproduction chains. However, the direction processor DP may also be construed as a binaural synthesis system that generates binaural signals DPI, DP2, DP3 representing different spatial locations for reproduction via respective reproduction chains. The embodiments in Figs. 5 and 6 merely serve to illustrate that preferred reproduction systems include more than one reproduction chain, and signals DPI, DP2, DP3 for reproduction via the separate reproduction chains represent only, or predominantly only, sound sources with spatial image locations in a limited area. In this way it is possible to mix several reproduction chains with physically different loudspeaker positions and obtain a better reproduction with an improved spatial impression.
Especially, the reproduction chain already illustrated in Figs. 2-4 is suited to reproduce all spatial directions where change of ITDs in response to listener movement or head-turns are not crucial for the listener in order to localize the sound sources. Thus, this reproduction chain can be used for signal parts representing lateral directions, e.g. for azimuths in the ranges 45°-135° and - 45°to -135°, and in general for all direction with elevation angles larger than 45° or lower than -45° . Signal parts representing directions in front of the listener should then preferably be reproduced by the "front" reproduction chain, which
may be a stereo dipole with a connected cross-cancellation system, while signal parts representing directions behind the listener should preferably be reproduced b the "back" reproduction chain, which may also be a stereo dipole with a connected cross-talk cancellation system. In this way ITDs will change in a natural way when the listener moves his head, and thus the physical position of the loudspeakers will support correct sound source reproduction. However, it is to be understood that the spatial separation should be considered in connection with the selected loudspeaker positions.
In Fig. 5 show three separate reproduction chains with respective cross- cancellation units CCl, CC2, CC3 and with respective sets of two loudspeaker drivers Ll, L2, L3, L4, L5, L6. The physical position of the loudspeaker sets are above, front and back, respectively. Thus, the loudspeaker drivers Ll, L2 of the first reproduction chain are preferably positioned as illustrated in Figs. 3 or 4. The loudspeaker drivers L3, L4 of the second reproduction chain may be positioned in a traditional stereo setup, i.e. at azimuth angels 30° and -30°, respectively, both at 0° elevation angle. The loudspeaker drivers L5, L6 of the third reproduction chain may be positioned at 120° and -120°, respectively, both at 0° elevation angle. However, the front set of loudspeaker drivers L3, L4 may be closely spaced and form a stereo dipole positioned at 2° and -2° azimuth at 0° elevation, while the back set of loudspeaker drivers L5, L6 are also closely spaced to form a stereo dipole positioned at 178° and -178° azimuth and 0° elevation. In case stereo dipoles are used for all three reproduction chains, the system can be implemented with only three loudspeaker boxes each with two loudspeaker drivers.
In Fig. 6 the illustrated system is a simplified version of the system of Fig. 5. In the system of Fig. 6 the second and third reproduction chains are replaced by only one loudspeaker driver L3, L5. Thus, cross-cancellation is only required for the first reproduction chain.
The system illustrated in Fig. 6 may be used for a rather simple home entertainment system. Ll and L2 can be small drivers capable of covering only a frequency range of such as 200-15,000 Hz. Ll and L2 are configured as a stereo dipole and mounted in one common loudspeaker box. The loudspeaker box with Ll and L2 is then mounted on the ceiling right above the "sweet spot" where the
listener is most often seated while listening to the reproduction system. Since Ll and L2 do not necessarily be capable of reproducing sound below 200 Hz, the loudspeaker box can be rather small and e.g. hidden behind a suspended ceiling and thereby only hardly visible. The loudspeaker box may include a signal processor implementing the cross-talk cancellation CCl, a power amplifier and an RF receiver arranged to receive the input set of signals DPI as a digital audio signal in a Radio Frequency representation, thus allowing wireless operation. The front loudspeaker including L3 can be a full-range loudspeaker system including a sub-woofer reproducing below, e.g. positioned in connection with a TV display. L5 can be a driver limited to e.g. 200-15,000 Hz and mounted on the wall right behind the "sweet spot". The L5 may also include an RF receiver and a power amplifier and thus allow wireless operation. To feed the described home entertainment reproduction system by existing sound formats, e.g. the 5.1 surround sound format, a conversion processor can be used to convert the 5.1 signal into signals for the front, the above and the back reproduction chains, respectively. The reproduction system may of course also be used in connection with e.g. computer games where the computer is capable of directly generating the correct binaural synthesis signals for the three reproduction chains with the desired spatial splitting.
Fig. 7 illustrates the three reproduction chains of Fig. 5. However, in Fig. 7 a binaural synthesis unit BS takes a mono audio signal A as input, and depending on which spatial image direction to be applied, a binaural synthesis processing is performed on the input signal A, and sets of binaural signals DPlO, DP20, DP30 are generated in response. Depending on spatial image direction, binaural synthesized signals are applied to one of the three sets of binaural signals DPlO, DP20, DP30. Preferably, the first set of binaural signals DPlO to be reproduced by loudspeaker drivers Ll, L2 positioned above the listener, e.g. close to 90° elevation includes binaural signals representing all spatial image directions except front and back directions, e.g. with a split up in azimuth and elevation as described in the foregoing. DP20 then includes binaural signals representing frontal spatial image directions, while DP30 includes binaural signals representing spatial image directions behind the listener, e.g. in conformity with azimuth and elevation definitions as described in the foregoing. In preferred embodiments all
three loudspeaker driver sets Ll, L2, and L3, L4 and L5, L6 are configured as stereo dipoles and mounted in one box (indicated by the dashed lines).
To sum up, the invention provides spatial sound reproduction system for sound reproduction of a set of audio signals. A cross-talk cancellation unit is arranged to receive the set of audio signals and generate a processed set of audio signals in response. The processed set of audio signals is then reproduced by a set of loudspeaker drivers positioned at an elevation angle, relative to the listener's head, being larger than +50° or smaller than -50° (i.e. above or below the listener), and with the cross-talk cancellation unit being arranged to reduce acoustic cross-talk corresponding to the actual position of the set of loudspeaker drivers. Preferably, the set of loudspeaker drivers are positioned at an elevation angle larger than 70°, preferably right above the listener, laterally symmetric relative to the listener. The set of loudspeaker drivers may be configured as a stereo dipole, e.g. with a mutual distance of 10-20 cm. Further, a second reproduction chain with at least one loudspeaker driver is positioned at an azimuth angle in the range -60°to +60°, larger than +120°, or smaller than - 120°, e.g. directly in front of or directly behind the listener, in order to improve reproduction of front/back directions. Further reproduction chains each with one or two loudspeaker drivers at different positions may be included. Preferably, the reproduction chain with loudspeakers positioned at a high elevation is then arranged to reproduce lateral sound source directions as well as above and below directions, while the second reproduction chain, e.g. with a front or back loudspeaker, serves to reproduce sound source directions primarily corresponding to the physical position. The second and/or further reproduction chains may each have two loudspeaker driver systems with or without respective cross-talk cancellation units, and the two loudspeaker drivers may be configured as stereo dipoles.
Certain specific details of the disclosed embodiment are set forth for purposes of explanation rather than limitation, so as to provide a clear and thorough understanding of the present invention. However, it should be understood by those skilled in this art, that the present invention might be practiced in other embodiments that do not conform exactly to the details set forth herein, without departing significantly from the spirit and scope of this disclosure. Further, in this
context, and for the purposes of brevity and clarity, detailed descriptions of well- known apparatuses, circuits and methodologies have been omitted so as to avoid unnecessary detail and possible confusion.
Reference signs are included in the claims, however the inclusion of the reference signs is only for clarity reasons and should not be construed as limiting the scope of the claims.
Claims
1. A spatial sound reproduction system for sound reproduction of a set of first and second audio signals (Al, A2) with spatial cues to a listener, the system including
- a first reproduction chain including
- a cross-talk cancellation unit (CCl) arranged to receive the first and second audio signals (Al, A2) and generate first and second processed audio signals (Pl, P2) in response,
- a first loudspeaker with first and second loudspeaker drivers (Ll, L2) arranged to reproduce the respective first and second processed audio signals (Pl, P2), the first and second loudspeaker drivers (Ll, L2) being arranged for a first set of respective positions, during reproduction, the first set of positions being described by an elevation angle, relative to the listener's head, being larger than +50° or smaller than -50°, and wherein the cross-talk cancellation unit (CCl) is arranged to reduce acoustic crosstalk corresponding to the first position, and
- a second reproduction chain including
- a second loudspeaker with at least a third loudspeaker driver (L3) arranged for a second position, during reproduction, wherein the second position is described by an azimuth angle, relative to the listener's head, being in the range from -60° to +60°, larger than + 120°, or smaller than - 120°, and wherein the second position is different from the first set of positions.
2. System according to claim 1, wherein the first set of positions is described by an elevation angle, relative to the listener's head, being larger than +70° or smaller than -70° .
3. System according to claim 1 or 2, wherein the first set of positions of the first loudspeaker is above or below the listener.
4. System according to any of the preceding claims, wherein the first and second loudspeaker drivers (Ll, L2) are positioned with a mutual distance of less than 50 cm.
5. System according to claim 4, wherein the mutual distance is less than 20 cm.
6. System according to any of the preceding claims, wherein the first and second loudspeaker drivers (Ll, L2) are configured to form a stereo dipole.
7. System according to any of the preceding claims, wherein the first and second loudspeaker drivers (Ll, L2) are positioned laterally symmetric relative to the listener, preferably at azimuth angles with numeric values in the range 80°-100° .
8. System according to claim 7, wherein the first and second loudspeaker drivers (Ll, L2) are positioned at azimuth angles 90° and -90°, respectively.
9. System according to any of the preceding claims, wherein the second loudspeaker further includes a fourth loudspeaker driver (L4), and wherein the second reproduction chain further includes an associated second cross-talk cancellation unit (CC2) arranged to reduce acoustic cross-talk corresponding to the second position.
10. System according to any of the preceding claims, further including a direction processor (DP) arranged to receive a set of audio signals (AlO, A20) and generate in response first and second direction processed audio signals (DPI, DP2) for reproduction via the respective first and second reproduction chains in accordance with the spatial cues in the first and second audio signals.
11. System according to any of the preceding claims, wherein portions of the set of audio signals (AlO, A20) representing lateral image directions are mainly or exclusively included in the first direction processed audio signal (DPI) arranged for reproduction by the first reproduction chain.
12. System according to any of the preceding claims, wherein the second position is described by an azimuth angle, relative to the listener's head, being in the range from -60° to +60°, and wherein the system further includes a third reproduction chain including a third loudspeaker with at least one loudspeaker driver (L5), the third loudspeaker being arranged for a third position, during reproduction.
13. System according to claim 12, wherein the third position is described by an azimuth angle, relative to the listener's head, being smaller than -120° or larger than +120° .
14. System according to claim 12 or 13, wherein the second and third reproduction chains include respective associated second and third cross-talk cancellation units (CC2, CC3).
15. System according to any of the preceding claims, further including a binaural synthesis unit (BS) arranged to receive a mono audio signal (A), apply binaural synthesis to the mono audio signal (A) and generate in response a first set of binaural audio signals (DPlO) for reproduction via the first reproduction chain (CCl, Ll, L2) and a second set of binaural audio signals (DP20) for reproduction via the second reproduction chain (CC2, L3, L4), and wherein first image directions are mainly or exclusively represented in the first binaural audio signals (DPlO) while second image directions are mainly or exclusively represented in the second binaural audio signals (DP20).
16. System according to claim 15, wherein the first image directions include lateral image directions and the second image directions include frontal image directions.
17. System according to claim 15 or 16, wherein binaural synthesis unit (BS) is arranged to receive two or more mono audio signals (A) and apply separate binaural syntheses to the two or more mono audio signals (A).
18. System according to any of the preceding claims, wherein the set of first and second audio signals (Al, A2) is a set of binaural audio signals.
19. Method for reproducing spatial sound to a listener based on a set of first and 5 second audio signals with spatial cues, the method including
- generating first and second processed audio signals by performing a cross-talk cancellation processing on the first and second audio channels,
- reproducing the respective first and second processed audio signals by a 10 first loudspeaker with first and second loudspeaker drivers, the first and second loudspeaker drivers being arranged for a first set of respective positions, during reproduction, the first set of positions being described by an elevation angle, relative to the listener's head, being larger than +50° or smaller than -50°, and wherein the cross-talk cancellation processing is 15 performed to reduce acoustic cross-talk corresponding to the first position, and
- reproducing sound with a second reproduction chain including a second loudspeaker with at least a third loudspeaker driver arranged for a second
20 position, during reproduction, wherein the second position is described by an azimuth angle, relative to the listener's head, being in the range from - 60° to +60°, larger than + 120°, or smaller than -120°, and wherein the second position is different from the first set of positions.
25 20. Method according to claim 19, wherein the first set of positions is described by an elevation angle, relative to the listener's head, being larger than +70° or smaller than -70° .
21. Method according to claim 19 or 20, wherein the first set of positions of the 30 first loudspeaker is above or below the listener.
22. Method according to any of claims 19-21, wherein the first and second loudspeaker drivers are positioned with a mutual distance of less than 50 cm.
35 23. Method according to claim 22, wherein the mutual distance is less than 20 cm.
24. Method according to any of claims 19-23, wherein the first and second loudspeaker drivers are configured to form a stereo dipole.
25. Method according to any of claims 19-24, wherein the first and second loudspeaker drivers are positioned laterally symmetric relative to the listener, preferably at azimuth angles with numeric values in the range 80°-100° .
26. Method according to claim 25, wherein the first and second loudspeaker drivers are positioned at azimuth angles 90° and -90°, respectively.
27. Method according to any of claims 19-26, wherein the second loudspeaker further includes a fourth loudspeaker driver, and wherein the method includes generating third and fourth processed audio signals to be reproduced by the respective third and fourth loudspeaker drivers by performing a cross-talk cancellation processing corresponding to the second position.
28. Method according to any of claims 19-27, further including generating a set of direction processed audio signals for reproduction via the respective first and second reproduction chains in accordance with the spatial cues in the first and second audio signals.
29. Method according to any of claims 19-28, wherein portions of the set of audio signals representing lateral image directions are mainly or exclusively included in the first direction processed audio signal arranged for reproduction by the first reproduction chain.
30. Method according to any of claims 19-29, wherein the second position is described by an azimuth angle, relative to the listener's head, being in the range from -60° to +60°, and wherein the system further includes a third reproduction chain including a third loudspeaker with at least one loudspeaker driver, the third loudspeaker being arranged for a third position, during reproduction.
31. Method according to claim 30, wherein the third position is described by an azimuth angle, relative to the listener's head, being smaller than -120° or larger than +120° .
32. Method according to claim 30 or 31, wherein second and third cross- cancellation processings are performed for the respective second and third reproduction chains.
33. Method according to any of claims 19-32, the method further including receiving a mono audio signal and apply binaural synthesis thereto and generate in response a first set of binaural audio signals for reproduction via the first reproduction chain and generate a second set of binaural audio signals for reproduction via the second reproduction chain, and wherein first image directions are mainly or exclusively represented in the first binaural audio signals while second image directions are mainly or exclusively represented in the second binaural audio signals .
34. Method according to claim 33, wherein the first image directions include lateral image directions and the second image directions include frontal image directions.
35. Method according to claim 33 or 34, the method including receiving two or more mono audio signals and applying separate binaural syntheses to the two or more mono audio signals.
36. Method according to any of claims 19-35, wherein the set of first and second audio signals is a set of binaural audio signals.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| DKPA200700680 | 2007-05-07 | ||
| DKPA200700680 | 2007-05-07 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2008135049A1 true WO2008135049A1 (en) | 2008-11-13 |
Family
ID=38823743
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/DK2008/050100 Ceased WO2008135049A1 (en) | 2007-05-07 | 2008-05-06 | Spatial sound reproduction system with loudspeakers |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2008135049A1 (en) |
Cited By (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2014035728A3 (en) * | 2012-08-31 | 2014-04-17 | Dolby Laboratories Licensing Corporation | Virtual rendering of object-based audio |
| US9107023B2 (en) | 2011-03-18 | 2015-08-11 | Dolby Laboratories Licensing Corporation | N surround |
| US9204236B2 (en) | 2011-07-01 | 2015-12-01 | Dolby Laboratories Licensing Corporation | System and tools for enhanced 3D audio authoring and rendering |
| US10582327B2 (en) | 2017-10-13 | 2020-03-03 | Dolby Laboratories Licensing Corporation | Systems and methods for providing an immersive listening experience in a limited area using a rear sound bar |
| WO2021058858A1 (en) | 2019-09-24 | 2021-04-01 | Nokia Technologies Oy | Audio processing |
| WO2021176135A1 (en) * | 2020-03-03 | 2021-09-10 | Nokia Technologies Oy | Apparatus, methods and computer programs for enabling reproduction of spatial audio signals |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2003030589A2 (en) * | 2001-09-28 | 2003-04-10 | Adaptive Audio Limited | Sound reproduction systems |
| US6577736B1 (en) * | 1998-10-15 | 2003-06-10 | Central Research Laboratories Limited | Method of synthesizing a three dimensional sound-field |
| US6760447B1 (en) * | 1996-02-16 | 2004-07-06 | Adaptive Audio Limited | Sound recording and reproduction systems |
| EP1460879A2 (en) * | 2003-03-18 | 2004-09-22 | ASK INDUSTRIES S.p.A. | Individual sound system for vehicles |
-
2008
- 2008-05-06 WO PCT/DK2008/050100 patent/WO2008135049A1/en not_active Ceased
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6760447B1 (en) * | 1996-02-16 | 2004-07-06 | Adaptive Audio Limited | Sound recording and reproduction systems |
| US6577736B1 (en) * | 1998-10-15 | 2003-06-10 | Central Research Laboratories Limited | Method of synthesizing a three dimensional sound-field |
| WO2003030589A2 (en) * | 2001-09-28 | 2003-04-10 | Adaptive Audio Limited | Sound reproduction systems |
| EP1460879A2 (en) * | 2003-03-18 | 2004-09-22 | ASK INDUSTRIES S.p.A. | Individual sound system for vehicles |
Cited By (23)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US9107023B2 (en) | 2011-03-18 | 2015-08-11 | Dolby Laboratories Licensing Corporation | N surround |
| TWI666944B (en) * | 2011-07-01 | 2019-07-21 | 杜比實驗室特許公司 | Apparatus, method and non-transitory medium for enhanced 3d audio authoring and rendering |
| US10244343B2 (en) | 2011-07-01 | 2019-03-26 | Dolby Laboratories Licensing Corporation | System and tools for enhanced 3D audio authoring and rendering |
| US9204236B2 (en) | 2011-07-01 | 2015-12-01 | Dolby Laboratories Licensing Corporation | System and tools for enhanced 3D audio authoring and rendering |
| US11057731B2 (en) | 2011-07-01 | 2021-07-06 | Dolby Laboratories Licensing Corporation | System and tools for enhanced 3D audio authoring and rendering |
| US9549275B2 (en) | 2011-07-01 | 2017-01-17 | Dolby Laboratories Licensing Corporation | System and tools for enhanced 3D audio authoring and rendering |
| US12047768B2 (en) | 2011-07-01 | 2024-07-23 | Dolby Laboratories Licensing Corporation | System and tools for enhanced 3D audio authoring and rendering |
| US9838826B2 (en) | 2011-07-01 | 2017-12-05 | Dolby Laboratories Licensing Corporation | System and tools for enhanced 3D audio authoring and rendering |
| TWI701952B (en) * | 2011-07-01 | 2020-08-11 | 美商杜比實驗室特許公司 | Apparatus, method and non-transitory medium for enhanced 3d audio authoring and rendering |
| US11641562B2 (en) | 2011-07-01 | 2023-05-02 | Dolby Laboratories Licensing Corporation | System and tools for enhanced 3D audio authoring and rendering |
| US10609506B2 (en) | 2011-07-01 | 2020-03-31 | Dolby Laboratories Licensing Corporation | System and tools for enhanced 3D audio authoring and rendering |
| WO2014035728A3 (en) * | 2012-08-31 | 2014-04-17 | Dolby Laboratories Licensing Corporation | Virtual rendering of object-based audio |
| CN104604255A (en) * | 2012-08-31 | 2015-05-06 | 杜比实验室特许公司 | Virtual rendering of object-based audio |
| US9622011B2 (en) | 2012-08-31 | 2017-04-11 | Dolby Laboratories Licensing Corporation | Virtual rendering of object-based audio |
| CN104604255B (en) * | 2012-08-31 | 2016-11-09 | 杜比实验室特许公司 | The virtual of object-based audio frequency renders |
| US10582327B2 (en) | 2017-10-13 | 2020-03-03 | Dolby Laboratories Licensing Corporation | Systems and methods for providing an immersive listening experience in a limited area using a rear sound bar |
| CN114503606A (en) * | 2019-09-24 | 2022-05-13 | 诺基亚技术有限公司 | Audio processing |
| EP4035425A4 (en) * | 2019-09-24 | 2023-10-11 | Nokia Technologies Oy | AUDIO PROCESSING |
| WO2021058858A1 (en) | 2019-09-24 | 2021-04-01 | Nokia Technologies Oy | Audio processing |
| US12231867B2 (en) | 2019-09-24 | 2025-02-18 | Nokia Technologies Oy | Audio processing |
| CN115244952A (en) * | 2020-03-03 | 2022-10-25 | 诺基亚技术有限公司 | Apparatus, method and computer program for enabling reproduction of spatial audio signals |
| WO2021176135A1 (en) * | 2020-03-03 | 2021-09-10 | Nokia Technologies Oy | Apparatus, methods and computer programs for enabling reproduction of spatial audio signals |
| US12439220B2 (en) | 2020-03-03 | 2025-10-07 | Nokia Technologies Oy | Apparatus, methods and computer programs for enabling reproduction of spatial audio signals |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US7333622B2 (en) | Dynamic binaural sound capture and reproduction | |
| Theile | Multichannel natural music recording based on psychoacoustic principles | |
| Gardner | 3-D audio using loudspeakers | |
| EP3253079B1 (en) | System for rendering and playback of object based audio in various listening environments | |
| CN101874414B (en) | Method and apparatus for improving the accuracy of sound field rendering in sweet spot regions | |
| US6577736B1 (en) | Method of synthesizing a three dimensional sound-field | |
| US4418243A (en) | Acoustic projection stereophonic system | |
| US20080056517A1 (en) | Dynamic binaural sound capture and reproduction in focued or frontal applications | |
| CN1658709B (en) | Sound reproduction apparatus and sound reproduction method | |
| CN103053180A (en) | System and method for sound reproduction | |
| EP3895451A1 (en) | Method and apparatus for processing a stereo signal | |
| US20070009120A1 (en) | Dynamic binaural sound capture and reproduction in focused or frontal applications | |
| WO2002015637A1 (en) | Method and system for recording and reproduction of binaural sound | |
| WO2008135049A1 (en) | Spatial sound reproduction system with loudspeakers | |
| CN102598718A (en) | Loudspeaker system for reproducing multi-channel sound with an improved sound image | |
| JP2645731B2 (en) | Sound image localization reproduction method | |
| JP2013504837A (en) | Phase layering apparatus and method for complete audio signal | |
| US6990210B2 (en) | System for headphone-like rear channel speaker and the method of the same | |
| US10440495B2 (en) | Virtual localization of sound | |
| WO2024186771A1 (en) | Systems and methods for hybrid spatial audio | |
| US7050596B2 (en) | System and headphone-like rear channel speaker and the method of the same | |
| JP2000333297A (en) | Stereophonic sound generator, method for generating stereophonic sound, and medium storing stereophonic sound | |
| KR101526014B1 (en) | Multi-channel surround speaker system | |
| US20030068051A1 (en) | Surround sound speaker system | |
| AU2004202113A1 (en) | Depth render system for audio |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 08734555 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 08734555 Country of ref document: EP Kind code of ref document: A1 |