[go: up one dir, main page]

WO1997025834A2 - Procede et dispositif de traitement d'un signal multicanal destine a un casque audio - Google Patents

Procede et dispositif de traitement d'un signal multicanal destine a un casque audio Download PDF

Info

Publication number
WO1997025834A2
WO1997025834A2 PCT/US1997/000145 US9700145W WO9725834A2 WO 1997025834 A2 WO1997025834 A2 WO 1997025834A2 US 9700145 W US9700145 W US 9700145W WO 9725834 A2 WO9725834 A2 WO 9725834A2
Authority
WO
WIPO (PCT)
Prior art keywords
hrtf
hrtfs
signal
bit
sets
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/US1997/000145
Other languages
English (en)
Other versions
WO1997025834A3 (fr
Inventor
Timothy J. Tucker
David M. Green
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Virtual Listening Systems Inc
Original Assignee
Virtual Listening Systems Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US08/582,830 external-priority patent/US5742689A/en
Application filed by Virtual Listening Systems Inc filed Critical Virtual Listening Systems Inc
Priority to AU15271/97A priority Critical patent/AU1527197A/en
Publication of WO1997025834A2 publication Critical patent/WO1997025834A2/fr
Publication of WO1997025834A3 publication Critical patent/WO1997025834A3/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S3/00Systems employing more than two channels, e.g. quadraphonic
    • H04S3/002Non-adaptive circuits, e.g. manually adjustable or static, for enhancing the sound image or the spatial distribution
    • H04S3/004For headphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R29/00Monitoring arrangements; Testing arrangements
    • H04R29/001Monitoring arrangements; Testing arrangements for loudspeakers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2400/00Details of stereophonic systems covered by H04S but not provided for in its groups
    • H04S2400/01Multi-channel, i.e. more than two input channels, sound reproduction with two speakers wherein the multi-channel information is substantially preserved
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2420/00Techniques used stereophonic systems covered by H04S but not provided for in its groups
    • H04S2420/01Enhancing the perception of the sound image or of the spatial distribution using head related transfer functions [HRTF's] or equivalents thereof, e.g. interaural time difference [ITD] or interaural level difference [ILD]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S3/00Systems employing more than two channels, e.g. quadraphonic
    • H04S3/008Systems employing more than two channels, e.g. quadraphonic in which the audio signals are in digital form, i.e. employing more than two discrete digital channels

Definitions

  • the present invention relates to a method and device for processing a multi-channel audio signal for reproduction over headphones.
  • the present invention relates to an apparatus and method for creating, over headphones, the sensation of multiple "phantom" loudspeakers in a user matched virtual listening environment.
  • each audio channel of the multi-channel signal is routed to one of several loudspeakers distributed throughout the theater, providing movie-goers with the sensation that sounds are originating all around them.
  • At least one of these formats for example the Dolby Pro Logic® format, has been adapted for use in the home entertainment industry.
  • the Dolby Pro Logic® format is now in wide use in home theater systems.
  • each audio channel of the multi-channel signal is routed to one of several loudspeakers placed around the room, providing home listeners with the sensation that sounds are originating all around them.
  • the home entertain ent system market expands other multi-channel systems will likely become available to home consumers. When humans listen to sounds produced by loudspeakers, it is termed open-ear listening.
  • Open-ear listening occurs when the ears are uncovered. It is the way we listen in everyday life.
  • the sonic information arriving at the ears provides cues about the location and distance of the sound source. Humans are able to localize a sound to the right or left based on differences in the arrival times and differences in the sound levels at the two ears. Other subtle differences in the spectrum of the sound at each ear drum provide cues about the sound source elevation and front back location. These differences are related to the filtering effects of several body parts, most notably the head and the pinnae of the ears.
  • the process of listening while the outer ear surface of the ear is covered is termed closed-ear listening.
  • Covering the ear changes the ear canal resonance characteristics. Due to the physical effects of wearing headphones, sound delivered through headphones lacks the subtle differences in time, level, and spectra caused by location, distance, and the filtering effects of the head and pinna experienced in open-ear listening. Thus, when headphones are used with multi-channel home entertainment systems, the advantages of listening via numerous loudspeakers placed throughout the room are lost, the sound often appearing to be originating inside the listener's head.
  • an object of the present invention is to provide a method for processing the multi-channel output typically produced by home entertainment or like systems such that when presented over headphones, the listener is able to select a best match set of head related transfer functions from a database of measured head related transfer functions to filter the channels such that the listener experiences the sensation of multiple "phantom" loudspeakers placed throughout the room.
  • Another object of the present invention is to provide an apparatus for processing the multi- channel output typically produced by home entertainment or like systems such that when presented over headphones, the listener experiences listening sensations most like that which the listener, as an individual, would experience when listening to multiple loudspeakers placed throughout the room.
  • Another object of the present invention is to provide an apparatus for processing the multi ⁇ channel output typically produced by home entertainment or like systems such that when presented over headphones, the listener experiences sensations typical of open-ear (unobstructed) hstening.
  • Another object of the present invention is to provide an apparatus and method for measuring the acoustic filtering action produced by the head and pinnae of the human ears so as to produce a useful database of head related transfer functions.
  • Another object of the present invention is to create a database of HRTFs representative of the general listening public by measuring and recording a large enough set of such HRTFs such that any given individual is likely to be able to select a set of HRTFs from the database so that when used to process an audio signal the user perceives the corresponding sounds to be localized in the proper spatial positions.
  • Another object of the present invention is to provide a means of determining the "best- match" of an individual listener to one of the HRTF sets of the representative database such that the individual listener can be matched as closely as possible to an already measured set of HRTFs stored in a database, such that once properly matched, the individual will experience the correct "phantom" locations of the sources of the listening system.
  • Another object of the present invention is to provide a wired or wireless transmission system for dimensionalized hstening of sound over headphones.
  • multiple channels of an audio signal are processed through the apphcation of filtering using a head related transfer function (HRTF) or a plurality of HRTFs, selected by a user, such that when reduced to two channels, left and right, each channel contains information that enables the listener to sense the location of multiple phantom loudspeakers when hstening over headphones.
  • HRTF head related transfer function
  • multiple channels of an audio signal are processed through the apphcation of filtering using HRTFs chosen from a large database such that when listening through headphones, the listener experiences a sensation that most closely matches the sensation the listener, as an individual, would experience when listening to multiple loudspeakers.
  • the right and left channels are filtered in order to simulate the effects of open-ear hstening.
  • a complete set of HRTFs for an individual is measured and recorded, such that the measured HRTFs are an accurate reflection of the filtering effects of that individual's head and pinnae, and in which the measurement takes on the order of a few minutes.
  • HRTFs are measured such that an HRTF is specified for each location in space about the listener with an accuracy of approximately 10 ° in both the vertical and horizontal dimensions.
  • the HRTFs of a sufficient number of individuals are measured and stored to create a database such that a given individual is able to select a set of
  • HRTFs from the database such that when audio signals are processed with the selected set of HRTFs, the user perceives the corresponding sounds to be localized in the proper spatial positions.
  • the database of HRTFs comprises a representative set of HRTF sets.
  • an individual is matched to a
  • “best-match" set of HRTFs selected from a database of sets of HRTFs measured from a representative sample of the general hstening population, where the individual listener participates in the matching of the set of HRTFs by comparing the perception created by different HRTF sets and selecting the HRTF set providing the best spatial perception.
  • a database of HRTF sets, measured from a representative sample of the hstening population is estabhshed, such that an individual can select a "best-match" set of HRTFs from the database.
  • a best match set of HRTFs is selected from the database of HRTFs and is used to process signals for wired or wireless transmission to a hstener wearing headphones.
  • Figure 1 is a representation of sound waves received at both ears of a hstener sitting in a room with a typical multi-channel loudspeaker configuration.
  • Figure 2 is a representation of the hstening sensation experienced through headphones according to an exemplary embodiment of the present invention.
  • Figure 3a shows the sound source locations used to measure a set of head related transfer functions (HRTFs) obtained at multiple elevations and azimuths surrounding a hstener.
  • HRTFs head related transfer functions
  • Figure 3b is a graph representing the HRTF for 0 degrees elevation and 30 degrees azimuth for three different individuals.
  • Figure 4 is a schematic in block diagram form of a typical multi-channel headphone processing system according to an exemplary embodiment of the present invention.
  • Figure 5 is a schematic in block diagram form of a bass boost circuit according to an exemplary embodiment of the present invention.
  • Figure 6A is a schematic in block diagram form of HRTF filtering as applied to a single channel according to an exemplary embodiment of the present invention.
  • Figure 6B is a schematic in block diagram form of the process of HRTF matching based on an ordered set of HRTFs according to the present invention.
  • Figure 7 is a representation of a typical digital signal transmission system comprising a transmitting station, a connecting medium called a channel and a receiving station.
  • Figure 8A is a block diagram of a novel radio-frequency transmission system for use in a wireless embodiment of this invention.
  • Figure 8B is a representation of an adaptive filter for removing the DC component of a digital signal.
  • Figure 9A shows a computer simulated input gaussian noise source with a variance of 2.5 mV and a mean of 0.5 V.
  • Figure 9B shows the tracking constant, C[k], during a computer simulation of the removal of the DC component of an input gaussian noise source by an adaptive filter.
  • Figure 9C shows the output of an adaptive filter where the input is a gaussian noise source.
  • Figures 9D and 9E show the magnitude frequency response of the input gaussian noise waveform and DC shifted output.
  • Figure 9F is a schematic of a state machine.
  • Figure 9G is a timing diagram of various clock outputs for decoding signals encoded according to one embodiment of this invention.
  • Figure 10 depicts an HRTF matching process according to the present invention.
  • Figure 11 shows an impulse response wave form recorded from one individual at one spatial location for one ear.
  • Figure 12 illustrates critical band filtering according to the present invention.
  • Figure 13 illustrates an exemplary subject filtered HRTF matrix according to the present invention.
  • Figure 14 illustrates a hypothetical hierarchical agglomerative clustering procedure in two dimensions according to the present invention.
  • Figure 15 illustrates a hypothetical hierarchical agglomerative clustering procedure according to an exemplary embodiment of the present invention.
  • Figure 16 is a schematic in block diagram form of a typical reverberation processor constructed of parallel lowpass comb filters.
  • Figure 17 is a schematic in block diagram form of a typical lowpass comb filter.
  • Figure 18a is a schematic of a preferred embodiment of an HRTF measurement means.
  • Figure 18b further illustrates a preferred embodiment of an HRTF measurement means.
  • Figure 19 is a schematic representation of the HRTF measurement control system.
  • Figure 20 is a schematic representation of the HRTF measurement control system software flow chart.
  • Figure 21 A is a schematic representation of a front view of a sound room in which HRTFs may be measured to produce the database of HRTFs of this invention.
  • Figure 21B is a schematic representation of a top view of the sound room.
  • Figure 21 C shows the detail of the cross section of the wall of the sound room.
  • Figure 22A shows the probability that the RMS distance, between any individual's HRTF and the nearest HRTF already in the database, is less than a certain RMS distance (dB), as a function of the number of HRTF sets in the database.
  • Figure 22B shows the cumulative density function of the distance between each of 150 HRTFs and the mean HRTF.
  • Figure 22C shows the change in average mean as a function of subsample group size.
  • Figure 22D shows the change in average standard deviation as a function of subsample group size.
  • Figure 22E shows the mean minimiim distance between any HRTF set of the 150 HRTF sets and one of the stored HRTF sets as a function of the number of stored HRTF sets.
  • Figures 23A, B, C are block diagrams of a circuit according to this invention for processing signals using a best match set of HRTFs selected by a user from the database of this invention.
  • Figure 24 is a detail of an early reflection processing circuit 612 according to Figure 23.
  • FIG 25 is a detail of an HRTF processing circuit 663 according to Figure 23 comprising finite impulse response filters that implement HRTFs selected from the database of this invention.
  • Figure 26 is a detail of a reverberation circuit 671 according to Figure 23.
  • FIG 27 is a detail of a bass boost processing circuit 670 according to Figure 23.
  • Figures 28A, B, C are a schematic representation of the HRTF selection and matching performed by a user to arrive at a best match set of HRTFs which is then used for processing of audio signals according to Figures 25 and 23.
  • Figure 29A, B is an alternate embodiment to that disclosed in Figures 28A, B, and C.
  • the method and device according to the present invention processes audio signals, including multi-channel audio signals having a plurahty of channels, each corresponding to a loudspeaker placed in a particular location in a room, in such a way as to create, over headphone, the sensation of multiple "phantom” loudspeakers placed throughout the room.
  • the present invention utilizes Head Related Transfer Functions (HRTFs) that are chosen according to the elevation and azimuth of each intended loudspeaker relative to the listener, each channel being filtered by a set of HRTFs such that when combined into left and right channels and played over headphones, the hstener senses that the sound is actually produced by phantom loudspeakers placed throughout the "virtual" room.
  • HRTFs Head Related Transfer Functions
  • the filtering of the present invention utilizes a database collection of sets of HRTFs measured from numerous individuals and subsequent matching of the best HRTF set to an individual listener, thus providing the hstener with hstening sensations similar to that which the hstener, as an individual, would experience when hstening to multiple loudspeakers placed throughout the room. Additionally, the present invention utilizes an appropriate transfer function apphed to the right and left channel output so that the sensation of open-ear listening may be experienced through closed-ear headphones.
  • the present invention also provides a measurement device and method for measuring and recording complete sets of HRTFs of subjects from a representative sample of the hstening population, such that the measured HRTFs are an accurate reflection of the filtering effects of the head and pinnae of each of the subjects measured. For each individual, as many as 360 HRTFs for each ear may be measured, with each HRTF depending on the position or location of the sound source with respect to the hstener.
  • FIG. 1 depicts the path of sound waves received at both ears of a hstener according to a typical embodiment of a home entertainment system
  • the multi-channel audio signal is decoded into multiple channels, i.e., a two-channel encoded signal is decoded into a multi-channel signal in accordance with, for example, the Dolby Pro Logic® format.
  • Each channel of the multi-channel signal is then played, for example, through its associated loudspeaker, e.g., one of five loudspeakers: left; right; center; left surround; and right surround.
  • the effect is the sensation that sound is originating all around the hstener.
  • Figure 2 depicts the hstening experience created by an exemplary embodiment of the present invention.
  • the present invention processes each channel of a multi-channel signal using a set of HRTFs appropriate for the distance and location of each phantom loudspeaker (e.g., the intended loudspeaker for each channel) relative to the listener's left and right ears. All resulting left ear channels are summed, and all resulting right ear channels are summed producing two channels, left and right. Each channel is then preferably filtered using a transfer function that introduces the effects of open-ear hstening. When the two channel output is presented via headphones, the hstener senses that the sound is originating from five phantom loudspeakers placed throughout the room, as indicated in Figure 2.
  • HRTF Head Related Transfer Function
  • An HRTF is a transfer function obtained from one individual for one ear for a specific sound source location.
  • An HRTF is described by multiple coefficients that characterize how sound produced at a particular spatial position should be filtered to simulate the filtering effects of the head and outer ear of a particular individual.
  • HRTFs are typically measured at various elevations and azimuths. Typical HRTF measurement locations are illustrated in Figure 3A. In Figure 3 A, the horizontal plane located at the center of the listener's head 100 represents
  • HRTF locations are defined by a pair of elevation and azimuth coordinates and are represented by a small sphere 110.
  • HRTFs are measured in 10 degree intervals for the azimuth and 10 degree intervals for the elevation from 30 degrees below the horizon to 60 degrees above the horizon.
  • Associated with each sphere 110 is a set of HRTF coefficients that represent the transfer function for that sound source location.
  • Each sphere 110 is actually associated with two HRTFs, one for each ear.
  • a "universal" set of HRTFs would give very different sensations to each of the three individuals depicted. For instance, if an individual's HRTF had a peak (or valley) at a frequency f, while the universal HRTF had a contradictory valley (or peak) at the same frequency f, the individual would interpret the directional cues of the signal incorrectly. These inaccurate or poorly matched HRTFs degrade the overall 3D perception of the individual, the amount of degradation depending cm the individual. This was experimentally demonstrated by Wightman and Kistler (1993).
  • the present invention provides a database of HRTFs collected from a measured group of the general population. For example, the HRTFs are collected from numerous individuals of both sexes with varying physical characteristics. The present invention then employs a unique process whereby the sets of HRTFs obtained from all individuals are organized into an ordered fashion and stored in a read only memory (ROM) or other storage device.
  • ROM read only memory
  • An HRTF matching processor enables each user to select, from the sets of HRTFs stored in the ROM, a set of HRTFs such that when audio signals are processed with the selected set of HRTFs, the user perceives the corresponding sounds to be localized in the proper spatial positions.
  • FIG. 4 An exemplary embodiment of the present invention is illustrated in Figure 4.
  • the multi-channel signal has been decoded into its constituent channels, for example channels 1, 2, 3, 4 and 5 in the Dolby Pro Logic® format
  • selected channels are processed via an optional bass boost circuit 6.
  • channels 1, 2 and 3 are processed by the bass boost circuit 6.
  • Output channels 7, 8 and 9 from the bass boost circuit 6, as well as channels 4 and 5, are then each electronically processed to create the sensation of a phantom loudspeaker for each channel.
  • the HRTF processing circuits can include, for example, a suitably programmed digital signal processor.
  • a best match between the listener and a set of HRTFs is selected via the HRTF matching processor 59.
  • a preferred pair of HRTFs one for each ear, is selected for each channel as a function of the intended loudspeaker position of each channel of the multi-channel signal.
  • the best match set of HRTFs are selected from an ordered set of HRTFs stored in ROM 65 via the HRTF matching processor 59 and routed to the appropriate HRTF processor 10, 11, 12, 13 and 14.
  • sets of HRTFs stored in the HRTF database 63 are processed by an HRTF ordering processor 64 such that they may be stored in ROM
  • Each channel of the dual channel output from, for example, the HRTF processing circuit 10 is multiplied by a scaling factor as shown, for example, at nodes 16 and 17.
  • This scaling factor reflects signal attenuation as a function of the distance between the phantom loudspeaker and the listener's ear.
  • All right ear channels are summed at node 26.
  • All left ear channels are summed at node 27.
  • the outp ut of nodes 26 and 27 results in two channels, left and right respectively, each of which contains signal information necessary to provide the sensation of left, right, center, and rear loudspeakers intended to be created by each channel of the multi-channel signal, but now configured to be presented over conventional two transducer headphones.
  • parallel reverberation processing may optionally be performed on one or more channels by reverberation circuit 15.
  • the sound signal that reaches the ear includes information transmitted directly from each sound source as well as information reflected off of surfaces such as walls and ceilings. Sound information that is reflected off of surfaces is delayed in its arrival at the ear relative to sound that travels directly to the ear.
  • at least one channel of the multi-channel signal would be routed to the reverberation circuit 15, as shown in Figure 4.
  • one or more channels are routed through the reverberation circuit 15.
  • the circuit 15 includes, for example, numerous lowpass comb filters in parallel configuration. This is illustrated in Figure 16.
  • the input channel is routed to lowpass comb filters 140, 141, 142, 143, 144 and 145. Each of these filters is designed, as is known in the art, to introduce the delays associated with reflection off of room surfaces.
  • the output of the lowpass comb filters is summed at node 146 and passed through an allpass filter 147.
  • the output of the allpass filter is separated into two channels, left and right.
  • a gain, g is apphed to the left channel at node 147.
  • An inverse gain, -g is apphed to the right channel at node 148.
  • the gain g allows the relative proportions of direct and reverberated sounds to be adjusted.
  • Figure 17 illustrates an exemplary embodiment of a lowpass comb filter 140.
  • the input to the comb filter is summed with filtered output from the comb filter at node 150.
  • the summed signal is routed through the comb filter 151 where it is delayed D samples.
  • the output of the comb filter is routed to node 146, shown in Figure 16, and also summed with feedback from the lowpass filter 153 loop at node 152.
  • the summed signal is then input to the lowpass filter 153.
  • the output of the lowpass filter 153 is then routed back through both the comb filter and the lowpass filter, with gains applied of g, and g 2 at nodes 154 and 155, respectively.
  • the effects of open-ear (non-obstructed) resonation are optionally added at circuit 29 in Figure 4.
  • the ear canal resonator according to the present invention is designed to simulate open-ear hstening via headphones by introducing the resonances and anti-resonances that are characteristic of open-ear listening. It is generally known in the psychoa coustic art that open-ear hstening introduces certain resonances and anti-resonances into the incoming acoustic signal due to the filtering effects of the outer ear.
  • the characteristics of these resonances and anti-resonances are also generally known and may be used to construct a generally known transfer function, referred to as the open ear, transfer function, that, when convolved with a digital signal, introduces these resonances and anti-resonances into the digital signal.
  • a generally known transfer function referred to as the open ear, transfer function
  • Open-ear resonation circuit 29 compensates for the effects introduced by obstruction of the outer ear via, for example, headphones.
  • the open ear transfer function is convolved with each channel, left and right, using, for example, a digital signal processor.
  • the output of the open-ear resonation circuit 29 is two audio channels 30, 31 that when delivered through headphones, simulate the listener's multi-loudspeaker listening experience by creating the sensation of phantom loudspeakers throughout the simulated room in accordance with loudspeaker layout provided by format of the multi-channel signal.
  • the ear resonation circuit according to the present invention allows for use with any headphone, thereby eliminating a need for uniquely designed headphones.
  • Sound delivered to the ear via headphones is typically reduced in amplitude in the lower frequencies.
  • Low frequency energy may be increased, however, through the use of a bass boost system.
  • An exemplary embodiment of a bass boost circuit 6 is illustrated in Figure 5.
  • Output from selected channels of the multi-channel system is routed to the bass boost circuit 6.
  • Low frequency signal information is extracted by performing a low-pass filter at, for example, 100 Hz on one or more channels, via low pass filter 34. Once the low frequency signal information is obtained, it is multiplied by predetermined factor 35, for example k, and added to all channels via summing circuits 38, 39 and 40, thereby boosting the low frequency energy present in each channel.
  • the HRTF coefficients associated with the location of each phantom loudspeaker relative to the hstener must be convolved with each channel. This convolution is accomplished using a digital signal processor and may be done in either the time cr frequency domains with filter order ranging from 16 to 32 taps. Because HRTFs differ for right and left ears, the single channel input to each HRTF processing circuit 10, 11, 12, 13 and 14 is processed in parallel by two separate HRTFs, one for the right ear and one for the left ear. The result is a dual channel (e.g., right and left ear) output. This process is illustrated in Figure 6A.
  • Figure 6A illustrates the interaction of HRTF matching processor 59 with, for example, the HRTF processing circuit 10.
  • the signal for each channel of the multi-channel signal is convolved with two different HRTFs.
  • Figure 6A shows the left channel signal 7 being apphed to the left and right HRTF processing circuits 43, 44 of the HRTF processing circuit 10.
  • One set of HRTF coefficients corresponding to the spatial location of the phantom loudspeaker relative to the left ear is apphed to signal 7 via left ear HRTF processing circuit 43, the other set of HRTF coefficients corresponding to the spatial location of the phantom loudspeaker relative to the right ear and being applied to signal 7 via the right ear HRTF processing circuit 44.
  • the HRTFs apphed by HRTF processing circuits 43, 44 are selected from the set of HRTFs that best matches the hstener via the HRTF matching processor 59.
  • the output of each circuit 43, 44 is multiplied by a scaling factor via, for example, nodes 16 and 17, also as shown in Figure 4.
  • This scaling factor is used to apply signal attenuation that corresponds to that which would be achieved in a free field environment.
  • the value of the scaling factor is inversely related to the distance between the phantom loudspeaker and the listener's ear. As shown in Figure 4, the right ear output is summed for each phantom loudspeaker via node 26, and left ear output is summed for each phantom loudspeaker via node 27.
  • the signal can be transmitted to conventional two transducer headphones.
  • These signals can be transmitted by wire or wirelessly, for example, by a radio frequency (RF) transmission system. Examples of wireless transmission systems are exemplified in Examples 2, 3, and 4.
  • a central feature of this invention is to provide a sufficiently diverse and comprehensive set of HRTFs so that the user can select from that set one HRTF set which will produce the perception of sound located in the proper spatial position.
  • This selection process is accomplished herein by: (1) collecting a comprehensive database of HRTFs; (2) ordering the database so that a representative subset of the entire collection of HRTFs can be obtained and stored in the device; and (3) providing a means for a user to select from the representative subset.
  • a single HRTF is the spectrum obtained by presenting sound from a single location 110 (see Figure 3A).
  • a listener's HRTF head related transfer function refers to the set of HRTFs obtained from the multiple locations described, for example, in Figure 3A.
  • two HRTFs are measured, one for the listener's left ear and one for the right ear.
  • L locations are measured, the set of 2*L spectra represent the HRTF set for a single hstener.
  • S subjects are measured, an entire data base consisting of S*L*2 spectra is generated.
  • 360 locations were measured and HRTFs on over 150 subjects were collected.
  • the total data base consists of more than 108,000 spectra.
  • Prior measurement devices involved the use of multiple, e.g., 12, loudspeakers located on a circular hoop. Each of the multiple loudspeakers were used to create a signal used to measure the head-ear filter characteristics. In using these prior measurement devices, signals from each of the multiple loudspeakers were projected from a different location to allow measurements of HRTFs for different elevations and azimuths.
  • signals from each of the multiple loudspeakers were projected from a different location to allow measurements of HRTFs for different elevations and azimuths.
  • the use of multiple loudspeakers poses a problem. To avoid contamination of the measured HRTF, the different loudspeakers need to have equal output spectra. Unfortunately, it is only possible to equate such spectra to within about 0.5 dB.
  • an improved measurement method is provided by utilizing a single loudspeaker located at the end of a robot arm.
  • the single loudspeaker is used for all HRTF measurements, thereby eliminating the problem of unequal output spectra of different loudspeakers.
  • the single loudspeaker is precisely positioned by a computer-controlled robot arm in each of the locations where an HRTF is to be measured.
  • the present HRTF measurement device can measure and record a complete set of 360 HRTFs for each ear, for an individual, in approximately 10 to 15 minutes, as compared to one-to-four hours for prior measurement techniques. Because the hstener should remain stationary during the entire measurement process, the speeding-up of the measurement process can, itself, contribute to the accuracy of the measurements.
  • FIG 18A Provided in Figure 18A is a schematic of a preferred embodiment of an HRTF measurement means according to this invention.
  • a speaker preferably a 4 Ohm, 40 watt speaker, for example, produced by Pioneer.
  • a lower arm with dimensions approximately 1" wide, about 2" high and about 29" long.
  • an elbow AC servo motor preferably capable of high rotational speeds and torques (e.g. about 20,000 rpm, and about 200 oz.-in.), and an absolute encode (e.g. about 500 count/rev.).
  • an elbow planetary gearbox 203 preferably with a ratio of about 100: 1 and a torque capability of about 275 in.- lb.
  • An upper arm 212 is connected to the lower arm 201 through the elbow AC servo motor 202.
  • a shoulder spur gear pair 204 At the upper end of the upper arm 212, there is provided a shoulder spur gear pair 204, preferably having a ratio of about 11.1111:1. Maintaining the shoulder spur gear in appropriate linkage with the upper arm 212 is a mounting bracket with bearings 205.
  • the mounting bracket 205 is suspended from a rotation shaft 206 having a diameter of about 1-1/4".
  • a rotation spur gear pair 207 is provided with a ratio of about 12.8: 1, to rotate the rotation shaft 206.
  • a rotation planetary gearbox 208 having a ratio of about 100: 1 and a torque capability of about 275 in. - lb., drives the rotation spur gear pair 207.
  • a rotation servo motor and associated absolute encoder 209 having a speed of about 20,000 rpm, a torque of about 200 oz. - in., with the encoder being amenable to 500 count rev., are provided to actuate the rotation planetary gearbox 208.
  • a shoulder planetary gearbox 210 having a ratio of about 100: 1 and a torque output of about 275 lb. -ia, is actuated by an associated shoulder servo motor 211 having a speed of about 20,000 rpm and a torque output of about 200 oz. - in.
  • a wrist gearmotor 213 having a speed of about 50 rpm and a torque of about 178 oz. - in. with an associated analog encoder are provided to position to the speaker 200.
  • FIG 18B there is provided a detail of the upper arm 212, the elbow planetary gearbox 203, the elbow AC servo motor and absolute encode 202, the mounting bracket with bearings 205, the rotation shaft 206, the shoulder planetary gearbox 210, the shoulder servo motor and absolute encoder 211 and the drive shaft 214.
  • FIG 19 there is provided a schematic representation of the HRTF measurement control system.
  • This includes a central control computer 300 which, in a first loop, controls a servo controller 301 which drives a plurahty of servo amps 302a-c, which in turn drive a plurality of linked encoder, servo motor and gearboxes 303a-c.
  • Encoder/servo motor/gearbox 303a drives rotation, while 303b drives the shoulder, and 303c drives the arm (see Figure 18).
  • the central control computer 300 controls data acquisition, signal presentation and speaker control via a feedback loop comprising: an encoder/gear/motor assembly 304 for positioning the speaker 305; an A/D converter 306, a D/A converter 307, and an attenuator 308.
  • the feedback loop links through an amplifier 309 to the speaker 305 and to a microphone pre-amplifier 310 and the left and right microphones 311a and 311b.
  • the above described hardware may be controlled by software which controls the positioning of the speaker.
  • a preferred embodiment of such software is schematically represented in Figure 20.
  • the software controls system startup at 400, system initialization 401, and display of a main menu 402.
  • Subroutines 403-408 are provided which allow for loading of data 403, speaker calibration 404, headphone measurement 405, performance of an HRTF test run 406, performance of a full HRTF measurement run 407, and termination of the program 408.
  • a schematic of a full HRTF measurement run 407 is shown in steps 407a-407q, all of which are initiated by selection of element 407 at the main menu.
  • the full HRTF measurement run is initiated, following which the measured subject is identified 407b, the robot arm is calibrated 407c, via a feedback loop 407d which repeats arm calibration until a calibration "OK" signal 407e is received.
  • the robot arm is set to a zero starting position 407f, and the measurement routine is begun 407g. This includes movement of the robot arm and speaker 407 h about the subject whose HRTF sets are being measured.
  • the acquired data is played recorded 407i and the HRTF azimuth and elevation is displayed 407j on a monitor.
  • a continuous interrupt query 407k is sent and as long as no interrupt signal is received, the measurement process is looped 4071 back to measurement step 407g.
  • the system If an interrupt signal is received, the system resets 407p to the main menu, 407q. If the measurement routine is continued without interruption, a complete set of HRTFs are measured until the natural termination of the measurement routine is reached 407m. A pause 407n is included in the routine to allow the system to store 407o the acquired HRTFs, after which the system resets to the main menu 407q.
  • the headphone measurement 405 comprises steps 405a-405h, which are initiated by selecting this option at the main menu: at 405a, the routine is initiated, following which sounds are played through the headphone and displayed 405b.
  • a pause 405c is included in the routine to allow time for data retrieval and initiation of a subroutine 405d. If a particular headphone subroutine is not to be initiated 405e the system resets to the main menu. However, if a particular headphone subroutine is to be initiated, a particular headphone identity is entered 405f and the data acquired for that headphone is stored 405g following which the system resets to the main menu 405h.
  • the HRTF measurements are made in an appropriately constructed sound room.
  • the measurements are made in a room such as that schematically depicted in Figures 21A, 21B, and 21C.
  • This room shown in a front view in Figure 2 IA, provides an exhaust fan 500 and an air outlet channel 510.
  • a latched door 520 is provided, preferably with latches on both the inside and outside.
  • a fresh air fan 530 is provided for replenishment of fresh air from the outside of the room through an air inlet channel 540.
  • FIG 2 IB a schematic of a top view of the sound room is provided, including a representation of the subject seat 550, a monitoring camera 560, a pair of laser pointers 570, and sound absorbent walls 580.
  • FIG 21C a detail of the wall cross section is provided, showing a double wall structure in which there is provided two layers of dry wall 581 between which there is placed a damping material 582, preferably selected from foam rubber, polyurethane or like sound insulating material.
  • a further improvement in the present HRTF measurement device and method is the location of the transducer employed to record the sound signal used in calculating the HRTF.
  • the transducer may be placed at the entrance of the outer ear canal, instead of deep into the outer ear canal near the eardrum.
  • the external location of the transducer provides a much higher S/N ratio than previous locations for the transducer. This higher S/N ratio provides a more accurate HRTF, especially in the "valleys" of the HRTF where the greatest attenuation of the incoming impulse signal exists.
  • the database of measured HRTF's is ordered by comparing the spectra recorded from different individuals. This is accomplished by transforming or pre-processing the raw data to represent the perceptual features of the raw spectra more accurately.
  • the raw HRTFs are measured as the impulse response to a digital signal propagated by a loudspeaker at a given location. The signal so generated is carefully measured in the free-field (in the listener's absence) to correct for imperfections in the spectrum of the loudspeaker.
  • the measured impulse response is then converted to the frequency domain using a fast Fourier transform (FFT) according to methods well known in the art. This frequency domain representation is further processed by implementing critical-band filtering and converting the data from a linear frequency scale to a logarithmic scale.
  • FFT fast Fourier transform
  • Critical-band filtering reflects the fact that the first stage of the auditory system contains bandpass filters whose bandwidth is a constant fraction of the center frequency of the filter.
  • the critical band filters resemble 1/6 octave bandpass filters.
  • the distance along the auditory display is roughly proportional to the logarithm of sound frequency. Therefore, a logarithmic, rather than a linear, frequency scale is imposed on the representation.
  • a gammatone filter is used to perform critical band filtering.
  • the magnitude of the frequency response is calculated for each frequency, f, and is multiplied by the magnitude of the HRTF at the same frequency, f.
  • the results of this calculation at all frequencies are squared and summed. The square root is then taken. This results in one value representing the magnitude of the internal HRTF for each critical band filter.
  • the hearing system is sensitive to a fixed fractional change in signal magnitude, which is known in the field as "Weber's Law.”
  • stimulus magnitude is represented on a logarithmic scale, such as decibels
  • the ear is sensitive to a fixed number of decibels.
  • the internal spectrum is represented by the level of the stimulus in decibels at about 12-18 frequencies per octave in the range between 3 and 18 kHz. Outside this frequency range (3 to 18 kHz) the human auditory system gains little or no directional or localization information based on the shape of the stimulus spectrum In fact, few listeners but the very young can hear sounds above 18,000 Hz. At the lower frequencies, the spectrum of the signal is essentially the same for any azimuth or elevation.
  • An exemplary embodiment of the present invention applies critical band filtering to the set of HRTFs from each individual in the HRTF database 63, resulting in a new set of internal HRTFs.
  • the process is illustrated in Figure 12, wherein an impulse response waveform 80 shown in Figure 11 is filtered via a critical band filter 81 to produce the internal HRTF 82.
  • HRTFs obtained from the different subjects and transformed or pre-processed as described above can now be compared and organized so that their similarities and differences can be quantified.
  • One basic method of comparing two or more spectra is the simple Euclidian distance. Euclidian distance is equal to the root-mean-squared (RMS) difference in decibels between the levels measured at the same frequencies in the two or more spectra.
  • RMS root-mean-squared
  • the HRTFs were measured and preprocessed, we can now return to the issues raised earlier about how the user of the device selects a particular HRTF from those stored in the device. The selection process must ensure that the sound sources appear in their proper spatial position for the individual user. Thus, the first issue to be addressed is whether the entire database of measured HRTFs is sufficiently broad and comprehensive to represent the entire hstening population. In one exemplary embodiment, 150 HRTFs were measured from a population in which both genders and a variety of ages and ethnicities were represented.
  • HRTFs constitute a set size sufficient for the purposes of the subject invention. These tests were all conducted on a sample consisting of 150 sets measured according to this invention. Three HRTFs from each HRTF set were selected for these comparisons, namely, on the horizon (0 elevation) and at 10, 20, and 30 degrees to the left of straight ahead. It is expected that similar conclusions about stability would apply for other positions.
  • Each of the three HRTFs from each HRTF set consists, for example, of values representing the level of the HRTF, at a plurahty, e.g. 39, of different frequencies. The 39 frequencies are spaced equally, on a logarithmic frequency axis, from about 3,000 to about 18,000 Hz. Few listeners (except the very young) can hear sound above 18,000 Hz.
  • the composite spectra obtained over the 3 positions can be regarded as a vector consisting of 117 levels (dB).
  • FIG. 22A shows a plot of the cumulative probability of that distance for the various different set sizes. For example, if the set size is 20, then the RMS distance in decibels to the nearest neighbor is less than 2 dB for only about 55% of the individual HRTFs.
  • the centroid itself having 117 levels, is obtained by adding together, for each of the 117 levels, the value representing the level of the HRTF from each of the 150 composite spectra and dividing each sum by the sample size, 150 in the example. If each of the 150 composite spectra are treated as a point in a space of 117 dimensions, the centroid is the center of gravity of the set of 150 points.
  • the Euclidean distance between the centroid and each of the 150 composite spectra can then be measured.
  • the mean of this distance is about 2.53 dB, and the standard deviation is about 0.76 dB.
  • Figure 22B shows an estimate of a cumulative density function, which is a plot of the probability of an individual being less than a given value, x, from the centroid. As is shown in Figure 22B, the nearest individual in the space was about 1 dB from the centroid; approximately half the sample was within 2.5 dB of the centroid and about 95% were within 4 dB.
  • the stability of the data is assessed as the number of HRTFs measured is increased or decreased thus defining larger or smaller databases of HRTF subsets, and observing the effect this has on the mean and standard deviation.
  • random subsamples are drawn from the large sample of 150, and the mean and standard deviation of each subsample was calculated.
  • Figure 22C shows the change in the average mean as the number of HRTFs in the subsample increases.
  • Figure 22D shows the change in the average standard deviation as the number of HRTFs in die subsample increases.
  • the average mean changes by about 10% in value as the subsample group size goes from 5 to 80 HRTFs.
  • the last point on the graph is the mean, 2.53 dB, for all 150 HRTFs.
  • the average standard deviation changes by about 25% in value as the subsample group size goes from 5 to 80 HRTFs.
  • the two critical statistics of the 150 measured HRTFs are reasonably stable, and we have found that little statistical improvement would be gained by increasing the sample size much beyond 150 samples. While the preceding has estabhshed that the initial database is sufficiently comprehensive to cov ⁇ an entire population of listeners, it should also be appreciated that not each of the 100-200 HRTFs contributes equally to that result. This is because there is considerable similarity or correlation between certain groups within the entire database. This fact suggests that the raw database can be pruned in some fashion to reduce the total number of HRTFs actually stored in the device.
  • Several different statistical techniques might be used to provide an organization of the database that reveals the imderlying correlations. These include one of the variety of multidimensional scaling procedures known in the art. The procedure used in one exemplary embodiment herein was cluster analysis.
  • a hierarchical agglomerative clustering procedure such as that executed by the statistical program S-PlusTM. This procedure uses similarities between the HRTFs as measured in a distance matrix of all 150 HRTFs to produce an ordered tree-like structure to the data. At the highest node of the cluster, all of the HRTFs are contained. Successive nodes contain HRTFs that are similar to each other and different from the remainder, just as biological animals are classified as orders, genera, and species.
  • Figure 15 shows a sample cluster of HRTFs obtained from four subjects. Implicit in this example is the fact that
  • HRTFs of the left and right ear of a single subject are usually nearer in distance than are one person's HRTF to any other person's HRTF.
  • Clustering provides a convenient ordering of the entire database, so that subsets of HRTFs can easily be obtained by selecting similar groups determined by the nodes in the cluster. Those skilled in the art will recognize from this disclosure that other methods of ordering known in the art could be used.
  • a representative subset of HRTF sets from the entire set of 150 HRTF sets, from which a listener can be matched, is chosen to simplify the matching process.
  • the HRTF sets within a representative subset are stored for use according to the method of this invention.
  • the disadvantages of having a very large number of HRTF sets stored in the device are that more memory is required to store the HRTF sets, with an accompanying increase in cost of the device. In addition, it would take more time to match the hstener with the best-match HRTF set.
  • the illustrated results from both algorithms show the same trends, whether one selects representative HRTFs from the ordered database based on the "popularity" of the representative HRTF (i.e. an HRTF that is closest to the other HRTFs within a given subcluster), or based on the isolation of the representative HRTF (i.e. an HRTF most distant from other HRTFs within a given subcluster).
  • the mean minimum RMS distance increases slowly.
  • the mean RMS distance increases much more rapidly. The lowest RMS distance is 1 dB because 1 dB is the average RMS deviation between two measurements of the same individual's HRTF set.
  • an HRTF set randomly chosen from the 150 total HRTF sets is one of the stored HRTF sets
  • a value of 1 dB is used to represent the RMS distance, not 0 dB. Accordingly, the lowest possible value for the RMS error is 1 dB.
  • 25 HRTF sets is the number of representative HRTF sets to be stored in the device, for listeners to select from.
  • the hstener first chooses from among 5 representative HRTF sets, each representative set representing a set of 5 similar HRTF sets. Once one of the 5 representative sets is selected, the user selects from among the five similar HRTF sets in the set of HRTF sets corresponding to the selected representative HRTF set.
  • 15 HRTF sets is the number of representative HRTF sets to be stored in the device for listeners to select from. This number is approximately at the "knee" of d e plot in Figure 22E. Having discovered from the aforedescribed statistical analysis of our large ordered database that 15 representative HRTF sets is sufficient to allow the vast majority of the population to select an HRTF set that will allow proper audio spatialization, the 15 representative HRTFs may be selected as follows: the entire database is ordered such that the distance metric (Euclidian distance, RMS distance, etc.) between every HRTF and every other HRTF in the database is known.
  • the distance metric Euclidian distance, RMS distance, etc.
  • every HRTF set that is a distance x, e.g., 2, dB away from a particular HRTF set in the database is identified. This identification is made for each HRTF set in the database, and a listing is made of each HRTF set and all of the HRTF sets within x, e.g., 2, dB of it, from the most popular to the least popular HRTF set.
  • the most popular HRTF set is that set in the database that has the most HRTF sets within x, e.g., 2, dB of it.
  • the process of selecting 15 representative sets proceeds by first selecting the most popular HRTF set as a representative HRTF set, and then eliminating every HRTF set that was within x, e.g., 2, dB of the most popular HRTF set from further selection in the database.
  • the next most popular HRTF set, which was not eliminated upon the selection of the most popular HRTF set, is then selected to be the second representative HRTF set, and every remaining HRTF set in the database within x, e.g., 2, dB of this HRTF set is accordingly eliminated.
  • This process is repeated, moving down the hst of popularity of HRTF sets that remain in the database. Once 15 representative HRTF sets have been selected, the process may be terminated.
  • fewer or more representative HRTF sets may be selected and that a stringency, i.e., x, of greater than about 1 dB to about 4 dB may be imposed around each of the most popular HRTFs so as to arrive at about 15- 25 representative HRTF sets from the entire database of measured HRTF sets. From our statistical analysis, we have found that 15-25 representative HRTF sets is preferred for the considerations provided above.
  • the user selects the HRTF set that he/she will use in hstening to program material by any of several different methods.
  • One procedure is to present, via headphones, sounds filtered by a variety of HRTFs to convey the impression of phantom sounds rotating about the listener's head.
  • the programmed sounds are in fact all chosen from elevations on the horizon. What is generally true of HRTFs is that the variation in the filtered spectrum decreases as elevation increases. That is, the HRTF is generally flatter as the elevation of the sound increases. It is also true that a listener using an HRTF that is very dissimilar to his/her own will tend to hear the phantom sound much higher in elevation than that programmed. Thus, when a listener hears a sound at a lower elevation, it generally means that the listener better appreciates the structure in those HRTFs. Consequently, if one listens to a set of different HRTFs programmed to produce the circle of phantom sounds on the horizon such as that illustrated in Figure
  • the HRTF set producing the lowest apparent elevation will provide the best means to localize sound in the correct spatial location.
  • the present invention uses HRTF clustering as illustrated in Figure 6B.
  • the present invention collects and stores HRTFs from numerous individuals in the HRTF database 63.
  • HRTFs are pre-processed by the HRTF ordering processor 64 which includes an HRTF pre-processor 71, an HRTF analyzer 72 and an HRTF clustering processor 73.
  • the HRTF pre-processor 71 processes HRTFs so that they more closely match the way in which humans perceive sound, as described above and further below.
  • the smoothed HRTFs are statistically analyzed, each one to every other one, to determine similarities and differences between them by HRTF analyzer 72.
  • the HRTFs are subjected to a cluster analysis, as is known in the art, and as described above may be "pruned" to arrive at a representative set of HRTFs, by HRTF clustering processor 73, resulting in a hierarchical grouping of HRTFs.
  • the HRTFs are then stored in an ordered manner in the ROM 65 for use by a listener. From these ordered HRTFs, the listener selects the set that provide the best match via the HRTF matching processor 59. From the set of HRTFs that best match the listener, the HRTFs appropriate for the location of each phantom speaker are input to their respective logical HRTF processing circuits 10 to 14 of Figure 4.
  • the listener is matched to or selects a best-match HRTF set from the 15 most representative HRTF sets.
  • the HRTF sets of the most representative group of HRTF sets, including the user selected best-match set of HRTFs are stored in an external EEPROM 704 to be accessed during the matching process.
  • an input left 601 and right 602 audio signal typically from a CD player, VCR, laser disk player, or like source of audio signal are inputted to a circuit 600 for processing of the signals to achieve accurate spatialization of the sound transmitted to the user of the headphones.
  • the circuit 600 may be custom burned into read only memory on a sihcon or like chip, or an off-the-shelf, commercially available chip, such as a Motorola DSP 56007 chip, may be programmed by downloading the appropriate connectors to an electrically erasable programmable read only memory (EEPROM) 710 which reconfigures the DSP 56007 chip each time the chip
  • EEPROM electrically erasable programmable read only memory
  • the signals are first routed to a Dolby Prologic® or like decoder 603, a well defined Dolby Laboratories standard known in the art.
  • the Dolby Prologic® decoder 603 provides four output channels, left 604, right 605, center 606, and surround 607, intended for loudspeakers located to the front left 608, front right 609, front center 610, and rear center 611 of the listener, see Figure 23C, respectively.
  • the center channel signal 606 is preprocessed within an early reflection 612 processing circuit, to simulate early reflections that sound waves would encounter in a non-anechoic environment.
  • the output signal of the early reflection processing circuit, the left early reflection 613 and the right early reflection 614 signals, are preferably added 615, 616 to the left channel signal 604 and to the right channel signal 605, respectively, yielding early reflection processed left 627 and right channel 628 signals.
  • one embodiment of this early reflection preprocessing which is intended to provide a sense of direction and spatial cue, comprises delay tap lines 618, 619 with variable length filter delays 620, 621 and variable magnitude gains 622, 623 for the left and right early reflections, respectively.
  • the length of the delays 620, 621 and the magnitude of the gains 622, 623 can be adjusted, according to the simulated early reflections to be imposed on the signals, by, for example, ambiance 696, theater 624, hall 625, or club 626 control buttons.
  • Means for achieving early reflection processing are known in the art (see U. S . patent No. 5 ,371 ,799, incorporated here by reference for this purpose).
  • the multiple channels of the signal 627, 628, 606, 607 are processed 663 to create the sensation of phantom loudspeakers by filtering each channel of the signal with a pair of HRTFs, from the best-match HRTF set, corresponding to the intended location for that channel.
  • the user is matched to a best-match HRTF set.
  • the user is preferably matched to a best- match HRTF set, from among the most representative group of HRTF sets of the total database of HRTF sets measured so that when used to process an audio signal the user perceives the corresponding sounds to be localized in the proper spatial positions.
  • the HRTF matching process begins by the user pushing an HRTF match mode control button (Ears control) 629, thus entering the HRTF matching mode. This places the user in match mode 1 630. In match mode 1 630, the user may select from one of five clusters of HRTF sets (sets 1-5) in the test bank. Representative HRTFs from each of the five clusters are copied from the external EEPROM 704, which stores the most representative HRTF sets, into the internal RAM 631 , see Figure 23A, of circuit 600, for testing.
  • the testing is accomplished by presenting the user, upon the user pushing a noise control 703 button, with sound signals produced by a white noise process 632, Figure 28B, with a linearly decaying envelop 633.
  • the user is first presented with a sound processed by an HRTF 640 corresponding to a first predetermined virtual location, e.g., the front left speaker 634, see Figure 28C, and then the user is presented with a sound processed by an HRTF 641 corresponding to a second predetermined virtual location, e.g., the rear left speaker 635, for each of the representative HRTF sets of the five clusters copied to the RAM 631.
  • the user sequentially listens to each representative set by using the HRTF matching control button 636 to step through the representative HRTF sets 1-5, and ultimately selects which of the sound signals, each generated using a representative HRTF set from one of the five clusters (1-5), which the user perceives as most clearly arriving first from the horizon to the user's front left and then arriving from the horizon to the user's rear left.
  • the user selects the clearest sound signal by pressing the OK button 637.
  • the selected sound signal co ⁇ esponds to the representative HRTF set 638 from one of the clusters of HRTF sets (1-5) which contains the first approximation of the user's best-match
  • the next step is for the HRTF sets (sets 2.1-2.5 in Figure 28A) from the cluster corresponding to the selected sound signal to be copied 1,000 from the external EEPROM 704 into the internal RAM 631 for further selection by the user.
  • the user is presented with sound signals produced by a white noise process 632 with a Linearly decaying envelop 633 processed first by the HRTF 640 corresponding to the front left speaker 634 and then processed by the HRTF 641 corresponding to the rear left speaker 635, for each of the five HRTF sets 2.1-2.5 within the cluster corresponding to the previously selected representative set (set 2 in Figure 28 A).
  • the user selects which of the sound signals, each associated with one of the HRTF sets (sets 2.1-2.5 in Figure 28A) of the selected cluster, (2), which the user perceives as most clearly arriving first from the horizon to the user's front left and then from the horizon to the user's rear left.
  • the user selects this sound signal by pressing the OK button 637.
  • the device can enable the matching process by producing a transient click-like stimulus e.g., a white noise process
  • HRTF appropriate for the frontal position. Fifteen such HRTFs are used, each appropriate for the set of HRTFs associated with the 15 representative individuals chosen from the entire population of 150 HRTFs. The user selects that HRTF which produces the clearest perception of a phantom sound source located directly in front of the listener. This can enable the matching process to provide a match based on the needs of the apphcation. It should be appreciated that other tests may be more appropriate in other applications, but this simple test is adequate for the current application. For example, if the apphcation requires spatialization of sounds to the sides, HRTFs corresponding to the sides can be used in the matching process.
  • a seat control button 643 which allows the user to select where the user will "sit" in the virtual room with respect to the virtual speakers.
  • the user can select the fr ont-of-the-room 644 seat position, in which case the sound which is to appear from the left 634 and right 645 front phantom speakers will be generated from an HRTF set (2.2.4 in Figure 28A) measured from an appropriate azimuth angle, i.e., 40 degrees azimuth left or right respectively.
  • the front left 634, front center 646, and front right 645 virtual speakers will be louder than the rear virtual speakers.
  • the front left 634 and right 645 virtual speakers will be generated by an HRTF set (2.2.1 in Figure 28 A) measured from a smaller azimuth angle, i.e., 10 degrees azimuth left or right respectively.
  • the front left 634, front center 646, and front right 645 virtual speakers will be softer, than the rear left (surround left) 635 and rear right (surround right) 648 speakers.
  • 10 HRTFs 651-660 are copied from the external EEPROM 704 to the internal RAM 631 for use as digital filters.
  • the 10 HRTFs correspond to the front left, front center, front right, rear left (surround left), and rear right (surround right) virtual speaker locations, with a left and right HRTF for each position 651, 652, 653, 654, 655, 656, 657, 658, 659, 660.
  • These 10 HRTF sets (651 through 660), from the best-match HRTF set (2.2), provide the user with a best-match to the user's own head and pinnae filtering characteristics and simulate the user's selected seat position. Note that for each of the 4 seat positions 644, 661, 662, 647, 10 different HRTFs are copied to the RAM 631.
  • a fifth channel (second surround channel) 664 may be generated by optionally inverting 665 the single Dolby Prologic® surround channel 607. This inversion 665 aids in decorrelating the two surround channels. These two surround channels 607, 664 then become rear left (surround left) 607 and rear right (surround right) 664 channels. Accordingly, the surround right channel 664 is identical to the surround left 607 channel, although possibly invented.
  • Each of the five channels (left front 627, center front 606, right front 628, left rear 607, and right rear 664) is then split into a right and left channel for filtering by the corresponding HRTFs (651-660) stored in the RAM 631.
  • an EEPROM 710 stores all current parameters of die system including current HRTFs, and its stored data is not disturbed by power-up/power-down events.
  • This EEPROM can save, after selection by user, multiple operating mode parameter presets, which can be pulled up by a user by, for example, pushing a button.
  • the HRTF filtering of the 5 left and 5 right channels is accomplished by convolving (or mixing) each channel with the HRTF, from the best-match HRTF set, corresponding to the given location and to the given ear.
  • the convolution of these 10 signals with the corresponding HRTFs produces signals which produce sound corresponding to virtual or phantom speakers at locations corresponding to the locations from which the HRTFs were measured.
  • die 5 left signals are summed 666 to generate a summed left signal 668
  • the 5 right signals are summed 667 to generate a summed right signal 669.
  • These left 668 and right 669 summed signals can be sent directly to a set of headphones for virtual speaker generation.
  • additional processing of the summed left 668 and right 669 signals to enhance the effect experienced by the user may be performed. This further processing eliminates the impression of being in an anechoic chamber with the five speakers generating the sounds. Sound in an anechoic chamber does not have the same "fullness" of sound as if the user were in an echoic chamber.
  • bass boost 670 and reverberation 671 processing is preferably performed on the signals before presentation to the user over headphones.
  • these are well known processes in the art.
  • both the left 668 and right 669 summed output from the HRTF processing may be directed to a bass boost processing block 670.
  • this circuit 670 comprises, for example, a 100 Hz lowpass filter 672, 673 for each signal, left 668 and right 669, to produce signals 681 and 682 followed by an amplification 674, 675 of gain G B for each signal, left and right.
  • the gain G B can be adjusted, per the user's preference, up or down to adjust the amount of bass boost to the signals by using the bass control button 680.
  • the left 676 and right 677 outputs of the respective amplifiers are then added to the respective left 668 or right 669 input signal to produce a left bass boosted output 678 and a right bass boosted output 679 signal.
  • the left bass boosted output 678 and right bass boosted output 679 signals are essentially the original signal 668, 69 with an added component comprising G B times the respective output 681, 682 of the signal through a 100 Hz lowpass filter
  • the left bass boosted 678 and right bass boosted 679 output signals are then added to the output of a reverberation processing circuit 671, where the inputs 604, 605, 606, 607 to the reverberation processing block are the original four standard Dolby Prologic® or like outputs before any other processing.
  • the reverberation processing 671 in conjunction with the early reflection processing 612, provides the "fill" or architectural enhancement that an anechoic representation lacks.
  • the reverberation processing circuit 671 comprises two all-pole comb filters 683, 684, in parallel, the summed output of which 692 feeds into two all-pass filters 685, 686 in parallel.
  • the four standard Dolby Prologic® or like outputs are first summed 687 together and the sum 688 is then inputted to the first comb filter 683 and to the second comb filter 684.
  • Each all-pole comb filter 683, 684, as shown in Figure 26, loops the input signal upon itself over and over again with the volume reduced by some fractional amount for each successive loop.
  • the summed output 692 of the two comb filters in parallel feeds two all- pass filters 685, 686 in parallel.
  • These all-pass filters provide a smearing effect in time to the signal at its input without disturbing the frequency characteristics of the input.
  • the all-pass filters are non- linear phase distorters and remove some of the phase information as a function of frequency. This allows decorrelation of the left 693 and right 694 reverberation outputs, even though the input 692 to the left and right all-pass filters is the same, without disturbing the frequency profile which is embedded in the signal from the HRTF processing.
  • the level of the left 693 and right 694 reverberation outputs is a function of gain, G R 695, which is controlled by die ambiance control button 696.
  • the left 693 and right 694 reverberation outputs are summed 697, 698 with the left 678 and right 679 bass boost outputs, respectively.
  • These summed left 701 and right 702 signals are the left audio out 701 and right audio out 702 signals respectively.
  • the left audio out 701 and right audio out 702 can be sent directly to a set of headphones to provide the hstener with the sensation that the audio is originating from virtual speakers positioned according to the seat control selection made by die user.
  • the headphones are connected via wire to outputs 701 and 702.
  • 701 and 702 are signals sent via wireless connection to a set of headphones (see Examples 2, 3, and 4).
  • the HRTFs are copied, one at a time, from the external EEPROM into the internal RAM of the DSP chip for testing.
  • the user may test these HRTFs by asserting a test signal, see Figure 29B, which will be comprehended by analogy to Figure 28B.
  • a white noise process with a linearly decaying envelope is played from die Center (C) speaker (see figure 28C).
  • the user chooses the HRTF set that best fits the following criteria: (a) the sound source is localized directly in front of the user, and (b) the sound source is localized at the horizon (i.e. on a horizontal plane defined by the user's pinnae).
  • the user exits match mode.
  • the seating position can then be adjusted, as described above with reference to Figure 28A, by selecting the 10 HRTFs used by the HRTF processor to localize the virtual sound sources. In this scenario, the user is spared an intermediate step of HRTF matching used in d e system shown in Figure 28A.
  • HRTFs used to process an audio signal, for each spatial position is measured from the same individual, a user can instead be matched to separate representative sets of HRTFs for each spatial position. The user would perform a matching step for each spatial location, wherein a subset of each representative set, selected for the desired spatial position, would be used to process the audio signals.
  • This set herein as a Multi-Position Head-Related Transfer Function or
  • the hstener would experience a sound source at each location.
  • the sound source may change for each location depending on the objective criterion at that location.
  • the sound source may be speech for a location in which speech is the main information to be presented.
  • Another may be filtered white noise for those locations that will present ambient noise.
  • HRTFs for each location, a listener would be allowed to choose across multiple sets of HRTFs, where a set of HRTFs is defined to be those recorded from a single subject. This allows the hstener to custom develop a "user's set of HRTFs" that best describe his/her localization and perception characteristics at each location to be presented. Furthermore, an inte ⁇ olation algorithm could generate intermediate locations for the user's set of HRTFs as a mixture of the selected HRTF sets.
  • the statistical analysis of HRTFs performed by the HRTF analyzer 72, shown in Figure 6B is performed through computation of eigenvectors and eigenvalues. Such computations are known, for example, using the MATLAB® software program by The MathWorks, Inc.
  • An exemplary embodiment compares HRTFs by computing eigenvectors and eigenvalues for the set of 2S HRTFs at L * N levels.
  • Each subject-ear HRTF set may be described by one or more eigenvalues. Only those eigenvalues computed from eigenvectors that contribute to a large portion of the shared variance are used to describe a set of subject-ear HRTFs.
  • Each subject- ear HRTF may be described by, for example, a set of 10 eigenvalues.
  • the cluster analysis procedure performed by the HRTF clustering processor 73 is performed using a hierarchical agglomerative cluster technique, for example the S-Plus® program, provided by MathSoft, Inc., based on die distance between each set of HRTFs in multi-dimension space.
  • Each subject-ear HRTF set is represented in multi- dimensional space in terms of eigenvalues. Thus, if 10 eigenvalues are used, each subject-ear HRTF would be represented at a specific location in 10-dimensional space.
  • Distances between each subject-ear position are used by die cluster analysis in order to organize the subject-ear sets of HRTFs into hierarchical groups.
  • Hierarchical agglomerative clustering in two dimensions is illustrated in Figure 14.
  • Figure 15 depicts die same clustering procedure using a binary tree structure.
  • This embodiment stores sets of HRTFs in an ordered fashion in the ROM 65 based on die result of the cluster analysis.
  • the present invention employs an HRTF matching processor 59 in order to allow the user to select die set of HRTFs that best match the user.
  • an HRTF binary tree structure is used to match an individual hstener to the best set of HRTFs.
  • the sets of HRTFs stored in the ROM 65 comprise one large cluster.
  • the sets of HRTFs are grouped based on similarity into two sub-clusters. The listener is presented with sounds filtered using representative sets of HRTFs from each of two sub- clusters 49, 50.
  • the listener For each set of HRTFs, the listener hears sounds filtered using specific HRTFs associated with a constant low elevation and varying azimuths surrounding the head. The listener indicates which set of HRTFs appears to be originating at the lowest elevation. This becomes the current "best match set of HRTFs.” The cluster in which this set of HRTFs is located becomes the current "best match cluster.”
  • the "best match cluster” in turn includes two sub-clusters, 51 , 52.
  • the hstener is again presorted with a representative pair of sets of HRTFs from each sub-cluster. Once again, the set of
  • HRTFs that is perceived to be of the lowest elevation is selected as the current "best match set of HRTFs" and the cluster in which it is found becomes the current "best match cluster.”
  • the process continues in this fashion with each successive cluster containing fewer and fewer sets of HRTFs.
  • the representative set of HRTFs selected at this level becomes the listener's final "best match set of HRTFs.” From this set of HRTFs, specific HRTFs are selected as a function of the desired phantom loudspeaker location associated with each of the multiple channels. These HRTFs are routed to multiple HRTF processors for convolution with each channel.
  • left 701 and right 702 audio out signals of Figure 23 A can be inputs, for example 754, of a typical digital signal transmission system known in the art, the output of which, for example 762, can be inputted to a set of headphones.
  • Left 701 and right 702 audio out signals can be outputted in digital or analog format. If outputted in analog format, each signal can be converted to digital format 755.
  • d e left and right audio signals are interlaced in time to create a single digital signal 755 which carries both the left and right channel information.
  • the single interlaced digital signal 755 can have a first digital word, e.g., 16 bits, that is a right audio channel word, a second digital word that is a left audio channel word and thereafter alternating between right and left (see Figure 9G).
  • This single digital signal 755 carrying both die left and right audio channel information can then be inputted, for example 755 of Figure 7, to a typical digital signal transmission system.
  • a standard digital signal transmission system typically comprises a transmitting station 751, a connecting medium called a channel 752, and a receiving station 753.
  • the transmitting station 751 can receive an analog signal 754 and convert it to a digital signal 755 or can receive a digital signal 755 direcdy.
  • Conversion of an analog to a digital signal for example using an analog-to-digital (D/A) converter 756, requires the analog signal to be sampled and quantized to the nearest of a number of discrete signal levels.
  • D/A analog-to-digital
  • the discrete signal level of the quantized signal is sent to a source encoder 757 where each discrete signal level is converted into a digital representation thereof, typically binary.
  • This representation can consist of digital words, for example 16-bit digital words, wherein each digital word represents the value of a discrete signal level. These digital words can be transmitted sequentially as a serial binary digital bit stream.
  • the binary digital representation is in a particular waveform format, e.g., unipolar or Manchester, and is sent to a modulator 758, which modulates die signal for transmission over the channel 752.
  • the modulator 758 can be a RF modulator, for which the corresponding channel would be air.
  • die channel may be a wire or like transmission means.
  • the receiving station 753 is essentially the inverse of d e transmitting station and comprises a demodulator 759, a source decoder 760, and an optional digital-to-analog converter 761.
  • the output from the receiving station can accordingly be either an analog output 762 or a digital output 763.
  • Example 3 Important parameters and design considerations for a digital signal transmission system are bandwiddi of die channel, costs of the transmitting and receiving stations, power consumption of the transmitting and receiving stations, and the particular binary waveform chosen for source encoding. Bandwiddi is important because it limits the amount of information that can be sent per unit time. The selection of the binary waveform is important because die selection can affect bandwidth and the costs, complexity, and power consumption of the transmitting and receiving stations. This example provides a method for signal transmission that avoids certain problems, discussed below, inherent in known transmission systems for digital signals which enhances the fidelity of the HRTF processed signal of this invention as it is sent to a hstener.
  • a receiver for example, within the receiving station of Example 2, has no clock which is, a priori, synchronized to an incoming digital bit stream
  • the digital bit stream is called an asynchronous signal.
  • _he receiver must, therefore, lock-on to the bit rate in order to generate a clock signal, tied to the bit rate, to enable the receiver to decode the signal.
  • Locking-on to the bit rate can be accomplished by known methods, for example, using a phase-locked loop (PLL).
  • PLL phase-locked loop
  • these strings of contiguous zeroes and/or ones can be encountered with audio signals during moments of silence, or idle patterns.
  • These strings of contiguous zeroes and ones can lead to drifting of the output frequency of the PLL due to an imbalance in the charging and discharging events within the PLL.
  • die PLL can lose its lock, resulting in decoding errors, and thus degradation in the performance of the entire transmission system.
  • a binary format digital signal without repeated strings of contiguous zeroes and/or ones would give the PLL a balance of charging and discharging events, allowing the PLL to track the digital signal's frequency more accurately.
  • Manchester, or bi-phase-level encoding commonly used for digital audio signals, eliminates die drifting of the PLL.
  • a Manchester encoded waveform transmits the symbol 1 as a positive pulse for half of the symbol interval, followed by a negative pulse for the remainder of die interval; d_s symbol 0 is conveyed by the same two-pulse sequence but of opposite polarity.
  • the subject invention includes a novel encoding, transmission, and decoding technique for binary format digital signals. This is particularly advantageous when applied to signals with frequent idle patterns (e.g. digital audio). Advantages of the subject technique include efficient carrier stabilization and bit clock embedding. In addition, this technology provides a low-cost, low power-consumption transmitter/receiver combination for digital signals, including, but not limited to, digital radio frequency (RF) audio signals processed according to this invention to spatialize sound over headphones.
  • RF radio frequency
  • the subject encoding technique can operate on input binary encoded digital signals, typically encoded in two's compliment.
  • the subject technique involves (a) removing the DC component of the input binary encoded digital signal, if present, and, if not already present, adding a small amount of noise to the input binary encoded digital signal, to ensure that each bit location undergoes transitions between the zero and one states, even during idle patterns; (b) inverting, or toggling, every other bit of the binary encoded signal to provide sufficient transitions between adjacent bits to enable the receiver to lock-on to the bit rate and to prevent drifting of the receiver's PLL when long strings of contiguous zeroes and/or ones are present in the input binary encoded digital signal; and (c) encoding a locking bit on the digital signal, for example one locking bit at the start of each word.
  • This locking bit enables the receiver to lock-on to the word pattern of the digital signal, i.e., the position of the digital words within the digital bit stream.
  • the signal should have enough self-noise to ensure frequent transitions from positive to negative values of d e signal. Note, if a signal does not have sufficient self-noise, a noise generator is summed with die signal to ensure frequent transitions between positive and negative values for the signal.
  • the subject encoding technique operates on an input binary encoded digital signal, typically encoded in two's complement. The first step of the subject technique is to remove the DC component of d e input binary encoded digital signal, if present.
  • this technique is apphed to signals where DC coupling is not critical, as in the audio signals of this invention. Since the human ear cannot detect DC sounds, the DC component is not important with respect to digital audio signals. Therefore, this technique is particularly advantageous with respect to processing digital audio signals.
  • the left 701 and right 702 audio out signals can be outputted in digital or analog format. If outputted in analog format, each signal can be converted to digital format 901.
  • the left and right audio signals are interlaced in time to create a single digital signal 901 which carries both die left and right channel information.
  • the single interlaced digital signal 901 can have a first digital word, e.g., 16 bits, that is a right audio channel word, a second digital word that is a left audio channel word and thereafter alternating between right and left (see Figure 9G).
  • This single digital signal 901 carrying both the left and right audio channel information can then be inputted as shown in Figure 8A.
  • the DC be removed 902 from the signal after the signal is in digital form 901, rather than from the analog signal prior to digitization.
  • a small DC component is typically introduced into the digital signal during conversion from analog to digital.
  • This DC component introduced into the digital signal is inherent in known analog-to-digital converters and even though small, is undesirable when implementing the subject invention. For instance, during idle patterns of the signal, this residual DC component can cause bit locations to "stick" (i.e. remain in a zero state or a one state) for long periods.
  • This "sticking" can make it possible for the receiver to mistake a “sticking” bit as a locking bit, which as discussed in greater detail below, is a bit which can be encoded on d e digital signal and, typically, is always a zero or always a one.
  • Removing the DC component 902 can be accomplished by many known techniques, for example, by passing die signal through a high pass digital filter.
  • This high-pass filter can be, for example, an infinite impulse response (HR) high pass digital filter. It is important, when designing the apparatus which is to remove the DC component from the digital signal, that the apparatus does not detrimentally affect the non-DC components of the digital signal.
  • HR infinite impulse response
  • a first-order Butterwoith digital high-pass filter with a 20 Hz comer frequency, is used.
  • an adaptive filter is used to remove the DC component.
  • an adaptive filter such as that shown in Figure 8B is used to remove die DC component 902 of the input binary encoded digital signal 901, generated by interlacing in time the digital format representation of left 701 and right 702 audio out signals of Figure 23A (or left 30 and right 31 earphone signals of Figure 4).
  • the input binary encoded digital signal in a specific embodiment, can be a 16 bit word signal where left and right channel words are interlocked in time such that die first 16 bit word represents the first right channel word and die second 16 bit word represents the first left channel word. Accordingly, each successive 16 bit word alternates between right channel and left channel.
  • the digital word of die input signal 9011 is first summed 771 with a tracking constant C[k] 772, which can initially be zero.
  • the tracking control variables, Q, and Q 2 are dependent upon die amount of gain desired in the adaptation control circuit. This adaptive filter effectively integrates out an average, or DC component, and continually removes it from the source signal.
  • the input signal 9011 or 901 r has sufficient self-noise to ensure transitions between positive and negative values even after the DC component is removed, then it is preferred that Q, and Oj be equal in size.
  • a noise generator 924 can be used to add in sufficient noise.
  • Figure 9 A, 9B, and 9C the results of a computer simulation of removing the DC component from a gaussian noise source using an adaptive filter, as shown in Figure 8B, are illustrated.
  • Figure 9A shows the original gaussian noise source waveform
  • Figure 9B shows die value of the tracking constant, C[k]
  • Figure 9C shows the output waveform of the adaptive filter.
  • Figures 9D and 9E the magnitude frequency response of the input gaussian noise waveform and DC shifted output waveform are shown, where Figure 9D is up to 2x10 4 Hz while Figure 9E shows an expanded view up to 1000 Hz.
  • die next step is to toggle every other bit 903 of the signal.
  • This toggling can be accomplished by known means, for example, by exclusive ORing the signal with a sequence of alternating ones and zeroes, i.e., ...1010...10...
  • the output of an exclusive OR gate is a one if, and only if, only one of the two inputs is a one. Therefore, when an input is exclusive ORed with a zero, the output is the same as the input. However, when an input is exclusive ORed with a one, the output is an inversion of the input. For example, a one exclusive ORed with a one gives an output of zero and a zero exclusive ORed with a one gives an output of one.
  • every other bit of die encoded signal is inverted by exclusive ORing 903 each word at the signal with 10101010101010. It should be noted that one could alternatively exclusive OR the signal with 010101...01 and adjust die receiver accordingly.
  • the purpose of this toggling, or inverting of every other bit is to provide sufficient transitions between adjacent bits to enable a receiver to lock-on to the bit rate.
  • the removal of the DC component, and subsequent inverting of every other bit ensures that there will not be repeated strings of contiguous ones or zeroes, and that each bit location is guaranteed to alternate, or flip flop, between the one and zero states, even during idle patterns of the signal.
  • 24 bit signed two's complement encoding is used.
  • the most significant bit location is the sign bit in the two's complement binary format, where the sign bit is zero for positive and one for negative signal values. Since the DC component of the digital signal has been removed, the digital signal frequently transitions between positive and negative. Therefore, the sign bit location is equally likely to be a one or a zero. Combining the removal of the DC component with the inversion of every other bit ensures each of the remaining 23 bit locations in this 24 bit illustration are also just as likely to be a one or a zero, and there are no repeated strings of contiguous ones or zeroes remaining in the signal.
  • die 24 bit signal would frequentiy have positive value words having a string of zeroes in the most significant bits during idle patterns, such as 000000000000000000100101, with only the least significant bits being in a different state than tiieir neighbor bits.
  • negative value words with a string of ones in the most significant bits such as
  • 111111111111111110101110 again widi only die least significant bits flip-flopping. If the signal, fear example due to noise, were such that the signal remains positive or negative for relatively long periods, then these most significant bits can "stick” at a particular value, zero or one, for an equally long period. These "sticking" bits could be mistaken for a locking bit, wherein a locking bit is a bit which can be encoded on the digital signal and, typically, is always a one or always a zero.
  • a locking bit can be located at a certain bit location within a word to allow a receiver to lock-on to the location of die words within the signal by locking on to the locking bit.
  • a "code violation" within the signal can be used to allow the receiver to determine where each word begins.
  • a locking bit can be placed at certain locations within the signal.
  • right and left channel words can be interlocked in time, where each channel can have, for example, 16 bits as shown in Figure 9G.
  • the locking bit can be located in a certain position of d e right channel word, for example, in the least significant bit locationoa
  • This locking bit then gives the location of the right channel word, as well as the location of the left channel word.
  • This locking bit can be, for example, always a zero or always a one, which allows a receiver to lock on to the locking bit and, therefore, the word pattern of the digital bit stream.
  • each, for example, right word is ANDed 904 with 1111111111111110.
  • AND operation leaves the first 15 bits of the 16 bit word unchanged and necessarily encodes a zero in the 16th bit location. This guarantees that each right word has as a locking bit, a zero in the least significant bit location, to allow determination of d e location of each word in the digital signal at die receiver. It is important to note that it is not necessary for each word or even every otiier word to have a locking bit encoded on it. Indeed, a locking bit could be encoded on every third or fourth word. In fact, the limit as to how far apart locking bits can be spaced is determined by the cost and complexity of the receiver to be used.
  • the signal can be transmitted via a wired connection to headphones or through the air.
  • the signal is inputted to a frequency shift keying (FSK) transmitter 905, such as a RF9901 FSK transmitter chip from RF Micro Devices, which modulates the signal for transmission from a transmitting loop antenna 906.
  • a corresponding receiving loop antenna 907 receives the incoming FSK modulated signal and sends the signal to a FSK receiver 908, such as a RF9902 FSK receiver chip from RF Micro Devices, which demodulates the signal.
  • the demodulated signal can men be inputted to conventional two transducer headphones for listening.
  • die receiver can comprise a phase lock loop 815, which provides a master clock 804 and aligns the clocking bits with die data bits provided from, for example, an RF demodulator.
  • the receiver can further comprise a state machine 800, which can be the center of the timing for the receiver, and can also perform a number of operations including: clocking functions for the D/A converter, reclocking of the data dehvered to d e D/A, and control lines for master reset.
  • the state machine can provide a serial clock 805, SCLK, a left/right clock 806, L/R CLK, and data 803, SDATA, to a D/A converter.
  • the state machine 800 can, for example, be a free running eight bit counter. Where the signal is transmitted wirelessly, the state machine 800 receives the RF data 801 (RF Digital) and inverts the bits which were inverted prior to transmission, by exclusive ORing RF Digital 801 with a clocking signal Q3 802 which has a frequency one half of the bit rate (or 1/16 of the master clock). The data stream can then be latched to produce a strong, clean data bit stream, 803 (SDATA), to present to the D/A converter.
  • RF Data 801 RF Digital
  • Q3 802 which has a frequency one half of the bit rate (or 1/16 of the master clock).
  • SDATA strong, clean data bit stream
  • the locking bit is encoded on the incoming data stream, RF Digital 801, to allow the receiver to maintain word lock.
  • the locking bit can be, for example, always 0 (logic level low) in the least significant bit of the digital data word.
  • the state machine 800 looks for the locking bit during a window of time, the locking bit window 808, to determine if lock is being maintained. If a 0 is present, no action is taken; however, if a 1 is detected, d e state machine 800 resets itself via its reset control line 809. After resetting, the state machine 800 can, for example, start over at a new data position and the process continues until lock is regained.
  • the locking bit could always be 1 and then the state machine would reset upon detecting a 0 during the locking bit window 808.
  • the demodulated signal output from the FSK receiver 908, called RFDIG 801 is in the same binary format as the signal which entered die FSK transmitter 905.
  • RFDIG 801 the demodulated signal output from the FSK receiver 908, called RFDIG 801
  • PLL 815 is able to lock on to the frequency of die bit rate due to sufficient bit transitions provided by the exclusive ORing of the signal with 1010 ...
  • the output of the PLL 815 is die master clock 804, MCLK, which has a frequency eight times the bit rate.
  • the MCLK is inputted to a divide-by-eight state machine 912, with die output thereof, at a frequency equal to the bit rate, fed through a feedback loop 913 to the
  • MCLK 804 is inputted to a state machine 800 which generates clock signals at MCLK 2 (or QO)810, MCLK/4 (or Ql)811, MCLK/8 (or Q2)805, MCLK/16 (or Q3)802, MCLK/32 (or Q4)812, MCLK/ 64 (or Q5)813, MCLK/128 (or Q6)814, and MCLK/256 (or Q7)806, wherein MCLK/2 means a clocking signal at the MCLK frequency divided by 2, etc.
  • Figure 9G shows how these clock signals align with each other, the input signal RF digital
  • Figure 9G shows two 16-bit words, right channel word D15, D14, ... , DO, and left channel word D15, D14, ... , DO, from a digital bit stream, RFDIG 801 in Figure 8A. Note, these two 16-bit words could be considered one 32-bit word.
  • the first D15, D14, ... , DO can be a right channel word and the next D15, D14, ... , DO can be a left channel word.
  • MCLK 8 (or Q2
  • SCLK the data clock at twice the bit rate, which can be used to determine die state, one or zero, of each bit.
  • an ⁇ t input NAND gate 915 with inputs NOT Q7 817, Q6814, Q5 813, Q4812, Q3 802, NOT Q2818, NOT Ql 819, and a bit value from latch 916, SDATA 803 after inversion,
  • Latch 916 can delay each bit for one cycle of MCLK.4, or one-half the duration of a bit. Therefore, the output from latch 916, SDATA 803, is delayed with respect to the output of the exclusive OR 917, by one-half the duration of a bit. This latching and delay allows the bit to be clean and strong during die locking bit window 808.
  • Figure 9G illustrates the alignment of SDATA 803, and die various clock signals when the state machine is in lock with the locking bit.
  • the bit value during the locking bit window 808, one or zero, from latch 916 is the bit value of Dn, which is any one of D15, D14, ... , DO, D15, D14, ..., DO from either the left or right channel word as shown in Figure 9G.
  • the bit value of Dn is obtained by Exclusive ORing 917 RFDIG 801 with Q3 802.
  • Exclusive ORing 917 Q3 802 with RFDIG 801 inverts the previously inverted bits to generate a data signal, XOR output 816, which is a replica of the original binary coded format signal 901 with die DC removed.
  • Q3 802 is synchronized widi RFDIG 801, by locking on to the bit rate.
  • the locking bit is located by first resetting the state machine at a random position within die two 16 bit word cycle. If the output 921 of the NAND gate 915, after inversion by inverter 920, is a zero, then d e selected bit is a one and d erefore not the locking bit. Alternatively, die inverted NAND gate 915 output 921 will be one only when the inverted bit 922 from SDATA 803, is a one, corresponding to the bit from SDATA 803, the locking bit, being a zero.
  • the inverted NAND gate 915 output, 921 can only be a one if the inverted bit 922 from SDATA 803 is a one at the same time that NOT Q7 817 is a one, Q6 814 is a one, Q5 813 is a one, Q4 812 is a one, Q3 802 is a one,
  • NOT Q2818 is a one
  • NOT Ql 819 is a one
  • based on the inputs to the NAND gate 915 this only occurs at the DO bit location of the right channel word. Therefore, if Dn (n ⁇ O) is arriving when DO should arrive, then die inverted NAND 915 output 921 remains zero until Dn eventually becomes a zero. If . in Figures 8A and 9G, Dn is a one, then die inverted NAND gate 915 output 921 is zero, and the state machine 800 can be instructed to reset to the bit following Dn, namely Dn+1.
  • each bit location from D15, D14, ..., DO, D15, D14, ..., DO is guaranteed to alternate between one and zero, except the locking bit, DO of the right channel word which is always zero, the state machine can quickly lock on to the location of the locking bit. In this synchronized state, lock-on to the locking bit has been achieved.
  • the need to locate die locking bit is why it is imperative tiiat each of the other bit locations are guaranteed to switch to a one state some time in the bit stream such that no other bit location remains in the zero state long enough to be mistaken as the locking bit.
  • Example 4 In an embodiment such as described in Example 2 or Example 3, if the digital signal is wirelessly transmitted tiirough the air, for example from an FSK transmitter to a FSK receiver, the receiver can be located in a remote unit while the transmitter can be located in a base unit.
  • the base unit can, for example, comprise the HRTF processing circuitry including DSP chip 600, EEPROM 710, and External EPROM 704, such as exemplified in Figure 23 A, as well as the signal processing circuitry 901, 924, 902, 903, 904, FSK transmitter 905, and transmitting loop 906, such as exemplified in Figure 8A.
  • the remote unit can, for example, comprise receiving loop 907, FSK receiver 908, PLL 815, state machine 800, NAND gate 800, and associated circuitry exemplified in Figure 8A, as well as input means for HRTF matching control 636, OK control 637, Noise control 703, Bass control 680, Ears control 629, Seat control 643, Ambience control 696, Theater control 624, Hall control 625, and Club control 626.
  • the input means for the aforementioned control functions can instead be located in the base unit.
  • the headphones can be plugged into the base unit or the remote unit to allow the headphone user to listen to the audio signal.
  • the wireless transmission of the signal from the base unit to the remote unit allows the hstener a greater range of motion than if connected to the base unit by wire. If the input means for the control features are in the remote unit, it is preferred to have some means for die remote unit to send information to the base unit.
  • the remote unit sends information to the base unit, for example, by an infra-red (IR) signal.
  • the remote unit has input means, for example, buttons, for the hstener to enter, for example, club 626, hall 625, theater 624, ambience 696, seat control 643, ears control 629, bass control 680, noise control 703, OK control 637, and/or HRTF matching 636 signals.
  • These command signals are transmitted to the base unit by, for example, IR
  • the base In order for the remote unit to determine if the base received die IR signal, the base sends a return signal from the base unit to the remote unit, in response to receiving the IR signal from the remote unit.
  • die subject invention encodes a tag bit on the RF digital audio signal which, when received by the remote unit, indicates receipt, by the base unit, of an IR signal from the remote unit.
  • This tag bit is a bit encoded similarly to the locking bit. For example, if the locking bit is encoded in the least significant bit location of tbe right channel word of the audio signal, then the tag bit is, for example, encoded in the least significant bit location of the left channel word of the audio signal.
  • d e tag bit is encoded, as a default value, opposite to the value of the locking bit. For instance, if the locking bit is encoded as one, or a zero, then die tag bit will be encoded, as a default value, as a zero, or a one, respectively. In a specific embodiment where the locking bit is encoded as a zero, the default value of the tag bit can thus be a one and can therefore be encoded by ORing each left channel word with OOOOOOOOOOOOO 1.
  • the receiver in the remote unit interprets a one in the tag bit location to mean that no IR signal has been received by the base unit.
  • the base unit encodes a zero value in at least one consecutive tag bit location by ANDing at least one left word with 11...10 instead of Oring with 00...01.
  • a zero value is encoded for eight consecutive tag bits to reduce die effects of noise, i.e. bit errors.
  • the state machine 800 monitors the tag bit location, which is known relative to the locking bit location
  • de locking bit is encoded in the least-significant bit location of the right channel word
  • d e tag bit is encoded in the least significant bit location of die left channel word.
  • d e receiver of die remote unit monitors the tag bit much like it monitors the locking bit.
  • an additional eight input NAND gate similar to NAND gate 915 having inputs Q7 806, Q6814, Q5 813, Q4812, Q3 802, NOT Q2 818, NOT Ql 819, and a bit value from latch 916, SDATA 803, after inversion, 922, is used. Note, these are the same inputs for monitoring the locking bit location, except NOT Q7817 is replaced with Q7806.
  • Figure 9F illustrates die alignment of SDATA 803, and the various clock signals when the state machine is in lock with d e locking bit.
  • the inverted output of the NAND gate is a zero, then the tag bit is a one and tiierefore no IR signal has been received by die base.
  • the inverted output of the NAND gate will be a one only when the inverted bit 922 from SDATA 803 is a one, corresponding to die bit from
  • SDATA 803 the tag bit, being a zero.
  • a zero value for the tag bit signifies the base unit has received an IR signal from the remote.
  • the state machine 800 only looks for the tag bit during a small window in time, the tag bit window 820, after a command is sent via the IR link.
  • the remote clears the tag bit latch, transmits the command word over the IR, and then watches for a zero bit to be latched onto die tag bit control line. If a zero is latched, then the command was received by the DSP, die base; if a one is latched, then the command was not received and no action is taken by the remote unit. When a one is latched and no action is taken by the remote, the user would be required to press the command button again and res end die command over the IR link. Once the receiver locks on to the locking bit, the location of the tag bit will then be known.

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Stereophonic System (AREA)

Abstract

La présente invention concerne un procédé et un dispositif de traitement de signaux audio multicanaux, chaque canal correspondant à un haut-parleur disposé en un point particulier une pièce de façon à donner, via un casque audio, l'impression que de multiples de haut-parleurs 'fantômes' sont répartis dans la pièce. On sélectionne des fonctions HRTF de transfert par rapport à la tête (Head Related Transfer Functions) en prenant en considération la hauteur et l'azimut de chaque haut-parleur considéré par rapport à l'auditeur. Chaque canal fait l'objet d'un filtrage HRTF de sorte que, lorsque ces canaux sont combinés dans les canaux gauche et droit et restitués par un casque audio, l'auditeur a l'impression que le son provient effectivement de haut-parleurs fantômes répartis dans la pièce virtuelle. Des jeux de coefficients HRTF saisis en base de données à partir d'un grand nombre d'individus et l'utilisation pour l'auditeur concerné d'un jeu HRTF optimal lui fournit des impressions d'écoute semblables à celle qu'aurait un auditeur isolé s'il écoutait de multiples haut-parleurs répartis dans le volume d'un local. L'application d'une fonction HRTF à la sortie des canaux droit et gauche permet, dans le cas d'une écoute au casque, de donner l'impression d'une écoute sans casque.
PCT/US1997/000145 1996-01-04 1997-01-03 Procede et dispositif de traitement d'un signal multicanal destine a un casque audio Ceased WO1997025834A2 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU15271/97A AU1527197A (en) 1996-01-04 1997-01-03 Method and device for processing a multi-channel signal for use with a headphone

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
US08/582,830 US5742689A (en) 1996-01-04 1996-01-04 Method and device for processing a multichannel signal for use with a headphone
US08/582,830 1996-01-04
US72659096A 1996-10-07 1996-10-07
US72651896A 1996-10-07 1996-10-07
US08/726,518 1996-10-07
US08/726,590 1996-10-07

Publications (2)

Publication Number Publication Date
WO1997025834A2 true WO1997025834A2 (fr) 1997-07-17
WO1997025834A3 WO1997025834A3 (fr) 1997-09-18

Family

ID=27416387

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1997/000145 Ceased WO1997025834A2 (fr) 1996-01-04 1997-01-03 Procede et dispositif de traitement d'un signal multicanal destine a un casque audio

Country Status (2)

Country Link
AU (1) AU1527197A (fr)
WO (1) WO1997025834A2 (fr)

Cited By (39)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1999014983A1 (fr) 1997-09-16 1999-03-25 Lake Dsp Pty. Limited Utilisation d'effets de filtrage dans les casques d'ecoute stereophoniques pour ameliorer la spatialisation d'une source autour d'un auditeur
WO2000044196A3 (fr) * 1999-01-21 2000-10-19 Fraunhofer Ges Forschung Procede et dispositif pour evaluer la qualite de signaux audio a canaux multiples
GB2351213A (en) * 1999-05-29 2000-12-20 Central Research Lab Ltd A method of modifying head related transfer functions
WO2001054453A1 (fr) * 2000-01-17 2001-07-26 The University Of Sydney Generation d'effets sonores tridimensionnels personnalises
WO2002098172A3 (fr) * 2001-05-29 2003-12-04 Koninkl Philips Electronics Nv Procede de generation d'un signal audio modifie gauche et d'un signal audio modifie droit pour un systeme stereo
WO2004028205A3 (fr) * 2002-09-23 2004-06-03 Koninkl Philips Electronics Nv Syteme de reproduction sonore, programme et support de donnees
WO2004028204A3 (fr) * 2002-09-23 2004-07-15 Koninkl Philips Electronics Nv Production d'un signal son
EP1272004A3 (fr) * 2001-06-21 2004-07-21 Bose Corporation Traitement du signal audio
US6771778B2 (en) 2000-09-29 2004-08-03 Nokia Mobile Phonés Ltd. Method and signal processing device for converting stereo signals for headphone listening
EP0966179A3 (fr) * 1998-06-20 2005-07-20 Creative Technology Ltd. Méthode de synthétisation d'un signal acoustique
EP1455554A3 (fr) * 2003-03-03 2005-09-07 Pioneer Corporation Circuit et programme traítement des signaux audio multicanaux et dispositif de réproduction des dits signaux
EP0932324A3 (fr) * 1998-01-22 2005-09-14 Sony Corporation Dispositif de reproduction de son,dispositif d'écouteur et dispositif de traitement
EP1119121A3 (fr) * 1999-12-03 2005-12-21 AKG Acoustics GmbH Méthode et dispositif pour la transmission numérique de signaux audiophoniques par voie hertzienne
WO2006039748A1 (fr) * 2004-10-14 2006-04-20 Dolby Laboratories Licensing Corporation Fonctions de transfert liees a la chaleur ameliorees pour contenu audio stereo
EP1545154A3 (fr) * 2003-12-17 2006-05-17 Samsung Electronics Co., Ltd. Haut-parleur virtuel paramétrique et système de son multivoie
WO2006024850A3 (fr) * 2004-09-01 2006-06-15 Smyth Res Llc Virtualisation d'ecouteurs personnalisee
FR2880755A1 (fr) * 2005-01-10 2006-07-14 France Telecom Procede et dispositif d'individualisation de hrtfs par modelisation
EP1562402A3 (fr) * 2004-02-06 2007-11-14 Sony Corporation Capteur acoustique, méthode de prise de son et support d'enregistrement
EP1617707A3 (fr) * 2004-07-14 2008-03-19 Samsung Electronics Co, Ltd Appareil de reproduction de son et procédé pour créer une source virtuelle de son
EP1365629A4 (fr) * 2001-02-27 2008-10-29 Sanyo Electric Co Dispositif stereophonique a ecouteurs et programme de traitement de signal vocal
WO2009086174A1 (fr) * 2007-12-21 2009-07-09 Srs Labs, Inc. Système pour ajuster la sonie perçue de signaux audio
EP2083585A1 (fr) * 2008-01-23 2009-07-29 LG Electronics Inc. Procédé et appareil de traitement de signal audio
US8027476B2 (en) 2004-02-06 2011-09-27 Sony Corporation Sound reproduction apparatus and sound reproduction method
US8243969B2 (en) 2005-09-13 2012-08-14 Koninklijke Philips Electronics N.V. Method of and device for generating and processing parameters representing HRTFs
WO2012172264A1 (fr) * 2011-06-16 2012-12-20 Haurais Jean-Luc Procede de traitement d'un signal audio pour une restitution amelioree
US8538042B2 (en) 2009-08-11 2013-09-17 Dts Llc System for increasing perceived loudness of speakers
US8615088B2 (en) 2008-01-23 2013-12-24 Lg Electronics Inc. Method and an apparatus for processing an audio signal using preset matrix for controlling gain or panning
JP2014072894A (ja) * 2012-09-27 2014-04-21 Intel Corp カメラによるオーディオ空間化
CN103989481A (zh) * 2013-02-16 2014-08-20 上海航空电器有限公司 一种hrtf数据库测量装置及其使用方法
CN104010264A (zh) * 2013-02-21 2014-08-27 中兴通讯股份有限公司 双声道音频信号处理的方法和装置
WO2014170580A1 (fr) 2013-04-17 2014-10-23 Haurais Jean-Luc Procédé de restitution sonore d'un signal numérique audio
GB2515375A (en) * 2013-06-20 2014-12-24 Csr Technology Inc Method, apparatus, and manufacture for wireless immersive audio transmission
US9312829B2 (en) 2012-04-12 2016-04-12 Dts Llc System for adjusting loudness of audio signals in real time
US9426599B2 (en) 2012-11-30 2016-08-23 Dts, Inc. Method and apparatus for personalized audio virtualization
FR3040253A1 (fr) * 2015-08-21 2017-02-24 Sacha Semeria Procede de mesure de filtres phrtf d'un auditeur, cabine pour la mise en oeuvre du procede, et procedes permettant d'aboutir a la restitution d'une bande sonore multicanal personnalisee
US9794715B2 (en) 2013-03-13 2017-10-17 Dts Llc System and methods for processing stereo audio content
CN107852563A (zh) * 2015-06-18 2018-03-27 诺基亚技术有限公司 双耳音频再现
US10306396B2 (en) 2017-04-19 2019-05-28 United States Of America As Represented By The Secretary Of The Air Force Collaborative personalization of head-related transfer function
US10341799B2 (en) 2014-10-30 2019-07-02 Dolby Laboratories Licensing Corporation Impedance matching filters and equalization for headphone surround rendering

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4739513A (en) * 1984-05-31 1988-04-19 Pioneer Electronic Corporation Method and apparatus for measuring and correcting acoustic characteristic in sound field
JPH0739968B2 (ja) * 1991-03-25 1995-05-01 日本電信電話株式会社 音響伝達特性模擬方法
US5404406A (en) * 1992-11-30 1995-04-04 Victor Company Of Japan, Ltd. Method for controlling localization of sound image
US5371799A (en) * 1993-06-01 1994-12-06 Qsound Labs, Inc. Stereo headphone sound source localization system
US5438623A (en) * 1993-10-04 1995-08-01 The United States Of America As Represented By The Administrator Of National Aeronautics And Space Administration Multi-channel spatialization system for audio signals
GB9326092D0 (en) * 1993-12-21 1994-02-23 Central Research Lab Ltd Apparatus and method for audio signal balance control
US6118875A (en) * 1994-02-25 2000-09-12 Moeller; Henrik Binaural synthesis, head-related transfer functions, and uses thereof
EP0760197B1 (fr) * 1994-05-11 2009-01-28 Aureal Semiconductor Inc. Affichage audio virtuel tridimensionnel utilisant des filtres de formation d'images a complexite reduite

Cited By (80)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7539319B2 (en) 1997-09-16 2009-05-26 Dolby Laboratories Licensing Corporation Utilization of filtering effects in stereo headphone devices to enhance spatialization of source around a listener
EP1025743A4 (fr) * 1997-09-16 2007-10-17 Dolby Lab Licensing Corp Utilisation d'effets de filtrage dans les casques d'ecoute stereophoniques pour ameliorer la spatialisation d'une source autour d'un auditeur
US7536021B2 (en) 1997-09-16 2009-05-19 Dolby Laboratories Licensing Corporation Utilization of filtering effects in stereo headphone devices to enhance spatialization of source around a listener
WO1999014983A1 (fr) 1997-09-16 1999-03-25 Lake Dsp Pty. Limited Utilisation d'effets de filtrage dans les casques d'ecoute stereophoniques pour ameliorer la spatialisation d'une source autour d'un auditeur
EP0932324A3 (fr) * 1998-01-22 2005-09-14 Sony Corporation Dispositif de reproduction de son,dispositif d'écouteur et dispositif de traitement
EP0966179A3 (fr) * 1998-06-20 2005-07-20 Creative Technology Ltd. Méthode de synthétisation d'un signal acoustique
WO2000044196A3 (fr) * 1999-01-21 2000-10-19 Fraunhofer Ges Forschung Procede et dispositif pour evaluer la qualite de signaux audio a canaux multiples
GB2351213B (en) * 1999-05-29 2003-08-27 Central Research Lab Ltd A method of modifying one or more original head related transfer functions
GB2351213A (en) * 1999-05-29 2000-12-20 Central Research Lab Ltd A method of modifying head related transfer functions
EP1119121A3 (fr) * 1999-12-03 2005-12-21 AKG Acoustics GmbH Méthode et dispositif pour la transmission numérique de signaux audiophoniques par voie hertzienne
US7542574B2 (en) 2000-01-17 2009-06-02 Personal Audio Pty Ltd Generation of customised three dimensional sound effects for individuals
US7209564B2 (en) 2000-01-17 2007-04-24 Vast Audio Pty Ltd. Generation of customized three dimensional sound effects for individuals
WO2001054453A1 (fr) * 2000-01-17 2001-07-26 The University Of Sydney Generation d'effets sonores tridimensionnels personnalises
US6771778B2 (en) 2000-09-29 2004-08-03 Nokia Mobile Phonés Ltd. Method and signal processing device for converting stereo signals for headphone listening
EP1365629A4 (fr) * 2001-02-27 2008-10-29 Sanyo Electric Co Dispositif stereophonique a ecouteurs et programme de traitement de signal vocal
US7706555B2 (en) 2001-02-27 2010-04-27 Sanyo Electric Co., Ltd. Stereophonic device for headphones and audio signal processing program
WO2002098172A3 (fr) * 2001-05-29 2003-12-04 Koninkl Philips Electronics Nv Procede de generation d'un signal audio modifie gauche et d'un signal audio modifie droit pour un systeme stereo
US7065218B2 (en) 2001-05-29 2006-06-20 Koninklijke Philips Electronics N.V. Method of generating a left modified and a right modified audio signal for a stereo system
US8175292B2 (en) 2001-06-21 2012-05-08 Aylward J Richard Audio signal processing
US7164768B2 (en) 2001-06-21 2007-01-16 Bose Corporation Audio signal processing
EP1272004A3 (fr) * 2001-06-21 2004-07-21 Bose Corporation Traitement du signal audio
CN100394829C (zh) * 2001-06-21 2008-06-11 伯斯有限公司 处理音频信号的方法
WO2004028204A3 (fr) * 2002-09-23 2004-07-15 Koninkl Philips Electronics Nv Production d'un signal son
WO2004028205A3 (fr) * 2002-09-23 2004-06-03 Koninkl Philips Electronics Nv Syteme de reproduction sonore, programme et support de donnees
US8160260B2 (en) 2003-03-03 2012-04-17 Pioneer Corporation Circuit and program for processing multichannel audio signals and apparatus for reproducing same
EP1455554A3 (fr) * 2003-03-03 2005-09-07 Pioneer Corporation Circuit et programme traítement des signaux audio multicanaux et dispositif de réproduction des dits signaux
US7457421B2 (en) 2003-03-03 2008-11-25 Pioneer Corporation Circuit and program for processing multichannel audio signals and apparatus for reproducing same
EP1545154A3 (fr) * 2003-12-17 2006-05-17 Samsung Electronics Co., Ltd. Haut-parleur virtuel paramétrique et système de son multivoie
US8027476B2 (en) 2004-02-06 2011-09-27 Sony Corporation Sound reproduction apparatus and sound reproduction method
EP1562402A3 (fr) * 2004-02-06 2007-11-14 Sony Corporation Capteur acoustique, méthode de prise de son et support d'enregistrement
EP1617707A3 (fr) * 2004-07-14 2008-03-19 Samsung Electronics Co, Ltd Appareil de reproduction de son et procédé pour créer une source virtuelle de son
US7680290B2 (en) 2004-07-14 2010-03-16 Samsung Electronics Co., Ltd. Sound reproducing apparatus and method for providing virtual sound source
US7936887B2 (en) 2004-09-01 2011-05-03 Smyth Research Llc Personalized headphone virtualization
JP2008512015A (ja) * 2004-09-01 2008-04-17 スミス リサーチ エルエルシー 個人化されたヘッドフォン仮想化処理
WO2006024850A3 (fr) * 2004-09-01 2006-06-15 Smyth Res Llc Virtualisation d'ecouteurs personnalisee
CN101040565B (zh) * 2004-10-14 2010-05-12 杜比实验室特许公司 用于移动立体声内容的改善的头相关传递函数
WO2006039748A1 (fr) * 2004-10-14 2006-04-20 Dolby Laboratories Licensing Corporation Fonctions de transfert liees a la chaleur ameliorees pour contenu audio stereo
AU2005294113B2 (en) * 2004-10-14 2009-11-26 Dolby Laboratories Licensing Corporation Improved head related transfer functions for panned stereo audio content
US7634092B2 (en) 2004-10-14 2009-12-15 Dolby Laboratories Licensing Corporation Head related transfer functions for panned stereo audio content
JP2008516539A (ja) * 2004-10-14 2008-05-15 ドルビー・ラボラトリーズ・ライセンシング・コーポレーション パンされたステレオオーディオコンテンツについての改善された頭部伝達関数
WO2006075077A3 (fr) * 2005-01-10 2006-10-05 France Telecom Procede et dispositif d’individualisation de hrtfs par modelisation
FR2880755A1 (fr) * 2005-01-10 2006-07-14 France Telecom Procede et dispositif d'individualisation de hrtfs par modelisation
US8520871B2 (en) 2005-09-13 2013-08-27 Koninklijke Philips N.V. Method of and device for generating and processing parameters representing HRTFs
US8243969B2 (en) 2005-09-13 2012-08-14 Koninklijke Philips Electronics N.V. Method of and device for generating and processing parameters representing HRTFs
US8315398B2 (en) 2007-12-21 2012-11-20 Dts Llc System for adjusting perceived loudness of audio signals
CN102017402A (zh) * 2007-12-21 2011-04-13 Srs实验室有限公司 用于调节音频信号的感知响度的系统
US9264836B2 (en) 2007-12-21 2016-02-16 Dts Llc System for adjusting perceived loudness of audio signals
WO2009086174A1 (fr) * 2007-12-21 2009-07-09 Srs Labs, Inc. Système pour ajuster la sonie perçue de signaux audio
EP2083585A1 (fr) * 2008-01-23 2009-07-29 LG Electronics Inc. Procédé et appareil de traitement de signal audio
US8615088B2 (en) 2008-01-23 2013-12-24 Lg Electronics Inc. Method and an apparatus for processing an audio signal using preset matrix for controlling gain or panning
US8615316B2 (en) 2008-01-23 2013-12-24 Lg Electronics Inc. Method and an apparatus for processing an audio signal
US9787266B2 (en) 2008-01-23 2017-10-10 Lg Electronics Inc. Method and an apparatus for processing an audio signal
US9319014B2 (en) 2008-01-23 2016-04-19 Lg Electronics Inc. Method and an apparatus for processing an audio signal
US10299040B2 (en) 2009-08-11 2019-05-21 Dts, Inc. System for increasing perceived loudness of speakers
US8538042B2 (en) 2009-08-11 2013-09-17 Dts Llc System for increasing perceived loudness of speakers
US9820044B2 (en) 2009-08-11 2017-11-14 Dts Llc System for increasing perceived loudness of speakers
WO2012172264A1 (fr) * 2011-06-16 2012-12-20 Haurais Jean-Luc Procede de traitement d'un signal audio pour une restitution amelioree
FR2976759A1 (fr) * 2011-06-16 2012-12-21 Jean Luc Haurais Procede de traitement d'un signal audio pour une restitution amelioree.
RU2616161C2 (ru) * 2011-06-16 2017-04-12 Жан-Люк ОРЭ Способ обработки аудиосигнала для улучшенного преобразования
US10171927B2 (en) 2011-06-16 2019-01-01 Axd Technologies, Llc Method for processing an audio signal for improved restitution
US9559656B2 (en) 2012-04-12 2017-01-31 Dts Llc System for adjusting loudness of audio signals in real time
US9312829B2 (en) 2012-04-12 2016-04-12 Dts Llc System for adjusting loudness of audio signals in real time
JP2014072894A (ja) * 2012-09-27 2014-04-21 Intel Corp カメラによるオーディオ空間化
US10070245B2 (en) 2012-11-30 2018-09-04 Dts, Inc. Method and apparatus for personalized audio virtualization
US9426599B2 (en) 2012-11-30 2016-08-23 Dts, Inc. Method and apparatus for personalized audio virtualization
CN103989481A (zh) * 2013-02-16 2014-08-20 上海航空电器有限公司 一种hrtf数据库测量装置及其使用方法
CN104010264A (zh) * 2013-02-21 2014-08-27 中兴通讯股份有限公司 双声道音频信号处理的方法和装置
CN104010264B (zh) * 2013-02-21 2016-03-30 中兴通讯股份有限公司 双声道音频信号处理的方法和装置
WO2014127609A1 (fr) * 2013-02-21 2014-08-28 中兴通讯股份有限公司 Méthode et dispositif de traitement de signaux audios binauraux
US9794715B2 (en) 2013-03-13 2017-10-17 Dts Llc System and methods for processing stereo audio content
US9609454B2 (en) 2013-04-17 2017-03-28 Jean-Luc Haurais Method for playing back the sound of a digital audio signal
WO2014170580A1 (fr) 2013-04-17 2014-10-23 Haurais Jean-Luc Procédé de restitution sonore d'un signal numérique audio
GB2515375A (en) * 2013-06-20 2014-12-24 Csr Technology Inc Method, apparatus, and manufacture for wireless immersive audio transmission
US10341799B2 (en) 2014-10-30 2019-07-02 Dolby Laboratories Licensing Corporation Impedance matching filters and equalization for headphone surround rendering
CN107852563A (zh) * 2015-06-18 2018-03-27 诺基亚技术有限公司 双耳音频再现
EP3311593A4 (fr) * 2015-06-18 2019-01-16 Nokia Technologies OY Reproduction audio binaurale
US10757529B2 (en) 2015-06-18 2020-08-25 Nokia Technologies Oy Binaural audio reproduction
WO2017032946A1 (fr) * 2015-08-21 2017-03-02 Immersive Personalized Sound Procédé de mesure de filtres phrtf d'un auditeur, cabine pour la mise en oeuvre du procédé, et procédés permettant d'aboutir à la restitution d'une bande sonore multicanal personnalisée
FR3040253A1 (fr) * 2015-08-21 2017-02-24 Sacha Semeria Procede de mesure de filtres phrtf d'un auditeur, cabine pour la mise en oeuvre du procede, et procedes permettant d'aboutir a la restitution d'une bande sonore multicanal personnalisee
US10306396B2 (en) 2017-04-19 2019-05-28 United States Of America As Represented By The Secretary Of The Air Force Collaborative personalization of head-related transfer function

Also Published As

Publication number Publication date
WO1997025834A3 (fr) 1997-09-18
AU1527197A (en) 1997-08-01

Similar Documents

Publication Publication Date Title
WO1997025834A2 (fr) Procede et dispositif de traitement d'un signal multicanal destine a un casque audio
US5742689A (en) Method and device for processing a multichannel signal for use with a headphone
TWI427621B (zh) 編碼聲音通道及解碼經傳輸之聲音通道之方法、裝置及機器可讀取媒體
TWI645723B (zh) 用於解壓縮經壓縮之音訊資料之方法及器件及其非暫時性電腦可讀儲存媒體
US20050080616A1 (en) Recording a three dimensional auditory scene and reproducing it for the individual listener
CA2744429C (fr) Convertisseur et procede de conversion d'un signal audio
CN110537221A (zh) 用于空间音频处理的两阶段音频聚焦
JP2021525392A (ja) 空間オーディオパラメータのシグナリング
US8401685B2 (en) Method for reproducing an audio recording with the simulation of the acoustic characteristics of the recording condition
US7116788B1 (en) Efficient head related transfer function filter generation
WO2019197709A1 (fr) Appareil, procédé et programme informatique destinés à la reproduction audio spatiale
US20250292762A1 (en) Apparatus, Methods and Computer Programs for Spatial Rendering of Reverberation
KR20250097886A (ko) 물리적으로 별개의 디바이스들로의 계산들의 스마트 분배를 사용하여 2채널 오디오 신호를 생성하기 위한 오디오 신호 프로세서와 관련 방법 및 컴퓨터 프로그램
Martens et al. Multidimensional perceptual unfolding of spatially processed speech I: Deriving stimulus space using INDSCAL
US20240196151A1 (en) Error correction of head-related filters
Lee et al. A real-time audio system for adjusting the sweet spot to the listener's position
Tsang et al. Development of a re-configurable ambisonic decoder for irregular loudspeaker configuration
Francl Modeling and Evaluating Human Sound Localization in the Natural Environment
Filipanits Design and implementation of an auralization system with a spectrum-based temporal processing optimization
Kapralos Auditory perception and virtual environments
US20240292171A1 (en) Systems and methods for efficient and accurate virtual accoustic rendering
CN115604646B (zh) 一种全景深空间音频处理方法
Warusfel Identification of Best-Matching HRTFs from Binaural Selfies and Machine Learning
Kelly Subjective Evaluations of Spatial Room Impulse Response Convolution Techniques in Channel-and Scene-Based Paradigms
JP2025533618A (ja) 空間オーディオに知覚ベースの距離メトリックを使用する装置および方法

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AL AU BA BB BG BR CA CN CU CZ EE GE HU IL IS JP KP KR LC LK LR LT LV MG MK MN MX NO NZ PL RO SG SI SK TR TT UA UG US UZ VN AM AZ BY KG KZ MD RU TJ TM

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): KE LS MW SD SZ UG AT BE CH DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN ML MR NE SN TD TG

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase

Ref country code: JP

Ref document number: 97525321

Format of ref document f/p: F

122 Ep: pct application non-entry in european phase