CN109716795B

CN109716795B - Networked microphone device, method therefor, and media playback system

Info

Publication number: CN109716795B
Application number: CN201780057093.3A
Authority: CN
Inventors: 蒂莫西·希恩
Original assignee: Sonos Inc
Current assignee: Sonos Inc
Priority date: 2016-07-15
Filing date: 2017-07-14
Publication date: 2020-12-04
Anticipated expiration: 2037-07-14
Also published as: CN112492502A; CN109716795A; EP4325895A2; CN112492502B; EP4325895A3; WO2018013959A1; EP3485655A1; EP3485655B1

Abstract

Example techniques may involve performing aspects of spectral calibration using applied spatial calibration. Example implementations may include: data representing spatial filters corresponding to respective playback configurations is received. The implementation may also involve: causing the audio driver to output calibration audio divided into a repeating set of frames including a respective frame for each playback configuration. Causing the audio driver to output the calibration audio may involve: the audio stage is caused to apply a spatial filter corresponding to the respective playback configuration during each frame. The implementation may further include: data representing spectral filters corresponding to respective playback configurations is received, the spectral filters based on the calibrated audio output by the audio driver. When playing back audio content in a given playback configuration, the audio stage may apply a particular spectral filter corresponding to that configuration.

Description

Networked microphone device, method therefor, and media playback system

相关申请的交叉引用CROSS-REFERENCE TO RELATED APPLICATIONS

本申请要求于2016年7月15日提交的美国专利申请第15/211,835号和于2016年7月15日提交的美国专利申请第15/211,822号的优先权，以上申请的全部内容通过引用合并在本文中。在整个公开内容中通过引用进行另外的合并。This application claims priority to US Patent Application Serial No. 15/211,835, filed July 15, 2016, and US Patent Application No. 15/211,822, filed July 15, 2016, the entire contents of which are incorporated by reference in their entirety in the text. Additional incorporation by reference is made throughout the disclosure.

技术领域technical field

本公开内容涉及消费品，并且更特别地，涉及针对媒体回放或其一些方面的方法、系统、产品、特征、服务和其他元素。The present disclosure relates to consumer products, and more particularly, to methods, systems, products, features, services, and other elements for media playback or aspects thereof.

背景技术Background technique

直到在2003年SONOS公司申请题为“Method for Synchronizing Audio Playbackbetween Multiple Networked Devices”的其首批专利申请之一，并且在2005年开始提供媒体回放系统用于出售时，用于以大声设置访问和收听数字音频的选择受到限制。Sonos无线高保真系统使人能经由一个或更多个联网回放设备来体验来自许多源的音乐。通过安装在智能电话、平板计算机或计算机上的软件控制应用，一个人可以在具有联网回放设备的任何房间中播放他或她想要的内容。另外，使用控制器，例如，可以利用回放设备将不同的歌曲流送至每个房间，房间可以被分组在一起以用于同步回放，或者可以在所有房间中同步地收听同一歌曲。Until 2003, SONOS Corporation filed one of its first patent applications entitled "Method for Synchronizing Audio Playbackbetween Multiple Networked Devices", and in 2005 began offering a media playback system for sale for access and listening in loud settings Digital audio options are limited. Sonos wireless hi-fi systems enable people to experience music from many sources via one or more networked playback devices. Through a software-controlled application installed on a smartphone, tablet or computer, a person can play what he or she wants in any room with a networked playback device. Additionally, using the controller, for example, playback devices can be used to stream different songs to each room, rooms can be grouped together for synchronized playback, or the same song can be listened to in all rooms synchronously.

鉴于对数字媒体的日益增长的兴趣，仍然需要开发消费者可访问的技术以进一步增强收听体验。Given the growing interest in digital media, there is still a need to develop consumer-accessible technologies to further enhance the listening experience.

附图说明Description of drawings

考虑以下描述、所附权利要求以及附图，可以更好地理解本公开技术的特征、方面和优点，在附图中：The features, aspects, and advantages of the disclosed technology may be better understood upon consideration of the following description, appended claims, and accompanying drawings, in which:

图1示出了其中可以实践某些实施方式的示例媒体回放系统配置；1 illustrates an example media playback system configuration in which certain embodiments may be practiced;

图2示出了示例回放设备的功能框图；Figure 2 shows a functional block diagram of an example playback device;

图3示出了示例控制设备的功能框图；Figure 3 shows a functional block diagram of an example control device;

图4示出了示例控制器接口；Figure 4 shows an example controller interface;

图5示出了示例控制设备；Figure 5 shows an example control device;

图6示出了根据示例实现的正显示示例控制接口的智能电话；6 illustrates a smartphone displaying an example control interface, according to an example implementation;

图7示出了通过示例媒体回放系统所位于的示例环境的示例移动；7 illustrates example movement through an example environment in which an example media playback system is located;

图8示出了频率随时间增大的示例啁啾(chirp)；Figure 8 shows an example chirp with increasing frequency over time;

图9示出了示例布朗噪声频谱；Figure 9 shows an example Brownian noise spectrum;

图10A和图10B示出了示例混合校准声音的过渡频率范围；Figures 10A and 10B illustrate transition frequency ranges for example hybrid calibration sounds;

图11示出了说明示例周期性校准声音的迭代的帧；Figure 11 shows frames illustrating iterations of an example periodic calibration sound;

图12示出了说明示例周期性校准声音的迭代的一系列帧；12 shows a series of frames illustrating iterations of an example periodic calibration sound;

图13示出了促进空间校准的示例流程图；FIG. 13 shows an example flow diagram for facilitating spatial calibration;

图14示出了说明被划分成帧和时隙的校准音频的示例帧；Figure 14 shows an example frame illustrating calibration audio divided into frames and time slots;

图15示出了根据示例实现的正显示示例控制接口的智能电话；15 illustrates a smartphone displaying an example control interface, according to an example implementation;

图16示出了根据示例实现的正显示示例控制接口的智能电话；16 illustrates a smartphone displaying an example control interface according to an example implementation;

图17示出了根据示例实现的正显示示例控制接口的智能电话；17 illustrates a smartphone displaying an example control interface, according to an example implementation;

图18示出了根据示例实现的正显示示例控制接口的智能电话；18 illustrates a smartphone displaying an example control interface according to an example implementation;

图19示出了使用所应用的空间校准来促进空间校准的示例流程图；Figure 19 shows an example flow diagram for facilitating spatial calibration using applied spatial calibration;

图20示出了使用所应用的空间校准来促进空间校准的示例流程图；以及FIG. 20 shows an example flow diagram for facilitating spatial calibration using the applied spatial calibration; and

图21示出了根据示例实现的正显示示例控制接口的智能电话。21 illustrates a smartphone displaying an example control interface, according to an example implementation.

附图是为了说明示例实施方式的目的，但是应该理解，本发明不限于附图中所示的布置和手段。The drawings are for purposes of illustrating example embodiments, it being understood, however, that the invention is not limited to the arrangements and instrumentalities shown in the drawings.

具体实施方式Detailed ways

I.概述I. Overview

本文描述的实施方式尤其涉及促进对媒体回放系统的校准的技术。本文设想的一些校准过程涉及：记录设备(例如，联网麦克风设备(NMD))检测由媒体回放系统的一个或更多个回放设备发出的声波(例如，一个或更多个校准声音)。处理设备诸如记录设备、回放设备或通信地耦接至媒体回放系统的另外的设备可以分析检测到的声波以确定针对媒体回放系统的一个或更多个回放设备的一个或更多个校准。当被应用时，这样的校准可以将一个或更多个回放设备配置到给定收听区域(即，在发出声波的同时将一个或多个回放设备定位的环境)。Embodiments described herein relate, inter alia, to techniques that facilitate calibration of media playback systems. Some calibration procedures contemplated herein involve recording devices (eg, networked microphone devices (NMDs)) detecting sound waves (eg, one or more calibration sounds) emitted by one or more playback devices of a media playback system. A processing device such as a recording device, a playback device, or another device communicatively coupled to the media playback system may analyze the detected sound waves to determine one or more calibrations for one or more playback devices of the media playback system. When applied, such calibration can configure one or more playback devices to a given listening area (ie, an environment in which the one or more playback devices are positioned while sound waves are emitted).

在本文设想的一些实施方式中，处理设备可以确定第一类型的校准。例如，处理设备可以确定在空间上将一个或更多个回放设备配置到给定收听区域的空间校准。这样的校准可以将一个或更多个回放设备配置到环境内的一个或更多个特定位置(例如，一个或更多个优选收听位置，例如最喜欢的座位位置)，可能通过针对那些特定位置调整时间延迟和/或响度来进行配置。空间校准可以包括一个或更多个滤波器，滤波器包括延迟和/或相位调整、增益调整和/或任何其他调整以校正一个或多个回放设备相对于环境内的一个或更多个特定位置的空间放置。In some embodiments contemplated herein, the processing device may determine a first type of calibration. For example, the processing device may determine a spatial calibration that spatially configures one or more playback devices to a given listening area. Such calibration may configure one or more playback devices to one or more specific locations within the environment (eg, one or more preferred listening locations, such as favorite seating locations), possibly by targeting those specific locations Adjust time delay and/or loudness to configure. Spatial calibration may include one or more filters including delay and/or phase adjustments, gain adjustments, and/or any other adjustments to correct one or more playback devices relative to one or more specific positions within the environment space placement.

如上所述，在校准过程期间，媒体回放系统的一个或更多个回放设备可以输出校准声音。一些示例媒体回放系统可以包括多个音频驱动器，多个音频驱动器可以以各种布置在媒体回放系统的一个或多个回放设备之间划分。例如，示例媒体回放系统可以包括具有多个音频驱动器(例如，九个音频驱动器)的条形音箱型回放设备。另一回放设备可能包括多个不同类型的音频驱动器(例如，可能具有不同尺寸的高音扬声器和低音扬声器)。其他示例回放设备可以包括单个音频驱动器(例如，回放设备中的单个全范围低音扬声器，或超低音扬声器型设备中的大型低频低音扬声器)。As described above, during the calibration process, one or more playback devices of the media playback system may output calibration sounds. Some example media playback systems may include multiple audio drivers that may be divided among one or more playback devices of the media playback system in various arrangements. For example, an example media playback system may include a sound bar-type playback device having multiple audio drivers (eg, nine audio drivers). Another playback device may include multiple different types of audio drivers (eg, tweeters and woofers that may have different sizes). Other example playback devices may include a single audio driver (eg, a single full-range woofer in a playback device, or a large low-frequency woofer in a subwoofer-type device).

在操作中，媒体回放系统的多个音频驱动器可以形成多个“声轴”。每个这样的“声轴”可以对应于音频内容的相应输入声道。在一些实现中，可以对两个或更多个音频驱动器进行排列以形成声轴。例如，条形音箱型设备可以包括形成多个声轴(例如，前环绕声道、左环绕声道和右环绕声道)的九个音频驱动器。任何音频驱动器都可能对任何数量的声轴有贡献。例如，环绕声系统的左轴可以由示例条形音箱型设备中的所有九个音频驱动器的贡献形成。可替选地，轴可以由单个音频驱动器形成。In operation, multiple audio drivers of a media playback system may form multiple "sound axes." Each such "sound axis" may correspond to a corresponding input channel of audio content. In some implementations, two or more audio drivers can be arranged to form an acoustic axis. For example, a sound bar-type device may include nine audio drivers that form multiple sound axes (eg, a front surround channel, a left surround channel, and a right surround channel). Any audio driver can contribute to any number of sound axes. For example, the left axis of a surround sound system may be formed by the contributions of all nine audio drivers in an example soundbar-type device. Alternatively, the shaft may be formed by a single audio driver.

本文描述的示例媒体回放系统可以采用表示相应声轴组的各种回放配置。示例回放配置可以包括基于输入声道的数量的相应配置(例如，单声道、立体声、环绕声或以上任何与超低音扬声器的组合)。其他示例回放配置可以基于内容类型。例如，当播放音乐时可以由媒体回放系统的音频驱动器形成第一轴集合，当播放与视频配对的音频(例如，电视音频)时可以由音频驱动器形成第二轴集合。其他回放确认可以由媒体回放系统内的回放设备的各种分组引起。许多示例是可能的。The example media playback systems described herein may employ various playback configurations representing respective sets of sound axes. Example playback configurations may include corresponding configurations based on the number of input channels (eg, mono, stereo, surround, or any combination of the above with a subwoofer). Other example playback configurations can be based on content type. For example, a first set of axes may be formed by an audio driver of a media playback system when playing music, and a second set of axes may be formed by an audio driver when playing audio paired with video (eg, television audio). Other playback confirmations may be caused by various groupings of playback devices within the media playback system. Many examples are possible.

在一些示例校准过程期间，媒体回放系统的多个音频驱动器可以形成多个声轴，使得每个声轴在校准过程期间输出声音。例如，由多个音频驱动器发出的校准音频可以被分成组成帧。每个帧又可以分成时隙。在给定帧的每个时隙期间，可以通过输出音频来形成相应的声轴。以这种方式，记录音频驱动器的音频输出的NMD可以从每个声轴获得样本。帧可以重复，以在由NMD记录时针对每个声轴产生多个样本。During some example calibration processes, multiple audio drivers of a media playback system may form multiple acoustic axes, such that each acoustic axis outputs sound during the calibration process. For example, calibration audio from multiple audio drivers can be divided into constituent frames. Each frame can in turn be divided into time slots. During each time slot of a given frame, a corresponding sound axis can be formed by outputting audio. In this way, the NMD recording the audio output of the audio driver can obtain samples from each acoustic axis. Frames can be repeated to produce multiple samples for each acoustic axis when recorded by the NMD.

可以通过本文描述的示例校准过程产生的另一类型的校准是频谱校准。频谱校准可以在频谱上跨给定收听区域对媒体回放系统的一个或多个回放设备进行配置。这样的校准通常可以帮助补偿环境的声学特性，而不是如空间校准那样相对更多地指向特定收听位置。频谱校准可以包括调整回放设备的频率响应的一个或更多个滤波器。在操作中，两个或更多个校准之一可以应用于一个或更多个回放设备的回放，可能用于不同的使用情况。示例使用情况可以包括音乐回放或环绕声(即家庭影院)等。Another type of calibration that can be produced by the example calibration process described herein is spectral calibration. Spectral calibration may spectrally configure one or more playback devices of a media playback system across a given listening area. Such calibration can often help compensate for the acoustic properties of the environment, rather than pointing relatively more towards a specific listening position as spatial calibration does. Spectral calibration may include one or more filters that adjust the frequency response of the playback device. In operation, one of the two or more calibrations may be applied to the playback of one or more playback devices, possibly for different use cases. Example use cases may include music playback or surround sound (ie home theater), among others.

在本文设想的一些示例校准过程中，媒体回放系统可以执行第一校准以确定针对媒体回放系统的一个或多个回放设备的空间校准。然后，媒体回放系统可以在第二校准期间在回放设备发出音频的同时应用空间校准，以确定频谱校准。这样的校准过程可以产生包括空间校正和频谱校正两者的校准。In some example calibration procedures contemplated herein, the media playback system may perform a first calibration to determine a spatial calibration for one or more playback devices of the media playback system. The media playback system may then apply the spatial calibration while the playback device is emitting audio during the second calibration to determine the spectral calibration. Such a calibration process can result in a calibration that includes both spatial and spectral corrections.

示例技术可以涉及执行空间校准的各方面。第一实现可以包括：检测启动对媒体回放系统的校准的触发条件，该媒体回放系统包括形成多个声轴的多个音频驱动器，每个声轴对应于多声道音频内容的相应声道。第一实现还可以包括使多个音频驱动器发出被分成组成帧的校准音频，多个声轴在每个组成帧的相应时隙期间发出校准音频。第一实现还可以包括经由麦克风记录所发出的校准音频。第一实现可以包括：使得确定多个声轴中的每个声轴的延迟，所确定的每个声轴的延迟基于与声轴对应的所记录的校准音频的时隙；以及使得校准多个声轴。校准多个声轴可以涉及：使得根据所确定的相应延迟来延迟多个声轴的音频输出。Example techniques may involve various aspects of performing spatial calibration. A first implementation may include detecting a trigger condition that initiates calibration of a media playback system including a plurality of audio drivers forming a plurality of sound axes, each sound axis corresponding to a respective channel of the multi-channel audio content. The first implementation may also include causing a plurality of audio drivers to emit calibration audio that is divided into frames, the plurality of sound axes emitting calibration audio during respective time slots of each framed frame. The first implementation may also include recording the emitted calibration audio via the microphone. The first implementation may include: causing a delay for each of the plurality of acoustic axes to be determined, the determined delay for each acoustic axis being based on a time slot of the recorded calibration audio corresponding to the acoustic axis; and causing the calibration of the plurality of acoustic axes sound axis. Calibrating the plurality of sound axes may involve causing the audio output of the plurality of sound axes to be delayed according to the determined respective delays.

第二实现可以包括：接收表示与相应回放配置相对应的一个或更多个空间滤波器的数据。每个回放配置可以表示经由一个或更多个音频驱动器形成的特定声轴集合，并且每个声轴可以对应于音频内容的相应声道。第二实现还可以涉及使一个或更多个音频驱动器输出被划分成重复的帧集合的校准音频，该帧集合包括针对每个回放配置的相应帧。使一个或更多个音频驱动器输出校准音频可以涉及：使音频级在每个帧期间应用与相应回放配置相对应的空间滤波器。第二实现还可以包括：接收表示与相应回放配置相对应的一个或更多个频谱滤波器的数据，该一个或更多个频谱滤波器基于由一个或更多个音频驱动器输出的校准音频。当以给定回放配置回放音频内容时，音频级可以应用与给定回放配置相对应的特定频谱滤波器。A second implementation may include receiving data representing one or more spatial filters corresponding to respective playback configurations. Each playback configuration may represent a particular set of sound axes formed via one or more audio drivers, and each sound axis may correspond to a corresponding channel of audio content. The second implementation may also involve causing one or more audio drivers to output calibration audio that is divided into repeated sets of frames, the set of frames including respective frames configured for each playback. Causing the one or more audio drivers to output calibration audio may involve causing the audio stage to apply a spatial filter corresponding to the respective playback configuration during each frame. The second implementation may also include receiving data representing one or more spectral filters corresponding to respective playback configurations, the one or more spectral filters based on calibration audio output by the one or more audio drivers. When playing back audio content in a given playback configuration, the audio stage may apply specific spectral filters corresponding to the given playback configuration.

第三实现可以包括：检测触发条件，该触发条件启动针对多个回放配置对媒体回放系统的校准。每个回放配置表示经由媒体回放系统的多个音频驱动器形成的特定声轴集合，并且每个声轴可以对应于音频内容的相应声道。第三实现还可以涉及使多个音频驱动器输出被划分成重复的帧集合的校准音频，该帧集合包括针对每个回放配置的相应帧。使多个音频驱动器输出校准音频可以涉及：使得在该帧集合的每个帧期间将相应空间滤波器集合应用于多个音频驱动器，每个空间滤波器集合包括用于每个声轴的相应空间滤波器。第三实现还可以涉及：经由麦克风记录由多个音频驱动器输出的校准音频，并且使处理设备基于所记录的校准音频来确定用于多个回放配置的相应频谱滤波器集合，每个频谱滤波器集合包括用于每个声轴的相应频谱滤波器。A third implementation may include detecting a trigger condition that initiates calibration of the media playback system for multiple playback configurations. Each playback configuration represents a particular set of sound axes formed via multiple audio drivers of the media playback system, and each sound axis may correspond to a corresponding channel of audio content. A third implementation may also involve causing the plurality of audio drivers to output calibration audio that is divided into repeated sets of frames, the set of frames including respective frames configured for each playback. Causing the plurality of audio drivers to output calibration audio may involve causing a corresponding set of spatial filters to be applied to the plurality of audio drivers during each frame of the set of frames, each set of spatial filters including a corresponding spatial for each acoustic axis filter. The third implementation may also involve recording, via the microphones, calibration audio output by the plurality of audio drivers, and causing the processing device to determine, based on the recorded calibration audio, respective sets of spectral filters for the plurality of playback configurations, each spectral filter The set includes a corresponding spectral filter for each acoustic axis.

这些示例实现中的每一个可以实施为方法、被配置成执行实现的设备、被配置成执行实现的设备的系统或者包含可由一个或更多个处理器执行以执行实现的指令的非暂态计算机可读介质，以及其他示例。本领域普通技术人员将理解，本公开内容包括许多其他实施方式，包括本文描述的示例特征的组合。此外，被描述为由给定设备执行以说明技术的任何示例操作可以由包括本文描述的设备的任何合适的设备执行。此外，任何设备可以使另一设备执行本文描述的任何操作。Each of these example implementations can be implemented as a method, an apparatus configured to perform the implementations, a system configured to perform the implemented apparatus, or a non-transitory computer containing instructions executable by one or more processors to perform the implemented implementations readable media, and other examples. Those of ordinary skill in the art will appreciate that the present disclosure includes many other embodiments, including combinations of the example features described herein. Furthermore, any example operations described as being performed by a given device to illustrate the techniques may be performed by any suitable device, including the devices described herein. Furthermore, any device can cause another device to perform any of the operations described herein.

虽然本文中描述的一些示例可以涉及由给定的行动者例如“用户”和/或其他实体执行的功能，但是应该理解的是，该描述仅出于说明的目的。除非权利要求本身的语言明确要求，否则权利要求不应该被解释为需要任何这样的示例行动者的行动。While some of the examples described herein may relate to functions performed by a given actor, such as a "user" and/or other entities, it should be understood that this description is for illustration purposes only. A claim should not be construed as requiring an action by any such example actor unless the language of the claim itself clearly requires it.

II.示例操作环境II. Example Operating Environment

图1示出了其中可以实践或实现本文中公开的一个或更多个实施方式的媒体回放系统100的示例配置。所示出的媒体回放系统100与具有若干房间和空间——例如主卧室、办公室、餐厅和起居室——的示例家庭环境相关联。如图1的示例中所示，媒体回放系统100包括回放设备102至回放设备124、控制设备126和128以及有线或无线网络路由器130。1 illustrates an example configuration of a media playback system 100 in which one or more embodiments disclosed herein may be practiced or implemented. The illustrated media playback system 100 is associated with an example home environment having several rooms and spaces, such as a master bedroom, office, dining room, and living room. As shown in the example of FIG. 1 , media playback system 100 includes playback device 102 to playback device 124 , control devices 126 and 128 , and wired or wireless network router 130 .

可以在下面的部分中找到与示例媒体回放系统100的不同组件以及不同组件可以如何交互以向用户提供媒体体验有关的进一步讨论。虽然本文中的讨论可以通常涉及示例媒体回放系统100，但是本文中描述的技术不限于如图1所示的家庭环境等内的应用。例如，本文中描述的技术可以在以下可能期望多区音频的环境中有用：例如商业环境如餐馆、商场或机场，交通工具如运动型多功能车(SUV)、公共汽车或汽车、舰或船、飞机等。Further discussion regarding the different components of the example media playback system 100 and how the different components may interact to provide a user with a media experience can be found in the following sections. Although the discussion herein may generally relate to the example media playback system 100, the techniques described herein are not limited to application within a home environment or the like as shown in FIG. 1 . For example, the techniques described herein may be useful in the following environments where multi-zone audio may be desired: such as commercial environments such as restaurants, malls or airports, vehicles such as sport utility vehicles (SUVs), buses or cars, ships or boats , aircraft, etc.

a.示例回放设备 a. Example playback device

图2示出了示例回放设备200的功能框图，该示例回放设备200可以被配置为图1的媒体回放系统100的回放设备102至回放设备124中的一个或更多个。回放设备200可以包括：处理器202、软件组件204、存储器206、音频处理组件208、音频放大器210、扬声器212、包括无线接口216和有线接口218的网络接口214。在一种情况下，回放设备200可以不包括扬声器212，而可以包括用于将回放设备200连接至外部扬声器的扬声器接口。在另一种情况下，回放设备200可以既不包括扬声器212也不包括音频放大器210，而可以包括用于将回放设备200连接至外部音频放大器或视听接收器的音频接口。FIG. 2 illustrates a functional block diagram of an example playback device 200 that may be configured as one or more of playback devices 102 through 124 of media playback system 100 of FIG. 1 . Playback device 200 may include: processor 202 , software component 204 , memory 206 , audio processing component 208 , audio amplifier 210 , speakers 212 , network interface 214 including wireless interface 216 and wired interface 218 . In one case, playback device 200 may not include speakers 212, but may include a speaker interface for connecting playback device 200 to external speakers. In another case, playback device 200 may include neither speaker 212 nor audio amplifier 210, but may include an audio interface for connecting playback device 200 to an external audio amplifier or audiovisual receiver.

在一个示例中，处理器202可以是被配置成根据存储在存储器206中的指令来处理输入数据的时钟驱动计算组件。存储器206可以是被配置成存储能够由处理器202执行的指令的有形计算机可读介质。例如，存储器206可以是可以加载有能够由处理器202执行以实现某些功能的软件组件204中的一个或更多个的数据存储装置。在一个示例中，功能可以涉及回放设备200从音频源或另一回放设备检索音频数据。在另一示例中，功能可以涉及回放设备200向网络上的另一设备或回放设备发送音频数据。在又一示例中，功能可以涉及回放设备200与一个或更多个回放设备的配对以创建多声道音频环境。In one example, processor 202 may be a clock-driven computing component configured to process input data according to instructions stored in memory 206 . Memory 206 may be a tangible computer-readable medium configured to store instructions executable by processor 202 . For example, memory 206 may be a data storage device that may be loaded with one or more of software components 204 executable by processor 202 to implement certain functions. In one example, a function may involve playback device 200 retrieving audio data from an audio source or another playback device. In another example, the function may involve playback device 200 sending audio data to another device or playback device on the network. In yet another example, functionality may involve pairing of playback device 200 with one or more playback devices to create a multi-channel audio environment.

某些功能可以涉及回放设备200与一个或更多个其他回放设备同步对音频内容的回放。在同步回放期间，收听者优选地将不能感知回放设备200对音频内容的回放与一个或更多个其他回放设备对音频内容的回放之间的时间延迟差异。在此通过引用并入本文的题为“System and method for synchronizing operations among a plurality ofindependently clocked digital data processing devices”的第8,234,395号美国专利更详细地提供了用于回放设备之间的音频回放同步的一些示例。Certain functions may involve playback device 200 synchronizing playback of audio content with one or more other playback devices. During synchronized playback, the listener will preferably not perceive a difference in time delay between the playback of the audio content by playback device 200 and the playback of the audio content by one or more other playback devices. US Patent No. 8,234,395, entitled "System and method for synchronizing operations among a plurality of independently clocked digital data processing devices," which is hereby incorporated by reference, provides in more detail some of the methods for audio playback synchronization between playback devices Example.

存储器206还可以被配置成存储与回放设备200相关联的数据，例如回放设备200是其一部分的一个或更多个区和/或区组、可由回放设备200访问的音频源、或者可与回放设备200(或某些其他回放设备)相关联的回放队列。该数据可以被存储为周期性地更新并且用来描述回放设备200的状态的一个或更多个状态变量。存储器206还可以包括这样的数据：该数据与媒体系统的其他设备的状态相关联，并且不时地在设备之间共享，使得设备中的一个或更多个具有与系统相关联的最新数据。其他实施方式也是可行的。The memory 206 may also be configured to store data associated with the playback device 200, such as one or more zones and/or groups of zones of which the playback device 200 is a part, audio sources accessible by the playback device 200, or A playback queue associated with device 200 (or some other playback device). This data may be stored as one or more state variables that are periodically updated and used to describe the state of playback device 200 . The memory 206 may also include data that is associated with the state of other devices of the media system and is shared among the devices from time to time so that one or more of the devices has the most recent data associated with the system. Other implementations are also possible.

音频处理组件208可以包括一个或更多个数模转换器(DAC)、音频预处理组件、音频增强组件或数字信号处理器(DSP)等。在一个实施方式中，音频处理组件208中的一个或更多个可以是处理器202的子组件。在一个示例中，音频处理组件208可以处理和/或有意地改变音频内容以产生音频信号。然后，可以将产生的音频信号提供至音频放大器210以用于放大并且通过扬声器212进行回放。特别地，音频放大器210可以包括被配置成将音频信号放大至用于驱动扬声器212中的一个或更多个的水平的设备。扬声器212可以包括单独的换能器(例如，“驱动器”)，或者包括具有一个或更多个驱动器的外壳的完整扬声器系统。扬声器212的特定驱动器可以包括例如超低音扬声器(例如，用于低频)、中档驱动器(例如，用于中频)和/或高音扬声器(例如，用于高频)。在一些情况下，一个或更多个扬声器212中的每个换能器可以由音频放大器210的单独相应音频放大器来驱动。除了产生用于由回放设备200回放的模拟信号以外，音频处理组件208可以被配置成对要被发送至一个或更多个其他回放设备以供回放的音频内容进行处理。Audio processing components 208 may include one or more digital-to-analog converters (DACs), audio preprocessing components, audio enhancement components, or digital signal processors (DSPs), among others. In one embodiment, one or more of the audio processing components 208 may be subcomponents of the processor 202 . In one example, audio processing component 208 can process and/or intentionally alter audio content to generate audio signals. The resulting audio signal may then be provided to audio amplifier 210 for amplification and playback through speakers 212 . In particular, audio amplifier 210 may include a device configured to amplify an audio signal to a level for driving one or more of speakers 212 . Speakers 212 may include individual transducers (eg, "drivers"), or a complete speaker system including an enclosure with one or more drivers. Particular drivers for speakers 212 may include, for example, subwoofers (eg, for low frequencies), mid-range drivers (eg, for mid frequencies), and/or tweeters (eg, for high frequencies). In some cases, each transducer in one or more speakers 212 may be driven by a separate corresponding audio amplifier of audio amplifier 210 . In addition to generating analog signals for playback by playback device 200, audio processing component 208 may be configured to process audio content to be sent to one or more other playback devices for playback.

可以例如经由音频线路输入连接(例如，自动检测3.5mm音频线路输入连接)或网络接口214从外部源接收要由回放设备200处理和/或回放的音频内容。Audio content to be processed and/or played back by playback device 200 may be received from an external source, eg, via an audio line-in connection (eg, auto-detection of a 3.5mm audio line-in connection) or network interface 214 .

网络接口214可以被配置成促进回放设备200与数据网络上的一个或更多个其他设备之间的数据流动。同样地，回放设备200可以被配置成通过数据网络从与该回放设备200通信的一个或更多个其他回放设备、局域网内的网络设备或者广域网如互联网上的音频内容源接收音频内容。在一个示例中，可以以包含基于互联网协议(IP)的源地址和基于IP的目标地址的数字分组数据的形式发送由回放设备200发送和接收的音频内容和其他信号。在这样的情况下，网络接口214可以被配置成对数字分组数据进行解析，使得回放设备200正确地接收和处理去往该回放设备200的数据。Network interface 214 may be configured to facilitate data flow between playback device 200 and one or more other devices on the data network. Likewise, playback device 200 may be configured to receive audio content over a data network from one or more other playback devices in communication with playback device 200, network devices within a local area network, or audio content sources over a wide area network such as the Internet. In one example, audio content and other signals sent and received by playback device 200 may be sent in the form of digital packet data containing an Internet Protocol (IP)-based source address and an IP-based destination address. In such a case, the network interface 214 may be configured to parse the digital packet data so that the playback device 200 properly receives and processes the data destined for the playback device 200 .

如所示出的，网络接口214可以包括无线接口216和有线接口218。无线接口216可以为回放设备200提供网络接口功能以根据通信协议(例如任何无线标准，包括IEEE802.11a、802.11b、802.11g、802.11n、802.11ac、802.15、4G移动通信标准等)与其他设备(例如，与回放设备200相关联的数据网络内的其他回放设备、扬声器、接收器、网络设备、控制设备)进行无线通信。有线接口218可以为回放设备200提供网络接口功能以根据通信协议(例如，IEEE 802.3)通过有线连接与其他设备进行通信。虽然图2中示出的网络接口214包括无线接口216和有线接口218两者，但是在一些实施方式中，网络接口214可以仅包括无线接口或仅包括有线接口。As shown, the network interface 214 may include a wireless interface 216 and a wired interface 218 . The wireless interface 216 may provide network interface functionality for the playback device 200 to communicate with other devices in accordance with a communication protocol (eg, any wireless standard, including IEEE 802.11a, 802.11b, 802.11g, 802.11n, 802.11ac, 802.15, 4G mobile communication standards, etc.) (eg, other playback devices, speakers, receivers, network devices, control devices within the data network associated with playback device 200) communicate wirelessly. Wired interface 218 may provide playback device 200 with network interface functionality to communicate with other devices over a wired connection according to a communication protocol (eg, IEEE 802.3). Although the network interface 214 shown in FIG. 2 includes both the wireless interface 216 and the wired interface 218, in some implementations, the network interface 214 may include only the wireless interface or only the wired interface.

在一个示例中，可以将回放设备200与一个其他回放设备进行配对以播放音频内容的两个分开的音频分量。例如，回放设备200可以被配置成播放左声道音频分量，而其他回放设备可以被配置成播放右声道音频分量，从而产生或增强音频内容的立体声效果。配对的回放设备(也被称为“绑定的回放设备”)还可以与其他回放设备同步地播放音频内容。In one example, playback device 200 may be paired with one other playback device to play two separate audio components of audio content. For example, playback device 200 may be configured to play the left channel audio component, while other playback devices may be configured to play the right channel audio component, thereby creating or enhancing the stereo effect of the audio content. Paired playback devices (also referred to as "bonded playback devices") can also play audio content in synchronization with other playback devices.

在另一示例中，可以将回放设备200与一个或更多个其他回放设备在声音上联合以形成单个联合的回放设备。因为联合的回放设备可以具有额外的扬声器驱动器，其中可以通过该扬声器驱动器来呈现音频内容，所以联合的回放设备可以被配置成与非联合的回放设备或配对的回放设备不同地处理和再现声音。例如，如果回放设备200是被设计成呈现低频范围音频内容的回放设备(即，超低音扬声器)，则回放设备200可以与被设计成呈现全频范围音频内容的回放设备联合。在这样的情况下，当与低频回放设备200联合时，全频范围回放设备可以被配置成仅呈现音频内容的中频分量和高频分量，而低频范围回放设备200呈现音频内容的低频分量。联合的回放设备还可以与单个回放设备或另一联合的回放设备配对。In another example, playback device 200 may be acoustically combined with one or more other playback devices to form a single combined playback device. Because a federated playback device may have additional speaker drivers through which audio content may be rendered, a federated playback device may be configured to process and reproduce sound differently than a non-linked playback device or paired playback device. For example, if playback device 200 is a playback device designed to present low frequency range audio content (ie, a subwoofer), playback device 200 may be associated with a playback device designed to present full frequency range audio content. In such a case, when associated with the low frequency playback device 200, the full frequency range playback device may be configured to present only the mid and high frequency components of the audio content, while the low frequency range playback device 200 presents the low frequency components of the audio content. A federated playback device can also be paired with a single playback device or another federated playback device.

举例来说，SONOS公司目前公开发售(或已经公开发售)某些回放设备，某些回放设备包括“PLAY：1”、“PLAY：3”、“PLAY：5”、“PLAYBAR”、“CONNECT：AMP”、“CONNECT”和“SUB”。另外地或可替选地，任何其他过去的、现在的和/或将来的回放设备可以用于实现本文公开的示例实施方式的回放设备。另外，应当理解的是，回放设备不限于图2中示出的示例或SONOS产品供应。例如，回放设备可以包括有线或无线耳机。在另一示例中，回放设备可以包括用于个人移动媒体回放设备的对接站或与该对接站交互。在又一示例中，回放设备可以集成至另一设备或组件如电视、照明器材或供室内或室外使用的一些其他设备。For example, SONOS Corporation currently offers (or has already publicly offered) certain playback devices including "PLAY: 1", "PLAY: 3", "PLAY: 5", "PLAYBAR", "CONNECT: AMP", "CONNECT", and "SUB". Additionally or alternatively, any other past, present and/or future playback devices may be used to implement playback devices of the example embodiments disclosed herein. Additionally, it should be understood that the playback device is not limited to the example or SONOS product offering shown in FIG. 2 . For example, playback devices may include wired or wireless headphones. In another example, the playback device may include or interact with a docking station for a personal mobile media playback device. In yet another example, the playback device may be integrated into another device or component such as a television, lighting fixture, or some other device for indoor or outdoor use.

b.示例回放区配置 b. Example playback area configuration

返回参照图1的媒体回放系统100，环境可以具有一个或更多个回放区，每个回放区具有一个或更多个回放设备。媒体回放系统100可以与一个或更多个回放区一起建立，此后，一个或更多个区可以被添加或移除以得到图1中示出的示例配置。可以根据不同的房间或空间——例如办公室、浴室、主卧室、卧室、厨房、餐厅、起居室和/或阳台——给每个区命名。在一种情况下，单个回放区可以包括多个房间或空间。在另一种情况下，单个房间或空间可以包括多个回放区。Referring back to the media playback system 100 of FIG. 1, an environment may have one or more playback zones, each playback zone having one or more playback devices. The media playback system 100 may be established with one or more playback zones, after which one or more zones may be added or removed to achieve the example configuration shown in FIG. 1 . Each zone can be named after a different room or space - such as an office, bathroom, master bedroom, bedroom, kitchen, dining room, living room and/or balcony. In one case, a single playback zone may include multiple rooms or spaces. In another case, a single room or space may include multiple playback zones.

如图1示出的，阳台、餐厅、厨房、浴室、办公室和卧室区各有一个回放设备，而起居室区和主卧室区各有多个回放设备。在起居室区中，回放设备104、106、108和110可以被配置成：作为单独的回放设备、作为一个或更多个绑定的回放设备、作为一个或更多个联合的回放设备或者其任意组合来同步地播放音频内容。类似地，在主卧室的情况下，回放设备122和124可以被配置成：作为单独的回放设备、作为绑定的回放设备或者作为联合的回放设备来同步地播放音频内容。As shown in Figure 1, the balcony, dining room, kitchen, bathroom, office, and bedroom areas each have one playback device, while the living room and master bedroom areas each have multiple playback devices. In the living room area, playback devices 104, 106, 108, and 110 may be configured: as individual playback devices, as one or more bundled playback devices, as one or more combined playback devices, or the like Any combination to play audio content synchronously. Similarly, in the case of a master bedroom, playback devices 122 and 124 may be configured to play audio content synchronously as separate playback devices, as bundled playback devices, or as joint playback devices.

在一个示例中，图1的环境中的一个或更多个回放区可以各自正在播放不同的音频内容。例如，用户可以正在阳台区中烧烤并且收听正由回放设备102播放的嘻哈音乐，同时另一用户可以正在厨房区中准备食物并且收听正由回放设备114播放的古典音乐。在另一示例中，回放区可以与另一回放区同步地播放相同的音频内容。例如，用户可以在办公室区中，在该办公室区中回放设备118正在播放与阳台区中的回放设备102正播放的摇滚音乐相同的摇滚音乐。在这样的情况下，回放设备102和回放设备118可以同步播放摇滚音乐，使得当用户在不同回放区之间移动时可以无缝地(或至少基本上无缝地)享受正被大声播放的音频内容。如在先前引用的第8,234,395号美国专利中描述的，可以以与回放设备之间的同步方式类似的方式实现回放区之间的同步。In one example, one or more playback zones in the environment of FIG. 1 may each be playing different audio content. For example, a user may be grilling in the balcony area and listening to hip hop music being played by playback device 102 , while another user may be preparing food in the kitchen area and listening to classical music being played by playback device 114 . In another example, a playback zone may play the same audio content in synchronization with another playback zone. For example, a user may be in an office area where playback device 118 is playing the same rock music that playback device 102 in the balcony area is playing. In such a case, playback device 102 and playback device 118 may play rock music synchronously so that the audio being played loudly can be seamlessly (or at least substantially seamless) enjoyed by the user as the user moves between different playback zones content. Synchronization between playback zones may be accomplished in a manner similar to synchronization between playback devices, as described in previously referenced US Patent No. 8,234,395.

如上面提出的，可以动态地修改媒体回放系统100的区配置，并且在一些实施方式中，媒体回放系统100支持许多配置。例如，如果用户物理地将一个或更多个回放设备移动至区或从区移出一个或更多个回放设备，则媒体回放系统100可以被重新配置以适应改变。例如，如果用户物理地将回放设备102从阳台区移动至办公室区，则办公室区现在可以包括回放设备118和回放设备102两者。如果需要，则可以经由控制设备如控制设备126和控制设备128将回放设备102与办公室区配对或分组在一起和/或对该回放设备102重命名。另一方面，如果一个或更多个回放设备被移动至家庭环境中的尚不是回放区的特定区域，则可以针对该特定区域创建新的回放区。As noted above, the zone configuration of the media playback system 100 can be dynamically modified, and in some embodiments, the media playback system 100 supports a number of configurations. For example, if a user physically moves one or more playback devices to or from a zone, the media playback system 100 may be reconfigured to accommodate the change. For example, if a user physically moves playback device 102 from a balcony area to an office area, the office area may now include both playback device 118 and playback device 102 . If desired, playback device 102 may be paired or grouped together with the office area and/or renamed via control devices such as control device 126 and control device 128 . On the other hand, if one or more playback devices are moved to a particular area of the home environment that is not already a playback area, a new playback area may be created for that particular area.

此外，可以将媒体回放系统100的不同回放区动态地组合成区组或将其划分成单独的回放区。例如，可以将餐厅区和厨房区114组合成用于晚餐聚会的区组，使得回放设备112和114可以同步地呈现音频内容。另一方面，如果用户希望在起居室空间中收听音乐而另一用户希望看电视，则可以将起居室区划分成包括回放设备104的电视区以及包括回放设备106、108和110的收听区。Furthermore, the different playback zones of the media playback system 100 may be dynamically combined into zone groups or divided into separate playback zones. For example, the dining room area and the kitchen area 114 may be combined into a block for a dinner party, so that the playback devices 112 and 114 may simultaneously present audio content. On the other hand, if a user wishes to listen to music in the living room space and another user wishes to watch television, the living room area may be divided into a television area including playback device 104 and a listening area including playback devices 106 , 108 and 110 .

c.示例控制设备 c. Example control device

图3示出了示例控制设备300的功能框图，该示例控制设备300可以被配置为媒体回放系统100的控制设备126和控制设备128中的一者或两者。控制设备300还可以被称为控制器300。如示出的，控制设备300可以包括处理器302、存储器304、网络接口306、用户接口308。在一个示例中，控制设备300可以是用于媒体回放系统100的专用控制器。在另一示例中，控制设备300可以是可以安装媒体回放系统控制器应用软件的网络设备，例如，iPhone^TM、iPad^TM或者任何其他智能电话、平板计算机或网络设备(例如，联网的计算机如PC或Mac^TM)。FIG. 3 shows a functional block diagram of an example control device 300 that may be configured as one or both of control device 126 and control device 128 of media playback system 100 . The control device 300 may also be referred to as the controller 300 . As shown, control device 300 may include processor 302 , memory 304 , network interface 306 , user interface 308 . In one example, the control device 300 may be a dedicated controller for the media playback system 100 . In another example, the control device 300 may be a network device, such as an iPhone ^™ , iPad ^™ , or any other smartphone, tablet, or network device (eg, a networked computer such as a PC) that can install a media playback system controller application software or Mac ^TM ).

处理器302可以被配置成执行与促进对媒体回放系统100的用户访问、控制和配置有关的功能。存储器304可以被配置成存储可由处理器302执行以执行那些功能的指令。存储器304还可以被配置成存储媒体回放系统控制器应用软件以及与媒体回放系统100和用户相关联的其他数据。The processor 302 may be configured to perform functions related to facilitating user access, control, and configuration of the media playback system 100 . Memory 304 may be configured to store instructions executable by processor 302 to perform those functions. Memory 304 may also be configured to store media playback system controller application software and other data associated with media playback system 100 and users.

在一个示例中，网络接口306可以基于行业标准(例如，包括IEEE802.3的红外、无线电、有线标准，包括IEEE 802.11a、802.11b、802.11g、802.11n、802.11ac、802.15的无线标准、4G移动通信标准等)。网络接口306可以提供用于控制设备300与媒体回放系统100中的其他设备进行通信的手段。在一个示例中，可以经由网络接口306在控制设备300与其他设备之间传送数据和信息(例如，如状态变量)。例如，控制设备300可以经由网络接口306从回放设备或另一网络设备接收媒体回放系统100中的回放区和区组配置或者控制设备300可以经由网络接口306将媒体回放系统100中的回放区和区组配置发送至另一回放设备或网络设备。在一些情况下，其他网络设备可以是另一控制设备。In one example, the network interface 306 may be based on industry standards (eg, infrared, radio, wired standards including IEEE 802.3, wireless standards including IEEE 802.11a, 802.11b, 802.11g, 802.11n, 802.11ac, 802.15, 4G mobile communication standards, etc.). Network interface 306 may provide a means for controlling device 300 to communicate with other devices in media playback system 100 . In one example, data and information (eg, such as state variables) may be communicated between control device 300 and other devices via network interface 306 . For example, the control device 300 may receive the playback zone and zone group configuration in the media playback system 100 from the playback device or another network device via the network interface 306 or the control device 300 may convert the playback zone in the media playback system 100 and the zone group via the network interface 306 The block configuration is sent to another playback device or network device. In some cases, the other network device may be another control device.

还可以经由网络接口306将回放设备控制命令如音量控制和音频回放控制从控制设备300传送至回放设备。如上面提出的，用户还可以使用控制设备300来执行对媒体回放系统100的配置的改变。配置改变可以包括：将一个或更多个回放设备添加至区或从区移除一个或更多个回放设备；将一个或更多个区添加至区组或从区组移除一个或更多个区；形成绑定的或联合的播放器；从绑定的或联合的播放器分离一个或更多个回放设备等。因此，有时可以将控制设备300称为控制器，而不论控制设备300是专用控制器还是安装有媒体回放系统控制器应用软件的网络设备。Playback device control commands such as volume controls and audio playback controls may also be communicated from the control device 300 to the playback device via the network interface 306 . As noted above, the user may also use the control device 300 to perform changes to the configuration of the media playback system 100 . Configuration changes may include: adding one or more playback devices to or removing one or more playback devices from a zone; adding one or more zones to or removing one or more from a zone group form a bound or joint player; detach one or more playback devices from a bound or joint player, etc. Therefore, the control device 300 may sometimes be referred to as a controller, regardless of whether the control device 300 is a dedicated controller or a network device with the media playback system controller application software installed.

控制设备300的用户接口308可以被配置成通过提供控制器接口例如图4示出的控制器接口400来促进对媒体回放系统100的用户访问和控制。控制器接口400包括回放控制区域410、回放区区域420、回放状态区域430、回放队列区域440以及音频内容源区域450。示出的用户接口400仅是可以在网络设备如图3的控制设备300(和/或图1的控制设备126和128)上提供并且由用户访问以控制媒体回放系统如媒体回放系统100的用户接口的一个示例。可替选地，可以在一个或更多个网络设备上实现不同格式、样式和交互序列的其他用户接口以提供对媒体回放系统的可比较的控制访问。The user interface 308 of the control device 300 may be configured to facilitate user access and control of the media playback system 100 by providing a controller interface, such as the controller interface 400 shown in FIG. 4 . The controller interface 400 includes a playback control area 410 , a playback area area 420 , a playback status area 430 , a playback queue area 440 , and an audio content source area 450 . The illustrated user interface 400 is merely a user that may be provided on a network device such as the control device 300 of FIG. 3 (and/or the control devices 126 and 128 of FIG. 1 ) and accessed by a user to control a media playback system such as the media playback system 100 An example of an interface. Alternatively, other user interfaces of different formats, styles and interaction sequences may be implemented on one or more network devices to provide comparable control access to the media playback system.

回放控制区域410可以包括用于使所选择的回放区或区组中的回放设备播放或暂停、快进、倒回、跳到下一首、跳到上一首、进入/退出随机模式、进入/退出重复模式、进入/退出交叉渐变模式的可选(例如，通过触摸或通过使用光标)图标。回放控制区域410还可以包括用于修改均衡设定和回放音量以及其他可能性的可选图标。Playback control area 410 may include controls for playing or pausing playback devices in a selected playback zone or group, fast forwarding, rewinding, skipping to next track, skipping to previous track, entering/exiting random mode, entering /exit repeat mode, optional (eg by touch or by using cursor) icons to enter/exit crossfade mode. Playback control area 410 may also include selectable icons for modifying equalization settings and playback volume, among other possibilities.

回放区区域420可以包括媒体回放系统100内的回放区的表示。在一些实施方式中，回放区的图形表示可以是可选的以产生另外的可选图标，以管理或配置媒体回放系统中的回放区，例如，绑定区的创建、区组的创建、区组的分离和区组的重命名，以及其他可能性。Playback zone area 420 may include a representation of the playback zone within media playback system 100 . In some embodiments, the graphical representation of the playback zone may be selectable to generate additional selectable icons to manage or configure playback zones in the media playback system, eg, creation of binding zones, creation of zones, zone creation Separation of groups and renaming of blocks, among other possibilities.

例如，如示出的，可以在回放区的图形表示中的每一个内设置“分组”图标。在特定区的图形表示内设置的“分组”图标可以是可选的，以产生用于选择媒体回放系统中的要与该特定区分组在一起的一个或更多个其他区的选项。一旦被分组，已经与特定区分组在一起的区中的回放设备将被配置成与特定区中的回放设备同步地播放音频内容。类似地，可以在区组的图形表示内设置“分组”图标。在这种情况下，“分组”图标可以是可选的，以产生用于取消选择区组中的要从该区组移除的一个或更多个区的选项。用于经由用户接口如用户接口400对区进行分组和取消分组的其他交互和实现也是可行的。随着回放区或区组配置被修改，可以动态地更新回放区区域420中的回放区的表示。For example, as shown, a "group" icon may be provided within each of the graphical representations of the playback area. A "group" icon positioned within the graphical representation of a particular zone may be selectable to yield options for selecting one or more other zones in the media playback system to group with the particular zone. Once grouped, playback devices in a zone that have been grouped with a particular zone will be configured to play the audio content in synchronization with the playback devices in the particular zone. Similarly, a "group" icon can be provided within the graphical representation of the block. In this case, the "group" icon may be selectable to generate an option to deselect one or more zones in the block group to be removed from the block group. Other interactions and implementations for grouping and ungrouping zones via a user interface such as user interface 400 are also possible. The representation of the playback zone in the playback zone area 420 may be dynamically updated as the playback zone or zone group configuration is modified.

回放状态区域430可以包括所选择的回放区或区组中的当前正在被播放、先前被播放或被安排接下来要播放的音频内容的图形表示。可以在用户接口上——例如在回放区区域420和/或回放状态区域430内——在视觉上区分所选择的回放区或区组。图形表示可以包括曲目名称、艺术家姓名、专辑名、专辑年份、曲目长度以及对用户了解何时经由用户接口400来控制媒体回放系统而言有用的其他相关信息。The playback status area 430 may include a graphical representation of the audio content currently being played, previously played, or scheduled to be played next in the selected playback zone or group of zones. The selected playback zone or group of zones may be visually distinguished on a user interface, eg, within playback zone area 420 and/or playback status area 430. The graphical representation may include track title, artist name, album title, album year, track length, and other relevant information useful for the user to know when to control the media playback system via user interface 400 .

回放队列区域440可以包括与所选择的回放区或区组相关联的回放队列中的音频内容的图形表示。在一些实施方式中，每个回放区或区组可以与这样的回放队列相关联，该回放队列包含与用于由回放区或区组回放的零个或更多个音频项相对应的信息。例如，回放队列中的每个音频项可以包括统一资源标识符(URI)、统一资源定位符(URL)或一些其他标识符，其他标识符可以由回放区或区组中的回放设备使用以从本地音频内容源或联网音频内容源查找和/或检索音频项，可能用于由回放设备回放。Playback queue area 440 may include a graphical representation of the audio content in the playback queue associated with the selected playback zone or group of zones. In some embodiments, each playback zone or zone group may be associated with a playback queue that contains information corresponding to zero or more audio items for playback by the playback zone or zone group. For example, each audio item in the playback queue may include a Uniform Resource Identifier (URI), Uniform Resource Locator (URL), or some other identifier that may be used by playback devices in a playback zone or set of zones to retrieve data from Local audio content sources or networked audio content sources find and/or retrieve audio items, possibly for playback by playback devices.

在一个示例中，可以将播放列表添加至回放队列，在这种情况下，可以将与播放列表中的每个音频项相对应的信息添加至回放队列。在另一示例中，可以将回放队列中的音频项保存为播放列表。在又一示例中，当回放区或区组正在连续播放流式音频内容——例如，可以连续播放直到以其他方式被停止的互联网广播，而不是播放具有回放持续时间的离散音频项时，回放队列可以是空的或者被填充但“不在使用中”。在替选实施方式中，回放队列可以包括互联网广播和/或其他流式音频内容项并且当回放区或区组正在播放那些项时回放队列“在使用中”。其他示例也是可行的。In one example, a playlist may be added to the playback queue, in which case information corresponding to each audio item in the playlist may be added to the playback queue. In another example, audio items in the playback queue may be saved as a playlist. In yet another example, playback is performed when a playback zone or group of zones is continuously playing streaming audio content—eg, an Internet radio that could be played continuously until otherwise stopped, rather than playing discrete audio items with playback durations. Queues can be empty or filled but "not in use". In alternative embodiments, the playback queue may include Internet radio and/or other streaming audio content items and the playback queue is "in use" when the playback zone or group of zones is playing those items. Other examples are also possible.

当回放区或区组被“分组”或被“取消分组”时，可以清除或重新关联与受影响的回放区或区组相关联的回放队列。例如，如果将包括第一回放队列的第一回放区与包括第二回放队列的第二回放区分组在一起，则所建立的区组可以具有相关联的回放队列，该相关联的回放队列最初为空，包含来自第一回放队列的音频项(例如，如果第二回放区被添加至第一回放区)，包含来自第二回放队列的音频项(例如，如果第一回放区被添加至第二回放区)，或者来自第一回放队列和第二回放队列两者的音频项的组合。随后，如果建立的区组被取消分组，则所得到的第一回放区可以与先前的第一回放队列重新关联，或者可以与以下新的回放队列相关联，该新的回放队列是空的或者包含来自在所建立的区组被取消分组之前与所建立的区组相关联的回放队列的音频项。类似地，所得到的第二回放区可以与先前的第二回放队列重新关联，或者与以下新的回放队列相关联，该新的回放队列是空的或者包含来自在所建立的区组被取消分组之前与所建立的区组相关联的回放队列的音频项。其他示例也是可行的。When a playback zone or zone is "grouped" or "ungrouped", the playback queue associated with the affected playback zone or zone may be cleared or re-associated. For example, if a first playback zone that includes a first playback queue is grouped with a second playback zone that includes a second playback queue, the created zone group may have an associated playback queue that originally Empty, contains audio items from the first playback queue (for example, if the second playback area is added to the first playback area), contains audio items from the second playback queue (for example, if the first playback area is added to the first playback area) two playback areas), or a combination of audio items from both the first playback queue and the second playback queue. Subsequently, if the established zone group is ungrouped, the resulting first playback zone may be re-associated with the previous first playback queue, or may be associated with a new playback queue that is empty or Contains audio items from the playback queue associated with the established chunk before it was ungrouped. Similarly, the resulting second playback zone can be re-associated with the previous second playback queue, or with a new playback queue that is either empty or contains data from the zone set established at the time of cancellation The audio items of the playback queue associated with the established block before the grouping. Other examples are also possible.

返回参照图4的用户接口400，回放队列区域440中的音频内容的图形表示可以包括曲目名称、艺术家姓名、曲目长度以及与回放队列中的音频内容相关联的其他相关信息。在一个示例中，音频内容的图形表示可以是可选的以产生另外的可选图标，以管理和/或操纵回放队列和/或回放队列中呈现的音频内容。例如，所呈现的音频内容可以从回放队列中移除，可以被移动至回放队列内的不同位置，或者被选择以立即播放或者在任何当前播放的音频内容之后播放，以及其他可能性。与回放区或区组相关联的回放队列可以被存储在回放区或区组中的一个或更多个回放设备上的存储器中、不在回放区或区组中的回放设备上的存储器中和/或一些其他指定设备上的存储器中。对这样的回放队列的回放可以涉及一个或更多个回放设备可能以连续顺序或随机顺序回放该队列的媒体项。Referring back to user interface 400 of FIG. 4, the graphical representation of audio content in playback queue area 440 may include track name, artist name, track length, and other relevant information associated with the audio content in the playback queue. In one example, the graphical representation of the audio content may be selectable to generate additional selectable icons to manage and/or manipulate the playback queue and/or audio content presented in the playback queue. For example, the rendered audio content may be removed from the playback queue, moved to a different position within the playback queue, or selected to be played immediately or after any currently playing audio content, among other possibilities. A playback queue associated with a playback zone or zone may be stored in memory on one or more playback devices in the playback zone or zone, in memory on playback devices not in the playback zone or zone and/or or in memory on some other specified device. Playback of such a playback queue may involve one or more playback devices playing back the queue's media items, possibly in sequential order or random order.

音频内容源区域450可以包括可选音频内容源的图形表示，可以从该可选音频内容源检索音频内容并且由所选择的回放区或区组播放音频内容。可以在下面部分中找到关于音频内容源的论述。Audio content sources area 450 may include a graphical representation of selectable audio content sources from which audio content may be retrieved and played by the selected playback zone or group of zones. A discussion of audio content sources can be found in the following sections.

图5描绘了智能电话500，该智能电话包括一个或更多个处理器、有形计算机可读存储器、网络接口和显示器。智能电话500可以是图1的控制设备126或128或者图3的控制设备300或者本文描述的其他控制设备的示例实现。借助示例，将参考智能电话500以及智能电话500在作为媒体回放系统(例如，媒体回放系统100)的控制设备操作时可以显示的某些控制接口、提示和其他图形元素。在示例内，这样的接口和元素可以由任何合适的控制设备——例如智能电话、平板计算机、膝上型计算机或台式计算机、个人媒体播放器或远程控制设备——显示。5 depicts a smartphone 500 that includes one or more processors, tangible computer-readable memory, a network interface, and a display. Smartphone 500 may be an example implementation of control device 126 or 128 of FIG. 1 or control device 300 of FIG. 3 or other control devices described herein. By way of example, reference will be made to smartphone 500 and certain control interfaces, prompts, and other graphical elements that smartphone 500 may display when operating as a control device of a media playback system (eg, media playback system 100). Within examples, such interfaces and elements may be displayed by any suitable control device, such as a smartphone, tablet computer, laptop or desktop computer, personal media player or remote control device.

当作为媒体回放系统的控制设备操作时，智能电话500可以显示一个或更多个控制器接口，例如控制器接口400。类似于图4的回放控制区域410、回放区区域420、回放状态区域430、回放队列区域440和/或音频内容源区域450，智能电话500可以显示一个或更多个相应的接口，例如回放控制接口、回放区接口、回放状态接口、回放队列接口和/或音频内容源接口。示例控制设备可能利用例如智能电话或其他手持设备来显示屏幕尺寸相对受限的单独接口(而不是区域)。When operating as a control device of a media playback system, smartphone 500 may display one or more controller interfaces, such as controller interface 400 . Similar to playback control area 410, playback area area 420, playback status area 430, playback queue area 440, and/or audio content source area 450 of FIG. 4, smartphone 500 may display one or more corresponding interfaces, such as playback control interface, playback area interface, playback state interface, playback queue interface and/or audio content source interface. An example control device might utilize, for example, a smartphone or other handheld device to display a separate interface (rather than an area) with a relatively limited screen size.

d.示例音频内容源 d. Sample audio content source

如前面指出的，区或区组中的一个或更多个回放设备可以被配置成从各种可用音频内容源检索用于回放的音频内容(例如，根据音频内容的相应URI或URL)。在一个示例中，可以由回放设备直接从相应的音频内容源(例如，线路输入连接)检索音频内容。在另一示例中，可以经由一个或更多个其他回放设备或网络设备通过网络将音频内容提供给回放设备。As noted previously, one or more playback devices in a zone or group of zones may be configured to retrieve audio content for playback from various available audio content sources (eg, according to a corresponding URI or URL of the audio content). In one example, the audio content may be retrieved by the playback device directly from the corresponding audio content source (eg, line-in connection). In another example, the audio content may be provided to a playback device over a network via one or more other playback devices or network devices.

示例音频内容源可以包括：媒体回放系统例如图1的媒体回放系统100中的一个或更多个回放设备的存储器、一个或更多个网络设备(例如，如控制设备、支持网络的个人计算机或网络附加存储装置(NAS))上的本地音乐库、经由互联网(例如，云)提供音频内容的流式音频服务或者经由回放设备或网络设备上的线路输入连接与媒体回放系统连接的音频源，以及其他可能性。Example audio content sources may include: the memory of one or more playback devices in a media playback system such as media playback system 100 of FIG. 1 , one or more network devices (eg, such as a control device, a network-enabled personal computer, or a local music library on a network attached storage device (NAS), a streaming audio service that provides audio content via the Internet (e.g., the cloud), or an audio source connected to a media playback system via a playback device or a line-in connection on a network device, and other possibilities.

在一些实施方式中，可以有规律地从媒体回放系统例如图1的媒体回放系统100添加或移除音频内容源。在一个示例中，每当添加、移除或更新一个或更多个音频内容源时，可以执行对音频项编索引。对音频项编索引可以涉及：扫描在能够由媒体回放系统中的回放设备访问的网络上共享的所有文件夹/目录中的可识别音频项；以及生成或更新音频内容数据库，该音频内容数据库包含元数据(例如，名称、艺术家、专辑、曲目长度等)以及其他关联信息，例如找到的每个可识别音频项的URI或URL。用于管理和保持音频内容源的其他示例也是可行的。In some implementations, audio content sources may be added or removed from a media playback system, such as media playback system 100 of FIG. 1, on a regular basis. In one example, indexing audio items may be performed whenever one or more audio content sources are added, removed, or updated. Indexing audio items may involve: scanning for identifiable audio items in all folders/directories shared on a network accessible by playback devices in a media playback system; and generating or updating an audio content database containing Metadata (eg, name, artist, album, track length, etc.) and other associated information such as the URI or URL of each identifiable audio item found. Other examples for managing and maintaining audio content sources are also possible.

e.示例校准序列 e. Example calibration sequence

如上所述，示例校准过程可以涉及：一个或更多个回放设备发出校准声音，该校准声音可以由记录设备(或多个记录设备)检测。As mentioned above, an example calibration process may involve one or more playback devices emitting calibration sounds, which may be detected by the recording device (or recording devices).

在一些实施方式中，可以在要校准回放设备的频率范围(即，校准范围)上分析检测到的校准声音。因此，由回放设备发出的特定校准声音覆盖校准频率范围。校准频率范围可以包括回放设备能够发出的频率范围(例如15Hz至30000Hz)，并且可以包括被认为处于人类听觉范围(例如，20Hz至20000Hz)内的频率。通过发出并随后检测覆盖这样的频率范围的校准声音，可以针对回放设备来确定包括该范围的频率响应。这样的频率响应可以表示其中回放设备发出校准声音的环境。In some embodiments, the detected calibration sound may be analyzed over the frequency range (ie, the calibration range) over which the playback device is to be calibrated. Therefore, the specific calibration sound emitted by the playback device covers the calibration frequency range. The calibrated frequency range may include the range of frequencies that the playback device is capable of emitting (eg, 15 Hz to 30,000 Hz), and may include frequencies that are considered to be within the range of human hearing (eg, 20 Hz to 20,000 Hz). By emitting and then detecting calibration sounds covering such a frequency range, a frequency response including this range can be determined for the playback device. Such a frequency response may represent the environment in which the playback device emits calibrated sounds.

在一些实施方式中，回放设备可以在校准过程期间重复发出校准声音，使得校准声音在每个重复期间覆盖校准频率范围。利用移动的麦克风，在环境内的不同物理位置处连续检测校准声音的重复。例如，回放设备可能发出周期性校准声音。可以由记录设备在环境内的不同物理位置处检测校准声音的每个周期，从而在该位置处提供样本(即，表示重复的帧)。这样的校准声音因此可以促进对环境的空间平均的校准。当使用多个麦克风时，每个麦克风可以覆盖环境的相应部分(可能具有一些重叠)。In some embodiments, the playback device may repeatedly emit the calibration sound during the calibration process such that the calibration sound covers the calibration frequency range during each repetition. Using a moving microphone, repetitions of the calibration sound are continuously detected at different physical locations within the environment. For example, a playback device may emit periodic calibration sounds. Each cycle of the calibration sound may be detected by the recording device at a different physical location within the environment, providing a sample (ie, representing a repeating frame) at that location. Such calibration sounds may thus facilitate a spatially averaged calibration of the environment. When multiple microphones are used, each microphone may cover a corresponding portion of the environment (possibly with some overlap).

此外，记录设备可以测量移动样本和静止样本两者。例如，当一个或更多个回放设备输出校准声音时，记录设备可以在环境内移动。在这样的移动期间，记录设备可以在一个或更多个位置暂停以测量静止样本。这样的位置可以对应于优选的收听位置。在另一示例中，第一记录设备和第二记录设备可以分别包括第一麦克风和第二麦克风。当回放设备发出校准声音时，第一麦克风可以移动，并且第二麦克风可以保持静止，可能在环境内的特定收听位置(例如，最喜欢的椅子)处保持静止。Furthermore, the recording device can measure both moving and stationary samples. For example, the recording device may move within the environment while one or more playback devices output calibration sounds. During such movement, the recording device may pause at one or more locations to measure stationary samples. Such positions may correspond to preferred listening positions. In another example, the first recording device and the second recording device may include a first microphone and a second microphone, respectively. When the playback device emits the calibration sound, the first microphone may move and the second microphone may remain stationary, possibly at a particular listening position within the environment (eg, a favorite chair).

在一些情况下，一个或更多个回放设备可以加入分组，例如绑定区或区组。在这样的情况下，校准过程可以将一个或更多个回放设备作为组来校准。示例分组包括区组或绑定对，以及其他示例配置。In some cases, one or more playback devices may join a group, such as a binding zone or zone group. In such a case, the calibration process may calibrate one or more playback devices as a group. Example groupings include block or bond pairs, and other example configurations.

受校准的一个或多个回放设备可以基于触发条件来启动校准过程。例如，记录设备诸如媒体回放系统100的控制设备126可以检测触发条件，该触发条件使记录设备启动对一个或更多个回放设备(例如，回放设备102至124中的一个或更多个)的校准。可替选地，媒体回放系统的回放设备可以检测这样的触发条件(并且可能将该触发条件的指示转发至记录设备)。The one or more playback devices being calibrated may initiate the calibration process based on the triggering condition. For example, a recording device, such as the control device 126 of the media playback system 100, may detect a trigger condition that causes the recording device to initiate a response to one or more playback devices (eg, one or more of playback devices 102-124). calibration. Alternatively, the playback device of the media playback system may detect such a trigger condition (and possibly forward an indication of the trigger condition to the recording device).

在一些实施方式中，检测触发条件可以涉及：检测指示对可选控件的选择的输入数据。例如，记录设备例如控制设备126可以显示包括一个或更多个控件的接口(例如，图4的控制接口400)，该控件在被选择时启动对回放设备或者回放设备的组(例如，区)的校准。In some implementations, detecting a trigger condition may involve detecting input data indicating selection of a selectable control. For example, a recording device such as control device 126 may display an interface (eg, control interface 400 of FIG. 4 ) that includes one or more controls that, when selected, initiate a playback device or group (eg, zone) of playback devices calibration.

为了说明这样的控件，图6示出了显示示例控制接口600的智能电话500。控制接口600包括：提示当准备好时点击可选控件604(开始)的图形区域602。当被选择时，可选控件604可以启动校准过程。如所示的，可选控件604是按钮控件。尽管通过示例示出了按钮控件，但是也可以设想其他类型的控件。To illustrate such controls, FIG. 6 shows a smartphone 500 displaying an example control interface 600 . Control interface 600 includes a graphics area 602 that prompts to click selectable controls 604 (start) when ready. When selected, selectable control 604 may initiate a calibration process. As shown, selectable controls 604 are button controls. Although a button control is shown by way of example, other types of controls are also contemplated.

控制接口600还包括图形区域606，图形区域包括描绘如何帮助校准过程的视频。一些校准过程可以涉及：将麦克风在环境内移动，以在多个物理位置处获得校准声音的样本。为了提示用户移动麦克风，控制设备可以显示描绘要在校准期间执行的一个或更多个步骤的视频或动画。The control interface 600 also includes a graphics area 606 that includes a video depicting how to assist the calibration process. Some calibration procedures may involve moving the microphone within the environment to obtain samples of the calibration sound at multiple physical locations. To prompt the user to move the microphone, the control device may display a video or animation depicting one or more steps to be performed during calibration.

为了说明控制设备在校准期间的移动，图7示出了图1的媒体回放系统100。图7示出了记录设备(例如，控制设备126)在校准期间可能沿其移动的路径700。如上所述，记录设备可以以各种方式例如借助视频或动画等来指示如何执行这样的移动。记录设备可以在沿着路径700的不同点处检测由媒体回放系统100的一个或更多个回放设备发出的校准声音的迭代，这可以促进对那些回放设备的空间平均的校准。To illustrate the movement of the control device during calibration, FIG. 7 shows the media playback system 100 of FIG. 1 . FIG. 7 shows a path 700 along which a recording device (eg, control device 126 ) may travel during calibration. As mentioned above, the recording device may instruct in various ways how to perform such movements, eg by means of video or animation or the like. The recording device may detect iterations of calibration sounds emitted by one or more playback devices of the media playback system 100 at various points along the path 700, which may facilitate calibration of the spatial average of those playback devices.

在其他示例中，检测触发条件可以涉及：回放设备检测到回放设备已变得未校准，这可能是由于将回放设备移动到不同的位置引起的。例如，回放设备可以经由对移动敏感的一个或更多个传感器(例如，加速度计)来检测物理移动。作为另一示例，回放设备可能检测到其已被移动至不同的区(例如，从“厨房”区移动至“起居室”区)，可能通过从控制设备接收到使回放设备离开第一区并加入第二区的指示而被移动至不同的区。In other examples, detecting a trigger condition may involve the playback device detecting that the playback device has become uncalibrated, possibly due to moving the playback device to a different location. For example, the playback device may detect physical movement via one or more sensors (eg, accelerometers) that are sensitive to movement. As another example, a playback device may detect that it has been moved to a different zone (eg, from a "kitchen" zone to a "living room" zone), possibly by receiving from the control device causing the playback device to leave the first zone and Moved to a different area by adding instructions to the second area.

在另外的示例中，检测触发条件可以涉及：记录设备(例如，控制设备或回放设备)检测系统中的新回放设备。这样的回放设备可能尚未针对环境进行校准。例如，记录设备可以检测新回放设备作为媒体回放系统的设置过程(例如，将一个或更多个回放设备配置到媒体回放系统中的过程)的一部分。在其他情况下，记录设备可以通过以下操作来检测新回放设备：检测指示用于配置媒体回放系统的请求(例如，配置具有附加回放设备的媒体回放系统的请求)的输入数据。In a further example, detecting a trigger condition may involve a recording device (eg, a control device or a playback device) detecting a new playback device in the system. Such playback devices may not have been calibrated for the environment. For example, a recording device may detect a new playback device as part of a setup process of the media playback system (eg, the process of configuring one or more playback devices into the media playback system). In other cases, the recording device may detect a new playback device by detecting input data indicative of a request to configure a media playback system (eg, a request to configure a media playback system with additional playback devices).

在一些情况下，第一记录设备(或另一设备)可以指示一个或更多个回放设备发出校准声音。例如，记录设备诸如媒体回放系统100的控制设备126可以发送使回放设备(例如，回放设备102至124之一)发出校准声音的命令。控制设备可以经由网络接口(例如，有线网络接口或无线网络接口)发送命令。回放设备可以接收这样的命令，可能经由网络接口接收这样的命令，并响应地发出校准声音。In some cases, the first recording device (or another device) may instruct one or more playback devices to emit calibration sounds. For example, a recording device such as control device 126 of media playback system 100 may send a command to cause a playback device (eg, one of playback devices 102-124) to emit calibration sounds. The control device may send commands via a network interface (eg, a wired network interface or a wireless network interface). The playback device may receive such commands, possibly via a network interface, and responsively emit calibration sounds.

环境的声学效果(Acoustics)可能随环境内的位置而变化。由于该变化，可以通过以与稍后操作回放设备相同的方式将要校准的回放设备定位在环境内来改进一些校准过程。在该位置，环境可能以与在操作期间环境将影响回放类似的方式来影响由回放设备发出的校准声音。The acoustics of the environment may vary with location within the environment. Due to this change, some calibration procedures can be improved by positioning the playback device to be calibrated within the environment in the same way that the playback device is operated later. In this position, the environment may affect the calibration sound emitted by the playback device in a similar way that the environment will affect playback during operation.

此外，一些示例校准过程可以涉及：一个或更多个记录设备在环境内的多个物理位置处检测校准声音，这可以进一步帮助捕获环境内的声学效果变化性。为了促进在环境内的多个点处检测校准声音，一些校准过程涉及移动的麦克风。例如，在发出校准声音的同时，正在检测校准声音的麦克风可以在环境内移动。这样的移动可以促进在环境内的多个物理位置处检测校准声音，这可以提供对整个环境的更好的理解。Additionally, some example calibration procedures may involve one or more recording devices detecting calibration sounds at multiple physical locations within the environment, which may further help capture acoustic variability within the environment. To facilitate detection of calibration sounds at multiple points within the environment, some calibration procedures involve moving microphones. For example, the microphone that is detecting the calibration sound may move within the environment while the calibration sound is being emitted. Such movement may facilitate detection of calibration sounds at multiple physical locations within the environment, which may provide a better understanding of the overall environment.

在一些实施方式中，一个或更多个回放设备可以在校准过程期间重复发出校准声音，使得校准声音在每个重复期间覆盖校准频率范围。使用移动的麦克风，在环境内的不同物理位置处检测校准声音的重复，从而提供在整个环境中间隔开的样本。在一些情况下，校准声音可以是周期性校准信号，其中每个周期覆盖校准频率范围。In some implementations, one or more playback devices may repeatedly emit the calibration sound during the calibration process such that the calibration sound covers the calibration frequency range during each repetition. Using a moving microphone, repetitions of the calibration sound are detected at different physical locations within the environment, providing samples that are spaced out throughout the environment. In some cases, the calibration sound may be a periodic calibration signal, where each cycle covers a calibration frequency range.

为了促进确定频率响应，应该在每个频率下以足够的能量发出校准声音以克服背景噪声。为了增加给定频率下的能量，该频率下的音调可以被发出达更长的持续时间。然而，通过延长校准声音的周期，校准过程的空间分辨率降低，这是因为移动的麦克风在每个周期期间移动得更远(假定相对恒定的速度)。作为增加给定频率下的能量的另一种技术，回放设备可以增加音调的强度。然而，在某些情况下，尝试在短时间内发出足够的能量可能会损坏回放设备的扬声器驱动器。To facilitate determination of the frequency response, the calibration sound should be emitted at each frequency with sufficient energy to overcome the background noise. To increase the energy at a given frequency, the tones at that frequency can be emitted for longer durations. However, by extending the period of the calibration sound, the spatial resolution of the calibration process is reduced because the moving microphone moves farther during each period (assuming a relatively constant velocity). As another technique for increasing the energy at a given frequency, playback devices can increase the intensity of the tone. However, in some cases, attempting to emit enough energy for a short period of time can damage the speaker drivers of the playback device.

一些实现可以通过以下操作来平衡这些考虑：指示回放设备发出具有大约3/8秒的持续时间(例如，在1/4秒到1秒的持续时间的范围内)的周期的校准声音。换句话说，校准声音可以以2Hz至4Hz的频率重复。这样的持续时间可以足够长以在每个频率下提供足够能量的音调以克服典型环境(例如，安静的房间)中的背景噪声，但是也要足够短以使空间分辨率保持在可接受的范围内(例如，假设正常行走速度，小于几英尺)。Some implementations may balance these considerations by instructing the playback device to emit a calibration sound having a period of approximately 3/8 second duration (eg, in the range of 1/4 second to 1 second duration). In other words, the calibration sound can be repeated at a frequency of 2Hz to 4Hz. Such durations can be long enough to provide enough energy tones at each frequency to overcome background noise in typical environments (eg, quiet rooms), but short enough to keep spatial resolution within acceptable limits within (eg, less than a few feet, assuming normal walking speed).

在一些实施方式中，一个或更多个回放设备可以发出混合校准声音，混合校准声音将具有相应波形的第一分量和第二分量进行组合。例如，示例混合校准声音可以包括：具有某些频率下的噪声的第一分量和扫过其他频率的第二分量(例如扫描正弦)。噪声分量可以覆盖校准频率范围的相对较低的频率(例如，10Hz至50Hz)，而扫描信号分量覆盖该范围的较高频率(例如，高于50Hz)。这样的混合校准声音可以结合其分量信号的优点。In some embodiments, one or more playback devices may emit a hybrid calibration sound that combines the first and second components having corresponding waveforms. For example, an example hybrid calibration sound may include a first component with noise at certain frequencies and a second component (eg, a swept sine) swept across other frequencies. The noise component may cover the relatively lower frequencies of the calibration frequency range (eg, 10 Hz to 50 Hz), while the scan signal component covers the higher frequencies of the range (eg, above 50 Hz). Such a mixed calibration sound can combine the advantages of its component signals.

扫描信号(例如，啁啾或扫描正弦)是频率随时间增加或减小的波形。包括这样的波形作为混合校准声音的分量可以促进覆盖校准频率范围，因为可以选择在校准频率范围(或其一部分)内增加或减小的扫描信号。例如，啁啾在相对较短的时间段内发出啁啾内的每个频率，使得啁啾相对于某些其他波形能够更有效地覆盖校准范围。图8示出了图示示例啁啾的图800。如图8所示，波形的频率随着时间(在X轴上绘出)而增加，并且在相对较短的时间段内在每个频率下发出音调。A sweep signal (eg, a chirp or sweep sine) is a waveform whose frequency increases or decreases over time. Including such a waveform as a component of the mixed calibration sound may facilitate coverage of the calibration frequency range, as the scan signal may be selected to increase or decrease within the calibration frequency range (or a portion thereof). For example, the chirp emits each frequency within the chirp for a relatively short period of time, allowing the chirp to cover the calibration range more efficiently relative to some other waveform. FIG. 8 shows a graph 800 illustrating an example chirp. As shown in Figure 8, the frequency of the waveform increases with time (plotted on the x-axis), and a tone is emitted at each frequency over a relatively short period of time.

然而，因为啁啾内的每个频率在相对短的持续时间内发出，所以啁啾的幅度(或声音强度)在低频处必须相对较高以克服典型的背景噪声。某些扬声器可能无法输出如此高强度的音调而没有损坏的风险。此外，在回放设备的可听范围内这样的高强度音调可能令人不愉快，正如在涉及移动麦克风的校准过程期间所预期的那样。因此，校准声音的一些实施方式可能不包括延伸到相对较低频率(例如，低于50Hz)的啁啾。作为替代，啁啾信号或扫描信号可以覆盖相对较低的阈值频率(例如，大约50Hz至100Hz的频率)与校准频率范围的最大值之间的频率。校准范围的最大值可以对应于发出校准声音的声道的物理能力，其可以是20000Hz或更高。However, because each frequency within a chirp is emitted in a relatively short duration, the amplitude (or sound intensity) of the chirp must be relatively high at low frequencies to overcome typical background noise. Some speakers may not be able to output such high-intensity tones without risk of damage. Furthermore, such high-intensity tones can be unpleasant within the audible range of the playback device, as would be expected during a calibration process involving moving microphones. Therefore, some embodiments of the calibration sound may not include chirps that extend to relatively low frequencies (eg, below 50 Hz). Alternatively, the chirp or sweep signal may cover frequencies between a relatively low threshold frequency (eg, frequencies of about 50 Hz to 100 Hz) and the maximum value of the calibrated frequency range. The maximum value of the calibration range may correspond to the physical capability of the channel from which the calibration sound is emitted, which may be 20000 Hz or higher.

扫描信号还可以促进由移动麦克风引起的相位失真的反转。如上所述，移动麦克风引起相位失真，这可能干扰根据所检测的校准声音确定频率响应。然而，利用扫描信号，每个频率的相位是可预测的(按照多普勒频移)。这种可预测性有助于使相位失真反转，使得在分析期间可以将检测到的校准声音与所发出的校准声音相关联。这样的相关性可以用来确定环境对校准声音的影响。The scanning signal can also facilitate the inversion of phase distortion caused by moving microphones. As mentioned above, moving the microphone induces phase distortion, which can interfere with the determination of the frequency response from the detected calibration sound. However, with the scanning signal, the phase of each frequency is predictable (in terms of Doppler shift). This predictability helps to invert the phase distortion so that the detected calibration sound can be correlated with the emitted calibration sound during analysis. Such correlations can be used to determine the influence of the environment on the calibrated sound.

如上所述，扫描信号可以使频率随时间增加或减小。在一些实施方式中，记录设备可以指示一个或更多个回放设备发出从校准范围的最大值(或以上)下降至阈值频率(或以下)的啁啾。由于人耳道的物理形状，因此对于一些听者而言下降的啁啾可能比上升的啁啾更令人愉快。尽管一些实现可以使用下降的扫描信号，但上升的扫描信号也可以对校准有效。As mentioned above, the sweep signal can cause the frequency to increase or decrease over time. In some embodiments, the recording device may instruct one or more playback devices to emit chirps from the maximum value (or above) of the calibration range down to a threshold frequency (or below). Because of the physical shape of the human ear canal, a falling chirp may be more pleasant to some listeners than an upward chirp. Although some implementations may use a falling scan signal, a rising scan signal may also be effective for calibration.

如上所述，示例校准声音除了扫描信号分量之外还可以包括噪声分量。噪声指的是随机信号，在某些情况下，这个信号被滤波为每倍频程具有相等的能量。在噪声分量是周期性的实施方式中，混合校准声音的噪声分量可被认为是伪随机的。校准声音的噪声分量可以基本上在校准声音的整个周期或重复内发出。这使得被噪声分量覆盖的每个频率被发出达更长的持续时间，这降低了克服背景噪声通常需要的信号强度。As mentioned above, the example calibration sound may include noise components in addition to the scan signal components. Noise refers to a random signal that, in some cases, is filtered to have equal energy per octave. In embodiments where the noise component is periodic, the noise component of the mixed calibration sound may be considered pseudo-random. The noise component of the calibration sound may be emitted for substantially the entire period or repetition of the calibration sound. This allows each frequency covered by the noise component to be emitted for a longer duration, which reduces the signal strength typically required to overcome background noise.

此外，噪声分量可以比啁啾分量覆盖更小的频率范围，这可以增加在该范围内的每个频率处的声音能量。如上所述，噪声分量可以覆盖频率范围的最小值与阈值频率之间的频率，该阈值频率可以例如是约50Hz至100Hz的频率。与校准范围的最大值一样，校准范围的最小值可以对应于发出校准声音的声道的物理能力，其可以是20Hz或更低。Furthermore, the noise component may cover a smaller frequency range than the chirp component, which may increase the sound energy at each frequency in the range. As mentioned above, the noise component may cover frequencies between the minimum value of the frequency range and a threshold frequency, which may be, for example, a frequency of about 50 Hz to 100 Hz. As with the maximum value of the calibration range, the minimum value of the calibration range may correspond to the physical capabilities of the channel from which the calibrated sound is emitted, which may be 20 Hz or lower.

图9示出了图示示例布朗噪声的图900。布朗噪声是一种基于布朗运动的噪声。在一些情况下，回放设备可以发出在其噪声分量中包括布朗噪声的校准声音。布朗噪声具有类似于瀑布或强降水的“软”特性，这对于一些听者来说可能是令人愉快的。虽然一些实施方式可以使用布朗噪声来实现噪声分量，但是其他实施方式可以使用其他类型的噪声(例如粉红噪声或白噪声)来实现噪声分量。如图9所示，示例布朗噪声的强度每倍频程降低6dB(每十倍频程20dB)。FIG. 9 shows a graph 900 illustrating example Brownian noise. Brownian noise is a noise based on Brownian motion. In some cases, the playback device may emit a calibration sound that includes Brownian noise in its noise component. Brownian noise has "soft" properties similar to waterfalls or heavy precipitation, which may be pleasant to some listeners. While some implementations may implement the noise component using Brownian noise, other implementations may implement the noise component using other types of noise, such as pink noise or white noise. As shown in Figure 9, the intensity of the example Brownian noise is reduced by 6 dB per octave (20 dB per decade).

混合校准声音的一些实现可以包括噪声分量和扫频分量重叠的过渡频率范围。如上所述，在一些示例中，控制设备可以指示回放设备发出包括第一分量(例如噪声分量)和第二分量(例如，扫描信号分量)的校准声音。第一分量可以包括频率在校准频率范围的最小值与第一阈值频率之间的噪声，并且第二分量可以扫过第二阈值频率与校准频率范围的最大值之间的频率。Some implementations of mixing calibration sounds may include transition frequency ranges where noise components and sweep frequency components overlap. As described above, in some examples, the control device may instruct the playback device to emit a calibration sound that includes a first component (eg, a noise component) and a second component (eg, a scan signal component). The first component may include noise with a frequency between a minimum of the calibration frequency range and a first threshold frequency, and the second component may sweep frequencies between the second threshold frequency and a maximum of the calibration frequency range.

为了使这些信号重叠，第二阈值频率可以是比第一阈值频率低的频率。在这样的配置中，过渡频率范围包括第二阈值频率与第一阈值频率之间的频率，其可以是例如50Hz至100Hz。通过使这些分量重叠，回放设备可避免发出与两种类型的声音之间的苛刻过渡相关联的可能令人不快的声音。In order for these signals to overlap, the second threshold frequency may be a lower frequency than the first threshold frequency. In such a configuration, the transition frequency range includes frequencies between the second threshold frequency and the first threshold frequency, which may be, for example, 50 Hz to 100 Hz. By overlapping these components, the playback device can avoid potentially unpleasant sounds associated with harsh transitions between the two types of sounds.

图10A和图10B示出了覆盖校准频率范围1000的示例混合校准信号的分量。图10A示出了示例校准声音的第一分量1002A(即，噪声分量)和第二分量1004A。分量1002A覆盖从校准范围1000的最小值1006A到第一阈值频率1008A的频率。分量1004A覆盖从第二阈值1010A到校准频率范围1000的最大值的频率。如所示的，阈值频率1008A和阈值频率1010A是相同的频率。10A and 10B illustrate components of an example mixed calibration signal covering a calibration frequency range 1000. FIG. 10A shows a first component 1002A (ie, a noise component) and a second component 1004A of an example calibration sound. Component 1002A covers frequencies from minimum value 1006A of calibration range 1000 to first threshold frequency 1008A. Component 1004A covers frequencies from the second threshold 1010A to the maximum value of the calibration frequency range 1000. As shown, threshold frequency 1008A and threshold frequency 1010A are the same frequency.

图10B示出了另一示例校准声音的第一分量1002B(即，噪声分量)和第二分量1004B。分量1002B覆盖从校准范围1000的最小值1006B到第一阈值频率1008B的频率。分量1004A覆盖从第二阈值1010B到校准频率范围1000的最大值1012B的频率。如所示的，阈值频率1010B是比阈值频率1008B低的频率，使得分量1002B和分量1004B在从阈值频率1010B延伸到阈值频率1008B的过渡频率范围内重叠。FIG. 10B shows a first component 1002B (ie, a noise component) and a second component 1004B of another example calibration sound. Component 1002B covers frequencies from minimum value 1006B of calibration range 1000 to first threshold frequency 1008B. Component 1004A covers frequencies from second threshold 1010B to maximum value 1012B of calibration frequency range 1000. As shown, threshold frequency 1010B is a lower frequency than threshold frequency 1008B such that component 1002B and component 1004B overlap in a transition frequency range extending from threshold frequency 1010B to threshold frequency 1008B.

图11示出了被表示为帧1100的示例混合校准声音的一个示例迭代(例如，周期或循环)。帧1100包括扫描信号分量1102和噪声分量1104。扫描信号分量1102被示出为向下倾斜的线，以示出在校准范围的频率内下降的扫描信号。噪声分量1104被示为区域以示出整个帧1100中的低频噪声。如图所示，扫描信号分量1102和噪声分量在过渡频率范围内重叠。校准声音的周期1106大约为3/8秒(例如，在1/4秒到1/2秒的范围内)，在一些实现中，该周期是足以覆盖单声道的校准频率范围的时间。FIG. 11 illustrates one example iteration (eg, cycle or loop) of the example mixed calibration sound represented as frame 1100 . Frame 1100 includes scan signal component 1102 and noise component 1104 . The scan signal component 1102 is shown as a downward sloping line to show the scan signal falling within the frequency of the calibration range. Noise components 1104 are shown as regions to illustrate low frequency noise throughout frame 1100 . As shown, the scan signal component 1102 and the noise component overlap in the transition frequency range. The period 1106 of the calibration sound is approximately 3/8 second (eg, in the range of 1/4 second to 1/2 second), which in some implementations is sufficient time to cover the calibration frequency range for mono.

图12示出了示例周期性校准声音1200。混合校准声音1100的五次迭代(例如，周期)被表示为帧1202、1204、1206、1208和1210。在每次迭代或每个帧中，周期性校准声音1200使用两个分量(例如，噪声分量和扫描信号分量)覆盖校准频率范围。FIG. 12 shows an example periodic calibration sound 1200 . Five iterations (eg, cycles) of mixed calibration sound 1100 are represented as frames 1202 , 1204 , 1206 , 1208 , and 1210 . In each iteration or frame, the periodic calibration sound 1200 uses two components (eg, a noise component and a scan signal component) to cover the calibration frequency range.

在一些实施方式中，可以将频谱调整应用于校准声音，以使校准声音具有期望的形状或者滚降(roll off)，这可以避免扬声器驱动器过载。例如，校准声音可以被滤波来以每倍频程3dB或1/f滚降。这样的频谱调整可能不适用于改变低频以防止扬声器驱动器过载。In some embodiments, spectral adjustments may be applied to the calibration sound to give the calibration sound a desired shape or roll off, which may avoid overloading the speaker drivers. For example, the calibration sound can be filtered to roll off at 3dB per octave or 1/f. Such spectral adjustments may not be suitable for changing low frequencies to prevent overloading the speaker drivers.

在一些实施方式中，可以预先生成校准声音。这样的预先生成的校准声音可以存储在控制设备、回放设备或服务器(例如，向媒体回放系统提供云服务的服务器)上。在一些情况下，控制设备或服务器可以经由网络接口将预先生成的校准声音发送至回放设备，回放设备可以经由其自己的网络接口来检索校准声音。可替选地，控制设备可以向回放设备发送校准声音的源的指示(例如，URI)，回放设备可以使用该指示来获得校准声音。In some embodiments, calibration sounds may be pre-generated. Such pre-generated calibration sounds may be stored on the control device, playback device, or server (eg, a server providing cloud services to the media playback system). In some cases, the control device or server may send the pre-generated calibration sound to the playback device via a network interface, and the playback device may retrieve the calibration sound via its own network interface. Alternatively, the control device may send an indication (eg, a URI) of the source of the calibration sound to the playback device, which may be used by the playback device to obtain the calibration sound.

可替选地，控制设备或回放设备可以生成校准声音。例如，对于给定的校准范围，控制设备可以生成至少覆盖校准频率范围的最小值与第一阈值频率之间的频率的噪声以及至少覆盖第二阈值频率与校准频率范围的最大值之间的频率的扫描正弦。控制设备可以通过应用交叉滤波功能来将扫描正弦和噪声组合成周期性校准声音。交叉滤波功能可以将所生成的噪声的包括低于第一阈值频率的频率的部分与所生成的扫描正弦的包括高于第二阈值频率的频率的部分进行组合以获得期望的校准声音。生成校准声音的设备可以具有模拟电路和/或数字信号处理器，以生成和/或组合混合校准声音的分量。Alternatively, the control device or playback device may generate calibration sounds. For example, for a given calibration range, the control device may generate noise covering at least frequencies between the minimum value of the calibration frequency range and the first threshold frequency and frequencies covering at least the second threshold frequency and the maximum value of the calibration frequency range of sweep sine. The control device can combine the sweep sine and noise into a periodic calibration sound by applying a cross-filtering function. The cross filtering function may combine the portion of the generated noise including frequencies below the first threshold frequency with the portion of the generated sweep sine including frequencies above the second threshold frequency to obtain the desired calibration sound. The device for generating the calibration sound may have analog circuitry and/or a digital signal processor to generate and/or combine components of the mixed calibration sound.

在以下申请中描述了另外的示例校准过程：于2015年7月21日提交的题为“HybridTest Tone For Space-Averaged Room Audio Calibration Using A MovingMicrophone”的美国专利申请第14/805,140号，于2015年7月21日提交的题为“ConcurrentMulti-Loudspeaker Calibration with a Single Measurement”的美国专利申请第14/805,340号，以及于2015年9月24日提交的题为“Facilitating Calibration of an AudioPlayback Device”的美国专利申请第14/864,393号，以上申请的全部内容结合在本文中。Additional example calibration procedures are described in: US Patent Application No. 14/805,140, filed Jul. 21, 2015, entitled "HybridTest Tone For Space-Averaged Room Audio Calibration Using A MovingMicrophone," in 2015 U.S. Patent Application Serial No. 14/805,340, entitled "ConcurrentMulti-Loudspeaker Calibration with a Single Measurement," filed July 21, and U.S. Patent Application No. 14/805,340, entitled "Facilitating Calibration of an AudioPlayback Device," filed September 24, 2015 Patent Application No. 14/864,393, the entire contents of which are incorporated herein by reference.

可以经由如一个或更多个设备显示的一个或更多个控制接口来促进校准。在以下申请中描述了示例接口：于2015年4月24日提交的题为“Speaker Calibration”的美国专利申请第14/696,014号，以及于2015年8月14日提交的题为“Speaker Calibration UserInterface”的美国专利申请第14/826,873号，以上申请的全部内容结合在本文中。Calibration may be facilitated via one or more control interfaces as displayed by one or more devices. Example interfaces are described in US Patent Application Serial No. 14/696,014, entitled "Speaker Calibration," filed April 24, 2015, and "Speaker Calibration UserInterface," filed August 14, 2015 ," U.S. Patent Application Serial No. 14/826,873, the entire contents of which are incorporated herein.

现在转到几个示例实现，图13、图19和图20中所示的实现1300、1900和2000分别呈现了本文描述的技术的示例实施方式。这些示例实施方式可以在包括例如图1的媒体回放系统100、图2的一个或更多个回放设备200或者图3的一个或更多个控制设备300以及在本文描述的其他设备和/或其他合适的设备的操作环境内实现。此外，通过示例示出为由媒体回放系统执行的操作可以由任何合适的设备例如媒体回放系统的回放设备或控制设备来执行。实现1300、1900和2000可以包括如由图13、图19和图20所示的一个或更多个块示出的一个或更多个操作、功能或动作。尽管以连续顺序示出了块，但是这些块也可以并行执行，和/或以与本文描述的顺序不同的顺序执行。此外，可以基于期望的实现将各种块组合成更少的块、划分成附加的块和/或移除。Turning now to a few example implementations, implementations 1300, 1900, and 2000 shown in Figures 13, 19, and 20, respectively, present example implementations of the techniques described herein. These example embodiments may be implemented in systems including, for example, the media playback system 100 of FIG. 1, one or more playback devices 200 of FIG. 2, or one or more control devices 300 of FIG. 3, as well as other devices described herein and/or other be implemented within the operating environment of a suitable device. Furthermore, operations shown by way of example as being performed by the media playback system may be performed by any suitable device, such as a playback device or control device of the media playback system. Implementations 1300, 1900, and 2000 may include one or more operations, functions, or actions as illustrated by one or more of the blocks illustrated in Figures 13, 19, and 20. Although blocks are shown in sequential order, these blocks can also be executed in parallel and/or in a different order than described herein. Additionally, various blocks may be combined into fewer blocks, divided into additional blocks, and/or removed based on the desired implementation.

另外，对于本文公开的实现，该流程图示出了本实施方式的一种可能实现的功能和操作。在这方面，每个框可以表示程序代码的模块、区段或一部分，其包括可以由处理器执行以用于实现该处理中的特定逻辑功能或步骤的一个或更多个指令。程序代码可以存储在任何类型的计算机可读介质上，例如，包括磁盘或硬盘驱动器的存储设备。计算机可读介质可以包括非暂态计算机可读介质，例如，短时间存储数据的计算机可读介质，如寄存器存储器、处理器高速缓存和随机存取存储器(RAM)。计算机可读介质还可以包括非暂态介质，例如二级或持久长期存储装置，例如只读存储器(ROM)、光盘或磁盘、致密盘只读存储器(CD-ROM)。计算机可读介质还可以是任何其他易失性或非易失性存储系统。计算机可读介质可以被认为是例如计算机可读存储介质或有形存储设备。另外，对于本文公开的实现，每个框可以表示被连线以执行处理中的特定逻辑功能的电路。III.用于促进空间校准的示例技术Additionally, for the implementations disclosed herein, the flowchart illustrates the functionality and operation of one possible implementation of the present embodiment. In this regard, each block may represent a module, segment, or portion of program code, which comprises one or more instructions executable by a processor for implementing the specified logical function or step in the process. The program code may be stored on any type of computer-readable medium, for example, storage devices including magnetic disks or hard drives. Computer-readable media may include non-transitory computer-readable media, eg, computer-readable media that store data for a short period of time, such as register memory, processor cache, and random access memory (RAM). Computer-readable media may also include non-transitory media, such as secondary or persistent long-term storage devices such as read only memory (ROM), optical or magnetic disks, compact disk read only memory (CD-ROM). The computer readable medium can also be any other volatile or nonvolatile storage system. Computer-readable media may be considered, for example, computer-readable storage media or tangible storage devices. Additionally, for the implementations disclosed herein, each block may represent a circuit that is wired to perform the specified logical function in the process. III. Example Techniques for Facilitating Spatial Calibration

如上所述，本文描述的实施方式可以通过确定空间校准来促进对一个或更多个回放设备的校准。图13示出了示例实现1300，通过该示例实现，媒体回放系统促进这样的校准。As described above, embodiments described herein may facilitate calibration of one or more playback devices by determining spatial calibration. FIG. 13 illustrates an example implementation 1300 by which a media playback system facilitates such calibration.

a.检测触发条件 a. Detect trigger conditions

在块1302处，实现1300涉及检测触发条件。例如，联网麦克风设备可以检测触发条件，该触发条件启动对媒体回放系统(或者可能是媒体回放系统中的回放设备组)的校准。示例联网麦克风设备包括具有网络接口和麦克风的任何合适的设备。例如，回放设备(例如，回放设备200)和控制设备(例如，控制设备300)可以各自作为联网麦克风设备操作。其他示例联网麦克风设备包括图1的控制设备126和128。At block 1302, implementation 1300 involves detecting a triggering condition. For example, a networked microphone device may detect a trigger condition that initiates calibration of the media playback system (or possibly a group of playback devices in the media playback system). Example networked microphone devices include any suitable device having a network interface and a microphone. For example, a playback device (eg, playback device 200) and a control device (eg, control device 300) may each operate as networked microphone devices. Other example networked microphone devices include control devices 126 and 128 of FIG. 1 .

触发条件可以启动对多个音频驱动器的校准。在一些情况下，多个音频驱动器可以容纳在单个回放设备中。例如，条形音箱型回放设备可以包括多个音频驱动器(例如，九个音频驱动器)。在其他情况下，可以在两个或更多个回放设备之间划分多个音频驱动器。例如，可以利用一个或更多个其他回放设备对具有多个音频驱动器的条形音箱进行校准，其中每个回放设备具有一个或更多个相应音频驱动器。一些示例回放设备包括多个不同类型的音频驱动器(例如，可能具有不同尺寸的高音扬声器和低音扬声器)。Trigger conditions can initiate calibration of multiple audio drivers. In some cases, multiple audio drivers can be accommodated in a single playback device. For example, a sound bar-type playback device may include multiple audio drivers (eg, nine audio drivers). In other cases, multiple audio drivers may be divided between two or more playback devices. For example, a sound bar with multiple audio drivers may be calibrated with one or more other playback devices, each playback device having one or more corresponding audio drivers. Some example playback devices include multiple different types of audio drivers (eg, tweeters and woofers that may have different sizes).

受校准的特定回放设备(和音频驱动器)可以对应于媒体回放系统的区。例如，示例触发条件可以启动对媒体回放系统的给定区(例如，图1中示出的媒体回放系统100的起居室区)的校准。根据该示例，起居室区包括一起包括多个音频驱动器的回放设备104、106、108和110，并且示例触发条件因此可以启动对多个音频驱动器的校准。The particular playback device (and audio driver) that is calibrated may correspond to a zone of the media playback system. For example, an example trigger condition may initiate calibration of a given zone of the media playback system (eg, the living room zone of the media playback system 100 shown in FIG. 1 ). According to this example, the living room area includes playback devices 104, 106, 108, and 110 that together include multiple audio drivers, and example trigger conditions may thus initiate calibration of the multiple audio drivers.

如上文结合示例校准序列所述，本文设想了各种触发条件。一些示例触发条件包括指示媒体回放系统启动校准的输入数据。这样的输入数据可以经由联网麦克风设备的用户接口(例如，图6的控制接口600)来接收，如图6所示，或者可能经由将指令传送至联网麦克风设备和/或受校准的回放设备的另一设备来接收。As described above in connection with the example calibration sequence, various trigger conditions are contemplated herein. Some example trigger conditions include input data instructing the media playback system to initiate calibration. Such input data may be received via a user interface of the networked microphone device (eg, control interface 600 of FIG. 6 ), as shown in FIG. 6 , or possibly via communication of instructions to the networked microphone device and/or the playback device being calibrated. another device to receive.

其他示例触发条件可以基于传感器数据。例如，来自加速度计或其他合适传感器的传感器数据可以指示给定回放设备已经移动，这可以提示对该回放设备(以及可能与给定回放设备相关联的其他回放设备，例如具有该回放设备的绑定区或区域组中的回放设备)的校准。Other example trigger conditions may be based on sensor data. For example, sensor data from an accelerometer or other suitable sensor may indicate that a given playback device has moved, which may prompt that playback device (and possibly other playback devices associated with the given playback device, such as bindings with the playback device) playback devices in a fixed zone or zone group).

一些触发条件可能涉及输入数据与传感器数据的组合。例如，传感器数据可以指示媒体回放系统的操作环境的变化，这可能会导致在联网麦克风设备上显示启动校准的提示。媒体回放系统可能会在收到这样的输入数据后进行校准：该输入数据以提示来指示对启动校准的确认。Some trigger conditions may involve a combination of input data and sensor data. For example, sensor data may indicate changes in the operating environment of the media playback system, which may cause prompts to be displayed on networked microphone devices to initiate calibration. The media playback system may calibrate upon receipt of input data indicating confirmation to initiate calibration with a prompt.

另外的示例触发条件可以基于媒体回放系统的配置的变化。例如，示例触发条件包括从媒体回放系统(或其分组)添加或移除回放设备。其他示例触发条件包括接收新类型的输入内容(例如，接收多声道音频内容)。Additional example trigger conditions may be based on changes in the configuration of the media playback system. For example, example trigger conditions include adding or removing playback devices from the media playback system (or groupings thereof). Other example triggering conditions include receiving new types of input content (eg, receiving multi-channel audio content).

在操作中，多个音频驱动器可以形成多个声轴。例如，各自具有相应音频驱动器的两个回放设备可以形成相应的声轴。在一些情况下，可以对两个或更多个音频驱动器进行排列以形成声轴。例如，具有多个音频驱动器的回放设备(例如，具有九个音频驱动器的条形音箱)可以形成多个声轴(例如，三个声轴)。任何音频驱动器都可能对任何数量的声轴有贡献。例如，给定声轴可以由条形音箱的所有九个音频驱动器的贡献形成。In operation, multiple audio drivers may form multiple acoustic axes. For example, two playback devices, each with respective audio drivers, may form respective sound axes. In some cases, two or more audio drivers may be arranged to form an acoustic axis. For example, a playback device with multiple audio drivers (eg, a sound bar with nine audio drivers) may form multiple acoustic axes (eg, three acoustic axes). Any audio driver can contribute to any number of sound axes. For example, a given acoustic axis may be formed by the contributions of all nine audio drivers of the soundbar.

每个声轴可以对应于音频内容的相应输入声道。例如，媒体回放系统的音频驱动器可以形成分别对应于立体声内容的左声道、右声道的两个声轴。作为另一示例，音频驱动器可以形成与环绕声内容的相应声道(例如，中心声道、左前声道、右前声道、左后声道和右后声道)相对应的声轴。Each sound axis may correspond to a corresponding input channel of audio content. For example, an audio driver of a media playback system may form two sound axes corresponding to left and right channels of stereo content, respectively. As another example, an audio driver may form sound axes corresponding to respective channels of surround sound content (eg, center channel, front left channel, front right channel, rear left channel, and rear right channel).

对两个或更多个音频驱动器进行排列以形成给定声轴可以使两个或更多个音频驱动器能够在特定方向上“引导”给定声轴的声音输出。例如，在条形音箱的九个音频驱动器各自贡献与环绕声内容的左声道相对应的声轴的一部分的情况下，九个音频驱动器可以以这样的方式进行排列(即，声学地求和，可能使用DSP)：九个音频驱动器的净极性响应(polar response)将声音引导至左侧。与对应于左声道的声轴同时，九个音频驱动器还可以形成与环绕声内容的中心声道和右声道相对应的声轴，以分别将声音引导至中心和右侧。Arranging two or more audio drivers to form a given acoustic axis may enable the two or more audio drivers to "direct" the sound output of the given acoustic axis in a particular direction. For example, where each of the soundbar's nine audio drivers contributes a portion of the sound axis corresponding to the left channel of the surround content, the nine audio drivers may be arranged in such a way (that is, acoustically summed , possibly using DSP): The net polar response of the nine audio drivers directs the sound to the left. Simultaneously with the sound axis corresponding to the left channel, the nine audio drivers can also form the sound axis corresponding to the center channel and the right channel of the surround content to direct the sound to the center and the right, respectively.

由媒体回放系统的回放设备形成的特定声轴集合可以被称为回放配置。在操作中，媒体回放系统的回放设备可以被配置为多个可能的回放配置中的给定回放配置。在给定回放配置时，回放设备的音频驱动器可以形成特定声轴集合。在一些情况下，回放设备的配置为新的回放配置可以用作启动对回放设备的校准的触发条件。The particular set of sound axes formed by the playback devices of the media playback system may be referred to as a playback configuration. In operation, a playback device of a media playback system may be configured for a given playback configuration of a number of possible playback configurations. Given a playback configuration, the audio drivers of the playback device may form a particular set of sound axes. In some cases, the configuration of the playback device as a new playback configuration may be used as a triggering condition to initiate calibration of the playback device.

为了说明，返回参照图1，起居室区的回放设备104、106、108和110可以被配置为多个回放配置。在可能与环绕声音频内容相关联的第一回放配置中，回放设备104可以形成一个或更多个声轴(例如，前声道、左声道和右声道)，而回放设备108和110形成相应的声轴(例如，左环绕声道和右环绕声道)。作为超低音扬声器型设备的回放设备110可以贡献单独的低频声轴或由回放设备104、106和/或108形成的声轴的低频部分。在另一回放配置中，回放设备104、106、108和110的音频驱动器可以进行组合以形成与立体声音频内容的左声道和右声道相对应的声轴。另一回放配置可以涉及：音频驱动器形成与单声道音频内容相对应的单个声轴。For illustration, referring back to FIG. 1, the playback devices 104, 106, 108, and 110 of the living room area may be configured in multiple playback configurations. In a first playback configuration that may be associated with surround sound audio content, playback device 104 may form one or more sound axes (eg, front, left, and right channels), while playback devices 108 and 110 Corresponding sound axes (eg, left and right surround channels) are formed. Playback device 110 , being a subwoofer-type device, may contribute a separate low frequency acoustic axis or a low frequency portion of the acoustic axis formed by playback devices 104 , 106 and/or 108 . In another playback configuration, the audio drivers of playback devices 104, 106, 108, and 110 may combine to form sound axes corresponding to left and right channels of stereo audio content. Another playback configuration may involve the audio driver forming a single sound axis corresponding to mono audio content.

在操作中，回放设备可以根据各种因素利用给定回放配置。这样的因素可以包括区配置(例如，回放设备是否处于5.1、5.0或其他环绕声配置，立体声配对配置，仅播放条配置等)。这些因素还可以包括回放设备的特定类型和能力。这些因素还可以包括被提供给回放设备(或预期被提供)的特定类型的内容。例如，回放设备可以在播放环绕声内容时采用第一回放配置而在播放立体声内容时采用另外的回放配置。作为另一示例，回放设备可以在播放音乐时使用给定回放配置，而在播放与视频配对的音频(例如，电视内容)时使用另外的回放配置。另外的示例回放配置包括具有(或不具有)超低音扬声器型回放设备的任何上述示例配置，因为从回放配置添加(或减去)这样的设备可以改变回放配置中的声学特性和/或回放责任的分配。In operation, a playback device may utilize a given playback configuration depending on various factors. Such factors may include zone configuration (eg, whether the playback device is in a 5.1, 5.0 or other surround sound configuration, stereo pairing configuration, play bar only configuration, etc.). These factors may also include the specific type and capabilities of playback devices. These factors may also include the particular type of content provided (or expected to be provided) to the playback device. For example, the playback device may employ a first playback configuration when playing surround sound content and a further playback configuration when playing stereo content. As another example, a playback device may use a given playback configuration when playing music, and use another playback configuration when playing audio paired with video (eg, television content). Additional example playback configurations include any of the above-described example configurations with (or without) subwoofer-type playback devices, as adding (or subtracting) such devices from the playback configuration can change the acoustic characteristics and/or playback responsibilities in the playback configuration allocation.

一些示例校准序列涉及：针对多个回放配置来校准回放设备。这样的校准序列可以产生多个校准配置文件(profile)，多个校准配置文件应用于处于给定回放配置中的回放设备。例如，给定的校准过程可以针对环绕声回放配置和音乐回放配置校准媒体回放系统100的起居室区。在环绕声回放配置时，起居室区的回放设备可以应用与环绕声回放配置相对应的第一校准配置文件(例如，调整幅度响应、频率响应、相位等中的一个或更多个的一个或更多个滤波器)。同样地，在音乐回放配置时，起居室区的回放设备可以应用与音乐回放配置相对应的第二校准配置文件。Some example calibration sequences involve calibrating a playback device for multiple playback configurations. Such a calibration sequence may generate multiple calibration profiles that apply to playback devices in a given playback configuration. For example, a given calibration process may calibrate the living room area of the media playback system 100 for a surround sound playback configuration and a music playback configuration. In a surround sound playback configuration, the playback device in the living room area may apply a first calibration profile corresponding to the surround sound playback configuration (eg, adjust one or more of amplitude response, frequency response, phase, etc. or more filters). Likewise, in the music playback configuration, the playback device in the living room area may apply a second calibration profile corresponding to the music playback configuration.

b.使多个音频驱动器发出校准音频 b. Make multiple audio drivers emit calibrated audio

在图13中，在块1304处，实现1300涉及使多个音频驱动器发出校准音频。例如，NMD可以指示包括多个音频驱动器的一个或多个回放设备经由多个音频驱动器发出校准音频。例如，媒体回放系统100的控制设备126可以发送使回放设备(例如，回放设备102至124之一)发出校准音频的命令。NMD可以经由网络接口(例如，有线网络接口或无线网络接口)发送命令。回放设备可以接收这样的命令，可能经由网络接口接收这样的命令，并且响应地发出校准音频。In Figure 13, at block 1304, implementation 1300 involves causing a plurality of audio drivers to emit calibration audio. For example, the NMD may instruct one or more playback devices that include multiple audio drivers to emit calibration audio via the multiple audio drivers. For example, control device 126 of media playback system 100 may send a command to cause a playback device (eg, one of playback devices 102-124) to issue calibration audio. The NMD may send commands via a network interface (eg, a wired network interface or a wireless network interface). The playback device may receive such commands, possibly via a network interface, and emit calibration audio in response.

校准音频可以包括一个或更多个校准声音，例如频率扫描(“啁啾”)、棕色噪声或其他类型的噪声或者歌曲，以及其他示例声音。关于示例校准声音的其他细节在上面结合第II.e部分中描述的示例校准序列以及通常贯穿整个公开内容进行了说明。The calibration audio may include one or more calibration sounds, such as frequency sweeps ("chirps"), brown noise or other types of noise or songs, and other example sounds. Additional details regarding example calibration sounds are described above in connection with the example calibration sequences described in Section II.e, and generally throughout this disclosure.

在一些示例中，校准音频被分成帧。如图11和图12所示并在本文描述的，帧可以表示示例校准声音的迭代(例如，周期或循环)。当被记录时，帧可以产生如由一个或更多个音频驱动器发出的校准声音的相应样本。In some examples, the calibration audio is divided into frames. As shown in FIGS. 11 and 12 and described herein, a frame may represent an iteration (eg, cycle or loop) of an example calibration sound. When recorded, the frames may produce corresponding samples of calibration sounds as emitted by one or more audio drivers.

如上所述，在一些情况下，校准序列涉及对多个声轴的校准。用于校准多个声轴的示例校准音频可以被分成组成帧，其中每个帧包括受校准的每个声轴的校准音频。因此，当被记录时，每个帧可以包括由每个声轴产生的校准音频的样本。帧可以重复以针对每个声轴产生多个样本。As mentioned above, in some cases the calibration sequence involves calibration of multiple acoustic axes. Example calibration audio for calibrating multiple acoustic axes may be divided into frames, where each frame includes calibration audio for each acoustic axis being calibrated. Thus, when recorded, each frame may include a sample of calibration audio produced by each acoustic axis. The frame can be repeated to produce multiple samples for each acoustic axis.

为了包括受校准的每个声轴的校准音频，每个帧可以进一步划分成时隙。每个时隙可以包括受校准的相应声轴的校准音频。例如，形成三个声轴(诸如左声道、右声道和中心声道)的播放条型回放设备(例如，图1中所示的回放设备104)的示例帧可以包括三个时隙。例如，如果要利用超低音扬声器型设备来校准该设备，则每个帧可以包括四个时隙，一个用于由播放条型回放设备形成的每个声轴，一个用于由超低音扬声器产生的声轴。作为另一示例，在利用产生相应声轴(例如，左后声道和右声道)的两个附加回放设备来校准播放条型回放设备的情况下，每个帧可以包括五个时隙(或者如果利用超低音扬声器校准则为六个时隙)。Each frame may be further divided into time slots in order to include calibration audio for each acoustic axis being calibrated. Each time slot may include calibration audio for the corresponding acoustic axis being calibrated. For example, an example frame of a playbar-type playback device (eg, playback device 104 shown in FIG. 1 ) forming three acoustic axes, such as left, right, and center channels, may include three time slots. For example, if a subwoofer-type device is to be used to calibrate the device, each frame may include four time slots, one for each acoustic axis formed by the playbar-type playback device, and one for each sound axis produced by the subwoofer sound axis. As another example, where a playbar-type playback device is calibrated with two additional playback devices producing corresponding acoustic axes (eg, rear left and right channels), each frame may include five time slots ( or six time slots if calibrated with a subwoofer).

如上所述，每个时隙可以包括受校准的相应声轴的校准音频。每个时隙中的校准音频可以包括频率扫描(“啁啾”)、棕色噪声或其他类型的噪声，以及其他示例。例如，返回参照图11和图12，每个声音中的校准音频可以包括混合校准声音。时隙可以以已知顺序按顺序发生，以促进将所记录的校准音频内的时隙匹配到相应的声轴。每个时隙可以具有已知的持续时间，这也可以促进将记录的校准音频内的时隙匹配到相应的声轴。在其他示例中，每个时隙和/或帧可以包括用于识别时隙或帧的水印(例如，特定的声音模式)，该水印可以用于将记录的校准音频内的时隙匹配到相应的声轴。As described above, each time slot may include calibration audio for the corresponding acoustic axis that is being calibrated. Calibration tones in each time slot may include frequency sweeps ("chirps"), brown noise, or other types of noise, among other examples. For example, referring back to Figures 11 and 12, the calibration audio in each sound may include a mix of calibration sounds. The time slots may occur sequentially in a known order to facilitate matching the time slots within the recorded calibration audio to the corresponding acoustic axes. Each time slot may have a known duration, which may also facilitate matching time slots within the recorded calibration audio to the corresponding acoustic axis. In other examples, each time slot and/or frame may include a watermark (eg, a specific sound pattern) for identifying the time slot or frame, which watermark may be used to match time slots within the recorded calibration audio to corresponding sound axis.

为了说明，图14示出了示例校准音频1400。校准声音1400包括帧1402、1404和1406。帧1402、1404和1406被分成相应的三个相应时隙。特别地，帧1402包括时隙1402A、1402B和1402C。同样，帧1404和1406分别包括时隙1404A、1404B和1404C以及1406A、1406B和1406C。每个时隙包括图11的混合校准声音1100的迭代。在校准过程期间，每个时隙中的校准声音可以由(可能经由多个音频驱动器形成的)相应声轴发出。例如，时隙1402A、1404A和1406A可以对应于第一声轴(例如，左声道)，而时隙1402B、1404B和1406B对应于第二声轴(以及时隙1402C、1404C和1406C对应于第三声轴)。以这种方式，当被记录时，假设帧1402、1404和1406的足够部分被记录，校准音频1400可以产生每个声轴的三个样本。For illustration, FIG. 14 shows an example calibration audio 1400 . Calibration sound 1400 includes frames 1402 , 1404 and 1406 . Frames 1402, 1404 and 1406 are divided into respective three respective time slots. In particular, frame 1402 includes time slots 1402A, 1402B, and 1402C. Likewise, frames 1404 and 1406 include time slots 1404A, 1404B, and 1404C, and 1406A, 1406B, and 1406C, respectively. Each time slot includes an iteration of the mixed calibration sound 1100 of FIG. 11 . During the calibration process, the calibration sound in each time slot may be emitted by the corresponding sound axis (possibly formed via multiple audio drivers). For example, timeslots 1402A, 1404A, and 1406A may correspond to a first acoustic axis (eg, the left channel), while timeslots 1402B, 1404B, and 1406B correspond to a second acoustic axis (and timeslots 1402C, 1404C, and 1406C correspond to the first acoustic axis) three acoustic shafts). In this manner, when recorded, calibration audio 1400 may produce three samples per acoustic axis, assuming sufficient portions of frames 1402, 1404, and 1406 are recorded.

如上所述，在一些示例校准过程中，可以针对多个回放配置来校准媒体回放系统的回放设备。可替选地，可以以相应的校准序列来校准针对音频驱动器集合的不同回放配置。用于校准多个回放配置的示例校准音频可以包括重复的一系列帧。一系列中的每个帧可以对应于相应的回放配置。例如，用于校准三个回放配置的示例校准音频可以包括一系列三个帧(例如，图14的帧1402、1404和1406)。As described above, in some example calibration procedures, playback devices of a media playback system may be calibrated for multiple playback configurations. Alternatively, different playback configurations for sets of audio drivers may be calibrated in corresponding calibration sequences. An example calibration audio for calibrating multiple playback configurations may include a repeated series of frames. Each frame in the series may correspond to a corresponding playback configuration. For example, an example calibration audio for calibrating three playback configurations may include a series of three frames (eg, frames 1402, 1404, and 1406 of Figure 14).

如图14所示，该系列的每个帧可以被分成与对应于该帧的回放配置的声轴相对应的时隙。由于不同的回放配置可能形成可能具有不同数量的总轴的不同声轴组，因此一系列中的帧可以具有不同数量的时隙。该系列帧可以重复，以针对每个回放配置的每个声轴产生多个样本。As shown in FIG. 14, each frame of the series may be divided into time slots corresponding to the acoustic axis corresponding to the playback configuration of the frame. Since different playback configurations may result in different sets of acoustic axes, which may have different numbers of total axes, frames in a series may have different numbers of time slots. The series of frames can be repeated to produce multiple samples for each sound axis of each playback configuration.

c.记录校准音频 c. Record calibration audio

在图13中，在块1306处，实现1300涉及记录所发出的校准音频。例如，NMD可以经由麦克风记录如由媒体回放系统(例如，媒体回放系统100)的回放设备发出的校准音频。如上所述，示例NMD包括控制设备(例如，图1的控制设备126或128)、回放设备或者具有麦克风或其他传感器以记录校准音频的任何合适的设备。在一些情况下，多个NMD可以经由相应的麦克风来记录校准音频。In Figure 13, at block 1306, implementation 1300 involves recording the emitted calibration audio. For example, the NMD may record calibration audio as emitted by a playback device of a media playback system (eg, media playback system 100 ) via a microphone. As noted above, example NMDs include a control device (eg, control device 126 or 128 of FIG. 1 ), a playback device, or any suitable device that has a microphone or other sensor to record calibration audio. In some cases, multiple NMDs may record calibration audio via respective microphones.

在实践中，校准声音中的一些可能被环境或其他状况衰减或淹没，这可能干扰记录全部校准声音的记录设备。这样，NMD可能测量如由媒体回放系统的回放设备发出的校准声音的一部分。校准音频可以是以上关于示例校准过程描述的任何示例校准声音，以及任何合适的校准声音。In practice, some of the calibration sounds may be attenuated or drowned out by environmental or other conditions, which may interfere with the recording device recording all of the calibration sounds. In this way, the NMD may measure a portion of the calibration sound as emitted by the playback device of the media playback system. The calibration audio may be any of the example calibration sounds described above with respect to the example calibration process, as well as any suitable calibration sounds.

在某些情况下，一个或多个NMD可以在记录校准音频时或多或少地保持静止。例如，NMD可以定位在一个或更多个特定位置(例如，优选的收听位置)处。这样的定位可以有助于记录校准音频，如由收听者在该特定位置处所感知的那样。In some cases, one or more of the NMDs may remain more or less stationary while recording the calibration audio. For example, the NMD may be positioned at one or more specific locations (eg, preferred listening locations). Such positioning may aid in recording calibration audio as perceived by the listener at that particular location.

某些回放配置可以建议特定的优选收听位置。例如，与环绕声音频或与视频耦合的音频相对应的回放配置可以建议这样的位置，在该位置处用户将在收听回放设备的同时观看电视(例如，在沙发或椅子上)。在一些示例中，NMD可以提示移动到特定位置(例如，优选的收听位置)以开始校准。当校准多个回放配置时，NMD可以提示移动到与每个回放配置相对应的某些收听位置。Certain playback configurations may suggest specific preferred listening positions. For example, a playback configuration corresponding to surround sound audio or video-coupled audio may suggest a location where the user will be watching television (eg, on a sofa or chair) while listening to the playback device. In some examples, the NMD may prompt to move to a specific location (eg, a preferred listening location) to begin calibration. When calibrating multiple playback configurations, the NMD may prompt to move to certain listening positions corresponding to each playback configuration.

为了说明这样的提示，在图15中，智能电话500正在显示包括图形区域1502的控制接口1500。图形区域1502提示移动到特定位置(即，用户通常在房间中看电视的地方)。可以显示这样的提示以引导用户在优选位置开始校准序列。控制接口1500还包括可选控件1504和1506，它们分别在校准序列中前进和后退。To illustrate such a prompt, in FIG. 15 , smartphone 500 is displaying control interface 1500 including graphics area 1502 . Graphical area 1502 prompts to move to a specific location (ie, where in the room the user would normally watch television). Such prompts can be displayed to guide the user to start the calibration sequence at a preferred location. Control interface 1500 also includes selectable controls 1504 and 1506, which advance and reverse, respectively, in the calibration sequence.

图16描绘了显示包括图形区域1602的控制接口1600的智能电话500。图形区域1602提示用户将记录设备升高到视线水平。可以显示这样的提示以引导用户将电话定位在促进对校准音频的测量的位置。控制接口1600还包括可选控件1604和1606，它们分别在校准序列中前进和后退。FIG. 16 depicts smartphone 500 displaying control interface 1600 including graphics area 1602 . Graphics area 1602 prompts the user to raise the recording device to eye level. Such prompts can be displayed to guide the user to position the phone in a position that facilitates measurement of calibration audio. Control interface 1600 also includes selectable controls 1604 and 1606, which advance and reverse, respectively, in the calibration sequence.

接下来，图17描绘了显示包括图形区域1702的控制接口1700的智能电话500。图形区域1702提示用户“设置最佳位置”(即，环境中的优选位置)。在智能电话500检测到对可选控件1704的选择之后，智能电话500可以开始在其当前位置测量校准声音(并且可能还指示一个或更多个回放设备输出校准音频)。如所示的，控制接口1700还包括可选控件1706，其使校准序列前进(例如，通过使智能电话开始在其当前位置测量校准声音，如利用可选控件1704)。Next, FIG. 17 depicts smartphone 500 displaying control interface 1700 including graphics area 1702 . Graphics area 1702 prompts the user to "set the best location" (ie, a preferred location in the environment). After smartphone 500 detects selection of selectable control 1704, smartphone 500 may begin measuring calibration audio at its current location (and possibly also instructing one or more playback devices to output calibration audio). As shown, the control interface 1700 also includes selectable controls 1706 that advance the calibration sequence (eg, by causing the smartphone to begin measuring calibration sounds at its current location, as with selectable controls 1704).

在图18中，智能电话500显示包括图形区域1802的控制接口1800。图形区域1802指示智能电话500正在记录校准音频。控制接口1800还包括可选控件1804，其在校准序列中后退。In FIG. 18 , smartphone 500 displays control interface 1800 including graphics area 1802 . Graphics area 1802 indicates that smartphone 500 is recording calibration audio. The control interface 1800 also includes an optional control 1804 that steps back in the calibration sequence.

d.使得处理所记录的校准音频 d. Enables processing of recorded calibration audio

在图13中，在块1308处，实现1300涉及使得处理所记录的校准音频。例如，NMD可以使处理设备处理所记录的校准音频。在一些情况下，NMD可以包括处理设备。可替选地，NMD可以将所记录的音频发送至一个或更多个其他处理设备以进行处理。示例处理设备包括回放设备、控制设备、经由局域网连接至媒体回放系统的计算设备、远程计算设备诸如云服务器或者以上设备的任何组合。In Figure 13, at block 1308, implementing 1300 involves causing the recorded calibration audio to be processed. For example, the NMD can cause the processing device to process the recorded calibration audio. In some cases, the NMD may include processing equipment. Alternatively, the NMD may send the recorded audio to one or more other processing devices for processing. Example processing devices include playback devices, control devices, computing devices connected to the media playback system via a local area network, remote computing devices such as cloud servers, or any combination of the above.

校准音频的处理可以涉及：确定针对多个声轴中的每一个声轴的一个或更多个校准。多个声轴的每个校准可以涉及：修改幅度响应、频率响应、相位调整或任何其他声学特性中的一个或更多个。这样的修改可以在空间上将多个声轴校准到一个或更多个位置(例如，一个或更多个优选的收听位置)。The process of calibrating the audio may involve determining one or more calibrations for each of the plurality of acoustic axes. Each calibration of the multiple acoustic axes may involve modifying one or more of the amplitude response, frequency response, phase adjustment, or any other acoustic characteristic. Such modifications may spatially align multiple acoustic axes to one or more locations (eg, one or more preferred listening locations).

可以使用在DSP中实现的或作为模拟滤波器的一个或更多个滤波器来应用这样的修改。校准数据可以包括用于实现滤波器的参数(例如，作为双二阶滤波器的系数)。可以每音频驱动器或每组两个或更多个驱动器(例如，形成声轴的两个或更多个驱动器或者相同类型的音频驱动器中的两个或更多个，以及其他示例)地应用滤波器。在一些情况下，可以针对受校准的多个回放配置确定相应的校准。Such modifications may be applied using one or more filters implemented in DSP or as analog filters. The calibration data may include parameters used to implement the filter (eg, as coefficients of a biquad filter). Filtering may be applied per audio driver or per group of two or more drivers (eg, two or more drivers forming an acoustic axis or two or more of the same type of audio drivers, among other examples) device. In some cases, corresponding calibrations may be determined for multiple playback configurations that are calibrated.

所记录的校准音频可以在记录时或记录完成后进行处理。例如，在校准音频被分成帧的情况下，帧可以在它们被记录时被发送至处理设备，可能以帧集合形式。可替选地，在回放设备完成发出校准音频之后，可以将记录的帧发送至处理设备。The recorded calibration audio can be processed while recording or after recording. For example, where calibration audio is divided into frames, the frames may be sent to the processing device as they are recorded, possibly in sets of frames. Alternatively, after the playback device finishes emitting the calibration audio, the recorded frames may be sent to the processing device.

处理可以涉及确定针对多个声轴的每个声轴的相应延迟。最终，这样的延迟可以用于将来自每个声轴的相应声音在特定位置(例如，优选的收听位置)处的到达时间对准。例如，用于给定回放配置的校准配置文件可以包括这样的滤波器，该滤波器对回放配置的某些声轴进行延迟以使回放配置的声轴在优选收听位置处的到达时间对准。声轴可以具有在特定位置处的不同到达时间，因为它们由距特定位置不同距离处的音频驱动器形成。此外，一些声轴可以被引导远离特定位置(例如，条形音箱型回放设备的左声道和右声道)，并且因此在到达特定位置之前被环境反射。这样的声音路径可以增加形成声轴的音频驱动器与特定位置之间的有效距离，这可以导致与具有更直接路径的声轴相比的较晚的到达时间。如上所述，对于环绕声回放配置而言，这样的优选收听位置可以是沙发或椅子。Processing may involve determining respective delays for each of the plurality of acoustic axes. Ultimately, such delays can be used to align the arrival times of the corresponding sounds from each acoustic axis at a particular location (eg, a preferred listening location). For example, a calibration profile for a given playback configuration may include filters that delay certain acoustic axes of the playback configuration to align the arrival times of the acoustic axes of the playback configuration at preferred listening locations. Sound axes can have different arrival times at certain locations because they are formed by audio drivers at different distances from the particular location. Furthermore, some acoustic axes may be directed away from certain locations (eg, left and right channels of a sound bar-type playback device) and thus be reflected by the environment before reaching certain locations. Such a sound path can increase the effective distance between the audio driver forming the sound axis and a particular location, which can result in a later arrival time compared to a sound axis with a more direct path. As mentioned above, for a surround sound playback configuration, such a preferred listening position may be a sofa or chair.

在示例中，处理设备可以将所记录的音频分成与发出每个部分的不同声轴和/或回放配置相对应的部分。例如，在由回放设备发出的校准声音被分成帧的情况下，处理设备可以将所记录的音频划分成组成帧。在校准声音包括一系列帧的情况下，处理设备可以将来自每个系列的帧归因于与那些帧相对应的相应回放配置。此外，处理设备可以将每个帧划分成与每个声轴对应的相应时隙。如上所述，回放设备可以以已知的顺序发出帧和时隙，并且每个时隙可以具有已知的持续时间，以促进将记录的音频划分成其组成部分。在一些示例中，每个时隙和/或帧可以包括用于标识时隙或帧的水印，该水印可以用于将所记录的校准音频内的帧匹配到相应的回放配置和/或将时隙匹配到相应的声轴。In an example, the processing device may divide the recorded audio into sections corresponding to different acoustic axes and/or playback configurations from which each section is emitted. For example, where the calibration sound emitted by the playback device is divided into frames, the processing device may divide the recorded audio into constituent frames. Where the calibration sound includes a series of frames, the processing device may attribute the frames from each series to the respective playback configuration corresponding to those frames. Furthermore, the processing device may divide each frame into respective time slots corresponding to each acoustic axis. As mentioned above, the playback device may issue frames and time slots in a known order, and each time slot may have a known duration to facilitate dividing the recorded audio into its component parts. In some examples, each time slot and/or frame may include a watermark identifying the time slot or frame, which watermark may be used to match the frame within the recorded calibration audio to the corresponding playback configuration and/or time The gap is matched to the corresponding acoustic axis.

处理设备可以确定针对每个声轴的脉冲响应。可以通过生成频率滤波响应来进一步处理每个脉冲响应，以将脉冲响应划分成频带。不同类型的音频驱动器可以在不同频带处更好地排列。例如，中间范围低音扬声器可以很好地排列以形成300Hz至2.5kHz范围内的声轴。作为另一示例，高音扬声器可以在8kHz至14kHz的范围内很好地排列。在示例声轴被配置成形成环绕声配置的中心声道的情况下，声轴应该在轴上最大并且向右和向左衰减。相反，对于形成环绕声配置的左声道和右声道的声轴，每个阵列应该在轴上衰减(例如，零)并且分别向左或向右最大。在诸如上面提供的那些范围的某些范围之外，音频驱动器可能也不会在预期方向上形成声轴。这些频率范围是借助示例提供的，并且可以根据不同音频驱动器的能力和特性而变化。The processing device can determine the impulse response for each acoustic axis. Each impulse response can be further processed by generating a frequency filtered response to divide the impulse response into frequency bands. Different types of audio drivers can be better aligned at different frequency bands. For example, mid-range woofers can line up nicely to form an acoustic axis in the 300Hz to 2.5kHz range. As another example, tweeters can line up well in the 8kHz to 14kHz range. With the example acoustic axis configured to form the center channel of a surround sound configuration, the acoustic axis should be largest on-axis and attenuated to the right and left. Conversely, for the acoustic axes of the left and right channels forming a surround sound configuration, each array should be attenuated (eg, zero) on-axis and maximized to the left or right, respectively. Outside of certain ranges, such as those provided above, the audio driver may also not form an acoustic axis in the intended direction. These frequency ranges are provided by way of example and can vary depending on the capabilities and characteristics of different audio drivers.

作为另一示例，在具有不同类型的多个音频驱动器(例如，高音扬声器和低音扬声器)的回放设备中，处理设备可以确定三个带限响应。这样的响应可以包括全范围响应，覆盖低音扬声器的中频范围(例如，300Hz至2.5kHz)的响应，以及覆盖高音扬声器的高频(例如，8kHz至14kHz)的响应。这样的频率滤波响应可以通过更清楚地表示每个声轴来促进进一步处理。As another example, in a playback device with multiple audio drivers of different types (eg, tweeters and woofers), the processing device may determine three band-limited responses. Such responses may include full-range responses, responses covering the mid-range of the woofer (eg, 300 Hz to 2.5 kHz), and responses covering the high frequencies of the tweeter (eg, 8 kHz to 14 kHz). Such frequency filtered responses can facilitate further processing by representing each acoustic axis more clearly.

处理所记录的音频可以涉及每个声轴的响应之间的比较。为了促进这样的比较，针对每个时隙的脉冲响应可以彼此时间对准(因为它们在不同的时间段期间被发出)。例如，脉冲响应可以与第一参考点例如每个时隙的开始对准。脉冲响应的这样的时间对准促进对每个响应中的特定参考点的识别。Processing the recorded audio may involve comparisons between the responses of each vocal axis. To facilitate such comparisons, the impulse responses for each time slot can be time aligned with each other (as they are emitted during different time periods). For example, the impulse response may be aligned with a first reference point such as the beginning of each time slot. Such temporal alignment of the impulse responses facilitates the identification of specific reference points in each response.

在示例实现中，对每个响应中的特定参考点的识别涉及：识别参考声轴的脉冲响应中的给定第二参考点。作为示例，参考声轴可以是与环绕声系统的中心声道相对应的声轴(例如，3.0、3.1、5.0、5.1或其他多声道回放配置)。该声轴可以用作参考声轴，因为与其他声轴(例如，形成左声道和右声道的声轴)相比，来自该轴的声音更直接地行进至通常的优选收听位置。该脉冲响应中的给定第二参考点可以是第一峰值。可以假设第一峰值对应于从一个或多个音频驱动器到NMD的直接信号(而不是反射)。该给定的第二参考点(即，第一峰值)用作其他声轴在NMD处的后续到达时间的参考。In an example implementation, the identification of a particular reference point in each response involves identifying a given second reference point in the impulse response of the reference acoustic axis. As an example, the reference acoustic axis may be the acoustic axis corresponding to the center channel of the surround sound system (eg, 3.0, 3.1, 5.0, 5.1, or other multi-channel playback configuration). This acoustic axis can be used as a reference acoustic axis because sound from this axis travels more directly to a generally preferred listening position than other acoustic axes (eg, the acoustic axis forming the left and right channels). The given second reference point in the impulse response may be the first peak. It can be assumed that the first peak corresponds to the direct signal (rather than reflections) from one or more audio drivers to the NMD. This given second reference point (ie, the first peak) is used as a reference for subsequent arrival times of other acoustic axes at the NMD.

为了将其他声轴在NMD处的到达时间与参考声轴在NMD处的到达时间进行比较，处理设备可以识别其他脉冲响应中的第二参考点。这些其他第二参考点对应于参考声轴中的相同第二参考点。例如，如果参考声轴的脉冲响应中的第一峰值被用作给定的第二参考点，则其他脉冲响应中的第一峰值被识别为第二参考点。In order to compare the time of arrival of the other acoustic axis at the NMD with the time of arrival of the reference acoustic axis at the NMD, the processing device may identify a second reference point in the other impulse response. These other second reference points correspond to the same second reference points in the reference acoustic axis. For example, if the first peak in the impulse response of the reference acoustic axis is used as a given second reference point, then the first peak in the other impulse responses is identified as the second reference point.

在已知多个音频驱动器的近似物理配置的情况下，可以应用时窗来限制每个脉冲响应的要识别第二参考点的部分。例如，在声轴形成左声道、右声道和中心声道的情况下，形成左声道和右声道的声轴的脉冲响应可以被限制到形成中心声道的声轴的脉冲响应中的峰值之后的时窗。来自形成左声道和右声道的声轴的声音向外向左向右行进(而不是在轴上)，因此感兴趣的峰值将是来自这些轴的声音被环境的反射。然而，形成左环绕声道和/或右环绕声道和/或超低音扬声器声道的声轴可能在物理上比形成中心声道的一个或多个音频驱动器更靠近NMD。这样，与那些轴相对应的脉冲响应的窗可以包括参考声轴中的给定参考点之前和之后的时间，以考虑相对于该参考声轴的正延迟或负延迟的可能性。Given the approximate physical configuration of the plurality of audio drivers, a time window can be applied to limit the portion of each impulse response for which the second reference point is to be identified. For example, where the vocal axis forms the left, right and center channels, the impulse responses of the vocal axis forming the left and right channels may be restricted to the impulse response of the vocal axis forming the center channel The time window after the peak of . The sound from the sound axes forming the left and right channels travels outward left and right (rather than on-axis), so the peaks of interest will be the reflections of the sound from these axes by the environment. However, the acoustic axis forming the left surround channel and/or the right surround channel and/or the subwoofer channel may be physically closer to the NMD than the one or more audio drivers forming the center channel. In this way, the windows of the impulse responses corresponding to those axes may include time before and after a given reference point in the reference acoustic axis to account for the possibility of positive or negative delays relative to the reference acoustic axis.

一旦识别出脉冲响应中的相应第二参考点，就可以确定来自每个声轴的声音在NMD(即，NMD的麦克风)处的相应到达时间。特别地，处理设备可以通过比较每个脉冲响应中的第一参考点与第二参考点的相应差异来确定在麦克风处的相应到达时间。Once the corresponding second reference point in the impulse response is identified, the corresponding arrival time of the sound from each acoustic axis at the NMD (ie, the microphone of the NMD) can be determined. In particular, the processing device may determine the respective arrival times at the microphones by comparing the respective differences of the first and second reference points in each impulse response.

确定了来自每个声轴的声音在NMD处的相应到达时间，处理设备可以确定要对每个声轴施加的相应延迟。处理设备可以确定相对于延迟目标的延迟。该延迟目标可以是具有最新到达时间的声轴。用作延迟目标的声轴可能不会接收任何延迟。可以向其他声轴分配延迟以匹配用作延迟目标的声轴的到达时间。形成中心声道的声轴在某些情况下不能用作延迟目标，因为具有较晚到达时间的声轴不能被分配“负”延迟以匹配形成中心声道的声轴的到达时间。Having determined the respective arrival times at the NMD of the sound from each acoustic axis, the processing device can determine the respective delay to apply to each acoustic axis. The processing device may determine the delay relative to the delay target. The delay target may be the acoustic axis with the latest arrival time. A sound axis used as a delay target may not receive any delay. Delays can be assigned to other sound axes to match the arrival time of the sound axis used as the delay target. The vocal axis forming the center channel cannot be used as a delay target in some cases, because vocal axis with a later arrival time cannot be assigned a "negative" delay to match the arrival time of the vocal axis forming the center channel.

在一些情况下，任何给定声轴的延迟可以被封顶(cap)在最大延迟阈值处。这样的封顶可以防止以下问题：大量延迟引起由声轴输出的音频内容与耦合至该音频内容的视频内容之间的明显不匹配(例如，唇形同步问题)。这样的封顶可以仅应用于包括与视频配对的音频的回放配置，因为在音频未与视频配对时，大的延迟可能不会影响用户体验。可替选地，如果视频显示器与一个或多个回放设备同步，则视频可能被延迟以避免由声轴输出的音频内容与耦合至该音频内容的视频内容之间的明显不匹配，这可以消除对最大延迟阈值的需要。In some cases, the delay for any given acoustic axis may be capped at a maximum delay threshold. Such capping may prevent problems where substantial delays cause significant mismatches between the audio content output by the sound axis and the video content coupled to the audio content (eg, lip sync problems). Such capping may only apply to playback configurations that include audio paired with video, since large delays may not affect the user experience when audio is not paired with video. Alternatively, if the video display is synchronized with one or more playback devices, the video may be delayed to avoid apparent mismatches between the audio content output by the sound axis and the video content coupled to the audio content, which can eliminate The need for a maximum latency threshold.

如上所述，记录校准音频的NMD可能不执行处理的某些部分(或者可能根本不处理校准音频)。确切地说，NMD可以将表示所记录的校准音频的数据发送至处理设备，可能利用关于如何处理所记录的校准音频的一个或更多个指令。在其他情况下，处理设备可以被编程为使用某些技术来处理所记录的校准音频。在这样的实施方式中，发送表示记录的校准音频的数据(例如，表示校准音频的原始样本的数据和/或表示部分处理的校准音频的数据)可以使处理设备确定校准配置文件(例如，滤波器参数)。As mentioned above, the NMD recording the calibration audio may not perform some parts of the processing (or may not process the calibration audio at all). Specifically, the NMD may send data representing the recorded calibration audio to a processing device, possibly with one or more instructions on how to process the recorded calibration audio. In other cases, the processing device may be programmed to process the recorded calibration audio using certain techniques. In such embodiments, sending data representing recorded calibration audio (eg, data representing raw samples of calibration audio and/or data representing partially processed calibration audio) may cause the processing device to determine a calibration profile (eg, filtering device parameters).

e.使得校准多个声轴 e. Enables calibration of multiple acoustic axes

在图13中，在块1310处，实现1300涉及使得校准多个声轴。例如，NMD可以将校准数据发送至形成多个声轴的一个或多个回放设备。可替选地，NMD可以指示另一处理设备将校准数据发送至回放设备。这样的校准数据可以使一个或多个回放设备将多个声轴校准至特定响应。In Figure 13, at block 1310, implementing 1300 involves causing a plurality of acoustic axes to be calibrated. For example, the NMD may send calibration data to one or more playback devices forming multiple acoustic axes. Alternatively, the NMD may instruct another processing device to send the calibration data to the playback device. Such calibration data may enable one or more playback devices to calibrate multiple acoustic axes to a particular response.

如上所述，对多个声轴的校准可以涉及：修改幅度响应、频率响应、相位调整或任何其他声学特性中的一个或更多个。可以使用在DSP中实现的或作为模拟滤波器的一个或更多个滤波器来应用这样的修改。校准数据可以包括用于实现滤波器的参数(例如，作为双二阶滤波器的系数)。可以每音频驱动器或每组两个或更多个驱动器(例如，形成声轴的两个或更多个驱动器或者相同类型的音频驱动器中的两个或更多个，以及其他示例)地应用滤波器。As mentioned above, calibration of multiple acoustic axes may involve modifying one or more of the amplitude response, frequency response, phase adjustment, or any other acoustic characteristic. Such modifications may be applied using one or more filters implemented in DSP or as analog filters. The calibration data may include parameters used to implement the filter (eg, as coefficients of a biquad filter). Filtering may be applied per audio driver or per group of two or more drivers (eg, two or more drivers forming an acoustic axis or two or more of the same type of audio drivers, among other examples) device.

校准多个声轴可以包括：使多个声轴的音频输出根据声轴的相应确定的延迟而延迟。可以通过使相应滤波器根据多个声轴的相应确定的延迟来延迟多个音频驱动器的音频输出来形成这样的延迟。除了其他示例之外，这样的滤波器可以实现循环缓冲器延迟线。Calibrating the plurality of sound axes may include delaying the audio output of the plurality of sound axes according to respective determined delays of the sound axes. Such delays may be formed by having respective filters delay the audio output of the plurality of audio drivers according to respective determined delays of the plurality of acoustic axes. Among other examples, such a filter may implement a circular buffer delay line.

在某些情况下，延迟是动态的。例如，一个轴的响应可以与另外的轴的响应在给定范围内重叠，但是声轴可以具有不同的到达时间(因此表明不同的延迟)。在这样的情况下，可以在重叠范围内平滑每个声轴的延迟。例如，可以在范围内实现延迟曲线以平滑延迟。这样的平滑可以通过避免重叠范围中的声轴之间的延迟的可能明显差异来改善用户体验。In some cases, the delay is dynamic. For example, the response of one axis may overlap the response of another axis within a given range, but the acoustic axes may have different arrival times (and thus indicate different delays). In such a case, the delay of each acoustic axis can be smoothed within the overlapping range. For example, delay curves can be implemented in ranges to smooth out delays. Such smoothing may improve the user experience by avoiding possible significant differences in delay between acoustic axes in overlapping ranges.

如上所述，在某些情况下，由某些声轴产生的声音可能在之前被环境反射。As mentioned above, in some cases the sound produced by certain acoustic axes may be previously reflected by the environment.

IV.使用所应用的空间校准来促进频谱校准的示例技术IV. Example Techniques to Facilitate Spectral Calibration Using Applied Spatial Calibration

如上所述，本文描述的实施方式可以促进对一个或更多个回放设备的校准。图19示出了示例实现1900，通过该示例实现，回放设备使用所应用的空间校准来促进频谱校准。As described above, embodiments described herein may facilitate calibration of one or more playback devices. FIG. 19 illustrates an example implementation 1900 by which a playback device facilitates spectral calibration using the applied spatial calibration.

a.接收表示一个或多个空间校准的数据 a. Receive data representing one or more spatial calibrations

在块1902处，实现1900涉及：接收表示一个或空间校准的数据。例如，回放设备(例如，图1中的媒体回放系统100的任何回放设备或图3中的回放设备300)可以经由网络接口从诸如处理设备或NMD的设备以及其他可能的源接收表示一个或更多个空间校准(例如，上面结合图13的实现1300描述的多个校准中的任何一个)的数据。可以通过校准序列例如以上描述的示例校准序列来预先确定每个校准。At block 1902, implementation 1900 involves receiving data representing an or spatial calibration. For example, a playback device (eg, any playback device of media playback system 100 in FIG. 1 or playback device 300 in FIG. 3 ) may receive representations of one or more via a network interface from a device such as a processing device or NMD, as well as other possible sources Data for multiple spatial calibrations (eg, any of the multiple calibrations described above in connection with the implementation 1300 of FIG. 13 ). Each calibration may be predetermined by a calibration sequence, such as the example calibration sequence described above.

校准可以包括一个或更多个滤波器。这样的滤波器可以修改幅度响应、频率响应、相位调整或任何其他声学特性中的一个或更多个。此外，这样的滤波器可以将受校准的一个或多个回放设备校准到收听区域内的一个或更多个特定收听位置。如上所述，滤波器可以在DSP中(例如，作为双二阶滤波器的系数)或作为模拟滤波器或其组合实现。所接收的校准数据可以包括针对受校准的每个音频声道、轴或设备的滤波器。可替选地，滤波器可以应用于多于一个音频声道、轴或设备。Calibration may include one or more filters. Such filters may modify one or more of the amplitude response, frequency response, phase adjustment, or any other acoustic characteristic. Furthermore, such filters may calibrate the playback device or devices under calibration to one or more specific listening positions within the listening area. As mentioned above, the filter can be implemented in the DSP (eg, as the coefficients of a biquad filter) or as an analog filter or a combination thereof. The received calibration data may include filters for each audio channel, axis or device being calibrated. Alternatively, the filter may be applied to more than one audio channel, axis or device.

在一些情况下，多个校准可以对应于相应回放配置。如上所述，回放配置是指由多个音频驱动器形成的特定声轴集合。此外，示例空间校准可以包括以多个回放配置对音频驱动器的校准。因此，对于每个音频声道、轴或设备，可以存在多于一个滤波器(或滤波器集合)。每个滤波器(或滤波器集合)可以对应于不同的回放配置。In some cases, multiple calibrations may correspond to respective playback configurations. As mentioned above, a playback configuration refers to a specific set of sound axes formed by multiple audio drivers. Additionally, example spatial calibration may include calibration of audio drivers in multiple playback configurations. Thus, for each audio channel, axis or device, there may be more than one filter (or set of filters). Each filter (or set of filters) may correspond to a different playback configuration.

如上所述，回放配置可以涉及用于形成声轴的音频驱动器的分配的变化。回放配置中的每个声轴可以对应于音频内容的相应输入声道。示例回放配置可以对应于不同数量的输入声道，例如单声道、立体声、环绕声(例如，3.0、5.0、7.0)或以上与超低音扬声器结合的任何声道(例如，3.1、5.1、7.1)。其他回放配置可以基于输入内容类型。例如，示例回放配置可以对应于包括音乐、家庭影院(即，与视频配对的音频)、环绕声音频内容、口语等的输入音频内容。这些示例回放配置不应被认为是限制的。所接收的校准可以包括与任何单独的回放配置或回放配置的任何组合相对应的一个或更多个滤波器。As mentioned above, playback configurations may involve changes in the assignment of audio drivers used to form the sound axis. Each sound axis in the playback configuration may correspond to a corresponding input channel of audio content. Example playback configurations may correspond to different numbers of input channels, such as mono, stereo, surround (eg, 3.0, 5.0, 7.0), or any of the above combined with a subwoofer (eg, 3.1, 5.1, 7.1 ). Other playback configurations can be based on input content type. For example, an example playback configuration may correspond to input audio content including music, home theater (ie, audio paired with video), surround sound audio content, spoken language, and the like. These example playback configurations should not be considered limiting. The received calibration may include one or more filters corresponding to any individual playback configuration or any combination of playback configurations.

回放设备可以将这些校准保持在数据存储中。可替选地，可以将这样的校准保持在经由网络通信地耦接至回放设备的设备或系统上。可能根据来自回放设备的请求，回放设备可以从该设备或系统接收校准。The playback device can maintain these calibrations in data storage. Alternatively, such calibration may be maintained on a device or system communicatively coupled to the playback device via a network. The playback device may receive calibrations from the device or system, possibly upon request from the playback device.

b.使一个或多个音频驱动器输出校准音频 b. Make one or more audio drivers output calibrated audio

在图19中，在块1904处，实现1900涉及使一个或多个音频驱动器输出校准音频。例如，回放设备可以使音频级驱动音频驱动器以输出校准音频。示例音频级可以包括一个或多个放大器、信号处理(例如，DSP)以及其他可能的组件。在一些情况下，回放设备可以指示受校准的其他回放设备输出校准音频，可能在用作受校准的回放设备的组协调器时。In Figure 19, at block 1904, implementation 1900 involves causing one or more audio drivers to output calibration audio. For example, a playback device may have the audio stage drive an audio driver to output calibrated audio. An example audio stage may include one or more amplifiers, signal processing (eg, DSP), and possibly other components. In some cases, a playback device may instruct other playback devices under calibration to output calibration audio, possibly when acting as a group coordinator for the playback devices under calibration.

校准音频可以包括一个或更多个校准声音，例如频率扫描(“啁啾”)、棕色噪声或其他类型的噪声或者歌曲，以及其他示例。以上结合以上描述的示例校准序列指出了关于示例校准声音的附加细节。Calibration audio may include one or more calibration sounds, such as frequency sweeps ("chirps"), brown noise or other types of noise or songs, among other examples. Additional details regarding example calibration sounds are indicated above in connection with the example calibration sequences described above.

校准音频可以划分成帧。如图11和图12所示并且在本文描述的，帧可以表示示例校准声音的迭代。当被记录时，帧可以产生如由一个或更多个音频驱动器发出的校准声音的相应样本。帧可以重复以产生多个样本。Calibration audio can be divided into frames. As shown in FIGS. 11 and 12 and described herein, a frame may represent an iteration of an example calibration sound. When recorded, the frames may produce corresponding samples of calibration sounds as emitted by one or more audio drivers. Frames can be repeated to produce multiple samples.

如上所述，校准序列可以涉及对多个声轴的校准。在这样的情况下，校准音频输出可以被划分成组成帧，其中每个帧包括受校准的每个声轴的校准音频。因此，当被记录时，每个帧可以包括由每个声轴产生的校准音频的样本。帧可以重复以针对每个声轴产生多个样本。As mentioned above, the calibration sequence may involve calibration of multiple acoustic axes. In such a case, the calibration audio output may be divided into constituent frames, where each frame includes calibration audio for each acoustic axis being calibrated. Thus, when recorded, each frame may include a sample of calibration audio produced by each acoustic axis. The frame can be repeated to produce multiple samples for each acoustic axis.

如上所述，在一些示例校准过程中，可以针对多个回放配置校准媒体回放系统的回放设备。可替选地，可以以相应的校准序列来校准音频驱动器集合的不同回放配置。用于校准多个回放配置的示例校准音频可以包括重复的帧集合。该集合中的每个帧可以对应于相应的回放配置。例如，用于校准三个回放配置的示例校准音频可以包括一系列三个帧(例如，图14的帧1402、1404和1406)。As described above, in some example calibration procedures, playback devices of a media playback system may be calibrated for multiple playback configurations. Alternatively, different playback configurations of the set of audio drivers may be calibrated in corresponding calibration sequences. Example calibration audio for calibrating multiple playback configurations may include repeated sets of frames. Each frame in the set may correspond to a corresponding playback configuration. For example, an example calibration audio for calibrating three playback configurations may include a series of three frames (eg, frames 1402, 1404, and 1406 of Figure 14).

在每个帧期间，回放设备可以应用与相应回放配置相对应的空间校准。应用空间校准可以涉及：使音频级(或多个音频级)应用与每个回放配置相对应的一个或多个相应滤波器。当输入信号通过一个或多个滤波器时，应用校准以在发出校准音频时修改一个或多个音频驱动器的幅度响应、频率响应、相位调整或任何其他声学特性中的一个或更多个。如上所述，这样的滤波器可以修改所发出的校准音频以适应特定的收听位置。例如，示例空间滤波器可以至少部分地平衡来自多个声轴的声音在特定收听位置处的到达时间。During each frame, the playback device may apply a spatial calibration corresponding to the respective playback configuration. Applying spatial calibration may involve causing the audio level (or levels) to apply one or more respective filters corresponding to each playback configuration. When the input signal passes through the one or more filters, calibration is applied to modify one or more of the amplitude response, frequency response, phase adjustment, or any other acoustic characteristic of the one or more audio drivers when the calibration audio is emitted. As mentioned above, such filters can modify the emitted calibration audio to suit a particular listening position. For example, an example spatial filter may at least partially balance the arrival times of sounds from multiple acoustic axes at a particular listening position.

在其他实施方式中，可以由除回放设备之外的设备将空间校准应用于校准音频。可以由任何设备来应用空间校准，该设备存储校准音频和/或使用该设备的处理器或DSP生成用于由音频驱动器输出的校准音频。此外，空间校准可以由存储校准音频的设备与受校准的一个或多个回放设备之间的任何中间设备应用。In other embodiments, the spatial calibration may be applied to calibrate the audio by a device other than the playback device. Spatial calibration can be applied by any device that stores calibration audio and/or uses the device's processor or DSP to generate calibration audio for output by an audio driver. Furthermore, spatial calibration may be applied by any intermediary device between the device storing the calibration audio and the playback device or devices being calibrated.

为了包括受校准的每个声轴的校准音频，每个帧可以进一步划分成时隙。每个时隙可以包括受校准的相应声轴的校准音频。例如，形成三个声轴(诸如左声道、右声道和中心声道)的播放条型回放设备(例如，图1中所示的回放设备104)的示例帧可以包括三个时隙。例如，如果要利用超低音扬声器型设备来校准该设备，则每个帧可以包括四个时隙，一个用于由播放条型回放设备形成的每个声轴，一个用于由超低音扬声器产生的声轴。作为另一示例，在利用产生相应声轴(例如，环绕左声道和环绕右声道)的两个附加回放设备来校准播放条型回放设备的情况下，每个帧可以包括五个时隙(或者如果利用超低音扬声器校准则为六个时隙)。图14示出了具有被划分成时隙的组成帧的示例校准音频。Each frame may be further divided into time slots in order to include calibration audio for each acoustic axis being calibrated. Each time slot may include calibration audio for the corresponding acoustic axis being calibrated. For example, an example frame of a playbar-type playback device (eg, playback device 104 shown in FIG. 1 ) forming three acoustic axes, such as left, right, and center channels, may include three time slots. For example, if a subwoofer-type device is to be used to calibrate the device, each frame may include four time slots, one for each acoustic axis formed by the playbar-type playback device, and one for each sound axis produced by the subwoofer sound axis. As another example, where a playbar-type playback device is calibrated with two additional playback devices producing corresponding acoustic axes (eg, surround left and surround right channels), each frame may include five time slots (or six time slots if calibrated with a subwoofer). Figure 14 shows an example calibration audio with constituent frames divided into time slots.

如上所述，每个时隙可以包括受校准的相应声轴的校准音频。每个时隙中的校准音频可以包括频率扫描(“啁啾”)、棕色噪声或其他类型的噪声，以及其他示例。例如，如图11和图12所示，每个声音中的校准音频可以包括混合校准声音。时隙可以以已知的顺序按顺序发生，以促进将所记录的校准音频内的时隙匹配到相应的声轴。每个时隙可以具有已知的持续时间，这也可以促进将所记录的校准音频内的时隙匹配到相应的声轴。在其他示例中，每个时隙和/或帧可以包括用于识别时隙或帧的水印(例如，特定的声音模式)，该水印可以用于将记录的校准音频内的时隙匹配到相应的声轴。As described above, each time slot may include calibration audio for the corresponding acoustic axis that is being calibrated. Calibration tones in each time slot may include frequency sweeps ("chirps"), brown noise, or other types of noise, among other examples. For example, as shown in Figures 11 and 12, the calibration audio in each sound may include a mix of calibration sounds. The time slots may occur sequentially in a known order to facilitate matching the time slots within the recorded calibration audio to the corresponding acoustic axes. Each time slot may have a known duration, which may also facilitate matching the time slots within the recorded calibration audio to the corresponding acoustic axis. In other examples, each time slot and/or frame may include a watermark (eg, a specific sound pattern) for identifying the time slot or frame, which watermark may be used to match time slots within the recorded calibration audio to corresponding sound axis.

c.接收表示一个或多个频谱校准的数据 c. Receive data representing one or more spectral calibrations

在图19中，在块1906处，实现1900涉及接收表示一个或频谱校准的数据。例如，回放设备可以从处理设备接收表示一个或更多个频谱校准的数据。这些频谱校准可以基于由一个或多个音频驱动器输出的校准音频。特别地，从一个或多个音频驱动器输出的校准音频可以由一个或更多个记录设备(例如，NMD)记录。在被记录之前，校准音频可能被周围环境相互作用(例如，被反射或吸收)，因而可以表示环境的特征。In FIG. 19, at block 1906, implementation 1900 involves receiving data representing an or spectral calibration. For example, the playback device may receive data representing one or more spectral calibrations from the processing device. These spectral calibrations may be based on calibration audio output by one or more audio drivers. In particular, calibration audio output from one or more audio drivers may be recorded by one or more recording devices (eg, NMDs). Before being recorded, the calibration audio may be interacted with (eg, reflected or absorbed) by the surrounding environment and thus may characterize the environment.

示例频谱校准可以补偿环境的声学特性以实现给定响应(例如，平坦响应，被认为是期望的响应，或设置均衡)。例如，如果给定环境对大约500Hz的频率进行衰减并且对大约14000Hz的频率进行放大，则校准可能会提升大约500Hz的频率并且消减大约14000Hz的频率，以补偿这些环境影响。An example spectral calibration can compensate for the acoustic properties of the environment to achieve a given response (eg, a flat response, considered a desired response, or set equalization). For example, if a given environment attenuates frequencies around 500Hz and amplifies frequencies around 14000Hz, calibration may boost frequencies around 500Hz and attenuate frequencies around 14000Hz to compensate for these environmental effects.

用于确定校准的一些示例技术在2012年6月28日提交的题为“System and Methodfor Device Playback Calibration”并且公布为US 2014/0003625A1的美国专利申请第13/536,493号中描述，其全部内容结合在本文中。示例技术在段落[0019]至[0025]和[0068]至[0118]中以及大体在整个说明书中描述。Some example techniques for determining calibration are described in US Patent Application No. 13/536,493, filed June 28, 2012, entitled "System and Method for Device Playback Calibration" and published as US 2014/0003625 A1, the entire contents of which are incorporated herein by reference in the text. Example techniques are described in paragraphs [0019] to [0025] and [0068] to [0118] and generally throughout the specification.

用于确定校准的其他示例技术在2014年3月17日提交的题为“Audio SettingsBased On Environment”并且公布为US 2015/0263692A1的美国专利申请第14/216,306号中描述，其全部内容结合在本文中。示例技术在段落[0014]至[0025]和[0063]至[0114]以及大体整个说明书中描述。Other example techniques for determining calibration are described in US Patent Application No. 14/216,306, filed March 17, 2014, entitled "Audio SettingsBased On Environment" and published as US 2015/0263692A1, the entire contents of which are incorporated herein middle. Example techniques are described in paragraphs [0014] through [0025] and [0063] through [0114] and generally throughout the specification.

用于确定校准的另外的示例技术在2014年9月9日提交的题为“Playback DeviceCalibration”并且公布为US 2016/0014534A1的美国专利申请第14/481,511号中描述，其全部内容结合在本文中。示例技术在段落[0017]至[0043]和[0082]至[0184]以及大体整个说明书中描述。Additional example techniques for determining calibration are described in US Patent Application Serial No. 14/481,511, filed September 9, 2014, entitled "Playback DeviceCalibration" and published as US 2016/0014534A1, the entire contents of which are incorporated herein . Example techniques are described in paragraphs [0017] to [0043] and [0082] to [0184] and generally throughout the specification.

示例处理设备包括NMD、其他回放设备、控制设备、经由局域网连接至媒体回放系统的计算设备、远程计算设备诸如云服务器，或者以上设备的任何组合。在一些情况下，一个或多个处理设备可以将空间校准发送至一个或更多个中间设备，中间设备可以将空间校准发送至回放设备。这样的中间设备可以存储表示一个或空间校准的数据。Example processing devices include NMDs, other playback devices, control devices, computing devices connected to the media playback system via a local area network, remote computing devices such as cloud servers, or any combination of the above. In some cases, the one or more processing devices may send the spatial calibration to one or more intermediate devices, which may send the spatial calibration to the playback device. Such an intermediate device may store data representing one or a spatial calibration.

d.应用特定频谱滤波器 d. Apply a specific spectral filter

在块1908处，实现1900涉及应用特定频谱校准。例如，当以给定回放配置回放音频内容时，回放设备可以应用与该给定回放配置相对应的特定滤波器。回放设备可以保持或可以访问与多个回放配置相对应的相应频谱校准。At block 1908, implementing 1900 involves applying a specific spectral calibration. For example, when playing back audio content in a given playback configuration, the playback device may apply specific filters corresponding to the given playback configuration. The playback device can maintain or have access to corresponding spectral calibrations corresponding to multiple playback configurations.

在一些示例中，可以指示回放设备进入特定回放配置并且相应地应用与该回放配置相对应的特定校准。例如，控制设备可以发送用于形成与给定回放配置相对应的特定声轴集合的命令。In some examples, the playback device may be instructed to enter a particular playback configuration and apply a particular calibration corresponding to that playback configuration accordingly. For example, the control device may send commands to form a particular set of sound axes corresponding to a given playback configuration.

可替选地，回放设备可以基于其当前配置来检测要应用的适当频谱校准。如上所述，回放设备可以结合到各种分组例如区组或绑定区中。每个分组可以表示回放配置。在一些实现中，在被结合到具有另外的回放设备的分组中时，回放设备可以应用与该分组的回放配置相关联的特定校准。例如，基于检测到回放设备已经结合到特定区组，回放设备可以应用与区组(或与特定区组)相关联的特定校准。Alternatively, the playback device may detect the appropriate spectral calibration to apply based on its current configuration. As mentioned above, playback devices may be incorporated into various groupings such as zones or bound zones. Each grouping can represent a playback configuration. In some implementations, when incorporated into a group with additional playback devices, the playback device may apply specific calibrations associated with the group's playback configuration. For example, based on detecting that the playback device has been coupled to a particular zone, the playback device may apply a particular calibration associated with the zone (or with the particular zone).

回放设备可以基于被提供给回放设备(或者已经被指示回放)的音频内容来检测要应用的频谱校准。例如，回放设备可能检测到它正在回放仅由音频组成的媒体内容(例如，音乐)。在这种情况下，回放设备可以应用与对应于音乐回放的回放配置相关联的特定校准。作为另一示例，回放设备可以接收与音频和视频两者相关联的媒体内容(例如，电视节目或电影)。当回放这样的内容时，回放设备可以应用与和视频配对的音频相对应的特定校准，或者可能应用与家庭影院(例如，环绕声)相对应的校准。The playback device may detect the spectral calibration to apply based on the audio content provided to the playback device (or having been instructed to play back). For example, a playback device may detect that it is playing back media content (eg, music) that consists only of audio. In this case, the playback device may apply a specific calibration associated with the playback configuration corresponding to the music playback. As another example, a playback device may receive media content (eg, television shows or movies) associated with both audio and video. When playing back such content, the playback device may apply a specific calibration corresponding to the audio paired with the video, or possibly a calibration corresponding to a home theater (eg, surround sound).

回放设备可以基于音频内容的源来应用特定校准。经由这些源中的特定源接收内容可以触发特定的回放配置。例如，经由网络接口接收内容可以指示音乐回放。这样，当经由网络接口接收内容时，回放设备可以应用与对应于音乐回放的特定回放配置相关联的特定校准。作为另一示例，经由特定物理输入接收内容可以指示家庭影院使用(即，来自电视节目或电影的音频的回放)。在回放来自该输入的内容时，回放设备可以应用与对应于家庭影院回放的回放配置相关联的不同校准。The playback device may apply specific calibrations based on the source of the audio content. Receiving content via specific ones of these sources can trigger specific playback configurations. For example, receiving content via a network interface may indicate music playback. In this way, when receiving content via the network interface, the playback device may apply specific calibrations associated with the specific playback configuration corresponding to music playback. As another example, receiving content via a specific physical input may indicate home theater usage (ie, playback of audio from a television show or movie). When playing back content from this input, the playback device may apply different calibrations associated with the playback configuration corresponding to home theater playback.

给定区场景可以与特定回放配置相关联。在进入特定区场景并且因此进入特定回放配置时，回放设备可以应用与该回放配置相关联的特定校准。可替选地，与区场景相关联的内容或配置可以使回放设备应用特定校准。例如，区场景可以涉及特定媒体内容或内容源的回放，这使回放设备应用特定校准。A given zone scene can be associated with a specific playback configuration. Upon entering a particular zone scene and thus entering a particular playback configuration, the playback device may apply a particular calibration associated with that playback configuration. Alternatively, the content or configuration associated with the zone scene may cause the playback device to apply specific calibrations. For example, a zone scene may involve playback of a particular media content or content source, which causes the playback device to apply a particular calibration.

在另外的示例中，可以借助来自控制设备或另外的回放设备的一个或更多个消息向回放设备指示回放配置。例如，在接收到选择特定回放配置的输入之后，设备可以向回放设备指示选择了特定回放配置。回放设备可以应用与该回放配置相关联的校准。作为另一示例，回放设备可以是诸如绑定区组的组的构件。另一回放设备，例如该组的组协调器设备，可以检测该组的回放配置，并且向回放设备发送指示回放配置(或针对该配置的校准)的消息。In further examples, the playback configuration may be indicated to the playback device via one or more messages from the control device or another playback device. For example, upon receiving input to select a particular playback configuration, the device may indicate to the playback device that the particular playback configuration was selected. The playback device may apply the calibration associated with the playback configuration. As another example, a playback device may be a member of a group such as a binding zone group. Another playback device, such as the group coordinator device for the group, may detect the playback configuration of the group and send a message to the playback device indicating the playback configuration (or calibration for that configuration).

在一些情况下，回放设备还可以将校准应用于一个或更多个另外的回放设备。例如，回放设备可以是组(例如，区组)的构件(例如，组协调器)。回放设备可以发送指示组中的其他回放设备应用校准的消息。在接收到这样的消息时，这些回放设备可以应用校准。In some cases, the playback device may also apply the calibration to one or more additional playback devices. For example, a playback device may be a member (eg, a group coordinator) of a group (eg, a zone group). The playback device may send a message instructing the other playback devices in the group to apply the calibration. Upon receipt of such a message, these playback devices may apply calibration.

在一些示例中，可以使用一个或更多个状态变量在媒体回放系统的设备之间共享校准或校准状态。涉及校准状态变量的一些示例技术在2015年7月7日提交的题为“Calibration State Variable”的美国专利申请第14/793,190号和2015年7月7日提交的题为“Calibration Indicator”的美国专利申请第14/793,205号中描述，以上申请的全部内容合并在本文中。V.用于使用所应用的空间校准促进频谱校准的示例技术In some examples, the calibration or calibration state may be shared among devices of the media playback system using one or more state variables. Some example techniques involving calibration state variables are in US Patent Application Serial No. 14/793,190, filed July 7, 2015, entitled "Calibration State Variable," and US Patent Application No. 14/793,190, filed July 7, 2015, and entitled "Calibration Indicator" described in Patent Application No. 14/793,205, the entire contents of which are incorporated herein. V. Example Techniques for Facilitating Spectral Calibration Using Applied Spatial Calibration

如上所述，本文描述的实施方式可以促进对一个或更多个回放设备的校准。图20示出了示例实现200，其中通过该实现，NMD使用所应用的空间校准来促进对媒体回放系统的频谱校准。As described above, embodiments described herein may facilitate calibration of one or more playback devices. FIG. 20 illustrates an example implementation 200 by which the NMD uses the applied spatial calibration to facilitate spectral calibration of a media playback system.

a.检测触发条件 a. Detect trigger conditions

在块2002处，实现2000涉及检测启动校准的触发条件。例如，NMD可以检测启动对媒体回放系统的校准的触发条件。触发条件可以明确地或者可能因为一个或多个回放设备的一个或多个音频驱动器已经设置有多个回放配置而针对多个回放配置来启动对媒体回放系统中的一个或多个回放设备的校准。用于启动校准的示例触发条件在上面的III.a部分中以及大体在整个公开内容中描述。At block 2002, implementation 2000 involves detecting a triggering condition to initiate calibration. For example, the NMD may detect a trigger condition that initiates calibration of the media playback system. A trigger condition may initiate calibration of one or more playback devices in a media playback system for multiple playback configurations, either explicitly or possibly because one or more audio drivers of the one or more playback devices have been set up with multiple playback configurations . Example trigger conditions for initiating calibration are described in Section III.a above and generally throughout this disclosure.

在图20中，在块2004处，实现2000涉及使一个或多个音频驱动器输出校准音频。例如，NMD可以使多个音频驱动器输出校准音频。NMD可能经由网络接口向受校准的回放设备发送指令。以上结合示例校准技术描述了示例校准音频。In Figure 20, at block 2004, implementation 2000 involves causing one or more audio drivers to output calibration audio. For example, an NMD can have multiple audio drivers output calibrated audio. The NMD may send instructions to the calibrated playback device via a network interface. Example calibration audio is described above in connection with example calibration techniques.

c.记录校准音频 c. Record calibration audio

在图20中，在块2006处，实现2000涉及记录校准音频。例如，NMD可以经由麦克风记录如由受校准的一个或多个回放设备的一个或多个音频驱动器输出的校准音频。在一些情况下，多个NMD可以经由相应的麦克风记录校准音频。In Figure 20, at block 2006, implementation 2000 involves recording calibration audio. For example, the NMD may record, via a microphone, calibration audio as output by one or more audio drivers of the one or more playback devices being calibrated. In some cases, multiple NMDs may record calibration audio via respective microphones.

NMD可以在记录校准音频的同时在环境内移动，以在不同位置处测量校准声音。利用移动的麦克风，在环境内的不同物理位置处检测校准声音的重复。与一个位置中的样本相比，不同位置处的校准声音的样本可以提供对周围环境的更好表示。例如，返回参照图7，媒体回放系统100的控制设备126可以沿路径700在不同点处(例如，在点702和/或点704处)检测由一个或更多个回放设备(例如，起居室区的回放设备104、106、108和/或110)发出的校准音频。可替选地，控制设备可以沿该路径记录校准信号。The NMD can move within the environment while recording the calibration audio to measure the calibration sound at different locations. Using a moving microphone, the repetition of the calibration sound is detected at different physical locations within the environment. Samples of the calibration sound at different locations may provide a better representation of the surrounding environment than samples in one location. For example, referring back to FIG. 7 , the control device 126 of the media playback system 100 may detect at various points along the path 700 (eg, at points 702 and/or 704 ) detection by one or more playback devices (eg, a living room) calibration audio from playback devices 104, 106, 108 and/or 110) in the zone. Alternatively, the control device may record calibration signals along this path.

这样，在校准音频被发出的同时，NMD可以显示一个或更多个提示以移动NMD。这样的提示可以引导用户在校准期间移动记录设备。为了说明，在图21中，智能电话500正在显示包括图形区域2102和2104的控制接口2100。图形区域2102提示观看图形区域2104中的动画。这样的动画可以描绘如何在校准期间在环境内移动智能电话以在不同位置处测量校准音频的示例。虽然通过示例在图形区域2104中示出了动画，但是控制设备可以可替选地示出视频或其他指示，视频或其他指示说明了如何在校准期间在环境内移动控制设备。控制接口2100还包括可选控件2106和2108，它们分别在校准序列中前进和后退。In this way, the NMD can display one or more prompts to move the NMD while the calibration audio is being issued. Such prompts may guide the user to move the recording device during calibration. For illustration, in FIG. 21 , smartphone 500 is displaying control interface 2100 including graphics areas 2102 and 2104 . Graphics area 2102 prompts viewing of animations in graphics area 2104. Such an animation may depict an example of how to move the smartphone within the environment during calibration to measure calibration audio at different locations. Although an animation is shown in graphics area 2104 by way of example, the control device may alternatively show video or other instructions illustrating how to move the control device within the environment during calibration. Control interface 2100 also includes selectable controls 2106 and 2108, which advance and reverse, respectively, in the calibration sequence.

用于记录校准音频的其他示例在以上III.a部分中以及大体在整个公开内容中描述。Other examples for recording calibration audio are described in Section III.a above and generally throughout this disclosure.

d.确定一个或多个频谱校准 d. Determine one or more spectral calibrations

在块2008处，实现2000涉及确定频谱校准。例如，NMD可以使处理设备针对受校准的多个回放配置来确定相应频谱滤波器集合。这些频谱校准可以基于所记录的由一个或多个音频驱动器输出的校准音频。在一些情况下，NMD可以包括处理设备。可替选地，NMD可以将所记录的音频发送至一个或更多个其他处理设备。以上描述了示例处理设备和处理技术。At block 2008, implementation 2000 involves determining a spectral calibration. For example, the NMD may cause the processing device to determine respective sets of spectral filters for the multiple playback configurations that are being calibrated. These spectral calibrations may be based on recorded calibration audio output by one or more audio drivers. In some cases, the NMD may include processing equipment. Alternatively, the NMD may send the recorded audio to one or more other processing devices. Example processing devices and processing techniques are described above.

当媒体回放系统以给定回放配置回放音频内容时，NMD可以使与给定回放配置相对应的特定校准(例如，特定频谱滤波器集合)应用于由多个音频驱动器形成的声轴。以上描述了应用校准的另外的示例。When a media playback system plays back audio content in a given playback configuration, the NMD may apply a specific calibration (eg, a specific set of spectral filters) corresponding to the given playback configuration to the sound axis formed by the plurality of audio drivers. Further examples of applying calibration are described above.

VI.结论VI. Conclusion

除了别的以外，上面的描述公开了各种示例系统、方法、装置和包括在硬件上执行的固件和/或软件以及其他组件的制品。应该理解的是，这样的示例仅仅是说明性的，并且不应被认为是限制性的。例如，可以设想的是，固件、硬件和/或软件方面或组件中的任何一个或全部可以专门以硬件、专门以软件、专门以固件或者以硬件、软件和/或固件的任意组合来实现。因此，所提供的示例不是实现这样的系统、方法、装置和/或制品的唯一方式。The above description discloses, among other things, various example systems, methods, apparatus, and articles of manufacture including firmware and/or software executing on hardware and other components. It should be understood that such examples are merely illustrative and should not be considered limiting. For example, it is contemplated that any or all of the firmware, hardware and/or software aspects or components may be implemented exclusively in hardware, exclusively in software, exclusively in firmware or in any combination of hardware, software and/or firmware. Accordingly, the examples provided are not the only way to implement such systems, methods, apparatus and/or articles of manufacture.

(特征1)一种方法，包括：检测触发条件，所述触发条件启动对包括多个音频驱动器的媒体回放系统的校准，多个音频驱动器形成多个声轴，每个声轴对应于多声道音频内容的相应声道；经由网络接口使所述多个音频驱动器发出被划分成组成帧的校准音频，所述多个声轴在每个组成帧的相应时隙期间发出校准音频；经由麦克风记录所发出的校准音频；使得基于与所述声轴对应的所记录校准音频的时隙来确定所述多个声轴中的每个声轴的相应延迟；以及使得校准所述多个声轴，其中，校准所述多个声轴包括：使得根据相应确定的延迟来延迟所述多个声轴的音频输出。(Feature 1) A method comprising: detecting a trigger condition that initiates calibration of a media playback system including a plurality of audio drivers forming a plurality of sound axes, each sound axis corresponding to a plurality of sound channel audio content; causing the plurality of audio drivers to emit framed calibration audio via a network interface, the plurality of sound axes emitting calibration audio during respective time slots of each frame; via microphones recording the emitted calibration audio; causing a respective delay for each of the plurality of acoustic axes to be determined based on time slots of the recorded calibration audio corresponding to the acoustic axes; and causing the plurality of acoustic axes to be calibrated , wherein calibrating the plurality of acoustic axes includes causing the audio output of the plurality of acoustic axes to be delayed according to correspondingly determined delays.

(特征2)根据特征1所述的方法，其中，使得确定所述多个声轴中的每个声轴的相应延迟包括：使处理设备根据与每个声轴相对应的记录校准音频的时隙来确定所述多个声轴中的每个声轴的在所述麦克风处的相应到达时间；以及使得确定所述多个声轴中的每个声轴的延迟，每个确定的延迟对应于相应声轴的所确定的到达时间。(Feature 2) The method of feature 1, wherein causing the determination of the respective delay for each of the plurality of acoustic axes comprises causing the processing device to calibrate the timing of the audio according to the recording corresponding to each acoustic axis gaps to determine a respective time of arrival at the microphone for each of the plurality of acoustic axes; and such that a delay for each of the plurality of acoustic axes is determined, each determined delay corresponding to The determined arrival time at the corresponding acoustic axis.

(特征3)根据特征2所述的方法，其中，使得根据相应确定的延迟来延迟所述多个声轴的音频输出包括：使相应滤波器根据所述多个声轴的相应确定的延迟来延迟所述多个音频驱动器的音频输出。(Feature 3) The method of feature 2, wherein causing the audio output of the plurality of sound axes to be delayed according to the respective determined delays comprises causing respective filters to delay the audio output according to the respective determined delays of the plurality of sound axes delaying audio output of the plurality of audio drivers.

(特征4)根据特征2所述的方法，其中，NMD包括所述处理设备，并且其中，使所述处理设备确定所述多个声轴中的每个声轴的在所述麦克风处的相应到达时间包括：将所记录的校准音频划分成组成帧并且将每个组成帧划分成针对每个声轴的相应时隙；根据与每个声轴相对应的相应时隙来确定所述声轴的相应脉冲响应；将相应脉冲响应对准至第一参考点；识别每个脉冲响应中的相应第二参考点；以及基于每个脉冲响应中的所述第一参考点与所述第二参考点的相应差异来确定在所述麦克风处的相应到达时间。(Feature 4) The method of feature 2, wherein the NMD includes the processing device, and wherein the processing device is caused to determine a corresponding response at the microphone for each of the plurality of acoustic axes The time of arrival includes dividing the recorded calibration audio into constituent frames and dividing each constituent frame into a corresponding time slot for each acoustic axis; determining the acoustic axis from the respective time slot corresponding to each acoustic axis aligning the corresponding impulse responses to the first reference point; identifying the corresponding second reference point in each impulse response; and based on the first reference point and the second reference point in each impulse response The corresponding difference of the points determines the corresponding arrival time at the microphone.

(特征5)根据特征4所述的方法，其中，所述声轴由参考声轴和一个或更多个其他声轴组成，并且其中，识别每个脉冲响应中的相应第二参考点包括：识别所述参考声轴的所述脉冲响应中的峰值作为给定第二参考点；以及在所述给定第二参考点之后的时窗中，识别所述一个或更多个其他声轴的所述脉冲响应的相应峰值作为其他第二参考点。(Feature 5) The method of feature 4, wherein the acoustic axis consists of a reference acoustic axis and one or more other acoustic axes, and wherein identifying the corresponding second reference point in each impulse response comprises: identifying a peak in the impulse response of the reference acoustic axis as a given second reference point; and identifying the one or more other acoustic axes in a time window following the given second reference point The corresponding peaks of the impulse responses serve as other second reference points.

(特征6)根据特征2所述的方法，其中，所述处理设备经由一个或更多个网络连接至NMD，并且其中，使所述处理设备确定所述多个声轴中的每个声轴的在所述麦克风处的相应到达时间包括：经由所述网络接口向所述处理设备发送(i)所记录的校准音频，以及(ii)确定所述多个声轴中的每个声轴的在所述麦克风处的相应到达时间的指令；以及经由所述网络接口接收所确定的相应到达时间。(Feature 6) The method of feature 2, wherein the processing device is connected to the NMD via one or more networks, and wherein the processing device is caused to determine each acoustic axis of the plurality of acoustic axes The respective arrival times at the microphones include sending (i) the recorded calibration audio to the processing device via the network interface, and (ii) determining the an instruction of a corresponding time of arrival at the microphone; and receiving the determined corresponding time of arrival via the network interface.

(特征7)根据特征1所述的方法，其中，所述多个声轴中的每个声轴对应于环绕声音频内容的相应声道。(Feature 7) The method of feature 1, wherein each acoustic axis of the plurality of acoustic axes corresponds to a corresponding channel of surround sound audio content.

(特征8)根据特征7所述的方法，其中，所述媒体回放系统包括多个回放设备，所述每个回放设备包括所述多个音频驱动器的子组。(Feature 8) The method of feature 7, wherein the media playback system includes a plurality of playback devices, each playback device including a subset of the plurality of audio drivers.

(特征9)根据特征8所述的方法，其中，所述多个回放设备包括给定回放设备，所述给定回放设备包括所述多个音频驱动器的特定子组，其中，所述多个音频驱动器的特定子组形成三个声轴，所述三个声轴分别对应于所述环绕声音频内容的左声道、所述环绕声音频内容的右声道以及所述音频内容的中心声道。(Feature 9) The method of feature 8, wherein the plurality of playback devices comprises a given playback device comprising a specific subset of the plurality of audio drivers, wherein the plurality of A particular subset of audio drivers form three sound axes corresponding to the left channel of the surround audio content, the right channel of the surround audio content, and the center sound of the audio content, respectively. road.

(特征10)根据特征1所述的方法，其中，检测启动对媒体回放系统的校准的所述触发条件包括：经由用户接口检测这样的输入数据，该输入数据指示启动对所述媒体回放系统的校准的命令。(Feature 10) The method of feature 1, wherein detecting the trigger condition to initiate calibration of the media playback system comprises: detecting, via a user interface, input data indicating initiation of calibration of the media playback system Calibration command.

(特征11)根据特征1所述的方法，其中，检测启动对所述媒体回放系统的校准的所述触发条件包括：检测所述媒体回放系统的配置为特定轴配置，其中，所述多个音频驱动器形成特定声轴集合。(Feature 11) The method according to feature 1, wherein detecting the trigger condition that initiates calibration of the media playback system comprises detecting that the configuration of the media playback system is a specific axis configuration, wherein the plurality of Audio drivers form a specific set of sound axes.

(特征12)根据特征1所述的方法，其中，使得确定所述多个声轴中的每个声轴的延迟包括：确定给定声轴的到达时间超过最大延迟阈值；以及当所述媒体回放系统正在回放与视频内容配对的音频内容时，使得所述给定声轴的延迟被设置在最大延迟阈值处。(Feature 12) The method of feature 1, wherein causing the delay of each of the plurality of sound axes to be determined comprises: determining that the arrival time of a given sound axis exceeds a maximum delay threshold; and when the media The playback system is playing back audio content paired with video content such that the delay for the given sound axis is set at the maximum delay threshold.

(特征13)一种有形非暂态计算机可读介质，其存储有指令，所述指令能够由一个或更多个处理器执行以使设备执行根据特征1至12中任一项所述的方法。(Feature 13) A tangible non-transitory computer-readable medium storing instructions executable by one or more processors to cause an apparatus to perform the method according to any one of features 1 to 12 .

(特征14)一种设备，被配置成执行根据特征1至12中任一项所述的方法。(Feature 14) An apparatus configured to perform the method according to any one of features 1 to 12.

(特征15)一种媒体回放系统，被配置成执行根据特征1至12中任一项所述的方法。(Feature 15) A media playback system configured to perform the method according to any one of features 1 to 12.

(特征16)一种方法，包括：经由网络接口接收表示与各个相应回放配置相对应的一个或更多个空间滤波器的数据，其中，每个回放配置表示经由一个或更多个音频驱动器形成的特定声轴集合，并且其中，每个声轴对应于音频内容的相应声道；经由音频级使所述一个或更多个音频驱动器输出校准音频，所述校准音频被划分成重复的帧集合，所述帧集合包括针对每个回放配置的相应帧，其中，使所述一个或更多个音频驱动器输出校准音频包括：使所述音频级在每个帧期间应用与相应回放配置相对应的空间滤波器；经由所述网络接口接收表示与相应回放配置相对应的一个或更多个频谱滤波器的数据，所述一个或更多个频谱滤波器基于由所述一个或更多个音频驱动器输出的校准音频；以及在以给定回放配置回放音频内容时，使所述音频级应用与所述给定回放配置相对应的特定频谱滤波器。(Feature 16) A method comprising: receiving, via a network interface, data representing one or more spatial filters corresponding to respective respective playback configurations, wherein each playback configuration representation is formed via one or more audio drivers and wherein each sound axis corresponds to a corresponding channel of audio content; causing the one or more audio drivers to output calibration audio via an audio stage, the calibration audio being divided into a repeating set of frames , the set of frames includes respective frames for each playback configuration, wherein causing the one or more audio drivers to output calibration audio includes causing the audio stage to apply a corresponding frame to the respective playback configuration during each frame a spatial filter; receiving, via the network interface, data representing one or more spectral filters corresponding to respective playback configurations, the one or more spectral filters based on data generated by the one or more audio drivers output calibration audio; and causing the audio stage to apply a specific spectral filter corresponding to the given playback configuration when the audio content is played back in the given playback configuration.

(特征17)根据特征16所述的方法，其中，接收表示一个或更多个空间滤波器的数据包括：接收表示将所述回放设备校准到所述回放设备的收听区域内的特定收听位置的一个或更多个空间滤波器的数据，并且其中，接收表示一个或更多个频谱滤波器的数据包括：接收表示对所述收听区域的声学特性进行补偿的一个或更多个频谱滤波器的数据。(Feature 17) The method of feature 16, wherein receiving data representing one or more spatial filters comprises receiving data representing calibration of the playback device to a particular listening position within a listening area of the playback device data for one or more spatial filters, and wherein receiving data representing one or more spectral filters includes receiving data representing one or more spectral filters that compensate for acoustic properties of the listening area data.

(特征18)根据特征16所述的方法，其中，接收表示一个或更多个空间滤波器的数据包括：接收表示空间滤波器的一个或更多个集合的数据，每个空间滤波器集合包括用于每个声轴的相应空间滤波器，并且其中，接收表示一个或更多个频谱滤波器的数据包括：接收表示频谱滤波器的一个或更多个集合的数据，每个频谱滤波器包括用于每个声轴的相应频谱滤波器。(Feature 18) The method of feature 16, wherein receiving data representing one or more spatial filters comprises receiving data representing one or more sets of spatial filters, each set of spatial filters comprising a respective spatial filter for each acoustic axis, and wherein receiving data representing one or more spectral filters comprises receiving data representing one or more sets of spectral filters, each spectral filter comprising Corresponding spectral filters for each acoustic axis.

(特征19)根据特征18所述的方法，其中，所述一个或更多个空间滤波器包括以下滤波器中的至少一个：(i)对应于单声道回放配置的第一滤波器，在以所述单声道回放配置回放音频内容时，所述一个或更多个音频驱动器被配置成形成声轴以输出单声道音频内容，(ii)对应于立体声回放配置的第二滤波器，在以所述立体声回放配置回放音频内容时，所述一个或更多个音频驱动器被配置成形成一个或更多个声轴以输出立体声音频内容的一个或更多个声道，以及(iii)对应于环绕声回放配置的第三滤波器，在以所述环绕声回放配置回放音频内容时，所述一个或更多个音频驱动器被配置成形成一个或更多个声轴以输出环绕声音频内容的一个或更多个声道。(Feature 19) The method of feature 18, wherein the one or more spatial filters include at least one of: (i) a first filter corresponding to a mono playback configuration, where when the audio content is played back in the mono playback configuration, the one or more audio drivers are configured to form a sound axis to output the mono audio content, (ii) a second filter corresponding to the stereo playback configuration, When the audio content is played back in the stereo playback configuration, the one or more audio drivers are configured to form one or more sound axes to output one or more channels of the stereo audio content, and (iii) a third filter corresponding to a surround sound playback configuration in which the one or more audio drivers are configured to form one or more sound axes to output surround sound audio when the audio content is played back in the surround sound playback configuration One or more channels of content.

(特征20)根据特征18所述的方法，其中，所述单声道回放配置是第一单声道回放配置，所述立体声回放配置是第一立体声回放配置，并且所述环绕声回放配置是第一环绕声配置，并且其中，所述一个或更多个空间滤波器包括以下滤波器中的至少一个：(i)对应于第二单声道回放配置的第四滤波器，在以所述第二单声道回放配置回放音频内容时，所述一个或更多个音频驱动器被配置成形成声轴，以与超低音扬声器设备同步地输出单声道音频内容，(ii)对应于第二立体声回放配置的第五滤波器，在以所述第二立体声回放配置回放音频内容时，所述一个或更多个音频驱动器被配置成形成一个或更多个声轴，以与所述超低音扬声器设备同步地输出立体声音频内容的一个或更多个声道，以及(iii)对应于第二环绕声回放配置的第六滤波器，在以所述第二环绕声回放配置回放音频内容时，所述一个或更多个音频驱动器被配置成形成一个或更多个声轴，以与超低音扬声器设备同步地输出环绕声音频内容的一个或更多个声道。(Feature 20) The method of feature 18, wherein the mono playback configuration is a first mono playback configuration, the stereo playback configuration is a first stereo playback configuration, and the surround playback configuration is a first surround sound configuration, and wherein the one or more spatial filters include at least one of: (i) a fourth filter corresponding to the second mono playback configuration, in the When the second mono playback configuration plays back the audio content, the one or more audio drivers are configured to form an acoustic axis to output the mono audio content in synchronization with the subwoofer device, (ii) corresponding to the second a fifth filter of a stereo playback configuration, the one or more audio drivers are configured to form one or more acoustic axes to interact with the subwoofer when the audio content is played back in the second stereo playback configuration The speaker device synchronously outputs one or more channels of stereo audio content, and (iii) a sixth filter corresponding to a second surround playback configuration in which the audio content is played back, The one or more audio drivers are configured to form one or more sound axes to output one or more channels of surround sound audio content in synchronization with the subwoofer device.

(特征21)根据特征16所述的方法，其中，所述一个或更多个空间滤波器包括：(i)对应于音乐回放配置的第一滤波器，在以所述音乐回放配置回放音频内容时，所述一个或更多个音频驱动器被配置成形成一个或更多个声轴以输出音乐内容，以及(ii)对应于家庭影院回放配置的第二滤波器，在以所述家庭影院回放配置回放音频内容时，所述一个或更多个音频驱动器被配置成形成一个或更多个声轴，以输出与视频内容配对的音频内容。(Feature 21) The method of feature 16, wherein the one or more spatial filters comprise: (i) a first filter corresponding to a music playback configuration in which the audio content is played back , the one or more audio drivers are configured to form one or more sound axes to output music content, and (ii) a second filter corresponding to a home theater playback configuration in which playback is performed at the home theater When configured to play back the audio content, the one or more audio drivers are configured to form one or more sound axes to output the audio content paired with the video content.

(特征22)根据特征16所述的方法，其中，所述一个或更多个音频驱动器包括以给定回放配置形成多个声轴的多个音频驱动器，并且其中，使所述一个或更多个音频驱动器输出所述校准音频包括：使所述多个声轴在与所述给定回放配置相对应的每个帧的相应时隙期间输出校准音频。(Feature 22) The method of feature 16, wherein the one or more audio drivers include a plurality of audio drivers forming a plurality of sound axes in a given playback configuration, and wherein the one or more audio drivers are caused to Outputting the calibration audio by the plurality of audio drivers includes causing the plurality of sound axes to output the calibration audio during respective time slots of each frame corresponding to the given playback configuration.

(特征23)根据特征22所述的方法，其中，所述多个声轴中的每个声轴对应于环绕声音频内容的相应声道。(Feature 23) The method of feature 22, wherein each sound axis of the plurality of sound axes corresponds to a respective channel of surround sound audio content.

(特征24)根据特征22所述的方法，其中，所述多个声轴中的每个声轴对应于立体声音频内容的相应声道。(Feature 24) The method of feature 22, wherein each acoustic axis of the plurality of acoustic axes corresponds to a respective channel of stereo audio content.

(特征25)根据特征16所述的方法，其中，所述一个或更多个音频驱动器以给定回放配置形成单个声轴。(Feature 25) The method of feature 16, wherein the one or more audio drivers form a single sound axis in a given playback configuration.

(特征26)一种有形非暂态计算机可读介质，其存储有指令，所述指令能够由一个或更多个处理器执行以使设备执行根据特征16至25中任一项所述的方法。(Feature 26) A tangible non-transitory computer-readable medium storing instructions executable by one or more processors to cause an apparatus to perform the method according to any one of features 16 to 25 .

(特征27)一种设备，被配置成执行根据特征16至25中任一项所述的方法。(Feature 27) An apparatus configured to perform the method according to any one of features 16 to 25.

(特征28)一种媒体回放系统，被配置成执行根据特征16至25中任一项所述的方法。(Feature 28) A media playback system configured to perform the method according to any one of features 16 to 25.

(特征29)一种方法，包括：检测触发条件，所述触发条件启动针对多个回放配置对媒体回放系统的校准，其中，每个回放配置表示经由所述媒体回放系统的多个音频驱动器形成的特定声轴集合，并且其中，每个声轴对应于音频内容的相应声道；经由网络接口使所述多个音频驱动器输出校准音频，所述校准音频被划分成重复的帧集合，所述帧集合包括针对每个回放配置的相应帧，其中，使所述多个音频驱动器输出所述校准音频包括：使得在所述帧集合的每个帧期间将相应空间滤波器集合应用于所述多个音频驱动器，每个空间滤波器集合包括用于每个声轴的相应空间滤波器；经由麦克风记录由所述多个音频驱动器输出的所述校准音频；使处理设备基于所记录的校准音频来确定用于所述多个回放配置的相应频谱滤波器集合，每个频谱滤波器集合包括用于每个声轴的相应频谱滤波器。(Feature 29) A method comprising: detecting a trigger condition that initiates calibration of a media playback system for a plurality of playback configurations, wherein each playback configuration represents a formation via a plurality of audio drivers of the media playback system and wherein each sound axis corresponds to a corresponding channel of audio content; causing the plurality of audio drivers to output calibration audio via a network interface, the calibration audio being divided into a repeating set of frames, the The set of frames includes respective frames configured for each playback, wherein causing the plurality of audio drivers to output the calibration audio includes causing a respective set of spatial filters to be applied to the plurality of frames during each frame of the set of frames. audio drivers, each set of spatial filters including a corresponding spatial filter for each acoustic axis; recording the calibration audio output by the plurality of audio drivers via a microphone; causing the processing device to generate the calibration audio based on the recorded calibration audio Respective sets of spectral filters are determined for the plurality of playback configurations, each set of spectral filters including a corresponding spectral filter for each acoustic axis.

(特征30)根据特征29所述的方法，所述方法还包括：在所述媒体回放系统以给定回放配置回放音频内容时，使得将与所述给定回放配置相对应的特定频谱滤波器集合应用于由所述多个音频驱动器形成的声轴。(Feature 30) The method of feature 29, further comprising: when the media playback system plays back audio content in a given playback configuration, causing the specific spectral filter corresponding to the given playback configuration to be The set applies to the sound axis formed by the plurality of audio drivers.

(特征31)根据特征29所述的方法，其中，所述校准音频是第二校准音频，所述方法还包括：在使所述多个音频驱动器输出所述第二校准音频之前，经由所述网络接口使所述多个驱动器输出被划分成重复的帧集合的第一校准音频，所述帧集合包括针对所述多个回放配置中的每个回放配置的相应帧；经由所述麦克风记录由所述多个音频驱动器输出的所述第一校准音频；以及使所述处理设备基于所记录的第一校准音频来确定用于所述多个回放配置的所述相应空间滤波器集合，每个空间滤波器集合包括用于每个声轴的相应空间滤波器。(Feature 31) The method according to feature 29, wherein the calibration audio is a second calibration audio, the method further comprising: prior to causing the plurality of audio drivers to output the second calibration audio, via the The network interface causes the plurality of drivers to output a first calibration audio divided into a repeated set of frames, the set of frames including corresponding frames for each playback configuration of the plurality of playback configurations; recording via the microphone by the first calibration audio output by the plurality of audio drivers; and causing the processing device to determine the respective set of spatial filters for the plurality of playback configurations based on the recorded first calibration audio, each The set of spatial filters includes a corresponding spatial filter for each acoustic axis.

(特征32)根据特征29所述的方法，其中，使所述多个音频驱动器输出所述校准音频包括：使所述多个音频驱动器在每个帧的相应时隙期间形成所述多个声轴中的相应声轴。(Feature 32) The method of feature 29, wherein causing the plurality of audio drivers to output the calibration audio comprises causing the plurality of audio drivers to form the plurality of audio drivers during respective time slots of each frame the corresponding acoustic axis in the axis.

(特征33)根据特征29所述的方法，其中，所述多个回放配置包括以下回放配置中的两个或更多个：(i)单声道回放配置，在以所述单声道回放配置回放音频内容时，所述多个音频驱动器被配置成形成声轴以同步地输出单声道音频内容，(ii)立体声回放配置，在以所述立体声回放配置回放音频内容时，所述多个音频驱动器被配置成形成声轴以输出立体声音频内容的声道，以及(iii)环绕声回放配置，在以所述环绕声回放配置回放音频内容时，所述多个音频驱动器被配置成形成声轴，以输出环绕声音频内容的相应声道。(Feature 33) The method of feature 29, wherein the plurality of playback configurations includes two or more of the following playback configurations: (i) a mono playback configuration in which playback in the mono When configured to play back audio content, the plurality of audio drivers are configured to form a sound axis to output mono audio content synchronously, (ii) a stereo playback configuration in which the plurality of audio drivers are configured to play back audio content in the stereo playback configuration; a plurality of audio drivers are configured to form a sound axis to output channels of stereo audio content, and (iii) a surround sound playback configuration in which the plurality of audio drivers are configured to form a surround sound playback configuration when the audio content is played back sound axis to output the corresponding channel of surround audio content.

(特征34)根据特征33所述的方法，其中，所述单声道回放配置是第一单声道回放配置，所述立体声回放配置是第一立体声回放配置，并且所述环绕声回放配置是第一环绕声配置，其中，所述多个回放配置包括以下回放配置中的至少一个：(i)第二单声道回放配置，在以所述单声道回放配置回放音频内容时，所述多个音频驱动器被配置成形成一个或更多个全范围声轴和超低音扬声器声轴以同步地输出单声道音频内容，(ii)第二立体声回放配置，在以所述第二立体声回放配置回放音频内容时，所述多个音频驱动器被配置成形成一个或更多个全范围声轴，以与超低音扬声器声轴同步地输出立体声内容音频内容的声道，以及(iii)第二环绕声回放配置，在以所述第二环绕声回放配置回放音频内容时，所述多个音频驱动器被配置成形成一个或更多个全范围声轴，以与超低音扬声器声轴同步地输出环绕声内容音频内容的相应声道。(Feature 34) The method of feature 33, wherein the mono playback configuration is a first mono playback configuration, the stereo playback configuration is a first stereo playback configuration, and the surround playback configuration is A first surround sound configuration, wherein the plurality of playback configurations includes at least one of: (i) a second mono playback configuration in which, when the audio content is played back, the a plurality of audio drivers configured to form one or more full-range sound shafts and subwoofer sound shafts to output mono audio content in synchrony, (ii) a second stereo playback configuration, in which playback in said second stereo When configured to play back the audio content, the plurality of audio drivers are configured to form one or more full-range sound axes to output channels of the stereo content audio content in synchronization with the subwoofer sound axis, and (iii) a second a surround sound playback configuration wherein the plurality of audio drivers are configured to form one or more full-range sound axes for output in synchronization with the subwoofer sound axis when the audio content is played back in the second surround sound playback configuration The corresponding channel of the surround content audio content.

(特征35)根据特征29所述的方法，其中，所述多个回放配置包括以下回放配置中的两个或更多个：(i)音乐回放配置，在以所述音乐回放配置回放音频内容时，所述多个音频驱动器被配置成形成声轴以输出音乐内容，以及(ii)家庭影院回放配置，在以所述家庭影院回放配置回放音频内容时，所述多个音频驱动器被配置成形成声轴以输出与视频内容配对的音频内容。(Feature 35) The method of feature 29, wherein the plurality of playback configurations includes two or more of the following playback configurations: (i) a music playback configuration in which audio content is played back , the plurality of audio drivers are configured to form an acoustic axis to output music content, and (ii) a home theater playback configuration in which the plurality of audio drivers are configured to play back audio content in the home theater playback configuration A sound axis is formed to output audio content paired with video content.

(特征36)根据特征29所述的方法，其中，使得在所述帧集合的每个帧期间将所述相应空间滤波器集合应用于所述多个音频驱动器包括：使所述处理设备将所述空间滤波器应用于所述校准音频，并且将应用了空间滤波器的校准音频发送至包括所述多个音频驱动器的一个或更多个回放设备。(Feature 36) The method of feature 29, wherein causing the respective set of spatial filters to be applied to the plurality of audio drivers during each frame of the set of frames comprises causing the processing device to apply all The spatial filter is applied to the calibration audio, and the calibration audio to which the spatial filter is applied is sent to one or more playback devices including the plurality of audio drivers.

(特征37)根据特征29所述的方法，其中，所述媒体回放系统包括多个回放设备，每个回放设备包括所述多个音频驱动器的子组。(Feature 37) The method of feature 29, wherein the media playback system includes a plurality of playback devices, each playback device including a subset of the plurality of audio drivers.

(特征38)一种有形非暂态计算机可读介质，其存储有指令，所述指令能够由一个或更多个处理器执行以使设备执行根据特征29至37中任一项所述的方法。(Feature 38) A tangible non-transitory computer readable medium storing instructions executable by one or more processors to cause an apparatus to perform the method according to any one of features 29 to 37 .

(特征39)一种设备，被配置成执行根据特征29至37中任一项所述的方法。(Feature 39) An apparatus configured to perform the method according to any one of features 29 to 37.

(特征40)一种媒体回放系统，被配置成执行根据特征29至37中任一项所述的方法。(Feature 40) A media playback system configured to perform the method according to any one of features 29 to 37.

(特征41)一种回放设备，包括：(i)网络接口；(ii)音频级，被布置成驱动一个或更多个音频驱动器；(iii)一个或更多个处理器；(iv)存储有指令的计算机可读介质，所述指令能够由所述一个或更多个处理器执行以使所述回放设备执行包括以下操作的操作：(a)经由所述网络接口接收表示与相应回放配置相对应的一个或更多个空间滤波器的数据，其中，每个回放配置表示经由所述一个或更多个音频驱动器形成的特定声轴集合，并且其中，每个声轴对应于音频内容的相应声道；(b)经由音频级使所述一个或更多个音频驱动器输出被划分成重复的帧集合的校准音频，所述帧集合包括针对每个回放配置的相应帧，其中，使所述一个或更多个音频驱动器输出所述校准音频包括：使所述音频级在每个帧期间应用与相应回放配置相对应的所述空间滤波器；(c)经由所述网络接口接收表示与相应回放配置相对应的一个或更多个频谱滤波器的数据，所述一个或更多个频谱滤波器基于由所述一个或更多个音频驱动器输出的所述校准音频；以及(d)在以给定回放配置回放音频内容时，使所述音频级应用与所述给定回放配置相对应的特定频谱滤波器。(Feature 41) A playback device comprising: (i) a network interface; (ii) an audio stage arranged to drive one or more audio drivers; (iii) one or more processors; (iv) storage A computer-readable medium having instructions executable by the one or more processors to cause the playback device to perform operations comprising: (a) receiving a representation and a corresponding playback configuration via the network interface data for the corresponding one or more spatial filters, wherein each playback configuration represents a particular set of sound axes formed via the one or more audio drivers, and wherein each sound axis corresponds to a a corresponding channel; (b) causing the one or more audio drivers to output, via the audio stage, calibration audio divided into repeated sets of frames, the set of frames including corresponding frames configured for each playback, wherein all outputting the calibration audio by the one or more audio drivers includes causing the audio stage to apply the spatial filter corresponding to a respective playback configuration during each frame; (c) receiving via the network interface a representation with corresponding playback configuration data for one or more spectral filters corresponding to the one or more spectral filters based on the calibration audio output by the one or more audio drivers; and (d) in When the audio content is played back in a given playback configuration, the audio stage is caused to apply a specific spectral filter corresponding to the given playback configuration.

(特征42)根据特征41所述的回放设备，其中，接收表示一个或更多个空间滤波器的数据包括：接收表示将所述回放设备校准到所述回放设备的收听区域内的特定收听位置的一个或更多个空间滤波器的数据，并且其中，接收表示一个或更多个频谱滤波器的数据包括：接收表示对所述收听区域的声学特性进行补偿的一个或更多个频谱滤波器的数据。(Feature 42) The playback device of feature 41, wherein receiving data representing one or more spatial filters comprises receiving data representing calibration of the playback device to a particular listening position within a listening area of the playback device and wherein receiving data representing one or more spectral filters includes receiving data representing one or more spectral filters compensating for acoustic properties of the listening area The data.

(特征43)根据特征41所述的回放设备，其中，接收表示一个或更多个空间滤波器的数据包括：接收表示空间滤波器的一个或更多个集合的数据，每个空间滤波器集合包括用于每个声轴的相应空间滤波器，并且其中，接收表示一个或更多个频谱滤波器的数据包括：接收表示频谱滤波器的一个或更多个集合的数据，每个频谱滤波器包括用于每个声轴的相应频谱滤波器。(Feature 43) The playback device of feature 41, wherein receiving data representing one or more spatial filters comprises receiving data representing one or more sets of spatial filters, each set of spatial filters comprising respective spatial filters for each acoustic axis, and wherein receiving data representing one or more spectral filters comprises receiving data representing one or more sets of spectral filters, each spectral filter Corresponding spectral filters are included for each acoustic axis.

(特征44)根据特征41所述的回放设备，其中，所述一个或更多个空间滤波器包括以下滤波器中的至少一个：(i)对应于单声道回放配置的第一滤波器，在以所述单声道配置回放音频内容时，所述一个或更多个音频驱动器被配置成形成声轴以输出单声道音频内容，(ii)对应于立体声回放配置的第二滤波器，在以所述立体声回放配置回放音频内容时，所述一个或更多个音频驱动器被配置成形成一个或更多个声轴以输出立体声音频内容的一个或更多个声道，以及(iii)对应于环绕声回放配置的第三滤波器，在以所述环绕声回放配置回放音频内容时，所述一个或更多个音频驱动器被配置成形成一个或更多个声轴以输出环绕声音频内容的一个或更多个声道。(Feature 44) The playback device of feature 41, wherein the one or more spatial filters include at least one of: (i) a first filter corresponding to a mono playback configuration, When the audio content is played back in the mono configuration, the one or more audio drivers are configured to form a sound axis to output the mono audio content, (ii) a second filter corresponding to the stereo playback configuration, When the audio content is played back in the stereo playback configuration, the one or more audio drivers are configured to form one or more sound axes to output one or more channels of the stereo audio content, and (iii) a third filter corresponding to a surround sound playback configuration in which the one or more audio drivers are configured to form one or more sound axes to output surround sound audio when the audio content is played back in the surround sound playback configuration One or more channels of content.

(特征45)根据特征44所述的回放设备，其中，所述单声道回放配置是第一单声道回放配置，所述立体声回放配置是第一立体声回放配置，并且所述环绕声回放配置是第一环绕声配置，并且其中，所述一个或更多个空间滤波器包括以下滤波器中的至少一个：(i)对应于第二单声道回放配置的第四滤波器，在以所述第二单声道回放配置回放音频内容时，所述一个或更多个音频驱动器被配置成形成声轴以与超低音扬声器设备同步地输出单声道音频内容，(ii)对应于第二立体声回放配置的第五滤波器，在以所述第二立体声回放配置回放音频内容时，所述一个或多个音频驱动器被配置成形成一个或更多个声轴，以与所述超低音扬声器设备同步地输出立体声音频内容的一个或更多个声道，以及(iii)对应于第二环绕声回放配置的第六滤波器，在以所述第二环绕声回放配置回放音频内容时，所述一个或更多个音频驱动器被配置成形成一个或更多个声轴，以与超低音扬声器设备同步地输出环绕声音频内容的一个或更多个声道。(Feature 45) The playback device according to feature 44, wherein the monaural playback configuration is a first mono playback configuration, the stereo playback configuration is a first stereo playback configuration, and the surround playback configuration is a first surround sound configuration, and wherein the one or more spatial filters include at least one of: (i) a fourth filter corresponding to the second mono playback configuration, where so When the second mono playback configuration plays back the audio content, the one or more audio drivers are configured to form an acoustic axis to output the mono audio content in synchronization with the subwoofer device, (ii) corresponding to the second a fifth filter of a stereo playback configuration, the one or more audio drivers are configured to form one or more acoustic axes to interact with the subwoofer when the audio content is played back in the second stereo playback configuration The device outputs one or more channels of stereo audio content synchronously, and (iii) a sixth filter corresponding to a second surround playback configuration in which the audio content is played back in the second surround playback configuration. The one or more audio drivers are configured to form one or more sound axes to output one or more channels of surround sound audio content in synchronization with the subwoofer device.

(特征46)根据特征41所述的回放设备，其中，所述一个或更多个空间滤波器包括：(i)对应于音乐回放配置的第一滤波器，在以所述音乐回放配置回放音频内容时，所述一个或更多个音频驱动器被配置成形成一个或更多个声轴以输出音乐内容，以及(ii)对应于家庭影院回放配置的第二滤波器，在以所述家庭影院回放配置回放音频内容时，所述一个或更多个音频驱动器被配置成形成一个或更多个声轴以输出与视频内容配对的音频内容。(Feature 46) The playback device of feature 41, wherein the one or more spatial filters include: (i) a first filter corresponding to a music playback configuration in which audio is played back content, the one or more audio drivers are configured to form one or more sound axes to output the music content, and (ii) a second filter corresponding to a home theater playback configuration in which the home theater When the playback configuration plays back the audio content, the one or more audio drivers are configured to form one or more sound axes to output the audio content paired with the video content.

(特征47)根据特征41所述的回放设备，其中，所述一个或更多个音频驱动器包括以给定回放配置形成多个声轴的多个音频驱动器，并且其中，使所述一个或更多个音频驱动器输出所述校准音频包括：使所述多个声轴在与所述给定回放配置相对应的每个帧的相应时隙期间输出校准音频。(Feature 47) The playback device according to feature 41, wherein the one or more audio drivers include a plurality of audio drivers forming a plurality of sound axes in a given playback configuration, and wherein the one or more audio drivers are made Outputting the calibration audio by the plurality of audio drivers includes causing the plurality of sound axes to output the calibration audio during respective time slots of each frame corresponding to the given playback configuration.

(特征48)根据特征47所述的回放设备，其中，所述多个声轴中的每个声轴对应于环绕声音频内容的相应声道。(Feature 48) The playback device of feature 47, wherein each sound axis of the plurality of sound axes corresponds to a respective channel of surround sound audio content.

(特征49)根据特征47所述的回放设备，其中，所述多个声轴中的每个声轴对应于立体声音频内容的相应声道。(Feature 49) The playback device of feature 47, wherein each sound axis of the plurality of sound axes corresponds to a respective channel of the stereo audio content.

(特征50)根据特征41所述的回放设备，其中，所述一个或更多个音频驱动器以给定回放配置形成单个声轴。(Feature 50) The playback device of feature 41, wherein the one or more audio drivers form a single sound axis in a given playback configuration.

(特征51)一种有形非暂态计算机可读介质，其存储有指令，所述指令能够由一个或更多个处理器执行以使联网麦克风设备(NMD)执行包括以下操作的方法：(i)检测触发条件，所述触发条件启动针对多个回放配置对媒体回放系统的校准，其中，每个回放配置表示经由所述媒体回放系统的多个音频驱动器形成的特定声轴集合，并且其中，每个声轴对应于音频内容的相应声道；(ii)经由网络接口使所述多个音频驱动器输出被划分成重复的帧集合的校准音频，所述帧集合包括针对每个回放配置的相应帧，其中，使所述多个音频驱动器输出所述校准音频包括：使得在所述帧集合的每个帧期间将相应空间滤波器集合应用于所述多个音频驱动器，每个空间滤波器集合包括用于每个声轴的相应空间滤波器；(iii)经由所述麦克风记录由所述多个音频驱动器输出的所述校准音频；(iv)使处理设备基于所记录的校准音频来确定用于所述多个回放配置的相应频谱滤波器集合，每个频谱滤波器集合包括用于每个声轴的相应频谱滤波器。(Feature 51) A tangible, non-transitory computer-readable medium storing instructions executable by one or more processors to cause a networked microphone device (NMD) to perform a method comprising: (i ) detects a trigger condition that initiates calibration of a media playback system for a plurality of playback configurations, wherein each playback configuration represents a particular set of sound axes formed via a plurality of audio drivers of the media playback system, and wherein, Each sound axis corresponds to a corresponding channel of audio content; (ii) causing the plurality of audio drivers to output, via a network interface, calibrated audio divided into repeated sets of frames, the sets of frames comprising corresponding frame, wherein causing the plurality of audio drivers to output the calibration audio includes causing a respective set of spatial filters to be applied to the plurality of audio drivers during each frame of the set of frames, each set of spatial filters including a corresponding spatial filter for each acoustic axis; (iii) recording, via the microphone, the calibration audio output by the plurality of audio drivers; (iv) causing the processing device to determine the use of the calibration audio based on the recorded calibration audio Respective sets of spectral filters in the plurality of playback configurations, each set of spectral filters including a corresponding spectral filter for each acoustic axis.

(特征52)根据特征51所述的有形非暂态计算机可读介质，所述方法还包括：在所述媒体回放系统以给定回放配置回放音频内容时，使得将与所述给定回放配置相对应的特定频谱滤波器集合应用于由所述多个音频驱动器形成的所述声轴。(Feature 52) The tangible, non-transitory computer-readable medium of feature 51, the method further comprising: when the media playback system plays back the audio content in a given playback configuration, causing the media playback system to play back the audio content with the given playback configuration A corresponding set of specific spectral filters is applied to the acoustic axis formed by the plurality of audio drivers.

(特征53)根据特征51所述的有形非暂态计算机可读介质，其中，所述校准音频是第二校准音频，所述方法还包括：(i)在使所述多个音频驱动器输出所述第二校准音频之前，经由所述网络接口使所述多个驱动器输出第一校准音频，所述第一校准音频被划分成重复的帧集合，所述帧集合包括针对所述多个回放配置中的每个回放配置的相应帧；(ii)经由所述麦克风记录由所述多个音频驱动器输出的所述第一校准音频；以及(iii)使所述处理设备基于所记录的第一校准音频来确定用于所述多个回放配置的相应空间滤波器集合，每个空间滤波器集合包括用于每个声轴的相应空间滤波器。(Feature 53) The tangible, non-transitory computer-readable medium of feature 51, wherein the calibration audio is a second calibration audio, the method further comprising: (i) causing the plurality of audio drivers to output all of the causing the plurality of drives to output a first calibration audio via the network interface prior to the second calibration audio, the first calibration audio being divided into a repeating set of frames, the set of frames including a configuration for the plurality of playback (ii) recording, via the microphone, the first calibration audio output by the plurality of audio drivers; and (iii) causing the processing device to base the recorded first calibration on audio to determine respective sets of spatial filters for the plurality of playback configurations, each set of spatial filters including respective spatial filters for each acoustic axis.

(特征54)根据特征51所述的有形非暂态计算机可读介质，其中，使所述多个音频驱动器输出所述校准音频包括：使所述多个音频驱动器在每个帧的相应时隙期间形成所述多个声轴中的相应声轴。(Feature 54) The tangible, non-transitory computer-readable medium of feature 51, wherein causing the plurality of audio drivers to output the calibration audio comprises causing the plurality of audio drivers to output the calibration audio at corresponding time slots of each frame Corresponding acoustic axes of the plurality of acoustic axes are formed during the process.

(特征55)根据特征51所述的有形非暂态计算机可读介质，其中，所述多个回放配置包括以下回放配置中的两个或更多个：(i)单声道回放配置，在以所述单声道回放配置回放音频内容时，所述多个音频驱动器被配置成形成声轴以同步地输出单声道音频内容，(ii)立体声回放配置，在以所述立体声回放配置回放音频内容时，所述多个音频驱动器被配置成形成声轴以输出立体声音频内容的声道，以及(iii)环绕声回放配置，在以所述环绕声回放配置回放音频内容时，所述多个音频驱动器被配置成形成声轴以输出环绕声音频内容的相应声道。(Feature 55) The tangible, non-transitory computer-readable medium of feature 51, wherein the plurality of playback configurations includes two or more of the following playback configurations: (i) a mono playback configuration, where When the audio content is played back in the mono playback configuration, the plurality of audio drivers are configured to form a sound axis to output the mono audio content synchronously, (ii) a stereo playback configuration, in which playback in the stereo playback configuration When playing back audio content, the plurality of audio drivers are configured to form a sound axis to output channels of stereo audio content, and (iii) a surround sound playback configuration in which when the audio content is played back in the surround sound playback configuration, the plurality of audio drivers are The audio drivers are configured to form a sound axis to output respective channels of surround sound audio content.

(特征56)根据特征55所述的有形非暂态计算机可读介质，其中，所述单声道回放配置是第一单声道回放配置，所述立体声回放配置是第一立体声回放配置，并且所述环绕声回放配置是第一环绕声配置，其中，所述多个回放配置包括以下回放配置中的至少一个：(i)第二单声道回放配置，在以所述单声道回放配置回放音频内容时，所述多个音频驱动器被配置成形成一个或更多个全范围声轴和超低音扬声器声轴，以同步地输出单声道音频内容，(ii)第二立体声回放配置，在以所述第二立体声回放配置回放音频内容时，所述多个音频驱动器被配置成形成一个或更多个全范围声轴，以与超低音扬声器声轴同步地输出立体声内容音频内容的声道，以及(iii)第二环绕声回放配置，在以所述第二环绕声回放配置回放音频内容时，所述多个音频驱动器被配置成形成一个或更多个全范围声轴，以与超低音扬声器声轴同步地输出环绕声内容音频内容的相应声道。(Feature 56) The tangible, non-transitory computer-readable medium of feature 55, wherein the mono playback configuration is a first mono playback configuration, the stereo playback configuration is a first stereo playback configuration, and The surround sound playback configuration is a first surround sound configuration, wherein the plurality of playback configurations includes at least one of the following playback configurations: (i) a second mono playback configuration in which the mono playback configuration is When playing back the audio content, the plurality of audio drivers are configured to form one or more full-range sound shafts and subwoofer sound shafts to output mono audio content synchronously, (ii) a second stereo playback configuration, When the audio content is played back in the second stereo playback configuration, the plurality of audio drivers are configured to form one or more full-range sound axes to output sound of the stereo content audio content in synchronization with the subwoofer sound axis channel, and (iii) a second surround sound playback configuration in which the plurality of audio drivers are configured to form one or more full-range sound axes to interact with the audio content when playing back audio content in the second surround sound playback configuration. The subwoofer sound axis outputs the corresponding channels of surround content audio content synchronously.

(特征57)根据特征51所述的有形非暂态计算机可读介质，其中，所述多个回放配置包括以下回放配置中的两个或更多个：(i)音乐回放配置，在以所述音乐回放配置回放音频内容时，所述多个音频驱动器被配置成形成声轴以输出音乐内容，以及(ii)家庭影院回放配置，在以所述家庭影院回放配置回放音频内容时，所述多个音频驱动器被配置成形成声轴以输出与视频内容配对的音频内容。(Feature 57) The tangible, non-transitory computer-readable medium of feature 51, wherein the plurality of playback configurations includes two or more of the following playback configurations: (i) a music playback configuration in which The plurality of audio drivers are configured to form an acoustic axis to output the music content when the audio content is played back in the music playback configuration, and (ii) a home theater playback configuration in which the audio content is played back in the home theater playback configuration. A plurality of audio drivers are configured to form a sound axis to output audio content paired with video content.

(特征58)根据特征51所述的有形非暂态计算机可读介质，其中，使得在所述帧集合的每个帧期间将所述相应空间滤波器集合应用于所述多个音频驱动器包括：使所述处理设备将所述空间滤波器应用于所述校准音频，并且将应用了空间滤波器的校准音频发送至包括所述多个音频驱动器的一个或更多个回放设备。(Feature 58) The tangible, non-transitory computer-readable medium of feature 51, wherein causing the respective set of spatial filters to be applied to the plurality of audio drivers during each frame of the set of frames comprises: The processing device is caused to apply the spatial filter to the calibration audio, and send the spatial filter-applied calibration audio to one or more playback devices including the plurality of audio drivers.

(特征59)根据特征51所述的有形非暂态计算机可读介质，其中，所述媒体回放系统包括多个回放设备，每个回放设备包括所述多个音频驱动器的子组。(Feature 59) The tangible, non-transitory computer-readable medium of feature 51, wherein the media playback system includes a plurality of playback devices, each playback device including a subset of the plurality of audio drivers.

(特征60)一种媒体回放系统，包括：(i)一个或更多个回放设备，包括形成多个声轴的多个音频驱动器，每个声轴对应于音频内容的相应声道；(ii)联网麦克风设备，包括麦克风；(iii)处理器；以及(iv)存储有指令的计算机可读介质，所示指令能够由一个或更多个处理器执行以使所述媒体回放系统执行包括以下操作的方法：(a)检测触发条件，所述触发条件启动针对多个回放配置对所述媒体回放系统的校准，其中，每个回放配置表示经由所述多个音频驱动器形成的特定声轴集合；(b)经由网络接口使所述多个音频驱动器输出校准音频，所述校准音频被划分成重复的帧集合，所述帧集合包括针对每个回放配置的相应帧，其中，使所述多个音频驱动器输出所述校准音频包括：使得在所述帧集合的每个帧期间将相应空间滤波器集合应用于所述多个音频驱动器，每个空间滤波器集合包括用于每个声轴的相应空间滤波器；(c)经由所述麦克风记录由所述多个音频驱动器输出的所述校准音频；(d)使处理设备基于所记录的校准音频来确定用于所述多个回放配置的相应频谱滤波器集合，每个频谱滤波器集合包括用于每个声轴的相应频谱滤波器。(Feature 60) A media playback system comprising: (i) one or more playback devices including a plurality of audio drivers forming a plurality of sound axes, each sound axis corresponding to a corresponding channel of audio content; (ii) ) a networked microphone device, including a microphone; (iii) a processor; and (iv) a computer-readable medium having stored instructions executable by one or more processors to cause the media playback system to perform including the following Method of operation: (a) detecting a trigger condition that initiates calibration of the media playback system for a plurality of playback configurations, wherein each playback configuration represents a particular set of sound axes formed via the plurality of audio drivers (b) causing the plurality of audio drivers to output calibration audio via a network interface, the calibration audio being divided into a repeating set of frames, the set of frames comprising corresponding frames configured for each playback, wherein the plurality of audio drivers are caused to outputting the calibration audio by a plurality of audio drivers includes causing a corresponding set of spatial filters to be applied to the plurality of audio drivers during each frame of the set of frames, each set of spatial filters including a set of spatial filters for each acoustic axis a corresponding spatial filter; (c) recording the calibration audio output by the plurality of audio drivers via the microphone; (d) causing a processing device to determine a calibration audio for the plurality of playback configurations based on the recorded calibration audio Corresponding spectral filter sets, each spectral filter set including a corresponding spectral filter for each acoustic axis.

(特征61)一种有形非暂态计算机可读介质，其存储有指令，所述指令能够由一个或更多个处理器执行以使联网麦克风设备(NMD)执行包括以下操作的方法：(i)检测触发条件，所述触发条件启动对包括多个音频驱动器的媒体回放系统的校准，多个音频驱动器形成多个声轴，每个声轴对应于多声道音频内容的相应声道；(ii)经由网络接口使所述多个音频驱动器发出校准音频，所述校准音频被划分成组成帧，所述多个声轴在每个组成帧的相应时隙期间发出校准音频；(iii)经由麦克风记录所发出的校准音频；(iv)使得基于与所述声轴相对应的所记录校准音频的时隙来确定所述多个声轴中的每个声轴的相应延迟；以及(v)使得校准所述多个声轴，其中，校准所述多个声轴包括：使得根据相应确定的延迟来延迟所述多个声轴的音频输出。(Feature 61) A tangible, non-transitory computer-readable medium storing instructions executable by one or more processors to cause a networked microphone device (NMD) to perform a method comprising: (i ) detecting a trigger condition that initiates calibration of a media playback system comprising a plurality of audio drivers forming a plurality of acoustic axes, each acoustic axis corresponding to a corresponding channel of the multi-channel audio content; ( ii) causing the plurality of audio drivers to emit calibration audio via a network interface, the calibration audio being divided into constituent frames, the plurality of acoustic axes emitting calibration audio during respective time slots of each constituent frame; (iii) via recording the emitted calibration audio by a microphone; (iv) causing a respective delay for each of the plurality of acoustic axes to be determined based on time slots of the recorded calibration audio corresponding to the acoustic axes; and (v) causing the plurality of acoustic axes to be calibrated, wherein calibrating the plurality of acoustic axes includes causing audio output of the plurality of acoustic axes to be delayed according to respective determined delays.

(特征62)根据特征61所述的有形非暂态计算机可读介质，其中，使得确定所述多个声轴中的每个声轴的相应延迟包括：(i)使处理设备根据与每个声轴相对应的所记录校准音频的时隙来确定所述多个声轴中的每个声轴的在所述麦克风处的相应到达时间；以及(ii)使得确定所述多个声轴中的每个声轴的延迟，每个确定的延迟对应于相应声轴的所确定的到达时间。(Feature 62) The tangible, non-transitory computer-readable medium of feature 61, wherein causing the determination of the respective delay for each of the plurality of acoustic axes comprises: (i) causing the processing device to be time slots of the recorded calibration audio corresponding to the acoustic axes to determine the respective arrival times at the microphones for each of the plurality of acoustic axes; and (ii) causing the determination of the The delay of each acoustic axis of , each determined delay corresponds to the determined arrival time of the corresponding acoustic axis.

(特征63)根据特征62所述的有形非暂态计算机可读介质，其中，使得根据相应确定的延迟来延迟所述多个声轴的音频输出包括：使相应滤波器根据所述多个声轴的相应确定的延迟来延迟所述多个音频驱动器的音频输出。(Feature 63) The tangible, non-transitory computer-readable medium of feature 62, wherein causing the audio output of the plurality of sound axes to be delayed according to the respective determined delays comprises causing respective filters to delay the audio output according to the plurality of sound axes A corresponding determined delay of the axes delays the audio output of the plurality of audio drivers.

(特征64)根据特征62所述的有形非暂态计算机可读介质，其中，所述NMD包括所述处理设备，并且其中，使所述处理设备确定所述多个声轴中的每个声轴的在所述麦克风处的相应到达时间包括：(i)将所记录的校准音频划分成所述组成帧并且将所述每个组成帧划分成针对每个声轴的相应时隙；(ii)根据与所述每个声轴相对应的相应时隙来确定所述声轴的相应脉冲响应；(iii)将相应脉冲响应对准至第一参考点；(iv)识别每个脉冲响应中的相应第二参考点；以及(v)基于每个脉冲响应中的所述第一参考点与所述第二参考点的相应差异来确定在所述麦克风处的相应到达时间。(Feature 64) The tangible, non-transitory computer-readable medium of feature 62, wherein the NMD includes the processing device, and wherein the processing device is caused to determine each acoustic axis of the plurality of acoustic axes The respective arrival times of the axes at the microphones include: (i) dividing the recorded calibration audio into the constituent frames and dividing each constituent frame into a respective time slot for each acoustic axis; (ii) ) determining the respective impulse responses of the acoustic axes according to the respective time slots corresponding to the acoustic axes; (iii) aligning the respective impulse responses to a first reference point; (iv) identifying the respective impulse responses in each impulse response and (v) determining a respective time of arrival at the microphone based on the respective difference between the first reference point and the second reference point in each impulse response.

(特征65)根据特征64所述的有形非暂态计算机可读介质，其中，所述声轴由参考声轴和一个或更多个其他声轴组成，并且其中，识别每个脉冲响应中的相应第二参考点包括：(i)识别所述参考声轴的所述脉冲响应中的峰值作为给定第二参考点；以及(ii)在所述给定第二参考点之后的时窗中，识别所述一个或更多个其他声轴的所述脉冲响应的相应峰值作为其他第二参考点。(Feature 65) The tangible, non-transitory computer-readable medium of feature 64, wherein the acoustic axis consists of a reference acoustic axis and one or more other acoustic axes, and wherein the acoustic axis in each impulse response is identified A corresponding second reference point includes: (i) identifying a peak in the impulse response of the reference acoustic axis as a given second reference point; and (ii) in a time window following the given second reference point , the corresponding peaks of the impulse responses of the one or more other acoustic axes are identified as other second reference points.

(特征66)根据特征62所述的有形非暂态计算机可读介质，其中，所述处理设备经由一个或更多个网络连接至所述NMD，并且其中，使所述处理设备确定所述多个声轴中的每个声轴的在所述麦克风处的相应到达时间包括：(i)经由所述网络接口向所述处理设备发送(a)所记录的校准音频以及(b)确定所述多个声轴中的每个声轴的在所述麦克风处的相应到达时间的指令；以及(ii)经由所述网络接口接收所确定的相应到达时间。(Feature 66) The tangible, non-transitory computer-readable medium of feature 62, wherein the processing device is connected to the NMD via one or more networks, and wherein the processing device is caused to determine the plurality of The respective time of arrival at the microphone for each of the acoustic axes comprises: (i) sending (a) the recorded calibration audio to the processing device via the network interface and (b) determining the an instruction of a respective time of arrival at the microphone for each of the plurality of acoustic axes; and (ii) receiving the determined respective time of arrival via the network interface.

(特征67)根据特征61所述的有形非暂态计算机可读介质，其中，所述多个声轴中的每个声轴对应于环绕声音频内容的相应声道。(Feature 67) The tangible, non-transitory computer-readable medium of feature 61, wherein each acoustic axis of the plurality of acoustic axes corresponds to a respective channel of surround sound audio content.

(特征68)根据特征67所述的有形非暂态计算机可读介质，其中，所述媒体回放系统包括多个回放设备，每个回放设备包括所述多个音频驱动器的子组。(Feature 68) The tangible, non-transitory computer-readable medium of feature 67, wherein the media playback system includes a plurality of playback devices, each playback device including a subset of the plurality of audio drivers.

(特征69)根据特征68所述的有形非暂态计算机可读介质，其中，所述多个回放设备包括给定回放设备，所述给定回放设备包括所述多个音频驱动器的特定子组，其中，所述多个音频驱动器的特定子组形成三个声轴，所述三个声轴分别对应于所述环绕声音频内容的左声道、所述环绕声音频内容的右声道以及所述音频内容的中心声道。(Feature 69) The tangible, non-transitory computer-readable medium of feature 68, wherein the plurality of playback devices comprises a given playback device comprising a particular subset of the plurality of audio drivers , wherein a particular subset of the plurality of audio drivers forms three sound axes corresponding to the left channel of the surround audio content, the right channel of the surround audio content, and The center channel of the audio content.

(特征70)根据特征61所述的有形非暂态计算机可读介质，其中，检测启动对媒体回放系统的校准的所述触发条件包括：经由用户接口检测这样的输入数据，该输入数据指示启动对媒体回放系统的校准的命令。(Feature 70) The tangible, non-transitory computer-readable medium of feature 61, wherein detecting the trigger condition to initiate calibration of the media playback system comprises detecting, via a user interface, input data indicating initiation of Commands for calibration of the media playback system.

(特征71)根据特征61所述的有形非暂态计算机可读介质，其中，检测启动对媒体回放系统的校准的所述触发条件包括：检测所述媒体回放系统的配置为特定轴配置，其中，所述多个音频驱动器形成特定声轴集合。(Feature 71) The tangible, non-transitory computer-readable medium of feature 61, wherein detecting the trigger condition that initiates calibration of a media playback system comprises detecting that the configuration of the media playback system is a particular axis configuration, wherein , the plurality of audio drivers form a specific set of sound axes.

(特征72)根据特征61所述的有形非暂态计算机可读介质，其中，使得确定所述多个声轴中的每个声轴的延迟包括：(i)确定给定声轴的到达时间超过最大延迟阈值；以及(ii)当所述媒体回放系统正在回放与视频内容配对的音频内容时，使得所述给定声轴的延迟被设置在最大延迟阈值处。(Feature 72) The tangible, non-transitory computer-readable medium of feature 61, wherein causing the determination of a delay for each of the plurality of acoustic axes comprises: (i) determining a time of arrival for a given acoustic axis exceeding a maximum delay threshold; and (ii) causing the delay for the given sound axis to be set at the maximum delay threshold when the media playback system is playing back audio content paired with video content.

(特征73)一种方法，包括：(i)检测触发条件，所述触发条件启动对包括多个音频驱动器的媒体回放系统的校准，多个音频驱动器形成多个声轴，每个声轴对应于多声道音频内容的相应声道；(ii)经由网络接口使所述多个音频驱动器发出校准音频，所述校准音频被划分成组成帧，所述多个声轴在每个组成帧的相应时隙期间发出校准音频；(iii)经由联网麦克风设备(NMD)的麦克风记录所发出的校准音频；(iv)使处理设备根据与所述声轴相对应的所记录校准音频的时隙来确定所述多个声轴中的每个声轴的在所述麦克风处的相应到达时间；(v)使得确定所述多个声轴中的每个声轴的延迟，每个确定的延迟对应于相应声轴的所确定的到达时间；以及(vi)使得校准所述多个声轴，其中，校准所述多个声轴包括：使得根据相应确定的延迟来延迟所述多个声轴的音频输出。(Feature 73) A method comprising: (i) detecting a trigger condition that initiates calibration of a media playback system comprising a plurality of audio drivers forming a plurality of sound axes, each sound axis corresponding to corresponding channels of the multi-channel audio content; (ii) causing the plurality of audio drivers to emit calibration audio via a network interface, the calibration audio being divided into constituent frames, the plurality of sound axes at the end of each constituent frame emit calibration audio during the corresponding time slot; (iii) record the emitted calibration audio via the microphone of the Networked Microphone Device (NMD); (iv) cause the processing device to generate a calibration audio according to the time slot corresponding to the sound axis of the recorded calibration audio determining a respective arrival time at the microphone for each of the plurality of acoustic axes; (v) such that a delay for each of the plurality of acoustic axes is determined, each determined delay corresponding to and (vi) causing calibration of the plurality of acoustic axes, wherein calibrating the plurality of acoustic axes includes causing delays of the plurality of acoustic axes according to the respective determined delays Audio output.

(特征74)根据特征73所述的方法，其中，所述NMD包括所述处理设备，并且其中，使所述处理设备确定所述多个声轴中的每个声轴的在所述麦克风处的相应到达时间包括：(i)将所记录的校准音频划分成所述组成帧并且将所述每个组成帧划分成针对每个声轴的相应时隙；(ii)根据与每个声轴相对应的相应时隙来确定所述声轴的相应脉冲响应；(iii)将相应脉冲响应对准至第一参考点；(iv)识别每个脉冲响应中的相应第二参考点；以及(v)基于每个脉冲响应中的所述第一参考点与所述第二参考点的相应差异来确定在所述麦克风处的相应到达时间。(Feature 74) The method of feature 73, wherein the NMD includes the processing device, and wherein the processing device is caused to determine at the microphone of each of the plurality of acoustic axes The respective arrival times of the include: (i) dividing the recorded calibration audio into the constituent frames and dividing each constituent frame into a corresponding time slot for each acoustic axis; (ii) dividing the recorded calibration audio into the constituent frames; (iii) aligning the respective impulse responses to the first reference point; (iv) identifying the respective second reference point in each impulse response; and ( v) Determining a respective time of arrival at the microphone based on the respective difference between the first reference point and the second reference point in each impulse response.

(特征75)根据特征74所述的方法，其中，所述声轴由参考声轴和一个或更多个其他声轴组成，并且其中，识别每个脉冲响应中的相应第二参考点包括：(i)识别所述参考声轴的所述脉冲响应中的峰值作为给定第二参考点；以及(ii)在所述给定第二参考点之后的时窗中，识别所述一个或更多个其他声轴的所述脉冲响应的相应峰值作为其他第二参考点。(Feature 75) The method of feature 74, wherein the acoustic axis consists of a reference acoustic axis and one or more other acoustic axes, and wherein identifying a corresponding second reference point in each impulse response comprises: (i) identifying a peak in the impulse response of the reference acoustic axis as a given second reference point; and (ii) in a time window following the given second reference point, identifying the one or more Corresponding peaks of the impulse responses of a plurality of other acoustic axes serve as other second reference points.

(特征76)根据特征73所述的方法，其中，所述处理设备经由一个或更多个网络连接至所述NMD，并且其中，使所述处理设备确定所述多个声轴中的每个声轴的在所述麦克风处的相应到达时间包括：(i)经由所述网络接口向所述处理设备发送(a)所记录的校准音频，以及(b)确定所述多个声轴中的每个声轴的在所述麦克风处的相应到达时间的指令；以及(ii)经由所述网络接口接收所确定的相应到达时间。(Feature 76) The method of feature 73, wherein the processing device is connected to the NMD via one or more networks, and wherein the processing device is caused to determine each of the plurality of acoustic axes The respective arrival times of the acoustic axes at the microphones include: (i) sending (a) the recorded calibration audio to the processing device via the network interface, and (b) determining the number of acoustic axes in the plurality of acoustic axes. an instruction of a respective time of arrival at the microphone for each acoustic axis; and (ii) receiving the determined respective time of arrival via the network interface.

(特征77)根据特征73所述的方法，其中，所述多个声轴中的每个声轴对应于环绕声音频内容的相应声道，并且其中，所述媒体回放系统包括多个回放设备，每个回放设备包括所述多个音频驱动器的子组。(Feature 77) The method of feature 73, wherein each sound axis of the plurality of sound axes corresponds to a respective channel of surround sound audio content, and wherein the media playback system includes a plurality of playback devices , each playback device includes a subset of the plurality of audio drivers.

(特征78)根据特征77所述的方法，其中，所述多个回放设备包括给定回放设备，所述给定回放设备包括所述多个音频驱动器的特定子组，其中，所述多个音频驱动器的特定子组形成三个声轴，所述三个声轴分别对应于所述环绕声音频内容的左声道、所述环绕声音频内容的右声道以及所述音频内容的中心声道。(Feature 78) The method of feature 77, wherein the plurality of playback devices comprises a given playback device comprising a particular subset of the plurality of audio drivers, wherein the plurality of A particular subset of audio drivers form three sound axes corresponding to the left channel of the surround audio content, the right channel of the surround audio content, and the center sound of the audio content, respectively. road.

(特征79)根据特征73所述的方法，其中，检测启动对媒体回放系统的校准的所述触发条件包括以下操作之一：(a)经由用户接口检测这样的输入数据，该输入数据指示启动对所述媒体回放系统的校准的命令，或者(b)检测所述媒体回放系统的配置为特定轴配置，其中，所述多个音频驱动器形成特定声轴集合。(Feature 79) The method of feature 73, wherein detecting the triggering condition to initiate calibration of the media playback system comprises one of: (a) detecting, via a user interface, input data indicative of initiation A command for calibration of the media playback system, or (b) detecting that the configuration of the media playback system is a particular axis configuration, wherein the plurality of audio drivers form a particular set of acoustic axes.

(特征80)一种媒体回放系统，包括：(i)一个或更多个回放设备，包括形成多个声轴的多个音频驱动器，每个声轴对应于多声道音频内容的相应声道；(ii)联网麦克风设备，包括麦克风；(iii)处理器；以及(iv)存储有指令的计算机可读介质，所述指令能够由一个或更多个处理器执行以使媒体回放系统执行包括以下操作的方法：(a)检测启动对媒体回放系统的校准的触发条件，经由网络接口使所述多个音频驱动器发出校准音频，所述校准音频被划分成组成帧，所述多个声轴在每个组成帧的相应时隙期间发出校准音频；(b)经由所述麦克风记录所发出的校准音频；(c)使处理设备根据与所述声轴相对应的所记录校准音频的时隙来确定所述多个声轴中的每个声轴的在所述麦克风处的相应到达时间；(d)使得确定所述多个声轴中的每个声轴的延迟，每个确定的延迟对应于相应声轴的所确定的到达时间；以及(e)使得校准所述多个声轴，其中，校准所述多个声轴包括：使得根据相应确定的延迟来延迟所述多个声轴的音频输出。(Feature 80) A media playback system comprising: (i) one or more playback devices including a plurality of audio drivers forming a plurality of sound axes, each sound axis corresponding to a corresponding channel of multi-channel audio content (ii) a networked microphone device, including a microphone; (iii) a processor; and (iv) a computer-readable medium having stored instructions executable by one or more processors to cause the media playback system to perform including A method of operating the following: (a) detecting a trigger condition that initiates calibration of a media playback system, causing the plurality of audio drivers to emit calibration audio via a network interface, the calibration audio being divided into constituent frames, the plurality of sound axes emit calibration audio during the corresponding time slot of each constituent frame; (b) record the emitted calibration audio via the microphone; (c) cause the processing device to respond to the time slot of the recorded calibration audio corresponding to the acoustic axis to determine a respective time of arrival at the microphone for each of the plurality of acoustic axes; (d) such that a delay for each of the plurality of acoustic axes is determined, each determined delay the determined arrival times corresponding to the respective acoustic axes; and (e) causing the plurality of acoustic axes to be calibrated, wherein calibrating the plurality of acoustic axes comprises causing the plurality of acoustic axes to be delayed according to the respective determined delays audio output.

主要从说明性环境、系统、过程、步骤、逻辑块、处理以及直接或间接地与耦接至网络的数据处理设备的操作相类似的其他象征性表示的方面，提出本说明书。本领域技术人员通常使用这些处理描述和表示来向本领域其他技术人员最有效地传达他们的工作内容。阐述了许多具体细节，以提供对本公开内容的透彻理解。然而，本领域技术人员应当理解的是，可以在没有某些具体细节的情况下实践本公开内容的某些实施方式。在其他实例中，没有详细描述熟知的方法、过程、组件和电路，以避免不必要地使实施方式的各方面模糊。因此，本公开内容的范围由所附权利要求而不是对实施方式的前述描述来限定。This specification is presented primarily in terms of illustrative environments, systems, processes, steps, logic blocks, processes, and other symbolic representations that directly or indirectly resemble the operation of a data processing device coupled to a network. These process descriptions and representations are commonly used by those skilled in the art to most effectively convey the substance of their work to others skilled in the art. Numerous specific details are set forth in order to provide a thorough understanding of the present disclosure. However, it will be understood by those skilled in the art that certain embodiments of the present disclosure may be practiced without certain specific details. In other instances, well-known methods, procedures, components and circuits have not been described in detail so as not to unnecessarily obscure aspects of the embodiments. Accordingly, the scope of the present disclosure is to be defined by the appended claims rather than the foregoing description of the embodiments.

当所附权利要求中的任意权利要求被理解为涵盖纯软件和/或固件实现时，在此将至少一个示例中的至少一个元素明确限定为包括存储软件和/或固件的有形非暂态介质，如存储器、DVD、CD、蓝光等。When any of the appended claims is construed to encompass a pure software and/or firmware implementation, at least one element of at least one example is expressly defined herein to include a tangible non-transitory medium storing software and/or firmware, Such as memory, DVD, CD, Blu-ray, etc.

Claims

1. A method for a networked microphone device (NMD), the method comprising:

detecting a trigger condition that initiates calibration of a media playback system for a plurality of playback configurations, each playback configuration representing a respective set of one or more sound axes formed via a plurality of audio drivers of the media playback system, wherein each sound axis corresponds to a corresponding input channel of the audio content;

causing, via a network interface of the networked microphone device, each audio driver of the plurality of audio drivers of the media playback system to output calibration audio divided into repeated sets of frames comprising respective frames configured for each playback , and the calibration audio is output via the one or more acoustic axes corresponding to a given playback configuration during respective time slots of each frame corresponding to the respective playback configuration, such that during each of the set of frames During frames, a respective set of spatial filters is applied to the plurality of audio drivers, the respective set of spatial filters including for each of the one or more acoustic axes corresponding to the respective playback configuration a corresponding spatial filter for an axis, wherein the spatial filter spatially calibrates the media playback system to a given listening area by arranging the plurality of audio drivers to form a particular acoustic axis, at directing the sound output of the particular acoustic axis of the respective set of the one or more acoustic axes in a particular direction;

recording the calibration audio output by the plurality of audio drivers via a microphone; and

The processing device is caused to determine, based on the recorded calibration audio, respective sets of spectral filters for the plurality of playback configurations, each set of spectral filters including respective spectral filters for each acoustic axis.

2. The method of claim 1, further comprising causing the determined set of spectral filters corresponding to the given playback configuration to be applied when the media playback system plays back audio content in the given playback configuration on the acoustic axis formed by the plurality of audio drivers.

3. The method of claim 1 or 2, wherein causing the respective sets of spatial filters to be applied to the plurality of audio drivers comprises causing the processing device to apply the spatial filters to the calibration audio, and sending the calibrated audio with the spatial filter applied to one or more playback devices including the plurality of audio drivers.

4. The method of claim 1 or 2, wherein the media playback system comprises a plurality of playback devices, each playback device comprising a subset of the plurality of audio drivers.

5. The method of claim 1 or 2, wherein:

In a surround playback configuration:

each sound axis corresponds to a corresponding channel of surround audio content, and

a first spatial filter corresponding to the surround sound playback configuration;

In a stereo playback configuration:

each sound axis corresponds to a corresponding channel of stereo audio content; and

a second spatial filter corresponding to the stereo playback configuration; and

In a mono playback configuration:

the plurality of audio drivers form a single acoustic shaft; and

The third spatial filter corresponds to the mono playback configuration.

6. The method of claim 5, wherein:

the mono playback configuration is a first mono playback configuration,

the stereo playback configuration is a first stereo playback configuration,

the surround sound playback configuration is a first surround sound configuration; and

The plurality of playback configurations includes at least one of the following playback configurations:

A second mono playback configuration in which, upon playback of audio content, the plurality of audio drivers are configured to form one or more full-range sound shafts and subwoofer sound shafts to outputting mono audio content synchronously, wherein a fourth spatial filter corresponds to the second mono playback configuration;

A second stereo playback configuration in which the plurality of audio drivers are configured to form one or more acoustic shafts to output the stereo content in synchronization with the subwoofer acoustic shaft when the audio content is played back in the second stereo playback configuration a channel of audio content, wherein a fifth spatial filter corresponds to the second stereo playback configuration; and

A second surround sound playback configuration in which the plurality of audio drivers are configured to form one or more full-range sound axes to be synchronized with the subwoofer sound axis when the audio content is played back in the second surround sound playback configuration and outputting respective channels of surround sound content audio content, wherein the sixth spatial filter corresponds to the second surround sound playback configuration.

7. The method of claim 1 or 2, wherein the plurality of playback configurations comprises two or more of the following playback configurations:

a music playback configuration in which, upon playback of audio content, the plurality of audio drivers are configured to form an acoustic axis to output music content, wherein a music playback spatial filter corresponds to the music playback configuration, and

a home theater playback configuration in which, upon playback of audio content, the plurality of audio drivers are configured to form an acoustic axis to output audio content paired with the video content, wherein the home theater playback spatial filter corresponds to in the home theater playback configuration.

8. The method of claim 3, wherein the calibration audio is a second calibration audio, the method further comprising:

prior to causing the plurality of audio drivers to output the second calibration audio, causing the plurality of audio drivers to output a first calibration audio, the first calibration audio being divided into components comprising for each of the plurality of playback configurations a repeated set of frames for the corresponding frames of the playback configuration;

recording, via the microphone, the first calibration audio output by the plurality of audio drivers; and

causing the processing device to determine the respective sets of spatial filters for the plurality of playback configurations based on the recorded first calibration audio, each set of spatial filters including a respective spatial filter for each acoustic axis .

9. The method of claim 8, wherein:

the determined set of spatial filters calibrates the playback device to a particular listening position within a listening area of the playback device, and

The determined spectral filter compensates for the acoustic properties of the listening area.

10. The method of claim 8, wherein:

causing the plurality of audio drivers to output the first calibration audio includes causing the plurality of audio drivers to emit calibration audio via a plurality of acoustic axes at respective time slots in each frame, each acoustic axis corresponding to multi-channel audio the corresponding channel of the content; and

Causing the processing device to determine the respective set of spatial filters includes:

such that a respective spatial delay for each acoustic axis of the plurality of acoustic axes is determined based on the time slot corresponding to the acoustic axis in which the calibration audio was recorded, based on the respective determined delays,

Wherein, such that determining the respective delay of each acoustic axis of the plurality of acoustic axes comprises:

causing the processing device to determine a respective time of arrival at the microphone for each of the plurality of acoustic axes based on a time slot for recording calibration audio corresponding to each acoustic axis; and

A delay for each of the plurality of acoustic axes is caused to be determined, each determined delay corresponding to the determined time of arrival of the respective acoustic axis.

11. The method of claim 10, wherein causing the processing device to determine a respective time of arrival at the microphone for each of the plurality of acoustic axes comprises:

dividing the recorded calibration audio into constituent frames and dividing each constituent frame into a corresponding time slot for each acoustic axis;

determining a respective impulse response of each acoustic axis from a respective time slot corresponding to the acoustic axis;

aligning the respective impulse responses with a first reference point;

identifying the corresponding second reference point in each impulse response; and

determining the respective time of arrival at the microphone based on the respective difference between the first reference point and the second reference point in each impulse response,

wherein the acoustic axis consists of a reference acoustic axis and one or more other acoustic axes, and wherein identifying a corresponding second reference point in each impulse response comprises:

identifying a peak in the impulse response of the reference acoustic axis as a given second reference point; and

In the time window following the given second reference point, the corresponding peaks of the impulse responses of the one or more other acoustic axes are identified as other second reference points.

12. The method of claim 10, wherein causing the processing device to determine a respective time of arrival at the microphone for each of the plurality of acoustic axes comprises:

Send via the network interface to the processing device:

the recorded calibration audio, and

instructions to determine a respective time of arrival at the microphone for each of the plurality of acoustic axes; and

The determined respective arrival times are received via the network interface.

13. The method of claim 10, wherein causing the delay of each of the plurality of acoustic axes to be determined comprises:

determine that the arrival time of a given acoustic axis exceeds a maximum delay threshold; and

The delay for the given sound axis is caused to be set at the maximum delay threshold while the media playback system is playing back audio content paired with video content.

14. The method of claim 1 or 2, wherein detecting the triggering condition to initiate calibration of a media playback system comprises one of the following:

detecting, via a user interface, input data indicating a command to initiate calibration of the media playback system; and

The configuration of the media playback system is detected as a particular axis configuration, wherein the plurality of audio drivers form a particular set of acoustic axes.

15. A networked microphone device configured to perform the method of any of claims 1 to 14.

16. A media playback system comprising:

The networked microphone device of claim 15; and

A playback device configured to cause the audio level to apply a specific spectral filter corresponding to a given playback configuration when the audio content is played back in the given playback configuration.