WO2021008350A1

WO2021008350A1 - Audio playback method and apparatus and computer readable storage medium

Info

Publication number: WO2021008350A1
Application number: PCT/CN2020/099234
Authority: WO
Inventors: 吴宜安
Original assignee: Shenzhen Skyworth RGB Electronics Co Ltd
Current assignee: Shenzhen Skyworth RGB Electronics Co Ltd
Priority date: 2019-07-12
Filing date: 2020-06-30
Publication date: 2021-01-21
Anticipated expiration: 2022-01-12
Also published as: CN110364188A

Abstract

Disclosed in the present application is an audio playback method, the method comprising: acquiring PCM data corresponding to audio data inputted into a sound card; performing frequency band separation on the PCM data and, on the basis of preset audio separation parameters, extracting frequency band data after frequency band separation; on the basis of a preset sound effect algorithm, performing sound effect processing on the frequency band data; writing the PCM data after sound effect processing back to the sound card, and playing the audio data on the sound card. Also disclosed in the present application are an audio playback apparatus and a computer readable storage medium.

Description

Audio playback method, device and computer readable storage medium

优先权信息Priority information

本申请要求于2019年7月12日申请的、申请号为201910633434.8、名称为“音频播放方法、装置及计算机可读存储介质”的中国专利申请的优先权，其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application filed on July 12, 2019, with the application number 201910633434.8 and the title "audio playback method, device and computer-readable storage medium", the entire content of which is incorporated into this application by reference in.

Technical field

本申请涉及智能电视技术领域，尤其涉及一种音频播放方法、装置及计算机可读存储介质。This application relates to the technical field of smart TVs, and in particular to an audio playback method, device and computer-readable storage medium.

Background technique

目前智能电视基本都支持直播互联网的音视频内容，由于互联网音视频内容相比以前电视内容提供商提供的音视频内容场景更多、年代更长，所以智能电视播放互联网音视频内容时需要处理事情也越多，越复杂。例如，当处于嘈杂室外场景如地铁、公交站时，电视需要帮助用户清晰辨别人声；当处于安静室内场景时，电视需要帮用户清晰识别所有声音细节；当声音突变时，电视需要对用户听觉进行保护；当播放年代久远的音视频时，电视需要对已经丢失或损坏音质部分进行补偿、提高。通常各智能电视芯片厂商会提供一个或多个音效模式来实现高低音调节、清脆人声、智能环绕等功能，但由于这些功能基本是免费的，很难达到好的音响效果。而且面对价格不断降低的智能电视产品，当前市面上的智能电视使用的功放和喇叭无法很好还原、渲染这些声音场景。因此，在不增加硬件成本的情况下，如何提升智能电视的音效，成为亟待解决的技术问题。At present, smart TVs basically support live Internet audio and video content. Since Internet audio and video content has more scenes and longer ages than the audio and video content provided by previous TV content providers, smart TVs need to handle things when playing Internet audio and video content. The more, the more complicated. For example, when in a noisy outdoor scene such as a subway or a bus station, the TV needs to help users clearly distinguish the voices of others; when in a quiet indoor scene, the TV needs to help users clearly identify all sound details; when the sound changes suddenly, the TV needs to be audible to the user Protect it; when playing audio and video with a long history, the TV needs to compensate and improve the lost or damaged sound quality. Usually, smart TV chip manufacturers will provide one or more sound effects modes to achieve high and low bass adjustment, crisp vocals, smart surround and other functions, but because these functions are basically free, it is difficult to achieve good sound effects. Moreover, in the face of smart TV products with ever-decreasing prices, the power amplifiers and speakers used by smart TVs currently on the market cannot restore and render these sound scenes well. Therefore, how to improve the sound effects of smart TVs without increasing hardware costs has become an urgent technical problem to be solved.

发明内容Summary of the invention

本申请的主要目的在于提供一种音频播放方法、装置及计算机可读存储介质，旨在解决在不增加硬件成本的情况下如何提升智能电视的音效的技术问题。The main purpose of this application is to provide an audio playback method, device, and computer-readable storage medium, aiming to solve the technical problem of how to improve the sound effect of a smart TV without increasing the hardware cost.

为实现上述目的，本申请提供一种音频播放方法，所述音频播放方法包括：To achieve the above objective, the present application provides an audio playback method, the audio playback method includes:

获取输入至声卡中的音频数据对应的PCM数据；Obtain PCM data corresponding to the audio data input to the sound card;

将所述PCM数据进行频段分离，并根据预设的音频分离参数提取频段分离后的频段数据；Performing frequency band separation on the PCM data, and extracting frequency band data after the frequency band separation according to preset audio separation parameters;

根据预设的音效算法对所述频段数据进行音效处理；Performing sound effect processing on the frequency band data according to a preset sound effect algorithm;

将音效处理后的PCM数据写回至所述声卡，并对所述声卡上的音频数据进行播放。The PCM data after the sound effect processing is written back to the sound card, and the audio data on the sound card is played.

在一实施例中，所述将所述PCM数据进行频段分离，并根据预设的音频分离参数提取频段分离后的频段数据的步骤，包括：In an embodiment, the step of performing frequency band separation on the PCM data and extracting frequency band data after the frequency band separation according to preset audio separation parameters includes:

根据预设的滤波算法对所述PCM数据进行频段分离；Performing frequency band separation on the PCM data according to a preset filtering algorithm;

获取预设的音频分离参数；Obtain preset audio separation parameters;

根据所述音频分离参数提取频段分离后的频段数据。Extract frequency band data after frequency band separation according to the audio separation parameter.

在一实施例中，所述获取输入至声卡中的音频数据对应的PCM数据的步骤，包括：In an embodiment, the step of obtaining PCM data corresponding to the audio data input to the sound card includes:

获取输入至声卡中的音频数据对应的PCM数据，并确定输出所述PCM数据的电视通道所对应的声音场景；Acquiring PCM data corresponding to the audio data input to the sound card, and determining the sound scene corresponding to the TV channel that outputs the PCM data;

根据所述声音场景设置相应的音效参数。Set corresponding sound effect parameters according to the sound scene.

在一实施例中，所述根据预设的音效算法对所述频段数据进行音效处理的步骤，包括：In an embodiment, the step of performing sound effect processing on the frequency band data according to a preset sound effect algorithm includes:

将所述音效参数调入至预设的音效算法中对所述频段数据进行音效处理。The sound effect parameters are transferred into a preset sound effect algorithm to perform sound effect processing on the frequency band data.

在一实施例中，所述预设的音效算法为和声搜索算法。In an embodiment, the preset sound effect algorithm is a harmony search algorithm.

在一实施例中，所述音效参数包括和声记忆库取值概率HMCR，所述将所述音效参数调入至预设的音效算法中对所述频段数据进行音效处理的步骤，包括：In an embodiment, the sound effect parameters include the value probability HMCR of the harmony memory bank, and the step of tuning the sound effect parameters into a preset sound effect algorithm to perform sound effect processing on the frequency band data includes:

初始化和声记忆库，并在所述和声记忆库中设置预设数量的和声变量；Initialize the harmony memory bank, and set a preset number of harmony variables in the harmony memory bank;

随机生成第一变量，并判断所述第一变量是否小于所述HMCR的值，其中，所述第一变量为0至1之间的随机数；Randomly generating a first variable, and determining whether the first variable is less than the value of the HMCR, wherein the first variable is a random number between 0 and 1;

根据判断结果获取目标和声变量，并对所述目标和声变量进行微调；Acquire the target harmony variable according to the judgment result, and fine-tune the target harmony variable;

判断是否遍历完所有频段数据；Determine whether to traverse all frequency band data;

若已遍历所有频段数据，则判断微调扰动后的和声是否优于和声记忆库中最差的和声；If all frequency band data has been traversed, judge whether the harmony after fine-tuning disturbance is better than the worst harmony in the harmony memory;

若微调扰动后的和声优于和声记忆库中最差的和声，将所述最差的和声替换成所述微调扰动后的和声，形成新的和声记忆库。If the harmony after the fine-tuning disturbance is better than the worst harmony in the harmony memory bank, replace the worst harmony with the harmony after the fine-tuning disturbance to form a new harmony memory bank.

在一实施例中，所述音效参数包括微调概率PAR和微调带宽BW，所述根据判断结果获取新的和声变量，并对新的和声变量进行微调的步骤，包括：In an embodiment, the sound effect parameters include fine-tuning probability PAR and fine-tuning bandwidth BW, and the step of obtaining a new harmony variable according to the judgment result and fine-tuning the new harmony variable includes:

若所述第一变量小于所述HMCR的值，则从当前和声记忆库中获取任一和声变量作为目标和声变量；If the first variable is less than the value of the HMCR, obtain any harmony variable from the current harmony memory bank as the target harmony variable;

若所述第一变量大于或等于所述HMCR的值，则根据所述频段数据随机生成新的和声变量作为目标和声变量；If the first variable is greater than or equal to the value of the HMCR, randomly generating a new harmony variable as a target harmony variable according to the frequency band data;

基于所述微调概率PAR和微调带宽BW对所述目标和声变量进行微调。The target harmony variable is fine-tuned based on the fine-tuning probability PAR and the fine-tuning bandwidth BW.

在一实施例中，在所述判断是否遍历完所有频段数据的步骤之后，还包括：In an embodiment, after the step of determining whether to traverse all frequency band data, the method further includes:

若未遍历完所有频段数据，则随机生成一个新的变量，作为第一变量，并返回步骤：判断所述第一变量是否小于所述HMCR的值。If all frequency band data has not been traversed, a new variable is randomly generated as the first variable, and the step is returned: judging whether the first variable is less than the value of the HMCR.

此外，为实现上述目的，本申请还提供一种音频播放装置，所述音频播放装置包括：存储器、处理器及存储在所述存储器上并可在所述处理器上运行的音频播放程序，所述音频播放程序被所述处理器执行时实现上述任一项所述音频播放方法的步骤。In addition, in order to achieve the above object, the present application also provides an audio playback device, the audio playback device comprising: a memory, a processor, and an audio playback program stored on the memory and running on the processor, so When the audio playing program is executed by the processor, the steps of any one of the above audio playing methods are implemented.

此外，为实现上述目的，本申请还提供一种计算机可读存储介质，其上存储有音频播放程序，所述音频播放程序被处理器执行时实现上述任一项所述音频播放方法的步骤。In addition, in order to achieve the foregoing objective, the present application also provides a computer-readable storage medium on which an audio playback program is stored, and when the audio playback program is executed by a processor, the steps of any one of the foregoing audio playback methods are implemented.

本申请通过获取输入至声卡中的音频数据对应的PCM数据，并将PCM数据进行频段分离，再根据预设的音频分离参数提取频段分离后的频段数据；根据预设的音效算法对频段数据进行音效处理；将音效处理后的PCM数据写回至所述声卡，并对声卡上的音频数据进行播放，从而提高了智能电视的音效，同时增强了智能电视的市场竞争力和用户体验性，且不需要在智能电视中增加任何硬件成本。This application obtains the PCM data corresponding to the audio data input into the sound card, and separates the PCM data into frequency bands, and then extracts the frequency band data after the frequency band separation according to the preset audio separation parameters; Sound effect processing; write back the PCM data after sound effect processing to the sound card, and play the audio data on the sound card, thereby improving the sound effect of the smart TV, and at the same time enhancing the market competitiveness and user experience of the smart TV, and There is no need to add any hardware cost to the smart TV.

Description of the drawings

图1是本申请实施例方案涉及的硬件运行环境的装置结构示意图；FIG. 1 is a schematic diagram of the device structure of the hardware operating environment involved in the solution of the embodiment of the present application;

图2为本申请音频播放方法的第一实施例的流程示意图；2 is a schematic flowchart of the first embodiment of the audio playback method of this application;

图3为安卓音频框架的结构示意图；Figure 3 is a schematic diagram of the structure of the Android audio framework;

图4为图2中获取输入至声卡中的音频数据对应的PCM数据的细化步骤的流程示意图；4 is a schematic flowchart of the detailed steps of obtaining PCM data corresponding to audio data input to the sound card in FIG. 2;

图5为本申请音频播放方法的第二实施例的流程示意图。FIG. 5 is a schematic flowchart of a second embodiment of the audio playback method of this application.

本申请目的的实现、功能特点及优点将结合实施例，参照附图做进一步说明。The realization, functional characteristics, and advantages of the purpose of this application will be further described in conjunction with the embodiments and with reference to the accompanying drawings.

Detailed ways

应当理解，此处所描述的具体实施例仅仅用以解释本申请，并不用于限定本申请。It should be understood that the specific embodiments described here are only used to explain the application, and are not used to limit the application.

如图1所示，图1是本申请实施例方案涉及的硬件运行环境的装置结构示意图。As shown in FIG. 1, FIG. 1 is a schematic diagram of the device structure of the hardware operating environment involved in the solution of the embodiment of the present application.

本申请实施例装置可以是智能电视，也可以是智能手机、平板电脑、PC、MP3(Moving Picture Experts Group Audio Layer III，动态影像专家压缩标准音频层面3)播放器、MP4(Moving Picture Experts Group Audio Layer IV，动态影像专家压缩标准音频层面4)播放器、便携计算机等具音频播放的装置。The device in the embodiment of this application may be a smart TV, or a smart phone, a tablet computer, a PC, an MP3 (Moving Picture Experts Group Audio Layer III, dynamic image expert compression standard audio layer 3) player, MP4 (Moving Picture Experts Group Audio) Layer IV, dynamic image experts compress standard audio layer 4) Players, portable computers and other devices with audio playback.

如图1所示，该装置可以包括：处理器1001，例如CPU，通信总线1002，用户接口1003，网络接口1004，存储器1005。其中，通信总线1002用于实现这些组件之间的连接通信。用户接口1003可以包括显示屏(Display)、输入单元比如键盘(Keyboard)，可选的用户接口1003还可以包括标准的有线接口、无线接口。网络接口1004可选的可以包括标准的有线接口、无线接口(如WI-FI接口)。存储器1005可以是高速RAM存储器，也可以是稳定的存储器(non-volatile memory)，例如磁盘存储器。存储器1005可选的还可以是独立于前述处理器1001的存储装置。As shown in FIG. 1, the device may include: a processor 1001, such as a CPU, a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. Among them, the communication bus 1002 is used to implement connection and communication between these components. The user interface 1003 may include a display screen (Display) and an input unit such as a keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface and a wireless interface. The network interface 1004 may optionally include a standard wired interface and a wireless interface (such as a WI-FI interface). The memory 1005 may be a high-speed RAM memory, or a non-volatile memory (non-volatile memory), such as a magnetic disk memory. Optionally, the memory 1005 may also be a storage device independent of the foregoing processor 1001.

在一实施例中，装置还可以包括扬声器、麦克风等，在此不再赘述。In an embodiment, the device may also include a speaker, a microphone, etc., which will not be repeated here.

本领域技术人员可以理解，图1中示出的终端结构并不构成对终端的限定，可以包括比图示更多或更少的部件，或者组合某些部件，或者不同的部件布置。Those skilled in the art can understand that the terminal structure shown in FIG. 1 does not constitute a limitation on the terminal, and may include more or fewer components than shown in the figure, or combine some components, or arrange different components.

如图1所示，作为一种计算机存储介质的存储器1005中可以包括操作系统、网络通信模块、用户接口模块以及音频播放程序。而处理器1001可以用于调用存储器1005中存储的音频播放程序，并执行以下操作：As shown in Fig. 1, a memory 1005, which is a computer storage medium, may include an operating system, a network communication module, a user interface module, and an audio playback program. The processor 1001 may be used to call the audio playback program stored in the memory 1005 and perform the following operations:

进一步地，处理器1001可以调用存储器1005中存储的音频播放程序，还执行以下操作：Further, the processor 1001 may call an audio playback program stored in the memory 1005, and also perform the following operations:

获取预设的音频分离参数；Obtain preset audio separation parameters;

根据所述声音场景从音频数据库中获取与所述PCM数据对应的音效参数，并将所述音效参数调入至预设的音效算法中。Acquire sound effect parameters corresponding to the PCM data from an audio database according to the sound scene, and tune the sound effect parameters into a preset sound effect algorithm.

所述预设的音效算法为和声搜索算法。The preset sound effect algorithm is a harmony search algorithm.

若未遍历完所有频段数据，则随机生成一个新的变量，作为第一变量，并返回步骤：判断所述第一变量是否小于所述HMCR的值。If all the frequency band data has not been traversed, a new variable is randomly generated as the first variable, and the step is returned: judging whether the first variable is less than the value of the HMCR.

本申请音频播放装置的具体实施例与下述音频播放方法各实施例基本相同，在此不作赘述。The specific embodiments of the audio playback device of the present application are basically the same as the following embodiments of the audio playback method, and will not be repeated here.

参照图2，本申请音频播放方法的第一实施例的流程示意图，所述音频播放方法包括：2 is a schematic flowchart of a first embodiment of an audio playback method according to the present application. The audio playback method includes:

步骤S10，获取输入至声卡中的音频数据对应的PCM数据。Step S10: Acquire PCM data corresponding to the audio data input to the sound card.

本申请实施例装置可以是智能电视，也可以是智能手机、平板电脑、PC、MP3(Moving Picture Experts Group Audio Layer III，动态影像专家压缩标准音频层面3)播放器、MP4(Moving Picture Experts Group Audio Layer IV，动态影像专家压缩标准音频层面4)播放器、便携计算机等具音频播放的装置。为方便说明，后续实施例均以智能电视为例。目前智能电视大部分都是安卓系统的，通过安卓音频框架来实现声音播放，如图3所示，图3为安卓音频框架的结构示意图。安卓音频框架由Framework、AudioHal、ALSA、KERNEL、HW层构成，用户打开应用层的应用软件如某播放器，并通过该应用软件播放来自电视某一通道的音频如在线视频、本地媒体、第三方视频中的音频。在播放音频的过程中，智能电视通过Framework层的audio接口调用AudioHal接口，将原始音频数据、麦克风、蓝牙、ARC回传音频等音频数据写入KERNEL层中注册过的声卡中，此时在AudioHal层使用pcm_open接口打开KERNEL层中的声卡，再在mediaserver进程中建议一个线程，使用IOCTRL操作，从pcm_read接口中读取由上层应用写入声卡的PCM数据，并将读取到的PCM数据输入至AudioHal层的智能音效模块(即SmartSE)，智能音效模块通过预设的音效算法对PCM数据进行音效处理，再通过pcm_write接口将音效处理后的PCM数据重新写入声卡的alsa驱动中，SOC芯片读取声卡的alsa驱动中的音频进行编码、混音、叠加音效，输出给功放IC，再通过扬声器、数字音频接口、ARC接口或AVOUT接口等呈现给用户。The device in the embodiment of this application may be a smart TV, or a smart phone, a tablet computer, a PC, an MP3 (Moving Picture Experts Group Audio Layer III, dynamic image expert compression standard audio layer 3) player, MP4 (Moving Picture Experts Group Audio) Layer IV, dynamic image experts compress standard audio layer 4) Players, portable computers and other devices with audio playback. For convenience of description, the subsequent embodiments all take smart TVs as examples. At present, most smart TVs are Android systems, and sound playback is realized through the Android audio framework, as shown in Figure 3, which is a schematic diagram of the structure of the Android audio framework. The Android audio framework is composed of Framework, AudioHal, ALSA, KERNEL, and HW layers. The user opens application software at the application layer, such as a player, and plays audio from a certain channel of the TV through the application software, such as online video, local media, and third-party Audio in video. In the process of playing audio, the smart TV calls the AudioHal interface through the audio interface of the Framework layer to write the original audio data, microphone, Bluetooth, ARC return audio and other audio data to the sound card registered in the KERNEL layer. The layer uses the pcm_open interface to open the sound card in the KERNEL layer, and then suggests a thread in the mediaserver process to use the IOCTRL operation to read the PCM data written to the sound card by the upper application from the pcm_read interface, and input the read PCM data to AudioHal layer's smart sound module (SmartSE), the smart sound module performs sound effect processing on the PCM data through the preset sound effect algorithm, and then rewrites the sound effect processed PCM data into the alsa driver of the sound card through the pcm_write interface, and the SOC chip reads Take the audio in the alsa driver of the sound card for encoding, mixing, and superimposing sound effects, output to the power amplifier IC, and then present it to the user through the speaker, digital audio interface, ARC interface or AVOUT interface.

本申请为了防止音频读写阻塞导致的声音延时过大和声音卡顿的情况，在音频数据读写的过程中建立两个缓存区，即一个读缓存区和一个写缓存区，通过两个缓存区设计双线程，分别处理音频数据的读和写，并且将音频数据的读和写设置成相同的采样率、通道数和bit位数，从而保证音频数据输入和输出同步。同时由于通过这两个缓存区建立了两个FIFO的环形队列，这样在系统CPU使用过高时，不会因为某一动作处理时间过长，导致pcm流卡住、PcmInFIFO溢出、PcmOutFIFO空处理等情况的发生。在软件设计时，用户可以根据不同平台的运行速度，合理设计两个FIFO的大小和空闲时间，防止AVSYNC延时过大和CPU使用率过高。In order to prevent excessive sound delay and sound freeze caused by audio reading and writing blocking, this application establishes two buffer areas in the process of reading and writing audio data, namely, a read buffer area and a write buffer area, through two buffers The area is designed with dual threads to handle the reading and writing of audio data respectively, and set the reading and writing of audio data to the same sampling rate, number of channels and bit number, so as to ensure the synchronization of audio data input and output. At the same time, two FIFO circular queues are established through these two buffer areas, so that when the system CPU is used too high, it will not cause pcm flow to get stuck, PcmInFIFO overflow, PcmOutFIFO empty processing, etc. The situation happened. When designing the software, the user can reasonably design the size and idle time of the two FIFOs according to the operating speed of different platforms to prevent the AVSYNC delay from being too large and the CPU usage rate from being too high.

当然，对于智能电视的其他操作系统，如YunOS系统、WebOS系统、TIZEN系统等，同样可以通过软件实现声卡中PCM数据的获取，在此不作赘述。Of course, for other operating systems of smart TVs, such as YunOS system, WebOS system, TIZEN system, etc., the PCM data in the sound card can also be obtained through software, which will not be repeated here.

步骤S20，将所述PCM数据进行频段分离，并根据预设的音频分离参数提取频段分离后的频段数据。Step S20: Perform frequency band separation on the PCM data, and extract frequency band data after the frequency band separation according to preset audio separation parameters.

利用预设的滤波算法对获取到的PCM数据进行频段分离，根据PCM数据所处的音频的不同频段，将PCM数据分离开来，然后根据预设的音频分离参数提取频段分离后的频段数据。例如在音频分离参数中分别设置100和500，则频率低于100HZ认为是该音频处于低频段，频率高于100HZ且低于500HZ 认为该音频处于中频段，频率高于500HZ则认为该音频处于高频段。这里的音频分离参数可以设置多个，数量越多，对音频处理的精度越高，本领域的技术人员可以认为，该音频分离参数的具体数量与数值可根据用户的经验设置，也可以根据具体的实际需要设置，在此不一一赘述。The obtained PCM data is separated by frequency band using a preset filtering algorithm, the PCM data is separated according to the different frequency bands of the audio in which the PCM data is located, and then the frequency band data after the frequency band separation is extracted according to the preset audio separation parameters. For example, if the audio separation parameters are set to 100 and 500 respectively, the frequency is lower than 100HZ, it is considered that the audio is in the low frequency band, the frequency is higher than 100HZ and lower than 500HZ, the audio is considered to be in the middle frequency band, and the frequency is higher than 500HZ, the audio is considered to be high Frequency band. Multiple audio separation parameters can be set here. The more the number, the higher the accuracy of audio processing. Those skilled in the art can think that the specific number and value of the audio separation parameter can be set according to the user’s experience or specific The actual needs to be set, I won’t repeat them here.

步骤S30，根据预设的音效算法对所述频段数据进行音效处理。Step S30: Perform sound effect processing on the frequency band data according to a preset sound effect algorithm.

获取到频段数据后，根据预设的音效算法对所述频段数据进行音效处理，本实施例利用和声搜索算法对频段数据进行音效处理，和声搜索算法类似于遗传算法对生物进化的模仿、模拟退火算法对物理退火的模拟以及粒子群优化算法对鸟群的模仿等，和声搜索算法模拟了音乐演奏的原理。例如，若当前频段数据共有n个，分别定义为x ₁至x _n，则X＝{x ₁,x ₂,…,x _n}，从X的解空间中随机生成预设数量的和声变量，本实施例中的预设数量基于和声记忆库大小设置，如和声记忆库大小定义为HMS，这生成HMS个和声变量，即X ¹，X ²，…，X ^HMS，将HMS个和声变量放入和声记忆库，并记录对应的f(X)，则生成的和声记忆库的形式为： After the frequency band data is obtained, the frequency band data is processed according to the preset sound effect algorithm. In this embodiment, the harmony search algorithm is used to perform sound effect processing on the frequency band data. The harmony search algorithm is similar to the genetic algorithm's imitation of biological evolution. The simulated annealing algorithm simulates physical annealing and the particle swarm optimization algorithm simulates the flock of birds. The harmony search algorithm simulates the principle of music performance. For example, if there are n data in the current frequency band, defined as x ₁ to x _n , then X={x ₁ ,x ₂ ,...,x _n }, randomly generate a preset number of harmony variables from the solution space of X The preset number in this embodiment is based on the size of the harmony memory bank. For example, the size of the harmony memory bank is defined as HMS, which generates HMS harmony variables, namely X ¹ , X ² ,..., X ^HMS , and HMS Harmony variables are put into the harmony memory bank, and the corresponding f(X) is recorded, then the form of the generated harmony memory bank is:

在[0,1]之间产生一个随机数r1，并将r1与声记忆库取值概率HMCR进行比较，根据判断结果获取目标和声变量，若r1小于HMCR，则从和声记忆库中随机获取任一和声变量作为目标和声变量；若r1大于或者等于HMCR，则从频段数据的解空间随机生成一个和声变量。获取目标和声变量后对目标和声变量进行微调扰动，若这个目标和声变量是从和声记忆库中得到的，就需要对这个目标和声变量进行微调，在[0，1]之间产生一个随机数r2，若r2小于微调概率PAR，则根据微调带宽BW来对得到的新的和声变量进行调整，得到一个新的和声变量Xnew；若r2大于或者等于该微调概率PAR，则不做任何调整，最后对Xnew进行评估，即f(Xnew)，若优于和声记忆库中的函数值最差的一个，即f(Xnew)<f(Xworst)，则将Xnew代替和声记忆库中函数值最差的和声Xworst；否则，不做修改。不断重复上述步骤，直到达到最大的迭代次数或满足停止准则后结束循环，输出最优解。由此，对所有频段数据进行音效处理，得到用户需要的音效。Generate a random number r1 between [0,1] and compare r1 with the value probability HMCR of the acoustic memory bank, and obtain the target harmony variable according to the judgment result. If r1 is less than HMCR, then randomly from the harmony memory bank Obtain any harmony variable as the target harmony variable; if r1 is greater than or equal to HMCR, a harmony variable is randomly generated from the solution space of the frequency band data. After obtaining the target harmony variable, the target harmony variable is fine-tuned and disturbed. If the target harmony variable is obtained from the harmony memory, the target harmony variable needs to be fine-tuned, between [0, 1] Generate a random number r2, if r2 is less than the fine-tuning probability PAR, adjust the obtained new harmony variable according to the fine-tuning bandwidth BW to obtain a new harmony variable Xnew; if r2 is greater than or equal to the fine-tuning probability PAR, then No adjustments are made, and finally Xnew is evaluated, that is, f(Xnew). If it is better than the one with the worst function value in the harmony memory, that is, f(Xnew)<f(Xworst), then Xnew will replace the harmony The harmony Xworst with the worst function value in the memory bank; otherwise, no modification is made. Repeat the above steps until the maximum number of iterations is reached or the stop criterion is met, the loop is ended, and the optimal solution is output. As a result, sound effect processing is performed on all frequency band data to obtain the sound effect required by the user.

步骤S40，将音效处理后的PCM数据写回至所述声卡，并对所述声卡上的音频数据进行播放。Step S40: Write back the PCM data after the sound effect processing to the sound card, and play the audio data on the sound card.

在智能音效模块完成音效处理后，通过AudioHal层的pcm_write接口将PCM数据写入至声卡中，并对声卡中的新的音频数据进行播放，由此用户听到的就是音效处理后的音频。After the smart sound effect module completes the sound effect processing, it writes the PCM data to the sound card through the pcm_write interface of the AudioHal layer, and plays the new audio data in the sound card, so that the user hears the audio after the sound effect processing.

在本实施例中通过将音频数据的PCM数据进行频段分离，并根据预设的音效算法对将频段数据进行音效处理，由此对音效处理后的音频数据进行播放，使得智能电视在不增加硬件成本的情况下，能针对音频数据中不同的声音场景对音频数据进行智能音效处理，从而提高了智能电视的音效，同时增强了智能电视的市场竞争力和用户体验性。In this embodiment, the PCM data of the audio data is separated into frequency bands, and the frequency band data is subjected to sound effect processing according to the preset sound effect algorithm, so that the audio data after the sound effect processing is played, so that the smart TV does not increase the hardware In the case of cost, the audio data can be processed with intelligent sound effects for different sound scenes in the audio data, thereby improving the sound effects of the smart TV and at the same time enhancing the market competitiveness and user experience of the smart TV.

进一步地，参照图4，图4为图2中获取输入至声卡中的音频数据对应的PCM数据的细化步骤的流程示意图，所述获取输入至声卡中的音频数据对应的PCM数据的步骤，包括：Further, referring to FIG. 4, FIG. 4 is a schematic flowchart of the detailed steps of obtaining PCM data corresponding to the audio data input to the sound card in FIG. 2, the step of obtaining PCM data corresponding to the audio data input to the sound card, include:

步骤S50，获取输入至声卡中的音频数据对应的PCM数据，并确定输出所述PCM数据的电视通道所对应的声音场景。Step S50: Obtain PCM data corresponding to the audio data input to the sound card, and determine the sound scene corresponding to the TV channel that outputs the PCM data.

由于不同的电视通道获取的音频不同，对音频的音效处理也不相同。例如，音频为音乐会现场时，需要呈现大提琴的低音，架子鼓、萨克斯等乐器的空间环绕效果，如音频为影视作品中的战争场面，则需要还原战场人声、兵器交接、战马嘶鸣等效果。所以在对音频进行音效处理前，需要获取输入至声卡中的音频数据对应的PCM数据，并确定输出PCM数据的电视通道所对应的声音场景。Since the audio obtained by different TV channels is different, the sound effect processing of the audio is also different. For example, when the audio is a concert site, it needs to present the bass of the cello, drums, saxophone and other musical instruments. If the audio is a war scene in a film and television work, it needs to restore the effects of battlefield vocals, weapon handover, war horse neighing, etc. . Therefore, before performing sound effect processing on the audio, it is necessary to obtain the PCM data corresponding to the audio data input to the sound card, and determine the sound scene corresponding to the TV channel that outputs the PCM data.

步骤S60，根据所述声音场景设置相应的音效参数。Step S60: Set corresponding sound effect parameters according to the sound scene.

针对不同的声音场景在音频数据库中设置不同的音效参数，所述音效参数可以为和声搜索算法的相关参数，例如声记忆库取值概率HMCR、微调概率PAR、微调带宽BW等，也可以为其他与音效算法相关的参数。本实施例在从声卡中获取PCM数据后，根据PCM数据输出的电视通道确定声音场景，并对该声音场景设置相应的音效参数。当然，作为另一种实施方式，可以预先对各个电视通道的声音场景进行定义，并设置与之对应的音效参数保存至音频数据库中，在需要使用时直接调用。具体地，音效参数可以以ini文件保存在音频数据库中，这样在有音频数据读入至声卡中时，就能自动获取音频数据库中对应的音效参数，直接通过预设的音效算法即可达到理想的音效。Different sound effect parameters are set in the audio database for different sound scenes. The sound effect parameters can be related parameters of the harmony search algorithm, such as the value probability of the acoustic memory bank HMCR, the fine-tuning probability PAR, the fine-tuning bandwidth BW, etc., or Other parameters related to the sound effect algorithm. In this embodiment, after obtaining the PCM data from the sound card, the sound scene is determined according to the TV channel output by the PCM data, and corresponding sound effect parameters are set for the sound scene. Of course, as another implementation manner, the sound scenes of each TV channel can be defined in advance, and the corresponding sound effect parameters can be set and stored in the audio database, which can be directly called when needed. Specifically, the sound effect parameters can be stored in the audio database as an ini file, so that when audio data is read into the sound card, the corresponding sound effect parameters in the audio database can be automatically obtained, and the ideal sound effect algorithm can be achieved directly through the preset sound effect algorithm. Sound effects.

在本实施例中通过声音的不同声音场景设置不同的音效参数，在执行音效算法时自动获取对应的音效参数对音频数据进行音效处理，不需要考虑平台的不同和声音场景的不同编写多套不同的算法来实现，因而减少了研发人员的工作量，同时也便于软件的移植。In this embodiment, different sound effect parameters are set through different sound scenes of the sound, and the corresponding sound effect parameters are automatically obtained when the sound effect algorithm is executed to perform sound effect processing on the audio data. There is no need to consider different platforms and different sound scenes. The algorithm is implemented, thus reducing the workload of R&D personnel, and at the same time facilitating software transplantation.

进一步地，参照图5，图5为本申请音频播放方法的第二实施例的流程示意图，基于上述图2所示的实施例，所述步骤S30：根据预设的音效算法对所述频段数据进行音效处理，包括：Further, referring to FIG. 5, FIG. 5 is a schematic flow chart of the second embodiment of the audio playback method of this application. Based on the embodiment shown in FIG. 2, the step S30: perform processing on the frequency band data according to a preset sound effect algorithm Perform sound processing, including:

步骤S301，初始化和声记忆库，并在所述和声记忆库中设置预设数量的和声变量。Step S301: Initialize the harmony memory bank, and set a preset number of harmony variables in the harmony memory bank.

本实施例中，预设的音效算法为和声搜索算法。智能音效模块在对音频数据进行音效处理前，需要在存储器中获取预设大小的存储空间作为和声记忆库，并将该和声记忆库初始化，基于音频数据频段分离获取的频段数据生成预设数量的和声变量。例如，若当前频段数据共有n个，分别定义为x ₁至x _n，则X＝{x ₁,x ₂,…,x _n}，从X的解空间中随机生成预设数量的和声变量，本实施例中的预设数量基于和声记忆库大小设置，如和声记忆库大小定义为HMS，这生成HMS个和声变量，即X ¹，X ²，…，X ^HMS，将HMS个和声变量放入和声记忆库，并记录对应的f(X)，则生成的和声记忆库的形式为： In this embodiment, the preset sound effect algorithm is a harmony search algorithm. Before the intelligent sound effect module performs sound effect processing on the audio data, it needs to obtain a preset size of storage space in the memory as the harmony memory bank, initialize the harmony memory bank, and generate presets based on the frequency band data obtained by separating the frequency band of the audio data The number of harmony variables. For example, if there are n data in the current frequency band, defined as x ₁ to x _n , then X={x ₁ ,x ₂ ,...,x _n }, randomly generate a preset number of harmony variables from the solution space of X The preset number in this embodiment is based on the size of the harmony memory bank. For example, the size of the harmony memory bank is defined as HMS, which generates HMS harmony variables, namely X ¹ , X ² ,..., X ^HMS , and HMS Harmony variables are put into the harmony memory bank, and the corresponding f(X) is recorded, then the form of the generated harmony memory bank is:

步骤S302，随机生成第一变量，并判断所述第一变量是否小于所述HMCR的值，其中，所述第一变量为0至1之间的随机数。Step S302, randomly generating a first variable, and determining whether the first variable is less than the value of the HMCR, where the first variable is a random number between 0 and 1.

步骤S303，根据判断结果获取目标和声变量，并对所述目标和声变量进行微调。Step S303: Obtain the target harmony variable according to the judgment result, and fine-tune the target harmony variable.

从音频数据库中获取当前音频数据对应的音效参数，该音效参数中包括和声记忆库取值概率HMCR。在[0，1]的区间范围随机生成第一变量r1，并将第一变量r1与音效参数中的和声记忆库取值概率HMCR进行比较，若第一变量r1小于和声记忆库取值概率HMCR，则执行步骤S310；若r1第一变量r1大于或者等于和声记忆库取值概率HMCR，则执行步骤S311。Acquire the sound effect parameter corresponding to the current audio data from the audio database, and the sound effect parameter includes the value probability HMCR of the harmony memory bank. Randomly generate the first variable r1 in the range of [0,1], and compare the first variable r1 with the value probability HMCR of the harmony memory bank in the sound effect parameter, if the first variable r1 is less than the value of the harmony memory bank If the probability HMCR, step S310 is executed; if the first variable r1 of r1 is greater than or equal to the value probability HMCR of the harmony memory bank, then step S311 is executed.

具体地，步骤S303包括：Specifically, step S303 includes:

步骤S310，从当前和声记忆库中获取任一和声变量作为目标和声变量。Step S310: Obtain any harmony variable from the current harmony memory bank as the target harmony variable.

步骤S311，根据所述频段数据随机生成新的和声变量作为目标和声变量。Step S311, randomly generating a new harmony variable as the target harmony variable according to the frequency band data.

步骤S312，基于所述微调概率PAR和微调带宽BW对所述目标和声变量进行微调。Step S312, fine-tuning the target harmony variable based on the fine-tuning probability PAR and the fine-tuning bandwidth BW.

具体地，获取目标和声变量后对目标和声变量进行微调扰动，若这个目标和声变量是从和声记忆库中获取得到的，就需要对这个目标和声变量进行如下微调：Specifically, after obtaining the target harmony variable, the target harmony variable is fine-tuned and disturbed. If the target harmony variable is obtained from the harmony memory bank, the target harmony variable needs to be fine-tuned as follows:

在[0，1]的区间范围随机生成第二变量r2，若第二变量r2小于音效参数中的微调概率PAR，则根据微调带宽BW来对得到的目标和声变量进行调整，从而得到一个新的和声变量Xnew；若第二变量r2大于或者等于该微调概率PAR，则不做任何调整。Randomly generate a second variable r2 in the range of [0,1]. If the second variable r2 is less than the fine-tuning probability PAR in the sound effect parameter, then the target harmony variable is adjusted according to the fine-tuning bandwidth BW to obtain a new Harmony variable Xnew; if the second variable r2 is greater than or equal to the fine-tuning probability PAR, no adjustment is made.

若这个目标和声变量是根据所述频段数据随机生成新的和声变量Xnew，则忽略基于所述微调概率PAR和微调带宽BW来调整的步骤。If this target harmony variable is to randomly generate a new harmony variable Xnew based on the frequency band data, then the step of adjusting based on the fine-tuning probability PAR and the fine-tuning bandwidth BW is ignored.

步骤S304，判断是否遍历完所有频段数据。Step S304: It is judged whether all frequency band data has been traversed.

每获取一个随机变量与和声记忆库取值概率HMCR进行比较，就遍历一个频段数据，依次循环，需要将和声记忆库中所有频段数据都遍历完，才能对所有的音频频段进行调整，因此需要在执行音效算法的过程中，实时判断是否遍历完所有频段数据。若已遍历所有频段数据，则执行步骤S305和S306；若未遍历完所有频段数据，则执行步骤S307。Every time a random variable is obtained and compared with the value probability HMCR of the harmony memory bank, it traverses a frequency band data, and loops sequentially. It is necessary to traverse all the frequency band data in the harmony memory bank to adjust all audio frequency bands. It is necessary to determine in real time whether to traverse all frequency band data in the process of executing the sound effect algorithm. If all frequency band data has been traversed, steps S305 and S306 are executed; if all frequency band data has not been traversed, step S307 is executed.

步骤S305，判断微调扰动后的和声是否优于和声记忆库中最差的和声。Step S305: It is judged whether the harmony after the fine-tuning disturbance is better than the worst harmony in the harmony memory bank.

步骤S306，若微调扰动后的和声优于和声记忆库中最差的和声，将所述最差的和声替换成所述微调扰动后的和声，形成新的和声记忆库。Step S306: If the fine-tuned and disturbed harmony is better than the worst harmony in the harmony memory bank, replace the worst harmony with the fine-tuned and disturbed harmony to form a new harmony memory bank.

步骤S307，随机生成一个新的变量，作为第一变量，并返回步骤：判断所述第一变量是否小于所述HMCR的值。Step S307, randomly generating a new variable as the first variable, and returning to the step: judging whether the first variable is less than the value of the HMCR.

若已遍历所有频段数据，则对获取到的Xnew进行评估，计算Xnew的函数值即f(Xnew)，若f(Xnew)优于和声记忆库中的函数值最差的一个，即f(Xnew)<f(Xworst)，则将Xnew代替和声记忆库中函数值最差的和声Xworst；否则，不做修改。不断重复上述步骤S305和S306，直至所有频段数据遍历完结束循环，输出最优解。作为另一种实施方式，最大的迭代次数除了可以是由频段数据的数量决定之外，也可以是在音效参数中预先设置的一个参数，循环次数一旦达到最大的迭代次数后，停止迭代。If all frequency band data has been traversed, the obtained Xnew is evaluated, and the function value of Xnew is calculated, namely f(Xnew). If f(Xnew) is better than the worst one in the harmony memory, namely f( Xnew)<f(Xworst), Xnew will replace Xworst with the worst function value in the harmony memory; otherwise, no modification will be made. The above steps S305 and S306 are continuously repeated until all the frequency band data is traversed to end the loop, and the optimal solution is output. As another implementation manner, the maximum number of iterations may not only be determined by the number of frequency band data, but also may be a parameter preset in the sound effect parameters. Once the number of cycles reaches the maximum number of iterations, the iteration stops.

本领域的技术人员可以认为，预设的音效算法除了上述的和声搜索算法外，还可以是通过软件实现的其他音效算法或者结合滤波器等硬件实现的软件算法，在此不一一赘述。Those skilled in the art can think that, in addition to the aforementioned harmony search algorithm, the preset sound effect algorithm may also be other sound effect algorithms implemented by software or software algorithms implemented in combination with hardware such as filters, which will not be repeated here.

本实施例中通过预设的音效算法对音频数据进行音效处理，由于不同声音场景下的音效参数不同，根据音效算法得到的结果不同，从而实现不同声音场景下的音效处理。In this embodiment, a preset sound effect algorithm is used to perform sound effect processing on audio data. Since the sound effect parameters in different sound scenes are different, and the results obtained according to the sound effect algorithm are different, the sound effect processing in different sound scenes is realized.

此外，本申请实施例还提出一种计算机可读存储介质，所述计算机可读存储介质上存储有音频播放程序，所述音频播放程序被处理器执行时实现如下操作：In addition, the embodiment of the present application also proposes a computer-readable storage medium, the computer-readable storage medium stores an audio playback program, and when the audio playback program is executed by a processor, the following operations are implemented:

进一步地，所述音频播放程序被处理器执行时还实现如下操作：Further, the following operations are also implemented when the audio playback program is executed by the processor:

获取预设的音频分离参数；Obtain preset audio separation parameters;

进一步地，所述音频播放程序被处理器执行时还实现如下操作Further, the following operations are also implemented when the audio playback program is executed by the processor

本申请计算机可读存储介质的具体实施例与上述应用软件安全漏洞检测方法各实施例基本相同，在此不作赘述。The specific embodiments of the computer-readable storage medium of the present application are basically the same as the above-mentioned embodiments of the application software security vulnerability detection method, and will not be repeated here.

需要说明的是，在本文中，术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含，从而使得包括一系列要素的过程、方法、物品或者系统不仅包括那些要素，而且还包括没有明确列出的其他要素，或者是还包括为这种过程、方法、物品或者系统所固有的要素。在没有更多限制的情况下，由语句“包括一个……”限定的要素，并不排除在包括该要素的过程、方法、物品或者系统中还存在另外的相同要素。It should be noted that in this article, the terms "include", "include" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, method, article or system including a series of elements not only includes those elements, It also includes other elements that are not explicitly listed, or elements inherent to the process, method, article, or system. If there are no more restrictions, the element defined by the sentence "including a..." does not exclude the existence of other identical elements in the process, method, article or system that includes the element.

上述本申请实施例序号仅仅为了描述，不代表实施例的优劣。The serial numbers of the foregoing embodiments of the present application are for description only, and do not represent the superiority of the embodiments.

通过以上的实施方式的描述，本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现，当然也可以通过硬件，但很多情况下前者是更佳的实施方式。基于这样的理解，本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来，该计算机软件产品存储在如上所述的一个存储介质(如ROM/RAM、磁碟、光盘)中，包括若干指令用以使得一台终端设备(可以是手机，计算机，服务器，空调器，或者网络设备等)执行本申请各个实施例所述的方法。Through the description of the above embodiments, those skilled in the art can clearly understand that the method of the above embodiments can be implemented by means of software plus the necessary general hardware platform. Of course, it can also be implemented by hardware, but in many cases the former is better.的实施方式。 Based on this understanding, the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product is stored in a storage medium (such as ROM/RAM) as described above. , Magnetic disk, optical disk), including several instructions to make a terminal device (can be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) execute the method described in each embodiment of the present application.

以上仅为本申请的优选实施例，并非因此限制本申请的专利范围，凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换，或直接或间接运用在其他相关的技术领域，均同理包括在本申请的专利保护范围内。The above are only preferred embodiments of this application, and do not limit the scope of this application. Any equivalent structure or equivalent process transformation made using the content of the description and drawings of this application, or directly or indirectly used in other related technical fields , The same reason is included in the scope of patent protection of this application.

Claims

An audio playback method, wherein the audio playback method includes:

Obtain PCM data corresponding to the audio data input to the sound card;

Performing frequency band separation on the PCM data, and extracting frequency band data after the frequency band separation according to preset audio separation parameters;

Performing sound effect processing on the frequency band data according to a preset sound effect algorithm;

The PCM data after the sound effect processing is written back to the sound card, and the audio data on the sound card is played.

5. The audio playback method according to claim 1, wherein the step of performing frequency band separation on the PCM data and extracting frequency band data after the frequency band separation according to preset audio separation parameters comprises:

Performing frequency band separation on the PCM data according to a preset filtering algorithm;

Obtain preset audio separation parameters;

Extract frequency band data after frequency band separation according to the audio separation parameter.

The audio playback method according to claim 1, wherein the step of obtaining PCM data corresponding to the audio data input into the sound card comprises:

Acquiring PCM data corresponding to the audio data input to the sound card, and determining the sound scene corresponding to the TV channel that outputs the PCM data;

Set corresponding sound effect parameters according to the sound scene.

8. The audio playback method of claim 3, wherein the step of performing sound effect processing on the frequency band data according to a preset sound effect algorithm comprises:

The sound effect parameters are transferred into a preset sound effect algorithm to perform sound effect processing on the frequency band data.

5. The audio playback method of claim 4, wherein the preset sound effect algorithm is a harmony search algorithm.

The audio playback method according to claim 5, wherein the sound effect parameters include the value probability HMCR of the harmony memory bank, and the sound effect parameters are transferred into a preset sound effect algorithm to perform sound effects on the frequency band data The processing steps include:

Initialize the harmony memory bank, and set a preset number of harmony variables in the harmony memory bank;

Randomly generating a first variable, and determining whether the first variable is less than the value of the HMCR, wherein the first variable is a random number between 0 and 1;

Acquire the target harmony variable according to the judgment result, and fine-tune the target harmony variable;

Determine whether to traverse all frequency band data;

If all frequency band data has been traversed, judge whether the harmony after fine-tuning disturbance is better than the worst harmony in the harmony memory;

If the harmony after the fine-tuning disturbance is better than the worst harmony in the harmony memory bank, replace the worst harmony with the harmony after the fine-tuning disturbance to form a new harmony memory bank.

8. The audio playback method of claim 6, wherein the sound effect parameters include a fine-tuning probability PAR and a fine-tuning bandwidth BW, and the step of acquiring a new harmony variable according to the judgment result and fine-tuning the new harmony variable, include:

If the first variable is less than the value of the HMCR, obtain any harmony variable from the current harmony memory bank as the target harmony variable;

If the first variable is greater than or equal to the value of the HMCR, randomly generating a new harmony variable as a target harmony variable according to the frequency band data;

The target harmony variable is fine-tuned based on the fine-tuning probability PAR and the fine-tuning bandwidth BW.

8. The audio playback method of claim 6, wherein after the step of determining whether to traverse all frequency band data, the method further comprises:

If all frequency band data has not been traversed, a new variable is randomly generated as the first variable, and the step is returned: judging whether the first variable is less than the value of the HMCR.

An audio playback device, wherein the audio playback device includes a memory, a processor, and an audio playback program stored on the memory and capable of running on the processor, and the audio playback program is used by the processor When executed, the steps of the audio playback method according to any one of claims 1 to 8 are realized.

A computer-readable storage medium having an audio playing program stored thereon, wherein the audio playing program is executed by a processor to implement the steps of the audio playing method according to any one of claims 1 to 8.