WO2024174183A1

WO2024174183A1 - Lung sound enhancement method and system, and device and storage medium

Info

Publication number: WO2024174183A1
Application number: PCT/CN2023/077983
Authority: WO
Inventors: 李坚强; 陈杰; 张劲; 王利; 代雨嫣
Original assignee: Shenzhen University
Current assignee: Shenzhen University
Priority date: 2023-02-23
Filing date: 2023-02-23
Publication date: 2024-08-29
Anticipated expiration: 2025-08-23

Abstract

The embodiments of the present application belong to the technical field of sound signal processing. Disclosed are a lung sound enhancement method and system, and a device and a storage medium. The lung sound enhancement method comprises: acquiring lung sound data to be enhanced; and converting said lung sound data by using a trained lung sound enhancement model, so as to obtain enhanced lung sound data, wherein the lung sound enhancement model is constructed on the basis of a first variational auto-encoder, a second variational auto-encoder and a cyclic consistency loss model, and is trained by means of clean lung sound data and noisy lung sound data, so as to obtain the enhanced lung sound data.

Description

Lung sound enhancement method, system, device and storage medium

Technical Field

本申请涉及一种声音信号处理技术领域，尤其涉及一种肺音增强方法、系统、设备和存储介质。The present application relates to the technical field of sound signal processing, and in particular to a lung sound enhancement method, system, device and storage medium.

Background Art

肺音是在呼吸过程中气流通过身体气道和肺泡引起的振动产生的声音，这种声音通过肺组织和胸壁传递到身体表面。肺音信号具有非高斯随机特性，是一种不稳定的声音信号。在医学领域，肺音是衡量肺部生理和病理健康状况的重要指标。肺部听诊具有快速、低成本、可持续监测等优点，被医务人员广泛使用。Lung sounds are the sounds produced by the vibrations caused by airflow passing through the body's airways and alveoli during breathing. This sound is transmitted to the body surface through the lung tissue and chest wall. Lung sound signals have non-Gaussian random characteristics and are unstable sound signals. In the medical field, lung sounds are an important indicator of the physiological and pathological health of the lungs. Lung auscultation has the advantages of rapidity, low cost, and sustainable monitoring, and is widely used by medical personnel.

由于肺音中容易有大量的噪音，而这些噪声的频带又与肺音的频带重叠，导致现有技术中通常使用小波变换、自适应滤波器等来增强肺音信号。但这些方法需要大量干净的肺音作为模型训练的样本，而干净的肺音本就难以获取，增大了肺音增强的成本。Since lung sounds are prone to contain a lot of noise, and the frequency bands of these noises overlap with the frequency bands of lung sounds, wavelet transforms, adaptive filters, etc. are usually used in the prior art to enhance lung sound signals. However, these methods require a large number of clean lung sounds as samples for model training, and clean lung sounds are difficult to obtain, which increases the cost of lung sound enhancement.

发明内容Summary of the invention

有鉴于此，本申请提供了一种肺音增强方法、系统、设备和存储介质，用于解决现有技术中肺音数据采集难度大，导致肺音增强成本高的问题。为达上述之一或部分或全部目的或是其他目的，本申请提出一种肺音增强方法、系统、设备和存储介质，第一方面：In view of this, the present application provides a lung sound enhancement method, system, device and storage medium, which are used to solve the problem that lung sound data collection is difficult in the prior art, resulting in high cost of lung sound enhancement. In order to achieve one or part or all of the above purposes or other purposes, the present application proposes a lung sound enhancement method, system, device and storage medium, in the first aspect:

一种肺音增强方法，包括：A lung sound enhancement method, comprising:

获取待增强肺音数据；Acquire lung sound data to be enhanced;

利用训练完成的肺音增强模型对所述待增强肺音数据进行转换，得到增强肺音数据；Using the trained lung sound enhancement model to transform the lung sound data to be enhanced, to obtain enhanced lung sound data;

所述肺音增强模型为基于第一变分自动编码器、第二变分自动编码器、循环一致性损失模型进行构建，且通过干净肺音数据、带噪肺音数据进行训练得到增强肺音数据。The lung sound enhancement model is constructed based on a first variational autoencoder, a second variational autoencoder, and a cycle consistency loss model, and is trained with clean lung sound data and noisy lung sound data to obtain enhanced lung sound data.

在一实施例中，在所述利用训练完成的肺音增强模型对所述待增强肺音数据进行转换，得到增强肺音数据之前，所述方法还包括：In one embodiment, before converting the lung sound data to be enhanced using the trained lung sound enhancement model to obtain enhanced lung sound data, the method further includes:

通过所述干净肺音数据对所述第一变分自动编码器进行训练，得到完成训练的所述第一变分自动编码器以及所述第一变分自动编码器对应的第一特征向量数据；Training the first variational autoencoder using the clean lung sound data to obtain the trained first variational autoencoder and first feature vector data corresponding to the first variational autoencoder;

通过所述带噪肺音数据和完成训练的所述第一变分自动编码器对所述第二变分自动编码器进行训练，得到所述第二变分自动编码器对应的第二特征向量数据、所述第一特征向量数据与所述第二特征向量数据之间的映射关系和重构带噪数据；所述第二特征向量数据、所述映射关系和所述重构带噪数据为在所述循环一致性损失模型约束下得到的；The second variational autoencoder is trained by using the noisy lung sound data and the first variational autoencoder that has completed training, to obtain second feature vector data corresponding to the second variational autoencoder, a mapping relationship between the first feature vector data and the second feature vector data, and reconstructed noisy data; the second feature vector data, the mapping relationship, and the reconstructed noisy data are obtained under the constraints of the cycle consistency loss model;

在所述重构带噪数据与所述带噪肺音数据之间满足预设的匹配条件时，判定所述肺音增强模型训练完成。When a preset matching condition is satisfied between the reconstructed noisy data and the noisy lung sound data, it is determined that the training of the lung sound enhancement model is completed.

在一实施例中，所述通过所述带噪肺音数据和完成训练的所述第一变分自动编码器对所述第二变分自动编码器进行训练，得到所述第二变分自动编码器对应的第二特征向量数据、所述第一特征向量数据与所述第二特征向量数据之间的映射关系和重构带噪数据的步骤包括：In one embodiment, the step of training the second variational autoencoder using the noisy lung sound data and the trained first variational autoencoder to obtain second feature vector data corresponding to the second variational autoencoder, a mapping relationship between the first feature vector data and the second feature vector data, and reconstructing the noisy data includes:

利用所述第二变分自动编码器的编码器二对所述带噪肺音数据进行编码处理，得到所述第二特征向量数据；Using encoder 2 of the second variational autoencoder to encode the noisy lung sound data to obtain the second feature vector data;

根据所述循环一致性损失模型构建的所述第一特征向量数据与所述第二特征向量数据之间的映射关系，利用完成训练的所述第一变分自动编码器的解码器一对所述第二特征向量数据进行解码处理，得到净化肺音数据；According to the mapping relationship between the first feature vector data and the second feature vector data constructed by the cycle consistency loss model, the second feature vector data is decoded by using the decoder 1 of the first variational autoencoder that has completed training to obtain purified lung sound data;

利用训练完成的所述第一变分自动编码器的编码器一对所述净化肺音数据进行编码处理，得到所述第一特征向量数据；Encode the purified lung sound data using the first encoder of the first variational autoencoder that has been trained Processing to obtain the first feature vector data;

基于所述映射关系利用所述第二变分自动编码器的解码器二对所述第一特征向量数据进行解码处理，得到所述重构带噪数据。Based on the mapping relationship, the first feature vector data is decoded using decoder 2 of the second variational autoencoder to obtain the reconstructed noisy data.

在一实施例中，在所述得到增强肺音数据之后，所述方法还包括：In one embodiment, after obtaining the enhanced lung sound data, the method further includes:

利用预设的生成对抗网络模型对所述增强肺音数据进行处理，得到判别值数据；Processing the enhanced lung sound data using a preset generative adversarial network model to obtain discriminant value data;

所述生成对抗网络模型包括生成器和判别器；所述第一变分自动编码器和所述第二变分自动编码器均对应有所述生成对抗网络模型；与所述第一变分自动编码器对应的所述生成对抗网络模型将所述解码器一判定为所述生成器；与所述第二变分自动编码器对应的所述生成对抗网络模型将所述解码器二判定为所述生成器；在所述第一变分自动编码器的损失函数收敛后对对应的所述生成对抗网络模型进行训练；在所述第二变分自动编码器的损失函数收敛后对对应的所述生成对抗网络模型进行训练。The generative adversarial network model includes a generator and a discriminator; the first variational autoencoder and the second variational autoencoder both correspond to the generative adversarial network model; the generative adversarial network model corresponding to the first variational autoencoder determines the decoder 1 as the generator; the generative adversarial network model corresponding to the second variational autoencoder determines the decoder 2 as the generator; after the loss function of the first variational autoencoder converges, the corresponding generative adversarial network model is trained; after the loss function of the second variational autoencoder converges, the corresponding generative adversarial network model is trained.

在一实施例中，所述解码器一、解码器二和所述判别器中均包含有两层多头自注意力模型。In one embodiment, the decoder 1, the decoder 2 and the discriminator all include a two-layer multi-head self-attention model.

在一实施例中，在所述利用预设的生成对抗网络模型对所述增强肺音数据进行处理，得到判别值数据之后，所述方法还包括：In one embodiment, after the enhanced lung sound data is processed using a preset generative adversarial network model to obtain discriminant value data, the method further includes:

利用训练完成的肺音相位修正模型对所述增强肺音数据进行处理，得到相位修正肺音数据；Processing the enhanced lung sound data using the trained lung sound phase correction model to obtain phase-corrected lung sound data;

所述肺音相位修正模型利用包含有白噪声的所述干净肺音数据训练。The lung sound phase correction model is trained using the clean lung sound data containing white noise.

在一实施例中，在所述利用训练完成的肺音相位修正模型对所述增强肺音数据进行处理，得到相位修正肺音数据之后，所述方法还包括：In one embodiment, after the enhanced lung sound data is processed using the trained lung sound phase correction model to obtain the phase-corrected lung sound data, the method further includes:

利用训练完成的信噪比预测模型对所述待增强肺音数据和所述增强肺音数据进行处理，得到对应的信噪比预测数据。The trained signal-to-noise ratio prediction model is used to process the lung sound data to be enhanced and the enhanced lung sound data to obtain corresponding signal-to-noise ratio prediction data.

第二方面：Second aspect:

一种肺音增强系统，包括获取模块，用于获取待增强肺音数据；A lung sound enhancement system comprises an acquisition module for acquiring lung sound data to be enhanced;

肺音增强模块，用于利用训练完成的肺音增强模型对所述待增强肺音数据进行转换，得到增强肺音数据；A lung sound enhancement module, used to convert the lung sound data to be enhanced by using the trained lung sound enhancement model to obtain enhanced lung sound data;

在一实施例中，所述系统还包括第一训练模块，用于在所述利用训练完成的肺音增强模型对所述待增强肺音数据进行转换，得到增强肺音数据之前，通过所述干净肺音数据对所述第一变分自动编码器进行训练，得到完成训练的所述第一变分自动编码器以及所述第一变分自动编码器对应的第一特征向量数据；In one embodiment, the system further includes a first training module, which is used to train the first variational autoencoder using the clean lung sound data before converting the lung sound data to be enhanced using the trained lung sound enhancement model to obtain the enhanced lung sound data, so as to obtain the trained first variational autoencoder and first feature vector data corresponding to the first variational autoencoder;

第二训练模块，用于通过所述带噪肺音数据和完成训练的所述第一变分自动编码器对所述第二变分自动编码器进行训练，得到所述第二变分自动编码器对应的第二特征向量数据、所述第一特征向量数据与所述第二特征向量数据之间的映射关系和重构带噪数据；所述第二特征向量数据、所述映射关系和所述重构带噪数据为在所述循环一致性损失模型约束下得到的；A second training module is used to train the second variational autoencoder through the noisy lung sound data and the first variational autoencoder that has completed the training, to obtain second feature vector data corresponding to the second variational autoencoder, a mapping relationship between the first feature vector data and the second feature vector data, and reconstructed noisy data; the second feature vector data, the mapping relationship, and the reconstructed noisy data are obtained under the constraints of the cycle consistency loss model;

迭代模块，用于在所述重构带噪数据与所述带噪肺音数据之间满足预设的匹配条件时，判定所述肺音增强模型训练完成。The iteration module is used to determine that the training of the lung sound enhancement model is completed when a preset matching condition is met between the reconstructed noisy data and the noisy lung sound data.

在一实施例中，所述第二训练模块包括第一编码单元，用于利用所述第二变分自动编码器的编码器二对所述带噪肺音数据进行编码处理，得到所述第二特征向量数据；In one embodiment, the second training module includes a first encoding unit, configured to encode the noisy lung sound data using encoder 2 of the second variational autoencoder to obtain the second feature vector data;

第一解码单元，用于根据所述循环一致性损失模型构建的所述第一特征向量数据与所述第二特征向量数据之间的映射关系，利用完成训练的所述第一变分自动编码器的解码器一对所述第二特征向量数据进行解码处理，得到净化肺音数据；A first decoding unit is configured to decode the second feature vector data using a decoder of the first variational autoencoder that has completed training, according to a mapping relationship between the first feature vector data and the second feature vector data constructed by the cycle consistency loss model, to obtain purified lung sound data;

第二编码单元，用于利用训练完成的所述第一变分自动编码器的编码器一对所述净化肺音数据进行编码处理，得到所述第一特征向量数据；A second encoding unit is used to use the first variational autoencoder encoder pair trained to encode the purification The lung sound data is encoded to obtain the first feature vector data;

第二解码单元，用于基于所述映射关系利用所述第二变分自动编码器的解码器二对所述第一特征向量数据进行解码处理，得到所述重构带噪数据。The second decoding unit is used to decode the first feature vector data using the second decoder of the second variational autoencoder based on the mapping relationship to obtain the reconstructed noisy data.

在一实施例中，所述系统还包括生成对抗网络模块，用于在所述得到增强肺音数据之后，利用预设的生成对抗网络模型对所述增强肺音数据进行处理，得到判别值数据；In one embodiment, the system further comprises a generative adversarial network module, which is used to process the enhanced lung sound data using a preset generative adversarial network model to obtain discriminant value data after the enhanced lung sound data is obtained;

在一实施例中，所述系统还包括修正模块，用于在所述利用预设的生成对抗网络模型对所述增强肺音数据进行处理，得到判别值数据之后，利用训练完成的肺音相位修正模型对所述增强肺音数据进行处理，得到相位修正肺音数据；In one embodiment, the system further includes a correction module for processing the enhanced lung sound data using a trained lung sound phase correction model to obtain phase-corrected lung sound data after the enhanced lung sound data is processed using a preset generative adversarial network model to obtain discriminant value data;

在一实施例中，所述系统还包括信噪比预测模块，用于在所述利用训练完成的肺音相位修正模型对所述增强肺音数据进行处理，得到相位修正肺音数据之后，利用训练完成的信噪比预测模型对所述待增强肺音数据和所述增强肺音数据进行处理，得到对应的信噪比预测数据。In one embodiment, the system further includes a signal-to-noise ratio prediction module, which is used to process the enhanced lung sound data using the trained lung sound phase correction model to obtain phase-corrected lung sound data, and then use the trained signal-to-noise ratio prediction model to process the lung sound data to be enhanced and the enhanced lung sound data to obtain corresponding signal-to-noise ratio prediction data.

第三方面：The third aspect:

一种计算机设备，包括存储器和处理器，所述存储器中存储有肺音增强方法，所述处理器用于在执行所述肺音增强方法时采用上述所述方法。A computer device includes a memory and a processor, wherein a lung sound enhancement method is stored in the memory, and the processor is used to adopt the above-mentioned method when executing the lung sound enhancement method.

第四方面：Fourth aspect:

一种存储介质，存储有能够被处理器加载并执行上述所述方法的计算机程序。A storage medium stores a computer program that can be loaded by a processor and execute the above method.

实施本申请实施例，将具有如下有益效果：Implementing the embodiments of the present application will have the following beneficial effects:

在肺音增强模型的构建过程中使用了循环一致性损失模型；在肺音增强模型的训练过程中，使用了干净肺音数据和带噪肺音数据。使肺音增强模型的训练过程无需使用等比例的干净肺音数据与带噪肺音数据，从而降低干净肺音数据的需求量，降低了训练肺音增强模型的难度，同时也降低了肺音增强模型的训练成本，只需要较少量的干净肺音数据即可完成训练，降低了肺音增强的成本。The cycle consistency loss model is used in the construction process of the lung sound enhancement model; clean lung sound data and noisy lung sound data are used in the training process of the lung sound enhancement model. This makes it unnecessary to use clean lung sound data and noisy lung sound data in equal proportions in the training process of the lung sound enhancement model, thereby reducing the demand for clean lung sound data, reducing the difficulty of training the lung sound enhancement model, and also reducing the training cost of the lung sound enhancement model. Only a small amount of clean lung sound data is needed to complete the training, reducing the cost of lung sound enhancement.

BRIEF DESCRIPTION OF THE DRAWINGS

为了更清楚地说明本申请实施例或现有技术中的技术方案，下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本申请的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings required for use in the embodiments or the description of the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present application. For ordinary technicians in this field, other drawings can be obtained based on these drawings without paying any creative work.

其中：in:

图1为一个实施例中肺音增强方法的流程图。FIG. 1 is a flow chart of a lung sound enhancement method according to an embodiment.

图2为一个实施例中肺音增强方法中得到重构带噪数据的流程图。FIG. 2 is a flow chart of obtaining reconstructed noisy data in a lung sound enhancement method in one embodiment.

图3为一个实施例中肺音映射的共享潜在空间相关结构示意图。FIG. 3 is a schematic diagram of a shared latent space correlation structure of lung sound mapping in one embodiment.

图4为一个实施例中变分自动编码器与生成对抗网络模型结合后的结构示意图。FIG4 is a schematic diagram of the structure of a variational autoencoder combined with a generative adversarial network model in one embodiment.

图5为一个实施例中肺音相位修正的流程图。FIG. 5 is a flow chart of lung sound phase correction in one embodiment.

图6为一个实施例中肺音增强系统的结构框图。 FIG6 is a structural block diagram of a lung sound enhancement system in one embodiment.

图7为一个实施例中肺音增强装置的结构示意图。FIG. 7 is a schematic diagram of the structure of a lung sound enhancement device in one embodiment.

DETAILED DESCRIPTION

下面将结合本申请实施例中的附图，对本申请实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本申请一部分实施例，而不是全部的实施例。基于本申请中的实施例，本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例，都属于本申请保护的范围。The following will be combined with the drawings in the embodiments of the present application to clearly and completely describe the technical solutions in the embodiments of the present application. Obviously, the described embodiments are only part of the embodiments of the present application, not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by ordinary technicians in this field without creative work are within the scope of protection of this application.

在以下的描述中，涉及到“一些实施例”，其描述了所有可能实施例的子集，但是可以理解，“一些实施例”可以是所有可能实施例的相同子集或不同子集，并且可以在不冲突的情况下相互结合。In the following description, reference is made to “some embodiments”, which describe a subset of all possible embodiments, but it will be understood that “some embodiments” may be the same subset or different subsets of all possible embodiments and may be combined with each other without conflict.

除非另有定义，本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同。本文中所使用的术语只是为了描述本申请实施例的目的，不是旨在限制本申请。Unless otherwise defined, all technical and scientific terms used herein have the same meaning as those commonly understood by those skilled in the art to which this application belongs. The terms used herein are only for the purpose of describing the embodiments of this application and are not intended to limit this application.

本申请提供一种肺音增强方法，由于肺音中容易有大量的噪音，而这些噪声的频带又与肺音的频带重叠，导致现有技术中通常使用小波变换、自适应滤波器等来增强肺音信号。但这些方法需要大量干净的肺音作为模型训练的样本，而干净的肺音本就难以获取，增大了肺音增强的成本。The present application provides a lung sound enhancement method. Since lung sounds are prone to contain a large amount of noise, and the frequency bands of these noises overlap with the frequency bands of lung sounds, wavelet transform, adaptive filter, etc. are usually used in the prior art to enhance lung sound signals. However, these methods require a large number of clean lung sounds as samples for model training, and clean lung sounds are difficult to obtain, which increases the cost of lung sound enhancement.

为了解决上述缺陷，本申请实施例提供的一种肺音增强方法，如图1所示，包括：In order to solve the above defects, an embodiment of the present application provides a lung sound enhancement method, as shown in FIG1 , comprising:

101、获取待增强肺音数据。101. Obtain lung sound data to be enhanced.

在一实施例中，待增强肺音数据指需要进行去噪的肺音数据。待增强肺音数据可以是当前执行主体被动获取，也可以是当前执行主体主动获取。例如，在一应用场景中，由肺音收集设备将待增强肺音数据传输给当前执行主体，使当前执行主体被动地获得待增强肺音数据；在另一应用场景中，当前执行主体在检测到或者接收到增强指令后，到预设的路径下或接口处获取待增强肺音数据。需要说明的是，当前执行主体可以是MCU、集成芯片、单片机、计算机等具有数据处理功能的智能设备。In one embodiment, the lung sound data to be enhanced refers to the lung sound data that needs to be denoised. The lung sound data to be enhanced can be passively acquired by the current executing subject, or it can be actively acquired by the current executing subject. For example, in one application scenario, the lung sound collection device transmits the lung sound data to be enhanced to the current executing subject, so that the current executing subject passively obtains the lung sound data to be enhanced; in another application scenario, after detecting or receiving the enhancement instruction, the current executing subject obtains the lung sound data to be enhanced at a preset path or interface. It should be noted that the current executing subject can be an intelligent device with data processing function, such as MCU, integrated chip, single-chip microcomputer, computer, etc.

102、利用训练完成的肺音增强模型对所述待增强肺音数据进行转换，得到增强肺音数据。102. Use the trained lung sound enhancement model to convert the lung sound data to be enhanced to obtain enhanced lung sound data.

在一实施例中，所述肺音增强模型为基于第一变分自动编码器、第二变分自动编码器、循环一致性损失模型进行构建，且通过干净肺音数据、带噪肺音数据进行训练得到。In one embodiment, the lung sound enhancement model is constructed based on a first variational autoencoder, a second variational autoencoder, and a cycle consistency loss model, and is trained by using clean lung sound data and noisy lung sound data.

为了便于理解，对肺音增强模型进行说明。由于需要将待增强肺音数据中的噪音剔除，得到不含或者含有少量噪音的增强肺音数据，因此需要一种能够将待增强肺音数据转换为增强肺音数据的模型。其中，增强肺音数据指经过去噪后，得到增强的肺音数据。在训练肺音增强模型的过程中，使用两个变分自动编码器(VAE)，分别命名为第一变分自动编码器和第二变分自动编码器。通过利用干净肺音数据训练第一变分自动编码器，使训练完成的第一变分自动编码器能够提取到干净肺音数据的特征并完成对干净肺音数据的重构。For ease of understanding, the lung sound enhancement model is explained. Since it is necessary to remove the noise in the lung sound data to be enhanced to obtain enhanced lung sound data with no or little noise, a model that can convert the lung sound data to be enhanced into enhanced lung sound data is needed. Among them, enhanced lung sound data refers to the enhanced lung sound data after denoising. In the process of training the lung sound enhancement model, two variational autoencoders (VAE) are used, named the first variational autoencoder and the second variational autoencoder respectively. By using clean lung sound data to train the first variational autoencoder, the trained first variational autoencoder can extract the features of the clean lung sound data and complete the reconstruction of the clean lung sound data.

在一实施例中，为了减少训练过程中所需干净肺音数据的数量，在第二变分自动编码器的训练过程中加入循环一致性损失模型，解决干净肺音数据和带噪肺音数据之间的不均衡问题。利用循环一致性损失模型构建第一变分自动编码器与第二变分自动编码器两个特征向量空间之间的映射关系，使带噪肺音数据在经过第一变分自动编码器和第二变分自动编码器的处理后，能够转换为增强肺音数据，实现待增强肺音数据到增强肺音数据的转变，解决肺音数据的去噪问题。In one embodiment, in order to reduce the amount of clean lung sound data required during the training process, a cycle consistency loss model is added during the training process of the second variational autoencoder to solve the imbalance problem between clean lung sound data and noisy lung sound data. The cycle consistency loss model is used to construct a mapping relationship between the two feature vector spaces of the first variational autoencoder and the second variational autoencoder, so that the noisy lung sound data can be converted into enhanced lung sound data after being processed by the first variational autoencoder and the second variational autoencoder, realizing the transformation of the lung sound data to be enhanced to the enhanced lung sound data, and solving the denoising problem of the lung sound data.

在肺音增强模型的构建过程中使用了循环一致性损失模型；在肺音增强模型的训练过程中，使用了干净肺音数据和带噪肺音数据。使肺音增强模型的训练过程无需使用等比例的干净肺音数据与带噪肺音数据，从而降低干净肺音数据的需求量，降低了训练肺音增强模型的难度，同时也降低了肺音增强模型的训练成本，只需要较少量的干净肺音数据即可完成训练，降低了肺音增强的成本。 The cycle consistency loss model is used in the construction process of the lung sound enhancement model; clean lung sound data and noisy lung sound data are used in the training process of the lung sound enhancement model. This makes it unnecessary to use clean lung sound data and noisy lung sound data in equal proportions in the training process of the lung sound enhancement model, thereby reducing the demand for clean lung sound data, reducing the difficulty of training the lung sound enhancement model, and also reducing the training cost of the lung sound enhancement model. Only a small amount of clean lung sound data is needed to complete the training, reducing the cost of lung sound enhancement.

编码器一在本申请的另一种实施方式中，在步骤所述利用训练完成的肺音增强模型对所述待增强肺音数据进行转换，得到增强肺音数据之前，所述方法还包括：Encoder 1 In another embodiment of the present application, before the step of converting the lung sound data to be enhanced using the trained lung sound enhancement model to obtain enhanced lung sound data, the method further includes:

201、通过所述干净肺音数据对所述第一变分自动编码器进行训练，得到完成训练的所述第一变分自动编码器以及所述第一变分自动编码器对应的第一特征向量数据。201. Train the first variational autoencoder using the clean lung sound data to obtain the trained first variational autoencoder and first feature vector data corresponding to the first variational autoencoder.

在一实施例中，第一变分自动编码器包括编码器一和解码器一，利用编码器一对干净肺音数据进行处理，得到第一特征向量数据；再利用解码器一对第一特征向量数据进行处理，得到重构后的干净肺音数据。通过比对干净肺音数据和重构后的干净肺音数据的相似度，在相似度达到预设的阈值后，判定第一变分自动编码器训练完成。In one embodiment, the first variational autoencoder includes an encoder 1 and a decoder 1, wherein the encoder 1 processes the clean lung sound data to obtain first feature vector data, and the decoder 1 processes the first feature vector data to obtain reconstructed clean lung sound data. By comparing the similarity between the clean lung sound data and the reconstructed clean lung sound data, when the similarity reaches a preset threshold, it is determined that the training of the first variational autoencoder is completed.

202、通过所述带噪肺音数据和完成训练的所述第一变分自动编码器对所述第二变分自动编码器进行训练，得到所述第二变分自动编码器对应的第二特征向量数据、所述第一特征向量数据与所述第二特征向量数据之间的映射关系和重构带噪数据。202. Train the second variational autoencoder using the noisy lung sound data and the trained first variational autoencoder to obtain second feature vector data corresponding to the second variational autoencoder, a mapping relationship between the first feature vector data and the second feature vector data, and reconstruct the noisy data.

其中，所述第二特征向量数据、所述映射关系和所述重构带噪数据为在所述循环一致性损失模型约束下得到的。The second feature vector data, the mapping relationship and the reconstructed noisy data are obtained under the constraints of the cycle consistency loss model.

为了便于理解，第一变分自动编码器和第二变分自动编码器与自动编码器相比，更注重数据生成。即第一变分自动编码器和第二变分自动编码器并不是将输入数据编码为潜在空间中的单个点，而是将输入数据编码为潜在空间中的概率分布。即，在一实施例中，设第一变分自动编码器对应有第一潜在空间，设第二变分自动编码器对应有第二潜在空间。在训练第二变分自动编码器时，使用完成训练的第一变分自动编码器。根据循环一致性损失模型的约束，利用待训练的第二变分自动编码器对带噪肺音数据进行处理，得到第二特征向量数据。设存在一个共享空间，给定一个干净肺音数据至第一变分自动编码器，再给定一个带噪肺音数据至第二变分自动编码器。使位于第一潜在空间中的第一特征向量数据和位于第二潜在空间中的第二特征向量数据，在共享空间中构建映射关系，满足循环一致性损失算法。For ease of understanding, the first variational autoencoder and the second variational autoencoder pay more attention to data generation compared with the autoencoder. That is, the first variational autoencoder and the second variational autoencoder do not encode the input data as a single point in the latent space, but encode the input data as a probability distribution in the latent space. That is, in one embodiment, it is assumed that the first variational autoencoder corresponds to a first latent space, and the second variational autoencoder corresponds to a second latent space. When training the second variational autoencoder, the first variational autoencoder that has completed the training is used. According to the constraints of the cycle consistency loss model, the noisy lung sound data is processed using the second variational autoencoder to be trained to obtain the second eigenvector data. Assume that there is a shared space, a clean lung sound data is given to the first variational autoencoder, and then a noisy lung sound data is given to the second variational autoencoder. The first eigenvector data located in the first latent space and the second eigenvector data located in the second latent space are constructed to construct a mapping relationship in the shared space to satisfy the cycle consistency loss algorithm.

即构建第一特征向量数据和第二特征向量数据位于共享空间中的映射关系的过程是第二变分自动编码器的训练过程。通过使用待训练的第二变分自动编码器和完成训练的第一变分自动编码器迭代对带噪肺音数据进行处理，对映射关系进行优化，且在解码器二的作用下每次迭代均生成重构带噪数据。That is, the process of constructing the mapping relationship between the first feature vector data and the second feature vector data in the shared space is the training process of the second variational autoencoder. The noisy lung sound data is iteratively processed using the second variational autoencoder to be trained and the first variational autoencoder that has completed training, the mapping relationship is optimized, and each iteration generates reconstructed noisy data under the action of decoder 2.

203、在所述重构带噪数据与所述带噪肺音数据之间满足预设的匹配条件时，判定所述肺音增强模型训练完成。203. When a preset matching condition is satisfied between the reconstructed noisy data and the noisy lung sound data, it is determined that the training of the lung sound enhancement model is completed.

在一实施例中，重构带噪数据是在带噪肺音数据作为输入，依次使用编码器二、解码器一、编码器一和解码器二处理得到。因此重构带噪数据与作为输入的对应的带噪肺音数据应相同或高度相似。因此，在一应用场景中，匹配条件为判断重构带噪数据与对应的带噪肺音数据间的相似度是否大于预设的阈值。若相似度大于阈值，则判定为满足匹配条件，进而判定肺音增强模型训练完成。In one embodiment, the reconstructed noisy data is obtained by using encoder 2, decoder 1, encoder 1, and decoder 2 in sequence with noisy lung sound data as input. Therefore, the reconstructed noisy data should be identical or highly similar to the corresponding noisy lung sound data as input. Therefore, in an application scenario, the matching condition is to determine whether the similarity between the reconstructed noisy data and the corresponding noisy lung sound data is greater than a preset threshold. If the similarity is greater than the threshold, it is determined that the matching condition is met, and then it is determined that the lung sound enhancement model training is completed.

通过使用带噪肺音数据以及循环一致性损失模型对第二变分自动编码器进行训练，减少了干净肺音数据的使用量，有助于降低肺音增强成本。By using noisy lung sound data and a cycle consistency loss model to train the second variational autoencoder, the amount of clean lung sound data used is reduced, which helps to reduce the cost of lung sound enhancement.

此外，在本申请的另一种实施方式中，训练第一变分自动编码器和第二变分自动编码器时，利用短时傅里叶变换(Short Time Fourier Transform，STFT)提取干净肺音数据和带噪肺音数据的幅度谱图，以幅度谱图作为第一变分自动编码器和第二变分自动编码器的输入。In addition, in another embodiment of the present application, when training the first variational autoencoder and the second variational autoencoder, the short time Fourier transform (STFT) is used to extract the amplitude spectra of clean lung sound data and noisy lung sound data, and the amplitude spectra are used as inputs of the first variational autoencoder and the second variational autoencoder.

使用幅度谱图对第一变分自动编码器和第二变分自动编码器进行训练，从而完成肺音增强模型的训练，便于提高肺音增强模型的质量和效率。The amplitude spectrogram is used to train the first variational autoencoder and the second variational autoencoder, thereby completing the training of the lung sound enhancement model, thereby improving the quality and efficiency of the lung sound enhancement model.

在本申请另一种实施方式中，如图2所示，步骤所述通过所述带噪肺音数据和完成训练的所述第一变分自动编码器对所述第二变分自动编码器进行训练，得到所述第二变分自动编码器对应的第二特征向量数据、所述第一特征向量数据与所述第二特征向量数据之间的映射关系和重构带噪数据包括：In another embodiment of the present application, as shown in FIG2, the step of training the second variational autoencoder by using the noisy lung sound data and the first variational autoencoder that has completed the training, obtains the second feature vector data corresponding to the second variational autoencoder, and the first feature vector data and the second feature vector data. The mapping relationship and reconstruction of noisy data include:

301、利用所述第二变分自动编码器的编码器二对所述带噪肺音数据进行编码处理，得到所述第二特征向量数据。301. Use encoder 2 of the second variational autoencoder to encode the noisy lung sound data to obtain the second feature vector data.

为了便于理解，如图3所示，将带噪肺音数据y_i输入编码器二，即可得到与带噪肺音数据对应的第二特征向量数据。For ease of understanding, as shown in FIG3 , the noisy lung sound data _{yi is} input into encoder 2 to obtain the second feature vector data corresponding to the noisy lung sound data.

302、根据所述循环一致性损失模型构建的所述第一特征向量数据与所述第二特征向量数据之间的映射关系，利用完成训练的所述第一变分自动编码器的解码器一对所述第二特征向量数据进行解码处理，得到净化肺音数据。302. According to the mapping relationship between the first feature vector data and the second feature vector data constructed by the cycle consistency loss model, the second feature vector data is decoded by using the decoder of the first variational autoencoder that has completed training to obtain purified lung sound data.

具体的，在一实施例中，由于设置了共享空间，因此在共享空间中，第一特征向量数据与第二特征向量数据间构建有映射关系。在得到第二特征向量数据后，基于映射关系，使第二特征向量数据等于或者趋近于第一特征向量数据。因此，将第二特征向量数据作为解码器一的输入后，解码器一输出净化肺音数据需要说明的是，理想状态下，利用解码器一对第二特征向量数据进行处理后得到的净化肺音数据与利用解码器一对对应的第一特征向量数据进行处理后得到的重构的干净肺音数据相同，从而利用映射关系完成带噪肺音数据向增强肺音数据的转换。基于此特点，在训练第二变分自动编码器的过程中，迭代优化映射关系，使得利用解码器一对第二特征向量数据进行处理后，能够得到趋近于干净肺音数据的净化肺音数据 Specifically, in one embodiment, since a shared space is set, a mapping relationship is established between the first feature vector data and the second feature vector data in the shared space. After obtaining the second feature vector data, based on the mapping relationship, the second feature vector data is made equal to or close to the first feature vector data. Therefore, after the second feature vector data is used as the input of decoder 1, decoder 1 outputs purified lung sound data. It should be noted that, ideally, the purified lung sound data obtained by processing the second feature vector data using the decoder is The same as the reconstructed clean lung sound data obtained by processing the corresponding first feature vector data using a decoder pair, thus completing the conversion of noisy lung sound data to enhanced lung sound data using the mapping relationship. Based on this feature, in the process of training the second variational autoencoder, the mapping relationship is iteratively optimized so that after processing the second feature vector data using a decoder pair, purified lung sound data close to clean lung sound data can be obtained.

303、利用训练完成的所述第一变分自动编码器的编码器一对所述净化肺音数据进行编码处理，得到所述第一特征向量数据。303. Encode the purified lung sound data using encoder 1 of the trained first variational autoencoder to obtain the first feature vector data.

将净化肺音数据作为编码器一的输入，编码器一对净化肺音数据进行处理后，得到第一特征向量数据。由于第一特征向量数据与第二特征向量数据之间存在映射关系，因此在理想状态下，第一特征向量数据等于第二特征向量数据。Purify lung sound data As the input of encoder one, encoder one purifies lung sound data After processing, the first feature vector data is obtained. Since there is a mapping relationship between the first feature vector data and the second feature vector data, in an ideal state, the first feature vector data is equal to the second feature vector data.

304、基于所述映射关系利用所述第二变分自动编码器的解码器二对所述第一特征向量数据进行解码处理，得到所述重构带噪数据。304. Based on the mapping relationship, use decoder 2 of the second variational autoencoder to decode the first feature vector data to obtain the reconstructed noisy data.

将第一特征向量数据作为解码器二的输入，得到所述重构带噪数据其中，由于在理想状态下，第一特征向量数据等于第二特征向量数据。使得重构带噪数据等于加码器二对第二特征向量数据进行处理后得到的重构的带噪肺音数据。基于此特点，通过比对重构带噪数据和带噪肺音数据间的差异即可判断出映射关系的优化进度。在一实施例中，在重构带噪数据与带噪肺音数据件的差异小于预设的阈值时，判定第二变分自动编码器训练完成，从而判定肺音增强模型训练完成。The first feature vector data is used as the input of the decoder 2 to obtain the reconstructed noisy data In the ideal state, the first eigenvector data is equal to the second eigenvector data. is equal to the reconstructed noisy lung sound data obtained after the second eigenvector data is processed by the encoder 2. Based on this feature, the optimization progress of the mapping relationship can be determined by comparing the difference between the reconstructed noisy data and the noisy lung sound data. In one embodiment, when the difference between the reconstructed noisy data and the noisy lung sound data is less than a preset threshold, it is determined that the training of the second variational autoencoder is completed, thereby determining that the training of the lung sound enhancement model is completed.

在一应用场景中，第二变分自动编码器训练过程中的中间表示为：

In one application scenario, the intermediate representation during the training of the second variational autoencoder is:

损失函数为：
The loss function is:

在完成训练的第一变分自动编码器以及循环一致性损失模型的约束下，完成对第二变分自动编码器的训练，从而完成对肺音增强模型的训练。无需使用较多的干净肺音，有助于降低肺音增强成本。Under the constraints of the trained first variational autoencoder and the cycle consistency loss model, the second variational autoencoder is trained, thereby completing the training of the lung sound enhancement model. It is not necessary to use more clean lung sounds, which helps to reduce the cost of lung sound enhancement.

在本申请另一种实施方式中，在步骤得到增强肺音数据之后，所述方法还包括：In another embodiment of the present application, after the step of obtaining enhanced lung sound data, the method further includes:

利用预设的生成对抗网络模型对所述增强肺音数据进行处理，得到判别值数据。The enhanced lung sound data is processed using a preset generative adversarial network model to obtain discriminant value data.

所述生成对抗网络模型包括生成器和判别器；所述第一变分自动编码器和所述第二变分自动编码器均对应有所述生成对抗网络模型；与所述第一变分自动编码器对应的所述生成对抗网络模型将所述解码器一判定为所述生成器；与所述第二变分自动编码器对应的所述生成对抗网络模型将所述解码器二判定为所述生成器；在所述第一变分自动编码器的损失函数收敛后对对应的所述生成对抗网络模型进行训练；在所述第二变分自动编码器的损失函数收敛后对对应的所述生成对抗网络模型进行训练。The generative adversarial network model includes a generator and a discriminator; the first variational autoencoder and the second variational autoencoder both correspond to the generative adversarial network model; the generative adversarial network model corresponding to the first variational autoencoder determines the decoder 1 as the generator; the generative adversarial network model corresponding to the second variational autoencoder The generative adversarial network model determines the decoder 2 as the generator; trains the corresponding generative adversarial network model after the loss function of the first variational autoencoder converges; and trains the corresponding generative adversarial network model after the loss function of the second variational autoencoder converges.

在一实施例中，生成对抗网络模型(GAN)是一个生成模型，判别器应该尽可能地输出一个接近于1的值，而对于假样本，判别器应该尽可能地输出一个接近于0的值。在前面，我们训练了一对VAE，即第一变分自动编码器和第二变分自动编码器，用于实现干净肺音和带噪肺音之间的转换。由于我们只是简单地使用均方误差来计算VAE对于样本的重构损失，因此，VAE生成的星等谱图总是不如GAN生成的那么清晰。如图4所示，在两个VAE上都添加了判别器Discriminator。将两个判别器与两个VAE相结合，在这里可以将VAE的解码器视为GAN的生成器，与判别器相结合进行对抗学习。判别器将会同时输入当前域的肺音频谱图和由VAE生成的从另外一个域切换过来的肺音频谱图并输出其属于当前域的概率。若判断这个频谱图属于当前域那么判别器输出的值将会接近于1，若判断这个频谱图是从另外一个域切换过来的那么判别器输出的值将会接近于0。VAE为了能够欺骗当前的判别器，让判别器将由VAE生成的肺音频谱图分类为当前域原始就存在的肺音频谱图需要生成更加真实的数据。由于训练GAN的过程不稳定，为了加快训练速度，获得更好的效果，在两个VAE具有一定的生成能力后再添加判别器对其进行微调。其中，损失函数收敛后视为判断模型具有生成能力。In one embodiment, the generative adversarial network model (GAN) is a generative model, and the discriminator should output a value as close to 1 as possible, while for fake samples, the discriminator should output a value as close to 0 as possible. Previously, we trained a pair of VAEs, namely the first variational autoencoder and the second variational autoencoder, to achieve the conversion between clean lung sounds and noisy lung sounds. Since we simply use the mean square error to calculate the reconstruction loss of the VAE for the sample, the magnitude spectrogram generated by the VAE is always not as clear as that generated by the GAN. As shown in Figure 4, a discriminator is added to both VAEs. The two discriminators are combined with the two VAEs. Here, the decoder of the VAE can be regarded as the generator of the GAN, and combined with the discriminator for adversarial learning. The discriminator will simultaneously input the lung audio spectrogram of the current domain and the lung audio spectrogram generated by the VAE switched from another domain and output the probability that it belongs to the current domain. If the spectrum graph is judged to belong to the current domain, the value output by the discriminator will be close to 1. If the spectrum graph is judged to be switched from another domain, the value output by the discriminator will be close to 0. In order to deceive the current discriminator and let the discriminator classify the lung audio spectrogram generated by VAE as the original lung audio spectrogram in the current domain, VAE needs to generate more realistic data. Since the process of training GAN is unstable, in order to speed up the training and obtain better results, the discriminator is added to fine-tune it after the two VAEs have a certain generation ability. Among them, after the loss function converges, it is considered that the judgment model has the generation ability.

通过VAE和GAN的结合，有助于判断得到的增强肺音数据的准确率，从而及时发现肺音增强模型的缺陷，提高肺音增强的质量。The combination of VAE and GAN helps to determine the accuracy of the enhanced lung sound data, thereby promptly discovering the defects of the lung sound enhancement model and improving the quality of lung sound enhancement.

在本申请另一种实施方式中，所述解码器一、解码器二和所述判别器中均包含有两层多头自注意力模型。In another embodiment of the present application, the decoder 1, the decoder 2 and the discriminator all include a two-layer multi-head self-attention model.

具体的，在一实施例中，根据Self-Attention GAN结构，在VAE的解码器和GAN的判别器中分别加入两层多头自注意力模型。Specifically, in one embodiment, according to the Self-Attention GAN structure, two layers of multi-head self-attention models are added to the decoder of VAE and the discriminator of GAN respectively.

通过增设多头自注意力模型，能够更好地捕捉时序特征之间的远程交互，多头自注意力模块可以帮助模型关注来自不同位置的不同表示子空间的信息，从而提高模型的表达能力模型。By adding a multi-head self-attention model, the long-range interactions between temporal features can be better captured. The multi-head self-attention module can help the model focus on information from different representation subspaces from different positions, thereby improving the expressiveness of the model.

在本申请另一种实施方式中，在步骤利用预设的生成对抗网络模型对所述增强肺音数据进行处理，得到判别值数据之后，所述方法还包括：In another embodiment of the present application, after the step of processing the enhanced lung sound data using a preset generative adversarial network model to obtain discriminant value data, the method further includes:

利用训练完成的肺音相位修正模型对所述增强肺音数据进行处理，得到相位修正肺音数据。The enhanced lung sound data are processed using the trained lung sound phase correction model to obtain phase-corrected lung sound data.

肺音相位修正模型利用包含有白噪声的所述干净肺音数据训练。The lung sound phase correction model is trained using the clean lung sound data containing white noise.

在一实施例中，在肺音增强模型之后增加一个Unet结构的肺音相位修正模型，如图5所示，将干净肺音数据X与白噪声按不同的信噪比进行混合得到X^*，将X的频谱图与X^*的相位结合生成将作为肺音相位修正模型的训练数据，并以X中对应的肺音作为标签对肺音相位修正模型进行训练。其中，在函数收敛时判定训练完成。In one embodiment, a lung sound phase correction model with a Unet structure is added after the lung sound enhancement model. As shown in FIG5 , the clean lung sound data X is mixed with white noise at different signal-to-noise ratios to obtain X ^* , and the spectrum of X is combined with the phase of X ^* to generate Will As the training data of the lung sound phase correction model, the lung sound phase correction model is trained with the corresponding lung sounds in X as labels. The training is completed when the function converges.

在本申请另一种实施方式中，在步骤利用训练完成的肺音相位修正模型对所述增强肺音数据进行处理，得到相位修正肺音数据之后，所述方法还包括：In another embodiment of the present application, after the step of processing the enhanced lung sound data using the trained lung sound phase correction model to obtain phase-corrected lung sound data, the method further includes:

在一实施例中，信噪比预测模型为支持向量回归模型(SVR)，通过提取拥有不同信噪比的肺音的梅尔频率倒谱系数(MFCC)，以肺音的MFCC作为SVR模型的输入,以其原始信噪比为标签以自监督的方式对SVR回归模型进行训练。待SVR模型收敛后将肺音的MFCC输入到模型中以模型的预测值作为预测出的肺音信噪比。In one embodiment, the signal-to-noise ratio prediction model is a support vector regression model (SVR), which extracts the Mel-frequency cepstral coefficients (MFCC) of lung sounds with different signal-to-noise ratios, uses the MFCC of lung sounds as the input of the SVR model, and uses its original signal-to-noise ratio as a label to train the SVR regression model in a self-supervised manner. After the SVR model converges, the MFCC of the lung sounds is input into the model, and the predicted value of the model is used as the predicted signal-to-noise ratio of the lung sounds.

通过预测肺音信噪比，便于得知肺音增强的质量，从而在质量较差时，及时进行缺陷查询和补救，提高肺音的增强效果。 By predicting the lung sound signal-to-noise ratio, it is easy to know the quality of lung sound enhancement, so that when the quality is poor, defects can be checked and remedied in time to improve the enhancement effect of lung sounds.

在本申请的另一实施例中，还公开一种肺音增强系统，如图6所示，一种肺音增强系统，包括获取模块1，用于获取待增强肺音数据；In another embodiment of the present application, a lung sound enhancement system is also disclosed. As shown in FIG6 , a lung sound enhancement system includes an acquisition module 1 for acquiring lung sound data to be enhanced;

肺音增强模块2，用于利用训练完成的肺音增强模型对所述待增强肺音数据进行转换，得到增强肺音数据；Lung sound enhancement module 2, used to convert the lung sound data to be enhanced by using the trained lung sound enhancement model to obtain enhanced lung sound data;

第二编码单元，用于利用训练完成的所述第一变分自动编码器的编码器一对所述净化肺音数据进行编码处理，得到所述第一特征向量数据；a second encoding unit, configured to encode the purified lung sound data using encoder 1 of the first variational autoencoder that has been trained to obtain the first feature vector data;

这里需要指出的是：以上应用于肺音增强系统实施例项的描述，与上述方法描述是类似的，具有同方法实施例相同的有益效果。对于本申请肺音增强系统实施例中未披露的技术细节，本领域的技术人员请参照本申请方法实施例的描述而理解。It should be noted that the above description of the embodiment of the lung sound enhancement system is similar to the above method description. For technical details not disclosed in the embodiment of the lung sound enhancement system of the present application, those skilled in the art should refer to the description of the embodiment of the method of the present application for understanding.

需要说明的是，本申请实施例中，如果以软件功能模块的形式实现上述方法，并作为独立的产品销售或使用时，也可以存储在一个计算机可读存储介质中。基于这样的理解，本申请实施例的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来，该计算机软件产品存储在一个存储介质中，包括若干指令用以使得一台计算机设备(可以是个人计算机、服务器、或者网络设备等)执行本申请各个实施例所述方法的全部或部分。而前述的存储介质包括：U盘、移动硬盘、只读存储器(ROM，Read Only Memory)、磁碟或者光盘等各种可以存储程序代码的介质。这样，本申请实施例不限制于任何特定的硬件和软件结合。It should be noted that in the embodiments of the present application, if the above method is implemented in the form of a software function module and sold or used as an independent product, it can also be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the embodiment of the present application can be essentially or partly reflected in the form of a software product that contributes to the prior art. The computer software product is stored in a storage medium and includes several instructions to enable a computer device (which can be a personal computer, server, or network device, etc.) to execute all or part of the methods described in each embodiment of the present application. The aforementioned storage medium includes: various media that can store program codes, such as a USB flash drive, a mobile hard disk, a read-only memory (ROM, Read Only Memory), a magnetic disk or an optical disk. In this way, the embodiments of the present application are not limited to any specific combination of hardware and software.

相应地，本申请实施例还公开一种存储介质，存储有能够被处理器加载并执行上述方法的计算机程序。Correspondingly, an embodiment of the present application also discloses a storage medium storing a computer program that can be loaded by a processor and execute the above method.

本申请实施例还公开一种计算机设备，如图7所示，包括一个处理器100、至少一个通信总线200、用户接口300、至少一个外部通信接口400和存储器500。其中，通信总线200配置为实现这些组件之间的连接通信。其中，用户接口300可以包括显示屏，外部通信接口400可以包括标准的有线接口和无线接口。其中，存储器500中存储有肺音增强方法。其中，处理器100用于在执行存储器500中存储的肺音增强方法时采用上述方法。The embodiment of the present application also discloses a computer device, as shown in FIG7, including a processor 100, at least one communication bus 200, a user interface 300, at least one external communication interface 400 and a memory 500. The communication bus 200 is configured to realize connection and communication between these components. The user interface 300 may include a display screen, and the external communication interface 400 may include a standard wired interface and a wireless interface. The memory 500 stores a lung sound enhancement method. The processor 100 is used to adopt the above method when executing the lung sound enhancement method stored in the memory 500.

以上应用于计算机设备和存储介质实施例的描述，与上述方法实施例的描述是类似的，具有同方法实施例相似的有益效果。对于本申请计算机设备和存储介质实施例中未披露的技术细节，请参照本申请方法实施例的描述而理解。The above description of the computer device and storage medium embodiment is similar to the description of the above method embodiment, and has similar beneficial effects as the method embodiment. For technical details not disclosed in the computer device and storage medium embodiment of this application, please refer to the description of the method embodiment of this application for understanding.

应理解，说明书通篇中提到的“一个实施例”或“一实施例”意味着与实施例有关的特定特征、结构或特性包括在本申请的至少一个实施例中。因此，在整个说明书各处出现的“在一个实施例中”或“在一实施例中”未必一定指相同的实施例。此外，这些特定的特征、结构或特性可以任意适合的方式结合在一个或多个实施例中。应理解，在本申请的各种实施例中，上述各过程的序号的大小并不意味着执行顺序的先后，各过程的执行顺序应以其功能和内在逻辑确定，而不应对本申请实施例的实施过程构成任何限定。上述本申请实施例序号仅仅为了描述，不代表实施例的优劣。It should be understood that "one embodiment" or "an embodiment" mentioned throughout the specification means that specific features, structures or characteristics related to the embodiment are included in at least one embodiment of the present application. Therefore, "in one embodiment" or "in an embodiment" appearing throughout the specification does not necessarily refer to the same embodiment. In addition, these specific features, structures or characteristics can be combined in one or more embodiments in any suitable manner. It should be understood that in various embodiments of the present application, the size of the sequence number of the above-mentioned processes does not mean the order of execution, and the execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of the present application. The above-mentioned sequence numbers of the embodiments of the present application are only for description and do not represent the advantages and disadvantages of the embodiments.

需要说明的是，在本文中，术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含，从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素，而且还包括没有明确列出的其他要素，或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下，由语句“包括一个……”限定的要素，并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。It should be noted that, in this article, the terms "include", "comprises" or any other variations thereof are intended to cover non-exclusive inclusion, so that a process, method, article or device including a series of elements includes not only those elements, but also other elements not explicitly listed, or also includes elements inherent to such process, method, article or device. In the absence of further restrictions, an element defined by the sentence "comprises a ..." does not exclude the existence of other identical elements in the process, method, article or device including the element.

在本申请所提供的几个实施例中，应该理解到，所揭露的设备和方法，可以通过其它的方式实现。以上所描述的设备实施例仅仅是示意性的，例如，所述单元的划分，仅仅为一种逻辑功能划分，实际实现时可以有另外的划分方式，如：多个单元或组件可以结合，或可以集成到另一个系统，或一些特征可以忽略，或不执行。另外，所显示或讨论的各组成部分相互之间的耦合、或直接耦合、或通信连接可以是通过一些接口，设备或单元的间接耦合或通信连接，可以是电性的、机械的或其它形式的。In the several embodiments provided in the present application, it should be understood that the disclosed devices and methods can be implemented in other ways. The device embodiments described above are only schematic. For example, the division of the units is only a logical function division. There may be other division methods in actual implementation, such as: multiple units or components can be combined, or can be integrated into another system, or some features can be ignored, or not executed. In addition, the coupling, direct coupling, or communication connection between the components shown or discussed can be through some interfaces, and the indirect coupling or communication connection of the devices or units can be electrical, mechanical or other forms.

上述作为分离部件说明的单元可以是、或也可以不是物理上分开的，作为单元显示的部件可以是、或也可以不是物理单元；既可以位于一个地方，也可以分布到多个网络单元上；可以根据实际的需要选择其中的部分或全部单元来实现本实施例方案的目的。The units described above as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units; they may be located in one place or distributed on multiple network units; some or all of the units may be selected according to actual needs to achieve the purpose of the present embodiment.

另外，在本申请各实施例中的各功能单元可以全部集成在一个处理单元中，也可以是各单元分别单独作为一个单元，也可以两个或两个以上单元集成在一个单元中；上述集成的单元既可以采用硬件的形式实现，也可以采用硬件加软件功能单元的形式实现。In addition, all functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may be separately configured as a unit, or two or more units may be integrated into one unit; the above-mentioned integrated units may be implemented in the form of hardware or in the form of hardware plus software functional units.

本领域普通技术人员可以理解：实现上述方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成，前述的程序可以存储于计算机可读存储介质中，该程序在执行时，执行包括上述方法实施例的步骤；而前述的存储介质包括：移动存储设备、ROM、磁碟或者光盘等各种可以存储程序代码的介质。A person skilled in the art will understand that all or part of the steps of the above method embodiment can be implemented by a program The aforementioned program can be stored in a computer-readable storage medium. When the program is executed, it executes the steps of the aforementioned method embodiment; and the aforementioned storage medium includes: various media that can store program codes, such as mobile storage devices, ROM, magnetic disks or optical disks.

或者，本申请上述集成的单元如果以软件功能模块的形式实现并作为独立的产品销售或使用时，也可以存储在一个计算机可读取存储介质中。基于这样的理解，本申请实施例的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来，该计算机软件产品存储在一个存储介质中，包括若干指令用以使得一台设备执行本申请各个实施例所述方法的全部或部分。而前述的存储介质包括：移动存储设备、ROM、磁碟或者光盘等各种可以存储程序代码的介质。Alternatively, if the above-mentioned integrated unit of the present application is implemented in the form of a software function module and sold or used as an independent product, it can also be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the embodiment of the present application can essentially or in other words, the part that contributes to the prior art can be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for a device to execute all or part of the methods described in each embodiment of the present application. The aforementioned storage medium includes: various media that can store program codes, such as mobile storage devices, ROMs, magnetic disks or optical disks.

以上所揭露的仅为本申请较佳实施例而已，当然不能以此来限定本申请之权利范围，因此依本申请权利要求所作的等同变化，仍属本申请所涵盖的范围。 The above disclosure is only the preferred embodiment of the present application, which certainly cannot be used to limit the scope of rights of the present application. Therefore, equivalent changes made according to the claims of the present application are still within the scope covered by the present application.

Claims

A lung sound enhancement method, comprising:

Acquire lung sound data to be enhanced;

Using the trained lung sound enhancement model to transform the lung sound data to be enhanced, to obtain enhanced lung sound data;

The lung sound enhancement model is constructed based on a first variational autoencoder, a second variational autoencoder, and a cycle consistency loss model, and is trained by clean lung sound data and noisy lung sound data.

The lung sound enhancement method according to claim 1, wherein, before converting the lung sound data to be enhanced using the trained lung sound enhancement model to obtain enhanced lung sound data, the method further comprises:

Training the first variational autoencoder using the clean lung sound data to obtain the trained first variational autoencoder and first feature vector data corresponding to the first variational autoencoder;

The second variational autoencoder is trained by using the noisy lung sound data and the first variational autoencoder that has completed training, to obtain second feature vector data corresponding to the second variational autoencoder, a mapping relationship between the first feature vector data and the second feature vector data, and reconstructed noisy data; the second feature vector data, the mapping relationship, and the reconstructed noisy data are obtained under the constraints of the cycle consistency loss model;

When a preset matching condition is satisfied between the reconstructed noisy data and the noisy lung sound data, it is determined that the training of the lung sound enhancement model is completed.

The lung sound enhancement method according to claim 2, wherein the step of training the second variational autoencoder using the noisy lung sound data and the first variational autoencoder that has completed training, obtaining the second feature vector data corresponding to the second variational autoencoder, the mapping relationship between the first feature vector data and the second feature vector data, and reconstructing the noisy data comprises:

Using encoder 2 of the second variational autoencoder to encode the noisy lung sound data to obtain the second feature vector data;

According to the mapping relationship between the first feature vector data and the second feature vector data constructed by the cycle consistency loss model, the second feature vector data is decoded by using the decoder 1 of the first variational autoencoder that has completed training to obtain purified lung sound data;

Encoding the purified lung sound data using encoder 1 of the first variational autoencoder that has been trained to obtain the first feature vector data;

Based on the mapping relationship, the first feature vector data is decoded using decoder 2 of the second variational autoencoder to obtain the reconstructed noisy data.

The lung sound enhancement method according to claim 3, wherein after obtaining the enhanced lung sound data, the method further comprises:

Processing the enhanced lung sound data using a preset generative adversarial network model to obtain discriminant value data;

The generative adversarial network model includes a generator and a discriminator; the first variational autoencoder and the second variational autoencoder both correspond to the generative adversarial network model; the generative adversarial network model corresponding to the first variational autoencoder determines the decoder 1 as the generator; the generative adversarial network model corresponding to the second variational autoencoder determines the decoder 2 as the generator; after the loss function of the first variational autoencoder converges, the corresponding generative adversarial network model is trained; after the loss function of the second variational autoencoder converges, the corresponding generative adversarial network model is trained.

The lung sound enhancement method as described in claim 4, wherein the decoder 1, the decoder 2 and the discriminator each include a two-layer multi-head self-attention model.

The lung sound enhancement method according to claim 4, wherein, after the enhanced lung sound data is processed using a preset generative adversarial network model to obtain discriminant value data, the method further comprises:

Processing the enhanced lung sound data using the trained lung sound phase correction model to obtain phase-corrected lung sound data;

The lung sound phase correction model is trained using the clean lung sound data containing white noise.

The lung sound enhancement method according to claim 6, wherein, after the enhanced lung sound data is processed using the trained lung sound phase correction model to obtain the phase-corrected lung sound data, the method further comprises:

The trained signal-to-noise ratio prediction model is used to process the lung sound data to be enhanced and the enhanced lung sound data to obtain corresponding signal-to-noise ratio prediction data.

A lung sound enhancement system, comprising an acquisition module for acquiring lung sound data to be enhanced;

A lung sound enhancement module, used to convert the lung sound data to be enhanced by using the trained lung sound enhancement model to obtain enhanced lung sound data;

A computer device comprises a memory and a processor, wherein the memory stores a lung sound enhancement method, and the processor is used to adopt the method described in any one of claims 1 to 7 when executing the lung sound enhancement method.

A storage medium storing a computer program that can be loaded by a processor and execute the method according to any one of claims 1 to 7.