[go: up one dir, main page]

CN120600315A - Thyroid tumor classification and recurrence risk prediction method based on deep learning - Google Patents

Thyroid tumor classification and recurrence risk prediction method based on deep learning

Info

Publication number
CN120600315A
CN120600315A CN202510766187.4A CN202510766187A CN120600315A CN 120600315 A CN120600315 A CN 120600315A CN 202510766187 A CN202510766187 A CN 202510766187A CN 120600315 A CN120600315 A CN 120600315A
Authority
CN
China
Prior art keywords
feature
data
dynamic
ceus
fused
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202510766187.4A
Other languages
Chinese (zh)
Inventor
冯嘉伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
First Peoples Hospital of Changzhou
Original Assignee
First Peoples Hospital of Changzhou
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by First Peoples Hospital of Changzhou filed Critical First Peoples Hospital of Changzhou
Priority to CN202510766187.4A priority Critical patent/CN120600315A/en
Publication of CN120600315A publication Critical patent/CN120600315A/en
Pending legal-status Critical Current

Links

Landscapes

  • Image Analysis (AREA)

Abstract

本发明公开了一种基于深度学习的甲状腺肿瘤分类及复发风险预测方法,属于医学影像处理领域,包括以下步骤:对与甲状腺滤泡癌FTC或滤泡性腺瘤FTA相关的多模态数据进行采集和预处理;对预处理后的多模态数据进行特征提取,得到影像特征和病理特征;针对影像特征与病理特征的组合以及影像特征分别采用不同的融合和优化方法,得到第一优化融合特征和第二优化融合特征;将第一优化融合特征输入术前分类模型,得到甲状腺肿瘤类型分类概率,将第二优化融合特征和病理特征输入术后复发风险预测模型,得到复发风险评分。本发明通过特别构建的多模态影像分类模型和复发风险预测模型,提升了术前分类的灵敏度和特异性,并为术后患者提供精准的个体化管理。

The present invention discloses a method for thyroid tumor classification and recurrence risk prediction based on deep learning, which belongs to the field of medical image processing and includes the following steps: collecting and preprocessing multimodal data related to thyroid follicular carcinoma (FTC) or follicular adenoma (FTA); extracting features from the preprocessed multimodal data to obtain image features and pathological features; using different fusion and optimization methods for the combination of image features and pathological features, as well as for the image features, to obtain a first optimized fusion feature and a second optimized fusion feature; inputting the first optimized fusion feature into a preoperative classification model to obtain a thyroid tumor type classification probability, and inputting the second optimized fusion feature and the pathological feature into a postoperative recurrence risk prediction model to obtain a recurrence risk score. The present invention improves the sensitivity and specificity of preoperative classification through specially constructed multimodal image classification models and recurrence risk prediction models, and provides accurate individualized management for postoperative patients.

Description

Thyroid tumor classification and recurrence risk prediction method based on deep learning
Technical Field
The invention relates to the field of medical image processing, in particular to a thyroid tumor classification and recurrence risk prediction method based on deep learning.
Background
Thyroid follicular carcinoma (FTC), a highly invasive and clinically occult subtype of thyroid carcinoma, has limitations in that conventional imaging diagnostic methods, such as ultrasound and enhanced CT, have low specificity and cannot effectively capture dynamic micro-blood flow characteristics, resulting in difficulty in early diagnosis of FTC. Existing pathological diagnostic methods such as fine needle aspiration biopsy (FNA) also suffer from insufficient accuracy, especially for the identification of FTC from follicular adenoma (FTA). The TNM staging system is used to describe the range of primary tumors (T), regional lymph nodes (N) and distant metastases (M) of malignant tumors, which are widely used for prognostic evaluation, but which are largely dependent on the anatomical features of the tumor, with about 20% of FTC patients already present with occult metastases at the time of initial diagnosis, but TNM staging relies on imaging detection, lacking accurate prediction of recurrence risk.
The existing thyroid tumor type classification and postoperative recurrence risk prediction method mainly comprises the following steps of (1) a deep learning diagnosis model based on a single-mode image, wherein a part of deep learning models based on ultrasonic images are researched and developed, and the benign and malignant thyroid nodules are classified through a convolutional neural network. However, such methods only use single-mode ultrasound images and cannot comprehensively analyze the multidimensional characteristics of tumors. In addition, enhanced CT images have also been used to develop risk prediction models, but the diagnostic performance is limited due to the lack of characterization of tumor dynamic micro-blood flow by CT images. (2) A part of researches start to try to fuse Contrast Enhanced Ultrasound (CEUS) and CT image feature enhancement in malignant tumors such as livers and the like by using a model with simple fusion of the multi-modal images, and the multi-modal model is constructed by using a simple feature stitching mode. However, the fusion mode in these studies fails to fully exploit the complementarity of different modal features, and it is difficult to dynamically adjust the focus of the model according to the importance of the features, resulting in limited improvement of diagnostic performance. (3) The single analysis of images and pathology data, most of the studies on FTC prognosis only analyze on postoperative pathology data, failing to establish an effective link between preoperative images and postoperative pathology data. The use of pre-operative imaging features and post-operative pathology features to fracture limits the depth and breadth of the joint analysis. In addition, the existing thyroid tumor type classification and postoperative recurrence risk prediction methods lack an interpretability support, and the application of the deep learning model in medical image analysis has a problem of 'black box', which means that the decision process of the model lacks transparency for doctors. Although these models can make accurate diagnoses or risk predictions through the training of large amounts of data, it is often difficult for a physician to understand why a model makes a particular decision. Especially in complex clinical cases, doctors tend to rely on their own expertise rather than on model-based predictions entirely, making it difficult to actually play its supportive role in clinical decisions.
Disclosure of Invention
In the prior art, a thyroid tumor classification model based on deep learning only utilizes a single-mode ultrasonic image, cannot comprehensively analyze multidimensional features of tumors, or utilizes a simple feature splicing mode to construct a multi-mode model, cannot fully mine complementarity of different mode features, so that the improvement of diagnosis performance is limited, the existing postoperative recurrence risk prediction model only analyzes postoperative pathology data, and cannot establish effective connection between preoperative images and postoperative pathology data.
In one aspect of the invention, a deep learning-based thyroid tumor classification and recurrence risk prediction method is provided, which comprises the following steps of S1, collecting and preprocessing multi-mode data related to a thyroid follicular carcinoma FTC or follicular gonadoma FTA, wherein the multi-mode data comprise dynamic micro blood flow data, anatomical data and full-section pathological image WSI data, S2, respectively adopting different feature extraction methods to perform feature extraction aiming at the preprocessed dynamic micro blood flow data, anatomical data and full-section pathological image WSI data, obtaining a dynamic micro blood flow feature F_CEUS, an anatomical feature F_CT and a pathological feature F_path, S3, fusing and optimizing the dynamic micro blood flow feature F_CEUS, the anatomical feature F_CT and the pathological feature F_path, obtaining a first optimized fusion feature F_fused_timed, fusing and optimizing the dynamic micro blood flow feature F_CEUS and the anatomical feature F_CT, obtaining a second optimized fusion feature F_temporal, S4, and inputting the first optimized fusion feature F_CT and the pathological feature F_path into a classification model to obtain a recurrence risk prediction model.
More specifically, in the aspect, the step S1 of collecting the multi-mode data comprises the steps of obtaining dynamic micro blood flow data through contrast enhanced ultrasound CEUS by adopting a time sequence collecting technology, obtaining anatomical data through enhanced CT by adopting a multi-phase scanning and iterative reconstruction algorithm, obtaining full-slice pathological image WSI data through a digital scanner, the step S1 of preprocessing the multi-mode data comprises the steps of applying adaptive median filtering and wavelet transformation to the dynamic micro blood flow data to obtain first dynamic micro blood flow data, applying non-local mean denoising to the anatomical data to obtain first anatomical data, applying a mutual information maximizing algorithm to the first dynamic micro blood flow data and the first anatomical data to obtain dynamic micro blood flow data and anatomical data with sub-pixel level spatial alignment, applying pyramid blocking strategy to the full-slice pathological image WSI data, respectively extracting pixel blocks of the full-slice pathological image WSI under different amplification factors, and applying multi-scale feature fusion to the pixel blocks to obtain fused full-slice image WSI data.
More specifically, in the above aspect, the feature extraction of the preprocessed multi-mode data in step S2 includes performing spatial feature extraction and temporal feature modeling on the preprocessed dynamic micro-blood flow data by using a CNN-GRU hybrid architecture to obtain a dynamic micro-blood flow feature f_ceus associated with the dynamic micro-blood flow data, performing multi-level feature extraction on the preprocessed anatomical data by using a pretraining ResNet and fusing the multi-level scale features by using a pyramid pooling module PPM to obtain an anatomical feature f_ct associated with the anatomical data, dividing the preprocessed full-slice pathological image WSI into N tiles, processing each tile by using a InceptionResNetV model to obtain a corresponding example feature, then calculating the attention weight of each tile by using an attention weight network, and performing weighted aggregation on all the example features according to the calculated attention weight to obtain a pathological feature f_path associated with the full-slice pathological image WSI data.
More specifically, in the above aspect, in step S3, the dynamic micro-blood flow feature F_CEUS, the anatomical feature F_CT, and the pathological feature F_path are fused and optimized to obtain a first optimized fused feature F_fused_optimized, comprising generating a first initial weight vector for the dynamic micro-blood flow feature F_CEUS, the anatomical feature F_CT, and the pathological feature F_path using a multi-headed self-attention mechanismCombining feature quality assessment and task association analysis to the first initial weight vectorPerforming adaptive adjustment to obtain a first initial weight vector after the first adjustmentUtilizing a task-aware dynamic weight adjustment mechanism to adjust the first initial weight vector after the first timeDynamically adjusting to obtain a first initial weight vector after the second adjustmentFor the first initial weight vector after the second adjustmentCarrying out normalization and adopting task-sensitive self-adaptive adjustment to obtain a first weight vector [ alpha, beta, gamma ], wherein alpha+beta+gamma=1, and carrying out weighted fusion on the dynamic micro blood flow characteristic F_CEUS, the anatomical characteristic F_CT and the pathological characteristic F_path based on the first weight vector [ alpha, beta, gamma ] to obtain a first fusion characteristic F_fused, wherein the first fusion characteristic F_fused is shown in the following formula:
F_fused=α·F_CEUS+β·F_CT+γ·F_path
And then optimizing the first fusion characteristic F_fused by a transducer coder to obtain a first optimized fusion characteristic F_fused_optimized.
More specifically, in the above aspect, in step S3, the dynamic micro-blood flow feature F_CEUS and the anatomical feature F_CT are fused and optimized to obtain a second optimized fused feature F_temporal, comprising computing a feature similarity matrix using a multi-head attention mechanism for the dynamic micro-blood flow feature F_CEUS and the anatomical feature F_CT, generating a second initial weight vectorA task-aware dynamic weight adjustment mechanism is adopted for the second initial weight vectorAdjusting to obtain an adjusted second initial weight vectorFor the adjusted second initial weight vectorNormalization is carried out to obtain a second weight vector [ alpha ', beta' ], wherein alpha '+beta' =1, and the dynamic micro-blood flow feature F_CEUS and the anatomical feature F_CT are subjected to weighted fusion based on the second weight vector [ alpha ', beta' ] to obtain a second fusion feature F_fused ', wherein the second fusion feature F_fused' is shown in the following formula:
F_fused'=α'·F_CEUS+β'·F_CT;
introducing a residual connection to obtain an enhanced feature f_enhanced:
F_enhanced = F_fused' + Dropout(FC(F_fused'))
Wherein FC is a fully connected layer, dropout is a random deactivation operation, and FC is a fully connected layer;
normalizing the enhanced feature f_enhanced application layer to obtain a normalized enhanced feature f_normalized:
F_normalized = LayerNorm(F_enhanced)
inputting the normalized enhancement feature f_normalized into the attention enhanced GRU network to obtain a second optimized fusion feature f_temporal:
F_temporal = GRU(F_normalized)。
More specifically, in the above aspect, the pre-operative classification model in step S4 is constructed by constructing an initial pre-operative classification model by adopting a dual-branch fully-connected network structure, collecting multi-modal data related to thyroid follicular carcinoma FTC or follicular gonadoma FTA as a dataset, dividing the dataset into a first training set, a first verification set and a first test set according to a predetermined proportion by adopting hierarchical sampling, training the initial pre-operative classification model by using the first training set, evaluating the performance of the initial pre-operative classification model by using the first verification set in the training process, and adjusting parameters of the initial pre-operative classification model by adopting a multi-stage migration learning strategy, a cross entropy loss function and an L2 regularization term, a dynamic fine tuning mechanism based on the characteristics of CEUS equipment and CT equipment and an adaptive learning rate adjustment, obtaining the pre-operative classification model after the training is completed, and evaluating the performance of the pre-operative classification model by using the first test set.
More specifically, in the aspect, the postoperative recurrence risk prediction model in the step S4 is constructed by adopting a multi-mode contrast learning framework to construct an initial postoperative recurrence risk prediction model, collecting multi-mode data related to thyroid follicular carcinoma (FTC) and follicular adenoma (FTA) as data sets, dividing the data sets into a second training set, a second verification set and a second test set according to a preset proportion by adopting hierarchical sampling, training the initial postoperative recurrence risk prediction model by using the second training set, evaluating the performance of the initial postoperative recurrence risk prediction model by using the second verification set, adjusting parameters of the initial postoperative recurrence risk prediction model by adopting field-invariant feature learning based on countermeasure training and adaptive early shutdown based on the performance of the verification set, obtaining the postoperative recurrence risk prediction model after training, and evaluating the performance of the postoperative recurrence risk prediction model by using the first test set.
In a further embodiment of the above embodiments, a dual-branch contrast learning network and risk scoring mechanism are further employed to enhance performance of the postoperative recurrence risk prediction model.
In the aspects described above, in further embodiments, thermodynamic diagrams are generated using Grad-CAM optimization to improve the clarity and reliability of the visualized results based on thyroid tumor type classification probability and recurrence risk score, and SHAP feature contribution analysis is employed to quantify the contribution of various features to the prognosis results.
In another aspect of the invention, an electronic device is provided, comprising a processor and a memory, characterized in that a computer program is stored in the memory, which when executed by the processor causes the processor to perform the deep learning based thyroid tumor classification and recurrence risk prediction method of the above aspect of the invention.
The invention has the beneficial effects that:
1. The invention not only utilizes the image data of a single mode, but also fuses the data of three modes of dynamic micro blood flow data (CEUS), anatomical data (CT) and full-section pathological image (WSI), thereby providing more comprehensive tumor characteristic information.
2. For data of different modes, the invention adopts a specific feature extraction method, namely, dynamic micro-blood flow features of CEUS are extracted by using a CNN-GRU mixed architecture, multi-level and multi-scale anatomical features of CT are extracted by using a pre-training ResNet and Pyramid Pooling Module (PPM), and pathological features of WSI are extracted by using InceptionResNetV and a concentration weight network.
3. The invention provides a self-attention-based dynamic weighted fusion mechanism for fusion of image data (CEUS and CT) and pathological data, and the weight of each modal feature is adaptively adjusted by combining feature quality assessment and task association analysis. Meanwhile, the transform encoder is utilized to carry out global optimization of the depth self-attention network, so that the expression capability of fusion features is further improved, and the problem that the mode complementarity cannot be mined in the traditional simple splicing fusion mode is solved.
4. The invention adopts the attention mechanism based on the feature similarity matrix to generate the weight aiming at the fusion of the image data (CEUS and CT), combines with residual connection, layer normalization and attention-enhanced GRU network, strengthens the space-time feature interaction, captures the synergistic effect of CEUS time sequence evolution and CT space features, and improves the feature fusion efficiency among image modes.
5. The invention respectively builds different models aiming at preoperative classification and postoperative recurrence risk prediction. The preoperative classification model adopts a double-branch full-connection network structure, combines a multi-stage migration learning strategy and a dynamic fine adjustment mechanism, improves the adaptability of the model in different equipment environments, dynamically adjusts model parameters based on equipment characteristics (resolution, frame rate and the like), combines a cyclic attenuation learning rate strategy, avoids training from being trapped into local optimum, and improves convergence speed and cross-equipment performance stability. The postoperative recurrence risk prediction model is based on a multi-mode contrast learning framework, a double-branch contrast learning network is introduced, image features and pathological features of similar samples (recurrence/non-recurrence) are forced to be close in a feature space, heterogeneous samples are far away, feature discriminant and modal semantic alignment degree are improved, recurrence risk AUC is improved by 7-13%, continuous risk scores are output through weighted mixed loss (binary cross entropy+contrast loss), and the influence of equipment difference on model performance is reduced by combining with domain invariant feature learning based on countermeasure training, so that cross-equipment performance retention rate is improved to 85-92%.
6. The invention adopts the self-adaptive learning rate adjustment and early-stop strategy, effectively avoids the problem of over-fitting in the model construction process, reduces unnecessary training time, and improves the training efficiency and performance of the model.
7. According to the invention, a thermodynamic diagram is generated through Grad-CAM optimization, time dimension weights are superimposed in a CEUS thermodynamic diagram, a multi-mode fusion visualization is realized by combining CT and pathological features, the transparency of a model decision process is improved, a doctor is helped to understand a key diagnosis area, contributions of various mode features (such as CEUS peak flow rate, CT boundary ambiguity and pathological nuclear division count) to a prognosis result are quantized through SHAP feature contribution analysis, a downloadable explanatory report is generated, the problem of a deep learning model 'black box' is solved, and the trust degree of a clinician on model prediction is enhanced.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention, are incorporated in and constitute a part of this specification. The drawings and their description are illustrative of the invention and are not to be construed as unduly limiting the invention.
In the drawings:
fig. 1 is a flowchart of a deep learning-based thyroid tumor classification and recurrence risk prediction method of the present invention.
Fig. 2 is a flow chart of data acquisition and preprocessing of the deep learning-based thyroid tumor classification and recurrence risk prediction method of the present invention.
Fig. 3 is a feature extraction block diagram of a deep learning-based thyroid tumor classification and recurrence risk prediction method of the present invention.
Fig. 4 is a schematic diagram of tri-modal feature fusion and optimization of a deep learning-based thyroid tumor classification and recurrence risk prediction method of the present invention.
Fig. 5 is a schematic diagram of bimodal feature fusion and optimization of a deep learning-based thyroid tumor classification and recurrence risk prediction method of the present invention.
Fig. 6 is a classification and prediction flow chart of a deep learning-based thyroid tumor classification and recurrence risk prediction method of the present invention.
Fig. 7 is a flow chart of an explanatory analysis of the deep learning-based thyroid tumor classification and recurrence risk prediction method of the present invention.
Detailed Description
Referring to fig. 1, fig. 1 is a flowchart of a deep learning-based thyroid tumor classification and recurrence risk prediction method of the present invention. The deep learning-based thyroid tumor classification and recurrence risk prediction method comprises the steps of S1, collecting and preprocessing multi-mode data related to thyroid follicular carcinoma FTC or follicular gonadoma FTA, wherein the multi-mode data comprise dynamic micro blood flow data, anatomical data and full-slice pathological image WSI data, S2, respectively adopting different feature extraction methods for the preprocessed dynamic micro blood flow data, anatomical data and WSI data to conduct feature extraction to obtain dynamic micro blood flow features F_CEUS, anatomical features F_CT and pathological features F_path, S3, generating weight vectors aiming at the dynamic micro blood flow features F_CEUS, anatomical features F_CT and pathological features F_path by using a multi-head self-attention mechanism, adopting an adaptive weight adjustment mechanism combining feature quality evaluation and task relevance analysis to conduct weighted fusion, utilizing a trans-former encoder to obtain first optimized fusion features F_optimal, adopting a correlation matrix between dynamic micro features F_CEUS and anatomical features F_CT to obtain a dynamic micro blood flow features F_CEUS, utilizing a correlation feature F_CT and a pathological features F_path to conduct weighted fusion, utilizing a second weighted fusion model to obtain a recurrent risk prediction model, and utilizing a first functional model to conduct weighted fusion, and a second functional model to obtain a recurrent risk prediction model, and a weighted fusion model before the first functional model and a second functional model are input to a weighted fusion model.
The pretreatment, feature extraction, fusion and optimization, model training and construction and interpretation enhancement technology in the thyroid tumor classification and recurrence risk prediction method based on deep learning are implemented through Python. In this specification, the dynamic micro-blood flow feature f_ceus may also be referred to as CEUS feature, and the anatomical feature f_ct may also be referred to as CT feature.
Referring to fig. 2, a flow chart of data acquisition and preprocessing according to the present invention is shown. Mainly comprises the following steps:
1. And (3) acquiring CEUS data, namely acquiring CEUS time sequence images and capturing arterial period and venous period dynamic characteristics.
2. CT data is acquired, namely three-phase enhanced CT images are acquired, wherein the three-phase enhanced CT images comprise an arterial phase, a venous phase and a delay phase.
3. And (3) acquiring pathological section data, namely acquiring H & E stained pathological section full-section images and carrying out block segmentation.
4. And (3) data cleaning, namely removing an invalid blank area in the image and filling in lost or incomplete data.
5. Format conversion, namely saving the CEUS data as a standard DICOM format, and converting the pathological section into a JPEG format, so that deep learning processing is facilitated.
6. Normalization processing, which is to normalize brightness and contrast and unify resolution, for example, CEUS is 640×480, CT is 256×256, and pathological section is 256×256.
7. Spatial registration, namely using Elastix to spatially align the CT and CEUS images and registering multi-mode data to ensure consistency of the images in spatial positions.
8. And finishing preprocessing, namely outputting the multi-mode data after normalization and registration.
More specifically, in one embodiment of the present invention, regarding the data acquisition portion, for the multimode characteristics of the FTC, the following standardized acquisition procedure is proposed:
(1) Dynamic micro blood flow feature Capture (CEUS) is to accurately divide arterial period (30-60 seconds) and venous period (60-120 seconds) by adopting a time sequence acquisition technology, capture the space-time dynamic feature of tumor micro blood flow perfusion by a high-frequency ultrasonic probe (frequency is more than or equal to 12 MHz), and store the data in DICOM format (resolution 640X 480, frame rate 15 fps).
(2) The high resolution anatomy feature extraction (enhanced CT) adopts multi-phase scanning (arterial phase, venous phase and delay phase), the thickness of the layer is 1.0-1.5mm, and the noise is reduced by combining an iterative reconstruction algorithm, so that the fine depiction of the three-dimensional anatomy structure is ensured.
(3) Digital processing of pathology data full-slice pathology image (WSI) is digitized by a digital scanner (resolution 0.25 μm/pixel), combined with tissue region screening algorithm (Otsu thresholding) using a blocking process (256×256 pixels), excluding blank region interference.
(4) And the data storage and cross-platform compatibility is that a standardized data warehouse is constructed, CEUS (DICOM), CT (DICOM) and pathology image (JPEG) are uniformly mapped to the same space coordinate system, and the lossless transmission and cross-equipment analysis of multi-center data are supported.
Through the standardized acquisition flow, the standardized acquisition of CEUS, CT and pathological data can be ensured, the accurate acquisition of dynamic micro-blood flow characteristics, high-resolution anatomical characteristics and pathological characteristics is realized, and unified storage and cross-platform compatibility of multi-mode data are supported by constructing a standardized data warehouse, so that nondestructive transmission and analysis of multi-center data are realized.
In one embodiment of the invention, regarding data preprocessing, the following operations are employed:
(1) Spatial registration and dynamic alignment, namely adopting Elastix registration frames (based on B spline elastic transformation), realizing sub-pixel level spatial alignment of CEUS and enhanced CT images through a mutual information maximization algorithm, and eliminating artifacts caused by respiratory motion and equipment offset.
(2) And (3) multi-scale optimization of pathological data, namely, aiming at WSI data, a pyramid blocking strategy is provided, namely 256×256 pixel blocks are respectively extracted under 5×, 10×and20× magnification, the pixel blocks are input into a InceptionResNetV multi-scale feature fusion module, 512-dimensional feature vectors containing cross-scale information of cell morphology and tissue structure are output, and cross-scale representation of the cell morphology and tissue structure is realized.
(3) Noise suppression and enhancement processing, namely, adopting adaptive median filtering (window size is 3 multiplied by 3) to the CEUS data and combining wavelet transformation (Daubechies basis function) to effectively separate blood flow signals from background noise, and retaining anatomical details of CT data through non-local mean denoising (NL-Means) and simultaneously suppressing metal artifacts.
Through the data preprocessing operation, sub-pixel level space alignment of CEUS and CT images is realized, artifacts caused by respiratory motion and equipment offset are eliminated, multi-scale optimization is carried out on pathological data, cross-scale representation of cell morphology and tissue structure is extracted, CEUS blood flow signals and background noise are effectively separated through noise suppression and enhancement processing, and CT anatomical details are reserved.
Referring to fig. 3, a feature extraction block diagram of the deep learning-based thyroid tumor classification and recurrence risk prediction method of the present invention is shown. Mainly comprises the following parts:
CEUS dynamic micro-blood flow feature extraction spatial features of single frame CEUS images are extracted using Convolutional Neural Networks (CNNs). The time series features are captured by a gated loop unit (GRU) to generate a micro-blood flow feature vector (e.g., 128 dimensions).
And extracting static anatomical features of CT, namely extracting anatomical features such as tumor boundary ambiguity, capsule invasion and the like by utilizing ResNet. A Pyramid Pooling Module (PPM) is used to enhance the combination of global and local features to generate CT feature vectors (e.g., 256 dimensions).
3. Pathological feature extraction, namely dividing a full-slice pathological image into small blocks, and cleaning. Cell morphology was extracted using InceptionResNetV a 2. Based on a multi-instance learning (MILs) framework, the patch features are weighted and aggregated to generate a pathology feature vector (e.g., 512 dimensions).
More specifically, the extraction of dynamic micro-blood flow features, anatomical features and pathological features according to the invention is achieved by:
1. Extraction of dynamic micro-blood flow characteristics F_CEUS of CEUS
Adopting a CNN-GRU hybrid architecture design, comprising:
Spatial feature extraction, namely constructing a 3-layer convolution network (Conv3×3, reLU, batchNorm), and outputting 128-dimensional feature vectors for each CEUS image;
time series feature modeling, capturing hemodynamic changes from arterial phase to venous phase with a bi-directional GRU (hidden layer 64 unit), and outputting dynamic features (e.g., 128 dimensions).
Extracting mathematical expression of the CNN spatial features:
F_spatial^i = CNN(I_i)
(wherein I_i represents the ith frame CEUS image, F_spatial ≡i represents the extracted 128-dimensional spatial feature vector.)
Modeling GRU time sequence characteristics:
h_t = GRU(F_spatial^t, h_{t-1})
r_t = σ(W_r·[h_{t-1}, F_spatial^t])
z_t = σ(W_z·[h_{t-1}, F_spatial^t])
ñ_t = tanh(W·[r_t * h_{t-1}, F_spatial^t])
h_t = (1 - z_t) * h_{t-1} + z_t * ñ_t
Wherein h_t is a hidden state at time T, r_t is a reset gate, z_t is an update gate, sigma is a sigmoid activation function, a dynamic feature f_ceus=h_t is finally obtained, T is the last time step of the sequence, ñ _t is a candidate hidden state in the GRU, and represents a new candidate state calculated based on current input and history information after reset in the time step T, and W is a weight matrix for calculating the candidate hidden state in the GRU, and is used for mapping the history state controlled by the reset gate and the current input to a candidate state space.
In summary, through the CNN-GRU hybrid architecture, the hemodynamic changes from arterial phase to venous phase are captured, and 128-dimensional dynamic micro-blood flow characteristics are output.
2. Enhanced CT anatomical feature F_CT extraction
Using ResNet-PPM joint optimization, multi-level features (Conv 1 to Conv 5) are extracted based on pre-training ResNet50, 4-level scale features (1×1,2×2, 3×3, 6×6 pooling) are fused by a Pyramid Pooling Module (PPM), and 256-dimensional global-local joint features are output. Because the data scale of the enhanced CT is smaller, if the random initialized ResNet is directly used for extracting the features, the problems of parameter redundancy, low feature learning efficiency, difficult optimization and the like can be caused, and the features are extremely poor in performance in a small data scene, so that the method adopts the pre-trained ResNet to migrate the general feature representation learned in the large data to the small data set through migration learning, and the dependence of the model on target data is remarkably reduced. Wherein, the
ResNet50 feature extraction:
F_conv_i = ResNet50_layer_i(I_CT), i ∈ {1,2,3,4,5}
wherein i_ct represents a CT input image, and f_conv_i represents a ResNet50 0 ith layer convolution output feature map.
Pyramid Pooling Module (PPM):
F_pool_j = Pool_j(F_conv_5), j ∈ {1×1, 2×2, 3×3, 6×6}
F_ppm_j = Conv(Upsample(F_pool_j))
F_CT = Concat(F_ppm_1, F_ppm_2, F_ppm_3, F_ppm_4)
The F_pool_j is a multi-scale feature map obtained by carrying out different scale pooling operations (1×1,2×2, 3×3 and 6×6) on a ResNet50 0 layer 5 feature map, wherein F_ppm_j is a feature map obtained by carrying out upsampling and convolution processing on the pooled feature F_pool_j and restoring to the original size for subsequent splicing to form a final 256-dimensional CT feature vector, and F_CT is a CT feature vector with 256 dimensions and contains multi-scale feature information.
In summary, based on ResNet-PPM joint optimization, multi-scale anatomical features are extracted, and 256-dimensional global-local joint features are output.
3. Extraction of pathological features F_path
Weak supervised feature learning of pathology images (MIL framework) is adopted, wherein WSI is divided into N blocks (256×256) through multi-instance learning and attention mechanisms, 512-dimensional features are extracted through InceptionResNetV2, and an instance-level feature pool is constructed. And designing an attention weight network (full connection layer+sigmoid), dynamically screening key image blocks (such as vascular infiltration areas) related to the FTC, and outputting the aggregated 512-dimensional pathological features.
Example-level feature extraction:
f_i = InceptionResNetV2(p_i), i ∈ {1,2,...,N}
Where p_i represents the ith pathology tile and f_i represents the corresponding 512-dimensional instance feature.
Attention weight calculation:
w_i = sigmoid(W·f_i + b)
Where W and b are a learnable weight matrix and bias, w_i ε [0,1] represents the attention weight of the ith tile.
And (3) weighting and aggregating:
F_path = Σ(w_i·f_i) / Σ(w_i)
Wherein F_path is the final 512-dimensional pathology, representing a global representation of the pathology image.
In summary, key pathological patches related to FTC are screened through a multi-instance learning and attention mechanism, and 512-dimensional pathological features are output.
Referring to fig. 4, fig. 4 is a schematic diagram of tri-modal feature fusion and optimization of a deep learning-based thyroid tumor classification and recurrence risk prediction method of the present invention. It mainly comprises the following parts:
1. And the self-adaptive weighting mechanism dynamically calculates the weights of the dynamic micro blood flow characteristic F_CEUS, the anatomical characteristic F_CT and the pathological characteristic F_path, and adjusts the contribution proportion of each modal characteristic according to task requirements.
2. Deep self-attention network-the global dependency relationship among modalities is captured through a multi-head (e.g. 8-head) attention mechanism, and fusion characteristic expression is optimized.
3. And optimizing the inter-mode features, namely further improving the expressive power of the fusion features through feature interaction and redundancy elimination processing.
4. And outputting the fusion characteristics, namely outputting high-dimensional optimized fusion characteristics for subsequent classification and prediction.
Specifically, the adaptive fusion and optimization of the trimodal features of the present invention includes the following operations:
1. self-attention based dynamic weighted fusion
1.1 Adopting a weight distribution algorithm, inputting CEUS (128-dimensional), CT (256-dimensional) and pathology (512-dimensional) characteristics, calculating a correlation matrix among modes through a multi-head self-attention mechanism (head number=8), and generating a first initial weight vector
First a query (Q), key (K) and value (V) matrix is calculated:
Q = W_Q·F_concat
K = W_K·F_concat
V = W_V·F_concat
Wherein, f_concat= [ f_ceus, f_ct, f_path ], is a spliced feature vector.
The attention score is then calculated:
Attention(Q, K, V) = softmax(QK^T/√d_k)V
where d_k is the dimension of the key vector in the attention mechanism for scaling the attention score to prevent the gradient from disappearing.
Multi-head attention splice (head number=8) was then performed:
MultiHead(F_concat) = Concat(head_1, head_2, ..., head_8)·W_O
head_i = Attention(Q_i, K_i, V_i)
wherein W_O is the output weight matrix of the multi-head attention, and is used for mapping the spliced multi-head attention result to the final output feature space.
A first initial weight vector is then generated from the multi-headed attention result:
= softmax(W_modal·MultiHead(F_concat))
Wherein W_ modal is a modal weight generation matrix, and the output result of the multi-head attention is converted into weight coefficients of three modes
1.2 Adaptive adjustment of the first initial weight vector in combination with feature quality assessment and task relevance analysis:
dynamically adjusting a first initial weight vector based on feature quality assessment and task relevance analysis :
1.2.1 Calculating a quality assessment score:
quality_CEUS = w_q·Quality(F_CEUS) + w_r·Relevance(F_CEUS,task)
quality_CT = w_q·Quality(F_CT) + w_r·Relevance(F_CT,task)
quality_path = w_q·Quality(F_path) + w_r·Relevance(F_path,task)
wherein quality_CEUS is the Quality score of CEUS characteristics and is calculated by the weighted combination of characteristic Quality and task Relevance, w_q is a Quality weight coefficient, quality function evaluates the Quality of the current characteristics, w_r is a Relevance weight coefficient, quality function evaluates the Relevance degree of the characteristics and the current tasks, quality_CT is the Quality score of CT characteristics, quality_path is the Quality score of pathological characteristics
1.2.2 Normalized mass fraction:
[q_α, q_β, q_γ] = softmax([quality_CEUS, quality_CT, quality_path])
where q_α, q_β, q_γ are the quality score of the normalized CEUS feature, the quality score of the normalized CT feature, and the quality score of the normalized pathology feature, respectively.
1.2.3 Adjustment is made based on the initial weights:
α1_temp = λ·α0+ (1-λ)·q_α
β1_temp = λ·β0+ (1-λ)·q_β
γ1_temp = λ·γ0+ (1-λ)·q_γ
Where λ=0.7, is the initial weight retention coefficient.
1.2.4 Renormalization:
= softmax([α1_temp, β1_temp, γ1_temp])
Wherein, the Is the first initial weight vector after the first adjustment.
2. Task perception weight optimization:
L_weight = L_task + λ·R(α1, β1, γ1)
Where l_task is task penalty (classification or regression) and R is a weight regularization term.
2.1 Obtaining a first initial weight vector after second adjustment through task self-adaptive updating based on gradient:
α2= α1- η_α·L_weight/α1
β2= β1- η_β·L_weight/β1
γ2= γ1- η_γ·L_weight/γ1
Where η_α is a learning rate corresponding to the weight α 1 for controlling the update step size of the weight parameter α 1, η_β is a learning rate corresponding to the weight β 1 for controlling the update step size of the weight parameter β 1, and η_γ is a learning rate corresponding to the weight γ 1 for controlling the update step size of the weight parameter γ 1.
Finally, the first weight vector [ α, β, γ ] =softmax ])。
2.2 Task-sensitive adaptive adjustment:
increasing ηα to increase CEUS feature weight when diagnostic tasks are prioritized
Increasing eta gamma to increase pathological feature weight when predicting task priority
Real-time monitoring of verification set performance and dynamic adjustment of final weight vector
In practical applications, the diagnostic tasks are typically larger (biased towards CEUS) for α≡0.4, while the recurrence prediction tasks are larger (biased towards pathological features) for γ≡0.5. The weights are automatically adapted to the device characteristics and data quality of different institutions.
Finally, the dynamic micro-blood flow feature F_CEUS, the anatomical feature F_CT and the pathological feature F_path are fused by a first weight vector to obtain a first fused feature F_fused, wherein F_fused=alpha.F_CEUS+beta.F_CT+gamma.F_path, (alpha+beta+gamma=1)
In conclusion, through a multi-head self-attention mechanism, weights of CEUS, CT and pathological features are dynamically distributed, and accurate fusion of cross-modal features is achieved.
2. Global optimization of deep self-attention network
Using a transducer-based feature interaction, inputting the fused features (512 dimensions) into a 6-layer transducer encoder (head number=8, feed-forward layer dimension=2048), capturing cross-modal global dependencies by position coding and layer normalization (LayerNorm);
The specific process is as follows:
X' = LayerNorm(X + MultiHeadAttention(X))
X'' = LayerNorm(X' + FFN(X'))
Wherein MultiHeadAttention (X) = Concat (head 1,...,head8) w≡o
head_i = Attention(XW_i^Q, XW_i^K, XW_i^V)
Attention(Q,K,V) = softmax(QK^T/√d_k)V
The method comprises the steps of performing multi-head self-attention and residual connection on X 'and then performing layer normalization on the X', wherein X '' is a final output characteristic representation after feedforward network FFN processing and secondary residual connection and layer normalization, W O is an output projection matrix of a multi-head attention mechanism and is used for mapping spliced multi-head attention results to final output dimensions, XW_i Q is a query vector Q_i obtained after the input characteristic X is transformed by a query weight matrix W_i_Q of an ith head, XW_i_K is a key vector K_i obtained after the input characteristic X is transformed by a key weight matrix W_i_k of the ith head, and XW_i_v is a value vector V_i obtained after the input characteristic X is transformed by a value weight matrix W_i_v of the ith head.
Position coding:
PE(pos, 2i) = sin(pos/10000^(2i/d_model))
PE(pos, 2i+1) = cos(pos/10000^(2i/d_model))
Transformer encoder layer:
x' = LayerNorm(x + MultiHeadAttention(x, x, x))
output = LayerNorm(x' + FFN(x'))
Wherein FFN is feed forward network:
FFN(x) = max(0, xW_1 + b_1)W_2 + b_2
by stacking 6 layers of the transducer encoder, the final features are expressed as:
F_fused_optimized = Transformer_6(Transformer_5(...Transformer_1(F_fused)...))
to sum up, cross-modal global dependencies are captured by a transducer encoder, outputting 512-dimensional optimization features.
Referring to fig. 5, fig. 5 is a schematic diagram of bimodal feature fusion and optimization of a deep learning-based thyroid tumor classification and recurrence risk prediction method of the present invention. It is worth noting that in the case of tri-modal fusion, because of large semantic difference between modalities, a transform encoder and a position encoder are adopted to establish a global context relation for deep cross-modal feature interaction, and the dual-modal fusion mainly processes two relatively homogeneous image data of CEUS and CT, and more attention is paid to the time sequence evolution trend of CEUS, so that residual connection and GRU network are adopted to capture time dependence.
Specifically, the bimodal adaptive fusion and optimization of the present invention includes the following operations:
1. Attention-based weight distribution
Inputting CEUS (128-dimensional) and CT (256-dimensional) features, calculating feature similarity matrix, and generating a second initial weight vector
The specific process is as follows:
Cross-modality similarity score is calculated:
similarity_matrix = F_CEUS·F_CT^T/√d
cross_similarity = mean(softmax(similarity_matrix))
Where F_CEUS. F_CT≡represents a feature dot product operation, d is the feature dimension, softmax can normalize similarity to probability distribution
Modality autocorrelation calculation:
self_sim_CEUS = mean(F_CEUS·F_CEUS^T/√128)
self_sim_CT = mean(F_CT·F_CT^T/√256)
weight raw score calculation:
w_CEUS_raw = cross_similarity * self_sim_CEUS
w_CT_raw = (1 - cross_similarity) * self_sim_CT
Weight vector normalization:
= softmax([w_CEUS_raw, w_CT_raw])
2. task aware dynamic weight adjustment
Weight optimization is based on gradient descent and task loss:
L_weight' = L_task' + λ'·R' (α0', β0')
Where l_task 'is the primary task loss function, λ' is the regularization coefficient, and R '(α 0', β0') is the weight regularization term. And monitoring the performance of the second verification set in real time, and dynamically fine-tuning the weight parameters.
2.1 Gradient-based weight update
Gradient update formula:
α1' = α0' - η_α'·L_weight/α0'
β1' = β0' - η_β'·L_weight/β0'
[α', β'] = softmax()
wherein:
η_α ', η_β ' are learning rate parameters of CEUS and CT modalities, respectively, α 1'、β1 ' is a temporary weight after gradient update, and α ', β ' is a normalized weight, i.e. a second weight.
The invention adopts a task self-adaptive adjustment strategy in the fusion process of the dynamic micro blood flow characteristic F_CEUS and the anatomical characteristic F_CT, and the task self-adaptive adjustment strategy is as follows:
when diagnostic tasks are prioritized:
Increasing η_α ' emphasizes the micro-blood flow characteristics of CEUS, typically weight distribution α ' ≡0.6, β ' ≡0.4.
When other analysis tasks are prioritized:
increasing η_β ' emphasizes anatomical detail features of CT, typical weight distributions are α ' ≡0.45, β ' ≡0.55.
The performance of the verification set can be monitored in real time, the weight parameters are dynamically and finely adjusted, the learning rate of eta alpha 'and eta beta' is automatically adjusted according to the performance of the verification set, and the optimal weight distribution of the model under different task scenes is ensured.
And finally fusing output to obtain a second fusion characteristic:
F_fused' = α'·F_CEUS + β'·F_CT
Where α '+β' =1.
3. Feature optimization and interaction enhancement
3.1 Residual connection and feature enhancement
Introducing residual connection avoids gradient disappearance, and enhances information flow:
F_enhanced = F_fused' + Dropout(FC(F_fused'))
Where F_enhanced is the enhanced feature, F_fused' is the second fusion feature, FC is the fully connected layer, dropout is a random deactivation operation, preventing overfitting.
And then unifying different modal characteristic distributions by applying layer normalization (Layer Normalization) to obtain normalized enhancement characteristic F_normalized:
F_normalized = LayerNorm(F_enhanced)
3.2, space-time dependency modeling
The normalized enhancement feature f_normalized is input into the attention enhanced GRU network:
F_temporal = GRU(F_normalized)
where GRU is the gating loop and F_temporal is the feature after capture timing dependence, i.e., the second optimized fusion feature.
In summary, the bimodal optimization fusion method of the invention strengthens the space-time feature interaction through a self-attention mechanism (head number=4) by capturing the synergistic effect of the CEUS time sequence feature and the CT space feature.
Fig. 6 is a classification and prediction flow chart of a deep learning-based thyroid tumor classification and recurrence risk prediction method of the present invention. The thyroid tumor classification method is implemented by a preoperative classification model, and the recurrence risk prediction method is implemented by a postoperative recurrence risk prediction model. The construction of the pre-operative classification model and the post-operative recurrence risk prediction model is described below.
1. Construction of preoperative classification model
1.1 Data acquisition and Pre-processing
In one embodiment of the invention, multimodal data of a total of 1500 patients, including cases of thyroid follicular carcinoma (FTC) and follicular adenoma (FTA), were employed from multiple trimester hospitals using different brands and models of imaging equipment. The dataset was divided into training, validation and test sets at a rate of about 70% (1050 cases), 15% (225 cases), and hierarchical sampling was used to ensure consistent proportions of FTC and FTA in each subset. And then, carrying out standardized acquisition and processing on CEUS data, CT data and pathological data:
CEUS data, namely precisely dividing arterial period (30-60 seconds) and venous period (60-120 seconds) by adopting a time sequence acquisition technology, and capturing the space-time dynamic characteristics of tumor micro-blood perfusion by a high-frequency ultrasonic probe;
CT data, namely adopting multi-phase scanning, wherein the layer thickness is 1.0-1.5mm, and reducing noise by combining an iterative reconstruction algorithm;
pathological data the whole-slice pathological image is digitized by a digital scanner (resolution 0.25 μm/pixel) and a block processing and tissue region screening algorithm is adopted.
1.2 Training of Pre-operative Classification models
The preoperative classification model is trained by adopting a double-branch full-connection network structure, an input layer receives 512-dimensional fusion characteristics, a two-layer full-connection network (FC-128) is matched with a ReLU activation function to perform characteristic dimension reduction, and an output layer outputs classification probability of FTC and FTA through a Softmax function. Through the double-branch full-connection network, the accurate classification of tumor types is realized, and the classification probability is output.
The pre-operation classification model is constructed by adopting a multi-stage migration learning strategy, and a three-stage progressive training framework is utilized, wherein the three stages are pre-training, feature migration and fine adjustment in sequence. The input is source domain data (1050 cases of multi-center training data), the output is field-adaptive model parameters theta, and the optimization is carried out aiming at specific medical equipment environments.
The adjustment method of the model parameter theta is as follows:
First stage Pre-training
θ_pretrain = argmin_θ L_src(θ, D_src)
Wherein argmin_θ is an optimization operator that represents finding the optimal parameter θ that minimizes the objective function, and l_src is a source domain loss function that measures the prediction error of the model on the source domain dataset.
Second stage feature migration
θ_transfer = argmin_θ [L_src(θ, D_src) + λ_1·d_MMD(F_src, F_tgt)]
Where d_mmd is the maximum mean difference metric for reducing source domain and target domain feature distribution differences.
Third stage of fine tuning
θ_finetune = argmin_θ [L_tgt(θ, D_tgt) + λ_2·R(θ, θ_transfer)]
Where R is a regularization term, limiting parameter deviations from the pre-trained model are excessive.
The distribution difference of the source domain and the target domain is reduced through progressive migration, the adaptability of the model on new equipment is improved, and the accuracy rate on a verification set is improved by about 8-12%.
The pre-operation classification model is constructed by adopting a dynamic fine adjustment mechanism, and a self-adaptive parameter adjustment algorithm based on equipment characteristics is adopted. The CEUS and CT characteristic parameters, including imaging resolution, frame rate, contrast, etc., input to the target device and the output is a network weight optimized for the particular device characteristics.
The method for adjusting the network weight is as follows:
characterization of device characteristics:
v_device = Encoder(device_params)
And (3) generating condition parameters:
α_device = MLP(v_device)
Dynamic weight adjustment:
W_adjusted = W_base * (1 + α_device * Tanh(v_device))
Where w_base is the base weight and α_device is the device-specific adjustment factor.
By automatically adjusting model parameters according to equipment characteristics, the model can adapt to imaging characteristic differences of equipment of different manufacturers, and the cross-equipment performance attenuation is reduced by 20-35%.
The pre-operation classification model is constructed by adopting self-adaptive learning rate adjustment, which is a cyclic attenuation learning rate strategy based on verification set performance. The input is the performance index (accuracy, loss value) on the verification set, and the output is the dynamically adjusted learning value.
The learning rate adjustment method is as follows:
basic cycle learning rate:
η_t = η_min + 0.5*(η_max - η_min)*(1 + cos(t/T_cycle * π))
Wherein eta_t is a learning rate value in the T training step and is obtained by dynamic calculation through a cosine annealing formula, eta_min is a minimum learning rate value, training stagnation caused by too small learning rate is prevented, eta_max is a maximum learning rate value and can be dynamically adjusted according to performance in the training process, and T_cycle is the training step number of a complete learning rate period and determines the period length of cosine annealing.
And (3) performance monitoring:
If val_loss_t>val_loss_{t-1} * (1 - ε):
patience += 1
Else:
patience = 0
wherein val_loss_t is the loss value on the validation set at the t training step for monitoring model performance, patience is a patience counter, which records how many times the validation set performance is not improved continuously for triggering learning rate adjustment.
And (3) learning rate adjustment:
If patience>patience_threshold:
η_max = η_max * 0.5
patience = 0
by avoiding the training from falling into the local optimum, the upper limit of the learning rate is automatically reduced along with the training, the model convergence is promoted, and the average convergence speed is improved by about 30%.
The loss function of the preoperative classification model adopts a cross entropy loss function and an L2 regularization term:
L_diag = -Σ y_i log(P(y=i)) + λ||θ||^2
Where y_i is the true label, λ θ is the L2 regularization term, preventing overfitting.
2. Construction of postoperative recurrence risk prediction model
The postoperative recurrence risk prediction model is based on a multi-mode comparison learning framework, the second optimization fusion feature is subjected to sequential dependence extraction through a gating circulation unit GRU, the pathological feature is subjected to self-attention screening on a high-risk region, a comparison loss (Contrastive Loss) is adopted to draw a similar sample (recurrence vs. non-recurrence), and a spacing threshold delta=0.5.
The construction of the postoperative recurrence risk prediction model adopts a contrast learning strategy, and utilizes a double-branch contrast learning network to combine sample-to-sample similarity learning so as to learn discriminant features and align the semantics of the expansion mode. The input of the double-branch contrast learning network is the feature F_temporal extracted by the image branch and the feature F_path extracted by the pathological branch, and the output is the semantic alignment representation in the feature space. The process involves the following operations:
positive and negative sample pair construction:
y_ij=1 if samples i, j are either co-recurrent or co-non-recurrent
y_ij = 0 otherwise
And (3) calculating contrast loss:
L_contra = Σ_i Σ_j y_ij max(0, δ - cos(F_temporal^i, F_path^j)) + (1-y_ij) max(0, cos(F_temporal^i, F_path^j) - m)
Wherein δ=0.5 is a positive sample-to-similarity threshold, representing that the distance of the same class of sample features should be less than this value, m=0.2 is a negative sample-to-boundary interval, representing that the distance of different classes of sample features should be greater than this value, cos (,) represents cosine similarity, and y_ij is 1, representing that samples i and j are both recurring or non-recurring.
By promoting the model to learn the distinguishing characteristics of recurrent and non-recurrent cases, the semantic consistency between the image and the pathological mode is maintained, and the recurrence risk AUC is improved by 7-13%.
The construction of the postoperative recurrence risk prediction model adopts a risk scoring mechanism, which is a risk scoring regressor based on optimized fusion characteristics, is input into an image characteristic F_temporal and a pathological characteristic F_path after comparison and learning alignment, and is output into continuous recurrence risk scores (0-1), and the process involves the following operations:
1. Risk score calculation
Risk_score = sigmoid(W_risk·[F_temporal, F_path] + b_risk)
Wherein W_risk is a risk weight matrix, which is a trainable weight matrix for mapping the second optimized fusion feature F_temporal and the pathological feature F_path to the risk space, the dimension of W_risk depends on the dimension of the input feature and the dimension of the risk score, typically (d_combined×1), where d_combined is the dimension after the f_temporal and F_path are spliced, and the weight matrix is automatically learned by back propagation during training, and can be given a higher weight value to the feature more important for relapse prediction, the weight value reflects the contribution degree of different features to relapse risk prediction, b_risk is a risk bias term, is a bias parameter of the risk prediction model, is a scalar value, is used for adjusting the baseline level of the risk score, ensures that the model output has proper bias, is optimized together with W_risk during training, helps the model process the feature distribution imbalance, and can be understood as a default risk prediction value of the model without any feature input. Together, w_risk and b_risk form the linear part of the risk score, and the output is then mapped between 0 and1 by a sigmoid function, representing the probability of a patient's postoperative recurrence.
2. Weighted mixing loss
L_pred = BCE(Risk_score, y_recur) + α·L_contra
Wherein L_pred consists of two parts, a weighted sum of binary cross entropy loss (BCE) and contrast loss (L_ contra), BCE (risk_score, y_ recur) representing the binary cross entropy loss between the recurrence Risk prediction value (risk_score) and the true recurrence label (y_ recur), for directly optimizing the accuracy of the Risk prediction. Where y_ recur ε {0,1} indicates whether the patient relapsed (1 indicates relapsed, 0 indicates not relapsed). Alpha is the weight coefficient of contrast loss, used to balance the contribution of direct predicted loss with characteristic contrast loss. In practical training, α is typically set to between 0.3-0.5, which can be adjusted by validation set performance. L contra is a contrast loss component, as described above, for approximating the characteristic representation of the same class of samples (either both recurrent or both non-recurrent), pushing the characteristic representation of different classes of samples away, thereby enhancing the model's ability to distinguish between recurrent risks.
By means of a risk scoring mechanism, the model can provide finer granularity of risk prediction, rather than just a classification result, with an approximately 25% improvement in clinical decision support.
The construction of the postoperative recurrence risk prediction model adopts cross-equipment adaptability optimization, which is a field adaptation technology and is specifically characterized by field invariant feature learning based on countermeasure training. The input is a feature representation of the source domain data (training device) and the output is a domain-invariant feature representation. The process involves the following operations:
1. domain classifier training
L_domain = -Σ[d_i log(D(f_i)) + (1-d_i)log(1-D(f_i))]
2. Feature extractor training (challenge goal)
L_feature = L_task - λ_d·L_domain
Where D is a domain classifier, d_i is a domain label (0 represents a source domain, 1 represents a target domain), and f_i is a feature representation.
Through countermeasure training, the model learns the characteristic representation which is not influenced by the difference of the equipment, so that the model can reach the performance close to the original equipment without a large amount of data on the new equipment, and the cross-equipment performance retention rate is improved to 85-92%.
The post-operation recurrence risk prediction model is constructed by adopting an early-stop strategy, which is a self-adaptive early-stop mechanism based on verification set performance. The input is the risk prediction performance index (AUC (Area Under the Curve, area under curve), recall@high risk (recall of high risk class, ability of the model to correctly identify high risk class samples) on the validation set, and the output is the training stop signal and the optimal model preservation point. The process involves the following operations:
1. performance monitoring
Metric_t=0.7. AUC+ 0.3 x recall @ high risk
If metric_t>best_metric:
best_metric = metric_t
best_epoch = t
save_model(θ_t)
patience = 0
Else:
patience += 1
The method comprises the steps of calculating a comprehensive performance index of a t training period by using weighted combination of AUC and high risk recall rate, judging whether a current model reaches new optimal performance by best performance index recorded so far, wherein best_metric is a training period number corresponding to the best performance index, and is used for recording a training time point of the optimal model, t is a current training period number and is used for indicating how many epochs the model has been trained, save_model (theta_t) is a model saving operation, saving model parameter theta_t of the current period t as an optimal model check point, patience is a endurance counter, and recording how many continuous periodic performances are not improved and are used for triggering an early stopping mechanism.
2. Early stop determination
If patience>patience_max or t - best_epoch>window_size:
stop_training()
restore_model(best_epoch)
By adopting the early-stop strategy, the over-fitting problem can be effectively avoided, unnecessary training time is reduced, the average training time of the model is reduced by 35%, and meanwhile, the performance of the test set is maintained or improved.
The pre-operative classification model and the recurrence risk prediction model of the present invention perform on the test set (225 cases) as follows:
The accuracy of FTC/FTA classification reaches 91.3%, the sensitivity is 89.6%, and the specificity is 92.5%.
The recurrence risk prediction model has AUC of 0.873 and recurrence rate prediction accuracy of 83.7% in the high risk group.
Through detailed training strategy design and optimization, the invention can maintain stable performance under different medical institutions and equipment environments, and provides reliable support for clinical thyroid cancer diagnosis and risk assessment.
Fig. 7 is a flow chart of an explanatory analysis of the deep learning-based thyroid tumor classification and recurrence risk prediction method of the present invention. The flow chart includes the following parts:
1. Grad-CAM thermodynamic diagram generation:
and (3) inputting image features, namely extracting CEUS features and CT features used in classification and prediction models.
Generating a key region thermodynamic diagram based on Grad-CAM technology, generating a thermodynamic diagram to highlight key regions of interest to the model.
Labeling the dynamic abnormal region, namely highlighting the blood perfusion abnormal region on a CEUS thermodynamic diagram.
Labeling static abnormal areas, namely labeling boundary fuzzy areas or calcified areas of tumors on a CT thermodynamic diagram.
2. SHAP feature contribution analysis:
and inputting fusion characteristics, namely classifying and predicting input characteristic vectors of the model.
Feature importance is calculated by analyzing the contribution of input features to the classification result based on the SHAP technique.
2.1, Quantization characteristic contribution:
CEUS features contribute such as dynamic perfusion rate, time series pattern.
CT features contribute, e.g., boundary ambiguity, anatomical morphology.
Pathological characteristics such as cell morphology and histological structure.
3. Interpretation analysis results show that:
Grad-CAM thermodynamic diagrams are displayed showing key regions in an image at a user interface.
Feature contribution ranking is displayed-the contribution weights of different modality features are quantified and visualized.
An explanatory report is generated by integrating the thermodynamic diagram with the feature contribution rank, and a downloadable report is generated.
Specifically, the Grad-CAM optimization and SHAP feature contribution analysis employed by the interpretive enhancement techniques of the present invention are embodied as follows.
1. Grad-CAM optimization:
the optimization strategy is to superimpose time dimension weights in a CEUS thermodynamic diagram, accurately position an arterial phase perfusion abnormal region, and combine CT images and pathological features (such as tumor boundary ambiguity and nuclear division counting) to realize multi-mode data fusion, so that the definition and reliability of a visual result are improved.
The Grad-CAM is embodied as follows:
For the target class c, the feature map weights α_k≡c are calculated:
α_k^c = (1/Z) Σ_i Σ_jy^c/A_ij^k
Wherein Z is a normalization factor equal to the total number of pixels of the feature map for normalization processing of the weights, and Σ_i Σ_j is a double summation symbol representing accumulation operation of all spatial positions (i, j) of the feature map; y^c/ A_ij≡is the partial derivative of the predicted value of the target class c to the activation value of the kth feature map at the position (i, j), indicating the degree of influence of the position on the classification result.
Time dimension weight integration (for CEUS sequence):
α_k^c(t) = α_k^c·w_t
where w_t is the weight coefficient of time step t, associated with the perfusion phase.
Grad-CAM thermodynamic diagram calculation:
L_Grad-CAM^c = ReLU(Σ_k α_k^c A^k)
The weight alpha_k≡c of all channels and the corresponding feature map A≡k are weighted and summed, and then Grad-CAM thermodynamic diagrams are obtained through a ReLU activation function, which highlights the feature areas important to the target class c.
CEUS-specific thermodynamic diagrams:
L_Grad-CAM^c(CEUS) = ReLU(Σ_k Σ_t α_k^c(t) A^k(t))
On the basis of the Grad-CAM, the alpha_k≡c (t) and the time variation of the characteristic diagram A≡k (t) at different moments t in the time dimension are further considered, so that a thermodynamic diagram which is more in line with the characteristics of the CEUS sequence is obtained and is used for analyzing the characteristic importance distribution of the target category on the time sequence.
Grad-CAM optimization adopted by the invention can accurately position the CEUS arterial phase perfusion abnormal region, and fusion visualization of multi-mode data is realized by combining CT images and pathological features.
2. SHAP feature contribution analysis:
Through SHAP feature contribution analysis, the model can quantify the contribution of various features to the prognosis results, helping to reveal the role of different factors in the prediction. The SHAP value assigns an importance score to each feature, and by measuring the influence of each feature on the predicted outcome, a clinician can intuitively understand how factors such as CEUS peak flow rate, boundary ambiguity in CT images, pathological nuclear division count, etc. influence prognosis prediction.
The specific implementation process of SHAP feature contribution analysis is as follows:
SHAP value calculation for feature i:
φ_i = Σ_{S⊆N\{i}} |S|!(|N|-|S|-1)!/|N|! [f_x(S∪{i}) - f_x(S)]
where N is the set of all features, S is the subset that does not contain feature i, and f_x (S) is the prediction of sample x when the model uses feature set S.
Inter-modality feature interaction SHAP values:
φ_ij = Σ_{S⊆N\{i,j}} |S|!(|N|-|S|-2)!/2|N|! [f_x(S∪{i,j}) - f_x(S∪{i}) - f_x(S∪{j}) + f_x(S)]
feature importance quantification:
Importance(i) = Σ_x |φ_i(x)|
modality importance quantification:
Importance(CEUS) = Σ_{i∈CEUS_features} Importance(i)
Importance(CT) = Σ_{i∈CT_features} Importance(i)
Importance(Pathology) = Σ_{i∈Pathology_features} Importance(i)
The SHAP characteristic contribution analysis adopted by the invention can quantify the contribution of each characteristic to the prognosis result and reveal the effect of different factors in prediction.
In summary, the invention combines the dynamic micro blood flow characteristics of CEUS and the high-resolution anatomical information of enhanced CT through a deep learning method, establishes a preoperative classification model based on multi-modal image data, comprehensively improves the sensitivity and specificity of the preoperative classification of FTC, adopts a self-adaptive weight dynamic adjustment mechanism, realizes the optimized combination of multi-modal image characteristics through a deep self-attention network, fully excavates the complementary information of each modal characteristic, utilizes the joint analysis of images and pathological data to construct a recurrence risk prediction model based on molecular and cell level characteristics, and provides an accurate individuation management strategy for postoperative high-risk patients.
Furthermore, the invention also provides electronic equipment, which mainly comprises a processor and a memory, wherein a computer program is stored in the memory, and the computer program can cause the processor to execute the thyroid tumor classification and recurrence risk prediction method based on deep learning.
The memory may be random access memory, read only memory, nonvolatile, programmable ROM, erasable PROM, electrically erasable, flash memory, optical memory, registers, and so forth.
The processor may be an integrated circuit chip with signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or by instructions in the form of software. The processor may be a general-purpose processor, including a central Processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), a digital signal processor (DIGITAL SIGNAL Processing, DSP), an Application Specific Integrated Circuit (ASIC), a Field-Programmable gate array (FPGA) or other Programmable logic device, discrete gate or transistor logic device, or discrete hardware components. The disclosed methods, steps, and logic blocks in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present application, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another, for example, by wired (e.g., coaxial cable, optical fiber, digital Subscriber Line (DSL)), or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, tape), an optical medium (e.g., DVD), or a Solid state disk (Solid STATEDISK, SSD), etc.

Claims (10)

1.一种基于深度学习的甲状腺肿瘤分类及复发风险预测方法,其特征在于,包括以下步骤:1. A method for thyroid tumor classification and recurrence risk prediction based on deep learning, characterized by comprising the following steps: S1、对与甲状腺滤泡癌FTC或滤泡性腺瘤FTA相关的多模态数据进行采集和预处理,所述多模态数据包括动态微血流数据、解剖数据和全切片病理图像WSI数据;S1. Acquiring and preprocessing multimodal data related to follicular thyroid carcinoma (FTC) or follicular adenoma (FTA), wherein the multimodal data includes dynamic microblood flow data, anatomical data, and whole-slice pathology image (WSI) data; S2、针对预处理后的动态微血流数据、解剖数据和全切片病理图像WSI数据,分别采用不同特征提取方法进行特征提取,得到动态微血流特征F_CEUS、解剖特征F_CT和病理特征F_path;S2. Different feature extraction methods are used to extract the preprocessed dynamic microblood flow data, anatomical data and whole-slice pathological image WSI data, respectively, to obtain dynamic microblood flow features F_CEUS, anatomical features F_CT and pathological features F_path; S3、对动态微血流特征F_CEUS、解剖特征F_CT和病理特征F_path进行融合和优化,得到第一优化融合特征F_fused_optimized;对动态微血流特征F_CEUS和解剖特征F_CT进行融合和优化,得到第二优化融合特征F_temporal;S3. Fusing and optimizing the dynamic microblood flow feature F_CEUS, the anatomical feature F_CT, and the pathological feature F_path to obtain a first optimized fusion feature F_fused_optimized; fusing and optimizing the dynamic microblood flow feature F_CEUS and the anatomical feature F_CT to obtain a second optimized fusion feature F_temporal; S4、将第一优化融合特征F_fused_optimized输入术前分类模型以得到甲状腺肿瘤类型分类概率,将第二优化融合特征F_temporal和病理特征F_path输入术后复发风险预测模型以得到复发风险评分。S4. Input the first optimized fusion feature F_fused_optimized into the preoperative classification model to obtain the thyroid tumor type classification probability, and input the second optimized fusion feature F_temporal and the pathological feature F_path into the postoperative recurrence risk prediction model to obtain the recurrence risk score. 2.根据权利要求1所述的基于深度学习的甲状腺肿瘤分类及复发风险预测方法,其特征在于,步骤S1中的对多模态数据进行采集包括:2. The method for thyroid tumor classification and recurrence risk prediction based on deep learning according to claim 1, wherein collecting multimodal data in step S1 comprises: 通过对比增强超声CEUS采用时间序列采集技术获得动态微血流数据;Dynamic microblood flow data were obtained by contrast-enhanced ultrasound (CEUS) using time series acquisition technology. 通过增强CT采用多期相扫描及迭代重建算法获得解剖数据;Anatomical data were obtained by using enhanced CT with multi-phase scanning and iterative reconstruction algorithm; 通过数字扫描仪获得全切片病理图像WSI数据;Whole-slice pathology image WSI data were obtained using a digital scanner; 步骤S1中的对多模态数据进行预处理包括:The preprocessing of multimodal data in step S1 includes: 对动态微血流数据应用自适应中值滤波及小波变换,以得到第一动态微血流数据,且对解剖数据应用非局部均值去噪,得到第一解剖数据;Applying adaptive median filtering and wavelet transform to the dynamic microblood flow data to obtain first dynamic microblood flow data, and applying non-local mean denoising to the anatomical data to obtain first anatomical data; 对第一动态微血流数据和第一解剖数据应用互信息最大化算法,得到亚像素级空间对齐的动态微血流数据和解剖数据;Applying a mutual information maximization algorithm to the first dynamic microblood flow data and the first anatomical data to obtain sub-pixel spatially aligned dynamic microblood flow data and anatomical data; 对全切片病理图像WSI数据应用金字塔分块策略,分别在不同放大倍数下提取全切片病理图像WSI的像素图块,并对所述像素图块应用多尺度特征融合,得到融合后的全切片病理图像WSI数据。A pyramid block partitioning strategy is applied to the whole-slice pathology image WSI data. Pixel blocks of the whole-slice pathology image WSI are extracted at different magnifications. Multi-scale feature fusion is then applied to the pixel blocks to obtain the fused whole-slice pathology image WSI data. 3.根据权利要求1所述的基于深度学习的甲状腺肿瘤分类及复发风险预测方法,其特征在于,步骤S2中的对预处理后的多模态数据进行特征提取包括:3. The method for thyroid tumor classification and recurrence risk prediction based on deep learning according to claim 1, wherein the feature extraction of the preprocessed multimodal data in step S2 comprises: 采用CNN-GRU混合架构对预处理后的动态微血流数据进行空间特征提取和时序特征建模,得到与动态微血流数据相关联的动态微血流特征F_CEUS;The CNN-GRU hybrid architecture is used to extract spatial features and model temporal features of the preprocessed dynamic microblood flow data, and the dynamic microblood flow features F_CEUS associated with the dynamic microblood flow data are obtained; 采用预训练ResNet50对预处理后的解剖数据进行多层级特征提取,并通过金字塔池化模块PPM融合多级尺度特征,得到与解剖数据相关联的解剖特征F_CT;The pre-trained ResNet50 is used to extract multi-level features from the pre-processed anatomical data, and the multi-scale features are fused through the pyramid pooling module PPM to obtain the anatomical features F_CT associated with the anatomical data; 将预处理后的全切片病理图像WSI分割为N个图块,使用InceptionResNetV2模型对每个图块进行处理,得到对应的实例特征,然后通过注意力权重网络来计算每个图块的注意力权重,并根据计算出的注意力权重,对所有实例特征进行加权聚合,以得到与全切片病理图像WSI数据相关联的病理特征F_path。The preprocessed whole-slice pathology image (WSI) is divided into N tiles. Each tile is processed using the InceptionResNetV2 model to obtain the corresponding instance features. The attention weight of each tile is then calculated using the attention weight network. Based on the calculated attention weight, all instance features are weightedly aggregated to obtain the pathology feature F_path associated with the whole-slice pathology image (WSI) data. 4.根据权利要求3所述的基于深度学习的甲状腺肿瘤分类及复发风险预测方法,其特征在于,在步骤S3中,对动态微血流特征F_CEUS、解剖特征F_CT和病理特征F_path进行融合和优化,得到第一优化融合特征F_fused_optimized,包括:4. The deep learning-based thyroid tumor classification and recurrence risk prediction method according to claim 3, characterized in that in step S3, the dynamic microblood flow feature F_CEUS, the anatomical feature F_CT, and the pathological feature F_path are fused and optimized to obtain a first optimized fused feature F_fused_optimized, comprising: 针对动态微血流特征F_CEUS、解剖特征F_CT和病理特征F_path,使用多头自注意力机制生成第一初始权重向量For the dynamic microblood flow feature F_CEUS, anatomical feature F_CT and pathological feature F_path, a multi-head self-attention mechanism is used to generate the first initial weight vector ; 结合特征质量评估和任务关联度分析对第一初始权重向量进行自适应调整,以得到第一次调整后的第一初始权重向量Combined with feature quality evaluation and task relevance analysis, the first initial weight vector Perform adaptive adjustment to obtain the first initial weight vector after the first adjustment ; 利用任务感知的动态权重调整机制对第一次调整后的第一初始权重向量进行动态调整,以得到第二次调整后的第一初始权重向量The first initial weight vector after the first adjustment is adjusted using the task-aware dynamic weight adjustment mechanism Perform dynamic adjustment to obtain the first initial weight vector after the second adjustment ; 对第二次调整后的第一初始权重向量进行归一化,并且采用任务敏感的自适应调整,得到第一权重向量[α, β, γ],其中α+β+γ=1;The first initial weight vector after the second adjustment Normalize and use task-sensitive adaptive adjustment to obtain the first weight vector [α, β, γ], where α+β+γ=1; 基于第一权重向量[α, β, γ]对动态微血流特征F_CEUS、解剖特征F_CT和病理特征F_path进行加权融合,得到第一融合特征F_fused,如下式所示:Based on the first weight vector [α, β, γ], the dynamic microblood flow feature F_CEUS, the anatomical feature F_CT, and the pathological feature F_path are weightedly fused to obtain the first fused feature F_fused, as shown in the following formula: F_fused=α·F_CEUS+β·F_CT+γ·F_pathF_fused=α·F_CEUS+β·F_CT+γ·F_path 其后通过Transformer编码器对第一融合特征F_fused进行优化,得到第一优化融合特征F_fused_optimized。The first fused feature F_fused is then optimized through the Transformer encoder to obtain the first optimized fused feature F_fused_optimized. 5.根据权利要求3所述的基于深度学习的甲状腺肿瘤分类及复发风险预测方法,其特征在于,在步骤S3中,对动态微血流特征F_CEUS和解剖特征F_CT进行融合和优化,得到第二优化融合特征F_temporal,包括:5. The method for thyroid tumor classification and recurrence risk prediction based on deep learning according to claim 3, characterized in that in step S3, the dynamic microblood flow feature F_CEUS and the anatomical feature F_CT are fused and optimized to obtain a second optimized fusion feature F_temporal, comprising: 针对动态微血流特征F_CEUS和解剖特征F_CT,使用多头注意力机制计算特征相似度矩阵,生成第二初始权重向量For the dynamic microblood flow feature F_CEUS and the anatomical feature F_CT, the multi-head attention mechanism is used to calculate the feature similarity matrix and generate the second initial weight vector ; 采用任务感知的动态权重调整机制对第二初始权重向量进行调整,得到调整后的第二初始权重向量The second initial weight vector is adjusted using a task-aware dynamic weight adjustment mechanism Adjust to get the adjusted second initial weight vector ; 对调整后的第二初始权重向量进行归一化,得到第二权重向量[α', β'],其中α'+β' =1;The adjusted second initial weight vector Normalize to obtain the second weight vector [α', β'], where α'+β' = 1; 基于第二权重向量[α', β']对动态微血流特征F_CEUS和解剖特征F_CT进行加权融合,得到第二融合特征F_fused',如下式所示:Based on the second weight vector [α', β'], the dynamic microblood flow feature F_CEUS and the anatomical feature F_CT are weightedly fused to obtain the second fused feature F_fused', as shown in the following formula: F_fused'=α'·F_CEUS+β'·F_CT;F_fused'=α'·F_CEUS+β'·F_CT; 引入残差连接以得到增强特征F_enhanced:Introduce residual connections to obtain enhanced features F_enhanced: F_enhanced = F_fused' + Dropout(FC(F_fused'))F_enhanced = F_fused' + Dropout(FC(F_fused')) 其中FC是全连接层,Dropout是随机失活操作,FC是全连接层;Among them, FC is the fully connected layer, Dropout is the random deactivation operation, and FC is the fully connected layer; 对增强特征F_enhanced应用层归一化以得到归一化的增强特征F_normalized:Apply layer normalization to the enhanced feature F_enhanced to obtain the normalized enhanced feature F_normalized: F_normalized = LayerNorm(F_enhanced)F_normalized = LayerNorm(F_enhanced) 将归一化的增强特征F_normalized输入注意力增强的GRU网络,以得到第二优化融合特征F_temporal:The normalized enhanced feature F_normalized is input into the attention-enhanced GRU network to obtain the second optimized fusion feature F_temporal: F_temporal = GRU(F_normalized)。F_temporal = GRU(F_normalized). 6.根据权利要求1所述的基于深度学习的甲状腺肿瘤分类及复发风险预测方法,其特征在于,步骤S4中的术前分类模型是通过以下方式来构建:6. The method for thyroid tumor classification and recurrence risk prediction based on deep learning according to claim 1, wherein the preoperative classification model in step S4 is constructed by: 采用双分支全连接网络结构构建初始术前分类模型;A dual-branch fully connected network structure was used to construct the initial preoperative classification model; 采集与甲状腺滤泡癌FTC或滤泡性腺瘤FTA相关的多模态数据作为数据集,采用分层抽样将所述数据集按预定比例划分为第一训练集、第一验证集和第一测试集;Multimodal data related to follicular thyroid carcinoma (FTC) or follicular adenoma (FTA) are collected as a dataset, and stratified sampling is used to divide the dataset into a first training set, a first validation set, and a first test set according to a predetermined ratio; 使用第一训练集对初始术前分类模型进行训练,在训练过程中使用第一验证集对初始术前分类模型的性能进行评估,并采用多阶段迁移学习策略、交叉熵损失函数与L2正则化项、基于CEUS设备和CT设备的特性的动态微调机制以及自适应学习率调整来调整初始术前分类模型的参数,训练完成后得到术前分类模型,使用第一测试集对术前分类模型的性能进行评估。The first training set was used to train the initial preoperative classification model. During the training process, the first validation set was used to evaluate the performance of the initial preoperative classification model. A multi-stage transfer learning strategy, a cross-entropy loss function and an L2 regularization term, a dynamic fine-tuning mechanism based on the characteristics of CEUS and CT devices, and an adaptive learning rate adjustment were used to adjust the parameters of the initial preoperative classification model. After training, the preoperative classification model was obtained, and the performance of the preoperative classification model was evaluated using the first test set. 7.根据权利要求1所述的基于深度学习的甲状腺肿瘤分类及复发风险预测方法,其特征在于,S4中的术后复发风险预测模型是通过以下方式来构建:7. The method for thyroid tumor classification and recurrence risk prediction based on deep learning according to claim 1, wherein the postoperative recurrence risk prediction model in S4 is constructed by: 采用多模态对比学习框架来构建初始术后复发风险预测模型;A multimodal contrastive learning framework was used to construct an initial postoperative recurrence risk prediction model; 采集与甲状腺滤泡癌(FTC)和滤泡性腺瘤(FTA)相关的多模态数据作为数据集,采用分层抽样将所述数据集按预定比例划分为第二训练集、第二验证集和第二测试集;Multimodal data related to follicular thyroid carcinoma (FTC) and follicular adenoma (FTA) are collected as a dataset, and stratified sampling is used to divide the dataset into a second training set, a second validation set, and a second test set according to a predetermined proportion; 使用第二训练集对初始术后复发风险预测模型进行训练,使用第二验证集对初始术后复发风险预测模型的性能进行评估,并采用基于对抗训练的领域不变特征学习和基于验证集性能的自适应早停机制来调整初始术后复发风险预测模型的参数,训练完成后得到术后复发风险预测模型,使用第一测试集对术后复发风险预测模型的性能进行评估。The second training set was used to train the initial postoperative recurrence risk prediction model, and the second validation set was used to evaluate the performance of the initial postoperative recurrence risk prediction model. Domain-invariant feature learning based on adversarial training and an adaptive early stopping mechanism based on validation set performance were used to adjust the parameters of the initial postoperative recurrence risk prediction model. After training, the postoperative recurrence risk prediction model was obtained, and the performance of the postoperative recurrence risk prediction model was evaluated using the first test set. 8.根据权利要求7所述的基于深度学习的甲状腺肿瘤分类及复发风险预测方法,其特征在于,进一步采用双分支对比学习网络和风险评分机制来增强术后复发风险预测模型的性能。8. The deep learning-based thyroid tumor classification and recurrence risk prediction method according to claim 7 is characterized in that a dual-branch contrastive learning network and a risk scoring mechanism are further used to enhance the performance of the postoperative recurrence risk prediction model. 9.根据权利要求1至8中任一项所述的基于深度学习的甲状腺肿瘤分类及复发风险预测方法,其特征在于,基于甲状腺肿瘤类型分类概率与复发风险评分,采用Grad-CAM优化生成热力图以提升可视化结果的清晰度和可靠性,并且采用SHAP特征贡献分析以量化各种特征对预后结果的贡献。9. The deep learning-based thyroid tumor classification and recurrence risk prediction method according to any one of claims 1 to 8 is characterized in that, based on the thyroid tumor type classification probability and recurrence risk score, Grad-CAM optimization is used to generate a heat map to improve the clarity and reliability of the visualization results, and SHAP feature contribution analysis is used to quantify the contribution of various features to the prognostic results. 10.一种电子设备,所述电子设备包括处理器和存储器,其特征在于,在所述存储器中存储有计算机程序,所述计算机程序在由所述处理器执行时会使所述处理器执行根据权利要求1至9中任意一项所述的基于深度学习的甲状腺肿瘤分类及复发风险预测方法。10. An electronic device comprising a processor and a memory, characterized in that a computer program is stored in the memory, and when the computer program is executed by the processor, the processor will execute the deep learning-based thyroid tumor classification and recurrence risk prediction method according to any one of claims 1 to 9.
CN202510766187.4A 2025-06-10 2025-06-10 Thyroid tumor classification and recurrence risk prediction method based on deep learning Pending CN120600315A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202510766187.4A CN120600315A (en) 2025-06-10 2025-06-10 Thyroid tumor classification and recurrence risk prediction method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202510766187.4A CN120600315A (en) 2025-06-10 2025-06-10 Thyroid tumor classification and recurrence risk prediction method based on deep learning

Publications (1)

Publication Number Publication Date
CN120600315A true CN120600315A (en) 2025-09-05

Family

ID=96899449

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202510766187.4A Pending CN120600315A (en) 2025-06-10 2025-06-10 Thyroid tumor classification and recurrence risk prediction method based on deep learning

Country Status (1)

Country Link
CN (1) CN120600315A (en)

Similar Documents

Publication Publication Date Title
Zuo et al. R2AU‐Net: attention recurrent residual convolutional neural network for multimodal medical image segmentation
Khan et al. Detection of Abnormal Cardiac Rhythms Using Feature Fusion Technique with Heart Sound Spectrograms
US11893659B2 (en) Domain adaption
US20220391760A1 (en) Combining model outputs into a combined model output
Sirjani et al. Automatic cardiac evaluations using a deep video object segmentation network
Bourai et al. Deep learning-assisted medical image compression challenges and opportunities: systematic review
Tian et al. Radiomics and its clinical application: artificial intelligence and medical big data
Cai et al. Medical ai for early detection of lung cancer: A survey
Zheng et al. A dual‐attention V‐network for pulmonary lobe segmentation in CT scans
US20250213134A1 (en) Apparatus and method for locating a position of an electrode on an organ model
US20250245919A1 (en) Apparatus and method for generating a three-dimensional (3d) model of cardiac anatomy based on model uncertainty
CN119028573A (en) Benign breast disease diagnosis system based on multimodal data
CN118609821A (en) Method and system for determining disease occurrence probability based on multimodal assisted learning
Srivastava et al. Optimizing CNN based model for thyroid nodule classification using data augmentation, segmentation and boundary detection techniques
Kavitha et al. Machine learning technique for breast cancer detection and classification
He et al. Deep learning-based image classification for AI-assisted integration of pathology and radiology in medical imaging
Zhang et al. Semi‐supervised graph convolutional networks for the domain adaptive recognition of thyroid nodules in cross‐device ultrasound images
Wu et al. The DeepLabV3+ Algorithm Combined With the ResNeXt Network for Medical Image Segmentation
CN120600315A (en) Thyroid tumor classification and recurrence risk prediction method based on deep learning
Liu et al. DDANet: A deep dilated attention network for intracerebral haemorrhage segmentation
Ramya et al. Classification of amniotic fluid level using Bi-LSTM with homomorphic filter and contrast enhancement techniques
Ranjan Kumar et al. Deep learning-based automated classification of skin lesions using CNN and computer vision
Guo et al. Towards robust multimodal ultrasound classification for liver tumor diagnosis: A generative approach to modality missingness
US12462478B2 (en) Apparatus and method for generating a three-dimensional (3D) model of cardiac anatomy with an overlay
Banerjee et al. A systematic review of machine learning in heart disease prediction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination