Disclosure of Invention
In order to solve the above technical problem, the present invention provides a lens dividing method, comprising:
acquiring an ROI image containing the crystalline lens from the original image;
the image segmentation difficulty is judged based on the ROI image, and a lens segmentation result is obtained through a real-time segmentation network based on the ROI image.
The target area is extracted through the preprocessing algorithm, the size of the image to be segmented is greatly reduced, the interference of redundant information is reduced, and the calculation amount of the segmentation network algorithm is reduced.
Further, the acquiring the ROI image containing the lens from the original image includes: the original image is filtered and then input to a ShuffleSeg network to obtain a segmentation result image, the left, right, upper and lower boundaries of the ROI image are obtained from the segmentation result image, and the ROI image is intercepted from the original image according to the obtained left, right, upper and lower boundaries. The ROI image is extracted by adopting the image segmentation idea, the problems that the traditional method and a deep learning detection algorithm are poor in robustness and easy to exceed a detection boundary and the like are solved, and stability and precision are obviously improved.
Further, the determining the image segmentation difficulty based on the ROI image comprises: and inputting the ROI image into a ShuffleNet network for coding, inputting the coded feature map into a SkipNet network for decoding, fusing the coding result and the decoding result after respectively averaging and pooling, and connecting a full connection layer to obtain the image segmentation difficulty. The segmentation difficulty level of the lens can be used as an evaluation of the confidence of the image segmentation result.
Further, the obtaining of the crystalline segmentation result through the real-time segmentation network based on the ROI image includes: the ROI image is input into a ShuffleNet network coding to extract image features, and the SkipNet network decoding is adopted to perform upsampling on the basis of the extracted image features to calculate a final class probability map. The segmentation network based on ShuffleNet and SkipNet can greatly reduce the calculated amount and realize real-time segmentation while ensuring certain precision, and has great significance for practical application.
Further, before the ROI image containing the crystalline lens is obtained from the original image, whether the original image can be segmented or not is judged, and if the original image cannot be segmented, the image segmentation difficulty is judged to be the largest. The quality control module can improve the segmentation efficiency if the image can be segmented.
Further, the determining whether the original image is divisible includes: and inputting the original image into a ShuffleNet network for coding, and connecting a full connection layer after adding an average pooling layer in the coded output layer to output a judgment result.
The present invention also provides a lens dividing system, comprising:
the ROI image extraction module is used for acquiring an ROI image containing the crystalline lens from the original image;
the segmentation difficulty grading module is used for judging the image segmentation difficulty based on the input ROI image;
the real-time segmentation module is used for carrying out real-time lens segmentation on the input ROI image;
the ROI image extraction module extracts an ROI image from an input original image and inputs the ROI image into the segmentation difficulty grading module and the real-time segmentation module.
In the preprocessing process of extracting the region of interest, the image segmentation idea is adopted for boundary searching, so that the problems of poor robustness, easiness in exceeding of a detection boundary and the like in the traditional method and a deep learning detection algorithm are solved, and the stability and the precision are obviously improved. Meanwhile, the target area is extracted through the preprocessing algorithm, the size of the image to be segmented is greatly reduced, the interference of redundant information is reduced, and the calculated amount of the segmentation network algorithm is reduced.
Preferably, the system further comprises a whether-divisible judging module, which is used for judging whether the input original image can be divided; the segmentation judging module enables the ROI image extracting module to extract an image ROI area when judging that the input original image can be segmented, and judges that the image segmentation difficulty is the largest when judging that the input original image cannot be segmented.
The present invention also provides a computer-readable storage medium, comprising computer-readable instructions, which, when read and executed by a processor, cause the processor to perform the operations of any of claims 1-6.
The invention has the following beneficial effects:
(1) the method comprises four modules of quality control of image segmentation, region of interest extraction, segmentation difficulty level prediction and structure segmentation. The method can reduce the influence of human factors while ensuring the segmentation precision, realizes the repeatability of segmentation, greatly improves the segmentation efficiency, and has important significance for the diagnosis of cataract diseases.
(2) In the preprocessing process of extracting the region of interest, the image segmentation idea is adopted for boundary searching, so that the problems that the traditional method and the deep learning detection algorithm are poor in robustness and easy to exceed the detection boundary and the like are solved, and the stability and the precision are obviously improved. Meanwhile, the target area is extracted through the preprocessing algorithm, the size of the image to be segmented is greatly reduced, the interference of redundant information is reduced, and the calculated amount of the segmentation network algorithm is reduced.
(3) The ShuffleSeg segmentation network based on ShuffleNet and SkipNet can greatly reduce the calculated amount and realize real-time segmentation while ensuring certain precision, and has great significance for practical application.
(4) The two parts of operation of segmentation difficulty level prediction and structure segmentation in the invention share the features extracted by the ShuffLeNet feature extraction network, on the basis, the extracted features can be effectively utilized, different tasks are efficiently realized by classifying and segmenting two network branches, and the operation time of the algorithm is greatly saved.
(5) The invention aims at the segmentation framework of the AS-OCT image, has strong anti-interference capability and good generalization capability, and the concept of the segmentation framework can be conveniently applied to other image segmentation fields.
Example one
A lens splitting method, comprising: acquiring an ROI image containing the crystalline lens from the original image; the image segmentation difficulty is judged based on the ROI image, and a lens segmentation result is obtained through a real-time segmentation network based on the ROI image. The target area is extracted through the preprocessing algorithm, the size of the image to be segmented is greatly reduced, the interference of redundant information is reduced, and the calculation amount of the segmentation network algorithm is reduced. In the image segmentation difficulty judgment and the process of obtaining the lens segmentation result, a real-time segmentation network which encodes by using ShuffleNet and decodes by using SkipNet is adopted. In an actual application scenario, due to the fact that a lens structure of a patient is diseased or an intraocular lens and the like, a phenomenon that the lens structure is missing or the structure does not exist may occur in a shot AS-OCT image, and therefore structure segmentation is affected. For this reason, in the present embodiment, before acquiring an ROI image including the lens from the original image, it is automatically determined whether the original image is divisible.
Fig. 1 shows the overall flow of the method of this embodiment, which includes four steps:
step one, judging whether the AS-OCT original image can be segmented or not.
And step two, acquiring an ROI image containing the crystalline lens from the original image. As the AS-OCT original image is large in size (e.g., 2130 multiplied by 1864), the left and right side regions of the crystalline lens in the image are redundant information, which can interfere with structure segmentation and increase algorithm processing difficulty and calculation amount. And searching the boundary of the image crystalline lens region through a preprocessing algorithm, and extracting a key region of the image to be segmented while reducing noise. On the premise of not influencing the structure segmentation, the size and the range of the image are greatly reduced, and the method is favorable for subsequent network segmentation and reduction of the calculated amount. In addition, the AS-OCT lesion image is very complex, and the traditional image processing method is easily influenced by image quality in the process of finding the boundary, so that the problems of inaccurate or wrong boundary finding and the like occur, and the AS-OCT lesion image is difficult to cope with various practical application scenes.
And step three, taking the ROI image obtained in the step two as input, obtaining the segmentation difficulty level of the original image, and taking the segmentation difficulty level as the evaluation of the confidence coefficient of the image segmentation result.
And step four, taking the ROI image obtained in the step two as input, obtaining a lens structure segmentation result through a real-time segmentation network, and obtaining a visual result of image segmentation through post-processing.
Step one, an original image is taken as input, coding is carried out through a shuffle network, an Average Pooling Layer (AvgPool) is added to a coded output Layer, and then a full connected Layer (FC) is connected, so that image partitionable and inseparable results can be obtained, and the network structure is shown in fig. 2. In this embodiment, if it is determined that the image is not divisible, the image division difficulty level may be directly determined to be a 0 level indicating the maximum division difficulty.
The main purpose of the second step is: an ROI image (shown in an ROI image in FIG. 3) is extracted from the original image, so that the size of the image input into the segmentation network is reduced, the interference of redundant information is reduced, and the segmentation speed of the algorithm is improved. And step two, taking the original image as Input, and filtering the original image (the Input image in fig. 3) by adopting median filtering so as to reduce image noise caused by acquisition equipment and the like, wherein the size of a convolution kernel is 5 × 5. On this basis, the image was input into the ShuffleSeg network, with an input image size of 240 × 120. The image segmentation result obtained by the ShuffleSeg network is shown in the preprocessed segmentation image in fig. 4, in which the lens region is the foreground and the other regions are the background. The vertical coordinate values of the upper and lower boundaries of the lens capsule, which are respectively denoted as ytop and ybottom, can be obtained by searching from the top downwards and from the bottom upwards of the segmentation result image. The coordinates of the central position of the crystalline lens can be calculated by the upper and lower boundaries of the crystalline lens capsule, and the non-background area is searched from the two sides of the central position to the center, so as to obtain the abscissa values of the positions of the left and right boundaries, which are marked as xleft and xright. In order to ensure that the ROI image contains the complete lens capsule region, the upper and lower boundaries need to be moved upward and downward by a certain distance when the image is extracted, so as to intercept the image of the region of interest, as shown in the ROI image in fig. 4.
Step three, the ROI image obtained in the step two is used as input, and ShuffleNet is adopted for coding, so that feature extraction is realized; in addition, the extracted feature map is taken as input, and SkipNet is adopted for decoding; the output results of encoding and decoding are respectively merged after being averaged and pooled, then two full-connected layers are connected, and finally the grading result of the image segmentation difficulty is obtained (the grading result in the embodiment comprises 1-5 grades, and the grading is easier to segment when the grade is larger), and the network structure is shown in fig. 5.
Step four, the ROI image obtained in step two is used as an input (shown in fig. 4), and the size of the image is uniformly scaled to 256 × 512, in this embodiment, a four-classification ShuffleSeg real-time segmentation network is used to implement segmentation of the lens structure, and the segmentation process is shown in fig. 6. The real-time semantic segmentation Network ShuffleSeg comprises two processes of encoding and decoding, and the Network structure is shown as ShuffleSeg Network Architecture in FIG. 7 and comprises an encoding part and a decoding part. The encoding process is based on a ShuffleNet network and is mainly responsible for extracting image features. The network reduces the calculation amount through packet convolution (packet convolution) and maintains excellent accuracy by using a channel aliasing (channel blurring) method, thereby improving the network performance. Initially 3 x 3 convolutions are used, with a step size of 2 downsampling (Conv 1[2 x2, #24], number of output channels 24), one activation function (Relu) per convolutional layer, followed by 2 x2 Max Pooling (Max Pooling [2 x2 ]). Then, the algorithm has 3 stages (Stage 2, Stage 3, Stage 4), each of which is composed of a plurality of ShuffleNet units (SU: ShuffleNet Unit). The 2 nd and 4 th phases consist of 3 shefflonet units (SUs = 3), and the 3 rd phase consists of 7 shefflonet units (SUs = 7). The number of output channels in the 2 nd stage is 240, the number of output channels in the 3 rd stage is 480, and the number of output channels in the 4 th stage is 960. The decoding process is based on a SkipNet network and mainly takes charge of up-sampling and calculating a final class probability graph. SkipNet improves accuracy from high resolution feature maps. The output of Stage 4 (Stage 4) is represented as a fractional Layer (Score Layer), i.e. a probability map, by a 1 × 1 convolution (1 × 1 Conv), thereby converting the channel to the number of classes. The output of the fractional layer is upsampled by a factor of 2 (x 2 Upsampling) with a convolution kernel size of 4 x 4, resulting in an upsampled image. The outputs of Stage 2 (Stage 2) and Stage 3 (Stage 3) were used as input interlayers (Feed 1 Layer, Feed2 Layer), with 1 × 1 convolution (1 × 1 Conv) respectively, to improve heatmap resolution. The upsampled image output of the fractional layer was element superimposed with the heat map of stage 3 to obtain the Use intermediate layer (denoted as Use Feed 1). Use Feed1 was upsampled by a step size of 2 and a convolution kernel size of 4 x 4 to obtain a Score Layer (denoted Score Layer 2). Element stacking of stage 2 heatmaps (heatmaps) and Score Layer2 resulted in the Use of an intermediate Layer (denoted as Use Feed 2). Finally, by transpose convolution initialized by bilinear Upsampling, with a convolution kernel size of 16 × 16 and a step size of 8 (x 8 Upsampling), a final probability map matching the input size can be obtained.
The design of the ShuffleNet Unit refers to a residual error network (ResNet) and is divided into two basic units, such as SU: ShuffleNet Unit in FIG. 7 and FIG. 8. The first unit operation: the step size is 2 and the convolution kernel is 3 × 3 Average pooling (AVG Pool). The second unit operation: 1 × 1 dot-by-dot convolution (GC), data Normalization with BN layer (Batch Normalization, BN), and channel rearrangement (1 × 1 GC + Shuffle) using ReLU activation function; then, performing deep Convolution (DWC) with the step length of 2 and the Convolution kernel size of 3 multiplied by 3, and performing data normalization (BN layer) after Convolution; the output data after the deep convolution is subjected to 1 × 1 point-by-point convolution and packetization (1 × 1 GC), and data normalization (BN layer) is performed after the convolution and packetization. Finally, the results of the two calculation units are merged by channel (Concat) and the processing result is obtained using the ReLU activation function.
The third step and the fourth step share the features extracted by the ShuffLeNet feature extraction network, and can be executed sequentially or synchronously. On the basis, the extracted features can be effectively utilized, different tasks are efficiently realized through classifying and dividing the two network branches, and the algorithm running time is greatly saved.
Preferably, the network model for classifying the image segmentation difficulty in the third step of this embodiment is established on the network for segmenting the lens structure, and the establishment process is as follows: and 3.1, collecting sample data and marking a reference dividing line for each sample data. The sample data is the ROI image obtained in the step two. The labeling of the sample data in step S3.1 is to mark out the real segmentation line of the lens structure in the ROI image as an evaluation criterion for evaluating whether the segmentation result in step S3.2 is accurate.
And 3.2, inputting the sample data into the network adopted in the fourth step to segment the lens structure in the ROI image, and obtaining a segmentation result.
And 3.3, calculating the segmentation comprehensive error of the automatic segmentation structure of each sample data. The error can be obtained by calculating an average Pixel Distance (Pixel Distance) between the segmentation result and the label for each sample data. By calculating the shortest distance between each boundary point on each segmentation line in the segmentation result and the marked real segmentation line as the error of a single boundary point, the mean value and the variance of the errors of the boundary points on the whole segmentation line can be calculated as the single error of the segmentation line. The segmentation comprehensive error of the segmentation result can be counted based on the single error of each segmentation line, for example, the segmentation comprehensive error in this embodiment:
for a single error of the boundary dividing line on the lens of sample data i,
the mean of the single term errors of the boundary dividing lines on the lens for all sample data,
the weighted value of the single error of the lower boundary dividing line of the crystalline lens.
Is the single term error of the lower boundary dividing line of the lens of sample data i,
the mean of the single term errors of the lower boundary dividing lines of the lens for all sample data,
the weighted value of the single error of the lower boundary dividing line of the crystalline lens.
Is a single error of the boundary dividing line on the cortical layer of sample data i,
the mean of the single errors of the boundary dividing lines on the cortical layer for all sample data,
the weight value of the single error of the boundary dividing line under the cortical layer.
Is a single error of a dividing line of the lower boundary of the cortical layer of the sample data i,
the mean of the single errors of the boundary dividing lines under the cortical layer of all sample data,
the weight value of the single error of the boundary dividing line under the cortical layer.
Is the single term error of the lower boundary dividing line of the lens nucleus of sample data i,
for all sample dataThe mean of the monomial errors of the lower boundary dividing lines of the lens nucleus,
the weighted value of the monomial error of the boundary dividing line under the lens nucleus.
Is the single term error of the boundary dividing line on the lens nucleus of sample data i,
the mean of the single errors of the boundary dividing lines on the lens nucleus for all sample data,
the weighted value of the monomial error of the boundary dividing line on the lens nucleus.
And 3.4, determining the segmentation difficulty level of the sample data according to the segmentation comprehensive error of the automatic segmentation result of each sample data. The number of the segmentation difficulty levels of the sample data and the proportion of each level can be determined according to actual needs and the comprehensive error distribution of all the sample data. Taking five stages in total of the segmentation difficulty grades as an example, according to the segmentation comprehensive error statistical results and distribution of all sample data, the sample data can be divided into the following stages according to 1 stage, 5%, 2 stage, 15%, 3 stage, 45% and 4: 25% and 5 levels and 10%, and dividing the segmentation difficulty level into 1-5 levels, so that the range of the segmentation comprehensive error E corresponding to each level can be determined. The segmentation difficulty corresponding to the 1-5 levels is reduced in sequence, namely the 5 levels represent that the segmentation difficulty is easiest, the error of the segmentation result is small, and the reliability of the segmentation result is high; level 1 indicates that the segmentation difficulty is high, the error of the segmentation result is high, and the reliability of the segmentation result is low. After determining the comprehensive error range corresponding to each segmentation difficulty level, the segmentation difficulty level of each sample data can be determined by the segmentation comprehensive error. Training is carried out based on the obtained sample data and the segmentation difficulty level corresponding to the sample data, and a segmentation difficulty judging network is established to realize automatic judgment of the segmentation difficulty of the lens structure of the AS-OCT image.
And 3.5, establishing a segmentation difficulty judgment network based on each sample data and the segmentation difficulty grade thereof. And adding a parallel classification branch as a segmentation difficulty judgment network on the basis of the segmentation network adopted in the fourth step, so as to classify by using the effective characteristics obtained by automatically segmenting the network.