WO2025051988A1

WO2025051988A1 - Computer implemented method and computer programs for harmonization of magnetic resonance images

Info

Publication number: WO2025051988A1
Application number: PCT/EP2024/075043
Authority: WO
Inventors: Ibor Crespo EDUARDO; Jiménez Pastor ANA; Alberich Bayarri ÁNGEL; García Castro FABIO
Original assignee: Quibim SL
Current assignee: Quibim SL
Priority date: 2023-09-08
Filing date: 2024-09-06
Publication date: 2025-03-13
Anticipated expiration: 2026-03-08

Abstract

A computer implemented method and computer programs for harmonization of magnetic resonance images are proposed. The method comprises selecting a subset of homogeneous images among a set multi-contrast magnetic resonance original images to be harmonized and consider it as ground truth; generating contrast and texture alterations of the homogeneous images in the selected subset by subjecting each 2D slice of the homogeneous images to different alterations in the frequency domain; returning the generated contrast and texture alterations of the homogeneous images to the spatial domain; pairing each generated contrast and texture alteration of the homogeneous images in the spatial domain with its original ground truth, obtaining paired images as a result; and generating a training set for a deep learning model using the obtained paired images.

Description

COMPUTER IMPLEMENTED METHOD AND COMPUTER PROGRAMS FOR HARMONIZATION OF MAGNETIC RESONANCE IMAGES

DESCRIPTION

Technical Field

The present invention relates to a method and computer programs for harmonization of images, in particular magnetic resonance images (MRI), a type of medical images. More specifically, the invention relates to an image harmonization methodology based on the use of the frequency domain, powered by artificial intelligence (Al).

Background of the Invention

In the last years, different solutions have been proposed to solve the image harmonization problem. These solutions go from using traditional computer vision algorithms for image normalization and denoising [1], to more advanced solutions based on Al and, specifically, computed neural network (CNN) architectures, such as generative adversarial networks (GAN) or autoencoders.

According to different principles, traditional computer vision algorithms can be divided into filtering-based denoising algorithms and statistical learning-based methods. The former tries to preserve information by local smoothing of noisy images, and to eliminate noise by calculating the relationship between noisy image pixels and the surrounding pixels [2, 3], The latter are used to learn the statistical properties of natural images, noisy images, and noisy signals, and to fuse spatial and transform domain methods to denoise images, with a focus on the determination of parameters [4-6], Although traditional methods are remarkably useful for remote sensing images, they also present significant drawbacks. Thus, they require various hyperparameters to be set manually, as well as significant computational and time costs, as they are used to obtain optimal results by solving for the optimal variance. Finally, they only allow to handle specific intensities of noise [1],

Learning-based methods translate images between sites via non-linear mappings determined using machine learning [7, 8] or deep learning [9, 10], with or without supervision. Machine learning methods predict harmonized images using regression models learned, typically with supervision, with hand-crafted features. In contrast, deep learning techniques automatically extract features pertinent to the harmonization task. Supervised methods, such as the DeepHarmony method [10], typically require training data acquired from traveling human phantoms, which might not always be available for large-scale, multi-site, or longitudinal studies. DeepHarmony uses a U-Net-based deep learning architecture to produce images with consistent contrast. Although proven to effectively harmonize images across different protocols [10], has a major limitation, as training requires images from different acquisition protocols for the same patient and the same time point, a requirement which is highly difficult to meet in practice. Unsupervised deep learning techniques [11-13] determine mappings between sites using unpaired images and therefore avoid the need for traveling human phantom data. Among these methods, GAN-based approaches, a family of deep learning methods using adversarial training to learn a generative model of a target distribution, have emerged as a promising alternative, as they have proven to be highly effective at synthesizing imaging data in a wide range of scenarios [14], GAN-based methods can be used to train deep learning models that are robust to adversarial perturbations [15-18], In the field of medical imaging, there have been encouraging results using the GAN approach to model sitebased variations. In particular, Y. Gao et al. [19] used a GAN-based approach for intensity normalization on multi-site T2-FLAIR MRI data and Modanwal et al. [20] developed a GAN-based image harmonization approach for dynamic contrast-enhanced breast MRI scans. Other GAN-based methods, such as those developed by Bashyam et al. [21] or Zhu et al. [22] have also demonstrated good results. However, important drawbacks are also associated with unsupervised methods. Thus, these methods are unscalable to large-scale multi-site MRI harmonization, as they typically learn pair-wise mappings between sites. For N sites, these methods learn N (N - 1) mappings for all site pairs, and therefore require a large amount of data for learning the multitude of network parameters. These pair-wise methods are also ineffective by not fully and jointly utilizing complementary information available from all sites and they are generally less flexible for improvements and to perform experiments. Importantly, and specifically for GAN approaches, there is also a potential risk of missing or even altering important information of, for example, abnormal structures such as lesions, as this kind of architecture uses two different sets to transfer style features between them.

Besides the above, European patent application EP386506-A1 discloses a harmonization system for a brain activity classifier harmonizing brain measurement data obtained at a plurality of sites to realize a discrimination process based on brain functional imaging: obtains data, for a plurality of traveling subjects as common objects of measurements at each of the plurality of measurement sites, resulting from measurements of brain activities of a predetermined plurality of brain regions of each of the traveling subjects; calculates, for each of the traveling subjects, prescribed elements of a brain functional connectivity matrix representing the temporal correlation of brain activities of a set of the plurality of brain regions; using a generalized linear mixed model, calculates measurement bias data 3108 for each element of the brain functional connectivity matrix, as a fixed effect at each measurement site with respect to an average of the corresponding element across the plurality of measurement sites and across the plurality of traveling subjects; and thereby executes a harmonizing process.

WO2022192728 discloses an exemplary system, method and computer-accessible medium for harmonizing neuromelanin (NM) data using combat directly on a NM database or using combat generated coefficients to harmonize future data. The method comprises, for example, receiving imaging information of a brain of the patient(s), from one MRI scanner, receiving imaging information of a brain of the patient(s), from a second MRI scanner and using combat to harmonize the data between scanners against a reference dataset. The NM concentration of the patient(s) can then be determined based on the harmonized data. The NM concentration can be determined using a voxelwise analysis procedure. The voxel-wise analysis procedure can be used to determine a topographical pattern(s) within a substantia nigra (SN) of the brain of the patient(s).

US2023014745-A1 discloses systems, methods, and instrumentalities associated with reconstructing MR images based on under-sampled MR data. The MR data include 2D or 3D information, and may encompass multiple contrasts and multiple coils. The MR images are reconstructed using deep learning (DL) methods, which may accelerate the scan and/or image generation process. Challenges imposed by the large quantity of the MR data and hardware limitations are overcome by separately reconstructing MR images based on respective subsets of contrasts, coils, and/or readout segments, and then combining the reconstructed MR images to obtain desired multi-contrast results.

In spite of the known solutions, to date, none of the known methods have proven sufficiently good generalizable performance in the imaging field, and more particularly in the medical imaging field, when dealing with multicentric data. So far, no solutions have used the frequency domain to generate altered images to be used as inputs of a subsequent training process as part of a deep learning model or have harmonized around subset clustering based on DI COM tags.

References:

1. Han, L et al Remote Sensing Image Denoising Based on Deep and Shallow Feature Fusion and Attention Mechanism. 2022. 14(5).

2. Buades, A et al. Nonlocal Image and Movie Denoising. International Journal of Computer Vision, 2008. 76(2): p. 123-139.

3. Rudin, L et al. Nonlinear total variation based noise removal algorithms. Physica D: Nonlinear Phenomena, 1992. 60: p. 259-268.

4. Aharon, M et al. K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation. IEEE Transactions on Signal Processing, 2006. 54(11): p. 4311-4322.

5. Cybenko, G. Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals and Systems, 1989. 2(4): p. 303-314.

6. Zhao, H et al. Statistically Adaptive Image Denoising Based on Overcomplete Topographic Sparse Coding. Neural Processing Letters, 2015. 41(3): p. 357-369.

7. Garcia-Dias, R et al. Neuroharmony: A new tool for harmonizing volumetric MRI data from unseen scanners. Neuroimage, 2020. 220: p. 117127.

8. Jog, A et al. Random forest regression for magnetic resonance image synthesis. Med Image Anal, 2017. 35: p. 475-488.

9. Dinsdale, NK et al. Deep learning-based unlearning of dataset bias for MRI harmonisation and confound removal. Neuroimage, 2021. 228: p. 117689.

10. Dewey, BE et al. DeepHarmony: A deep learning approach to contrast harmonization across scanner changes. Magn Reson Imaging, 2019. 64: p. 160-170.

11. Liu, M et al Unsupervised Image-to Image Translation Networks. 2018.

12. Zhu, J et al Toward multimodal image-to-image translation. 2018.

13. Isola, P et al Image-to-image translation with conditional adversarial networks. arXiv:1611.07004v3, 2018.

14. Ij, G et al Generative adversarial networks. 2014.

15. Sun, K et al Enhancing the robustness of deep neural networks by boundary conditional GAN. 2019.

16. Wang, H et al A direct approach to robust deep learning using adversarial networks. 2019.

17. Lee, H et al Generative adversarial trainer: Defense to adversarial perturbations with GAN. 2017.

18. Samangouei, P et al Defense-GAN: Protecting classifiers against adversarial attacks using generative models. 2018.

19. Gao, Y et al. A Universal Intensity Standardization Method Based on a Many-to-One Weak-Paired Cycle Generative Adversarial Network for Magnetic Resonance Images. IEEE Transactions on Medical Imaging, 2019. PP: p. 1-1 . 20. Modanwal, G et al. MRI image harmonization using cycle-consistent generative adversarial network. 2020.

21 . Bashyam, V et al, Medical Image Harmonization Using Deep Learning Based Canonical Mapping: Toward Robust and Generalizable Learning in Imaging. 2020.

22. Zhu, J et al Unpaired image-to-image translation using cycle-consistent adversarial networks. 2017. of the Invention

Present invention proposes an innovative self-supervised deep learning method that aims to overcome most of the limitations of the previous methods previously described. Thanks to its novel approach, consisting of an image pre-processing step involving alterations in the frequency domain, and of a deep learning model (e.g., a CNN learning model) that can include a modified version of a U-Net CNN architecture, the present invention allows the harmonization of MRI scans from several sites and acquired with any kind of scanner and/or protocol:

• Without requiring a complex training process

• Avoiding pair-wise methods

• Minimizing information loss and/or alteration by enhancing the field of view of the input set

The above object is fulfilled by a method with the characteristics of claim 1 and by a computer program with the features of claim 8.

Thus, the present invention proposes, according to one aspect, a computer implemented method for harmonization of MRI images (or sequences), comprising selecting, by a processor, a subset of homogeneous images among a set of original multi-contrast images to be harmonized and consider it as ground truth; generating, by a processor, contrast and texture alterations of the homogeneous images in the selected subset by subjecting each 2D slice of the homogeneous images to different alterations in the frequency domain by performing in an iterative loop the following steps: i) generating a centered round binary mask positioned at a center region of the frequency domain with a radius varying from 1 to 5% of the image size and taking values from 1 to 1.5, both the size and the value increasing independently on each iteration; ii) generating a mask composed by a module of the homogeneous images in the frequency domain normalized to a range [0 - 1]; iii) adding together and normalizing to a range [0 - 1] the two generated masks, obtaining a new and final mask as a result; and iv) performing addition and subtraction operations on the original module of the homogeneous images through the final mask; returning, by a processor, the generated contrast and texture alterations of the homogeneous images to the spatial domain; pairing, by a processor, each generated contrast and texture alteration of the homogeneous images in the spatial domain with its original ground truth, obtaining paired images as a result; and generating, by a processor, a training set for a deep learning model (e.g. a CNN architecture) using the obtained paired images.

In some embodiments, the selecting step comprises checking if DICOM tags are available in the set of original multi-contrast images, wherein if DICOM tags are available, the subset of homogeneous images is selected based on them, whereas if DICOM tags are not available, comparison of histogram plot of voxels’ intensities is used for the selection.

In some embodiments, the method also performs a visual exploration on the selected subset of homogeneous images to discard artifacts or distorted information therein.

In some embodiments, the final mask is obtained by giving more weight on each iteration to a region corresponding to true values of the centered round binary mask increasing each step of the iteration by 0.1 for the intensity and 1% for the size.

In some embodiments, the deep learning model is based on a ll-Net CNN architecture, that can comprise at least one of: using convolutional filters with kernel size of 4X4 and a stride of 2; adding an additional skip connection between an input of a convolutional block of the ll-Net CNN architecture and an output of a deep supervised block of the ll- Net CNN architecture, and using a mean square error, MSE, as a loss function with L2 regularizers; and/or adding a deep supervised block at an end of the U-Net CNN architecture.

Other embodiments of the invention that are disclosed herein also include software programs to perform the method embodiment steps and operations summarized above and disclosed in detail below. More particularly, a computer program product is one embodiment that has a computer-readable medium including computer program instructions encoded thereon that when executed on at least one processor in a computer system causes the processor to perform the operations indicated herein as embodiments of the invention. Brief of the Drawings

The previous and other advantages and features will be more fully understood from the following detailed description of embodiments, with reference to the attached figures, which must be considered in an illustrative and non-limiting manner, in which:

Fig. 1 : Illustrates an embodiment of the alterations workflow in the frequency domain.

Fig. 2: Shows a sample subset of alterations including augmenting and reducing intensities showing its frequency domain alteration masks that have caused the resulting image.

Fig. 3: Illustrates the flow diagram of the process to obtain contrast altered images, according to another embodiment.

Fig. 4: Illustrates an embodiment of the proposed CNN architecture.

Detailed Description of the Invention and of Preferred Embodiments

Present invention proposes a new paradigm for MRI harmonization based on the generation, from a set of original multi-contrast images, of a wide variety of realistic contrast and texture alterations that allow a deep learning algorithm to learn how to bring any distribution of intensities to the same latent space.

The uniqueness of the proposed approach lies in the data pre-processing or preparation. Thus, an initial step requires the selection of a cluster/subset of homogenous images among the set of original images considered as the ground truth. Each of their slices are subjected to different alterations, aimed to recreate a realistic variation, in terms of contrast and texture, of the images. These alterations are generated in the frequency domain (or k-space), that represents the spatial frequency information of an image; the space covered by the phase and frequency encoding data. The relationship between k- space data and image data is the Fourier transformation. In two-dimensional (2D) Fourier transform imaging, a line of data corresponds to the digitized signal at a particular phase encoding level. The position in k-space is directly related to the gradient across the object being imaged. By changing the gradient over time, the k-space data are sampled in a trajectory through Fourier space. Each point in the raw data matrix contains part of the information for the complete image. However, a point in the raw data matrix does not correspond to a point in the image matrix. The outer rows of the raw data matrix correspond to high spatial frequencies, and provide information regarding the borders and contours of the image, that is the structure details. The inner rows of the matrix correspond to low spatial frequencies, and provide information on the general contrast of the image.

Present invention approach leverages these k-space’s properties to generate paired 2D slices (altered-original) with different contrast appearance and intensity distribution by performing alterations in the center of the frequency domain - where the higher amount of contrast-related information is found- by applying the Fast Fourier transformation. Slight changes in specific parts of the module in the frequency domain generate a tone- up or down-contrast effect, while maintaining the original lighting features, as well as any artifact. Alterations of images in the frequency domain are returned to the spatial/image domain by applying the Inverse Fourier Transform, and subsequently used as inputs to train a deep learning model (e.g. a U-Net-based convolutional neuronal network (CNN) architecture) in which each alteration is paired with its original ground truth.

The proposed harmonization methodology can involve a I) pre-processing phase and II) a training phase.

In the pre-processing phase, the method searches for homogeneous image set clusters. The higher similarity, the more solid the harmonization pattern that the algorithm will learn. In this case, if DICOM tags are available, images are chosen based on them, as DICOM tags provide the most reliable information on tissue representation similarity. Images are then normalized between 0-1 and classified according to their DICOM tags (repetition time, echo time, magnetic field, inversion field [if applied]). Alternatively, if DICOM tags are not available, comparison of histogram plot of the intensities of the voxels to be harmonized are used for image selection. Lesions and any other variable information that could affect the histogram must be discarded.

A visual exploration of all the cases selected can be also performed, to discard those with artifacts of any kind that could have distorted the information, or could be not appropriate to be established as the ground truth. Likewise, a visual inspection can be also performed to select images with the highest quality, as well as those that are more likely to be reactive to Al. Images with a magnetic field strength of 3T are preferred.

For each 2D slice of the subset of homogenous selected images, a panel of different alterations in the frequency domain or k-space is applied, aiming to recreate a realistic variation of contrasts and textures (Fig. 1). Around 300 2D slices are enough to generate a good number of alterations and training dataset. For each 2D slice, in an embodiment, 60-80 alterations are generated (see 5 of them in Fig. 2), depending on investigator’s choice (e.g., the importance put on a specific kind of alterations to make the architecture learn specific patterns).

In an embodiment, from the image transformation into the k-space or frequency domain, the phase and module are obtained. However, particularly, the proposed method only makes use of the module. Then, the method generates a mask composed by a representation of the module in the frequency domain normalized between 0 and 1 , and another centered round binary mask positioned at the center of the frequency domain with a radius varying from 1 to 5% of the image size and taking values from 1 to 1.5, both the size and the value increasing independently on each iteration. Afterwards, addition and normalization of the two generated masks is performed, particularly giving more weight to the round mask, as its max value is higher. One or more mathematical operations (additions or subtractions) can be also executed on the image’s module pixels (frequency domain) affected by the final mask, resulting in a non-linear increase or decrease in contrast. Thus, i) an increase will result in the augmentation of the perceptible luminance, ii) a decrease will result in the attenuation of the perceptible luminance; iii) a replacement to zero will result in a more extreme attenuation of the perceptible luminance.

The alterations previously described can be presented as inputs in training time using original images as ground truths. The balance between higher and lower contrasts is determined by the (most uniform) subset of samples that are intended for harmonization (i.e. the images having more similar acquisition parameters regarding pulse sequence, TE and TR among them). If images are in the upper distribution of intensity levels, the alteration balance is oriented to generate lower intensity alterations. Contrarily, for images in the lower or mid distribution of intensity levels, the alteration balance is oriented to generate upper intensity alterations. This helps the network to learn to generate the ground truth from input images containing any variation possible. Each alteration paired with its original ground truth is used to train the cited deep learning model (e.g. based on a CNN architecture).

With regard to Fig. 3, therein another embodiment of the proposed method to obtain contrast altered images is illustrated. In this particular embodiment, the alterations were generated by performing the following steps: (1) the image was transformed into the frequency domain by applying the Fast Fourier transformation; as a result, both the phase and module information were obtained for each reference case. (2) Then, the image corresponding to the frequency domain was normalized to a range between 0 and 1. (3) A mask was generated to guide the alterations defining the intensity and localization of the frequency values that were altered. The mask varied on each iteration and was composed by a round mask positioned at the center of the frequency matrix with a radius varying from 1 to 5% of the image size and taking values from 1 to 1.5, both the size and the value increased independently on each iteration producing a wide variety of combinations. The intention behind the round mask was to regulate the weight of the middle area of the frequency domain corresponding to the low frequencies. (4) To create the final mask, the two images generated in the previous steps were added together and normalized again to a range [0-1], The round mask makes the resulting mask hold more importance in the centered region of the frequency domain image due to being normalized by a higher value, while keeping the module’s pixel distribution. (5) Next, the composed mask was multiplied by a factor within a range that was selected empirically between 1.25 and 7.5. The higher the multiplying factor, the more intense the alteration was. (6) Then, additions and subtractions operations were applied to the module of the image in the frequency domain through the mask obtained in the previous steps, the image was reconstructed back to the imaging domain after each calculation by applying the Inverse Fourier transformation to the resulting frequency domain.

By following these empirically designed steps, the original image underwent a series of transformations resulting in an altered version with adjusted contrast. The full range of transformations were applied to each reference image, generating different representations of the original image. After applying these transformations, 300 synthetically generated variations were generated per 2D image, making a total of 90,320 paired contrast altered images with their respective original image, used as ground truth.

Fig. 4 illustrates an embodiment of the training process where the ll-Net architecture from Dewey et al. [10] is adapted by:

• Designing or using convolutional filters with a kernel size of 4x4 and a stride of 2. These filters substitute the use of max pooling layers to reduce image size. Using convolutional filters instead of max pooling layers, also make the network to learn to perform the image dimensionality reduction in the best way possible.

Adding an additional skip connection between the input (previous to any convolutional block) and the output (deep supervised block output). In this way, the original input is directly sent to the last layer using a 1x1 convolutional filter to adapt the dimensionality. This direct skip connection gives extra context information to the architecture at the time of generating the output. The mean square error (MSE) is used as a loss function with L2 regularizers.

• Adding or using a deep supervised block at the end. This has demonstrated to be effective in generating an extra level of detail when compared to the same network not including this block.

Once trained, the CNN algorithm is able to transform input images with appearance variations into harmonized images, reducing visual heterogeneity while preserving semantic information. The exclusive features of this self-supervised harmonization method which makes it unique are:

• A more flexible and realistic range of image alterations vs other kind of alterations previously described thanks to the use of the k-space.

• The lightness of the architecture in comparison with GAN architectures, which makes this approach to scale faster with hardware (e.g., Training 3D harmonization models is faster than with GAN’s).

• The possibility of modulating model outputs by choice depending on the alterations balance.

Following, a specific use case of the present invention is described. In this case, the starting point is a given heterogeneous dataset that needs to be harmonized. During the preprocessing the method dives into the set to select which cases are going to be used as harmonization references for the harmonization of the rest of the dataset; around 5 to 8 volumes are roughly enough to generate a solid training set. After selecting the harmonization cases according to the criteria previously described above (good quality, and certain homogeneity), for manual approach and/or by using tags, the method separates the selected cases from the rest of the set to work only with them for the next steps. The alterations generation pipeline is executed on those cases. The code iterates over the reference cases. It takes 2D slices individually from each volume, and all transformations and alterations described before are performed and saved for each slice individually. Images are collected into a single array of real images and another one of altered images in which the real images array (original unprocessed images) will be the ground truth, and the alteration’s array will be the X. The resulting training set is used to generate the harmonization models, using the deep learning model oriented to image reconstruction.

Various aspects of the proposed method, as described herein, may be embodied in programming. Program aspects of the technology may be thought of as “products” or “articles of manufacture” typically in the form of executable code and/or associated data that is carried on or embodied in a type of machine readable medium. Tangible non- transitory “storage” type media include any or all of the memory or other storage for the computers, processors, or the like, or associated modules thereof, such as various semiconductor memories, tape drives, disk drives and the like, which may provide storage at any time for the software programming.

All or portions of the software may at times be communicated through a network such as the Internet or various other telecommunication networks. Such communications, for example, may enable loading of the software from one computer or processor into another, for example, from a management server or host computer of a scheduling system into the hardware platform(s) of a computing environment or other system implementing a computing environment or similar functionalities in connection with image processing. Thus, another type of media that may bear the software elements includes optical, electrical and electromagnetic waves, such as used across physical interfaces between local devices, through wired and optical landline networks and over various air-links. The physical elements that carry such waves, such as wired or wireless links, optical links or the like, also may be considered as media bearing the software. As used herein, unless restricted to tangible “storage” media, terms such as computer or machine “readable medium” refer to any medium that participates in providing instructions to a processor for execution.

A machine-readable medium may take many forms, including but not limited to, a tangible storage medium, a carrier wave medium or physical transmission medium. Nonvolatile storage media include, for example, optical or magnetic disks, such as any of the storage devices in any computer(s), or the like, which may be used to implement the system or any of its components shown in the drawings. Volatile storage media may include dynamic memory, such as a main memory of such a computer platform. Tangible transmission media may include coaxial cables; copper wire and fiber optics, including the wires that form a bus within a computer system. Carrier-wave transmission media may take the form of electric or electromagnetic signals, or acoustic or light waves such as those generated during radio frequency (RF) and infrared (IR) data communications. Common forms of computer-readable media may include, for example: a floppy disk, a flexible disk, hard disk, magnetic tape, any other magnetic medium, a CD-ROM, DVD or DVD-ROM, any other optical medium, punch cards paper tape, any other physical storage medium with patterns of holes, a RAM, a PROM and EPROM, a FLASH- EPROM, any other memory chip or cartridge, a carrier wave transporting data or instructions, cables or links transporting such a carrier wave, or any other medium from which a computer may read programming code and/or data. Many of these forms of computer readable media may be involved in carrying one or more sequences of one or more instructions to a physical processor for execution.

The present disclosure and/or some other examples have been described. According to descriptions above, various alterations may be achieved. The topic of the present disclosure may be achieved in various forms and embodiments, and the present disclosure may be further used in a variety of application programs. All applications, modifications and alterations required to be protected in the claims may be within the protection scope of the present disclosure.

The scope of the present invention is defined in the following set of claims.

Claims

1. A computer implemented method for harmonization of magnetic resonance images, comprising: selecting, by a processor, a subset of homogeneous images among a set of original multi-contrast magnetic resonance images to be harmonized and considered as ground truth; generating, by a processor, contrast and texture alterations of the homogeneous images in the selected subset by subjecting each 2D slice of the homogeneous images to different alterations in the frequency domain by performing in an iterative loop the following steps: generating a centered round binary mask positioned at a center region of the frequency domain with a radius varying from 1 to 5% of the image size and taking values from 1 to 1.5, both the size and the value increasing independently on each iteration; generating a mask composed by a module of the homogeneous images in the frequency domain normalized to a range [0 - 1]; adding together and normalizing to a range [0 - 1] the two generated masks, obtaining a new and final mask as a result; and performing addition and subtraction operations on the original module of the homogeneous images through the final mask; returning, by a processor, the generated contrast and texture alterations of the homogeneous images to the spatial domain; pairing, by a processor, each generated contrast and texture alteration of the homogeneous images in the spatial domain with its original ground truth, obtaining paired images as a result; and generating, by a processor, a training set for a deep learning model using the obtained paired images.

2. The method of claim 1 , wherein the selecting step comprises checking if DICOM tags are available in the set of original multi-contrast magnetic resonance images, wherein if DICOM tags are available, the subset of homogeneous images is selected based on them, whereas if DICOM tags are not available, a comparison of histogram plot of voxels’ intensities is used for the selection of the subset.

3. The method of claim 2, further comprising performing a visual exploration on the selected subset of homogeneous images to discard artifacts or distorted information therein.

4. The method of claim 1 , wherein the final mask is obtained by giving more weight on each iteration to a region corresponding to true values of the centered round binary mask increasing each step of the iteration by 0.1 for an intensity and 1% for a size of the binary value.

5. The method of claim 1 , wherein the deep learning model is based on a Convolutional Neural Network, CNN, architecture.

6. The method of claim 1 , wherein the deep learning model is based on a ll-Net CNN architecture.

7. The method of claim 6, wherein the ll-Net CNN architecture comprises at least one of: using convolutional filters with kernel size of 4X4 and a stride of 2; adding an additional skip connection between an input of a convolutional block of the ll-Net CNN architecture and an output of a deep supervised block of the ll-Net CNN architecture, and using a mean square error, MSE, as a loss function with L2 regularizers; adding a deep supervised block at an end of the ll-Net CNN architecture.

8. A non-transitory processor-readable medium storing code representing instructions to be executed by a processor, the instructions comprising code to cause the processor to: select a subset of homogeneous images among a set of original multi-contrast magnetic medical images to be harmonized and considered as ground truth; generate contrast and texture alterations of the homogeneous images in the selected subset by subjecting each 2D slice of the homogeneous images to different alterations in the frequency domain by means of performing in an iterative loop the following steps: generating a centered round binary mask positioned at a center region of the frequency domain with a radius varying from 1 to 5% of the image size and taking values from 1 to 1.5, both the size and the value increasing independently on each iteration; generating a mask composed by a module of the homogeneous images in the frequency domain normalized to a range [0 - 1]; adding together and normalizing to a range [0 - 1] the two generated masks, obtaining a new and final mask as a result; and performing addition and subtraction operations on the original module of the homogeneous images through the final mask; return the generated contrast and texture alterations of the homogeneous images to the spatial domain; pair each generated contrast and texture alteration of the homogeneous images in the spatial domain with its original ground truth, obtaining paired images as a result; and generate a training set for a deep learning model using the obtained paired images.