WO2025171167A1

WO2025171167A1 - Methods for improved artificial intelligence prediction of a diagnosis

Info

Publication number: WO2025171167A1
Application number: PCT/US2025/014842
Authority: WO
Inventors: Ashok PURI
Original assignee: University of Nebraska Lincoln; University of Nebraska System
Current assignee: University of Nebraska Lincoln; University of Nebraska System
Priority date: 2024-02-09
Filing date: 2025-02-06
Publication date: 2025-08-14
Anticipated expiration: 2026-08-09

Abstract

Attention guided images for updating predictions made by an image classifier model are generated by extracting feature maps from a last convolutional layer or other layers of the image classifier model, where the image classifier model generates the feature maps in response to being provided as input an image and computing as output a prediction of a diagnosis. A concept weighted feature map is generated based on the extracted feature maps and coefficients of a concept class associated with an image class of the image. A weighted average feature map is generated by implementing channel wise pooling by averaging across channels for the concept weighted feature map. An attention guided image is generated by implementing element wise multiplication of the weighted average feature map with the image. An updated prediction of the diagnosis is generated by the image classifier model based on the attention guided image.

Description

Attorney Docket No.515247 (000100PCT) (Client Ref.24048PCT) METHODS FOR IMPROVED ARTIFICIAL INTELLIGENCE PREDICTION OF A DIAGNOSIS CROSS REFERENCE TO RELATED APPLICATIONS [0001] This application claims benefit to Provisional Patent Application No. 62/551,828, filed on February 9, 2024, which is hereby incorporated by reference herein. FIELD [0002] The present disclosure relates to Artificial Intelligence (AI) and machine learning (ML), and in particular to a method, system, and computer program product and computer- readable medium for improving predictions of diagnoses generated by AI models. BACKGROUND [0003] AI models typically are not a hundred percent accurate when deployed in the real world, regardless of how good the evaluation metrics are for the test data. These AI models are taken by users at the face value of model metrics, and currently there exists no method to check the model predictions other than by a specialist who provides expert domain knowledge. Some of the conventional methods to improve the accuracy of model predictions include using ensemble models, multimodal models, and models with complex architecture and attention mechanisms, yet the verification of model prediction correctness is still lacking when implementing these solutions. AI models could assist in a healthcare setting by providing a diagnosis at the point of care. The more accurate the results, the better it is for patient care and more acceptable for medical professionals to use in a high-risk domain like medicine. When diagnosing indications based on images, for example, a physician may have key or specific concepts applicable to its disease classification. A classification model that operates similarly in its classification and diagnosis of disease using user or domain expert designed concepts in an understandable manner and improve upon the results of an AI model prediction would have broad applications of use. [0004] Artificial Intelligence imaging models learn the concept of the classes they are trained to classify or segment. The implementation of user or domain expert defined concepts can be applied to determine and confirm the significance of these concepts in underlying model predictions when a new image is presented to the model for classification or diagnosis. Attorney Docket No.515247 (000100PCT) (Client Ref.24048PCT) Such images harboring such concepts are converted into vectors by the applied model and then converted into Concept Activation Vectors (CAV). The technique of using testing with CAVs (TCAV), a score called TCAV score, uses directional derivatives to calculate the importance the user or domain expert defined concept is for the model to make its prediction. This approach demonstrates the sensitivity of the model prediction is to the specific concepts, as a reflection of the internal working model and the influence of the concept to the prediction. The TCAV score provides a global, quantitative explanation of the importance of that concept the model uses to classify the image for predictive purposes, versus the local explainability provided by class attribution methods. The calculation and determination of the CAV and TCAV scores is as described in Kim, Been, et al. "Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (TCAV)." International conference on machine learning. PMLR, 2018), is incorporated by reference in its entirety herein. Solutions which improve the prediction results of an AI model without using multiple and complex models are required to improve the accuracy of the final prediction made by AI models without making the model more complex or larger. SUMMARY [0005] In a first aspect, an embodiment of the present disclosure provides a computer- implemented method for generating attention guided images for updating predictions made by an image classifier model implemented by one or more processors, the method including: extracting, by a computer system, feature maps from a last convolutional layer or other layers of the image classifier model, the feature maps generated by the image classifier model in response to being provided as input an image and computing as output a prediction of a diagnosis of a disease or condition; generating, by the computer system, a concept weighted feature map based on the feature maps and coefficients of a concept class associated with an image class of the image, the coefficients of the concept class generated by: providing, by the computer system, concept activations of concept images of concept classes associated with the image class and random activations of random images associated with the image class as input to train a linear classifier that is trained to separate the concept activations from the random activations; and obtaining, by the computer system, the coefficients of the concept classes; generating, by the computer system, a weighted average feature map by implementing channel wise pooling by averaging across channels for the concept weighted feature map; generating, by the computer system, an attention guided image by implementing Attorney Docket No.515247 (000100PCT) (Client Ref.24048PCT) element wise multiplication of the weighted average feature map with the image provided to the image classifier model; and generating, by the computer system, an updated prediction of the diagnosis of the disease or condition as the output of the image classifier model by providing the attention guided image as the input to the image classifier model. [0006] In a second aspect, the present disclosure provides the method according to the first aspect, wherein the method further includes generating, by the computer system, a recommended treatment plan based on the updated prediction of the diagnosis of the disease or condition. [0007] In a third aspect, the present disclosure provides the method according to the first aspect or the second aspect, wherein the output of the image classifier model includes probabilities that are each associated with a given image class of a plurality of image classes, wherein the method further includes: selecting, by the computer system, a coefficient of a second highest probability; generating, by the computer system, another concept weighted feature map based on the coefficients of the concept class associated with the given image class of the second highest probability; generating, by the computer system, another weighted average feature map by implementing the channel wise pooling by averaging across the channels for the another concept weighted feature map; generating, by the computer system, another attention guided image by implementing the element wise multiplication of the another weighted average feature map with the image provided to the image classifier model; and generating, by the computer system, another updated prediction of the diagnosis of the disease or condition as the output of the image classifier model by providing the another attention guided image as the input to the image classifier model, wherein the attention guided image and the another attention guided image each include a different region of interest for the image classifier model than the region of interest first used by the image classifier model to generate the output of the prediction of the diagnosis of the disease or condition in response to being provided the input of the image. [0008] In a fourth aspect, the present disclosure provides the method according to any of the first to third aspects, the method further including generating, by the computer system, the concept images by extracting, from a dataset of images, portions from each image of the dataset of images based on domain knowledge provided by users, the concept images associated with the concept classes. Attorney Docket No.515247 (000100PCT) (Client Ref.24048PCT) [0009] In a fifth aspect, the present disclosure provides the method according to any of the first to fourth aspects, the method further including generating, by the computer system, synthetic concept images by implementing a generative adversarial network that is provided as input a training dataset of images, the synthetic concept images associated with the concept classes, wherein activations of the synthetic concept images and activations of the random images are provided as the input to train the linear classifier that is trained to separate the concept activations from the random activations to obtain the coefficients of the concept class. [0010] In a sixth aspect, the present disclosure provides the method according to any of the first to fifth aspects, wherein the coefficients of the concept class associated with the image class of the image is further generated by providing, by the computer system, the concept activations of the concept images of the concept classes associated with the image class and the random activations of the random images associated with the image class as the input to one or more of a support vector machine (SVM), a trained regression model, or a Ridge Regression technique. [0011] In a seventh aspect, the present disclosure provides the method according to any of the first to sixth aspects, wherein the image classifier model includes one of a Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Long Short-Term Memory Network (LSTM), Transformer, Autoencoder, Principal Component Analysis, Support Vector Machine (SVM), Gaussian Mixture Models (GMM), Hidden Markov Models (HMM), or Decision Trees. [0012] In an eighth aspect, the present disclosure provides the method according to any of the first to seventh aspects, wherein prior to extracting the feature maps from the last convolutional layer or other layers of the image classifier model, the method further comprises implementing a batch normalization of the feature maps. [0013] In a ninth aspect, the present disclosure provides the method according to any of the first to eighth aspects, wherein prior to generating the attention guided image, the method further includes: converting, by the computer system, the average feature map to one or more channels; resizing, by the computer system, the weighted average feature map to a size that corresponds to the image provided as the input to the image classifier model; and Attorney Docket No.515247 (000100PCT) (Client Ref.24048PCT) normalizing, by the computer system, the weighted average feature map to generate a normalized feature map. [0014] In a tenth aspect, the present disclosure provides the method according to any of the first to ninth aspects, wherein generating the weighted average feature map includes implementing the channel wise pooling by sum across the channels or by max pooling across the channels for the concept weighted feature map. [0015] In an eleventh aspect, the present disclosure provides the method according to any of the first to tenth aspects, wherein generating the weighted average feature map includes implementing one-into-one convolutions for generating different weights for each feature map of the feature maps. [0016] In a twelfth aspect, the present disclosure provides the method according to any of the first to eleventh aspects, wherein the method further includes updating the coefficients of the concept classes by providing updated concept images activations of the concept classes to the trained linear classifier. [0017] In a thirteenth aspect, the present disclosure provides a computer system for generating attention guided images for updating predictions made by an image classifier model, the computer system including one or more hardware processors which, alone or in combination, are configured to perform a method according to any of the first to twelfth aspects. [0018] In a fourteenth aspect, the present disclosure provides the computer system according to the thirteenth aspect, wherein the method further includes generating a recommended treatment plan based on the updated prediction of the diagnosis of the disease or condition. [0019] In a fifteenth aspect, the present disclosure provides the computer system according to the thirteenth or fourteenth aspects, wherein the output of the image classifier model includes probabilities that are each associated with a given image class of a plurality of image classes, wherein the method further includes: selecting a coefficient of a second highest probability; generating another concept weighted feature map based on the coefficients of the concept class associated with the given image class of the second highest probability; generating another weighted average feature map by implementing the channel Attorney Docket No.515247 (000100PCT) (Client Ref.24048PCT) wise pooling by averaging across the channels for the another concept weighted feature map; generating another attention guided image by implementing the element wise multiplication of the another weighted average feature map with the image provided to the image classifier model; and generating another updated prediction of the diagnosis of the disease or condition as the output of the image classifier model by providing the another attention guided image as the input to the image classifier model. [0020] In a sixteenth aspect, the present disclosure provides the computer system according to any of the thirteenth to fifteenth aspects, wherein the method further includes generating the concept images by extracting, from a dataset of images, portions from each image of the dataset of images based on domain knowledge provided by users, the concept images associated with the concept classes. [0021] In a seventeenth aspect, the present disclosure provides the computer system according to any of the thirteenth to sixteenth aspects, wherein the method further includes generating synthetic concept images by implementing a generative adversarial network that is provided as input a training dataset of images, the synthetic concept images associated with the concept classes, wherein activations of the synthetic concept images and activations of the random images are provided as the input to train the linear classifier that is trained to separate the concept activations from the random activations to obtain the coefficients of the concept class. [0022] In an eighteenth aspect, the present disclosure provides the computer system according to any of the thirteenth to seventeenth aspects, wherein the coefficients of the concept class associated with the image class of the image is further generated by providing the concept activations of the concept images of the concept classes associated with the image class and the random activations of the random images associated with the image class as the input to one or more of a support vector machine (SVM), a trained regression model, or a Ridge Regression technique. [0023] In a nineteenth aspect, present disclosure provides a tangible, non-transitory computer-readable medium having instructions thereon which, upon being executed by one or more processors, provide for generating attention guided images for updating predictions made by an image classifier model, according to any of the first to eighteenth aspects and by execution of the following steps: extracting feature maps from a last convolutional layer or Attorney Docket No.515247 (000100PCT) (Client Ref.24048PCT) other layers of the image classifier model, the feature maps generated by the image classifier model in response to being provided as input an image and computing as output a prediction of a diagnosis of a disease or condition; generating a concept weighted feature map based on the feature maps and a coefficient of a concept class associated with an image class of the image, the coefficient of the concept class generated by: providing concept activations of concept images of concept classes associated with the image class and random activations of random images associated with the image class as input to train a linear classifier that is trained to separate concept activations from random activations; and obtaining coefficients of the concept classes; generating a weighted average feature map by implementing channel wise pooling by averaging across channels for the concept weighted feature map; generating an attention guided image by implementing element wise multiplication of the weighted average feature map with the image provided to the image classifier model; and generating an updated prediction of the diagnosis of the disease or condition as the output of the image classifier model by providing the attention guided image as the input to the image classifier model. [0024] In a twentieth aspect, the present disclosure provides the tangible, non-transitory computer-readable medium according to the nineteenth aspect, wherein the instructions, upon being executed by the one or more processors, are further configured to execute the following step generating a recommended treatment plan based on the updated prediction of the diagnosis of the disease or condition, and wherein generating the attention guided image includes using saliency maps or class activation maps with the weighted average feature map and the image provided to the image classifier model. BRIEF DESCRIPTION OF THE DRAWINGS [0025] The present disclosure will be described in even greater detail below based on the exemplary figures. The disclosure is not limited to the exemplary embodiments. All features described and/or illustrated herein can be used alone or combined in different combinations in embodiments of the disclosure. The features and advantages of various embodiments of the present disclosure will become apparent by reading the following detailed description with reference to the attached drawings which illustrate the following: [0026] FIG. 1 illustrates an example process for generating concept images for each class, according to embodiments of the present disclosure; Attorney Docket No.515247 (000100PCT) (Client Ref.24048PCT) [0027] FIG. 2 illustrates examples of images belonging to a certain class and the concept images generated for said class, according to embodiments of the present disclosure; [0028] FIG. 3 illustrates an example process for obtaining concept activation vectors/coefficients for each concept class, according to embodiments of the present disclosure; [0029] FIG. 4 illustrates an example process for training a classifier model and extracting feature maps, activations from the classifier model, according to embodiments of the present disclosure; [0030] FIGs. 5A and 5B illustrate an exemplary flow chart for generating attention guided images to update an initial prediction made by a classifier model using concept weighted feature maps for a class of highest probability and class of second highest probability, according to embodiments of the present disclosure; [0031] FIG. 6 illustrates examples of defined concept classes Macular Edema (DME), Choroidal Neovascularization (CNV), Drusen, and Normal in a given image that represent a visualization of the concepts in the form of a heat map overlaid on the image, according to embodiments of the present disclosure; [0032] FIG. 7 depicts example Optical coherence tomography (OCT) images with overlaid heat maps which represent the region of interest used by the model during an initial diagnosis (that is incorrect), and the region of interest used by the model during a recheck or first iteration (that is correct), according to embodiments of the present disclosure; [0033] FIG. 8 depicts an example user interface for presenting an initial diagnosis, a subsequent revised diagnosis, and two images that highlight the region of interest the model focused on when it made the incorrect and correct diagnosis, according to embodiments to the present disclosure; [0034] FIG. 9 illustrates examples of images that correspond to an initial prediction made by a classifier model and the attention guided image provided to the model using the coefficients of concept classes, according to embodiments of the present disclosure; Attorney Docket No.515247 (000100PCT) (Client Ref.24048PCT) [0035] FIG. 10 illustrates an example flowchart for generating attention guided images for updating predictions made by an image classifier model, according to embodiments of the present disclosure; [0036] FIG. 11 illustrates a simplified block diagram of one or more devices or systems for generating attention guided images for updating predictions made by an image classifier model, according to embodiments of the present disclosure; and [0037] FIGs. 12A and 12B illustrate results for a test data set and an external data set with improved accuracy for predictions made by a model using the attention guided images generated according to embodiments of the present disclosure. DETAILED DESCRIPTION [0038] Embodiments of the present disclosure provide a method and system for generating attention guided images for updating predictions made by an image classifier model. While the present disclosure is described primarily in connection with updating predictions made by an image classifier model using attention guided images, as would be recognized by a person of ordinary skill in the art, the disclosure is not so limited and inventive features apply to other scenarios such as updating predictions made by other AI models, such as large language models (LLMs), using attention guided input. [0039] The embodiments of the present disclosure may improve the prediction results of an AI model without using multiple and complex models and may improve the accuracy of the final prediction made by an AI model without using multiple models, or making the model more complex and larger. In some embodiments, domain expert defined concepts are used in a human understandable form to confirm or improve upon model prediction accuracy by checking model predictions instantly, at the point of care, without the presence of a specialist. The present disclosure may provide an additional level of validation to reduce the impact or occurrence of a wrong diagnosis. In embodiments, the present disclosure may improve the accuracy of a model’s prediction through the use of attention guided images. Attention guided images may be generated based on coefficients derived from activations of the concept images of the predicated class and the extracted feature maps from the image classifier model used to generate the initial prediction, and the initial image classifier model uses a given or new image as input to generate a prediction of a diagnosis of a disease or condition as output. A concept weighted feature map may be used in combination with the Attorney Docket No.515247 (000100PCT) (Client Ref.24048PCT) initial image to generate, ultimately, an attention guided image that leverages domain concepts as guidance provided by the user or domain expert. The attention guided image may be applied to the model (provided as input) and the model may use the attention guided image to refocus the model’s attention and recheck the model’s prior predictions. [0040] The attention guided images may provide scope to detect and re-direct any miss- directed attention of a classifier model created by an adversarial attack. An adversarial attack may be a deliberate manipulation of machine or deep learning models by introducing crafted input data with malicious or willful intent. In embodiments, the present disclosure may be [0041] The systems and methods according to embodiments of the present disclosure provide for more accurate predictions for the diagnosis of diseases or conditions by refocusing a model (such as an image classifier model) through the use of generated attention guided images. Previously provided methods in the art for improving the accuracy of predictions made by AI models do not utilize attention guided images which refocus the model to update their prediction and instead rely on implementing more complex models on Attorney Docket No.515247 (000100PCT) (Client Ref.24048PCT) top of complex models or providing feedback loops over multiple iterations to improve the model itself. The attention guided images generated in accordance with embodiments of the present disclosure can be used to quickly update the predictions made by a model and improve the predictions made by the model absent any additional training thereby reducing computer resource usage and providing easier to understand explainability for the model and its updated predictions. The accurate predictions made by models using the attention guided images generated by embodiments of the present disclosure can also be used to design treatment strategies for patients or aid clinicians in diagnosing certain cancers, disorders, or diseases in a patient. For example, clinicians and doctors may utilize different chemotherapy treatments or other treatment options (e.g. surgery) based on the predicted diagnosis for a patient. The ability to quickly and accurately predict indications of a disease or a condition represents an important improvement in the technical field of disease classification, where acting quickly to diagnose or provide particular treatments is time-critical and often lifesaving. [0042] The systems and methods described herein involve the use of machine learning models, feature extraction, and generating of updated images which are computationally complex and cannot be performed in the human mind. Additionally, computing feature maps, coefficients, and updating predictions of diagnosis made by an AI model include computationally complex steps which cannot be performed in the human mind as determining how to update a prediction made by an AI model is already a complex task. [0043] FIG. 1 illustrates an example process for generating concept images for each class, according to embodiments described herein. The process 100 of FIG.1 includes selecting concepts for each class of interest at 102. For example, an expert or person with particular domain knowledge may select concepts for each class of interest 102 for diseases or conditions which can be diagnosed from OCT images of a patient. To continue the example, assume four concept classes of interest: Diabetic Macular Edema (DME), Choroidal Neovascularization (CNV), Drusen, and normal. Domain knowledge may be used as user input to crop/extract from input images of each concept class (from 102) to generate a concept image at 104. For example, may be provided with a images associated with a concept class from 102 and ask to crop or extract the portion or area of the image that is most important to the image being classified as a certain concept (e.g. as DME or CNV, Attorney Docket No.515247 (000100PCT) (Client Ref.24048PCT) etc.,). The cropped/extracted portions of input images may be used to generate concept images for each class at 106. [0044] In embodiments, generating the concept images for each class at 106 may include padding each image (e.g. with OCT image background) to match the size of the original image (104) or the input image size for each model that will execute the classifying, such as an image classifier model. In some instances, preserving the aspect ratio between the concept images may help in utilizing some concept classes. In some embodiments, synthetic concept images may be generated and used for obtaining coefficients of a concept class associated with an image class by training a linear classifier using the synthetic images. For example, generative adversarial networks (GANs), variational autoencoders (VAE), DALL-E, stable diffusion, or other suitable algorithms may be provided, as input, a training dataset of images, and generate the concept images for each class of interest. The generative models described above may be trained using concept images to generate any number of concept images for each class of interest. Using synthetically generated concept images can help increase the dataset size and add more variations as needed. The generative models may also be used to generate subsets of concept images. For example, in the case of DME there may be concept images of small cysts, very small cysts, large cysts, etc. These types of subsets can also be generated absent the use of synthetic concept images as well. In embodiments, the activations of synthetic concept images and the activations of synthetic random images may be provided as input to train a linear classifier that is trained to separate the concept activations from the random activations to obtain the coefficients of the concept class or for each concept class. [0045] FIG. 2 illustrates examples of images belonging to a certain class and the concept images generated for said class, according to embodiments described herein. For example, with reference to FIG.1, above, original images (Retinal OCT images) may be provided as initial images for each class of interest that correspond to one of four classes – DME, CNV, Drusen, and Normal – as depicted on the left side of FIG. 2. Examples of extracted or cropped portions of the initial images are depicted on the right side of FIG. 2 and represent concept images for each concept class (e.g., DME concept 200, CNV concept 202, Drusen concept 204, and Normal concept 206). As described above with reference to FIG. 1, an expert or a person with particular domain knowledge may be provided with a dataset of images for each class of interest, such as DME, and they can provide input, via a computer system, to extract or crop a portion of the input image (left side of FIG. 2), to generate a set _{Attorney Docket No. 515247 (000100PCT)} (Client Ref.24048PCT) of concept images for that class (e.g., 200-206). As will be described in more detail below with reference to FIG. 3, the activations of the concept images extracted from the image classifier model, called the concept activations 310, may be used to train a linear classifier that is also provided activations of random images associated with an image class. The trained linear classifier may be trained to separate concept activations from random activations and coefficients of the concept classes may be obtained using the trained linear classifier. _{[0046] FIG. 3 illustrates an example process 300 for obtaining concept activation} vectors/coefficients for each concept class, according to embodiments described. The process 300 includes obtaining concept images of classes the model (AI model/image classifier model) was trained to predict at 302. In an exemplary embodiment, a way to obtain the activations of images of each concept class includes giving the linear classifier (318) the activations of the images of the concept class under consideration (304 or 312) and use the activations of the images belonging to the other (remaining) classes as random activations. The table below depicts an example of the concepts generated using the above described process. Table 1 _{[0047] The process 300 includes multiple iterations where one part of the process 300} iterates through one particular concept class under consideration (e.g.304-310) and the other _{part of the process 300 iterates through the other remaining concept classes (e.g. 306 and} _{312-316). For example, if 304-310 were iterating for the concept class DME, the process 300} _{for 306 and 312-316 would iterate through the remaining concept classes CNV, Drusen, and} Normal. This iterative process loop is repeated until each concept class is the one under consideration and the remaining classes are provided through 306 and 312-316. The process 300 includes providing the concept images of one class under consideration 304 (e.g. input is Attorney Docket No.515247 (000100PCT) (Client Ref.24048PCT) concept images for a particular class under consideration) to an AI model or image classifier model 306. The process 300 includes extracting activations 308 to generate or obtain concept activations 310 for the class under consideration 304. An example of the model 306 includes ResNet18^®. As used herein, the term “activations” includes a general term that applies to outputs from any layer of a CNN model, like ResNet18®. The feature maps described herein may be referred to as activations that are of a vector size of 512x7x7. The feature maps from the last CNN layer are provided to the global average pooling (GAP) layer of the model 306 where the GAP layer converts the 512x7x7 vector to 512x1 vector. This output is also referred to as activations. In embodiments, the embodiments of the present disclosure include extracting feature maps from a last convolutional layer (before the GAP layer) for any new image given to the image classifier model (e.g. model 306) for prediction. In embodiments, extracting activations 308 from a GAP layer for concept images includes executing or invoking functions of open source deep learning frameworks (e.g. PyTorch) which are defined to extract activations (e.g. vectors) from a specified layer (the GAP layer in this scenario) of the model 306. This function may be invoked for concept images 200-206 for example. The activations extracted from the concept images are referred to as the concept activations 310. The concept activations belonging to the classes other than the class under consideration become the random activations. These are then both used to train a linear classifier 318 that learns to separate the concept activations 310 from the random activations 316. For example, when the DME images 200 are the concept images under consideration, then their activations are the DME concept activations, and the activations from the other concept images CNV 202, Drusen 204, and Normal 206 become the random activations. For each class of concept images and their activations, the rest of the concept images and their respective activations become random activations. In embodiments, another function call may be invoked or executed to extract the feature maps from the model 306 such as PyTorch’s hook function. [0048] As described above, concept images of the remaining classes (mix of concept images of the remaining classes) 312 (e.g. input is random images from the remaining concept classes) are provided to the same model 306 to extract activations 314 from this image set and generate or obtain random activations 316. In embodiments, the activations extracted 308 and 314 may be extracted from a last convolutional layer and/or a GAP layer which is just after the last convolutional layer of the model 306, from one or more convolutional layers (layers) of the model 306, derived from an averaging of the layers of the Attorney Docket No.515247 (000100PCT) (Client Ref.24048PCT) model 306, or some other mathematical operation, or each convolutional layer of the model 306 may be used for a particular class under consideration 304. In some embodiments, the computer system implementing the features of the present disclosure may back-propagate the gradients of the model 306 to feature maps of the last convolutional layer of the model 306 ad implement global average pooling to obtain activations 308 and 314. [0049] The process 300 includes training a linear classifier to separate concept activations 310 from random activations 316 at 318. Although FIG.3 depicts using a linear classifier to separate concept activations 310 from random activations 316 at 318 – other processes, methods, or models may be used including a Convolutional Neural Network (CNN), a Support Vector Machine (SVM), Linear Discriminant Analysis (LDA), Elastic Net, or Ridge regression, or other suitable techniques which can distinguish between positive and negative examples based on the extracted activations (310 and 316) to obtain the coefficients/weights (e.g. 320). The trained linear classifier at 318 may be used to obtain coefficients of the concept for the class under consideration at 320 of the process 300. As described above, the process 300 is repeated or iterated through for all concepts under consideration, represented at 322 which results in obtaining coefficients for each concept class at 324 of FIG. 3. In embodiments, the coefficients of the concept classes 324 are updated by updating or providing updated concept images of the concept classes to the trained linear classifier 318 as part of a feedback loop. [0050] FIG. 4 illustrates an example process 400 for training a classifier model and extracting feature maps, activations from the classifier model, according to embodiments described herein. The process 400 of FIG. 4 includes obtaining training images 402. Examples in the present disclosure utilize retinal OCT images as the training images for training a classifier model. Examples of the classifier (Classifier Model 406) may include ResNet18®, CNNs, Recurrent Neural Networks (RNNs), Long Short-Term Memory Networks (LSTMs), Transformers, Autoencoders, Principal Component Analysis (PCA), SVMs, Gaussian Mixture Models (GMMs), Hidden Markov Models (HMMs), or Decision Trees. The process 400 includes pre-processing the training images at 404. Pre-processing the training images at 404 may include removing noise or outliers from the training images or using previously annotated or labeled images to train the Classifier Model 406. Pre- processing the training images at 404 may include resizing the images to a size appropriate for the given classifier model (for example resizing to 224x224 for ResNet18®), Attorney Docket No.515247 (000100PCT) (Client Ref.24048PCT) implementing random rotation, horizontal and vertical flips, contrast variations, or other pre- processing steps to augment the training images 402 to obtain a robust trained model 406. [0051] The process 400 includes training the Classifier Model at 406 using the pre- processed training images from 404 to classify images and/or make predictions of a diagnosis of a condition or disease based on the images. The process 400 includes evaluating the output of the model at 406 and tuning the hyperparameters of the model at 408. For example, training the model 406 may involve multiple feedback loops to update the weights or hyperparameters of the model at 408 until a best trained model is obtained at 410. The process 400 of FIG. 4 also includes using the best trained model 410 to obtain predictions and probabilities of the predictions of a diagnosis of a condition or a disease at 412. Once the model has generated a prediction the feature maps may be extracted at 414 and activations may be extracted at 416. In some embodiments, the feature maps may be extracted 414 prior to batch normalization is executed for the model 410 or for the convolutional layers of the model 410. For example, feature maps are typically extracted after batch normalization, but they may be extracted before it as well. The output of each layer of the model (306) (also called activations) becomes the input to the next layer. Instead of using a single image as the input to the model during training, a small batch of images are used together. Therefore, each layer’s output is in the form of batches of activations. These activations are normalized by subtracting the batch mean and dividing by batch standard deviation. This helps to accelerate training of the deep neural networks like ResNet18. The feature maps output from the last convolutional layer also under-go this process of batch normalization before they are given to the Global Average Pooling layer. During the inference phase, when only a new single image is given to the model to predict, the learned features during model training are present in the batch normalization layer. So, feature maps before this layer do not have these parameters applied to them, and after going through batch normalization layer the feature maps have these parameters applied to them. Thus, there is a small difference between the two feature maps. At this point the feature maps may be extracted from the last convolutional layer for the purposes of the embodiments of the present disclosure, either before batch normalization or after batch normalization. [0052] FIGs. 5A and 5B illustrate an exemplary flow chart 500 for generating attention guided images to update an initial prediction made by a classifier model using concept weighted feature maps for a class of highest probability and class of second highest Attorney Docket No.515247 (000100PCT) (Client Ref.24048PCT) probability, according to embodiments described herein. As described herein, a trained model 502 may be provided with a new image or input image 504 for generating an initial/first prediction 506 of a diagnosis for a disease or condition. For example, the trained model 502 may be an image classifier model such as ResNet18®, the input image 504 may be an OCT image for a patient, and the initial/first prediction 506 may indicate that the patient is suffering from DME. The trained model 502 may be trained in accordance with the process described above with reference to FIG. 4. In embodiments, the trained model 502 may output a list of probabilities (510) of each label or class in the dataset used for model training. The class of highest probability is used for the initial/first prediction 506. For example, the prediction of the diagnosis (506) is associated with the class of highest probability from the above described list of probabilities that are obtained. The flow chart 500 includes obtaining feature maps from a last convolutional layer before or after batch normalization at 508. Although FIG. 5A depicts obtaining the feature maps from a last convolutional layer of the trained model 502 after generating the initial/first prediction 506, embodiments of the present disclosure are not limited to just this layer and may obtain feature maps from other convolutional layers or a particular convolutional layer of the trained model 502. [0053] A computer system implementing embodiments of the present disclosure may obtain probabilities of each class at 510 predicted or generated by the trained model 502 that are associated with the feature maps after generating the initial/first prediction 506. As depicted in FIGs. 5A and 5B, the flow chart 500 includes operations to be performed for a first updated prediction and a second updated prediction that correspond to using the class of the highest probability 512 and the class of the second highest probability 514, respectively. The flowchart 500 includes generating a concept weighted feature map based on the extracted feature maps and coefficients of the concept class of the highest probability for both the class of the highest probability 516 and second highest probability 518, each with their own feature maps and coefficients of concept of the class of interest. The trained model 502 may be an example of the model 306 of FIG. 3. The coefficients of the linear classifier for activations of concept class images are obtained 320 as described above. The examples described herein utilize four concept classes, so there are four linear classifiers and four coefficients are obtained, one for each concept of DME, CNV, Drusen, and Normal. These coefficients are vectors of dimension 512x1. When a new image is given to the model 502, the computer system implementing the embodiments of the present disclosure utilizes two outputs – the first being the list of predictions of the model 502 (consider the class/diagnosis it belongs to Attorney Docket No.515247 (000100PCT) (Client Ref.24048PCT) which is the class of the highest probability 512 and the class of the second highest probability 514), and the second being the feature maps 508 (512x7x7) from the last convolutional layer of the trained model 502. For example, the predictions of the model 502 may be in the form of a list of probabilities for each class the model 502 is trained to classify. The initial/first prediction 506 is taken as the class with the highest probability. The feature maps extracted 508 are multiplied with the coefficients of the concept defined for the class of highest probability 512 in the first updated prediction (first iteration or first re-check 534), and in the second iteration or second re-check 536 the same feature map is multiplied with the coefficients of the concept defined for the class of the second highest probability 514. In some embodiments, gradients of the feature maps 508 with respect to the model’s 502 output/prediction 506, that are obtained by backpropagation, can be used to generate the concept weighted feature map 518 by multiplying with the coefficients to generate the attention guided images 534 and 536. [0054] For example, if the model 502 has predicted the diagnosis as DME for a given new image (initial/first prediction 506), it means that class DME is the class of highest probability. The coefficients of the linear classifier for the concept of DME (320) are obtained and multiplied with the feature maps extracted from the last convolutional layer for that image 508. The feature maps 508 are 512x7x7 and the coefficients are 512x1x1. The first element of the coefficients multiplies with each element of the first 7x7 matrix of the feature map, the second with second and so on until the vector of 512x7x7 is generated. The 512x7x7 vector generated using the described process represents the concept weighted feature maps 516 and 518 because it was multiplied with the coefficients of the linear classifier for that concept. Next these concept weighted feature maps 516 and 518 are averaged channel wise to generate average feature maps 520 of a size 1x7x7. Using the size of the original image, these average feature maps 520 are then resized 522 to match it to allow element wise multiplication with the image given 504 to the model 502 to predict. This would generate an image height x image width x 1 (height, width, channel). Next, as depicted in FIG. 5B, this resized feature map 522 is normalized 524 using min max scaling. Since the resized normalized feature map 522 has only one channel and the original image 504 is in three channels (or one or more channels), this resized feature map 522 created its converted into three channels 526. This would allow its multiplication with each channel of the original image 504 to obtain the first attention guided image 528. This is used for the first iteration to re-check at 532 and 534. Attorney Docket No.515247 (000100PCT) (Client Ref.24048PCT) [0055] If the class of second highest probability 514 was for CNV, then the same feature maps 508 are multiplied with the coefficients of the linear classifier 518 for CNV concept image activations and the same process repeated to get the second attention guided image 530 that is used for the second iteration 536 for re-prediction. The model 502 prediction is in the form of a list of probabilities. The computer system implementing embodiments of the present disclosure can select the class of highest probability 512 and the class of second highest probability 514. Next, the computer system selects the corresponding coefficients of the concepts of both these classes. For example, if the highest probability 512 was DME class, then the computer system is configured to select the coefficient of the linear classifier that was trained to separate the DME concept activations from random ones. This is the first iteration for re-prediction 534. The computer system is also configured to determine the class of the second highest probability 514, such as CNV for example. The computer system in this scenario would select the appropriate coefficients of CNV concept and continues with the same process of FIGs. 5A and 5B to implement the second iteration for re-prediction 536. [0056] FIG. 5B continues from FIG. 5A and includes an exemplary process for an updated first and second prediction in accordance with embodiments of the present disclosure. For example, FIG. 5B includes generating concept weighted feature maps based on coefficients of the concept class of the highest probability 516 and the second highest probability 518. The flow chart 500 includes generating an average feature map (weighted average feature map) 520 for both the concept class of the highest probability 516 and the second highest probability 518. The following description will use the same reference numerals for both processes for the concept class of the highest probability 516 and the second highest probability 518 as the steps are similar for both concept classes. In embodiments, generating the average feature map 520 includes implementing channel wise pooling by averaging across channels for the concept weighed feature map(s) (516 and 518), by implementing channel wise pooling by sum across channels for the concept weighted feature map(s) (516 and 518), or by implementing max pooling across channels for the concept weighted feature map(s) (516 and 518). [0057] FIG. 5B depicts several steps for normalizing or preparing the average feature map to generate an attention guided image including resizing to a size similar to that of the original image at 522, performing normalization 524, and converting to one or more channels (e.g. three channels) 526. For example, the original image (new image 504) is provided to the Attorney Docket No.515247 (000100PCT) (Client Ref.24048PCT) model 502 to generate prediction 506 has a size or shape that is width, height, and channels (3 in RGB images, for example). The average feature maps 520 obtained for this would be 1x7x7, which means a height of 7, width of 7, and channel of 1. These average feature maps 520 are resized 522 to match the size of the new image 504 given to the model 502 for prediction 506. The resized feature maps are then normalized 524 using, for example, min- max scaling, also referred to as min-max normalization. This results in each value in the resized feature map being scaled between 0 to 1. These are then converted into 3 channels 526 (or one or more channels). This conversion allows for element wise multiplication with each of the three channels of the given image for prediction. Other methods of normalization 524 may also be used such as Z-Score normalization, Max-Abs Scaling, Robust scaling, Log Transformation, and L2 normalization. The flow chart 500 includes generating an attention guided image at 528 and 530, one for the concept in the class of the highest probability 516, and one for the concept class of the second highest probability 518, respectively. The attention guided images 528 and 530 may be provided as input 532 back to the model 502 to obtain, as output, a first iteration or first recheck 534 and second iteration or second re-check 536 of the initial/first prediction 506. For example, the first iteration or first recheck 534 may update the prediction of the diagnosis from DME to CNV and the second iteration or second recheck 536 may confirm the first iteration as CNV for the predicted diagnosis. In embodiments, the initial/first prediction 506, the first iteration or first recheck 534, the second iteration or second recheck 536, and associated attention guided images may be presented to a user via a user interface of a computer device to help provide explainability for each prediction. [0058] FIG. 6 illustrates examples of defined concept classes DME, CNV, Drusen, and Normal in a given image that represent a visualization of these user defined concepts in the form of a heat map overlaid on the image, according to embodiments of the present disclosure. For example, the visualizations depicted in FIG. 6 may be for defined concepts for each class 200-206 from FIG. 2 that are obtained from the coefficients of the corresponding linear classifier in a given new image. In FIG. 6, these visualizations are in the form of heat maps overlaid on the original image. For example, 600 depicts a scenario where the model is provided a new image to predict a diagnosis and 600 illustrates the highlights to the region where the CNV concept is present in that image. In FIG. 6, 602 depicts the highlights to the region where the DME concept is present in the image, 604 depicts the highlights to the region where the Drusen concept is present in that image, and 606 depicts that the Normal Attorney Docket No.515247 (000100PCT) (Client Ref.24048PCT) concept is practically absent in the region of interest for that given image. As described herein, the heat maps for each image 600-606 represent an area of the respective class concepts in the given new image in a human understandable form. They may be present as in 600, 602, and 604, or absent as in 606 in the region of interest selected by the model. [0059] FIG. 7 depicts example OCT images with overlaid heat maps which represent the region of interest used by the model during an initial diagnosis (that is incorrect), and the region of interest used by the model during a recheck or iteration (that is correct), according to embodiments of the present disclosure. The image on the left (700) includes an overlaid heat map which depicts the region of interest, or focus, the model used when it predicted the image incorrect as DME. The image on the right (702) of FIG. 7 highlights the region of interest (using an overlaid heat map) for the region of interest in the image which the model used when it predicted the image correct as CNV after rechecking. A clear shift in focus of attention (e.g. region of interest) of the model is depicted in FIG. 7 between images 700 and 702. The image 700 may represent an original or new image provided to an image classifier model 502 for generating an initial prediction, while image 702 may represent an attention guided image that is generated according to embodiments of the present disclosure and provided to the image classifier 502 to obtain the correct diagnosis of CNV. [0060] FIG.8 depicts an example user interface for presenting an initial diagnosis, a subsequent revised diagnosis, and two images that highlight the region of interest the model focused on when it made the incorrect and correct diagnosis, according to embodiments to the present disclosure. For example, model user interface 800 may include buttons, features, interactable objects, etc., 802 that enable a user to upload or provide an image 804, such as an OCT image, which allows an image classifier model (model) to predict an initial diagnosis for a condition or disease. The initial diagnosis 806, which in this example is DME, is presented along with the image 804 via the user interface 800. Several more buttons or interactable objects or portions 808 and 810 for generating a revised diagnosis 812 (e.g. first iteration or first re-check, and second iteration or second re-check). The generation of the revised diagnosis 812 includes generating attention guided image(s) as described herein. The right portion of FIG. 8 depicts image 814 which highlights, using an overlaid heat map, the region the model focused on when it made the wrong initial diagnosis of DME, while image 816, which is also highlighted using an overlay of a heat map, depicts the region the model used to diagnose the image correctly as CNV in the second iteration. The image 816 may be Attorney Docket No.515247 (000100PCT) (Client Ref.24048PCT) representative of an attention guided image that is generated using embodiments of the present disclosure that is then provided to the model to correct or confirm its initial prediction of diagnosis. [0061] FIG. 9 illustrates examples of images that correspond to an initial prediction made by a classifier model and the attention guided image provided to the model using the coefficients of concept classes, according to embodiments described herein. FIG. 9 includes a first recheck, which corresponds to an updated prediction of a diagnosis of a disease or condition as described herein. FIG. 9 depicts the original image 900 provide to the image classifier model that uses the image 900 to generate an initial diagnosis 506. FIG. 9 also depicts the attention guided image 1 (902) that corresponds to the attention guided image generated by embodiments of the present disclosure given the image 900. The attention guided image 1 (902) of FIG. 9 illustrates a focus (darker areas and lighter areas) on certain areas/portions of the image which can refocus the image classifier model to potentially update their initial prediction of the diagnosis. As described above with reference to FIGs. 5A and 5B, the computer system implementing the embodiments of the present disclosure may generate another updated prediction using a second highest probability from a list of probabilities output by the Model 502 associated with the feature maps of the image classifier model. This is depicted in FIG. 9 with reference to the second recheck that uses the same original image and coefficients from the concept of the class of the second highest probability to generate attention guided image 2 (904). The attention guided image 2 (904) may be provided as input to the image classifier model to potentially update the initial prediction of a diagnosis. Each attention guided image 902 and 904 can provide a model (image classifier model) another opportunity to recheck their initial diagnosis and potentially generate a more accurate prediction based on the attention guided images 902 and 904 which refocus the model to areas of the image which are most associated with a certain concept class. [0062] FIG. 10 illustrates an example flowchart for generating attention guided images for updating predictions made by an image classifier model, according to embodiments described herein. FIG. 10 includes an exemplary process 1000 which may be performed by an environment or architecture such as in FIGs. 1-9 and 11, and by systems and components presented in or described with reference to of FIGs. 1-9, and 11. However, it will be recognized that any of the following blocks may be performed in any suitable order and that Attorney Docket No.515247 (000100PCT) (Client Ref.24048PCT) the process 1000 may be performed in any environment or architecture and by any suitable computing device and/or controller. [0063] At step 1002, the process 1000 includes extracting feature maps from a last convolutional layer or other layers of the image classifier model, where the feature maps are generated by the image classifier model in response to being provided, as input, an image, and computing, as output, a prediction of a diagnosis of a disease or condition. In embodiments, attention guided images may be generated for updating predictions made by the image classifier model. As an example, the image classifier model may be ResNet18® which includes four layers. In such a scenario, a computer system implementing the features described herein, may be configured to extract four feature maps, each from a different convolutional layer of the image classifier model. An average feature map or weighted average feature map may be generated from the four feature maps in some instances. The image classifier model may include one of a Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Long Short-Term Memory Network (LSTM), Transformer, Autoencoder, Principal Component Analysis, Support Vector Machine (SVM), Gaussian Mixture Models (GMM), Hidden Markov Models (HMM), or Decision Trees. [0064] At step 1004, the process 1000 includes generating a concept weighted feature map based on the feature maps and a coefficient of a concept class associated with an image class of the image, the coefficient of the concept class generated by: providing concept activations of concept images of concept classes associated with the image class and random activations of random images associated with the image class as input to train a linear classifier that is trained to separate the concept activations from the random activations, and obtaining coefficients of the concept classes. In embodiments, the computer system may provide the concept images of the concept classes associated with the image class and the random images associated with the image class as input to the model 302, to obtain the concept and random activations, which are given to a linear classifier that is trained to separate concept activations from random activations in order to obtain the coefficients of the concept classes. In some embodiments, the computer system may obtain the coefficients of the concept classes by providing the concept images of the concept classes associated with the image class and random images associated with the image class as the input to one or more of a support vector machine (SVM), a trained regression model, and/or a Ridge Regression technique. In embodiments, the concept images may be provided to the image Attorney Docket No.515247 (000100PCT) (Client Ref.24048PCT) classifier model for obtaining activations from a Global Average Pooling (GAP layer) that is after a list convolutional layer of the image classifier model. The activations of the concept images or concept activations are then provided as input to the linear classifier that learns to separate the concept activations from the random activations. [0065] At step 1006, the process 1000 includes generating, by the computer system, a weighted average feature map by implementing channel wise pooling by averaging across channels for the concept weighted feature map. At step 1008, the process 1000 includes generating, by the computer system, an attention guided image by implementing element wise multiplication of the weighted average feature map with the image provided to the image classifier model. The process 1000 includes at step 1010 generating, by the computer system, an updated prediction of the diagnosis of the disease or condition as the output of the image classifier model by providing the attention guided image as the input to the image classifier model. In embodiments, the computer system implementing the features of the present disclosure may generate a recommended treatment plan based on the updated prediction of the diagnosis of the disease or treatment. In embodiments, generating the weighted average feature map includes implementing the channel wise pooling by sum across the channels or by max pooling across the channels for the concept weighted feature map, and/or implementing one-into-one convolutions for generating different weights for each feature map of the feature maps. In embodiments, generating the attention guided image includes using saliency maps or class activation maps with the weighted average feature map and the image provided to the image classifier model. An example of using saliency maps or class activation maps includes Gradient-weighted Class Activation Mapping (GradCAM). [0066] In some embodiments, the feature maps are also each associated with a probability that corresponds to a given image class of a plurality of image classes. A feature map used in generating the attention guided image for updating the prediction in the initial iteration described and depicted in FIG. 10 may be associated with a highest probability for that given image class. In some instances, the computer system implementing the features of the present disclosure may also generate another attention guided image for generating another updated prediction that corresponds to second iteration of the diagnosis of a disease or condition. For example, a particular feature map from the feature maps may be selected that has a second highest probability. Another concept weighted feature map may be generated based on the particular feature map and the coefficients of the concept class associated with the given Attorney Docket No.515247 (000100PCT) (Client Ref.24048PCT) image class. Another weighted average feature map may be generated by implementing the channel wise pooling by averaging across the channels for the another concept weighted feature map. Another attention guided image may be generated by implementing the element wise multiplication of the another weighted average feature map with the image provided to the image classifier model. Another updated prediction of the diagnosis of the disease or condition may be generated as the output of the image classifier model that is provided the another attention guided image as input to the image classifier model. In such scenarios, the image, the attention guided image, and the another attention guided image may each include or otherwise be associated with a different region of interest for the image classifier model. The changing regions of interest may serve as a digital trail which can provide insights or explainability to users about the image classifier model’s working process and how it arrived at the prediction it outputs. [0067] FIG. 11 illustrates a simplified block diagram of one or more devices or systems for generating attention guided images for updating predictions made by an image classifier model according to embodiments of the present disclosure. FIG. 11 is a block diagram of an exemplary system or device 1100 that can be representative of each computing system disclosed herein. The system 1100 includes a processor 1104, such as a central processing unit (CPU), and/or logic, that executes computer executable instructions for performing the functions, processes, and/or methods described herein. In some examples, the computer executable instructions are locally stored and accessed from a non-transitory computer readable medium, such as storage 1110, which may be a hard drive or flash drive. Read Only Memory (ROM) 1106 includes computer executable instructions for initializing the processor 1104, while the random-access memory (RAM) 1108 is the main memory for loading and processing instructions executed by the processor 1104. [0068] The network interface 1112 may connect to a wired network or cellular network and to a local area network or wide area network. The system 1100 may also include a bus 1102 that connects the processor 1104, ROM 1106, RAM 1108, storage 1110, and/or the network interface 1112. The components within the system 1100 may use the bus 1102 to communicate with each other. The components within the system 1100 are merely exemplary and might not be inclusive of every component for embodiments described herein. For instance, in some examples, the system 1100 might not include a network interface 1112. In embodiments the system 1100 may include one or more components such as input-output Attorney Docket No.515247 (000100PCT) (Client Ref.24048PCT) devices which can enable electronic, optical, magnetic, and holographic, communication with ROM 1106, RAM 1108, and/or storage 1110. The input-output devices can enable wireless communication via WiFi®, Bluetooth®, cellular (e.g., LTE®, CDMA®, GSM®, WiMax®, NFC®), GPS, and the like. The input-output devices can include wired and/or wireless communication pathways and may be configured to present or otherwise display predictions, updated predictions, images, attention-guided images, or generated recommendations for treatment plans associated with updated diagnosis of a disease or condition. [0069] FIGs. 12A and 12B illustrate results for a test data set and an external data set with improved accuracy for predictions made by a model using the attention guided images generated according to embodiments of the present disclosure. The results in FIGs.12A and 12B depict the improved accuracy of predictions made by AI models implementing the attention guided images generated according to embodiments of the present disclosure as well as: improvement in precision recall and F1 scores (F-scores or F1 scores represent a measure of predictive performance for a machine learning model’s accuracy), improvement in confusion matrix, improvement in Area Under the Receiver Operating Characteristic curve (AU-ROC), and improvement in Specificity. FIG. 12A illustrates that the image classifier model accuracy was 97.4%. However, after a first re-check the accuracy improved to 97.8%, and after a second re-check the accuracy improved to 98.6%. Other metrics also showed improvement as depicted in FIG. 12A. Fig. 12B illustrates the results with diabetic macular edema images from a different dataset (OCTID an open source Optical Coherence Tomography Image Database) to simulate a real world scenario. The model accuracy dropped to 83.2% on an initial prediction. However, after a first re-check the model accuracy improved to 96.3%, and after a second re-check the model accuracy improved to 98.1%. The errors generated by the model reduced by 72% after the first re-check and by 88.9% after the second re-check. [0070] While the disclosure has been illustrated and described in detail in the drawings and foregoing description, such illustration and description are to be considered illustrative or exemplary and not restrictive. It will be understood that changes and modifications may be made by those of ordinary skill within the scope of the following claims. In particular, the present disclosure covers further embodiments with any combination of features from different embodiments described above and below. Additionally, statements made herein Attorney Docket No.515247 (000100PCT) (Client Ref.24048PCT) characterizing the disclosure refer to an embodiment of the disclosure and not necessarily all embodiments. [0071] The terms used in the claims should be construed to have the broadest reasonable interpretation consistent with the foregoing description. For example, the use of the article “a” or “the” in introducing an element should not be interpreted as being exclusive of a plurality of elements. Likewise, the recitation of “or” should be interpreted as being inclusive, such that the recitation of “A or B” is not exclusive of “A and B,” unless it is clear from the context or the foregoing description that only one of A and B is intended. Further, the recitation of “at least one of A, B and C” should be interpreted as one or more of a group of elements consisting of A, B and C, and should not be interpreted as requiring at least one of each of the listed elements A, B and C, regardless of whether A, B and C are related as categories or otherwise. Moreover, the recitation of “A, B and/or C” or “at least one of A, B or C” should be interpreted as including any singular entity from the listed elements, e.g., A, any subset from the listed elements, e.g., A and B, or the entire list of elements A, B and C.

Claims

Attorney Docket No.515247 (000100PCT) (Client Ref.24048PCT) CLAIMS What is claimed is: 1. A computer-implemented method for generating attention guided images for updating predictions made by an image classifier model implemented by one or more processors, the method comprising: extracting, by a computer system, feature maps from a last convolutional layer or other layers of the image classifier model, the feature maps generated by the image classifier model in response to being provided as input an image and computing as output a prediction of a diagnosis of a disease or condition; generating, by the computer system, a concept weighted feature map based on the feature maps and coefficients of a concept class associated with an image class of the image, the coefficients of the concept class generated by: providing, by the computer system, concept activations of concept images of concept classes associated with the image class and random activations of random images associated with the image class as input to train a linear classifier that is trained to separate the concept activations from the random activations; and obtaining, by the computer system, the coefficients of the concept classes; generating, by the computer system, a weighted average feature map by implementing channel wise pooling by averaging across channels for the concept weighted feature map; generating, by the computer system, an attention guided image by implementing element wise multiplication of the weighted average feature map with the image provided to the image classifier model; and generating, by the computer system, an updated prediction of the diagnosis of the disease or condition as the output of the image classifier model by providing the attention guided image as the input to the image classifier model. 2. The computer-implemented method according to claim 1, further comprising generating, by the computer system, a recommended treatment plan based on the updated prediction of the diagnosis of the disease or condition. 3. The computer-implemented method according to claim 1, wherein the output of the image classifier model includes probabilities that are each associated with a given image class of a plurality of image classes, wherein the method further comprises: Attorney Docket No.515247 (000100PCT) (Client Ref.24048PCT) selecting, by the computer system, a coefficient of a second highest probability; generating, by the computer system, another concept weighted feature map based on the coefficients of the concept class associated with the given image class of the second highest probability; generating, by the computer system, another weighted average feature map by implementing the channel wise pooling by averaging across the channels for the another concept weighted feature map; generating, by the computer system, another attention guided image by implementing the element wise multiplication of the another weighted average feature map with the image provided to the image classifier model; and generating, by the computer system, another updated prediction of the diagnosis of the disease or condition as the output of the image classifier model by providing the another attention guided image as the input to the image classifier model, wherein the attention guided image and the another attention guided image each include a different region of interest for the image classifier model than the region of interest first used by the image classifier model to generate the output of the prediction of the diagnosis of the disease or condition in response to being provided the input of the image. 4. The computer-implemented method according to claim 1, further comprising generating, by the computer system, the concept images by extracting, from a dataset of images, portions from each image of the dataset of images based on domain knowledge provided by users, the concept images associated with the concept classes. 5. The computer-implemented method according to claim 1, further comprising generating, by the computer system, synthetic concept images by implementing a generative adversarial network that is provided as input a training dataset of images, the synthetic concept images associated with the concept classes, wherein activations of synthetic concept images and activations of the random images are provided as the input to train the linear classifier that is trained to separate the concept activations from the random activations to obtain the coefficients of the concept class. 6. The computer-implemented method according to claim 1, wherein the coefficients of the concept class associated with the image class of the image is further generated by providing, by the computer system, the concept activations of the concept images of the concept classes associated with the image class and the random activations of the random Attorney Docket No.515247 (000100PCT) (Client Ref.24048PCT) images associated with the image class as the input to one or more of a support vector machine (SVM), a trained regression model, or a Ridge Regression technique. 7. The computer-implemented method according to claim 1, wherein the image classifier model includes one of a Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Long Short-Term Memory Network (LSTM), Transformer, Autoencoder, Principal Component Analysis, Support Vector Machine (SVM), Gaussian Mixture Models (GMM), Hidden Markov Models (HMM), or Decision Trees. 8. The computer-implemented method according to claim 1, wherein prior to extracting the feature maps from the last convolutional layer or other layers of the image classifier model, the method further comprises implementing a batch normalization of the feature maps. 9. The computer-implemented method according to claim 1, wherein prior to generating the attention guided image, the method further comprises: converting, by the computer system, the average feature map to one or more channels; resizing, by the computer system, the weighted average feature map to a size that corresponds to the image provided as the input to the image classifier model; and normalizing, by the computer system, the weighted average feature map to generate a normalized feature map. 10. The computer-implemented method according to claim 1, wherein generating the weighted average feature map includes implementing the channel wise pooling by sum across the channels or by max pooling across the channels for the concept weighted feature map. 11. The computer-implemented method according to claim 1, wherein generating the weighted average feature map includes implementing one-into-one convolutions for generating different weights for each feature map of the feature maps. 12. The computer-implemented method according to claim 1, further comprising updating the coefficients of the concept classes by providing updated concept images activations of the concept classes to the trained linear classifier. 13. A computer system for generating attention guided images for updating predictions made by an image classifier model, the computer system comprising one or more hardware Attorney Docket No.515247 (000100PCT) (Client Ref.24048PCT) processors which, alone or in combination, are configured to provide for execution of a method comprising the following steps: extracting feature maps from a last convolutional layer or other layers of the image classifier model, the feature maps generated by the image classifier model in response to being provided as input an image and computing as output a prediction of a diagnosis of a disease or condition; generating a concept weighted feature map based on the feature maps and coefficients of a concept class associated with an image class of the image, the coefficients of the concept class generated by: providing concept activations of concept images of concept classes associated with the image class and random activations of random images associated with the image class as input to train a linear classifier that is trained to separate the concept activations from the random activations; and obtaining the coefficients of the concept classes; generating a weighted average feature map by implementing channel wise pooling by averaging across channels for the concept weighted feature map; generating an attention guided image by implementing element wise multiplication of the weighted average feature map with the image provided to the image classifier model; and generating an updated prediction of the diagnosis of the disease or condition as the output of the image classifier model by providing the attention guided image as the input to the image classifier model. 14. The computer system according to claim 13, wherein the method further comprises generating a recommended treatment plan based on the updated prediction of the diagnosis of the disease or condition. 15. The computer system according to claim 13, wherein the output of the image classifier model includes probabilities that are each associated with a given image class of a plurality of image classes, wherein the method further comprises: selecting a coefficient of a second highest probability; generating another concept weighted feature map based on the coefficients of the concept class associated with the given image class of the second highest probability; generating another weighted average feature map by implementing the channel wise pooling by averaging across the channels for the another concept weighted feature map; Attorney Docket No.515247 (000100PCT) (Client Ref.24048PCT) generating another attention guided image by implementing the element wise multiplication of the another weighted average feature map with the image provided to the image classifier model; and generating another updated prediction of the diagnosis of the disease or condition as the output of the image classifier model by providing the another attention guided image as the input to the image classifier model. 16. The computer system according to claim 13, wherein the method further comprises generating the concept images by extracting, from a dataset of images, portions from each image of the dataset of images based on domain knowledge provided by users, the concept images associated with the concept classes. 17. The computer system according to claim 13, wherein the method further comprises generating synthetic concept images by implementing a generative adversarial network that is provided as input a training dataset of images, the synthetic concept images associated with the concept classes, wherein activations of the synthetic concept images and activations of the random images are provided as the input to train the linear classifier that is trained to separate the concept activations from the random activations to obtain the coefficients of the concept class. 18. The computer system according to claim 13, wherein the coefficients of the concept class associated with the image class of the image is further generated by providing the concept activations of the concept images of the concept classes associated with the image class and the random activations of the random images associated with the image class as the input to one or more of a support vector machine (SVM), a trained regression model, or a Ridge Regression technique. 19. A tangible, non-transitory computer-readable medium having instructions thereon which, upon being executed by one or more processors, provide for generating attention guided images for updating predictions made by an image classifier model, by execution of the following steps: extracting feature maps from a last convolutional layer or other layers of the image classifier model, the feature maps generated by the image classifier model in response to being provided as input an image and computing as output a prediction of a diagnosis of a disease or condition; Attorney Docket No.515247 (000100PCT) (Client Ref.24048PCT) generating a concept weighted feature map based on the feature maps and coefficients of a concept class associated with an image class of the image, the coefficients of the concept class generated by: providing concept activations of concept images of concept classes associated with the image class and random activations of random images associated with the image class as input to train a linear classifier that is trained to separate the concept activations from the random activations; and obtaining the coefficients of the concept classes; generating a weighted average feature map by implementing channel wise pooling by averaging across channels for the concept weighted feature map; generating an attention guided image by implementing element wise multiplication of the weighted average feature map with the image provided to the image classifier model; and generating an updated prediction of the diagnosis of the disease or condition as the output of the image classifier model by providing the attention guided image as the input to the image classifier model. 20. The tangible, non-transitory computer-readable medium according to claim 19, wherein the instructions, upon being executed by the one or more processors, are further configured to execute the following step: generating a recommended treatment plan based on the updated prediction of the diagnosis of the disease or condition, and wherein generating the attention guided image includes using saliency maps or class activation maps with the weighted average feature map and the image provided to the image classifier model.