[go: up one dir, main page]

WO2025179210A1 - Multimodal transformer models for biomedical images and associated texts - Google Patents

Multimodal transformer models for biomedical images and associated texts

Info

Publication number
WO2025179210A1
WO2025179210A1 PCT/US2025/016896 US2025016896W WO2025179210A1 WO 2025179210 A1 WO2025179210 A1 WO 2025179210A1 US 2025016896 W US2025016896 W US 2025016896W WO 2025179210 A1 WO2025179210 A1 WO 2025179210A1
Authority
WO
WIPO (PCT)
Prior art keywords
cancer
subject
recurrence
score
architecture
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/US2025/016896
Other languages
French (fr)
Inventor
Kevin Boehm
Sohrab SHAH
Sarat CHANDARLAPATY
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Memorial Sloan Kettering Cancer Center
Original Assignee
Memorial Sloan Kettering Cancer Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Memorial Sloan Kettering Cancer Center filed Critical Memorial Sloan Kettering Cancer Center
Publication of WO2025179210A1 publication Critical patent/WO2025179210A1/en
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0475Generative networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/094Adversarial learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30068Mammography; Breast
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30096Tumor; Lesion
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/03Recognition of patterns in medical or anatomical images
    • G06V2201/032Recognition of patterns in medical or anatomical images of protuberances, polyps nodules, etc.
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H30/00ICT specially adapted for the handling or processing of medical images
    • G16H30/40ICT specially adapted for the handling or processing of medical images for processing medical images, e.g. editing

Definitions

  • One or more processors coupled with memory may obtain, for a subject at risk of recurrence of cancer, a dataset at least one of: (i) a biomedical image of a tissue sample from an organ associated with the cancer or (ii) a text report identifying a plurality of characteristics of a tumor associated with the cancer.
  • the one or more processors may apply an ML architecture to at least a portion of the biomedical image and at least a portion of the text report of the dataset.
  • the ML architecture may be established using a plurality of examples.
  • Each of the plurality of examples may have, for a respective subject having undergone resection of a respective tumor from a respective organ: (i) a respective dataset comprising at least one of (a) a respective biomedical image of a respective tissue sample from the respective organ or (b) a respective text report identifying a respective plurality of characteristics of the respective tumor; and (ii) a respective score indicating a corresponding likelihood of recurrence associated with cancer in the respective subject.
  • the one or more processors may determine, based on applying the ML architecture, a score indicating a likelihood of recurrence associated with cancer in the subject.
  • more processors may generate a classification identifying the subject as one of a candidate or non-candidate for administration of a therapy for the cancer, in accordance with the score.
  • the one or more processors may store, using one or more data structures, an association between the subject and the classification.
  • the one or more processors may generate, responsive to the score satisfying a threshold, a classification to identify the subject as the candidate for administration of the therapy for recurrence of the cancer.
  • the one or more processors may provide, for presentation, an output based on the classification to identify the subject as the candidate for the administration of the therapy.
  • the subject is administered with the therapy for cancer, in response to the presentation of the output.
  • the one or more processors may generate, responsive to the score not satisfying a threshold, a classification to identify the subject as the non-candidate for administration of the therapy for recurrence of the cancer.
  • the one or more processors may provide an output based on the classification to identify the subject as the non-candidate for the administration of the therapy.
  • the output may include at least one of (i) a first indication to not administer the subject the therapy for recurrence of the cancer, (ii) a second indication to continue monitoring the subject for recurrence of the cancer, or (iii) a third indication to provide a second dataset for the subject.
  • At least one of the plurality of examples may include the respective score indicating the corresponding likelihood of recurrence associated with cancer in the respective subject at a respective time window relative to resection of a respective primary tumor associated with the cancer.
  • the one or more processors may determine the score indicating the likelihood of recurrence associated with cancer in the subject at a time window relative to resection of a primary tumor associated with the cancer in the subject, wherein the time window ranges from three months to eight years. -2- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191
  • the one or more processors may generate a plurality of tiles using the biomedical images.
  • Each of the plurality of tiles may correspond to a corresponding section of the biomedical images.
  • the one or more processors may generate a plurality of tokens using the text report, each of the plurality of tokens corresponding to one or more respective words of the text report.
  • the one or more processors may apply the ML architecture to the plurality of tiles and the plurality of tokens.
  • an image processor may generate, using the plurality of tiles, a first set of embeddings corresponding to features associated with the recurrence of the cancer;
  • a text processor may generate, using the plurality of tokens, a second set of embeddings corresponding to features associated with the recurrence of the cancer;
  • an aggregator may determine, based on the first set of embeddings from the image processor and the second set of embeddings from the text processor, the score indicating the likelihood of recurrence associated with cancer in the subject.
  • the one or more processors may apply he ML architecture to the plurality of tiles and the plurality of tokens.
  • the image processor may generate, for each tile of the plurality of tiles, a first score of a plurality of first scores indicating a relevance of the tile to the recurrence of the cancer; and the text processor may generate, for each token of the plurality of tokens, a second score of a plurality of second scores indicating a relevance of the token to the recurrence of the cancer.
  • the one or more processors may select (i) a subset of tiles from the plurality of tiles based on the plurality of first scores and (ii) a subset of tokens from the plurality of tokens based on the plurality of second scores.
  • the one or more processors may provide, for presentation an output based on the subset of tiles and the subset of tokens.
  • At least one example of the plurality of examples may include (i) the respective score indicating a corresponding likelihood of recurrence associated with cancer in the respective subject based on the respective biomedical image and (ii) a respective second score indicating a corresponding likelihood of recurrence associated with cancer in the respective subject based on the respective text report.
  • the ML architecture may be trained by: applying the ML architecture to the respective dataset and the respective text report -3- 4924-7413-8397.1 Atty. Dkt.
  • No.: 115872-3191 of the at least one example determining (i) a first score indicating corresponding likelihood of recurrence associated with cancer in the respective subject from applying the ML architecture to the respective biomedical score and (ii) a second score indicating corresponding likelihood of recurrence associated with cancer in the respective subject from applying the ML architecture to the respective text report; and updating one or more of a plurality of weights inn the ML architecture based on (i) a first comparison between the respective score and the first score and (ii) a second comparison between the second respective score and the second score.
  • At least one of the plurality of examples further comprises the respective score to indicate at least one of (i) a risk of recurrence of the respective cancer in the respective subject or (ii) a predicted outcome of a respective administration of therapy for the respective subject.
  • the one or more processors may determine the score indicating least one of (i) a risk of recurrence of the cancer in the subject or (ii) a predicted outcome of the administration of therapy for the subject.
  • the cancer may include can affecting a breast of the subject. The subject may have undergone resection of a primary tumor associated with breast cancer from the breast, prior to obtaining the dataset.
  • the adjuvant chemotherapy may include at least one of anthracyclines, taxanes, cyclophosphamide, fluorouracil, or carboplatin.
  • Fig. 1 Developing a multimodal transformer model for breast cancer risk. Early-stage breast tumors are (a) resected, (b) profiled histologically (c) digitized, and (d) used for downstream modeling of recurrence risk. (e) Number of pathologic slides, pathology reports, and patients included in each split.
  • PR progesterone receptor
  • SF stromal fraction
  • Prolif tumor cell proliferation
  • LISS lymphocyte infiltrating signature score
  • LF lymphocyte fraction
  • Neo. neoplastic, inflam.: inflammatory, non-neo.: non-neoplastic, unlab.: unlabeled, connec.: connective, necros.: necrosis Fig. 6.
  • Multimodal model performance and benchmarking (a) regression of predicted versus true recurrence scores, (b) precision-recall curve for high-risk disease, (c) concordance correlation coefficient for all data splits and models, (d) receiver operating characteristic curve for high-risk disease (e) confusion matrix using score cutoffs of 11 and 25, (f) Pearson correlation for all data splits and models.
  • Fig. 7 Case inclusion diagram. Depicts slides (yellow) and patients (blue) joined to form the full cohort of paired slides and reports with recurrence scores (green).
  • Fig. 8 Quantitative analysis of high- versus low-attention tiles. Full feature titles available in source data. Fig. 9. MSK-BRCA training and validation unimodal vision model performance.
  • the sensitivity and specificity are calculated using a threshold for the predicted recurrence risk score of ⁇ 16 and ⁇ 25, respectively.
  • the analysis does not account for age and nodal status.
  • the sensitivity versus the patient count is plotted for the a. MSK, b. IEO and c. MDX cohorts.
  • the specificity versus the patient count is plotted for the d. MSK, e. IEO and f. MDX cohorts.
  • Fig. 13 Sensitivity and specificity analysis for the predicted recurrence risk scores of all patients above 50 years of age and having 1-3 positive nodes.
  • the sensitivity and specificity are calculated using a threshold for the predicted recurrence risk score of ⁇ 16 and ⁇ 25, respectively.
  • the sensitivity versus the patient count is plotted for the a. MSK, b. IEO and c. MDX cohorts.
  • the specificity versus the patient count is plotted for the d. MSK, e. IEO and f. MDX cohorts. Fig.
  • Fig. 15 Potential clinical use-case of the Orpheus recurrence risk prediction model.
  • the model is within scope for early-stage hormone receptor positive (HR+) and HER2- breast cancer patients.
  • Fig. 16 Orpheus performance for TAILORx risk stratification.
  • i-l Quantification of stromal fraction (SF), tumor cell proliferation (Prolif), lymphocyte infiltrating signature score (LISS), and lymphocyte fraction (LF) for predicted low- and high-risk patients (50 each) depicted in blue and orange, respectively, in the MSK-BRCA cohort.
  • p-values are generated using an independent two-sided t-test. In box plots, boxes denote 25th-75th percentiles, whiskers denote the range without outliers, and individual points denote outliers. Scale bars denote 64 ⁇ m.
  • Fig. 19 Potential clinical use case of the Orpheus recurrence risk prediction model.
  • the Orpheus multimodal prediction model for recurrence risk prediction is potentially capable of guiding decision-making for adjuvant cytotoxic chemotherapy alongside adjuvant endocrine therapy for predicted low- and high-risk patients.
  • the model is within scope for early-stage hormone receptor positive (HR+) and HER2- breast cancer patients.
  • Fig. 20 Visual model generalizes internationally to three test cohorts. (a-c) density plots, (d-f) precision-recall curves, (g-i) confusion matrices, and (j-l) calibration plots for MSK-BRCA, IEO-BRCA, and MDX-BRCA test sets. Positive class is ODX RS > 25.
  • Orpheus+ risk scores versus Oncotype DX ® (ODX) recurrence score (RS) values for cases with or without distant recurrence for any ODX RS value.
  • ODX Oncotype DX ®
  • RS recurrence score
  • e Time-dependent area under the receiver operating characteristic curve and (f) receiver operating characteristic curve for Orpheus+ and ODX RS score to infer recurrence.
  • Fig. 27 Quantitative analysis of high- versus low-attention tiles. Full feature titles available in source data.
  • Fig. 28 (a) Pathology report-derived tokens colorized by language attention. (b) Whole-cohort token importance cloud with size of word scaled by mean importance across the MSK-BRCA test set. [UNK]: unknown.
  • Fig. 29 Performance with ablation of known correlates of Recurrence Score.
  • Fig. 30 Model distinguishes tumors by biologically meaningful features.
  • UMAP embeddings of the MSK-BRCA test set denoting visual, linguistic, and multimodal -12- 4924-7413-8397.1 Atty.
  • PR progesterone receptor
  • Fig. 32 Tumor microenvironment quantification of the top attention tiles of the recurrence risk vision prediction model.
  • the analysis is focussed on the patient subset above 50 years of age and having 1-3 positive nodes.
  • the sensitivity versus the patient count is plotted for the a. MSK, b. IEO and c. MDX cohorts.
  • the specificity versus the patient count is plotted for the d. MSK, e. IEO and f. MDX cohorts.
  • Fig. 35 Sensitivity and specificity analysis for the predicted recurrence risk scores of all patients below 50 years of age. The sensitivity and specificity are calculated using a threshold for the predicted recurrence risk score of ⁇ 11 and > 25, respectively.
  • the analysis is focussed on the patient subset below 50 years of age and having 0 positive nodes.
  • the sensitivity versus the patient count is plotted for the a. MSK, b. IEO and c. MDX cohorts.
  • the specificity versus the patient count is plotted for the d. MSK, e. IEO and f. MDX cohorts. Fig.
  • FIG. 36 depicts a block diagram of a system for determining scores related to recurrence of breast cancers in subjects using multimodal machine learning (ML) architectures, in accordance with an illustrative embodiment.
  • ML machine learning
  • Fig. 37 depicts a block diagram of a process for training ML architectures in the system for determining scores related to recurrence of breast cancers in subjects, in accordance with an illustrative embodiment.
  • Fig. 38 depicts a block diagram of a process for applying ML architectures in the system for determining scores related to recurrence of breast cancers in subjects, in accordance with an illustrative embodiment.
  • Fig. 37 depicts a block diagram of a process for applying ML architectures in the system for determining scores related to recurrence of breast cancers in subjects, in accordance with an illustrative embodiment.
  • Fig. 38 depicts a block diagram of a process for applying ML architectures in the system for determining scores related
  • Fig. 39 depicts a block diagram of a process for evaluating outputs from ML architectures in the system for determining scores related to recurrence of breast cancers in subjects, in accordance with an illustrative embodiment.
  • Fig. 40 depicts a flow diagram of a method of determining scores related to recurrence of breast cancers in subjects using multimodal machine learning (ML) architectures, in accordance with an illustrative embodiment.
  • Fig. 41 is a block diagram of a computing environment according to an example implementation of the present disclosure. DETAILED DESCRIPTION Following below are more detailed descriptions of various concepts related to, and embodiments of, systems and methods for determining scores related to recurrence of cancers in subjects using multimodal machine learning (ML) architectures.
  • Section A describes multimodal histopathological models for stratifying hormone receptor-positive early breast cancer. -15- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191
  • Section B describes inference of recurrence risk and predicted outcome using multimodal histopathological models.
  • Section C describes systems and methods of determining scores related to recurrence of cancers in subjects using multimodal machine learning (ML) architectures
  • Section D describes a network environment and computing environment which may be useful for practicing various computing related embodiments described herein. A.
  • Multimodal Histopathological Models for Stratifying Hormone Receptor-Positive Early Breast Cancer A multimodal transformer model was trained and validated using patients, taking both H&E images and their corresponding synoptic text reports as input. Accurate inference of recurrence score was shown from whole-slide images, the raw text of their corresponding reports, and their combination as measured by Pearson’s correlation. Moreover, the model generalizes well to external international cohorts, effectively identifying recurrence risk and high-risk status from whole-slide images. Probing the biologic underpinnings of the model decisions uncovered tumor cell size heterogeneity, immune cell infiltration, a proliferative transcription program, and stromal fraction as correlates of higher-risk predictions.
  • AUROC receiver operating characteristic curve
  • AUPRC area under the precision recall curve
  • ODX calculates a recurrence score (RS) ranging from zero to 100 with both prognostic and predictive value.
  • RS recurrence score
  • Fig. 1, panel a Tissue samples were subjected to H&E staining and immunohistochemical (IHC) analysis for hormone receptors and HER2 according to ASCO/CAP guidelines, and samples were submitted for calculation of RS per clinical practice.
  • IHC immunohistochemical
  • genomic data from clinical MSK-IMPACT targeted sequencing were also available (Fig. 1, panel b). These derivative data were subsequently digitized (Fig. 1, panel c) and used for multimodal modeling (Fig. 1, panel d).
  • Model training A transformer model was developed to directly regress the ODX RS from whole- slide images of EBC. To train this architecture, a two-step process was employed. First, each slide’s tissue-containing tiles (Fig. 1, panel f) were projected into an informative space using a frozen model trained using SSL on over 30,000 slides (Fig. 1, panel g). Subsequently, a transformer architecture was adapted, which was previously validated in a large multicenter study of colorectal cancer, to map the phenotypic-genotypic correlation between the extracted features and the ODX RS (Fig. 1, panel g). The unimodal and multimodal models were trained to regress RS as a continuous variable (Fig. 1, panel g).
  • Embeddings and predicted score recapitulate clinical and genomic correlates Uniform manifold approximation and projection (UMAP) over the learned embedding spaces for the visual, linguistic, and multimodal models in the MSK-BRCA test set (Fig. 2) revealed that learned embeddings separated somewhat by histologic grade (Fig. 2, panel a) and progesterone receptor expression (Fig. 2, panel b) in the MSK-BRCA test set (n 1034), with the gradients appearing along a learned, lyre-shaped manifold for the multimodal model. The same was observed for the ODX RS itself (Fig. 2, panel c). the association of predicted scores with genomic features was further texted.
  • UMAP Uniform manifold approximation and projection
  • the subset of the MSK-BRCA test set with available tumor grades and IHC- derived HR status in the text report as extracted by regular expressions was analyzed (those without matches by regular expressions were excluded).
  • the ability to discriminate high-risk disease of a nomogram based on clinical and pathologist-annotated features was compared to that of the multimodal (Fig. 6, panel g), text-based (Fig. 6, panel h), and image- based (Fig. 6, panel i) models.
  • the multimodal model achieved an AUROC of 0.89 and AUPRC of 0.71 (95% C.I. 0.60 - 0.82), the vision model achieved an AUPRC of 0.63 (95% C.I.
  • a specificity analysis was conducted, wherein a threshold in the MSK- BRCA test set to yield the highest specificity for the largest percentage of the cohort’s population was manually selected. This resulted in a specificity of 0.93 for 13% of the population with a threshold of > 25 for the predicted RS to identify high-risk patients (Fig. 12, panel d) for the test set of the MSK-BRCA cohort. Applying this threshold on the subsequent cohorts, a specificity of 0.76 for 40% of the population (Fig. 12, panel e) and 0.85 for 31% of the population (Fig. 12, panel f) in the IEO-BRCA and MDX-BRCA cohorts, respectively, was achieved.
  • the speed of this assay could enable novel uses such as more precisely defining populations that will benefit from the use of neoadjuvant therapies beyond the currently used clinical characteristics and without the requirement for additional biopsies.
  • Further analysis of the proposed method as a potential pre-screening tool revealed consistent sensitivity and specificity to identify patients with high- and low-risk tumors for -25- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 approximately 50% of the population across three distinct large cohorts.
  • the risk prediction results remained generally unaffected by subgroup analyses taking into account age and nodal status, despite being clinicopathological factors which impact the recurrence risk in patients with breast cancer, and therefore influence the risk category threshold.
  • the architecture makes use of self- supervised learning to enable training on under 5,000 patients and Cartesian product with dimensionality reduction to commingle the text-based and image-based features.
  • the model also generates an annotated report of the text and image used to estimate the recurrence score.
  • deep learning suffers from a general lack of interpretability, the attention paid to each token in the text or each tile in the image enables ordering physicians to perform qualitative quality controls.
  • “lymphovascular” and “invasion” also appeared, reflecting the association of lymphovascular invasion with disease recurrence risk, though this is not a feature of the clinicopathologic nomogram.
  • the cohort also recapitulated previously uncovered genomic biomarkers with adverse prognostic implications, namely MYC amplification, PIK3CA amplification, and TP53 mutation.
  • Tiles were embedded using CTransPath. After a fully connected layer to project the CTransPath-derived tokens into 512-dimensional space and a ReLU, two PyTorch TransformerEncoderLayers with dimensionality 512 and eight heads were stacked before a final LayerNorm and projection to scalar space. No activation function was used, and no positional encodings were used. For training, a maximum learning rate of 2e-5 with linear warmup of 1000 steps, learning rate decay by a factor of 0.9999 every step, and L2 decay of 2e-5 was used. A batch size of one slide per GPU across two GPUs with accumulated gradients over four batches was used, with gradients clipped at 0.5. The model was trained for up to 50 epochs with early stopping.
  • the HuggingFace model tsantos/PathologyBERT was tuned using the HuggingFace BertForSequenceClassification, Trainer, and AutoTokenizer with a batch size of eight-part descriptions per GPU, four Nvidia Tesla V100 GPUs, four gradient accumulations per backprop, a learning rate of 2e-5, L2 decay of 0.01, and ten training epochs.
  • Prior to tokenization the text corresponding to the part used to measure the Oncotype score was extracted using regular expressions. Addenda, when available, were concatenated to the part description. Names, initials, and logistical comments were removed prior to tokenization.
  • the tsantos/PathologyBERT tokenizer was not modified.
  • the model with the lowest validation loss was chosen for downstream use.
  • attention rollout was used to attribute attention to each input token, and multiple predictions with dropout enabled were used to estimate confidence intervals.
  • Multimodal model training Multiple architectures including simple concatenation of embeddings before dense layers, attention-based integration of unimodal embeddings, or averaging of scores were explored using the validation set.
  • the final model chosen took the pre-computed unimodal embeddings as input, projected them into 96-dimensional space, performed tensor fusion by prepending unity and taking the Cartesian product, applied 30% dropout, and passed through a small 96-dimensional regression head to yield a scalar regression score.
  • HoVerNet with PanNuke-derived weights was used for instance segmentation, and quantitative features such as solidity, area, and perimeter were calculated for the outline of each nucleus. Within each tile, summative statistics were used to aggregate these features. Comparisons between the high- and low-risk groups were made with the Mann-Whitney U test with corrections for multiple testing. For the volcano plot, log fold change was calculated as the base-two logarithm of the mean value for the high-risk group over the same for the low-risk group. Tiles with fewer than 50 total nuclei were excluded. For comparisons involving standard deviations, tiles with fewer than ten of the relevant cell type were excluded.
  • Tumor microenvironment quantification A pre-trained deep learning model was used for the quantification of the tumor microenvironment for the top 50 predicted high- and low-risk patients by the recurrence risk vision prediction model, specifically for the stromal fraction (SF) and leukocyte fraction (LF) as assessed via DNA methylation analysis, lymphocyte infiltrating signature score (LISS) and proliferation (Prolif) as measured by RNA expression.
  • the deep learning regression model was trained on whole-slide images from a breast cancer cohort from The Cancer Genome Atlas (TCGA) in a weakly-supervised setting using the open-source biomarker data. Statistical significance is measured by an independent t-test, indicating a difference in sample means between predicted high- and low-risk patients (p ⁇ 0.05).
  • the scores for the tumor -31- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 microenvironment quantification are inferred based on the same tile-embeddings (1x768) which were used in training the vision model for the recurrence risk prediction.
  • Generative adversarial network training The highest-attention tiles were identified from cases with measured recurrence scores below 11 or above 25 as chosen by an Attn-MIL model trained on the training set.
  • Studio GAN was used to train a ReACGAN architecture with big_resnet backbone, batch size of 36 per GPU across four NVIDIA Tesla V100 GPUs, and default loss parameters.
  • the conditional architecture encoded three classes: high score, low score, and background. Spectrum plots and canvases were generated as per default StudioGAN code. Genomic analysis Specimens were sequenced by MSK-IMPACT, annotated by OncoKB, and accessed via cBioPortal.
  • 95% confidence intervals were calculated using bootstrapping (random sampling with replacement) 1000 times. Areas under the precision recall and receiver operating characteristic curve were calculated using binary thresholding of high- and low- (recurrence score >/ ⁇ 25) risk disease. Significance was established using 1000-fold permutation tests. Operating points on the precision recall curve were analyzed by varying the threshold from greatest to lowest and tabulating the respective precision and recall for each value. F1 scores were calculated using the weighted average. Confusion matrices were established using the three risk categories ( ⁇ 16, 16-25, >25), and significance was established using McNemar’s test of homogeneity.
  • the cohort from the European Institute of Oncology (IEO-BRCA, Milan, Italy), contained a total of 456 early-stage breast cancer patients which received the official Oncotype DX test. Only histopathology slides and corresponding clinicopathological variables were available for analysis. After filtering down patients based on histology slide availability, 452 patients in the IEO-BRCA cohort were available for external validation.
  • OncotypeDX After filtering down patients based on histology slide availability and nodal, ER, PR and HER2 status, which would have been eligible for Oncotype DX, 575 patients in the MDX-BRCA cohort remained for external validation.
  • the research-based scores were calculated using the GeneFu Bioconductor package, based on the original algorithm to calculate the OncotypeDX score. Because research-based versions of OncotypeDX use different data inputs (e.g, microarray/RNA-seq) compared to the official OncotypeDX (e.g., RT-qPCR), this may result in scaling effects when comparing research-based scores with official scores, as demonstrated by the OPTIMA trial group.
  • the Oncotype DX® Recurrence Score (RS) is an assay for hormone receptor- positive early breast cancer with extensively validated predictive and prognostic value. However, its cost and lag time have limited global adoption, and previous attempts to estimate it using clinicopathologic variables have had limited success. To address this, 6,172 cases were assembled across three institutions and developed Orpheus, a multimodal deep learning tool to infer the RS from H&E whole-slide images. The model identifies TAILORx high-risk cases (RS -36- 4924-7413-8397.1 Atty. Dkt.
  • ODX calculates a recurrence score (RS) ranging from zero to 100 with both prognostic and predictive value.
  • RS recurrence score
  • Model Training -39- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 A transformer model was developed to directly regress the ODX RS from WSIs of EBC. To train this architecture, a two-step process was employed. First, each slide’s tissue- containing tiles were projected (Fig. 1, panel f) into an informative space using a frozen model trained using SSL on over 30,000 slides (Fig. 1, panel g). Subsequently, a transformer architecture was adapted, which was previously validated in a large multicenter study of colorectal cancer, to map the phenotypic-genotypic correlation between the extracted features and the ODX RS (Fig. 1, panel g).
  • the unimodal and multimodal models were trained to regress RS as a continuous variable (Fig. 1, panel g). Deep learning infers recurrence risk score from whole slide images
  • the WSI-based model was developed and tested across the three cohorts to measure generalizability of its performance. In the withheld MSK-BRCA test set, the unimodal WSI-based model achieved a Pearson correlation of 0.60 (95% C.I. 0.55 - 0.65, p ⁇ 10 -4 ; Fig. 20, panel a) and concordance correlation coefficient (CCC) of 0.57 (95% C.I. 0.52 - 0.62), along with area under the precision-recall curve (AUPRC) of 0.55 (95% C.I.
  • the WSI-based model robustly infers RS and accurately identifies high-risk disease across three test cohorts derived from different medical centers and countries.
  • Deep learning infers recurrence risk score from text-based reports -40- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191
  • Second, a unimodal text report-based model was developed that achieves a Pearson correlation of 0.59 (95% C.I. 0.53 - 0.65, p ⁇ 10 -4 ) and CCC of 0.53 (95% C.I. 0.47 - 0.58; Fig. 22, panel c) along with AUPRC of 0.53 (95% C.I. 0.45 - 0.61; Fig.
  • Orpheus+ ascertained risk of distant recurrence with an AUROC of 0.77 (95% C.I. 0.68, 0.85)). This was superior to the Oncotype DX ® RS itself, which was uninformative in the RS ⁇ 25 cohort (AUROC 0.51 (95% C.I. 0.41, 0.61)). Scores differed significantly for cases with or without metastatic recurrence for Orpheus+ (p ⁇ 1e-4), but not for ODX RS (p > 0.05; Fig. 17, panel a).
  • a generative model was also trained to synthesize fields of view for informative tiles for high- and low-risk disease (Fig. 5, bottom panel).
  • Tiles conditioned on the high-risk class depict confluent clusters of tumor cells with moderate to marked nuclear pleomorphism and prominent nucleoli, and tiles conditioned on the low-risk class depicted trabeculae and clusters of tumor cells with moderate nuclear pleomorphism and inconspicuous nucleoli.
  • Tiles conditioned on the background class depicted stroma without epithelial cells.
  • Orpheus as a conceptual triaging tool for low- and high- risk disease
  • the utility of Orpheus was tested as a pre-screening tool to reduce the load of laboratory testing for breast cancer recurrence risk in clinical workflows.
  • a sensitivity analysis was conducted to evaluate the performance of the predicted recurrence risk score in identifying low-risk patients, defined as those with a risk recurrence score ⁇ 11.
  • model performance on the intermediate-risk (RS 11-25) subgroup was analyzed using the AUROC, Cohen’s Kappa, F1 score, accuracy and Matthew’s Correlation Coefficient (Supp. Tab. 4), utilizing additional clinically-relevant thresholds of 10, 15 and 25 to binarize the risk predictions.
  • a substantial decrease was observed in all metrics compared to all risk groups (RS 0-100).
  • Orpheus accurately identifies patients with high-risk disease as defined by TAILORx, with a high degree of confidence.
  • the model shows potential to guide adjuvant chemotherapy decisions without the need for multigene assay testing (Fig. 19).
  • adjuvant chemotherapy could be selectively recommended for a subset of patients classified as high-risk with high confidence. This approach could streamline treatment decisions and reduce the need for additional testing, ultimately improving patient care and resource allocation. Furthermore, for patients who are treated per TAILORx with adjuvant chemoendocrine or endocrine therapy based on ODX RS results, Orpheus identifies distant metastatic recurrences more accurately than ODX RS itself in the test set. With further validation, this prognostic value has the potential to refine patient selection for personalization of the adjuvant treatments and follow-up strategies. Discussion -45- 4924-7413-8397.1 Atty. Dkt.
  • Orpheus can precisely identify approximately one quarter of patients with high-risk disease as defined by TAILORx without the need for ODX RS, with superior discrimination of this class compared to a state-of-the-art nomogram, which integrates clinico-pathologic features such as IHC-derived progesterone/estrogen receptor positivity, tumor size, lobular versus ductal histology, Nottingham grade, and age. This would potentially enable physicians to forgo molecular testing in selected cases. Orpheus has the added advantage of not requiring manual curation of these features from the healthcare record.
  • Orpheus further enables emerging applications that tools such as the nomogram would not support, such as identification of risk of local recurrence, clinical trial eligibility, or defining populations that will benefit from the use of neoadjuvant systemic therapies beyond the currently used clinical characteristics. It is further shown that this correlation with RS corresponds to meaningful prognostication: for patients who are treated per TAILORx with adjuvant chemoendocrine or endocrine therapy based on ODX RS results, Orpheus identifies distant metastatic recurrences more accurately than ODX RS itself in the test set.
  • this study advances an improved platform to approximate the ODX RS from routine histopathologic WSIs, outperforming a leading existing method in identification of high-risk disease and—critically—identifying metastatic recurrences for cases with low ODX RS values more accurately than the ODX RS itself.
  • the multimodal model improves performance when pathology text reports are available and is a lightweight and flexible machine learning architecture suitable for application to biomarkers for other histologies.
  • the orthogonal histopathologic and transcriptomic analyses corroborate proliferation and tumor- infiltrating lymphocytes as markers of higher risk.
  • OncotypeDX After filtering down patients based on histology slide availability and nodal, ER, PR and HER2 status, which would have been eligible for Oncotype DX, 575 patients in the MDX- BRCA cohort remained for external validation.
  • the research-based scores were calculated using the GeneFu Bioconductor package, based on the original algorithm to calculate the OncotypeDX score. Because research-based versions of OncotypeDX use different data inputs (e.g, microarray/RNA-seq) compared to the official OncotypeDX (e.g., RT-qPCR), this may result in scaling effects when comparing research-based scores with official scores, as demonstrated by the OPTIMA trial group.
  • the HuggingFace model tsantos/PathologyBERT was tuned using the HuggingFace BertForSequenceClassification, Trainer, and AutoTokenizer with a batch size of eight part descriptions per GPU, four Nvidia Tesla V100 GPUs, four -50- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 gradient accumulations per backprop, a learning rate of 2e-5, L2 decay of 0.01, and ten training epochs.
  • Prior to tokenization the text corresponding to the part used to measure the Oncotype score was extracted using regular expressions. Addenda, when available, were concatenated to the part description.
  • the final model chosen took the pre-computed unimodal embeddings as input, projected them into 96- dimensional space, performed tensor fusion by prepending unity and taking the Cartesian product, applied 30% dropout, and passed through a small 96-dimensional regression head to yield a scalar regression score. Finally, linear regression trained on the training set was used to calibrate the weight of the unimodal and multimodal scores to yield a final score. Multiple predictions with dropout enabled were used to estimate confidence intervals. UMAP plots were generated using the Python umap software package with 10 neighbors and min_dist 0.5 for all plots fit on the training set, and only test sets are shown.
  • Model evaluation When multiple slides were available for a single measured score and all contained relevant tissue as per preprocessing, the relevant tissue tiles were bagged prior to inference or training by the vision model. Models were evaluated primarily by Pearson correlation and associated significance and concordance correlation coefficient. 95% confidence intervals were calculated using bootstrapping (random sampling with replacement) 1000 times. Areas under the precision recall and receiver operating characteristic curve were calculated using binary thresholding of high- and low-risk disease. Significance was established using 1000-fold -51- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 permutation tests. Operating points on the precision recall curve were analyzed by varying the threshold from greatest to lowest and tabulating the respective precision and recall for each value.
  • F1 scores were calculated using the weighted average. Confusion matrices were established using the three risk categories ( ⁇ 11, 11-25, >25), and significance was established using McNemar’s test of homogeneity. To further evaluate the performance of the model, the AUROC, F1 score, accuracy, Cohen’s Kappa, and Matthew’s Correlation Coefficient (MCC) were utilized. These metrics were assessed across all risk groups (0-100) and specifically within the intermediate risk group (11-25). To investigate the model’s performance using various clinically-defined thresholds, binary risk stratification (low versus high) was applied at cut-off values of 10, 15, and 25, analyzing both the entire population and subgroups stratified by age and nodal status.
  • MCC Correlation Coefficient
  • the model can integrate both biomedical images (e.g., H&E-stained whole-slide images) and their corresponding synoptic text reports to more accurately infer the recurrence score, relative to other techniques.
  • biomedical images e.g., H&E-stained whole-slide images
  • their corresponding synoptic text reports to more accurately infer the recurrence score, relative to other techniques.
  • the ML model can extract meaningful image and text features associated with high recurrence risk. These insights can lead to better understanding of the disease informing the risk of recurrence or predicted outcome (including likelihood of distant metastasis) that would not be deducible from unimodal approaches.
  • the ability of the ML model to take unstructured multimodal data can also reduce reliance on extensive manual data extraction, thereby reducing the amount of time and effort on part of specialized experts.
  • the data processing system 105 may include at least one dataset indexer 125, at least one input handler 130, at least one model trainer 135, at least one model applier 140, at least one output evaluator 145, at least one ML architecture 150, and at least one database 155, among others.
  • the ML architecture 150 may include at least one image processor 160, at least one text processor 165, and at least one aggregator 170, among others.
  • Each of the components of the system 100 can be implemented using the computing system as described in Section D.
  • the system 100 may be used to implement the functionalities (e.g., of the Orpheus and Orpheus+ model) detailed herein in Sections A and B.
  • the data processing system 105 can be any computing device comprising one or more processors coupled with memory and software capable of performing the various processes and tasks described herein.
  • the data processing system 105 may be housed within a computing system (e.g., laptop, PC, smart device) or within a server group (e.g., a data center, a branch office, or a server site), and include instructions to manage the identifying of images, generating a classification, and storing an association.
  • the data processing system 105 may be in communication with the imaging device 110, administrative device 115, and the database 155, among others.
  • the data processing system 105 may have one or more components, modules, processes, and threads to perform the various processes and tasks described herein.
  • the output evaluator 145 may generate an output based on the score to identify the subject as a candidate or non-candidate for therapy for cancer.
  • the ML architecture 150 may be any type of artificial intelligence (AI) algorithm or ML model to process biomedical images or text reports (or both) about cancers in subjects to determine scores indicating likelihood of recurrence.
  • the ML architecture 150 may include a deep learning artificial neural network (ANN) (e.g., a transformer architecture, an encoder- decoder model with a convolution neural network architecture, a diffusion model, or an autoencoder model), a clustering algorithm, a support vector machine (SVM), a decision tree, a Bayesian model, a regression model, among others.
  • ANN deep learning artificial neural network
  • SVM support vector machine
  • the ML architecture 150 may include inputs and outputs related to one another via a set of weights.
  • the set of weights may be in accordance with the AI algorithm or ML model (e.g., transformer models) used to implement the ML architecture 150.
  • the set of weights of the ML architecture may be in accordance with one or more transformer models, each with encoder layers, decoder layers, feed- forward layers, cross-attention layers, and activation layers, among others.
  • the set of weights of the ML architecture 150 may be distributed or arranged across the image processor 160, the text processor 165, and the aggregator 170, among others.
  • the image processor 160 may be any type of AI algorithm or ML model to generate features associated with recurrence of cancer and to determine one or more scores indicating likelihood of recurrence of cancer from the biomedical image.
  • the image processor 160 may include a deep learning artificial neural network (ANN), such as a transformer architecture, an encoder-decoder model with a convolution neural network architecture, a diffusion model, or an autoencoder model, among others.
  • ANN deep learning artificial neural network
  • the ML -63- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 architecture 150 may include multiple image processors 160 for images of different staining modalities (e.g., histological stain or an immunohistochemical (IHC) stain).
  • the text processor 165 may be any type of AI algorithm or ML model to generate features associated with recurrence of cancer and to determine one or more scores indicating likelihood of recurrence of cancer from text reports.
  • the text processor 165 may include a deep learning artificial neural network (ANN), such as a transformer architecture, an encoder-decoder model with a convolution neural network architecture, a diffusion model, or an autoencoder model, among others.
  • ANN deep learning artificial neural network
  • the aggregator 170 be any type of AI algorithm or ML model to determine the score indicating likelihood of recurrence of cancer in the subject using the features from the image processor 160 and the text processor 165.
  • the aggregator 170 may include a deep learning artificial neural network (ANN), such as a tensor-fusion network, a transformer model, an encoder-decoder model with a convolution neural network architecture, a diffusion model, or an autoencoder model, among others.
  • ANN deep learning artificial neural network
  • the ML architecture 150 may lack one of the image processor 160, the text processor 165, or the aggregator 170.
  • the ML architecture 150 may include the text processor 165 and lack the image processor 160 and the aggregator 170, to process text data.
  • the ML architecture 150 may include the image processor 160 and lack the text processor 165 and the aggregator 170, to process image data.
  • the ML architecture 150 may an instance of the Orpheus or Orpheus+ models detailed herein in Sections A and B.
  • the imaging device 110 may (sometimes herein generally referred to as a whole slide scanner, a digital slide, scanner, an imaging device, or an image acquirer) may be any device to acquire biomedical images.
  • the imaging device 110 may execute, carry out, or otherwise perform a scan of a slide with a tissue sample.
  • the tissue sample on the slide may have been dyed using a histological stain or an immunohistochemical (IHC) stain to differentiate colors of certain features (e.g., cells) within the tissue sample.
  • IHC immunohistochemical
  • the scanning of the tissue sample on the slide may be in accordance with microscopy (e.g., light microscopy, brightfield microscopy, fluorescence microscopy, confocal microscopy, or multi-photon microscopy).
  • microscopy e.g., light microscopy, brightfield microscopy, fluorescence microscopy, confocal microscopy, or multi-photon microscopy.
  • the -64- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 imaging device 110 may be in communication with the data processing system 105 and the administrative device 115, among others, via the network 120.
  • the administrative device 115 (sometimes herein referred to as a client device, a client, or an end user computing device) may be any computing device comprising one or more processors coupled with memory and software and capable of performing the various processes and tasks described herein.
  • the administrative device 115 may be in communication with the data processing system 105 and the imaging device 110, via the network 120.
  • the administrative device 115 may have at least one display.
  • the administrative device 115 may be associated with an entity (e.g., a clinician) examining a tissue sample (e.g., tumor from resection or biopsy) from a of the subject.
  • the administrative device 115 may be used to create or generate pathology reports identifying various characteristics of the tissue sample.
  • the display may present information about the subject provided by the data processing system 105. Referring now to Fig. 37, depicted is a block diagram of a process 200 for training ML architectures in the system 100 for determining scores related to recurrence of cancers in subjects.
  • the dataset indexer 125 may receive, retrieve, or otherwise obtain training data 205.
  • the training data 205 may be stored and maintained on the database 155.
  • the training data 205 may be used to initialize, train, and establish the ML architecture 150, including its subcomponents, such as the image processor 160, the text processor 165, and the aggregator 170, among others.
  • the training data 205 may identify or include a set of examples with which the ML architecture 150 is to perform learning (e.g., supervised or weakly supervised learning).
  • Each example of the training data 205 may be for at least one subject 210.
  • the subject 210 (sometimes herein referred to as a sample subject) may be a human or animal subject.
  • the subject 210 may be female or male of any age.
  • the subject 210 may have or may be diagnosed with cancer affecting at least one organ 215 of the subject 210.
  • the subject 210 may be at risk of relapse, metastasis, or recurrence of the cancer, such as reappearing within the organ 215 or spreading to another anatomical site outside of the organ -65- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 215 (e.g., lymph node, bones, lungs, or brain).
  • the breast cancer may be, for example, a hormone receptor-positive early breast cancer (HR+/HER2- EBC) of luminal type A or luminal type B, among others.
  • HR+/HER2- EBC hormone receptor-positive early breast cancer
  • the breast cancer may be an invasive breast cancer (e.g., ductal carcinoma (IDC) or invasive lobular carcinoma (ILC)) or non-invasive (or in situ) breast cancer (e.g., ductal carcinoma in situ (DCIS) or lobular carcinoma in situ (LCIS)), among others.
  • the subject 210 may have undergone treatment previously for the cancer associated with the organ 215.
  • the subject 210 may have undergone surgical removal or resection of primary tumor associated with the cancer from the organ 215.
  • the subject 210 may have undergone other forms of treatments, such as radiation therapy, adjuvant chemotherapy, or endocrine therapy.
  • At least one tissue sample 220 may have been extracted, removed, or obtained from the organ 215 of the subject 210.
  • the biomedical image 225 may be of at least a portion of the tissue sample 220 from the organ 215 of the subject 210.
  • the biomedical image 225 may be generated by scanning the tissue sample 220 placed on a slide using the imaging device 110 in accordance with brightfield microscopy.
  • the biomedical image 225 may be of at least one of staining modality. The staining modality may depend on the stain applied on the tissue sample 220 prior to scanning and acquisition of the biomedical image 225. -66- 4924-7413-8397.1 Atty. Dkt.
  • the tissue sample 220 may be stained using a histological stain or an immunohistochemical (IHC) stain for imaging to create the biomedical image 225.
  • the histological stain may function to enhance visualization of cellular, tissue, and tumor structure within the tissue sample 220, and may include, for example, hematoxylin and eosin (H&E), hemosiderin stain, a Sudan stain, a Schiff stain, a Congo red stain, a Gram stain, a Ziehl-Neelsen stain, a Auramine-rhodamine stain, a trichrome stain, a silver stain, and Wright’s Stain, among others.
  • H&E hematoxylin and eosin
  • Hesiderin stain a Sudan stain
  • a Schiff stain a Schiff stain
  • a Congo red stain a Gram stain
  • Ziehl-Neelsen stain a Ziehl-
  • the IHC stain may server to detect specific proteins within the cells of the tissue sample 220, and may include, for example, hormone receptor stain (e.g., estrogen receptor (ER) or progesterone receptor (PR)), HER2/neu stain, basal and myoepithelial marker stain (e.g., CK5/6, CK14, epidermal growth factor receptor (EGFR), p63, or SMA), breast and urothelial marker stain (e.g., GATA3), SOX-10 stain, cytokeratin stain, epithelial membrane antigen (EMA) stain, Ki-67 stain, CD markers (e.g., CD3, CD4, CD8, CD20, CD34, CD56, and CD117), mesenchymal marker, or neural markers, among others.
  • hormone receptor stain e.g., estrogen receptor (ER) or progesterone receptor (PR)
  • HER2/neu stain e.g., basal and myoepithelial marker stain (e.
  • the example of the training data 205 may include multiple biomedical images 225 of different staining modalities.
  • one biomedical image 225 may be of at least a portion (e.g., a slice) of the tissue sample 220 in a H&E stain
  • another biomedical image 225 may be of at least another (e.g., another slice) portion of the tissue sample 220 in an IHC stain.
  • the example of the training data 205 may lack the biomedical image 225.
  • the text report 230 may include or identify a set of characteristics defining the tumor (e.g., in the tissue sample 220) associated with the cancer in the subject 210.
  • the set of characteristics may identify or include a gross description of the tumor (e.g., size, shape, or color), a specimen type, a histologic description (e.g., tumor type, histologic grade, or depth), a biomarker identifier (e.g., hormone receptors, proliferation markers, or other markers), among others.
  • the set of characteristics may depend on the type of cancer. For instance, for breast cancer, the set of characteristics identified in the text report 230 may include a histologic subtype, a percent positivity of HR, a percent positivity of HER2, a histologic grade, an anatomic site, a ductal carcinoma in situ (DCIS) indicator, or lobular carcinoma in situ (LCIS) indicator, among others.
  • the text report 230 may include a set of alphanumeric characters or strings -67- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 defining the set of characteristics.
  • the text report 230 may have been created by a clinician (e.g., a pathologist) examining the tissue sample 220 directly or the biomedical image 225 associated with the subject 210 to evaluate the tumor and to assess the risk of recurrence of cancer in the subject 210.
  • the text report 230 may include structured data.
  • the structured data may include a set of fields and a corresponding set of values for the set of characteristics defining the tumor.
  • the text report 230 may include unstructured text.
  • the text report 230 may include free text data identifying set of characteristics to define the tumor associated with the cancer in the subject 210.
  • the example of the training data 205 may lack the text report 230.
  • the image score 235 may identify or indicate a likelihood of recurrence associated with cancer in the subject 210, as derived from the biomedical image 225.
  • the image score 235 may been determined or assigned by a clinician examining the biomedical image 225 of the tissue sample 220 to assess the risk of recurrence of the cancer in the subject 210.
  • the image score 235 may be a numerical value (e.g., 0 to 100, -100 to 100, 0 to 10, -10 to 10, 0 to 1, or -1 to 1) indicate the degree of the likelihood.
  • the image score 235 may indicate a risk of recurrence of the cancer in the subject 210. In some embodiments, the image score 235 may indicate a predicted outcome identifying a degree of risk or likelihood of occurrence (or absence of) the relapse, metastasis, or recurrence of the cancer in the subject 210. The predicted outcome may be used to determine whether to administer the therapy to the subject 210. In some embodiments, the image score 235 may indicate the likelihood of recurrence within or at a time window relative to the previous treatment (e.g., resection of the primary tumor) for the cancer in the subject 210.
  • the previous treatment e.g., resection of the primary tumor
  • the time window may range anywhere between 1 month to 5 years, such as 1 month, 2 months, 3 months, 6 months, 9 months, 1 year, 1.5 years, 2 years, 3 years, 4 years, 5 years, 6 years, 7 years, or 8 years, among others.
  • the example of the training data 205 may lack the image score 235 (e.g., when the example also lacks the biomedical image 225).
  • the text score 240 may identify or indicate a likelihood of recurrence associated with cancer in the subject 210, as derived from the text report 230.
  • the text score 240 may been -68- 4924-7413-8397.1 Atty. Dkt.
  • the text score 240 may be a numerical value (e.g., 0 to 100, -100 to 100, 0 to 10, -10 to 10, 0 to 1, or -1 to 1) indicate the degree of the likelihood. In some embodiments, the text score 240 may indicate a risk of recurrence of the cancer in the subject 210. In some embodiments, the text score 240 may indicate a predicted outcome identifying a degree of risk or likelihood of occurrence (or absence of) the relapse, metastasis, or recurrence of the cancer in the subject 210.
  • the example of the training data 205 may identify or include at least one aggregate score.
  • the aggregate score may identify or indicate identify or indicate a likelihood of recurrence associated with cancer in the subject 210.
  • the aggregate score may be a numerical value (e.g., 0 to 100, -100 to 100, 0 to 10, -10 to 10, 0 to 1, or -1 to 1) indicate the degree of the likelihood.
  • the aggregate score may indicate a risk of recurrence of the cancer in the subject 210.
  • the aggregate score may indicate a predicted outcome identifying a degree of risk or likelihood of occurrence (or absence of) the relapse, metastasis, or recurrence of the cancer in the subject 210.
  • the aggregate score may be calculated, generated, or determined as a function (e.g., an average or a weighted sum) of the image score 235 and the text score 240.
  • the aggregate score may be assigned by a clinician examining the tissue sample 220, the biomedical image 225, or the text report 230.
  • the aggregate score for breast cancer may be generated from the multigene assay, and may be in accordance with any one or more of Oncotype DX (ODX) recurrence score, a MammaPrint score, a Prosignal (PAM50) risk of recurrence (ROR) score, or a breast cancer index (BCI), among others.
  • ODX Oncotype DX
  • MAM50 Prosignal risk of recurrence
  • BCI breast cancer index
  • the example of the training data 205 may lack one of the image score 235, the text score 240, or the aggregated score.
  • the ML architecture 150 herein may be used in assessing risk of other types of cancers, such as: a bone cancer (e.g., osteosarcoma, chondrosarcoma, chordoma, or Ewing sarcoma), a lung cancer (e.g., non-small cell lung cancer (NSCLC) or Small cell lung cancer (SCLC)), skin cancer (e.g., melanoma, Basal cell carcinoma (BCC) and squamous cell carcinoma (SCC)), lung cancer (e.g., non-small cell lung cancer (NSCLC) or small cell lung cancer (SCLC)), brain cancer (e.g., glioblastoma multiforme (GBM), astrocytoma, oligodendroglioma, meningioma, and medulloblasto
  • GBM glioblastoma multi
  • the input handler 130 may create, produce, or otherwise generate a set of tiles 250A–N (hereinafter generally referred to as tiles 250) using the biomedical image 225.
  • the input handler 130 may partition, section, or otherwise divide the biomedical image 225 to generate the set of tiles 250.
  • Each tile 250 may correspond to a respective portion of the biomedical image 225.
  • Each tile 250 may have dimensions corresponding to dimensions for an input of the ML architecture 150 or the image processor 160.
  • the input handler 130 may create, produce, or otherwise generate a set of tokens 255A–N (hereinafter generally referred to as a set of tokens 255) using the text report 230.
  • the model trainer 135 may initialize, train, or otherwise establish the ML architecture 150 using the training data 205.
  • the model trainer 135 may instantiate or create the ML architecture 150 to include the image processor 160, the text processor 165, and the aggregator 170.
  • the model trainer 135 may assign or set values (e.g., random or defined values) to the set of weights arranged across the image processor 160, the text processor 165, and the aggregator 170 in the ML architecture 150.
  • the model trainer 135 may create the ML architecture 150 based on a pre-trained model, and use the training data 205 to further train or fine-tune.
  • the model trainer 135 may provide tiles 250 generated from the biomedical image 225 of the H&E stain to the image processor 160 for processing images with H&E stain.
  • the model trainer 135 may provide tiles 250 created from the biomedical image 225 with IHC stain to the image processor 160 for processing images with IHC stain.
  • the image processor 160 may process the set of tiles 250 (e.g., in sequence) in accordance with the set of weights of the image processor 160. From processing the set of tiles 250, the image processor 160 may produce, create, or otherwise generate a set of embeddings 260A–N (hereinafter generally referred to as embeddings 260).
  • the image score 235’ may indicate the likelihood of recurrence within or at the time window relative to the previous treatment (e.g., resection of the primary tumor) for the cancer in the subject 210.
  • the image processor 160 may generate the image score 235’ by processing the set of embeddings 260 at the output layer (e.g., last activation or regression layer) within the set of weights in the image processor 160. -72- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191
  • the image processor 160 may calculate, determine, or otherwise generate a set of attention scores for the set of tiles 250.
  • Each attention score may identify or indicate a degree of relevance of a respective tile 250 with the recurrence of the cancer. For instance, the attention score may quantify an importance of the given tile 250 in relation with other tiles 250 to the desired output of determining likelihood of recurrence as derived from the tiles 250.
  • the image processor 160 may generate the set of attention scores at an attention layer (e.g., multi-head attention layer) within the set of weights of the image processor 160.
  • the model trainer 135 may input, feed, or otherwise provide the set of tokens 255 to the text processor 165 of the ML architecture 150.
  • the text processor 165 may process the set of tokens 255 (e.g., in sequence) in accordance with the set of weights of the text processor 165.
  • the text processor 165 may produce, create, or otherwise generate a set of embeddings 265A–N (hereinafter generally referred to as embeddings 265).
  • the set of embeddings 265 may correspond to features (e.g., in the form of a feature map or vector) associated with the recurrence of cancer.
  • the set of embeddings 265 may be obtained by an intermediary layer (e.g., before the last activation or regression layer) in the text processor 165.
  • the text processor 165 may calculate, determine, or otherwise generate at least one image score 240’.
  • the text score 240’ may identify or indicate a likelihood of recurrence associated with cancer in the subject 210, as derived from the text report 230 or the set of tokens 255. In some embodiments, the text score 240’ may indicate the likelihood of recurrence within or at the time window relative to the previous treatment (e.g., resection of the primary tumor) for the cancer in the subject 210.
  • the text processor 165 may generate the text score 240’ by processing the set of embeddings 265 at the output layer (e.g., last activation or regression layer) within the set of weights in the text processor 165.
  • the text processor 165 may calculate, determine, or otherwise generate a set of attention scores for the set of tokens 255. -73- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191
  • Each attention score may identify or indicate a degree of relevance of a respective token 255 with the recurrence of the cancer.
  • the attention score may quantify an importance of the given token 255 in relation with other tokens 255 to the desired output of determining likelihood of recurrence as derived from the tokens 255.
  • the image processor 165 may generate the set of attention scores at an attention layer (e.g., multi-head attention layer) within the set of weights of the text processor 165.
  • an attention layer e.g., multi-head attention layer
  • the model trainer 135 may input, feed, or otherwise provide the set of embeddings 260 from the image processor 160 and the set of embeddings 265 from the text processor 165 to the aggregator 170 of the ML architecture 150.
  • the aggregator 170 may process the set of embeddings 260 and the set of embeddings 265.
  • the aggregator 170 may process the input set of embeddings 260 and 265 in accordance with the set of weights (e.g., in a fusion layer or activation layer). From processing, the aggregator 170 may calculate, generate, or otherwise determine at least one aggregate score 270.
  • the aggregate score 270 may identify or indicate identify or indicate a likelihood of recurrence associated with cancer in the subject 210 as derived from both the biomedical image 225 and the text report 230. In some embodiments, the aggregate score 270 may indicate the likelihood of recurrence within or at the time window relative to the previous treatment (e.g., resection of the primary tumor) for the cancer in the subject 210. The time window for the image score 235’, the text score 240’, and the aggregate score 270 may be the same.
  • the model trainer 135 may compare the outputs of the ML architecture 150 with the example of the training data 205 used to generate the outputs.
  • the model trainer 135 may calculate, generate, or otherwise determine one or more loss metrics 275A–N (hereinafter generally referred to as loss metrics 275).
  • the loss metric 275 may be calculated in accordance with any number of loss functions, such as a norm loss (e.g., L1 or L2), mean squared error (MSE), a quadratic loss, a cross-entropy loss, and a Huber loss, among others.
  • the model trainer 135 may compare the image score 235’ generated by the image processor 160 and the image score 235 as identified in the corresponding example in the training data 205. Based on the comparison, the model trainer 135 may determine the loss metric 275 for -74- 4924-7413-8397.1 Atty. Dkt.
  • the loss metric 275 may identify or indicate a degree of deviation between the image scores 235 and 235’.
  • the model trainer 135 may compare the text score 240’ generated by the text processor 165 and the text score 240 as identified in the corresponding example in the training data 205. Based on the comparison, the model trainer 135 may determine the loss metric 275 for the text processor 165.
  • the loss metric 275 may identify or indicate a degree of deviation between the text scores 240 and 240’.
  • the model trainer 135 may compare the aggregate score 270 generated by the ML architecture 150 with the aggregate score as identified in the corresponding example in the training data 205.
  • the model trainer 135 may update one or more of the set of weights of the ML architecture 150 based on one or more of: the comparison between the image scores 235 and 235’, the comparison between the text score 240 and 240’, or the comparison between the aggregate score 270 with the expected aggregate score. In some embodiments, the model trainer 135 may update one or more of the set of weights of the image processor 160 using the loss metric 275 determined for the text processor 160 using the image scores 235 and 235’. In some embodiments, the model trainer 135 may update one or more of the set of weights of the text processor 165 using the loss metric 275 determined for the text processor 165 using the text scores 240 and 240’.
  • the model trainer 135 may update one or more of -75- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 the set of weights of the overall ML architecture 150 (or the aggregator 170) using the loss metric 275 determined using the aggregate scores. In some embodiments, the model trainer 135 may omit end-to-end training of the overall ML architecture 150 (or the aggregator 170) using the loss metric.
  • the updating of the weights of the ML architecture 150 may be in accordance with an optimization function (or an objective function).
  • the optimization function may define one or more learning rates or parameters at which the weights of the ML architecture 150 are to be updated.
  • the optimization function may include, for example, adaptive moment estimation (Adam), Adam with weight decay, or stochastic gradient descent (SGD), among others.
  • Adam adaptive moment estimation
  • SGD stochastic gradient descent
  • the weights of the image processor 160, the text processor 165, and the aggregator 170 in the ML architecture 150 may be modified or updated using the respective loss metrics 275 in accordance with the respective objective function (e.g., Adam).
  • the model trainer 135 may further train the ML architecture 150.
  • the training may be iteratively repeated using the examples of the training data 205 until a convergence condition for the ML architecture 150 to complete the training and establishment of the ML architecture 150.
  • FIG. 38 depicted is a block diagram of a process 300 for applying ML architectures in the system 100 for determining scores related to recurrence of cancers in subjects.
  • at least subject 310 may be under evaluation for risk of relapse, metastasis, or recurrence of the cancer.
  • the subject 310 may be similar to the subject 210, except that the subject 310 is a new individual and thus data associated with the subject 310 may not be in the training data 205.
  • the subject 310 may be a human or animal subject.
  • the subject 310 may be female or male of any age.
  • the subject 310 may have or may be diagnosed with cancer affecting at least one organ 315 of the subject 310.
  • the subject 310 may be at risk of relapse, metastasis, or recurrence of the cancer, such as reappearing within the organ 315 or spreading to another anatomical site outside of the organ 315 (e.g., lymph node, bones, lungs, or brain).
  • anatomical site outside of the organ 315 e.g., lymph node, bones, lungs, or brain.
  • the cancer may be, for example, breast cancer, affecting the breast of the subject 310.
  • the breast cancer may be, for example, a hormone receptor-positive early breast cancer (HR+/HER2- EBC) of luminal type A or luminal type B, among others.
  • HR+/HER2- EBC hormone receptor-positive early breast cancer
  • the breast cancer may be an invasive breast cancer (e.g., ductal carcinoma (IDC) or invasive lobular carcinoma (ILC)) or non-invasive (or in situ) breast cancer (e.g., ductal carcinoma in situ (DCIS) or lobular carcinoma in situ (LCIS)), among others.
  • the cancer may include, for instance, a bone cancer, a lung cancer, skin cancer, lung cancer, brain cancer, head and neck cancer, uterine cancer, stomach cancer, ovarian cancer, cervical cancer, bladder cancer, prostate cancer, or colorectal cancer, among others.
  • the organ 315 may correspond to the anatomical site associated with the cancer in the subject 310.
  • the organ 315 may include for example, at least a portion of skin, lung, brain, head, neck , uterus, stomach, ovary, cervix, prostate, breast, colon, or rectum, among others.
  • the subject 310 may have undergone treatment previously for the cancer associated with the organ 315.
  • the subject 310 may have undergone surgical removal or resection of primary tumor associated with the cancer from the organ 315.
  • the subject 310 may have undergone other forms of treatments, such as radiation therapy, adjuvant chemotherapy, or endocrine therapy.
  • At least one tissue sample 320 may have been extracted, removed, or obtained from the organ 315 of the subject 310.
  • the tissue sample 320 may contain or include at least a portion of the tumor associated with the cancer in the subject 310.
  • the tissue sample 320 may include at least a portion or entirety of the tumor (e.g., primary tumor) associated with the cancer within the organ 315 of the subject 310.
  • the tissue sample 320 may have been obtained from the organ 315 as part of the surgical removal or resection of the tumor (e.g., the primary tumor) associated with the cancer.
  • the imaging device 110 may output, produce, or otherwise generate at least one biomedical image 325 of the tissue sample 320.
  • the biomedical image 325 may be similar to the biomedical image 225 described herein.
  • the biomedical image 325 may be of at least a portion of the tissue sample 320 from the organ 315 of the subject 310. For example, a slice of the tissue sample 320 may be placed on a slide and then scanned by the imaging device 110 in accordance -77- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 with brightfield microscopy to acquire the biomedical image 325.
  • the biomedical image 325 may be of at least one of staining modality. The staining modality may depend on the stain applied on the tissue sample 320 prior to scanning and acquisition of the biomedical image 325 by the imaging device 110.
  • the tissue sample 320 may be cut, processed, and stained using a histological stain or an immunohistochemical (IHC) stain for imaging to create the biomedical image 325.
  • the histological stain may function to enhance visualization of cellular, tissue, and tumor structure within the tissue sample 320, and may include, for example, hematoxylin and eosin (H&E), hemosiderin stain, a Sudan stain, a Schiff stain, a Congo red stain, a Gram stain, a Ziehl-Neelsen stain, a Auramine-rhodamine stain, a trichrome stain, a silver stain, and Wright’s Stain, among others.
  • H&E hematoxylin and eosin
  • hemosiderin stain a Sudan stain, a Schiff stain, a Congo red stain, a Gram stain, a Ziehl-Neelsen stain, a Auramine-
  • the IHC stain may server to detect specific proteins within the cells of the tissue sample 320, and may include, for example, hormone receptor stain (e.g., estrogen receptor (ER) or progesterone receptor (PR)), HER2/neu stain, basal and myoepithelial marker stain (e.g., CK5/6, CK14, epidermal growth factor receptor (EGFR), p63, or SMA), breast and urothelial marker stain (e.g., GATA3), SOX-10 stain, cytokeratin stain, epithelial membrane antigen (EMA) stain, Ki-67 stain, CD markers (e.g., CD3, CD4, CD8, CD20, CD34, CD56, and CD117), mesenchymal marker, or neural markers, among others.
  • hormone receptor stain e.g., estrogen receptor (ER) or progesterone receptor (PR)
  • HER2/neu stain e.g., basal and myoepithelial marker stain (e.
  • the imaging device 110 may generate multiple biomedical images 325 of one or more tissue samples from the subject 310 in different staining modalities. For instance, one biomedical image 325 may be of at least a portion (e.g., a slice) of the tissue sample 320 in a H&E stain, and another biomedical image 325 may be of at least another (e.g., another slice) portion of the tissue sample 320 in an IHC stain.
  • the imaging device 110 may transmit, send, or otherwise provide the biomedical image 325 to the data processing system 105 or the database 155.
  • the administrative device 115 may output, produce, or otherwise generate at least one text report 330.
  • the text report 330 may include or identify a set of characteristics defining the tumor (e.g., in the tissue sample 320) associated with the cancer in the subject 310.
  • the set of characteristics may identify or include a gross description of the tumor (e.g., size, shape, or color), a specimen type, a histologic description (e.g., tumor type, -78- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 histologic grade, or depth), a biomarker identifier (e.g., hormone receptors, proliferation markers, or other markers), among others.
  • the set of characteristics may depend on the type of cancer.
  • the set of characteristics identified in the text report 230 a histologic subtype, a percent positivity of HR, a percent positivity of HER2, a histologic grade, an anatomic site, a ductal carcinoma in situ (DCIS) indicator, or lobular carcinoma in situ (LCIS) indicator, among others.
  • the text report 330 may include a set of alphanumeric characters or strings defining the set of characteristics.
  • the text report 330 may include structured data.
  • the structured data may include a set of fields and a corresponding set of values for the set of characteristics defining the tumor.
  • the text report 330 may include unstructured text.
  • the administrative device 115 may apply optical character recognition (OCR) on a scan of the physical report to recognize and detect the set of characters on the report and to generate the text report 330. With the generation, the administrative device 115 may transmit, send, or otherwise provide the text report 330 to the data processing system 105 or the database 155. In some embodiments, the administrative device 115 may provide the text report 330 together with the biomedical image 325 to the data processing system 105 or the database 155.
  • the dataset indexer 125 may receive, retrieve, or otherwise obtain at least one dataset 305 including at least one of the biomedical image 325 or the text report 330 (or both).
  • the dataset 305 may lack one of the biomedical image 325 or the text -79- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 report 330.
  • the dataset indexer 125 may extract or identify the biomedical image 325 and the text report 330 for the subject 310.
  • the dataset indexer 125 may receive, retrieve, or otherwise the biomedical image 325 from the imaging device 110 and the text report 330 from the administrative device 115. With the obtaining, the dataset indexer 125 may create, form, or generate the dataset 305 to include the biomedical image 325 and the text report 330.
  • the dataset 305 may be used during inference to apply the ML architecture 150 to newly obtained biomedical images and text reports, outside of the training data 205.
  • the input handler 130 may create, produce, or otherwise generate a set of tiles 350A–N (hereinafter generally referred to as tiles 350) using the biomedical image 325.
  • the input handler 130 may partition, section, or otherwise divide the biomedical image 325 to generate the set of tiles 350.
  • Each tile 350 may correspond to a respective portion of the biomedical image 325.
  • Each tile 350 may have dimensions corresponding to dimensions for an input of the ML architecture 150 or the image processor 160.
  • the tokenizer may be a part of the ML architecture 150 or the text processor 165 to convert or transform the one or more words of the token 355 into corresponding numerical representations (e.g., word vectors).
  • the model applier 140 may apply the ML architecture 150 to the biomedical image 325 and to the text report 330 of the dataset 305.
  • the model applier 140 may apply the ML architecture 150 to the set of tiles 350 and the set of tokens 355.
  • the model applier 140 may input, feed, -80- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 or otherwise provide the set of tiles 350 to the image processor 160.
  • the image processor 160 may produce, create, or otherwise generate a set of embeddings 360A–N (hereinafter generally referred to as embeddings 360).
  • the set of embeddings 360 may correspond to features (e.g., in the form of a feature map or vector) associated with the recurrence of cancer.
  • the set of embeddings 360 may be obtained by an intermediary layer (e.g., before the last activation or regression layer) in the image processor 160.
  • the image processor 160 may calculate, determine, or otherwise generate at least one image score 335.
  • the image processor 160 may generate the image score 335 by processing the set of embeddings 360 at the output layer (e.g., last activation or regression layer) within the set of weights in the image processor 160.
  • the image processor 160 may calculate, determine, or otherwise generate a set of attention scores for the set of tiles 350.
  • Each attention score may identify or indicate a degree of relevance of a respective tile 350 with the recurrence of the cancer. For instance, the attention score may quantify an importance of the given tile 350 in relation with other tiles 350 to the desired output of determining likelihood of recurrence as derived from the tiles 350.
  • the image processor 160 may generate the set of attention scores at an attention layer (e.g., multi-head attention layer) within the set of weights of the image processor 160.
  • the model applier 140 may input, feed, or otherwise provide the set of tokens 355 to the text processor 165 of the ML architecture 150.
  • the text processor 165 may process the set of tokens 355 (e.g., in sequence) in accordance with the set of weights of the text processor 165. From processing the set of tokens 355, the text processor 165 may produce, create, or otherwise generate a set of embeddings 365A–N (hereinafter generally referred to as embeddings 365).
  • the set of embeddings 365 may correspond to features (e.g., in the form of a feature map or vector) associated with the recurrence of cancer.
  • the set of embeddings 365 may be obtained by an intermediary layer (e.g., before the last activation or regression layer) in the text processor 165.
  • the text processor 165 may calculate, determine, or otherwise generate at least one image score 340.
  • the text score 340 may identify or indicate identify or indicate a likelihood of recurrence associated with cancer in the subject 310, as derived from the text report 330 or the set of tokens 355.
  • the text score 240 may be a numerical value (e.g., 0 to 100, -100 to 100, 0 to 10, -10 to 10, 0 to 1, or -1 to 1) indicate the degree of the likelihood.
  • the text score 240 may indicate a risk of recurrence of the cancer in the -82- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 subject 210.
  • the text score 240 may indicate a predicted outcome identifying a degree of risk or likelihood of occurrence (or absence of) the relapse, metastasis, or recurrence of the cancer in the subject 210. The predicted outcome may be used to determine whether to administer the therapy to the subject 210.
  • the image score 240 may indicate the likelihood of recurrence within or at a time window relative to the previous treatment (e.g., resection of the primary tumor) for the cancer in the subject 210.
  • the time window may range anywhere between 1 month to 5 years, such as 1 month, 2 months, 3 months, 6 months, 9 months, 1 year, 1.5 years, 2 years, 3 years, 4 years, 5 years, 6 years, 7 years, or 8 years, among others.
  • the text processor 165 may generate the text score 340 by processing the set of embeddings 365 at the output layer (e.g., last activation or regression layer) within the set of weights in the text processor 165.
  • the text processor 165 may calculate, determine, or otherwise generate a set of attention scores for the set of tokens 355.
  • Each attention score may identify or indicate a degree of relevance of a respective token 355 with the recurrence of the cancer.
  • the attention score may quantify an importance of the given token 355 in relation with other tokens 355 to the desired output of determining likelihood of recurrence as derived from the tokens 355.
  • the image processor 165 may generate the set of attention scores at an attention layer (e.g., multi-head attention layer) within the set of weights of the text processor 165.
  • the model applier 140 may input, feed, or otherwise provide the set of embeddings 360 from the image processor 160 and the set of embeddings 365 from the text processor 165 to the aggregator 170 of the ML architecture 150.
  • the aggregator 170 may process the set of embeddings 360 and the set of embeddings 365.
  • the aggregator 170 may process the input set of embeddings 360 and 365 in accordance with the set of weights (e.g., in a fusion layer or activation layer). From processing, the aggregator 170 may calculate, generate, or otherwise determine at least one aggregate score 370.
  • the aggregate score 370 may identify or indicate identify or indicate a likelihood of recurrence associated with cancer in the subject 310 as derived from both the biomedical image 325 and the -83- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 text report 330.
  • the aggregate score may be a numerical value (e.g., 0 to 100, -100 to 100, 0 to 10, -10 to 10, 0 to 1, or -1 to 1) indicate the degree of the likelihood.
  • the aggregate score may indicate a risk of recurrence of the cancer in the subject 210.
  • the aggregate score may indicate a predicted outcome of identifying a degree of risk or likelihood of occurrence (or absence of) the relapse, metastasis, or recurrence of the cancer in the subject 210.
  • the predicted outcome may be used to determine whether to administer the therapy to the subject 210.
  • the aggregate score may indicate the likelihood of recurrence within or at a time window relative to the previous treatment (e.g., resection of the primary tumor) for the cancer in the subject 210.
  • the time window may range anywhere between 1 month to 5 years, such as 1 month, 2 months, 3 months, 6 months, 9 months, 1 year, 1.5 years, 2 years, 3 years, 4 years, 5 years, 6 years, 7 years, or 8 years, among others.
  • the time window for the image score 335, the text score 340, and the aggregate score 370 may be the same.
  • Fig. 39 depicted is a block diagram of a process 400 for evaluating outputs from ML architectures in the system 100 for determining scores related to recurrence of cancers in subjects.
  • the output evaluator 145 may determine or generate at least one classification 405 in accordance with the aggregate score 370 (or the image score 335 or the text score 340).
  • the classification 405 may indicate or identify the subject 310 as one of a candidate or non-candidate for administration of at least one therapy 410 for cancer.
  • the output evaluator 145 may store and maintain an association between the subject 310 (e.g., using an anonymized identifier) and the classification 405 using one or more data structures.
  • the data structures may be of any type, such as an array, matrix, table, linked list, binary tree, heap, stack, queue, class object, or data file, among others.
  • the output evaluator 145 may compare the aggregate score 370 with a threshold.
  • the threshold may identify or define a value for the aggregate score 370 at which to classify the subject 310 as one of the candidate or non-candidate for the therapy.
  • the output evaluator 145 may generate the classification 405 to identify the subject 310 as a candidate for the administration of -84- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 the therapy 410.
  • the therapy 410 may be to block, inhibit, or otherwise prevent relapse, metastasis, or recurrence of the cancer in the subject 310.
  • the output evaluator 145 may generate the classification 405 to identify the subject 310 as a non-candidate for the administration of the therapy 410.
  • the therapy may include at least one of an endocrine therapy, an adjuvant chemotherapy, or radiation therapy, among others.
  • the type of therapy 410 identified for the subject 310 may depend on the type of cancer. For instance, for breast cancer (e.g., HR+/HER2- EBC), the therapy may include an endocrine therapy, an adjuvant chemotherapy, or radiation therapy, among others.
  • the endocrine therapy may include, for example, at least one of a selective estrogen receptor modulator (SERM), an aromatase inhibitor (AI), or a selective estrogen receptor degrader (SERD), among others.
  • SERM selective estrogen receptor modulator
  • AI aromatase inhibitor
  • SESD selective estrogen receptor degrader
  • the adjuvant chemotherapy may include, for example, at least one of anthracyclines, taxanes, cyclophosphamide, fluorouracil, or carboplatin, among others.
  • the radiation therapy may include external beam radiation therapy (EBRT) (e.g., a whole or partial breast irradiation or brachytherapy (e.g., intracavitary brachytherapy or interstitial brachytherapy, among others.
  • EBRT external beam radiation therapy
  • brachytherapy e.g., intracavitary brachytherapy or interstitial brachytherapy, among others.
  • the therapy may include, for example, radiation therapy, hormone therapy, or immunotherapy, among others.
  • the surgical procedure may include a resection or mastectomy, among others.
  • the radiation therapy may include EBRT or brachytherapy, among others.
  • the chemotherapy may include capecitabine, 5- fluorouracil, oxaliplatin, irinotecan, trifluridine, cetuximab, or panitumumab, among others.
  • Th -85- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 targeted therapy may include, for example, EGFR inhibitors, anti-angiogenic therapy, or BRAF inhibitors, among others.
  • the output evaluator 145 may produce, create, or otherwise generate at least one output 415 based on the classification 405.
  • the output 415 may include information to be presented via the administrative device 115 (or another computing device) for assessing the risk of recurrence or predicted outcome with respect to recurrence of cancer in the subject 310.
  • the output 415 may also include instructions for presenting the information, for example, defining user interface elements of a graphical user interface (GUI) to display the information.
  • GUI graphical user interface
  • the information may be used (e.g., by a clinician examining the subject 310) to determine whether to provide or administer the subject 310 with the therapy 410.
  • the information may include at least one of: an identifier (e.g., anonymized identifier) for the subject 310; the biomedical image 325; one or more of the tiles 350 from the biomedical image 325; the text report 330; one or more of the tokens 355 from the text report 330; the image score 335; the text score 340; the aggregate score 370; or the time window for the image score 335, the text score 340, the aggregate score 370 relative to the prior administration of therapy for the subject 310, among others.
  • the output evaluator 145 may generate the output 415 to identify or indicate the subject 310 as a candidate for the administration of the therapy 410.
  • the output 415 may also include an indicator identifying the therapy for the recurrence of the cancer, such as endocrine therapy or adjuvant chemotherapy for breast cancer.
  • the output evaluator 145 may generate the output 415 to identify or indicate the subject 310 as a non-candidate for the administration of the therapy 410.
  • the output 415 may also identify or include at least one indicator.
  • the output 415 may include at least one of an indication to not administer the subject 310 with the therapy for recurrence of the cancer, an indication to continue monitoring the subject 310 for recurrence of the cancer, or -86- 4924-7413-8397.1 Atty. Dkt.
  • the output evaluator 145 may send, transmit, or otherwise provide the output 415 to the administrative device 115 for presentation via the administrative device 115.
  • the output evaluator 145 may identify or select a subset of tiles 350’A–N (hereinafter generally tiles 350’) from the set of tiles 350 based on the set of attention scores generated by the image processor 160.
  • the subset of tiles 350’ may correspond to the tiles most relevant or important to the determination of the likelihood of recurrence of the cancer as indicated in the image score 335.
  • the output evaluator 145 may select the subset of tiles 350’ with the highest attention scores correlated with the most relevant or important determinants in determining the image score 335. With the selection, the output evaluator 145 may add, insert, or otherwise include the subset of tiles 350’ in the output 415, prior to provision of the output 415 to the administrative device 115. In some embodiments, the output evaluator 145 may identify or select a subset of tokens 355’A–N (hereinafter generally tokens 355’) from the set of tokens 355 based on the set of attention scores generated by the text processor 165.
  • tokens 355’A–N hereinafter generally tokens 355’
  • the information may be used by a user (e.g., a clinician examining the subject 310) of the administrative device 115 as one of the factors in deciding on the treatment of the subject 310 -87- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 with respect to the recurrence of cancer.
  • the output 415 includes the classification 405 identifying the subject 310 as a candidate for therapy
  • the administrative device 115 may present an indication identifying the subject 310 as the candidate for therapy.
  • the subject 310 may be provided or administered with the therapy 410 (e.g., an endocrine therapy or an adjuvant chemotherapy) as identified in the output 415.
  • the therapy 410 e.g., an endocrine therapy or an adjuvant chemotherapy
  • the clinician may determine that the subject 310 is at a high risk for recurrence and may decide to administer the subject 310 with the therapy 410 for recurrence of the cancer.
  • the output 415 includes the classification 405 identifying the subject 310 as a non-candidate for therapy
  • the administrative device 115 may present an indication identifying the subject 310 as the non-candidate for therapy. In response to the presentation, there may be a refraining of the administration or the provision of the therapy 410 to the subject 310.
  • the ML architecture 150 may be a significant improvement over unimodal models that take either imagining data or text data.
  • the use of the ML architecture 150 on the data processing system 105 may eliminate the reliance on separate, dedicated models for such unimodal data, and by extension reduce the reliance on separate systems to host these unimodal models, thereby freeing up computing capacity and network bandwidth.
  • the integration and leverage of multimodal data can allow the ML architecture 150 to embed and capture complex relationships between different types of data that unimodal models might miss or be unable to -88- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 detect.
  • the scores (e.g., the aggregate score 370) outputted by the ML architecture 150 can thus be more accurate and more insightful relative to unimodal models. Additionally, the data processing system 105 and the ML architecture 150 thereon can offer substantial improvements to clinical outcomes by providing more accurate and timely predictions of cancer recurrence risk. Techniques that rely on multigene assay tests can take a substantial amount of time to provide results. In contrast, the ML architecture 150 can rely on more readily accessible and available images of tissues samples taken from the subject and associated pathology text report, thereby reducing the amount of time in obtaining analysis results. By accurately identifying high-risk patients, the ML architecture 150 can guide clinical decision-making, allowing for more personalized and effective treatment plans.
  • the method 500 may be implemented or performed by any of the components detailed herein, such as the system 100 or the system 600.
  • a computing system may obtain a dataset including a biomedical image and a text report associated with a tumor of cancer in a subject (505).
  • the computing system may generate a set of tiles from the biomedical image and a set of tokens from the text report (510).
  • the computing system may apply an ML architecture to the biomedical image (or the set of tiles) and the text report (or the set of tokens) (515).
  • the computing system may determine a score indicating a likelihood of recurrence of cancer in the subject based on applying the ML architecture (520).
  • the computing system may determine whether the score satisfies a threshold (525).
  • FIG. 41 shows a simplified block diagram of a representative server system 600, client computing system 614, and network 626 usable to implement certain embodiments of the present disclosure.
  • server system 600 or similar systems can implement services or servers described herein or portions thereof.
  • Client computing system 614 or similar systems can implement clients described herein.
  • the system 100 described herein can be similar to the server system 600.
  • Server system 600 can have a modular design that incorporates a number of modules 602 (e.g., blades in a blade server embodiment); while two modules 602 are shown, any number can be provided.
  • Each module 602 can include processing unit(s) 604 and local storage 606.
  • Processing unit(s) 604 can include a single processor, which can have one or more cores, or multiple processors.
  • processing unit(s) 604 can include a general-purpose primary processor as well as one or more special-purpose co-processors such as graphics processors, digital signal processors, or the like. In some embodiments, some or all processing units 604 can be implemented using customized circuits, such as application specific integrated circuits (ASICs) or field programmable gate arrays (FPGAs). In some embodiments, such integrated circuits execute instructions that are stored on the circuit itself. In other embodiments, processing unit(s) 604 can execute instructions stored in local storage 606. Any type of processors in any combination can be included in processing unit(s) 604.
  • ASICs application specific integrated circuits
  • FPGAs field programmable gate arrays
  • Local storage 606 can include volatile storage media (e.g., DRAM, SRAM, SDRAM, or the like) and/or non-volatile storage media (e.g., magnetic or optical disk, flash -90- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 memory, or the like). Storage media incorporated in local storage 606 can be fixed, removable or upgradeable as desired. Local storage 606 can be physically or logically divided into various subunits such as a system memory, a read-only memory (ROM), and a permanent storage device.
  • the system memory can be a read-and-write memory device or a volatile read-and-write memory, such as dynamic random-access memory.
  • the system memory can store some or all of the instructions and data that processing unit(s) 604 need at runtime.
  • the ROM can store static data and instructions that are needed by processing unit(s) 604.
  • the permanent storage device can be a non-volatile read-and-write memory device that can store instructions and data even when module 602 is powered down.
  • storage medium includes any medium in which data can be stored indefinitely (subject to overwriting, electrical disturbance, power loss, or the like) and does not include carrier waves and transitory electronic signals propagating wirelessly or over wired connections.
  • local storage 606 can store one or more software programs to be executed by processing unit(s) 604, such as an operating system and/or programs implementing various server functions such as functions of the system 100 of FIG. 36 or any other system described herein, or any other server(s) associated with system 100 or any other system described herein.
  • “Software” refers generally to sequences of instructions that, when executed by processing unit(s) 604 cause server system 600 (or portions thereof) to perform various operations, thus defining one or more specific machine embodiments that execute and perform the operations of the software programs.
  • the instructions can be stored as firmware residing in read-only memory and/or program code stored in non-volatile storage media that can be read into volatile working memory for execution by processing unit(s) 604.
  • modules 602 can be interconnected via a bus or other interconnect 608, forming a local area network that supports communication between modules 602 and other components of server system 600.
  • Interconnect 608 can be implemented using various technologies including server racks, hubs, routers, etc.
  • a wide area network (WAN) interface 610 can provide data communication capability between the local area network (interconnect 608) and the network 626, such as the Internet. Technologies can be used, including wired (e.g., Ethernet, IEEE 602.3 standards) and/or wireless technologies (e.g., Wi-Fi, IEEE 602.11 standards).
  • local storage 606 is intended to provide working memory for processing unit(s) 604, providing fast access to programs and/or data to be processed while reducing traffic on interconnect 608. Storage for larger quantities of data can be provided on the local area network by one or more mass storage subsystems 612 that can be connected to interconnect 608. Mass storage subsystem 612 can be based on magnetic, optical, semiconductor, or other data storage media.
  • Direct attached storage storage area networks, network-attached storage, and the like can be used. Any data stores or other collections of data described herein as being produced, consumed, or maintained by a service or server can be stored in mass storage subsystem 612. In some embodiments, additional data storage resources may be accessible via WAN interface 610 (potentially with increased latency).
  • Server system 600 can operate in response to requests received via WAN interface 610. For example, one of modules 602 can implement a supervisory function and assign discrete tasks to other modules 602 in response to received requests. Work allocation techniques can be used. As requests are processed, results can be returned to the requester via WAN interface 610. Such operation can generally be automated.
  • Processing unit(s) 616 and storage device 618 can be similar to processing unit(s) 604 and local storage 606 described above. Suitable devices can be selected based on the demands to be placed on client computing system 614; for example, client computing system 614 can be implemented as a “thin” client with limited processing capability or as a high- powered computing device. Client computing system 614 can be provisioned with program code executable by processing unit(s) 616 to enable various interactions with server system 600.
  • Network interface 620 can provide a connection to the network 626, such as a wide area network (e.g., the Internet) to which WAN interface 610 of server system 600 is also connected.
  • network interface 620 can include a wired interface (e.g., Ethernet) and/or a wireless interface implementing various RF data communication standards such as Wi-Fi, Bluetooth, or cellular data network standards (e.g., 3G, 4G, LTE, etc.).
  • User input device 622 can include any device (or devices) via which a user can provide signals to client computing system 614. The client computing system 614 can interpret the signals as indicative of particular user requests or information. In various embodiments, user -93- 4924-7413-8397.1 Atty. Dkt.
  • input device 622 can include any or all of a keyboard, touch pad, touch screen, mouse or other pointing device, scroll wheel, click wheel, dial, button, switch, keypad, microphone, and so on.
  • User output device 624 can include any device via which client computing system 614 can provide information to a user.
  • user output device 624 can include a display to display images generated by or delivered to client computing system 614.
  • the display can incorporate various image generation technologies, e.g., a liquid crystal display (LCD), light- emitting diode (LED) including organic light-emitting diodes (OLED), projection system, cathode ray tube (CRT), or the like, together with supporting electronics (e.g., digital-to-analog or analog-to-digital converters, signal processors, or the like).
  • LCD liquid crystal display
  • LED light- emitting diode
  • OLED organic light-emitting diodes
  • CRT cathode ray tube
  • Some embodiments can include a device such as a touchscreen that function as both input and output device.
  • other user output devices 624 can be provided in addition to or instead of a display. Examples include indicator lights, speakers, tactile “display” devices, printers, and so on.
  • Some embodiments include electronic components, such as microprocessors, storage and memory that store computer program instructions in a computer-readable storage medium. Many of the features described in this specification can be implemented as processes that are specified as a set of program instructions encoded on a computer-readable storage medium. When these program instructions are executed by one or more processing units, they cause the processing unit(s) to perform various operation indicated in the program instructions. Examples of program instructions or computer code include machine code, such as is produced by a compiler, and files including higher-level code that are executed by a computer, an electronic component, or a microprocessor using an interpreter.
  • processing unit(s) 604 and 616 can provide various functionality for server system 600 and client computing system 614, including any of the functionality described herein as being performed by a server or client, or other functionality. It will be appreciated that server system 600 and client computing system 614 are illustrative and that variations and modifications are possible. Computer systems used in -94- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 connection with embodiments of the present disclosure can have other capabilities not specifically described here. Further, while server system 600 and client computing system 614 are described with reference to particular blocks, it is to be understood that these blocks are defined for convenience of description and are not intended to imply a particular physical arrangement of component parts.
  • blocks can be but need not be located in the same facility, in the same server rack, or on the same motherboard. Further, the blocks need not correspond to physically distinct components. Blocks can be configured to perform various operations, e.g., by programming a processor or providing appropriate control circuitry, and various blocks might or might not be reconfigurable depending on how the initial configuration is obtained. Embodiments of the present disclosure can be realized in a variety of apparatus including electronic devices implemented using any combination of circuitry and software. While the disclosure has been described with respect to specific embodiments, one skilled in the art will recognize that numerous modifications are possible. Embodiments of the disclosure can be realized using a variety of computer systems and communication technologies including but not limited to the specific examples described herein.
  • Embodiments of the present disclosure can be realized using any combination of dedicated components and/or programmable processors and/or other programmable devices.
  • the various processes described herein can be implemented on the same processor or different processors in any combination.
  • components are described as being configured to perform certain operations, such configuration can be accomplished, e.g., by designing electronic circuits to perform the operation, by programming programmable electronic circuits (such as microprocessors) to perform the operation, or any combination thereof.
  • programmable electronic circuits such as microprocessors
  • Computer programs incorporating various features of the present disclosure may be encoded and stored on various computer-readable storage media; suitable media include -95- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 magnetic disk or tape, optical storage media such as compact disk (CD) or DVD (digital versatile disk), flash memory, and other non-transitory media.
  • Computer-readable media encoded with the program code may be packaged with a compatible electronic device, or the program code may be provided separately from electronic devices (e.g., via Internet download or as a separately packaged computer-readable storage medium).

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Public Health (AREA)
  • Databases & Information Systems (AREA)
  • Epidemiology (AREA)
  • Pathology (AREA)
  • Primary Health Care (AREA)
  • Probability & Statistics with Applications (AREA)
  • Multimedia (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Quality & Reliability (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

Presented herein are systems and methods for determining scores related to metastatic recurrence of cancers in subjects using multimodal machine learning (ML) architectures. A computing system may obtain, for a subject at risk of recurrence of cancer, a dataset comprising at least one of: (i) a biomedical image of a tissue sample from an organ associated with the cancer or (ii) a text report identifying a plurality of characteristics of a tumor associated with the cancer. The computing system may apply an ML architecture to at least one of the biomedical image or the text report of the dataset. The computing system may determine, based on applying the ML architecture, a score indicating a likelihood of recurrence associated with cancer in the subject. The computing system may generate a classification of the subject for administration of a therapy for the cancer, in accordance with the score.

Description

Atty. Dkt. No.: 115872-3191 MULTIMODAL TRANSFORMER MODELS FOR BIOMEDICAL IMAGES AND ASSOCIATED TEXTS CROSS REFERENCES TO RELATED APPLICATIONS The present application claims priority to US Provisional Patent Application No. 63/556,754, titled “Multimodal Transformer Models for Biomedical Images and Associated Texts,” filed February 22, 2024, which is incorporated by reference in its entirety. BACKGROUND A computing device may use various models to process inputs to generate outputs. SUMMARY Aspects of the present disclosure is directed to systems, methods, devices, and non-transitory computer readable media for determining scores related to metastatic recurrence of cancers in subjects using multimodal machine learning (ML) architectures. One or more processors coupled with memory may obtain, for a subject at risk of recurrence of cancer, a dataset at least one of: (i) a biomedical image of a tissue sample from an organ associated with the cancer or (ii) a text report identifying a plurality of characteristics of a tumor associated with the cancer. The one or more processors may apply an ML architecture to at least a portion of the biomedical image and at least a portion of the text report of the dataset. The ML architecture may be established using a plurality of examples. Each of the plurality of examples may have, for a respective subject having undergone resection of a respective tumor from a respective organ: (i) a respective dataset comprising at least one of (a) a respective biomedical image of a respective tissue sample from the respective organ or (b) a respective text report identifying a respective plurality of characteristics of the respective tumor; and (ii) a respective score indicating a corresponding likelihood of recurrence associated with cancer in the respective subject. The one or more processors may determine, based on applying the ML architecture, a score indicating a likelihood of recurrence associated with cancer in the subject. The one or -1- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 more processors may generate a classification identifying the subject as one of a candidate or non-candidate for administration of a therapy for the cancer, in accordance with the score. The one or more processors may store, using one or more data structures, an association between the subject and the classification. In some embodiments, the one or more processors may generate, responsive to the score satisfying a threshold, a classification to identify the subject as the candidate for administration of the therapy for recurrence of the cancer. The one or more processors may provide, for presentation, an output based on the classification to identify the subject as the candidate for the administration of the therapy. In some embodiments, the subject is administered with the therapy for cancer, in response to the presentation of the output. In some embodiments, the one or more processors may generate, responsive to the score not satisfying a threshold, a classification to identify the subject as the non-candidate for administration of the therapy for recurrence of the cancer. The one or more processors may provide an output based on the classification to identify the subject as the non-candidate for the administration of the therapy. The output may include at least one of (i) a first indication to not administer the subject the therapy for recurrence of the cancer, (ii) a second indication to continue monitoring the subject for recurrence of the cancer, or (iii) a third indication to provide a second dataset for the subject. In some embodiments, at least one of the plurality of examples may include the respective score indicating the corresponding likelihood of recurrence associated with cancer in the respective subject at a respective time window relative to resection of a respective primary tumor associated with the cancer. The one or more processors may determine the score indicating the likelihood of recurrence associated with cancer in the subject at a time window relative to resection of a primary tumor associated with the cancer in the subject, wherein the time window ranges from three months to eight years. -2- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 In some embodiments, the one or more processors may generate a plurality of tiles using the biomedical images. Each of the plurality of tiles may correspond to a corresponding section of the biomedical images. The one or more processors may generate a plurality of tokens using the text report, each of the plurality of tokens corresponding to one or more respective words of the text report. The one or more processors may apply the ML architecture to the plurality of tiles and the plurality of tokens. In the ML architecture, an image processor may generate, using the plurality of tiles, a first set of embeddings corresponding to features associated with the recurrence of the cancer; a text processor may generate, using the plurality of tokens, a second set of embeddings corresponding to features associated with the recurrence of the cancer; an aggregator may determine, based on the first set of embeddings from the image processor and the second set of embeddings from the text processor, the score indicating the likelihood of recurrence associated with cancer in the subject. In some embodiments, the one or more processors may apply he ML architecture to the plurality of tiles and the plurality of tokens. In the ML architecture, the image processor may generate, for each tile of the plurality of tiles, a first score of a plurality of first scores indicating a relevance of the tile to the recurrence of the cancer; and the text processor may generate, for each token of the plurality of tokens, a second score of a plurality of second scores indicating a relevance of the token to the recurrence of the cancer. The one or more processors may select (i) a subset of tiles from the plurality of tiles based on the plurality of first scores and (ii) a subset of tokens from the plurality of tokens based on the plurality of second scores. The one or more processors may provide, for presentation an output based on the subset of tiles and the subset of tokens. In some embodiments, at least one example of the plurality of examples may include (i) the respective score indicating a corresponding likelihood of recurrence associated with cancer in the respective subject based on the respective biomedical image and (ii) a respective second score indicating a corresponding likelihood of recurrence associated with cancer in the respective subject based on the respective text report. The ML architecture may be trained by: applying the ML architecture to the respective dataset and the respective text report -3- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 of the at least one example; determining (i) a first score indicating corresponding likelihood of recurrence associated with cancer in the respective subject from applying the ML architecture to the respective biomedical score and (ii) a second score indicating corresponding likelihood of recurrence associated with cancer in the respective subject from applying the ML architecture to the respective text report; and updating one or more of a plurality of weights inn the ML architecture based on (i) a first comparison between the respective score and the first score and (ii) a second comparison between the second respective score and the second score. In some embodiments, the one or more processors may obtain the dataset comprising (i) the biomedical image of at least a portion of the tissue sample in a first staining modality of a plurality of staining modalities and (ii) a second biomedical image of at least a portion of the tissue sample in a second staining modality of the plurality of staining modalities. The plurality of staining modalities may include a histological modality or an immunohistochemical (IHC) modality. The one or more processors may apply the ML architecture to the biomedical image in the first staining modality and the second biomedical image in the second staining modality. In some embodiments, at least one of the plurality of examples further comprises the respective score to indicate at least one of (i) a risk of recurrence of the respective cancer in the respective subject or (ii) a predicted outcome of a respective administration of therapy for the respective subject. The one or more processors may determine the score indicating least one of (i) a risk of recurrence of the cancer in the subject or (ii) a predicted outcome of the administration of therapy for the subject. In some embodiments, the cancer may include can affecting a breast of the subject. The subject may have undergone resection of a primary tumor associated with breast cancer from the breast, prior to obtaining the dataset. The plurality of characteristics identified in the text report may include at least one of a histologic subtype, a percent positivity of HR, a percent positivity of HER2, a histologic grade, an anatomic site, a ductal carcinoma in situ (DCIS) indicator, or lobular carcinoma in situ (LCIS) indicator. The therapy for the cancer may -4- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 include at least one of an endocrine therapy or an adjuvant chemotherapy. The endocrine therapy may include at least one of a selective estrogen receptor modulator (SERM), an aromatase inhibitor (AI), or a selective estrogen receptor degrader (SERD). The adjuvant chemotherapy may include at least one of anthracyclines, taxanes, cyclophosphamide, fluorouracil, or carboplatin. BRIEF DESCRIPTION OF THE DRAWINGS The foregoing and other objects, aspects, features, and advantages of the disclosure will become more apparent and better understood by referring to the following description taken in conjunction with the accompanying drawings, in which: Fig. 1. Developing a multimodal transformer model for breast cancer risk. Early-stage breast tumors are (a) resected, (b) profiled histologically (c) digitized, and (d) used for downstream modeling of recurrence risk. (e) Number of pathologic slides, pathology reports, and patients included in each split. (f) Histogram depicting number of slides with a given number of tiles. (g) Tissue detection, tessellation, transformer-based modeling of CTransPath- derived tile embeddings, pathology report scraping, tokenization and transformer-based modeling, nuclear segmentation for interpretation, tensor fusion for multimodal integration. Graphic partially created using BioRender. Fig. 2. Model distinguishes tumors by biologically meaningful features. UMAP embeddings denoting visual, linguistic, and multimodal representations annotated with (a) histologic grade, (b) progesterone receptor (PR) expression, and (c) recurrence score. (d) TP53 mutation status, (e) MYC copy number amplifications (CNA), (f) PIK3CA CNA, and (g) fraction genome altered (FGA) versus recurrence score. Mann-Whitney U **** q < 0.0001, ** q < 0.01, * q < 0.05. Fig. 3. Model decisions are clinically explainable. (a) Foreground tissue colorized by visual attention, plotted in (b). One tile-attention value pair is denoted by the white and black arrows. (c) The five highest- and lowest-attention tiles from (a) at greater -5- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 magnification. (d) Multimodal prediction with confidence interval and true score. (e) Pathology report-derived tokens colorized by language attention. (f) Whole-cohort token importance cloud with size of word scaled by mean importance across the MSK-BRCA test set. [UNK]: unknown. Fig. 4. Visual model generalizes internationally to three test cohorts. (a-c) density plots, (d-f) precision-recall curves, (g-i) receiver operating characteristic curves, (j-l) confusion matrices for MSK-BRCA, IEO-BRCA, and MDX-BRCA test sets. MAE: mean absolute error, PRC: precision-recall curve, AUPRC: area under the PRC, AUROC: area under the ROC. p-values calculated using (a-c) comparison against the beta distribution, (d-i) 1000- fold permutation testing, (j-l) McNemar’s exact test. Dashed lines in (e-l) represent performance for the minimally informative classifier. Fig. 5. Spatial interpretation of tumors. (a) Association of cellular features with high- and low-risk tissue. (b) High and (c) low relative abundance of inflammatory cells. (d) High and (e) low standard deviation of neoplastic cell area. (f-i) Quantification of stromal fraction (SF), tumor cell proliferation (Prolif), lymphocyte infiltrating signature score (LISS), and lymphocyte fraction (LF) for predicted low- and high-risk patients depicted in blue and orange, respectively, in the MSK-BRCA cohort. P-values are generated using an independent t- test (j-l) Depictions of (j) high-risk (including artifact on right), (k) low-risk, and (l) uninformative tissue synthesized by a generative adversarial network. Scale bars denote 64µm. Neo.: neoplastic, inflam.: inflammatory, non-neo.: non-neoplastic, unlab.: unlabeled, connec.: connective, necros.: necrosis Fig. 6. Multimodal model performance and benchmarking. (a) regression of predicted versus true recurrence scores, (b) precision-recall curve for high-risk disease, (c) concordance correlation coefficient for all data splits and models, (d) receiver operating characteristic curve for high-risk disease (e) confusion matrix using score cutoffs of 11 and 25, (f) Pearson correlation for all data splits and models. (g-i) PRCs and (j-l) ROCs for multimodal, language, and vision models, respectively, compared against a clinical nomogram in the full -6- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 information setting. MAE: mean absolute error, PRC: precision-recall curve, AUPRC: area under the PRC, L: language model, V: vision model, VL: multimodal model, ROC: receiver operating characteristic curve, AUROC: area under the ROC. Error bars (c,f) and shaded linear uncertainty in (a) represent 95% confidence intervals by bootstrapping. p-values calculated using (a) comparison against the beta distribution, (b,d) 1000-fold permutation testing, (e) McNemar’s exact test. Dashed lines in (a,d) represent performance for the minimally informative classifier. Fig. 7: Case inclusion diagram. Depicts slides (yellow) and patients (blue) joined to form the full cohort of paired slides and reports with recurrence scores (green). Fig. 8: Quantitative analysis of high- versus low-attention tiles. Full feature titles available in source data. Fig. 9. MSK-BRCA training and validation unimodal vision model performance. (a-b) density plots, (c-d) precision-recall curves, (e-f) receiver operating characteristic curves, (g-h) confusion matrices for MSK-BRCA training and validation sets (left to right). MAE: mean absolute error, PRC: precision-recall curve, AUPRC: area under the PRC, AUROC: area under the ROC. p-values calculated using (a-d) comparison against the beta distribution, (e-l) 1000-fold permutation testing, (m-p) McNemar’s exact test. Dashed lines in (e-l) represent performance for the minimally informative classifier. Fig. 10: Tumor microenvironment quantification of the top attention tiles of the recurrence risk vision prediction model. Quantification of the tumor microenvironment for the top 50 predicted high- and low-risk patients by the recurrence risk vision prediction model, specifically for the stromal fraction (SF) and leukocyte fraction (LF) as assessed via DNA methylation analysis, lymphocyte infiltrating signature score (LISS) and proliferation (Prolif) as measured by RNA expression for the a. MSK cohort (n=100), b. IEO cohort (n=100) and c. MDX cohort (n=100). Statistical significance is measured by an independent t-test, indicating a difference in sample means between predicted high- and low-risk patients (p < 0.05). -7- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 Fig. 11. Unimodal language model performance. (a-c) density plots, (d-f) precision-recall curves, (g-i) receiver operating characteristic curves, (j-l) confusion matrices for training, validation, and MSKCC test sets (left to right). MAE: mean absolute error, PRC: precision-recall curve, AUPRC: area under the PRC, AUROC: area under the ROC. p-values calculated using (a-d) comparison against the beta distribution, (e-l) 1000-fold permutation testing, (m-p) McNemar’s exact test. Dashed lines in (e-l) represent performance for the minimally informative classifier. Fig. 12: Sensitivity and specificity analysis for the predicted recurrence risk scores of all patients. The sensitivity and specificity are calculated using a threshold for the predicted recurrence risk score of < 16 and ≥ 25, respectively. The thresholds are determined in the MSK cohort (n=2338) and are set in the external IEO (n=452) and MDX (n=572) cohorts. The analysis does not account for age and nodal status. The sensitivity versus the patient count is plotted for the a. MSK, b. IEO and c. MDX cohorts. Moreover, the specificity versus the patient count is plotted for the d. MSK, e. IEO and f. MDX cohorts. Fig. 13: Sensitivity and specificity analysis for the predicted recurrence risk scores of all patients above 50 years of age and having 1-3 positive nodes. The sensitivity and specificity are calculated using a threshold for the predicted recurrence risk score of < 16 and ≥ 25, respectively. The thresholds are determined in the MSK cohort (n=987) and are set in the external IEO (n=124) and MDX (n=171) cohorts. The analysis is focused on the patient subset above 50 years of age and having 1-3 positive nodes. The sensitivity versus the patient count is plotted for the a. MSK, b. IEO and c. MDX cohorts. Moreover, the specificity versus the patient count is plotted for the d. MSK, e. IEO and f. MDX cohorts. Fig. 14: Sensitivity and specificity analysis for the predicted recurrence risk scores of all patients below 50 years of age and 0 positive nodes. The sensitivity and specificity are calculated using a threshold for the predicted recurrence risk score of < 16 and ≥ 25, respectively. The thresholds are determined in the MSK cohort (n=450) and are set in the external IEO (n=114) and MDX (n=70) cohorts. The analysis is focused on the patient subset -8- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 below 50 years of age and having 0 positive nodes. The sensitivity versus the patient count is plotted for the a. MSK, b. IEO and c. MDX cohorts. Moreover, the specificity versus the patient count is plotted for the d. MSK, e. IEO and f. MDX cohorts. Fig. 15: Potential clinical use-case of the Orpheus recurrence risk prediction model. The Orpheus multimodal prediction model for recurrence risk prediction is potentially capable of guiding decision-making for adjuvant cytotoxic chemotherapy alongside adjuvant endocrine therapy with 94.4% precision and 33.3% recall as measured on the withheld test set of the MSK cohort (n=2338). The model is within scope for early-stage hormone receptor positive (HR+) and HER2- breast cancer patients. Fig. 16: Orpheus performance for TAILORx risk stratification. (a-c) The whole slide image (WSI)-based model reliably identifies high-risk disease (RS > 25) as defined by TAILORx across the MSK-BRCA (n=1029), IEO-BRCA (n=452), and MDX-BRCA (n=572) test cohorts. (d) The multimodal model outperforms the WSI- and text-based unimodal models. Error bars by 1,000-fold bootstrapping. (e-f) The multimodal model outperforms a clinicogenomic nomogram in identifying high-risk disease. (g) Calibration plot and (h) predicted score frequencies. All results shown for test sets. Fig. 17: Orpheus+ performance for identifying distant recurrence. (a-b) Orpheus+ risk scores versus Oncotype DX ® (ODX, N=1464) and Multiplex DX ® (MDX, N=575) recurrence score (RS) values for cases with or without distant recurrence. P-values by Mann-Whitney-Wilcoxon test, two sided. (c) Time-dependent areas under the receiver operating characteristic curve for Orpheus+ and ODX RS scores against recurrence in the MSK-BRCA test set for cases with ODX RS ≤ 25, with associated calibration plots. Mean value represents AUC over 2-8 years. (d) Receiver operating characteristic curves for Orpheus+ and MDX RS scores against recurrence in the MDX-BRCA test/validation set, with associated calibration plots. **** denotes p ≤ 1e-4, *** denotes p ≤ 1e-3, ** denotes p ≤ 1e-2, and * denotes p ≤ 5e-2. In box plots, boxes denote 25th-75th percentiles with lines at the median, whiskers denote the range -9- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 without outliers, and individual points denote outliers. Exact p-values, top to bottom, for (a) are 5e-7 and 0.80 and for (b) are 0.0013 and 0.0011. Fig. 18: Cellular and transcriptomic correlates of risk. (a) Saliency map of contributory foreground tiles, with one tile-attention value pair denoted by the arrows and predicted score. (b) Histogram of tile attention values. (c) The five highest- and lowest-attention tiles at greater magnification. (d) Association of cellular features with high- and low-risk tissue. Hypothesis testing performed with the two-sided Mann-Whitney U test with corrections for multiple testing. (e) High and (f) low relative abundance of inflammatory cells. (g) High and (h) low standard deviation of neoplastic nuclear area. (i-l) Quantification of stromal fraction (SF), tumor cell proliferation (Prolif), lymphocyte infiltrating signature score (LISS), and lymphocyte fraction (LF) for predicted low- and high-risk patients (50 each) depicted in blue and orange, respectively, in the MSK-BRCA cohort. p-values are generated using an independent two-sided t-test. In box plots, boxes denote 25th-75th percentiles, whiskers denote the range without outliers, and individual points denote outliers. Scale bars denote 64 µm. Fig. 19: Potential clinical use case of the Orpheus recurrence risk prediction model. The Orpheus multimodal prediction model for recurrence risk prediction is potentially capable of guiding decision-making for adjuvant cytotoxic chemotherapy alongside adjuvant endocrine therapy for predicted low- and high-risk patients. The model is within scope for early-stage hormone receptor positive (HR+) and HER2- breast cancer patients. Fig. 20: Visual model generalizes internationally to three test cohorts. (a-c) density plots, (d-f) precision-recall curves, (g-i) confusion matrices, and (j-l) calibration plots for MSK-BRCA, IEO-BRCA, and MDX-BRCA test sets. Positive class is ODX RS > 25. MAE: mean absolute error, PRC: precision-recall curve, AUPRC: area under the PRC, AUROC: area under the ROC, m: slope, b: intercept. p-values calculated using (a-c) comparison against the beta distribution, (g-i) McNemar’s exact test. Dashed lines in (d-f) represent performance for the minimally informative classifier. -10- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 Fig. 21: MSK-BRCA training and validation unimodal vision model performance. (a-b) density plots, (c-d) precision-recall curves, (e-f) receiver operating characteristic curves, (g-h) confusion matrices for MSK-BRCA training and validation sets (left to right). MAE: mean absolute error, PRC: precision-recall curve, AUPRC: area under the PRC, AUROC: area under the ROC. p-values calculated using (a-d) comparison against the beta distribution, (e-l) 1000-fold permutation testing, (m-p) McNemar’s exact test. Dashed lines in (e-l) represent performance for the minimally informative classifier. Fig. 22: Unimodal language model performance. (a-c) density plots, (d-f) precision-recall curves, (g-i) receiver operating characteristic curves, (j-l) confusion matrices for training, validation, and MSK-BRCA test sets (left to right). MAE: mean absolute error, PRC: precision-recall curve, AUPRC: area under the PRC, AUROC: area under the ROC, m: slope, b: intercept. p-values calculated using (a-d) comparison against the beta distribution, (e-l) 1000- fold permutation testing, (m-p) McNemar’s exact test. Dashed lines in (e-l) represent performance for the minimally informative classifier. Fig. 23: Multimodal model performance in the MSK-BRCA test set. (a) Density plot, (b) precision-recall curve, (c) Pearson correlation, (d) receiver operating characteristic curve for all cases (not just those with sufficient data for nomogram), (e) confusion matrix. MAE: mean absolute error, PRC: precision-recall curve, AUPRC: area under the PRC, AUROC: area under the ROC, m: slope, b: intercept. p-values calculated using (a) comparison against the beta distribution, (e) McNemar’s exact test. Dashed lines in (b) and (d) represent performance for the minimally informative classifier. Positive/high-risk cases are defined as ODX RS > 25, with low-risk cases defined as ODX RS < 11 in the confusion matrix. Fig. 24: Orpheus performance compared to clinicopathologic nomogram. Precision recall curves for (a) multimodal, (b) language-based, and (c) vision-based models and receiver operating characteristic curves for (d) language-based and (e) vision-based models are -11- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 shown. The positive class is ODX RS > 25. Dashed lines represent performance for the minimally informative classifier. Fig. 25: Performance of LLM-based ensemble model used to identify recurrences from the electronic medical record. Ground truth labels (on the y axis) were hand curated. Fig. 26: Orpheus+ performance for distant recurrence prediction. (a) Orpheus+ risk scores versus Oncotype DX ® (ODX) recurrence score (RS) values for cases with or without distant recurrence for any ODX RS value. (b) Time-dependent area under the receiver operating characteristic curve and (c) receiver operating characteristic curve for Orpheus+ and ODX RS score to infer recurrence. (d) Orpheus+ risk scores versus ODX RS values for cases with or without distant recurrence for the MSK-BRCA test set alone, excluding the MSK-BRCA validation set. (e) Time-dependent area under the receiver operating characteristic curve and (f) receiver operating characteristic curve for Orpheus+ and ODX RS score to infer recurrence. **** denotes p ≤ 1e-4, *** denotes p ≤ 1e-3, ** denotes p ≤ 1e-2, and * denotes p ≤ 5e-2. In box plots, boxes denote 25th-75th percentiles, whiskers denote the range without outliers, and individual points denote outliers. Fig. 27: Quantitative analysis of high- versus low-attention tiles. Full feature titles available in source data. Fig. 28: (a) Pathology report-derived tokens colorized by language attention. (b) Whole-cohort token importance cloud with size of word scaled by mean importance across the MSK-BRCA test set. [UNK]: unknown. Fig. 29: Performance with ablation of known correlates of Recurrence Score. Error bars denote 95% confidence interval using 1000-fold bootstrapping. Fig. 30: Model distinguishes tumors by biologically meaningful features. UMAP embeddings of the MSK-BRCA test set denoting visual, linguistic, and multimodal -12- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 representations annotated with (a) histologic grade, (b) progesterone receptor (PR) expression, and (c) recurrence score. (d) TP53 mutation status, (e) MYC copy number amplifications (CNA), (f) PIK3CA CNA, (h) BRCA2 mutation status, and (g) fraction genome altered (FGA) versus inferred (multimodal) recurrence score in the entire MSK-BRCA dataset. Mann-Whitney U **** q < 0.0001, *** q < 0.001. The box plots depict the interquartile range (IQR), with the lower, middle and upper edge being the 25th, 50th, and 75th percentile, respectively. The whiskers of the box plots are defined as the minimum and maximum values 1.5 times the IQR away from the lower and upper quartiles of the data, respectively. Fig. 31: Associations of predicted multimodal score with features from MSK-IMPACT in the test set alone (N=97). The relations persist for (a) TP53 mutations and (b) fraction of genome altered. In box plots, boxes denote 25th-75th percentiles, whiskers denote the range without outliers, and individual points denote outliers. Fig. 32: Tumor microenvironment quantification of the top attention tiles of the recurrence risk vision prediction model. Quantification of the tumor microenvironment for the top 50 predicted high- and low-risk patients by the recurrence risk vision prediction model, specifically for the stromal fraction (SF) and leukocyte fraction (LF) as assessed via DNA methylation analysis, lymphocyte infiltrating signature score (LISS) and proliferation (Prolif) as measured by RNA expression for the a. MSK cohort (n=100), b. IEO cohort (n=100) and c. MDX cohort (n=100). High-risk is defined as predictions > 0.25 by the recurrence risk vision model. Statistical significance is measured by an independent t-test, indicating a difference in sample means between predicted high- and low-risk patients (p < 0.05). In box plots, boxes denote 25th-75th percentiles, whiskers denote the range without outliers, and individual points denote outliers. Fig. 33: Sensitivity and specificity analysis for the predicted recurrence risk scores of all patients. The sensitivity and specificity are calculated using a threshold for the predicted recurrence risk score of < 11 and > 25, respectively. The thresholds are determined by the standard low- and high-risk thresholds for OncotypeDX and applied to the MSK test -13- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 (n=1029), external IEO (n=452) and external MDX (n=572) cohorts. The analysis does not account for age and nodal status. The sensitivity versus the patient count is plotted for the a. MSK, b. IEO and c. MDX cohorts. Moreover, the specificity versus the patient count is plotted for the d. MSK, e. IEO and f. MDX cohorts. Fig. 34: Sensitivity and specificity analysis for the predicted recurrence risk scores of all patients above 50 years of age. The sensitivity and specificity are calculated using a threshold for the predicted recurrence risk score of < 11 and > 25, respectively. The thresholds are determined by the standard low- and high-risk thresholds for OncotypeDX and applied to the MSK test (n=636), external IEO (n=225) and external MDX (n=370) cohorts. The analysis is focussed on the patient subset above 50 years of age and having 1-3 positive nodes. The sensitivity versus the patient count is plotted for the a. MSK, b. IEO and c. MDX cohorts. Moreover, the specificity versus the patient count is plotted for the d. MSK, e. IEO and f. MDX cohorts. Fig. 35: Sensitivity and specificity analysis for the predicted recurrence risk scores of all patients below 50 years of age. The sensitivity and specificity are calculated using a threshold for the predicted recurrence risk score of < 11 and > 25, respectively. The thresholds are determined by the standard low- and high-risk thresholds for OncotypeDX and applied to the MSK test (n=305), external IEO (n=227) and external MDX (n=205) cohorts. The analysis is focussed on the patient subset below 50 years of age and having 0 positive nodes. The sensitivity versus the patient count is plotted for the a. MSK, b. IEO and c. MDX cohorts. Moreover, the specificity versus the patient count is plotted for the d. MSK, e. IEO and f. MDX cohorts. Fig. 36 depicts a block diagram of a system for determining scores related to recurrence of breast cancers in subjects using multimodal machine learning (ML) architectures, in accordance with an illustrative embodiment. -14- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 Fig. 37 depicts a block diagram of a process for training ML architectures in the system for determining scores related to recurrence of breast cancers in subjects, in accordance with an illustrative embodiment. Fig. 38 depicts a block diagram of a process for applying ML architectures in the system for determining scores related to recurrence of breast cancers in subjects, in accordance with an illustrative embodiment. Fig. 39 depicts a block diagram of a process for evaluating outputs from ML architectures in the system for determining scores related to recurrence of breast cancers in subjects, in accordance with an illustrative embodiment. Fig. 40 depicts a flow diagram of a method of determining scores related to recurrence of breast cancers in subjects using multimodal machine learning (ML) architectures, in accordance with an illustrative embodiment. Fig. 41 is a block diagram of a computing environment according to an example implementation of the present disclosure. DETAILED DESCRIPTION Following below are more detailed descriptions of various concepts related to, and embodiments of, systems and methods for determining scores related to recurrence of cancers in subjects using multimodal machine learning (ML) architectures. It should be appreciated that various concepts introduced above and discussed in greater detail below may be implemented in any of numerous ways, as the disclosed concepts are not limited to any particular manner of implementation. Examples of specific implementations and applications are provided primarily for illustrative purposes. Section A describes multimodal histopathological models for stratifying hormone receptor-positive early breast cancer. -15- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 Section B describes inference of recurrence risk and predicted outcome using multimodal histopathological models. Section C describes systems and methods of determining scores related to recurrence of cancers in subjects using multimodal machine learning (ML) architectures Section D describes a network environment and computing environment which may be useful for practicing various computing related embodiments described herein. A. Multimodal Histopathological Models for Stratifying Hormone Receptor-Positive Early Breast Cancer A multimodal transformer model was trained and validated using patients, taking both H&E images and their corresponding synoptic text reports as input. Accurate inference of recurrence score was shown from whole-slide images, the raw text of their corresponding reports, and their combination as measured by Pearson’s correlation. Moreover, the model generalizes well to external international cohorts, effectively identifying recurrence risk and high-risk status from whole-slide images. Probing the biologic underpinnings of the model decisions uncovered tumor cell size heterogeneity, immune cell infiltration, a proliferative transcription program, and stromal fraction as correlates of higher-risk predictions. In conclusion, at an operating point of 94.4% precision and 33.3% recall, this model could help increase global adoption and shorten lag between resection and adjuvant therapy. For patients with hormone receptor-positive, early breast cancer without HER2 amplification, multigene expression assays including Oncotype DX ® recurrence score (RS) have been clinically validated to identify patients who stand to derive added benefit from adjuvant cytotoxic chemotherapy. However, cost and turnaround time have limited its global adoption despite recommendation by practice guidelines. Routinely available hematoxylin and eosin (H&E)-stained pathology slides were investigated to see if they could act as a surrogate triaging data substrate by predicting RS using machine learning methods. -16- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 A multimodal transformer model, Orpheus, was trained and validated using 6,203 patients across three independent cohorts, taking both H&E images and their corresponding synoptic text reports as input. Accurate inference of recurrence score was shown from whole-slide images (r = 0.63 (95% C.I. 0.58 - 0.68); n = 1,029), the raw text of their corresponding reports (r = 0.58 (95% C.I. 0.51 - 0.64); n = 972), and their combination (r = 0.68 (95% C.I. 0.64 - 0.73); n = 964) as measured by Pearson’s correlation. To predict high-risk disease (RS>25), the model achieved an area under the receiver operating characteristic curve (AUROC) of 0.89 (95% C.I. 0.83 - 0.94), and area under the precision recall curve (AUPRC) of 0.64 (95% C.I. 0.60 - 0.82), compared to 0.49 (95% C.I. 0.36 - 0.64) for an existing nomogram based on clinical and pathologic features. Moreover, the model generalizes well to external international cohorts, effectively identifying recurrence risk (r = 0.61, p < 10-4, n = 452; r = 0.60, p < 10-4, n = 575) and high-risk status (AUROC = 0.80, p < 10-4, AUPRC = 0.68, p < 10-4, n = 452; AUROC = 0.83, p < 10-4, AUPRC = 0.73, p < 10-4, n = 575) from whole-slide images. Probing the biologic underpinnings of the model decisions uncovered tumor cell size heterogeneity, immune cell infiltration, a proliferative transcription program, and stromal fraction as correlates of higher-risk predictions. In conclusion, at an operating point of 94.4% precision and 33.3% recall, this model could help increase global adoption and shorten lag between resection and adjuvant therapy. Introduction Hormone receptor-positive disease without HER2 overexpression or amplification (HR+/HER2-) is the most common subtype of early breast cancer (EBC), accounting for approximately 70% of diagnoses. A major challenge in the management of this disease has been identifying the cancers for which adjuvant chemotherapy does not meaningfully reduce the risk of recurrence. Risk stratification of HR+/HER2- EBC relies on the integration of traditional clinicopathological features (e.g., tumor size, nodal status, Nottingham grade) with multigene assays to estimate risk of recurrence and personalize adjuvant therapy. Among the assays, the Oncotype DX (ODX) ® (Genomic Health, Redwood City, CA) is the most extensively validated and widely used in clinical practice. By measuring transcriptional abundance of 16 genes, -17- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 including ESR1, PGR, HER2, MKI67, and MMP11, against the abundance of five reference genes using reverse transcription quantitative real-time PCR, ODX calculates a recurrence score (RS) ranging from zero to 100 with both prognostic and predictive value. Substantial clinical evidence from retrospective and prospective trials has shown that ODX can improve clinical decision-making in breast cancer. Retrospective analyses of the NSABP B14 and TransATAC trials demonstrated the prognostic value of ODX in stratifying the risk of recurrence for HR+/HER2- EBC patients. Similarly, analyses of the NSABP B20 and SWOG8814 clinical trials established the predictive value of ODX by uncovering a survival benefit with the addition of adjuvant chemotherapy to endocrine therapy for patients with high risk of disease relapse. These studies provided the rationale for the prospective evaluation of ODX in the TAILORx (>10,000 patients with node-negative disease) and RxPONDER (5,083 patients with one to three positive lymph nodes) trials and established ODX as the preferred genomic assay for adjuvant treatment-decision making in HR+/HER2- EBC While guidelines have recommended the use of ODX or other assays for more than a decade, reimbursement restrictions and global accessibility barriers have limited universal adoption. Beyond the United States, the cost of around 4,000 USD per sample and turnaround time delaying start of therapy have created barriers to adoption, despite analyses indicating downstream savings from more tailored adjuvant therapy. Some efforts have been undertaken to develop nomograms based on clinical and pathologic features annotated during the standard of care, aiming to predict ODX scores. However, such tools require manual extraction of relevant inputs from the unstructured electronic healthcare record and leave room for improvement in terms of performance, with the assay itself still providing greater cost effectiveness than clinical risk tools. The use of whole-slide images (WSIs) was investigated from routinely available formalin-fixed paraffin-embedded (FFPE) tissue slides stained with hematoxylin and eosin (H&E) to predict RS. As previous studies have demonstrated, these slides can be effectively analyzed using deep learning algorithms to predict relapse risk. Such algorithms have already -18- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 been approved for colorectal cancer in Europe, though their widespread adoption is yet to be realized. One possible reason for this delay could be the limited clinical validation against the standard of care. However, the field of deep learning is progressing rapidly. Over the past year, two techniques have markedly enhanced system performance: transformers and self-supervised learning (SSL). Furthermore, recent studies have shown that integrating histopathologic imaging with additional modalities, such as genomics, text, clinical imaging, uncovers intermodal relationships and often improves predictive performance. In this study, three independent cohorts were assembled comprising 6,203 patients with HR+/HER2- EBC with surgically resected primary tumors (Fig. 1, panel a). Tissue samples were subjected to H&E staining and immunohistochemical (IHC) analysis for hormone receptors and HER2 according to ASCO/CAP guidelines, and samples were submitted for calculation of RS per clinical practice. For a subset, genomic data from clinical MSK-IMPACT targeted sequencing were also available (Fig. 1, panel b). These derivative data were subsequently digitized (Fig. 1, panel c) and used for multimodal modeling (Fig. 1, panel d). Transformer models were demonstrated to accurately predict RS from H&E- stained whole-slide images and pathology text reports, and that their integration improves performance beyond that of available nomograms. The biological interpretability of predictions was probed through computational analysis and suggest clinical operating points to identify high- risk disease. A new model, Orpheus, was advanced which has the potential to save testing cost and hasten therapeutic decision-making while maintaining the standard of care based on individual tumor transcriptomes for HR+/HER2- EBC. Results Data assembly A retrospective cohort of 5,176 (Fig. 1, panel e) patients with HR+/HER2- EBC (MSK-BRCA; Fig. 1, panel a; Fig. 7) was curated for model training, validation, and testing, whose primary tumors had H&E-stained FFPE tissue specimens available, textual pathology -19- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 reports, and targeted panel sequencing for a subset (n=332; Fig. 1, panel b). These patients were allocated a priori into either a withheld test set (20%) or a set used for training and validation (80%; Supp. Tab. 1). Moreover, two additional independent cohorts were assembled of whole- slide images derived from patients with HR+/HER2- EBC, IEO-BRCA (452 patients) and MDX- BRCA (575 patients), for external validation. Model training A transformer model was developed to directly regress the ODX RS from whole- slide images of EBC. To train this architecture, a two-step process was employed. First, each slide’s tissue-containing tiles (Fig. 1, panel f) were projected into an informative space using a frozen model trained using SSL on over 30,000 slides (Fig. 1, panel g). Subsequently, a transformer architecture was adapted, which was previously validated in a large multicenter study of colorectal cancer, to map the phenotypic-genotypic correlation between the extracted features and the ODX RS (Fig. 1, panel g). The unimodal and multimodal models were trained to regress RS as a continuous variable (Fig. 1, panel g). Embeddings and predicted score recapitulate clinical and genomic correlates Uniform manifold approximation and projection (UMAP) over the learned embedding spaces for the visual, linguistic, and multimodal models in the MSK-BRCA test set (Fig. 2) revealed that learned embeddings separated somewhat by histologic grade (Fig. 2, panel a) and progesterone receptor expression (Fig. 2, panel b) in the MSK-BRCA test set (n=1034), with the gradients appearing along a learned, lyre-shaped manifold for the multimodal model. The same was observed for the ODX RS itself (Fig. 2, panel c). the association of predicted scores with genomic features was further texted. Limiting to cases with MSK-IMPACT, predicted RS was higher for tumors with TP53 mutation, MYC amplifications, and PIK3CA amplifications (Fig. 2, panels d-f) and trended slightly higher for specimens with greater fraction of genome altered (Fig. 2, panel g). Model decisions are clinically explainable -20- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 The model outputs were interpreted using attention rollout. The last layer’s attention tiles for each slide (Fig. 3, panel a) were visualized, noting that the model designates most tiles as background with low attention scores (Fig. 3, panel b). Though the breakdown varies across slides, higher-attention tiles tended to contain invasive and in situ carcinoma compared to lower-attention tiles, which are more likely to contain fat and stroma (Fig. 3, panel c; Fig. 8). The model yielded predicted point-estimate scores alongside 95% confidence intervals (95% C.I.) for use in clinical decision making (Fig. 3, panel d). Analogously to the tiles, the importance of word tokens comprising the synoptic pathology report (including fields such as histologic subtype, HR and HER2 IHC staining patterns, histologic grade, anatomic site, and presence of DCIS and LCIS, and other noted histologic features) for the part from which RS was calculated can be analyzed (Fig. 3, panel e). Across the whole withheld test set, a word cloud of tokens showed that words around immunohistochemical analyses for estrogen and progesterone receptors and lymphovascular invasion tended to have highest mean relative attention within a report alongside punctuation and descriptions of Nottingham grade (Fig. 3, panel f). Visual model reproducibly infers recurrence risk The reproducibility of the vision model was tested across the three cohorts (Fig. 4). In the withheld MSK-BRCA test set, the unimodal whole slide image-based model achieved a Pearson correlation of 0.63 (95% C.I. 0.58 - 0.68, p < 10-4) and concordance correlation coefficient (CCC) of 0.58 (95% C.I. 0.52 - 0.63; Fig. 4, panel a) along with area under the precision-recall curve (AUPRC) of 0.593 (95% C.I. 0.514 - 0.671; Fig. 4, panel d) and area under the receiver operating characteristic curve (AUROC) of 0.864 (95% C.I. 0.831 - 0.895; Fig. 4, panel g). In the external IEO-BRCA test set, the same model achieved a Pearson correlation of 0.61 (95% C.I. 0.55 - 0.67; p < 10-4) and CCC of 0.60 (95% C.I. 0.533 - 0.650; Fig. 4, panel b) along with AUPRC of 0.675 (95% C.I. 0.601 - 0.745; Fig. 4, panel e) and AUROC of 0.801 (95% C.I. 0.759 - 0.841; Fig. 4, panel h). In the external MDX-BRCA test set, which used an inferred, ODX-like RS (see Methods), the same model achieved a Pearson correlation of 0.60 (95% C.I. 0.54 - 0.65; p < 10-4) and CCC of 0.44 (95% C.I. 0.384 - 0.486; -21- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 Fig. 4, panel c) along with AUPRC of 0.734 (95% C.I. 0.672 - 0.795; Fig. 4, panel f) and AUROC of 0.830 (95% C.I. 0.791 - 0.863; Fig. 4, panel i). Full results are detailed in the other panels of Fig. 9, Fig. 4, panel , and in Supp. Tab 2. In summary, the vision-based model robustly infers RS across three cohorts derived from different medical centers and countries. Model uncovers morphological features To further explore the model’s capability of correlating histologic features with ODX RS, the most-attended tiles were identified for high- and low-risk disease. The nuclei of these tiles were segmented, and derivative features of cell type proportions and cellular morphology were tabulated (Fig. 5, panel a). This revealed a relative abundance of inflammatory cells (Fig. 5, panels b-c) and neoplastic cells along with the standard deviation of the neoplastic nuclear area (Fig. 5, panels d-e) as some of the features differing significantly between the groups. Moreover, a model trained on The Cancer Genome Atlas (TCGA) to infer transcriptomic program activity from imaging features revealed that high-risk disease exhibited greater stromal fraction (p < 10-4, n=100) (Fig. 5, panel f), lymphocyte infiltration signature (p < 10-4, n=100) (Fig. 5, panel g), tumor cell proliferation (p < 10-4, n=100) (Fig. 5, panel h), and leukocyte fraction (p < 10-4, n=100) (Fig. 5, panel i). Extending the tumor microenvironment analysis to all three test cohorts corroborated these results, except for the lymphocyte infiltration signature in the MDX cohort which shows no statistically significant difference between the predicted high- and low-risk disease patients (Fig. 10). As a further study of differences, a conditional generative adversarial network (GAN) was also trained to synthesize fields of view for informative tiles for high- and low-risk disease (Fig. 5, panels j-l). Tiles conditioned on the high-risk class depicted confluent clusters of tumor cells with moderate to marked nuclear pleomorphism and prominent nucleoli, and tiles conditioned on the low-risk class depicted trabeculae and clusters of tumor cells with moderate nuclear pleomorphism and inconspicuous nucleoli. Tiles conditioned on the background class depicted stroma without epithelial cells. Integrating imaging and language information improves stratification -22- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 In the MSK-BRCA test set, the unimodal text report-based model achieved a Pearson correlation of 0.58 (95% C.I. 0.51 - 0.64, p < 10-4) and CCC of 0.55 (95% C.I. 0.478 - 0.606; Fig. 11, panel c) along with AUPRC of 0.539 (95% C.I. 0.455 - 0.628; Fig. 11, panel f) and AUROC of 0.820 (95% C.I. 0.779 - 0.854; Fig. 11, panel i). Full results are detailed in the other panels of Fig. 11 and in Supp. Tab. 2. Multimodal integrations were tested to see if they could improve on the image or text models alone, using tensor fusion of the transformer-based embeddings. In the MSK-BRCA test set, the full multimodal model achieved a Pearson correlation of 0.68 (Fig. 6, panel a; 95% C.I. 0.64-0.73, p < 10-4) and CCC of 0.65 (95% C.I. 0.59-0.70). For classification of high-risk (RS ≥ 26) disease, the AUPRC was 0.64 (p < 10-4; 95% C.I. 0.56 - 0.71), with a macro-averaged F1 score of 0.75 (Fig. 6, panel b). The CCC and Pearson’s correlation based on multimodal scores were higher than those based on unimodal scores (Fig. 6, panels c,f). AUROC was 0.88 (Fig. 6, panel d; 95% C.I. 0.85 - 0.91, p < 10-4). Using <12 and >25 as thresholds for low-, intermediate-, and high-risk disease, the confusion matrix for the withheld test set is depicted in Fig. 6, panel e, showing very low confusion between the extrema and moderate confusion between intermediate and extreme categories (p < 10-4). Next, the subset of the MSK-BRCA test set with available tumor grades and IHC- derived HR status in the text report as extracted by regular expressions was analyzed (those without matches by regular expressions were excluded). For this set, the ability to discriminate high-risk disease of a nomogram based on clinical and pathologist-annotated features was compared to that of the multimodal (Fig. 6, panel g), text-based (Fig. 6, panel h), and image- based (Fig. 6, panel i) models. The multimodal model achieved an AUROC of 0.89 and AUPRC of 0.71 (95% C.I. 0.60 - 0.82), the vision model achieved an AUPRC of 0.63 (95% C.I. 0.50 - 0.75), and the language model achieved an AUPRC of 0.61 (95% C.I. 0.48 - 0.73). By comparison, the nomogram achieved an AUPRC of 0.49 (95% C.I. 0.36 - 0.64). For the multimodal model, an operating point of 29.8 with 94.4% precision and 33.3% recall (Fig. 6, panel g) is suggested. -23- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 Assessing clinical utility as a triaging tool for low- and high- risk disease The utility of the multimodal transformer model as a pre-screening tool to reduce the load of laboratory testing for breast cancer recurrence in clinical workflows was tested. Performing a sensitivity analysis, a threshold in the test set of the MSK-BRCA (n=2338) cohort which yields the highest sensitivity for the largest percentage of the cohort’s population was manually selected. This resulted in a sensitivity of 0.93 for 34% of the population with a threshold of < 16 for the predicted recurrence risk score to determine intermediate/low-risk patients (Fig. 12, panel a) for the test set of the MSK-BRCA cohort. Applying this threshold on the IEO-BRCA (n=452) and MDX-BRCA (n=572) cohorts, a sensitivity of 0.94 for 25% (Fig. 12, panel b) and 0.96 for 18% of the populations (Fig. 12, panel c), respectively, was achieved. Similarly, a specificity analysis was conducted, wherein a threshold in the MSK- BRCA test set to yield the highest specificity for the largest percentage of the cohort’s population was manually selected. This resulted in a specificity of 0.93 for 13% of the population with a threshold of > 25 for the predicted RS to identify high-risk patients (Fig. 12, panel d) for the test set of the MSK-BRCA cohort. Applying this threshold on the subsequent cohorts, a specificity of 0.76 for 40% of the population (Fig. 12, panel e) and 0.85 for 31% of the population (Fig. 12, panel f) in the IEO-BRCA and MDX-BRCA cohorts, respectively, was achieved. The analyses stratified by age and nodal status, specifically patients with node- negative disease below 50 years of age (Fig. 13) and patients with 1-3 positive nodes above or equal to 50 years of age (Fig. 14), was repeated with similar performance metrics in all cohorts regardless of age and nodal status. In summary, the Orpheus model has the potential to accurately and highly confidently identify patients with high-risk disease. In a potential use case, adjuvant chemotherapy could be recommended for a selected subset of high-confidence high-risk patients without multigene assay testing (Fig.15). Discussion -24- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 Proper selection of patients with HR+/HER2- EBC who can safely omit adjuvant chemotherapy is a priority in clinical practice. Validated multigene assays, such as ODX RS, have the power to tailor adjuvant treatment-decision making in this setting. However, due to fiscal and logistical barriers, they have faced limited adoption in non-American healthcare systems despite long-standing recommendations for their use. In this study, a large-scale analysis comprising thousands of patients with HR+/HER2- EBC from internationally distinct cohorts showed that machine learning on whole-slide images accurately and reproducibly infers RS from routinely available H&E-stained specimens or their corresponding text report. These models and their multimodal combination outperform a nomogram using clinico-pathologic features, such as IHC-derived progesterone/estrogen receptor positivity, tumor size, lobular versus ductal histology, Nottingham grade, and clinical features, such as age. The optimal operating point accurately retrieves one third of high-risk disease with minimal false positives, potentially enabling physicians to forgo testing on one in three newly diagnosed patients. If deployed clinically, the improved accuracy of this technique and reduced requirement for manual curation would be expected to further improve the cost effectiveness. Few institutions have fully digital pathology workflows, but commercial services offer scanning for USD 35 per slide (Biochain, Newark, CA), and model inference is relatively inexpensive at USD 0.90 per hour (Amazon Web Services, Seattle, WA), with average model inference requiring significantly less than one minute per slide. Assuming a cost of USD 4,000 per slide for the laboratory assay, a hypothetical fee of USD 50 per slide for the artificial intelligence- derived test, and the empirically estimated recall of 33.3% (where any patients with scores below the operating point are sent for laboratory assay measurement), this results in an estimated average savings of USD 1,271 per patient without compromising the standard of precision oncology. Moreover, the speed of this assay could enable novel uses such as more precisely defining populations that will benefit from the use of neoadjuvant therapies beyond the currently used clinical characteristics and without the requirement for additional biopsies. Further analysis of the proposed method as a potential pre-screening tool revealed consistent sensitivity and specificity to identify patients with high- and low-risk tumors for -25- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 approximately 50% of the population across three distinct large cohorts. Notably, the risk prediction results remained generally unaffected by subgroup analyses taking into account age and nodal status, despite being clinicopathological factors which impact the recurrence risk in patients with breast cancer, and therefore influence the risk category threshold. With proper regulatory approval, adoption of clinical artificial intelligence in this paradigm—as a triaging or support tool—is a more measured approach than outright replacement of genomic tests or physician judgment and is more likely to result in widespread adoption. This computational tool has the added benefit of providing confidence intervals rather than pure point estimates, enabling integration of uncertainty into clinical decision making. Similarly, an additional benefit over prior nomograms is the provision of a continuous recurrence score rather than mere risk category, enabling downstream use of the RS for emerging uses, such as the patient selection for adjuvant radiotherapy after breast-conserving surgery, for neoadjuvant chemotherapy and in selecting patients for clinical trials. The architecture makes use of self- supervised learning to enable training on under 5,000 patients and Cartesian product with dimensionality reduction to commingle the text-based and image-based features. For each specimen, the model also generates an annotated report of the text and image used to estimate the recurrence score. Though deep learning suffers from a general lack of interpretability, the attention paid to each token in the text or each tile in the image enables ordering physicians to perform qualitative quality controls. In the analysis of word importance, “lymphovascular” and “invasion” also appeared, reflecting the association of lymphovascular invasion with disease recurrence risk, though this is not a feature of the clinicopathologic nomogram. Furthermore, colocalization of lower progesterone receptor percent positivity and higher Nottingham grade with higher recurrence score in the models’ learned embedding spaces was seen, both of which are known associations. Informative imaging tiles tended to contain invasive carcinoma but sometimes also contained stroma and other known correlates of recurrence risk, such as lobular carcinoma in situ. -26- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 A solution was sought to identify quantitative biologic features underpinning high- and low-risk disease in tiles deemed informative by the model, including cell-level analyses that have begun to yield fruit in other studies. Abundance of inflammatory cells was found to correspond with higher-risk disease, corroborating prior studies that tumor infiltrating lymphocytes are a negative prognostic factor in HR+/HER2- EBC and are associated with a somewhat higher RS. The association is widely validated, but its nature is unknown. Based on the findings that both the fraction of genome altered, and inflammatory cell infiltrate are higher for high-risk disease, one putative mechanism submitted is that more aggressive disease exhibits greater chromosomal instability, which in turn increases intratumoral inflammation. The greater nuclear pleomorphism and nucleolar prominence in high-risk tumors synthesized by the GAN associates with higher Nottingham grade. The cohort also recapitulated previously uncovered genomic biomarkers with adverse prognostic implications, namely MYC amplification, PIK3CA amplification, and TP53 mutation. By estimating transcriptomic programs from images using the validated model, proliferation was also found to be higher in the analysis of patients with predicted high risk, correlating with grade, the MKI67 gene included in the calculation of the RS, and perhaps explaining the empiric association of more heterogeneous areas and perimeters of cancer cells with higher risk disease. The elevated presence of the lymphocyte infiltrating signature score and leukocyte fraction in predicted high-risk tumors hints at biologically aggressive cancer and was found to positively correlate with recurrence risk of HR+/HER2- EBC. Furthermore, the finding of increased stromal fraction in predicted tumors with predicted high risk across all cohorts corroborates the association between high stromal fraction and cancer-associated fibroblasts with worse prognosis in various breast cancer subtypes, with the analysis specifically building a case for HR+/HER2- tumors. The finding of stroma in the background tiles generated by the GAN is possibly due to the prevalence of uninformative stromal areas further from the tumor-stroma interface. Together, these findings show that the new deep learning method can be used as a tool to make biological discoveries and suggest mechanistic hypotheses. -27- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 The greatest limitation of this study is that it relies on a laboratory assay–albeit rigorously validated–for ground truth rather than clinical outcomes, and thus the model cannot be detected to discriminate risk of distal recurrence better than ODX RS. That is, using the RS as the ground truth will penalize deviations, even if they could hypothetically be more closely associated with true clinical risk of recurrence. As clinical outcomes become more readily available at scale, this study aims to test the ability of such models on censored time-to-event modeling, in this case for distal recurrence. A second limitation is the reliance on deep transformer architectures: though they are ensconced as the workhorse of modern artificial intelligence, they lack true interpretability, with post hoc explainability instead standing in. That is, methods to interpret the model’s decisions and robustness must deployed rather than asking the model to directly explain its reasoning. Nonetheless, recurrence risk models with inherit interpretability capabilities tend to perform worse on the main evaluation metrics such as the AUROC. In summary, Orpheus, an artificial intelligence model that accurately infers Oncotype DX ® Recurrence Score from H&E-stained whole slide images, was developed and validated across three independent cohorts totaling 6,271 patients internationally. The biological and clinical underpinnings of the model’s decisions have been rigorously analyzed and the architecture can be tailored for application to rapid biomarker inference in any tumor type. Methods Cohort curation Cases from 2013-2020 were selected according to Fig. 7 for this retrospective analysis. All cases were pathologically confirmed HR+/HER2- invasive breast carcinoma without distinction by specific histologic subtype. Pathology reports and slides were joined by the surgical pathology part number rather than case or block. The synoptic pathology text report for the part used to calculated RS was included, with fields such as histologic subtype, HR and HER2 IHC percent positivity, histologic grade, anatomic site, and DCIS and LCIS. Oncotype -28- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 DX results were recorded manually from the healthcare record by a medical oncologist. Regular expressions were used to extract progesterone, estrogen, and HER2 receptor percentage positivity from the pathology text reports. Before any analysis, the cohort was split into a training/validation set and a withheld test set using 80% and 20% of patient IDs, respectively. Pathology review All included cases were hormone receptor-positive and HER2-negative as defined by the American Society of Clinical Oncology/College of American Pathologists clinical practice guidelines. Experienced breast cancer pathologists from each Institution reviewed the case to confirm the diagnosis of invasive breast cancer and receptor status. Vision model training Images were preprocessed using STAMP with 1.14 microns per pixel, tile edge length of 224 pixels, and Macenko normalization. Tiles were embedded using CTransPath. After a fully connected layer to project the CTransPath-derived tokens into 512-dimensional space and a ReLU, two PyTorch TransformerEncoderLayers with dimensionality 512 and eight heads were stacked before a final LayerNorm and projection to scalar space. No activation function was used, and no positional encodings were used. For training, a maximum learning rate of 2e-5 with linear warmup of 1000 steps, learning rate decay by a factor of 0.9999 every step, and L2 decay of 2e-5 was used. A batch size of one slide per GPU across two GPUs with accumulated gradients over four batches was used, with gradients clipped at 0.5. The model was trained for up to 50 epochs with early stopping. This was implemented in PyTorch-Lightning and ran on two NVIDIA Tesla V100 GPUs (CUDA 12.1) on a cluster running Linux. Metrics were tracked using Weights and Biases, and the model with the lowest validation mean squared error was chosen for downstream use. During inference, attention rollout was used to attribute attention to each input token, and multiple predictions with dropout enabled were used to estimate confidence intervals. Regression was used instead of classification because -29- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 classification discards information, and it substantially outperformed classification in the validation set. Language model training The HuggingFace model tsantos/PathologyBERT was tuned using the HuggingFace BertForSequenceClassification, Trainer, and AutoTokenizer with a batch size of eight-part descriptions per GPU, four Nvidia Tesla V100 GPUs, four gradient accumulations per backprop, a learning rate of 2e-5, L2 decay of 0.01, and ten training epochs. Prior to tokenization, the text corresponding to the part used to measure the Oncotype score was extracted using regular expressions. Addenda, when available, were concatenated to the part description. Names, initials, and logistical comments were removed prior to tokenization. The tsantos/PathologyBERT tokenizer was not modified. The model with the lowest validation loss was chosen for downstream use. During inference, attention rollout was used to attribute attention to each input token, and multiple predictions with dropout enabled were used to estimate confidence intervals. Multimodal model training Multiple architectures including simple concatenation of embeddings before dense layers, attention-based integration of unimodal embeddings, or averaging of scores were explored using the validation set. The final model chosen took the pre-computed unimodal embeddings as input, projected them into 96-dimensional space, performed tensor fusion by prepending unity and taking the Cartesian product, applied 30% dropout, and passed through a small 96-dimensional regression head to yield a scalar regression score. Finally, linear regression trained on the training set was used to calibrate the weight of the unimodal and multimodal scores to yield a final score. Multiple predictions with dropout enabled were used to estimate confidence intervals. Report visualization -30- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 Django was used to colorize each token in the description by relative attention. A similar color scale in HSV space was applied to tint each tile by its relative attention, with absolute attention plotted on corresponding histograms with logarithmic counts displayed. The five tiles with highest and lowest absolute attention are displayed for quality control by users. Nuclear analysis The three most informative tiles from each slide with the highest 100 and lowest 100 predicted scores were identified. HoVerNet with PanNuke-derived weights was used for instance segmentation, and quantitative features such as solidity, area, and perimeter were calculated for the outline of each nucleus. Within each tile, summative statistics were used to aggregate these features. Comparisons between the high- and low-risk groups were made with the Mann-Whitney U test with corrections for multiple testing. For the volcano plot, log fold change was calculated as the base-two logarithm of the mean value for the high-risk group over the same for the low-risk group. Tiles with fewer than 50 total nuclei were excluded. For comparisons involving standard deviations, tiles with fewer than ten of the relevant cell type were excluded. Tumor microenvironment quantification A pre-trained deep learning model was used for the quantification of the tumor microenvironment for the top 50 predicted high- and low-risk patients by the recurrence risk vision prediction model, specifically for the stromal fraction (SF) and leukocyte fraction (LF) as assessed via DNA methylation analysis, lymphocyte infiltrating signature score (LISS) and proliferation (Prolif) as measured by RNA expression. The deep learning regression model was trained on whole-slide images from a breast cancer cohort from The Cancer Genome Atlas (TCGA) in a weakly-supervised setting using the open-source biomarker data. Statistical significance is measured by an independent t-test, indicating a difference in sample means between predicted high- and low-risk patients (p < 0.05). The scores for the tumor -31- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 microenvironment quantification are inferred based on the same tile-embeddings (1x768) which were used in training the vision model for the recurrence risk prediction. Generative adversarial network training The highest-attention tiles were identified from cases with measured recurrence scores below 11 or above 25 as chosen by an Attn-MIL model trained on the training set. Subsequently, using these tiles and a random sampling of low-attention tiles across the same slides, Studio GAN was used to train a ReACGAN architecture with big_resnet backbone, batch size of 36 per GPU across four NVIDIA Tesla V100 GPUs, and default loss parameters. The conditional architecture encoded three classes: high score, low score, and background. Spectrum plots and canvases were generated as per default StudioGAN code. Genomic analysis Specimens were sequenced by MSK-IMPACT, annotated by OncoKB, and accessed via cBioPortal. Variants of known significance in the most commonly altered genes in the MSKCC Clinical Sequencing Cohort (TP53, PTEN, BRCA2, FGF4, split, KMT2C, index, PAK1, FGF19, CDH1, MYC, ARID1A, FGF3, GATA3, CBFB, CCND1, RUNX1, PIK3CA, FGFR1, NSD3, MAP3K1). Using Bonferroni correction, genes associated with high- or low-risk status based on measured recurrence score with a significance of q = 0.05 were identified. Fraction of genome altered, tumor mutational burden, and mutation count were analyzed using the Pearson correlation with the same significance and correction. Nomogram comparison Logistic regression for high-risk disease was performed using the formula. The features were extracted using regular expressions from the pathology report. Reports with failed extraction (e.g., due to absence of hormone receptor annotation in the pathology report or unusual formatting precluding the extraction of tumor size) were excluded from the comparison analysis. Precision-recall analyses were performed using the predicted score [0, 100], in the case -32- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 of the transformer regression, and the logistic regression score [0, 1] in the case of the nomogram’s logistic regression formula. Model evaluation When multiple slides were available for a single measured scores and all contained relevant tissue as per preprocessing, the mean predicted score was simply taken. Models were evaluated primarily by Pearson correlation and associated significance and concordance correlation coefficient. 95% confidence intervals were calculated using bootstrapping (random sampling with replacement) 1000 times. Areas under the precision recall and receiver operating characteristic curve were calculated using binary thresholding of high- and low- (recurrence score >/≤ 25) risk disease. Significance was established using 1000-fold permutation tests. Operating points on the precision recall curve were analyzed by varying the threshold from greatest to lowest and tabulating the respective precision and recall for each value. F1 scores were calculated using the weighted average. Confusion matrices were established using the three risk categories (<16, 16-25, >25), and significance was established using McNemar’s test of homogeneity. External cohort curation The cohort from the European Institute of Oncology (IEO-BRCA, Milan, Italy), contained a total of 456 early-stage breast cancer patients which received the official Oncotype DX test. Only histopathology slides and corresponding clinicopathological variables were available for analysis. After filtering down patients based on histology slide availability, 452 patients in the IEO-BRCA cohort were available for external validation. The cohort from MultiplexDX (MDX-BRCA, Bratislava, Slovakia) contained a total of 1013 early-stage breast cancer patients originally obtained for a retrospective study from Biobank Graz of the Medical University of Graz, (n=390, Graz, Austria) and PATH Biobank (n=592, Munich, Germany), The Biomedical Research Institute of Málaga (n=27, IBIMA-CIMES-UMA, Malaga, Spain), and a commercial company (n=4, AMBIO). Histopathology slides, corresponding clinicopathological -33- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 variables, and research-based Oncotype DX scores derived from RNA-sequencing were available for analysis. After filtering down patients based on histology slide availability and nodal, ER, PR and HER2 status, which would have been eligible for Oncotype DX, 575 patients in the MDX-BRCA cohort remained for external validation. The research-based scores were calculated using the GeneFu Bioconductor package, based on the original algorithm to calculate the OncotypeDX score. Because research-based versions of OncotypeDX use different data inputs (e.g, microarray/RNA-seq) compared to the official OncotypeDX (e.g., RT-qPCR), this may result in scaling effects when comparing research-based scores with official scores, as demonstrated by the OPTIMA trial group. Consequently, their outlined approach was used, where they provide a linear equation that models the relationship between research-based versus true OncotypeDX, to rescale the research-based recurrence score into a more realistic range. split count type train 39774233 Tiles train 7758 Slides train 3178 Reports val. 8098090 Tiles val. 1570 Slides val. 795 Reports test 11765872 Tiles test 2338 Slides test 972 Reports val. 790 Patients-Reports train 3157 Patients-Reports test 970 Patients-Reports val. 715 Patients-Images train 3404 Patients-Images test 1026 Patients-Images -34- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 val. 790 Patients train 3420 Patients test 1034 Patients all 5176 Patients all 5145 Patients-Images all 4917 Patients-Reports Supp. Table 1. Data characteristics. Note: 68 patients who are in the training set for images are in the validation set for text, so they don’t appear in either set for multimodal integration. Split Model n r r_p r_95%_lower msk-test language 972 0.58 4.43E-88 0.51 msk-test multimodal 964 0.68 2.72E-134 0.64 msk-test vision 1029 0.63 4.12E-115 0.58 mdx-test vision 575 0.60 2.68E-58 0.54 mln-test vision 452 0.61 9.05E-47 0.55 train language 3178 0.84 0 0.82 train multimodal 3164 0.88 0 0.87 train vision 3428 0.75 0 0.74 val language 725 0.57 1.10E-64 0.50 val multimodal 719 0.67 1.15E-94 0.62 val vision 720 0.61 4.44E-76 0.56 -35- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 Split Model r_95%_upperccc ccc_95%_lowerccc_95%_upper msk- language 0.64 0.55 0.48 0.61 test msk- multimodal 0.73 0.65 0.60 0.70 test msk- vision 0.68 0.58 0.52 0.63 test mdx- vision 0.65 0.22 0.19 0.25 test mln- vision 0.67 0.60 0.53 0.65 test train language 0.85 0.83 0.81 0.84 train multimodal 0.90 0.88 0.87 0.89 train vision 0.77 0.71 0.69 0.73 val language 0.63 0.55 0.49 0.61 val multimodal 0.72 0.65 0.59 0.69 val vision 0.67 0.57 0.52 0.63 Supp Table 2. Detailed model performance B. Inference of Recurrence Risk and Predicted Outcome Using Multimodal Histopathological Models The Oncotype DX® Recurrence Score (RS) is an assay for hormone receptor- positive early breast cancer with extensively validated predictive and prognostic value. However, its cost and lag time have limited global adoption, and previous attempts to estimate it using clinicopathologic variables have had limited success. To address this, 6,172 cases were assembled across three institutions and developed Orpheus, a multimodal deep learning tool to infer the RS from H&E whole-slide images. The model identifies TAILORx high-risk cases (RS -36- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 > 25) with an area under the curve (AUC) of 0.89, compared to a leading clinicopathologic nomogram with 0.73. Furthermore, in patients with RS ≤ 25, Orpheus ascertains risk of metastatic recurrence more accurately than the RS itself (0.75 vs 0.49 mean time-dependent AUC). These findings have the potential to guide adjuvant therapy for high-risk cases and tailor surveillance for patients at elevated metastatic recurrence risk. Introduction Hormone receptor-positive disease without HER2 overexpression or amplification (HR+/HER2-) is the most common subtype of early breast cancer (EBC), accounting for approximately 70% of diagnoses. A major challenge in the management of this disease has been identifying the cancers for which adjuvant chemotherapy meaningfully reduces the risk of recurrence. Risk stratification of HR+/HER2- EBC relies upon the integration of traditional clinicopathological features (e.g., tumor size, nodal status, Nottingham grade) with multigene assays to estimate risk of recurrence and personalize adjuvant therapy. Among the commercially-available assays, Oncotype DX (ODX) ® (Exact Sciences, Madison, WI) is the most extensively validated and widely used in clinical practice. By measuring transcriptional abundance of 16 genes, including ESR1, PGR, HER2, MKI67, and MMP11, against the abundance of five reference genes using reverse transcription quantitative real-time PCR2, ODX calculates a recurrence score (RS) ranging from zero to 100 with both prognostic and predictive value. Substantial clinical evidence from retrospective and prospective trials has shown that ODX can improve clinical decision-making in breast cancer. Retrospective analyses of the NSABP B14 and TransATAC trials demonstrated the prognostic value of ODX in stratifying the risk of recurrence for HR+/HER2- EBC patients. Similarly, analyses of the NSABP B20 and SWOG8814 clinical trials established the predictive value of ODX by uncovering a survival benefit with the addition of adjuvant chemotherapy to endocrine therapy for patients with high risk of disease relapse. These studies provided the rationale for the prospective evaluation of ODX in the TAILORx (>10,000 patients with node-negative disease) and RxPONDER (5,083 -37- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 patients with one to three positive lymph nodes) trials and established ODX as the preferred genomic assay for adjuvant treatment-decision making in HR+/HER2- EBC While guidelines have recommended the use of ODX or other assays for more than a decade, reimbursement restrictions and global accessibility barriers have limited universal adoption. Beyond the United States, the cost of around 4,000 USD per sample and turnaround time delaying start of therapy have created barriers to adoption, despite analyses indicating downstream savings from more tailored adjuvant therapy. Some efforts have been undertaken to develop nomograms based on clinical and pathologic features annotated during the standard of care, aiming to predict ODX scores. However, such tools require manual extraction of relevant inputs from the unstructured electronic healthcare record and leave room for improvement in terms of performance, with the assay itself still providing greater cost effectiveness than these tools. The use of whole-slide images (WSIs) were investigated from routinely available formalin-fixed paraffin-embedded (FFPE) tissue slides stained with hematoxylin and eosin (H&E) to predict RS. As previous studies have demonstrated, these slides can be effectively analyzed using deep learning algorithms to predict relapse risk. Such algorithms have already been approved for colorectal cancer in Europe, though their widespread adoption is yet to be realized. One possible reason for this delay could be the limited clinical validation against the standard of care. However, the field of deep learning is progressing rapidly. Recently, two techniques have markedly enhanced system performance: transformers and self-supervised learning (SSL). Furthermore, recent studies have shown that integrating histopathologic imaging with additional modalities, such as genomics, text, clinical imaging, uncovers intermodal relationships and often improves predictive performance. In this study, Orpheus was developed, a multimodal deep learning model to infer the ODX RS from H&E-stained whole-slide images, and validate it across three independent patient cohorts for the identification of high-risk patients. Moreover, a head-to-head comparison is made between the Orpheus and ODX RS to identify patients with documented metastatic -38- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 recurrence. This work advances a multimodal machine learning paradigm in precision oncology, applies it to accurately infer the ODX RS from routine histopathology images, and outperforms the ODX RS in identifying risk of metastatic recurrence in patients with low ODX RS. This study has the potential to extend access to the established and exploratory applications of the well-validated Recurrence Score and to refine sub-stratification for patients treated using the current standard paradigm. Results Data Assembly Three independent cohorts were assembled comprising 6,172 patients with HR+/HER2- EBC with surgically resected primary tumors (Fig. 1, panel a). Tissue samples were subjected to H&E staining and immunohistochemical (IHC) analysis for hormone receptors and HER2 according to ASCO/CAP guidelines, and samples were submitted for calculation of RS per clinical practice. For a subset, genomic data from clinical MSK-IMPACT targeted sequencing were also available (Fig. 1, panel b). These derivative data were subsequently digitized (Fig. 1, panel c) and used for multimodal modeling (Fig. 1, panel d). We curated a retrospective cohort of 5,145 (Fig. 1, panel e) patients with HR+/HER2- EBC (MSK-BRCA; Fig. 1, panel a; Fig. 7) for model training, validation, and testing, whose primary tumors had H&E-stained FFPE tissue specimens available, textual pathology reports, and targeted panel sequencing for a subset (n=481; Fig. 1, panel b). These patients were allocated a priori into either a withheld test set (20%) or a set used for training and validation (80%; Supp. Tab. 1). Moreover, two additional independent cohorts of WSIs were assembled derived from patients with HR+/HER2- EBC, IEO-BRCA (452 patients) and MDX-BRCA (575 patients), for external validation. The patients’ age, sex, race are reported in Supp. Tab. 2. A patient is considered high-risk with a molecular RS > 25, following TAILORx. Model Training -39- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 A transformer model was developed to directly regress the ODX RS from WSIs of EBC. To train this architecture, a two-step process was employed. First, each slide’s tissue- containing tiles were projected (Fig. 1, panel f) into an informative space using a frozen model trained using SSL on over 30,000 slides (Fig. 1, panel g). Subsequently, a transformer architecture was adapted, which was previously validated in a large multicenter study of colorectal cancer, to map the phenotypic-genotypic correlation between the extracted features and the ODX RS (Fig. 1, panel g). The unimodal and multimodal models were trained to regress RS as a continuous variable (Fig. 1, panel g). Deep learning infers recurrence risk score from whole slide images First, the WSI-based model was developed and tested across the three cohorts to measure generalizability of its performance. In the withheld MSK-BRCA test set, the unimodal WSI-based model achieved a Pearson correlation of 0.60 (95% C.I. 0.55 - 0.65, p < 10-4; Fig. 20, panel a) and concordance correlation coefficient (CCC) of 0.57 (95% C.I. 0.52 - 0.62), along with area under the precision-recall curve (AUPRC) of 0.55 (95% C.I. 0.47 - 0.64; Fig. 20, panel d) and area under the receiver operating characteristic curve (AUROC) of 0.85 (95% C.I. 0.81 - 0.88; Fig. 16, panel a) for high-risk disease. In the external IEO-BRCA test set, the same model achieved a Pearson correlation of 0.60 (95% C.I. 0.55 - 0.65; p < 10-4; Fig. 20, panel b) and CCC of 0.58 (95% C.I. 0.52 - 0.63; Fig. 18, panel b) along with AUPRC of 0.69 (95% C.I. 0.61 - 0.76; ; Fig. 20, panel e) and AUROC of 0.81 (95% C.I. 0.77 - 0.85; Fig. 16, panel b). In the external MDX-BRCA test set, which used an inferred, ODX-like RS (see Methods), the same model achieved a Pearson correlation of 0.58 (95% C.I. 0.53 - 0.63; p < 10-4; Fig. 20, panel c) and CCC of 0.40 (95% C.I. 0.35 - 0.45) along with AUPRC of 0.71 (95% C.I. 0.65 - 0.78; Fig. 20, panel f) and AUROC of 0.80 (95% C.I. 0.76 - 0.84; Fig. 16, panel c). Full results are detailed in the other panels of Figs. 20-21. In summary, the WSI-based model robustly infers RS and accurately identifies high-risk disease across three test cohorts derived from different medical centers and countries. Deep learning infers recurrence risk score from text-based reports -40- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 Second, a unimodal text report-based model was developed that achieves a Pearson correlation of 0.59 (95% C.I. 0.53 - 0.65, p < 10-4) and CCC of 0.53 (95% C.I. 0.47 - 0.58; Fig. 22, panel c) along with AUPRC of 0.53 (95% C.I. 0.45 - 0.61; Fig. 22, panel f) and AUROC of 0.81 (95% C.I. 0.76 - 0.85; Fig. 22, panel i) in the MSK-BRCA test set. Full results are detailed in Fig. 22 . For the subset of male patients in the MSK-BRCA test set, the model demonstrates comparable performance (Supp. Tab. 3) Multimodal integration of images and text improves recurrence risk score prediction Finally, it was evaluated whether the multimodal model integrating WSIs and text-based reports, Orpheus, improves predictive performance relative to the unimodal models. In the MSK-BRCA test set, the multimodal model achieved a Pearson correlation of 0.70 (Fig. 23, panel a; 95% C.I. 0.65-0.74, p < 10-4) and CCC of 0.67 (95% C.I. 0.62-0.72). For classification of high-risk (RS > 25) disease, the AUPRC was 0.65 (p < 10-4; 95% C.I. 0.57 - 0.72), with a macro-averaged F1 score of 0.75 (Fig. 23, panel b). The CCC and Pearson’s correlation based on multimodal scores were higher than those based on unimodal scores (Fig. 16, panel d; Fig. 23, panel e). The AUROC was 0.88 (Fig. 23, panel d; 95% C.I. 0.86 - 0.91, p < 10-4). A confusion matrix for the withheld test set is depicted in Fig. 23, panel e, showing minimal misclassification between extreme categories, with moderate errors between intermediate and extreme categories (p < 10-4). Multimodal recurrence risk model outperforms clinicopathologic nomogram Next, Orpheus was compared to the state-of-the-art nomogram for predicting the ODX RS. Specifically, the subset of the MSK-BRCA test set was analyzed with available tumor grades and IHC-derived HR status in the text report as extracted by regular expressions. The ability to discriminate high-risk disease of a nomogram based on clinical and pathologist- annotated features was compared to that of the Orpheus, the multimodal (Fig. 16, panel e; Fig. 24, panel a), text-based (Fig. 24, panels b,d), and whole slide image-based (Fig. 24, panel c,e) model. Orpheus achieved an AUROC of 0.89 and AUPRC of 0.67 (95% C.I. 0.58 - 0.75), the -41- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 vision model achieved an AUPRC of 0.55 (95% C.I. 0.46 - 0.65), and the language model achieved an AUPRC of 0.57 (95% C.I. 0.48 - 0.66). By comparison, the nomogram achieved an AUPRC of 0.40 (95% C.I. 0.32 - 0.50). For the multimodal model, an operating point of 31.8 with 93% precision and 23% recall (Fig. 16, panels f) is suggested. The model was well calibrated for risk stratification based on >25 as a high-risk threshold, with most predicted scores being low risk (Fig. 16, panels g-h). Orpheus outperforms molecular risk score to identify patients with recurrent disease Using 6814 hand-annotated cases, a large language model was tuned (see Methods) to infer recurrences from the electronic medical record, achieving an accuracy of 0.96 (Fig. 25). By analyzing cases with at least two years of follow up and ODX RS values ≤ 25, linear models were first used to develop Orpheus+ models to infer risk of distant recurrence from H&E-stained WSIs. In the combined MSK-BRCA test and validation set, Orpheus+ ascertained risk of distant recurrence with an AUROC of 0.77 (95% C.I. 0.68, 0.85)). This was superior to the Oncotype DX ® RS itself, which was uninformative in the RS ≤ 25 cohort (AUROC=0.51 (95% C.I. 0.41, 0.61)). Scores differed significantly for cases with or without metastatic recurrence for Orpheus+ (p ≤ 1e-4), but not for ODX RS (p > 0.05; Fig. 17, panel a). Next, the dynamic time-dependent AUROC was calculated for Orpheus+ in a time-frame from two to eight years after surgery, accounting for censoring events to capture time-to-event dynamics encompassing the clinically-relevant 5-year prediction window. Orpheus+ achieved a mean time-dependent AUROC of 0.75, whereas ODX RS attained a value of 0.49 (Fig. 17, panel c). In the MDX-BRCA test set, comparable identification of recurrence risk was observed by Orpheus+ (AUROC=0.64 (95% C.I. 0.56, 0.72)) to that by the Oncotype- like Multiplex DX ® laboratory assay risk score (AUROC=0.64 (95% C.I. 0.57, 0.70)) itself (Fig. 17, panels b,d). These results show the potential of Orpheus to improve identification of recurrence risk for patients classified as low risk by TAILORx categories. Considering patients with any ODX RS (rather than only ≤ 25), the Orpheus+ performance is preserved, with a mean time-dependent AUROC of 0.72, and ODX RS achieves a value of 0.59 (Fig. 26, panel a). -42- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 Considering only patients in the MSK-BRCA test set (rather than the combined test and validation set) for cases with ODX RS ≤ 25, the Orpheus+ performance is preserved, with a mean time-dependent AUROC of 0.88, and ODX RS achieves a value of 0.50 (Fig. 26, panel a). The characteristics of the distant recurrences in the test set are shown. For full results, see Fig. 26. Multimodal interpretability shows concordance with biological processes Next, it was sought to understand the decision-making process of the model’s predictions using Orpheus’ innate interpretability mechanisms. Specifically, using the attention mechanism of the image- and text transformers of Orpheus, the importance of features within the WSIs and pathology reports was visualized. Visualizing the attention of the slide (Fig. 18, panel a), the model designates most tiles as background with low attention scores (Fig. 18, panel b). Higher-attention tiles tend to contain invasive and in situ carcinoma compared to lower-attention tiles, which are more likely to contain fat and stroma (Fig. 18, panel c; Fig. 27). Analogously to the tiles, the importance of word tokens comprising the synoptic pathology report can be analyzed (Fig. 28, panel a). For the attention of the reports, analysis shows that words around immunohistochemical analyses for estrogen and progesterone receptors and lymphovascular invasion tend to have highest mean relative attention, alongside punctuation and descriptions of Nottingham grade (Fig. 28, panel b). Ablating grade or progesterone receptor status from the text report decremented performance in the test set (Fig. 29). Analyzing the latent space of the learned embeddings of the trained models reveals separation by histologic grade (Fig. 30, panel a) and progesterone receptor expression (Fig. 30, panel b) in the MSK-BRCA test set, with the gradients appearing along a learned, lyre-shaped manifold for the multimodal model. The same was observed for the ODX RS itself (Fig. 30, panel c). The association of predicted multimodal scores with genomic features was further tested. Limiting to cases with MSK-IMPACT, predicted RS was higher for tumors with TP53 mutations, MYC amplifications, PIK3CA amplifications, and BRCA2 mutations (Fig. 30, panel d-g), and it trended slightly higher for specimens with greater fraction of genome altered (Fig. 30, panel h). In the test set alone, the relationships for TP53 and fraction of genome altered persisted (Fig. 31). -43- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 Orpheus provides histologic characterization of high-risk disease To further explore the model’s capability of correlating histologic features with ODX RS, we analyze the most-attended tiles for high- and low-risk disease. The nuclei of these tiles were segmented, and derivative features of cell type proportions and cellular morphology were tabulated (Fig. 18, panel d). This reveals a relative abundance of inflammatory cells (Fig. 18, panel e-f) and neoplastic cells along with the standard deviation of the neoplastic nuclear area (Fig. 18, panel g-h) as some of the features differ significantly between the groups. Analysis of the tumor microenvironment reveals that high-risk disease exhibited greater stromal fraction (p < 10-4, n=100) (Fig. 18, panel i), tumor cell proliferation (p < 10-4, n=100) (Fig. 18, panel j), lymphocyte infiltration signature (p = 2 x 10-3, n=100) (Fig. 18, panel k), and leukocyte fraction (p < 10-4, n=100) (Fig. 18, panel l). Extending the tumor microenvironment analysis to external cohorts corroborated these results, especially for tumor cell proliferation which exhibited a significant difference between the predicted high- and low-risk disease patients (p < 10-4, n=100) in all three cohorts (Fig. 32). As a further study of differences, a generative model was also trained to synthesize fields of view for informative tiles for high- and low-risk disease (Fig. 5, bottom panel). Tiles conditioned on the high-risk class depict confluent clusters of tumor cells with moderate to marked nuclear pleomorphism and prominent nucleoli, and tiles conditioned on the low-risk class depicted trabeculae and clusters of tumor cells with moderate nuclear pleomorphism and inconspicuous nucleoli. Tiles conditioned on the background class depicted stroma without epithelial cells. Orpheus as a conceptual triaging tool for low- and high- risk disease The utility of Orpheus was tested as a pre-screening tool to reduce the load of laboratory testing for breast cancer recurrence risk in clinical workflows. First, a sensitivity analysis was conducted to evaluate the performance of the predicted recurrence risk score in identifying low-risk patients, defined as those with a risk recurrence score <11. The analysis yielded a sensitivity for the and test set of MSK-BRCA (n=1029, 16% predicted low-risk), IEO- BRCA (n=452, 6% predicted low-risk) and MDX-BRCA (n=572, 1% predicted low-risk), of -44- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 0.90 (Fig. 33, panel a), 0.96 (Fig. 33, panel b) and 0.99 (Fig. 33, panel c), respectively, for the low-risk subgroup. Second, a specificity analysis was conducted to evaluate the performance of the predicted recurrence risk score in identifying high-risk patients, defined as those with a risk recurrence score > 25 and who are most likely to benefit from adjuvant chemotherapy. This resulted in a specificity of 0.94 (Fig. 33, panel d), 0.68 (Fig. 33, panel e), and 0.62 (Fig. 33, panel f) to predict the high-risk subgroup for the test set of MSK-BRCA (n=1029, 11% predicted high-risk), IEO-BRCA (n=452, 45% predicted high-risk) and MDX-BRCA (n=572, 61% predicted high-risk), respectively. Next, the analyses were repeated stratified by age, specifically patients above 50 years of age (Fig. 34) and patients below or equal to 50 years of age (Fig. 35), and nodal status, with similar performance metrics in all cohorts regardless of age and nodal status following the TAILORx risk groups. Finally, the model’s performance on the intermediate-risk (RS 11-25) subgroup was analyzed using the AUROC, Cohen’s Kappa, F1 score, accuracy and Matthew’s Correlation Coefficient (Supp. Tab. 4), utilizing additional clinically-relevant thresholds of 10, 15 and 25 to binarize the risk predictions. When observing the model’s performance specifically on the intermediate risk group (RS 11-25), a substantial decrease was observed in all metrics compared to all risk groups (RS 0-100). In summary, Orpheus accurately identifies patients with high-risk disease as defined by TAILORx, with a high degree of confidence. The model shows potential to guide adjuvant chemotherapy decisions without the need for multigene assay testing (Fig. 19). Specifically, adjuvant chemotherapy could be selectively recommended for a subset of patients classified as high-risk with high confidence. This approach could streamline treatment decisions and reduce the need for additional testing, ultimately improving patient care and resource allocation. Furthermore, for patients who are treated per TAILORx with adjuvant chemoendocrine or endocrine therapy based on ODX RS results, Orpheus identifies distant metastatic recurrences more accurately than ODX RS itself in the test set. With further validation, this prognostic value has the potential to refine patient selection for personalization of the adjuvant treatments and follow-up strategies. Discussion -45- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 The ODX RS is among the most extensively validated predictive and prognostic biomarkers, and given this widespread application, new applications continue to be developed. The model, Orpheus, provides a way to approximate the continuous RS from routine H&E WSIs, extending the ODX RS and its myriad applications to centers where it is not feasible financially or logistically to order the laboratory assay itself. In one example application, Orpheus can precisely identify approximately one quarter of patients with high-risk disease as defined by TAILORx without the need for ODX RS, with superior discrimination of this class compared to a state-of-the-art nomogram, which integrates clinico-pathologic features such as IHC-derived progesterone/estrogen receptor positivity, tumor size, lobular versus ductal histology, Nottingham grade, and age. This would potentially enable physicians to forgo molecular testing in selected cases. Orpheus has the added advantage of not requiring manual curation of these features from the healthcare record. By inferring the continuous RS rather than identifying risk categories, Orpheus further enables emerging applications that tools such as the nomogram would not support, such as identification of risk of local recurrence, clinical trial eligibility, or defining populations that will benefit from the use of neoadjuvant systemic therapies beyond the currently used clinical characteristics. It is further shown that this correlation with RS corresponds to meaningful prognostication: for patients who are treated per TAILORx with adjuvant chemoendocrine or endocrine therapy based on ODX RS results, Orpheus identifies distant metastatic recurrences more accurately than ODX RS itself in the test set. With further validation, this prognostic value has implications ranging from tailoring the frequency of surveillance imaging and use in patient decision making around treatment escalation and compliance with adjuvant endocrine therapy. Moreover, the findings that multimodal approaches significantly outperform unimodal models further strengthens the broader perspective that integrating multimodal real world data is a promising direction for AI in oncology. Orpheus is a flexible machine learning framework building on validated unimodal transformer-based architectures and data integration paradigms from the field of sentiment analysis, and the lightweight framework flexibly accepts tokens from the rapidly evolving foundation models for subsequent integration, allowing -46- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 machine learning practitioners to harness the representative power of deep learning with small cohorts comprising only thousands of patients. From a biological perspective, the study corroborates via orthogonal histopathologic and inferred transcriptomic analyses that greater immune infiltration portends higher-risk disease. This finding is in accordance with prior studies that tumor infiltrating lymphocytes are a negative prognostic factor in HR+/HER2- EBC and may be associated with higher RS. By estimating transcriptomic programs from images using the validated model, proliferation was also found to be higher in the analysis of patients with predicted high risk, correlating with grade, the MKI67 gene included in the calculation of the RS, the TP53 mutations, and perhaps explaining the empiric association of more heterogeneous areas and perimeters of cancer cells with higher risk disease. Finally, the greater inferred stromal fraction for higher risk disease is possibly related to cancer associated fibroblasts and provides support for this line of inquiry Together, these findings show that deep learning interpretability is greatly improved by orthogonal molecular and cell-level data, which in turn can yield hypothesis driven insights for biological discovery. Future work should include spatial transcriptomic data for direct characterization of clonal heterogeneity and immune programming. Taken together, this study advances an improved platform to approximate the ODX RS from routine histopathologic WSIs, outperforming a leading existing method in identification of high-risk disease and—critically—identifying metastatic recurrences for cases with low ODX RS values more accurately than the ODX RS itself. The multimodal model improves performance when pathology text reports are available and is a lightweight and flexible machine learning architecture suitable for application to biomarkers for other histologies. The orthogonal histopathologic and transcriptomic analyses corroborate proliferation and tumor- infiltrating lymphocytes as markers of higher risk. Clinically, Orpheus has the potential to both expand access to precision medicine through ODX RS approximation and enhance its efficacy by identifying patients at risk of distant metastatic recurrence, even among those deemed low- risk by TAILORx-guided treatment. -47- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 Methods Statistics & Reproducibility This study was conducted retrospectively. Therefore, no statistical method was used to predetermine sample size, the experiments were not randomized and the investigators were not blinded to allocation during experiments and outcome assessment. The data is split into a training and validation cohort for training of Orpheus, MSK-BRCA, and external validation cohorts, IEO-BRCA and MDX-BRCA. Model training included both male and female cases irrespective of age and nodal status, followed by stratified analyses to evaluate performance in each subgroup. Sex was determined based on sex assigned at birth reported by the institutional database. Before any analysis, the MSK-BRCA cohort was split into a training/validation set and a withheld test set using 80% and 20% of patient IDs, respectively. Model evaluation consisted of two steps: first, comparing Orpheus’s predicted risk scores against OncotypeDX recurrence scores using correlation metrics and risk group classification performance; second, assessing both Orpheus and ODX RS predictions for distant metastatic recurrence through calibration and time-dependent analyses. Unless otherwise specified, high-risk is consistently defined as scores > 25 and low-risk as scores ≤ 25, following TAILORx thresholds. This standardized cut-off applies to both Orpheus and OncotypeDX recurrence scores, independent of age or nodal status, throughout the entire manuscript for direct comparability between the predicted Orpheus scores and OncotypeDX recurrence scores. Additional statistical and experimental details, including sample inclusion and exclusion criteria, are provided in the subsequent cohort curation, model training and model evaluation sections. For FIG. 4, panel c, micrographs were selected by attention scores of one random slide. For FIG. 4, panels e-h, representative fields of view with high or low values of the feature in question were selected. The experiments were not repeated to generate different micrographs. Cohort Curation -48- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 For the MSK cohort, cases from 2013-2020 were selected according to Fig. 7 for this retrospective analysis. All cases were pathologically confirmed HR+/HER2- invasive breast carcinoma without distinction by specific histologic subtype. Pathology reports and slides were joined by the surgical pathology part number rather than case or block. The synoptic pathology text report for the part used to calculate the RS was included, with fields such as histologic subtype, HR and HER2 IHC percent positivity, histologic grade, anatomic site, and DCIS and LCIS. Oncotype DX results were recorded manually from the healthcare record by a medical oncologist. Regular expressions were used to extract progesterone, estrogen, and HER2 receptor percentage positivity from the pathology text reports. For the external validation cohorts, the cohort from the European Institute of Oncology (IEO-BRCA, Milan, Italy) contained a total of 456 early-stage breast cancer patients which received the official Oncotype DX test. Only histopathology slides and corresponding clinicopathological variables were available for analysis. After filtering down patients based on histology slide availability, 452 patients in the IEO-BRCA cohort were available for external validation. The cohort from MultiplexDX (MDX- BRCA, Bratislava, Slovakia) contained a total of 1013 early-stage breast cancer patients originally obtained for a retrospective study from Biobank Graz of the Medical University of Graz, (n=390, Graz, Austria) and PATH Biobank (n=592, Munich, Germany), The Biomedical Research Institute of Málaga (n=27, IBIMA-CIMES-UMA, Malaga, Spain), and a commercial company (n=4, AMBIO). Histopathology slides, corresponding clinicopathological variables, and research-based Oncotype DX scores derived from RNA-sequencing were available for analysis. After filtering down patients based on histology slide availability and nodal, ER, PR and HER2 status, which would have been eligible for Oncotype DX, 575 patients in the MDX- BRCA cohort remained for external validation. The research-based scores were calculated using the GeneFu Bioconductor package, based on the original algorithm to calculate the OncotypeDX score. Because research-based versions of OncotypeDX use different data inputs (e.g, microarray/RNA-seq) compared to the official OncotypeDX (e.g., RT-qPCR), this may result in scaling effects when comparing research-based scores with official scores, as demonstrated by the OPTIMA trial group. Consequently, their outlined approach was used, where they provide a linear equation that models the relationship between research-based versus true OncotypeDX, to -49- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 rescale the research-based recurrence score into a more realistic range. All included cases were hormone receptor-positive and HER2-negative as defined by the American Society of Clinical Oncology/College of American Pathologists clinical practice guidelines. Experienced breast cancer pathologists from each Institution reviewed the case to confirm the diagnosis of invasive breast cancer and receptor status. Training setup of vision, language and multimodal models For vision model training, images were preprocessed using STAMP with 1.14 microns per pixel, tile edge length of 224 pixels, and Macenko normalization. Tiles were embedded using CTransPath. After a fully connected layer to project the CTransPath-derived tokens into 512-dimensional space and a ReLU, two PyTorch TransformerEncoderLayers with dimensionality 512 and eight heads were stacked before a final LayerNorm and projection to scalar space. No activation function was used, and no positional encodings were used. For training, a maximum learning rate of 2e-5 with linear warmup of 1000 steps, learning rate decay by a factor of 0.9999 every step, and L2 decay of 2e-5 was used. A batch size of one slide per GPU across two GPUs with accumulated gradients over four batches was used, with gradients clipped at 0.5. The model was trained for up to 50 epochs with early stopping. This was implemented in PyTorch-Lightning and ran on two NVIDIA Tesla V100 GPUs (CUDA 12.1) on a cluster running Linux. Metrics were tracked using Weights and Biases, and the model with the lowest validation mean squared error was chosen for downstream use. During inference, attention rollout. was used to attribute attention to each input token, and multiple predictions with dropout enabled were used to estimate confidence intervals. Regression was used instead of classification because classification discards information, and it substantially outperformed classification in the validation set. For cases with multiple slides corresponding to the same pathologic specimen, all the tiles were bagged prior to any transformer-based analysis. For language model training, the HuggingFace model tsantos/PathologyBERT was tuned using the HuggingFace BertForSequenceClassification, Trainer, and AutoTokenizer with a batch size of eight part descriptions per GPU, four Nvidia Tesla V100 GPUs, four -50- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 gradient accumulations per backprop, a learning rate of 2e-5, L2 decay of 0.01, and ten training epochs. Prior to tokenization, the text corresponding to the part used to measure the Oncotype score was extracted using regular expressions. Addenda, when available, were concatenated to the part description. Names, initials, and the most common logistical comments were removed prior to tokenization. The tsantos/PathologyBERT tokenizer was not modified. The model with the lowest validation loss was chosen for downstream use. During inference, attention rollout was used to attribute attention to each input token, and multiple predictions with dropout enabled were used to estimate confidence intervals. For multimodal model training, multiple architectures including simple concatenation of embeddings before dense layers, attention-based integration of unimodal embeddings, or averaging of scores were explored using the validation set. The final model chosen took the pre-computed unimodal embeddings as input, projected them into 96- dimensional space, performed tensor fusion by prepending unity and taking the Cartesian product, applied 30% dropout, and passed through a small 96-dimensional regression head to yield a scalar regression score. Finally, linear regression trained on the training set was used to calibrate the weight of the unimodal and multimodal scores to yield a final score. Multiple predictions with dropout enabled were used to estimate confidence intervals. UMAP plots were generated using the Python umap software package with 10 neighbors and min_dist 0.5 for all plots fit on the training set, and only test sets are shown. Model evaluation When multiple slides were available for a single measured score and all contained relevant tissue as per preprocessing, the relevant tissue tiles were bagged prior to inference or training by the vision model. Models were evaluated primarily by Pearson correlation and associated significance and concordance correlation coefficient. 95% confidence intervals were calculated using bootstrapping (random sampling with replacement) 1000 times. Areas under the precision recall and receiver operating characteristic curve were calculated using binary thresholding of high- and low-risk disease. Significance was established using 1000-fold -51- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 permutation tests. Operating points on the precision recall curve were analyzed by varying the threshold from greatest to lowest and tabulating the respective precision and recall for each value. F1 scores were calculated using the weighted average. Confusion matrices were established using the three risk categories (<11, 11-25, >25), and significance was established using McNemar’s test of homogeneity. To further evaluate the performance of the model, the AUROC, F1 score, accuracy, Cohen’s Kappa, and Matthew’s Correlation Coefficient (MCC) were utilized. These metrics were assessed across all risk groups (0-100) and specifically within the intermediate risk group (11-25). To investigate the model’s performance using various clinically-defined thresholds, binary risk stratification (low versus high) was applied at cut-off values of 10, 15, and 25, analyzing both the entire population and subgroups stratified by age and nodal status. This approach allowed us to thoroughly examine the model’s predictive capabilities across populations with different definitions of risk, and to evaluate its performance within the clinically challenging intermediate risk group. To compare Orpheus against the state-of-the-art clinical nomogram, logistic regression for high-risk disease was performed using the formula, including age, histological classification, tumor grade, PR regression status, and tumor size, with corresponding coefficients and an intercept term. The features were extracted using regular expressions from the pathology report. Reports with failed extraction (e.g. due to absence of hormone receptor annotation in the pathology report or unusual formatting precluding the extraction of tumor size) were excluded from the comparison analysis. Precision-recall analyses were performed using the predicted score [0, 100], in the case of the transformer regression, and the logistic regression score [0, 1] in the case of the nomogram’s logistic regression formula. Summarizing, Orpheus was evaluated through a two-pronged approach. First, its ability to predict ODX RS values was assessed by comparing sample-level continuous risk predictions using Pearson correlation and concordance correlation coefficient. Its performance in classifying patients into TAILORx risk groups, measuring agreement with the gold-standard RS categorization was then evaluated through multiple metrics: AUROC, AUPRC, F1 score, accuracy, Cohen’s Kappa, and Matthew’s Correlation Coefficient. In the second step, both -52- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 Orpheus and RS were compared against actual clinical outcomes, specifically distant metastatic recurrence. This comparison utilized calibration plots and time-dependent AUROCs which account for censoring events to capture time-to-event dynamics beyond the clinically relevant 5- year prediction window. Patient outcome modeling A method to identify primary breast cancer patients with metastatic recurrence was developed from a combination of medical oncology notes, radiology notes, tumor marker lab values and internal referral data using natural language processing (NLP) and machine learning methods. Briefly, a pretrained Clinical longformer model pretrained on MIMIC III data was selected to predict metastatic recurrence from medical oncology notes. The model was finetuned on a note dataset consisting of the note closest to the follow-up date of the patient and a random sample of three notes following the date of surgery for N=6,814 patients with disease status labels curated from the Breast Disease Management Team at Memorial Sloan Kettering. It should be noted that prior to the date of recurrence were labeled ‘early’ while notes following the date of local and metastatic recurrence were labeled ‘local’ and ‘metastatic’ respectively. On a held-out dataset of N=1363 primary breast cancer patients, the finetuned Clinical Longformer model was tested. Ensembling was performed: specifically, the local and metastatic probabilities of the NLP model were further included as features of a Random Forest model that also included outputs from a metastatic sites model (Clinical BERT), progression probability (RoBERTa), positive tumor markers (CEA, CA 15-3, CA 125), and indicators of internal referral. The MDX- BRCA cases were manually annotated as distant, non-recurrent, or uncertain recurrences, where solely cases with ≥ 24 months follow up time were considered for these analyses. Distant metastatic recurrences in the MSK-BRCA test set were manually validated and tabulated. For the subsequent outcome modeling, bagged WSI-based embeddings were generated for each case. Using the same train/validation/test split of the MSK-BRCA dataset, a logistic regression model -53- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 implemented in scikit-learn with C = 1e-5 was trained with class weights calculated by the scikit- learn compute_class_weight utility given the class imbalance. Calibration was performed using the sigmoid method with five-fold cross validation in the training set. The model was applied to the combined MSK-BRCA test and MSK-BRCA validation sets and the MDX-BRCA test set. Orpheus+ models were trained separately for any recurrence, locoregional recurrence, and distant metastatic recurrence. Calibration curves were plotted using the corresponding scikit-learn functions, comparing predicted probabilities from Orpheus with observed outcomes, accounting for censoring events. Moreover, time-dependent area under the receiver operating characteristic (AUROC) curves were plotted using scikit-survival with three-month bins from 24 to 96 months, accounting for censoring events to capture time-to-event dynamics beyond the clinically relevant 5-year prediction window. Mann-Whitney U tests were used to compare inferred risk scores for cases with, and without, the type of recurrence under analysis. Model interpretability To interpret model predictions, multiple analytical approaches were employed: visualizing attention patterns in text reports for the language component of the model, and examining nuclear features, generating tissue representations, quantifying tumor microenvironment components, and conducting gene association analyses across risk groups for the vision component of the model. For report visualization, Django was used to colorize each token in the description by relative attention. A similar color scale in HSV space was applied to tint each tile by its relative attention, with absolute attention plotted on corresponding histograms with logarithmic counts displayed. The five tiles with highest and lowest absolute attention are displayed for quality control by users. For nuclear analysis, the three most informative tiles from each slide with the highest 100 and lowest 100 predicted scores were identified. HoVerNet with PanNuke-derived -54- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 weights was used for instance segmentation, and quantitative features such as solidity, area, and perimeter were calculated for the outline of each nucleus. Within each tile, summative statistics were used to aggregate these features. The feature with the highest variance inflation factor was iteratively removed until the highest variance inflation factor was 10 or less. Comparisons for the values of these remaining features between the high- and low-risk groups were made with the two-sided Mann-Whitney U test with corrections for multiple testing. For the volcano plot, log fold change was calculated as the base-two logarithm of the mean value for the high-risk group over the same for the low-risk group. Tiles with fewer than 50 total nuclei were excluded. For comparisons involving standard deviations, tiles with fewer than ten of the relevant cell type were excluded. For the generation of tissue representations across the risk groups, the highest- attention tiles were identified from cases with measured recurrence scores below 11 or above 25 as chosen by an Attn-MIL model trained on the training set. Subsequently, using these tiles and a random sampling of low-attention tiles across the same slides, Studio GAN was used to train a ReACGAN architecture with big_resnet backbone, batch size of 36 per GPU across four NVIDIA Tesla V100 GPUs, and default loss parameters. The conditional architecture encoded three classes: high score, low score, and background. Spectrum plots and canvases were generated as per default StudioGAN code. For the quantification of the tumor microenvironment, a pre-trained deep learning model was used for the quantification of the tumor microenvironment for the top 50 predicted high- and low-risk patients by the recurrence risk vision prediction model, specifically for the stromal fraction (SF) and leukocyte fraction (LF) as assessed via DNA methylation analysis, lymphocyte infiltrating signature score (LISS) and proliferation (Prolif) as measured by RNA expression. The deep learning regression model was trained on whole-slide images from a breast cancer cohort from The Cancer Genome Atlas (TCGA) in a weakly-supervised setting using the open-source biomarker data. Statistical significance is measured by an independent t- test, indicating a difference in sample means between predicted high- and low-risk patients (p < 0.05). The scores for the tumor microenvironment quantification are inferred based on the same -55- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 tile embeddings (1x768) which were used in training the vision model for the recurrence risk prediction. To analyze the association of genetic patterns with predicted risk groups, the specimens were sequenced by MSK-IMPACT, annotated by OncoKB, and accessed via cBioPortal. Only specimens attributed to a patient without other specimens were considered to avoid part-level mismatch, and specimens across the MSK-BRCA training, validation, and test set were included. Variants in the most commonly altered genes in the MSKCC Clinical Sequencing Cohort (TP53, PTEN, BRCA2, FGF4, KMT2C, PAK1, FGF19, CDH1, MYC, ARID1A, FGF3, GATA3, CBFB, CCND1, RUNX1, PIK3CA, FGFR1, NSD3, MAP3K1) were considered, provided that the variant in question (e.g., TP53 SNV) occurred in at least two of the considered samples. Passenger mutations were ignored. Using Bonferroni correction, genes associated with high- or low-risk status based on the measured, ground truth recurrence score with a significance of q = 0.05 were identified. Fraction of genome altered, tumor mutational burden, and mutation count were analyzed using the Pearson correlation with the same significance and correction. Supplemental Tables Supp. Table 1. Data characteristics. The train-test-validation split was at the patient level. split count type MSK-BRCA (test) 1026 patients MSK-BRCA (train) 3336 patients MSK-BRCA (val.) 783 patients IEO-BRCA (test) 452 patients MDX-BRCA (test) 575 patients MSK-BRCA (test) 1029 reports (cases) -56- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 MSK-BRCA (train) 3359 reports (cases) MSK-BRCA (val.) 789 reports (cases) IEO-BRCA (test) N/A reports (cases) MDX-BRCA (test) N/A reports (cases) MSK-BRCA (test) 2338 slides MSK-BRCA (train) 7588 slides MSK-BRCA (val.) 1740 slides IEO-BRCA (test) 452 slides MDX-BRCA (test) 575 slides MSK-BRCA (test) 11765872 tiles MSK-BRCA (train) 38905642 tiles MSK-BRCA (val.) 8966681 tiles IEO-BRCA (test) tiles MDX-BRCA (test) tiles Supp Table 2. Age and self-reported race of patients. Feature Value Age (years) Count 4768 Mean 56.1 Standard deviation 11.1 Minimum 17.0 Maximum 86.0 -57- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 25th percentile 47.0 50th percentile 56.0 75th percentile 65.0 Race (self-reported) Asian 391 Black or African 313 American Native American 3 Pacific Islander 5 White 3662 Other 159 Unknown 612 Sex (self-reported) MSK-BRCA Female 4742 Male 50 Unknown 353 1EO-BRCA Female 452 MDX-BRCA Female 575 Supp Table 3. Performance for male patients. split model metric value n train vis slope 1.261812896 31 train vis pearson 0.8589498758 31 train vis mae 0.04878772955 31 -58- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 train vis ccc 0.7926909328 31 test vis slope 0.9137974797 12 test vis pearson 0.6126447519 12 test vis mae 0.05793593528 12 test vis ccc 0.5214329362 12 val vis slope 0.7171253748 7 val vis pearson 0.8259023547 7 val vis mae 0.07663633709 7 val vis ccc 0.5003988147 7 train Ian slope 1.355412596 31 train Ian pearson 0.8028293436 31 train Ian mae 0.05898198422 31 train Ian ccc 0.6931256056 31 test Ian slope 0.7069191898 12 test Ian pearson 0.5194452816 12 test Ian mae 0.07271688548 12 test Ian ccc 0.4304119647 12 val Ian slope 0.6575342091 7 val Ian pearson 0.3973613339 7 val Ian mae 0.1085435926 7 val Ian ccc 0.1110515147 7 train mul slope 1.158483157 31 train mul pearson 0.8998618553 31 train mul mae 0.04087739137 31 train mul ccc 0.86371696 31 test mul slope 0.76549028 12 test mul pearson 0.6484977878 12 test mul mae 0.0544721086 12 -59- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 test mul ccc 0.5714293718 12 val mul slope 0.9353418916 7 val mul pearson 0.8846110828 7 val mul mae 0.08006460394 7 val mul ccc 0.4774314761 7 Supp. Table 4. Metrics for evaluation of clinically-defined thresholds from the TAILORx study in all risk groups (0-100) and the intermediate risk group (11-25). The risk subgroups are binarized at 10, 15, and 25, where low-risk is defined as ≤ the threshold, and high risk as > the threshold. The used metrics for evaluation are the area under the curve (AUC), Cohen's Kappa, Fl score, accuracy, and Matthew's Correlation Coefficient (MCC). The metrics for all risk groups are calculated for MSK (n=1029), IEO (n=452) and MDX (n=572). The metrics for the intermediate risk group are calculated for MSK (n=645), IEO (n=256) and MDX (n=132). Cohort All risk groups Intermediate risk group Threshold 101525 15 AUC MSK 0.73 0.75 0.85 0.65 IEO 0.76 0.75 0.81 0.61 MDX 0.73 0.75 0.8 0.51 Cohen's Kappa MSK 0.28 0.34 0.41 0.19 IEO 0.22 0.21 0.41 0.07 MDX 0.02 0.1 0.29 0.01 F1 MSK 0.86 0.68 0.47 0.64 IEO 0.93 0.8 0.62 0.72 MDX 0.71 0.61 0.58 0.6 Accuracy MSK 0.77 0.67 0.88 0.6 IEO 0.88 0.7 0.72 0.71 MDX 0.56 0.49 0.63 0.45 MCC MSK 0.29 0.34 0.42 0.2 -60- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 IEO 0.23 0.22 0.42 0.07 MDX 0.09 0.2 0.33 0.03 C. Systems and Methods for Determining Scores Related to Recurrence of Cancers in Subjects Using Multimodal Machine Learning (ML) Architectures Predicting the recurrence risk of certain cancers may be used as a factor in determining whether to provide the subject with therapy. Multigene expression assays such as recurrence scores (e.g., for hormone receptor-positive early breast cancer (HR+/HER2- EBC)) may be effective at predicting recurrence risk but may have limitations due to long turnaround times, high cost, and maintenance of samples. Additionally, clinicopathologic tools require manual extraction of data from unstructured electronic health records and often underperform compared to molecular assays. Other approaches, such as unimodal models and nomograms, may also face significant challenges. Unimodal models that solely rely on either imaging or text data may often fail to capture the complex interplay between and orthogonal data elements from different types of data, leading to suboptimal predictive performance. Nomograms, on the other hand, may rely on manual extraction of relevant inputs from unstructured electronic health records and may suffer from poor performance in terms of accuracy and reliability. To address these challenges, a multimodal machine learning (ML) model may be used to integrate and process image data and text data to generate a measure on recurrence risk and outcome for a subject with cancer. The model can integrate both biomedical images (e.g., H&E-stained whole-slide images) and their corresponding synoptic text reports to more accurately infer the recurrence score, relative to other techniques. By leveraging multimodal data, the ML model can extract meaningful image and text features associated with high recurrence risk. These insights can lead to better understanding of the disease informing the risk of recurrence or predicted outcome (including likelihood of distant metastasis) that would not be deducible from unimodal approaches. The ability of the ML model to take unstructured multimodal data can also reduce reliance on extensive manual data extraction, thereby reducing the amount of time and effort on part of specialized experts. The use of the predicted or -61- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 expected likelihoods generated by the ML model from the multimodal data can reduce the reliance on time-consuming molecular assay, and ultimately improve clinical outcomes for subjects with cancer. Referring now to Fig. 36, depicted is a block diagram of a system 100 for determining scores related to recurrence of cancers in subjects using multimodal machine learning (ML) architectures. In brief overview, the system 100 may include at least one data processing system 105, at least one imaging device 110, and at least one administrative device 115, communicatively coupled with one another via at least one network 120. The data processing system 105 may include at least one dataset indexer 125, at least one input handler 130, at least one model trainer 135, at least one model applier 140, at least one output evaluator 145, at least one ML architecture 150, and at least one database 155, among others. The ML architecture 150 may include at least one image processor 160, at least one text processor 165, and at least one aggregator 170, among others. Each of the components of the system 100 can be implemented using the computing system as described in Section D. The system 100 may be used to implement the functionalities (e.g., of the Orpheus and Orpheus+ model) detailed herein in Sections A and B. The data processing system 105 can be any computing device comprising one or more processors coupled with memory and software capable of performing the various processes and tasks described herein. The data processing system 105 may be housed within a computing system (e.g., laptop, PC, smart device) or within a server group (e.g., a data center, a branch office, or a server site), and include instructions to manage the identifying of images, generating a classification, and storing an association. The data processing system 105 may be in communication with the imaging device 110, administrative device 115, and the database 155, among others. The data processing system 105 may have one or more components, modules, processes, and threads to perform the various processes and tasks described herein. On the data processing system, the dataset indexer 125 may receive, retrieve, or otherwise identify a dataset -62- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 with a biomedical image and a text report about a tumor associated with cancer in a subject. The input handler 130 may create, produce, or otherwise generate tiles from the biomedical image and tokens from the text report for inputting to the ML architecture 150. The model trainer 135 may initialize, train, and establish the ML architecture 150 using training data. The model applier 140 may apply the ML architecture 150 to the biomedical image (or tiles) and the text report (or tokens) to generate a score indicating a likelihood of recurrence of the cancer in the subject. The output evaluator 145 may generate an output based on the score to identify the subject as a candidate or non-candidate for therapy for cancer. The ML architecture 150 may be any type of artificial intelligence (AI) algorithm or ML model to process biomedical images or text reports (or both) about cancers in subjects to determine scores indicating likelihood of recurrence. The ML architecture 150 may include a deep learning artificial neural network (ANN) (e.g., a transformer architecture, an encoder- decoder model with a convolution neural network architecture, a diffusion model, or an autoencoder model), a clustering algorithm, a support vector machine (SVM), a decision tree, a Bayesian model, a regression model, among others. In general, the ML architecture 150 may include inputs and outputs related to one another via a set of weights. The set of weights may be in accordance with the AI algorithm or ML model (e.g., transformer models) used to implement the ML architecture 150. For instance, the set of weights of the ML architecture may be in accordance with one or more transformer models, each with encoder layers, decoder layers, feed- forward layers, cross-attention layers, and activation layers, among others. The set of weights of the ML architecture 150 may be distributed or arranged across the image processor 160, the text processor 165, and the aggregator 170, among others. The image processor 160 may be any type of AI algorithm or ML model to generate features associated with recurrence of cancer and to determine one or more scores indicating likelihood of recurrence of cancer from the biomedical image. The image processor 160 may include a deep learning artificial neural network (ANN), such as a transformer architecture, an encoder-decoder model with a convolution neural network architecture, a diffusion model, or an autoencoder model, among others. In some embodiments, the ML -63- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 architecture 150 may include multiple image processors 160 for images of different staining modalities (e.g., histological stain or an immunohistochemical (IHC) stain). The text processor 165 may be any type of AI algorithm or ML model to generate features associated with recurrence of cancer and to determine one or more scores indicating likelihood of recurrence of cancer from text reports. The text processor 165 may include a deep learning artificial neural network (ANN), such as a transformer architecture, an encoder-decoder model with a convolution neural network architecture, a diffusion model, or an autoencoder model, among others. The aggregator 170 be any type of AI algorithm or ML model to determine the score indicating likelihood of recurrence of cancer in the subject using the features from the image processor 160 and the text processor 165. The aggregator 170 may include a deep learning artificial neural network (ANN), such as a tensor-fusion network, a transformer model, an encoder-decoder model with a convolution neural network architecture, a diffusion model, or an autoencoder model, among others. In some embodiments, the ML architecture 150 may lack one of the image processor 160, the text processor 165, or the aggregator 170. For instance, the ML architecture 150 may include the text processor 165 and lack the image processor 160 and the aggregator 170, to process text data. Conversely, the ML architecture 150 may include the image processor 160 and lack the text processor 165 and the aggregator 170, to process image data. In some embodiments, the ML architecture 150 may an instance of the Orpheus or Orpheus+ models detailed herein in Sections A and B. The imaging device 110 may (sometimes herein generally referred to as a whole slide scanner, a digital slide, scanner, an imaging device, or an image acquirer) may be any device to acquire biomedical images. The imaging device 110 may execute, carry out, or otherwise perform a scan of a slide with a tissue sample. The tissue sample on the slide may have been dyed using a histological stain or an immunohistochemical (IHC) stain to differentiate colors of certain features (e.g., cells) within the tissue sample. The scanning of the tissue sample on the slide may be in accordance with microscopy (e.g., light microscopy, brightfield microscopy, fluorescence microscopy, confocal microscopy, or multi-photon microscopy). The -64- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 imaging device 110 may be in communication with the data processing system 105 and the administrative device 115, among others, via the network 120. The administrative device 115 (sometimes herein referred to as a client device, a client, or an end user computing device) may be any computing device comprising one or more processors coupled with memory and software and capable of performing the various processes and tasks described herein. The administrative device 115 may be in communication with the data processing system 105 and the imaging device 110, via the network 120. The administrative device 115 may have at least one display. The administrative device 115 may be associated with an entity (e.g., a clinician) examining a tissue sample (e.g., tumor from resection or biopsy) from a of the subject. The administrative device 115 may be used to create or generate pathology reports identifying various characteristics of the tissue sample. The display may present information about the subject provided by the data processing system 105. Referring now to Fig. 37, depicted is a block diagram of a process 200 for training ML architectures in the system 100 for determining scores related to recurrence of cancers in subjects. Under the process 200, the dataset indexer 125 may receive, retrieve, or otherwise obtain training data 205. The training data 205 may be stored and maintained on the database 155. The training data 205 may be used to initialize, train, and establish the ML architecture 150, including its subcomponents, such as the image processor 160, the text processor 165, and the aggregator 170, among others. The training data 205 may identify or include a set of examples with which the ML architecture 150 is to perform learning (e.g., supervised or weakly supervised learning). Each example of the training data 205 may be for at least one subject 210. The subject 210 (sometimes herein referred to as a sample subject) may be a human or animal subject. The subject 210 may be female or male of any age. The subject 210 may have or may be diagnosed with cancer affecting at least one organ 215 of the subject 210. The subject 210 may be at risk of relapse, metastasis, or recurrence of the cancer, such as reappearing within the organ 215 or spreading to another anatomical site outside of the organ -65- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 215 (e.g., lymph node, bones, lungs, or brain). The breast cancer may be, for example, a hormone receptor-positive early breast cancer (HR+/HER2- EBC) of luminal type A or luminal type B, among others. The breast cancer may be an invasive breast cancer (e.g., ductal carcinoma (IDC) or invasive lobular carcinoma (ILC)) or non-invasive (or in situ) breast cancer (e.g., ductal carcinoma in situ (DCIS) or lobular carcinoma in situ (LCIS)), among others. The subject 210 may have undergone treatment previously for the cancer associated with the organ 215. In some embodiments, the subject 210 may have undergone surgical removal or resection of primary tumor associated with the cancer from the organ 215. In some embodiments, the subject 210 may have undergone other forms of treatments, such as radiation therapy, adjuvant chemotherapy, or endocrine therapy. At least one tissue sample 220 may have been extracted, removed, or obtained from the organ 215 of the subject 210. The tissue sample 220 may contain or include at least a portion of the tumor associated with the cancer in the subject 210. For example, the tissue sample 220 may include at least a portion or entirety of the tumor (e.g., primary tumor) associated with the cancer within the organ 215 of the subject 210. The tissue sample 220 may have been obtained from the organ 215 as part of the surgical removal or resection of the tumor (e.g., the primary tumor) associated with the cancer. The tissue sample 220 may have been obtained as part of creating or forming the training data 205. In the training data 205, each example may include one or more of: at least one biomedical image 225, at least one text report 230, at least one image score 235, and at least one text score 240, among others. Each example may be associated with the subject 210 or by extension the tissue sample 220 from the organ 215 of the subject 210. The biomedical image 225 may be of at least a portion of the tissue sample 220 from the organ 215 of the subject 210. For example, the biomedical image 225 may be generated by scanning the tissue sample 220 placed on a slide using the imaging device 110 in accordance with brightfield microscopy. The biomedical image 225 may be of at least one of staining modality. The staining modality may depend on the stain applied on the tissue sample 220 prior to scanning and acquisition of the biomedical image 225. -66- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 The tissue sample 220 may be stained using a histological stain or an immunohistochemical (IHC) stain for imaging to create the biomedical image 225. The histological stain may function to enhance visualization of cellular, tissue, and tumor structure within the tissue sample 220, and may include, for example, hematoxylin and eosin (H&E), hemosiderin stain, a Sudan stain, a Schiff stain, a Congo red stain, a Gram stain, a Ziehl-Neelsen stain, a Auramine-rhodamine stain, a trichrome stain, a silver stain, and Wright’s Stain, among others. The IHC stain may server to detect specific proteins within the cells of the tissue sample 220, and may include, for example, hormone receptor stain (e.g., estrogen receptor (ER) or progesterone receptor (PR)), HER2/neu stain, basal and myoepithelial marker stain (e.g., CK5/6, CK14, epidermal growth factor receptor (EGFR), p63, or SMA), breast and urothelial marker stain (e.g., GATA3), SOX-10 stain, cytokeratin stain, epithelial membrane antigen (EMA) stain, Ki-67 stain, CD markers (e.g., CD3, CD4, CD8, CD20, CD34, CD56, and CD117), mesenchymal marker, or neural markers, among others. In some embodiments, the example of the training data 205 may include multiple biomedical images 225 of different staining modalities. For instance, one biomedical image 225 may be of at least a portion (e.g., a slice) of the tissue sample 220 in a H&E stain, and another biomedical image 225 may be of at least another (e.g., another slice) portion of the tissue sample 220 in an IHC stain. In some embodiments, the example of the training data 205 may lack the biomedical image 225. In addition, the text report 230 may include or identify a set of characteristics defining the tumor (e.g., in the tissue sample 220) associated with the cancer in the subject 210. The set of characteristics may identify or include a gross description of the tumor (e.g., size, shape, or color), a specimen type, a histologic description (e.g., tumor type, histologic grade, or depth), a biomarker identifier (e.g., hormone receptors, proliferation markers, or other markers), among others. The set of characteristics may depend on the type of cancer. For instance, for breast cancer, the set of characteristics identified in the text report 230 may include a histologic subtype, a percent positivity of HR, a percent positivity of HER2, a histologic grade, an anatomic site, a ductal carcinoma in situ (DCIS) indicator, or lobular carcinoma in situ (LCIS) indicator, among others. The text report 230 may include a set of alphanumeric characters or strings -67- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 defining the set of characteristics. The text report 230 may have been created by a clinician (e.g., a pathologist) examining the tissue sample 220 directly or the biomedical image 225 associated with the subject 210 to evaluate the tumor and to assess the risk of recurrence of cancer in the subject 210. In some embodiments, the text report 230 may include structured data. The structured data may include a set of fields and a corresponding set of values for the set of characteristics defining the tumor. In some embodiments, the text report 230 may include unstructured text. For instance, the text report 230 may include free text data identifying set of characteristics to define the tumor associated with the cancer in the subject 210. In some embodiments, the example of the training data 205 may lack the text report 230. The image score 235 may identify or indicate a likelihood of recurrence associated with cancer in the subject 210, as derived from the biomedical image 225. The image score 235 may been determined or assigned by a clinician examining the biomedical image 225 of the tissue sample 220 to assess the risk of recurrence of the cancer in the subject 210. The image score 235 may be a numerical value (e.g., 0 to 100, -100 to 100, 0 to 10, -10 to 10, 0 to 1, or -1 to 1) indicate the degree of the likelihood. In some embodiments, the image score 235 may indicate a risk of recurrence of the cancer in the subject 210. In some embodiments, the image score 235 may indicate a predicted outcome identifying a degree of risk or likelihood of occurrence (or absence of) the relapse, metastasis, or recurrence of the cancer in the subject 210. The predicted outcome may be used to determine whether to administer the therapy to the subject 210. In some embodiments, the image score 235 may indicate the likelihood of recurrence within or at a time window relative to the previous treatment (e.g., resection of the primary tumor) for the cancer in the subject 210. The time window may range anywhere between 1 month to 5 years, such as 1 month, 2 months, 3 months, 6 months, 9 months, 1 year, 1.5 years, 2 years, 3 years, 4 years, 5 years, 6 years, 7 years, or 8 years, among others. In some embodiments, the example of the training data 205 may lack the image score 235 (e.g., when the example also lacks the biomedical image 225). The text score 240 may identify or indicate a likelihood of recurrence associated with cancer in the subject 210, as derived from the text report 230. The text score 240 may been -68- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 determined or assigned by a clinician examining the text report 230 of the tissue sample 220 to assess the risk of recurrence of the cancer in the subject 210. The text score 240 may be a numerical value (e.g., 0 to 100, -100 to 100, 0 to 10, -10 to 10, 0 to 1, or -1 to 1) indicate the degree of the likelihood. In some embodiments, the text score 240 may indicate a risk of recurrence of the cancer in the subject 210. In some embodiments, the text score 240 may indicate a predicted outcome identifying a degree of risk or likelihood of occurrence (or absence of) the relapse, metastasis, or recurrence of the cancer in the subject 210. The predicted outcome may be used to determine whether to administer the therapy to the subject 210. In some embodiments, the image score 240 may indicate the likelihood of recurrence within or at a time window relative to the previous treatment (e.g., resection of the primary tumor) for the cancer in the subject 210. The time window may range anywhere between 1 month to 5 years, such as 1 month, 2 months, 3 months, 6 months, 9 months, 1 year, 1.5 years, 2 years, 3 years, 4 years, 5 years, 6 years, 7 years, or 8 years, among others. In some embodiments, the example of the training data 205 may lack the text score 240 (e.g., when the example also lacks the text report 230). In some embodiments, the example of the training data 205 may identify or include at least one aggregate score. The aggregate score may identify or indicate identify or indicate a likelihood of recurrence associated with cancer in the subject 210. The aggregate score may be a numerical value (e.g., 0 to 100, -100 to 100, 0 to 10, -10 to 10, 0 to 1, or -1 to 1) indicate the degree of the likelihood. In some embodiments, the aggregate score may indicate a risk of recurrence of the cancer in the subject 210. In some embodiments, the aggregate score may indicate a predicted outcome identifying a degree of risk or likelihood of occurrence (or absence of) the relapse, metastasis, or recurrence of the cancer in the subject 210. The predicted outcome may be used to determine whether to administer the therapy to the subject 210. In some embodiments, the aggregate score may indicate the likelihood of recurrence within or at a time window relative to the previous treatment (e.g., resection of the primary tumor) for the cancer in the subject 210. The time window may range anywhere between 1 month to 5 years, such as 1 month, 2 months, 3 months, 6 months, 9 months, 1 year, 1.5 years, 2 years, 3 years, 4 years, 5 -69- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 years, 6 years, 7 years, or 8 years, among others. The time window for the image score 335, the text score 340, and the aggregate score may be the same. In some embodiments, the aggregate score may be calculated, generated, or determined as a function (e.g., an average or a weighted sum) of the image score 235 and the text score 240. In some embodiments, the aggregate score may be assigned by a clinician examining the tissue sample 220, the biomedical image 225, or the text report 230. For instance, the aggregate score for breast cancer may be generated from the multigene assay, and may be in accordance with any one or more of Oncotype DX (ODX) recurrence score, a MammaPrint score, a Prosignal (PAM50) risk of recurrence (ROR) score, or a breast cancer index (BCI), among others. In some embodiments, the example of the training data 205 may lack one of the image score 235, the text score 240, or the aggregated score. While described primarily with respect to breast cancer, the ML architecture 150 herein may be used in assessing risk of other types of cancers, such as: a bone cancer (e.g., osteosarcoma, chondrosarcoma, chordoma, or Ewing sarcoma), a lung cancer (e.g., non-small cell lung cancer (NSCLC) or Small cell lung cancer (SCLC)), skin cancer (e.g., melanoma, Basal cell carcinoma (BCC) and squamous cell carcinoma (SCC)), lung cancer (e.g., non-small cell lung cancer (NSCLC) or small cell lung cancer (SCLC)), brain cancer (e.g., glioblastoma multiforme (GBM), astrocytoma, oligodendroglioma, meningioma, and medulloblastoma), head and neck cancer (e.g., squamous cell carcinoma of the larynx, nasopharyngeal carcinoma, oropharyngeal carcinoma, salivary gland tumors, and thyroid carcinoma), uterine cancer (e.g., endometrial cancer and uterine sarcoma), stomach cancer (e.g., adenocarcinoma, gastrointestinal stromal tumor (GIST), neuroendocrine tumors, and lymphoma), ovarian cancer (e.g., epithelial, germ cell, and stromal), cervical cancer (e.g., squamous cell carcinoma and adenocarcinoma), bladder cancer (e.g., urothelial carcinoma, squamous cell carcinoma, and adenocarcinoma), prostate cancer (e.g., adenocarcinoma or transitional cell carcinoma), or colorectal cancer (e.g., colon or rectal, adenocarcinoma, squamous cell carcinoma, carcinoid tumor, or lymphoma), among others. The organ 215 may correspond to the anatomical site associated with the cancer in the subject 210. The organ 215 may include for example, at least a portion of skin, lung, -70- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 brain, head, neck , uterus, stomach, ovary, cervix, prostate, breast, colon, or rectum, among others. For instance, the ML architecture 150 can process biomedical images of tissue samples from prostate or associated text pathology reports (or both) to output a risk score (e.g., Veracyte Decipher score) for recurrence, relapse, or metastasis of the prostate cancer within the subject. With the obtaining of the training data 205, the input handler 130 may create, produce, or otherwise generate a set of tiles 250A–N (hereinafter generally referred to as tiles 250) using the biomedical image 225. In some embodiments, the input handler 130 may partition, section, or otherwise divide the biomedical image 225 to generate the set of tiles 250. Each tile 250 may correspond to a respective portion of the biomedical image 225. Each tile 250 may have dimensions corresponding to dimensions for an input of the ML architecture 150 or the image processor 160. In addition, the input handler 130 may create, produce, or otherwise generate a set of tokens 255A–N (hereinafter generally referred to as a set of tokens 255) using the text report 230. In some embodiments, the input handler 130 may apply a tokenizer to the text report 230 to produce the set of token 255. Each token 255 may correspond to one or more respective words (e.g., strings, phrases, or group of alphanumeric characters separated by spaces) in the text report 230. The set of tokens 255 may be for inputting to the ML architecture 150 or the text processor 165. In some embodiments, the tokenizer may be a separate from the ML architecture 150. In some embodiments, the tokenizer may be a part of the ML architecture 150 or the text processor 165 to convert or transform the one or more words of the token 255 into corresponding numerical representations (e.g., word vectors). The model trainer 135 may initialize, train, or otherwise establish the ML architecture 150 using the training data 205. The model trainer 135 may instantiate or create the ML architecture 150 to include the image processor 160, the text processor 165, and the aggregator 170. The model trainer 135 may assign or set values (e.g., random or defined values) to the set of weights arranged across the image processor 160, the text processor 165, and the aggregator 170 in the ML architecture 150. In some embodiments, the model trainer 135 may create the ML architecture 150 based on a pre-trained model, and use the training data 205 to further train or fine-tune. To train the ML architecture 150, the model trainer 135 may apply the -71- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 ML architecture 150 to the biomedical image 225 and to the text report 230 of each example of the training dataset 205. In some embodiments, the model trainer 135 may apply the ML architecture 150 to the set of tiles 250 and the set of tokens 255. In applying, the model trainer 135 may input, feed, or otherwise provide the set of tiles 250 to the image processor 160. In some embodiments, the model trainer 135 may select or identify the image processor 160 corresponding to the staining modality of the biomedical image 225 from which the set of tiles 250 are generated. For example, the model trainer 135 may provide tiles 250 generated from the biomedical image 225 of the H&E stain to the image processor 160 for processing images with H&E stain. The model trainer 135 may provide tiles 250 created from the biomedical image 225 with IHC stain to the image processor 160 for processing images with IHC stain. With the provision, the image processor 160 may process the set of tiles 250 (e.g., in sequence) in accordance with the set of weights of the image processor 160. From processing the set of tiles 250, the image processor 160 may produce, create, or otherwise generate a set of embeddings 260A–N (hereinafter generally referred to as embeddings 260). The set of embeddings 260 may correspond to features (e.g., in the form of a feature map or vector) associated with the recurrence of cancer. The set of embeddings 260 may be obtained by an intermediary layer (e.g., before the last activation or regression layer) in the image processor 160. In addition, the image processor 160 may calculate, determine, or otherwise generate at least one image score 235’. The image score 235’ may identify or indicate identify or indicate a likelihood of recurrence associated with cancer in the subject 210, as derived from the biomedical image 225 or the set of tiles 250. In some embodiments, the image score 235’ may indicate the likelihood of recurrence within or at the time window relative to the previous treatment (e.g., resection of the primary tumor) for the cancer in the subject 210. The image processor 160 may generate the image score 235’ by processing the set of embeddings 260 at the output layer (e.g., last activation or regression layer) within the set of weights in the image processor 160. -72- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 In some embodiments, in processing the set of tiles 250, the image processor 160 may calculate, determine, or otherwise generate a set of attention scores for the set of tiles 250. Each attention score may identify or indicate a degree of relevance of a respective tile 250 with the recurrence of the cancer. For instance, the attention score may quantify an importance of the given tile 250 in relation with other tiles 250 to the desired output of determining likelihood of recurrence as derived from the tiles 250. The image processor 160 may generate the set of attention scores at an attention layer (e.g., multi-head attention layer) within the set of weights of the image processor 160. In conjunction, the model trainer 135 may input, feed, or otherwise provide the set of tokens 255 to the text processor 165 of the ML architecture 150. With the provision, the text processor 165 may process the set of tokens 255 (e.g., in sequence) in accordance with the set of weights of the text processor 165. From processing the set of tokens 255, the text processor 165 may produce, create, or otherwise generate a set of embeddings 265A–N (hereinafter generally referred to as embeddings 265). The set of embeddings 265 may correspond to features (e.g., in the form of a feature map or vector) associated with the recurrence of cancer. The set of embeddings 265 may be obtained by an intermediary layer (e.g., before the last activation or regression layer) in the text processor 165. In addition, the text processor 165 may calculate, determine, or otherwise generate at least one image score 240’. The text score 240’ may identify or indicate a likelihood of recurrence associated with cancer in the subject 210, as derived from the text report 230 or the set of tokens 255. In some embodiments, the text score 240’ may indicate the likelihood of recurrence within or at the time window relative to the previous treatment (e.g., resection of the primary tumor) for the cancer in the subject 210. The text processor 165 may generate the text score 240’ by processing the set of embeddings 265 at the output layer (e.g., last activation or regression layer) within the set of weights in the text processor 165. In some embodiments, in processing the set of tokens 255, the text processor 165 may calculate, determine, or otherwise generate a set of attention scores for the set of tokens 255. -73- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 Each attention score may identify or indicate a degree of relevance of a respective token 255 with the recurrence of the cancer. For example, the attention score may quantify an importance of the given token 255 in relation with other tokens 255 to the desired output of determining likelihood of recurrence as derived from the tokens 255. The image processor 165 may generate the set of attention scores at an attention layer (e.g., multi-head attention layer) within the set of weights of the text processor 165. Continuing on, the model trainer 135 may input, feed, or otherwise provide the set of embeddings 260 from the image processor 160 and the set of embeddings 265 from the text processor 165 to the aggregator 170 of the ML architecture 150. With the provision, the aggregator 170 may process the set of embeddings 260 and the set of embeddings 265. For example, the aggregator 170 may process the input set of embeddings 260 and 265 in accordance with the set of weights (e.g., in a fusion layer or activation layer). From processing, the aggregator 170 may calculate, generate, or otherwise determine at least one aggregate score 270. The aggregate score 270 may identify or indicate identify or indicate a likelihood of recurrence associated with cancer in the subject 210 as derived from both the biomedical image 225 and the text report 230. In some embodiments, the aggregate score 270 may indicate the likelihood of recurrence within or at the time window relative to the previous treatment (e.g., resection of the primary tumor) for the cancer in the subject 210. The time window for the image score 235’, the text score 240’, and the aggregate score 270 may be the same. The model trainer 135 may compare the outputs of the ML architecture 150 with the example of the training data 205 used to generate the outputs. Based on the comparisons, the model trainer 135 may calculate, generate, or otherwise determine one or more loss metrics 275A–N (hereinafter generally referred to as loss metrics 275). The loss metric 275 may be calculated in accordance with any number of loss functions, such as a norm loss (e.g., L1 or L2), mean squared error (MSE), a quadratic loss, a cross-entropy loss, and a Huber loss, among others. The model trainer 135 may compare the image score 235’ generated by the image processor 160 and the image score 235 as identified in the corresponding example in the training data 205. Based on the comparison, the model trainer 135 may determine the loss metric 275 for -74- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 the image processor 165. The loss metric 275 may identify or indicate a degree of deviation between the image scores 235 and 235’. In addition, the model trainer 135 may compare the text score 240’ generated by the text processor 165 and the text score 240 as identified in the corresponding example in the training data 205. Based on the comparison, the model trainer 135 may determine the loss metric 275 for the text processor 165. The loss metric 275 may identify or indicate a degree of deviation between the text scores 240 and 240’. In some embodiments, the model trainer 135 may compare the aggregate score 270 generated by the ML architecture 150 with the aggregate score as identified in the corresponding example in the training data 205. Based on the comparison, the model trainer 135 may determine the loss metric 275 for the overall ML architecture 150 (or the aggregator 170). The loss metric 275 may identify or indicate a degree of deviation between the aggregate score 270 and the expected aggregate score. In some embodiments, the model trainer 135 can omit the comparison between the aggregate score 270 and the expected aggregate score and the determination of the loss metric 275 based on this comparison. The model trainer 135 may modify, change, or otherwise update one or more of the set of weights of the ML architecture 150 based on the comparisons. Using the loss metrics 275, the model trainer 135 may update one or more the set of weights arranged across the image processor 160, the text processor 165, and the aggregator 170 of the ML architecture 150. In some embodiments, the model trainer 135 may update one or more of the set of weights of the ML architecture 150 based on one or more of: the comparison between the image scores 235 and 235’, the comparison between the text score 240 and 240’, or the comparison between the aggregate score 270 with the expected aggregate score. In some embodiments, the model trainer 135 may update one or more of the set of weights of the image processor 160 using the loss metric 275 determined for the text processor 160 using the image scores 235 and 235’. In some embodiments, the model trainer 135 may update one or more of the set of weights of the text processor 165 using the loss metric 275 determined for the text processor 165 using the text scores 240 and 240’. In some embodiments, the model trainer 135 may update one or more of -75- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 the set of weights of the overall ML architecture 150 (or the aggregator 170) using the loss metric 275 determined using the aggregate scores. In some embodiments, the model trainer 135 may omit end-to-end training of the overall ML architecture 150 (or the aggregator 170) using the loss metric. The updating of the weights of the ML architecture 150 may be in accordance with an optimization function (or an objective function). The optimization function may define one or more learning rates or parameters at which the weights of the ML architecture 150 are to be updated. The optimization function may include, for example, adaptive moment estimation (Adam), Adam with weight decay, or stochastic gradient descent (SGD), among others. For example, the weights of the image processor 160, the text processor 165, and the aggregator 170 in the ML architecture 150 may be modified or updated using the respective loss metrics 275 in accordance with the respective objective function (e.g., Adam). By updating the weights, the model trainer 135 may further train the ML architecture 150. The training may be iteratively repeated using the examples of the training data 205 until a convergence condition for the ML architecture 150 to complete the training and establishment of the ML architecture 150. Referring now to Fig. 38, depicted is a block diagram of a process 300 for applying ML architectures in the system 100 for determining scores related to recurrence of cancers in subjects. Under the process 300, at least subject 310 may be under evaluation for risk of relapse, metastasis, or recurrence of the cancer. The subject 310 may be similar to the subject 210, except that the subject 310 is a new individual and thus data associated with the subject 310 may not be in the training data 205. The subject 310 may be a human or animal subject. The subject 310 may be female or male of any age. The subject 310 may have or may be diagnosed with cancer affecting at least one organ 315 of the subject 310. The subject 310 may be at risk of relapse, metastasis, or recurrence of the cancer, such as reappearing within the organ 315 or spreading to another anatomical site outside of the organ 315 (e.g., lymph node, bones, lungs, or brain). -76- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 The cancer may be, for example, breast cancer, affecting the breast of the subject 310. The breast cancer may be, for example, a hormone receptor-positive early breast cancer (HR+/HER2- EBC) of luminal type A or luminal type B, among others. The breast cancer may be an invasive breast cancer (e.g., ductal carcinoma (IDC) or invasive lobular carcinoma (ILC)) or non-invasive (or in situ) breast cancer (e.g., ductal carcinoma in situ (DCIS) or lobular carcinoma in situ (LCIS)), among others. As discussed herein, while described primarily with respect to breast cancer, the cancer may include, for instance, a bone cancer, a lung cancer, skin cancer, lung cancer, brain cancer, head and neck cancer, uterine cancer, stomach cancer, ovarian cancer, cervical cancer, bladder cancer, prostate cancer, or colorectal cancer, among others. The organ 315 may correspond to the anatomical site associated with the cancer in the subject 310. The organ 315 may include for example, at least a portion of skin, lung, brain, head, neck , uterus, stomach, ovary, cervix, prostate, breast, colon, or rectum, among others. The subject 310 may have undergone treatment previously for the cancer associated with the organ 315. In some embodiments, the subject 310 may have undergone surgical removal or resection of primary tumor associated with the cancer from the organ 315. In some embodiments, the subject 310 may have undergone other forms of treatments, such as radiation therapy, adjuvant chemotherapy, or endocrine therapy. At least one tissue sample 320 may have been extracted, removed, or obtained from the organ 315 of the subject 310. The tissue sample 320 may contain or include at least a portion of the tumor associated with the cancer in the subject 310. For example, the tissue sample 320 may include at least a portion or entirety of the tumor (e.g., primary tumor) associated with the cancer within the organ 315 of the subject 310. The tissue sample 320 may have been obtained from the organ 315 as part of the surgical removal or resection of the tumor (e.g., the primary tumor) associated with the cancer. The imaging device 110 may output, produce, or otherwise generate at least one biomedical image 325 of the tissue sample 320. The biomedical image 325 may be similar to the biomedical image 225 described herein. The biomedical image 325 may be of at least a portion of the tissue sample 320 from the organ 315 of the subject 310. For example, a slice of the tissue sample 320 may be placed on a slide and then scanned by the imaging device 110 in accordance -77- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 with brightfield microscopy to acquire the biomedical image 325. The biomedical image 325 may be of at least one of staining modality. The staining modality may depend on the stain applied on the tissue sample 320 prior to scanning and acquisition of the biomedical image 325 by the imaging device 110. Prior to imaging, the tissue sample 320 may be cut, processed, and stained using a histological stain or an immunohistochemical (IHC) stain for imaging to create the biomedical image 325. The histological stain may function to enhance visualization of cellular, tissue, and tumor structure within the tissue sample 320, and may include, for example, hematoxylin and eosin (H&E), hemosiderin stain, a Sudan stain, a Schiff stain, a Congo red stain, a Gram stain, a Ziehl-Neelsen stain, a Auramine-rhodamine stain, a trichrome stain, a silver stain, and Wright’s Stain, among others. The IHC stain may server to detect specific proteins within the cells of the tissue sample 320, and may include, for example, hormone receptor stain (e.g., estrogen receptor (ER) or progesterone receptor (PR)), HER2/neu stain, basal and myoepithelial marker stain (e.g., CK5/6, CK14, epidermal growth factor receptor (EGFR), p63, or SMA), breast and urothelial marker stain (e.g., GATA3), SOX-10 stain, cytokeratin stain, epithelial membrane antigen (EMA) stain, Ki-67 stain, CD markers (e.g., CD3, CD4, CD8, CD20, CD34, CD56, and CD117), mesenchymal marker, or neural markers, among others. In some embodiments, the imaging device 110 may generate multiple biomedical images 325 of one or more tissue samples from the subject 310 in different staining modalities. For instance, one biomedical image 325 may be of at least a portion (e.g., a slice) of the tissue sample 320 in a H&E stain, and another biomedical image 325 may be of at least another (e.g., another slice) portion of the tissue sample 320 in an IHC stain. With the generation, the imaging device 110 may transmit, send, or otherwise provide the biomedical image 325 to the data processing system 105 or the database 155. In conjunction, the administrative device 115 may output, produce, or otherwise generate at least one text report 330. The text report 330 may include or identify a set of characteristics defining the tumor (e.g., in the tissue sample 320) associated with the cancer in the subject 310. The set of characteristics may identify or include a gross description of the tumor (e.g., size, shape, or color), a specimen type, a histologic description (e.g., tumor type, -78- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 histologic grade, or depth), a biomarker identifier (e.g., hormone receptors, proliferation markers, or other markers), among others. The set of characteristics may depend on the type of cancer. For instance, for breast cancer, the set of characteristics identified in the text report 230 a histologic subtype, a percent positivity of HR, a percent positivity of HER2, a histologic grade, an anatomic site, a ductal carcinoma in situ (DCIS) indicator, or lobular carcinoma in situ (LCIS) indicator, among others. The text report 330 may include a set of alphanumeric characters or strings defining the set of characteristics. In some embodiments, the text report 330 may include structured data. The structured data may include a set of fields and a corresponding set of values for the set of characteristics defining the tumor. In some embodiments, the text report 330 may include unstructured text. For instance, the text report 330 may include free text data identifying set of characteristics to define the tumor associated with the cancer in the subject 310. The text report 330 may have been created by a clinician (e.g., a pathologist) examining the tissue sample 320 directly or the biomedical image 325 associated with the subject 310 to evaluate the tumor and to assess the risk of recurrence of cancer in the subject 310. Upon examination, the clinician may use a graphical user interface provided via the administrative device 115 (or another device) to enter or input text corresponding to the text report 330. In some embodiments, the administrative device 115 may generate the text report 330 upon scanning a physical report created or written by the clinician examining the tissue sample 320 to the biomedical image 325. For example, the administrative device 115 may apply optical character recognition (OCR) on a scan of the physical report to recognize and detect the set of characters on the report and to generate the text report 330. With the generation, the administrative device 115 may transmit, send, or otherwise provide the text report 330 to the data processing system 105 or the database 155. In some embodiments, the administrative device 115 may provide the text report 330 together with the biomedical image 325 to the data processing system 105 or the database 155. The dataset indexer 125 may receive, retrieve, or otherwise obtain at least one dataset 305 including at least one of the biomedical image 325 or the text report 330 (or both). In some embodiments, the dataset 305 may lack one of the biomedical image 325 or the text -79- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 report 330. From the dataset 305, the dataset indexer 125 may extract or identify the biomedical image 325 and the text report 330 for the subject 310. In some embodiments, the dataset indexer 125 may receive, retrieve, or otherwise the biomedical image 325 from the imaging device 110 and the text report 330 from the administrative device 115. With the obtaining, the dataset indexer 125 may create, form, or generate the dataset 305 to include the biomedical image 325 and the text report 330. The dataset 305 may be used during inference to apply the ML architecture 150 to newly obtained biomedical images and text reports, outside of the training data 205. The input handler 130 may create, produce, or otherwise generate a set of tiles 350A–N (hereinafter generally referred to as tiles 350) using the biomedical image 325. In some embodiments, the input handler 130 may partition, section, or otherwise divide the biomedical image 325 to generate the set of tiles 350. Each tile 350 may correspond to a respective portion of the biomedical image 325. Each tile 350 may have dimensions corresponding to dimensions for an input of the ML architecture 150 or the image processor 160. In addition, the input handler 130 may create, produce, or otherwise generate a set of tokens 355A–N (hereinafter generally referred to as a set of tokens 355) using the text report 330. In some embodiments, the input handler 130 may apply a tokenizer to the text report 330 to produce the set of token 355. Each token 355 may correspond to one or more respective words (e.g., strings, phrases, or group of alphanumeric characters separated by spaces) in the text report 330. The set of tokens 355 may be for inputting to the ML architecture 150 or the text processor 165. In some embodiments, the tokenizer may be a separate from the ML architecture 150. In some embodiments, the tokenizer may be a part of the ML architecture 150 or the text processor 165 to convert or transform the one or more words of the token 355 into corresponding numerical representations (e.g., word vectors). With the establishment of the ML architecture 150, the model applier 140 may apply the ML architecture 150 to the biomedical image 325 and to the text report 330 of the dataset 305. In some embodiments, the model applier 140 may apply the ML architecture 150 to the set of tiles 350 and the set of tokens 355. In applying, the model applier 140 may input, feed, -80- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 or otherwise provide the set of tiles 350 to the image processor 160. In some embodiments, the model applier 140 may select or identify the image processor 160 corresponding to the staining modality of the biomedical image 325 from which the set of tiles 350 are generated. For example, the model applier 140 may provide tiles 350 generated from the biomedical image 325 of the H&E stain to the image processor 160 for processing images with H&E stain. The model applier 140 may provide tiles 350 created from the biomedical image 325 with IHC stain to the image processor 160 for processing images with IHC stain. In providing, the image processor 160 may process the set of tiles 350 (e.g., in sequence) in accordance with the set of weights of the image processor 160. From processing the set of tiles 350, the image processor 160 may produce, create, or otherwise generate a set of embeddings 360A–N (hereinafter generally referred to as embeddings 360). The set of embeddings 360 may correspond to features (e.g., in the form of a feature map or vector) associated with the recurrence of cancer. The set of embeddings 360 may be obtained by an intermediary layer (e.g., before the last activation or regression layer) in the image processor 160. In addition, the image processor 160 may calculate, determine, or otherwise generate at least one image score 335. The image score 335 may identify or indicate identify or indicate a likelihood of recurrence associated with cancer in the subject 310, as derived from the biomedical image 325 or the set of tiles 350. The image score 235 may be a numerical value (e.g., 0 to 100, -100 to 100, 0 to 10, -10 to 10, 0 to 1, or -1 to 1) indicate the degree of the likelihood. In some embodiments, the image score 235 may indicate a risk of recurrence of the cancer in the subject 210. In some embodiments, the image score 235 may indicate a predicted outcome identifying a degree of risk or likelihood of occurrence (or absence of) the relapse, metastasis, or recurrence of the cancer in the subject 210. The predicted outcome may be used to determine whether to administer the therapy to the subject 210. In some embodiments, the image score 235 may indicate the likelihood of recurrence within or at a time window relative to the previous treatment (e.g., resection of the primary tumor) for the cancer in the subject 210. The time window may range anywhere between 1 month to 5 years, such as 1 month, 2 months, -81- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 3 months, 6 months, 9 months, 1 year, 1.5 years, 2 years, 3 years, 4 years, 5 years, 6 years, 7 years, or 8 years, among others. The image processor 160 may generate the image score 335 by processing the set of embeddings 360 at the output layer (e.g., last activation or regression layer) within the set of weights in the image processor 160. In some embodiments, in processing the set of tiles 350, the image processor 160 may calculate, determine, or otherwise generate a set of attention scores for the set of tiles 350. Each attention score may identify or indicate a degree of relevance of a respective tile 350 with the recurrence of the cancer. For instance, the attention score may quantify an importance of the given tile 350 in relation with other tiles 350 to the desired output of determining likelihood of recurrence as derived from the tiles 350. The image processor 160 may generate the set of attention scores at an attention layer (e.g., multi-head attention layer) within the set of weights of the image processor 160. In conjunction, the model applier 140 may input, feed, or otherwise provide the set of tokens 355 to the text processor 165 of the ML architecture 150. With the provision, the text processor 165 may process the set of tokens 355 (e.g., in sequence) in accordance with the set of weights of the text processor 165. From processing the set of tokens 355, the text processor 165 may produce, create, or otherwise generate a set of embeddings 365A–N (hereinafter generally referred to as embeddings 365). The set of embeddings 365 may correspond to features (e.g., in the form of a feature map or vector) associated with the recurrence of cancer. The set of embeddings 365 may be obtained by an intermediary layer (e.g., before the last activation or regression layer) in the text processor 165. In addition, the text processor 165 may calculate, determine, or otherwise generate at least one image score 340. The text score 340 may identify or indicate identify or indicate a likelihood of recurrence associated with cancer in the subject 310, as derived from the text report 330 or the set of tokens 355. The text score 240 may be a numerical value (e.g., 0 to 100, -100 to 100, 0 to 10, -10 to 10, 0 to 1, or -1 to 1) indicate the degree of the likelihood. In some embodiments, the text score 240 may indicate a risk of recurrence of the cancer in the -82- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 subject 210. In some embodiments, the text score 240 may indicate a predicted outcome identifying a degree of risk or likelihood of occurrence (or absence of) the relapse, metastasis, or recurrence of the cancer in the subject 210. The predicted outcome may be used to determine whether to administer the therapy to the subject 210. In some embodiments, the image score 240 may indicate the likelihood of recurrence within or at a time window relative to the previous treatment (e.g., resection of the primary tumor) for the cancer in the subject 210. The time window may range anywhere between 1 month to 5 years, such as 1 month, 2 months, 3 months, 6 months, 9 months, 1 year, 1.5 years, 2 years, 3 years, 4 years, 5 years, 6 years, 7 years, or 8 years, among others. The text processor 165 may generate the text score 340 by processing the set of embeddings 365 at the output layer (e.g., last activation or regression layer) within the set of weights in the text processor 165. In some embodiments, in processing the set of tokens 355, the text processor 165 may calculate, determine, or otherwise generate a set of attention scores for the set of tokens 355. Each attention score may identify or indicate a degree of relevance of a respective token 355 with the recurrence of the cancer. For example, the attention score may quantify an importance of the given token 355 in relation with other tokens 355 to the desired output of determining likelihood of recurrence as derived from the tokens 355. The image processor 165 may generate the set of attention scores at an attention layer (e.g., multi-head attention layer) within the set of weights of the text processor 165. Continuing on, the model applier 140 may input, feed, or otherwise provide the set of embeddings 360 from the image processor 160 and the set of embeddings 365 from the text processor 165 to the aggregator 170 of the ML architecture 150. With the provision, the aggregator 170 may process the set of embeddings 360 and the set of embeddings 365. For example, the aggregator 170 may process the input set of embeddings 360 and 365 in accordance with the set of weights (e.g., in a fusion layer or activation layer). From processing, the aggregator 170 may calculate, generate, or otherwise determine at least one aggregate score 370. The aggregate score 370 may identify or indicate identify or indicate a likelihood of recurrence associated with cancer in the subject 310 as derived from both the biomedical image 325 and the -83- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 text report 330. The aggregate score may be a numerical value (e.g., 0 to 100, -100 to 100, 0 to 10, -10 to 10, 0 to 1, or -1 to 1) indicate the degree of the likelihood. In some embodiments, the aggregate score may indicate a risk of recurrence of the cancer in the subject 210. In some embodiments, the aggregate score may indicate a predicted outcome of identifying a degree of risk or likelihood of occurrence (or absence of) the relapse, metastasis, or recurrence of the cancer in the subject 210. The predicted outcome may be used to determine whether to administer the therapy to the subject 210. In some embodiments, the aggregate score may indicate the likelihood of recurrence within or at a time window relative to the previous treatment (e.g., resection of the primary tumor) for the cancer in the subject 210. The time window may range anywhere between 1 month to 5 years, such as 1 month, 2 months, 3 months, 6 months, 9 months, 1 year, 1.5 years, 2 years, 3 years, 4 years, 5 years, 6 years, 7 years, or 8 years, among others. The time window for the image score 335, the text score 340, and the aggregate score 370 may be the same. Referring now to Fig. 39, depicted is a block diagram of a process 400 for evaluating outputs from ML architectures in the system 100 for determining scores related to recurrence of cancers in subjects. Under the process 400, the output evaluator 145 may determine or generate at least one classification 405 in accordance with the aggregate score 370 (or the image score 335 or the text score 340). The classification 405 may indicate or identify the subject 310 as one of a candidate or non-candidate for administration of at least one therapy 410 for cancer. The output evaluator 145 may store and maintain an association between the subject 310 (e.g., using an anonymized identifier) and the classification 405 using one or more data structures. The data structures may be of any type, such as an array, matrix, table, linked list, binary tree, heap, stack, queue, class object, or data file, among others. To generate, the output evaluator 145 may compare the aggregate score 370 with a threshold. The threshold may identify or define a value for the aggregate score 370 at which to classify the subject 310 as one of the candidate or non-candidate for the therapy. If the aggregate score 370 satisfies (e.g., greater than or equal to) the threshold, the output evaluator 145 may generate the classification 405 to identify the subject 310 as a candidate for the administration of -84- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 the therapy 410. The therapy 410 may be to block, inhibit, or otherwise prevent relapse, metastasis, or recurrence of the cancer in the subject 310. On the other hand, if the aggregate score 370 does not satisfy (e.g., less than) the threshold, the output evaluator 145 may generate the classification 405 to identify the subject 310 as a non-candidate for the administration of the therapy 410. The therapy may include at least one of an endocrine therapy, an adjuvant chemotherapy, or radiation therapy, among others. The type of therapy 410 identified for the subject 310 may depend on the type of cancer. For instance, for breast cancer (e.g., HR+/HER2- EBC), the therapy may include an endocrine therapy, an adjuvant chemotherapy, or radiation therapy, among others. The endocrine therapy may include, for example, at least one of a selective estrogen receptor modulator (SERM), an aromatase inhibitor (AI), or a selective estrogen receptor degrader (SERD), among others. The adjuvant chemotherapy may include, for example, at least one of anthracyclines, taxanes, cyclophosphamide, fluorouracil, or carboplatin, among others. The radiation therapy may include external beam radiation therapy (EBRT) (e.g., a whole or partial breast irradiation or brachytherapy (e.g., intracavitary brachytherapy or interstitial brachytherapy, among others. In addition, for prostate cancer, the therapy may include, for example, radiation therapy, hormone therapy, or immunotherapy, among others. The radiation therapy may include EBRT (e.g., intensity modulated radiation therapy (IMRT), image guided radiation therapy (IGRT), or proton beam therapy), brachytherapy (e.g., with low dose or high dose), or radiopharmaceuticals (e.g., Radium-223, Strontium-89, or Lutetium-177), among others. The hormone therapy may include androgen deprivation therapy (ADT), among others. The immunotherapy may include a cancer vaccine (e.g., Sipuleucel-T ) or checkpoint inhibitors (e.g., anti-PD-1), among others. For colorectal cancer, the therapy may include, for example, surgical procedure, radiation therapy, chemotherapy, or targeted therapy, among others. The surgical procedure may include a resection or mastectomy, among others. The radiation therapy may include EBRT or brachytherapy, among others. The chemotherapy may include capecitabine, 5- fluorouracil, oxaliplatin, irinotecan, trifluridine, cetuximab, or panitumumab, among others. Th -85- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 targeted therapy may include, for example, EGFR inhibitors, anti-angiogenic therapy, or BRAF inhibitors, among others. Although discussed primarily in terms of therapies for breast cancer, other types of therapies may be identified and administered based on the type of cancer that the subject 310 is under evaluation. The output evaluator 145 may produce, create, or otherwise generate at least one output 415 based on the classification 405. The output 415 may include information to be presented via the administrative device 115 (or another computing device) for assessing the risk of recurrence or predicted outcome with respect to recurrence of cancer in the subject 310. The output 415 may also include instructions for presenting the information, for example, defining user interface elements of a graphical user interface (GUI) to display the information. The information may be used (e.g., by a clinician examining the subject 310) to determine whether to provide or administer the subject 310 with the therapy 410. In some embodiments, the information may include at least one of: an identifier (e.g., anonymized identifier) for the subject 310; the biomedical image 325; one or more of the tiles 350 from the biomedical image 325; the text report 330; one or more of the tokens 355 from the text report 330; the image score 335; the text score 340; the aggregate score 370; or the time window for the image score 335, the text score 340, the aggregate score 370 relative to the prior administration of therapy for the subject 310, among others. When the classification 405 identifies the subject 310 as a candidate, the output evaluator 145 may generate the output 415 to identify or indicate the subject 310 as a candidate for the administration of the therapy 410. The output 415 may also include an indicator identifying the therapy for the recurrence of the cancer, such as endocrine therapy or adjuvant chemotherapy for breast cancer. In contrast, when the classification 405 identifies the subject 310 as a non-candidate, the output evaluator 145 may generate the output 415 to identify or indicate the subject 310 as a non-candidate for the administration of the therapy 410. The output 415 may also identify or include at least one indicator. For example, the output 415 may include at least one of an indication to not administer the subject 310 with the therapy for recurrence of the cancer, an indication to continue monitoring the subject 310 for recurrence of the cancer, or -86- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 an indication to provide a subsequent dataset (e.g., including another biomedical image and text report) for the subject 310, among others. With the generation, the output evaluator 145 may send, transmit, or otherwise provide the output 415 to the administrative device 115 for presentation via the administrative device 115. In some embodiments, the output evaluator 145 may identify or select a subset of tiles 350’A–N (hereinafter generally tiles 350’) from the set of tiles 350 based on the set of attention scores generated by the image processor 160. The subset of tiles 350’ may correspond to the tiles most relevant or important to the determination of the likelihood of recurrence of the cancer as indicated in the image score 335. For example, the output evaluator 145 may select the subset of tiles 350’ with the highest attention scores correlated with the most relevant or important determinants in determining the image score 335. With the selection, the output evaluator 145 may add, insert, or otherwise include the subset of tiles 350’ in the output 415, prior to provision of the output 415 to the administrative device 115. In some embodiments, the output evaluator 145 may identify or select a subset of tokens 355’A–N (hereinafter generally tokens 355’) from the set of tokens 355 based on the set of attention scores generated by the text processor 165. The subset of tokens 355’ may correspond to the tiles most relevant or important to the determination of the likelihood of recurrence of the cancer as indicated in the text score 340. For example, the output evaluator 145 may select the subset of tokens 355’ with the highest attention scores correlated with the most relevant or important determinants in determining the text score 340. With the selection, the output evaluator 145 may add, insert, or otherwise include the subset of tokens 355’ in the output 415, prior to provision of the output 415 to the administrative device 115. The administrative device 115 may retrieve, identify, or otherwise receive the output 415 from the data processing system 105. With receipt, the administrative device 115 may display, render, or otherwise present the information included in the output 415. The information may be used by a user (e.g., a clinician examining the subject 310) of the administrative device 115 as one of the factors in deciding on the treatment of the subject 310 -87- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 with respect to the recurrence of cancer. When the output 415 includes the classification 405 identifying the subject 310 as a candidate for therapy, the administrative device 115 may present an indication identifying the subject 310 as the candidate for therapy. In response to the presentation, the subject 310 may be provided or administered with the therapy 410 (e.g., an endocrine therapy or an adjuvant chemotherapy) as identified in the output 415. For example, having viewed the information of the output 415 presented through the administrative device 115, the clinician may determine that the subject 310 is at a high risk for recurrence and may decide to administer the subject 310 with the therapy 410 for recurrence of the cancer. When the output 415 includes the classification 405 identifying the subject 310 as a non-candidate for therapy, the administrative device 115 may present an indication identifying the subject 310 as the non-candidate for therapy. In response to the presentation, there may be a refraining of the administration or the provision of the therapy 410 to the subject 310. For instance, when the information of the output 415 presented through the administrative device 115 indicates that the subject 310 is not to be administered with the therapy, the clinician may determine that the subject 310 is at a low risk for recurrence and may decide to withhold administration of the therapy 410 for recurrence of the cancer. In addition, based on the time window identified for the aggregate score 370, the clinician may decide to wait until re- examining the subject 310 again for the risk or recurrence. In this manner, the ML architecture 150 can integrate multimodal data in producing more accurate scores with respect to recurrence risk or predicted outcome for cancer (e.g., hormone receptor-positive early breast cancer (HR+/HER2- EBC)) in subject 310. The ML architecture 150 may be a significant improvement over unimodal models that take either imagining data or text data. For one, the use of the ML architecture 150 on the data processing system 105 may eliminate the reliance on separate, dedicated models for such unimodal data, and by extension reduce the reliance on separate systems to host these unimodal models, thereby freeing up computing capacity and network bandwidth. For another, the integration and leverage of multimodal data can allow the ML architecture 150 to embed and capture complex relationships between different types of data that unimodal models might miss or be unable to -88- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 detect. The scores (e.g., the aggregate score 370) outputted by the ML architecture 150 can thus be more accurate and more insightful relative to unimodal models. Additionally, the data processing system 105 and the ML architecture 150 thereon can offer substantial improvements to clinical outcomes by providing more accurate and timely predictions of cancer recurrence risk. Techniques that rely on multigene assay tests can take a substantial amount of time to provide results. In contrast, the ML architecture 150 can rely on more readily accessible and available images of tissues samples taken from the subject and associated pathology text report, thereby reducing the amount of time in obtaining analysis results. By accurately identifying high-risk patients, the ML architecture 150 can guide clinical decision-making, allowing for more personalized and effective treatment plans. This can lead to better patient outcomes (e.g., for the subject 310), as high-risk patients can receive appropriate therapies sooner, potentially reducing the risk of recurrence. Furthermore, the ability of the ML architecture 150 to integrate multimodal data can provide a deeper insights, uncovering biologically meaningful features associated with high-risk disease (e.g., in the form of selected tiles 350’ and tokens 355’). These insights can improve the understanding of cancer within a particular subject, ultimately leading to more effective and better patient care. Referring now to Fig. 40, depicted is a flow diagram of a method 500 of determining scores related to recurrence of cancers in subjects using multimodal ML architectures. The method 500 may be implemented or performed by any of the components detailed herein, such as the system 100 or the system 600. Under the method 500, a computing system may obtain a dataset including a biomedical image and a text report associated with a tumor of cancer in a subject (505). The computing system may generate a set of tiles from the biomedical image and a set of tokens from the text report (510). The computing system may apply an ML architecture to the biomedical image (or the set of tiles) and the text report (or the set of tokens) (515). The computing system may determine a score indicating a likelihood of recurrence of cancer in the subject based on applying the ML architecture (520). The computing system may determine whether the score satisfies a threshold (525). If the score satisfies (e.g., greater than or equal to) the threshold, the computing system may classify the subject as a -89- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 candidate for therapy for cancer (530). On the other hand, if the score does not satisfy (e.g., less than) the threshold, the computing system may classify the subject as a non-candidate for therapy for cancer (535). The computing system may provide an output based on the classification (540). D. Computing and Network Environment Various operations described herein can be implemented on computer systems. FIG. 41 shows a simplified block diagram of a representative server system 600, client computing system 614, and network 626 usable to implement certain embodiments of the present disclosure. In various embodiments, server system 600 or similar systems can implement services or servers described herein or portions thereof. Client computing system 614 or similar systems can implement clients described herein. The system 100 described herein can be similar to the server system 600. Server system 600 can have a modular design that incorporates a number of modules 602 (e.g., blades in a blade server embodiment); while two modules 602 are shown, any number can be provided. Each module 602 can include processing unit(s) 604 and local storage 606. Processing unit(s) 604 can include a single processor, which can have one or more cores, or multiple processors. In some embodiments, processing unit(s) 604 can include a general-purpose primary processor as well as one or more special-purpose co-processors such as graphics processors, digital signal processors, or the like. In some embodiments, some or all processing units 604 can be implemented using customized circuits, such as application specific integrated circuits (ASICs) or field programmable gate arrays (FPGAs). In some embodiments, such integrated circuits execute instructions that are stored on the circuit itself. In other embodiments, processing unit(s) 604 can execute instructions stored in local storage 606. Any type of processors in any combination can be included in processing unit(s) 604. Local storage 606 can include volatile storage media (e.g., DRAM, SRAM, SDRAM, or the like) and/or non-volatile storage media (e.g., magnetic or optical disk, flash -90- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 memory, or the like). Storage media incorporated in local storage 606 can be fixed, removable or upgradeable as desired. Local storage 606 can be physically or logically divided into various subunits such as a system memory, a read-only memory (ROM), and a permanent storage device. The system memory can be a read-and-write memory device or a volatile read-and-write memory, such as dynamic random-access memory. The system memory can store some or all of the instructions and data that processing unit(s) 604 need at runtime. The ROM can store static data and instructions that are needed by processing unit(s) 604. The permanent storage device can be a non-volatile read-and-write memory device that can store instructions and data even when module 602 is powered down. The term “storage medium” as used herein includes any medium in which data can be stored indefinitely (subject to overwriting, electrical disturbance, power loss, or the like) and does not include carrier waves and transitory electronic signals propagating wirelessly or over wired connections. In some embodiments, local storage 606 can store one or more software programs to be executed by processing unit(s) 604, such as an operating system and/or programs implementing various server functions such as functions of the system 100 of FIG. 36 or any other system described herein, or any other server(s) associated with system 100 or any other system described herein. “Software” refers generally to sequences of instructions that, when executed by processing unit(s) 604 cause server system 600 (or portions thereof) to perform various operations, thus defining one or more specific machine embodiments that execute and perform the operations of the software programs. The instructions can be stored as firmware residing in read-only memory and/or program code stored in non-volatile storage media that can be read into volatile working memory for execution by processing unit(s) 604. Software can be implemented as a single program or a collection of separate programs or program modules that interact as desired. From local storage 606 (or non-local storage described below), processing unit(s) 604 can retrieve program instructions to execute and data to process in order to execute various operations described above. -91- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 In some server systems 600, multiple modules 602 can be interconnected via a bus or other interconnect 608, forming a local area network that supports communication between modules 602 and other components of server system 600. Interconnect 608 can be implemented using various technologies including server racks, hubs, routers, etc. A wide area network (WAN) interface 610 can provide data communication capability between the local area network (interconnect 608) and the network 626, such as the Internet. Technologies can be used, including wired (e.g., Ethernet, IEEE 602.3 standards) and/or wireless technologies (e.g., Wi-Fi, IEEE 602.11 standards). In some embodiments, local storage 606 is intended to provide working memory for processing unit(s) 604, providing fast access to programs and/or data to be processed while reducing traffic on interconnect 608. Storage for larger quantities of data can be provided on the local area network by one or more mass storage subsystems 612 that can be connected to interconnect 608. Mass storage subsystem 612 can be based on magnetic, optical, semiconductor, or other data storage media. Direct attached storage, storage area networks, network-attached storage, and the like can be used. Any data stores or other collections of data described herein as being produced, consumed, or maintained by a service or server can be stored in mass storage subsystem 612. In some embodiments, additional data storage resources may be accessible via WAN interface 610 (potentially with increased latency). Server system 600 can operate in response to requests received via WAN interface 610. For example, one of modules 602 can implement a supervisory function and assign discrete tasks to other modules 602 in response to received requests. Work allocation techniques can be used. As requests are processed, results can be returned to the requester via WAN interface 610. Such operation can generally be automated. Further, in some embodiments, WAN interface 610 can connect multiple server systems 600 to each other, providing scalable systems capable of managing high volumes of activity. Other techniques for managing server systems and server farms (collections of server systems that cooperate) can be used, including dynamic resource allocation and reallocation. -92- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 Server system 600 can interact with various user-owned or user-operated devices via a wide-area network such as the Internet. An example of a user-operated device is shown in FIG. 12 as client computing system 614. Client computing system 614 can be implemented, for example, as a consumer device such as a smartphone, other mobile phone, tablet computer, wearable computing device (e.g., smart watch, eyeglasses), desktop computer, laptop computer, and so on. For example, client computing system 614 can communicate via WAN interface 610. Client computing system 614 can include computer components such as processing unit(s) 616, storage device 618, network interface 620, user input device 622, and user output device 624. Client computing system 614 can be a computing device implemented in a variety of form factors, such as a desktop computer, laptop computer, tablet computer, smartphone, other mobile computing device, wearable computing device, or the like. Processing unit(s) 616 and storage device 618 can be similar to processing unit(s) 604 and local storage 606 described above. Suitable devices can be selected based on the demands to be placed on client computing system 614; for example, client computing system 614 can be implemented as a “thin” client with limited processing capability or as a high- powered computing device. Client computing system 614 can be provisioned with program code executable by processing unit(s) 616 to enable various interactions with server system 600. Network interface 620 can provide a connection to the network 626, such as a wide area network (e.g., the Internet) to which WAN interface 610 of server system 600 is also connected. In various embodiments, network interface 620 can include a wired interface (e.g., Ethernet) and/or a wireless interface implementing various RF data communication standards such as Wi-Fi, Bluetooth, or cellular data network standards (e.g., 3G, 4G, LTE, etc.). User input device 622 can include any device (or devices) via which a user can provide signals to client computing system 614. The client computing system 614 can interpret the signals as indicative of particular user requests or information. In various embodiments, user -93- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 input device 622 can include any or all of a keyboard, touch pad, touch screen, mouse or other pointing device, scroll wheel, click wheel, dial, button, switch, keypad, microphone, and so on. User output device 624 can include any device via which client computing system 614 can provide information to a user. For example, user output device 624 can include a display to display images generated by or delivered to client computing system 614. The display can incorporate various image generation technologies, e.g., a liquid crystal display (LCD), light- emitting diode (LED) including organic light-emitting diodes (OLED), projection system, cathode ray tube (CRT), or the like, together with supporting electronics (e.g., digital-to-analog or analog-to-digital converters, signal processors, or the like). Some embodiments can include a device such as a touchscreen that function as both input and output device. In some embodiments, other user output devices 624 can be provided in addition to or instead of a display. Examples include indicator lights, speakers, tactile “display” devices, printers, and so on. Some embodiments include electronic components, such as microprocessors, storage and memory that store computer program instructions in a computer-readable storage medium. Many of the features described in this specification can be implemented as processes that are specified as a set of program instructions encoded on a computer-readable storage medium. When these program instructions are executed by one or more processing units, they cause the processing unit(s) to perform various operation indicated in the program instructions. Examples of program instructions or computer code include machine code, such as is produced by a compiler, and files including higher-level code that are executed by a computer, an electronic component, or a microprocessor using an interpreter. Through suitable programming, processing unit(s) 604 and 616 can provide various functionality for server system 600 and client computing system 614, including any of the functionality described herein as being performed by a server or client, or other functionality. It will be appreciated that server system 600 and client computing system 614 are illustrative and that variations and modifications are possible. Computer systems used in -94- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 connection with embodiments of the present disclosure can have other capabilities not specifically described here. Further, while server system 600 and client computing system 614 are described with reference to particular blocks, it is to be understood that these blocks are defined for convenience of description and are not intended to imply a particular physical arrangement of component parts. For instance, different blocks can be but need not be located in the same facility, in the same server rack, or on the same motherboard. Further, the blocks need not correspond to physically distinct components. Blocks can be configured to perform various operations, e.g., by programming a processor or providing appropriate control circuitry, and various blocks might or might not be reconfigurable depending on how the initial configuration is obtained. Embodiments of the present disclosure can be realized in a variety of apparatus including electronic devices implemented using any combination of circuitry and software. While the disclosure has been described with respect to specific embodiments, one skilled in the art will recognize that numerous modifications are possible. Embodiments of the disclosure can be realized using a variety of computer systems and communication technologies including but not limited to the specific examples described herein. Embodiments of the present disclosure can be realized using any combination of dedicated components and/or programmable processors and/or other programmable devices. The various processes described herein can be implemented on the same processor or different processors in any combination. Where components are described as being configured to perform certain operations, such configuration can be accomplished, e.g., by designing electronic circuits to perform the operation, by programming programmable electronic circuits (such as microprocessors) to perform the operation, or any combination thereof. Further, while the embodiments described above may make reference to specific hardware and software components, those skilled in the art will appreciate that different combinations of hardware and/or software components may also be used and that particular operations described as being implemented in hardware might also be implemented in software or vice versa. Computer programs incorporating various features of the present disclosure may be encoded and stored on various computer-readable storage media; suitable media include -95- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 magnetic disk or tape, optical storage media such as compact disk (CD) or DVD (digital versatile disk), flash memory, and other non-transitory media. Computer-readable media encoded with the program code may be packaged with a compatible electronic device, or the program code may be provided separately from electronic devices (e.g., via Internet download or as a separately packaged computer-readable storage medium). Thus, although the disclosure has been described with respect to specific embodiments, it will be appreciated that the disclosure is intended to cover all modifications and equivalents within the scope of the following claims. -96- 4924-7413-8397.1

Claims

Atty. Dkt. No.: 115872-3191 WHAT IS CLAIMED IS 1. A method of determining scores related to recurrence of cancers in subjects using multimodal machine learning (ML) architectures, comprising: obtaining, by one or more processors, for a subject at risk of recurrence of cancer, a dataset comprising at least one of: (i) a biomedical image of a tissue sample from an organ associated with the cancer or (ii) a text report identifying a plurality of characteristics of a tumor associated with the cancer; applying, by the one or more processors, an ML architecture to at least one of the biomedical image or at least a portion of the text report of the dataset, wherein the ML architecture is established using a plurality of examples, each of the plurality of examples having, for a respective subject having undergone resection of a respective tumor from a respective organ: (i) a respective dataset comprising at least one of (a) a respective biomedical image of a respective tissue sample from the respective organ or (b) a respective text report identifying a respective plurality of characteristics of the respective tumor; and (ii) a respective score indicating a corresponding likelihood of recurrence associated with cancer in the respective subject; determining, by the one or more processors, based on applying the ML architecture, a score indicating a likelihood of recurrence associated with cancer in the subject; generating, by the one or more processors, a classification identifying the subject as one of a candidate or non-candidate for administration of a therapy for the cancer, in accordance with the score; and storing, by the one or more processors, using one or more data structures, an association between the subject and the classification. 2. The method of claim 1, wherein generating the classification further comprises generating, responsive to the score satisfying a threshold, the classification to identify the subject as the candidate for administration of the therapy for recurrence of the cancer, and further comprising: -97- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 providing, by the one or more processors, for presentation, an output based on the classification to identify the subject as the candidate for the administration of the therapy wherein the subject is administered with the therapy for cancer, in response to the presentation of the output. 3. The method of claim 1, wherein generating the classification further comprises generating responsive to the score not satisfying a threshold, the classification to identify the subject as the non-candidate for administration of the therapy for recurrence of the cancer, and further comprising: providing, by the one or more processors, an output based on the classification to identify the subject as the non-candidate for the administration of the therapy, wherein the output comprises at least one of (i) a first indication to not administer the subject the therapy for recurrence of the cancer, (ii) a second indication to continue monitoring the subject for recurrence of the cancer, or (iii) a third indication to provide a second dataset for the subject.. 4. The method of claim 1, wherein at least one of the plurality of examples comprises the respective score indicating the corresponding likelihood of recurrence associated with cancer in the respective subject at a respective time window relative to resection of a respective primary tumor associated with the cancer, and wherein determining the score further comprises determining the score indicating the likelihood of recurrence associated with cancer in the subject at a time window relative to resection of a primary tumor associated with the cancer in the subject, wherein the time window ranges from three months to eight years. 5. The method of claim 1, further comprising: generating, by the one or more processors, a plurality of tiles using the biomedical images, each of the plurality of tiles corresponding a corresponding section of the biomedical images; -98- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 generating, by the one or more processors, a plurality of tokens using the text report, each of the plurality of tokens corresponding to one or more respective words of the text report, and wherein applying the ML architecture further comprises applying the ML architecture to the plurality of tiles and the plurality of tokens, the ML architecture comprising: an image processor configured to generate, using the plurality of tiles, a first set of embeddings corresponding to features associated with the recurrence of the cancer; a text processor configured to generate, using the plurality of tokens, a second set of embeddings corresponding to features associated with the recurrence of the cancer; an aggregator configured to determine, based on the first set of embeddings from the image processor and the second set of embeddings from the text processor, the score indicating the likelihood of recurrence associated with cancer in the subject. 6. The method of claim 5, wherein applying the ML architecture further comprises applying the ML architecture, the ML architecture comprising: the image processor configured to generate, for each tile of the plurality of tiles, a first score of a plurality of first scores indicating a relevance of the tile to the recurrence of the cancer; the text processor configured to generate, for each token of the plurality of tokens, a second score of a plurality of second scores indicating a relevance of the token to the recurrence of the cancer, and further comprising: selecting, by the one or more processors, (i) a subset of tiles from the plurality of tiles based on the plurality of first scores and (ii) a subset of tokens from the plurality of tokens based on the plurality of second scores; and providing, by the one or more processors, for presentation an output based on the subset of tiles and the subset of tokens. 7. The method of claim 1, wherein at least one example of the plurality of examples further comprises (i) the respective score indicating a corresponding likelihood of recurrence associated with cancer in the respective subject based on the respective biomedical image and (ii) a -99- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 respective second score indicating a corresponding likelihood of recurrence associated with cancer in the respective subject based on the respective text report, and wherein the ML architecture is trained by: applying the ML architecture to the respective dataset and the respective text report of the at least one example; determining (i) a first score indicating corresponding likelihood of recurrence associated with cancer in the respective subject from applying the ML architecture to the respective biomedical score and (ii) a second score indicating corresponding likelihood of recurrence associated with cancer in the respective subject from applying the ML architecture to the respective text report; and updating one or more of a plurality of weights inn the ML architecture based on (i) a first comparison between the respective score and the first score and (ii) a second comparison between the second respective score and the second score. 8. The method of claim 1, wherein obtaining the dataset further comprises obtaining the dataset comprising (i) the biomedical image of at least a portion of the tissue sample in a first staining modality of a plurality of staining modalities and (ii) a second biomedical image of at least a portion of the tissue sample in a second staining modality of the plurality of staining modalities, the plurality of staining modalities comprising a histological modality or an immunohistochemical (IHC) modality, and wherein applying the ML architecture further comprises applying the ML architecture to the biomedical image in the first staining modality and the second biomedical image in the second staining modality. 9. The method of claim 1, wherein at least one of the plurality of examples further comprises the respective score to indicate at least one of (i) a risk of recurrence of the respective cancer in the respective subject or (ii) a predicted outcome of a respective administration of therapy for the respective subject, and -100- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 wherein determining the score further comprises determining the score indicating least one of (i) a risk of recurrence of the cancer in the subject or (ii) a predicted outcome of the administration of therapy for the subject. 10. The method of claim 1, wherein the cancer comprises breast cancer affecting a breast of the subject), and wherein the subject has undergone resection of a primary tumor associated with breast cancer from the breast, prior to obtaining the dataset, and wherein the plurality of characteristics identified in the text report comprises at least one of a histologic subtype, a percent positivity of HR, a percent positivity of HER2, a histologic grade, an anatomic site, a ductal carcinoma in situ (DCIS) indicator, or lobular carcinoma in situ (LCIS) indicator, and wherein the therapy for the breast cancer comprises at least one of an endocrine therapy or an adjuvant chemotherapy, wherein the endocrine therapy comprises at least one of a selective estrogen receptor modulator (SERM), an aromatase inhibitor (AI), or a selective estrogen receptor degrader (SERD), and wherein the adjuvant chemotherapy comprises at least one of anthracyclines, taxanes, cyclophosphamide, fluorouracil, or carboplatin. 11. A system for determining scores related to metastatic recurrence of cancers in subjects using multimodal machine learning (ML) architectures, comprising: one or more processors coupled with memory, configured to: obtain, for a subject at risk of recurrence of cancer, a dataset comprising at least one of: (i) a biomedical image of a tissue sample from an organ associated with the cancer or (ii) a text report identifying a plurality of characteristics of a tumor associated with the cancer; apply an ML architecture to at least one of the biomedical image or the text report of the dataset, wherein the ML architecture is established using a plurality of examples, each of the plurality of examples having, for a respective subject having undergone resection of a respective tumor from a respective organ: (i) a respective dataset comprising at least one of (a) a respective biomedical image of a respective tissue sample from the respective organ or (b) a respective text report identifying a respective plurality of characteristics of the respective tumor; -101- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 and (ii) a respective score indicating a corresponding likelihood of recurrence associated with cancer in the respective subject; determine, based on applying the ML architecture, a score indicating a likelihood of recurrence associated with cancer in the subject; generate a classification identifying the subject as one of a candidate or non- candidate for administration of a therapy for the cancer, in accordance with the score; and store, using one or more data structures, an association between the subject and the classification. 12. The system of claim 11, wherein the one or more processors are further configured to: generate, responsive to the score satisfying a threshold, a classification to identify the subject as the candidate for administration of the therapy for recurrence of the cancer; and provide, for presentation, an output based on the classification to identify the subject as the candidate for the administration of the therapy, wherein the subject is administered with the therapy for cancer, in response to the presentation of the output. 13. The system of claim 11, wherein the one or more processors are further configured to: generate, responsive to the score not satisfying a threshold, a classification to identify the subject as the non-candidate for administration of the therapy for recurrence of the cancer; and provide an output based on the classification to identify the subject as the non-candidate for the administration of the therapy, wherein the output comprises at least one of (i) a first indication to not administer the subject the therapy for recurrence of the cancer, (ii) a second indication to continue monitoring the subject for recurrence of the cancer, or (iii) a third indication to provide a second dataset for the subject. 14. The system of claim 11, wherein at least one of the plurality of examples comprises the respective score indicating the corresponding likelihood of recurrence associated with cancer in the respective subject at a respective time window relative to resection of a respective primary tumor associated with the cancer, -102- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 wherein the one or more processors are further configured to determine the score indicating the likelihood of recurrence associated with cancer in the subject at a time window relative to resection of a primary tumor associated with the cancer in the subject, wherein the time window ranges from three months to eight years. 15. The system of claim 11, wherein the one or more processors are further configured to: generate a plurality of tiles using the biomedical images, each of the plurality of tiles corresponding a corresponding section of the biomedical images; generate a plurality of tokens using the text report, each of the plurality of tokens corresponding to one or more respective words of the text report, and apply the ML architecture to the plurality of tiles and the plurality of tokens, the ML architecture comprising: an image processor configured to generate, using the plurality of tiles, a first set of embeddings corresponding to features associated with the recurrence of the cancer; a text processor configured to generate, using the plurality of tokens, a second set of embeddings corresponding to features associated with the recurrence of the cancer; an aggregator configured to determine, based on the first set of embeddings from the image processor and the second set of embeddings from the text processor, the score indicating the likelihood of recurrence associated with cancer in the subject. 16. The system of claim 15, wherein the one or more processors are further configured to: apply the ML architecture to the plurality of tiles and the plurality of tokens, the ML architecture comprising: the image processor configured to generate, for each tile of the plurality of tiles, a first score of a plurality of first scores indicating a relevance of the tile to the recurrence of the cancer; and the text processor configured to generate, for each token of the plurality of tokens, a second score of a plurality of second scores indicating a relevance of the token to the recurrence of the cancer, -103- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 select (i) a subset of tiles from the plurality of tiles based on the plurality of first scores and (ii) a subset of tokens from the plurality of tokens based on the plurality of second scores; and provide, for presentation an output based on the subset of tiles and the subset of tokens. 17. The system of claim 11, at least one example of the plurality of examples further comprises (i) the respective score indicating a corresponding likelihood of recurrence associated with cancer in the respective subject based on the respective biomedical image and (ii) a respective second score indicating a corresponding likelihood of recurrence associated with cancer in the respective subject based on the respective text report, and wherein the ML architecture is trained by: applying the ML architecture to the respective dataset and the respective text report of the at least one example; determining (i) a first score indicating corresponding likelihood of recurrence associated with cancer in the respective subject from applying the ML architecture to the respective biomedical score and (ii) a second score indicating corresponding likelihood of recurrence associated with cancer in the respective subject from applying the ML architecture to the respective text report; and updating one or more of a plurality of weights inn the ML architecture based on (i) a first comparison between the respective score and the first score and (ii) a second comparison between the second respective score and the second score. 18. The system of claim 11, wherein the one or more processors are further configured to: obtain the dataset comprising (i) the biomedical image of at least a portion of the tissue sample in a first staining modality of a plurality of staining modalities and (ii) a second biomedical image of at least a portion of the tissue sample in a second staining modality of the plurality of staining modalities, the plurality of staining modalities comprising a histological modality or an immunohistochemical (IHC) modality; and -104- 4924-7413-8397.1 Atty. Dkt. No.: 115872-3191 apply the ML architecture to the biomedical image in the first staining modality and the second biomedical image in the second staining modality. 19. The system of claim 11, wherein at least one of the plurality of examples further comprises the respective score to indicate at least one of (i) a risk of recurrence of the respective cancer in the respective subject or (ii) a predicted outcome of a respective administration of therapy for the respective subject, and wherein the one or more processors are further configured to determine the score indicating least one of (i) a risk of recurrence of the cancer in the subject or (ii) a predicted outcome of the administration of therapy for the subject. 20. The system of claim 11, wherein the cancer comprises breast cancer affecting a breast of the subject , and wherein the subject has undergone resection of a primary tumor associated with breast cancer from the breast, prior to obtaining the dataset, and wherein the plurality of characteristics identified in the text report comprises at least one of a histologic subtype, a percent positivity of HR, a percent positivity of HER2, a histologic grade, an anatomic site, a ductal carcinoma in situ (DCIS) indicator, or lobular carcinoma in situ (LCIS) indicator wherein the therapy for the breast cancer comprises at least one of an endocrine therapy or an adjuvant chemotherapy, wherein the endocrine therapy comprises at least one of a selective estrogen receptor modulator (SERM), an aromatase inhibitor (AI), or a selective estrogen receptor degrader (SERD), and wherein the adjuvant chemotherapy comprises at least one of anthracyclines, taxanes, cyclophosphamide, fluorouracil, or carboplatin. -105- 4924-7413-8397.1
PCT/US2025/016896 2024-02-22 2025-02-21 Multimodal transformer models for biomedical images and associated texts Pending WO2025179210A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202463556754P 2024-02-22 2024-02-22
US63/556,754 2024-02-22

Publications (1)

Publication Number Publication Date
WO2025179210A1 true WO2025179210A1 (en) 2025-08-28

Family

ID=96847804

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2025/016896 Pending WO2025179210A1 (en) 2024-02-22 2025-02-21 Multimodal transformer models for biomedical images and associated texts

Country Status (1)

Country Link
WO (1) WO2025179210A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220033912A1 (en) * 2018-12-08 2022-02-03 Pfs Genomics Inc Transcriptomic profiling for prognosis of breast cancer to identify subjects who may be spared adjuvant systemic therapy
US20220189016A1 (en) * 2014-06-10 2022-06-16 Ventana Medical Systems Inc. Assessing risk of breast cancer recurrence
US20230223121A1 (en) * 2019-09-19 2023-07-13 Tempus Labs, Inc. Data based cancer research and treatment systems and methods

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220189016A1 (en) * 2014-06-10 2022-06-16 Ventana Medical Systems Inc. Assessing risk of breast cancer recurrence
US20220033912A1 (en) * 2018-12-08 2022-02-03 Pfs Genomics Inc Transcriptomic profiling for prognosis of breast cancer to identify subjects who may be spared adjuvant systemic therapy
US20230223121A1 (en) * 2019-09-19 2023-07-13 Tempus Labs, Inc. Data based cancer research and treatment systems and methods

Similar Documents

Publication Publication Date Title
Amgad et al. A population-level digital histologic biomarker for enhanced prognosis of invasive breast cancer
Risom et al. Transition to invasive breast cancer is associated with progressive changes in the structure and composition of tumor stroma
US11423547B2 (en) Systems and methods for processing electronic images for computational detection methods
JP7635941B2 (en) System and method for processing slide images and inferring biomarkers - Patents.com
Wang et al. Comprehensive analysis of lung cancer pathology images to discover tumor shape and boundary features that predict survival outcome
Arvaniti et al. Automated Gleason grading of prostate cancer tissue microarrays via deep learning
Breen et al. Artificial intelligence in ovarian cancer histopathology: a systematic review
Qaiser et al. Usability of deep learning and H&E images predict disease outcome-emerging tool to optimize clinical trials
Brieu et al. Automated tumour budding quantification by machine learning augments TNM staging in muscle-invasive bladder cancer prognosis
US20200388029A1 (en) System and Method to Quantify Tumor-Infiltrating Lymphocytes (TILs) for Clinical Pathology Analysis Based on Prediction, Spatial Analysis, Molecular Correlation, and Reconstruction of TIL Information Identified in Digitized Tissue Images
US20230207134A1 (en) Systems and methods for directly predicting cancer patient survival based on histopathology images
Bilal et al. Novel deep learning algorithm predicts the status of molecular pathways and key mutations in colorectal cancer from routine histology images
Acosta et al. Intratumoral resolution of driver gene mutation heterogeneity in renal cancer using deep learning
Ing et al. A novel machine learning approach reveals latent vascular phenotypes predictive of renal cancer outcome
Chen et al. Computational pathology improves risk stratification of a multi-gene assay for early stage ER+ breast cancer
Boehm et al. Multimodal histopathologic models stratify hormone receptor-positive early breast cancer
US20250308663A1 (en) Integration of radiologic, pathologic, and genomic features for prediction of response to immunotherapy
Li et al. Predicting neoadjuvant chemotherapy benefit using deep learning from stromal histology in breast cancer
Hagi et al. Prediction of prognosis using artificial intelligence‐based histopathological image analysis in patients with soft tissue sarcomas
Wang et al. Deep learning for endometrial cancer subtyping and predicting tumor mutational burden from histopathological slides
Shamai et al. Clinical utility of receptor status prediction in breast cancer and misdiagnosis identification using deep learning on hematoxylin and eosin-stained slides
Kim et al. Predicting Nottingham grade in breast cancer digital pathology using a foundation model
Boehm et al. Multimodal histopathologic models stratify hormone receptor-positive early breast cancer
US20250125054A1 (en) Systems and methods for identifying prostate cancer patients at high-risk of progression
Pan et al. Feature-interactive Siamese graph encoder-based image analysis to predict STAS from histopathology images in lung cancer

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 25758963

Country of ref document: EP

Kind code of ref document: A1