WO2018110903A2 - Classification method of molecular subtype of breast cancer and classification device of molecular subtype of breast cancer using same - Google Patents
Classification method of molecular subtype of breast cancer and classification device of molecular subtype of breast cancer using same Download PDFInfo
- Publication number
- WO2018110903A2 WO2018110903A2 PCT/KR2017/014345 KR2017014345W WO2018110903A2 WO 2018110903 A2 WO2018110903 A2 WO 2018110903A2 KR 2017014345 W KR2017014345 W KR 2017014345W WO 2018110903 A2 WO2018110903 A2 WO 2018110903A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- genes
- foxa1
- sfrp1
- cep55
- fgfr4
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B25/00—ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B25/00—ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
- G16B25/10—Gene or protein expression profiling; Expression-ratio estimation or normalisation
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
- C12Q1/6886—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/112—Disease subtyping, staging or classification
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/158—Expression markers
Definitions
- the present invention relates to a method for classifying breast cancer molecular subtypes and a device for classifying breast cancer molecular subtypes using the same, and more particularly, to a method for classifying molecular subtypes of breast cancer in tumor cells by measuring expression levels of specific genes in tumor cells and using the same.
- a device for classifying molecular subtypes of breast cancer is a device for classifying molecular subtypes of breast cancer.
- Breast cancer is a mass of cancerous cells of the breast.
- Various factors such as female hormones, family history, past history, fertility, and dietary habits have been mentioned as the causes of breast cancer, but there is no clear explanation yet.
- the incidence of breast cancer is increasing rapidly due to increased sensitivity of the mammary gland, westernization of diet, and pollution of living environment.
- Breast tumors can be broadly classified into five molecular subtypes: Luminal A type, Luminal B type, HER2 type, Basal-like type and Normal-like type. Can be. Molecular subtypes of such breast tumors can be classified based on expression on selectors consisting of estrogen receptors, progesterone receptors, HER2, HER1 and Cytokeratin 5/3.
- Molecular subtypes of the above breast tumors may have different predictions for the diagnosis and treatment response of the prognosis. Accordingly, treatments or therapeutic agents required for breast cancer patients with different molecular subtypes may be different.
- the inventors of the present invention show a difference in expression levels of specific genes according to molecular subtypes of breast cancer, and based on these differences, the molecular subtypes of breast tumors can be effectively classified by using a model obtained through machine learning. It was recognized.
- an object of the present invention is to provide a breast cancer molecular subtype classification method, which can classify breast cancer molecular subtypes of tumor cells based on a ratio of their expression levels by providing a differential expression gene set in tumor cells. .
- Another problem to be solved by the present invention is to measure the level of expression of a particular set of genes in tumor cells, breast cancer molecular subtypes classification method and a device using the same that can classify only the absolute gene expression level of one sample To provide.
- Another problem to be solved by the present invention is to provide a breast cancer molecular subtype classification method and a device using the same that can accurately calculate the expression ratio of specific genes in tumor cells using the expression rate correction method corrected using a reference value. .
- the breast cancer molecular subtype classification method is to obtain the expression level of the differential expression gene set measured in the tumor cells, the expression level of the differential expression gene set obtained Inputting to a classification model having parameters for the differentially expressed gene set, determining a breast cancer molecular subtype for tumor cells using the classification model, and providing the determined breast cancer molecular subtype.
- the expression level is measured in tumor cells using the microarray or the expression level of the first and second gene sets measured in tumor cells using RNA sequencing Expression level of the first and third gene sets, wherein the first gene set includes the ESR1, PGR, ERBB2 and MKI67 genes, and the second and third gene sets may be different from each other.
- the second set of genes is FOXA1, FOXC1, MLPH, FGFR4, GRB7, BCL2, CEP55, MYBL2, KRT17, SFRP1, KRT14, MELK, KRT5, MIA, KIF2C, EGFR And at least one gene selected from the group consisting of NAT1 and ANLN genes.
- the second set of genes is FOXC1 gene or FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2 and KRT17 gene or FOXA1 and SFRP1 gene or MLPH, FGFR4, MYBL2 and SFRP1 gene Or FOXA1, KRT17 and SFRP1 gene or FOXA1, MLPH, FGFR4, BCL2, MYBL2, KRT17 and SFRP1 gene or FOXC1, FOXA1, MLPH, FGFR4, BCL2, MYBL2, KRT17 and SFRP1 gene or, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17 and SFRP1 genes or, GRB7, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17 and SFRP1 genes or, FOXA1, MLPH, FGFR4, BCL2, CEP55, MYBL2, KRT17 and SFRP1 genes or, FOX
- the third set of genes is FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14, BLVRA, BCL2, It may include at least one gene selected from the group consisting of UBE2C, KRT17, CCNB1, NAT1, SLC39A6, CDC20, KRT5 and CCNE1 gene.
- the third set of genes is FOXC1 and CEP55 genes or FOXC1, MELK and CEP55 genes or FOXA1, FOXC1, GRB7, CEP55, BIRC5, MELK and MIA genes or FOXA1, FGFR4, CEP55, BIRC5, SFRP1 and MIA genes or FOXC1, FOXA1, FGFR4, CEP55, BIRC5, SFRP1 and MIA genes, FOXC1, FOXA1, FGFR4, CEP55, BIRC5, SFRP1 and MIA genes, or FOXA1, MELK, SFRP1 and MIA genes, or CEP55, FOXA1, MELK, SFRP1 and MIA genes, or FOXC1, CEP55, FOXA1, MELK, SFRP1 and MIA genes or, FGFR4, CEP55, FOXA1, MELK, SFRP1 and MIA genes, or FOXC1, FGFR4, CEP55, FOXA1, MELK, SFRP1 and MIA genes
- the method may further include determining at least one gene having a relatively large difference in expression level between genes according to the molecular subtype as a differential expression gene set.
- the classification model based on the expression level of the genes associated with the plurality of breast cancer, the expression ratio for two genes selected from the genes associated with the plurality of breast cancer, calculated by Equation 1 below Can be defined by a matrix of values.
- i i is the logarithm of the i gene selected from the genes associated with the plurality of breast cancers in the base 2 log
- e j is the j gene selected from the genes associated with the plurality of breast cancers in the base 2 log. It is the logarithm of the expression value taken.
- the classification model is based on a matrix consisting of expression ratio values for two genes selected from genes associated with a plurality of breast cancers, calculated by correcting d ij with Equation 2 below. Can be defined.
- ⁇ is one value selected from the group consisting of 0, 0.01, 0.10, 0.15, 0.20 and 1.0).
- the molecular subtype of breast cancer may include at least one selected from the group consisting of basal type, luminal type A, luminal type B, normal-like type, normal type, and HER2 type.
- a breast cancer molecular subtype classification device is a device for classifying breast cancer molecular subtypes that can be driven, the processor including a processor operatively connected to the communication unit, the processor, The expression level of the differential expression gene set measured in tumor cells is obtained, and the expression level for the differential expression gene set obtained is input into a classification model having parameters for the differential expression gene set, and breast cancer molecular subtypes for tumor cells are obtained. And a processor configured to provide the determined breast cancer molecular subtypes.
- the expression level is the first set of genes measured in the tumor cells using the expression level or microarray of the first and second sets of genes measured in the tumor cells using RNA sequencing And the level of expression of the third set of genes, the first set of genes comprising the ESR1, PGR, ERBB2 and MKI67 genes, wherein the second set of genes and the third set of genes may be different from each other.
- the second set of genes is FOXA1, FOXC1, MLPH, FGFR4, GRB7, BCL2, CEP55, MYBL2, KRT17, SFRP1, KRT14, MELK, KRT5, MIA, KIF2C, It may comprise at least one gene selected from the group consisting of EGFR, NAT1 and ANLN genes.
- the second set of genes is FOXC1 gene or FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2 and KRT17 gene or FOXA1 and SFRP1 gene or MLPH, FGFR4, MYBL2 and SFRP1 gene Or FOXA1, KRT17 and SFRP1 gene or FOXA1, MLPH, FGFR4, BCL2, MYBL2, KRT17 and SFRP1 gene or FOXC1, FOXA1, MLPH, FGFR4, BCL2, MYBL2, KRT17 and SFRP1 gene or, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17 and SFRP1 genes or, GRB7, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17 and SFRP1 genes or, FOXA1, MLPH, FGFR4, BCL2, CEP55, MYBL2, KRT17 and SFRP1 genes or, FOX
- the third set of genes is FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14, BLVRA, BCL2, It may include at least one gene selected from the group consisting of UBE2C, KRT17, CCNB1, NAT1, SLC39A6, CDC20, KRT5 and CCNE1 gene.
- the third set of genes is FOXC1 and CEP55 genes or FOXC1, MELK and CEP55 genes or FOXA1, FOXC1, GRB7, CEP55, BIRC5, MELK and MIA genes or FOXA1, FGFR4, CEP55 , BIRC5, SFRP1 and MIA genes, or FOXC1, FOXA1, FGFR4, CEP55, BIRC5, SFRP1 and MIA genes, or FOXC1, FOXA1, FGFR4, CEP55, BIRC5, SFRP1 and MIA genes, or FOXA1, MELK, SFRP1 and MIA genes, or CEP55, FOXA1, MELK, SFRP1 and MIA genes, or FOXC1, CEP55 , FOXA1, MELK, SFRP1 and MIA genes, or FGFR4, CEP55, FOXA1, MELK, SFRP1 and MIA genes, or FOXC1, GRB7, CEP55, MELK, SFRP5, SFP1
- a matrix consisting of expression ratio values for two genes selected from the genes associated with the plurality of breast cancers calculated by Equation 1 below Can be defined by
- i i is the logarithm of the i gene selected from the genes associated with the plurality of breast cancers in the base 2 log
- e j is the j gene selected from the genes associated with the plurality of breast cancers in the base 2 log. It is the logarithm of the expression value taken.
- the classification model is based on a matrix consisting of expression ratio values for two genes selected from genes associated with a plurality of breast cancers, calculated by correcting d ij with Equation 2 below. Can be defined.
- ⁇ is one value selected from the group consisting of 0, 0.01, 0.10, 0.15, 0.20 and 1.0).
- the present invention provides a different set of genes according to a method for measuring expression levels including RNA sequencing or microarray, thereby measuring expression levels for specific genes in tumor cells, thereby classifying breast cancer molecular subtypes with high accuracy. It works.
- the gene set may include a differentially expressed gene (DEG, Differentially expressed gene) having a relatively large difference in the expression level between genes according to the molecular subtype, according to the present invention by using a differential expression gene breast cancer molecule It is effective to classify subtypes accurately.
- DEG differentially expressed gene
- the present invention provides a method for classifying breast cancer molecular subtypes, thereby measuring breast cancer molecular subtypes by measuring expression levels of specific genes of breast tumor cells with only tumor cells without subject samples.
- the present invention has the effect of determining the molecular subtype of breast cancer while maintaining accuracy and minimizing resource consumption.
- FIG. 1 is a block diagram showing a schematic configuration of a breast cancer molecular subtype classification device according to an embodiment of the present invention.
- FIG. 2 is a flowchart illustrating a method for classifying breast cancer molecular subtypes according to an embodiment of the present invention.
- 3A is a flowchart illustrating a learning procedure for breast cancer molecular subtype classification of a breast cancer molecular subtype classification device according to an embodiment of the present invention.
- Figure 3b is a schematic diagram showing a classification model provided by the breast cancer molecular subtype classification method according to an embodiment of the present invention.
- Figure 4a shows the results of the evaluation in the RNA sequencing analysis data according to various sets of differentially expressed genes provided in the breast cancer molecular subtype classification method according to an embodiment of the present invention.
- Figure 4b shows the results of the evaluation in the microarray analysis data according to a variety of differential expression gene set provided in the breast cancer molecular subtype classification method according to an embodiment of the present invention.
- Figure 5a illustrates the evaluation results in the RNA sequencing analysis data according to the breast cancer molecular subtype classification algorithm in the breast cancer molecular subtype classification method according to an embodiment of the present invention.
- Figure 5b shows the evaluation results in the microarray analysis data according to the breast cancer molecular subtype classification algorithm in the breast cancer molecular subtype classification method according to an embodiment of the present invention.
- Figure 6 shows a comparison of breast cancer molecular subtype classification method using a breast cancer molecular subtype classification method and a conventional method according to an embodiment of the present invention.
- Shapes, sizes, ratios, angles, numbers, and the like disclosed in the drawings for describing the embodiments of the present invention are exemplary, and the present invention is not limited to the illustrated items.
- the detailed description of the related known technology may unnecessarily obscure the subject matter of the present invention, the detailed description thereof will be omitted.
- 'comprises', 'haves', 'consists of' and the like mentioned in the present specification are used, other parts may be added unless 'only' is used.
- the plural number includes the plural unless specifically stated otherwise.
- first, second, etc. are used to describe various components, these components are not limited by these terms. These terms are only used to distinguish one component from another. Therefore, the first component mentioned below may be a second component within the technical spirit of the present invention.
- each of the features of the various embodiments of the present invention may be combined or combined with each other in part or in whole, various technically interlocking and driving as can be understood by those skilled in the art, each of the embodiments may be implemented independently of each other It may be possible to carry out together in an association.
- breast cancer refers to a mass consisting of cancer cells of the breast.
- Breast cancer may be a type of cancer that originates in the breast lobules that feed the mammary gland or in the medial lining of the mammary gland. Cancers originating from the mammary gland may be ductal carcinomas, and cancers derived from the lobules may be lobular carcinomas. Sometimes, the site of translocation by the breast may include bone, liver, lung and brain. Breast cancer occurs in humans and other mammals. In humans, most breast cancers occur in women, but can also occur in men. Treatments for breast cancer may include surgery, medication (hormonal therapy and chemotherapy), radiation therapy and / or immunotherapy / targeting therapy.
- Breast tumors can be largely divided into five molecular subtypes.
- "Molecular subtype” may refer to a molecular subtype of a breast tumor characterized by a characteristic distant molecular profile, eg, pyrogene expression profile. Specifically, breast tumors can be classified into five molecular subtypes: luminal type A, luminal B type, HER2 type, basal like type and normal like type.
- Breast cancer may have a different prognosis depending on the molecular subtype of the breast tumor.
- cancers with a positive expression of the estrogen receptor eg, luminal type A, luminal type B
- cancers with a positive expression level of HER2 eg, HER2 type, luminal type B
- triple negative cancers eg, basal like, normal like
- Triple-negative cancers may have a lower survival after relapse than other molecular subtypes of breast cancer, and thus may have a lower survival time.
- cancers with a positive expression of estrogen receptors may be effective treatments with estrogen receptor antagonists, tamoxifen.
- cancers with a positive expression level of HER2 eg, HER2 type, luminal type B
- cancers with a positive expression level of HER2 may be an effective treatment with an anti-HER2 antibody, a HER active receptor tyrosine kinase inhibitor (eg, rabpatinib).
- the method of classifying molecular subtypes of breast tumor may be used as a method of providing information for determining the prognosis of breast cancer or the manner of treating breast cancer.
- tumor cell refers to a cell that autonomously proliferates in the body.
- Preferred tumor cells may be, but are not limited to, breast cancer tumor cells isolated from breast cancer patients.
- the tumor cells may be cells in which breast cancer cells and normal cells are mixed.
- the term "differential expression gene” means a gene in which the difference in expression level is significantly increased or significantly decreased in the experimental group compared to the control group.
- Breast tumor cells may have differentially expressed genes that exhibit different levels of expression depending on their molecular subtype.
- the differentially expressed genes according to the molecular subtypes of the breast tumor may be genes having a large difference in expression level according to the molecular subtypes among genes related to breast cancer.
- the differentially expressed genes according to the molecular subtypes of breast tumors are FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14, BLVRA, BCL2, UBE2C, KRT17, CCNB1 , NAT1, SLC39A6, CDC20, KRT5, KIF2C, EGFR, CCNE1, ESR1, PGR, ERBB2 and MKI67.
- the term “expression level” means a measurable amount of gene product produced by a gene.
- the expression level of the differentially expressed gene may refer to the amount of transcript including mRNA produced by the transcription of the differentially expressed gene or the amount of DNA of the differentially expressed gene, and further to the amount of protein produced by translation. Amount, but is not limited to such. Expression levels in the present specification may be interpreted in the same manner as expression values.
- Expression levels of differentially expressed genes can be determined by polymerase chain reaction (PCR), DNA array, RNA array, Northern blot, Western blot, ELISA. (enzyme-linked immunosorbent assay), protein array (protein array) can be measured using, but is not limited thereto.
- PCR polymerase chain reaction
- DNA array DNA array
- RNA array Northern blot
- Western blot Western blot
- ELISA enzyme-linked immunosorbent assay
- protein array protein array
- RNA sequencing refers to a method of RNA sequencing of a subject
- microarray refers to a method of screening gene expression using a DNA chip.
- RNA sequencing and microarrays herein are used as a means for measuring expression levels for a plurality of genes, preferably as a means for measuring expression levels of differentially expressed genes for molecular subtype classification of breast tumors. It doesn't happen.
- the above-described set of differentially expressed genes may be different.
- preferred sets of differentially expressed genes for molecular subtyping of breast tumors are the ESR1, PGR, ERBB2, MKI67 and FOXC1 genes, or the ESR1, PGR, ERBB2, MKI67, FOXA1, FOXC1, MLPH , FGFR4, BCL2, CEP55, MYBL2 and KRT17 genes or, ESR1, PGR, ERBB2, MKI67, FOXA1 and SFRP1 genes, or ESR1, PGR, ERBB2, MKI67, MLPH, FGFR4, MYBL2 and SFRP1 genes, or ESR1, PGR, ERBB2 , MKI67, FOXA1, KRT17 and SFRP1 genes or, ESR1, PGR, ERBB2, MKI67, FOXA1, MLPH, FGFR4, BCL2, MYBL2, KRT17 and SFRP1
- preferred sets of differentially expressed genes for molecular subtyping of breast tumors are the ESR1, PGR, ERBB2, MKI67, FOXC1 and CEP55 genes, or the ESR1, PGR, ERBB2, MKI67, FOXC1, MELK and CEP55 genes or , ESR1, PGR, ERBB2, MKI67, FOXA1, FOXC1, GRB7, CEP55, BIRC5, MELK and MIA genes or, ESR1, PGR, ERBB2, MKI67, FOXA1, FGFR4, CEP55, BIRC5, SFRP1 and MIA genes, or ESR1, PGR , ERBB2, MKI67, FOXC1, FOXA1, FGFR4, CEP55, BIRC5, SFRP1 and MIA genes or, ESR1, PGR, ERBB2, MKI67, FOXC1, FOXA1, FGFR4, CEP55, BIRC5, SFRP1 and MIA genes or, ESR1, PGR, ERBB
- the term "first gene set” may refer to a constitution of genes of which the expression level is relatively different according to molecular subtypes of breast tumors, among other genes, among the aforementioned differential expression genes.
- the first set of genes may consist of the ESR1, PGR, ERBB2 and MKI67 genes.
- the term "second gene set" may refer to the configuration of genes with significant differences in expression levels according to molecular subtypes of breast tumors when RNA sequencing is used.
- the second set of genes consists of at least one of FOXA1, FOXC1, MLPH, FGFR4, GRB7, BCL2, CEP55, MYBL2, KRT17, SFRP1, KRT14, MELK, KRT5, MIA, KIF2C, EGFR, NAT1 and ANLN genes.
- the term "third gene set” may refer to a constitution of genes in which a difference in expression level is significant according to molecular subtypes of a breast tumor when a microarray is used.
- the third set of genes is FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14, BLVRA, BCL2, UBE2C, KRT17, CCNB1, NAT1, SLC39A6, CDC20 , KRT5 and CCNE1 genes.
- the second gene set and the third gene set may be different, but are not limited thereto.
- Breast tumor cells may differ in the expression level of the differential expression genes in the aforementioned differential expression gene set according to their molecular subtypes.
- the difference in the expression level, the expression level of each differential expression gene for a plurality of differential expression genes can be divided into large and small.
- the difference in expression level in the present specification may mean an expression ratio for two selected differential expression genes. Accordingly, the present invention can provide a method for classifying breast cancer molecular subtypes by providing expression ratios for differentially expressed genes.
- the term "parameter” means a parameter.
- the parameter may mean all variables operable for classification in the breast cancer molecular subtype classification method according to an embodiment of the present invention.
- classification model may be defined by a matrix consisting of a ratio of expression levels for two genes selected from genes associated with a plurality of breast cancers, that is, expression ratio values, but is not limited thereto.
- classification model may be defined as a matrix consisting of two differentiating genes and their differences.
- FIG. 1 a breast cancer molecular subtype classification device according to an embodiment of the present invention will be described with reference to FIG. 1.
- the breast cancer molecular subtype classification device 100 includes a communication unit 110, an input unit 120, a display unit 130, a storage unit 140, and a processor 150.
- the breast cancer molecular subtype classification device 100 may obtain the expression level of the differential expression gene set measured in the tumor cells.
- the input unit 120 is not limited to a keyboard, a mouse, a touch screen panel, and the like.
- the breast cancer molecular subtype classification device 100 may be set through the input unit 120, and the operation thereof may be instructed.
- the display unit 130 may display menus in which the breast cancer molecular subtype classification device 100 can be easily set by a user in the molecular subtype classification of breast cancer. Furthermore, the display unit 130 may display the molecular subtype determined by the classification model that receives the expression level for the differential expression gene set through the input unit 120 so that the user can easily recognize the subtype.
- the display unit 130 is a display device including a liquid crystal display, an organic light emitting display, and the like, and may allow menus to be displayed to a user.
- the display unit 130 may be implemented in various forms or methods within the scope to achieve the object of the present invention in addition to the above.
- the storage unit 140 may store the expression level of the differentially expressed gene set in the tumor cells obtained through the communication unit 110.
- the expression level of the differential expression gene set input to the classification model through the input unit 120 may be stored.
- the classification model can be stored.
- the processor 150 performs various instructions for operating the breast cancer molecular subtype classification device 100 according to an embodiment of the present invention.
- the processor 150 is connected to the communication unit 110 to obtain the expression level of the differential expression gene set measured in the tumor cells through the communication unit 110, and differentiates the expression level for the differential expression gene set obtained Input to a classification model with parameters for, determine breast cancer molecular subtypes for tumor cells, and provide determined breast cancer molecular subtypes.
- FIG. 2 a processor implemented in a breast cancer molecular subtype classification method and a breast cancer molecular subtype classification device according to an embodiment of the present invention will be described in detail.
- FIG. 2 is a flowchart illustrating a method for classifying breast cancer molecular subtypes according to an embodiment of the present invention.
- the differential bee gene set may be a first gene set and a second gene set measured in tumor cells using RNA sequencing.
- the differentially expressed gene set may be a first gene set and a third gene set measured in tumor cells using a microarray.
- the first gene set includes the ESR1, PGR, ERBB2, and MKI67 genes, and according to the expression level measuring method, the first gene set has different second gene sets and third gene sets, Differential expression gene sets can be different.
- the second set of genes is FOXA1, FOXC1, MLPH, FGFR4, GRB7, BCL2, CEP55, MYBL2, KRT17, SFRP1, KRT14, MELK, KRT5, MIA, KIF2C, EGFR, NAT1, and ANLN It may include at least one gene selected from the group consisting of genes.
- Preferred second gene sets are the FOXC1 gene or the FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2 and KRT17 genes, or the FOXA1 and SFRP1 genes, or the MLPH, FGFR4, MYBL2 and SFRP1 genes, or the FOXA1, KRT17 and SFRP1 genes.
- the third set of genes is FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14, BLVRA, BCL2, UBE2C, KRT17, CCNB1, NAT1 It may include at least one gene selected from the group consisting of, SLC39A6, CDC20, KRT5 and CCNE1 gene.
- Preferred third set of genes are the FOXC1 and CEP55 genes, or the FOXC1, MELK and CEP55 genes, or the FOXA1, FOXC1, GRB7, CEP55, BIRC5, MELK and MIA genes, or the FOXA1, FGFR4, CEP55, BIRC5, SFRP1 and MIA genes, or FOXC1, FOXA1, FGFR4, CEP55, BIRC5, SFRP1 and MIA genes, or FOXC1, FOXA1, FGFR4, CEP55, BIRC5, SFRP1 and MIA genes, or FOXA1, MELK, SFRP1 and MIA genes, or CEP55, FOXA1, MELK, SFRP1 and MIA genes, or FOXC1, CEP55, FOXA1, MELK, SFRP1 and MIA genes Or the FGFR4, CEP55, FOXA1, MELK, SFRP1 and MIA genes, or the FOXC1, FGFR4, CEP55, FOXA1,
- the present invention reduces the number of genes used in the breast cancer molecular subtype determination model, thereby minimizing resource consumption while maintaining accuracy and determining molecular subtypes of breast cancer based on the ratio of expression levels of specific genes. There is.
- the expression level of the acquired differential expression gene set is input to a classification model having parameters for the differential expression gene set (S220).
- the classification model may be defined by a matrix consisting of a ratio of expression levels for two genes selected from genes associated with a plurality of breast cancers, that is, expression ratio values, and more preferably, the classification model is assigned to two different genes.
- the logarithm of the baseline expression is taken as 2, and the difference between these log values can be defined as a matrix constructed.
- the molecular cancer subtypes for tumor cells are determined using the classification model (S230).
- the molecular subtypes of classifiable breast tumors may be basal type, luminal type A, luminal type B, normal-like type, normal type and HER2 type.
- the molecular subtypes of the breast tumor may be variously named according to those skilled in the art.
- the breast cancer molecular subtype classification method according to an embodiment of the present invention and the breast cancer molecular subtype classification device according to the present invention provides a combination of differential expression gene sets of various combinations, thereby reducing the number of genes used in the breast cancer molecular subtype determination model resources It is possible to provide a method for determining breast cancer molecular subtypes that minimizes consumption and maintains accuracy.
- the breast cancer molecular subtype classification method and the breast cancer molecular subtype classification device using the same may provide information on setting a treatment direction of a breast cancer patient and further, prognosis for breast cancer.
- FIGS. 3A, 3B and [Table 1] a method for determining a classification model and a differential expression gene set for breast cancer molecular subtype classification in a breast cancer molecular subtype classification method according to an embodiment of the present invention will be described. It demonstrates concretely. In addition, for the sake of clarity, the aforementioned reference numerals of FIG. 1 will be described together. Furthermore, hereinafter, the genes associated with breast cancer (eg, PAM 50) and the differentially expressed genes are classified and described, but are not limited thereto. For example, genes associated with breast cancer and genes constituting the differential expression gene set may be identical to each other, and genes associated with breast cancer may include genes constituting the differential expression gene set.
- genes associated with breast cancer and genes constituting the differential expression gene set may be identical to each other, and genes associated with breast cancer may include genes constituting the differential expression gene set.
- FIG. 3A is a flowchart illustrating a learning procedure for breast cancer molecular subtype classification of a breast cancer molecular subtype classification device according to an embodiment of the present invention.
- Figure 3b is a schematic diagram showing a classification model provided by the breast cancer molecular subtype classification method according to an embodiment of the present invention.
- gene expression level data associated with a plurality of breast cancers measured using RNA sequencing or a plurality of breast cancers measured using microarrays for samples having different molecular subtypes for breast cancer molecular subtype classification learning may use TCGA BRCA data provided by TCGA, but various data may be used without being limited thereto.
- Table 1 shows RNA sequencing analysis samples and microarray analysis samples in TCGA BRCA data, classified into five breast cancer molecular subtypes, according to a golden standard set based on the PAM 50 gene associated with molecular subtypes of breast cancer. Indicates a number.
- a breast cancer molecular subtype classification device according to an embodiment of the present invention can be learned, and its accuracy can also be evaluated.
- genes associated with breast cancer indicating differences in expression between molecular subtypes are selected (S320).
- the Wilcoxon rank-sum test is used to select a plurality of genes having relatively large differences in expression levels according to molecular subtypes of the breast tumor.
- genes included in the above-described PAM 50 gene are selected.
- the intergenic expression rate matrix 370 may be determined.
- the gene expression matrix 360 is composed of expression values of genes associated with a plurality of breast cancers calculated by Equation 1 below.
- e i is a log value that takes the expression value of the i gene selected from among a plurality of genes associated with breast cancer
- e j is the expression value of the selected j gene among genes associated with the plurality of breast cancers in the base 2 log. This is the log value taken.
- the intergene expression rate matrix 370 may be configured as an expression rate value (d ' ij value) between genes associated with two breast cancers, which is calculated by correcting d ij value by Equation 2 below.
- ⁇ may mean a reference value for correcting the d ij value
- ⁇ may be 0, 0.01, 0.10, 0.15, 0.20 and 1.0, but is not limited thereto.
- the matrix can also be determined variously according to the application of the various values of ⁇ .
- the intergene expression rate matrix 370 may be determined through the step S330 of determining the matrix, and the intergene expression rate matrix 370 may be a classification model in a breast cancer tumor cell classification device according to an embodiment of the present invention. It can be defined as.
- the determined intergene expression ratio matrix 370 is divided by 8 (learning set): 2 (evaluation set). The learning set repeats 5-fold cross validation 100 times. At this time, the accuracy evaluation is performed using the TCGA BRCA sample for each molecular subtype described above as the correct answer.
- the evaluation set evaluates the accuracy for the molecular subtype classification device 100 reflecting the candidate classification algorithm, candidate ⁇ value, and candidate discrimination gene set showing high accuracy in the learning set.
- the evaluation set is evaluated using a sample other than the TCGA BRCA sample used in the above-described learning set.
- a breast cancer molecular subtype classification algorithm, ⁇ value, and differential expression gene set for each of RNA sequencing analysis data and microarray analysis data are determined (S350). Specifically, on the basis of the results of the evaluation step (S340), in the determining step (S350) of the above-described four classification algorithms according to the RNA sequencing analysis and microarray analysis, the high accuracy of the breast cancer molecular subtype classification algorithm, classification of A differential expression gene set is determined that includes a high value of ⁇ , a first set of genes, a second set of genes, or a third set of genes with high accuracy.
- the breast cancer molecular subtype classification device of the present invention is set to an algorithm having high accuracy of breast cancer molecular subtype classification and a value of ⁇ having high accuracy of classification, and uses a differential expression gene set with high accuracy of classification.
- the molecular subtype classification device 100 can provide an effective classification of molecular subtypes of breast tumors.
- the breast cancer molecular subtype classification method according to an embodiment of the present invention and the breast cancer molecular subtype classification device 100 according to the breast cancer molecular subtype determination by providing a differential combination gene set of various combinations determined in the step (S350) to determine
- S350 differential combination gene set of various combinations determined in the step (S350)
- FIGS. 4A, 4B, 5A, and 5B a classification algorithm and a value for a differential expression gene set and a breast cancer molecular subtype classification provided by a breast cancer molecular subtype classification method according to an embodiment of the present invention will be described.
- Figure 4a shows the results of the evaluation in the RNA sequencing analysis data according to various sets of differentially expressed genes provided in the breast cancer molecular subtype classification method according to an embodiment of the present invention.
- Figure 4b shows the results of the evaluation in the microarray analysis data according to a variety of differential expression gene set provided in the breast cancer molecular subtype classification method according to an embodiment of the present invention.
- Figure 5a illustrates the evaluation results in the RNA sequencing analysis data according to the breast cancer molecular subtype classification algorithm in the breast cancer molecular subtype classification method according to an embodiment of the present invention.
- Figure 5b shows the evaluation results in the microarray analysis data according to the breast cancer molecular subtype classification algorithm in the breast cancer molecular subtype classification method according to an embodiment of the present invention.
- a suitable differential expression gene set in RNA sequencing analysis data is a first set of genes comprising the ESR1, PGR, ERBB2 and MKI67 genes and FOXA1, FOXC1, MLPH, FGFR4, GRB7, BCL2, CEP55, MYBL2 as described above.
- the classification accuracy in the 1 to 41 differentially expressed gene sets suitable for the RNA sequencing analysis data provided by the present invention is about 87%, indicating a high level.
- the 17 and 34 sets of differentially expressed genes show a classification accuracy of 89.41%. Furthermore, for the 37 sets of differential genes, the accuracy of the classification is 85%, which results in less than the accuracy of classification in the 17 or 34 sets of differentially expressed genes.
- the 37 sets of differentially expressed genes selected only four genes of MLPH, FGFR4, CEP55, and KRT17 from the second gene set to classify breast cancer molecular subtypes, resulting in more than 85% high classification accuracy while minimizing resource consumption.
- the preferred differential expression gene set in the RNA sequencing analysis data may be 17 or 34, even 37, but is not limited thereto and various sets in FIG. 4A may be provided as differential expression gene sets. According to FIG.
- ESR1, PGR, ERBB2, MKI67, FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55 and MYBL2, KRT17 and KRT14 genes are included in the differential expression gene set (or, where the SFRP1 gene is If further included), it can be seen that shows a much higher degree of breast cancer molecular subtype classification accuracy than the set of combinations of other genes.
- differentially expressed gene sets evaluated in microarray analysis data are shown.
- suitable differentially expressed gene sets in microarray analysis data are the first set of genes comprising the aforementioned ESR1, PGR, ERBB2 and MKI67 genes and FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, It can consist of a combination of genes of a third set of genes including the BAG1, MLPH, MELK, KRT14, BLVRA, BCL2, UBE2C, KRT17, CCNB1, NAT1, SLC39A6, CDC20, KRT5 and CCNE1 genes.
- FIG. 4B different combinations of differentially expressed gene sets evaluated in microarray analysis data are shown.
- suitable differentially expressed gene sets in microarray analysis data are the first set of genes comprising the aforementioned ESR1, PGR, ERBB2 and MKI67 genes and FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOX
- the sorting accuracy in the 1 to 41 differentially expressed gene sets suitable for the microarray analysis data provided by the present invention is high, about 86%.
- Nos. 27 and 33 show 89.41% and 88.24% sorting accuracy, respectively, with higher sorting accuracy than the rest of the differentially expressed gene sets.
- 41 of the differential expression gene sets show the highest classification accuracy of 92.94%.
- the accuracy of the classification is 84.71%, which results in less than the accuracy of classification in 27 or 33 or 41 sets of differentially expressed genes.
- the three sets of differentially expressed genes selected only two genes of FOXC1 and CEP55 from the third set of genes to classify breast cancer molecular subtypes, resulting in a high accuracy of about 85% while minimizing resource consumption.
- the preferred differential expression gene set in the microarray analysis data may be 27 or 33 or 41, furthermore, 3, but is not limited thereto and various sets in FIG. 4B may be provided as differential expression gene sets. have.
- SLC39A6, CDC20, KRT5, and CCNE1 genes it can be seen that the tumor cancer molecular subtype classification accuracy is much higher than the set of combinations of other genes.
- differential gene sets including the ESR1, PGR, ERBB2, MKI67, FOXA1, FGFR4, CEP55, MELK, SFRP1, MIA, and KRT14 genes (or if they further contain BIRC5 genes), also appear in the RNA sequencing analysis data. Resources can be reduced while still showing high breast cancer molecular subtype classification accuracy.
- RNA sequencing analysis data is Random Forest. Specifically, the classification accuracy when applying Random Forest was about 88%, which is higher than other classification algorithms. In addition, when the ⁇ value is set to 0.01, it can be seen that the accuracy of classification is the highest at 88.49%.
- an algorithm having high accuracy in microarray analysis data may be random forest as in the result of FIG. 5A.
- the classification accuracy when applying Random Forest was about 90%, which is higher than other classification algorithms.
- the ⁇ value is set to 1.00, it can be seen that the accuracy of classification is the highest at 90.23%.
- RNA sequencing analysis data and microarray analysis data can be provided to the breast cancer molecular subtype classification device in which Random Forest is set by the molecular subtype classification algorithm of the breast tumor. Furthermore, the breast cancer molecular subtype classification method of the present invention and the breast cancer molecular subtype classification device using the same provide an effective set of differentially expressed genes for each of RNA sequencing analysis data and microarray sequencing analysis data. Faster and more accurate analysis may be possible than the conventional method of classifying breast cancer molecular subtypes based on gene expression levels.
- the breast cancer molecular subtype classification method of the present invention and the breast cancer molecular subtype classification device using the same reduce the number of genes used in the breast cancer molecular subtype determination model determined through machine learning, while maintaining accuracy and minimizing resource consumption.
- the molecular subtypes of breast cancer can be determined.
- the breast cancer molecular subtype classification method of the present invention and the breast cancer molecular subtype classification device using the same can measure breast cancer molecular subtypes by measuring the expression level of the differentially expressed gene set of tumor cells without comparing the gene expression levels of the target sample and the tumor sample. Can be classified.
- Figure 6 shows a comparison of breast cancer molecular subtype classification method using a breast cancer molecular subtype classification method and a conventional method according to an embodiment of the present invention.
- AIMS is a method of classifying molecular subtypes of breast cancer based on a large and small relationship of expression levels for two genes, which differ according to molecular subtypes of breast cancer.
- the assessment of accuracy is based on the gene of PAM 50 in the genefu R package.
- the classification model of the present invention has 94.92% of the classification accuracy in the RNA sequencing analysis data and 81.36% of the classification accuracy in the microarray. That is, the classification model of the present invention can provide breast cancer molecular subtype classification with higher accuracy than AIMS, in which the classification accuracy of RNA sequencing analysis data and the classification accuracy of microarray analysis data are 64.41%.
- Table 2 shows the results of evaluating AIMS using the same sample as the TCGA BRCA sample described above and the number of samples.
- Table 2 shows the results of the classification accuracy of the present invention in which the classification accuracy in the RNA sequencing analysis data in FIG. 5A and FIG. 5B is about 88% and the classification accuracy in the microarray analysis data is about 90%. In contrast, the results of classification accuracy using AIMS can be confirmed. As a result, the classification accuracy of RNA sequencing analysis data and microarray analysis data was 73.38% for 317 out of 432 and 77.08% for 333 out of 432, respectively. You can see that the accuracy is higher than AIMS.
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Genetics & Genomics (AREA)
- Organic Chemistry (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biotechnology (AREA)
- General Health & Medical Sciences (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Wood Science & Technology (AREA)
- Zoology (AREA)
- Evolutionary Biology (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Medical Informatics (AREA)
- Immunology (AREA)
- Analytical Chemistry (AREA)
- Theoretical Computer Science (AREA)
- Bioinformatics & Computational Biology (AREA)
- Microbiology (AREA)
- Biochemistry (AREA)
- General Engineering & Computer Science (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
- Pathology (AREA)
- Hospice & Palliative Care (AREA)
- Oncology (AREA)
Abstract
Description
본 발명은 유방암 분자아형 분류방법 및 이를 이용한 유방암 분자아형 분류 디바이스에 관한 것으로, 보다 구체적으로는 종양세포에서 특정 유전자의 발현수준을 측정하여 종양세포에 대한 유방암의 분자아형을 분류하는 방법 및 이를 이용한 유방암의 분자아형을 분류하는 디바이스에 관한 것이다.The present invention relates to a method for classifying breast cancer molecular subtypes and a device for classifying breast cancer molecular subtypes using the same, and more particularly, to a method for classifying molecular subtypes of breast cancer in tumor cells by measuring expression levels of specific genes in tumor cells and using the same. A device for classifying molecular subtypes of breast cancer.
유방암이란 유방에 생긴 암 세포로 이루어진 종괴이다. 유방암의 발병 원인으로 여성 호르몬, 가족력, 과거력, 출산력, 식생활 습관 등 다양한 인자들이 거론되고 있지만 아직까지 명확하게 규명된 것은 없다. 유방암의 발생은 유선조직의 민감도 증가, 식생활의 서구화, 생활환경의 오염 등의 이유로 급격하게 증가하고 있다. Breast cancer is a mass of cancerous cells of the breast. Various factors such as female hormones, family history, past history, fertility, and dietary habits have been mentioned as the causes of breast cancer, but there is no clear explanation yet. The incidence of breast cancer is increasing rapidly due to increased sensitivity of the mammary gland, westernization of diet, and pollution of living environment.
유방암의 치료 방법으로는 수술적 치료, 방사선 치료, 항암화학 요법, 항호르몬 치료 등의 다양한 치료방법이 알려져 있다. 이에 따라, 유방암 환자에게 다양한 치료방법 중 효과적인 치료 방법을 제공하기 위해 유방암의 분자아형을 정확하게 아는 것이 중요하다. 분자아형을 정확하게 분류하는 것은 생존율, 치료 성공의 예측 및 예후의 진단과 연관된 정보를 제공할 수 있으며 적절한 치료 방법의 선택을 용이하게 해줄 수 있다. 이에 따라, 유방암의 분자아형을 정확하게 판별하는 방법에 대해 많은 수요가 있으며, 관련된 연구가 진행되고 있다. As a treatment method for breast cancer, various treatment methods such as surgical treatment, radiation treatment, chemotherapy, and anti-hormonal treatment are known. Accordingly, it is important to know the molecular subtypes of breast cancer accurately in order to provide effective treatment among various treatment methods to breast cancer patients. Accurate classification of molecular subtypes can provide information related to survival, prediction of treatment success, and diagnosis of prognosis and can facilitate selection of appropriate treatment methods. Accordingly, there is a great demand for a method for accurately determining the molecular subtypes of breast cancer, and related studies are being conducted.
발명의 배경이 되는 기술은 본 발명에 대한 이해를 보다 용이하게 하기 위해 작성되었다. 발명의 배경이 되는 기술에 기재된 사항들이 선행기술로 존재한다고 인정하는 것으로 이해되어서는 안 된다.The background art of the invention has been created to facilitate understanding of the present invention. It should not be understood that the matters described in the background of the invention exist as prior art.
유방 종양은 크게 루미널 A (Luminal A) 형, 루미널 B (Luminal B) 형, HER2 형, 기저 유사 (Basal-like) 형 및 정상 유사 (Normal-like) 형의 5개의 분자아형으로 분류될 수 있다. 이러한 유방 종양의 분자아형은 에스트로겐 수용체, 프로게스테론 수용체, HER2, HER1 및 Cytokeratin 5/3로 이루어진 선별인자에 대한 발현을 기초로 분류될 수 있다.Breast tumors can be broadly classified into five molecular subtypes: Luminal A type, Luminal B type, HER2 type, Basal-like type and Normal-like type. Can be. Molecular subtypes of such breast tumors can be classified based on expression on selectors consisting of estrogen receptors, progesterone receptors, HER2, HER1 and Cytokeratin 5/3.
이상의 유방 종양의 분자아형에 따라 예후의 진단과 치료반응에 대한 예측이 다를 수 있고, 이에 따라 각기 다른 분자아형을 가진 유방암 환자에게 요구되는 치료법 또는 치료제가 다를 수 있다. Molecular subtypes of the above breast tumors may have different predictions for the diagnosis and treatment response of the prognosis. Accordingly, treatments or therapeutic agents required for breast cancer patients with different molecular subtypes may be different.
이에, 본 발명의 발명자들은 유방암의 분자아형에 따라 특정 유전자들의 발현수준의 차이가 나타나고, 이러한 차이에 기초하여, 기계학습을 통해 획득된 모델을 이용함으로써, 유방 종양의 분자아형을 효과적으로 분류할 수 있다는 점을 인지하였다. Accordingly, the inventors of the present invention show a difference in expression levels of specific genes according to molecular subtypes of breast cancer, and based on these differences, the molecular subtypes of breast tumors can be effectively classified by using a model obtained through machine learning. It was recognized.
이에 본 발명의 해결하고자 하는 과제는 종양세포 내의 차별 발현 유전자 세트를 제공하여, 이들의 발현수준의 비율을 기초로 종양세포의 유방암 분자아형을 분류할 수 있는, 유방암 분자아형 분류방법을 제공하는 것이다. Accordingly, an object of the present invention is to provide a breast cancer molecular subtype classification method, which can classify breast cancer molecular subtypes of tumor cells based on a ratio of their expression levels by providing a differential expression gene set in tumor cells. .
본 발명의 해결하고자 하는 다른 과제는 종양세포내의 특정 유전자 세트의 발현수준을 측정함으로써, 하나의 샘플의 절대적인 유전자 발현수준만으로도 유방암 분자아형을 분류할 수 있는, 유방암 분자아형 분류방법 및 이를 이용하는 디바이스를 제공하는 것이다. Another problem to be solved by the present invention is to measure the level of expression of a particular set of genes in tumor cells, breast cancer molecular subtypes classification method and a device using the same that can classify only the absolute gene expression level of one sample To provide.
본 발명의 해결하고자 하는 또 다른 과제는 기준치를 이용하여 보정한 발현비율 산출방식을 이용하여 종양세포 내의 특정 유전자들의 발현비율에 대한 정확한 산출이 가능한 유방암 분자아형 분류방법 및 이를 이용하는 디바이스를 제공하는 것이다. Another problem to be solved by the present invention is to provide a breast cancer molecular subtype classification method and a device using the same that can accurately calculate the expression ratio of specific genes in tumor cells using the expression rate correction method corrected using a reference value. .
본 발명의 과제들은 이상에서 언급한 과제들로 제한되지 않으며, 언급되지 않은 또 다른 과제들은 아래의 기재로부터 당업자에게 명확하게 이해될 수 있을 것이다.The objects of the present invention are not limited to the above-mentioned objects, and other objects that are not mentioned will be clearly understood by those skilled in the art from the following description.
전술한 바와 같은 과제를 해결하기 위하여 본 발명의 일 실시예에 따른 유방암 분자아형 분류방법은 종양세포에서 측정된 차별 발현 유전자 세트의 발현수준을 획득하는 단계, 획득한 차별 발현 유전자 세트의 발현수준을 차별 발현 유전자 세트에 대한 파라미터를 갖는 분류모델에 입력하는 단계, 분류모델을 이용해 종양세포에 대한 유방암 분자아형을 결정하는 단계 및 결정된 유방암 분자아형을 제공하는 단계를 포함한다. In order to solve the problems as described above, the breast cancer molecular subtype classification method according to an embodiment of the present invention is to obtain the expression level of the differential expression gene set measured in the tumor cells, the expression level of the differential expression gene set obtained Inputting to a classification model having parameters for the differentially expressed gene set, determining a breast cancer molecular subtype for tumor cells using the classification model, and providing the determined breast cancer molecular subtype.
본 발명의 다른 특징에 따르면, 발현수준은 RNA 시퀀싱 (RNA sequencing) 을 이용하여 종양세포에서 측정된 제1 유전자 세트 및 제2 유전자 세트의 발현수준 또는 마이크로어레이 (microarray) 를 이용하여 종양세포에서 측정된 제1 유전자 세트 및 제3 유전자 세트의 발현수준이고, 제1 유전자 세트는 ESR1, PGR, ERBB2 및 MKI67 유전자를 포함하고, 제2 유전자 세트 및 제3 유전자 세트는 서로 상이할 수 있다.According to another feature of the invention, the expression level is measured in tumor cells using the microarray or the expression level of the first and second gene sets measured in tumor cells using RNA sequencing Expression level of the first and third gene sets, wherein the first gene set includes the ESR1, PGR, ERBB2 and MKI67 genes, and the second and third gene sets may be different from each other.
본 발명의 다른 특징에 따르면, RNA 시퀀싱을 이용하는 경우, 제2 유전자 세트는 FOXA1, FOXC1, MLPH, FGFR4, GRB7, BCL2, CEP55, MYBL2, KRT17, SFRP1, KRT14, MELK, KRT5, MIA, KIF2C, EGFR, NAT1 및 ANLN 유전자로 이루어진 그룹으로부터 선택된 적어도 하나의 유전자를 포함할 수 있다.According to another feature of the invention, when using RNA sequencing, the second set of genes is FOXA1, FOXC1, MLPH, FGFR4, GRB7, BCL2, CEP55, MYBL2, KRT17, SFRP1, KRT14, MELK, KRT5, MIA, KIF2C, EGFR And at least one gene selected from the group consisting of NAT1 and ANLN genes.
본 발명의 다른 특징에 따르면, 제2 유전자 세트는, FOXC1 유전자 또는, FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2 및 KRT17 유전자 또는, FOXA1 및 SFRP1 유전자 또는, MLPH, FGFR4, MYBL2 및 SFRP1 유전자 또는, FOXA1, KRT17 및 SFRP1 유전자 또는, FOXA1, MLPH, FGFR4, BCL2, MYBL2, KRT17 및 SFRP1 유전자 또는, FOXC1, FOXA1, MLPH, FGFR4, BCL2, MYBL2, KRT17 및 SFRP1 유전자 또는, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17 및 SFRP1 유전자 또는, GRB7, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17 및 SFRP1 유전자 또는, FOXA1, MLPH, FGFR4, BCL2, CEP55, MYBL2, KRT17 및 SFRP1 유전자 또는, GRB7, FOXA1, MLPH, FGFR4, BCL2, CEP55, MYBL2, KRT17 및 SFRP1 유전자 또는, FOXC1, GRB7, FOXA1, MLPH, FGFR4, BCL2, CEP55, MYBL2, KRT17 및 SFRP1 유전자 또는, FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2 및 KRT14 유전자 또는, KRT17, FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2 및 KRT14 유전자 또는, GRB7, KRT17, FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2 및 KRT14 유전자 또는, MLPH, FGFR4, MYBL2, SFRP1 및 KRT14 유전자 또는, FOXA1, FOXC1, MLPH, FGFR4, BCL2, MYBL2, SFRP1 및 KRT14 유전자 또는, MLPH, FGFR4, CEP55, MYBL2, SFRP1 및 KRT14 유전자 또는, FOXA1, MLPH, FGFR4, CEP55, MYBL2, SFRP1 및 KRT14 유전자 또는, FOXC1, FOXA1, MLPH, FGFR4, CEP55, MYBL2, SFRP1 및 KRT14 유전자 또는, GRB7, FOXA1, MLPH, FGFR4, CEP55, MYBL2, SFRP1 및 KRT14 유전자 또는, BCL2, FOXA1, MLPH, FGFR4, CEP55, MYBL2, SFRP1 및 KRT14 유전자 또는, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2, SFRP1 및 KRT14 유전자 또는, FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2, SFRP1 및 KRT14 유전자 또는, FOXA1, GRB7, MLPH, FGFR4, BCL2, CEP55, MYBL2, SFRP1 및 KRT14 유전자 또는, GRB7, FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2, SFRP1 및 KRT14 유전자 또는, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 및 KRT14 유전자 또는, FOXC1, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 및 KRT14 유전자 또는, GRB7, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 및 KRT14 유전자 또는, BCL2, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 및 KRT14 유전자 또는, FOXC1, BCL2, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 및 KRT14 유전자 또는, GRB7, BCL2, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 및 KRT14 유전자 또는, FOXC1, GRB7, BCL2, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 및 KRT14 유전자 또는, MLPH, FGFR4, CEP55 및 KRT17 유전자 또는, FOXC1, FOXA1, MLPH, FGFR4, GRB7, CEP55, MYBL2, KRT17, SFRP1, MELK, KRT5 및 KRT14 유전자 또는, BLC2, MIA, FOXC1, FOXA1, MLPH, FGFR4, GRB7, CEP55, MYBL2, KRT17, SFRP1, MELK, KRT5 및 KRT14 유전자 또는, KIF2C, EGFR, NAT1, ANLN, BLC2, MIA, FOXC1, FOXA1, MLPH, FGFR4, GRB7, CEP55, MYBL2, KRT17, SFRP1, MELK, KRT5 및 KRT14 유전자를 포함할 수 있다.According to another feature of the invention, the second set of genes is FOXC1 gene or FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2 and KRT17 gene or FOXA1 and SFRP1 gene or MLPH, FGFR4, MYBL2 and SFRP1 gene Or FOXA1, KRT17 and SFRP1 gene or FOXA1, MLPH, FGFR4, BCL2, MYBL2, KRT17 and SFRP1 gene or FOXC1, FOXA1, MLPH, FGFR4, BCL2, MYBL2, KRT17 and SFRP1 gene or, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17 and SFRP1 genes or, GRB7, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17 and SFRP1 genes or, FOXA1, MLPH, FGFR4, BCL2, CEP55, MYBL2, KRT17 and SFRP1 genes, or GRB7, FOXA1, MLPH, FGFR4, BCL2, CEP55, MYBL2, KRT17 and SFRP1 genes or, FOXC1, GRB7, FOXA1, MLPH, FGFR4, BCL2, CEP55, MYBL2, KRT17 and SFRP1 genes, or FOXA1, FOXC1, MLPH, FGFREP, BCL2, MYBL2 and KRT14 genes or, KRT17, FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2 and KRT14 genes, or GRB7, KRT17, FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2 and KRT14 genes or, MLPH, FGFR4, MYBL2, SFRP1 and KRT14 genes or, FOXA1, FOXC1, MLPH, FGFR4, BCL2, MYBL2, SFRP1 and KRT14 genes, or MLPH, FGFR4, CEP55, MYBL2, SFRP1 and KRT14 genes or, FOXA1, MLPH, FGFR4, CEP55, MYBL2, SFRP1 and KRT14 genes, or FOXC1, FOXA1, MLPH, FGFR4, CEP55, MYBL2, SFRP1 and KRT14 genes, or GRB7, FOXA1, MLPH, FGFR4, CEP55 MYBL2, SFRP1 and KRT14 genes or, BCL2, FOXA1, MLPH, FGFR4, CEP55, MYBL2, SFRP1 and KRT14 genes or, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2, SFRP1 and KRT14 genes, or FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2, SFRP1 and KRT14 genes or, FOXA1, GRB7, MLPH, FGFR4, BCL2, CEP55, MYBL2, SFRP1 and KRT14 genes, or GRB7, FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2 SFRP1 and KRT14 genes or FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 and KRT14 genes or, FOXC1, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 and KRT14 Or the GRB7, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 and KRT14 genes or, the BCL2, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 and KRT14 genes, or FOXC1, BCL2, FOXA1, MLPH , FGFR4, CEP55, MYBL2, KRT17, SFRP1 and KRT14 genes or, GRB7, BCL2, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 and KRT14 genes, or FOXC1, GRB7, BCL2, FOXA1, MLPH, FGFR4 , MYBL2, KRT17, SFRP1 and KRT14 genes, or MLPH, FGFR4, CEP55 and KRT17 genes or, FOXC1, FOXA1, MLPH, FGFR4, GRB7, CEP55, MYBL2, KRT17, SFRP1, MELK, KRT5 and KRT14 genes, or BLC2, MIA , FOXC1, FOXA1, MLPH, FGFR4, GRB7, CEP55, MYBL2, KRT17, SFRP1, MELK, KRT5 and KRT14 genes or, KIF2C, EGFR, NAT1, ANLN, BLC2, MIA, FOXC1, FOXA1, MLPH, FGFR4, GRB7 , MYBL2, KRT17, SFRP1, MELK, KRT5 and KRT14 genes.
본 발명의 또 다른 특징에 따르면, 마이크로어레이를 이용하는 경우, 제3 유전자 세트는 FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14, BLVRA, BCL2, UBE2C, KRT17, CCNB1, NAT1, SLC39A6, CDC20, KRT5 및 CCNE1 유전자로 이루어진 그룹으로부터 선택된 적어도 하나의 유전자를 포함할 수 있다.According to another feature of the invention, when using a microarray, the third set of genes is FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14, BLVRA, BCL2, It may include at least one gene selected from the group consisting of UBE2C, KRT17, CCNB1, NAT1, SLC39A6, CDC20, KRT5 and CCNE1 gene.
본 발명의 다른 특징에 따르면, 제3 유전자 세트는, FOXC1 및 CEP55 유전자 또는, FOXC1, MELK 및 CEP55 유전자 또는, FOXA1, FOXC1, GRB7, CEP55, BIRC5, MELK 및 MIA 유전자 또는, FOXA1, FGFR4, CEP55, BIRC5, SFRP1 및 MIA 유전자 또는, FOXC1, FOXA1, FGFR4, CEP55, BIRC5, SFRP1 및 MIA 유전자 또는, FOXA1, MELK, SFRP1 및 MIA 유전자 또는, CEP55, FOXA1, MELK, SFRP1 및 MIA 유전자 또는, FOXC1, CEP55, FOXA1, MELK, SFRP1 및 MIA 유전자 또는, FGFR4, CEP55, FOXA1, MELK, SFRP1 및 MIA 유전자 또는, FOXC1, FGFR4, CEP55, FOXA1, MELK, SFRP1 및 MIA 유전자 또는, FOXA1, GRB7, CEP55, MELK, SFRP1 및 MIA 유전자 또는, FOXA1, GRB7, CEP55, MELK, SFRP1 및 MIA 유전자 또는, FOXC1, FOXA1, GRB7, CEP55, MELK, SFRP1 및 MIA 유전자 또는, FGFR4, FOXA1, GRB7, CEP55, MELK, SFRP1 및 MIA 유전자 또는, FOXC1, FGFR4, FOXA1, GRB7, CEP55, MELK, SFRP1 및 MIA 유전자 또는, FOXA1, FGFR4, BIRC5, MELK, SFRP1 및 MIA 유전자 또는, FOXA1, FGFR4, BIRC5, MELK, SFRP1 및 MIA 유전자 또는, CEP55, FOXA1, FGFR4, BIRC5, MELK, SFRP1 및 MIA 유전자 또는, FOX1C, CEP55, FOXA1, FGFR4, BIRC5, MELK, SFRP1 및 MIA 유전자 또는, GRB7, CEP55, FOXA1, FGFR4, BIRC5, MELK, SFRP1 및 MIA 유전자 또는, FOXC1, GRB7, CEP55, FOXA1, FGFR4, BIRC5, MELK, SFRP1 및 MIA 유전자 또는, FOXA1, FOXC1, CEP55, SFRP1, MIA 및 KRT14 유전자 또는, FGFR4, FOXA1, FOXC1, CEP55, SFRP1, MIA 및 KRT14 유전자 또는, FGFR4, FOXA1, BIRC5, CEP55, SFRP1, MIA 및 KRT14 유전자 또는, FOXC1, FGFR4, FOXA1, BIRC5, CEP55, SFRP1, MIA 및 KRT14 유전자 또는, FOXA1, FOXC1, CEP55, MELK, SFRP1, MIA 및 KRT14 유전자 또는, FOXA1, FGFR4, CEP55, MELK, SFRP1, MIA 및 KRT14 유전자 또는, FOXC1, FOXA1, FGFR4, CEP55, MELK, SFRP1, MIA 및 KRT14 유전자 또는, FOXC1, FOXA1, GRB7, CEP55, MELK, SFRP1, MIA 및 KRT14 유전자 또는, FGFR4, FOXA1, GRB7, CEP55, MELK, SFRP1, MIA 및 KRT14 유전자 또는, FOXC1, FGFR4, FOXA1, GRB7, CEP55, MELK, SFRP1, MIA 및 KRT14 유전자 또는, FGFR4, CEP55, BIRC5, MELK, SFRP1, MIA 및 KRT14 유전자 또는, FOXA1, FGFR4, CEP55, BIRC5, MELK, SFRP1, MIA 및 KRT14 유전자 또는, FOXC1, FOXA1, FGFR4, CEP55, BIRC5, MELK, SFRP1, MIA 및 KRT14 유전자 또는, GRB7, FOXA1, FGFR4, CEP55, BIRC5, MELK, SFRP1, MIA 및 KRT14 유전자 또는, FOXC1, GRB7, FOXA1, FGFR4, CEP55, BIRC5, MELK, SFRP1, MIA 및 KRT14 유전자 또는, FOXA1, FGFR4, CEP55, SFRP1 및 ANLN 유전자 또는, FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA 및 BAG1 유전자 또는, FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14 및 BLVRA 유전자 또는, FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14, BLVRA, BCL2, UBE2C, KRT17 및 CCNB1 유전자 또는, FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14 및 BLVRA 유전자 또는, FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14, BLVRA, BCL2, UBE2C, KRT17, CCNB1, NAT1, SLC39A6, CDC20, KRT5 및 CCNE1 유전자를 포함할 수 있다.According to another feature of the invention, the third set of genes is FOXC1 and CEP55 genes or FOXC1, MELK and CEP55 genes or FOXA1, FOXC1, GRB7, CEP55, BIRC5, MELK and MIA genes or FOXA1, FGFR4, CEP55, BIRC5, SFRP1 and MIA genes or FOXC1, FOXA1, FGFR4, CEP55, BIRC5, SFRP1 and MIA genes, or FOXA1, MELK, SFRP1 and MIA genes, or CEP55, FOXA1, MELK, SFRP1 and MIA genes, or FOXC1, CEP55, FOXA1, MELK, SFRP1 and MIA genes or, FGFR4, CEP55, FOXA1, MELK, SFRP1 and MIA genes, or FOXC1, FGFR4, CEP55, FOXA1, MELK, SFRP1 and MIA genes, or FOXA1, GRB7, CEP55, MELK, SFRP1 and MIA gene or FOXA1, GRB7, CEP55, MELK, SFRP1 and MIA gene or FOXC1, FOXA1, GRB7, CEP55, MELK, SFRP1 and MIA gene or FGFR4, FOXA1, GRB7, CEP55, MELK, SFRP1 and MIA gene or FOXC1, FGFR4, FOXA1, GRB7, CEP55, MELK, SFRP1 and MIA genes, or FOXA1, FGFR4, BIRC5, MELK, SFRP1 and MIA genes, or FOXA1, F GFR4, BIRC5, MELK, SFRP1 and MIA genes, or CEP55, FOXA1, FGFR4, BIRC5, MELK, SFRP1 and MIA genes, or FOX1C, CEP55, FOXA1, FGFR4, BIRC5, MELK, SFRP1 and MIA genes, or GRB7, CEP55, FOXA1, FGFR4, BIRC5, MELK, SFRP1 and MIA genes or, FOXC1, GRB7, CEP55, FOXA1, FGFR4, BIRC5, MELK, SFRP1 and MIA genes, or FOXA1, FOXC1, CEP55, SFRP1, MIA and KRT14 genes, or FGFR4, FOXA1, FOXC1, CEP55, SFRP1, MIA and KRT14 genes, or FGFR4, FOXA1, BIRC5, CEP55, SFRP1, MIA and KRT14 genes or, FOXC1, FGFR4, FOXA1, BIRC5, CEP55, SFRP1, MIA and KRT14 genes, or FOXA1 FOXC1, CEP55, MELK, SFRP1, MIA and KRT14 genes or, FOXA1, FGFR4, CEP55, MELK, SFRP1, MIA and KRT14 genes, or FOXC1, FOXA1, FGFR4, CEP55, MELK, SFRP1, MIA and KRT14 genes, or FOXC1, FOXA1, GRB7, CEP55, MELK, SFRP1, MIA and KRT14 genes or, FGFR4, FOXA1, GRB7, CEP55, MELK, SFRP1, MIA and KRT14 genes, or FOXC1, FGFR4, FOXA1, GRB7, CEP55, MELK, SF RP1, MIA and KRT14 genes or, FGFR4, CEP55, BIRC5, MELK, SFRP1, MIA and KRT14 genes or, FOXA1, FGFR4, CEP55, BIRC5, MELK, SFRP1, MIA and KRT14 genes, or FOXC1, FOXA1, FGFR4, CEP55, BIRC5, MELK, SFRP1, MIA and KRT14 genes or GRB7, FOXA1, FGFR4, CEP55, BIRC5, MELK, SFRP1, MIA and KRT14 genes or, FOXC1, GRB7, FOXA1, FGFR4, CEP55, BIRC5, MELK, SFRP1, MIA and KRT14 gene or FOXA1, FGFR4, CEP55, SFRP1 and ANLN genes or FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA and BAG1 genes, or FOXA1, FGFR4, CEP55, SFRP1, ANLN, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14 and BLVRA genes or, FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14, BLVRA, BCL2, UBE2C, KRT17 and CCNB1 genes or, FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14 and BLVRA genes, or FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7 BIRC5, MIA, BAG1, MLPH, MELK, KRT 14, BLVRA, BCL2, UBE2C, KRT17, CCNB1, NAT1, SLC39A6, CDC20, KRT5 and CCNE1 genes.
본 발명의 또 다른 특징에 따르면, 발현수준을 획득하는 단계 전에, 기저형, 루미널 A형, 루미널 B형, 정상-유사형, 정상형 및 HER2형의 분자아형을 갖는 종양세포 각각에 대해 측정된 복수의 유방암과 연관된 유전자의 발현수준을 측정하는 단계, 종양세포 중에서 선택된 두 개의 분자아형의 종양세포 각각에 대한 복수의 유방암과 연관된 유전자의 발현수준의 차이를 산출하는 단계 및 복수의 유방암과 연관된 유전자 중 분자아형에 따라 유전자간의 발현수준의 차이가 상대적으로 큰 적어도 하나의 유전자를 차별 발현 유전자 세트로 결정하는 단계를 더 포함할 수 있다.According to another feature of the invention, prior to the step of obtaining the expression level, measured for each of the tumor cells having a molecular type of basal, luminal A, luminal B, normal-like, normal and HER2 Measuring expression levels of genes associated with a plurality of breast cancers, calculating differences in expression levels of genes associated with a plurality of breast cancers for each of two molecular subtypes of tumor cells selected from tumor cells, and genes associated with a plurality of breast cancers The method may further include determining at least one gene having a relatively large difference in expression level between genes according to the molecular subtype as a differential expression gene set.
본 발명의 또 다른 특징에 따르면, 분류모델은, 복수의 유방암과 연관된 유전자의 발현수준을 기초로 하기 [수학식 1]로 산출한, 복수의 유방암과 연관된 유전자에서 선택된 두 개의 유전자에 대한 발현비율값으로 구성된 메트릭스에 의해 정의될 수 있다.According to another feature of the invention, the classification model, based on the expression level of the genes associated with the plurality of breast cancer, the expression ratio for two genes selected from the genes associated with the plurality of breast cancer, calculated by
[수학식 1] [Equation 1]
(여기서, ei는 밑이 2인 로그에 복수의 유방암과 연관된 유전자에서 선택된 i유전자의 발현값을 취한 로그값이고, ej는 밑이 2인 로그에 복수의 유방암과 연관된 유전자에서 선택된 j유전자의 발현값을 취한 로그값이다.)(Where, i i is the logarithm of the i gene selected from the genes associated with the plurality of breast cancers in the
본 발명의 또 다른 특징에 따르면, 분류모델은, dij 값을 하기 [수학식 2] 로 보정하여 산출한, 복수의 유방암과 연관된 유전자에서 선택된 두 개의 유전자에 대한 발현비율값으로 구성된 메트릭스에 의해 정의될 수 있다.According to another feature of the present invention, the classification model is based on a matrix consisting of expression ratio values for two genes selected from genes associated with a plurality of breast cancers, calculated by correcting d ij with Equation 2 below. Can be defined.
[수학식 2][Equation 2]
(여기서, α는 0, 0.01, 0.10, 0.15, 0.20 및 1.0으로 이루어진 그룹으로부터 선택된 하나의 값이다.)(Where α is one value selected from the group consisting of 0, 0.01, 0.10, 0.15, 0.20 and 1.0).
본 발명의 또 다른 특징에 따르면, 유방암의 분자아형은, 기저형, 루미널 A형, 루미널 B형, 정상-유사형, 정상형, HER2형으로 이루어진 그룹으로부터 선택된 적어도 하나를 포함할 수 있다. According to another feature of the present invention, the molecular subtype of breast cancer may include at least one selected from the group consisting of basal type, luminal type A, luminal type B, normal-like type, normal type, and HER2 type.
전술한 바와 같은 과제를 해결하기 위하여 본 발명의 일 실시예에 따른 유방암 분자아형 분류 디바이스는 구동 가능한 유방암 분자아형 분류를 위한 디바이스로서, 통신부와 동작 가능하게 연결된 프로세서를 포함하고, 프로세서는, 통신부를 통해 종양세포에서 측정된 차별 발현 유전자 세트의 발현수준을 획득하고, 획득한 차별 발현 유전자 세트에 대한 발현수준을 차별 발현 유전자 세트에 대한 파라미터를 갖는 분류모델에 입력하고, 종양세포에 대한 유방암 분자아형을 결정하고, 결정된 유방암 분자아형을 제공하도록 구성된 프로세서를 포함한다.In order to solve the above problems, a breast cancer molecular subtype classification device according to an embodiment of the present invention is a device for classifying breast cancer molecular subtypes that can be driven, the processor including a processor operatively connected to the communication unit, the processor, The expression level of the differential expression gene set measured in tumor cells is obtained, and the expression level for the differential expression gene set obtained is input into a classification model having parameters for the differential expression gene set, and breast cancer molecular subtypes for tumor cells are obtained. And a processor configured to provide the determined breast cancer molecular subtypes.
본 발명의 다른 특징에 따르면, 발현수준은 RNA 시퀀싱을 이용하여 상기 종양세포에서 측정된 제1 유전자 세트 및 제2 유전자 세트의 발현수준 또는 마이크로어레이를 이용하여 상기 종양세포에서 측정된 제1 유전자 세트 및 제3 유전자 세트의 발현수준이고, 제1 유전자 세트는 ESR1, PGR, ERBB2 및 MKI67 유전자를 포함하고, 제2 유전자 세트 및 제3 유전자 세트는 서로 상이할 수 있다.According to another feature of the invention, the expression level is the first set of genes measured in the tumor cells using the expression level or microarray of the first and second sets of genes measured in the tumor cells using RNA sequencing And the level of expression of the third set of genes, the first set of genes comprising the ESR1, PGR, ERBB2 and MKI67 genes, wherein the second set of genes and the third set of genes may be different from each other.
본 발명의 또 다른 특징에 따르면, RNA 시퀀싱을 이용할 경우, 제2 유전자 세트는 FOXA1, FOXC1, MLPH, FGFR4, GRB7, BCL2, CEP55, MYBL2, KRT17, SFRP1, KRT14, MELK, KRT5, MIA, KIF2C, EGFR, NAT1 및 ANLN 유전자로 이루어진 그룹으로부터 선택된 적어도 하나의 유전자를 포함할 수 있다.According to another feature of the invention, when using RNA sequencing, the second set of genes is FOXA1, FOXC1, MLPH, FGFR4, GRB7, BCL2, CEP55, MYBL2, KRT17, SFRP1, KRT14, MELK, KRT5, MIA, KIF2C, It may comprise at least one gene selected from the group consisting of EGFR, NAT1 and ANLN genes.
본 발명의 다른 특징에 따르면, 제2 유전자 세트는, FOXC1 유전자 또는, FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2 및 KRT17 유전자 또는, FOXA1 및 SFRP1 유전자 또는, MLPH, FGFR4, MYBL2 및 SFRP1 유전자 또는, FOXA1, KRT17 및 SFRP1 유전자 또는, FOXA1, MLPH, FGFR4, BCL2, MYBL2, KRT17 및 SFRP1 유전자 또는, FOXC1, FOXA1, MLPH, FGFR4, BCL2, MYBL2, KRT17 및 SFRP1 유전자 또는, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17 및 SFRP1 유전자 또는, GRB7, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17 및 SFRP1 유전자 또는, FOXA1, MLPH, FGFR4, BCL2, CEP55, MYBL2, KRT17 및 SFRP1 유전자 또는, GRB7, FOXA1, MLPH, FGFR4, BCL2, CEP55, MYBL2, KRT17 및 SFRP1 유전자 또는, FOXC1, GRB7, FOXA1, MLPH, FGFR4, BCL2, CEP55, MYBL2, KRT17 및 SFRP1 유전자 또는, FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2 및 KRT14 유전자 또는, KRT17, FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2 및 KRT14 유전자 또는, GRB7, KRT17, FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2 및 KRT14 유전자 또는, MLPH, FGFR4, MYBL2, SFRP1 및 KRT14 유전자 또는, FOXA1, FOXC1, MLPH, FGFR4, BCL2, MYBL2, SFRP1 및 KRT14 유전자 또는, MLPH, FGFR4, CEP55, MYBL2, SFRP1 및 KRT14 유전자 또는, FOXA1, MLPH, FGFR4, CEP55, MYBL2, SFRP1 및 KRT14 유전자 또는, FOXC1, FOXA1, MLPH, FGFR4, CEP55, MYBL2, SFRP1 및 KRT14 유전자 또는, GRB7, FOXA1, MLPH, FGFR4, CEP55, MYBL2, SFRP1 및 KRT14 유전자 또는, BCL2, FOXA1, MLPH, FGFR4, CEP55, MYBL2, SFRP1 및 KRT14 유전자 또는, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2, SFRP1 및 KRT14 유전자 또는, FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2, SFRP1 및 KRT14 유전자 또는, FOXA1, GRB7, MLPH, FGFR4, BCL2, CEP55, MYBL2, SFRP1 및 KRT14 유전자 또는, GRB7, FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2, SFRP1 및 KRT14 유전자 또는, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 및 KRT14 유전자 또는, FOXC1, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 및 KRT14 유전자 또는, GRB7, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 및 KRT14 유전자 또는, BCL2, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 및 KRT14 유전자 또는, FOXC1, BCL2, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 및 KRT14 유전자 또는, GRB7, BCL2, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 및 KRT14 유전자 또는, FOXC1, GRB7, BCL2, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 및 KRT14 유전자 또는, MLPH, FGFR4, CEP55 및 KRT17 유전자 또는, FOXC1, FOXA1, MLPH, FGFR4, GRB7, CEP55, MYBL2, KRT17, SFRP1, MELK, KRT5 및 KRT14 유전자 또는, BLC2, MIA, FOXC1, FOXA1, MLPH, FGFR4, GRB7, CEP55, MYBL2, KRT17, SFRP1, MELK, KRT5 및 KRT14 유전자 또는, KIF2C, EGFR, NAT1, ANLN, BLC2, MIA, FOXC1, FOXA1, MLPH, FGFR4, GRB7, CEP55, MYBL2, KRT17, SFRP1, MELK, KRT5 및 KRT14 유전자를 포함할 수 있다.According to another feature of the invention, the second set of genes is FOXC1 gene or FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2 and KRT17 gene or FOXA1 and SFRP1 gene or MLPH, FGFR4, MYBL2 and SFRP1 gene Or FOXA1, KRT17 and SFRP1 gene or FOXA1, MLPH, FGFR4, BCL2, MYBL2, KRT17 and SFRP1 gene or FOXC1, FOXA1, MLPH, FGFR4, BCL2, MYBL2, KRT17 and SFRP1 gene or, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17 and SFRP1 genes or, GRB7, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17 and SFRP1 genes or, FOXA1, MLPH, FGFR4, BCL2, CEP55, MYBL2, KRT17 and SFRP1 genes, or GRB7, FOXA1, MLPH, FGFR4, BCL2, CEP55, MYBL2, KRT17 and SFRP1 genes or, FOXC1, GRB7, FOXA1, MLPH, FGFR4, BCL2, CEP55, MYBL2, KRT17 and SFRP1 genes, or FOXA1, FOXC1, MLPH, FGFREP, BCL2, MYBL2 and KRT14 genes or, KRT17, FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2 and KRT14 genes, or GRB7, KRT17, FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2 and KRT14 genes or, MLPH, FGFR4, MYBL2, SFRP1 and KRT14 genes or, FOXA1, FOXC1, MLPH, FGFR4, BCL2, MYBL2, SFRP1 and KRT14 genes, or MLPH, FGFR4, CEP55, MYBL2, SFRP1 and KRT14 genes or, FOXA1, MLPH, FGFR4, CEP55, MYBL2, SFRP1 and KRT14 genes, or FOXC1, FOXA1, MLPH, FGFR4, CEP55, MYBL2, SFRP1 and KRT14 genes, or GRB7, FOXA1, MLPH, FGFR4, CEP55 MYBL2, SFRP1 and KRT14 genes or, BCL2, FOXA1, MLPH, FGFR4, CEP55, MYBL2, SFRP1 and KRT14 genes or, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2, SFRP1 and KRT14 genes, or FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2, SFRP1 and KRT14 genes or, FOXA1, GRB7, MLPH, FGFR4, BCL2, CEP55, MYBL2, SFRP1 and KRT14 genes, or GRB7, FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2 SFRP1 and KRT14 genes or FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 and KRT14 genes or, FOXC1, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 and KRT14 Or the GRB7, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 and KRT14 genes or, the BCL2, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 and KRT14 genes, or FOXC1, BCL2, FOXA1, MLPH , FGFR4, CEP55, MYBL2, KRT17, SFRP1 and KRT14 genes or, GRB7, BCL2, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 and KRT14 genes, or FOXC1, GRB7, BCL2, FOXA1, MLPH, FGFR4 , MYBL2, KRT17, SFRP1 and KRT14 genes, or MLPH, FGFR4, CEP55 and KRT17 genes or, FOXC1, FOXA1, MLPH, FGFR4, GRB7, CEP55, MYBL2, KRT17, SFRP1, MELK, KRT5 and KRT14 genes, or BLC2, MIA , FOXC1, FOXA1, MLPH, FGFR4, GRB7, CEP55, MYBL2, KRT17, SFRP1, MELK, KRT5 and KRT14 genes or, KIF2C, EGFR, NAT1, ANLN, BLC2, MIA, FOXC1, FOXA1, MLPH, FGFR4, GRB7 , MYBL2, KRT17, SFRP1, MELK, KRT5 and KRT14 genes.
본 발명의 또 다른 특징에 따르면, 마이크로어레이를 이용할 경우, 제3 유전자 세트는 FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14, BLVRA, BCL2, UBE2C, KRT17, CCNB1, NAT1, SLC39A6, CDC20, KRT5 및 CCNE1 유전자로 이루어진 그룹으로부터 선택된 적어도 하나의 유전자를 포함할 수 있다.According to another feature of the invention, when using a microarray, the third set of genes is FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14, BLVRA, BCL2, It may include at least one gene selected from the group consisting of UBE2C, KRT17, CCNB1, NAT1, SLC39A6, CDC20, KRT5 and CCNE1 gene.
본 발명의 또 다른 특징에 따르면, 제3 유전자 세트는, FOXC1 및 CEP55 유전자 또는, FOXC1, MELK 및 CEP55 유전자 또는, FOXA1, FOXC1, GRB7, CEP55, BIRC5, MELK 및 MIA 유전자 또는, FOXA1, FGFR4, CEP55, BIRC5, SFRP1 및 MIA 유전자 또는, FOXC1, FOXA1, FGFR4, CEP55, BIRC5, SFRP1 및 MIA 유전자 또는, FOXA1, MELK, SFRP1 및 MIA 유전자 또는, CEP55, FOXA1, MELK, SFRP1 및 MIA 유전자 또는, FOXC1, CEP55, FOXA1, MELK, SFRP1 및 MIA 유전자 또는, FGFR4, CEP55, FOXA1, MELK, SFRP1 및 MIA 유전자 또는, FOXC1, FGFR4, CEP55, FOXA1, MELK, SFRP1 및 MIA 유전자 또는, FOXA1, GRB7, CEP55, MELK, SFRP1 및 MIA 유전자 또는, FOXA1, GRB7, CEP55, MELK, SFRP1 및 MIA 유전자 또는, FOXC1, FOXA1, GRB7, CEP55, MELK, SFRP1 및 MIA 유전자 또는, FGFR4, FOXA1, GRB7, CEP55, MELK, SFRP1 및 MIA 유전자 또는, FOXC1, FGFR4, FOXA1, GRB7, CEP55, MELK, SFRP1 및 MIA 유전자 또는, FOXA1, FGFR4, BIRC5, MELK, SFRP1 및 MIA 유전자 또는, FOXA1, FGFR4, BIRC5, MELK, SFRP1 및 MIA 유전자 또는, CEP55, FOXA1, FGFR4, BIRC5, MELK, SFRP1 및 MIA 유전자 또는, FOX1C, CEP55, FOXA1, FGFR4, BIRC5, MELK, SFRP1 및 MIA 유전자 또는, GRB7, CEP55, FOXA1, FGFR4, BIRC5, MELK, SFRP1 및 MIA 유전자 또는, FOXC1, GRB7, CEP55, FOXA1, FGFR4, BIRC5, MELK, SFRP1 및 MIA 유전자 또는, FOXA1, FOXC1, CEP55, SFRP1, MIA 및 KRT14 유전자 또는, FGFR4, FOXA1, FOXC1, CEP55, SFRP1, MIA 및 KRT14 유전자 또는, FGFR4, FOXA1, BIRC5, CEP55, SFRP1, MIA 및 KRT14 유전자 또는, FOXC1, FGFR4, FOXA1, BIRC5, CEP55, SFRP1, MIA 및 KRT14 유전자 또는, FOXA1, FOXC1, CEP55, MELK, SFRP1, MIA 및 KRT14 유전자 또는, FOXA1, FGFR4, CEP55, MELK, SFRP1, MIA 및 KRT14 유전자 또는, FOXC1, FOXA1, FGFR4, CEP55, MELK, SFRP1, MIA 및 KRT14 유전자 또는, FOXC1, FOXA1, GRB7, CEP55, MELK, SFRP1, MIA 및 KRT14 유전자 또는, FGFR4, FOXA1, GRB7, CEP55, MELK, SFRP1, MIA 및 KRT14 유전자 또는, FOXC1, FGFR4, FOXA1, GRB7, CEP55, MELK, SFRP1, MIA 및 KRT14 유전자 또는, FGFR4, CEP55, BIRC5, MELK, SFRP1, MIA 및 KRT14 유전자 또는, FOXA1, FGFR4, CEP55, BIRC5, MELK, SFRP1, MIA 및 KRT14 유전자 또는, FOXC1, FOXA1, FGFR4, CEP55, BIRC5, MELK, SFRP1, MIA 및 KRT14 유전자 또는, GRB7, FOXA1, FGFR4, CEP55, BIRC5, MELK, SFRP1, MIA 및 KRT14 유전자 또는, FOXC1, GRB7, FOXA1, FGFR4, CEP55, BIRC5, MELK, SFRP1, MIA 및 KRT14 유전자 또는, FOXA1, FGFR4, CEP55, SFRP1 및 ANLN 유전자 또는, FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA 및 BAG1 유전자 또는, FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14 및 BLVRA 유전자 또는, FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14, BLVRA, BCL2, UBE2C, KRT17 및 CCNB1 유전자 또는, FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14 및 BLVRA 유전자 또는, FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14, BLVRA, BCL2, UBE2C, KRT17, CCNB1, NAT1, SLC39A6, CDC20, KRT5 및 CCNE1 유전자를 포함할 수 있다.According to another feature of the invention, the third set of genes is FOXC1 and CEP55 genes or FOXC1, MELK and CEP55 genes or FOXA1, FOXC1, GRB7, CEP55, BIRC5, MELK and MIA genes or FOXA1, FGFR4, CEP55 , BIRC5, SFRP1 and MIA genes, or FOXC1, FOXA1, FGFR4, CEP55, BIRC5, SFRP1 and MIA genes, or FOXA1, MELK, SFRP1 and MIA genes, or CEP55, FOXA1, MELK, SFRP1 and MIA genes, or FOXC1, CEP55 , FOXA1, MELK, SFRP1 and MIA genes, or FGFR4, CEP55, FOXA1, MELK, SFRP1 and MIA genes, or FOXC1, FGFR4, CEP55, FOXA1, MELK, SFRP1 and MIA genes, or FOXA1, GRB7, CEP55, MELK, SFRP1 And MIA gene or FOXA1, GRB7, CEP55, MELK, SFRP1 and MIA gene or FOXC1, FOXA1, GRB7, CEP55, MELK, SFRP1 and MIA gene or FGFR4, FOXA1, GRB7, CEP55, MELK, SFRP1 and MIA gene or , FOXC1, FGFR4, FOXA1, GRB7, CEP55, MELK, SFRP1 and MIA genes, or FOXA1, FGFR4, BIRC5, MELK, SFRP1 and MIA genes, or FOXA 1, FGFR4, BIRC5, MELK, SFRP1 and MIA genes or, CEP55, FOXA1, FGFR4, BIRC5, MELK, SFRP1 and MIA genes, or FOX1C, CEP55, FOXA1, FGFR4, BIRC5, MELK, SFRP1 and MIA genes, or GRB7, CEP55, FOXA1, FGFR4, BIRC5, MELK, SFRP1 and MIA genes or, FOXC1, GRB7, CEP55, FOXA1, FGFR4, BIRC5, MELK, SFRP1 and MIA genes, or FOXA1, FOXC1, CEP55, SFRP1, MIA and KRT14 genes, or FGFR4, FOXA1, FOXC1, CEP55, SFRP1, MIA and KRT14 genes or, FGFR4, FOXA1, BIRC5, CEP55, SFRP1, MIA and KRT14 genes, or FOXC1, FGFR4, FOXA1, BIRC5, CEP55, SFRP1, MIA and KRT14 genes FOXA1, FOXC1, CEP55, MELK, SFRP1, MIA and KRT14 genes or, FOXA1, FGFR4, CEP55, MELK, SFRP1, MIA and KRT14 genes, or FOXC1, FOXA1, FGFR4, CEP55, MELK, SFRP1, MIA and KRT14 genes, or FOXC1, FOXA1, GRB7, CEP55, MELK, SFRP1, MIA and KRT14 genes, or FGFR4, FOXA1, GRB7, CEP55, MELK, SFRP1, MIA and KRT14 genes, or FOXC1, FGFR4, FOXA1, GRB7, CEP55, MELK , SFRP1, MIA and KRT14 genes or, FGFR4, CEP55, BIRC5, MELK, SFRP1, MIA and KRT14 genes or, FOXA1, FGFR4, CEP55, BIRC5, MELK, SFRP1, MIA and KRT14 genes, or FOXC1, FOXA1, FGFR4, CEP55 , BIRC5, MELK, SFRP1, MIA and KRT14 genes or, GRB7, FOXA1, FGFR4, CEP55, BIRC5, MELK, SFRP1, MIA and KRT14 genes or, FOXC1, GRB7, FOXA1, FGFR4, CEP55, BIRC5, MELK, SFRP1, MIA And KRT14 gene or FOXA1, FGFR4, CEP55, SFRP1 and ANLN genes or FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA and BAG1 genes, or FOXA1, FGFR4, CEP55, SFRP1, ANLN , GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14 and BLVRA genes or, FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14, BLVRA, BCL2, UBE2C , KRT17 and CCNB1 gene or, FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14 and BLVRA genes, or FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GROXB , BIRC5, MIA, BAG1, MLPH, MELK, KRT14, BLVRA, BCL2, UBE2C, KRT17, CCNB1, NAT1, SLC39A6, CDC20, KRT5 and CCNE1 genes.
본 발명의 또 다른 특징에 따르면, 복수의 유방암과 연관된 유전자의 발현수준을 기초로 하기 [수학식 1]로 산출한, 복수의 유방암과 연관된 유전자에서 선택된 두 개의 유전자에 대한 발현비율값으로 구성된 메트릭스에 의해 정의될 수 있다.According to another feature of the invention, based on the expression level of the genes associated with a plurality of breast cancers, a matrix consisting of expression ratio values for two genes selected from the genes associated with the plurality of breast cancers calculated by
[수학식 1] [Equation 1]
(여기서, ei는 밑이 2인 로그에 복수의 유방암과 연관된 유전자에서 선택된 i유전자의 발현값을 취한 로그값이고, ej는 밑이 2인 로그에 복수의 유방암과 연관된 유전자에서 선택된 j유전자의 발현값을 취한 로그값이다.)(Where, i i is the logarithm of the i gene selected from the genes associated with the plurality of breast cancers in the
본 발명의 또 다른 특징에 따르면, 분류모델은, dij 값을 하기 [수학식 2] 로 보정하여 산출한, 복수의 유방암과 연관된 유전자에서 선택된 두 개의 유전자에 대한 발현비율값으로 구성된 메트릭스에 의해 정의될 수 있다.According to another feature of the present invention, the classification model is based on a matrix consisting of expression ratio values for two genes selected from genes associated with a plurality of breast cancers, calculated by correcting d ij with Equation 2 below. Can be defined.
[수학식 2][Equation 2]
(여기서, α는 0, 0.01, 0.10, 0.15, 0.20 및 1.0으로 이루어진 그룹으로부터 선택된 하나의 값이다.)(Where α is one value selected from the group consisting of 0, 0.01, 0.10, 0.15, 0.20 and 1.0).
본 발명은 RNA 시퀀싱 또는 마이크로어레이를 포함하는 발현수준 측정방법에 따라 각기 다른 유전자 세트를 제공함으로써, 종양세포내의 특정 유전자들에 대한 발현수준을 측정하여, 유방암 분자아형을 높은 정확도로 분류할 수 있는 효과가 있다. The present invention provides a different set of genes according to a method for measuring expression levels including RNA sequencing or microarray, thereby measuring expression levels for specific genes in tumor cells, thereby classifying breast cancer molecular subtypes with high accuracy. It works.
구체적으로, 유전자 세트는 분자아형에 따라 유전자간의 발현수준의 차이가 상대적으로 큰, 차별 발현 유전자 (DEG, Differentially expressed gene) 를 포함할 수 있고, 이에 따라 본 발명은 차별 발현 유전자를 이용하여 유방암 분자아형을 정확하게 분류할 수 있는 효과가 있다. Specifically, the gene set may include a differentially expressed gene (DEG, Differentially expressed gene) having a relatively large difference in the expression level between genes according to the molecular subtype, according to the present invention by using a differential expression gene breast cancer molecule It is effective to classify subtypes accurately.
또한, 본 발명은 유방암 분자아형 분류방법을 제공함으로써, 대상샘플 없이, 종양세포만으로도 유방 종양세포의 특정 유전자들에 대한 발현수준을 측정함으로써, 유방암 분자아형을 분류할 수 있는 효과가 있다. In addition, the present invention provides a method for classifying breast cancer molecular subtypes, thereby measuring breast cancer molecular subtypes by measuring expression levels of specific genes of breast tumor cells with only tumor cells without subject samples.
구체적으로, 본 발명은 기계학습을 통해 결정된 유방암 분자아형 결정 모델에서 사용되는 유전자의 수를 감소시킴으로써, 정확도를 유지하는 동시에 리소스 소모를 최소화하면서 유방암의 분자아형을 결정할 수 있는 효과가 있다. Specifically, by reducing the number of genes used in the breast cancer molecular subtype determination model determined through machine learning, the present invention has the effect of determining the molecular subtype of breast cancer while maintaining accuracy and minimizing resource consumption.
본 발명에 따른 효과는 이상에서 예시된 내용에 의해 제한되지 않으며, 더욱 다양한 효과들이 본 명세서 내에 포함되어 있다.The effects according to the present invention are not limited by the contents exemplified above, and more various effects are included in the present specification.
도 1은 본 발명의 일 실시예에 따른 유방암 분자아형 분류 디바이스의 개략적인 구성을 도시한 블록도이다. 1 is a block diagram showing a schematic configuration of a breast cancer molecular subtype classification device according to an embodiment of the present invention.
도 2는 본 발명의 일 실시예에 따른 유방암 분자아형 분류방법을 설명하기 위한 순서도이다.2 is a flowchart illustrating a method for classifying breast cancer molecular subtypes according to an embodiment of the present invention.
도 3a는 본 발명의 일 실시예에 따른 유방암 분자아형 분류 디바이스의 유방암 분자아형 분류를 위한 학습 절차를 도시한 순서도이다.3A is a flowchart illustrating a learning procedure for breast cancer molecular subtype classification of a breast cancer molecular subtype classification device according to an embodiment of the present invention.
도 3b는 본 발명의 일 실시예에 따른 유방암 분자아형 분류방법에서 제공하는 분류모델을 도시한 개략도이다. Figure 3b is a schematic diagram showing a classification model provided by the breast cancer molecular subtype classification method according to an embodiment of the present invention.
도 4a는 본 발명의 일 실시예에 따른 유방암 분자아형 분류방법에서 제공되는 다양한 차별 발현 유전자 세트에 따른 RNA 시퀀싱 분석 데이터에서의 평가결과를 도시한 것이다. Figure 4a shows the results of the evaluation in the RNA sequencing analysis data according to various sets of differentially expressed genes provided in the breast cancer molecular subtype classification method according to an embodiment of the present invention.
도 4b는 본 발명의 일 실시예에 따른 유방암 분자아형 분류방법에서 제공되는 다양한 차별 발현 유전자 세트 따른 마이크로어레이 분석 데이터에서의 평가결과를 도시한 것이다.Figure 4b shows the results of the evaluation in the microarray analysis data according to a variety of differential expression gene set provided in the breast cancer molecular subtype classification method according to an embodiment of the present invention.
도 5a는 본 발명의 일 실시예에 따른 유방암 분자아형 분류방법에서 유방암 분자아형 분류 알고리즘에 따른 RNA 시퀀싱 분석 데이터에서의 평가결과를 도시한 것이다. Figure 5a illustrates the evaluation results in the RNA sequencing analysis data according to the breast cancer molecular subtype classification algorithm in the breast cancer molecular subtype classification method according to an embodiment of the present invention.
도 5b는 본 발명의 일 실시예에 따른 유방암 분자아형 분류방법에서 유방암 분자아형 분류 알고리즘에 따른 마이크로어레이 분석 데이터에서의 평가결과를 도시한 것이다.Figure 5b shows the evaluation results in the microarray analysis data according to the breast cancer molecular subtype classification algorithm in the breast cancer molecular subtype classification method according to an embodiment of the present invention.
도 6은 본 발명의 일 실시예에 따른 유방암 분자아형 분류방법 및 종래의 방법을 이용한 유방암 분자아형 분류의 비교결과를 도시한 것이다.Figure 6 shows a comparison of breast cancer molecular subtype classification method using a breast cancer molecular subtype classification method and a conventional method according to an embodiment of the present invention.
발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시예들을 참조하면 명확해질 것이다. 그러나 본 발명은 이하에서 개시되는 실시예들에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 것이며, 단지 본 실시예들은 본 발명의 개시가 완전하도록 하며, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명은 청구항의 범주에 의해 정의될 뿐이다. Advantages and features of the invention, and methods of achieving them will become apparent with reference to the embodiments described below in detail in conjunction with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below, but will be implemented in various forms, and only the present embodiments are intended to complete the disclosure of the present invention, and the general knowledge in the art to which the present invention pertains. It is provided to fully convey the scope of the invention to those skilled in the art, and the present invention is defined only by the scope of the claims.
본 발명의 실시예를 설명하기 위한 도면에 개시된 형상, 크기, 비율, 각도, 개수 등은 예시적인 것이므로 본 발명이 도시된 사항에 한정되는 것은 아니다. 또한, 본 발명을 설명함에 있어서, 관련된 공지 기술에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우 그 상세한 설명은 생략한다. 본 명세서 상에서 언급된 '포함한다', '갖는다', '이루어진다' 등이 사용되는 경우, '~만'이 사용되지 않는 이상 다른 부분이 추가될 수 있다. 구성요소를 단수로 표현한 경우에 특별히 명시적인 기재 사항이 없는 한 복수를 포함하는 경우를 포함한다.Shapes, sizes, ratios, angles, numbers, and the like disclosed in the drawings for describing the embodiments of the present invention are exemplary, and the present invention is not limited to the illustrated items. In addition, in describing the present invention, if it is determined that the detailed description of the related known technology may unnecessarily obscure the subject matter of the present invention, the detailed description thereof will be omitted. When 'comprises', 'haves', 'consists of' and the like mentioned in the present specification are used, other parts may be added unless 'only' is used. In case of singular reference, the plural number includes the plural unless specifically stated otherwise.
구성요소를 해석함에 있어서, 별도의 명시적 기재가 없더라도 오차 범위를 포함하는 것으로 해석한다.In interpreting a component, it is interpreted to include an error range even if there is no separate description.
비록 제1, 제2등이 다양한 구성요소들을 서술하기 위해서 사용되나, 이들 구성요소들은 이들 용어에 의해 제한되지 않는다. 이들 용어들은 단지 하나의 구성요소를 다른 구성요소와 구별하기 위하여 사용하는 것이다. 따라서, 이하에서 언급되는 제1 구성요소는 본 발명의 기술적 사상 내에서 제2 구성요소일 수도 있다.Although the first, second, etc. are used to describe various components, these components are not limited by these terms. These terms are only used to distinguish one component from another. Therefore, the first component mentioned below may be a second component within the technical spirit of the present invention.
별도로 명시하지 않는 한 명세서 전체에 걸쳐 동일 참조 부호는 동일 구성 요소를 지칭한다.Like reference numerals refer to like elements throughout the specification unless otherwise specified.
본 발명의 여러 실시예들의 각각 특징들이 부분적으로 또는 전체적으로 서로 결합 또는 조합 가능하며, 당업자가 충분히 이해할 수 있듯이 기술적으로 다양한 연동 및 구동이 가능하며, 각 실시예들이 서로에 대하여 독립적으로 실시 가능할 수도 있고 연관 관계로 함께 실시 가능할 수도 있다.Each of the features of the various embodiments of the present invention may be combined or combined with each other in part or in whole, various technically interlocking and driving as can be understood by those skilled in the art, each of the embodiments may be implemented independently of each other It may be possible to carry out together in an association.
본 명세서의 해석의 명확함을 위해, 이하에서는 본 명세서에서 사용되는 용어들을 정의하기로 한다.For clarity of interpretation of the present specification, the terms used herein will be defined below.
본 명세서에서 사용되는 용어, "유방암"은 유방에 생긴 암 세포로 이루어진 종괴를 의미한다. 유방암은 유선에 젖을 공급하는 유방 소엽 (lobules) 또는 유선의 내측 라이닝 (lining) 에 기원하는 암의 유형일 수 있다. 유선으로부터 기원하는 암은 관상암종 (ductal carcinomas) 일 수 있고, 소엽으로부터 유래하는 암은 소엽암종 (lobular carcinomas) 일 수 있다. 때로, 유방에 의한 전위 부위는 뼈, 간, 폐 및 뇌가 포함될 수 있다. 유방암은 인간 및 다른 포유동물에서 발생한다. 인간의 경우 유방암은 대부분 여성에게서 발생하지만, 남성에게도 발생할 수 있다. 유방암의 치료법에는 외과수술, 약물치료 (호르몬요법 및 화학요법), 방사선 치료 및/또는 면역요법/표적화요법이 포함될 수 있다. As used herein, the term "breast cancer" refers to a mass consisting of cancer cells of the breast. Breast cancer may be a type of cancer that originates in the breast lobules that feed the mammary gland or in the medial lining of the mammary gland. Cancers originating from the mammary gland may be ductal carcinomas, and cancers derived from the lobules may be lobular carcinomas. Sometimes, the site of translocation by the breast may include bone, liver, lung and brain. Breast cancer occurs in humans and other mammals. In humans, most breast cancers occur in women, but can also occur in men. Treatments for breast cancer may include surgery, medication (hormonal therapy and chemotherapy), radiation therapy and / or immunotherapy / targeting therapy.
유방 종양은 크게 5가지의 분자아형에 따라 분리될 수 있다. 여기서 "분자아형"은 특징적인 원격 분자 프로파일, 예를 들어, 파일유전자 발현 프로파일에 의해 특징지어지는 유방 종양의 분자아형을 의미할 수 있다. 구체적으로, 유방 종양은 루미널 A형, 루미널 B형, HER2 형, 기저 유사형 및 정상 유사형의 5개의 분자아형으로 분류될 수 있다. Breast tumors can be largely divided into five molecular subtypes. "Molecular subtype" may refer to a molecular subtype of a breast tumor characterized by a characteristic distant molecular profile, eg, pyrogene expression profile. Specifically, breast tumors can be classified into five molecular subtypes: luminal type A, luminal B type, HER2 type, basal like type and normal like type.
유방암은 유방 종양의 분자아형에 따라 예후가 다를 수 있다. 예를 들어, 에스트로겐 수용체의 발현이 양성인 암 (예를 들어, 루미널 A형, 루미널 B형) 은 다른 분자아형의 유방암에 비해 좋은 예후를 가질 수 있다. 더 나아가, HER2의 발현 수준이 양성인 암 (예를 들어, HER2 형, 루미널 B형) 은 HER 수용체 차단제들의 개발에 따라, 삼중 음성 암 (예를 들어, 기저 유사형, 정상 유사형) 보다 좋은 예후를 가질 수 있다. 삼중 음성 암은 다른 분자아형의 유방암에 비해 재발 후 생존률이 낮을 수 있고, 이에 따라 생존기간 또한 낮을 수 있다. Breast cancer may have a different prognosis depending on the molecular subtype of the breast tumor. For example, cancers with a positive expression of the estrogen receptor (eg, luminal type A, luminal type B) may have a better prognosis than other molecular subtypes of breast cancer. Furthermore, cancers with a positive expression level of HER2 (eg, HER2 type, luminal type B) are better than triple negative cancers (eg, basal like, normal like), depending on the development of HER receptor blockers. It can have a prognosis. Triple-negative cancers may have a lower survival after relapse than other molecular subtypes of breast cancer, and thus may have a lower survival time.
더 나아가, 유방 종양의 분자아형에 따라 효과적인 치료제 또한 다를 수 있다. 예를 들어, 에스트로겐 수용체의 발현이 양성인 암 (예를 들어, 루미널 A형, 루미널 B형) 은 에스트로겐 수용체 길항제, 타목시펜이 효과적인 치료제일 수 있다. 더 나아가, HER2의 발현 수준이 양성인 암 (예를 들어, HER2 형, 루미널 B형) 은 항-HER2 항체, HER 활성 수용체 타이로신 카이네이즈 저해제 (예를 들어, 랍파티닙) 가 효과적인 치료제일 수 있다. Furthermore, effective therapeutics may also vary depending on the molecular subtype of the breast tumor. For example, cancers with a positive expression of estrogen receptors (eg, luminal type A, luminal type B) may be effective treatments with estrogen receptor antagonists, tamoxifen. Furthermore, cancers with a positive expression level of HER2 (eg, HER2 type, luminal type B) may be an effective treatment with an anti-HER2 antibody, a HER active receptor tyrosine kinase inhibitor (eg, rabpatinib). .
이에 따라, 유방 종양의 분자아형을 분류하는 방법은 유방암의 예후 또는 유방암 치료의 방식 결정을 위한 정보를 제공하는 방법으로 이용될 수 있다. Accordingly, the method of classifying molecular subtypes of breast tumor may be used as a method of providing information for determining the prognosis of breast cancer or the manner of treating breast cancer.
본 명세서에서 사용되는 용어, "종양세포"는 체내에서 자율성을 가지고 과잉으로 증식하는 세포를 의미한다. 바람직한 종양세포는 유방암 환자로부터 분리된 유방암 종양세포일 수 있지만, 이에 제한되는 것은 아니다. 예를 들어, 종양세포는 유방암 세포와 정상 세포가 섞여있는 세포일 수 있다.As used herein, the term "tumor cell" refers to a cell that autonomously proliferates in the body. Preferred tumor cells may be, but are not limited to, breast cancer tumor cells isolated from breast cancer patients. For example, the tumor cells may be cells in which breast cancer cells and normal cells are mixed.
본 명세서에서 사용되는 용어, "차별 발현 유전자"는 대조군에 비해 실험군에서 발현수준의 차이가 유의하게 증가 또는 유의하게 감소하는 유전자를 의미한다. 유방 종양세포는 이의 분자아형에 따라 다른 수준의 발현수준을 나타내는 차별 발현 유전자를 가질 수 있다. 유방 종양의 분자아형에 따른 차별 발현 유전자는 유방암과 연관된 유전자들 중, 분자아형에 따라 발현수준의 차이가 큰 유전자일 수 있다. 예를 들어, 유방 종양의 분자아형에 따른 차별 발현 유전자는 FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14, BLVRA, BCL2, UBE2C, KRT17, CCNB1, NAT1, SLC39A6, CDC20, KRT5, KIF2C, EGFR, CCNE1, ESR1, PGR, ERBB2 및 MKI67일 수 있다.As used herein, the term "differential expression gene" means a gene in which the difference in expression level is significantly increased or significantly decreased in the experimental group compared to the control group. Breast tumor cells may have differentially expressed genes that exhibit different levels of expression depending on their molecular subtype. The differentially expressed genes according to the molecular subtypes of the breast tumor may be genes having a large difference in expression level according to the molecular subtypes among genes related to breast cancer. For example, the differentially expressed genes according to the molecular subtypes of breast tumors are FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14, BLVRA, BCL2, UBE2C, KRT17, CCNB1 , NAT1, SLC39A6, CDC20, KRT5, KIF2C, EGFR, CCNE1, ESR1, PGR, ERBB2 and MKI67.
본 명세서에서 사용되는 용어, "발현수준"은 유전자에 의해 생성된 유전자 생성물의 측정 가능한 양을 의미한다. 예를 들어, 차별 발현 유전자의 발현수준은 차별 발현 유전자의 전사에 의해 생성된 mRNA를 포함하는 전사체의 양 또는 차별 발현 유전자의 DNA의 양을 의미할 수 있고, 더 나아가 번역에 의해 생긴 단백질의 양을 의미할 수 있지만, 이에 제한되는 것은 아니다. 본 명세서에서의 발현수준은 발현값과 동일하게 해석될 수 있다. As used herein, the term “expression level” means a measurable amount of gene product produced by a gene. For example, the expression level of the differentially expressed gene may refer to the amount of transcript including mRNA produced by the transcription of the differentially expressed gene or the amount of DNA of the differentially expressed gene, and further to the amount of protein produced by translation. Amount, but is not limited to such. Expression levels in the present specification may be interpreted in the same manner as expression values.
차별 발현 유전자의 발현수준은 중합효소 연쇄 반응 (PCR, polymerase chain reaction), DNA 어레이 (DNA array), RNA 어레이 (RNA array), 노던 블랏 (Northern blot) 더 나아가, 웨스턴 블랏 (Western blot), ELISA (enzyme-linked immunosorbent assay), 단백질 어레이 (protein array) 등을 이용하여 측정할 수 있으나, 이에 제한되는 것은 아니다. 유방 종양내의 차별 발현 유전자의 발현수준 측정을 통해, 유방 종양의 분자아형이 분류 될 수 있다. 바람직한 유방 종양의 분자아형 분류를 위한 차별 발현 유전자의 발현수준 측정 방법으로는, RNA 시퀀싱 또는 마이크로어레이를 이용하는 방법일 수 있으나 이에 제한되는 것은 아니다. Expression levels of differentially expressed genes can be determined by polymerase chain reaction (PCR), DNA array, RNA array, Northern blot, Western blot, ELISA. (enzyme-linked immunosorbent assay), protein array (protein array) can be measured using, but is not limited thereto. By measuring expression levels of differentially expressed genes in breast tumors, molecular subtypes of breast tumors can be classified. As a method for measuring expression level of differentially expressed genes for molecular subtype classification of preferred breast tumors, it may be a method using RNA sequencing or microarray, but is not limited thereto.
구체적으로, "RNA 시퀀싱"이란 대상체의 RNA 염기서열 분석하는 방법을 의미하며, "마이크로어레이"란 DNA 칩을 이용해 유전자의 발현을 스크리닝하는 방법을 의미한다. 본 명세서에서의 RNA 시퀀싱 및 마이크로어어레이는 복수의 유전자에 대한 발현수준을 측정하는 수단, 바람직하게는 유방 종양의 분자아형 분류를 위한 차별 발현 유전자들의 발현수준을 측정하는 수단으로서 사용되나, 이에 제한되는 것은 아니다. Specifically, "RNA sequencing" refers to a method of RNA sequencing of a subject, and "microarray" refers to a method of screening gene expression using a DNA chip. RNA sequencing and microarrays herein are used as a means for measuring expression levels for a plurality of genes, preferably as a means for measuring expression levels of differentially expressed genes for molecular subtype classification of breast tumors. It doesn't happen.
이러한 발현수준 측정방법에 따라, 전술한 차별 발현 유전자의 세트가 다를 수 있다. 예를 들어, RNA 시퀀싱을 이용할 경우, 유방 종양의 분자아형 분류를 위한 바람직한 차별 발현 유전자 세트는, ESR1, PGR, ERBB2, MKI67 및 FOXC1 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2 및 KRT17 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXA1 및 SFRP1 유전자 또는, ESR1, PGR, ERBB2, MKI67, MLPH, FGFR4, MYBL2 및 SFRP1 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXA1, KRT17 및 SFRP1 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXA1, MLPH, FGFR4, BCL2, MYBL2, KRT17 및 SFRP1 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXC1, FOXA1, MLPH, FGFR4, BCL2, MYBL2, KRT17 및 SFRP1 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17 및 SFRP1 유전자 또는, ESR1, PGR, ERBB2, MKI67, GRB7, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17 및 SFRP1 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXA1, MLPH, FGFR4, BCL2, CEP55, MYBL2, KRT17 및 SFRP1 유전자 또는, ESR1, PGR, ERBB2, MKI67, GRB7, FOXA1, MLPH, FGFR4, BCL2, CEP55, MYBL2, KRT17 및 SFRP1 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXC1, GRB7, FOXA1, MLPH, FGFR4, BCL2, CEP55, MYBL2, KRT17 및 SFRP1 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2 및 KRT14 유전자 또는, ESR1, PGR, ERBB2, MKI67, KRT17, FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2 및 KRT14 유전자 또는, ESR1, PGR, ERBB2, MKI67, GRB7, KRT17, FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2 및 KRT14 유전자 또는, ESR1, PGR, ERBB2, MKI67, MLPH, FGFR4, MYBL2, SFRP1 및 KRT14 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXA1, FOXC1, MLPH, FGFR4, BCL2, MYBL2, SFRP1 및 KRT14 유전자 또는, ESR1, PGR, ERBB2, MKI67, MLPH, FGFR4, CEP55, MYBL2, SFRP1 및 KRT14 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXA1, MLPH, FGFR4, CEP55, MYBL2, SFRP1 및 KRT14 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXC1, FOXA1, MLPH, FGFR4, CEP55, MYBL2, SFRP1 및 KRT14 유전자 또는, ESR1, PGR, ERBB2, MKI67, GRB7, FOXA1, MLPH, FGFR4, CEP55, MYBL2, SFRP1 및 KRT14 유전자 또는, ESR1, PGR, ERBB2, MKI67, BCL2, FOXA1, MLPH, FGFR4, CEP55, MYBL2, SFRP1 및 KRT14 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2, SFRP1 및 KRT14 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2, SFRP1 및 KRT14 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXA1, GRB7, MLPH, FGFR4, BCL2, CEP55, MYBL2, SFRP1 및 KRT14 유전자 또는, ESR1, PGR, ERBB2, MKI67, GRB7, FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2, SFRP1 및 KRT14 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 및 KRT14 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXC1, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 및 KRT14 유전자 또는, ESR1, PGR, ERBB2, MKI67, GRB7, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 및 KRT14 유전자 또는, ESR1, PGR, ERBB2, MKI67, BCL2, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 및 KRT14 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXC1, BCL2, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 및 KRT14 유전자 또는, ESR1, PGR, ERBB2, MKI67, GRB7, BCL2, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 및 KRT14 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXC1, GRB7, BCL2, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 및 KRT14 유전자 또는, ESR1, PGR, ERBB2, MKI67, MLPH, FGFR4, CEP55 및 KRT17 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXC1, FOXA1, MLPH, FGFR4, GRB7, CEP55, MYBL2, KRT17, SFRP1, MELK, KRT5 및 KRT14 유전자 또는, ESR1, PGR, ERBB2, MKI67, BLC2, MIA, FOXC1, FOXA1, MLPH, FGFR4, GRB7, CEP55, MYBL2, KRT17, SFRP1, MELK, KRT5 및 KRT14 유전자 또는, ESR1, PGR, ERBB2, MKI67, KIF2C, EGFR, NAT1, ANLN, BLC2, MIA, FOXC1, FOXA1, MLPH, FGFR4, GRB7, CEP55, MYBL2, KRT17, SFRP1, MELK, KRT5 및 KRT14 유전자 중 하나일 수 있다. 마이크로어레이를 이용할 경우, 유방 종양의 분자아형 분류를 위한 바람직한 차별 발현 유전자 세트는, ESR1, PGR, ERBB2, MKI67, FOXC1 및 CEP55 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXC1, MELK 및 CEP55 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXA1, FOXC1, GRB7, CEP55, BIRC5, MELK 및 MIA 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXA1, FGFR4, CEP55, BIRC5, SFRP1 및 MIA 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXC1, FOXA1, FGFR4, CEP55, BIRC5, SFRP1 및 MIA 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXA1, MELK, SFRP1 및 MIA 유전자 또는, ESR1, PGR, ERBB2, MKI67, CEP55, FOXA1, MELK, SFRP1 및 MIA 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXC1, CEP55, FOXA1, MELK, SFRP1 및 MIA 유전자 또는, ESR1, PGR, ERBB2, MKI67, FGFR4, CEP55, FOXA1, MELK, SFRP1 및 MIA 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXC1, FGFR4, CEP55, FOXA1, MELK, SFRP1 및 MIA 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXA1, GRB7, CEP55, MELK, SFRP1 및 MIA 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXA1, GRB7, CEP55, MELK, SFRP1 및 MIA 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXC1, FOXA1, GRB7, CEP55, MELK, SFRP1 및 MIA 유전자 또는, ESR1, PGR, ERBB2, MKI67, FGFR4, FOXA1, GRB7, CEP55, MELK, SFRP1 및 MIA 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXC1, FGFR4, FOXA1, GRB7, CEP55, MELK, SFRP1 및 MIA 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXA1, FGFR4, BIRC5, MELK, SFRP1 및 MIA 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXA1, FGFR4, BIRC5, MELK, SFRP1 및 MIA 유전자 또는, ESR1, PGR, ERBB2, MKI67, CEP55, FOXA1, FGFR4, BIRC5, MELK, SFRP1 및 MIA 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOX1C, CEP55, FOXA1, FGFR4, BIRC5, MELK, SFRP1 및 MIA 유전자 또는, ESR1, PGR, ERBB2, MKI67, GRB7, CEP55, FOXA1, FGFR4, BIRC5, MELK, SFRP1 및 MIA 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXC1, GRB7, CEP55, FOXA1, FGFR4, BIRC5, MELK, SFRP1 및 MIA 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXA1, FOXC1, CEP55, SFRP1, MIA 및 KRT14 유전자 또는, ESR1, PGR, ERBB2, MKI67, FGFR4, FOXA1, FOXC1, CEP55, SFRP1, MIA 및 KRT14 유전자 또는, ESR1, PGR, ERBB2, MKI67, FGFR4, FOXA1, BIRC5, CEP55, SFRP1, MIA 및 KRT14 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXC1, FGFR4, FOXA1, BIRC5, CEP55, SFRP1, MIA 및 KRT14 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXA1, FOXC1, CEP55, MELK, SFRP1, MIA 및 KRT14 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXA1, FGFR4, CEP55, MELK, SFRP1, MIA 및 KRT14 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXC1, FOXA1, FGFR4, CEP55, MELK, SFRP1, MIA 및 KRT14 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXC1, FOXA1, GRB7, CEP55, MELK, SFRP1, MIA 및 KRT14 유전자 또는, ESR1, PGR, ERBB2, MKI67, FGFR4, FOXA1, GRB7, CEP55, MELK, SFRP1, MIA 및 KRT14 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXC1, FGFR4, FOXA1, GRB7, CEP55, MELK, SFRP1, MIA 및 KRT14 유전자 또는, ESR1, PGR, ERBB2, MKI67, FGFR4, CEP55, BIRC5, MELK, SFRP1, MIA 및 KRT14 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXA1, FGFR4, CEP55, BIRC5, MELK, SFRP1, MIA 및 KRT14 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXC1, FOXA1, FGFR4, CEP55, BIRC5, MELK, SFRP1, MIA 및 KRT14 유전자 또는, ESR1, PGR, ERBB2, MKI67, GRB7, FOXA1, FGFR4, CEP55, BIRC5, MELK, SFRP1, MIA 및 KRT14 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXC1, GRB7, FOXA1, FGFR4, CEP55, BIRC5, MELK, SFRP1, MIA 및 KRT14 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXA1, FGFR4, CEP55, SFRP1 및 ANLN 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA 및 BAG1 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14 및 BLVRA 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14, BLVRA, BCL2, UBE2C, KRT17 및 CCNB1 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14 및 BLVRA 유전자 또는, ESR1, PGR, ERBB2, MKI67, FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14, BLVRA, BCL2, UBE2C, KRT17, CCNB1, NAT1, SLC39A6, CDC20, KRT5 및 CCNE1 유전자 중 하나일 수 있다. Depending on the expression level measurement method, the above-described set of differentially expressed genes may be different. For example, using RNA sequencing, preferred sets of differentially expressed genes for molecular subtyping of breast tumors are the ESR1, PGR, ERBB2, MKI67 and FOXC1 genes, or the ESR1, PGR, ERBB2, MKI67, FOXA1, FOXC1, MLPH , FGFR4, BCL2, CEP55, MYBL2 and KRT17 genes or, ESR1, PGR, ERBB2, MKI67, FOXA1 and SFRP1 genes, or ESR1, PGR, ERBB2, MKI67, MLPH, FGFR4, MYBL2 and SFRP1 genes, or ESR1, PGR, ERBB2 , MKI67, FOXA1, KRT17 and SFRP1 genes or, ESR1, PGR, ERBB2, MKI67, FOXA1, MLPH, FGFR4, BCL2, MYBL2, KRT17 and SFRP1 genes, or ESR1, PGR, ERBB2, MKI67, FOXC1, FOXA1, MLPH, , BCL2, MYBL2, KRT17 and SFRP1 genes or, ESR1, PGR, ERBB2, MKI67, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17 and SFRP1 genes, or ESR1, PGR, ERBB2, MKI67, GRB7, FOXA1, MLPH, FGFR4 , CEP55, MYBL2, KRT17 and SFRP1 genes or, ESR1, PGR, ERBB2, MKI67, FOXA1, MLPH, FGFR4, BCL2, CEP55, MYBL2, KRT17 and SFRP1 genes or, ESR1, PGR, ERBB2, MKI67, GRB7, FOXA1, MLPH, FGFR4, BCL2, CEP55, MYBL2, KRT17 and SFRP1 genes, or ESR1, PGR, ERBB2, MKI67, FOXC1, GRB7, FOXA1, MLPH, FGFR4, BCL2, CEP55, MYBL2, KRT17 and SFRP1 genes ESR1, PGR, ERBB2, MKI67, FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2 and KRT14 genes or, ESR1, PGR, ERBB2, MKI67, KRT17, FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55 KRT14 gene or ESR1, PGR, ERBB2, MKI67, GRB7, KRT17, FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2 and KRT14 genes, or ESR1, PGR, ERBB2, MKI67, MLPH, FGFR4, MYBL2, SFRP1 KRT14 gene or ESR1, PGR, ERBB2, MKI67, FOXA1, FOXC1, MLPH, FGFR4, BCL2, MYBL2, SFRP1 and KRT14 genes or, ESR1, PGR, ERBB2, MKI67, MLPH, FGFR4, CEP55, MYBL2, SFRP1 and KRT14 Or the ESR1, PGR, ERBB2, MKI67, FOXA1, MLPH, FGFR4, CEP55, MYBL2, SFRP1 and KRT14 genes or, ESR1, PGR, ERBB2, MKI67, FOXC1, FOXA1, MLPH, FGFR4, CEP55, MYBL2, SFRP1 and KRT14 Or ESR1, PGR, ERBB2, MKI67, GRB7, FOX A1, MLPH, FGFR4, CEP55, MYBL2, SFRP1 and KRT14 genes or, ESR1, PGR, ERBB2, MKI67, BCL2, FOXA1, MLPH, FGFR4, CEP55, MYBL2, SFRP1 and KRT14 genes, or ESR1, PGR, ERBB2, MKI67, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2, SFRP1 and KRT14 genes or, ESR1, PGR, ERBB2, MKI67, FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2, SFRP1 and KRT14 genes, or ESR1, PGR ERBB2, MKI67, FOXA1, GRB7, MLPH, FGFR4, BCL2, CEP55, MYBL2, SFRP1 and KRT14 genes, or ESR1, PGR, ERBB2, MKI67, GRB7, FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2 KRT14 gene or ESR1, PGR, ERBB2, MKI67, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 and KRT14 genes or, ESR1, PGR, ERBB2, MKI67, FOXC1, FOXA1, MLPH, FGFR4, CEP55, MYBL2 KRT17, SFRP1 and KRT14 genes or, ESR1, PGR, ERBB2, MKI67, GRB7, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 and KRT14 genes, or ESR1, PGR, ERBB2, MKI67, BCL2, FOXA1, MLPH FGFR4, CEP55, MYBL2, KRT17, SFRP1 and KRT14 genes or ESR1, PGR, ERBB2, MKI67, FOXC1, BCL2, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 and KRT14 genes, or, ESR1, PGR, ERBB2, MKI67, GRB7, BCL2, FOXA1, MLPH, FGFR4 CEP55, MYBL2, KRT17, SFRP1 and KRT14 genes or, ESR1, PGR, ERBB2, MKI67, FOXC1, GRB7, BCL2, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 and KRT14 genes, or ESR1, PGR, ERBB2 MKI67, MLPH, FGFR4, CEP55 and KRT17 genes or, ESR1, PGR, ERBB2, MKI67, FOXC1, FOXA1, MLPH, FGFR4, GRB7, CEP55, MYBL2, KRT17, SFRP1, MELK, KRT5 and KRT14 genes, or ESR1, PGR, ERBB2, MKI67, BLC2, MIA, FOXC1, FOXA1, MLPH, FGFR4, GRB7, CEP55, MYBL2, KRT17, SFRP1, MELK, KRT5 and KRT14 genes, or ESR1, PGR, ERBB2, MKI67, KIF2C, EGFR, NAT1, ANLN It can be one of the BLC2, MIA, FOXC1, FOXA1, MLPH, FGFR4, GRB7, CEP55, MYBL2, KRT17, SFRP1, MELK, KRT5 and KRT14 genes. When using microarrays, preferred sets of differentially expressed genes for molecular subtyping of breast tumors are the ESR1, PGR, ERBB2, MKI67, FOXC1 and CEP55 genes, or the ESR1, PGR, ERBB2, MKI67, FOXC1, MELK and CEP55 genes or , ESR1, PGR, ERBB2, MKI67, FOXA1, FOXC1, GRB7, CEP55, BIRC5, MELK and MIA genes or, ESR1, PGR, ERBB2, MKI67, FOXA1, FGFR4, CEP55, BIRC5, SFRP1 and MIA genes, or ESR1, PGR , ERBB2, MKI67, FOXC1, FOXA1, FGFR4, CEP55, BIRC5, SFRP1 and MIA genes or, ESR1, PGR, ERBB2, MKI67, FOXA1, MELK, SFRP1 and MIA genes, or ESR1, PGR, ERBB2, MKI67, CEP55, FOXA , MELK, SFRP1 and MIA genes or, ESR1, PGR, ERBB2, MKI67, FOXC1, CEP55, FOXA1, MELK, SFRP1 and MIA genes, or ESR1, PGR, ERBB2, MKI67, FGFR4, CEP55, FOXA1, MELK, SFRP1 and MIA Gene or ESR1, PGR, ERBB2, MKI67, FOXC1, FGFR4, CEP55, FOXA1, MELK, SFRP1 and MIA genes or, ESR1, PGR, ERBB2, MKI67, FOXA1, GRB7, CEP55, MELK, SFRP1 and MIA Gene or ESR1, PGR, ERBB2, MKI67, FOXA1, GRB7, CEP55, MELK, SFRP1 and MIA genes or, ESR1, PGR, ERBB2, MKI67, FOXC1, FOXA1, GRB7, CEP55, MELK, SFRP1 and MIA genes, or ESR1 , PGR, ERBB2, MKI67, FGFR4, FOXA1, GRB7, CEP55, MELK, SFRP1 and MIA genes or, ESR1, PGR, ERBB2, MKI67, FOXC1, FGFR4, FOXA1, GRB7, CEP55, MELK, SFRP1 and MIA genes, or ESR1 , PGR, ERBB2, MKI67, FOXA1, FGFR4, BIRC5, MELK, SFRP1 and MIA genes or, ESR1, PGR, ERBB2, MKI67, FOXA1, FGFR4, BIRC5, MELK, SFRP1 and MIA genes, or ESR1, PGR, ERBB2, MKI67 , CEP55, FOXA1, FGFR4, BIRC5, MELK, SFRP1 and MIA genes or, ESR1, PGR, ERBB2, MKI67, FOX1C, CEP55, FOXA1, FGFR4, BIRC5, MELK, SFRP1 and MIA genes, or ESR1, PGR, ERBB2, MKI67 , GRB7, CEP55, FOXA1, FGFR4, BIRC5, MELK, SFRP1 and MIA genes or, ESR1, PGR, ERBB2, MKI67, FOXC1, GRB7, CEP55, FOXA1, FGFR4, BIRC5, MELK, SFRP1 and MIA genes, or ESR1, PGR , ERBB2, MKI67, FOXA1, FOXC1, CEP55, SFRP1, MIA, and KRT 14 gene or ESR1, PGR, ERBB2, MKI67, FGFR4, FOXA1, FOXC1, CEP55, SFRP1, MIA and KRT14 genes, or ESR1, PGR, ERBB2, MKI67, FGFR4, FOXA1, BIRC5, CEP55, SFRP1, MIA and KRT14 genes Or the ESR1, PGR, ERBB2, MKI67, FOXC1, FGFR4, FOXA1, BIRC5, CEP55, SFRP1, MIA and KRT14 genes, or ESR1, PGR, ERBB2, MKI67, FOXA1, FOXC1, CEP55, MELK, SFRP1, MIA and KRT14 genes Or the ESR1, PGR, ERBB2, MKI67, FOXA1, FGFR4, CEP55, MELK, SFRP1, MIA and KRT14 genes or, ESR1, PGR, ERBB2, MKI67, FOXC1, FOXA1, FGFR4, CEP55, MELK, SFRP1, MIA and KRT14 genes Or the ESR1, PGR, ERBB2, MKI67, FOXC1, FOXA1, GRB7, CEP55, MELK, SFRP1, MIA and KRT14 genes or, ESR1, PGR, ERBB2, MKI67, FGFR4, FOXA1, GRB7, CEP55, MELK, SFRP1, MIA and KRT14 gene or ESR1, PGR, ERBB2, MKI67, FOXC1, FGFR4, FOXA1, GRB7, CEP55, MELK, SFRP1, MIA and KRT14 genes, or ESR1, PGR, ERBB2, MKI67, FGFR4, CEP55, BIRC5, MELK, SFRP1 MIA and KRT14 genes or ESR1, PGR, ERBB2, MKI67, FO XA1, FGFR4, CEP55, BIRC5, MELK, SFRP1, MIA and KRT14 genes or, ESR1, PGR, ERBB2, MKI67, FOXC1, FOXA1, FGFR4, CEP55, BIRC5, MELK, SFRP1, MIA and KRT14 genes, or ESR1, PGR, ERBB2, MKI67, GRB7, FOXA1, FGFR4, CEP55, BIRC5, MELK, SFRP1, MIA and KRT14 genes or, ESR1, PGR, ERBB2, MKI67, FOXC1, GRB7, FOXA1, FGFR4, CEP55, BIRC5, MLK and SFRP1 KRT14 gene or ESR1, PGR, ERBB2, MKI67, FOXA1, FGFR4, CEP55, SFRP1 and ANLN genes or, ESR1, PGR, ERBB2, MKI67, FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, BIRC5 BAG1 gene or ESR1, PGR, ERBB2, MKI67, FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14 and BLVRA genes, or ESR1, PGR, ERBB2, MKI67, FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14, BLVRA, BCL2, UBE2C, KRT17 and CCNB1 genes, or ESR1, PGR, ERBB2, MKI67, FOXA1, FOXA1, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14 and BLVRA Oil Electronic or ESR1, PGR, ERBB2, MKI67, FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14, BLVRA, BCL2, UBE2C, KRT17, CCNB1, NAT6, SLC39 , CDC20, KRT5 and CCNE1 genes.
본 명세서에서 사용되는 용어, "제1 유전자 세트"는 전술한 차별 발현 유전자 중, 다른 유전자에 비해, 유방 종양의 분자아형에 따라 발현수준의 차이가 상대적으로 큰 유전자들의 구성을 의미할 수 있다. 예를 들어, 제1 유전자 세트는 ESR1, PGR, ERBB2 및 MKI67 유전자로 구성될 수 있다.As used herein, the term "first gene set" may refer to a constitution of genes of which the expression level is relatively different according to molecular subtypes of breast tumors, among other genes, among the aforementioned differential expression genes. For example, the first set of genes may consist of the ESR1, PGR, ERBB2 and MKI67 genes.
본 명세서에서 사용되는 용어, "제2 유전자 세트"는 RNA 시퀀싱을 이용하였을 경우, 유방 종양의 분자아형에 따라 발현수준의 차이가 유의한 유전자들의 구성을 의미할 수 있다. 예를 들어, 제2 유전자 세트는 FOXA1, FOXC1, MLPH, FGFR4, GRB7, BCL2, CEP55, MYBL2, KRT17, SFRP1, KRT14, MELK, KRT5, MIA, KIF2C, EGFR, NAT1 및 ANLN 유전자 중 적어도 하나로 구성될 수 있다.As used herein, the term "second gene set" may refer to the configuration of genes with significant differences in expression levels according to molecular subtypes of breast tumors when RNA sequencing is used. For example, the second set of genes consists of at least one of FOXA1, FOXC1, MLPH, FGFR4, GRB7, BCL2, CEP55, MYBL2, KRT17, SFRP1, KRT14, MELK, KRT5, MIA, KIF2C, EGFR, NAT1 and ANLN genes. Can be.
본 명세서에서 사용되는 용어, "제3 유전자 세트"는 마이크로어레이를 이용하였을 경우, 유방 종양의 분자아형에 따라 발현수준의 차이가 유의한 유전자들의 구성을 의미할 수 있다. 예를 들어, 제3 유전자 세트는 FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14, BLVRA, BCL2, UBE2C, KRT17, CCNB1, NAT1, SLC39A6, CDC20, KRT5 및 CCNE1 유전자로 구성될 수 있다. As used herein, the term "third gene set" may refer to a constitution of genes in which a difference in expression level is significant according to molecular subtypes of a breast tumor when a microarray is used. For example, the third set of genes is FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14, BLVRA, BCL2, UBE2C, KRT17, CCNB1, NAT1, SLC39A6, CDC20 , KRT5 and CCNE1 genes.
이때, 제2 유전자 세트 및 제3 유전자 세트는 상이할 수 있으나, 이에 제한되는 것은 아니다. In this case, the second gene set and the third gene set may be different, but are not limited thereto.
유방 종양세포는 이의 분자아형에 따라 전술한 차별 발현 유전자 세트 내의 차별 발현 유전자들의 발현수준에 차이가 있을 수 있다. 발현수준의 차이는, 복수의 차별 발현 유전자에 대한 각각의 차별 발현 유전자의 발현수준이 대, 소로 구분될 수 있다. 바람직하게, 본 명세서에서의 발현수준의 차이는 선택된 두 개의 차별 발현 유전자에 대한 발현비율을 의미할 수 있다. 이에 따라, 본 발명은 차별 발현 유전자에 대한 발현비율을 제공함으로써, 유방암 분자아형을 분류하는 방법을 제공할 수 있다. Breast tumor cells may differ in the expression level of the differential expression genes in the aforementioned differential expression gene set according to their molecular subtypes. The difference in the expression level, the expression level of each differential expression gene for a plurality of differential expression genes can be divided into large and small. Preferably, the difference in expression level in the present specification may mean an expression ratio for two selected differential expression genes. Accordingly, the present invention can provide a method for classifying breast cancer molecular subtypes by providing expression ratios for differentially expressed genes.
본 명세서에서 사용되는 용어, "파라미터"는 매개 변수를 의미한다. 예를 들어, 파라미터는 본 발명의 일 실시예에 따른 유방암 분자아형 분류방법에 있어서 분류를 위해 조작 가능한 모든 변수를 의미할 수 있다. As used herein, the term "parameter" means a parameter. For example, the parameter may mean all variables operable for classification in the breast cancer molecular subtype classification method according to an embodiment of the present invention.
본 명세서에서 사용되는 용어, "분류모델"은 복수의 유방암과 연관된 유전자에서 선택된 두 개의 유전자에 대한 발현수준의 비율 즉, 발현비율값으로 구성된 메트릭스에 의해 정의될 수 있으나, 이에 제한되는 것은 아니다. 바람직하게, 분류모델은 두 개의 차별 유전자 및 이들의 차로 구성된 메트릭스로 정의될 수 있다. As used herein, the term "classification model" may be defined by a matrix consisting of a ratio of expression levels for two genes selected from genes associated with a plurality of breast cancers, that is, expression ratio values, but is not limited thereto. Preferably, the classification model may be defined as a matrix consisting of two differentiating genes and their differences.
이하에서는, 도 1을 참조하여, 본 발명의 일 실시예에 따른 유방암 분자아형 분류 디바이스를 설명한다.Hereinafter, a breast cancer molecular subtype classification device according to an embodiment of the present invention will be described with reference to FIG. 1.
도 1은 본 발명의 일 실시예에 따른 유방암 분자아형 분류 디바이스의 개략적인 구성을 도시한 블록도이다. 도 1을 참조하면, 유방암 분자아형 분류 디바이스 (100) 는 통신부 (110), 입력부 (120), 표시부 (130), 저장부 (140) 및 프로세서 (150) 를 포함한다.1 is a block diagram showing a schematic configuration of a breast cancer molecular subtype classification device according to an embodiment of the present invention. Referring to FIG. 1, the breast cancer molecular
통신부 (110) 를 통해, 유방암 분자아형 분류 디바이스 (100) 는 종양세포에서 측정된 차별 발현 유전자 세트의 발현수준을 획득할 수 있다. Through the communication unit 110, the breast cancer molecular
입력부 (120) 는 키보드, 마우스, 터치 스크린 패널 등 제한되지 않는다. 입력부 (120) 를 통해 유방암 분자아형 분류 디바이스 (100) 를 설정하고, 이의 동작을 지시할 수 있다. The input unit 120 is not limited to a keyboard, a mouse, a touch screen panel, and the like. The breast cancer molecular
표시부 (130) 는 유방암의 분자아형 분류에 있어서, 사용자로부터 용이하게 유방암 분자아형 분류 디바이스 (100) 의 설정이 가능한 메뉴들을 표시할 수 있다. 더 나아가, 표시부 (130) 는 입력부 (120) 를 통해 차별 발현 유전자 세트에 대한 발현수준을 입력 받은 분류모델을 이용하여 결정한, 분자아형을 사용자가 용이하게 인식할 수 있도록 표시할 수 있다. 이때, 표시부 (130) 는 액정 표시 장치, 유기 발광 표시 장치 등을 포함하는 표시 장치로서, 메뉴들이 사용자에게 디스플레이 되도록 할 수 있다. 또한, 표시부 (130) 는 전술된 것 이외에 본 발명의 목적을 달성할 수 있은 범위 내에서 다양한 형태 또는 방법으로 구현될 수 있다.The
저장부 (140) 는 통신부 (110) 를 통해 획득한 종양세포에서의 차별 발현 유전자 세트의 발현수준을 저장할 수 있다. 또한, 입력부 (120) 를 통해 분류모델에 입력한 차별 발현 유전자 세트의 발현수준을 저장할 수 있다. 더 나아가, 분류모델을 저장할 수 있다.The
프로세서 (150) 는 본 발명의 일 실시예에 따른 유방암 분자아형 분류 디바이스 (100) 를 동작시키기 위한 다양한 명령들을 수행한다. 프로세서 (150) 는 통신부 (110) 와 연결되어, 통신부 (110) 를 통해 종양세포에서 측정된 차별 발현 유전자 세트의 발현수준을 획득하고, 획득한 차별 발현 유전자 세트에 대한 발현수준을 차별 발현 유전자 세트에 대한 파라미터를 갖는 분류모델에 입력하고, 종양세포에 대한 유방암 분자아형을 결정하고, 결정된 유방암 분자아형을 제공하도록 구성된다. The
이하에서는, 도 2을 참조하여, 본 발명의 일 실시예에 따른 유방암 분자아형 분류 방법 및 유방암 분자아형 분류 디바이스에서 구현되는 프로세서에 대해 구체적으로 설명한다. Hereinafter, referring to FIG. 2, a processor implemented in a breast cancer molecular subtype classification method and a breast cancer molecular subtype classification device according to an embodiment of the present invention will be described in detail.
도 2는 본 발명의 일 실시예에 따른 유방암 분자아형 분류방법을 설명하기 위한 순서도이다.2 is a flowchart illustrating a method for classifying breast cancer molecular subtypes according to an embodiment of the present invention.
먼저, 종양세포에서 측정된 차별 발현 유전자 세트의 발현수준을 획득한다 (S210). 이때, 차별 벌현 유전자 세트는 RNA 시퀀싱을 이용하여 종양세포에서 측정된 제1 유전자 세트 및 제2 유전자 세트일 수 있다. 또 다른 실시예에서 차별 발현 유전자 세트는 마이크로어레이를 이용하여 종양세포에서 측정된 제1 유전자 세트 및 제3 유전자 세트일 수 있다. 이때, 제1 유전자 세트는 전술한 바와 같이 제1 유전자 세트는 ESR1, PGR, ERBB2 및 MKI67 유전자를 포함하고, 발현수준 측정 방법에 따라, 서로 상이한 제2 유전자 세트 및 제3 유전자 세트를 갖게 됨으로써, 차별 발현 유전자 세트는 상이해질 수 있다. First, to obtain the expression level of the differential expression gene set measured in tumor cells (S210). In this case, the differential bee gene set may be a first gene set and a second gene set measured in tumor cells using RNA sequencing. In another embodiment, the differentially expressed gene set may be a first gene set and a third gene set measured in tumor cells using a microarray. In this case, as described above, the first gene set includes the ESR1, PGR, ERBB2, and MKI67 genes, and according to the expression level measuring method, the first gene set has different second gene sets and third gene sets, Differential expression gene sets can be different.
예를 들어, RNA 시퀀싱을 이용할 경우, 제2 유전자 세트는 FOXA1, FOXC1, MLPH, FGFR4, GRB7, BCL2, CEP55, MYBL2, KRT17, SFRP1, KRT14, MELK, KRT5, MIA, KIF2C, EGFR, NAT1 및 ANLN 유전자로 이루어진 그룹으로부터 선택된 적어도 하나의 유전자를 포함할 수 있다. 바람직한 제2 유전자 세트는 FOXC1 유전자 또는, FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2 및 KRT17 유전자 또는, FOXA1 및 SFRP1 유전자 또는, MLPH, FGFR4, MYBL2 및 SFRP1 유전자 또는, FOXA1, KRT17 및 SFRP1 유전자 또는, FOXA1, MLPH, FGFR4, BCL2, MYBL2, KRT17 및 SFRP1 유전자 또는, FOXC1, FOXA1, MLPH, FGFR4, BCL2, MYBL2, KRT17 및 SFRP1 유전자 또는, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17 및 SFRP1 유전자 또는, GRB7, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17 및 SFRP1 유전자 또는, FOXA1, MLPH, FGFR4, BCL2, CEP55, MYBL2, KRT17 및 SFRP1 유전자 또는, GRB7, FOXA1, MLPH, FGFR4, BCL2, CEP55, MYBL2, KRT17 및 SFRP1 유전자 또는, FOXC1, GRB7, FOXA1, MLPH, FGFR4, BCL2, CEP55, MYBL2, KRT17 및 SFRP1 유전자 또는, FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2 및 KRT14 유전자 또는, KRT17, FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2 및 KRT14 유전자 또는, GRB7, KRT17, FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2 및 KRT14 유전자 또는, MLPH, FGFR4, MYBL2, SFRP1 및 KRT14 유전자 또는, FOXA1, FOXC1, MLPH, FGFR4, BCL2, MYBL2, SFRP1 및 KRT14 유전자 또는, MLPH, FGFR4, CEP55, MYBL2, SFRP1 및 KRT14 유전자 또는, FOXA1, MLPH, FGFR4, CEP55, MYBL2, SFRP1 및 KRT14 유전자 또는, FOXC1, FOXA1, MLPH, FGFR4, CEP55, MYBL2, SFRP1 및 KRT14 유전자 또는, GRB7, FOXA1, MLPH, FGFR4, CEP55, MYBL2, SFRP1 및 KRT14 유전자 또는, BCL2, FOXA1, MLPH, FGFR4, CEP55, MYBL2, SFRP1 및 KRT14 유전자 또는, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2, SFRP1 및 KRT14 유전자 또는, FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2, SFRP1 및 KRT14 유전자 또는, FOXA1, GRB7, MLPH, FGFR4, BCL2, CEP55, MYBL2, SFRP1 및 KRT14 유전자 또는, GRB7, FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2, SFRP1 및 KRT14 유전자 또는, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 및 KRT14 유전자 또는, FOXC1, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 및 KRT14 유전자 또는, GRB7, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 및 KRT14 유전자 또는, BCL2, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 및 KRT14 유전자 또는, FOXC1, BCL2, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 및 KRT14 유전자 또는, GRB7, BCL2, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 및 KRT14 유전자 또는, FOXC1, GRB7, BCL2, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 및 KRT14 유전자 또는, MLPH, FGFR4, CEP55 및 KRT17 유전자 또는, FOXC1, FOXA1, MLPH, FGFR4, GRB7, CEP55, MYBL2, KRT17, SFRP1, MELK, KRT5 및 KRT14 유전자 또는, BLC2, MIA, FOXC1, FOXA1, MLPH, FGFR4, GRB7, CEP55, MYBL2, KRT17, SFRP1, MELK, KRT5 및 KRT14 유전자 또는, KIF2C, EGFR, NAT1, ANLN, BLC2, MIA, FOXC1, FOXA1, MLPH, FGFR4, GRB7, CEP55, MYBL2, KRT17, SFRP1, MELK, KRT5 및 KRT14 유전자를 포함할 수 있다. For example, using RNA sequencing, the second set of genes is FOXA1, FOXC1, MLPH, FGFR4, GRB7, BCL2, CEP55, MYBL2, KRT17, SFRP1, KRT14, MELK, KRT5, MIA, KIF2C, EGFR, NAT1, and ANLN It may include at least one gene selected from the group consisting of genes. Preferred second gene sets are the FOXC1 gene or the FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2 and KRT17 genes, or the FOXA1 and SFRP1 genes, or the MLPH, FGFR4, MYBL2 and SFRP1 genes, or the FOXA1, KRT17 and SFRP1 genes. Or the FOXA1, MLPH, FGFR4, BCL2, MYBL2, KRT17 and SFRP1 genes, or the FOXC1, FOXA1, MLPH, FGFR4, BCL2, MYBL2, KRT17 and SFRP1 genes, or the FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17 and SFRP1 genes Or the GRB7, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17 and SFRP1 genes, or the FOXA1, MLPH, FGFR4, BCL2, CEP55, MYBL2, KRT17 and SFRP1 genes, or GRB7, FOXA1, MLPH, FGFR4, BCL2, CEP55 MYBL2, KRT17 and SFRP1 genes or, FOXC1, GRB7, FOXA1, MLPH, FGFR4, BCL2, CEP55, MYBL2, KRT17 and SFRP1 genes or, FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2 and KRT14 genes, or KRT17, FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2 and KRT14 genes or, GRB7, KRT17, FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2 and KRT14 gene or, MLPH, FGFR4, MYBL2, SFRP1 and KRT14 genes or, FOXA1, FOXC1, MLPH, FGFR4, BCL2, MYBL2, SFRP1 and KRT14 genes, or MLPH, FGFR4, CEP55, MYBL2, SFRP1 and KRT14 genes, or FOXA1, MLPH, FGFR4, CEP55, MYBL2, SFRP1 and KRT14 genes or, FOXC1, FOXA1, MLPH, FGFR4, CEP55, MYBL2, SFRP1 and KRT14 genes, or GRB7, FOXA1, MLPH, FGFR4, CEP55, MYBL2, SFRP1 and KRT14 genes or BCL2, FOXA1, MLPH, FGFR4, CEP55, MYBL2, SFRP1 and KRT14 genes or, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2, SFRP1 and KRT14 genes, or FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2 SFRP1 and KRT14 genes or FOXA1, GRB7, MLPH, FGFR4, BCL2, CEP55, MYBL2, SFRP1 and KRT14 genes or, GRB7, FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55, MYBL2, SFRP1 and KRT14 genes, or FOXA1 MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 and KRT14 genes, or FOXC1, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 and KRT14 genes, or GRB7, FOXA1, MLPH , FGFR4, CEP55, MYBL2, KRT17, SFRP1 and KRT14 genes or, BCL2, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 and KRT14 genes, or FOXC1, BCL2, FOXA1, MLPH, FGFR4, CEP55, MYBL2 , SFRP1 and KRT14 genes or, GRB7, BCL2, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 and KRT14 genes or, FOXC1, GRB7, BCL2, FOXA1, MLPH, FGFR4, CEP55, MYBL2, KRT17, SFRP1 and SFRP1 and Gene or, MLPH, FGFR4, CEP55 and KRT17 genes or, FOXC1, FOXA1, MLPH, FGFR4, GRB7, CEP55, MYBL2, KRT17, SFRP1, MELK, KRT5 and KRT14 genes, or, BLC2, MIA, FOXC1, FOXA1, MLPH, FGFR4 , GRB7, CEP55, MYBL2, KRT17, SFRP1, MELK, KRT5 and KRT14 genes, or KIF2C, EGFR, NAT1, ANLN, BLC2, MIA, FOXC1, FOXA1, MLPH, FGFR4, GRB7, CEP55, MYBL2, KRT17, SFRP1, MELK , KRT5 and KRT14 genes.
또한, 마이크로어레이를 이용할 경우, 제3 유전자 세트는 FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14, BLVRA, BCL2, UBE2C, KRT17, CCNB1, NAT1, SLC39A6, CDC20, KRT5 및 CCNE1 유전자로 이루어진 그룹으로부터 선택된 적어도 하나의 유전자를 포함할 수 있다. 바람직한 제3 유전자 세트는 FOXC1 및 CEP55 유전자 또는, FOXC1, MELK 및 CEP55 유전자 또는, FOXA1, FOXC1, GRB7, CEP55, BIRC5, MELK 및 MIA 유전자 또는, FOXA1, FGFR4, CEP55, BIRC5, SFRP1 및 MIA 유전자 또는, FOXC1, FOXA1, FGFR4, CEP55, BIRC5, SFRP1 및 MIA 유전자 또는, FOXA1, MELK, SFRP1 및 MIA 유전자 또는, CEP55, FOXA1, MELK, SFRP1 및 MIA 유전자 또는, FOXC1, CEP55, FOXA1, MELK, SFRP1 및 MIA 유전자 또는, FGFR4, CEP55, FOXA1, MELK, SFRP1 및 MIA 유전자 또는, FOXC1, FGFR4, CEP55, FOXA1, MELK, SFRP1 및 MIA 유전자 또는, FOXA1, GRB7, CEP55, MELK, SFRP1 및 MIA 유전자 또는, FOXA1, GRB7, CEP55, MELK, SFRP1 및 MIA 유전자 또는, FOXC1, FOXA1, GRB7, CEP55, MELK, SFRP1 및 MIA 유전자 또는, FGFR4, FOXA1, GRB7, CEP55, MELK, SFRP1 및 MIA 유전자 또는, FOXC1, FGFR4, FOXA1, GRB7, CEP55, MELK, SFRP1 및 MIA 유전자 또는, FOXA1, FGFR4, BIRC5, MELK, SFRP1 및 MIA 유전자 또는, FOXA1, FGFR4, BIRC5, MELK, SFRP1 및 MIA 유전자 또는, CEP55, FOXA1, FGFR4, BIRC5, MELK, SFRP1 및 MIA 유전자 또는, FOX1C, CEP55, FOXA1, FGFR4, BIRC5, MELK, SFRP1 및 MIA 유전자 또는, GRB7, CEP55, FOXA1, FGFR4, BIRC5, MELK, SFRP1 및 MIA 유전자 또는, FOXC1, GRB7, CEP55, FOXA1, FGFR4, BIRC5, MELK, SFRP1 및 MIA 유전자 또는, FOXA1, FOXC1, CEP55, SFRP1, MIA 및 KRT14 유전자 또는, FGFR4, FOXA1, FOXC1, CEP55, SFRP1, MIA 및 KRT14 유전자 또는, FGFR4, FOXA1, BIRC5, CEP55, SFRP1, MIA 및 KRT14 유전자 또는, FOXC1, FGFR4, FOXA1, BIRC5, CEP55, SFRP1, MIA 및 KRT14 유전자 또는, FOXA1, FOXC1, CEP55, MELK, SFRP1, MIA 및 KRT14 유전자 또는, FOXA1, FGFR4, CEP55, MELK, SFRP1, MIA 및 KRT14 유전자 또는, FOXC1, FOXA1, FGFR4, CEP55, MELK, SFRP1, MIA 및 KRT14 유전자 또는, FOXC1, FOXA1, GRB7, CEP55, MELK, SFRP1, MIA 및 KRT14 유전자 또는, FGFR4, FOXA1, GRB7, CEP55, MELK, SFRP1, MIA 및 KRT14 유전자 또는, FOXC1, FGFR4, FOXA1, GRB7, CEP55, MELK, SFRP1, MIA 및 KRT14 유전자 또는, FGFR4, CEP55, BIRC5, MELK, SFRP1, MIA 및 KRT14 유전자 또는, FOXA1, FGFR4, CEP55, BIRC5, MELK, SFRP1, MIA 및 KRT14 유전자 또는, FOXC1, FOXA1, FGFR4, CEP55, BIRC5, MELK, SFRP1, MIA 및 KRT14 유전자 또는, GRB7, FOXA1, FGFR4, CEP55, BIRC5, MELK, SFRP1, MIA 및 KRT14 유전자 또는, FOXC1, GRB7, FOXA1, FGFR4, CEP55, BIRC5, MELK, SFRP1, MIA 및 KRT14 유전자 또는, FOXA1, FGFR4, CEP55, SFRP1 및 ANLN 유전자 또는, FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA 및 BAG1 유전자 또는, FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14 및 BLVRA 유전자 또는, FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14, BLVRA, BCL2, UBE2C, KRT17 및 CCNB1 유전자 또는, FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14 및 BLVRA 유전자 또는, FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14, BLVRA, BCL2, UBE2C, KRT17, CCNB1, NAT1, SLC39A6, CDC20, KRT5 및 CCNE1 유전자를 포함할 수 있다.In addition, when using a microarray, the third set of genes is FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14, BLVRA, BCL2, UBE2C, KRT17, CCNB1, NAT1 It may include at least one gene selected from the group consisting of, SLC39A6, CDC20, KRT5 and CCNE1 gene. Preferred third set of genes are the FOXC1 and CEP55 genes, or the FOXC1, MELK and CEP55 genes, or the FOXA1, FOXC1, GRB7, CEP55, BIRC5, MELK and MIA genes, or the FOXA1, FGFR4, CEP55, BIRC5, SFRP1 and MIA genes, or FOXC1, FOXA1, FGFR4, CEP55, BIRC5, SFRP1 and MIA genes, or FOXA1, MELK, SFRP1 and MIA genes, or CEP55, FOXA1, MELK, SFRP1 and MIA genes, or FOXC1, CEP55, FOXA1, MELK, SFRP1 and MIA genes Or the FGFR4, CEP55, FOXA1, MELK, SFRP1 and MIA genes, or the FOXC1, FGFR4, CEP55, FOXA1, MELK, SFRP1 and MIA genes, or the FOXA1, GRB7, CEP55, MELK, SFRP1 and MIA genes, or FOXA1, GRB7, CEP55, MELK, SFRP1 and MIA genes or, FOXC1, FOXA1, GRB7, CEP55, MELK, SFRP1 and MIA genes or, FGFR4, FOXA1, GRB7, CEP55, MELK, SFRP1 and MIA genes, or FOXC1, FGFR4, FOXA1, GRB7, CEP55, MELK, SFRP1 and MIA genes or, FOXA1, FGFR4, BIRC5, MELK, SFRP1 and MIA genes, or FOXA1, FGFR4, BIRC5, MELK, SFRP1 and M IA gene or CEP55, FOXA1, FGFR4, BIRC5, MELK, SFRP1 and MIA genes or, FOX1C, CEP55, FOXA1, FGFR4, BIRC5, MELK, SFRP1 and MIA genes, or GRB7, CEP55, FOXA1, FGFR4, BIRC5, MELK, SFRP1 and MIA genes or FOXC1, GRB7, CEP55, FOXA1, FGFR4, BIRC5, MELK, SFRP1 and MIA genes, or FOXA1, FOXC1, CEP55, SFRP1, MIA and KRT14 genes, or FGFR4, FOXA1, FOXC1, CEP55, SFRP1 MIA and KRT14 genes or FGFR4, FOXA1, BIRC5, CEP55, SFRP1, MIA and KRT14 genes or, FOXC1, FGFR4, FOXA1, BIRC5, CEP55, SFRP1, MIA and KRT14 genes, or FOXA1, FOXC1, CEP55, MELK, SFRP1 MIA and KRT14 genes or FOXA1, FGFR4, CEP55, MELK, SFRP1, MIA and KRT14 genes or, FOXC1, FOXA1, FGFR4, CEP55, MELK, SFRP1, MIA and KRT14 genes, or FOXC1, FOXA1, GRB7, CEP55, MELK, SFRP1, MIA and KRT14 genes or, FGFR4, FOXA1, GRB7, CEP55, MELK, SFRP1, MIA and KRT14 genes or, FOXC1, FGFR4, FOXA1, GRB7, CEP55, MELK, SFRP1, MIA and KRT14 genes FGFR4, CEP55, BIRC5, MELK, SFRP1, MIA and KRT14 genes, or FOXA1, FGFR4, CEP55, BIRC5, MELK, SFRP1, MIA and KRT14 genes, or FOXC1, FOXA1, FGFR4, CEP55, BIRC5, MELK, SFRP1, MIA and KRT14 genes or, GRB7, FOXA1, FGFR4, CEP55, BIRC5, MELK, SFRP1, MIA and KRT14 genes or, FOXC1, GRB7, FOXA1, FGFR4, CEP55, BIRC5, MELK, SFRP1, MIA and KRT14 genes, or FOXA1, FGFR4, CEP55, SFRP1 and ANLN genes or, FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA and BAG1 genes or, FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5 BAG1, MLPH, MELK, KRT14 and BLVRA genes or FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14, BLVRA, BCL2, UBE2C, KRT17 and CCNB1 genes, or FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14 and BLVRA genes or, FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC1, MIA, BAG MLPH, MELK, KRT14, BLVRA, BCL2, UBE2C, KRT17, CCNB1, NAT1, SLC39A6, CDC20, KRT5 and CCNE1 genes.
이에 따라, 본 발명은 유방암 분자아형 결정 모델에서 사용되는 유전자의 수를 감소시킴으로써, 정확도를 유지하는 동시에 리소스 소모를 최소화하면서 특정 유전자들의 발현수준의 비율을 기초로 유방암의 분자아형을 결정할 수 있는 효과가 있다.Accordingly, the present invention reduces the number of genes used in the breast cancer molecular subtype determination model, thereby minimizing resource consumption while maintaining accuracy and determining molecular subtypes of breast cancer based on the ratio of expression levels of specific genes. There is.
그 다음, 획득한 차별 발현 유전자 세트의 발현수준을 차별 발현 유전자 세트에 대한 파라미터를 갖는 분류모델에 입력한다 (S220). 바람직하게, 분류모델은 복수의 유방암과 연관된 유전자에서 선택된 두 개의 유전자에 대한 발현수준의 비율 즉, 발현비율값으로 구성된 메트릭스에 의해 정의될 수 있고, 더 바람직하게, 분류모델은 두 개의 차별 유전자에 대한 발현수준에 밑이 2인 로그를 취하고, 이들 로그값의 차로 행렬이 구성된 메트릭스로 정의될 수 있다. Next, the expression level of the acquired differential expression gene set is input to a classification model having parameters for the differential expression gene set (S220). Preferably, the classification model may be defined by a matrix consisting of a ratio of expression levels for two genes selected from genes associated with a plurality of breast cancers, that is, expression ratio values, and more preferably, the classification model is assigned to two different genes. The logarithm of the baseline expression is taken as 2, and the difference between these log values can be defined as a matrix constructed.
다음으로, 분류모델을 이용해 종양세포에 대한 유방암 분자아형을 결정한다 (S230). 이때, 분류 가능한 유방 종양의 분자아형으로는 기저형, 루미널 A형, 루미널 B형, 정상-유사형, 정상형 및 HER2형이 있을 수 있다. 그러나, 이에 제한되지 않고 유방 종양의 분자아형은 당업자에 따라 다양하게 명명될 수 있다. Next, the molecular cancer subtypes for tumor cells are determined using the classification model (S230). At this time, the molecular subtypes of classifiable breast tumors may be basal type, luminal type A, luminal type B, normal-like type, normal type and HER2 type. However, without being limited thereto, the molecular subtypes of the breast tumor may be variously named according to those skilled in the art.
마지막으로, 결정된 유방암 분자아형을 제공한다 (S240). 제공하는 단계 (S240) 를 통해 결정된 유방 종양의 분자아형을 제공함으로써, 유방암 분자아형에 따라 치료에 대한 예측에 대한 정보를 제공할 수 있다. 또한, 본 발명의 일 실시예에 따른 유방암 분자아형 분류방법 및 이를 따르는 유방암 분자아형 분류 디바이스는 다양한 조합의 차별 발현 유전자 세트를 제공함으로써, 유방암 분자아형 결정 모델에서 사용되는 유전자의 수를 감소시켜 리소스 소모를 최소화하고, 정확도를 유지하는 유방암 분자아형을 결정방법을 제공할 수 있다. Finally, it provides a determined breast cancer molecular subtype (S240). By providing a molecular subtype of the breast tumor determined through the providing step (S240), it is possible to provide information about the prediction for treatment according to the breast cancer molecular subtype. In addition, the breast cancer molecular subtype classification method according to an embodiment of the present invention and the breast cancer molecular subtype classification device according to the present invention provides a combination of differential expression gene sets of various combinations, thereby reducing the number of genes used in the breast cancer molecular subtype determination model resources It is possible to provide a method for determining breast cancer molecular subtypes that minimizes consumption and maintains accuracy.
이에 따라, 본 발명의 일 실시예에 따른 유방암 분자아형 분류방법 및 이를 이용한 유방암 분자아형 분류 디바이스는 유방암 환자의 치료 방향의 설정, 더 나아가, 유방암에 대한 예후에 대한 정보를 제공할 수 있다. Accordingly, the breast cancer molecular subtype classification method and the breast cancer molecular subtype classification device using the same according to an embodiment of the present invention may provide information on setting a treatment direction of a breast cancer patient and further, prognosis for breast cancer.
이하에서는, 도 3a, 3b 및 [표 1]을 참조하여, 본 발명의 일 실시예에 따른 유방암 분자아형 분류방법에서의 유방암 분자아형 분류를 위한 분류모델 및 차별 발현 유전자 세트를 결정하는 방법에 대해 구체적으로 설명한다. 또한, 보다 명확한 설명을 위해, 전술한 도 1의 도면 부호를 함께 설명한다. 더 나아가, 이하에서는 유방암과 연관된 유전자 (예를 들어, PAM 50) 및 차별 발현 유전자를 분류하여 설명하나, 이에 제한되는 것은 아니다. 예를 들어, 유방암과 연관된 유전자들과 차별 발현 유전자 세트를 구성하는 유전자는 서로 동일할 수 있고, 유방암과 연관된 유전자들이 차별 발현 유전자 세트를 구성하는 유전자를 포함할 수 있다. Hereinafter, referring to FIGS. 3A, 3B and [Table 1], a method for determining a classification model and a differential expression gene set for breast cancer molecular subtype classification in a breast cancer molecular subtype classification method according to an embodiment of the present invention will be described. It demonstrates concretely. In addition, for the sake of clarity, the aforementioned reference numerals of FIG. 1 will be described together. Furthermore, hereinafter, the genes associated with breast cancer (eg, PAM 50) and the differentially expressed genes are classified and described, but are not limited thereto. For example, genes associated with breast cancer and genes constituting the differential expression gene set may be identical to each other, and genes associated with breast cancer may include genes constituting the differential expression gene set.
도 3a는 본 발명의 일 실시예에 따른 유방암 분자아형 분류 디바이스의 유방암 분자아형 분류를 위한 학습 절차를 도시한 순서도이다. 도 3b는 본 발명의 일 실시예에 따른 유방암 분자아형 분류방법에서 제공하는 분류모델을 도시한 개략도이다. 3A is a flowchart illustrating a learning procedure for breast cancer molecular subtype classification of a breast cancer molecular subtype classification device according to an embodiment of the present invention. Figure 3b is a schematic diagram showing a classification model provided by the breast cancer molecular subtype classification method according to an embodiment of the present invention.
먼저, 유방암 분자아형 분류 학습을 위해 서로 다른 분자아형을 갖고 있는 샘플에 대하여, RNA 시퀀싱을 이용하여 측정된 복수의 유방암과 연관된 유전자 발현수준 데이터 또는, 마이크로어레이를 이용하여 측정된 복수의 유방암과 연관된 유전자 발현수준 데이터를 획득한다 (S310). 이때, 복수의 유방암과 연관된 유전자 발현수준 데이터는 TCGA에서 제공하는 TCGA BRCA 데이터를 이용할 수 있으나, 이에 제한되지 않고 다양한 데이터를 이용할 수 있다. [표 1]은 유방암의 분자아형과 연관된 PAM 50 유전자를 기초로 설정된 골든 스탠다드 (golden standard) 에 따라, 5개의 유방암 분자아형으로 분류한, TCGA BRCA 데이터 내의 RNA 시퀀싱 분석 샘플 및 마이크로어레이 분석 샘플의 수를 나타낸다.First, gene expression level data associated with a plurality of breast cancers measured using RNA sequencing or a plurality of breast cancers measured using microarrays for samples having different molecular subtypes for breast cancer molecular subtype classification learning. Obtain gene expression level data (S310). In this case, gene expression level data associated with a plurality of breast cancers may use TCGA BRCA data provided by TCGA, but various data may be used without being limited thereto. Table 1 shows RNA sequencing analysis samples and microarray analysis samples in TCGA BRCA data, classified into five breast cancer molecular subtypes, according to a golden standard set based on the PAM 50 gene associated with molecular subtypes of breast cancer. Indicates a number.
이때, 각각의 분자아형에 대한 샘플을 분류의 정답으로 이용하여, 본 발명의 일 실시예에 따른 유방암 분자아형 분류 디바이스가 학습될 수 있고, 이의 정확도 또한, 평가될 수 있다. At this time, by using a sample for each molecular subtype as the correct answer for classification, a breast cancer molecular subtype classification device according to an embodiment of the present invention can be learned, and its accuracy can also be evaluated.
그 다음, 획득한 데이터를 기초로 분자아형간의 발현의 차이를 나타내는 유방암과 연관된 유전자를 선별한다 (S320). 유전자를 선별하는 단계 (S320) 에서는 Wilcoxon 순위-합 검정 (Wilcoxon rank-sum test) 을 이용하여 유방 종양의 분자아형에 따라 발현수준의 차이가 상대적으로 큰 복수의 유전자를 선별한다. 그리고, 이들 중 전술한 PAM 50 유전자에 포함되어 있는 유전자를 유방암과 연관된 유전자를 선별한다.Next, based on the obtained data, genes associated with breast cancer indicating differences in expression between molecular subtypes are selected (S320). In the step of selecting genes (S320), the Wilcoxon rank-sum test is used to select a plurality of genes having relatively large differences in expression levels according to molecular subtypes of the breast tumor. Among the genes included in the above-described PAM 50 gene, genes associated with breast cancer are selected.
그 다음, 복수의 유방암과 연관된 유전자 중 선택된 두 개의 유전자에 대한 발현비율값으로 구성된 메트릭스를 결정한다 (S330). 도 3b를 참조하면, 유전자 발현 메트릭스 (360) 를 기초로, 유전자간 발현비율 메트릭스 (370) 를 결정할 수 있다. 구체적으로 유전자 발현 메트릭스 (360) 는 [수학식 1] 로 산출된 복수의 유방암과 연관된 유전자의 발현값으로 구성된다.Next, a matrix consisting of expression ratio values for two selected genes among a plurality of genes associated with breast cancer is determined (S330). Referring to FIG. 3B, based on the
여기서, ei는 밑이 2인 복수의 유방암과 연관된 유전자 중 선택된 i유전자의 발현값을 취한 로그값이고, ej는 밑이 2인 로그에 복수의 유방암과 연관된 유전자 중 선택된 j유전자의 발현값을 취한 로그값이다.Here, e i is a log value that takes the expression value of the i gene selected from among a plurality of genes associated with breast cancer, e j is the expression value of the selected j gene among genes associated with the plurality of breast cancers in the
유전자간 발현비율 메트릭스 (370) 는 dij 값을 하기 [수학식 2] 로 보정하여 산출된, 두 개의 유방암과 연관된 유전자간의 발현비율값 (d'ij 값) 으로 구성될 수 있다.The intergene
여기서, α는 dij 값을 보정하기 위한 기준치를 의미할 수 있으며, α는 0, 0.01, 0.10, 0.15, 0.20 및 1.0일 수 있으나, 이에 제한되는 것은 아니다. 다양한 α값의 적용에 따라 메트릭스 또한, 다양하게 결정될 수 있다. 메트릭스를 결정하는 단계 (S330) 를 통해 유전자간 발현비율 메트릭스 (370) 를 결정할 수 있으며, 이 유전자간 발현비율 메트릭스 (370) 는 본 발명의 일 실시예에 따른 유방암 종양세포 분류 디바이스에서의 분류모델로서 정의될 수 있다. Here, α may mean a reference value for correcting the d ij value, α may be 0, 0.01, 0.10, 0.15, 0.20 and 1.0, but is not limited thereto. The matrix can also be determined variously according to the application of the various values of α. The intergene
그 다음, 결정된 유전자간 발현비율 메트릭스 (370) 를 기초로, 다양한 분류 알고리즘의 기계학습을 수행하고, 이에 대한 평가를 수행한다 (S340). 구체적으로, 기계학습을 위해 CART, Random forest, SVM 및 Naive Bayes의 알고리즘을 이용한다. 이때, 정확도 높은 유방암 분자아형 분류 디바이스 (100) 의 알고리즘을 결정하기 위해, 결정된 유전자간 발현비율 메트릭스 (370) 를 8 (학습세트) : 2 (평가세트) 로 나눈다. 학습세트는 5-겹 교차 검증 (5- fold cross validation) 을 100번 반복한다. 이때, 정확도 평가는 전술한 각각의 분자아형에 대한 TCGA BRCA 샘플을 정답으로 이용하여 수행된다. 또한, 평가세트는 학습세트에서 높은 정확도를 보이는 후보 분류 알고리즘, 후보 αα값 및 후보 차별 유전자 세트가 반영된 분자아형 분류 디바이스 (100) 대한 정확도를 평가한다. 이때, 평가세트는 전술한 학습세트에서 이용한 TCGA BRCA샘플 이외의 다른 샘플을 이용하여 평가한다.Next, based on the determined intergene
마지막으로, RNA 시퀀싱 분석 데이터 및 마이크로어레이 분석 데이터 각각에 대한 유방암 분자아형 분류 알고리즘, α값 및 차별 발현 유전자 세트를 결정한다 (S350). 구체적으로, 평가하는 단계 (S340) 의 결과를 기초로, 결정하는 단계 (S350) 에서는 RNA 시퀀싱 분석 및 마이크로어레이 분석에 따라 전술한 4개의 분류 알고리즘 중 유방암 분자아형 분류의 정확도가 높은 알고리즘, 분류의 정확도가 높은 α값, 분류의 정확도가 높은 제1 유전자 세트, 제2 유전자 세트 또는 제3 유전자 세트를 포함하는 차별 발현 유전자 세트를 결정한다. 더 나아가, 본 발명의 유방암 분자아형 분류 디바이스는 유방암 분자아형 분류의 정확도가 높은 알고리즘 및 분류의 정확도가 높은 α값으로 설정되고, 분류의 정확도가 높은 차별 발현 유전자 세트를 이용한다. 이로써, 분자아형 분류 디바이스 (100) 는 효과적인 유방 종양의 분자아형의 분류를 제공할 수 있다. 또한, 본 발명의 일 실시예에 따른 유방암 분자아형 분류방법 및 이를 따르는 유방암 분자아형 분류 디바이스 (100) 는 결정하는 단계 (S350) 에서 결정된 다양한 조합의 차별 발현 유전자 세트를 제공함으로써, 유방암 분자아형 결정 모델에서 사용되는 유전자의 수를 감소시켜 리소스 소모를 최소화하고, 정확도를 유지하는 유방암 분자아형을 결정방법을 제공할 수 있다.Finally, a breast cancer molecular subtype classification algorithm, α value, and differential expression gene set for each of RNA sequencing analysis data and microarray analysis data are determined (S350). Specifically, on the basis of the results of the evaluation step (S340), in the determining step (S350) of the above-described four classification algorithms according to the RNA sequencing analysis and microarray analysis, the high accuracy of the breast cancer molecular subtype classification algorithm, classification of A differential expression gene set is determined that includes a high value of α, a first set of genes, a second set of genes, or a third set of genes with high accuracy. Furthermore, the breast cancer molecular subtype classification device of the present invention is set to an algorithm having high accuracy of breast cancer molecular subtype classification and a value of α having high accuracy of classification, and uses a differential expression gene set with high accuracy of classification. As such, the molecular
실시예 1 : 차별 발현 유전자 세트 및 분류 알고리즘에 대한 평가Example 1 Evaluation of Differential Expression Gene Sets and Classification Algorithms
이하에서는, 도 4a, 4b, 5a 및 도5b를 참조하여, 발명의 일 실시예에 따른 유방암 분자아형 분류방법에서 제공하는, 차별 발현 유전자 세트 및 유방암 분자아형 분류를 위한 분류 알고리즘 및 α값에 대한 평가를 설명한다. Hereinafter, referring to FIGS. 4A, 4B, 5A, and 5B, a classification algorithm and a value for a differential expression gene set and a breast cancer molecular subtype classification provided by a breast cancer molecular subtype classification method according to an embodiment of the present invention will be described. Explain the evaluation.
도 4a는 본 발명의 일 실시예에 따른 유방암 분자아형 분류방법에서 제공되는 다양한 차별 발현 유전자 세트에 따른 RNA 시퀀싱 분석 데이터에서의 평가결과를 도시한 것이다. 도 4b는 본 발명의 일 실시예에 따른 유방암 분자아형 분류방법에서 제공되는 다양한 차별 발현 유전자 세트 따른 마이크로어레이 분석 데이터에서의 평가결과를 도시한 것이다. 도 5a는 본 발명의 일 실시예에 따른 유방암 분자아형 분류방법에서 유방암 분자아형 분류 알고리즘에 따른 RNA 시퀀싱 분석 데이터에서의 평가결과를 도시한 것이다. 도 5b는 본 발명의 일 실시예에 따른 유방암 분자아형 분류방법에서 유방암 분자아형 분류 알고리즘에 따른 마이크로어레이 분석 데이터에서의 평가결과를 도시한 것이다. Figure 4a shows the results of the evaluation in the RNA sequencing analysis data according to various sets of differentially expressed genes provided in the breast cancer molecular subtype classification method according to an embodiment of the present invention. Figure 4b shows the results of the evaluation in the microarray analysis data according to a variety of differential expression gene set provided in the breast cancer molecular subtype classification method according to an embodiment of the present invention. Figure 5a illustrates the evaluation results in the RNA sequencing analysis data according to the breast cancer molecular subtype classification algorithm in the breast cancer molecular subtype classification method according to an embodiment of the present invention. Figure 5b shows the evaluation results in the microarray analysis data according to the breast cancer molecular subtype classification algorithm in the breast cancer molecular subtype classification method according to an embodiment of the present invention.
도 4a를 참고하면, RNA 시퀀싱 분석 데이터에서 평가된 다양한 조합의 차별 발현 유전자 세트를 나타낸다. 구체적으로, RNA 시퀀싱 분석 데이터에서 의 적합한 차별 발현 유전자 세트는 전술한 바와 같이 ESR1, PGR, ERBB2 및 MKI67 유전자를 포함하는 제1 유전자 세트 및 FOXA1, FOXC1, MLPH, FGFR4, GRB7, BCL2, CEP55, MYBL2, KRT17, SFRP1, KRT14, MELK, KRT5, MIA, KIF2C, EGFR, NAT1 및 ANLN 유전자를 포함하는 제2 유전자 세트의 유전자들의 조합으로 구성될 수 있다. 본 발명에서 제공하는 RNA 시퀀싱 분석데이터에 적합한 1 번 내지 41 번의 차별 발현 유전자 세트에서의 분류 정확도는 약 87 %로, 높은 수준을 나타낸다. 특히, 17 번 및 34 번의 차별 발현 유전자 세트는 89.41 %의 분류 정확도를 나타낸다. 더 나아가, 37 번의 차별 유전자 세트의 경우, 분류의 정확도는 85 %로 17 번 또는 34 번의 차별 발현 유전자 세트에서의 분류 정확도보다 낮은 결과를 나타낸다. 그러나, 37 번의 차별 발현 유전자 세트는 제2 유전자 세트 중 MLPH, FGFR4, CEP55 및 KRT17의 4 개의 유전자만을 선택하여 유방암 분자아형을 분류함으로써, 리소스 소모를 최소화하면서도 85 % 이상의 높은 분류의 정확도를 나타낸 점에 유의하여야 한다. 이에 따라, RNA 시퀀싱 분석 데이터에서의 바람직한 차별 발현 유전자 세트는 17 번 또는 34 번, 더 나아가 37 번일 수 있으나, 이에 제한되지 않고 도 4a에서의 다양한 세트가 차별 발현 유전자 세트로 제공될 수 있다. 도 4a에 따르면, 전술한 ESR1, PGR, ERBB2, MKI67, FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55 및 MYBL2, KRT17 및 KRT14 유전자가 차별 발현 유전자 세트에 포함된 경우 (또는, 여기에 SFRP1 유전자가 더 포함된 경우), 다른 유전자들의 조합의 세트보다 매우 높은 유방암 분자아형 분류 정확도를 나타내는 것을 확인할 수 있다.Referring to FIG. 4A, the set of differentially expressed genes of various combinations evaluated in RNA sequencing analysis data. Specifically, a suitable differential expression gene set in RNA sequencing analysis data is a first set of genes comprising the ESR1, PGR, ERBB2 and MKI67 genes and FOXA1, FOXC1, MLPH, FGFR4, GRB7, BCL2, CEP55, MYBL2 as described above. , KRT17, SFRP1, KRT14, MELK, KRT5, MIA, KIF2C, EGFR, NAT1 and ANLN genes. The classification accuracy in the 1 to 41 differentially expressed gene sets suitable for the RNA sequencing analysis data provided by the present invention is about 87%, indicating a high level. In particular, the 17 and 34 sets of differentially expressed genes show a classification accuracy of 89.41%. Furthermore, for the 37 sets of differential genes, the accuracy of the classification is 85%, which results in less than the accuracy of classification in the 17 or 34 sets of differentially expressed genes. However, the 37 sets of differentially expressed genes selected only four genes of MLPH, FGFR4, CEP55, and KRT17 from the second gene set to classify breast cancer molecular subtypes, resulting in more than 85% high classification accuracy while minimizing resource consumption. It should be noted that Accordingly, the preferred differential expression gene set in the RNA sequencing analysis data may be 17 or 34, even 37, but is not limited thereto and various sets in FIG. 4A may be provided as differential expression gene sets. According to FIG. 4A, if the aforementioned ESR1, PGR, ERBB2, MKI67, FOXA1, FOXC1, MLPH, FGFR4, BCL2, CEP55 and MYBL2, KRT17 and KRT14 genes are included in the differential expression gene set (or, where the SFRP1 gene is If further included), it can be seen that shows a much higher degree of breast cancer molecular subtype classification accuracy than the set of combinations of other genes.
도 4b를 참고하면 마이크로어레이 분석 데이터에서 평가된 다양한 조합의 차별 발현 유전자 세트를 나타낸다. 구체적으로, 마이크로어레이 분석 데이터에서 의 적합한 차별 발현 유전자 세트는 전술 ESR1, PGR, ERBB2 및 MKI67 유전자를 포함하는 제1 유전자 세트 및 FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14, BLVRA, BCL2, UBE2C, KRT17, CCNB1, NAT1, SLC39A6, CDC20, KRT5 및 CCNE1 유전자를 포함하는 제3 유전자 세트의 유전자들의 조합으로 구성될 수 있다. 도 4a의 결과와 마찬가지로, 본 발명에서 제공하는 마이크로어레이 분석 데이터에 적합한 1 번 내지 41 번의 차별 발현 유전자 세트에서의 분류 정확도는 약 86 %로 높은 수준을 나타낸다. 27 번과 33 번은 각각 89.41 %, 88. 24 %의 분류 정확도를 나타내어, 나머지 차별 발현 유전자 세트보다 높은 분류 정확도를 갖는다. 특히, 차별 발현 유전자 세트 중 41 번의 차별 발현 유전자 세트는 92.94 %의 가장 높은 분류 정확도를 나타낸다. 더 나아가, 3 번의 차별 유전자 세트의 경우, 분류의 정확도는 84.71 %로 27 번 또는 33 번 또는 41 번의 차별 발현 유전자 세트에서의 분류 정확도보다 낮은 결과를 나타낸다. 그러나, 3 번의 차별 발현 유전자 세트는 제3 유전자 세트 중 FOXC1 및 CEP55의 2 개의 유전자만을 선택하여 유방암 분자아형을 분류함으로써, 리소스 소모를 최소화하면서도 약 85 %의 높은 분류의 정확도를 나타낸 점에 유의하여야 한다. 이로써, 마이크로어레이 분석 데이터에서의 바람직한 차별 발현 유전자 세트는 27 번 또는 33 번 또는 41 번, 더 나아가, 3 번일 수 있으나, 이에 제한되지 않고 도 4b에서의 다양한 세트가 차별 발현 유전자 세트로 제공될 수 있다. 도 4b에 따르면, ESR1, PGR, ERBB2, MKI67, FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14, BLVRA, BCL2, UBE2C, KRT17, CCNB1, NAT1, SLC39A6, CDC20, KRT5 및 CCNE1 유전자를 포함하는 차별 발현 유전자 세트에 포함된 경우, 다른 유전자들의 조합의 세트보다 매우 높은 유방암 분자아형 분류 정확도를 나타내는 것을 확인할 수 있다. 그러나, ESR1, PGR, ERBB2, MKI67, FOXA1, FGFR4, CEP55, MELK, SFRP1, MIA 및 KRT14 유전자 (또는, 여기에 BIRC5 유전자가 더 포함된 경우) 를 포함하는 차별 유전자 세트 또한, RNA 시퀀싱 분석 데이터에서 리소스는 줄이면서도 높은 유방암 분자아형 분류 정확도를 나타낼 수 있다. Referring to FIG. 4B, different combinations of differentially expressed gene sets evaluated in microarray analysis data are shown. Specifically, suitable differentially expressed gene sets in microarray analysis data are the first set of genes comprising the aforementioned ESR1, PGR, ERBB2 and MKI67 genes and FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, It can consist of a combination of genes of a third set of genes including the BAG1, MLPH, MELK, KRT14, BLVRA, BCL2, UBE2C, KRT17, CCNB1, NAT1, SLC39A6, CDC20, KRT5 and CCNE1 genes. As in the results of FIG. 4A, the sorting accuracy in the 1 to 41 differentially expressed gene sets suitable for the microarray analysis data provided by the present invention is high, about 86%. Nos. 27 and 33 show 89.41% and 88.24% sorting accuracy, respectively, with higher sorting accuracy than the rest of the differentially expressed gene sets. In particular, 41 of the differential expression gene sets show the highest classification accuracy of 92.94%. Furthermore, for three sets of differential genes, the accuracy of the classification is 84.71%, which results in less than the accuracy of classification in 27 or 33 or 41 sets of differentially expressed genes. However, it should be noted that the three sets of differentially expressed genes selected only two genes of FOXC1 and CEP55 from the third set of genes to classify breast cancer molecular subtypes, resulting in a high accuracy of about 85% while minimizing resource consumption. do. As such, the preferred differential expression gene set in the microarray analysis data may be 27 or 33 or 41, furthermore, 3, but is not limited thereto and various sets in FIG. 4B may be provided as differential expression gene sets. have. According to Figure 4B, ESR1, PGR, ERBB2, MKI67, FOXA1, FGFR4, CEP55, SFRP1, ANLN, FOXC1, GRB7, BIRC5, MIA, BAG1, MLPH, MELK, KRT14, BLVRA, BCL2, UBE2C, KRT17, CCNB1, NAT1 When included in the differential expression gene set including the SLC39A6, CDC20, KRT5, and CCNE1 genes, it can be seen that the tumor cancer molecular subtype classification accuracy is much higher than the set of combinations of other genes. However, differential gene sets, including the ESR1, PGR, ERBB2, MKI67, FOXA1, FGFR4, CEP55, MELK, SFRP1, MIA, and KRT14 genes (or if they further contain BIRC5 genes), also appear in the RNA sequencing analysis data. Resources can be reduced while still showing high breast cancer molecular subtype classification accuracy.
도 5a 를 참고하면, 4개의 전술한 유방암 분자아형 분류 알고리즘 중 RNA 시퀀싱 분석 데이터에서의 정확도 높은 알고리즘은 Random Forest인 것을 확인할 수 있다. 구체적으로, Random Forest를 적용하였을 때의 분류 정확도는 약 88 %로 다른 분류 알고리즘 보다 크게 높았다. 또한, α값이 0.01로 설정되었을 때, 분류의 정확도는 88.49 %로 가장 높은 것을 알 수 있다.Referring to FIG. 5A, it can be seen that among the four breast cancer molecular subtype classification algorithms described above, a highly accurate algorithm in RNA sequencing analysis data is Random Forest. Specifically, the classification accuracy when applying Random Forest was about 88%, which is higher than other classification algorithms. In addition, when the α value is set to 0.01, it can be seen that the accuracy of classification is the highest at 88.49%.
도 5b 를 참고하면, 4개의 전술한 유방암 분자아형 분류 알고리즘 중 마이크로어레이 분석 데이터에서의 정확도 높은 알고리즘은 도 5a의 결과와 동일하게 Random Forest인 것을 확인할 수 있다. 구체적으로, Random Forest를 적용하였을 때의 분류 정확도는 약 90 %로 다른 분류 알고리즘 보다 크게 높았다. 또한, α값이 1.00으로 설정되었을 때, 분류의 정확도는 90.23 %로 가장 높은 것을 알 수 있다.Referring to FIG. 5B, among the four breast cancer molecular subtype classification algorithms described above, an algorithm having high accuracy in microarray analysis data may be random forest as in the result of FIG. 5A. Specifically, the classification accuracy when applying Random Forest was about 90%, which is higher than other classification algorithms. In addition, when the α value is set to 1.00, it can be seen that the accuracy of classification is the highest at 90.23%.
이상의 결과로, RNA 시퀀싱 분석 데이터 및 마이크로어레이 분석 데이터를 유방 종양의 분자아형 분류 알고리즘으로 Random Forest를 설정한 유방암 분자아형 분류 디바이스에 제공할 수 있다. 더 나아가, 본 발명의 유방암 분자아형 분류방법 및 이를 이용한 유방암 분자아형 분류 디바이스는 RNA 시퀀싱 분석 데이터 및 마이크로어레이 시퀀싱 분석 데이터 각각에 대해 효과가 좋은 차별 발현 유전자 세트를 제공함으로써, PAM 50의 50개의 모든 유전자의 발현수준을 기초로 유방암 분자아형을 분류할 수 있었던 종래의 방법보다 신속하고, 정확도 높은 분석이 가능해질 수 있다. 이에 따라, 본 발명의 유방암 분자아형 분류방법 및 이를 이용한 유방암 분자아형 분류 디바이스는 기계학습을 통해 결정된 유방암 분자아형 결정 모델에서 사용되는 유전자의 수를 감소시킴으로써, 정확도를 유지하는 동시에 리소스 소모를 최소화하면서 유방암의 분자아형을 결정할 수 있다. 더 나아가, 본 발명의 유방암 분자아형 분류방법 및 이를 이용한 유방암 분자아형 분류 디바이스는 대상샘플과 종양샘플의 유전자 발현수준의 비교 없이, 종양세포의 차별 발현 유전자 세트의 발현수준을 측정함으로써, 유방암 분자아형을 분류할 수 있다.As a result, RNA sequencing analysis data and microarray analysis data can be provided to the breast cancer molecular subtype classification device in which Random Forest is set by the molecular subtype classification algorithm of the breast tumor. Furthermore, the breast cancer molecular subtype classification method of the present invention and the breast cancer molecular subtype classification device using the same provide an effective set of differentially expressed genes for each of RNA sequencing analysis data and microarray sequencing analysis data. Faster and more accurate analysis may be possible than the conventional method of classifying breast cancer molecular subtypes based on gene expression levels. Accordingly, the breast cancer molecular subtype classification method of the present invention and the breast cancer molecular subtype classification device using the same reduce the number of genes used in the breast cancer molecular subtype determination model determined through machine learning, while maintaining accuracy and minimizing resource consumption. The molecular subtypes of breast cancer can be determined. Furthermore, the breast cancer molecular subtype classification method of the present invention and the breast cancer molecular subtype classification device using the same can measure breast cancer molecular subtypes by measuring the expression level of the differentially expressed gene set of tumor cells without comparing the gene expression levels of the target sample and the tumor sample. Can be classified.
실시예 2 : 본 발명의 유방암 분자아형 분류방법 및 종래의 방법에 대한 비교Example 2 Comparison of Breast Cancer Molecular Subtype Classification Method and Conventional Method of the Present Invention
이하에서는, 도 6 및 [표 2]를 참조하여, 발명의 일 실시예에 따른 유방암 분자아형 분류방법과 종래의 유방암 분자아형 분류방법을 비교하여 설명한다. Hereinafter, referring to FIG. 6 and Table 2, a breast cancer molecular subtype classification method according to an embodiment of the present invention and a conventional breast cancer molecular subtype classification method will be described.
도 6은 본 발명의 일 실시예에 따른 유방암 분자아형 분류방법 및 종래의 방법을 이용한 유방암 분자아형 분류의 비교결과를 도시한 것이다. Figure 6 shows a comparison of breast cancer molecular subtype classification method using a breast cancer molecular subtype classification method and a conventional method according to an embodiment of the present invention.
구체적으로, AIMS는 유방암의 분자아형에 따라 다르게 나타나는, 두 개의 유전자에 대한 발현수준의 대, 소 관계를 기초로 유방암의 분자아형을 분류하는 방법이다. 정확도의 평가는 genefu R package의 PAM 50의 유전자를 기초로 수행한다. 그 결과, PAM 50의 골든 스탠다드에 따라 본 발명의 분류모델은 RNA 시퀀싱 분석 데이터에서의 분류 정확도는 94.92 %, 마이크로어레이에서의 분류 정확도는 81.36 % 이다. 즉, 본 발명의 분류모델은, RNA 시퀀싱 분석 데이터의 분류 정확도 및 마이크로어레이 분석 데이터의 분류 정확도가 64.41 %인 AIMS보다, 정확도 높은 유방암 분자아형 분류를 제공할 수 있다.Specifically, AIMS is a method of classifying molecular subtypes of breast cancer based on a large and small relationship of expression levels for two genes, which differ according to molecular subtypes of breast cancer. The assessment of accuracy is based on the gene of PAM 50 in the genefu R package. As a result, according to the Golden Standard of PAM 50, the classification model of the present invention has 94.92% of the classification accuracy in the RNA sequencing analysis data and 81.36% of the classification accuracy in the microarray. That is, the classification model of the present invention can provide breast cancer molecular subtype classification with higher accuracy than AIMS, in which the classification accuracy of RNA sequencing analysis data and the classification accuracy of microarray analysis data are 64.41%.
[표 2]는 전술한 TCGA BRCA 샘플과 동일한 샘플을 이용하여 AIMS를 평가한 결과 및 샘플의 수를 나타낸다.Table 2 shows the results of evaluating AIMS using the same sample as the TCGA BRCA sample described above and the number of samples.
[표 2] 에서는, 전술한 도 5a 및 도5b에서 RNA 시퀀싱 분석 데이터에서의 분류 정확도가 약 88 %, 마이크로어레이 분석 데이터에서의 분류 정확도가 약 90 %로 평가된 본 발명의 분류 정확도의 결과와 대조적인, AIMS를 이용한 분류 정확도의 결과를 확인할 수 있다. 결과적으로, RNA 시퀀싱 분석 데이터 및 마이크로어레이 분석 데이터의 분류 정확도는 각각 432개 중 317개를 맞춘 73.38 %, 432개 중 333개를 맞춘 77.08 %로, 본 발명의 분류모델을 이용한 유방암 분자아형 분류의 정확도가 AIMS보다 높은 것을 확인할 수 있다. Table 2 shows the results of the classification accuracy of the present invention in which the classification accuracy in the RNA sequencing analysis data in FIG. 5A and FIG. 5B is about 88% and the classification accuracy in the microarray analysis data is about 90%. In contrast, the results of classification accuracy using AIMS can be confirmed. As a result, the classification accuracy of RNA sequencing analysis data and microarray analysis data was 73.38% for 317 out of 432 and 77.08% for 333 out of 432, respectively. You can see that the accuracy is higher than AIMS.
이상 첨부된 도면을 참조하여 본 발명의 실시 예들을 더욱 상세하게 설명하였으나, 본 발명은 반드시 이러한 실시 예로 국한되는 것은 아니고, 본 발명의 기술사상을 벗어나지 않는 범위 내에서 다양하게 변형 실시될 수 있다. 따라서, 본 발명에 개시된 실시 예들은 본 발명의 기술 사상을 한정하기 위한 것이 아니라 설명하기 위한 것이고, 이러한 실시 예에 의하여 본 발명의 기술 사상의 범위가 한정되는 것은 아니다. 그러므로, 이상에서 기술한 실시 예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 한다. 본 발명의 보호 범위는 아래의 청구범위에 의하여 해석되어야 하며, 그와 동등한 범위 내에 있는 모든 기술 사상은 본 발명의 권리범위에 포함되는 것으로 해석되어야 할 것이다.Although the embodiments of the present invention have been described in more detail with reference to the accompanying drawings, the present invention is not necessarily limited to these embodiments, and various modifications can be made without departing from the spirit of the present invention. Therefore, the embodiments disclosed in the present invention are not intended to limit the technical idea of the present invention but to describe the present invention, and the scope of the technical idea of the present invention is not limited by these embodiments. Therefore, it should be understood that the embodiments described above are exemplary in all respects and not restrictive. The protection scope of the present invention should be interpreted by the following claims, and all technical ideas within the equivalent scope should be interpreted as being included in the scope of the present invention.
[부호의 설명][Description of the code]
100: 유방암 분자아형 분류 디바이스100: breast cancer molecular subtype classification device
110: 통신부110: communication unit
120: 입력부120: input unit
130: 표시부130: display unit
140: 저장부140: storage unit
150: 프로세서150: processor
S210: 발현 수준을 획득하는 단계S210: obtaining expression level
S220: 발현수준을 입력하는 단계S220: entering the expression level
S230: 유방암 분자아형을 결정하는 단계S230: determining breast cancer molecular subtypes
S240: 유방암 분자아형을 제공하는 단계S240: providing a breast cancer molecular subtype
S310: 발현수준 데이터를 획득하는 단계S310: obtaining expression level data
S320: 유방암 연관 유전자를 선별하는 단계S320: selecting a breast cancer associated gene
S330: 메트릭스를 결정하는 단계S330: Determining the Metrics
S340: 유방암 분자아형 분류의 기계학습 수행 및 평가 수행단계S340: performing machine learning and evaluation of breast cancer molecular subtype classification
S350: 분류 알고리즘, 차별 발현 유전자 세트 및 α값을 결정하는 단계S350: determining a classification algorithm, differential expression gene set and α value
Claims (18)
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| KR1020160170132A KR101874716B1 (en) | 2016-12-14 | 2016-12-14 | Methods for classifyng breast cancer subtypes and a device for classifyng breast cancer subtypes using the same |
| KR10-2016-0170132 | 2016-12-14 |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| WO2018110903A2 true WO2018110903A2 (en) | 2018-06-21 |
| WO2018110903A3 WO2018110903A3 (en) | 2018-08-09 |
Family
ID=62559072
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/KR2017/014345 Ceased WO2018110903A2 (en) | 2016-12-14 | 2017-12-08 | Classification method of molecular subtype of breast cancer and classification device of molecular subtype of breast cancer using same |
Country Status (2)
| Country | Link |
|---|---|
| KR (1) | KR101874716B1 (en) |
| WO (1) | WO2018110903A2 (en) |
Cited By (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113278700A (en) * | 2021-06-04 | 2021-08-20 | 浙江省肿瘤医院 | Primer group and kit for breast cancer typing and prognosis prediction |
| CN114127314A (en) * | 2019-07-19 | 2022-03-01 | 公立大学法人福岛县立医科大学 | Genetic genomes, methods and kits for identifying or classifying subtypes (subtypes) of breast cancer |
| WO2023034955A1 (en) * | 2021-09-02 | 2023-03-09 | University Of Pittsburgh-Of The Commonwealth System Of Higher Education | Machine learning-based systems and methods for predicting liver cancer recurrence in liver transplant patients |
| WO2023089146A1 (en) * | 2021-11-19 | 2023-05-25 | The Institute Of Cancer Research: Royal Cancer Hospital | Prognostic and treatment response predictive method |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR102204958B1 (en) * | 2019-10-28 | 2021-01-20 | 삼성에스디에스 주식회사 | Processing method for result of medical examination |
Family Cites Families (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP1476568A2 (en) * | 2002-02-20 | 2004-11-17 | NCC Technology Ventures Pte Limited | Materials and methods relating to cancer diagnosis |
| HRP20140140T1 (en) * | 2008-05-30 | 2014-05-23 | The University Of North Carolina At Chapel Hill | Gene expression profiles to predict breast cancer outcomes |
| WO2013075059A1 (en) * | 2011-11-18 | 2013-05-23 | Vanderbilt University | Markers of triple-negative breast cancer and uses thereof |
-
2016
- 2016-12-14 KR KR1020160170132A patent/KR101874716B1/en not_active Expired - Fee Related
-
2017
- 2017-12-08 WO PCT/KR2017/014345 patent/WO2018110903A2/en not_active Ceased
Cited By (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN114127314A (en) * | 2019-07-19 | 2022-03-01 | 公立大学法人福岛县立医科大学 | Genetic genomes, methods and kits for identifying or classifying subtypes (subtypes) of breast cancer |
| EP4001431A4 (en) * | 2019-07-19 | 2023-09-27 | Public University Corporation Fukushima Medical University | DISTINCTIVE MARKER GENE SET, METHOD AND KIT FOR DIFFERENTIATION OR CLASSIFICATION OF BREAST CANCER SUBTYPES |
| CN113278700A (en) * | 2021-06-04 | 2021-08-20 | 浙江省肿瘤医院 | Primer group and kit for breast cancer typing and prognosis prediction |
| CN113278700B (en) * | 2021-06-04 | 2022-08-09 | 浙江省肿瘤医院 | Primer group and kit for breast cancer typing and prognosis prediction |
| WO2023034955A1 (en) * | 2021-09-02 | 2023-03-09 | University Of Pittsburgh-Of The Commonwealth System Of Higher Education | Machine learning-based systems and methods for predicting liver cancer recurrence in liver transplant patients |
| US20240404707A1 (en) * | 2021-09-02 | 2024-12-05 | University Of Pittsburgh-Of The Commonwealth System Of Higher Education | Machine learning-based systems and methods for predicting liver cancer recurrence in liver transplant patients |
| WO2023089146A1 (en) * | 2021-11-19 | 2023-05-25 | The Institute Of Cancer Research: Royal Cancer Hospital | Prognostic and treatment response predictive method |
Also Published As
| Publication number | Publication date |
|---|---|
| KR101874716B1 (en) | 2018-07-04 |
| KR20180068444A (en) | 2018-06-22 |
| WO2018110903A3 (en) | 2018-08-09 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2018110903A2 (en) | Classification method of molecular subtype of breast cancer and classification device of molecular subtype of breast cancer using same | |
| Lopez et al. | Genomic evidence for local adaptation of hunter-gatherers to the African rainforest | |
| Qi et al. | Cell-type heterogeneity: Why we should adjust for it in epigenome and biomarker studies | |
| Liu et al. | LRRK2 but not ATG16L1 is associated with Paneth cell defect in Japanese Crohn’s disease patients | |
| Yao et al. | Genomic profiling of NETs: a comprehensive analysis of the RADIANT trials | |
| Parnell et al. | Aberrant cell cycle and apoptotic changes characterise severe influenza A infection–a meta-analysis of genomic signatures in circulating leukocytes | |
| CN111033631A (en) | Systems and methods for generating, visualizing, and classifying molecular functional profiles | |
| Lin et al. | Characteristics of gut microbiota in patients with GH-secreting pituitary adenoma | |
| Arbitrio et al. | Polymorphic Variants in NR 1I3 and UGT 2B7 Predict Taxane Neurotoxicity and Have Prognostic Relevance in Patients With Breast Cancer: A Case‐Control Study | |
| Fu et al. | Identification of hub genes using co-expression network analysis in breast cancer as a tool to predict different stages | |
| Trastulla et al. | Computational estimation of quality and clinical relevance of cancer cell lines | |
| Wang et al. | Single-cell RNA sequencing reveals novel gene expression signatures of trastuzumab treatment in HER2+ breast cancer: a pilot study | |
| Rehn et al. | RaScALL: rapid (Ra) screening (Sc) of RNA-seq data for prognostically significant genomic alterations in acute lymphoblastic leukaemia (ALL) | |
| Çalışkan et al. | AI/ML advances in non-small cell lung cancer biomarker discovery | |
| Fortuna et al. | Circulating tumor DNA: Where are we now? A mini review of the literature | |
| Wang et al. | Development and validation of epithelial mesenchymal transition-related prognostic model for hepatocellular carcinoma | |
| Machida et al. | Characterizing tyrosine phosphorylation signaling in lung cancer using SH2 profiling | |
| Chen et al. | A gene signature based method for identifying subtypes and subtype-specific drivers in cancer with an application to medulloblastoma | |
| Durrani et al. | Integrated bioinformatics analyses identifying potential biomarkers for type 2 diabetes mellitus and breast cancer: In SIK1-ness and health | |
| KR101966589B1 (en) | Methods for classifyng breast cancer subtypes and a device for classifyng breast cancer subtypes using the same | |
| Hernandez et al. | Patterns of pharmacogenetic variation in nine biogeographic groups | |
| Duan et al. | Gut microbiome signature in response to neoadjuvant chemoradiotherapy in patients with rectal cancer | |
| WO2024242440A1 (en) | Method and apparatus for predicting treatment response to immune checkpoint inhibitors | |
| Shen et al. | Ancestral origins are associated with SARS-CoV-2 susceptibility and protection in a Florida patient population | |
| Perera et al. | Detection of human leukocyte antigen class I loss of heterozygosity in solid tumor types by next-generation DNA sequencing |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 17882145 Country of ref document: EP Kind code of ref document: A2 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 17882145 Country of ref document: EP Kind code of ref document: A2 |