WO2010108638A9 - Profil d'un gène tumoral - Google Patents
Profil d'un gène tumoral Download PDFInfo
- Publication number
- WO2010108638A9 WO2010108638A9 PCT/EP2010/001773 EP2010001773W WO2010108638A9 WO 2010108638 A9 WO2010108638 A9 WO 2010108638A9 EP 2010001773 W EP2010001773 W EP 2010001773W WO 2010108638 A9 WO2010108638 A9 WO 2010108638A9
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- genes
- nsclc
- set forth
- expression levels
- samples
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
- C12Q1/6886—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/112—Disease subtyping, staging or classification
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/118—Prognosis of disease development
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/158—Expression markers
Definitions
- the present invention relates to the diagnosis, categorisation and prognosis of and for lung cancers, in particular non-small cell lung carcinomas (NSCLC).
- NSCLC non-small cell lung carcinomas
- Lung cancer is the most frequent cause of cancer deaths in Europe. There were 386,300 new lung cancer cases in 2006, with an estimated 334,800 deaths. This accounts for 1 .5% of all cancer deaths [ 1 ] .
- lung cancer is in clinical practice sub-divided into four major histological subtypes: small cell lung cancer (SCLC), squamous cell carcinoma (SCC), adenocarcinoma (ADC), and large-cell carcinoma (LCC).
- SCLC small cell lung cancer
- SCC squamous cell carcinoma
- ADC adenocarcinoma
- LCC large-cell carcinoma
- NSCLC non-small cell lung carcinomas
- staging can, to some degree, be a prognostic indicator, it is still far from precise.
- the increase in the predictive value of staging over the past decades is mostly due to the increase in the sensitivity of imaging techniques and not related to a better understanding of the biology of a tumor.
- a method for classifying a test tissue sample as a malignant non-small cell lung carcinoma (NSCLC) by analysis of gene expression comprising the steps of: (a) assaying the expression levels of 5 or more genes selected from the genes set forth in Table 1 ; (b) comparing the expression levels of 5 or more genes with the expression levels of said 5 or more genes in a known non-cancerous tissue sample; wherein a change in the expression levels of said 5 or more genes indicates that the test tissue sample is a malignant NSCLC sample.
- the 5 or more genes comprise, consist of or consist essentially of the genes set forth in Table 2.
- the expression of the 5 or more genes is analysed with a two-dimensional hierarchical clustering of the expression levels of 5 or more genes as set forth in Table 1 ; wherein a correlation between the expression levels of said 5 or more genes and the pattern of gene expression levels observed in the two-dimensional clustering indicates that the test tissue sample is a malignant NSCLC sample.
- the two-dimensional hierarchical clustering is performed using OmniViz (Bio wisdom) software, for example as set forth below.
- a method for classifying a test tissue sample of a malignant non-small cell lung carcinoma (NSCLC) into LCC, ADC or SCC subtypes by analysis of gene expression comprising the steps of: (a) assaying the expression levels of 75 or more genes selected from the genes set forth in Table 3; (b) comparing the expression levels of said 75 or more genes with the expression levels of 75 or more genes from NSCLC tumour samples as set forth in Table 3; wherein a correlation between the expression levels of said 75 or more genes and the pattern of gene expression levels observed in Table 3 indicates that the test tissue sample is a malignant LCC, ADC or SCC subtype NSCLC sample.
- NSCLC non-small cell lung carcinoma
- gene expression is analysed by two-dimensional hierarchical clustering, which provides a graphical representation of comparative gene expression and facilitates classification of NSCLC samples into the relevant subtypes.
- the 75 or more genes comprise, consist of or consist essentially of the 75 genes set forth in Table 4.
- the optimised signature set forth in this table allows classification of NSCLC into LCC, ADC or SCC subtypes by genetic analysis using a reduced population of probes.
- a method for predicting the survival time of a patient suffering from a non-small cell lung carcinoma (NSCLC) by analysis of gene expression comprising the steps of: (a) assaying the expression levels of 17 genes selected from the genes set forth in Table 5; (b) either (i) comparing the expression levels of said 17 genes with a two-dimensional hierarchical clustering of the expression levels of 17 genes from NSCLC tumour samples as set forth in Table 5; or (ii) fitting the expression levels of said 17 genes to a survival model to derive a prognostic index.
- NSCLC non-small cell lung carcinoma
- the methods according to the invention are in vitro methods.
- Figure 1 shows a flowchart of the combinations of diagnostic methodologies, leading to an assessment of clinical risk for each patient.
- the invention provides a method for assessing clinical risk in a patient suffering from or suspected from suffering from NSCLC, comprising the steps of: assaying the expression of genes in accordance with the tumour signature set forth in the first aspect of the invention, and determining if the patient is suffering from NSCLC; predicting the survival time of the patient according to the third aspect of the invention; classifying the NSCLC according to the second aspect of the invention; and combining the results of the two previous steps to provide an indication for treatment of the patient on the basis of tumour type and severity.
- kits for assessing the presence, subtype or severity of NSCLC comprise reagents for measuring the presence of mRNA or polypeptides encoded by the genes identified herein.
- kits may contain instructions as to use.
- the kits may contain instructions as to the selection of genes to be screened in the diagnosis of NSCLC as set forth herein.
- the genes are 5 or more of the genes set forth in Table 1 , the 5 genes set forth in Table 2, 75 or more of the genes set forth in Table 3, the 75 genes set forth in Table 4 or the 17 genes set forth in Table 5.
- the kit may contain instructions for the detection of the gene products expressed from said mRNA species.
- any method for recognising the levels of expression of a gene may be used in the context of the present invention.
- the genes identified in each gene signature, and the changes in expression levels associated therewith, are identified in the Tables set out below; analysis can be made manually, or using automated means, to compare the expression levels observed in a test sample to those observed in a reference sample.
- kits in accordance with the invention may comprise any reagents suitable for measuring gene expression levels.
- Such reagents comprise reagents for measuring levels of mRNA, or cDNA derived from mR A, and/or reagents suitable for measuring levels of polypeptide gene products.
- a kit may comprise nucleic acid probes which hybridise specifically to mRNA or cDNA specific for the appropriate gene signature, under appropriate conditions.
- the probes may be immobilised onto a solid surface, such as glass slides, membranes of various types, columns or beads, and may be in the form of an addressable array. If the probes are on an array, the identity of each probe is advantageously known as a result of the spatial arrangement on the array itself.
- Probes may be used in solution, to probe nucleic acids derived from the sample.
- labelling means may be provided, to label either the probes or the sample nucleic acids.
- Primers may also be provided, to prime extension reactions for amplification and/or labelling of sample nucleic acids.
- the primers are specific for mRNA transcribed from the genes identified in the gene signatures set forth herein, or corresponding cDNA.
- kits may alternatively, or in addition, comprise reagents such as immunoglobulins, RNA or peptide aptamers and the like which are capable of specifically detecting the polypeptide gene products of the target genes.
- reagents such as immunoglobulins, RNA or peptide aptamers and the like which are capable of specifically detecting the polypeptide gene products of the target genes.
- the present invention provides a diagnostic kit for use in characterising NSCLC tumours, comprising a set of reagents for specifically measuring the abundance of the mRNA species transcribed from the 5 or more of the genes set forth in Table 1 , the 5 genes set forth in Table 2, 75 or more of the genes set forth in Table 3, the 75 genes set forth in Table 4 or the 17 genes set forth in Table 5; or the gene products expressed from said mRNA species.
- the reagents comprise a set of oligonucleotide primers or probes which hybridise specifically to said genes, which may advantageously be attached to a solid phase in the form of an array.
- the array consists of a library of oligonucleotides affixed to a solid phase, and said library of oligonucleotides consists substantially of oligonucleotides which are specific for the 5 or more of the genes set forth in Table 1, the 5 genes set forth in Table 2, 75 or more of the genes set forth in Table 3, the 75 genes set forth in Table 4 or the 17 genes set forth in Table 5.
- the reagents are selected from immunoglobulin molecules, RNA aptamers and peptide aptamers.
- the kit is for use in detecting the presence of NSCLC tumour tissue, comprising a set of nucleic acid probes or primers which recognise the transcripts of the genes set forth in Table 2.
- the kit is for use in differentiating between LCC, ADC or SCC subtypes of NSCLC, comprising a set of nucleic acid probes or primers which recognises the transcripts of the genes set forth in Table 4.
- the kit is for use in estimating the prognosis for survival of a patient suffering from NSCLC, comprising a set of nucleic acid probes or primers which recognises the transcripts of the genes set forth in Table 5.
- the kits may further include labelling means.
- immunoglobulins, RNA or peptide aptamers may be substituted for, or may supplement, the nucleic acid reagents in kits according to the invention.
- Figure 1 is a table representing the use of assays according to the invention in the assessment of patients diagnosed with or suspected of suffering from NSCLC.
- Figure 2 shows Kaplan-Meier plots for survival of NSCLC patients, separating the patients with good and poor prognoses as assessed using the gene signature set forth in Table 5. Light grey bars indicate the end of the follow-up.
- Figure 3 shows survival prediction by published prognostic signatures.
- Kaplan-Meier curves for the best performing signatures are shown for 82 Erasmus MC patients (left) and 89 Duke University NSCLC patients (right), fitted by their risk assignments. Grey bars indicate patients at last follow-up, still alive. P-values are between brackets if overall survival of the low risk group is actually lower than that of the high risk group.
- NSCLC is "classified” by being identified as belonging to the SCC, LCC or ADC subtype, according to the terminology normally used in the art.
- the present invention provides a further level of classification of NSCLC, as set forth below and in Table 5.
- NSCLC can also be assessed as to severity by means of the present invention.
- a prognosis for the patient's survival can be derived by the methods provided herein. Survival prognosis involves providing an estimate of the likelihood that a patient will survive for a given time period, for example 5 years.
- the expression levels of genes are assayed in accordance with the present invention by measuring the levels of either nucleic acids or proteins encoded by the gene which are present in a sample. Expression levels are considered herein to be the amounts of mRNA or polypeptide which are present in a sample; they may be influenced, therefore, by for instance modulations in levels of transcription, translation, mRNA or protein turnover.
- target genes Genes whose expression levels are described herein as being useful for identifying, classifying or measuring the severity of NSCLC are referred to as "target” genes; groups of target genes form gene signatures, which can be used to identify, classify or measure the severity of NSCLC.
- Nucleic acids are nucleic acids as is commonly understood in the art, and include DNA, RNA and artificial nucleic acids. In the context of the present invention, the levels of naturally-occurring nucleic acids will generally be measured using techniques known to those skilled in the art. Probes, primers and other nucleic acid molecules used in the present invention may comprise synthetic nucleotides or other modifications, as is known in the art.
- Reagents for measuring gene expression levels include nucleic acids and ligands, such as antibodies, which are capable of detecting the RNA or polypeptide products of the target genes described herein. Reagents may be selective, in that they bind to or detect only the RNA or polypeptide products of the target genes, or non-selective, capable of binding to or detecting a wider population of genes, with the selectivity being introduced in a later stage of the assay.
- assays can be conducted on arrays that comprise many genes in addition to the target genes, and the detection of changes in the expression levels of the target genes will be achieved by selective analysis of the arrays.
- the Affymetrix Gene chip analyser is capable of identifying binding to probes on gene chip arrays, thereby measuring the degree of hybridisation to the probe sets representing genes on the array as well as the identity of the probes hybridised to at the same time.
- primers may be used to selectively detect the RNA gene products of target genes.
- a "primer” is an oligonucleotide, whether produced naturally as in a purified restriction digest or produced synthetically, that is capable of acting as a point of initiation of synthesis when placed under conditions in which synthesis of a primer extension product that is complementary to a nucleic acid strand is induced, (i.e., in the presence of nucleotides and an inducing agent such as DNA polymerase and at a suitable temperature and pH).
- the primer is preferably single stranded for maximum efficiency in the initiation of the reaction, but may alternatively be double stranded.
- the primer is first treated to separate its strands before being used to prepare extension products.
- the primer is an oligodeoxyribonucleotide.
- the primer must be sufficiently long to prime the synthesis of extension products in the presence of the inducing agent. The exact lengths of the primers will depend on many factors, including temperature, source of primer and the use of the method.
- probe refers to an oligonucleotide (i.e., a sequence of nucleotides), whether occurring naturally as in a purified restriction digest or produced synthetically, recombinantly or by PCR amplification, that is capable of hybridizing to at least a portion of another oligonucleotide of interest.
- a probe may be single-stranded or double-stranded. Probes are useful in the detection, identification and isolation of particular gene sequences. It is contemplated that any probe used in the present invention can be labelled with a reporter molecule so that is detectable in any detection system, including, but not limited to enzyme (e. g., ELISA, as well as enzyme-based histochemical assays), fluorescent, radioactive, and luminescent systems. It is not intended that the present invention be limited to any particular detection system or label.
- sample is used to denote biological samples which may be obtained from animals (including humans) and encompass fluids, solids, tissues, and gases.
- Biological samples include sputum and blood products, such as plasma, serum and the like.
- a sample is ordinarily a tissue sample obtained from normal tissue or a NSCLC.
- “Comparing”, as used herein, includes comparison of expression levels of target genes directly with a control, as well as comparison with profiles, as described further herein. In comparisons according to the present invention, a match is sought between a pattern of gene expression seen in a control or in a predefined profile.
- isolated when used in relation to a nucleic acid, refers to a nucleic acid sequence that is identified and separated from at least one component or contaminant with which it is ordinarily associated in its natural source. Isolated nucleic acid is present in a form or setting that is different from that in which it is found in nature.
- isolated polypeptides are polypeptides or proteins separated from at least one component or contaminant with which they are ordinarily associated in their natural source
- the sample used for analysis comprises tissue sample, which includes tumour tissue, and in particular human lung cancer tumour tissue.
- tissue is, but is not limited to, epithelial tissue and connective tissue; other tissue types as may be used as and if they occur in a lung tumour.
- NSCLC are comprised of epithelial tissue.
- Samples are obtained from surgically resected lungs, or may be obtained from patients by standard biopsy techniques.
- microdissection is used to ensure that the cell types subjected to analysis are the intended cell type.
- Normal samples can be obtained from the same patient, adjacent the tumour, or from patients not suffering from cancer.
- normal samples will be of the same tissue type (i.e. epithelial tissue, connective tissue) as the tumour sample.
- the normal sample is used to establish reference expression profiles to distinguish normal from cancerous tissues, for example as in Table 1. If an analysis model is defined, for example using two-dimensional hierarchical clustering, it is only necessary to analyse a tumor sample from a patient rather than both a tumor sample and a normal sample from the same or different patients.
- the levels mRNAs present in a sample which are encoded by the gene identified in Tables 1-5 may be measured directly. Analysis is conveniently carried out by labelling the RNA in cells from the sample and assaying the abundance of the desired mRNA species. To prepare RNA from tumour and/or normal samples, total or poly(A)+ RNA is processed according to any suitable technique, for example as set forth below, to produce cDNA and subsequently cR A, which is conveniently used in microarray analysis.
- Copies of the cRNA or cDNA may be amplified, for example by RT-PCR. Fluorescent tags or digoxigenin-dUTP can then be enzymatically incorporated into the newly synthesized cDNA/cRNA or can be chemically attached to the new strands of DNA or RNA.
- the assessment of expression is performed by gene expression profiling using oligonucleotide-based arrays or cDNA-based arrays of any type; RT-PCR (reverse transcription-Polymerase Chain Reaction), real-time PCR, in-situ hybridisation, Northern blotting, serial analysis of gene expression (SAGE) for example as described by Velculescu et al Science 270 (5235): 484-487, or differential display. Details of these and other methods can be found for example in Sambrook et al, 1989, Molecular Cloning: A Laboratory Manual.
- the assessment uses a microarray assay.
- Arrays Microarrays can be constructed by a number of available technologies. Array technology and the various techniques and applications associated with it are described generally in numerous textbooks and documents. Gene array technology is particularly suited to the practice of the present invention. Methods for preparing microarrays are well known in the art. These include Lemieux et al., (1998), Molecular Breeding 4,277-289, Schena and Davis. Parallel Analysis with Biological Chips, in PCR Methods Manual (eds. M. Innis, D. Gelfand, J. Sninsky), Schena and Davis, (1999), Genes, Genomes and Chips. In DNA Microarrays : A Practical Approach (ed. M.
- array technology Major applications for array technology include the identification of sequence (nucleotide sequence/nucleotide sequence mutation) and the determination of expression level (abundance) of nucleotide sequences.
- Gene expression profiling may make use of array technology, optionally in combination with proteomics techniques (Celis et al, 2000, FEBS Lett, 480 (1) : 2-16; Lockhart and Winzeler, 2000, Nature 405 (6788) : 827-836; Khan et al. , 1999,20 (2): 223-9).
- any library may be arranged in an orderly manner into an array, by spatially separating the members of the library.
- libraries for arraying include nucleic acid libraries (including DNA, RNA, oligonucleotide and other nucleic acid libraries), peptide, polypeptide and protein libraries, as well as libraries comprising other types of molecules, such as ligand libraries. Accordingly, where reference is made to a "library” such reference includes reference to a library in the form of an array.
- the members of a library are generally fixed or immobilised onto a solid phase, preferably a solid substrate, to limit diffusion and admixing of the samples.
- the libraries may be immobilised to a substantially planar solid phase, including membranes and non- porous substrates such as plastic and glass.
- the samples are preferably arranged in such a way that indexing (i. e. reference or access to a particular sample) is facilitated.
- the samples are applied as spots in a grid formation.
- Common assay systems may be adapted for this purpose.
- an array may be immobilised on the surface of a microplate, either with multiple samples in a well, or with a single sample in each well.
- the solid substrate may be a membrane, such as a nitrocellulose or nylon membrane (for example, membranes used in blotting experiments).
- Alternative substrates include glass, or silica based substrates.
- the samples are immobilised by any suitable method known in the art, for example, by charge interactions, or by chemical coupling to the walls or bottom of the wells, or the surface of the membrane.
- Other means of arranging and fixing may be used, for example, pipetting, drop-touch, piezoelectric means, ink-jet and bubblejet technology, electrostatic application, etc.
- photolithography may be utilised to arrange and fix the samples on the chip.
- the samples may be arranged by being "spotted" onto the solid substrate; this may be done by hand or by making use of robotics to deposit the sample.
- arrays may be described as macroarrays or microarrays, the difference being the size of the sample spots.
- Macroarrays typically contain sample spot sizes of about 300 microns or larger and may be easily imaged by existing gel and blot scanners.
- the sample spot sizes in microarrays are typically less than 200 microns in diameter and these arrays usually contain thousands of spots.
- microarrays may require specialised robotics and imaging equipment, which may need to be custom made. Instrumentation is described generally in a review by Cortese, 2000, The Engineer 14 [11]: 26.
- targets and probes may be labelled with any readily detectable reporter such as a fluorescent, bioluminescent, phosphorescent, radioactive reporter.
- kits which comprise microarrays which are specific for a desired set of genes.
- microarrays according to the invention may consist of a solid phase and, immobilised thereto, a library of nucleic acid oligonucleotides or probes which consists substantially of one or more of the gene signatures identified herein, and listed in the Tables, especially Tables 2, 4, and 5.
- Such specialised microarrays are less expensive to produce than general purpose microarrays, and less difficult and expensive to analyse.
- the arrays according to the invention may comprise a library of oligonucleotides which is larger than, though still comprising, one or more of the gene signatures described herein, but still smaller than the set consisting of all known genes.
- such arrays may comprise gene signatures which are useful for detecting other forms of cancer, or other types of NSCLC, or which may provide different insights into the prognosis for NSCLC patients, or the like.
- Nucleic acid signatures in accordance with the invention may be detected by nucleic acid analysis which relies on amplification and/or sequencing of sample nucleic acids. Since the invention aims to measure gene expression, the methods used must quantitatively measure transcribed nucleic acid levels. The measured nucleic acids must therefore be mRNA, or nucleic acids derived quantitatively from mRNA such as cDNA.
- nucleic acid amplification requires nucleic acid amplification.
- Many amplification methods rely on an enzymatic chain reaction (such as a polymerase chain reaction, a ligase chain reaction, or a self- sustained sequence replication), a linear amplification procedure, or on the replication of all or part of the vector into which the desired sequence has been cloned.
- the amplification according to the invention is an exponential amplification, as exhibited by for example the polymerase chain reaction.
- amplification methods can be used in the methods of the present invention, and include polymerase chain reaction (PCR), PCR in situ, ligase amplification reaction (LAR), ligase hybridisation, Qbeta bacteriophage replicase, transcription-based amplification system (TAS), genomic amplification with transcript sequencing (GAWTS), nucleic acid sequence-based amplification (NASBA) and in situ hybridisation.
- Primers suitable for use in various amplification techniques can be prepared according to methods known in the art.
- PCR is a nucleic acid amplification method described inter alia in U.S. Pat. Nos. 4,683,195 and 4,683,202.
- PCR consists of repeated cycles of DNA polymerase generated primer extension reactions.
- the target DNA is heat denatured and two oligonucleotides, which bracket the target sequence on opposite strands of the DNA to be amplified, are hybridised. These oligonucleotides become primers for use with DNA polymerase.
- the DNA is copied by primer extension to make a second copy of both strands. By repeating the cycle of heat denaturation, primer hybridisation and extension, the target DNA can be amplified a million fold or more in about two to four hours.
- PCR is a molecular biology tool, which must be used in conjunction with a detection technique to determine the results of amplification.
- An advantage of PCR is that it increases sensitivity by amplifying the amount of target DNA by 1 million to 1 billion fold in approximately 4 hours.
- PCR can be used to amplify any known nucleic acid in a diagnostic context (Mok et al, (1994), Gynaecologic Oncology, 52: 247-252). Self-Sustained Sequence Replication (3SR)
- Self-sustained sequence replication is a variation of TAS, which involves the isothermal amplification of a nucleic acid template via sequential rounds of reverse transcriptase (RT), polymerase and nuclease activities that are mediated by an enzyme cocktail and appropriate oligonucleotide primers (Guatelli et al. (1990) Proc. Natl. Acad. Sci . US A 87 : 1874).
- Enzymatic degradation of the RNA of the RNA/DNA heteroduplex is used instead of heat denaturation.
- RNase H and all other enzymes are added to the reaction and all steps occur at the same temperature and without further reagent additions.
- Ligation amplification reaction or ligation amplification system uses DNA ligase and four oligonucleotides, two per target strand. This technique is described by Wu, D. Y. and Wallace, R. B. (1989) Genomics 4:560. The oligonucleotides hybridise to adjacent sequences on the target DNA and are joined by the ligase. The reaction is heat denatured and the cycle repeated.
- RNA replicase for the bacteriophage QP which replicates single- stranded RNA, is used to amplify the target DNA, as described by Lizardi et al. (1988) Bio/Technology 6: 1197.
- the target DNA is hybridised to a primer including a T7 promoter and a QP 5' sequence region.
- reverse transcriptase generates a cDNA connecting the primer to its 5' end in the process.
- the resulting heteroduplex is heat denatured.
- a second primer containing a QP 3' sequence region is used to initiate a second round of cDNA synthesis.
- T7 RNA polymerase then transcribes the double-stranded DNA into new RNA, which mimics the Qp. After extensive washing to remove any unhybridised probe, the new RNA is eluted from the target and replicated by Qp replicase. The latter reaction creates 10 fold amplification in approximately 20 minutes.
- rolling circle amplification (Lizardi et al, (1998) Nat Genet 19:225) is an amplification technology available commercially (RCAT ⁇ ) which is driven by DNA polymerase and can replicate circular oligonucleotide probes with either linear or geometric kinetics under isothermal conditions.
- a geometric amplification occurs via DNA strand displacement and hyperbranching to generate 10 12 or more copies of each circle in 1 hour. If a single primer is used, RCAT generates, in a few minutes, a linear chain of thousands of tandemly linked DNA copies of a target covalently linked to that target.
- SDA strand displacement amplification
- SDA comprises both a target generation phase and an exponential amplification phase.
- target generation double-stranded DNA is heat denatured creating two single-stranded copies.
- a series of specially manufactured primers combine with DNA polymerase (amplification primers for copying the base sequence and bumper primers for displacing the newly created strands) to form altered targets capable of exponential amplification.
- the exponential amplification process begins with altered targets (single-stranded partial DNA strands with restricted enzyme recognition sites) from the target generation phase.
- An amplification primer is bound to each strand at its complementary DNA sequence.
- DNA polymerase uses the primer to identify a location to extend the primer from its 3' end, using the altered target as a template for adding individual nucleotides. The extended primer thus forms a double-stranded DNA segment containing a complete restriction enzyme recognition site at each end.
- a restriction enzyme is then bound to the double stranded DNA segment at its recognition site.
- the restriction enzyme dissociates from the recognition site after having cleaved only one strand of the double-sided segment, forming a nick.
- DNA polymerase recognises the nick and extends the strand from the site, displacing the previously created strand.
- the recognition site is thus repeatedly nicked and restored by the restriction enzyme and DNA polymerase with continuous displacement of DNA strands containing the target segment.
- Each displaced strand is then available to anneal with amplification primers as above.
- the process continues with repeated nicking, extension and displacement of new DNA strands, resulting in exponential amplification of the original DNA target.
- Identification of nucleic acid sequences can for example be performed by primer extension or sequencing techniques. Such techniques may involve the parallel and/or serial processing of a large number of different template nucleic acid molecules.
- a library of probes on an array may be employed.
- a high sensitivity analytical technique may be used to characterize individually nucleic acid molecules which become immobilised on the array, by hybridisation to the probes.
- primer extension reactions may be used to incorporate labeled nucleotide(s) that can be individually detected in order to sequence individual molecules and/or determine the identity of at least one nucleotide position on individual nucleic acid molecules.
- Detection may involve labeling one or more of the primers and or extension nucleotides with a detectable label (e.g., using fluorescent label(s), FRET label(s), enzymatic label(s), radio-label(s), etc.).
- Detection may involve imaging, for example using a high sensitivity camera and/or microscope (e.g., a super-cooled camera and/or microscope).
- Suitable techniques may be selected by one of ordinary skill in the art.
- high- throughput sequencing approaches are listed in Y. Chan, Mutation Reseach 573 (2005) 13-40 and include, but are not limited to, near- term sequencing approaches such as cycle- extension approaches, polymerase reading approaches and exonuclease sequencing, revolutionary sequencing approaches such as DNA scanning and nanopore sequencing and direct linear analysis.
- Examples of current high-throughput sequencing methods are 454 (pyro)sequencing, Solexa Genome Analysis System, Agencourt SOLiD sequencing method (Applied Biosystems), MS-PET sequencing (Ng et al., 2006, http ://nar . oxfordjournals.org/cgi/content/full/34/ 12/e84).
- a digital analysis (e.g., a digital amplification and subsequent analysis) may be performed to obtain a statistically significant quantitative result.
- Certain digital techniques are known in the art, see for example, US Patent No. 6,440,706 and US Patent No. 6,753,147, incorporated herein by reference.
- an emulsion-based method for amplifying and/or sequencing individual nucleic acid molecules may be used (e.g., BEAMing technology; International Published Application Nos. WO2005/010145, WOOO/40712, WO02/22869, WO03/044187, WO99/02671, herein incorporated by reference).
- a sequencing method that can sequence single molecules in a biological sample may be used.
- Sequencing methods are known and being developed for high throughput (e.g., parallel) sequencing of complex genomes by sequencing a large number of single molecules (often having overlapping sequences) and compiling the information to obtain the sequence of an entire genome or a significant portion thereof.
- Suitable sequencing techniques may involve high speed parallel molecular nucleic acid sequencing as described in PCT Application No. WO 01/16375, US Application No. 60/151,580 and U.S. Published Application No. 20050014175, the entire contents of which are incorporated herein by reference.
- Other sequencing techniques are described in PCT Application No. WO 05/73410, PCT Application No. WO 05/54431 , PCT Application No. WO 05/39389, PCT Application No.
- Sequencing techniques for use in connection with the invention may involve exposing a nucleic acid molecule to an oligonucleotide primer and a polymerase in the presence of a mixture of nucleotides. Changes in the fluorescence of individual nucleic acid molecules in response to polymerase activity may be detected and recorded.
- the specific labels attached to each nucleic acid and/or nucleotide may provide an emission spectrum allowing for the detection of sequence information for individual template nucleic acid molecules.
- a label is attached to the primer/template and a different label is attached to each type of nucleotide (e.g., A, T U, C, or G). Each label emits a distinct signal which is distinguished from the other labels.
- Useful sequencing methods include high throughput sequencing using the 454 Life Sciences Instrument System (International Published Application No. WO2004/069849, filed January 28, 2004). Briefly, a sample of single stranded DNA is prepared and added to an excess of DNA capture beads which are then emulsified. Clonal amplification is performed to produce a sample of enriched DNA on the capture beads (the beads are enriched with millions of copies of a single clonal fragment). The DNA enriched beads are then transferred into PicoTiterPlate (TM) and enzyme beads and sequencing reagents are added. The samples are then analyzed and the sequence data recorded. Pyrophosphate and luciferin are examples of the labels that can be used to generate the signal.
- a label includes but is not limited to a fluorophore, for example green fluorescent protein (GFP), a luminescent molecule, for example aequorin or europium chelates, fluorescein, rhodamine green, Oregon green, Texas red, naphthofluorescein, or derivatives thereof.
- the polynucleotide is linked to a substrate.
- a substrate includes but is not limited to, streptavidin-biotin, histidine-Ni, S-tag-S-protein, or glutathione-S-transferase (GST).
- a substrate is pretreated to facilitate attachment of a polynucleotide to a surface
- the substrate can be glass which is coated with a polyelectrolyte multilayer (PEM), or the polynucleotide is biotinylated and the PEM- coated surface is further coated with streptavidin.
- PEM polyelectrolyte multilayer
- single molecule sequencing technology available from US Genomics, Mass., may be used.
- technology described, at least in part, in one or more of US patents 6,790,671 ; 6,772,070; 6,762,059; 6,696,022; 6,403,31 1 ; 6,355,420; 6,263,286; and 6,210,896 may be used.
- sequencing methods may be used to analyze DNA and/or RNA according to methods of the invention. It should be appreciated that a sequencing method does not have to be a single molecule sequencing method, since generally nucleic acid material from a substantial sample or biopsy will be available for analysis.
- the levels of polypeptides encoded by the genes identified in Tables 1-5 can be measured directly, without measuring mRNA levels.
- polypeptides can be detected by differential mobility on protein gels, or by other size analysis techniques such as mass spectrometry.
- Peptides derived from the gene signatures identified herein can be differentiated by size analysis.
- the detection means is sequence-specific, such that a particular gene product can accurately be identified as the product of a member of any given gene signature.
- polypeptide or RNA molecules can be developed which specifically recognise the desired gene products in vivo or in vitro.
- immunoglobulin molecules may be used to specifically bind to the target polypeptides, for instance in a western blot or ELISA.
- the immunoglobulins or the target polypeptides may be labelled, to provide a means of identification and measurement. Ideally, such measurements are carried out on an array of immunoglobulin molecules.
- An "immunoglobulin” is one of a family of polypeptides which retain the immunoglobulin fold characteristic of immunoglobulin (antibody) molecules, which contains two [beta] sheets and, usually, a conserved disulphide bond.
- immunoglobulin superfamily are involved in many aspects of cellular and non-cellular interactions in vivo, including widespread roles in the immune system (for example, antibodies, T-cell receptor molecules and the like), involvement in cell adhesion (for example the ICAM molecules) and intracellular signalling (for example, receptor molecules, such as the PDGF receptor).
- Preferred immunoglobulins are antibodies, which are capable of binding to target antigens with high specificity.
- Antibodies can be whole antibodies, or antigen-binding fragments thereof.
- the invention includes fragments such as Fv and Fab, as well as Fab' and F(ab')2, and antibody variants such as scFv, single domain antibodies, Dab antibodies and other antigen-binding antibody-based molecules.
- polypeptides encoded by the genes set forth in Tables 1 -5, or peptides derived therefrom, can be used to generate antibodies for use in the present invention.
- the peptides used preferably comprise an epitope which is specific for a polypeptide encoded by a gene in accordance with the invention.
- Polypeptide fragments which function as epitopes can be produced by any conventional means (see, for example, U.S. Pat. No. 4,631,211).
- antigenic epitopes preferably contain a sequence of at least 4, at least 5, at least 6, at least 7, more preferably at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 20, at least 25, at least 30, at least 40, at least 50, and, most preferably, between about 15 to about 30 amino acids.
- Preferred polypeptides comprising immunogenic or antigenic epitopes are at least 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95 or 100 amino acid residues in length.
- Antibodies can be generated using antigenic epitopes of polypeptides according to the invention by immunising animals, such as rabbits or mice, with either free or carrier- coupled peptides, for instance, by intraperitoneal and/or intradermal injection of emulsions containing about 100 [mu]g of peptide or carrier protein and Freund's adjuvant or any other adjuvant known for stimulating an immune response.
- Antibodies for use in the present invention can be fused to marker sequences, such as a peptide which facilitates purification of the fused polypeptide.
- the marker amino acid sequence is a hexa-histidine peptide, such as the tag provided in a pQE vector (QIAGEN, Inc., 9259 Eton Avenue, Chatsworth, Calif, 9131 1), among others, many of which are commercially available.
- hexa-histidine provides for convenient purification of the fusion protein.
- Another peptide tag useful for purification, the "HA" tag corresponds to an epitope derived from the influenza hemagglutinin protein (Wilson et al., (1984) Cell 37: 767.
- Antibodies as described herein can be altered antibodies comprising an effector protein such as a label.
- labels which allow the imaging of the distribution of the antibody in vivo.
- Such labels can be radioactive labels or radioopaque labels, such as metal particles, which are readily visualisable within the body of a patient. This can allow an assessment to be made without the need for tissue biopsies.
- they can be fluorescent labels or other labels which are visualisable on tissue.
- the antibody is preferably provided together with means for detecting the antibody, which can be enzymatic, fluorescent, radioisotopic or other means.
- the antibody and the detection means can be provided for simultaneous, simultaneous separate or sequential use, in a diagnostic kit intended for diagnosis.
- the antibodies for use in the invention can be assayed for immunospecific binding by any method known in the art.
- the immunoassays which can be used include but are not limited to competitive and noncompetitive assay systems using techniques such as western blots, radioimmunoassays, ELISA, sandwich immunoassays, immunoprecipitation assays, precipitin reactions, gel diffusion precipitin reactions, immunodiffusion assays, agglutination assays, complement- fixation assays, immunoradiometric assays, fluorescent immunoassays and protein A immunoassays.
- Such assays are routine in the art (see, for example, Ausubel et al, eds, 1994, Current Protocols in Molecular Biology, Vol.
- Immunoprecipitation protocols generally comprise lysing a population of cells in a lysis buffer such as RIP A buffer (1% NP-40 or Triton X-100, 1% sodium deoxycholate, 0.1% SDS, 0.15 M NaCl, 0.01 M sodium phosphate at pH 7.2, 1% Trasylol) supplemented with protein phosphatase and/or protease inhibitors (e.g., EDTA, PMSF, aprotinin, sodium vanadate), adding the antibody of interest to the cell lysate, incubating for a period of time (e.g., 1-4 hours) at 4°C, adding protein A and/or protein G sepharose beads to the cell lysate, incubating for about an hour or more at 4°C, washing the beads in lysis buffer and resuspending the
- Western blot analysis generally comprises preparing protein samples, electrophoresis of the protein samples in a polyacrylamide gel (e.g., 8%-20% SDS-PAGE depending on the molecular weight of the antigen), transferring the protein sample from the polyacrylamide gel to a membrane such as nitrocellulose, PVDF or nylon, blocking the membrane in blocking solution (e.g., PBS with 3% BSA or non-fat milk), washing the membrane in washing buffer (e.g., PBS-T een 20), exposing the membrane to a primary antibody (the antibody of interest) diluted in blocking buffer, washing the membrane in washing buffer, exposing the membrane to a secondary antibody (which recognises the primary antibody, e.g., an antihuman antibody) conjugated to an enzymatic substrate (e.g., horseradish peroxidase or alkaline phosphatase) or radioactive molecule
- an enzymatic substrate e.g., horseradish peroxidase or alkaline phosphatase
- ELISAs comprise preparing antigen, coating the well of a microtitre plate with the antigen, adding the antibody of interest conjugated to a detectable compound such as an enzymatic substrate (e.g., horseradish peroxidase or alkaline phosphatase) to the well and incubating for a period of time, and detecting the presence of the antigen.
- a detectable compound such as an enzymatic substrate (e.g., horseradish peroxidase or alkaline phosphatase)
- a detectable compound such as an enzymatic substrate (e.g., horseradish peroxidase or alkaline phosphatase)
- a second antibody conjugated to a detectable compound can be added following the addition of the antigen of interest to the coated well.
- the binding affinity of an antibody to an antigen and the off-rate of an antibody-antigen interaction can be determined by competitive binding assays.
- a competitive binding assay is a radioimmunoassay comprising the incubation of labelled antigen (e.g., 3 H or 125 I) with the antibody of interest in the presence of increasing amounts of unlabeled antigen, and the detection of the antibody bound to the labelled antigen.
- the affinity of the antibody of interest for a particular antigen and the binding off-rates can be determined from the data by Scatchard plot analysis.
- Competition with a second antibody can also be determined using radioimmunoassays.
- the antigen is incubated with antibody of interest conjugated to a labelled compound (e.g., 3 H or 125 I) in the presence of increasing amounts of an unlabeled second antibody.
- Polypeptide levels may be measured using alternative peptide-specific reagents.
- Such reagents include peptide or RNA aptamers, which can specifically detect a defined polypeptide sequence. Proteins can be detected by protein gel assay, antibody binding assay, or other detection methods known in the art.
- RNA aptamers can be produced by SELEX. SELEX is a method for the in vitro evolution of nucleic acid molecules with highly specific binding to target molecules.
- the SELEX method involves selection of nucleic acid aptamers, single-stranded nucleic acids capable of binding to a desired target, from a library of oligonucleotides.
- the SELEX method includes steps of contacting the library with the target under conditions favourable for binding, partitioning unbound nucleic acids from those nucleic acids which have bound specifically to target molecules, dissociating the nucleic acid- target complexes, amplifying the nucleic acids dissociated from the nucleic acid-target complexes to yield a ligand-enriched library of nucleic acids, then reiterating the steps of binding, partitioning, dissociating and amplifying through as many cycles as desired to yield highly specific, high affinity nucleic acid ligands to the target molecule.
- SELEX is based on the principle that within a nucleic acid library containing a large number of possible sequences and structures there is a wide range of binding affinities for a given target.
- a nucleic acid library comprising, for example a 20 nucleotide randomised segment can have 4 20 structural possibilities. Those which have the higher affinity constants for the target are considered to be most likely to bind.
- the process of partitioning, dissociation and amplification generates a second nucleic acid library, enriched for the higher binding affinity candidates. Additional rounds of selection progressively favour the best ligands until the resulting library is predominantly composed of only one or a few sequences. These can then be cloned, sequenced and individually tested for binding affinity as pure ligands.
- the iterative selection/amplification method is sensitive enough to allow isolation of a single sequence in a library containing at least 10 14 sequences.
- the nucleic acids of the library preferably include a randomised sequence portion as well as conserved sequences necessary for efficient amplification.
- Nucleic acid sequence variants can be produced in a number of ways including synthesis of randomised nucleic acid sequences and size selection from randomly cleaved cellular nucleic acids.
- the variable sequence portion can contain fully or partially random sequence; it can also contain subportions of conserved sequence incorporated with randomised sequence. Sequence variation in test nucleic acids can be introduced or increased by mutagenesis before or during the selection/amplification iterations and by specific modification of cloned aptamers.
- the expression of genes in a test sample can be compared to the expression of genes in known tumour and normal samples, in order to determine whether tumour cells are present.
- the pattern of gene expression for the genes identified in the present application differs in the tumour sample from that in a normal sample.
- the tumour sample contains those cells which show the physiological and morphological characteristics associated with malignancy, including the ability for unrestricted independent growth and proliferation.
- a normal sample is a sample which does not comprise any cells which show the physiological and morphological characteristics associated with malignancy.
- a normal sample may be a tissue sample isolated from tissue adjacent to a rumour in a patient suffering from cancer. Alternatively, it may be a sample isolated from an individual not suffering from cancer. The normal sample acts as a control.
- Expression levels of genes identified herein may be greater or lower in a tumour sample compared to a normal sample.
- the identification of tumour or normal tissue depends not only on upregulation of certain genes, but on a general pattern of change in gene expression.
- comparison with a normal sample is no longer necessary.
- the presence of a tumour may be assessed by comparison with the pattern associated with the tumour.
- a hierarchical clustering analysis can be applied to construct gene profiles for the identification of tumor tissue. Hierarchical Cluster Analysis is defined as grouping or segmenting a collection of objects into subsets or "clusters”.
- the objects to be clustered can be either the genes or the samples: genes can be clustered by comparing their expression profiles across the set of samples, or the samples can be clustered by comparing their expression profiles across the set of genes.
- the genes (or samples) within each cluster are more closely related to one another than genes (or samples) grouped within different clusters.
- the genes (or samples) are not partitioned into a particular cluster in a single step. Instead, a sequential merging of the genes (or samples), from low level to high level, takes place depending on the measurements of pair-wise similarity between expression profiles. At the highest level, there may be a single cluster containing all genes (or samples), while at the lowest level the clusters each consist of singleton genes (or samples).
- a similar procedure as described above is performed after the samples are obtained and the selected gene sets are expressed either as individual genes or as sets.
- the expression patterns associated with each type of NSCLC can be differentiated either in the presence of controls, from other, known NSCLC types, or without controls once the expression patters of the genes set forth herein have been established.
- the prognosis of patients with a tumour uses the same procedure as described above for obtaining the samples and expressing the gene sets either as individual genes or as sets.
- a Cox's proportional hazards regression analysis is then performed for each gene thereby allowing the selection of overall survival associated genes.
- a risk score is then determined for the individual patients that comprise the summation of multiplying the regression coefficient of the selected gene by the corresponding expression intensity.
- Cox regression is a method for investigating the effect of several variables upon the time a specified event takes to happen. In the context of an outcome such as death this is known as Cox regression for survival analysis.
- the method does not assume any particular "survival model" but it is not truly non-parametric because it does assume that the effects of the predictor variables upon survival are constant over time and are additive in one scale.
- Based on the median risk score patients are then categorized as having a high or low- risk of surviving or having a relapse free survival. This is determined by a comparison to the corresponding Kaplan-Meier estimates of overall survival.
- NSCLC Newcastle disease virus
- RNA isolation A genome-wide gene expression analysis using Affymetrix U133 Plus 2.0 arrays was performed on the cohort of 91 patients with NSCLC. All tumor samples were reviewed by two independent pathologists to determine their histopathological types, cancer cell contents, and degree of differentiation. Eight LCC samples had a high level of cell type heterogeneity, presenting with acinar differentiation or squamous cell components. 19 percent (17 out of 91) of tumor samples had a discrepancy in histopathological classification, including five of rare types of NSCLC with a histological composition of multiple cell types. To sketch a precise histological profile, these 17 samples were excluded from creating histology signatures. Total RNA isolation
- RNA pellets were washed with 75% ethanol and dissolved in RNase-free water. If applicable, they were stored at -80 °C for further usage.
- RNA integrity was determined using the Agilent 2100 BioAnalyzer. Samples were kept for further processing if the 28s/18s ratio of the RNA was higher than 1.2. The concentrations of the RNAs were measured with a NanoDrop ND-1 1 1 UV-VIS spectrophotometer. cRNA amplification and labelling
- Double strand (ds) cDNA synthesis was performed according to the standardized protocol for One-Cycle cDNA synthesis from Affymetrix (Santa Clara, CA). Approximately 5 ⁇ g of total R A was first converted to single strand cDNA in a 20 ⁇ First-Strand Reaction Mix, containing poly-A control RNA, 100 ⁇ T7-01igo Primer, lx first strand buffer, 0.2 mol DTT 10 mmol dNTP mix and Superscript II. In detail, the sample RNA, the poly- A control RNA and the T7-OHgo Primer were mixed and incubated for 10 min at 70 °C.
- the first strand buffer, the DTT and the dNTP mix were added and incubated for 2 min at 42 °C, followed by adding Superscript II and incubation for 1 hour at 42 °C.
- the ds cDNA was prepared from the resultant First-Strand Reaction Mix, mixed with lx second strand reaction buffer, 30 mmol dNTP mix, E.coli DNA ligase, E.coli DNA Polymerase I and RNaseH. The mix was incubated for 2 hours at 16 °C, then supplemented with T4 DNA Polymerase and then incubated for a further 5 minutes at 16 °C. The reaction was stopped by the addition of EDTA to a final concentration of 5 ⁇ .
- the Sample Cleanup Module and GeneChip IVT Labelling Kit from Affymetrix were used to purify the synthesized ds cDNA, which was then used to generate biotin-labelled cRNA in the presence of lx IVT Labelling buffer, IVT Labelling NTP Mix, IVT Labelling Enzyme Mix and RNase-free water in a total volume of 40 ⁇ . After an incubation of 16 hours at 37 °C, the concentration and quality of the labelled cRNA were checked with a NanoDrop ND-1000 UV-VIS spectrophotometer. An A 2 6o/A 28 o ratio between 1.9 and 2.1 was considered acceptable.
- Hybridization was conducted following Affymetrix instructions for the GeneChip® Human Genome U133 Plus 2.0 array.
- the GeneArray scanner 3000 (Affymetrix) was then employed to detect the hybridization signals.
- Microarrays that did not pass the quality assessment were removed from further analyses.
- the quality metrics used to exclude microarrays was the statistics summary calculated by the GCOS algorithm during the processing of probe-level data.
- the primary inclusion criteria include: all arrays had to have comparable noise values (Raw Q, measurement for the pixel-to-pixel variation of probe cells on the chip); background values were within the range of 20 to 100; percent of present calls for probe sets on the array should not be below 45%.
- the other criteria were: arrays with extremely high or low values for any of these parameters, e.g.
- RMA Robot Multi-Array average
- the intensities of mismatch probes were entirely ignored due to their spurious estimation of non-specific binding.
- the intensities were background-corrected in such a way that all corrected values must be positive.
- the RMA algorithm utilized quantile normalization in which the signal value of individual probes was substituted by the average of all probes with the same rank of intensity on each chip/array. Finally Tukey's median polish algorithm was used to obtain the estimates of expression for normalized probe intensities.
- GCOS vl.4 Global Scaling
- This algorithm was a summary method embedded in GeneChip Operating Software (GCOS) from Affymetrix, and fully described in the data analysis fundamentals_manual.
- the signal intensity of each probe was firstly corrected by the overall background.
- the differences between perfect match (PM) and mismatch (MM) probes were examined by using background-adjusted intensities for each probe pair.
- the significance of the differences between PM and MM probe sets was reflected by a p-value calculated by onesided Wilcoxon-signed rank test.
- the final signal for a probe set was assigned as the one- step biweight estimate of the combined differences of all probe pairs belonging to one probe set.
- the trimmed mean signal of each array was then scaled to the same Target Intensity (e.g. 250) by a global method to minimize technique-derived discrepancies.
- Probe sets were involved in further analysis only if their expression levels deviated from the overall mean in at least one array by a minimum factor of 2.5, because the remaining data were unlikely to be informative. The result was that 43,160 probe sets were eliminated, and 1 1 ,515 probe sets remained for further analysis.
- Clustering was performed without taking into account any external information such as histology subtypes and tumor stages, with each of the selected 1 1,515 probe sets using the K-means algorithm (OmniViz). Similarities were measured by magnitude and shape (Euclidean distance). Pair-wised similarities between samples were sorted and visualized by the Pearson Correlation Matrix (OmniViz). The order of clusters and individual samples within each cluster was sorted according to the Pearson Correlation Coefficient.
- SAM discovered differentially expressed genes among different sample classes, e.g. between non-cancerous tissues and tumours (this Example) or among different histology subtypes (Example 2).
- this algorithm calculated the different expression for each gene between classes relative to the variation expected in the mean difference.
- false discovery rate FDR was controlled by randomly permutating the classes of samples 100 times.
- Signature probe sets for assigned classes were selected by a change factor of 2 and a FDR of less than 1 percent.
- the class comparisons were performed with both RMA- and GCOS-processed data.
- the common probe sets identified by both sets of data were selected as the final signatures.
- the resultant signatures from Class Comparison were tested by the nearest shrunken centroids algorithm (PAM) to identify subgroups of genes that best characterized the predefined classes.
- the prediction accuracy of optimized signatures was determined by performing "leave-one-out" cross validation within the training set, with one sample omitted each time and class label being predicted with other samples for the omitted sample.
- the predictive models generated by the optimal subsets were subsequently applied to make predictions of classes for samples in the validation set, which were not involved in the corresponding class comparisons.
- the prediction accuracy on validation samples was calculated by comparing predicted class labels with the histopathological diagnoses for those samples; samples without histopathological records were excluded from the calculations.
- NSCLC are sub-classified by histology signature genes
- NSCLC is a class of tumor with a high degree of heterogeneity
- genes characterizing histological features were identified using strictly selected tumor samples. The samples used had consistent histological diagnoses two independent pathologists, and displayed no apparent tissue heterogeneity.
- ADC ADC
- SCC SCC
- LCC LCC
- the association between the prognosis profile and clinical parameters was studied.
- the prognosis profile was significantly associated with age (p ⁇ 0.023), smoking years (p ⁇ 0.014), gender (p ⁇ 0.012) and Forced Expiratory Volume 1 (p ⁇ 0.009), a parameter reflecting lung function, but not with tumor stage, tumor cell content, tumor histology and tumor size (Table 7).
- Table 8 shows the Wald statistics and significance for each variable tested.
- Tumor stage and the 17 probe set prognostic predictor were significantly related to the hazard of death.
- the prognostic predictor presented the highest importance which was 21.682 compared to 3.797 from tumor stage.
- the relative hazard ratio predicted by the prognostic predictor was 2.465 (95% confidence interval, 1.686 to 3.604, p ⁇ 1.5E-06), the highest one among all tested risks (Table 6).
- the inclusion of the prognostic predictor to the predictive model resulted in a change in model performance of 19.5, in terms of -2 log likelihood, with a p-value of 9.8E-06, compared to 24.3 and 2.0E-03 introduced by the model comprising all clinical variables.
- the multivariate proportional hazard analysis shows that the gene expression profile-derived prognostic predictor of 17 probe sets is the strongest predictor of the likelihood of death (Table 8).
- N mean 21og transformation of mean expression value in normal lung tissue samples (average of all NSCLC and normal lung tissue 0).
- DLC-1 a GTPase-activating protein for Rho, is associated with cell proliferation, morphology, and migration in human hepatocellular carcinoma.
Landscapes
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Organic Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Engineering & Computer Science (AREA)
- Immunology (AREA)
- Pathology (AREA)
- Analytical Chemistry (AREA)
- Zoology (AREA)
- Genetics & Genomics (AREA)
- Wood Science & Technology (AREA)
- Physics & Mathematics (AREA)
- Biotechnology (AREA)
- Microbiology (AREA)
- Molecular Biology (AREA)
- Hospice & Palliative Care (AREA)
- Biophysics (AREA)
- Oncology (AREA)
- Biochemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
La présente invention concerne des signatures géniques, qui sont utiles pour caractériser des tumeurs NSCLC, permettent de distinguer un tissu cancéreux d'un tissu normal, sont utiles pour classifier les NSCLC et peuvent être utilisées pour fournir un pronostic pour des patients atteints de NSCLC.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| GB0904957.8 | 2009-03-23 | ||
| GB0904957A GB0904957D0 (en) | 2009-03-23 | 2009-03-23 | Tumour gene profile |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| WO2010108638A1 WO2010108638A1 (fr) | 2010-09-30 |
| WO2010108638A9 true WO2010108638A9 (fr) | 2011-04-28 |
Family
ID=40640001
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/EP2010/001773 Ceased WO2010108638A1 (fr) | 2009-03-23 | 2010-03-22 | Profil d'un gène tumoral |
Country Status (2)
| Country | Link |
|---|---|
| GB (1) | GB0904957D0 (fr) |
| WO (1) | WO2010108638A1 (fr) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US9670553B2 (en) | 2004-06-04 | 2017-06-06 | Biotheranostics, Inc. | Determining tumor origin |
Families Citing this family (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| DE10254601A1 (de) | 2002-11-22 | 2004-06-03 | Ganymed Pharmaceuticals Ag | Differentiell in Tumoren exprimierte Genprodukte und deren Verwendung |
| DE102004024617A1 (de) | 2004-05-18 | 2005-12-29 | Ganymed Pharmaceuticals Ag | Differentiell in Tumoren exprimierte Genprodukte und deren Verwendung |
| MXPA06014116A (es) | 2004-06-04 | 2007-08-08 | Aviaradx Inc | Identificacion de tumores. |
| DK1899484T3 (da) | 2005-06-03 | 2015-11-23 | Biotheranostics Inc | Identifikation af tumorer og væv |
| EP1790664A1 (fr) | 2005-11-24 | 2007-05-30 | Ganymed Pharmaceuticals AG | Anticorps monoclonaux contre claudin-18 pour le traitement du cancer |
| JP2014500001A (ja) * | 2010-10-21 | 2014-01-09 | オンコセラピー・サイエンス株式会社 | C18orf54ペプチドおよびそれを含むワクチン |
| JP2014518086A (ja) * | 2011-06-29 | 2014-07-28 | バイオセラノスティクス インコーポレイテッド | 腫瘍起源の決定 |
| WO2013167153A1 (fr) | 2012-05-09 | 2013-11-14 | Ganymed Pharmaceuticals Ag | Anticorps utiles dans le diagnostic du cancer |
| WO2018216009A1 (fr) * | 2017-05-22 | 2018-11-29 | The National Institute For Biotechnology In The Negev Ltd. Ben-Gurion University Of The Negev | Biomarqueurs de diagnostic du cancer du poumon |
Family Cites Families (36)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US4631211A (en) | 1985-03-25 | 1986-12-23 | Scripps Clinic & Research Foundation | Means for sequential solid phase organic synthesis and methods using the same |
| US4683195A (en) | 1986-01-30 | 1987-07-28 | Cetus Corporation | Process for amplifying, detecting, and/or-cloning nucleic acid sequences |
| US4683202A (en) | 1985-03-28 | 1987-07-28 | Cetus Corporation | Process for amplifying nucleic acid sequences |
| US5503978A (en) | 1990-06-11 | 1996-04-02 | University Research Corporation | Method for identification of high affinity DNA ligands of HIV-1 reverse transcriptase |
| US5567588A (en) | 1990-06-11 | 1996-10-22 | University Research Corporation | Systematic evolution of ligands by exponential enrichment: Solution SELEX |
| US5654151A (en) | 1990-06-11 | 1997-08-05 | Nexstar Pharmaceuticals, Inc. | High affinity HIV Nucleocapsid nucleic acid ligands |
| US5270163A (en) | 1990-06-11 | 1993-12-14 | University Research Corporation | Methods for identifying nucleic acid ligands |
| US5837832A (en) | 1993-06-25 | 1998-11-17 | Affymetrix, Inc. | Arrays of nucleic acid probes on biological chips |
| WO1996038579A1 (fr) | 1995-06-02 | 1996-12-05 | Nexstar Pharmaceuticals, Inc. | Ligands oligonucleotidiques ayant une affinite elevee pour les facteurs de croissance |
| US6403311B1 (en) | 1997-02-12 | 2002-06-11 | Us Genomics | Methods of analyzing polymers using ordered label strategies |
| ATE273381T1 (de) | 1997-02-12 | 2004-08-15 | Eugene Y Chan | Verfahren zur analyse von polymeren |
| DK1801214T3 (da) | 1997-07-07 | 2011-01-24 | Medical Res Council | In vitro sorteringsfremgangsmåde |
| US6210896B1 (en) | 1998-08-13 | 2001-04-03 | Us Genomics | Molecular motors |
| US6263286B1 (en) | 1998-08-13 | 2001-07-17 | U.S. Genomics, Inc. | Methods of analyzing polymers using a spatial network of fluorophores and fluorescence resonance energy transfer |
| US6790671B1 (en) | 1998-08-13 | 2004-09-14 | Princeton University | Optically characterizing polymers |
| GB9900298D0 (en) | 1999-01-07 | 1999-02-24 | Medical Res Council | Optical sorting method |
| US6818395B1 (en) | 1999-06-28 | 2004-11-16 | California Institute Of Technology | Methods and apparatus for analyzing polynucleotide sequences |
| US6440706B1 (en) | 1999-08-02 | 2002-08-27 | Johns Hopkins University | Digital amplification |
| US6696022B1 (en) | 1999-08-13 | 2004-02-24 | U.S. Genomics, Inc. | Methods and apparatuses for stretching polymers |
| US6762059B2 (en) | 1999-08-13 | 2004-07-13 | U.S. Genomics, Inc. | Methods and apparatuses for characterization of single polymers |
| WO2001016375A2 (fr) | 1999-08-30 | 2001-03-08 | The Government Of The United States Of America, As Represented By The Secretary, Department Of Health And Human Services | Sequençage de molecules d'acide nucleique a grande vitesseen parallele |
| GB0022458D0 (en) | 2000-09-13 | 2000-11-01 | Medical Res Council | Directed evolution method |
| US7194269B2 (en) | 2000-10-30 | 2007-03-20 | Her Majesty The Queen In Right Of Canada, As Represented By The Minister Of Industry | Method and wireless communication hub for data communications |
| GB0127564D0 (en) | 2001-11-16 | 2002-01-09 | Medical Res Council | Emulsion compositions |
| CA2513535C (fr) | 2003-01-29 | 2012-06-12 | 454 Corporation | Amplification d'acides nucleiques par emulsion de billes |
| US20040241725A1 (en) | 2003-03-25 | 2004-12-02 | Wenming Xiao | Lung cancer detection |
| KR101126560B1 (ko) * | 2003-05-30 | 2012-04-05 | 도꾜 다이가꾸 | 약제 반응 예측 방법 |
| WO2005010145A2 (fr) | 2003-07-05 | 2005-02-03 | The Johns Hopkins University | Procede et compositions de detection et d'enumeration de variations genetiques |
| US20050221341A1 (en) | 2003-10-22 | 2005-10-06 | Shimkets Richard A | Sequence-based karyotyping |
| US7169560B2 (en) | 2003-11-12 | 2007-01-30 | Helicos Biosciences Corporation | Short cycle methods for sequencing polynucleotides |
| US20060019264A1 (en) | 2003-12-01 | 2006-01-26 | Said Attiya | Method for isolation of independent, parallel chemical micro-reactions using a porous filter |
| EP1735458B1 (fr) | 2004-01-28 | 2013-07-24 | 454 Life Sciences Corporation | Amplification d'acide nucleique avec emulsion a flux continu |
| US20070026424A1 (en) * | 2005-04-15 | 2007-02-01 | Powell Charles A | Gene profiles correlating with histology and prognosis |
| GB0519405D0 (en) | 2005-09-23 | 2005-11-02 | Univ Aberdeen | Cancer therapy prognosis and target |
| EP1980627A1 (fr) | 2007-04-13 | 2008-10-15 | Pangaea Biotech, S.A. | Procédé de détermination d'un régime de chimiothérapie et estimation de survie pour le cancer du poumon pour des grandes cellules selon l'expression EGFR/CSF-1/CA IX |
| KR20090025898A (ko) * | 2007-09-07 | 2009-03-11 | 삼성전자주식회사 | 폐암 환자의 폐암 재발 위험을 예측하기 위한 마커, 키트,마이크로어레이 및 방법 |
-
2009
- 2009-03-23 GB GB0904957A patent/GB0904957D0/en not_active Ceased
-
2010
- 2010-03-22 WO PCT/EP2010/001773 patent/WO2010108638A1/fr not_active Ceased
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US9670553B2 (en) | 2004-06-04 | 2017-06-06 | Biotheranostics, Inc. | Determining tumor origin |
Also Published As
| Publication number | Publication date |
|---|---|
| GB0904957D0 (en) | 2009-05-06 |
| WO2010108638A1 (fr) | 2010-09-30 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2010108638A9 (fr) | Profil d'un gène tumoral | |
| JP4938672B2 (ja) | p53の状態と遺伝子発現プロファイルとの関連性に基づき、癌を分類し、予後を予測し、そして診断する方法、システム、およびアレイ | |
| US8026055B2 (en) | Materials and methods for prognosing lung cancer survival | |
| JP6404304B2 (ja) | メラノーマ癌の予後予測 | |
| JP2018126154A (ja) | 胃腸癌での増殖の徴候及び予後 | |
| WO2008089577A1 (fr) | Puce génétique du cancer du sein | |
| EP3090265B1 (fr) | Profils de gènes du cancer de la prostate et procédés de leur utilisation | |
| US20120329878A1 (en) | Phenotyping tumor-infiltrating leukocytes | |
| US20250369053A1 (en) | Recurrence gene signature across multiple cancer types | |
| WO2016118670A1 (fr) | Dosage d'expression multigénique pour la stratification des patients dans le cas de métastases hépatiques colorectales après résection | |
| WO2008157277A1 (fr) | Procédés d'évaluation du pronostic d'un cancer du sein | |
| EP2297360A1 (fr) | Procédé pour prédire le résultat clinique de patients atteints de carcinome bronchique à grandes cellules | |
| CA2588253A1 (fr) | Methodes et systemes permettant de pronostiquer et de traiter des tumeurs solides | |
| WO2013079215A1 (fr) | Procédé pour la classification de cellules tumorales | |
| EP1849007A2 (fr) | Marqueurs pharmacogenomiques pour le pronostic de tumeurs solides | |
| US20130303400A1 (en) | Multimarker panel | |
| EP2457096A2 (fr) | Phénotypage des leucocytes infiltrant les tumeurs | |
| WO2006037485A2 (fr) | Procedes et kits pour la prevision d'un succes therapeutique et d'une survie exempte de rechute en therapie du cancer | |
| WO2024178032A2 (fr) | Méthodes de diagnostic et de traitement du cancer de l'ovaire | |
| WO2025104619A1 (fr) | Marqueurs du cancer du sein et utilisations associées |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 10709983 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase in: |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 10709983 Country of ref document: EP Kind code of ref document: A1 |