[go: up one dir, main page]

WO2014167494A1 - Molecular signature and its uses as diagnostic agent - Google Patents

Molecular signature and its uses as diagnostic agent Download PDF

Info

Publication number
WO2014167494A1
WO2014167494A1 PCT/IB2014/060529 IB2014060529W WO2014167494A1 WO 2014167494 A1 WO2014167494 A1 WO 2014167494A1 IB 2014060529 W IB2014060529 W IB 2014060529W WO 2014167494 A1 WO2014167494 A1 WO 2014167494A1
Authority
WO
WIPO (PCT)
Prior art keywords
ptcl
seq
nos
molecular
molecular markers
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/IB2014/060529
Other languages
French (fr)
Inventor
Pier Paolo PICCALUGA
Stefano PILERI
Fabio FULIGNI
Maura ROSSI
Antonio DE LEO
Maria Antonella LAGINESTRA
Anna GAZZOLA
Claudio AGOSTINELLI
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Universita di Bologna
Original Assignee
Universita di Bologna
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Universita di Bologna filed Critical Universita di Bologna
Publication of WO2014167494A1 publication Critical patent/WO2014167494A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/112Disease subtyping, staging or classification

Definitions

  • the present invention relates to a method for determining a peripheral T-cell lymphoma (PTCL), preferably for determining the PTCL subtype, said method being based on the use of a molecular signature.
  • the present invention further relates to said molecular signature and its use as a diagnostic agent.
  • PTCL Peripheral T-Cell lymphomas
  • the REAL/WHO (Revised European-American Lymphoma/World Health Organization) classification distinguishes PTCLs into two main categories comprising specific forms of PTCL and nonspecific forms, also known as NOS (acronym of "Not Otherwise Specified").
  • PTCL/NOS PTCL/NOS
  • AITL is the acronym of Angiolmmunoblastic T-cell Lymphoma
  • PTCL/ALCL is the acronym of Anaplastic Large Cell Lymphoma
  • PTCLs The heterogeneity of PTCLs is evident not only from a classification standpoint, but also from a prognostic standpoint. In fact, each subtype of PTCL is associated with a specific prognosis.
  • the differential diagnosis of PTCL is based on an analysis of the morphological, phenotypic and clinical features characterising the different PTCL subtypes.
  • This type of investigation is highly complex and is conducted exclusively by medical personnel who have acquired solid competence in the field of haematopathology.
  • the differential diagnosis of PTCL is entrusted to a highly expert haematopathologist who carefully analyses a sample from the patient in order to determine the presence or absence of the distinctive morphological features of the different PTCL subtypes.
  • the problem underlying a diagnosis based on anatomopathological criteria resides in the fact that the morphology, phenotype and clinical signs, i.e. the parameters presently used for subtyping PTCLs, vary considerably and are in part common to the different PTCL subtypes.
  • PTCL/NOS are often associated with characteristics that are also typical of PTCL/AITL or PTCL/ALCL-ALK " .
  • PTCL/NOS can exhibit a phenotype correlated to that of the T-helper follicular (Thf) lymphocytes commonly observed in PTCL/AITL; or else PTCL/NOS can be characterised by the presence of a sizable number of large CD30 + cells, wholly analogous to the ones characterising PTCL/ALCL- ALK " .
  • Thif T-helper follicular
  • a given PTCL belongs to the PTCL/NOS or PTCL/AITL subtype or the PTCL/NOS or PTCL/ALCL-ALK- subtype.
  • PTCL/NOS characterised by CD30 + cells
  • the possibility of distinguishing it from PTCL/ALCL-ALK " is particularly advantageous, because, as noted earlier, the two subtypes show a very different prognosis; specifically, PTCL/NOS has a more inauspicious prognosis.
  • PTCL/NOS can be treated with tyrosine-kinase inhibitors or histone deacetylase inhibitors
  • PTCL/AITL can be treated with antiangiogenic drugs, or monoclonal antibodies such as rituximab
  • PTCL/ALCL-ALK can benefit from anti-CD30 monoclonal antibodies.
  • the new diagnostic tool in order to replace or supplement the methods presently used for the differential diagnosis of PTCLs, the new diagnostic tool, freed of subjective criteria, must in any case assure a diagnostic accuracy that is comparable, if not even better, than that characterising the reference standard, i.e. the method based on analysing the morphological-clinical features of the PTCL sample.
  • the Applicant has developed a method for diagnosing PTLC which is based on the use of a molecular signature.
  • the method of the present invention enables a differential diagnosis of PTCL, that is, it enables a determination of the PTCL subtype (i.e. a subtyping of the PTCL).
  • the molecular signature or molecular classifier of the present invention can be used as a diagnostic agent, i.e. it can be used in a method for determining the presence of a pathology in an individual.
  • the molecular signature of the present invention is capable of diagnosing, preferably by subtyping, a PTCL with a diagnostic accuracy that is comparable to if not better than the diagnostic accuracy of the reference standard (i.e. the method presently used for subtyping PTCLs, which is based on analysing the morphological, phenotypic and clinical features).
  • the method for subtyping PTCLs of the present invention is particularly innovative and advantageous because it is the first method to be based on the use and analysis of gene expression profiles. Furthermore, the method of the present invention is the first method characterised by a diagnostic accuracy deriving from a phase III diagnostic study.
  • the method according to the present invention it is possible to determine whether an individual is affected by a PTCL, and in particular it is possible to determine the subtype it belongs to (i.e. if it is PTCL/NOS or PTCL/ALCL-ALK " , PTCL/NOS or PTCL/AITL) with a clinically significant diagnostic accuracy as high as 98%.
  • the method according to the present invention shows a high (80-100%) specificity (SP) and is thus ideal as a SPIN test (deriving from SPecificity-rule IN), in the sense that, if positive, said method confirms the presence of the pathology.
  • the method of the present invention shows an equally high (72-100%o) sensitivity (SN) and is thus also ideal as a SNOUT test (deriving from SeNsitivity-rule OUT), in the sense that when the result of the method is negative, the disease is ruled out.
  • the method of the invention is particularly useful in clinical practice. In fact, it is highly informative both in cases in which the result of the method is positive, and in cases in which the result is negative. In this regard it is also necessary to emphasize that, to date, the use of the gene expression profiles and the analysis thereof has essentially been confined to the field of research. Their exclusion from the diagnostic field is mainly tied to the necessity of using, as the starting material, only "fresh" or frozen isolated tissue samples which are often not available.
  • the method proposed also overcomes this difficulty because it also works well and accurately on a fixed isolated tissue sample, which may be sectioned and preserved, for example on laboratory slides.
  • the biological sample available for investigating this type of pathology is usually an isolated tissue sample fixed and embedded in ad hoc solutions, for example paraffin- based ones.
  • the method of the invention is the first method based on the analysis of a gene expression profile that has been validated on fixed biological tissue in the realm of lymphomas.
  • Figure 1 shows the hierarchical clustering of genes differentially expressed in a significant manner according to discriminant analysis of PTCL/AITL versus PTCL/NOS (A) and ALK- ALCL versus PTCL/NOS (B) .
  • FIG. 2 shows the genes expressed differentially between the PTCL/NOS samples and the PTCL/ALCL-ALK " samples.
  • FIG. 3 shows the genes expressed differentially between the PTCL/NOS samples and the PTCL/AITL samples.
  • FIG. 4 shows the survival curves: of the various PTCL subtypes (A); of the various molecular subtypes (B); and of the PTCL/NOS CD30 + subtype and PTCL/ALCL-ALK- subtype (C).
  • a first aspect of the present invention regards a method, carried out in vitro and based on the use of a molecular signature, which enables a peripheral T-cell lymphoma (PTCL) to be determined.
  • the method of the invention enables a subtyping of PTCLs, preferably as PTCL/NOS, PTCL/AITL or PTCL/ALCL-ALK " .
  • PTCL/NOS peripheral T-cell lymphoma
  • the PTCLs to which the present invention relates are preferably those affecting the nodes.
  • molecular signature means a set of molecular markers.
  • a molecular marker or molecular classifier is a nucleic acid sequence, for example a DNA sequence, in particular a gene sequence.
  • a nucleic acid sequence should be understood as meaning both the sequence of a filament of nucleic acid and the complementary sequence, and/or an RNA sequence.
  • a molecular signature is also definable as a set of DNA sequences or a set of genes (or also a gene pool).
  • the method of the present invention is carried out in vitro, i.e. on a biological sample isolated from an individual who is affected or suspected to be affected by a PTCL, and comprises at least one step of measuring, in a biological sample, the levels of expression of the molecular markers of the signature of the invention as described below (Step 1).
  • the expression levels of the molecular markers measured according to step 1 are analysed, preferably by using a binary classifier (Step 2).
  • the binary classifier is preferably based on a linear discriminant function or on Support Vector Machine algorithms.
  • the method of the invention comprises at least a step 1 and at least a step 2 as described above.
  • the molecular signature comprises at least 5, preferably at least 15, more preferably at least 38 or at least 53 molecular markers selected from among the markers indicated in Table I.
  • Table I shows the name of the molecular marker (i.e. of the gene) and the reference sequence (RefSeq) obtained from NCBI database (http://www.ncbi.nlm.nih.gov/refseq/).
  • the reference sequence identifies a variant of the sequence for each molecular marker described.
  • the method envisages a step of measuring the expression levels of at least 5, preferably at least 5, more preferably at least 38 or at least 53 molecular markers selected from among SEQ ID NO: 1-91.
  • said at least 38 sequences are SEQ ID NO: 1-38.
  • said at least 53 sequences are SEQ ID NO: 39-91.
  • Table II shows the sequences of the molecular signature of the present invention.
  • Table II shows the SEQ ID NO of the sequence of each molecular marker, the name of the molecular marker and the reference sequence of the molecular marker (RefSeq) obtained from the database (RefSeq NCBI http://www.ncbi.nlm.nih.gov/refseq/).
  • the reference sequence identifies a variant of the sequence for each molecular marker described.
  • sequences characterised by at least 70%, 80%, 85%, 90%, or 95% of identicalness with SEQ ID NO: 1-91 , listed in Table II and described in their entirety in the appended Sequence Listing, are also to be considered as part of the subject matter of the present invention.
  • FCGR3A SEQ ID NO: 59 NM_000569
  • GNA15 SEQ ID NO: 16 NM_002068 KLRG1 SEQ ID NO: 62 NM_005810
  • ICAM2 SEQ ID NO: 17 NM_000873 KRT3 SEQ ID NO: 63 NM_057088
  • MIB1 SEQ ID NO: 19 NM_020774 LYG2 SEQ ID NO: 65 NMJ 75735
  • MOAP1 SEQ ID NO: 20 NM_022151
  • MPPE1 SEQ ID NO: 66 NM_001242904
  • MRPS23 SEQ ID NO: 21 NM_016070 MRPL21 SEQ ID NO: 67 NMJ81514
  • NSD1 SEQ ID NO: 24 NM_022455
  • NR2F2 SEQ ID NO: 70 NM_001145155
  • ATF1 SEQ ID NO: 40 NM_005171 TUBD1 SEQ ID NO: 86 NM_001193609
  • the molecular markers i.e. the molecular signature
  • molecules hybridising with said molecular markers i.e. the complementary sequences of said molecular markers
  • a solid support is preferably a chip or a plate. Any support known to the person skilled in the art which serves the purpose of chemically binding a molecular marker as described above is to be considered as comprised within the objects of the present invention.
  • a further aspect of the present invention relates to a support, preferably a chip or a plate, on which the molecular signature of the present invention is bound; in other words, a support on which the above-described molecular markers are bound.
  • the molecular signature of the present invention preferably chemically bound on a support as described above, can be used as a diagnostic agent.
  • the markers of the molecular signature in order to diagnose a pathology in an individual.
  • the molecular signature of the invention can be used to define the prognosis of a pathology, or else to monitor the results of a surgical or therapeutic treatment targeted against a pathology (i.e. so-called disease follow-up).
  • the pathology is preferably a peripheral T-cell lymphoma (PTCL).
  • PTCL peripheral T-cell lymphoma
  • the method of the invention is particularly suitable for subtyping PTCLs, preferably nodal PTCLs.
  • the method of the present invention makes it possible to distinguish whether a PTCL is a PTCL/NOS, preferably a PTCL/NOS CD3O + , a PTCL/AITL or a PTCL/ALCL-ALK- and, since each PTCL subtype is associated with a different prognosis, the method of the present invention can also be considered as a method for determining the prognosis of a PTCL.
  • a signature comprising at least 5, preferably at least 15 or 38 molecular markers selected from among SEQ ID NO: 1-38, preferably chemically bound to a support as described above, is used as a diagnostic agent, in particular in a method for determining a PTCL, preferably for determining whether a PTCL is of the PTCL/NOS subtype or PTCL/AITL subtype.
  • a molecular signature that comprises at least 5, preferably at least 15 or 38 molecular markers selected from among SEQ ID NO: 1-38 is capable of distinguishing, in a clinically significant manner, individual affected by a PTCL/NOS, from those affected by a PTCL/AITL.
  • the method which uses, in step 1 , a molecular signature comprising at least 5, preferably at least 15 or 38 molecular markers selected from among SEQ ID NO: 1-38 is preferably a method for diagnosing or subtyping a PTCL/NOS and/or a PTCL/AITL (i.e. of distinguishing whether it is a case of PTCL/NOS or PTCL/AITL).
  • the method also makes it possible to establish the prognosis tied to a PTCL/NOS and/or a PTCL/AITL.
  • the molecular signature comprising at least 5, preferably at least 15 or 53 molecular markers selected from among SEQ ID NO: 39-91 , preferably chemically bound to a support as described above, is used as a diagnostic agent, in particular in a method for determining whether a PTCL is of the PTCL/NOS subtype, preferably PTCL/NOS CD3O + , or of the PTCL/ALCL-ALK " subtype.
  • a molecular signature which comprises at least 5, preferably at least 15 or 53 molecular markers selected from among SEQ ID NO: 39-91 , is capable of distinguishing, in a clinically significant manner, the individuals affected by a PTCL/NOS, preferably PTCL/NOS CD3O + , from those affected by a PTCL/ALCL-ALK " .
  • the method which uses, in step 1 , a molecular signature comprising at least 5, preferably at least 15 or 53 molecular markers selected from among SEQ ID NO: 39-91 is preferably a method for diagnosing or subtyping a PTCL/NOS, preferably PTCL/NOS CD3O + , and/or a PTCL/ALCL-ALK " (i.e. of distinguishing whether it is a case of PTCL/NOS, preferably PTCL/NOS CD3O + , or PTCL/ALCL-ALK " ).
  • the method also makes it possible to establish the prognosis tied to a PTCL/NOS, preferably PTCL/NOS CD3O + , and/or a PTCL/ALCL-ALK " .
  • the method of the present invention is carried out in vitro starting from any isolated biological sample, for example a biopsy.
  • said biological sample is a "fresh" (i.e. just isolated) and/or frozen sample.
  • the biological sample can be fixed and embedded.
  • the biological sample can be fixed using the common fixing solutions utilised in laboratories, for example a formalin solution or alcohol solutions.
  • the embedding is achieving using the common embedding solutions utilised in laboratories, for example paraffin or OCT solutions.
  • the method of the present invention can be carried out starting from a biological sample that has been fixed, embedded, sectioned and preserved on a plate.
  • the biological sample referred to is isolated from any individual, preferably, the sample is isolated from an individual who has been diagnosed with a PTCL, or an individual who is suspected to be affected by PTCL.
  • the biological sample is treated with the aim of obtaining the expression profile of the molecular signature of the invention, i.e. in order to carry out step 1 of the above described method.
  • the genes are the molecular markers (sequences) of the molecular signature of the invention, i.e. at least 5, preferably at least 15, more preferably at least 38 or at least 53 markers selected from among SEQ ID 1-91.
  • the 38 markers are SEQ ID NO: 1-38.
  • the 53 markers are SEQ ID NO: 39-91.
  • the methods for obtaining the expression profile of the molecular markers concerned are those known in the art which serve this purpose, for example, microarrays, quantitative PCR, digital PCR or next generation sequencing (also called deep sequencing).
  • the biological sample is subjected to a method comprising at least one of the following steps:
  • nucleic acid molecules in particular the transcribed RNA molecules in the biological sample (i.e. the messenger RNA-mRNA molecules);
  • step (b) retrotranscribing the RNA molecules obtained from step (a) in order to obtain the corresponding cDNA molecules;
  • step (c) amplifying and/or labelling the cDNA molecules obtained from step (iv); and/or (b); and/or
  • step (d) quantifying the molecules amplified during step (c).
  • the quantitative data obtained from step (d) represent the expression profile of the molecular markers of the signature in the sample considered.
  • RNA according to step (a) and the retrotranscribing step (b) are techniques that are well known in the art and are thus carried out according to the protocols commonly used in this field. For example, the methods described in laboratory manuals such as those by Sambrook et al. 1989, or Ausubel et al. 1994.
  • the amplification step (c) is preferably carried out using PCR-type techniques in the presence of probes (primers), i.e. DNA sequences complementary to the portions of cDNA that one wishes to amplify.
  • the amplification step is carried out by means of a DASL assay, which is known in the art for these purposes.
  • the labelling step is preferably carried out by performing the amplification step in the presence of labelled molecules, in particular in the presence of precursors of the labelled nucleic acids or labelled probes. Labelling is performed, preferably, with fluorescent or biotinylated molecules so as to enable the amplified, labelled DNA to be quantized by means of signal readers.
  • the amplified DNA can be quantized with methods known to the person skilled in the art, for example methods based on quantitative PCR, expression microarrays, RNA-sequencing or ad hoc arrays (i.e. so- called custom arrays).
  • the molecules to be amplified and/or labelled are at least 5, preferably at least 15, more preferably at least 38 or at least 53 markers selected from among SEQ ID 1-91.
  • the 38 markers are SEQ ID NO: 1-38.
  • the 53 markers are SEQ ID NO: 39-91.
  • the amplification/labelling step is followed by a step of hybridising the amplified/labelled molecules on a solid support, for example on a chip, on which, preferably, the whole genome and/or transcriptome has been chemically bound, or on a support on which the molecular markers of the invention, i.e. the molecular signature as previously described or in any case molecules hybridising with said molecular markers, have been chemically bound.
  • a solid support for example on a chip, on which, preferably, the whole genome and/or transcriptome has been chemically bound, or on a support on which the molecular markers of the invention, i.e. the molecular signature as previously described or in any case molecules hybridising with said molecular markers, have been chemically bound.
  • the quantization, and hence data of the expression profile can be obtained using a reader, for example an iSCAN System and/or Bead Assay reader in the case of DASL lllumina technology, or else a reader like the Scanner 3000 and the evolutions thereof for Affymetrix technology, or TaqMan platforms in the event that the amplification and/or labelling is carried out with quantitative PCR assays.
  • a reader for example an iSCAN System and/or Bead Assay reader in the case of DASL lllumina technology, or else a reader like the Scanner 3000 and the evolutions thereof for Affymetrix technology, or TaqMan platforms in the event that the amplification and/or labelling is carried out with quantitative PCR assays.
  • a peripheral T-cell lymphoma i.e. a PTCL
  • the expression profile of the molecular markers of the signature envisages that SEQ ID NO: 2, 8-10, 14, 15, 19, 23, 24, 26, 30, 32, 34, 36- 43, 45-47, 49, 50, 55, 56, 58, 59, 62, 64, 66, 68-71 , 73, 74, 76, 78-81 , 83, 84, 86 and 89-91 are underexpressed in the PTCL/AITL samples or in the PTCL/ALCL samples ALK " compared to the PTCL/NOS samples.
  • the expression profile of the molecular markers of the signature envisages that SEQ ID NO: 1 , 3-7, 1 1-13, 16-18, 20-22, 25, 27-29, 31 , 33, 35, 44, 48, 51-54, 57, 60, 61 , 63, 65, 67, 72, 75, 77, 82, 85, 87 and 88 are overexpressed in the PTCL/AITL samples or in the PTCL/ALCL ALK " samples compared to the PTCL/NOS samples.
  • the expression data of the molecular signature of the invention are analysed with the aid of a computerised system programmed with a binary classifier, preferably selected between a linear discriminant function and a Support Vector Machine.
  • Said function and said algorithm have preferably been preset using the expression data of the molecular markers of the signature described earlier, starting from biological samples isolated from individuals with PTCL/NOS, PTCL/AITL and PTCL/ALCL-ALK " .
  • these classifiers enable PTCL to be classified into two subtypes.
  • Said two subtypes are preferably PTCL/NOS vs. PTCL/ALCL-ALK " or PTCL/AITL vs. PTCL/NOS.
  • the linear discriminant function is Function 1 which follows:
  • the value of expression of the molecular marker is preferably the one obtained according to step 1 of the method, hence the value of expression of the molecular marker in the biological sample considered.
  • the discriminant coefficient for each molecular marker of the signature of the present invention is preferably the one indicated Table III.
  • the discriminant coefficient of the molecular marker i (i" means any marker of the signature of the invention) is a score associated with each discriminant molecular marker; said score indicates the importance which that given molecular marker has in classifying the PTCLs (the higher the score of the coefficient, the greater the weight this marker has in classifying the PTCL).
  • the discriminant coefficient is calculated taking into consideration the correlations between the expressions of the various molecular markers and is calculated in such a way as to maximise the difference between PTCLs, preferably between PTCL subtypes.
  • the constant is a fixed value that is added to the discriminant function and is likewise calculated in such a way as to maximise the difference between PTCLs, preferably between PTCL subtypes.
  • the values of the function are calculated in such a way as to maximise the difference between the subtypes PTCL/NOS and PTCL/AITL.
  • the values of the function are calculated in such a way as to maximise the difference between the PTCL/NOS subtypes, preferably PTCL/NOS CD3O + and PTCL/ALCL-ALK " .
  • GNA15 SEQ ID NO: 16 -,304 KLRG1 SEQ ID NO: 62 -1,351
  • ICAM2 SEQ ID NO: 17 -,158 KRT3 SEQ ID NO: 63 -1,470
  • MOAP1 SEQ ID NO: 20 ,906 MPPE1 SEQ ID NO: 66 2,365
  • NSD1 SEQ ID NO: 24 -2,187 NR2F2 SEQ ID NO: 70 -6,697
  • ACAD 10 SEQ ID NO: 39 -12,539 TRAF3IP1 SEQ ID NO: 85 -6,891
  • the values of the discriminant coefficients of SEQ ID NO: 1-38 are discriminant for PTC/AITL vs. PTCL/NOS, i.e. they enable PTC/AITL to be distinguished from PTCL/NOS.
  • the values of the discriminant coefficients of SEQ ID NO: 39-91 are discriminant for PTCL/NOS, preferably a PTCL/NOS CD3O + , vs. PTCL/ALCL-ALK " , i.e. they enable PTCL/NOS, preferably PTCL/NOS CD3O + , to be distinguished from PTCL/ALCL-ALK " .
  • the classifier is a Support Vector Machine (SVM) or kernel machine that preferably adopts a class prediction algorithm, widely used in the field of learning machines, which enables a type of supervised classification.
  • SVM Support Vector Machine
  • kernel machine that preferably adopts a class prediction algorithm, widely used in the field of learning machines, which enables a type of supervised classification.
  • the algorithm has been set using the following parameters, starting from the expression data of the molecular markers of the signature in biological samples isolated from PTCL/NOS, PTCL/AITL and PTCL/ALCL-ALK " individuals:
  • the kernel function is about 1 : 0.1 , preferably it is 1 : 0.1 ;
  • the kernel type is preferably linear;
  • the cost is about 100.0, preferably it is 00.0;
  • the maximum number of iterations is about 100000, preferably it is 100000;
  • the ratio is about 1.0, preferably it is 1.0;
  • the kernel function represents the measure adopted to measure the similarity between two different molecular markers and in the proposed model it is of a linear type, i.e. calculated as the product of the expression values of each pair of molecular markers.
  • the maximum number of iterations represents the maximum number of cycles that the algorithm SVM must complete before arriving at a convergence of the results.
  • the cost value represents a parameter that measures the complexity of the mathematical model in relation to the probability of misclassifying the various biological samples.
  • a cost value of 100 permits the right compromise to be found between the separation of classes and classification errors.
  • the ratio value indicates the ratio between the cost of misclassifying one class in relation to the other. The ratio used, equal to 1 , serves to ensure that there is no increase in false negatives in relation to the potential increase in false positives.
  • the analysis of the expression values of the molecular markers of the signature is based on mathematical models (i.e. the linear discriminant function or SVM algorithm) capable of using the information provided by the expression values of the molecular markers of the signature in groups of individuals known with certainty to be affected by PTCL/NOS, PTCL/AITL and PTCL/ ALCL-ALK " .
  • This type of analysis of the expression values of the molecular markers of the signature enables the biological samples subjected to the method of the invention to be classified, in a binary manner, into the different PTCL subtypes.
  • step 2 envisages a preceding step (called “training step”, from which it follows that the mathematical model is “trained"), in which the data related to the expression of the molecular markers of the signature are collected on the basis of biological samples isolated from individuals with PTCL/NOS, PTCL/ALCL-ALK " and PTCL/AITL.
  • the collected data regard the expression levels of different molecular markers in the single PTCL subtypes. These data (i.e. these expression profiles of the molecular markers) are used to create the weights of the mathematical function on which the binary classifier works.
  • the binary classifier is calibrated, or pre-set, in such a way as to recognise the different PTCL subtypes when the method according to the present invention is applied on a biological test sample. It is clear that the step of calibrating or pre-setting the classifier is carried out only in the step of preparing the classifier and it will thus preferably not be carried out in the method of the invention, which will be used on a routine basis in laboratories to analyse the biological samples.
  • Step 2 of the method of the invention envisages analysing the expression values (i.e. the expression profile) of the molecular markers (i.e. of the signature) obtained from the biological sample of an individual with suspected PTCL, or in any case an individual who is subjected to the method of the invention, using the weights created during the training step, i.e. using the mathematical function which is at the basis of the binary classifier and has been pre-set as described above.
  • the expression values i.e. the expression profile
  • the molecular markers i.e. of the signature
  • the result of this analysis is a score, based on which each biological sample is classified as PTCL/NOS, in particular PTCL/NOS, PTCL/ALCL-ALK " or PTCL/AITL.
  • the score will vary according to the biological sample or the type of comparison among PTCL subtypes. Therefore, based on the score it will be possible to differentiate a PTCL/NOS sample, preferably a PTCL/NOS, from a PTCL/ALCL-ALK- ample, or it will be possible to differentiate a PTCL/NOS sample from a PTCL/AITL sample.
  • the linear discriminant function is used when the number of molecular markers utilised in the signature is relatively limited, preferably when the number of molecular markers is around 30-60 markers.
  • the expression profile of the molecular markers i.e. step 1) is preferably obtained with methods of analysing gene expression, for example with microarrays using a chip.
  • the linear discriminant function can be used for the purpose of carrying out step 2 of the method of the invention also when the number of molecular markers used in the signature is less than around 20, preferably less than around 10 molecular markers.
  • the expression profile of the molecular markers is preferably obtained by means of quantitative PCR, RNA-sequencing or another custom array.
  • step 1 can be achieved by means of a quantitative PCR using the primers, possibly labelled (oligonucleotides), specific for the molecular markers of the signature of the present invention.
  • step 2 of the method according to the invention with a binary classifier such as the SVM, which, under this condition, enables an accurate classification suitable for diagnostic use.
  • the molecular signature of the present invention can further be used in order to develop dedicated assays (so-called custom arrays), which comprise the markers of the molecular signature as described earlier.
  • the method of the invention enables a PTCL to be determined, in particular it enables the subtyping of PTCLs, preferably by distinguishing PTCL/NOS, in particular PTCL/NOS CD30 + , from PTCL/ALCL-ALK " and PTCL/AITL from PTCL/NOS with a sensitivity of around 70-75% and a specificity of around 80-96%.
  • the method of the present invention shows a diagnostic accuracy which ranges from 70 to 95%.
  • This diagnostic accuracy derives from a study of step 3, that is to say, a study aimed at comparing the effectiveness of a system versus the gold standard presently in use, in observance of the rules of Evidence-Based Medicine (EBM).
  • EBM Evidence-Based Medicine
  • a method characterised by such diagnostic accuracy is sufficient to meet clinical requirements and can thus be used in place of and/or alongside the current reference diagnostic method, which is the based on analysing the morphological, phenotypic and clinical features belonging to the different PTCL subtypes.
  • the method according to the present invention demonstrates to be capable of distinguishing the patients affected by PTCL/ AITL or patients affected by PTCL/ALCL-ALK " from those affected by PTCL/NOS, in particular those affected by a PTCL/NOS CD3O + , with a considerable diagnostic accuracy.
  • PTCL/NOS characterised by CD30 + cells
  • the possibility of being able to distinguish it from PTCL/ALCL-ALK " is particularly advantageous, because the two PTCL subtypes, despite showing a phenotype that partly overlaps, clinically display a very different prognosis.
  • PTCL/NOS has a more inauspicious prognosis than PTCL/ALCL-ALK " and therefore a therapeutic approach that is targeted from the earliest stages is certainly more effective and more convenient, especially for the affected subject and, depending on the place, also for reducing health care costs.
  • the method according to the present invention shows a high specificity (SP) and is thus preferably a method that can be used as a SPIN test (deriving from SPecificity-rule IN). In fact, if positive, the method of the invention confirms the presence of the pathology.
  • the method of the present invention shows an equally high sensitivity (SN) and is thus also ideal as a SNOUT test (deriving from SeNsitivity-rule OUT), in the sense that when the result of the method is negative, the disease is ruled out.
  • the present invention further relates to a kit for carrying out the method according to the present invention, i.e. a kit for determining a PTCL, preferably for determining the PTCL subtypes, i.e. PTCL/NOS, PTCL/ALCL-ALK- and PTCL/AITL.
  • a kit for carrying out the method according to the present invention i.e. a kit for determining a PTCL, preferably for determining the PTCL subtypes, i.e. PTCL/NOS, PTCL/ALCL-ALK- and PTCL/AITL.
  • the kit of the invention comprises reagents for determining the expression of at least 5, preferably at least 15, more preferably at least 38 or at least 53 sequences selected from among SEQ ID NO: 1-91.
  • said reagents are sets of primers for determining the expression of at least 5, preferably at least 15, more preferably at least 38 or at least 53 sequences selected from among SEQ ID NO: 1-91.
  • probes specific for said sequences possibly bound on a solid support, for example a plate, filter or membrane.
  • said reagents are at least a solid support, such as an array/microarray (DNA chip) of molecules of nucleic acids, comprising molecules hybridising with at least 5, preferably at least 15, more preferably at least 38 or at least 53 sequences selected from among SEQ ID NO: 1-91.
  • Said reagents can alternatively be beads comprising said probes or primers, or other reagents for determining the expression of the sequences described in the present invention.
  • the kit there may be at least one enzyme and at least one buffer, for example enzymes like reverse transcriptase, DNA polymerase or ligase, nucleotides, positive control sequences, negative control sequences etc.
  • enzymes like reverse transcriptase, DNA polymerase or ligase, nucleotides, positive control sequences, negative control sequences etc.
  • a total number of 244 samples were subjected to the present study. In particular, they were samples of nodal PTCLs taken from subjects who had given their consent to treatment.
  • the 112 samples comprise: 80 PTCL/NOS samples, 20 PTCL/AITL samples and 12 PTCL/ALCL-ALK- samples.
  • test set a further set of samples consisting of 132 cases of PTCL (i.e. the validation set) was analysed.
  • the validation set includes: 78 cases of PTCL/NOS, 43 cases of PTCL/AITL and 1 1 cases of PTCL/ALCL-ALK " .
  • raw data of the gene expression profile are available in the GEO database. This information was obtained from “fresh” or frozen biopsy samples.
  • the diagnoses of all samples taken into consideration were verified by at least two expert haematopathologists, who made the diagnosis using WHO classification criteria.
  • the latter data i.e. the diagnoses made with the reference standard (i.e. the method based on morphological and phenotypic criteria), were used in the study as the reference standard.
  • the molecular signature based on GEP data i.e. the molecular classifier which represents the index test, was then applied retrospectively in 2011. The analysis was conducted by two experts in bioinformatics and gene expression analysis, both of whom were unaware of the results of the other tests. The study was conducted according to the principles of the Helsinki declaration, following approval of the Internal Review Board (IRB), reference number 201 _001.
  • IRS Internal Review Board
  • DASL cDNA- mediated Annealing, Selection, extension, and Ligation
  • RNA was extracted from the fixed tissues with the RecoverAIITM Total Nucleic Acid Isolation Kit.
  • RNA was purified by filtration through glass fibres carried out simultaneously with a treatment with DNAse.
  • the RNA molecules were eluted with low- salt buffer and quantized using a NanoDrop spectrophotometer.
  • the DASL assay begins conversion of the total RNA molecules into cDNAs using biotinylated Oligo(dT) and random nonamers.
  • DAP DASL Assay Pool
  • these template molecules are labelled using fluorescent primers that are added to the reaction. Subsequently, the products resulting from the PCR are scanned using a BeadArray Reader or an iScan System in order to determine the presence or absence of specific genes.
  • Genes@Work software which is a tool for gene expression analysis based on an algorithm that works by "pattern discovery” and on analysing the location of structural patterns through sequential histograms such as GenSpring GX 1 1 (in the case of Genes@Work and GenSpring GX 1 1 software, reference should be made to Piccaluga PP, et al. Blood, 1 17:3596-608, 201 1).
  • EASE software was applied in order to establish, through gene ontology, whether the deregulated genes defined specific cell functions of biological processes in a significant manner.
  • the classifier is a scoring function based on the values of a set of genes (gene cluster), which are differentially expressed in two sets of cell types (the samples of the training set) and can therefore be used for classifying the cell type of the samples included in the test set and the samples included in the validation set.
  • the forecasting model constructed on the basis of the values shown by the training test samples was then run on samples of the test set and samples of the validation set. Each sample was then assigned by an algorithm to one of the two categories considered (namely, PTCL/NOS vs. PTCL/AITL in one case and PTCL/NOS vs. PTCL/ALCL-ALK " in the other) when a confidence measure score >0.5 was observed (PTCL/NOS vs. PTCL/AITL or PTCL/NOS vs. PTCL/ALCL-ALK " ).
  • the reference standard (the anatomical-pathological method) and the index test (method of the invention) were compared in terms of sensitivity, specificity, positive predictive value, negative predictive value, confidence interval and likelihood ratio using CATmaker (Centre for Evidence Based Medicine, Oxford University, http://www.cebm.net) software.
  • stepwise method a statistical prediction test based on discriminant analysis (“stepwise method") was used, relying on SPSS software (IBM, USA). Exactly analogous results were obtained (100% concordance), when the original molecular signature was compared with one reduced through discriminant analysis. This step (reduction of the signature via discriminant analysis) is relevant from a clinical viewpoint, as it allows a simpler diagnostic assay to be performed while maintaining the same effectiveness.
  • the diagnostic accuracy was measured in terms of sensitivity, specificity, positive predictive value, negative predictive value, confidence interval and likelihood ratio. These parameters were obtained using CATmaker software.
  • PTCL/AITL and PTCL/ALCL-ALK- can be distinguished from PTCL/NOS on the basis of the total gene expression.
  • the differentially expressed genes were identified based on the p-value (p ⁇ 005) and fold change.
  • the PTCL/AITL samples are distinguished from the PTCL/NOS samples with a p ⁇ 0.0001
  • the PTCL/ALCL-ALK " samples are distinguished from the PTCL/NOS samples with a p ⁇ 0.0001).
  • the molecular signature based on the gene expression profile is capable of efficiently discriminating PTCL/AITL and PTCL/ALCL-ALK ' from PTCL/NOS.
  • an SVM algorithm was used, that is, a reproducible system for managing gene expression profile data.
  • the first step was to construct a model using the complete molecular signatures identified and testing them on samples of the training set (25 PTCL/NOS, 10 PTCL/AITL and 6 PTCL/ALCL-ALK samples).
  • the test has a sensitivity of 90% and specificity of 100%.
  • the positive and negative predictive values are respectively 100% and 98%.
  • the sensitivity of the test is 100% and the specificity is 98%, whilst the positive and negative predictive values are respectively 86% and 100%.
  • the genes used for the differential diagnosis were then reduced through discriminant analysis to 38 genes (for PTCL/NOS vs. PTCL/AITL) and 53 genes (for PTCL/NOS vs. PTCL/ALCL/ALK " ) (see Fig 1-3).
  • the validation set consisted in 78 PTCL/NOS samples, 43 PTCL/AITL samples and 1 1 PTCL/ALCL-ALK " samples.
  • a sensitivity of 72% and specificity of 80% were obtained.
  • the positive and negative predictive values are respectively 66% and 84%. Therefore, the total diagnostic accuracy is 77%.
  • PTCL/NOS were tested versus PTCL/ALCL-ALK- and 8/1 1 PTCL/ALCL and 75/78 PTCL/NOS were correctly classified.
  • the sensitivity and specificity in this case are respectively 73% and 96%.
  • the positive and negative predictive values are respectively 73% and 96%. Therefore, the total diagnostic accuracy is 93%.
  • the method according to the present invention demonstrates to be capable of distinguishing patients affected by PTCL/ AITL or patients affected by PTCL/ALCL-ALK " from those affected by PTCL/NOS with considerable diagnostic accuracy.
  • the molecular signature according to the present invention is efficient for identifying CD3O+ PTCL/NOS.
  • the molecular signatures according to the present invention show a significant effect on the post-test probability of disease.
  • the factor which best confirms the validity of a medical test is its ability to distinguish between the probability of the presence of a given pathological condition before and after the test in question.
  • the probability after the test increased by 15% to 100% and the cases of positive and negative results in the test set were reduced by 15% to 2%.
  • the probability after the test went from 35% to 66% and to 16% in the case of positive and negative results.
  • the probability after the test increased from 10% to 86%, whereas it showed to be from 10% to 0%, respectively, in the case of positive and negative results of the test itself.
  • the probability after the test went from 12% to 72% and 4%, respectively, in the case of positive and negative results.
  • the molecular signatures according to the present invention implement prognostic accuracy.
  • the method according to the present invention made it possible to distinguish, in the case of CD30 positive forms, the patients with a better prognosis (indicated by the test as PTCL/ALCL-ALK " ) from those with a worse prognosis (indicated as PTCL/NOS) ( Figure 4).

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Engineering & Computer Science (AREA)
  • Immunology (AREA)
  • Pathology (AREA)
  • Analytical Chemistry (AREA)
  • Zoology (AREA)
  • Genetics & Genomics (AREA)
  • Wood Science & Technology (AREA)
  • Physics & Mathematics (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Hospice & Palliative Care (AREA)
  • Biophysics (AREA)
  • Oncology (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The present invention relates to a method for determining a peripheral T-cell lymphoma (PTCL), in particular for subtyping PTCLs with a clinically significant diagnostic accuracy, said method being based on the use of a molecular signature. Furthermore, the present invention relates to said molecular signature and its use as a diagnostic agent.

Description

"MOLECULAR SIGNATURE AND ITS USES AS A DIAGNOSTIC AGENT"
*******
DESCRIPTION
The present invention relates to a method for determining a peripheral T-cell lymphoma (PTCL), preferably for determining the PTCL subtype, said method being based on the use of a molecular signature. The present invention further relates to said molecular signature and its use as a diagnostic agent.
Peripheral T-Cell lymphomas (hereinafter PTCL) represent 10-15% of all lymphomas and comprise a group of non-Hodgkin lymphomas that develop from T cells at different stages of maturity.
The REAL/WHO (Revised European-American Lymphoma/World Health Organization) classification distinguishes PTCLs into two main categories comprising specific forms of PTCL and nonspecific forms, also known as NOS (acronym of "Not Otherwise Specified").
In Europe and in the USA, 75% of all cases of PTCL fall under the following four subtypes: PTCL/NOS; PTCL/AITL (AITL is the acronym of Angiolmmunoblastic T-cell Lymphoma); and PTCL/ALCL (ALCL is the acronym of Anaplastic Large Cell Lymphoma), which in turn are distinguishable into PTCL/ALCL-ALK+ (acronym of Anaplastic Lymphoma Kinase) and PTCL/ALCL-ALK".
The heterogeneity of PTCLs is evident not only from a classification standpoint, but also from a prognostic standpoint. In fact, each subtype of PTCL is associated with a specific prognosis.
In particular, international studies have demonstrated that PTCL/NOS and PTCL/AITL have a significantly worse prognosis than PTCL/ALCL-ALK" forms. However, the impossibility up to now of distinguishing the various subtypes (i.e. PTCL/AITL, PTCL/ALCL-ALK" and PTCL/NOS) makes it difficult to establish a prognosis based solely on a morphological and immunophenotypic evaluation.
It is thus clear that being able to diagnose the specific PTCL subtype affecting a patient is fundamental in order to undertake the therapeutic approach that is most specific, and therefore also most effective, right from the early phases of clinical treatment of the pathology.
At present the differential diagnosis of PTCL, aimed at subtyping these lymphomas (i.e. the so-called reference standard or reference method), is based on an analysis of the morphological, phenotypic and clinical features characterising the different PTCL subtypes. This type of investigation is highly complex and is conducted exclusively by medical personnel who have acquired solid competence in the field of haematopathology. Practically speaking, today, the differential diagnosis of PTCL is entrusted to a highly expert haematopathologist who carefully analyses a sample from the patient in order to determine the presence or absence of the distinctive morphological features of the different PTCL subtypes.
It is clear that, notwithstanding the considerable experience of the expert and the care with which he/she performs an analysis of the sample, the final differential diagnosis will be "biased" (in a variable percentage, as the expert is a variable component) by the subjectivity of whoever makes it. Not coincidentally, a recent international study reported that up to 30% of the diagnoses of PTCL are wrong (Pileri S and Piccaluga PP. J Clin Invest. 2012 Oct 1 ;122(10):3448-55).
The problem underlying a diagnosis based on anatomopathological criteria resides in the fact that the morphology, phenotype and clinical signs, i.e. the parameters presently used for subtyping PTCLs, vary considerably and are in part common to the different PTCL subtypes.
For example, PTCL/NOS are often associated with characteristics that are also typical of PTCL/AITL or PTCL/ALCL-ALK". In particular, PTCL/NOS can exhibit a phenotype correlated to that of the T-helper follicular (Thf) lymphocytes commonly observed in PTCL/AITL; or else PTCL/NOS can be characterised by the presence of a sizable number of large CD30+ cells, wholly analogous to the ones characterising PTCL/ALCL- ALK". The fact that the different types of PTCL share morphological, phenotypic and clinical features creates "grey" areas in which it is not possible to establish which PTCL subtype the patient is affected by. For example, in these clinical cases (i.e. in the so- called grey area) it is not possible to establish with clinically significant diagnostic accuracy whether a given PTCL belongs to the PTCL/NOS or PTCL/AITL subtype or the PTCL/NOS or PTCL/ALCL-ALK- subtype.
In particular, with regard to PTCL/NOS characterised by CD30+ cells, the possibility of distinguishing it from PTCL/ALCL-ALK", likewise characterised by CD30+ cells, is particularly advantageous, because, as noted earlier, the two subtypes show a very different prognosis; specifically, PTCL/NOS has a more inauspicious prognosis.
Because of this unceertainty, and not being able to specifically tailor therapy to a defined PTCL subtype from the earliest stages of the clinical approach, broad spectrum therapies are adopted, but these prove to be clinically less effective and riskier for the patient. To this drawback it is also necessary to add the poor cost-effectiveness of such a clinical approach for the public/private health care provider, which will have to adopt broader, more costly therapies even when the clinical situation is potentially associable with a less severe prognosis.
Today, in fact, the first line of treatment is common to the various PTCL subtypes (CHOP/CHOP-like chemotherapy). However, the available experimental approach would be diversifiable, in the sense that, for example, PTCL/NOS can be treated with tyrosine-kinase inhibitors or histone deacetylase inhibitors, or PTCL/AITL can be treated with antiangiogenic drugs, or monoclonal antibodies such as rituximab, whilst PTCL/ALCL-ALK" can benefit from anti-CD30 monoclonal antibodies.
Therefore, in light of the fact that, from a clinical viewpoint, the PTCL subtypes have a completely different prognosis, it is greatly desirable to be able to develop a therapeutic approach that would enable a specific clinical treatment for each PTCL subtype right from the earliest stages.
However, as discussed above, at present the distinction between PTCL/NOS and PTCL/AITL or between PTCL/NOS and PTCL/ALCL-ALK" at a diagnostic level is still very complex and unresolved. Thus there are clear advantages tied to the development of a method enabling PTCL to be diagnosed, preferably distinguishing the various PTCL subtypes.
In particular, there is a felt need for a method of diagnosing PTCLs which is free of all subjectivity, that is to say, free of any source of diagnostic error, and is also sensitive, reliable and accurate. Such a method would make it possible to establish, from the earliest stages of treatment of a patient affected by PTCL, a specific therapy which takes account of the biology of the particular subtype of lymphoma.
Recently, various studies conducted on tumours of haematological origin have demonstrated that the different PTCL subtypes are distinct from a molecular viewpoint. However, no study has tested the potential diagnostic accuracy of an analysis of gene expression profiles in the diagnosis of PTCLs.
In fact, in order to replace or supplement the methods presently used for the differential diagnosis of PTCLs, the new diagnostic tool, freed of subjective criteria, must in any case assure a diagnostic accuracy that is comparable, if not even better, than that characterising the reference standard, i.e. the method based on analysing the morphological-clinical features of the PTCL sample.
To this end, the Applicant has developed a method for diagnosing PTLC which is based on the use of a molecular signature. In particular, the method of the present invention enables a differential diagnosis of PTCL, that is, it enables a determination of the PTCL subtype (i.e. a subtyping of the PTCL).
In general, the molecular signature or molecular classifier of the present invention can be used as a diagnostic agent, i.e. it can be used in a method for determining the presence of a pathology in an individual.
In particular, the molecular signature of the present invention is capable of diagnosing, preferably by subtyping, a PTCL with a diagnostic accuracy that is comparable to if not better than the diagnostic accuracy of the reference standard (i.e. the method presently used for subtyping PTCLs, which is based on analysing the morphological, phenotypic and clinical features).
The method for subtyping PTCLs of the present invention is particularly innovative and advantageous because it is the first method to be based on the use and analysis of gene expression profiles. Furthermore, the method of the present invention is the first method characterised by a diagnostic accuracy deriving from a phase III diagnostic study.
In particular, by applying the method according to the present invention it is possible to determine whether an individual is affected by a PTCL, and in particular it is possible to determine the subtype it belongs to (i.e. if it is PTCL/NOS or PTCL/ALCL-ALK", PTCL/NOS or PTCL/AITL) with a clinically significant diagnostic accuracy as high as 98%.
Until today, no molecular signature capable of enabling PTCL/NOS to be distinguished from PTCL/AITL and/or from PTCL/ALCL-ALK" was known; that is, there was no known method which, based on an analysis of the expression of the molecular signature of the invention, enabled an accurate differential diagnosis of PTCLs.
In particular, the method according to the present invention shows a high (80-100%) specificity (SP) and is thus ideal as a SPIN test (deriving from SPecificity-rule IN), in the sense that, if positive, said method confirms the presence of the pathology. Moreover, the method of the present invention shows an equally high (72-100%o) sensitivity (SN) and is thus also ideal as a SNOUT test (deriving from SeNsitivity-rule OUT), in the sense that when the result of the method is negative, the disease is ruled out.
In light of the above-mentioned advantages, the method of the invention is particularly useful in clinical practice. In fact, it is highly informative both in cases in which the result of the method is positive, and in cases in which the result is negative. In this regard it is also necessary to emphasize that, to date, the use of the gene expression profiles and the analysis thereof has essentially been confined to the field of research. Their exclusion from the diagnostic field is mainly tied to the necessity of using, as the starting material, only "fresh" or frozen isolated tissue samples which are often not available.
The method proposed also overcomes this difficulty because it also works well and accurately on a fixed isolated tissue sample, which may be sectioned and preserved, for example on laboratory slides.
This aspect is particularly relevant, since in routine diagnostic activity, both in Italy and abroad, the biological sample available for investigating this type of pathology is usually an isolated tissue sample fixed and embedded in ad hoc solutions, for example paraffin- based ones.
Therefore, the method of the invention is the first method based on the analysis of a gene expression profile that has been validated on fixed biological tissue in the realm of lymphomas.
The present invention is described in detail below, also with the aid of the appended figures, in which:
- Figure 1 shows the hierarchical clustering of genes differentially expressed in a significant manner according to discriminant analysis of PTCL/AITL versus PTCL/NOS (A) and ALK- ALCL versus PTCL/NOS (B) .
- Figure 2 (A,B) shows the genes expressed differentially between the PTCL/NOS samples and the PTCL/ALCL-ALK" samples.
- Figure 3 shows the genes expressed differentially between the PTCL/NOS samples and the PTCL/AITL samples.
- Figure 4 shows the survival curves: of the various PTCL subtypes (A); of the various molecular subtypes (B); and of the PTCL/NOS CD30+ subtype and PTCL/ALCL-ALK- subtype (C).
A first aspect of the present invention regards a method, carried out in vitro and based on the use of a molecular signature, which enables a peripheral T-cell lymphoma (PTCL) to be determined. In particular, the method of the invention enables a subtyping of PTCLs, preferably as PTCL/NOS, PTCL/AITL or PTCL/ALCL-ALK". In other words, by applying the method of the invention, it is possible to distinguish a PTCL/NOS from a PTCL/AITL, or a PTCL/NOS from a PTCL/ALCL-ALK".
The PTCLs to which the present invention relates are preferably those affecting the nodes.
In the context of the present invention, molecular signature means a set of molecular markers. A molecular marker or molecular classifier is a nucleic acid sequence, for example a DNA sequence, in particular a gene sequence. A nucleic acid sequence should be understood as meaning both the sequence of a filament of nucleic acid and the complementary sequence, and/or an RNA sequence.
Therefore, a molecular signature is also definable as a set of DNA sequences or a set of genes (or also a gene pool).
The method of the present invention is carried out in vitro, i.e. on a biological sample isolated from an individual who is affected or suspected to be affected by a PTCL, and comprises at least one step of measuring, in a biological sample, the levels of expression of the molecular markers of the signature of the invention as described below (Step 1).
Subsequently, the expression levels of the molecular markers measured according to step 1 are analysed, preferably by using a binary classifier (Step 2). The binary classifier is preferably based on a linear discriminant function or on Support Vector Machine algorithms.
Preferably, the method of the invention comprises at least a step 1 and at least a step 2 as described above.
In one embodiment of the invention, the molecular signature comprises at least 5, preferably at least 15, more preferably at least 38 or at least 53 molecular markers selected from among the markers indicated in Table I.
Table I shows the name of the molecular marker (i.e. of the gene) and the reference sequence (RefSeq) obtained from NCBI database (http://www.ncbi.nlm.nih.gov/refseq/). The reference sequence identifies a variant of the sequence for each molecular marker described.
Table I
Name of RefSeq Name of RefSeq Name of RefSeq Name of RefSeq
molecular molecular molecular molecular
marker marker marker marker
ABHD14 NM_00114 LDOC1 L N _03228 SNAP91 N _00124 LYG2 NMJ7573
6314 7 2792 5
ADFP NR_03806 LOC28563 NMJ7592 MPPE1 NM_00124 PP 1G N J 7798
4 6 1 2904 3
ADH4 NM_00067 LOC44120 NR_00350 EPDR1 NM_00124 MRPL21 NMJ8151
0 8 2 3076 4
ALG1 NM_01910 LOC64915 NR_03836 TXN NM_00124 KCTD8 NMJ9835 9 9 8 4938 3
AMICA1 NM_00109 LYRM2 NM_02046 BLK NM_00171 NPM3 NM_00699
8526 6 5 3
ANXA5 NM_00115 MAK10 NM_02463 F2R NM_00199 SNHG10 NR_00145
4 5 2 9
APPBP2 NM_00638 MALAT1 NR_00281 GNA15 NM_00206 FLJ22795 NR_02681
0 9 8 1
ARSG NM_01496 MAP3K13 NM_00124 JUN NM_00222 FLJ32679 NR_03335
0 2314 8 1
ATG4C NM_03285 MATR3 NM_00119 NRAS NM_00252 NXT1 NMJ31324
2 4954 4 8
ATP5A1 NM_00100 MED28 NM_02520 S100A4 NM_00296 ORC4L NMJ 8174
1937 5 1 1
BRCA2 NM_00005 MED30 NM_08065 XRCC4 NM_00340 P2RY8 NMJ7812
9 1 1 9
C10ORF3 NR_03764 MON1 B NM_01494 ZFP36 NM_00340 PARL NM_00103 2 4 0 7 7639
C10RF59 NMJ4458 MPP7 NMJ7349 AGPS NM_00365 PHF11 NM_00104
4 6 9 0443
C10RF71 NM_00113 MRPL20 NM_01797 BNIP3L NM_00433 PNMA1 NM_00602
9459 1 1 9
C210RF7 NM_02015 MT2A NM_00595 ATF1 NM_00517 POLR2D NM_00480
2 3 1 5
C30RF14 NM_02068 NDUFB2 NM_00454 GPR37 NM_00530 PPP2R5A NM_00119
5 6 2 9756
C30RF19 NM_01647 BTK NM_00006 POU3F2 NM_00560 PSMB4 NMJ50279
4 1 4 6
C80RF33 NM_05409 NR3C1 NM_00017 SFRS4 NM_00562 R3HDM2 NM_01492
9 6 6 5
CANT1 NM_00115 MY05A NM_00025 USPL1 NM_00580 RNF181 NM_01649
9772 9 0 4
CCDC107 NM_00119 UROD NM_00037 KLRG1 NM_00581 RPAIN NMJD0103
5200 4 0 3002
CDC34 NM_00435 CTSK NM_00039 ZNF192 NM_00629 RPL22 NM_00098
9 6 8 3
CELSR3 NM_00140 FCGR3A NM_00056 COPE NM_00726 SASS6 NMJ9429
7 9 3 2
COX 17 NM_00569 ICAM2 NM_00087 DAPP1 NM_01439 SCAF1 NM_02122
4 3 5 8
CPSF6 NM_00700 NUMB NM_00100 DBC1 NM_01461 SEC61 G NM_00101
7 5743 8 2456
CTNNA1 NM_00190 AOF2 NM_00100 SPAST NM_01494 SF3B5 NM_03128
3 9999 6 7
CYFIP2 NM_00103 CENPP NM_00101 CRLF3 NM_01598 SH3BP4 NM_01452
7332 2267 6 1
DAGLA NM_00613 DDR2 NM_00101 MRPS23 NM_01607 SIX3 NM_00541
3 4796 0 3
DGKQ N _00134 S100PBP NM_00101 FIT1 NM_01623 SLC35B4 NM_03282
7 7406 2 6 DNAJC2 N _00112 ALDH18A1 NM_00101 SCAND1 NM_01655 SLC36A1 NM_07848 9887 7423 8 3
EEF1 B2 N _00103 COX 19 NM_00103 RSRC1 NM_01662 SPHAR NM_00654
7663 1617 5 2
EFCAB3 NM_00114 CTLA4 NM_00103 TRIT1 N _01764 STS-1 NM_03287
4933 7631 6 3
EIF1 NM_00580 PPP4R1 N _00104 FAR2 NM_01809 TBC1 D15 N _00114
1 2388 9 6213
EMR2 N _01344 XIRP2 N _00107 ABCF3 NMJD1835 TBC1 D9 NM_01513
7 9810 8 0
EPM2AIP1 N _01480 PEAR1 NM_00108 L BRD1 NM_01836 TCF7L2 NM_00114
5 0471 8 6274
ERN1 NM_00143 BTLA NM_00108 CHPT1 N _02024 TFB1 NM_01602
3 5357 4 0
ETS2 N _00523 CDC45L NMJJ0109 MIB1 NM_02077 T EM50B N _00613
9 3633 4 4
FADS1 NM_01340 SYS1 NM_00109 PBOV1 NM_02163 TSC22D2 NM_01477
2 9791 5 9
FAM136A N _03282 PROK2 NM_00112 MOAP1 NM_02215 TSHZ2 N _00119
2 6128 1 3421
PPFIBP2 NM_00362 PLD1 N _00113 NSD1 NM_02245 TUSC2 NM_00727
1 0081 5 5
FLJ40142 XM_53471 ACAD 10 N _00113 BCL11 B NM_02289 UBE2W NM_00100
4 6538 8 1481
FTSJD1 N _00109 TRAF3IP1 NM_00113 BRIP1 N _03204 UBE3B NM_13046
9642 9490 3 6
GL01 N _00670 TM6SF1 NM_00114 DKFZP564 NM_03212 UTP23 NM_03233
8 4903 O0523 0 4
GPR52 NM_00568 NR2F2 NM_00114 E2F3 NM_03212 VAMP7 NM_00114
4 5155 0 5149
GPSN2 NM_00108 NFE2L2 NM_00114 SPANXB1 NM_03246 ZMYM4 NM_00509
6992 5412 1 5
H3F3A NM_00210 CEP120 NM_00116 FAM78A N _03338 ZNF124 N _00124
7 6226 7 3740
HLTF N _00307 TUBD1 NM_00119 KRT3 NM_05708 ZNF598 NMJ7816
1 3609 8 7
ID2 NM_00216 CAPN1 NM_00119 ALKBH3 NMJ 3917 ZNF624 N _02078
6 8868 8 7
IL1 RN N _00057 CRABP2 NM_00119 TIGD1 NMJ4570 ZNF681 NMJ 3828
7 9723 2 6
KARS NM_00113 TGOLN2 N _00120 PSENEN NMJ 7234 ZNF828 NM_00126
0089 6840 1 0800
KIAA2018 N _00100 PTPN2 NM_00120 CYP4F22 N J7348 ZNF92 NM_00713
9899 7013 3 9
In a further embodiment of the invention, the method envisages a step of measuring the expression levels of at least 5, preferably at least 5, more preferably at least 38 or at least 53 molecular markers selected from among SEQ ID NO: 1-91. Preferably, said at least 38 sequences are SEQ ID NO: 1-38. Preferably, said at least 53 sequences are SEQ ID NO: 39-91.
The sequences of the molecular signature of the present invention are shown in Table II. In particular, Table II shows the SEQ ID NO of the sequence of each molecular marker, the name of the molecular marker and the reference sequence of the molecular marker (RefSeq) obtained from the database (RefSeq NCBI http://www.ncbi.nlm.nih.gov/refseq/). In this case as well, the reference sequence identifies a variant of the sequence for each molecular marker described.
The complete sequences of the molecular markers present in the signature of the present invention are provided in the appended Sequence Listing. The sequences contained in the appended Sequence Listing are to be understood as incorporated in the present description.
The sequences characterised by at least 70%, 80%, 85%, 90%, or 95% of identicalness with SEQ ID NO: 1-91 , listed in Table II and described in their entirety in the appended Sequence Listing, are also to be considered as part of the subject matter of the present invention.
Table II
Name of SEQ ID NO: Reference Name of SEQ ID NO: Reference
molecular molecular
sequence sequence
marker marker
(RefSeq) (RefSeq)
ABCF3 SEQ ID NO: 1 NM_018358 CAPN1 SEQ ID NO: 47 NM_001 198868
AGPS SEQ ID NO: 2 NMJ503659 COPE SEQ ID NO: 48 NM_007263
ALDH18A1 SEQ ID NO: 3 NM_00101742 SEQ ID NO: 49 NM_015986
3 CRLF3
ALKBH3 SEQ ID NO: 4 NIVM39178 CTLA4 SEQ ID NO: 50 NM_001037631
AOF2 SEQ ID NO: 5 N _00100999 SEQ ID NO: 51 NM_173483
9 CYP4F22
CDC45L SEQ ID NO: 6 NM_00109363 SEQ ID NO: 52 NM_014395
3 DAPP1
CENPP SEQ ID NO: 7 NM_00101226 SEQ ID NO: 53 NM_014618
7 DBC1
CEP120 SEQ ID NO: 8 NM_001 16622 SEQ ID NO: 54 NM_001014796
6 DDR2
CHPT1 SEQ ID NO: 9 NM_020244 E2F3 SEQ ID NO: 55 NM_032120
COX 19 SEQ ID NO: 10 NM_00103161 SEQ ID NO: 56 NM_001243076
7 EPDR1
CRABP2 SEQ ID NO: 11 NMJD01 19972 SEQ ID NO: 57 N _001992
3 F2R CTSK SEQ ID NO: 12 NM_000396 FAR2 SEQ ID NO: 58 NM_018099
FAM78A SEQ ID NO: 13 NM_033387 FCGR3A SEQ ID NO: 59 NM_000569
FLJ22795 SEQ ID NO: 14 NR_026811 GPR37 SEQ ID NO: 60 NM_005302
FLJ32679 SEQ ID NO: 15 NR_033351 KCTD8 SEQ ID NO: 61 NMJ 98353
GNA15 SEQ ID NO: 16 NM_002068 KLRG1 SEQ ID NO: 62 NM_005810
ICAM2 SEQ ID NO: 17 NM_000873 KRT3 SEQ ID NO: 63 NM_057088
JUN SEQ ID NO: 18 NM_002228 LMBRD1 SEQ ID NO: 64 NM_018368
MIB1 SEQ ID NO: 19 NM_020774 LYG2 SEQ ID NO: 65 NMJ 75735
MOAP1 SEQ ID NO: 20 NM_022151 MPPE1 SEQ ID NO: 66 NM_001242904
MRPS23 SEQ ID NO: 21 NM_016070 MRPL21 SEQ ID NO: 67 NMJ81514
NR3C1 SEQ ID NO: 22 NM_000176 MY05A SEQ ID NO: 68 NM_000259
NRAS SEQ ID NO: 23 NM_002524 NFE2L2 SEQ ID NO: 69 NM_001145412
NSD1 SEQ ID NO: 24 NM_022455 NR2F2 SEQ ID NO: 70 NM_001145155
PPM1 G SEQ ID NO: 25 NM_177983 NUMB SEQ ID NO: 71 NM_001005743
RSRC1 SEQ ID NO: 26 NM_016625 PBOV1 SEQ ID NO: 72 NM_021635
S100A4 SEQ ID NO: 27 NM_002961 PEAR1 SEQ ID NO: 73 NM_001080471
SCA D1 SEQ ID NO: 28 NM_016558 PLD1 SEQ ID NO: 74 NM_001130081
SFRS4 SEQ ID NO: 29 NM_005626 POU3F2 SEQ ID NO: 75 NM_005604
SPANXB1 SEQ ID NO: 30 NM_032461 PPP4R1 SEQ ID NO: 76 NM_001042388
SYS1 SEQ ID NO: 31 NM_00109979 SEQ ID NO: 77 NM_001126128
1 PROK2
TIGD1 SEQ ID NO: 32 NMJ 45702 PSENEN SEQ ID NO: 78 NMJ 72341
SEQ ID NO: 33 N _00114490 SEQ ID NO: 79 NM_001207013
TM6SF1 3 PTPN2
TRIT1 SEQ ID NO: 34 NM_017646 S100PBP SEQ ID NO: 80 NM_001017406
SEQ ID NO: 35 NM_00124493 SEQ ID NO: 81 NM_001242792
TXN 8 SNAP91
USPL1 SEQ ID NO: 36 NM_005800 SNHG10 SEQ ID NO: 82 NR_001459
XRCC4 SEQ ID NO: 37 NM_003401 SPAST SEQ ID NO: 83 NM_014946
ZNF192 SEQ ID NO: 38 NM_006298 TGOLN2 SEQ ID NO: 84 NM_001206840
SEQ ID NO: 39 NM_00113653 SEQ ID NO: 85 NM_001139490
ACAD 10 8 TRAF3IP1
ATF1 SEQ ID NO: 40 NM_005171 TUBD1 SEQ ID NO: 86 NM_001193609
BCL11 B SEQ ID NO: 41 NM_022898 UROD SEQ ID NO: 87 NM_000374
BLK SEQ ID NO: 42 NM_001715 XIRP2 SEQ ID NO: 88 NM_001079810
BNIP3L SEQ ID NO: 43 NM_004331 ZFP36 SEQ ID NO: 89 NM_003407
SEQ ID NO: 44 NM_032043 DKFZP564O05 SEQ ID NO: 90 NM_032120
BRIP1 23
BTK SEQ ID NO: 45 N _000061 FIT1 SEQ ID NO: 91 NM_016232
SEQ ID NO: 46 NM_00108535
BTLA 7
According to a preferred aspect of the present invention, the molecular markers (i.e. the molecular signature) or molecules hybridising with said molecular markers (i.e. the complementary sequences of said molecular markers) are chemically bound on a solid support. Said support is preferably a chip or a plate. Any support known to the person skilled in the art which serves the purpose of chemically binding a molecular marker as described above is to be considered as comprised within the objects of the present invention.
Therefore, a further aspect of the present invention relates to a support, preferably a chip or a plate, on which the molecular signature of the present invention is bound; in other words, a support on which the above-described molecular markers are bound. The molecular signature of the present invention, preferably chemically bound on a support as described above, can be used as a diagnostic agent. In other words, it is possible to use the markers of the molecular signature in order to diagnose a pathology in an individual. Optionally, in addition to diagnosing a pathology, the molecular signature of the invention can be used to define the prognosis of a pathology, or else to monitor the results of a surgical or therapeutic treatment targeted against a pathology (i.e. so-called disease follow-up). In these cases as well, it is advantageous for the molecular signature to be chemically bound on the support as previously described. The pathology is preferably a peripheral T-cell lymphoma (PTCL). In these cases, the method of the invention is particularly suitable for subtyping PTCLs, preferably nodal PTCLs. Therefore, the method of the present invention makes it possible to distinguish whether a PTCL is a PTCL/NOS, preferably a PTCL/NOS CD3O+, a PTCL/AITL or a PTCL/ALCL-ALK- and, since each PTCL subtype is associated with a different prognosis, the method of the present invention can also be considered as a method for determining the prognosis of a PTCL.
According to a preferred aspect of the invention, a signature comprising at least 5, preferably at least 15 or 38 molecular markers selected from among SEQ ID NO: 1-38, preferably chemically bound to a support as described above, is used as a diagnostic agent, in particular in a method for determining a PTCL, preferably for determining whether a PTCL is of the PTCL/NOS subtype or PTCL/AITL subtype. In other words, a molecular signature that comprises at least 5, preferably at least 15 or 38 molecular markers selected from among SEQ ID NO: 1-38 is capable of distinguishing, in a clinically significant manner, individual affected by a PTCL/NOS, from those affected by a PTCL/AITL. Therefore, the method which uses, in step 1 , a molecular signature comprising at least 5, preferably at least 15 or 38 molecular markers selected from among SEQ ID NO: 1-38 is preferably a method for diagnosing or subtyping a PTCL/NOS and/or a PTCL/AITL (i.e. of distinguishing whether it is a case of PTCL/NOS or PTCL/AITL). Clearly, in this case the method also makes it possible to establish the prognosis tied to a PTCL/NOS and/or a PTCL/AITL.
According to a further preferred aspect of the invention, the molecular signature comprising at least 5, preferably at least 15 or 53 molecular markers selected from among SEQ ID NO: 39-91 , preferably chemically bound to a support as described above, is used as a diagnostic agent, in particular in a method for determining whether a PTCL is of the PTCL/NOS subtype, preferably PTCL/NOS CD3O+, or of the PTCL/ALCL-ALK" subtype. In other words, a molecular signature which comprises at least 5, preferably at least 15 or 53 molecular markers selected from among SEQ ID NO: 39-91 , is capable of distinguishing, in a clinically significant manner, the individuals affected by a PTCL/NOS, preferably PTCL/NOS CD3O+, from those affected by a PTCL/ALCL-ALK". Therefore, the method which uses, in step 1 , a molecular signature comprising at least 5, preferably at least 15 or 53 molecular markers selected from among SEQ ID NO: 39-91 is preferably a method for diagnosing or subtyping a PTCL/NOS, preferably PTCL/NOS CD3O+, and/or a PTCL/ALCL-ALK" (i.e. of distinguishing whether it is a case of PTCL/NOS, preferably PTCL/NOS CD3O+, or PTCL/ALCL-ALK"). Clearly, in this case as well the method also makes it possible to establish the prognosis tied to a PTCL/NOS, preferably PTCL/NOS CD3O+, and/or a PTCL/ALCL-ALK".
The method of the present invention is carried out in vitro starting from any isolated biological sample, for example a biopsy. Preferably, said biological sample is a "fresh" (i.e. just isolated) and/or frozen sample. Alternatively, the biological sample can be fixed and embedded. For example, the biological sample can be fixed using the common fixing solutions utilised in laboratories, for example a formalin solution or alcohol solutions. In the same manner, the embedding is achieving using the common embedding solutions utilised in laboratories, for example paraffin or OCT solutions. According to particular embodiments of the invention, the method of the present invention can be carried out starting from a biological sample that has been fixed, embedded, sectioned and preserved on a plate.
The biological sample referred to is isolated from any individual, preferably, the sample is isolated from an individual who has been diagnosed with a PTCL, or an individual who is suspected to be affected by PTCL.
The biological sample is treated with the aim of obtaining the expression profile of the molecular signature of the invention, i.e. in order to carry out step 1 of the above described method. In general, one speaks of determining the Gene Expression Profile (GEP). In this specific case, the genes are the molecular markers (sequences) of the molecular signature of the invention, i.e. at least 5, preferably at least 15, more preferably at least 38 or at least 53 markers selected from among SEQ ID 1-91. Preferably, the 38 markers are SEQ ID NO: 1-38. Preferably, the 53 markers are SEQ ID NO: 39-91.
The methods for obtaining the expression profile of the molecular markers concerned are those known in the art which serve this purpose, for example, microarrays, quantitative PCR, digital PCR or next generation sequencing (also called deep sequencing).
In particular, according to a preferred aspect of the present invention, in order to obtain the expression profile of the molecular markers concerned (and thus to carry out step 1 of the method of the invention), the biological sample is subjected to a method comprising at least one of the following steps:
(a) purifying/extracting the nucleic acid molecules, in particular the transcribed RNA molecules in the biological sample (i.e. the messenger RNA-mRNA molecules); and/or
(b) retrotranscribing the RNA molecules obtained from step (a) in order to obtain the corresponding cDNA molecules; and/or
(c) amplifying and/or labelling the cDNA molecules obtained from step (iv); and/or (b); and/or
(d) quantifying the molecules amplified during step (c).
The quantitative data obtained from step (d) represent the expression profile of the molecular markers of the signature in the sample considered.
The procedures for extracting RNA according to step (a) and the retrotranscribing step (b) are techniques that are well known in the art and are thus carried out according to the protocols commonly used in this field. For example, the methods described in laboratory manuals such as those by Sambrook et al. 1989, or Ausubel et al. 1994.
The amplification step (c) is preferably carried out using PCR-type techniques in the presence of probes (primers), i.e. DNA sequences complementary to the portions of cDNA that one wishes to amplify. Preferably, the amplification step is carried out by means of a DASL assay, which is known in the art for these purposes.
The labelling step is preferably carried out by performing the amplification step in the presence of labelled molecules, in particular in the presence of precursors of the labelled nucleic acids or labelled probes. Labelling is performed, preferably, with fluorescent or biotinylated molecules so as to enable the amplified, labelled DNA to be quantized by means of signal readers. In any case, the amplified DNA can be quantized with methods known to the person skilled in the art, for example methods based on quantitative PCR, expression microarrays, RNA-sequencing or ad hoc arrays (i.e. so- called custom arrays).
Preferably, the molecules to be amplified and/or labelled are at least 5, preferably at least 15, more preferably at least 38 or at least 53 markers selected from among SEQ ID 1-91. Preferably, the 38 markers are SEQ ID NO: 1-38. Preferably the 53 markers are SEQ ID NO: 39-91.
Preferably, the amplification/labelling step is followed by a step of hybridising the amplified/labelled molecules on a solid support, for example on a chip, on which, preferably, the whole genome and/or transcriptome has been chemically bound, or on a support on which the molecular markers of the invention, i.e. the molecular signature as previously described or in any case molecules hybridising with said molecular markers, have been chemically bound. In this case the quantization, and hence data of the expression profile, can be obtained using a reader, for example an iSCAN System and/or Bead Assay reader in the case of DASL lllumina technology, or else a reader like the Scanner 3000 and the evolutions thereof for Affymetrix technology, or TaqMan platforms in the event that the amplification and/or labelling is carried out with quantitative PCR assays.
The expression profile of the molecular markers of the signature of the invention in the biological sample concerned, obtained according to the step 1 , is subjected to step 2 of the method in order to determine whether the sample is a peripheral T-cell lymphoma (i.e. a PTCL), and in particular in order to determine which PTCL subtype the biological sample belongs to, that is, which PTCL subtype the individual is affected by.
In particular, for the purpose of diagnosing a PTCL, preferably for the purpose of subtyping the PTCL (and thus distinguishing PTCL/NOS, preferably PTCL/NOS CD30+, PTCL/AITL and PTCL/ALCL-ALK"), the expression profile of the molecular markers of the signature envisages that SEQ ID NO: 2, 8-10, 14, 15, 19, 23, 24, 26, 30, 32, 34, 36- 43, 45-47, 49, 50, 55, 56, 58, 59, 62, 64, 66, 68-71 , 73, 74, 76, 78-81 , 83, 84, 86 and 89-91 are underexpressed in the PTCL/AITL samples or in the PTCL/ALCL samples ALK" compared to the PTCL/NOS samples.
Furthermore, the expression profile of the molecular markers of the signature envisages that SEQ ID NO: 1 , 3-7, 1 1-13, 16-18, 20-22, 25, 27-29, 31 , 33, 35, 44, 48, 51-54, 57, 60, 61 , 63, 65, 67, 72, 75, 77, 82, 85, 87 and 88 are overexpressed in the PTCL/AITL samples or in the PTCL/ALCL ALK" samples compared to the PTCL/NOS samples. In particular, the expression data of the molecular signature of the invention are analysed with the aid of a computerised system programmed with a binary classifier, preferably selected between a linear discriminant function and a Support Vector Machine.
Said function and said algorithm have preferably been preset using the expression data of the molecular markers of the signature described earlier, starting from biological samples isolated from individuals with PTCL/NOS, PTCL/AITL and PTCL/ALCL-ALK". Being binary, these classifiers enable PTCL to be classified into two subtypes. Said two subtypes are preferably PTCL/NOS vs. PTCL/ALCL-ALK" or PTCL/AITL vs. PTCL/NOS. In particular, the linear discriminant function is Function 1 which follows:
Function 1 :
Figure imgf000017_0001
w0 = Constant X, = value of expression of the molecular
marker i
D(X)= Discriminant function w-i = discriminant coefficient of molecular
marker i
The value of expression of the molecular marker is preferably the one obtained according to step 1 of the method, hence the value of expression of the molecular marker in the biological sample considered.
The discriminant coefficient for each molecular marker of the signature of the present invention is preferably the one indicated Table III. The discriminant coefficient of the molecular marker i ("i" means any marker of the signature of the invention) is a score associated with each discriminant molecular marker; said score indicates the importance which that given molecular marker has in classifying the PTCLs (the higher the score of the coefficient, the greater the weight this marker has in classifying the PTCL). The discriminant coefficient is calculated taking into consideration the correlations between the expressions of the various molecular markers and is calculated in such a way as to maximise the difference between PTCLs, preferably between PTCL subtypes. The constant is a fixed value that is added to the discriminant function and is likewise calculated in such a way as to maximise the difference between PTCLs, preferably between PTCL subtypes. Preferably, for the molecular markers SEQ ID NO: 1-38, the values of the function are calculated in such a way as to maximise the difference between the subtypes PTCL/NOS and PTCL/AITL. For the molecular markers SEQ ID NO: 39-91 , the values of the function are calculated in such a way as to maximise the difference between the PTCL/NOS subtypes, preferably PTCL/NOS CD3O+ and PTCL/ALCL-ALK".
Table III
Name of the SEQ ID NO: discriminant Name of the SEQ ID NO: discriminant
molecular molecular
coefficient of coefficient of
marker marker
molecular molecular
marker i marker i
ABCF3 SEQ ID NO: 1 -,329 CAPN1 SEQ ID NO: 47 1,279
AGPS SEQ ID NO: 2 1,386 COPE SEQ ID NO: 48 13,193
ALDH18A1 SEQ ID NO: 3 -,835 CRLF3 SEQ ID NO: 49 2,923
ALKBH3 SEQ ID NO: 4 ,220 CTLA4 SEQ ID NO: 50 -1,247
AOF2 SEQ ID NO: 5 ,368 CYP4F22 SEQ ID NO: 51 1,132
CDC45L SEQ ID NO: 6 ,905 DAPP1 SEQ ID NO: 52 5,880
CENPP SEQ ID NO: 7 -,201 DBC1 SEQ ID NO: 53 -.357
CEP120 SEQ ID NO: 8 ,600 DDR2 SEQ ID NO: 54 2,926
CHPT1 SEQ ID NO: 9 -1,268 E2F3 SEQ ID NO: 55 3,657
COX19 SEQ ID NO: 10 1,104 EPDR1 SEQ ID NO: 56 2,483
CRABP2 SEQ ID NO: 11 ,914 F2R SEQ ID NO: 57 -1,176
CTSK SEQ ID NO: 12 -,755 FAR2 SEQ ID NO: 58 2,113
FA 78A SEQ ID NO: 13 -,995 FCGR3A SEQ ID NO: 59 -5,673
FLJ22795 SEQ ID NO: 14 -2,624 GPR37 SEQ ID NO: 60 -5,293
FLJ32679 SEQ ID NO: 15 -,527 KCTD8 SEQ ID NO: 61 -3,215
GNA15 SEQ ID NO: 16 -,304 KLRG1 SEQ ID NO: 62 -1,351
ICAM2 SEQ ID NO: 17 -,158 KRT3 SEQ ID NO: 63 -1,470
JUN SEQ ID NO: 18 1,447 LMBRD1 SEQ ID NO: 64 -,855
IB1 SEQ ID NO: 19 -1 ,548 LYG2 SEQ ID NO: 65 ,257
MOAP1 SEQ ID NO: 20 ,906 MPPE1 SEQ ID NO: 66 2,365
RPS23 SEQ ID NO: 21 -,375 MRPL21 SEQ ID NO: 67 ,267
NR3C1 SEQ ID NO: 22 ,312 MY05A SEQ ID NO: 68 -4,208
NRAS SEQ ID NO: 23 1,617 NFE2L2 SEQ ID NO: 69 -,981
NSD1 SEQ ID NO: 24 -2,187 NR2F2 SEQ ID NO: 70 -6,697
PPM1G SEQ ID NO: 25 ,562 NUMB SEQ ID NO: 71 -6,042
RSRC1 SEQ ID NO: 26 ,540 PBOV1 SEQ ID NO: 72 -1,107
S100A4 SEQ ID NO: 27 ,704 PEAR1 SEQ ID NO: 73 ,668
SCAND1 SEQ ID NO: 28 ,348 PLD1 SEQ ID NO: 74 2,120
SFRS4 SEQ ID NO: 29 ,806 POU3F2 SEQ ID NO: 75 14,377
SPANXB1 SEQ ID NO: 30 -,470 PPP4R1 SEQ ID NO: 76 -2,717
SYS1 SEQ ID NO: 31 1,273 PROK2 SEQ ID NO: 77 -3,908
TIGD1 SEQ ID NO: 32 1,748 PSENEN SEQ ID NO: 78 -8,444
TM6SF1 SEQ ID NO: 33 -,435 PTPN2 SEQ ID NO: 79 ,435
TRIT1 SEQ ID NO: 34 -2,803 S100PBP SEQ ID NO: 80 5,926
TXN SEQ ID NO: 35 -,386 SNAP91 SEQ ID NO: 81 9,737
USPL1 SEQ ID NO: 36 -,710 SNHG10 SEQ ID NO: 82 -3,461
XRCC4 SEQ ID NO: 37 -,521 SPAST SEQ ID NO: 83 -8,229 ZNF192 SEQ ID NO: 38 1 ,906 TGOLN2 SEQ ID NO: 84 5,519
ACAD 10 SEQ ID NO: 39 -12,539 TRAF3IP1 SEQ ID NO: 85 -6,891
ATF1 SEQ ID NO: 40 1 ,759 TUBD1 SEQ ID NO: 86 9,490
BCL11 B SEQ ID NO: 41 -7,342 UROD SEQ ID NO: 87 3,791
BLK SEQ ID NO: 42 ,328 XIRP2 SEQ ID NO: 88 -2,525
BNIP3L SEQ ID NO: 43 -1,177 ZFP36 SEQ ID NO: 89 5,716
SEQ ID NO: 44 1 ,393 DKFZP564O05 SEQ ID NO: 90 -1,480
BRIP1 23
BTK SEQ ID NO: 45 1 ,114 FIT1 SEQ ID NO: 91 -11,385
BTLA SEQ ID NO: 46 -1,007
Preferably, the values of the discriminant coefficients of SEQ ID NO: 1-38 are discriminant for PTC/AITL vs. PTCL/NOS, i.e. they enable PTC/AITL to be distinguished from PTCL/NOS.
Preferably, the values of the discriminant coefficients of SEQ ID NO: 39-91 are discriminant for PTCL/NOS, preferably a PTCL/NOS CD3O+, vs. PTCL/ALCL-ALK", i.e. they enable PTCL/NOS, preferably PTCL/NOS CD3O+, to be distinguished from PTCL/ALCL-ALK".
Alternatively, the classifier is a Support Vector Machine (SVM) or kernel machine that preferably adopts a class prediction algorithm, widely used in the field of learning machines, which enables a type of supervised classification.
Preferably, the algorithm has been set using the following parameters, starting from the expression data of the molecular markers of the signature in biological samples isolated from PTCL/NOS, PTCL/AITL and PTCL/ALCL-ALK" individuals:
- the kernel function is about 1 : 0.1 , preferably it is 1 : 0.1 ; and/or
- the kernel type is preferably linear; and/or
- the cost is about 100.0, preferably it is 00.0; and/or
- the maximum number of iterations is about 100000, preferably it is 100000; and/or
- the ratio is about 1.0, preferably it is 1.0;
In the context of the present invention the kernel function represents the measure adopted to measure the similarity between two different molecular markers and in the proposed model it is of a linear type, i.e. calculated as the product of the expression values of each pair of molecular markers.
In the context of the present invention the maximum number of iterations represents the maximum number of cycles that the algorithm SVM must complete before arriving at a convergence of the results.
In the context of the present invention, the cost value represents a parameter that measures the complexity of the mathematical model in relation to the probability of misclassifying the various biological samples. A cost value of 100 permits the right compromise to be found between the separation of classes and classification errors. In the context of the present invention the ratio value indicates the ratio between the cost of misclassifying one class in relation to the other. The ratio used, equal to 1 , serves to ensure that there is no increase in false negatives in relation to the potential increase in false positives.
Therefore, the analysis of the expression values of the molecular markers of the signature, obtained according to the above-described step 2, is based on mathematical models (i.e. the linear discriminant function or SVM algorithm) capable of using the information provided by the expression values of the molecular markers of the signature in groups of individuals known with certainty to be affected by PTCL/NOS, PTCL/AITL and PTCL/ ALCL-ALK". This type of analysis of the expression values of the molecular markers of the signature enables the biological samples subjected to the method of the invention to be classified, in a binary manner, into the different PTCL subtypes.
Therefore, the mathematical model (which obviously works on a computerised system), characterised by weights enabling the classification (i.e. the distinction) of PTCLs, is "trained", "calibrated" or "pre-set" based on samples diagnosed as PTCL/NOS, PTCL/AITL and PTCL/ ALCL-ALK". In these cases, step 2 envisages a preceding step (called "training step", from which it follows that the mathematical model is "trained"), in which the data related to the expression of the molecular markers of the signature are collected on the basis of biological samples isolated from individuals with PTCL/NOS, PTCL/ALCL-ALK" and PTCL/AITL. The collected data regard the expression levels of different molecular markers in the single PTCL subtypes. These data (i.e. these expression profiles of the molecular markers) are used to create the weights of the mathematical function on which the binary classifier works. In other words, the binary classifier is calibrated, or pre-set, in such a way as to recognise the different PTCL subtypes when the method according to the present invention is applied on a biological test sample. It is clear that the step of calibrating or pre-setting the classifier is carried out only in the step of preparing the classifier and it will thus preferably not be carried out in the method of the invention, which will be used on a routine basis in laboratories to analyse the biological samples.
Step 2 of the method of the invention envisages analysing the expression values (i.e. the expression profile) of the molecular markers (i.e. of the signature) obtained from the biological sample of an individual with suspected PTCL, or in any case an individual who is subjected to the method of the invention, using the weights created during the training step, i.e. using the mathematical function which is at the basis of the binary classifier and has been pre-set as described above.
The result of this analysis is a score, based on which each biological sample is classified as PTCL/NOS, in particular PTCL/NOS, PTCL/ALCL-ALK" or PTCL/AITL. In particular, the score will vary according to the biological sample or the type of comparison among PTCL subtypes. Therefore, based on the score it will be possible to differentiate a PTCL/NOS sample, preferably a PTCL/NOS, from a PTCL/ALCL-ALK- ample, or it will be possible to differentiate a PTCL/NOS sample from a PTCL/AITL sample.
In a preferred embodiment, in order to carry out step 2 the linear discriminant function is used when the number of molecular markers utilised in the signature is relatively limited, preferably when the number of molecular markers is around 30-60 markers. In this case the expression profile of the molecular markers (i.e. step 1) is preferably obtained with methods of analysing gene expression, for example with microarrays using a chip.
The linear discriminant function can be used for the purpose of carrying out step 2 of the method of the invention also when the number of molecular markers used in the signature is less than around 20, preferably less than around 10 molecular markers. In these cases the expression profile of the molecular markers is preferably obtained by means of quantitative PCR, RNA-sequencing or another custom array. In the latter case, for example, step 1 can be achieved by means of a quantitative PCR using the primers, possibly labelled (oligonucleotides), specific for the molecular markers of the signature of the present invention.
Alternatively, when a global transcriptional profile is available (i.e. when a substantial number of data are analysed), for example, a case of a whole-genome microarray, it is advisable to carry out step 2 of the method according to the invention with a binary classifier such as the SVM, which, under this condition, enables an accurate classification suitable for diagnostic use.
The molecular signature of the present invention can further be used in order to develop dedicated assays (so-called custom arrays), which comprise the markers of the molecular signature as described earlier.
Advantageously, the method of the invention enables a PTCL to be determined, in particular it enables the subtyping of PTCLs, preferably by distinguishing PTCL/NOS, in particular PTCL/NOS CD30+, from PTCL/ALCL-ALK" and PTCL/AITL from PTCL/NOS with a sensitivity of around 70-75% and a specificity of around 80-96%. Finally, the method of the present invention shows a diagnostic accuracy which ranges from 70 to 95%.
This diagnostic accuracy derives from a study of step 3, that is to say, a study aimed at comparing the effectiveness of a system versus the gold standard presently in use, in observance of the rules of Evidence-Based Medicine (EBM).
A method characterised by such diagnostic accuracy is sufficient to meet clinical requirements and can thus be used in place of and/or alongside the current reference diagnostic method, which is the based on analysing the morphological, phenotypic and clinical features belonging to the different PTCL subtypes.
In light of these results, the method according to the present invention demonstrates to be capable of distinguishing the patients affected by PTCL/ AITL or patients affected by PTCL/ALCL-ALK" from those affected by PTCL/NOS, in particular those affected by a PTCL/NOS CD3O+, with a considerable diagnostic accuracy.
In particular, with regard to PTCL/NOS characterised by CD30+ cells, the possibility of being able to distinguish it from PTCL/ALCL-ALK", which is likewise characterised by CD30+ cells, is particularly advantageous, because the two PTCL subtypes, despite showing a phenotype that partly overlaps, clinically display a very different prognosis. In fact, PTCL/NOS has a more inauspicious prognosis than PTCL/ALCL-ALK" and therefore a therapeutic approach that is targeted from the earliest stages is certainly more effective and more convenient, especially for the affected subject and, depending on the place, also for reducing health care costs.
The method according to the present invention shows a high specificity (SP) and is thus preferably a method that can be used as a SPIN test (deriving from SPecificity-rule IN). In fact, if positive, the method of the invention confirms the presence of the pathology. The method of the present invention shows an equally high sensitivity (SN) and is thus also ideal as a SNOUT test (deriving from SeNsitivity-rule OUT), in the sense that when the result of the method is negative, the disease is ruled out.
The present invention further relates to a kit for carrying out the method according to the present invention, i.e. a kit for determining a PTCL, preferably for determining the PTCL subtypes, i.e. PTCL/NOS, PTCL/ALCL-ALK- and PTCL/AITL.
In one embodiment of the invention, the kit of the invention comprises reagents for determining the expression of at least 5, preferably at least 15, more preferably at least 38 or at least 53 sequences selected from among SEQ ID NO: 1-91. Preferably, said reagents are sets of primers for determining the expression of at least 5, preferably at least 15, more preferably at least 38 or at least 53 sequences selected from among SEQ ID NO: 1-91. Or else probes specific for said sequences, possibly bound on a solid support, for example a plate, filter or membrane. Alternatively, said reagents are at least a solid support, such as an array/microarray (DNA chip) of molecules of nucleic acids, comprising molecules hybridising with at least 5, preferably at least 15, more preferably at least 38 or at least 53 sequences selected from among SEQ ID NO: 1-91. Said reagents can alternatively be beads comprising said probes or primers, or other reagents for determining the expression of the sequences described in the present invention.
In the kit there may be at least one enzyme and at least one buffer, for example enzymes like reverse transcriptase, DNA polymerase or ligase, nucleotides, positive control sequences, negative control sequences etc.
EXAMPLE
Sample selection.
A total number of 244 samples were subjected to the present study. In particular, they were samples of nodal PTCLs taken from subjects who had given their consent to treatment.
1 12 samples of PTCL were retrieved from the archives of the centres involved in the study (i.e. University of Bologna and University of Frankfurt) and represent the training set (25 PTCL/NOS, 10 PTCL/AITL and 6 PTCL/ ALCL-ALK-) and the test set (55 PTCL/NOS, 10 PTCL/AITL and 6 cases of PTCL/ALCL-ALK".
These were 1 12 biopsy samples (preserved in paraffin) isolated from patients who had not yet undergone any clinical treatment for the disease.
The 112 samples comprise: 80 PTCL/NOS samples, 20 PTCL/AITL samples and 12 PTCL/ALCL-ALK- samples.
The original diagnoses of these 1 12 samples date from the period 1991-2010. In particular, in 105 cases the diagnosis was made by the Haematopathology Unit of the University of Bologna, in 7 cases the diagnosis was made by the Institute of Pathology of the Goethe University of Frankfurt.
In addition to this set of samples (test set), a further set of samples consisting of 132 cases of PTCL (i.e. the validation set) was analysed. The validation set includes: 78 cases of PTCL/NOS, 43 cases of PTCL/AITL and 1 1 cases of PTCL/ALCL-ALK".
For this set of samples, raw data of the gene expression profile (GEP) are available in the GEO database. This information was obtained from "fresh" or frozen biopsy samples.
The GEO data sets are the following: GSE6338 (http://www.ncbi. nlm.nih.qov/qds?term=GSE6338D and GSE19069
□ http://www.ncbi. nlm.nih.qov/gds?term=GSE19069).
The diagnoses of all samples taken into consideration were verified by at least two expert haematopathologists, who made the diagnosis using WHO classification criteria. The latter data, i.e. the diagnoses made with the reference standard (i.e. the method based on morphological and phenotypic criteria), were used in the study as the reference standard.
The molecular signature based on GEP data, i.e. the molecular classifier which represents the index test, was then applied retrospectively in 2011. The analysis was conducted by two experts in bioinformatics and gene expression analysis, both of whom were ignorant of the results of the other tests. The study was conducted according to the principles of the Helsinki declaration, following approval of the Internal Review Board (IRB), reference number 201 _001.
Generation of the gene expression profile.
The analysis of the gene expression profiles of the cases included in the training and test sets (1 12 cases) was performed by means of whole-genome DASL (cDNA- mediated Annealing, Selection, extension, and Ligation) assay of formalin-fixed paraffin- embedded tissues. A DASL assay was used because the RNA recovered from formalin- fixed paraffin-embedded tissues is usually partially degraded and the DASL assay enables up to thousands of transcripts to be analysed starting from only 50 ng of total RNA, thus providing highly reproducible results.
The RNA was extracted from the fixed tissues with the RecoverAII™ Total Nucleic Acid Isolation Kit.
For each reaction up to five 10 μηι sections were processed. The fixed samples were deparaffinized with a series of washes in xylene and ethanol and were then subjected to proteinase digestion. The RNA was purified by filtration through glass fibres carried out simultaneously with a treatment with DNAse. The RNA molecules were eluted with low- salt buffer and quantized using a NanoDrop spectrophotometer. The DASL assay begins conversion of the total RNA molecules into cDNAs using biotinylated Oligo(dT) and random nonamers. This enables subsequent annealing, whereby the biotinylated cDNAs are bound to a group of DASL Assay Pool (DAP) probes which contain oligonucleotides specifically designed to identify each target sequence in the transcripts. These probes consist of around 50 bases and make it possible to obtain a partial profile of the degraded RNA. After annealing, the oligonucleotides are amplified and "ligated".
During PCR amplification these template molecules are labelled using fluorescent primers that are added to the reaction. Subsequently, the products resulting from the PCR are scanned using a BeadArray Reader or an iScan System in order to determine the presence or absence of specific genes.
Gene expression analysis.
Gene expression analysis was conducted as described in the following studies (Piccaluga PP, et al. J Clin Invest., 17(3):823-34, 2007; Iqbal J, et al. Blood, 1 15 1 1 , 2010; Piccaluga PP, et al. Blood, 17:3596-608, 201 1).
In order to perform a supervised gene expression analysis, use was made of Genes@Work software, which is a tool for gene expression analysis based on an algorithm that works by "pattern discovery" and on analysing the location of structural patterns through sequential histograms such as GenSpring GX 1 1 (in the case of Genes@Work and GenSpring GX 1 1 software, reference should be made to Piccaluga PP, et al. Blood, 1 17:3596-608, 201 1). In short, via a supervised analysis (two-tailed T- test) of the training set it was possible to identify the genes expressed differentially between the PTCL/NOS samples and the PTCL/AITL samples and between the cases of PTCL/NOS and the cases of PTCL/ALCL-ALK". The genes were identified on the basis of the p-value (<0·01 ) and fold change (>2).
The best 200 genes classified were then tested for their ability to discriminate cases of PTCL/AITL and PTCL/ ALCL-ALK" from cases of PTCL/NOS, as regards both the samples of the test set and the samples of the validation set.
A Support Vector Machine was used for the classification of each sample, as described by Piccaluga PP, et al. Blood, 117:3596-608, 2011.
EASE software was applied in order to establish, through gene ontology, whether the deregulated genes defined specific cell functions of biological processes in a significant manner.
A method based on the Support Vector Machine algorithm with an N-Fold validation as described by Piccaluga PP, et al. (Piccaluga PP, et al. Blood, 117:3596-608, 2011) was used to classify the cell type (molecular classifier). Briefly, the classifier is a scoring function based on the values of a set of genes (gene cluster), which are differentially expressed in two sets of cell types (the samples of the training set) and can therefore be used for classifying the cell type of the samples included in the test set and the samples included in the validation set.
The higher the score, the greater the probability that that cell type is linked to the phenotype set. In this specific case, the Class Prediction algorithm adopted (Support Vector Machine) was set using the following training set parameters:
- Kernel parameter 1 : 0.1
- Cost : 100.0
- Maximum number of iterations : 100000
- Ratio : 1.0
- Kernel type : Linear
The forecasting model constructed on the basis of the values shown by the training test samples was then run on samples of the test set and samples of the validation set. Each sample was then assigned by an algorithm to one of the two categories considered (namely, PTCL/NOS vs. PTCL/AITL in one case and PTCL/NOS vs. PTCL/ALCL-ALK" in the other) when a confidence measure score >0.5 was observed (PTCL/NOS vs. PTCL/AITL or PTCL/NOS vs. PTCL/ALCL-ALK").
The reference standard (the anatomical-pathological method) and the index test (method of the invention) were compared in terms of sensitivity, specificity, positive predictive value, negative predictive value, confidence interval and likelihood ratio using CATmaker (Centre for Evidence Based Medicine, Oxford University, http://www.cebm.net) software.
To ensure the reproducibility of the index test, a statistical prediction test based on discriminant analysis ("stepwise method") was used, relying on SPSS software (IBM, USA). Exactly analogous results were obtained (100% concordance), when the original molecular signature was compared with one reduced through discriminant analysis. This step (reduction of the signature via discriminant analysis) is relevant from a clinical viewpoint, as it allows a simpler diagnostic assay to be performed while maintaining the same effectiveness.
The gene expression studies were conducted according to MIAME guidelines.
Assessment of diagnostic accuracy. The ability of two molecular signatures to discriminate PTCL/NOS from PTCL/AITL, and from PTCL/ALCL-ALK"E" respectively was tested.
The diagnostic accuracy was measured in terms of sensitivity, specificity, positive predictive value, negative predictive value, confidence interval and likelihood ratio. These parameters were obtained using CATmaker software.
The studies were designed and conducted according to the STARD statement (STAndards for the Reporting of Diagnostic accuracy studies) (http://www.st.ard- statement.org), following REMARK guidelines and observing the QUADAS model (Quality Assessment of Diagnostic Accuracy Studies) (Whiting PF, et al. Ann Intern Med, 155: 529-36, 2011).
In particular, all of the samples were evaluated from a diagnostic viewpoint using both the index test (that is, the method of the present invention, based on classifiers in turn based on gene expression profiles) and the reference standard (i.e. through the integration of morphological, phenotypic, clinical and molecular data according to the WHO classification)'
PTCL/AITL and PTCL/ALCL-ALK- can be distinguished from PTCL/NOS on the basis of the total gene expression.
Using the samples of the training set (25 PTCL/NOS, 10 PTCL/AITL and 6 PTCL/ ALCL-ALK-), a supervised classification of PTCL/NOS vs. PTCL/AITL and vs. PTCL/ALCL-ALK" was achieved (by adopting, respectively, a T-test with the Bonferroni post-hoc correction and the Mann-Whitney U test with the Benjamini-Hochberg post-hoc correction).
The differentially expressed genes were identified based on the p-value (p<005) and fold change.
In particular, 208 genes were identified which discriminated between PTCL/NOS and PTCL/AITL and 1 133 genes which discriminated between PTCL/NOS and PTCL/ ALCL- ALK".
Subsequently, by independently studying and testing the various sets of samples, it was verified whether the molecular signatures identified were actually capable of accurately distinguishing the PTCL/AITL and PTCL/ALCL-ALK" samples from the PTCL/NOS samples. In particular, the samples of the test set, i.e. 55 cases of PTCL/NOS, 10 of PTCL/AITL and 6 of PTCL/ALCL-ALK", were analysed for this purpose.
When a hierarchical classification was applied, it was found that, based on the molecular signatures identified, it is possible to distinguish the different diseases in a manner that is significant from a clinical standpoint.
In particular, the PTCL/AITL samples are distinguished from the PTCL/NOS samples with a p<0.0001 , while the PTCL/ALCL-ALK" samples are distinguished from the PTCL/NOS samples with a p<0.0001).
Comparable results were obtained using the samples of the validation set.
The molecular signature based on the gene expression profile is capable of efficiently discriminating PTCL/AITL and PTCL/ALCL-ALK' from PTCL/NOS.
With the aim of developing a practical tool able to be applied for the differential diagnosis of PTCL, in particular for routine diagnostic investigations, an SVM algorithm was used, that is, a reproducible system for managing gene expression profile data.
The first step was to construct a model using the complete molecular signatures identified and testing them on samples of the training set (25 PTCL/NOS, 10 PTCL/AITL and 6 PTCL/ALCL-ALK samples).
The same signatures were then tested on the samples of the test set (55 PTCL/NOS,
10 PTCL/AITL and 6 PTCL/ALCL-ALK-). When the PTCL/AITL forecasting model was applied, it was possible to correctly classify 9/10 PTCL/AITL and 55/55 PTCL/NOS.
Therefore, the test has a sensitivity of 90% and specificity of 100%. The positive and negative predictive values are respectively 100% and 98%.
Therefore, the total accuracy of the diagnostic method proved to be 98%.
On the other hand, when the PTCL/ALCL-ALK- forecasting model was applied, 6/6
PTCL/ALCL and 54/55 PTCL/NOS were correctly classified.
The sensitivity of the test is 100% and the specificity is 98%, whilst the positive and negative predictive values are respectively 86% and 100%.
Therefore, the total accuracy of the method is 98%.
Subsequently, a molecular signature containing a more limited number of genes was tested to assess its ability to discriminate among the different PTCL subtypes.
The genes used for the differential diagnosis were then reduced through discriminant analysis to 38 genes (for PTCL/NOS vs. PTCL/AITL) and 53 genes (for PTCL/NOS vs. PTCL/ALCL/ALK") (see Fig 1-3).
In order to render the data obtained more solid, the value of the assay on the samples of the validation set was tested further, i.e. on an independent set of samples. The validation set consisted in 78 PTCL/NOS samples, 43 PTCL/AITL samples and 1 1 PTCL/ALCL-ALK" samples.
Under these conditions, when the PTCL/AITL forecasting model was applied, 31/43 PTCL/AITL and 62/78 PTCL/NOS were correctly classified.
A sensitivity of 72% and specificity of 80% were obtained. The positive and negative predictive values are respectively 66% and 84%. Therefore, the total diagnostic accuracy is 77%.
Similarly, PTCL/NOS were tested versus PTCL/ALCL-ALK- and 8/1 1 PTCL/ALCL and 75/78 PTCL/NOS were correctly classified.
The sensitivity and specificity in this case are respectively 73% and 96%. The positive and negative predictive values are respectively 73% and 96%. Therefore, the total diagnostic accuracy is 93%.
In light of these results, the method according to the present invention demonstrates to be capable of distinguishing patients affected by PTCL/ AITL or patients affected by PTCL/ALCL-ALK" from those affected by PTCL/NOS with considerable diagnostic accuracy.
The molecular signature according to the present invention is efficient for identifying CD3O+ PTCL/NOS.
The ability and efficiency of the molecular signature of the invention in distinguishing cases of PTCL/NOS CD3O+ from PTCL/ALCL-ALK-, which represents one of the greatest diagnostic challenges, was subsequently analysed.
Fourteen samples taken from patients with PTCL/NOS CD3O+ were analysed, and using the molecular signature according to the present invention it was possible to identify 14/14 cases correctly.
The molecular signatures according to the present invention show a significant effect on the post-test probability of disease.
The factor which best confirms the validity of a medical test is its ability to distinguish between the probability of the presence of a given pathological condition before and after the test in question.
The molecular signature and its use in the diagnostic method for subtyping PTCLs according to the present invention were also tested in this respect.
As regards the ability of the test to distinguish PTCL/AITL samples from PTCL/NOS samples, the probability after the test increased by 15% to 100% and the cases of positive and negative results in the test set were reduced by 15% to 2%.
In the same manner, with regard to the samples of the validation set, the probability after the test went from 35% to 66% and to 16% in the case of positive and negative results. With regard to the ability of the test to distinguish between PTCL/ALCL-ALK" and PTCL/NOS in the samples of the test set, the probability after the test increased from 10% to 86%, whereas it showed to be from 10% to 0%, respectively, in the case of positive and negative results of the test itself.
Similarly, for the samples belonging to the validation set, the probability after the test went from 12% to 72% and 4%, respectively, in the case of positive and negative results.
The molecular signatures according to the present invention implement prognostic accuracy.
In order to evaluate the clinical impact of the test, a comparison was then made of the patient survival curves based on type of disease, as diagnosed with conventional methods (histopathology) or with the molecular test (see Figure 4). Conventional diagnostics did not enable a distinction to be made among groups with a significantly different prognosis; in contrast, the method based on the molecular signature of the invention enabled a definition of groups with a different prognosis. In fact, patients recognised as carriers of PTCL/ALCL-ALK" have a significantly higher probability of being cured than those with PTCL/AITL and PTCL/NOS. Furthermore, the method according to the present invention made it possible to distinguish, in the case of CD30 positive forms, the patients with a better prognosis (indicated by the test as PTCL/ALCL-ALK") from those with a worse prognosis (indicated as PTCL/NOS) (Figure 4).
Overall, these data clearly demonstrate the clinical potential of the method and of the molecular signature of the present invention for the purpose of diagnosing PTCL, and in particular for the purpose of subtyping PTCLs into PTCL/NOS.

Claims

1. In vitro method for determining a peripheral T-cell lymphoma (PTCL), preferably for subtyping PTCLs, said method comprising (i) at least one step of measuring in a biological sample the expression levels of a molecular signature comprising at least 5, preferably at least 15 molecular markers selected from the group consisting of SEQ ID NO: 1-91 ; and (ii) at least one step of analyzing said expression levels by using a binary classifier.
2. Method according to claim 1 , wherein the PTCL subtypes are: PTCL/NOS, preferably PTCL/NOS CD3O+, PTCL/ALCL-ALK" or PTCL/AITL.
3. Method according to claim 1 or 2, wherein said molecular markers are at least 38 or at least 53.
4. Method according to claim 3, wherein said at least 38 molecular markers are SEQ ID NO: 1-38.
5. Method according to claim 3, wherein said at least 53 molecular markers are SEQ ID NO: 39-91.
6. Method according to any one of claims 1-5, wherein the expression levels of the molecular markers are measured by putting said biological sample through at least one of the following steps:
(iii) Purifying/extracting the nucleic acid molecules, preferably the transcribed RNA molecules, from the biological sample; and/or
(iv) Retrotranscribing the RNA molecules obtained from step (iii) in order to obtain the corresponding cDNA molecules; and/or
(v) Amplifying and/or labelling the cDNA molecules obtained from step (iv); and/or
(vi) Quantifying the molecules amplified during step (v) wherein the amounts of said amplified molecules represent the expression profile of the molecular markers in said biological sample.
7. Method according to claim 6, wherein said amplified and/or labelled molecules are the molecular markers according to any one of claims 1-5.
8. Method according to any one of claims 1-7, wherein said binary classifier is based on a linear discriminant function or on a Support Vector Machine algorithm.
9. Method according to claim 8, wherein said linear discriminant function is Function 1 : D Q BwjD ^ wjxi
1
10. Method according to claim 8, wherein said Support Vector Machine algorithm is set up by using parameters comprising:
- Kernel function parameter having a value of about 1 : 0.1 , preferably 1 : 0.1 ; and/or
- Cost parameter having a value of about 100.0, preferably 100.0; and/or
- Maximum number of iterations parameter having a value of about 100000, preferably 100000; and/or
- Ratio parameter having a value of about 1.0, preferably 1.0; and/or
- a kernel type parameter that is preferably linear.
1 1. Method according to any one of claims 1 -10, wherein the biological sample is isolated from an individual who has been diagnosed with a PTCL, or from an individual suspected to be affected by a PTCL, and wherein the biological sample is a fresh, freezer or fixed and embedded sample.
12. Method according to any one of claims 1-1 1 , wherein the sequences of said molecular markers or their complementary sequences are chemically bound on a solid support, preferably on a chip or a plate.
13. The solid support, preferably a chip, on which the molecular marker sequences according to any one of claims 1 , 3-5 or their complementary sequences are bound.
14. The support according to claim 13 for use as a diagnostic agent.
15. Kit for performing the method according to any one of claims 1-11 , said kit comprising reagents for determining the expression levels of a molecular signature comprising the molecular markers according to any one of claims 1 , 3- 5, wherein said reagents are preferably: a solid support according to claim 13, at least a set of primers for determining the expression levels of said molecular markers, or probes specific for said molecular markers, wherein said probes are preferably bound on a solid support.
PCT/IB2014/060529 2013-04-09 2014-04-08 Molecular signature and its uses as diagnostic agent Ceased WO2014167494A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IT000548A ITMI20130548A1 (en) 2013-04-09 2013-04-09 MOLECULAR SIGNATURE AND ITS USES AS DIAGNOSTIC AGENT
ITMI2013A000548 2013-04-09

Publications (1)

Publication Number Publication Date
WO2014167494A1 true WO2014167494A1 (en) 2014-10-16

Family

ID=48446451

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2014/060529 Ceased WO2014167494A1 (en) 2013-04-09 2014-04-08 Molecular signature and its uses as diagnostic agent

Country Status (2)

Country Link
IT (1) ITMI20130548A1 (en)
WO (1) WO2014167494A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019210216A3 (en) * 2018-04-27 2019-12-12 Seattle Children's Hospital D/B/A Seattle Children's Research Institute Talen-based and crispr/cas-based gene editing for bruton's tyrosine kinase
CN114839372A (en) * 2022-04-29 2022-08-02 徐州医科大学附属医院 Application of reagent for detecting expression level of pSTAT3 in identification of PTCL-NOS and ALK-ALCL
CN115927594A (en) * 2022-11-18 2023-04-07 南方医科大学 Circular RNA diagnostic marker for placenta implantation pedigree diagnosis and application thereof

Non-Patent Citations (10)

* Cited by examiner, † Cited by third party
Title
DE LEVAL LAURENCE ET AL: "Pathobiology and molecular profiling of peripheral T-cell lymphomas.", HEMATOLOGY / THE EDUCATION PROGRAM OF THE AMERICAN SOCIETY OF HEMATOLOGY. AMERICAN SOCIETY OF HEMATOLOGY. EDUCATION PROGRAM 2008, 2008, pages 272 - 279, XP055066049, ISSN: 1520-4391 *
HSI ERIC D ET AL: "Diagnostic Accuracy of a Defined Immunophenotypic and Molecular Genetic Approach for Peripheral T/NK-Cell Lymphomas: A North American PTCL Study Group Project", BLOOD, vol. 120, no. 21, November 2012 (2012-11-01), & 54TH ANNUAL MEETING AND EXPOSITION OF THE AMERICAN-SOCIETY-OF-HEMATOLOGY (ASH); ATLANTA, GA, USA; DECEMBER 08 -11, 2012, pages 1545, XP008162914 *
IQBAL J ET AL., BLOOD, vol. 115, 2010, pages 11
PICCALUGA PIER PAOLO SR ET AL: "Molecular Diagnosis of Peripheral T-Cell Lymphoma/NOS From Formalin Fixed Paraffin Embedded Tissues", BLOOD, vol. 118, no. 21, November 2011 (2011-11-01), & 53RD ANNUAL MEETING AND EXPOSITION OF THE AMERICAN-SOCIETY-OF-HEMATOLOGY (ASH); SAN DIEGO, CA, USA; DECEMBER 10 -13, 2011, pages 1565 - 1566, XP008162915 *
PICCALUGA PP ET AL., BLOOD, vol. 117, 2011, pages 3596 - 608
PICCALUGA PP ET AL., J CLIN INVEST., vol. 117, no. 3, 2007, pages 823 - 34
PIER PAOLO PICCALUGA ET AL: "Gene expression analysis of peripheral T cell lymphoma, unspecified, reveals distinct profiles and new potential therapeutic targets", JOURNAL OF CLINICAL INVESTIGATION, vol. 117, no. 3, 1 March 2007 (2007-03-01), pages 823 - 834, XP055066051, ISSN: 0021-9738, DOI: 10.1172/JCI26833 *
PILERI S; PICCALUGA PP, J CLIN INVEST., vol. 122, no. 10, 1 October 2012 (2012-10-01), pages 3448 - 55
TIFFANY TANG ET AL: "Gene Expression Profiling Identifies the JAK/STAT and NF B Pathways to Be Important in Peripheral T-Cell Lymphomas and Natural-Killer T-Cell Lymphomas", BLOOD (ASH ANNUAL MEETING ABSTRACTS) 2011 118: ABSTRACT 2658, 18 October 2011 (2011-10-18), XP055066046, Retrieved from the Internet <URL:http://abstracts.hematologylibrary.org/cgi/content/abstract/118/21/2658?maxtoshow=&hits=10&RESULTFORMAT=&fulltext=Gene+Expression+Profiling+Identifies+the+JAK%2FSTAT+and+NF+kappa+B+Pathways+to+Be+&searchid=1&FIRSTINDEX=0&sortspec=relevance&resourcetype=HWCIT> [retrieved on 20130610] *
WHITING PF ET AL., ANN INTERN MED, vol. 155, 2011, pages 529 - 36

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019210216A3 (en) * 2018-04-27 2019-12-12 Seattle Children's Hospital D/B/A Seattle Children's Research Institute Talen-based and crispr/cas-based gene editing for bruton's tyrosine kinase
CN112469823A (en) * 2018-04-27 2021-03-09 西雅图儿童医院d/b/a西雅图儿童研究所 TALEN-BASED AND CRISPR/CAS-BASED GENE EDITING OF BRUTON' S tyrosine kinase
CN114839372A (en) * 2022-04-29 2022-08-02 徐州医科大学附属医院 Application of reagent for detecting expression level of pSTAT3 in identification of PTCL-NOS and ALK-ALCL
CN115927594A (en) * 2022-11-18 2023-04-07 南方医科大学 Circular RNA diagnostic marker for placenta implantation pedigree diagnosis and application thereof

Also Published As

Publication number Publication date
ITMI20130548A1 (en) 2014-10-10

Similar Documents

Publication Publication Date Title
JP6908571B2 (en) Gene expression profile algorithms and tests to quantify the prognosis of prostate cancer
CN108368551B (en) Method for diagnosing tuberculosis
EP2572000B1 (en) Methods for diagnosing colorectal cancer
JP5784272B2 (en) Methods and compositions for detecting autoimmune diseases
JP2018514189A (en) Diagnosis of sepsis
CN106795565A (en) Methods Used to Assess Lung Cancer Status
CA2659194A1 (en) Methods for identifying, diagnosing, and predicting survival of lymphomas
WO2014160645A2 (en) Neuroendocrine tumors
AU2013277971A1 (en) Molecular malignancy in melanocytic lesions
CN109477145A (en) Biomarkers of Inflammatory Bowel Disease
WO2011006119A2 (en) Gene expression profiles associated with chronic allograft nephropathy
CA2986787A1 (en) Validating biomarker measurement
WO2014019977A1 (en) Diagnosis of active tuberculosis by determining the mrna expression levels of marker genes in blood
CN104046624B (en) Gene and application thereof for lung cancer for prognosis
US20250137066A1 (en) Compostions and methods for diagnosing lung cancers using gene expression profiles
WO2015179777A2 (en) Gene expression profiles associated with sub-clinical kidney transplant rejection
WO2013160176A1 (en) Diagnostic mirna profiles in multiple sclerosis
CN118755816B (en) Molecular markers of aortic dissection and their applications
WO2014167494A1 (en) Molecular signature and its uses as diagnostic agent
CA2882643A1 (en) Use of interleukin-27 as a diagnostic biomarker for bacterial infection in critically ill patients
KR102229647B1 (en) MiRNA bio-marker for non-invasive differential diagnosis of acute rejection in kidney transplanted patients and uses thereof
EP4023770A1 (en) A method of examining genes for the diagnosis of thyroid tumors, a set for the diagnosis of thyroid tumors and application
KR102101500B1 (en) Urinary mRNA for non-invasive differential diagnosis of acute rejection in kidney transplanted patients and uses thereof
WO2015179771A2 (en) Molecular signatures for distinguishing liver transplant rejections or injuries
CN119842887B (en) Biomarkers and their application in evaluating severe bronchiolitis in children with respiratory syncytial virus infection

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14727913

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14727913

Country of ref document: EP

Kind code of ref document: A1