[go: up one dir, main page]

WO2017083310A1 - Procédé de normalisation pour analyse d'échantillons - Google Patents

Procédé de normalisation pour analyse d'échantillons Download PDF

Info

Publication number
WO2017083310A1
WO2017083310A1 PCT/US2016/061001 US2016061001W WO2017083310A1 WO 2017083310 A1 WO2017083310 A1 WO 2017083310A1 US 2016061001 W US2016061001 W US 2016061001W WO 2017083310 A1 WO2017083310 A1 WO 2017083310A1
Authority
WO
WIPO (PCT)
Prior art keywords
target analyte
normal
normalizers
analyte
candidate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/US2016/061001
Other languages
English (en)
Inventor
Xitong Li
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
InKaryo Corp
Original Assignee
InKaryo Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by InKaryo Corp filed Critical InKaryo Corp
Publication of WO2017083310A1 publication Critical patent/WO2017083310A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B25/00ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
    • G16B25/10Gene or protein expression profiling; Expression-ratio estimation or normalisation
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B25/00ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids

Definitions

  • the present invention relates to methods and systems for normalizing measurements of analytes that may be used to detect abnormality in a sample, such as chromosomal aneuploidy, disorders associated with abnormal copy number of mitochondria, infectious diseases and cancer. BACKGROUND OF THE INVENTION
  • an analyte of interest in a biological sample is critical for proper disease diagnosis, prognosis, and therapeutic treatment (Park et al., Cronin et al., Jennings et al.1997, Jennings et al.2012, Walsh et al.).
  • Such an analyte can be for example, a gene, an RNA, a peptide, a cluster of genes, a chromosome, or a chromosomal region called bin.
  • copy number aberrations of chromosomal region detected in human samples have been indicated as prognostic markers in cancer associated phenotypes (Sapkota et al., Krepischi et al.).
  • RNA biomarkers have been used for breast cancer prognosis (Park et al.).
  • High level of cell surface protein Her2 is a companion diagnostic biomarker for the effectiveness of cancer drug Herceptin (Dent et al.).
  • direct quantitative measurements of a particular analyte of interest from biological samples are often not meaningful because these values are affected by variations unrelated to the biological significance of analytes in the samples. Examples of these variations include uneven quality of the input materials, and variations introduced through different sample preparation procedures including enzyme bias and unknown assay condition changes. In aggregation, the variations often render it difficult to correlate a measurement of an analyte generated by one particular assay with the true signals of the analyte present in the samples.
  • Normalization is an approach to transform measurements into corrected values to allow comparison of samples. It is useful for large scale of biological or chemical assays using high throughput technologies such as microarray, quantitative PCR (QPCR), mass spectrum, antibody array, two dimensional (2D) gel electrophoresis, digital PCR, or massively parallel sequencing where large quantity of analytes are measured at the same time. These assays include a variety of genome-wide analyses of gene copy number variation (CNV), chromosomal aberration, RNA expression, protein peptide expression, or protein binding patterns and are now routinely performed to compare treatment and reference groups for differentiation of disease and normal tissues.
  • CNV gene copy number variation
  • RNA expression RNA expression
  • protein peptide expression protein binding patterns
  • the present disclosure provides a covariation-based normalization (CBN) method that corrects various biases caused , for example, sample and assay conditions.
  • CBN covariation-based normalization
  • the method involves selecting normalizers from a set of analytes to normalize an analyte of interest, i.e. a target analyte.
  • the selection bases on a covariation value of a candidate analyte relative to a target analyte, and the covariation value is independent of any distribution assumption of the data.
  • the resulting normalization is thus reliable for correcting data skewness even if the cause is unknown. Therefore, the normalization method presented in this disclosure is robust, consistent, and useful to achieve desired sensitivity and specificity, which is especially critical for high throughout assays.
  • the CBN method of selecting normalizers of a target analyte for use in high throughput assay disclosed herein comprises calculating covariation values of each candidate analyte and the target analyte among a group of training samples.
  • a candidate analyte is selected as a normalizer if its covariation value is higher than a threshold.
  • a combination of a few or all normalizers that meet the selection criterion can be collectively used to normalize measurements of the target analyte.
  • the disclosure provides a method of identifying a plurality of normalizers from a pool of candidate analytes for a target analyte in an assay, the method comprising: a) obtaining measurements of the target analyte and measurements of the pool of candiate analytes from a group of training samples; b) calculating a covariation value of the measurement of the target analyte and the measurement of one of the pool of candidate analytes; c) identifying the candidate analyte as a normalizer when the covariation value of the one of the pool of candidate analyte and the target analyte is higher than a threshold, and repeating steps b) and c) with the rest of the pool of candidate analytes until a plurality of normalizers is identified, wherein the plurality of normalizers are used to normalize measurements of the measurement of target analyte in a test sample.
  • the plurality of normalizers is at least 10, 50, 100, 150, or 200, 300, 400, 500, 600, 700, 800, 900, or 1000.
  • the group of training samples comprise at least 3, at least 4, at least 5, at least 10, at least 50, at least 70, at least 80, at least 100, at least 150, at least 200, or at least 300 samples.
  • the method further comprises measuring target analyte and the plurality of normalizers in a test sample to obtain the measurement of target analyte and the measurements of the plurality of normalizers in the test sample, and normalizing the
  • the covariation value is a covariance or a correlation. In some embodiments of the invention, the covariation value is a Spearman, Pearson, or Kendall correlation.
  • the threshold of the covariation value used to select normalizers is a decimal number greater than or equal to 0.1, 0.2, 0.3, 0.4, 0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.91, 0.92, 0.93, 0.94, 0.95, 0.96, 0.97, 0.98, or 0.99.
  • the threshold of the covariation value is determined by a minimum rank among all candidates’ covariation values ranked from high to low. The minimum rank is a positive integer between 1 and the total number of candidate analytes.
  • the threshold of the covariation value is a decimal number greater than or equal to 10th, 20th, 30th, 40th, 50th, 55th, 60th, 65th, 70th, 75th, 80th, 85th, 90th, 91th, 92th, 93rd, 94th, 95th, 96th, 97th, 98th, or 99th percentile of the covariation values for all pairs of candidate analytes, and said target analyte.
  • This invention further provides a method of normalizing test data using a normalized quotient (NQ) of a target analyte in a test sample.
  • NQ normalized quotient
  • the invention provides a method of categorizing a test sample based on a target analyte comprising the steps of a) generating a NQ of the target analyte in the test sample using the methods described above, b) categorizing the test sample as normal based on the target analyte when said NQ is (1) greater than or equal to the lower limit of a normal NQ range and (2) less than or equal to an upper limit of normal NQ range, or categorizing the test sample as abnormal when said NQ is (1) lower than the lower limit of the normal NQ range or (2) higher than the higher limit of said normal NQ range.
  • the lower limit of normal NQ range is determined by the mean of the NQ values of the target analyte in a set of reference samples minus a multiple of the standard deviation of the NQ values of the target analyte in the set of reference samples
  • said upper limit of normal NQ range is determined by the mean of the NQ values of the target analyte in the set of reference samples plus a multiple of the standard deviation of the NQ values of the target analyte in a set of reference samples.
  • the multiple is a number that is greater than or equal to 1, greater than or equal to 1.5, greater than or equal to 2, greater than or equal to 2.5, greater than or equal to 3, greater than or equal to 3.5, greater than or equal to 4, greater than or equal to 4.5, greater than or equal to 5, greater than or equal to 5.5, greater than or equal to 6, or greater than or equal to 6, greater than or equal to 7, greater than or equal to 8, greater than or equal to 9, greater than or equal to 10.
  • the lower limit of normal NQ range is determined by the median of the NQ values of the target analyte in a set of reference samples minus a multiple of the median absoulte deviation (MAD) of the NQ values of the target analyte in the set of reference samples
  • said upper limit of normal NQ range is determined by the median of the NQ values of the target analyte in a set of reference samples plus a multiple of the median absoulte deviation (MAD) of the NQ values of the target analyte in the set of reference samples.
  • This invention also provides a method of categorizing a test sample based on a target analyte by comparing the target analyte’s NQ with a normal range of NQ.
  • a test sample is categorized as normal based on the target analyte if said NQ falls within a normal range of NQ and abnormal if the NQ is outside the normal range.
  • the normal range of NQ is defined by a low end (a.k.a.,“the lower limit”), which is the median minus a decimal number that is greater than or equal to 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, or 5 times of the median absolute deviation (MAD) of the NQs of the target analyte from a set of reference samples; and a high end, which is the mean plus a decimal number that is greater than or equal to 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, or 5 times of the median absolute deviation (MAD) of the NQs of the target analyte from a set of reference samples.
  • a low end a.k.a.,“the lower limit”
  • the low end of the NQ range is the mean minus a decimal number that is greater than or equal to 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, or 5 times of the standard deviation (SD) of the NQs of the target analyte from a set of reference samples; and the high end (a.k.a., the“higher limit”) of the range is the mean plus a decimal number that is greater than or equal to 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, or 5 times of the standard deviation (SD) of the NQs of the target analyte from a set of reference samples.
  • SD standard deviation
  • the step of identifying normalizer can be reiterated, that is to say, the NQs generated for the target analyte and candidate analytes based on a first set of plurality of normalizers are used as measurments for target analyte and candidate analyte to select a second set of plurality of normalizers.
  • This reiteration processes can be repeated multiple times until a desired set of normalizers are obtained.
  • the target ananlyte is then normalized using the second or later set of plurality of normalizers as described above.
  • the sample can be categorized using based on the normalized measurements of the target analyte as described above.
  • the present invention also provides a computer-based method for selecting a set of normalizers for a set of target analytes in an assay with a group of training samples.
  • the method steps include inputting measurements of analytes, including said set of target analytes, into a computer, and executing computer readable instructions on the computer for selecting a set of normalizers for each target analyte.
  • a candidate analyte is selected as a normalizer of the target analyte if the covariation value of the candidate analyte and the target analyte is greater than a threshold.
  • a set of normalizers is then produced for the target analyte.
  • sets of normalizers are produced for said set of target analytes.
  • the number of normalizers that is desired for the target analyte is also provided to the computer as an input parameter, which is then used as the minimum rank value for selection of a covariation threshold.
  • the executing act can be performed either locally or remotely relative to the inputting act.
  • the present invention provides a method for normalizing a measurement of a target analyte in a test sample, the method comprising: receiving, with a computing system, measurements of the target analyte and candidate analytes in a group of training samples; for each of the candidate analytes: determining, with a computing system, a covariation value between the target analyte and each of the candidate analytes, and selecting the candidate analyte as a normalizer if the covariation value of the candidate analyte and the target analyte is higher than a threshold, thereby selecting a plurality of normalizers; receiving, with a computing system, meaurements of the target analyte and the plurality of normalizers in a testing sample, and calculating a normalized quotient (NQ) of the target analyte in the test sample, thereby normalizing the measurement of the target analyte.
  • NQ normalized quotient
  • the method further comprises comparing the NQ of the target analyte of the test sample with a normal NQ range, outputting to a user, with a computing system, an indication representing the test sample as normal based on the target analyte if said NQ is greater than or equal to a lower limit of normal NQ range and less than or equal to an upper limit of normal NQ range, or categorizing the test sample as abnormal if said NQ is lower than the lower limit of the normal NQ range or higher than the higher limit of said normal NQ range.
  • the invention provides a computer-readable medium comprising computer program code for controlling a computer system to perform the methods above.
  • the invention provides computer system comprising the computer-readable medium.
  • This invention also provides a system for categorizing a sample based on a target, the system comprising (1) one or more processors; (2) a memory coupled with the one or more processors via an interconnect; (3) a communications interface coupled with the interconnect and adapted to receive a measurement of a target analyte and measurements of candidate analytes in a group of training samples.
  • the one or more processors are configured to (a) receive values representing measurements the target analyte and measurements of candidate analytes;
  • the system additionally comprises an output module communicatively coupled with the processor and configured to output an indication, wherein the indication represents the test sample as normal based on the target analyte when said NQ is greater than or equal to a lower limit of normal NQ range and less than or equal to an upper limit of normal NQ range, or categorizing the test sample as abnormal when said NQ is lower than the lower limit of the normal NQ range or higher than the higher limit of said normal NQ range.
  • the present invention also provides a physical computer-readable medium comprising a computer program for performing the following steps: inputting measurements of all measurable analytes including said set of target analytes into a computer; executing a set of computer readable instructions on the computer to select a set of normalizers for each target analyte, wherein a candidate analyte is selected as a normalizer if its covariation value is greater than a threshold; and outputting the set of normalizers for the set of target analytes.
  • the target analyte is a chromosome, a chromosomal bin, or a gene.
  • the target analyte is a human gene.
  • the analytes are measured in a high throughput assay.
  • the target analyte is a clinical diagnostic marker for a disease.
  • the target analyte is mitochondrial DNA.
  • each of the candidate analytes is an autosomal bin.
  • the testing sample is a tisuse biopsy for a testing disease.
  • the disease is Charcot-Marie-Tooth type IIA, optic nerve atrophy, diabetes, or cancer.
  • FIG.1 shows a system adapted for categorizing a sample as normal or abnormal based on a target analyte.
  • FIG.2 shows a data processing system adapted for use in systems adapted for categorizing a sample as normal or abnormal based on a target analyte in accordance with various embodiments.
  • FIG.3 illustrates a method of categorizing a sample as normal or abnormal based on a target analyte.
  • FIG.4 illustrates a computer implemented method of categorizing a sample as normal or abnormal based on a target analyte.
  • FIG.5. Illustrates normalizing the sequencing counts of human sub-chromosomal bins for detection of chromosome 21 trisomy in cfDNA samples from preganant women’s blood plasma using the methods disclosed herein. Definitions
  • analyte refers to the basic unit, element, or entity for analysis.
  • an analyte is a nucleotide sequence, a gene, an amplicon, an RNA, a protein, an array probe, a peptide, a protein, an antigen, an antibody, a macromolecule, a SNP, a mutation, a sequence of DNA, a fragment of chromosomal region, a chromosome, or any measurable biomarker.
  • an analyte is quantitatively measured and measurements are normalized for an analysis.
  • the term“target analyte” or“target” refers to an analyte of interest.
  • normalizer refers to an analyte the measurement of which is used to normalize measurement(s) of the target analyte.
  • normalizers or“set of normalizers” refers to a collection of such analytes.
  • sets of normalizers implies normalizers selected for a group of target analytes with each target analyte having its own set of normalizers. In the present invention, the measurements of a target analyte and its normalizers are used to calculate a normalized value called“normalized quotient”.
  • the term“candidate analyte” or“candidate” refers to an analyte that can serve as a normalizer for a“target analyte”.
  • the present invention provides a method to screen candidate analytes to serve as normalizers.
  • the term“sample” refers to a material source wherein an assay is performed in order to obtain the quantitative measurements of the material source for analysis.
  • Samples may be ones obtained from an organism or from the environment (e.g., a soil sample, water sample, etc.), or may be those directly obtained from a source (e.g., such as a biopsy or from a tumor, blood, serum sample, ) or indirectly obtained, e.g., after culturing and/or one or more processing steps.
  • a source e.g., such as a biopsy or from a tumor, blood, serum sample,
  • indirectly obtained e.g., after culturing and/or one or more processing steps.
  • the terms“assay”,“experiment”,“study”, and“test” are used interchangeably and refer to a biological, chemical, or assay of similar nature with a sample or a group of samples.
  • An assay in this disclosure normally produces quantitative measurements of analytes for each sample in the assay.
  • the term“high throughput” or“high throughput assay” is used to describe an assay in which a plurality of analytes were measured.
  • the term“normalized quotient” (“NQ”) refers to the output value from a normalization formula comprising measurements of target analyte and normalizers.
  • the normalization formula is a function of target analyte measurements divided by the sum of measurements of all analyzers. Typically the value is generated as the measurement of the target analyte divided by the sum of the measurements of all normalizers.
  • one or more coefficients or weights are included to adjust the measurements for calculation of the normalized quotient value.
  • various coefficients or weights determined empirically or statistically are applied to the measurements of the normalizers to produce a weighted sum, and a NQ is generated by dividing the measurements of the target analyte by the weighted sum.
  • the term“normal NQ range” refers to the range of NQs for a target analyte calculated based on a set of a plurality of normalizers in a set of reference samples. The normal NQ range is used to compare with the NQ for a target analyte based on the same set of normalizers in a test sample to determine whether a target analyte in the test sample is normal or abnormal.
  • the term“covariation” refers to the simultaneous change in measurement of two analytes across a group of samples.
  • the terms“measurement of covariation”,“covariation measurement”, and“covariation value” are used interchangeably and refer to a statistical or mathematical similarity measurement, which include, but is not limited to, common statistical or mathematical terms“covariance”,“correlation”,“correlation coefficient”,“Pearson correlation”, “Spearman correlation”, and“Kendall correlation”.
  • a target analyte is implied to be one of the two analytes for the calculation thereof; as an example, the term“a covariation value of a candidate analyte” refers to a covariation value of a target analyte and the candidate analyte.
  • the term“covariance” refers to a covariation value through statistical covariance calculation. If ⁇ and ⁇ are real-valued variables representing two analytes for an experiment with means E(X), E(Y) and variances var(X), var(Y), respectively. The covariance of (X, Y) or cov(X, Y) is determined by
  • the term“correlation” or“correlation coefficient” or“correlation value” refers to a covariation value through statistical correlation calculation. If X and Y are real-valued variables representing two analytes for an experiment with means E(X), E(Y), variances var(X), var(Y), and standard deviations sd(X), sd(Y), respectively. The covariance of (X, Y)or cov(X, Y) is determined by
  • ⁇ ⁇ ⁇ is the number of concordant pairs and ⁇ ⁇ is the number of discordant pairs.
  • the term“threshold value” or“threshold” refers to a minimal value of the
  • the term“covariation-based normalization” or“CBN” refers to the process or the method of selecting normalizers based on candidate analytes’ covariation, covariance, or correlation to the target analyte.
  • the term“test sample” refers to a sample of interest in a study. It is also referred as “patient sample” in this disclosure.
  • the term“training samples” refers to a group of samples that are used to generate a training data set for the selection of normalizers for a target analyte. Any sample that has characteristics similar to those of a test sample can be selected as a training sample.
  • the term“reference samples” refers to a group of samples that represent the typical sample populations of a study. NQs of the target analyte in the reference samples are used to determine median, mean, median absolute deviation (“MAD”), standard deviation of the NQs of the target analyte. low end of the normal range of NQ, and high end of the normal range of NQ for a target analyte. Reference samples can be the training samples that are used to select normalizers.
  • control sample or“control” refers to a sample to be compared within a study. Typically the results from a test sample are compared to those from a control for analysis.
  • control samples can be the training samples used for normalizer selection or the reference samples for the determination of the metrics of samples in the same experiment.
  • the term“Ratio Score” or“RS” refers to the ratio value generated by dividing the NQ of a target analyte by the NQ median or mean of reference samples and adjusting the value by an arbitrary scaling factor or coefficient. RS is a further transformation of NQ and is useful when a statistic or metric is needed for a group of analytes.
  • a "CGH array” refers to an array that can be used to compare DNA samples for relative differences in copy number. A CGH array can be used in any assay in which it is desirable to scan a genome with a sample of nucleic acids.
  • a CGH array can be used in a location analysis as described in Wyrick et al.
  • a CGH array thus can also be referred to as a "location analysis array” or an "array for ChIP-chip analysis.”
  • a CGH array provides probes for screening or scanning a genome of an organism and comprises probes from across the genome.
  • SNP single nucleotide polymorphism
  • the site is usually preceded by and followed by highly conserved sequences of the allele (e.g., sequences that vary in less than 11100 or 1/1000 members of the populations).
  • a single nucleotide polymorphism usually arises due to substitution of one nucleotide for another at the polymorphic site.
  • a transition is the replacement of one purine by another purine or one pyrimidine by another pyrimidine.
  • a transversion is the replacement of a purine by a pyrimidine or vice versa.
  • Single nucleotide polymorphisms can also arise from a deletion of a nucleotide or an insertion of a nucleotide relative to a reference allele.
  • the term“chromosomal bin” or“bin” refers to a fragment of chromosome. The bins can be chromosomal fragments that are overlapped or non-overlapped with one another.
  • Chromosomal bin has been commonly used for analysis of subchromosomal regions (Hu et al., Xu et al.).
  • the term“autosomal bin” refers to a bin of an autosome.
  • the term“sequence read” or“read” refers to a nucleotide sequence generated by a sequencing device for a sample.
  • the term“sequencing count” or its simplified version“count” refers to number of reads that are aligned or mapped to a reference sequence. The count is zero if there is no read aligned to the reference sequence, a positive integer number indicates otherwise.
  • a measurement of the analyte disclosed herein is a sequencing count.
  • chromosomal representation or“chromosomal binning representation” refers to a numeric measurement for a chromosome or chromosomal bin divided by the sum measurement of all analytes in a sample. This is a way to adjust the input quantity variation for the measurement.
  • the chromosome representation in the context of the assay platform of a sequencing device is the count for the chromosome divided by the count for all chromosomes.
  • computer-readable medium refers to any media that participates in providing instructions to a processor for execution. Such a medium can take many forms, including but not limited to, non-volatile media, volatile media, and transmission media.
  • a "biological sample” refers generally to body fluid or tissue or organ sample from a subject.
  • the biological sample may a body fluid such as blood, plasma, lymph fluid, serum, urine or saliva.
  • a tissue or organ sample such as a non-liquid tissue sample maybe digested, extracted or otherwise rendered to a liquid form - examples of such tissues or organs include cultured cells, blood cells, skin, liver, heart, kidney, pancreas, islets of Langerhans, bone marrow, blood, blood vessels, heart valve, lung, intestine, bowel, spleen, bladder, penis, face, hand, bone, muscle, fat, cornea or the like.
  • a plurality of biological samples may be collected at any one time.
  • the term“categorizing a test sample based on a target analyte” refers to determining whether the measurement of the target analyte in the test sample is within the normal range of the measurement of the target analyte in reference samples.
  • the categorization is carried out by comparing the NQ of the target analyte in the test sample with a normal NQ range determined based on the NQs of the reference samples.
  • a test sample is categorized as normal based on a target analyte if the NQ of the target analyte falls within the normal NQ range (including the lower limit and higher limit) and abnormal if the NQ of the target analyte falls outside the normal NQ range.
  • the term“measurement” refers to any type of quantitative measurement of an analyte, e.g., the amount of the analyte present in a sample, or sequencing counts of the analyte in the sample.
  • a measurment of an analyte can be expressed in various forms, e.g., an amount of an assay signal, or a value resulted from mathematical transformations of the amount of assay signal.
  • Other data processing approaches such as normalization of assay signals in reference to a population’s mean values, can also be used to produce a measurement for an analyte.
  • This invention can be used to accurately quantify one or more target analytes in a sample.
  • the disclosed method of identifying normalizers and using it for normalizing a target analyte measurement is especially useful and advantageous when the number of normalizer candidates for a given target analyte is greater than 10, 100, or 1000, which is typical for high throughput studies, for example, high throughput studies using microarray, PCR, sequencing, or mass spectrum to quantify levels of a set of DNA, RNA, protein, or metabolite analytes.
  • high throughput studies for example, high throughput studies using microarray, PCR, sequencing, or mass spectrum to quantify levels of a set of DNA, RNA, protein, or metabolite analytes.
  • it is challenging to identify normalizers that perform well across different assay conditions such as different protocols, different samples, different assay time, or different batch of reagents.
  • This disclosure provides methods for identifying normalizers for the target analyte using training samples under a desired assay condition, therefore measurements of target analyte can be normalized and accurately analyzed even if assay condition changes.
  • These methods disclosed herein include providing a sample comprising a biological sample.
  • the term“providing” is to be construed broadly. The term is not intended to refer exclusively to a subject who provided a biological sample. For example, a technician in an off-site clinical laboratory can be said to“provide” the sample, for example, as the sample is prepared for purification by chromatography.
  • the biological sample is preferably an in vitro sample and but is not limited to any particular sample type.
  • the biological sample may also include other components, such as solvents, buffers, anticlotting agents, and the like.
  • a sample used in the disclosure is an aliquot of material, frequently an aqueous solution or an aqueous suspension derived from biological material containing DNA.
  • Samples to be assayed for the presence of the target nucleic acid by the methods of the present invention include, for example, cells, tissues, e.g., tumors, homogenates, lysates, extracts, and other biological molecules and mixtures thereof.
  • samples typically used in the methods of the invention include human and animal body fluids such as whole blood, serum, plasma, cerebrospinal fluid, sputum, bronchial washings, bronchial aspirates, urine, lymph fluids and various external secretions of the respiratory, intestinal and genitourinary tracts, tears, saliva, milk, white blood cells, myelomas and the like; biological fluids such as cell culture supernatants; tissue specimens which may or may not be fixed; and cell specimens which may or may not be fixed.
  • the samples used in the methods of the present invention will vary depending on the assay format and the nature of samples, e.g. the characteristics of the tissues, cells, extracts or other materials, especially biological materials, to be assayed. Methods for preparing e.g.
  • the biological sample is at least about 1-100 ⁇ L, at least about 10-75 ⁇ L, or at least about 15-50 ⁇ L in volume. In certain embodiments, the biological sample is at least about 20 in volume.
  • the method disclosed herein can be used to analyze any target analyte present in a sample.
  • Non-limiting examples of the target analytes include a gene, an RNA, a peptide, a cluster of genes, a chromosome, a chromosomal region called bin, or a Single Nucleotide Polymorphism (SNP).
  • the target analyte is a prognostic or diagnostic marker, the amount of which in the sample is associated with the presence or non-presence of a disease, e.g., cancer. For example, copy number aberrations of chromosomal region detected in human samples have been indicated as prognostic markers in cancer associated phenotypes (Sapkota et al., Krepischi et al.).
  • RNA biomarkers have been used for breast cancer prognosis (Park et al.).
  • High level of cell surface protein Her2 is a companion diagnostic biomarker for the effectiveness of cancer drug Herceptin (Dent et al.).
  • the target analyte is a marker, the amount of which is corrected with the effectiveness of a treatment.
  • the method disclosed herein can be used to analyze measurement of target analytes produced by any assays that generate quantifiable and detectable signals. Such methods are well known in the art, for example, real time PCR, microarray, or sequencing.
  • the invention provides a method of selecting normalizers for a target analyte from candidate analytes.
  • the method includes calculating a covariation value for each candidate analyte and the target analyte in a group of training samples and selecting a candidate analyte as a normalizer if its covariation value is higher than a threshold (see Examples I-V below).
  • a covariation value quantifies the simultaneous change or similarity in measurement of a candidate analyte and the target analyte in a group of training samples.
  • the group of training samples comprise at least 3, 5, or 10 samples.
  • the group of training samples comprise one or more reference samples.
  • the group of training samples comprise 30-100, or 50-150 samples.
  • the group of training samples comprise 100-200 samples.
  • the group of training samples include the test sample itself.
  • the covariation value is a mathematical correlation such as Spearman correlation, Pearson correlation, or Kendall correlation.
  • the covariation value is a mathematical covariance.
  • the threshold for selection of a normalizer is determined either empirically or by statistical analysis.
  • the covariation value is a Spearman, a Pearson, or a Kendall correlation that ranges from -1 to 1, and the threshold is set to a decimal number greater than or equal to 0.1, 0.2, 0.3, 0.4, 0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.91, 0.92, 0.93, 0.94, 0.95, 0.96, 0.97, 0.98, or 0.99.
  • the threshold of the covariation value is determined by a minimum rank among all candidates’ covariation values ranked high to low. The minimum rank is a positive integer between 1 and the total number of candidate analytes.
  • This rank determined threshold effectively selects the top n number of candidate analytes of highest covariation values as normalizers, where n represents a desired number of normalizers for a target analyte, for example, at least 3, 5, 10, 50, 100, 150, or 200.
  • the threshold of the covariation value is a decimal number greater than or equal to 10th, 20th, 30th, 40th, 50th, 55th, 60th, 65th, 70th, 75th, 80th, 85th, 90th, 91th, 92th, 93rd, 94th, 95th, 96th, 97th, 98th, or 99th percentile of the covariation values for all pairs of candidate analytes and said target analyte.
  • This percentile- based threshold allows effective selection of candidate analytes having covariation values within the high percentile of all covariation values as normalizers.
  • Normalized quotient is the result generated by dividing the measurement of the target analyte by the sum of the measurements of the normalizers, which is shown in the following formula:
  • is the NQ value for target analyte denotes the measurement for target analyte ⁇ , and denotes the measurement of the ⁇ -th normalizer in the set of normalizers ⁇ ⁇ for target analyte ⁇ .
  • various coefficients or weights determined empirically or statistically are added to the measurements of each of the normalizers to calculate a weighted sum in order to determine the NQ value, using the following formula: where is the NQ value for target analyte denotes the measurement for target analyte denotes the weight of the h normalizer in the set of normalizers ⁇ for target analyte ⁇ , and denotes the measurement of the normalizer in the set of normalizers for target analyte ⁇ .
  • the analytes in the described invention may be polynucleotide probes for detection of aberrations in genomic DNA regions using- a microarray chip (Example I).
  • They may be polynucleotide probes for measuring the level of cDNA converted from RNA signals in biological samples using a microarray. They may be amplicons defined by specific PCR primer sets in a high throughput assay that is designed to detect either RNA expression levels (RT-PCR assay) or specific DNA targets.
  • the analytes may also be polynucleotide sequences that are defined bioinformatically or experimentally and detected by high throughput sequencing.
  • the normalized quotient (NQ) of a target analyte can be used to determine whether a biological sample is normal or abnormal. This can be achieved by comparing the NQ of a target analyte to an NQ range determined through a set of reference samples, which comprise at least 3, 5, or 10 samples.
  • the set of reference samples comprises 30-100, or 50- 150 samples.
  • the set of reference samples comprises 100-200, 300-500, or 200-400, or 500-1000 samples.
  • the set of reference samples include the test sample itself.
  • the detection of abnormality of a test sample is based on a statistical z-score of an analyte usin the formula:
  • ⁇ ⁇ represents the analyte’s z-score of the test sample
  • ⁇ ⁇ is the NQ median or NQ mean of the reference samples
  • 3 is the NQ median absolute deviation (MAD) or standard deviation (SD) of the reference samples.
  • the z-score obtained in high throughput assays typically follow Gaussian distribution, and thus the normal range for z-score is set as -2 to 2, -2.5 to 2.5, -3 to 3, -3.5 to 3.5, -4 to 4, or other fractional value range in similar range.
  • an analyte’s z-score for a test sample is not within the normal range, then the test sample is classified as abnormal for the analyte.
  • This z-score based classification can, for example, be applied for detection of chromosomal aneuploidy as described in Sehnert et al. or Palomaki et.al. [0085] In practice, the normal range of NQ based on average and deviation can be used equivalently as of z-score.
  • the normal range of NQ is defined by a low end, which is the median of the NQs of the target analytes in a set of reference samples minus a decimal number that is greater than or equal to 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, or 5 times of the median absolute deviation (MAD); and a high end, which is the median plus a decimal number that is greater than or equal to 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, or 5 times of the median absolute deviation (MAD).
  • a low end which is the median of the NQs of the target analytes in a set of reference samples minus a decimal number that is greater than or equal to 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, or 5 times of the median absolute deviation (MAD).
  • the low end of the NQ range is the mean of the NQs of the target analytes in a set of reference samples minus 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, or 5 times of the standard deviation (SD) and the high end of the range is the mean plus 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, or 5 times of the standard deviation (SD).
  • the training samples, reference samples are of the same tissue type as the test sample. In some embodiments, the training samples, reference samples are not of different tissue type as the test sample. In some embodiments, training samples used to select the normalizers are of the same tissue type as the reference samples. In some embodiments, training samples also serve as reference samples.
  • the claimed methods are especially advantageous for studies in which the number of data points for each sample is large such that the bin analysis can be performed.
  • the data are generated by whole genome sequencing to detect chromosomal aneuploidy.
  • a chromosome can be digitally divided into fragements called bins.
  • the claimed methods can be performed on data for a collection of bins, in order to detect amplification or deletion of a bin or a part of a bin in a sample.
  • these bins are designated as analytes, and each bin can be normalized using the method described above and also as illustrated in the Examples below.
  • each bin is normalized, i.e., by calculating a NQ
  • the data of the NQs for the bins belonging to the same chromsome can be aggregated to determine whether there is an abnormality for the chromosome. Whether bins belong to the same chromosome can be readily determined by chromosomal mapping.
  • This method of normalizing data at chromosome bins and then agrregating to a chromsome allows more accurate analyses when data are not even across the full chromosome, meanwhile the method allows subchromosomal analysis where abnormality to be detected only on a portion of a chromosome.
  • the bins can be of various or equal lengths, and may be overlapping or non- overlapping.
  • bins from human samples were used that are non-overlapping bins and have lengths ranging from 100 to 50,000,000 nucleotides, e.g, 200-5000000, 300- 100,000, 1,000-3,000,000, or 5,000-30,000 nucleotides.
  • one bin is designated as a target analyte and all other bins are designated as candidate analytes to select normalizers that are used to normalize the
  • systems for facilitating normalizing a target analyte and/or categorizing a test sample based on the target analyte.
  • Such systems can include one or more computing devices and can be communicatively coupled to a network.
  • Such computing device can include a discrete computing device, a computing device tied into a main-frame system of a medical facility or can include one or more portable devices that are communicatively coupled to a network or server associated with a treating physician or medical facility.
  • one or more of the computing devices can include a portable computing device of a treating physician, such as a tablet or handheld device.
  • a portable computing device of a treating physician such as a tablet or handheld device.
  • Such systems are configured, typically with programmed instructions recorded on a memory thereof, to select normalizers for normalizing the target analyte and categorize the test sample.
  • the system includes a computation engine that determines the covariation value between the target analyte and each of the candidate analytes; select the candidate analyte as a normalizer if the covariation value of the candidate analyte is higher than a threshold, thereby selecting a plurality of normalizers; calculate NQs of the target analyte in the reference samples and determine a normal NQ range for the target analyte; determine the NQ of the target analyte in the test sample, compare the NQ of the target analyte of the test sample with the normal NQ range.
  • the computation engine can be defined by programmable instructions recorded on a memory of the system, which can include a memory accessed through a server or a memory coupled with one or more processors of one or more computing devices of the system.
  • a memory of the system can include a memory accessed through a server or a memory coupled with one or more processors of one or more computing devices of the system.
  • FIG.1 depicts an example block diagram of a system configured to normalizing measurements of a target analyte and/or categorize a test sample based on the analyte.
  • system 100 includes a computer system 115 coupled to a network or server 110 that includes medical data associated with the patient from one or more data sources 105 (e.g. laboratory output of sample results).
  • Data sources 105 can include values
  • network 110 can be a local area network (LAN), a wide-area network (WAN), a wireless network, a bus connection, an interconnect, or any other means of communicating data or control information across one or more transmission lines or traces in an electronic system.
  • LAN local area network
  • WAN wide-area network
  • data sources 105 are accessed through a network or server 110, it is appreciated that data sources 105 can communicate data directly to the computing system 115 or data can be manually input into computer system 115 through a user input.
  • Computer system 115 includes a processor 101 and a system memory 104 coupled together via an interconnect bus 108.
  • processor 101 and system memory 104 can be directly interconnected, or can be connected indirectly through one or more intermediary components or units.
  • Processor 101 and system memory 104 can be any general- purpose or special-purpose components as is known in the art and is not limited to any particular type of processor or memory system.
  • System memory 104 can be configured to store system and control data for automatically performing the normalization and sample categorization methods described herein.
  • computing system 115 is coupled with a database to receive data.
  • the data stored on database can include data values corresponding to the measurements of the target analyte and candidate analytes from a group of training samples; data values corresponding to the measurements of the target analyte and normalizers from the testing sample and a set of reference samples; and/or data pertaining to the normalization of the target analyte in the test sample.
  • the processor can determine the covariation value between the target analyte and each of the candidate analyte, selecting a candidate as a normalizer if the covariation value of the candiate analyte is equal to or greater than a threshold.
  • the threshold can be stored on system memory 104, or can be automatically obtained from a database as needed or obtained from another data source 105 accessed through communication with network 110.
  • the processor can also determine a NQ for the target analyte in the test sample and each of the set of reference samples based on the measurements of the target analyte and the normalizers in each sample.
  • the processor determines a NQ range having a lower lmit, e.g., by calculating a mean minus a multiple of the standard deviation and an higher limit by calculating the mean plus a multiple of the standard deviation.
  • One advantage to including programmable instructions that queries an external data source for the pre-determined threshold –for selecting normalizers, or the multiple– for determining the limits of the normal NQ range, is that they can be changed or updated periodically, as needed, without altering the configuration of computing system 115.
  • Computing system 115 receives input data 103 from the various data sources through communications interface 120.
  • Computer system 115 processes the received data according to programmed instructions recorded on memory 104 and provides resulting data pertaining to the differential diagnosis to a user via output module 125.
  • Output module 125 can be
  • the output module 125 outputs an indication representing the categorization of the test sample to the user (e.g.“normal for target analyte”, “abnormal for target analyte”).
  • Output module can further output a list of normalizers used to normalizing the target analyte.
  • the output module 125 can output the processed data directly to the network or to a health information database so that the categorization result or associated data can be accessed by various other computing devices communicatively coupled with the network or database.
  • the computing system 115 receives receive values representing measurements of the target analyte and candidate analytes in a group of training samples, via network 110, and provides those values to computation engine 130. Comparison engine 930 then causes the processors 101, which, for each of the candidate analytes, determines the covariation value between the target analyte and the candidate analytes and select the candidate analyte as a normalizer if the covariation value of the candidate analyte is higher than a threshold, thereby selecting a plurality of candidate analytes. The computing system 115 also receives receive measurements of the target analyte and the plurality of the identified
  • the computation engine then causes the processor to calculate NQs of the target analyte in the reference samples and determine a normal NQ range for the target analyte; determine the NQ of the target analyte in the test sample; and compare the NQ of the target analyte of the test sample with the normal NQ range to determine whether the NQ of the target analyte is within the normal NQ range. If the NQ of the target analyte in the test sample falls within the normal NQ range, a signal indicating as such can output by the output module indicating the test sample is normal based on the target analyte.
  • Computation engine 130 may be implemented using specially configured computer hardware or circuitry or general-purpose computing hardware programmed by specially designed software modules or components; or any combination of hardware and software. The techniques described herein are not limited to any specific combination of hardware circuitry or software.
  • computation engine 130 may include off-the-shelf circuitry components or custom- designed circuitry.
  • the computation functionality may be performed in software stored in memory 104 and executed by the processor 101.
  • FIG.2 depicts an example block diagram of a data processing system upon which the disclosed embodiments may be implemented. Embodiments of the present invention may be practiced with various computer system configurations such as hand-held devices,
  • microprocessor systems microprocessor-based or programmable user electronics
  • FIG.3 depicts a Data Processing System 1000 that can be used with the embodiments described herein.
  • FIG.3 depicts a Data Processing System 1000 that can be used with the embodiments described herein.
  • FIG.3 depicts a Data Processing System 1000 that can be used with the embodiments described herein.
  • FIG.3 depicts a Data Processing System 1000 that can be used with the embodiments described herein.
  • FIG.3 depicts a Data Processing System 1000 that can be used with the embodiments described herein.
  • FIG.3 depicts a Data Processing System 1000 that can be used with the embodiments described herein.
  • network computers and other data processing systems which have fewer components or perhaps more components may also be used.
  • the data processing system could be distributed across multiple computing devices that are communicatively coupled.
  • the data processing system of FIG.3 can be a personal computer (PC), workstation, tablet, smartphone or other hand-held wireless device, or any device having similar functionality.
  • the data processing system 1001 includes a system bus 1002 which is coupled to a microprocessor 1003, a Read-Only Memory (ROM) 1007, a volatile Random Access Memory (RAM) 1005, as well as other nonvolatile memory 1006.
  • microprocessor 1003 is coupled to cache memory 1004.
  • System bus 1002 can be adapted to interconnect these various components together and also interconnect components 1003, 1007, 1005, and 1006 to a display controller and display device 1008, and to peripheral devices such as input/output (“I/O”) devices 1010.
  • I/O input/output
  • I/O devices can include keyboards, modems, network interfaces, printers, scanners, video cameras, or other devices well known in the art.
  • I/O devices 1010 are coupled to the system bus 1002 through I/O controllers 1009.
  • the I/O controller 1009 includes a Universal Serial Bus (“USB”) adapter for controlling USB peripherals or other type of bus adapter.
  • USB Universal Serial Bus
  • RAM 1005 can be implemented as dynamic RAM (“DRAM”) which requires power continually in order to refresh or maintain the data in the memory.
  • the other nonvolatile memory 1006 can be a magnetic hard drive, magnetic optical drive, optical drive, DVD RAM, or other type of memory system that maintains data after power is removed from the system.
  • nonvolatile memory 1006 is shown as a local device coupled with the rest of the components in the data processing system, it will be appreciated that the described techniques can use a nonvolatile memory remote from the system, such as a network storage device coupled with the data processing system through a network interface such as a modem or Ethernet interface (not shown).
  • a network storage device coupled with the data processing system through a network interface such as a modem or Ethernet interface (not shown).
  • hardwired circuitry may be used independently, or in combination with software instructions, to implement these techniques.
  • the described functionality may be performed by specific hardware components containing hardwired logic for performing operations, or by any combination of custom hardware components and programmed computer components.
  • the techniques described herein are not limited to any specific combination of hardware circuitry and software.
  • Embodiments herein may also be in the form of computer code stored on a computer- readable storage medium embodied in computer hardware or a computer program product.
  • Computer-readable media can be adapted to store computer program code, which when executed by a computer or other data processing system, such as data processing system 1000, is adapted to cause the system to perform operations according to the techniques described herein.
  • Computer-readable media can include any mechanism that stores information in a form accessible by a data processing device such as a computer, network device, tablet, smartphone, or any device having similar functionality.
  • Examples of computer-readable media include any type of tangible article of manufacture capable of storing information thereon such as a hard drive, floppy disk, DVD, CD-ROM, magnetic-optical disk, ROM, RAM, EPROM, EEPROM, flash memory and equivalents thereto, a magnetic or optical card, or any type of media suitable for storing electronic data.
  • Computer-readable media can also be distributed over a network- coupled computer system, which can be stored or executed in a distributed fashion.
  • FIG.3 shows an exemplary method of categorizing a test sample for a target analyte.
  • the method includes steps of: Obtaining measurements of a target analyte and candidate analytes; calculating a covariation value between the target analyte and each of the candidate analytes; for each of the candidate analyte, selecting the candidate as a normalizer if the covariation value is equal to greater than the threshold, thereby selecting a plurality of normalizers; calculating the NQs of the target analyte in the test sample and reference samples based on the measurements of the selected normalizers; determining a normal range of NQ for the target analyte from reference samples; comparing the NQ of the target analyte with the normal NQ range; categorizing the sample as normal based on the target analyte if the NQ falls within the normal NQ range or as abnormal if the NQ falls outside the normal NQ range.
  • FIG.4 shows another exemplary method of categorizing a test sample for a target analyte by use of a computing system adapted for performing such a categorization.
  • the computing system can include one or more computing devices that can be communicatively coupled with a network or server.
  • Such a method optionally includes a step of receiving a request for categorizing a test sample to be categorized for a target analyte. Such a request could be made by a physician for a patient and would be input through a user interface coupled with the computing device.
  • categorization method can be performed automatically for such a patient without requiring a request from the treating physician or other personnel.
  • the method includes steps of: obtaining measurements of a target analyte and candidate analytes; Calculating a covariation value between the target analyte and each of the candidate analytes; for each of the candidate analyte, selecting the candidate as a normalizer if the covariation value is equal to greater than the threshold, thereby selecting a plurality of normalizers; Calculating the NQs of the target analyte in the test sample and reference samples based on the measurements of the selected normalizers; Determining a normal range of NQ for the target analyte from reference samples; Comparing the NQ of the target analyte with the normal NQ range; Categorizing the sample as normal based on the target analyte if the NQ falls within the normal NQ range or as abnormal if the NQ falls outside the normal NQ range; outputting a
  • the method optionally includes a step of outputting the selected normalizers.
  • FIG.4 depicts a particular embodiment of a process and that other sequences of operations may also be performed in alternative embodiments. For example, certain steps can be performed by another computing device communicatively coupled with the computing device or the above operations could be performed in a different order. Moreover, the individual operations may include multiple sub-steps that may be performed in various sequences as appropriate and additional operations may be added or removed depending on the particular applications.
  • One of ordinary skill in the art would recognize the many possible variations, modifications, and alternatives.
  • the covariation based normalization method disclosed herein can be used in any assay that involves quantitative analyses, for example, assays that are used to quantify biomarker expression or determine the amount of RNA or DNA in biosamples; and assays used to determine the presence of an abnormal number of chromosomes in a cell, which is commonly referred to as aneuploidy.
  • the most common aneuploidy in human population is chromosome 21 trisomy (associated with Down syndrome); chromosome 18 trisomy (associated with Edwards syndrome); and chromosome 13 trisomy (associated with Patau syndrome).
  • the methods disclosed herein can also be used to quantify mitochondria DNA
  • mtDNA mitochondrial DNA
  • PEO progressive external ophthalmoplegia
  • the target analyte is mtDNA.
  • the target analyte is a chromosome or a chromosomal bin comprising mtDNA.
  • Comparative genomic hybridization is useful for detecting small chromosomal abnormalities, for example, small genetic imbalances (gains or losses of chromosomal material), also known as chromosome copy number variations (CNVs) because these small genetic imbalance is difficult to detect by routine Cytogenetics.
  • CGH array short DNA sequences called probes corresponding to known chromosomal loci spanning the genome are fixed to a solid surface. The composition of these sequences affects the size of the smallest detectable chromosomal anomaly.
  • the typical array includes loci of common microdeletion or duplication syndromes, as well as numerous sub-telomeric and peri-centromeric regions. Sub-telomeric locations are sites where DNA copy number alterations frequently occur.
  • the CGH method first hybridizes differentially fluorescently-labeled DNAs from patient samples and control samples to the array. After hybridization, the fluorescence signals from the patient samples and the control samples are detected and the ratios of the fluorescent signal of the patient sample to that of the control sample are calculated. In some cases, the logarithmically transformed counterparts of these ratios (log ratio) are calculated. These data are then processed through software to determine any copy number differences between the control and the patient DNA and serve as bases for clinical diagnosis. [0111] This traditional method of analysis of CGH array data is unable to produce reliable results because the signal ratios used by this method are not adequately normalized for at least two reasons.
  • the two fluorescently-labeled DNAs from the patient and control samples could interfere with each other during hybridization.
  • the accuracy of signal ratios depends on both the patient samples and the control samples; the variation in the measurement of control samples could render the signal ratios significantly skewed; this is especially problematic for the analytes producing low signals in the control sample. Therefore the results from CGH analyses are often not optimal, especially for assays that require high accuracy, e.g., a non- invasive prenatal test using pregnant women’s blood plasma.
  • the claimed methods i.e., the CBN normalization methods solves the problems associated with inadequate normalization of CGH array.
  • the CBN process involves a normalizer selection process using a group of training samples or equivalents.
  • the CBN method does not require two fluorescence labeling as such used in the CGH array and works equally well with one fluorescence labeling assay.
  • STANFORD.24, STANFORD.35, STANFORD.38, and STANFORD.A were selected to serve as the training samples to determine normalizers for each of the 6095 probes.
  • the dataset consisting of measurements of 6095 probes (designated as probe #1 through 6095) from each of the 37 samples (designated as sample 1 through 37) was preprocessed by the following steps: 1) Log ratio data generated as log ratios, were converted to the ratio values by an exponential function with base 2; and 2) the ratio values were divided by sample median on each array to adjust the variations of input DNA quantities.
  • the selection for normalizers began with designating probe #1 as the target analyte and the remaining 6094 probes as candidate analytes.
  • a Spearman correlation between probe #1 (the target analyte) and probe #2 (a candidate analyte) was calculated using the 37 signals of probe #1 (the median adjusted ratio values for probe #1 from the 37 input samples) and the 37 signals of probe #2 from the 37 samples. The process was reiterated to calculate a Spearman correlation for probe pairs of #1 versus #3, #1 versus #4, and all way through the last pair of #1 versus #6095.
  • a candidate analyte having a Spearman correlation value higher than a threshold of 0.4 was selected as a normalizer.
  • a threshold of Spearman 0.4 typically produces 50 to 500 normalizers from 6094 candidate analytes for a given target analyte.
  • 278 normalizers were selected for probe #1 using the described method.
  • probe #2 was designated as the target analyte, and probes #1, #3 through #6095 are designated as candidate analytes.
  • 135 probes were identified as normalizers for probe #2.
  • the above process was then reiterated until normalizers for probes #3 through #6095 were determined.
  • the normalizers for each probe on the CGH array were determined and used to normalize the signals for samples (including the training samples and other samples) assayed on the CGH array.
  • the 278 and 135 normalizers for probe #1 and #2 are shown in table 1 along with their Spearman correlations.
  • the same normalizers can be used to normalize data for existing assay or any new assay on the same CGH array.
  • NQ value for target probe denotes the signal ratio for target probe
  • denotes the signal ratio of the ⁇ -th normalizer in the set of normalizers for target probe
  • Coding or non-coding RNA expression assays have become powerful tools for the discovery of specific RNA biomarkers for diagnosis or prognosis of cancer, immune disease, or blood disease. They are also very useful for the determination of drug response and
  • RNA expression signal often varies more significantly than its encoding DNA, it is critical to have the right normalizers to correct the RNA expression bias commonly associated with these RNA expression assays.
  • the current RNA normalization methods used in these assays are based on a set of reference genes, such as beta-actin or GAPDH of which RNA expression is relatively constant and less prone to biases (Cronin et al).
  • the CBN method presented in this invention provides an alternative and more effective way to define the reference genes, a.k.a. normalizers.
  • a genomic scale RNA expression assay is typically performed using a high throughput technology platform such as microarray, multi-gene panel RT-QPCR, massively parallel sequencing, or digital PCR.
  • the CBN method, claimed in this application was used to normalize the RNA expression data from the same CGH array (Pollack et al) in the above example I.
  • STANFORD.35.mRNA, STANFORD.38.mRNA, and STANFORD.A.mRNA were selected to serve as the group of samples to select normalizers for each of the 6095 probes.
  • the dataset consisting of measurements of the 6095 probes (designated as probe #1 through 6095) from each of the 37 samples (designated as sample #1 through 37) was preprocessed by the following steps: 1) the data were transformed into signal ratios by an exponential function with base 2 since the data are originally generated as log ratios; and 2) the signal ratios were divided by sample median on each array to adjust variation of input RNA quantity.
  • the normalizer selection began with probe #1 designated as the target analyte while the remaining 6094 probes as candidate analytes.
  • a Spearman correlation between probe #1 (the target analyte) and probe #2 (a candidate analyte) was calculated using the 37 signal ratios of probe #1 and the 37 signals of probe #2 derived from the 37 samples.
  • the process was reiterated to generate a Spearman correlation for each probe pair of #1 versus #3, #1 versus #4, and all way through the last pair of #1 versus #6095.
  • a candidate analyte having a correlation value higher than a threshold of 0.4 was selected as a normalizer.
  • a fixed threshold of Spearman 0.4 will typically produce 50 to 500 normalizers from 6094 candidate analytes for a given target analyte.
  • 117 normalizers were selected for probe #1.
  • the probe #2 was designated as the target analyte, and probes #1, #3 through #6095 were designated as candidate analytes.
  • a total of 69 probes were selected as the normalizers for probe #2 using the above CBN method.
  • the above process was then repeated to select normalizers for probes #3 through #6095.
  • the 117 and 69 normalizers for probe #1 and #2 are shown in table 2 along with their Spearman correlations.
  • the normalizers selected for each probe on the RNA expression array were used to normalize the ratio signals for all RNA expression samples on the array in the same assay.
  • the log ratios assayed on the RNA expression array were first transformed into signal ratios with an exponential function with base 2 and then served as the input for NQ calculation.
  • the NQ was generated by dividing the signal ratio of the target by the sum of signal ratios of the normalizers as shown in the following formula:
  • Example III Normalizing the sequencing counts of human chromosomes and categorizing samples
  • Massively parallel shotgun sequencing has been proven to be an effective platform for detection of fetal chromosomal aneuploidy, i.e. gain or loss of one or more full copies of a fetal chromosome in the presence of maternal chromosomes (Chiu et al.2008; Fan et al.2008). This technique is used to sequence the first 36 bases (termed reads) of millions of DNA fragments to determine their specific chromosomal origin.
  • the count which is the number of reads mapped to a chromosome, is used to generate a ratio value over the counts of all chromosomes (Palomaki et al.2011) or the counts of a specific set of chromosomes called denominators selected by minimizing the variation of the chromosome ratios (Sehnert et al. 2011). Such a ratio value is then used to generate a standardized z-score in order to detect chromosomal aneuploidy.
  • the CBN method in this invention provides an alternative and a more computing efficient way to find normalizers and generate NQ values for the z-score calculation.
  • chr i denotes the chromosomal representation for chromosome i
  • counts j are the number of aligned reads on chromosome j.
  • the CBN method was applied by designating chromosome 1 as a target analyte and all other 23 chromosomes as candidate analytes. Using the chromosomal representation of the 14 samples, a Spearman correlation was calculated between each of the 23 chromosomes (the candidate analytes) and chromosome 1 (the target analyte). A threshold of 0.4 was chosen, and candidate analytes that have Spearman correlation values greater than 0.4 were selected as normalizers for chromosome 1.
  • chromosome representation of Y does not have high correlations to the rest of chromosomes.
  • Table 3 lists the normalizers for all 24 chromosomes generated using the method of this invention. [0130] Once a set of normalizers were defined for each of 24 chromosomes, the NQ values were calculated for a sequenced sample using the formula
  • representation for target chromosome denotes the chromosome representation of the normalizer in the set of normalizers or target chromosome
  • sample SRR609105 showed that NQ values of 15 out of 24 chromosomes were out of normal range (Table 4), while the NQ values of all chromosomes in the other 13 samples appeared to be normal. This results indicates that sample SRR609105 is different from others and should be further examined. Indeed, sample SRR609105 is the only blood plasma sample from a pregnant woman. Table 3. Normalizers for all 24 chromosomes generated in Example III
  • Example IV Normalizing the sequencing counts of human sub-chromosomal bins for detection of DNA copy number variation (CNV) or chromosomal aneuploidy
  • the CBN normalization was performed on sequencing counts for a collection of fragmented chromosomes (“bins”) in order to detect amplification or deletion of a bin or a part of a bin in a sample.
  • the bins may be of various or equal lengths, and may be overlapping or non- overlapping. In some instances, bins from human samples were used that are non-overlapping bins and have lengths ranging from 100 to 50000000 nucleotides. One bin was designated as a target analyte and all other bins were designated as candidate analytes. Sequencing counts for bins were obtained through a DNA sequencing device. The CBN method was then used to select normalizers for the bin. The process is iterated until normalizers are selected for all bins.
  • Chromosomal bin 1 was designated as a target analyte, and all other 2916 bins as candidate analytes. Using the chromosomal binning representation of the 14 samples, a
  • N ij denotes the chromosomal binning representation of the j-th normalizer in the set of normalizers N i for target bin i.
  • the NQ values for the bins can then be used for downstream sub-chromosomal analysis such as detection of CNVs, or detection of chromosomal aneuploidy.
  • the NQ values can also be used as the input data for covariation calculation to iteratively improve the normalizer selection.
  • Figure 5 is a study analysis flowchart illustrating a process of using bin NQ values for detection of fetal chromosome 21 trisomy in cell-free DNA (“cfDNA”) samples from preganant women’s blood plasma in real commercial non-invasive prenatal tests.
  • cfDNA cell-free DNA
  • cfDNA Cell-free DNA samples from pregnant women’s blood plasma have been shown to contain both maternal DNA and fetal DNA (see review by Chiu and Lo 2011). Fetal chromosomal trisomy causes abnormal high level of chromsomal 21 measurement in cfDNA samples, which can be detected with the
  • the data analyses include a selection of initial normalizers, calulation of NQ values, re-selection of normalizers, re- calculation of NQ values for each sub-chromosomal bin of 1000000 nucleotides in size.
  • the bin NQ values thus obtained for chromosome 21 can be directly compared to the mean bin NQ values of chromosome 21 obtained from reference samples through a t-test; a p-value less than 0.01 from the test predicts trisomy of fetal chromosome 21.
  • the CBN normalization was performed on sequencing counts for a collection of fragmented chromosomes (“bins”) comprising mtDNA in order to quantify the copy number of each mtDNA derived bin in a sample.
  • the bins may be of various or equal lengths, and may be overlapping or non-overlapping. In some instances, bins from human samples were used that are non-overlapping bins and have lengths ranging from 100 to 50000000 nucleotides.
  • One bin derived from mtDNA was designated as a target analyte and bins derived from autosomes were designated as candidate analytes. Sequencing counts for bins were obtained through a DNA sequencing device. The CBN method was then used to select normalizers for the bin.
  • ⁇ ⁇ denotes the chromosomal binning representation for bin i
  • counts j is the number of aligned reads on bin ⁇ .
  • a bin comprising mtDNA (“mtDNA bin”) was designated as the target analyte, and all other 2734 bins as candidate analytes.
  • mtDNA bin A bin comprising mtDNA
  • all other 2734 bins were designated as the target analyte, and all other 2734 bins as candidate analytes.
  • a Spearman correlation was calculated between each of the 2734 bins (the candidate analytes) and mtDNA bin (the target analyte).
  • Candidate analytes were then ranked from high to low based on the Spearman correlation values.
  • the correlation value ranking the 200th was selected as the threshold.
  • Candidate analytes having Spearman correlation rankings that are higher than or equal to the threshold were selected as normalizers for mtDNA. This process thus selected the top 200 highly correlated candidates as normalizers.
  • the NQ value for mtDNA bin was calculated using the following equation:
  • Q i denotes the NQ value for target mtDNA bin
  • cb i denotes the chromosomal binning representation for target mtDNA bin
  • N ij denotes the chromosomal binning representation of the j-th normalizer in the set of normalizers for target mtDNA bin.
  • the NQ values for target mtDNA bin (or mtDNA NQ value) for the six samples were generated as normalized measurements of mtDNA quantity with high accuracy and robustness (Table 5).
  • the mtDNA NQ values can then be used for further analyses such as computing mtDNA copy number through an established NQ to copy number reference relation function, and establishing various mtDNA quantity range for clinical or physiological conditions of human tissues or cells.

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Medical Informatics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Genetics & Genomics (AREA)
  • Biotechnology (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Mathematics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Mathematical Physics (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Software Systems (AREA)
  • Bioethics (AREA)
  • Artificial Intelligence (AREA)
  • Public Health (AREA)
  • Evolutionary Computation (AREA)
  • Operations Research (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Algebra (AREA)
  • Epidemiology (AREA)
  • General Engineering & Computer Science (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

L'invention concerne des procédés et des systèmes basés sur la covariation pour l'identification de normalisateurs pour la normalisation de mesures portant sur un ou plusieurs analytes cibles obtenus à l'issue d'une quelconque analyse quantitative. Ces procédés et ces systèmes peuvent également être utilisés pour classer des échantillons biologiques comme normaux ou anormaux sur la base des mesures normalisées portant sur un ou plusieurs analytes cibles.
PCT/US2016/061001 2015-11-09 2016-11-08 Procédé de normalisation pour analyse d'échantillons Ceased WO2017083310A1 (fr)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US201562252673P 2015-11-09 2015-11-09
US201562252679P 2015-11-09 2015-11-09
US62/252,679 2015-11-09
US62/252,673 2015-11-09

Publications (1)

Publication Number Publication Date
WO2017083310A1 true WO2017083310A1 (fr) 2017-05-18

Family

ID=58695214

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2016/061001 Ceased WO2017083310A1 (fr) 2015-11-09 2016-11-08 Procédé de normalisation pour analyse d'échantillons

Country Status (1)

Country Link
WO (1) WO2017083310A1 (fr)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114585922A (zh) * 2019-07-31 2022-06-03 细胞逻辑股份有限公司 用于分析物水平的自适应归一化的方法、装置和计算机可读介质
US20220221449A1 (en) * 2021-01-11 2022-07-14 Meso Scale Technologies, Llc. Assay system calibration systems and methods
CN115132271A (zh) * 2022-09-01 2022-09-30 北京中仪康卫医疗器械有限公司 一种基于批次内校正的cnv检测方法
WO2023216517A1 (fr) * 2022-05-12 2023-11-16 深圳市陆为生物技术有限公司 Procédé et dispositif de calcul d'indice d'activité immunitaire iga d'échantillon
CN119044394A (zh) * 2024-11-01 2024-11-29 西尼尔(山东)新材料科技有限公司 一种阻燃剂的阻燃性能智能监测方法及系统

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050090021A1 (en) * 2000-10-06 2005-04-28 Walt David R. Self-encoding sensor with microspheres
US20090043171A1 (en) * 2007-07-16 2009-02-12 Peter Rule Systems And Methods For Determining Physiological Parameters Using Measured Analyte Values
US20100204055A1 (en) * 2008-12-05 2010-08-12 Bonner-Ferraby Phoebe W Autoantibody detection systems and methods
US20110137851A1 (en) * 2009-10-15 2011-06-09 Crescendo Bioscience Biomarkers and methods for measuring and monitoring inflammatory disease activity
US20110306856A1 (en) * 2010-06-09 2011-12-15 Optiscan Biomedical Corporation Systems and methods for measuring multiple analytes in a sample
US20120208282A1 (en) * 2009-07-02 2012-08-16 Biocrates Life Sciences Ag Method For Normalization in Metabolomics Analysis Methods with Endogenous Reference Metabolites.
US20140242588A1 (en) * 2011-10-06 2014-08-28 Sequenom, Inc Methods and processes for non-invasive assessment of genetic variations
US20140274795A1 (en) * 2011-10-18 2014-09-18 Twistnostics Llc Detection units and methods for detecting a target analyte
US20140335630A1 (en) * 2012-01-31 2014-11-13 The University Of Toledo Methods and Devices for Detection and Measurement of Analytes
US20150227681A1 (en) * 2012-07-26 2015-08-13 The Regents Of The University Of California Screening, Diagnosis and Prognosis of Autism and Other Developmental Disorders

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050090021A1 (en) * 2000-10-06 2005-04-28 Walt David R. Self-encoding sensor with microspheres
US20090043171A1 (en) * 2007-07-16 2009-02-12 Peter Rule Systems And Methods For Determining Physiological Parameters Using Measured Analyte Values
US20100204055A1 (en) * 2008-12-05 2010-08-12 Bonner-Ferraby Phoebe W Autoantibody detection systems and methods
US20120208282A1 (en) * 2009-07-02 2012-08-16 Biocrates Life Sciences Ag Method For Normalization in Metabolomics Analysis Methods with Endogenous Reference Metabolites.
US20110137851A1 (en) * 2009-10-15 2011-06-09 Crescendo Bioscience Biomarkers and methods for measuring and monitoring inflammatory disease activity
US20110306856A1 (en) * 2010-06-09 2011-12-15 Optiscan Biomedical Corporation Systems and methods for measuring multiple analytes in a sample
US20140242588A1 (en) * 2011-10-06 2014-08-28 Sequenom, Inc Methods and processes for non-invasive assessment of genetic variations
US20140274795A1 (en) * 2011-10-18 2014-09-18 Twistnostics Llc Detection units and methods for detecting a target analyte
US20140335630A1 (en) * 2012-01-31 2014-11-13 The University Of Toledo Methods and Devices for Detection and Measurement of Analytes
US20150227681A1 (en) * 2012-07-26 2015-08-13 The Regents Of The University Of California Screening, Diagnosis and Prognosis of Autism and Other Developmental Disorders

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SPELLMAN ET AL.: "Development and evaluation of a multiplexed mass spectrometry based assay for measuring candidate peptide biomarkers in Alzheimer's Disease Neuroimaging Initiative (ADNI) CSF.", PROTEOMICS-CLINICAL APPLICATIONS., vol. 9, no. 7-8, 24 April 2015 (2015-04-24), pages 715 - 731, XP055381013, Retrieved from the Internet <URL:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4739636/pdf/nihms703182.pdf> *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114585922A (zh) * 2019-07-31 2022-06-03 细胞逻辑股份有限公司 用于分析物水平的自适应归一化的方法、装置和计算机可读介质
JP2022546206A (ja) * 2019-07-31 2022-11-04 ソマロジック・オペレイティング・カンパニー・インコーポレイテッド 検体レベルの適応正規化のための方法、装置、及びコンピュータ読み取り可能な媒体
EP4004559A4 (fr) * 2019-07-31 2023-10-04 SomaLogic Operating Co., Inc. Procédé, appareil et support lisible par ordinateur pour la normalisation adaptative de niveaux d'analyte
JP7748360B2 (ja) 2019-07-31 2025-10-02 ソマロジック・オペレイティング・カンパニー・インコーポレイテッド 検体レベルの適応正規化のための方法、装置、及びコンピュータ読み取り可能な媒体
US20220221449A1 (en) * 2021-01-11 2022-07-14 Meso Scale Technologies, Llc. Assay system calibration systems and methods
TWI891961B (zh) * 2021-01-11 2025-08-01 美商梅梭刻度技術公司 分析系統校準系統及方法
WO2023216517A1 (fr) * 2022-05-12 2023-11-16 深圳市陆为生物技术有限公司 Procédé et dispositif de calcul d'indice d'activité immunitaire iga d'échantillon
CN115132271A (zh) * 2022-09-01 2022-09-30 北京中仪康卫医疗器械有限公司 一种基于批次内校正的cnv检测方法
CN119044394A (zh) * 2024-11-01 2024-11-29 西尼尔(山东)新材料科技有限公司 一种阻燃剂的阻燃性能智能监测方法及系统

Similar Documents

Publication Publication Date Title
JP6854272B2 (ja) 遺伝子の変異の非侵襲的な評価のための方法および処理
US20200263257A1 (en) Method of predicting breast cancer prognosis
EP3967775B1 (fr) Analyse de modèles de fragmentation d&#39;adn acellulaire
AU2020221845A1 (en) An integrated machine-learning framework to estimate homologous recombination deficiency
JP7301798B2 (ja) 腎臓がんを有する患者に対する再発スコアを計算するための遺伝子発現プロファイルアルゴリズム
EP3359696B1 (fr) Test diagnostique pour surveillance urinaire du cancer de la vessie
EP3964589A1 (fr) Évaluation de sous-type moléculaire de cancer colorectal et utilisations associées
MX2012005822A (es) Metodos para predecir el desenlace clinico del cancer.
WO2017083310A1 (fr) Procédé de normalisation pour analyse d&#39;échantillons
EP2788535A1 (fr) Prédiction de pronostic dans un lymphome de hodgkin classique
US20200105367A1 (en) Methods of Incorporation of Transcript Chromosomal Locus Information for Identification of Biomarkers of Disease Recurrence Risk
WO2022093910A1 (fr) Signature du gène pronostique et procédé de pronostic et de traitement du lymphome diffus à grandes cellules b
WO2014130617A1 (fr) Procédé de prédiction d&#39;un pronostic de cancer du sein
WO2014130444A1 (fr) Méthode de prédiction du pronostic du cancer du sein
KR20150043790A (ko) 담도암 진단용 바이오마커의 추출 방법, 이를 위한 컴퓨팅 장치, 담도암 진단용 바이오마커 및 이를 포함하는 담도암 진단 장치
US20250003001A1 (en) Compositions and methods for identifying transplant rejection or the risk thereof
HK40052881A (en) Gene expression profile algorithm for calculating a recurrence score for a patient with kidney cancer
HK40070813B (en) Analysis of fragmentation patterns of cell-free dna
NZ752676B2 (en) Gene expression profile algorithm for calculating a recurrence score for a patient with kidney cancer
HK1223132B (en) Gene expression profile algorithm for calculating a recurrence score for a patient with kidney cancer
HK1247645B (en) Analysis of fragmentation patterns of cell-free dna

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16864867

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16864867

Country of ref document: EP

Kind code of ref document: A1