[go: up one dir, main page]

US20210198744A1 - Method for detecting cystic fibrosis - Google Patents

Method for detecting cystic fibrosis Download PDF

Info

Publication number
US20210198744A1
US20210198744A1 US17/201,469 US202117201469A US2021198744A1 US 20210198744 A1 US20210198744 A1 US 20210198744A1 US 202117201469 A US202117201469 A US 202117201469A US 2021198744 A1 US2021198744 A1 US 2021198744A1
Authority
US
United States
Prior art keywords
exon
cftr
sequence
sequencing
nucleic acid
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
US17/201,469
Inventor
Steven Patrick Rivera
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Quest Diagnostics Investments LLC
Original Assignee
Quest Diagnostics Investments LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Quest Diagnostics Investments LLC filed Critical Quest Diagnostics Investments LLC
Priority to US17/201,469 priority Critical patent/US20210198744A1/en
Publication of US20210198744A1 publication Critical patent/US20210198744A1/en
Pending legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6813Hybridisation assays
    • C12Q1/6827Hybridisation assays for detection of mutation or polymorphism
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • C12Q1/6853Nucleic acid amplification reactions using modified primers or templates
    • C12Q1/6855Ligating adaptors
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/112Disease subtyping, staging or classification
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/06Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids

Definitions

  • the present invention relates to methods for simultaneously determining the presence or absence of mutations, deletions, duplications and single nucleotide polymorphisms in a cystic fibrosis transmembrane regulator (CFTR) nucleic acid.
  • Nucleotide sequences (such as for primers) used to amplify regions of a CFTR nucleic acid for high throughput, massively parallel sequencing and methods of determining an individual's cystic fibrosis status are also disclosed.
  • Cystic fibrosis is the most common severe autosomal recessive genetic disorder in the Caucasian population. It affects approximately 1 in 2,500 live births in North America (Boat et al, The Metabolic Basis of Inherited Disease, 6th ed, pp 2649-2680, McGraw Hill, NY (1989)). Approximately 1 in 25 persons are carriers of the disease. The major symptoms of cystic fibrosis include chronic pulmonary disease, pancreatic exocrine insufficiency, and elevated sweat electrolyte levels. The symptoms are consistent with cystic fibrosis being an exocrine disorder. Although recent advances have been made in the analysis of ion transport across the apical membrane of the epithelium of CF patient cells, it is not clear that the abnormal regulation of chloride channels represents the primary defect in the disease.
  • CFTR cystic fibrosis transmembrane regulator
  • mutations exist in both the coding regions (e.g., ⁇ F508, a mutation found on about 70% of CF alleles, represents a deletion of a phenylalanine at residue 508) and the non-coding regions (e.g., the 5T, 7T, and 9T mutations correspond to a sequence of 5, 7, or 9 thymidine bases located at the splice branch/acceptor site of intron 8) of the CFTR gene.
  • Comparison of the CFTR genomic and cDNA sequences confirms the presence of 27 exons. The exons are numbered 1-27 as shown in NCBI Reference Sequence accession no. NM_000492.3. Each intron is flanked by the consensus GT-AG splice-site sequence as previously reported (Zielenski, et al., (1991) Genomics 10, 214-228).
  • improved methods are needed to efficiently detect the variety of CFTR gene defects which underlie CF and to simultaneously capture both dosage data (e.g., gene copy number) and sequence data. Moreover, improved methods are needed for detecting rare mutations in the CFTR gene. Ideally, methods that can detect multiple classes of CFTR mutations such as those involving small base changes (e.g., missense mutations, nonsense mutations, small insertions or deletions and/or splice-site mutations) and those involving larger deletions and/or duplications in a single assay are desirable.
  • a method for determining the nucleotide sequence of a sample CFTR nucleic acid comprising (a) producing an adapter-tagged amplicon library by amplifying multiple target segments of the sample CFTR nucleic acid and (b) determining the nucleotide sequences of the target segments by sequencing the amplicons in the amplicon library using high throughput massively parallel sequencing.
  • Also provided is a method for determining the presence or absence of a CFTR nucleotide sequence variant in a sample CFTR nucleic acid comprising (a) producing an adapter-tagged amplicon library by amplifying multiple target segments of the sample CFTR nucleic acid; (b) determining the nucleotide sequences of the target segments by sequencing the amplicons in the amplicon library using high throughput massively parallel sequencing; (c) comparing each target segment nucleotide sequence determined in step (b) with the corresponding region of a reference CFTR nucleotide sequence; and (d) determining that the sample CFTR nucleic acid has a variant sequence if or when one or more of the target segment sequences is different from the corresponding region of the reference CFTR nucleotide sequence.
  • a sequence variant is a CFTR sequence that is different from a corresponding region of a reference CFTR nucleic acid sequence.
  • Such differences in the CFTR sequence can include point mutations, insertions deletions and/or duplications or copy number variations (CNV).
  • CNVs are gains and losses of genomic sequence >50 bp between two individuals of a species (Mills et al. 2011, Mapping copy number variation by population-scale genome sequencing, Nature 470: 59-65).
  • Such variations can be determined when using next-generation sequencing by using a read depth (i.e., mapping density) approach if amplification is halted during library generation during the exponential phase of PCR.
  • a normal dosage in relation to all other amplicons for a normal specimen will be one, 1 ⁇ 2 for a homozygous deletions and 11 ⁇ 2 for homozygous duplication.
  • the reference CFTR nucleic acid sequence comprises a wild type CFTR nucleic acid sequence. In some embodiments the sequence variant comprises a CFTR nucleotide sequence mutation associated with cystic fibrosis.
  • Another aspect of the present invention provides a method for determining the presence or absence of base changes, gene deletions and gene duplications in a sample CFTR nucleic acid as compared to a reference CFTR nucleotide sequence, said method comprising (a) producing an adapter-tagged amplicon library by amplifying multiple target segments of the sample CFTR nucleic acid, (b) determining the nucleotide sequences of the target segments by sequencing the amplicons using high throughput massively parallel sequencing, (c) comparing each target segment sequence determined in step (b) with the corresponding region of the reference CFTR nucleotide sequence; and (d) determining that one or more base changes, gene deletions and/or gene duplications is present in the sample CFTR nucleic acid if or when one or more of the target segment sequences is different from the corresponding region of the reference CFTR nucleotide sequence.
  • the reference CFTR sequence consists of or, alternatively, comprises a wild type CFTR nucleic
  • Another aspect of the present invention provides a method for diagnosing a genetic basis for cystic fibrosis in an individual comprising (a) producing an adapter-tagged amplicon library by amplifying multiple target segments of a CFTR nucleic acid from said individual, (b) determining the nucleotide sequences of the target segments by sequencing the amplicons using high throughput massively parallel sequencing, and (c) determining that the individual has a genetic basis for cystic fibrosis if or when the nucleotide sequence of one or more of the target segments contains a mutation associated with cystic fibrosis. Genetic mutations associated with cystic fibrosis are well known in the art and include both rare and common mutations.
  • high throughput massively parallel sequencing may be performed using a read depth approach.
  • a sample CFTR nucleic acid may be any form of nucleic acid including, for example, genomic DNA, RNA (such as mRNA) or cDNA.
  • CFTR nucleic acids from more than one sample are sequenced. In some cases all samples are sequenced simultaneously in parallel. In a preferred embodiment, CFTR nucleic acids from at least 5, 10, 20, 30 or 35 up to 40, 45, 48 or 50 different samples are amplified and sequenced using methods of the present invention. All amplicons derived from a single sample may comprise an index sequence that indicates the source from which the amplicon is generated, the index for each sample being different from the indexes from all other samples. As such, the use of indexes permits multiple samples to be pooled per sequencing run and the sample source subsequently ascertained based on the index sequence.
  • the Access ArrayTM System (Fluidigm Corp., San Francisco, Calif.) is used to generate a bar coded (indexed) amplicon library by simultaneously amplifying the CFTR nucleic acids from the samples in one set up.
  • the library that is generated then can be used on a sequencing platform such as, for example, Roche/454TM GS FLXTM sequencing system (Roche, Germany), Ion TorrentTM Ion PGMTM Sequencer (Life Technologies, Carlsbad, Calif.) or MiSeq® Personal Sequencer (Illumina, Inc., San Diego, Calif.).
  • sample CFTR target segments are amplified using primers that contain an oligonucleotide sequencing adapter to produce adapter-tagged amplicons.
  • the employed primers do not contain adapter sequences and the amplicons produced are subsequently (i.e. after amplification) ligated to an oligonucleotide sequencing adapter on one or both ends of the amplicons.
  • all sense amplicons contain the same sequencing adapter and all antisense amplicons contain a sequencing adapter having a different sequence from the sense amplicon sequencing adapter.
  • only a single stranded sample CFTR nucleic acid is amplified and/or sequenced.
  • Methods of the present invention may be used to sequence all or part of a CFTR gene or cDNA. In some embodiments, from at least one, two, five, 10 or 20 up to 25 or 28 exons are evaluated. In other embodiments all or a portion of the CFTR promoter region is also evaluated. Some or all CFTR introns may also be evaluated.
  • the CFTR target segments when combined, represent the CFTR coding region and all intron/exon junctions, plus from about 100, 500, 750, 900 or 1000 up to about 1000 nucleotides of the CFTR promoter immediately upstream (in the 5 prime direction) of the first exon plus from about 50, 100, 150 or 200 up to about 200, 250, 300 or 400 nucleotides immediately downstream (in the 3 prime direction) of the CFTR gene.
  • one or more sample CFTR nucleic acids are sequenced using at least one primer that comprise a sequence shown in Table 1 or Table 2. In a preferred embodiment, all of the primers shown in Tables 1 or 2 are used.
  • Oligonucleotides and combinations of oligonucleotides that are useful as primers in the methods of the present invention are also provided. These oligonucleotides are provided as substantially purified material. Kits comprising oligonucleotides for performing amplifications and sequencing as described herein also are provided.
  • Provided by the present invention are methods for simultaneously determining the presence or absence of CFTR gene mutations involving a small number of nucleotides in addition to larger deletions and duplications in a CFTR nucleotide sequence of a sample CFTR nucleic acid in a single assay.
  • an investigator can determine an individual's cystic fibrosis status based on the presence or absence of CFTR mutations associated with cystic fibrosis in the sample obtained from the individual.
  • the methods of the present invention comprise generating an adapter-tagged amplicon library by amplifying multiple target segments of a sample CFTR nucleic acid of one or more samples and determining the target segment sequences by sequencing the amplicons using high throughput massively parallel sequencing (i.e., next generation sequencing).
  • high throughput massively parallel sequencing i.e., next generation sequencing.
  • both gene sequence and gene dosage may be determined in a nucleic acid sample.
  • Gene dosage also referred to as copy number variation
  • the one or more sample CFTR sequences are compared with a reference CFTR sequence to determine if differences (e.g., difference in sequence or copy number) are present.
  • a reference CFTR sequence may be a CFTR genomic or cDNA sequence, or a portion thereof, from a normal (non-cystic fibrosis afflicted and non-cystic fibrosis carrier) individual.
  • a reference CFTR sequence may comprise a wild type CFTR nucleic acid sequence.
  • Various methods known in the art e.g., read depth approach
  • nucleic acid amplification methods such as PCR, isothermal methods, rolling circle methods, etc., are well known to the skilled artisan. See, e.g., Saiki, “Amplification of Genomic DNA” in PCR Protocols, Innis et al., Eds., Academic Press, San Diego, Calif. 1990, pp 13-20; Wharam et al., Nucleic Acids Res. 2001 Jun.
  • CFTR promoter region refers to a segment of the CFTR gene representing at least the first 250 nucleotides upstream from the translation start site.
  • the promoter region may include the first 250 nt, first 300 nt, first 350 nt, first 400 nt, first 450 nt, first 500 nt, first 1 kb, first 5 kb, first 10, kb, first 15, kb, first 20, kb, first 21 kb or first 22 kb of sequence directly upstream of the start codon.
  • a deletion of the promoter region as defined herein may be accompanied by deletion of downstream exons/introns but not all of the CFTR gene.
  • the coordinate deletion involving the CFTR promoter region and downstream CFTR gene sequence involves about less than 10 exons, and more typically involves less than 5 exons.
  • Deletions or duplications of the CFTR promoter region may be detected using primers that flank the deleted or duplicated sequence.
  • a promoter deletion or duplication involves a segment of at least four or more nucleotides, more preferably 5 or more, more preferably 8 or more, and even more preferably 12 or more nucleotides.
  • a “CFTR nucleic acid” as used herein refers to a nucleic acid that contains a sequence of a CFTR gene, mRNA, cDNA or a portion of such a CFTR sequence.
  • a CFTR nucleic acid may contain the CFTR coding region.
  • a CFTR nucleic acid may be genomic DNA, cDNA, single stranded DNA or mRNA. In some embodiments, only a single strand of a sample CFTR nucleic acid is amplified and/or sequenced. In some embodiments both strands of double stranded CFTR DNA are amplified and sequenced.
  • a CFTR nucleic acid may be present in a biological sample or it may be isolated from a biological sample.
  • complementarity refers to the base-pairing rules.
  • nucleic acid sequence refers to an oligonucleotide which, when aligned with the nucleic acid sequence such that the 5′ end of one sequence is paired with the 3′ end of the other, is in “antiparallel association.” For example, for the sequence “5′-A-G-T-3”' is complementary to the sequence “3′-T-C-A-5.”
  • Certain bases not commonly found in natural nucleic acids may be included in the nucleic acids described herein; these include, for example, inosine, 7-deazaguanine, Locked Nucleic Acids (LNA), and Peptide Nucleic Acids (PNA).
  • Those skilled in the art of nucleic acid technology can determine duplex stability empirically considering a number of variables including, for example, the length of the oligonucleotide, base composition and sequence of the oligonucleotide, ionic strength and incidence of mismatched base pairs.
  • a complement sequence can also be a sequence of RNA complementary to the DNA sequence or its complement sequence, and can also be a cDNA.
  • deletion encompasses a mutation that removes one or more nucleotides from nucleic acid.
  • duplication refers to a mutation that inserts one or more nucleotides of identical sequence directly next to this sequence in the nucleic acid. In a preferred embodiment, a deletion or duplication involves a segment of four or more nucleotides.
  • dosage refers to the number of copies of a gene, or portions of a gene, present in a sample.
  • primer means a sequence of nucleotides, preferably DNA, that hybridizes to a substantially complementary target sequence and is recognized by DNA polymerase to begin DNA replication.
  • primer includes all forms of primers that may be synthesized including peptide nucleic acid primers, locked nucleic acid primers, phosphorothioate modified primers, labeled primers, and the like.
  • substantially complementary means that two sequences hybridize under stringent hybridization conditions. The skilled artisan will understand that substantially complementary sequences need not hybridize along their entire length. In particular, substantially complementary sequences may comprise a contiguous sequence of bases that do not hybridize to a target sequence, positioned 3′ or 5′ to a contiguous sequence of bases that hybridize under stringent hybridization conditions to a target sequence.
  • primers hybridizes to a target nucleic acid adjoining a region of interest sought to be amplified on the target.
  • preferred primers are pairs of primers that hybridize 5′ from a region of interest, one on each strand of a target double stranded DNA molecule, such that nucleotides may be added to the 3′ end of the primer by a suitable DNA polymerase.
  • Primers that flank a CFTR exon are generally designed not to anneal to the exon sequence but rather to anneal to sequence that adjoins the exon (e.g. intron sequence). However, in some cases, amplification primer may be designed to anneal to the exon sequence.
  • Table 1 The location of primer annealing for many primer pairs that may be used with the methods is shown in Table 1.
  • “Sequencing depth” or “read depth” as used herein refers to the number of times a sequence has been sequenced (the depth of sequencing).
  • read depth can be determined by aligning multiple sequencing run results and counting the start position of reads in nonoverlapping windows of a certain size (for example, 100 bp). Copy number variation can be determined based on read depth using methods known in the art. For example, using a method described in Yoon et al., Genome Research 2009 September; 19(9): 1586-1592; Xie et al., BMC Bioinformatics 2009 Mar. 6; 10:80; or Medvedev et al., Nature Methods 2009 November; 6(11 Suppl):513-20. Use of this type of method and analysis is referred to as a “read depth approach.”
  • Crossing depth refers to the number of nucleotides from sequencing reads that are mapped to a given position.
  • oligonucleotide primer specifically used herein in reference to an oligonucleotide primer means that the nucleotide sequence of the primer has at least 12 bases of sequence identity with a portion of the nucleic acid to be amplified when the oligonucleotide and the nucleic acid are aligned.
  • An oligonucleotide primer that is specific for a nucleic acid is one that, under the stringent hybridization or washing conditions, is capable of hybridizing to the target of interest and not substantially hybridizing to nucleic acids which are not of interest. Higher levels of sequence identity are preferred and include at least 75%, at least 80%, at least 85%, at least 90%, at least 95% and more preferably at least 98% sequence identity.
  • multiplex PCR refers to amplification of two or more products which are each primed using a distinct primer pair.
  • hybridize refers to a process where two complementary nucleic acid strands anneal to each other under appropriately stringent conditions. Hybridizations are typically and preferably conducted with probe-length nucleic acid molecules, preferably 20-100 nucleotides in length, more preferably 18-50 nucleotides in length. Nucleic acid hybridization techniques are well known in the art. See, e.g., Sambrook, et al., 1989, Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Press, Plainview, N.Y.
  • hybridization conditions such that sequences having at least a desired level of complementary will stably hybridize, while those having lower complementary will not.
  • hybridization conditions and parameters see, e.g., Sambrook, et al., 1989, Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Press, Plainview, N.Y.; Ausubel, F. M. et al. 1994, Current Protocols in Molecular Biology. John Wiley & Sons, Secaucus, N.J.
  • specific hybridization occurs under stringent hybridization conditions.
  • stringent hybridization conditions refers to hybridization conditions at least as stringent as the following: hybridization in 50% formamide, 5 ⁇ SSC, 50 mM NaH 2 PO 4 , pH 6.8, 0.5% SDS, 0.1 mg/mL sonicated salmon sperm DNA, and 5 ⁇ Denhart's solution at 42° C. overnight; washing with 2 ⁇ SSC, 0.1% SDS at 45° C.; and washing with 0.2 ⁇ SSC, 0.1% SDS at 45° C.
  • stringent hybridization conditions should not allow for hybridization of two nucleic acids which differ over a stretch of 20 contiguous nucleotides by more than two bases.
  • sense strand as used herein means the strand of double-stranded DNA (dsDNA) that includes at least a portion of a coding sequence of a functional protein.
  • Anti-sense strand means the strand of dsDNA that is the reverse complement of the sense strand.
  • forward primer means a primer that anneals to the anti-sense strand of dsDNA.
  • reverse primer anneals to the sense-strand of dsDNA.
  • nucleic acid e.g., an RNA, DNA or a mixed polymer
  • isolated is one which is substantially separated from other cellular components which naturally accompany such nucleic acid.
  • the term embraces a nucleic acid sequence which has been removed from its naturally occurring environment, and includes recombinant or cloned DNA isolates, oligonucleotides, and chemically synthesized analogs or analogs biologically synthesized by heterologous systems.
  • nucleic acid sample represents more than 50% of the nucleic acid in a sample.
  • the nucleic acid sample may exist in solution or as a dry preparation.
  • coding sequence means a sequence of a nucleic acid or its complement, or a part thereof, that can be transcribed and/or translated to produce the mRNA for and/or the polypeptide or a fragment thereof. Coding sequences include exons in a genomic DNA or immature primary RNA transcripts, which are joined together by the cell's biochemical machinery to provide a mature mRNA. The anti-sense strand is the complement of such a nucleic acid, and the encoding sequence can be deduced there from.
  • non-coding sequence means a sequence of a nucleic acid or its complement, or a part thereof, that is not transcribed into amino acid in vivo, or where tRNA does not interact to place or attempt to place an amino acid.
  • Non-coding sequences include both intron sequences in genomic DNA or immature primary RNA transcripts, and gene-associated sequences such as promoters, enhancers, silencers, etc.
  • NGS next generation sequencing
  • NGS methods include, for example, sequencing-by-synthesis using reversible dye terminators, and sequencing-by-ligation.
  • Non-limiting examples of commonly used NGS platforms include miRNA BeadArray (Illumina, Inc.), Roche 454TM GS FLXTM-Titanium (Roche Diagnostics), and ABI SOLiDTM System (Applied Biosystems, Foster City, Calif.).
  • carrier state or “cystic fibrosis carrier” as used herein means a person who contains one CFTR allele that has a mutant CFTR nucleic acid sequence associated with cystic fibrosis, but a second allele that is not a mutant CFTR nucleic acid sequence.
  • Cystic fibrosis is an “autosomal recessive” disease, meaning that a mutation produces little or no phenotypic effect when present in a heterozygous condition with a non-disease related allele, but produces a “disease state” when a person is homozygous or compound heterozygote , i.e., both CFTR alleles are mutant CFTR nucleic acid sequences.
  • wild type refers to the CFTR gene sequence which is found in NCBI GenBank locus IDs M58478 (HUMCFTC), AC000111 and AC000061.
  • a cDNA for a CFTR gene is found in Audrezet et al., Hum. Mutat. (2004) 23 (4), 343-357 and/or Genbank accession number NM _000492.3.
  • a “rare CFTR mutation” is a mutation in the CFTR gene sequence that is present in ⁇ 0.1% of cystic fibrosis patients.
  • a “private CFTR mutation” is a mutation in the CFTR gene sequence that is found in only a single family or a small group.
  • a “common CFTR mutation” is a mutation in the CFTR gene sequence that is associated with cystic fibrosis and is present in at least 0.1% of patients with cystic fibrosis.
  • a “genetic basis for cystic fibrosis” in an individual refers to the individual's genotype, in particular, of their CFTR nucleic acids and whether the individual possesses at least one CFTR mutation that contributes to cystic fibrosis.
  • sample CFTR nucleic acid is a CFTR nucleic acid in, or isolated from, a biological sample. Processing methods to release or otherwise make available a nucleic acid for detection are well known in the art and may include steps of nucleic acid manipulation, e.g., preparing a cDNA by reverse transcription of RNA from the biological sample.
  • a biological sample may be a body fluid or a tissue sample. In some cases a biological sample may consist of or comprise blood, plasma, sera, urine, feces, epidermal sample, vaginal sample, skin sample, cheek swab, sperm, amniotic fluid, cultured cells, bone marrow sample and/or chorionic villi, cultured cells, and the like.
  • Whole blood samples of about 0.5 to 5 ml collected with EDTA, ACD or heparin as anti-coagulant are suitable.
  • Amniotic fluid of 10-15 ml, cultured cells which are 80-100% confluent in two T-25 flasks and 25 mg of chorionic villi are useful sample amounts for processing.
  • An “individual” is any mammal. In a preferred embodiment, and individual is a human.
  • a CFTR target segment that is amplified and sequenced according to the present invention may represent one or more individual exon(s) or portion(s) of exon(s) of the CFTR gene or one or more portions of a CFTR mRNA.
  • a target segment also may include the CFTR promoter region and/or one or more CFTR introns.
  • the target segments represent the entire CFTR gene or the entire CFTR coding region.
  • the target segments represent the entire CFTR coding region and at least one intron or a portion there and an adjacent region located immediately upstream (in the 5′ direction) of the coding sequence.
  • the adjacent, upstream region may consist of from about 100 nucleotides up to about 500, 750, 1000, 1100, or 1200 nucleotides of the sequence located immediately upstream of the CFTR coding sequence.
  • the adjacent, upstream region comprises all or a portion of the CFTR promoter sequence.
  • each CFTR nucleic acid target segment may be amplified with an oligonucleotide primer or primer pair specific to the target segment.
  • a single primer or one or both primers of a primer pair comprise a specific adapter sequence (also referred to as a sequencing adapter) ligated to the 5′ end of the target specific sequence portion of the primer.
  • This sequencing adapter is a short oligonucleotide of known sequence that can provide a priming site for both amplification and sequencing of the adjoining, unknown nucleic acid.
  • adapters allow binding of a fragment to a flow cell for next generation sequencing. Any adapter sequence may be included in a primer used in the present invention.
  • all forward amplicons i.e., amplicons extended from forward primers that hybridized with antisense strands of a target segment
  • all forward amplicons contain the same adapter sequence.
  • all reverse amplicons i.e., amplicons extended from reverse primers that hybridized with sense strands of a target segment
  • all forward amplicons contain the same adapter sequence and all reverse amplicons (i.e., amplicons extended from reverse primers that hybridized with sense strands of a target segment) contain an adapter sequence that is different from the adapter sequence of the forward amplicons.
  • the “forward” adapter sequence consists of or comprises: ACACTGACGACATGGTTCTACA (SEQ ID NO:1) or a sequence 90%, 95% or 99% identical to SEQ ID NO:2. and the reverse adapter sequence consists of or comprises TACGGTAGCAGAGACTTGGTCT (SEQ ID NO:2) or a sequence 90%, 95% or 99% identical to SEQ ID NO:2.
  • amplicons from a single sample source further comprise an identical index sequence (also referred to as an index tag, a “barcode” or a multiplex identifier (MID).
  • index sequence also referred to as an index tag, a “barcode” or a multiplex identifier (MID).
  • indexed amplicons are generated using primers (for example, forward primers and/or reverse primers) containing the index sequence. Such indexed primers may be included during library preparation as a “barcoding” tool to identify specific amplicons as originating from a particular sample source. Indexed amplicons from more than one sample source are quantified individually and then pooled prior to sequencing. As such, the use of index sequences permits multiple samples (i.e., samples from more than one sample source) to be pooled per sequencing run and the sample source subsequently ascertained based on the index sequence.
  • the adapter sequence and/or index sequence gets incorporated into the amplicon (along with the target-specific primer sequence) during amplification. Therefore, the resulting amplicons are sequencing-competent and do not require the traditional library preparation protocol. Moreover, the presence of the index tag permits the differentiation of sequences from multiple sample sources.
  • sequencing templates are prepared by emulsion-based clonal amplification of target segments using specialized fusion primers (containing an adapter sequence) and capture beads.
  • a single adapter-bound fragment is attached to the surface of a bead, and an oil emulsion containing necessary amplification reagents is formed around the bead/fragment component.
  • Parallel amplification of millions of beads with millions of single strand fragments produces a sequencer-ready library.
  • the amplicons constituting the adapter-tagged (and, optionally, indexed) amplicon library are produced by polymerase chain reaction (PCR).
  • the amplicon library is generated using a multiplexed PCR approach, such as that disclosed in U.S. Pat. No. 8,092,996, incorporated by reference herein in its entirety.
  • Bridge PCR is yet another method for in vitro clonal amplification after a library is generated, in preparation for sequencing.
  • This process is a means to clonally amplify a single target molecule, a member of a library, in a defined physical region such as a solid surface, for example, a bead in suspension or a cluster on a glass slide.
  • fragments are amplified using primers attached to the solid surface forming “DNA colonies” or “DNA clusters”. This method is used in some of the genome analyzer sequencers manufactured by Illumina, Inc. (San Diego, Calif.).
  • each CFTR nucleic acid target segment may be amplified with non-adapter-ligated and/or non-indexed primers and a sequencing adapter and/or an index sequence may be subsequently ligated to each of the resulting amplicons.
  • next generation sequencing i.e. next generation sequencing.
  • Methods for performing high throughput, massively parallel sequencing are known in the art.
  • the capacity offered by next generation sequencing has revolutionized amplicon sequencing. Companies such as RainDance Technologies, Inc. (Lexington, Mass.) and Fluidigm Corporation offer platforms which generate libraries that are sequencing-competent and composed purely of targeted sequences. By enabling high-throughput, mini PCR setup, these technologies are ideal for preparing amplicon libraries.
  • One drawback of PCR-based approaches is the limitation of amplicon length, which is determined by PCR itself. However, by targeting overlapping regions, this problem can be circumvented.
  • high throughput, massively parallel sequencing employs sequencing-by-synthesis with reversible dye terminators.
  • sequencing is performed via sequencing-by-ligation.
  • sequencing is single molecule sequencing.
  • Sequencing by synthesis like the “old style” dye-termination electrophoretic sequencing, relies on incorporation of nucleotides by a DNA polymerase to determine the base sequence.
  • Reversible terminator methods use reversible versions of dye-terminators, adding one nucleotide at a time, detecting fluorescence at each position by repeated removal of the blocking group to allow polymerization of another nucleotide.
  • the signal of nucleotide incorporation can vary with fluorescently labeled nucleotides, phosphate-driven light reactions and hydrogen ion sensing having all been used.
  • the MiSeq® personal sequencing system (Illumina, Inc.) employs sequencing by synthesis with reversible terminator chemistry.
  • the sequencing by ligation method uses a DNA ligase to determine the target sequence.
  • This sequencing method relies on enzymatic ligation of oligonucleotides that are adjacent through local complementarity on a template DNA strand.
  • This technology employs a partition of all possible oligonucleotides of a fixed length, labeled according to the sequenced position. Oligonucleotides are annealed and ligated and the preferential ligation by DNA ligase for matching sequences results in a dinucleotide encoded color space signal at that position (through the release of a flourescently labeled probe that corresponds to a known nucleotide at a known position along the oligo).
  • This method is primarily used by Life Technologies' SOLiDTM sequencers.
  • the Ion TorrentTM (Life Technologies, Carlsbad, Calif.) amplicon sequencing system employs a flow-based approach that detects pH changes caused by the release of hydrogen ions during incorporation of unmodified nucleotides in DNA replication.
  • a sequencing library is initially produced by generating DNA fragments flanked by sequencing adapters. These fragments are clonally amplified on particles by emulsion PCR. The particles with the amplified template are then placed in a silicon semiconductor sequencing chip. During replication, the chip is flooded with one nucleotide after another, and if a nucleotide complements the DNA molecule in a particular microwell of the chip, then it will be incorporated. A proton is naturally released when a nucleotide is incorporated by the polymerase in the DNA molecule, resulting in a detectable local change of pH. The pH of the solution then changes in that well and is detected by the ion sensor.
  • the 454TM GS FLXTM sequencing system employs a light-based detection methodology in a large-scale parallel pyrosequencing system. Pyrosequencing uses DNA polymerization, adding one nucleotide species at a time and detecting and quantifying the number of nucleotides added to a given location through the light emitted by the release of attached pyrophosphates.
  • adapter-ligated DNA fragments are fixed to small DNA-capture beads in a water-in-oil emulsion and amplified by PCR (emulsion PCR). Each DNA-bound bead is placed into a well on a picotiter plate and sequencing reagents are delivered across the wells of the plate.
  • the four DNA nucleotides are added sequentially in a fixed order across the picotiter plate device during a sequencing run. During the nucleotide flow, millions of copies of DNA bound to each of the beads are sequenced in parallel.
  • a nucleotide complementary to the template strand is added to a well, the nucleotide is incorporated onto the existing DNA strand, generating a light signal that is recorded by a CCD camera in the instrument.
  • amplicons from more than one sample source are pooled prior to high throughput sequencing.
  • “Multiplexing” is the pooling of multiple adapter-tagged and indexed libraries into a single sequencing run. When indexed primer sets are used, this capability can be exploited for comparative studies.
  • amplicon libraries from up to 48 separate sources are pooled prior to sequencing.
  • one aspect of the present invention provides a method for diagnosing a genetic basis for cystic fibrosis in an individual comprising (a) producing an adapter-tagged amplicon library by amplifying multiple target segments of a sample CFTR nucleic acid from said individual, (b) determining the nucleotide sequences of the target segments by sequencing the amplicons using high throughput massively parallel sequencing, and (c) determining that the individual has a genetic basis for being affected with cystic fibrosis or for being a cystic fibrosis carrier if or when the nucleotide sequence of one or more of the target segments contains a mutation associated with cystic fibrosis.
  • the present invention can additionally be used to detect one or more rare CFTR mutations or private mutations in a CFTR nucleic acid from an individual, thereby identifying an individual who possesses one or more rare or private CFTR mutation(s).
  • the present invention is used to identify rare familial mutations in an obligate cystic fibrosis carrier after the carrier has tested negative in a routine screening test for common mutations.
  • routine screening tests may include Cystic Fibrosis Screen: Detectable Mutations, CF Mutation Screen, Cystic Fibrosis Mutation Screen, CFTR Screen, Cystic Fibrosis Screen, Cystic Fibrosis Carrier Screen, and CF-60.
  • the present invention can also be used to identify rare mutations in a cystic fibrosis-affected (i.e. symptomatic) individual who has not had two CFTR sequence mutations identified by at least one routine cystic fibrosis mutation screening test.
  • the methods disclosed herein are employed to confirm cystic fibrosis carrier status in an individual such as, for example, a parent, a sibling or other relatives of a cystic fibrosis-affected individual with one or more rare or private mutations.
  • the present invention is used for prenatal diagnosis of an individual, in particular, an individual who is related to a cystic fibrosis-affected individual or who is suspected of being a cystic fibrosis carrier
  • target segments of the CFTR gene may be sequenced with gains and losses of genomic sequence (>50 bp) determined using a read depth approach.
  • 29 target segments are sequenced, representing the CFTR coding region (including all exons/intron junctions).
  • the CFTR coding region (including all exons/intron junctions) in addition to about 1 kb upstream and about 300 kb downstream of the CFTR gene are assayed.
  • the sequence of substantially pure nucleic acid primers which are DNA (or an RNA equivalent) and which are useful for amplifying the promoter region, all of the CFTR exons and intron/exon junctions, and a region immediately downstream of the CFTR gene are shown in Table 1.
  • the letter F or R at the end of the primer name indicates whether the primer is a forward (F) or reverse (R) PCR primer.
  • the primers of Table 1 are used with Ion Torrent Personal Genome MachineTM and/or Illumina MiSeq® Personal Sequencing System.
  • the primers of Table 2 are used with a Roche/454TM GS FLXTM sequencer and/or Sanger sequencing.
  • one or more primers consisting of or comprising any of SEQ ID NOs: 3-54 and 107-140 further comprise sequencing adapter sequence SEQ ID NO:1.
  • one or more primers consisting of or comprising any of SEQ ID NOs: 55-106 and 141-174 further comprises sequencing adapter sequence SEQ ID NO:2.
  • CFTR amplicon libraries were created for samples from 48 different sources.
  • the CFTR gene is one of a select few genes that to date has been extensively and exhaustively sequenced and, as such, has been annotated with many polymorphisms. Avoiding these polymorphism made the selection of primer and or probe binding sites particularly difficult. Libraries were generated using primers from Table 1 or Table 2 and size selected using either AMPure® beads or eGel.
  • the forward primers of Tables 1 and 2 each had an adapter oligonucleotide ligated to the 5′ end of the primer.
  • the adapter sequence of the forward primer adapter was 5′-ACACTGACGACATGGTTCTACA-3′ (SEQ ID NO: 1).
  • the reverse primers of Tables 1 and 2 each had an adapter oligonucleotide ligated to the 5′ end of the primer.
  • the sequence of the reverse primer adapter was 5′-TACGGTAGCAGAGACTTGGTCT-3′ (SEQ ID NO: 2).

Landscapes

  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Physics & Mathematics (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Genetics & Genomics (AREA)
  • Analytical Chemistry (AREA)
  • Acoustics & Sound (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Immunology (AREA)
  • Pathology (AREA)
  • Biophysics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The present invention relates to methods for simultaneously determining the presence or absence of mutations, deletions, duplications and single nucleotide polymorphisms in a cystic fibrosis transmembrane regulator (CFTR) nucleic acid. Oligonucleotide primers and kits used to amplify regions of a CFTR nucleic acid for high throughput, massively parallel sequencing and methods of determining an individual's cystic fibrosis status are also disclosed.

Description

    CROSS-REFERENCE TO RELATED APPLICATIONS
  • This application is a Divisional of U.S. application Ser. No. 16/158,823, filed Oct. 12, 2018, which is a Divisional of U.S. application Ser. No. 14/774,331, which is the U.S. National Stage application of PCT/US2014/027870, filed Mar. 14, 2014, which claims priority from U.S. Provisional Application No. 61/785,862, filed Mar. 14, 2013.
  • The instant application contains a Sequence Listing which has been submitted in ASCII format via EFS-WEB and is hereby incorporated by reference in its entirety. Said ASCII copy, created on Mar. 12, 2021, is named seqlisting.txt and is 40,033 bytes.
  • FIELD OF THE INVENTION
  • The present invention relates to methods for simultaneously determining the presence or absence of mutations, deletions, duplications and single nucleotide polymorphisms in a cystic fibrosis transmembrane regulator (CFTR) nucleic acid. Nucleotide sequences (such as for primers) used to amplify regions of a CFTR nucleic acid for high throughput, massively parallel sequencing and methods of determining an individual's cystic fibrosis status are also disclosed.
  • BACKGROUND OF THE INVENTION
  • The following description of the background of the invention is provided simply as an aid in understanding the invention and is not admitted to describe or constitute prior art to the invention.
  • Cystic fibrosis (CF) is the most common severe autosomal recessive genetic disorder in the Caucasian population. It affects approximately 1 in 2,500 live births in North America (Boat et al, The Metabolic Basis of Inherited Disease, 6th ed, pp 2649-2680, McGraw Hill, NY (1989)). Approximately 1 in 25 persons are carriers of the disease. The major symptoms of cystic fibrosis include chronic pulmonary disease, pancreatic exocrine insufficiency, and elevated sweat electrolyte levels. The symptoms are consistent with cystic fibrosis being an exocrine disorder. Although recent advances have been made in the analysis of ion transport across the apical membrane of the epithelium of CF patient cells, it is not clear that the abnormal regulation of chloride channels represents the primary defect in the disease.
  • The gene for CF has been localized to a 250,000 base pair genomic sequence present on the long arm of chromosome 7. This sequence encodes a membrane-associated protein called the “cystic fibrosis transmembrane regulator” (or “CFTR”). There are greater than 1000 different mutations in the CFTR gene, having varying frequencies of occurrence in the population, presently reported to the Cystic Fibrosis Genetic Analysis Consortium. These mutations exist in both the coding regions (e.g., ΔF508, a mutation found on about 70% of CF alleles, represents a deletion of a phenylalanine at residue 508) and the non-coding regions (e.g., the 5T, 7T, and 9T mutations correspond to a sequence of 5, 7, or 9 thymidine bases located at the splice branch/acceptor site of intron 8) of the CFTR gene. Comparison of the CFTR genomic and cDNA sequences confirms the presence of 27 exons. The exons are numbered 1-27 as shown in NCBI Reference Sequence accession no. NM_000492.3. Each intron is flanked by the consensus GT-AG splice-site sequence as previously reported (Zielenski, et al., (1991) Genomics 10, 214-228).
  • Methods for detecting CFTR gene mutations have been described. See e.g., Audrezet et al., “Genomic rearrangements in the CFTR gene: extensive allelic heterogeneity and diverse mutational mechanisms” Hum Mutat. 2004 April; 23(4):343-57; PCT WO 1004/040013 A1 and corresponding US application #20040110138; titled “Method for the detection of multiple genetic targets” by Spiegelman and Lem; US patent application No. 20030235834; titled “Approaches to identify cystic fibrosis” by Dunlop et al.; and US patent application No. 20040126760 titled “Novel compositions and methods for carrying out multiple PCR reactions on a single sample” by N. Broude.
  • Currently, however, multiple different analysis and/or detection methods must be employed in order to accurately obtain comprehensive sequence data. For example, traditional Sanger sequencing methodology may be employed to determine the presence or absence of mutations involving a small number of nucleotides in the CFTR gene. Sanger sequencing, though, is unable to detect large deletions and duplications such as those involving one or more exons. As a result, additional methods such as quantitative fluorescent polymerase chain reaction (QF-PCR) are needed to detect these larger types of mutations.
  • Accordingly, improved methods are needed to efficiently detect the variety of CFTR gene defects which underlie CF and to simultaneously capture both dosage data (e.g., gene copy number) and sequence data. Moreover, improved methods are needed for detecting rare mutations in the CFTR gene. Ideally, methods that can detect multiple classes of CFTR mutations such as those involving small base changes (e.g., missense mutations, nonsense mutations, small insertions or deletions and/or splice-site mutations) and those involving larger deletions and/or duplications in a single assay are desirable.
  • SUMMARY OF THE INVENTION
  • Provided is a method for determining the nucleotide sequence of a sample CFTR nucleic acid, the method comprising (a) producing an adapter-tagged amplicon library by amplifying multiple target segments of the sample CFTR nucleic acid and (b) determining the nucleotide sequences of the target segments by sequencing the amplicons in the amplicon library using high throughput massively parallel sequencing.
  • Also provided is a method for determining the presence or absence of a CFTR nucleotide sequence variant in a sample CFTR nucleic acid comprising (a) producing an adapter-tagged amplicon library by amplifying multiple target segments of the sample CFTR nucleic acid; (b) determining the nucleotide sequences of the target segments by sequencing the amplicons in the amplicon library using high throughput massively parallel sequencing; (c) comparing each target segment nucleotide sequence determined in step (b) with the corresponding region of a reference CFTR nucleotide sequence; and (d) determining that the sample CFTR nucleic acid has a variant sequence if or when one or more of the target segment sequences is different from the corresponding region of the reference CFTR nucleotide sequence.
  • A sequence variant is a CFTR sequence that is different from a corresponding region of a reference CFTR nucleic acid sequence. Such differences in the CFTR sequence can include point mutations, insertions deletions and/or duplications or copy number variations (CNV). CNVs are gains and losses of genomic sequence >50 bp between two individuals of a species (Mills et al. 2011, Mapping copy number variation by population-scale genome sequencing, Nature 470: 59-65). Such variations can be determined when using next-generation sequencing by using a read depth (i.e., mapping density) approach if amplification is halted during library generation during the exponential phase of PCR. A normal dosage in relation to all other amplicons for a normal specimen will be one, ½ for a homozygous deletions and 1½ for homozygous duplication.
  • In some embodiments the reference CFTR nucleic acid sequence comprises a wild type CFTR nucleic acid sequence. In some embodiments the sequence variant comprises a CFTR nucleotide sequence mutation associated with cystic fibrosis.
  • Another aspect of the present invention provides a method for determining the presence or absence of base changes, gene deletions and gene duplications in a sample CFTR nucleic acid as compared to a reference CFTR nucleotide sequence, said method comprising (a) producing an adapter-tagged amplicon library by amplifying multiple target segments of the sample CFTR nucleic acid, (b) determining the nucleotide sequences of the target segments by sequencing the amplicons using high throughput massively parallel sequencing, (c) comparing each target segment sequence determined in step (b) with the corresponding region of the reference CFTR nucleotide sequence; and (d) determining that one or more base changes, gene deletions and/or gene duplications is present in the sample CFTR nucleic acid if or when one or more of the target segment sequences is different from the corresponding region of the reference CFTR nucleotide sequence. In some embodiments, the reference CFTR sequence consists of or, alternatively, comprises a wild type CFTR nucleic acid sequence.
  • Another aspect of the present invention provides a method for diagnosing a genetic basis for cystic fibrosis in an individual comprising (a) producing an adapter-tagged amplicon library by amplifying multiple target segments of a CFTR nucleic acid from said individual, (b) determining the nucleotide sequences of the target segments by sequencing the amplicons using high throughput massively parallel sequencing, and (c) determining that the individual has a genetic basis for cystic fibrosis if or when the nucleotide sequence of one or more of the target segments contains a mutation associated with cystic fibrosis. Genetic mutations associated with cystic fibrosis are well known in the art and include both rare and common mutations.
  • In any of the aspects of the present invention, high throughput massively parallel sequencing may be performed using a read depth approach.
  • A sample CFTR nucleic acid may be any form of nucleic acid including, for example, genomic DNA, RNA (such as mRNA) or cDNA.
  • In some embodiments of the above methods, CFTR nucleic acids from more than one sample are sequenced. In some cases all samples are sequenced simultaneously in parallel. In a preferred embodiment, CFTR nucleic acids from at least 5, 10, 20, 30 or 35 up to 40, 45, 48 or 50 different samples are amplified and sequenced using methods of the present invention. All amplicons derived from a single sample may comprise an index sequence that indicates the source from which the amplicon is generated, the index for each sample being different from the indexes from all other samples. As such, the use of indexes permits multiple samples to be pooled per sequencing run and the sample source subsequently ascertained based on the index sequence.
  • In some embodiments, the Access Array™ System (Fluidigm Corp., San Francisco, Calif.) is used to generate a bar coded (indexed) amplicon library by simultaneously amplifying the CFTR nucleic acids from the samples in one set up. The library that is generated then can be used on a sequencing platform such as, for example, Roche/454™ GS FLX™ sequencing system (Roche, Germany), Ion Torrent™ Ion PGM™ Sequencer (Life Technologies, Carlsbad, Calif.) or MiSeq® Personal Sequencer (Illumina, Inc., San Diego, Calif.).
  • In some embodiments of the present invention, sample CFTR target segments are amplified using primers that contain an oligonucleotide sequencing adapter to produce adapter-tagged amplicons. In other embodiments, the employed primers do not contain adapter sequences and the amplicons produced are subsequently (i.e. after amplification) ligated to an oligonucleotide sequencing adapter on one or both ends of the amplicons. In some embodiments, all sense amplicons contain the same sequencing adapter and all antisense amplicons contain a sequencing adapter having a different sequence from the sense amplicon sequencing adapter. In some embodiments, only a single stranded sample CFTR nucleic acid is amplified and/or sequenced.
  • Methods of the present invention may be used to sequence all or part of a CFTR gene or cDNA. In some embodiments, from at least one, two, five, 10 or 20 up to 25 or 28 exons are evaluated. In other embodiments all or a portion of the CFTR promoter region is also evaluated. Some or all CFTR introns may also be evaluated. In one embodiment, the CFTR target segments, when combined, represent the CFTR coding region and all intron/exon junctions, plus from about 100, 500, 750, 900 or 1000 up to about 1000 nucleotides of the CFTR promoter immediately upstream (in the 5 prime direction) of the first exon plus from about 50, 100, 150 or 200 up to about 200, 250, 300 or 400 nucleotides immediately downstream (in the 3 prime direction) of the CFTR gene. In a preferred embodiment, one or more sample CFTR nucleic acids are sequenced using at least one primer that comprise a sequence shown in Table 1 or Table 2. In a preferred embodiment, all of the primers shown in Tables 1 or 2 are used.
  • In a similar embodiment, all exons and a portion of one or more introns are represented.
  • Oligonucleotides and combinations of oligonucleotides that are useful as primers in the methods of the present invention are also provided. These oligonucleotides are provided as substantially purified material. Kits comprising oligonucleotides for performing amplifications and sequencing as described herein also are provided.
  • DETAILED DESCRIPTION OF THE INVENTION
  • Provided by the present invention are methods for simultaneously determining the presence or absence of CFTR gene mutations involving a small number of nucleotides in addition to larger deletions and duplications in a CFTR nucleotide sequence of a sample CFTR nucleic acid in a single assay. By determining the presence or absence of CFTR nucleotide sequence variants in a sample CFTR nucleic acid, an investigator can determine an individual's cystic fibrosis status based on the presence or absence of CFTR mutations associated with cystic fibrosis in the sample obtained from the individual.
  • The methods of the present invention comprise generating an adapter-tagged amplicon library by amplifying multiple target segments of a sample CFTR nucleic acid of one or more samples and determining the target segment sequences by sequencing the amplicons using high throughput massively parallel sequencing (i.e., next generation sequencing). Using the provided methods, both gene sequence and gene dosage may be determined in a nucleic acid sample. Gene dosage (also referred to as copy number variation) can be determined by performing next generation sequencing and using a read depth approach.
  • In some embodiments, the one or more sample CFTR sequences are compared with a reference CFTR sequence to determine if differences (e.g., difference in sequence or copy number) are present. A reference CFTR sequence may be a CFTR genomic or cDNA sequence, or a portion thereof, from a normal (non-cystic fibrosis afflicted and non-cystic fibrosis carrier) individual. In some cases, a reference CFTR sequence may comprise a wild type CFTR nucleic acid sequence. Various methods known in the art (e.g., read depth approach) can be employed to analyze sequencing data to determine if differences are present as compared to a reference sequence.
  • The term “amplify” as used herein with respect to nucleic acid sequences, refers to methods that increase the representation of a population of nucleic acid sequences in a sample. Nucleic acid amplification methods, such as PCR, isothermal methods, rolling circle methods, etc., are well known to the skilled artisan. See, e.g., Saiki, “Amplification of Genomic DNA” in PCR Protocols, Innis et al., Eds., Academic Press, San Diego, Calif. 1990, pp 13-20; Wharam et al., Nucleic Acids Res. 2001 Jun. 1; 29(11):E54-E54; Hafner et al., Biotechniques 2001 April; 30(4):852-6, 858, 860 passim; Zhong et al., Biotechniques 2001 April; 30(4):852-6, 858, 860.
  • The term “CFTR promoter region” as used herein refers to a segment of the CFTR gene representing at least the first 250 nucleotides upstream from the translation start site. In other embodiments, the promoter region may include the first 250 nt, first 300 nt, first 350 nt, first 400 nt, first 450 nt, first 500 nt, first 1 kb, first 5 kb, first 10, kb, first 15, kb, first 20, kb, first 21 kb or first 22 kb of sequence directly upstream of the start codon. A deletion of the promoter region as defined herein may be accompanied by deletion of downstream exons/introns but not all of the CFTR gene. In some embodiments, the coordinate deletion involving the CFTR promoter region and downstream CFTR gene sequence involves about less than 10 exons, and more typically involves less than 5 exons. Deletions or duplications of the CFTR promoter region may be detected using primers that flank the deleted or duplicated sequence. In a preferred embodiment, a promoter deletion or duplication involves a segment of at least four or more nucleotides, more preferably 5 or more, more preferably 8 or more, and even more preferably 12 or more nucleotides.
  • A “CFTR nucleic acid” as used herein refers to a nucleic acid that contains a sequence of a CFTR gene, mRNA, cDNA or a portion of such a CFTR sequence. A CFTR nucleic acid may contain the CFTR coding region. A CFTR nucleic acid may be genomic DNA, cDNA, single stranded DNA or mRNA. In some embodiments, only a single strand of a sample CFTR nucleic acid is amplified and/or sequenced. In some embodiments both strands of double stranded CFTR DNA are amplified and sequenced. A CFTR nucleic acid may be present in a biological sample or it may be isolated from a biological sample.
  • The terms “complementary” or “complementarity” as used herein with reference to polynucleotides (i.e., a sequence of nucleotides such as an oligonucleotide or a target nucleic acid) refers to the base-pairing rules. The complement of a nucleic acid sequence as used herein refers to an oligonucleotide which, when aligned with the nucleic acid sequence such that the 5′ end of one sequence is paired with the 3′ end of the other, is in “antiparallel association.” For example, for the sequence “5′-A-G-T-3”' is complementary to the sequence “3′-T-C-A-5.” Certain bases not commonly found in natural nucleic acids may be included in the nucleic acids described herein; these include, for example, inosine, 7-deazaguanine, Locked Nucleic Acids (LNA), and Peptide Nucleic Acids (PNA). Complementary need not be perfect; stable duplexes may contain mismatched base pairs, degenerative, or unmatched bases. Those skilled in the art of nucleic acid technology can determine duplex stability empirically considering a number of variables including, for example, the length of the oligonucleotide, base composition and sequence of the oligonucleotide, ionic strength and incidence of mismatched base pairs. A complement sequence can also be a sequence of RNA complementary to the DNA sequence or its complement sequence, and can also be a cDNA.
  • The term “deletion” as used herein encompasses a mutation that removes one or more nucleotides from nucleic acid. Conversely, the term “duplication” refers to a mutation that inserts one or more nucleotides of identical sequence directly next to this sequence in the nucleic acid. In a preferred embodiment, a deletion or duplication involves a segment of four or more nucleotides.
  • The term “dosage” or “gene dosage” refers to the number of copies of a gene, or portions of a gene, present in a sample.
  • The term “primer” as used herein means a sequence of nucleotides, preferably DNA, that hybridizes to a substantially complementary target sequence and is recognized by DNA polymerase to begin DNA replication. The term primer as used herein includes all forms of primers that may be synthesized including peptide nucleic acid primers, locked nucleic acid primers, phosphorothioate modified primers, labeled primers, and the like.
  • The term “substantially complementary” as used herein means that two sequences hybridize under stringent hybridization conditions. The skilled artisan will understand that substantially complementary sequences need not hybridize along their entire length. In particular, substantially complementary sequences may comprise a contiguous sequence of bases that do not hybridize to a target sequence, positioned 3′ or 5′ to a contiguous sequence of bases that hybridize under stringent hybridization conditions to a target sequence.
  • The term “flanking” as used herein with regard to primers means that a primer hybridizes to a target nucleic acid adjoining a region of interest sought to be amplified on the target. The skilled artisan will understand that preferred primers are pairs of primers that hybridize 5′ from a region of interest, one on each strand of a target double stranded DNA molecule, such that nucleotides may be added to the 3′ end of the primer by a suitable DNA polymerase. Primers that flank a CFTR exon are generally designed not to anneal to the exon sequence but rather to anneal to sequence that adjoins the exon (e.g. intron sequence). However, in some cases, amplification primer may be designed to anneal to the exon sequence. The location of primer annealing for many primer pairs that may be used with the methods is shown in Table 1.
  • “Sequencing depth” or “read depth” as used herein refers to the number of times a sequence has been sequenced (the depth of sequencing). As an example, read depth can be determined by aligning multiple sequencing run results and counting the start position of reads in nonoverlapping windows of a certain size (for example, 100 bp). Copy number variation can be determined based on read depth using methods known in the art. For example, using a method described in Yoon et al., Genome Research 2009 September; 19(9): 1586-1592; Xie et al., BMC Bioinformatics 2009 Mar. 6; 10:80; or Medvedev et al., Nature Methods 2009 November; 6(11 Suppl):513-20. Use of this type of method and analysis is referred to as a “read depth approach.”
  • “Coverage depth” refers to the number of nucleotides from sequencing reads that are mapped to a given position.
  • The term “specific” as used herein in reference to an oligonucleotide primer means that the nucleotide sequence of the primer has at least 12 bases of sequence identity with a portion of the nucleic acid to be amplified when the oligonucleotide and the nucleic acid are aligned. An oligonucleotide primer that is specific for a nucleic acid is one that, under the stringent hybridization or washing conditions, is capable of hybridizing to the target of interest and not substantially hybridizing to nucleic acids which are not of interest. Higher levels of sequence identity are preferred and include at least 75%, at least 80%, at least 85%, at least 90%, at least 95% and more preferably at least 98% sequence identity.
  • The term “multiplex PCR” as used herein refers to amplification of two or more products which are each primed using a distinct primer pair.
  • The term “hybridize” as used herein refers to a process where two complementary nucleic acid strands anneal to each other under appropriately stringent conditions. Hybridizations are typically and preferably conducted with probe-length nucleic acid molecules, preferably 20-100 nucleotides in length, more preferably 18-50 nucleotides in length. Nucleic acid hybridization techniques are well known in the art. See, e.g., Sambrook, et al., 1989, Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Press, Plainview, N.Y. Those skilled in the art understand how to estimate and adjust the stringency of hybridization conditions such that sequences having at least a desired level of complementary will stably hybridize, while those having lower complementary will not. For examples of hybridization conditions and parameters, see, e.g., Sambrook, et al., 1989, Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Press, Plainview, N.Y.; Ausubel, F. M. et al. 1994, Current Protocols in Molecular Biology. John Wiley & Sons, Secaucus, N.J. In some embodiments, specific hybridization occurs under stringent hybridization conditions.
  • The term “stringent hybridization conditions” as used herein refers to hybridization conditions at least as stringent as the following: hybridization in 50% formamide, 5×SSC, 50 mM NaH2PO4, pH 6.8, 0.5% SDS, 0.1 mg/mL sonicated salmon sperm DNA, and 5× Denhart's solution at 42° C. overnight; washing with 2×SSC, 0.1% SDS at 45° C.; and washing with 0.2×SSC, 0.1% SDS at 45° C. In another example, stringent hybridization conditions should not allow for hybridization of two nucleic acids which differ over a stretch of 20 contiguous nucleotides by more than two bases.
  • The term “sense strand” as used herein means the strand of double-stranded DNA (dsDNA) that includes at least a portion of a coding sequence of a functional protein. “Anti-sense strand” means the strand of dsDNA that is the reverse complement of the sense strand.
  • The term “forward primer” as used herein means a primer that anneals to the anti-sense strand of dsDNA. A “reverse primer” anneals to the sense-strand of dsDNA.
  • The term “isolated” as used herein with respect to a nucleic acid (e.g., an RNA, DNA or a mixed polymer) is one which is substantially separated from other cellular components which naturally accompany such nucleic acid. The term embraces a nucleic acid sequence which has been removed from its naturally occurring environment, and includes recombinant or cloned DNA isolates, oligonucleotides, and chemically synthesized analogs or analogs biologically synthesized by heterologous systems.
  • The term “substantially pure” as used herein means a nucleic acid, represents more than 50% of the nucleic acid in a sample. The nucleic acid sample may exist in solution or as a dry preparation.
  • The term “coding sequence” as used herein means a sequence of a nucleic acid or its complement, or a part thereof, that can be transcribed and/or translated to produce the mRNA for and/or the polypeptide or a fragment thereof. Coding sequences include exons in a genomic DNA or immature primary RNA transcripts, which are joined together by the cell's biochemical machinery to provide a mature mRNA. The anti-sense strand is the complement of such a nucleic acid, and the encoding sequence can be deduced there from.
  • The term “non-coding sequence” as used herein means a sequence of a nucleic acid or its complement, or a part thereof, that is not transcribed into amino acid in vivo, or where tRNA does not interact to place or attempt to place an amino acid. Non-coding sequences include both intron sequences in genomic DNA or immature primary RNA transcripts, and gene-associated sequences such as promoters, enhancers, silencers, etc.
  • The term “high throughput, massively parallel sequencing” as used herein refers to sequencing methods that can generate multiple sequencing reactions of clonally amplified molecules and of single nucleic acid molecules in parallel. This allows increased throughput and yield of data. These methods are also known in the art as next generation sequencing (NGS) methods. NGS methods include, for example, sequencing-by-synthesis using reversible dye terminators, and sequencing-by-ligation. Non-limiting examples of commonly used NGS platforms include miRNA BeadArray (Illumina, Inc.), Roche 454™ GS FLX™-Titanium (Roche Diagnostics), and ABI SOLiD™ System (Applied Biosystems, Foster City, Calif.).
  • The term “carrier state” or “cystic fibrosis carrier” as used herein means a person who contains one CFTR allele that has a mutant CFTR nucleic acid sequence associated with cystic fibrosis, but a second allele that is not a mutant CFTR nucleic acid sequence. Cystic fibrosis is an “autosomal recessive” disease, meaning that a mutation produces little or no phenotypic effect when present in a heterozygous condition with a non-disease related allele, but produces a “disease state” when a person is homozygous or compound heterozygote , i.e., both CFTR alleles are mutant CFTR nucleic acid sequences.
  • The term “wild type” as used herein with respect to the CFTR gene or a locus thereof refers to the CFTR gene sequence which is found in NCBI GenBank locus IDs M58478 (HUMCFTC), AC000111 and AC000061. A cDNA for a CFTR gene is found in Audrezet et al., Hum. Mutat. (2004) 23 (4), 343-357 and/or Genbank accession number NM _000492.3.
  • A “rare CFTR mutation” is a mutation in the CFTR gene sequence that is present in <0.1% of cystic fibrosis patients.
  • A “private CFTR mutation” is a mutation in the CFTR gene sequence that is found in only a single family or a small group.
  • A “common CFTR mutation” is a mutation in the CFTR gene sequence that is associated with cystic fibrosis and is present in at least 0.1% of patients with cystic fibrosis.
  • A “genetic basis for cystic fibrosis” in an individual refers to the individual's genotype, in particular, of their CFTR nucleic acids and whether the individual possesses at least one CFTR mutation that contributes to cystic fibrosis.
  • The term “about” as used herein means in quantitative terms plus or minus 10%.
  • A “sample CFTR nucleic acid” is a CFTR nucleic acid in, or isolated from, a biological sample. Processing methods to release or otherwise make available a nucleic acid for detection are well known in the art and may include steps of nucleic acid manipulation, e.g., preparing a cDNA by reverse transcription of RNA from the biological sample. A biological sample may be a body fluid or a tissue sample. In some cases a biological sample may consist of or comprise blood, plasma, sera, urine, feces, epidermal sample, vaginal sample, skin sample, cheek swab, sperm, amniotic fluid, cultured cells, bone marrow sample and/or chorionic villi, cultured cells, and the like. Fixed or frozen tissues also may be used. Whole blood samples of about 0.5 to 5 ml collected with EDTA, ACD or heparin as anti-coagulant are suitable. Amniotic fluid of 10-15 ml, cultured cells which are 80-100% confluent in two T-25 flasks and 25 mg of chorionic villi are useful sample amounts for processing.
  • An “individual” is any mammal. In a preferred embodiment, and individual is a human.
  • A CFTR target segment that is amplified and sequenced according to the present invention may represent one or more individual exon(s) or portion(s) of exon(s) of the CFTR gene or one or more portions of a CFTR mRNA. A target segment also may include the CFTR promoter region and/or one or more CFTR introns. In some embodiments the target segments represent the entire CFTR gene or the entire CFTR coding region. In a preferred embodiment the target segments represent the entire CFTR coding region and at least one intron or a portion there and an adjacent region located immediately upstream (in the 5′ direction) of the coding sequence. The adjacent, upstream region may consist of from about 100 nucleotides up to about 500, 750, 1000, 1100, or 1200 nucleotides of the sequence located immediately upstream of the CFTR coding sequence. In some embodiments, the adjacent, upstream region comprises all or a portion of the CFTR promoter sequence.
  • In accordance with the present invention, each CFTR nucleic acid target segment may be amplified with an oligonucleotide primer or primer pair specific to the target segment. In some embodiments a single primer or one or both primers of a primer pair comprise a specific adapter sequence (also referred to as a sequencing adapter) ligated to the 5′ end of the target specific sequence portion of the primer. This sequencing adapter is a short oligonucleotide of known sequence that can provide a priming site for both amplification and sequencing of the adjoining, unknown nucleic acid. As such, adapters allow binding of a fragment to a flow cell for next generation sequencing. Any adapter sequence may be included in a primer used in the present invention.
  • In some embodiments, all forward amplicons (i.e., amplicons extended from forward primers that hybridized with antisense strands of a target segment) contain the same adapter sequence. In some embodiments when double stranded sequencing is performed, all forward amplicons contain the same adapter sequence and all reverse amplicons (i.e., amplicons extended from reverse primers that hybridized with sense strands of a target segment) contain an adapter sequence that is different from the adapter sequence of the forward amplicons.
  • In a particular embodiment, the “forward” adapter sequence consists of or comprises: ACACTGACGACATGGTTCTACA (SEQ ID NO:1) or a sequence 90%, 95% or 99% identical to SEQ ID NO:2. and the reverse adapter sequence consists of or comprises TACGGTAGCAGAGACTTGGTCT (SEQ ID NO:2) or a sequence 90%, 95% or 99% identical to SEQ ID NO:2.
  • Other adapter sequences are known in the art. Some manufacturers recommend specific adapter sequences for use with the particular sequencing technology and machinery that they offer.
  • In some cases, amplicons from a single sample source further comprise an identical index sequence (also referred to as an index tag, a “barcode” or a multiplex identifier (MID). In some cases, indexed amplicons are generated using primers (for example, forward primers and/or reverse primers) containing the index sequence. Such indexed primers may be included during library preparation as a “barcoding” tool to identify specific amplicons as originating from a particular sample source. Indexed amplicons from more than one sample source are quantified individually and then pooled prior to sequencing. As such, the use of index sequences permits multiple samples (i.e., samples from more than one sample source) to be pooled per sequencing run and the sample source subsequently ascertained based on the index sequence.
  • When adapter-ligated and/or indexed primers are employed to amplify a CFTR target segment, the adapter sequence and/or index sequence gets incorporated into the amplicon (along with the target-specific primer sequence) during amplification. Therefore, the resulting amplicons are sequencing-competent and do not require the traditional library preparation protocol. Moreover, the presence of the index tag permits the differentiation of sequences from multiple sample sources.
  • In some embodiments, sequencing templates (amplicons) are prepared by emulsion-based clonal amplification of target segments using specialized fusion primers (containing an adapter sequence) and capture beads. A single adapter-bound fragment is attached to the surface of a bead, and an oil emulsion containing necessary amplification reagents is formed around the bead/fragment component. Parallel amplification of millions of beads with millions of single strand fragments produces a sequencer-ready library.
  • In some embodiments the amplicons constituting the adapter-tagged (and, optionally, indexed) amplicon library are produced by polymerase chain reaction (PCR). In some embodiments, the amplicon library is generated using a multiplexed PCR approach, such as that disclosed in U.S. Pat. No. 8,092,996, incorporated by reference herein in its entirety.
  • Bridge PCR is yet another method for in vitro clonal amplification after a library is generated, in preparation for sequencing. This process is a means to clonally amplify a single target molecule, a member of a library, in a defined physical region such as a solid surface, for example, a bead in suspension or a cluster on a glass slide. In this method, fragments are amplified using primers attached to the solid surface forming “DNA colonies” or “DNA clusters”. This method is used in some of the genome analyzer sequencers manufactured by Illumina, Inc. (San Diego, Calif.).
  • Alternatively, each CFTR nucleic acid target segment may be amplified with non-adapter-ligated and/or non-indexed primers and a sequencing adapter and/or an index sequence may be subsequently ligated to each of the resulting amplicons.
  • Following the production of an adapter tagged and, optionally indexed, amplicon library, the amplicons are sequenced using high throughput, massively parallel sequencing (i.e. next generation sequencing). Methods for performing high throughput, massively parallel sequencing are known in the art. The capacity offered by next generation sequencing has revolutionized amplicon sequencing. Companies such as RainDance Technologies, Inc. (Lexington, Mass.) and Fluidigm Corporation offer platforms which generate libraries that are sequencing-competent and composed purely of targeted sequences. By enabling high-throughput, mini PCR setup, these technologies are ideal for preparing amplicon libraries. One drawback of PCR-based approaches is the limitation of amplicon length, which is determined by PCR itself. However, by targeting overlapping regions, this problem can be circumvented.
  • In some embodiments, high throughput, massively parallel sequencing employs sequencing-by-synthesis with reversible dye terminators. In other embodiments, sequencing is performed via sequencing-by-ligation. In yet other embodiments, sequencing is single molecule sequencing.
  • Sequencing by synthesis, like the “old style” dye-termination electrophoretic sequencing, relies on incorporation of nucleotides by a DNA polymerase to determine the base sequence. Reversible terminator methods use reversible versions of dye-terminators, adding one nucleotide at a time, detecting fluorescence at each position by repeated removal of the blocking group to allow polymerization of another nucleotide. The signal of nucleotide incorporation can vary with fluorescently labeled nucleotides, phosphate-driven light reactions and hydrogen ion sensing having all been used. The MiSeq® personal sequencing system (Illumina, Inc.) employs sequencing by synthesis with reversible terminator chemistry.
  • In contrast to the sequencing by synthesis method, the sequencing by ligation method uses a DNA ligase to determine the target sequence. This sequencing method relies on enzymatic ligation of oligonucleotides that are adjacent through local complementarity on a template DNA strand. This technology employs a partition of all possible oligonucleotides of a fixed length, labeled according to the sequenced position. Oligonucleotides are annealed and ligated and the preferential ligation by DNA ligase for matching sequences results in a dinucleotide encoded color space signal at that position (through the release of a flourescently labeled probe that corresponds to a known nucleotide at a known position along the oligo). This method is primarily used by Life Technologies' SOLiD™ sequencers.
  • The Ion Torrent™ (Life Technologies, Carlsbad, Calif.) amplicon sequencing system employs a flow-based approach that detects pH changes caused by the release of hydrogen ions during incorporation of unmodified nucleotides in DNA replication. For use with this system, a sequencing library is initially produced by generating DNA fragments flanked by sequencing adapters. These fragments are clonally amplified on particles by emulsion PCR. The particles with the amplified template are then placed in a silicon semiconductor sequencing chip. During replication, the chip is flooded with one nucleotide after another, and if a nucleotide complements the DNA molecule in a particular microwell of the chip, then it will be incorporated. A proton is naturally released when a nucleotide is incorporated by the polymerase in the DNA molecule, resulting in a detectable local change of pH. The pH of the solution then changes in that well and is detected by the ion sensor.
  • The 454™ GS FLX™ sequencing system (Roche, Germany), employs a light-based detection methodology in a large-scale parallel pyrosequencing system. Pyrosequencing uses DNA polymerization, adding one nucleotide species at a time and detecting and quantifying the number of nucleotides added to a given location through the light emitted by the release of attached pyrophosphates. For use with the 454™ system, adapter-ligated DNA fragments are fixed to small DNA-capture beads in a water-in-oil emulsion and amplified by PCR (emulsion PCR). Each DNA-bound bead is placed into a well on a picotiter plate and sequencing reagents are delivered across the wells of the plate. The four DNA nucleotides are added sequentially in a fixed order across the picotiter plate device during a sequencing run. During the nucleotide flow, millions of copies of DNA bound to each of the beads are sequenced in parallel. When a nucleotide complementary to the template strand is added to a well, the nucleotide is incorporated onto the existing DNA strand, generating a light signal that is recorded by a CCD camera in the instrument.
  • In some embodiments, amplicons from more than one sample source are pooled prior to high throughput sequencing. “Multiplexing” is the pooling of multiple adapter-tagged and indexed libraries into a single sequencing run. When indexed primer sets are used, this capability can be exploited for comparative studies. In some embodiments, amplicon libraries from up to 48 separate sources are pooled prior to sequencing.
  • The described methods for determining the presence or absence of base changes, gene deletions and gene duplications in a CFTR nucleic acid may be used for determining a genetic basis for cystic fibrosis. Accordingly, one aspect of the present invention provides a method for diagnosing a genetic basis for cystic fibrosis in an individual comprising (a) producing an adapter-tagged amplicon library by amplifying multiple target segments of a sample CFTR nucleic acid from said individual, (b) determining the nucleotide sequences of the target segments by sequencing the amplicons using high throughput massively parallel sequencing, and (c) determining that the individual has a genetic basis for being affected with cystic fibrosis or for being a cystic fibrosis carrier if or when the nucleotide sequence of one or more of the target segments contains a mutation associated with cystic fibrosis.
  • The present invention can additionally be used to detect one or more rare CFTR mutations or private mutations in a CFTR nucleic acid from an individual, thereby identifying an individual who possesses one or more rare or private CFTR mutation(s). In some embodiments, the present invention is used to identify rare familial mutations in an obligate cystic fibrosis carrier after the carrier has tested negative in a routine screening test for common mutations. Such routine screening tests may include Cystic Fibrosis Screen: Detectable Mutations, CF Mutation Screen, Cystic Fibrosis Mutation Screen, CFTR Screen, Cystic Fibrosis Screen, Cystic Fibrosis Carrier Screen, and CF-60. The present invention can also be used to identify rare mutations in a cystic fibrosis-affected (i.e. symptomatic) individual who has not had two CFTR sequence mutations identified by at least one routine cystic fibrosis mutation screening test.
  • In some embodiments, the methods disclosed herein are employed to confirm cystic fibrosis carrier status in an individual such as, for example, a parent, a sibling or other relatives of a cystic fibrosis-affected individual with one or more rare or private mutations. In some embodiments, the present invention is used for prenatal diagnosis of an individual, in particular, an individual who is related to a cystic fibrosis-affected individual or who is suspected of being a cystic fibrosis carrier
  • In some aspects of the present invention, at least 2, 5, 10, 20, 25, or 28 and up to 25, 29, or 30, target segments of the CFTR gene may be sequenced with gains and losses of genomic sequence (>50 bp) determined using a read depth approach. In one approach, 29 target segments are sequenced, representing the CFTR coding region (including all exons/intron junctions). In another embodiment, the CFTR coding region (including all exons/intron junctions) in addition to about 1 kb upstream and about 300 kb downstream of the CFTR gene are assayed.
  • The sequence of substantially pure nucleic acid primers which are DNA (or an RNA equivalent) and which are useful for amplifying the promoter region, all of the CFTR exons and intron/exon junctions, and a region immediately downstream of the CFTR gene are shown in Table 1. The letter F or R at the end of the primer name indicates whether the primer is a forward (F) or reverse (R) PCR primer. In some embodiments, the primers of Table 1 are used with Ion Torrent Personal Genome Machine™ and/or Illumina MiSeq® Personal Sequencing System. In some embodiments, the primers of Table 2 are used with a Roche/454™ GS FLX™ sequencer and/or Sanger sequencing. In a preferred embodiment, one or more primers consisting of or comprising any of SEQ ID NOs: 3-54 and 107-140 further comprise sequencing adapter sequence SEQ ID NO:1. In another preferred embodiment, one or more primers consisting of or comprising any of SEQ ID NOs: 55-106 and 141-174 further comprises sequencing adapter sequence SEQ ID NO:2.
  • TABLE 1
    CFTR Primer Sequences for Amplicon Sequencing
    SEQ
    ID Primer
    NO: Name Primer Sequence Hybridizes to
    3 p1F AAAGGATAGACAAGGAACACATCCTGG promoter
    4 P2F CTAATAAAGCTTGGTTCTTTTCTCCGAC promoter
    5 P3F ACCTTGCAAACGTAACAGGAACCC promoter
    6 P4F CGGTGGCTTCTTCTGTCCTCCA promoter
    7 P5F GTCAGAATCGGGAAAGGGAGGTG promoter
    8 P6F GGGGAAAGAGCAAAAGGAAGGG promoter
    9 E1F GTCTTTGGCATTAGGAGCTTGAGC Exon 1
    10 E2F TCAAGTGAATATCTGTTCCTCCTCTCTTT Exon 2
    11 E3F GCACATGCAACTTATTGGTCCCAC Exon 3
    12 E4aF ATGAAATTTAATTTCTCTGTTTTTCCCC Exon 4
    13 E4bF AGGCTTATGCCTTCTCTTTATTGTGAG Exon 4
    14 E5F TTTGTTGAAATTATCTAACTTTCCATTTTTC Exon 5
    15 E6F CACCTGTTTTTGCTGTGCTTTTATTTTC Exon 6
    16 E7F TACTATTAGATTGATTGATTGATTGATTGATT Exon 7
    17 E8aF CTCAGATCTTCCATTCCAAGATCCC Exon 8
    18 E8bF CTTCCCTATGCACTAATCAAAGGAATC Exon 8
    19 E9F GCTATTCTGATTCTATAATATGTTTTTGCTCTC Exon 9
    20 E9outerF GAGTITATTTCAAATATGATGAATCCTAGTGCTTGGC Exon 9
    21 E10aF CTTTTCAAACTAATTGTACATAAAACAAGCATC Exon 10
    22 E10bF AAACAATAACAATAGAAAAACTTCTAATGGTG Exon 10
    23 E11aF TGACCTAATAATGATGGGTTTTATTTCC Exon 11
    24 E11bF TTTCCTGGATTATGCCTGGCAC Exon 11
    25 E12F ACTAAAAGTGACTCTCTAATTTTCTATTTTTGG Exon 12
    26 I12F AATTTCTTAATTGTGTGCTGAATACAATTTTC Intron 12
    27 E13F GAGAGGAAATGTAATTTAATTTCCATTTTC Exon 13
    28 14CFz GCATGAAGGTAGCAGCTATTTTTATGGG Exon 14
    29 E14aF GCTAAAATACGAGACATATTGCAATAAAGTATT Exon 14
    30 E14bF AAAACTAGGATTTTGGTCACTTCTAAAATG Exon 14
    31 E14cF GAACTCCAAAATCTACAGCCAGACTTTAG Exon 14
    32 E14dF TTCTCATTAGAAGGAGATGCTCCTGTC Exon 14
    33 E14eF CAATCAACTCTATACGAAAATTTTCCATTG Exon 14
    34 E14fF TGTCCTTAGTACCAGATTCTGAGCAGG Exon 14
    35 E14gF CTCAGTTAACCAAGGTCAGAACATTCAC Exon 14
    36 E15F CTGTCTTATTGTAATAGCCATAATTCTTTTATTC Exon 15
    37 E16F AAATCAACTGTGTCTTGTTCCATTCC Exon 16
    38 E17aF TGCCAAATAACGATTTCCTATTTGC Exon 17
    39 E17bF GTGTTTTACATTTACGTGGGAGTAGCC Exon 17
    40 E18F TTTTGAGGAATTTGTCATCTTGTATATTAT Exon 18
    41 E19F CTCACCAACATGTTTTCTTTGATCTTAC Exon 19
    42 E20aF TTGCAATGTTTTCTATGGAAATATTTCAC Exon 20
    43 E20bF CTTACTTTGAAACTCTGTTCCACAAAGC Exon 20
    44 E21F GAGGTTCATTTACGTCTTTTGTGCATC Exon 21
    45 E22aF GTGAAATTGTCTGCCATTCTTAAAAACA Exon 22
    46 E22bF GTGAAGAAAGATGACATCTGGCCC Exon 22
    47 I22F CCTTGTGGATCTAAATTTCAGTTGACTTG Intron 22
    48 E23F CAGAAGTGATCCCATCACTTTTACCTTAT Exon 23
    49 E24F TTCATACTTTCTTCTTCTTTTCTTTTTTGC Exon 24
    50 E25F CTCTGTGGTATCTGAACTATCTTCTCTAACTG Exon 25
    51 E26F GATCATTACTGTTCTGTGATATTATGTGTGG Exon 26
    52 E27aF CTCTGGTCTGACCTGCCTTCTGTC Exon 27
    53 E27bF CCAGAAACTGCTGAACGAGAGGAG Exon 27
    54 3UF CAGAAGAAGAGGTGCAAGATACAAGG 3′ UTR
    55 P1R CATTTACCTTAGCGCTTCCTTTGCG promoter
    56 P2R CTCCTCCTTTTCCCGATGATCCTAG promoter
    57 P3R CTCTCTTTAGGTCCAGTTGGCAACG promoter
    58 P4R CCTTCCTCCTCTCCTCCTTCGCT promoter
    59 P5R AATTCCCCCCACCCACCCCTACTC promoter
    60 P6R CCTTTTCCAGAGGCGACCTCTG promoter
    61 E1R CTTTCGTGGGCACGTGTCTTTC Exon 1
    62 E2R TTCTCTTCTCTAAATAATTAATAATATGAATTTCTC Exon 2
    63 E3R GTGATACATAATGAATGTACAAATGAGATCC Exon 3
    64 E4aR GCTGGGTGTAGGAGCAGTGTCCT Exon 4
    65 E4bR CATGGGGCCTGTGCAAGGAAG Exon 4
    66 E5R TAACCACTAATTACTATTATCTGACCCAGG Exon 5
    67 E6R TTTAAAACTTTCAAGTTATGAAAATAGGTTGC Exon 6
    68 E7R AAGGACAGAATTACTAACAATATTGAAATTATTG Exon 7
    69 E8ar GATGGTGGTGAATATTTTCCIGAG Exon 8
    70 E8br TATTTAAATCATAGTATATAATGCAGCATTATGGTAC Exon 8
    71 E9R GAAGAAAACAGTTAGGTGTTTAGAGCAAAC Exon 9
    72 E9outerR CGCCATTAGGATGAAATCCITATTCACAAAG Exon 9
    73 E10aR AAGAAGTGAGAAATTACTGAAGAAGAGGCT Exon 10
    74 E10bR CAAATTAAGTTCTTAATAGTGAAGAACAAAAGAAC Exon 10
    75 E11aR ATCATAGGAAACACCAAAGATGATATTTTC Exon 11
    76 E11bR GGTTCATATGCATAATCAAAAAGTTTTCAC Exon 11
    77 E12R GCAAATGCTTGCTAGACCAATAATTAG Exon 12
    78 I12R GAACAGTAATAAAGATGAAGACACAGTTCCC Intron 12
    79 E13R GCATGAGCATTATAAGTAAGGTATTCAAAG Exon 13
    80 14DRz GGTACTAAGGACAGCCTTCTCTCTAAAG Exon 14
    81 E14aR CAAAATTAATATTTTGTCAGCTTTCTTTAAATG Exon 14
    82 E14bR GAAAGAATCACATCCCATGAGTTTTG Exon 14
    83 E14cR AAGATTGTTTTTTTGTTTCTGTCCAGG Exon 14
    84 E14dR CTAAGGACAGCCTTCTCTCTAAAGGC Exon 14
    85 E1eR TCCTTCGTGCCTGAAGCGTGG Exon 14
    86 E14fR CACTTTTCGTGTGGATGCTGTTG Exon 14
    87 E14gR GTGAAATACCCCCAAGCGATGTATAC Exon 14
    88 E15R CTTTAAATCCAGTAATACTTTACAATAGAACATTC Exon 15
    89 E16R ACAAAGTGGATTACAATACATACAAACATAGTG Exon 16
    90 E17aR GAAGAATCCCATAGCAAGCAAAGTG Exon 17
    91 E17bR GGATCAGCAGTTTCATTTCTTAGACCTAG Exon 17
    92 E18R TAATAATACAGACATACTTAACGGTACTTATTTTTAC Exon 18
    93 E19R CAAGATGAGTATCGCACATTCACTGTC Exon 19
    94 E20aR CAAGAACCAGTTGGCAGTATGTAAATTC Exon 20
    95 E20bR CTTAAATGCTTAGCTAAAGTTAATGAGTTCATAG Exon 20
    96 E21R TTTTTCATAAAAGTTAAAAAGATGATAAGACTT Exon 21
    97 E22aR ATCTTTGACAGTCATTTGGCCCC Exon 22
    98 E22bR GTCTAACAAAGCAAGCAGTGTTCAAATC Exon 22
    99 I22R GGTGCTAGCTGTAATTGCATTGTACC Intron 22
    100 E23R CTTTTTTCTGGCTAAGTCCTTTTGC Exon 23
    101 E24R CCTTTCAAAATCATTTCAGTTAGCAGC Exon 24
    102 E25R GTGCTATTAAGTAACAGAACATCTGAAACTC Exon 25
    103 E26R AATTACAAGGGCAATGAGATCTTAAGTAAAG Exon 26
    104 E27aR TGGGGAAAGAGCTTCACCCTGT Exon 27
    105 E27bR GTCCCATGTCAACATTTATGCTGC Exon 27
    106 3UR CATATCAGTGTCCTCAATTCCCCTTAC 3′ UTR
  • TABLE 2
    CFTR Primer Sequences for Amplicon Sequencing
    SEQ
    ID Hybridizes
    NO: Primer Name Primer Sequence to
    107 q-PROMOTER- CGTGTCCTAAGATTTCTGTG promoter
    1-1F
    108 q-PROMOTER- TGCCAACTGGACCTAAAG promoter
    2-1F
    109 q1e1F CACCCAGAGTAGTAGGTCTTTG Exon 1
    110 q2e2F CATAATTTTCCATATGCCAG Exon 2
    111 s3e1F CTTGGGTTAATCTCCTTGGA Exon 3
    112 q4e1F AAAGTCTTGTGTTGAAATTCTCA Exon 4
    GG
    113 g5e3F ACATTTATGAACCTGAGAAG Exon 5
    114 q6ae1F GGGGTGGAAGATACAATGAC Exon 6
    115 q6be2F AAAATAATGCCCATCTGTTG Exon 7
    116 q7e3F CTTCCATTCCAAGATCCC Exon 8
    117 q8e1F GATGTAGCACAATGAGAGTATAA Exon 9
    AG
    118 g9e9F TGGATCATGGGCCATGTGC Exon 10
    119 s10e3F AGCAGAGTACCTGAAACAGGA Exon 11
    120 g11e1F CAGATTGAGCATACTAAAAGTG Exon 12
    121 q11i4F GTGTGCTGAATACAATTTTC Intron 12
    122 s12e1F GTGAATCGATGTGGTGACCA Exon 13
    123 q13-1e1F CGAGGATAAATGATTTGCTCAAA Exon 14
    G
    124 q13-2e1F TCCTAACTGAGACCTTACAC Exon 14
    125 q14ae5F GTGGCATGAAACTGTACTGT Exon 15
    126 q14be2F ATGGGAGGAATAGGTGAAGA Exon 16
    127 q15e3F GGTTAAGGGTGCATGCTCTTC Exon 17
    128 q16e4F CTACTGTGATCCAAACTTAGTAT Exon 18
    TG
    129 q17ae1F ACACTTTGTCCACTTTGC Exon 19
    130 q17be1F ATCTATTCAAAGAATGGCAC Exon 20
    131 q18e1F TAGATGCTGTGATGAACTG Exon 21
    132 q19e3F CCCGACAAATAACCAAGTGAC Exon 22
    133 q19i2F GAATCATTCAGTGGGTATAAGC Intron 22
    AG
    134 g20e3F TCTCTATTCTGTTCCAAGG Exon 23
    135 g21e1F TGATGGTAAGTACATGGGTG Exon 24
    136 q22e1F CTGTCAAGGTTGTAAATAGAC Exon 25
    137 q23e1F CTGTTCTGTGATATTATGTGTG Exon 26
    138 q24e1F TATTTTCCTTTGAGCCTG Exon 27
    139 CFTR-22.2F CTTAATTGTGTGCTGAATACAAT Intron 12
    TTTC
    140 CFTR-31.2F GAATCATTCAGTGGGTATAAGCA Intron 22
    G
    141 q-PROMOTER- CCTTTCCCGATTCTGACTC promoter
    1-1R
    142 q-PROMOTER- CCAAACCCAACCCATACAC promoter
    2-1R
    143 q1e1R CAAACCCAACCCATACACAC Exon 1
    144 q2e2R CTATGTTTGCTTTCTCTTCTC Exon 2
    145 s3e2R ATTCACCAGATTTCGTAGTC Exon 3
    146 q4e1R CCAGCTCACTACCTAATTTATGA Exon 4
    CAT
    147 g5e4R CAGAATAGGGAAGCTAGAG Exon 5
    148 q6ae1R CATAGAGCAGTCCTGGTTTTAC Exon 6
    149 q6be2R GTGGAAGTCTACCATGATAAACA Exon 7
    TA
    150 q7E4R GCAAAGTTCATTAGAACTGATC Exon 8
    151 q8e1R CACAAAGAAGAAAACAGTTAGG Exon 9
    152 g9e11R AAAGAGACATGGACACCAAATTA Exon 10
    AG
    153 s10e3R CCATTCACAGTAGCTTACCCA Exon 11
    154 g11e2R TACATGAATGACATTTACAGCA Exon 12
    155 q11i4R AAGATGAAGACACAGTTCCC Intron 12
    156 s12e1R CTGGTTTAGCATGAGGCGGT Exon 13
    157 q13-1e2R TCGTATAGAGTTGATTGGATTGA Exon 14
    GA
    158 q13-2e1R TTCTGTGGGGTGAAATAC Exon 14
    159 q14ae6R CACATCCCCAAACTATCTTAA Exon 15
    160 q14be2R TGGATTACAATACATACAAACA Exon 16
    161 q15e4R GGCCCTATTGATGGTGGATC Exon 17
    162 q16e5R AGGTAAGCAGTTCTGACTTATTA Exon 18
    163 q17ae1R CAGATGAGTATCGCACATTC Exon 19
    164 q17be1R GATAACCTATAGAATGCAGC Exon 20
    165 q18e1R GAAGGAAAGAAGAGATAAGG Exon 21
    166 q19e4R CGCTAACACATTGCTTCAGGCTA Exon 22
    C
    167 q19i3R CTTCAATGCACCTCCTCCC Intron 22
    168 g20e4R ACAAGTATCAAATAGCAG Exon 23
    169 g21e2R CAAAAGTACCTGTTGCTCCA Exon 24
    170 q22e1R AAGCAGGCATAATGATTC Exon 25
    171 q23e1R AATTACAAGGGCAATGAG Exon 26
    172 q24e1R GCAGAGGTAACTGTTCCAC Exon 27
    173 CFTR-22.2F AGTAATAAAGATGAAGACACAGT Intron 12
    TCCC
    174 CFTR-31.2R CTTCAATGCACCTCCTCCC Intron 22
  • The following examples serve to illustrate the present invention. These examples are in no way intended to limit the scope of the invention.
  • EXAMPLE Amplicon Library Generation:
  • Genomic DNA was isolated from either whole blood or paraffin embedded tissue. CFTR amplicon libraries were created for samples from 48 different sources. The CFTR gene is one of a select few genes that to date has been extensively and exhaustively sequenced and, as such, has been annotated with many polymorphisms. Avoiding these polymorphism made the selection of primer and or probe binding sites particularly difficult. Libraries were generated using primers from Table 1 or Table 2 and size selected using either AMPure® beads or eGel.
  • The forward primers of Tables 1 and 2 each had an adapter oligonucleotide ligated to the 5′ end of the primer. The adapter sequence of the forward primer adapter was 5′-ACACTGACGACATGGTTCTACA-3′ (SEQ ID NO: 1). The reverse primers of Tables 1 and 2 each had an adapter oligonucleotide ligated to the 5′ end of the primer. The sequence of the reverse primer adapter was 5′-TACGGTAGCAGAGACTTGGTCT-3′ (SEQ ID NO: 2).
  • In addition, the high GC content of the CFTR promoter region made it additionally difficult to determine suitable thermal cycling conditions during library generation. The ultimate PCR protocol employed is shown in Table 3.
  • TABLE 3
    PCR Protocol
    PCR Stages Number of Cycles
    50° C. 2 minutes 1 1
    70° C. 20 minutes 1
    95° C. 10 minutes 1
    95° C. 30 seconds X4 2
    65° C. 30 seconds
    72° C. 1 minute
    95° C. 15 seconds X8 3
    80° C. 30 seconds
    60° C. 30 seconds
    72° C. 1 minute
    95° C. 15 seconds x8 4
    60° C. 30 seconds
    72° C. 1 minute
    95° C. 15 seconds x2 5
    80° C. 30 seconds
    60° C. 30 seconds
    72° C. 1 minute
    95° C. 15 seconds X12 6
    60° C. 30 seconds
    72° C. 1 minute
    95° C. 15 seconds X6 7
    80° C. 30 seconds
    60° C. 30 seconds
    72° C. 1 minute

Claims (17)

That which is claimed is:
1. A method for determining the nucleotide sequence of a sample CFTR nucleic acid comprising:
(a) producing an adapter-tagged amplicon library by amplifying multiple target segments of the sample CFTR nucleic acid, wherein each target segment is amplified with a pair of oligonucleotide primers, wherein at least one primer of the primer pair is selected from the group consisting of SEQ ID NOS: 3-174; and
(b) determining the nucleotide sequences of the target segments by sequencing the amplicons in the amplicon library using high throughput massively parallel sequencing.
2. The method of claim 1, wherein the multiple target segments are amplified by PCR.
3. The method of claim 1, wherein the sample CFTR nucleic acid is at least one nucleic acid selected from the group consisting of genomic DNA, mRNA and cDNA.
4. The method of claim 1, wherein an adapter sequence is ligated to one of both ends of each amplicon.
5. The method of claim 1, wherein at least one primer of the primer pair is ligated to a sequencing adapter sequence prior to amplification.
6. The method of claim 1, wherein the amplicons are labeled with an index label that indicates the sample source from which the amplicon is generated.
7. The method of claim 6, wherein the index label is an oligonucleotide.
8. The method of claim 1, wherein the multiple target segments of the sample CFTR nucleic acid, together, span the CFTR coding region and all intron/junctions.
9. The method of claim 8, wherein the multiple target segments further span about 1000 nucleotides of the promoter region immediately upstream of the first exon.
10. The method of claim 9, wherein the multiple target segments further span about 200 to 350 nucleotides immediately downstream of the CFTR sequence.
11. The method of claim 1, wherein the at least one primer of the primer pair is selected from the group consisting of SEQ ID NOs: 9-54, 61-106, 109-140, and 143-174.
12. The method of claim 1, wherein the high throughput massively parallel sequencing involves a read depth approach.
13. A kit comprising an oligonucleotide primer selected from the group consisting of SEQ ID NO: 3-174, wherein the primer further comprises a fluorescent label.
14. The kit of claim 13, wherein the oligonucleotide primer is selected from the group consisting of SEQ ID NO: 9-54, 61-106, 109-140, and 143-174.
15. The kit of claim 13, wherein the oligonucleotide primer is ligated to a sequencing adapter sequence.
16. The kit of claim 15, wherein the sequencing adapter sequence comprises SEQ ID NO: 1.
17. The kit of claim 15, wherein the sequencing adapter sequence comprises SEQ ID NO: 2.
US17/201,469 2013-03-14 2021-03-15 Method for detecting cystic fibrosis Pending US20210198744A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/201,469 US20210198744A1 (en) 2013-03-14 2021-03-15 Method for detecting cystic fibrosis

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US201361785862P 2013-03-14 2013-03-14
PCT/US2014/027870 WO2014152822A2 (en) 2013-03-14 2014-03-14 Method for detecting cystic fibrosis
US201514774331A 2015-09-10 2015-09-10
US16/158,823 US10947592B2 (en) 2013-03-14 2018-10-12 Method for detecting cystic fibrosis
US17/201,469 US20210198744A1 (en) 2013-03-14 2021-03-15 Method for detecting cystic fibrosis

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US16/158,823 Division US10947592B2 (en) 2013-03-14 2018-10-12 Method for detecting cystic fibrosis

Publications (1)

Publication Number Publication Date
US20210198744A1 true US20210198744A1 (en) 2021-07-01

Family

ID=51581749

Family Applications (3)

Application Number Title Priority Date Filing Date
US14/774,331 Active 2034-10-04 US10100361B2 (en) 2013-03-14 2014-03-14 Method for detecting cystic fibrosis
US16/158,823 Active US10947592B2 (en) 2013-03-14 2018-10-12 Method for detecting cystic fibrosis
US17/201,469 Pending US20210198744A1 (en) 2013-03-14 2021-03-15 Method for detecting cystic fibrosis

Family Applications Before (2)

Application Number Title Priority Date Filing Date
US14/774,331 Active 2034-10-04 US10100361B2 (en) 2013-03-14 2014-03-14 Method for detecting cystic fibrosis
US16/158,823 Active US10947592B2 (en) 2013-03-14 2018-10-12 Method for detecting cystic fibrosis

Country Status (6)

Country Link
US (3) US10100361B2 (en)
EP (3) EP2971151A4 (en)
CN (2) CN105247073A (en)
BR (1) BR112015022331A2 (en)
CA (2) CA3174919A1 (en)
WO (1) WO2014152822A2 (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170238349A1 (en) * 2015-05-15 2017-08-17 Ntt Docomo, Inc. User apparatus and base station
GB2544071A (en) * 2015-11-04 2017-05-10 Univ Pretoria Method and kit for identifying gene mutations
EP3397766B1 (en) * 2015-12-31 2023-05-17 Quest Diagnostics Investments LLC Compositions and methods for screening mutations in thyroid cancer
CN106674344B (en) * 2017-01-20 2020-03-27 首都医科大学附属北京儿童医院 Deletion mutant form of CFTR gene of cystic fibrosis patient and application thereof
WO2018195555A1 (en) * 2017-04-21 2018-10-25 The Board Of Trustees Of The Leland Stanford Junior University Crispr/cas 9-mediated integration of polynucleotides by sequential homologous recombination of aav donor vectors
US12093009B2 (en) * 2021-03-24 2024-09-17 Yokogawa Electric Corporation Onboarding distributed control node using secondary channel

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110230365A1 (en) * 2010-03-22 2011-09-22 Elizabeth Rohlfs Mutations Associated With Cystic Fibrosis
US20120088236A1 (en) * 2010-06-02 2012-04-12 Canon U.S. Life Sciences, Inc. Methods and Systems for Sequential Determination of Genetic Mutations and/or Varients

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB9020632D0 (en) 1990-09-21 1990-10-31 Hsc Res Dev Corp Stable propagation of modified full length cystic fibrosis transmembrane conductance regulator protein cdna in heterologous systems
US7741028B2 (en) 1999-11-12 2010-06-22 Ambry Genetics Methods of identifying genetic markers in the human cystic fibrosis transmembrane conductance regulator (CFTR) gene
US20040126760A1 (en) 2001-05-17 2004-07-01 Natalia Broude Novel compositions and methods for carrying out multple pcr reactions on a single sample
US20040110138A1 (en) 2002-11-01 2004-06-10 University Of Ottawa Method for the detection of multiple genetic targets
US20050059035A1 (en) * 2003-09-09 2005-03-17 Quest Diagnostics Incorporated Methods and compositions for the amplification of mutations in the diagnosis of cystic fibrosis
US8092996B2 (en) * 2004-09-16 2012-01-10 Quest Diagnostics Investments Incorporated Method for detecting cystic fibrosis
US8163480B2 (en) 2006-10-05 2012-04-24 Quest Diagnostics Investments Incorporated Nucleic acid size detection method
US7794937B2 (en) * 2006-12-22 2010-09-14 Quest Diagnostics Investments Incorporated Cystic fibrosis transmembrane conductance regulator gene mutations
EP2414547B1 (en) * 2009-04-02 2014-03-12 Fluidigm Corporation Multi-primer amplification method for barcoding of target nucleic acids
DK2652155T3 (en) 2010-12-16 2017-02-13 Gigagen Inc Methods for Massive Parallel Analysis of Nucleic Acids in Single Cells

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110230365A1 (en) * 2010-03-22 2011-09-22 Elizabeth Rohlfs Mutations Associated With Cystic Fibrosis
US20120088236A1 (en) * 2010-06-02 2012-04-12 Canon U.S. Life Sciences, Inc. Methods and Systems for Sequential Determination of Genetic Mutations and/or Varients

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
Bareil et al; Human Mutation, vol 31: 2010, pages 1011-1019 *
DQ388137 (Accession number DQ388137; Genbank, NCBI, NLM, 2006) *
Mizuki, Psalm; Thesis, Rochester Institute of Technology, 2010 *
Mori (Mori et al; Clinical Biochemistry, vol 33, pages 323-327, 2000) *
Vorkas et al; Journal of Molecular Diagnostics, vol 12, pp 697-704; 2010 *

Also Published As

Publication number Publication date
EP2971151A4 (en) 2016-11-30
US20160032385A1 (en) 2016-02-04
EP4324936A3 (en) 2024-05-15
CA3174919A1 (en) 2014-09-25
CN113151436A (en) 2021-07-23
EP3647436A1 (en) 2020-05-06
US10100361B2 (en) 2018-10-16
US20190100804A1 (en) 2019-04-04
CA2905461C (en) 2022-12-06
BR112015022331A2 (en) 2017-10-10
EP4324936A2 (en) 2024-02-21
CA2905461A1 (en) 2014-09-25
CN105247073A (en) 2016-01-13
EP2971151A2 (en) 2016-01-20
WO2014152822A3 (en) 2014-12-24
WO2014152822A2 (en) 2014-09-25
US10947592B2 (en) 2021-03-16

Similar Documents

Publication Publication Date Title
US10947592B2 (en) Method for detecting cystic fibrosis
CN103534591B (en) Non-invasive fetal genetic screening by sequencing analysis
AU2014248511B2 (en) Systems and methods for prenatal genetic analysis
EP2513341B1 (en) Identification of polymorphic sequences in mixtures of genomic dna by whole genome sequencing
JP3693352B2 (en) Methods for detecting genetic polymorphisms and monitoring allelic expression using probe arrays
EP3455375B1 (en) Detection of met exon 14 deletions
US12404556B2 (en) Compositions and methods for detecting circulating tumor DNA
US20210130896A1 (en) Method for detecting cystic fibrosis
CN110719958A (en) Method and kit for constructing nucleic acid library
CN108026583A (en) HLA-B*15:02 single nucleotide polymorphism and its application
EP3720970A1 (en) Detection of nucleic acids from platelet enriched plasma samples
KR102799841B1 (en) Methylation marker genes for ovarian cancer diagnosis and use thereof
AU2024246594A1 (en) Novel assay for phasing of distant genomic loci with zygosity resolution via long-read sequencing hybrid data analysis
KR20250167593A (en) A novel analytical method for phasing distant genomic loci using splicing resolution through long-read sequencing hybrid data analysis.
US20220392568A1 (en) Method for identifying transplant donors for a transplant recipient
JP2005110607A (en) Test method for predisposition to hypertensive cardiac hypertrophy
HK1177233B (en) Identification of polymorphic sequences in mixtures of genomic dna by whole genome sequencing
HK1177233A (en) Identification of polymorphic sequences in mixtures of genomic dna by whole genome sequencing

Legal Events

Date Code Title Description
STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: ADVISORY ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION COUNTED, NOT YET MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE AFTER FINAL ACTION FORWARDED TO EXAMINER