[go: up one dir, main page]

US20150051088A1 - Next-generation sequencing libraries - Google Patents

Next-generation sequencing libraries Download PDF

Info

Publication number
US20150051088A1
US20150051088A1 US14/463,508 US201414463508A US2015051088A1 US 20150051088 A1 US20150051088 A1 US 20150051088A1 US 201414463508 A US201414463508 A US 201414463508A US 2015051088 A1 US2015051088 A1 US 2015051088A1
Authority
US
United States
Prior art keywords
nucleotide
sequence
nucleic acid
target
sequencing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US14/463,508
Other languages
English (en)
Inventor
Dae Hyun Kim
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Abbott Molecular Inc
Original Assignee
Abbott Molecular Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Abbott Molecular Inc filed Critical Abbott Molecular Inc
Priority to US14/463,508 priority Critical patent/US20150051088A1/en
Assigned to ABBOTT MOLECULAR INC. reassignment ABBOTT MOLECULAR INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: KIM, DAE HYUN
Publication of US20150051088A1 publication Critical patent/US20150051088A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1034Isolating an individual clone by screening libraries
    • C12N15/1086Preparation or screening of expression libraries, e.g. reporter assays
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • C12Q1/6874Methods for sequencing involving nucleic acid arrays, e.g. sequencing by hybridisation
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B40/00Libraries per se, e.g. arrays, mixtures
    • C40B40/04Libraries containing only organic compounds
    • C40B40/06Libraries containing nucleotides or polynucleotides, or derivatives thereof
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B50/00Methods of creating libraries, e.g. combinatorial synthesis
    • C40B50/06Biochemical methods, e.g. using enzymes or whole viable microorganisms

Definitions

  • next-generation sequencing is technology relating to next-generation sequencing and particularly, but not exclusively, to methods, compositions, kits, and systems for preparing a next-generation sequencing library comprising overlapping DNA fragments and using the library to sequence one or more target nucleic acids.
  • Nucleic acid sequences encode the necessary information for living things to function and reproduce. Determining such sequences is therefore a tool useful in pure research into how and where organisms live, as well as in applied sciences such as drug development.
  • sequencing tools are used for diagnosis and to develop treatments for a variety of pathologies, including cancer, infectious disease, heart disease, autoimmune disorders, multiple sclerosis, and obesity.
  • sequencing is used to design improved enzymatic processes and synthetic organisms.
  • such tools are used to study the health of ecosystems, for example, and thus have a broad range of utility.
  • NGS next generation sequencing
  • next-generation sequencing platforms are available for the high-throughput, massively parallel sequencing of nucleic acids.
  • Many of these systems such as the HiSeq and MiSeq systems produced by Illumina, use a sequencing-by-synthesis (SBS) approach, wherein a nucleotide sequence is determined using base-by-base detection and identification. Using this particular approach, identifying 1 base requires 1 cycle of the SBS chemistry process (which may involve four separate reactions separated by washes).
  • SBS sequencing-by-synthesis
  • a standard SBS protocol is used to acquire ⁇ 300-500 bases of sequence from the target template (2 ⁇ 150 bases or 2 ⁇ 250 bases) and, once the sequences are generated, the barcodes are used to parse and assemble the reads to provide the sequence of the original ⁇ 10 Kbp DNA.
  • Another method involves creation of an overlapping fragment library suitable for an Illumina sequencer, which produces reads ranging from ⁇ 400-460 bases by assembling two ⁇ 250-base reads that overlap by ⁇ 20-50 bases (see, e.g., Lundin, et al. (2012) Scientific Reports 3: 1186).
  • This overlapping library is constructed mainly by tagging fragments with specific adaptor sequences, followed by a digestion step and a precise size selection process.
  • a technology for sequencing that utilizes a relatively short read length (e.g., less than 100 bases, e.g., ⁇ 30-50 bases) to achieve a high-quality, long contiguous sequence comparable or superior to conventional technologies.
  • the technology provided requires only a short period of run-time (e.g., ⁇ 3-4 hours) on a sequencer (e.g., Illumina MiSeq platform), thus dramatically decreasing the time dedicated to use of the sequencing apparatus required to complete a sequencing run.
  • the technology results in longer sequences (e.g., ⁇ 500 bp to 1000 bp or more of high quality sequence) than conventional technology.
  • run-time does not increase as a function of the size of the nucleic acid to be sequenced because the short read size (e.g., ⁇ 30-50) remains the same regardless of the size of the nucleic acid to be sequenced.
  • the technology is not limited to any particular sequencing platform, but is generally applicable and platform independent.
  • similar time reductions are achieved for sequences acquired using, e.g., Life Technologies Ion Torrent and Qiagen GeneReader systems.
  • the technology provided herein reduces that time to approximately 20 to 30 minutes.
  • the technology is applicable to emulsion PCR-based methods, bead-based, and non-based methods, and thus finds use in the Life Technologies SOLiD systems and the Qiagen NGS sequencing platforms.
  • This technology provides high quality sequence in a decreased sequencing time relative to conventional technologies.
  • the technology is platform agnostic and thus is compatible with extant sequencing apparatuses.
  • the technology in some embodiments, enhances existing NGS platforms by, e.g., increasing the read length of extant platforms and shortening the time to sequence acquisition.
  • an added advantage of the present technology is that it reduces consumption of expensive sequencing reagents and thus can decrease the overall per-base cost of sequencing.
  • the technology involves producing a set of defined overlapping short sequence library inserts (e.g., less than 100 bases, e.g., ⁇ 30-50 bases) tiled over a region of a nucleic acid to be sequenced and offset from one another by, e.g., 1-20, 1-10, or 1-5 bases (e.g., in some embodiments, by 1 base).
  • a set of defined overlapping short sequence library inserts e.g., less than 100 bases, e.g., ⁇ 30-50 bases
  • 1-5 bases e.g., in some embodiments, by 1 base.
  • sequence quality is high because each base in the nucleic acid to be sequenced is sequenced with high coverage (e.g., 10-fold to 1000-fold coverage, e.g., 50-fold to 500-fold coverage) depending on the length of the short sequences acquired and the offset between adjacent tiled sequences.
  • the high sampling rate at each base minimizes or eliminates sequencing errors by providing increased information to the assembly process that determines the consensus identity of each base.
  • the first bases e.g., the first ⁇ 20-100 bases
  • the initial bases determined during the first part of each sequencing run e.g., the first ⁇ 30-50 bases
  • high quality sequence information is used in the assembly. The technology thus minimizes sequencing errors, especially in applications where long sequence reads are desired that retain phasing and linkage information associated with the reads and assemblies.
  • sequencer time is reduced because determining each short sequence (e.g., ⁇ 30-50 bases) requires only a small number of sequencing cycles (e.g., 1 cycle per base, e.g., ⁇ 30-50 cycles) on the sequencing apparatus.
  • the sequencing time needed to provide the sequence of the nucleic acid to be sequenced is greatly reduced, e.g., to one-eighth to one-tenth of the time needed by conventional technologies to sequence the same nucleic acid to be sequenced.
  • This technology for NGS library preparation and sequencing and the subsequent short-read parsing and assembly provides acquisition of more than ⁇ 500 bp (e.g., 600, 700, 800 bp or more) of high-quality contiguous sequence with phase information.
  • the technology finds use, e.g., in sequencing unknown regions starting from a known region, for example, to interrogate structural variants such as gene translocations, e.g., the detection and identification of unknown gene fusion partners.
  • the technology enhances existing NGS platforms' sequencing capabilities relative to read length, run time, and cost without any upgrades and/or changes to existing installed hardware and extant sequencing chemistries.
  • the technology is related to a method for determining a target nucleotide sequence, the method comprising determining a first nucleotide subsequence of the target nucleotide sequence, said first nucleotide subsequence having a 5′ end at nucleotide x1 of the target nucleotide sequence and having a 3′ end at nucleotide y1 of the target nucleotide sequence; determining a second nucleotide subsequence of the target nucleotide sequence, said second nucleotide subsequence having a 5′ end at nucleotide x2 of the target nucleotide sequence and having a 3′ end at nucleotide y2 of the target nucleotide sequence; assembling the first nucleotide subsequence and the second nucleotide subsequence to provide a consensus sequence for the target nucleotide sequence, wherein x2 ⁇ y1
  • the fragments are less than 100 bp, less than 90 bp, less than 80 bp, less than 70 bp, less than 60 bp, less than 55 bp, less than 50 bp, less than 45 bp, less than 40 bp, or less than 35 bp. Accordingly, in some embodiments, (y1 ⁇ x1) ⁇ 100, 90, 80, 70, 60, 55, 50, 45, 40, or 35 and (y2 ⁇ x2) ⁇ 100, 90, 80, 70, 60, 55, 50, 45, 40, or 35. In some embodiments, the fragments are less than 50 bp; accordingly, in some embodiments, (y1 ⁇ x1) ⁇ 50 and (y2 ⁇ x2) ⁇ 50.
  • a unique index (a “marker” in some contexts) is used to associate a fragment with the template nucleic acid from which it was produced.
  • a unique index is a unique sequence of synthetic nucleotides or a unique sequence of natural nucleotides that allows for easy identification of the target nucleic acid within a complicated collection of oligonucleotides (e.g., fragments) containing various sequences.
  • unique index identifiers are attached to nucleic acid fragments prior to attaching adaptor sequences.
  • unique index identifiers are contained within adaptor sequences such that the unique sequence is contained in the sequencing reads.
  • homologous fragments can be detected based upon the unique indices that are attached to each fragment, thus further providing for unambiguous reconstruction of a consensus sequence.
  • Homologous fragments may occur for example by chance due to genomic repeats, two fragments originating from homologous chromosomes, or fragments originating from overlapping locations on the same chromosome. Homologous fragments may also arise from closely related sequences (e.g., closely related gene family members, paralogs, orthologs, ohnologs, xenologs, and/or pseudogenes). Such fragments may be discarded to ensure that long fragment assembly can be computed unambiguously.
  • the markers may be attached as described above for the adaptor sequences.
  • the indices e.g., markers
  • the unique index (e.g., index identifier, tag, marker, etc.) is a “barcode”.
  • barcode refers to a known nucleic acid sequence that allows some feature of a nucleic acid with which the barcode is associated to be identified.
  • the feature of the nucleic acid to be identified is the sample or source from which the nucleic acid is derived.
  • barcodes are at least 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, or more nucleotides in length. In some embodiments, barcodes are shorter than 10, 9, 8, 7, 6, 5, or 4 nucleotides in length.
  • barcodes associated with some nucleic acids are of a different length than barcodes associated with other nucleic acids.
  • barcodes are of sufficient length and comprise sequences that are sufficiently different to allow the identification of samples based on barcodes with which they are associated.
  • a barcode and the sample source with which it is associated can be identified accurately after the mutation, insertion, or deletion of one or more nucleotides in the barcode sequence, such as the mutation, insertion, or deletion of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more nucleotides.
  • each barcode in a plurality of barcodes differs from every other barcode in the plurality at two or more nucleotide positions, such as at 2, 3, 4, 5, 6, 7, 8, 9, 10, or more positions.
  • one or more adaptors comprise(s) at least one of a plurality of barcode sequences.
  • methods of the technology further comprise identifying the sample or source from which a target nucleic acid is derived based on a barcode sequence to which the target nucleic acid is joined.
  • methods of the technology further comprise identifying the target nucleic acid based on a barcode sequence to which the target nucleic acid is joined.
  • Some embodiments of the method further comprise identifying a source or sample of the target nucleotide sequence by determining a barcode nucleotide sequence. Some embodiments of the method further comprise molecular counting applications (e.g., digital barcode enumeration and/or binning) to determine expression levels or copy number status of desired targets.
  • a barcode may comprise a nucleic acid sequence that when joined to a target nucleic acid serves as an identifier of the sample from which the target polynucleotide was derived.
  • the methods provide a sequence of up to 100 bases or, in some embodiments, a sequence of more than 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, or more bases.
  • the technology provides a sequence of more than 1000 bases, e.g., more than 2000, 2500, 3000, 3500, 4000, 4500, or 5000 or more bases.
  • the consensus sequence comprises up to 100 bases or more, e.g., 200, 300, 400, 500, 600, 700, 800, 900, 1000, or more bases; in some embodiments the consensus sequence comprises more than 1000 bases, e.g., more than 2000, 2500, 3000, 3500, 4000, 4500, or 5000 or more bases.
  • an oligonucleotide such as a primer, adaptor, etc. comprises a “universal” sequence.
  • a universal sequence is a known sequence, e.g., for use as a primer or probe binding site using a primer or probe of a known sequence (e.g., complementary to the universal sequence).
  • a template-specific sequence of a primer, a barcode sequence of a primer, and/or a barcode sequence of an adaptor might differ in embodiments of the technology, e.g., from fragment to fragment, from sample to sample, from source to source, or from region of interest to region of interest
  • embodiments of the technology provide that a universal sequence is the same from fragment to fragment, from sample to sample, from source to source, or from region of interest to region of interest so that all fragments comprising the universal sequence can be handled and/or treated in a same or similar manner, e.g., amplified, identified, sequenced, isolated, etc., using similar methods or techniques (e.g., using the same primer or probe).
  • a primer comprising a universal sequence (e.g., universal sequence A), a barcode sequence, and a template-specific sequence.
  • a first adaptor comprising a universal sequence e.g., universal sequence B
  • a second adaptor comprising a universal sequence e.g., universal sequence C
  • Universal sequence A, universal sequence B, and universal sequence C can be any sequence.
  • the universal sequence A of a first nucleic acid (e.g., a fragment) comprising universal sequence A is the same as the universal sequence A of a second nucleic acid (e.g., a fragment) comprising universal sequence A
  • the universal sequence B of a first nucleic acid (e.g., a fragment) comprising universal sequence B is the same as the universal sequence B of a second nucleic acid (e.g., a fragment) comprising universal sequence B
  • the universal sequence C of a first nucleic acid (e.g., a fragment) comprising universal sequence C is the same as the universal sequence C of a second nucleic acid (e.g., a fragment) comprising universal sequence C.
  • universal sequences A, B, and C are generally different in embodiments of the technology, they need not be. Thus, in some embodiments, universal sequences A and B are the same; in some embodiments, universal sequences B and C are the same; in some embodiments, universal sequences A and C are the same; and in some embodiments, universal sequences A, B, and C are the same. In some embodiments, universal sequences A, B, and C are different.
  • two primers may be used, one primer comprising a first template-specific sequence for priming from the first region of interest and a first barcode to associate the first amplified product with the first region of interest and a second primer comprising a second template-specific sequence for priming from the second region of interest and a second barcode to associate the second amplified product with the second region of interest.
  • These two primers will comprise the same universal sequence (e.g., universal sequence A) for pooling and downstream processing together.
  • Two or more universal sequences may be used and, in general, the number of universal sequences will be less than the number of target-specific sequences and/or barcode sequences for pooling of samples and treatment of pools as a single sample (batch).
  • determining the first nucleotide subsequence and the second nucleotide subsequence comprises priming from a universal sequence. In some embodiments determining the first nucleotide subsequence and the second nucleotide subsequence comprises terminating polymerization with a 3′-O-blocked nucleotide analog.
  • determining the first nucleotide subsequence and the second nucleotide subsequence comprises terminating polymerization with a 3′-O-alkynyl nucleotide analog, e.g., in some embodiments determining the first nucleotide subsequence and the second nucleotide subsequence comprises terminating polymerization with a 3′- ⁇ -propargyl nucleotide analog. In some embodiments determining the first nucleotide subsequence and the second nucleotide subsequence comprises terminating polymerization with a nucleotide analog comprising a reversible terminator.
  • the obtained short sequence reads are partitioned according to their barcode (i.e., de-multiplexed) and reads originating from the same samples, sources, regions of interest, etc. are binned together, e.g., saved to separate files or held in an organized data structure that allows binned reads to be identified as such. Then the binned short sequences are assembled into a consensus sequence.
  • Sequence assembly can generally be divided into two broad categories: de novo assembly and reference genome mapping assembly. In de novo assembly, sequence reads are assembled together so that they form a new and previously unknown sequence. In reference genome mapping, sequence reads are assembled against an existing backbone sequence (e.g., a reference sequence, etc.) to build a sequence that is similar but not necessarily identical to the backbone sequence.
  • target nucleic acids corresponding to each region of interest are reconstructed using a de-novo assembly.
  • short reads are stitched together bioinformatically by finding overlaps and extending them to produce a consensus sequence.
  • the method further comprises mapping the consensus sequence to a reference sequence.
  • Methods of the technology take advantage of sequencing quality scores that represent base calling confidence to reconstruct full length fragments.
  • fragments can be used to obtain phasing (assignment to homologous copies of chromosomes) of genomic variants by observing that consensus sequences originate from either one of the chromosomes.
  • a computer system is implemented for assembly and bioinformatic treatment of sequence information (e.g., identifying barcodes, partitioning, binning, making base calls, determining a consensus identity of each base, stitching reads, assessing quality scores, aligning reads and/or consensus sequences to a reference sequence, etc.).
  • a computer system includes a bus or other communication mechanism for communicating information and a processor coupled with the bus for processing information.
  • the computer system includes a memory, which can be a random access memory (RAM) or other dynamic storage device, coupled to the bus, and instructions to be executed by the processor. The memory also can be used for storing temporary variables or other intermediate information during execution of instructions to be executed by the processor.
  • RAM random access memory
  • the computer system further includes a read only memory (ROM) or other static storage device coupled to the bus for storing static information and instructions for the processor.
  • ROM read only memory
  • a storage device such as a solid state drive (e.g., “flash” memory), a magnetic disk, or an optical disk, is provided and coupled to the bus for storing information and instructions.
  • the computer system is coupled via the bus to a display, such as a cathode ray tube (CRT) or liquid crystal display (LCD), for displaying information to a computer user.
  • a display such as a cathode ray tube (CRT) or liquid crystal display (LCD)
  • an input device is coupled to the bus for communicating information and command selections to the processor.
  • a cursor control such as a mouse, a trackball, or cursor direction keys for communicating direction information and command selections to the processor and for controlling cursor movement on the display.
  • a computer system performs aspects of the present technology. Consistent with certain embodiments of the technology, results are provided by the computer system in response to the processor executing one or more sequences of one or more instructions contained in memory. Such instructions can be read into memory from another computer-readable medium, such as the storage device. Alternatively, hard-wired circuitry can be used in place of or in combination with software instructions to implement the present technology. Thus implementations of the present teachings are not limited to any specific combination of hardware circuitry and software.
  • embodiments of the technology comprise the use of storage and transfer of data using “cloud” computing technology, wired (e.g., fiber optic, cable, copper, ADSL, Ethernet, and the like), and/or wireless technology (e.g., IEEE 802.11 and the like).
  • components of the technology are connected via a local area network (LAN), wireless local area network (WLAN), wide area network (WAN) such as the internet, or any other network type, topology, and/or protocol.
  • the technology comprises use of a portable device such as a hand-held computer, smartphone, tablet computer, laptop computer, palmtop computer, hiptop computer, e.g., to display results, accept input from a user, provide instructions to another computer, store data, and/or perform other steps of methods provided herein.
  • a portable device such as a hand-held computer, smartphone, tablet computer, laptop computer, palmtop computer, hiptop computer, e.g., to display results, accept input from a user, provide instructions to another computer, store data, and/or perform other steps of methods provided herein.
  • a thin client terminal to display results, accept input from a user, provide instructions to another computer, store data, and/or perform other steps of methods provided herein.
  • Some embodiments provide a method for determining a target nucleotide sequence, the method comprising determining n nucleotide subsequences of the target nucleotide sequence (indexed over m), wherein the mth nucleotide subsequence has a 5′ end at nucleotide x m of the target nucleotide sequence and has a 3′ end at nucleotide y m of the target nucleotide sequence; the (m+1)th nucleotide subsequence has a 5′ end at nucleotide x m+1 of the target nucleotide sequence and has a 3′ end at nucleotide y m+1 of the target nucleotide sequence; and assembling the n nucleotide subsequences to provide a consensus sequence for the target nucleotide sequence, wherein m ranges from 1 to n; x m+1 ⁇ y m ; and (y m
  • the fragments are less than 50 bp; accordingly, in some embodiments (y m ⁇ x m ) ⁇ 50 and (y m+1 ⁇ x m+1 ) ⁇ 50. In some embodiments the fragments are less than 40 bp; accordingly in some embodiments (y m ⁇ x m ) ⁇ 40 and (y m+1 ⁇ x m+1 ) ⁇ 40. In some embodiments the fragments are less than 30 bp; accordingly, in some embodiments (y m ⁇ x m ) ⁇ 30 and (y m+1 ⁇ x m ⁇ 1 ) ⁇ 30.
  • determining the n nucleotide subsequences comprises priming from a universal sequence. In some embodiments, determining the n nucleotide subsequences comprises terminating polymerization with a 3′-O-blocked nucleotide analog. In some embodiments determining the first nucleotide subsequence and the second nucleotide subsequence comprises terminating polymerization with a 3′-O-alkynyl nucleotide analog. In some embodiments determining the first nucleotide subsequence and the second nucleotide subsequence comprises terminating polymerization with a 3′-O-propargyl nucleotide analog. In some embodiments determining the first nucleotide subsequence and the second nucleotide subsequence comprises terminating polymerization with a nucleotide analog comprising a reversible terminator.
  • methods for generating a next-generation sequencing library comprise amplifying a target nucleotide sequence using a primer comprising a target specific sequence, a universal sequence A, and a barcode nucleotide sequence associated with the target nucleic acid to provide an identifiable amplicon; ligating a first adaptor oligonucleotide comprising a universal sequence B to the 3′ end of the amplicon to form an adaptor-amplicon; circularizing the adaptor-amplicon to form a circular template; generating a ladder fragment library from the circular template using a 3′-O-blocked nucleotide analog; and ligating a second adaptor oligonucleotide comprising a universal sequence C to the 3′ ends of the fragments of the ladder fragment library to generate the next-generation sequencing library (e.g., using a ligase or a chemical ligation by, e.g., click chemistry, e.g., a copper
  • the barcode nucleotide sequence comprises 1 to 20 nucleotides. In some embodiments, the first adaptor oligonucleotide comprises 10 to 80 nucleotides. In some embodiments the nucleotide sequences of the fragments of the ladder fragment library correspond to overlapping nucleotide subsequences within the target nucleotide sequence and the nucleotide sequences of the fragments have 3′ ends corresponding to different nucleotides of the target nucleotide sequence.
  • nucleotide sequences of the fragments of the ladder fragment library comprise less than 100 nucleotides, e.g., less than 90, 80, 70, 60, 50, or 40 nucleotides, e.g., 15 to 50, e.g., 15 to 40 nucleotides.
  • the first adaptor oligonucleotide comprises a single-stranded DNA and/or the second adaptor oligonucleotide comprises a single-stranded DNA.
  • generating a ladder fragment library comprises using an oligonucleotide primer complementary to the universal sequence A.
  • the methods further comprise amplifying the next-generation sequencing library.
  • the 3′-O-alkynyl nucleotide analog is a 3′-O-propargyl nucleotide analog.
  • the nucleotide analog comprises a reversible terminator.
  • the technology further provides methods for determining a sequence of a nucleic acid.
  • the method comprises generating a next-generation sequencing library according to the technology provided herein; determining a nucleotide sequence of a fragment of the ladder fragment library, said nucleotide sequence comprising a nucleotide subsequence of the target nucleotide sequence; and determining a barcode nucleotide sequence of the fragment of the ladder fragment library.
  • determining the nucleotide sequence of a fragment of the ladder fragment library comprises using an oligonucleotide primer complementary to universal sequence C.
  • determining the barcode nucleotide sequence of the fragment of the ladder fragment library comprises using an oligonucleotide primer complementary to universal sequence B.
  • the nucleotide sequence of a fragment of the ladder fragment library comprises less than 100 nucleotides, e.g., 15 to 50 nucleotides, e.g., 20 to 50, e.g., 25 to 50, e.g., 30 to 50, e.g., 35 to 50, e.g., 40 to 50 nucleotides.
  • the methods further comprise associating the barcode nucleotide sequence with a source of the target nucleotide sequence.
  • the methods further comprise collecting or binning nucleotide sequences of fragments of the ladder fragment library having the same barcode nucleotide sequence. In some embodiments, the methods further comprise assembling a plurality of nucleotide sequences of fragments of the ladder fragment library to provide a consensus sequence. In some embodiments the methods further comprise mapping the consensus sequence to a reference sequence.
  • the technology includes attaching labels to the nucleic acids, such as nucleic acid binding proteins, optical labels, nucleotide analogs, and others known in the art.
  • compositions comprising a next-generation sequencing library, wherein the next-generation sequencing library comprises a plurality of nucleic acids, each nucleic acid comprising a universal sequence A, a barcode nucleotide sequence, a second universal sequence B, a nucleotide subsequence of a target nucleotide sequence, and a universal sequence C.
  • the universal sequence B comprises 10 to 100 nucleotides and/or the barcode nucleotide sequence comprises 1 to 20 nucleotides.
  • compositions further comprise a 3′-O-blocked nucleotide analog such as a 3′-O-alkynyl nucleotide analog, e.g., a 3′-O-propargyl nucleotide analog.
  • compositions further comprise a sequencing primer.
  • the compositions further comprise a sequencing primer complementary to the universal sequence C and/or a sequencing primer complementary to the universal sequence B.
  • the barcode nucleotide sequence is associated with the target nucleotide sequence.
  • the plurality of nucleic acids comprises nucleic acids having different barcode nucleotide sequences and different nucleotide subsequences of a target nucleotide sequence, wherein each barcode nucleotide sequence is associated with the target nucleotide sequence.
  • the barcode nucleotide sequence is associated with one-to-one correspondence with the target nucleotide sequence.
  • each nucleic acid of the next-generation sequencing library comprises a 3′-O-blocked nucleotide analog, e.g., a 3′-O-alkynyl nucleotide analog, e.g., a 3′-O-propargyl nucleotide analog.
  • each nucleic acid of the next-generation sequencing library comprises a nucleotide analog comprising a reversible terminator.
  • kits for producing a NGS sequencing library and/or for obtaining sequence information from a target nucleic acid comprising a nucleotide analog, e.g., for producing a nucleotide fragment ladder according to the methods provided herein.
  • the nucleotide analog is a 3′-O-blocked nucleotide analog, e.g., a 3′-O-alkynyl nucleotide analog, e.g., a 3′-O-propargyl nucleotide analog.
  • conventional A, C, G, U, and/or T nucleotides are provided in a kit as well as one or more (e.g., 1, 2, 3, or 4) A, C, G, U, and/or T nucleotide analogs.
  • kits comprise a polymerase (e.g., a natural polymerase, a modified polymerase, and/or an engineered polymerase, etc.), e.g., for amplification (e.g., by thermal cycling, isothermal amplification) or for sequencing, etc.
  • kits comprise a ligase, e.g., for attaching adaptors to a nucleic acid such as an amplicon or a ladder fragment or for circularizing an adaptor-amplicon.
  • kits comprise a copper-based catalyst reagent, e.g., for a click chemistry reaction, e.g., to react an azide and an alkynyl group to form a triazole link.
  • Some kit embodiments provide buffers, salts, reaction vessels, instructions, and/or computer software.
  • kits comprise primers and/or adaptors.
  • the adaptors comprise a chemical modification suitable for attaching the adaptor to the nucleotide analog, e.g., by click chemistry.
  • the kit comprises a nucleotide analog comprising an alkyne group and an adaptor oligonucleotide comprising an azide (N 3 ) group.
  • a “click chemistry” process such as an azide-alkyne cycloaddition is used to link the adaptor to the fragment via formation of a triazole.
  • system embodiments comprise a nucleotide analog for producing a fragment ladder from a target nucleic acid and a computer readable medium storing instructions for determining the sequence of the target nucleic acid from assembling short sequence reads.
  • systems comprise one or more adaptor oligonucleotides (e.g., suitable for attachment to the nucleotide analogs) or other kit components as described above.
  • some system embodiments are associated with assembling (stitching, reconstructing) a nucleic acid sequence.
  • Embodiments of such systems include various components such as, e.g., a nucleic acid sequencer, a sample sequence data storage, a reference sequence data storage, and an analytics computing device/server/node.
  • the analytics computing device/server/node is a workstation, mainframe computer, personal computer, mobile device, etc.
  • the systems comprise functionalities for identifying a barcode, parsing sequences based on a barcode, and binning sequences having common barcodes.
  • the nucleic acid sequencer is configured to analyze (e.g., interrogate) a nucleic acid fragment (e.g., a single fragment, a mate-pair fragment, a paired-end fragment, etc.) utilizing all available varieties of techniques, platforms, or technologies to obtain nucleic acid sequence information.
  • the systems comprise functionalities for making base calls, assessing quality scores, aligning sequences, identifying a barcode, parsing sequences based on a barcode, and binning sequences having common barcodes.
  • the nucleic acid sequencer communicates with the sample sequence data storage either directly via a data cable (e.g., a serial cable, a direct cable connection, etc.) or a bus linkage or, alternatively, through a network connection (e.g., internet, LAN, WAN, WLAN, VPN, etc.).
  • a network connection e.g., internet, LAN, WAN, WLAN, VPN, etc.
  • the network connection is a hardwired physical connection.
  • some embodiments provide that the nucleic acid sequencer is communicatively connected (via Category 5 (CAT5), fiber optic, or equivalent cabling) to a data server that is, in turn, communicatively connected (via CAT5, fiber optic, or equivalent cabling) through the internet and to the sample sequence data storage.
  • CAT5 Category 5
  • CAT5 fiber optic, or equivalent cabling
  • the network connection is a wireless network connection (e.g., Wi-Fi, WLAN, etc.), for example, utilizing an IEEE 802.11 (e.g., a/b/g/n, etc.) or equivalent transmission format.
  • the network connection utilized is dependent upon the particular requirements of the system.
  • the sample sequence data storage is an integrated part of the nucleic acid sequencer.
  • the sample sequence data storage is a database storage device, system, or implementation (e.g., data storage partition, etc.) that is configured to organize and store nucleic acid sequence read data generated by a nucleic acid sequencer (e.g., the short overlapping sequence reads of less than 100 bases, e.g., ⁇ 30-50 bases and associated index information such as barcode sequence and metadata associated with the barcode such as sample source, type, target nucleic acid, region of interest, experimental conditions, clinical data, etc.) such that the data can be searched (e.g., by barcode sequence or associated metadata) and retrieved manually (e.g., by a database administrator/client operator) or automatically by way of a computer program/application/software script.
  • a nucleic acid sequencer e.g., the short overlapping sequence reads of less than 100 bases, e.g., ⁇ 30-50 bases and associated index information such as barcode sequence and metadata associated with the barcode such as sample source, type, target nucleic acid, region of interest, experimental
  • the reference data storage can be any database device, storage system, or implementation (e.g., data storage partition, etc.) that is configured to organize and store reference sequences (e.g., whole/partial genome, whole/partial exome, gene, region, chromosome, BAC, etc.) such that the data can be searched and retrieved manually (e.g., by a database administrator/client operator) or automatically by way of a computer program/application/software script.
  • reference sequences e.g., whole/partial genome, whole/partial exome, gene, region, chromosome, BAC, etc.
  • the sample nucleic acid sequencing read data is stored on the sample sequence data storage and/or the reference data storage in a variety of different data file types/formats, including, but not limited to: *.fasta, *.csfasta, *seq.txt, *qseq.txt, *.fastq, *.sff, *prb.txt, *.sms, *srs and/or *.qv.
  • sample sequence data storage and the reference data storage are independent standalone devices/systems or implemented on different devices. In some embodiments, the sample sequence data storage and the reference data storage are implemented on the same device/system. In some embodiments, the sample sequence data storage and/or the reference data storage are implemented on the analytics computing device/server/node.
  • the analytics computing device/server/node is in communication with the sample sequence data storage and the reference data storage either directly via a data cable (e.g., a serial cable, a direct cable connection, etc.) or a bus linkage or, alternatively, through a network connection (e.g., internet, LAN, WAN, VPN, etc.).
  • a data cable e.g., a serial cable, a direct cable connection, etc.
  • a network connection e.g., internet, LAN, WAN, VPN, etc.
  • the analytics computing device/server/node hosts an assembler, e.g., a reference mapping engine or a de novo mapping module, and/or a tertiary analysis engine.
  • the de novo mapping module is configured to assemble sample nucleic acid sequence reads from the sample data storage into new and previously unknown sequences.
  • the reference mapping engine is configured to obtain sample nucleic acid sequence reads (e.g., having a common barcode and having been binned together) from the sample data storage and map them against one or more reference sequences obtained from the reference data storage to assemble the reads into a sequence that is similar but not necessarily identical to the reference sequence using all varieties of reference mapping/alignment techniques and methods.
  • the reassembled sequence can then be further analyzed by one or more optional tertiary analysis engines to identify differences in the genetic makeup (genotype, haplotype), gene expression, or epigenetic status of individuals that can result in large differences in physical characteristics (phenotype).
  • the tertiary analysis engine is configured to identify various genomic variants (in the assembled sequence) due to mutations, recombination/crossover, or genetic drift; to identify phasing of genetic information; to identify phylogenetic and/or taxonomic information; to identify an individual; to identify a species, genus, or other phylogenetic classification; to identify a drug resistance or a drug susceptibility (sensitivity) marker; to identify a gene fusion; to identify a copy number variation; to identify a methylation status; to associate the sequence with a disease state; etc.
  • genomic variants examples include, but are not limited to: single nucleotide polymorphisms (SNPs), copy number variations (CNV, insertions/deletions (“indels”), inversions, duplications, translocations, integrations, etc.
  • SNPs single nucleotide polymorphisms
  • CNV copy number variations
  • Indels insertions/deletions
  • the various engines and modules hosted on the analytics computing device/server/node can be combined or collapsed into a single engine or module, depending on the requirements of the particular application or system architecture.
  • the analytics computing device/server/node hosts additional engines or modules as needed by the particular application or system architecture.
  • the mapping and/or tertiary analysis engines are configured to process the nucleic acid and/or reference sequence reads in color space. In various embodiments, the mapping and/or tertiary analysis engines are configured to process the nucleic acid and/or reference sequence reads in base space. It should be understood, however, that the mapping and/or tertiary analysis engines can process or analyze nucleic acid sequence data in any schema or format as long as the schema or format conveys the base identity and position of the nucleic acid sequence.
  • sample nucleic acid sequencing read and referenced sequence data are supplied to the analytics computing device/server/node in a variety of different input data file types/formats, including, but not limited to: *.fasta, *.csfasta, *seq.txt, *qseq.txt, *.fastq, *.sff, *prb.txt, *.sms, *srs and/or *.qv.
  • the client terminal is, in some embodiments, a thin client or, in some embodiments, a thick client computing device.
  • the client terminal comprises a web browser (e.g., Internet Explorer, Firefox, Safari, Chrome, etc.) that is used to control the operation of the reference mapping engine, the de novo mapping module, and/or the tertiary analysis engine. That is, the client terminal can access the reference mapping engine, the de novo mapping module, and/or the tertiary analysis engine using a browser to control their functions.
  • the client terminal can be used to configure the operating parameters (e.g., mismatch constraint, quality value thresholds, etc.) of the various engines, depending on the requirements of the particular application.
  • the client terminal can also comprise a display to display the results of the analysis performed by the assembler, the reference mapping engine, the de novo mapping module, and/or the tertiary analysis engine.
  • the technology provided herein, in method, composition, kit, and system embodiments finds use, e.g., to prepare a NGS library for sequencing, to acquire a nucleotide sequence, to map a single nucleotide polymorphism, to distinguish alleles, to sequence a genome, to identify rare minor population variants (e.g., somatic mutations in cancer or a low-abundance pathogen against a large background of host or non-pathogen DNA), etc.
  • rare minor population variants e.g., somatic mutations in cancer or a low-abundance pathogen against a large background of host or non-pathogen DNA
  • Sequencing may be by any method known in the art.
  • sequencing is sequencing by synthesis.
  • sequencing is single molecule sequencing by synthesis.
  • sequencing involves hybridizing a primer to the template to form a template/primer duplex, contacting the duplex with a polymerase enzyme in the presence of detectably labeled nucleotides under conditions that permit the polymerase to add nucleotides to the primer in a template-dependent manner, detecting a signal from the incorporated labeled nucleotide, and sequentially repeating the contacting and detecting steps at least once, wherein sequential detection of incorporated labeled nucleotides determines the sequence of the nucleic acid.
  • Exemplary detectable labels include radiolabels, florescent labels, enzymatic labels, etc.
  • the detectable label may be an optically detectable label, such as a fluorescent label.
  • Exemplary fluorescent labels (for sequencing and/or other purposes such as labeling a nucleic acid, primer, probe, etc.) include cyanine, rhodamine, fluorescein, coumarin, BODIPY, alexa, or conjugated multi-dyes.
  • Some embodiments provide a method for generating a next-generation sequencing library, the method comprising amplifying a target nucleotide sequence using a primer comprising a target specific sequence, a universal sequence A, and a barcode nucleotide sequence (e.g., comprising 1 to 20 nucleotides) associated with the target nucleic acid to provide an identifiable amplicon; ligating a first adaptor oligonucleotide (e.g., a single-stranded DNA, e.g., comprising 10 to 80 nucleotides) comprising a universal sequence B to the 3′ end of the amplicon to form an adaptor-amplicon; circularizing the adaptor-amplicon to form a circular template; generating from the circular template by use of a primer complementary to the universal sequence A and a 3′-O-blocked nucleotide analog (e.g., a 3′-O-alkynyl nucleotide analog, a 3′-O-propargy
  • Some embodiments provide a method for determining a target nucleotide sequence, the method comprising amplifying a target nucleotide sequence using a primer comprising a target specific sequence, a universal sequence A, and a barcode nucleotide sequence (e.g., comprising 1 to 20 nucleotides) associated with the target nucleic acid to provide an amplicon; ligating a first adaptor oligonucleotide (e.g., a single-stranded DNA, e.g., comprising 10 to 80 nucleotides) comprising a universal sequence B to the 3′ end of the amplicon to form an adaptor-amplicon; circularizing the adaptor-amplicon to form a circular template; generating from the circular template by use of a primer complementary to the universal sequence A and a 3′-O-blocked nucleotide analog (e.g., a 3′-O-alkynyl nucleotide analog, a 3′-O-proparg
  • Some embodiments provide a method for determining a target nucleotide sequence, the method comprising determining a first nucleotide subsequence of the target nucleotide sequence (e.g., by priming from a universal sequence and, e.g., terminating polymerization with a 3′-O-blocked nucleotide analog such as a 3′-O-alkynyl nucleotide analog or a 3′-O-propargyl nucleotide analog or terminating polymerization with a nucleotide analog comprising a reversible terminator), said first nucleotide subsequence having a 5′ end at nucleotide x1 of the target nucleotide sequence and having a 3′ end at nucleotide y1 of the target nucleotide sequence; determining a second nucleotide subsequence of the target nucleotide sequence (e.g., by priming from a universal
  • Some embodiments provide a method for determining a target nucleotide sequence, the method comprising determining n nucleotide subsequences of the target nucleotide sequence (e.g., by priming from a universal sequence and, e.g., terminating polymerization with a 3′-O-blocked nucleotide analog such as a 3′-O-alkynyl nucleotide analog or a 3′-O-propargyl nucleotide analog or terminating polymerization with a nucleotide analog comprising a reversible terminator), wherein the mth nucleotide subsequence has a 5′ end at nucleotide x m of the target nucleotide sequence and has a 3′ end at nucleotide y m of the target nucleotide sequence; and the (m+1)th nucleotide subsequence has a 5′ end at nucleotide x
  • compositions for use as a next-generation sequencing library to obtain a sequence of a target nucleic acid comprising a 3′-O-blocked nucleotide analog, a 3′-O-alkynyl nucleotide analog, a 3′-O-propargyl nucleotide analog, or a nucleotide analog comprising a reversible terminator; a sequencing primer (e.g., complementary to a universal sequence C); a second sequencing primer (e.g., complementary to a universal sequence B); and n nucleic acids comprising a 3′-O-blocked nucleotide analog, a 3′-O-alkynyl nucleotide analog, or a 3′-O-propargyl nucleotide analog linked (e.g., by a triazole link formed, e.g., by click chemistry, e.g., by a reaction between an azide and an alky
  • Some embodiments provide a reaction mixture composition
  • a template e.g., a circular template, e.g., comprising a universal nucleotide sequence and/or a barcode nucleotide sequence
  • a polymerase e.g., a polymerase, one or more fragments of a ladder fragment library, and a 3′-O-blocked nucleotide analog.
  • Some embodiments provide a reaction mixture composition comprising a library of nucleic acids, the library of nucleic acids comprising overlapping short nucleotide sequences tiled over a target nucleic acid (e.g., the overlapping short nucleotide sequences cover a region of the target nucleic acid comprising 100 bases, 200 bases, 300 bases, 400 bases, 500 bases, 600 bases, 700 bases, 800 bases, 900 bases, 1000 bases, or more than 1000 bases, e.g., 2000 bases, 2500 bases, 3000 bases, 3500 bases, 4000 bases, 4500 bases, 5000 bases, or more than 5000 bases) and offset from one another by 1-20, 1-10, or 1-5 bases (e.g., 1 base) and each nucleic acid of the library comprising less than 100 bases, less than 90 bases, less than 80 bases, less than 70 bases, less than 60 bases, less than 50 bases, less than 45 bases, less than 40 bases, less than 35 bases, or less than 30 bases.
  • kits for generating a sequencing library comprising an adaptor oligonucleotide comprising a first reactive group (e.g., an azide), a 3′-O-blocked nucleotide analog (e.g., a 3′-O-alkynyl nucleotide analog or a 3′-O-propargyl nucleotide analog, e.g., comprising an alkyne group and, e.g., comprising a second reactive group that forms a chemical bond with the first reactive group, e.g., using click chemistry), a polymerase (e.g., a polymerase for isothermal amplification or thermal cycling), a second adaptor oligonucleotide, one or more compositions comprising a nucleotide or a mixture of nucleotides, and a ligase or a copper-based click chemistry catalyst reagent.
  • a first reactive group e.g., an azide
  • Some embodiments provide a system for sequencing a target nucleic acid, the system comprising an adaptor oligonucleotide comprising a first reactive group (e.g., an azide), a 3′-O-blocked nucleotide analog (e.g., a 3′-O-alkynyl nucleotide analog or a 3′-O-propargyl nucleotide analog, e.g., comprising an alkyne group and, e.g., comprising a second reactive group that forms a chemical bond with the first reactive group, e.g., using click chemistry, e.g., using a copper-based click chemistry catalyst), a sequencing apparatus, a nucleic acid fragment ladder (e.g., comprising a plurality of nucleic acids having 3′ ends that differ by less than 20 nucleotides, less than 10 nucleotides, less than 5 nucleotides, less than 4 nucleotides, less than 3 nucle
  • FIG. 1 is a schematic depicting an embodiment of the technology for sequencing a nucleic acid.
  • FIG. 2 is a schematic depicting an embodiment of the technology for producing a library for next-generation sequencing.
  • FIG. 2A shows one embodiment of the technology and
  • FIG. 2B shows another embodiment of the technology.
  • FIG. 3 is a schematic depicting an embodiment of the technology for sequencing a nucleic acid.
  • FIG. 4 is a schematic depicting an embodiment of the technology for sequencing a nucleic acid.
  • the technology generally relates to obtaining a nucleotide sequence, such as a consensus sequence or a haplotype sequence.
  • a nucleotide sequence such as a consensus sequence or a haplotype sequence.
  • the short overlapping DNA fragments have a range of lengths such that one fragment differs from another fragment by 1-5 bases, preferably 1 base, at their 3′ ends (e.g., a fragment ladder similar to that produced by conventional Sanger sequencing methods).
  • the short overlapping DNA fragments are indexed to generate a next generation sequencing (NGS) library.
  • NGS next generation sequencing
  • ⁇ 30-base to ⁇ 50-base sequence reads from the 3′ ends of the short overlapping fragments produces a tiled set of ⁇ 30-base to ⁇ 50-base sequence reads spanning the larger target DNA to be sequenced and offset from one another by 1-5 bases, preferably offset by 1 base.
  • Assembling the overlapping ⁇ 30-50 bp short sequence reads produces a long contiguous read covering a larger region ( ⁇ 800-1000 bp) of the target DNA fragment.
  • each sequence read results from the highest quality bases produced by NGS (e.g., the first 20-100 bases) and each base of the assembly is the consensus of 30-50 independent high quality sequence reads.
  • the term “or” is an inclusive “or” operator and is equivalent to the term “and/or” unless the context clearly dictates otherwise.
  • the term “based on” is not exclusive and allows for being based on additional factors not described, unless the context clearly dictates otherwise.
  • the meaning of “a”, “an”, and “the” include plural references.
  • the meaning of “in” includes “in” and “on.”
  • a “nucleotide” comprises a “base” (alternatively, a “nucleobase” or “nitrogenous base”), a “sugar” (in particular, a five-carbon sugar, e.g., ribose or 2-deoxyribose), and a “phosphate moiety” of one or more phosphate groups (e.g., a monophosphate, a diphosphate, or a triphosphate consisting of one, two, or three linked phosphates, respectively). Without the phosphate moiety, the nucleobase and the sugar compose a “nucleoside”.
  • base alternatively, a “nucleobase” or “nitrogenous base”
  • a “sugar” in particular, a five-carbon sugar, e.g., ribose or 2-deoxyribose
  • phosphate moiety of one or more phosphate groups (e.g., a monophosphate, a diphosphate,
  • a nucleotide can thus also be called a nucleoside monophosphate or a nucleoside diphosphate or a nucleoside triphosphate, depending on the number of phosphate groups attached.
  • the phosphate moiety is usually attached to the 5-carbon of the sugar, though some nucleotides comprise phosphate moieties attached to the 2-carbon or the 3-carbon of the sugar.
  • Nucleotides contain either a purine (in the nucleotides adenine and guanine) or a pyrimidine base (in the nucleotides cytosine, thymine, and uracil).
  • Ribonucleotides are nucleotides in which the sugar is ribose.
  • Deoxyribonucleotides are nucleotides in which the sugar is deoxyribose.
  • nucleic acid shall mean any nucleic acid molecule, including, without limitation, DNA, RNA, and hybrids thereof.
  • the nucleic acid bases that form nucleic acid molecules can be the bases A, C, G, T and U, as well as derivatives thereof. Derivatives of these bases are well known in the art.
  • the term should be understood to include, as equivalents, analogs of either DNA or RNA made from nucleotide analogs.
  • the term as used herein also encompasses cDNA, that is complementary, or copy, DNA produced from an RNA template, for example by the action of a reverse transcriptase.
  • nucleic acid sequencing data denotes any information or data that is indicative of the order of the nucleotide bases (e.g., adenine, guanine, cytosine, and thymine/uracil) in a molecule (e.g., a whole genome, a whole transcriptome, an exome, oligonucleotide, polynucleotide, fragment, etc.) of DNA or RNA.
  • nucleotide bases e.g., adenine, guanine, cytosine, and thymine/uracil
  • a molecule e.g., a whole genome, a whole transcriptome, an exome, oligonucleotide, polynucleotide, fragment, etc.
  • sequence information obtained using all available varieties of techniques, platforms or technologies, including, but not limited to: capillary electrophoresis, microarrays, ligation-based systems, polymerase-based systems, hybridization-based systems, direct or indirect nucleotide identification systems, pyrosequencing, ion- or pH-based detection systems, electronic signature-based systems, etc.
  • a base may refer to a single molecule of that base or to a plurality of the base, e.g., in a solution.
  • a “polynucleotide”, “nucleic acid”, or “oligonucleotide” refers to a linear polymer of nucleosides (including deoxyribonucleosides, ribonucleosides, or analogs thereof) joined by internucleosidic linkages.
  • a polynucleotide comprises at least three nucleosides.
  • oligonucleotides range in size from a few monomeric units, e.g. 3-4, to several hundreds of monomeric units.
  • a polynucleotide such as an oligonucleotide is represented by a sequence of letters, such as “ATGCCTG,” it will be understood that the nucleotides are in 5′ to 3′ order from left to right and that “A” denotes deoxyadenosine, “C” denotes deoxycytidine, “G” denotes deoxyguanosine, and “T” denotes thymidine, unless otherwise noted.
  • the letters A, C, G, and T may be used to refer to the bases themselves, to nucleosides, or to nucleotides comprising the bases, as is standard in the art.
  • target nucleic acid or “target nucleotide sequence” refers to any nucleotide sequence (e.g., RNA or DNA), the manipulation of which may be deemed desirable for any reason by one of ordinary skill in the art.
  • target nucleic acid refers to a nucleotide sequence whose nucleotide sequence is to be determined or is desired to be determined.
  • target nucleotide sequence refers to a sequence to which a partially or completely complementary primer or probe is generated.
  • region of interest refers to a nucleic acid that is analyzed (e.g., using one of the compositions, systems, or methods described herein).
  • the region of interest is a portion of a genome or region of genomic DNA (e.g., comprising one or chromosomes or one or more genes).
  • mRNA expressed from a region of interest is analyzed.
  • the term “corresponds to” or “corresponding” is used in reference to a contiguous nucleic acid or nucleotide sequence (e.g., a subsequence) that is complementary to, and thus “corresponds to”, all or a portion of a target nucleic acid sequence.
  • a clonal plurality of nucleic acids refers to the nucleic acid products that are complete or partial copies of a template nucleic acid from which they were generated. These products are substantially or completely or essentially identical to each other, and they are complementary copies of the template nucleic acid strand from which they are synthesized, assuming that the rate of nucleotide misincorporation during the synthesis of the clonal nucleic acid molecules is 0%.
  • library refers to a plurality of nucleic acids, e.g., a plurality of different nucleic acids.
  • a “subsequence” of a nucleotide sequence refers to any nucleotide sequence contained within the nucleotide sequence, including any subsequence having a size of a single base up to a subsequence that is one base shorter than the nucleotide sequence.
  • consensus sequence refers to a sequence that is common to, or otherwise present in the largest fraction, of an aligned group of sequences.
  • the consensus sequence shows the nucleotide most commonly found at each position within the nucleic acid sequences of the group of sequences.
  • a consensus sequence is often “assembled” from shorter sequence reads.
  • sequence assembly refers to generating nucleotide sequence information from shorter sequences, e.g., experimentally acquired sequence reads. Sequence assembly can generally be divided into two broad categories: de novo assembly and reference genome mapping assembly. In de novo assembly, sequence reads are assembled together so that they form a new and previously unknown sequence. In reference genome “mapping”, sequence reads are assembled against an existing “reference sequence” to build a sequence that is similar to but not necessarily identical to the reference sequence.
  • sequencing run refers to any step or portion of a sequencing experiment performed to determine some information relating to at least one biomolecule (e.g., nucleic acid molecule).
  • dNTP deoxynucleotidetriphosphate, where the nucleotide comprises a nucleotide base, such as A, T, C, G or U.
  • the term “monomer” as used herein means any compound that can be incorporated into a growing molecular chain by a given polymerase.
  • Such monomers include, without limitations, naturally occurring nucleotides (e.g., ATP, GTP, TTP, UTP, CTP, dATP, dGTP, dTTP, dUTP, dCTP, synthetic analogs), precursors for each nucleotide, non-naturally occurring nucleotides and their precursors or any other molecule that can be incorporated into a growing polymer chain by a given polymerase.
  • naturally occurring nucleotides e.g., ATP, GTP, TTP, UTP, CTP, dATP, dGTP, dTTP, dUTP, dCTP, synthetic analogs
  • precursors for each nucleotide e.g., non-naturally occurring nucleotides and their precursors or any other molecule that can be incorporated into a growing polymer
  • complementary generally refers to specific nucleotide duplexing to form canonical Watson-Crick base pairs, as is understood by those skilled in the art. However, complementary also includes base-pairing of nucleotide analogs that are capable of universal base-pairing with A, T, G or C nucleotides and locked nucleic acids that enhance the thermal stability of duplexes.
  • hybridization stringency is a determinant in the degree of match or mismatch in the duplex formed by hybridization.
  • a “polymerase” is an enzyme generally for joining 3′-OH 5′-triphosphate nucleotides, oligomers, and their analogs.
  • Polymerases include, but are not limited to, DNA-dependent DNA polymerases, DNA-dependent RNA polymerases, RNA-dependent DNA polymerases, RNA-dependent RNA polymerases, T7 DNA polymerase, T3 DNA polymerase, T4 DNA polymerase, T7 RNA polymerase, T3 RNA polymerase, SP6 RNA polymerase, DNA polymerase 1, Klenow fragment, Thermophilus aquaticus (Taq) DNA polymerase, Thermus thermophiles (Tth) DNA polymerase, Vent DNA polymerase (New England Biolabs), Deep Vent DNA polymerase (New England Biolabs), Bacillus stearothermophilus (Bst) DNA polymerase, DNA Polymerase Large Fragment, Stoeffel Fragment, 9° N DNA Polymerase, 9° N m
  • polymerases include wild-type, mutant isoforms, and genetically engineered variants such as exo ⁇ polymerases; polymerases with minimized, undetectable, and/or decreased 3′ ⁇ 5′ proofreading exonuclease activity, and other mutants, e.g., that tolerate labeled nucleotides and incorporate them into a strand of nucleic acid.
  • the polymerase is designed for use, e.g., in real-time PCR, high fidelity PCR, next-generation DNA sequencing, fast PCR, hot start PCR, crude sample PCR, robust PCR, and/or molecular diagnostics.
  • Such enzymes are available from many commercial suppliers, e.g., Kapa Enzymes, Finnzymes, Promega, Invitrogen, Life Technologies, Thermo Scientific, Qiagen, Roche, etc.
  • primer refers to an oligonucleotide, whether occurring naturally as in a purified restriction digest or produced synthetically, that is capable of acting as a point of initiation of synthesis when placed under conditions in which synthesis of a primer extension product that is complementary to a nucleic acid strand is induced, (e.g., in the presence of nucleotides and an inducing agent such as DNA polymerase and at a suitable temperature and pH).
  • the primer is preferably single stranded for maximum efficiency in amplification, but may alternatively be double stranded. If double stranded, the primer is first treated to separate its strands before being used to prepare extension products.
  • the primer is an oligodeoxyribonucleotide.
  • the primer must be sufficiently long to prime the synthesis of extension products in the presence of the inducing agent. The exact lengths of the primers will depend on many factors, including temperature, source of primer and the use of the method.
  • a “system” denotes a set of components, real or abstract, comprising a whole where each component interacts with or is related to at least one other component within the whole.
  • index shall generally mean a distinctive or identifying mark or characteristic.
  • An index is a short nucleotide sequence used as a “barcode” to identify a longer nucleotide comprising the barcode and other sequence.
  • phase refers to the unique content of the two chromosomes inherited from each parent and/or separating maternally and paternally derived sequence information present on a nucleic acid (e.g., a chromosome)
  • haploytpe phasing information describes which nucleotides (e.g., a SNP), regions, portions, or fragments originated from each of the parental chromosomes (or are associated with a specific minor viral quasi-species).
  • a “Sanger ladder”, “DNA ladder”, “fragment ladder”, or “ladder” refers to a library of nucleic acids (e.g., DNA) that each differ in length by a small number of bases, e.g., one to five bases and in some preferred embodiments by one base.
  • the nucleic acids in the ladder have 5′ ends that correspond to the same nucleotide position (or fall within a small range of nucleotide positions, e.g., 1-10 nucleotide positions) in the template from which they were made and have different 3′ ends that correspond to a range of nucleotide positions in the template from which they were made.
  • the technology provided herein provides methods and compositions to create short overlapping DNA fragments that span over a larger region of DNA fragment.
  • the short DNA fragments compose a population of DNA fragments having a range of sizes that increase in size from one fragment to the next larger fragment by, for example, 1 to 20 base pairs, 1 to 10 base pairs, or 1 to 5 base pairs, preferably by 1 base pair (e.g., as in the case of fragments generated by Sanger sequencing).
  • a short nucleic acid having a universal sequence is appended to the 3′ ends of each fragment (i.e., the end of the fragment where the ladder is generated). Subsequently, the fragments are sequenced using a sequencing primer complementary to the universal sequence.
  • the sequences generated have a range of 5′ (first) bases corresponding to bases distributed along the length of the larger DNA from the first base attached to the universal sequence up to 500 bases or more.
  • the sequences generated have a range of 5′ (first) bases corresponding to each base distributed along the length of the larger DNA.
  • a target nucleic acid is amplified using one or more target specific primers (see, e.g., FIG. 2A , step i).
  • the target nucleic acid may be a DNA or an RNA, e.g., a genomic DNA; mRNA; a cosmid, fosmid, or bacterial artificial chromosome (e.g., comprising an insert), a gene, a plasmid, etc.
  • an RNA is first reverse transcribed to produce a DNA.
  • Amplification may be PCR, limited cycle PCR, isothermal PCR, amplification with Phi29 or Bst enzymes, etc., e.g., as shown in FIG. 2A .
  • the target specific primers include both a universal sequence (e.g., universal sequence A) and a uniquely identifying index sequence (e.g., a barcode sequence; see FIG. 2A , “NNNNN” barcode sequence) that allows tracking and/or identifying the target nucleic acid from which the amplified product (amplicon) was produced.
  • barcode sequences may consist of 1 to 10 or more nucleotides.
  • a 10-base barcode sequence provides 1,048,576 (4 10 ) combinations of uniquely identifiable target-specific primer molecules. Consequently, with an appropriately designed barcode length, a starting material containing a small to a very large number of target DNA fragments can be reliably tagged and indexed without duplicate tagging with the same barcode sequence.
  • a next step comprises ligating the uniquely barcoded individual amplicons at their 3′ ends to an adaptor oligonucleotide approximately 10 to 80 bases in length and comprising a second universal sequence (e.g., universal sequence B) (see, e.g., FIG. 2A , step ii).
  • the adaptor-amplicon nucleic acids are self-ligated (e.g., circularized) to form a circular template (see, e.g., FIG. 2A , step iii).
  • the circularization brings the universal sequence at the 3′ end adjacent to the barcode sequence at the 5′ end. Intramolecular ligation may be effected using a ligase.
  • CircLigase II (Epicentre) is a thermostable single-stranded DNA ligase that catalyzes intramolecular ligation of single-stranded DNA templates having a 5′ phosphate and a 3′ hydroxyl group.
  • the 3′-O-blocked dNTP analog is a 3′-O-alkynyl nucleotide analog (e.g., an alkyl, having a saturated position (sp 3 -hybridized) on a molecular framework next to an alkynyl group, and substituted variants thereof).
  • the 3′-O-blocked dNTP analog is a 3′-O-propargyl nucleotide analog having a structure as shown below:
  • B is the base of the nucleotide (e.g., adenine, guanine, thymine, cytosine, or a natural or synthetic nucleobase, e.g., a modified purine such as hypoxanthine, xanthine, 7-methylguanine; a modified pyrimidine such as 5,6-dihydrouracil, 5-methylcytosine, 5-hydroxymethylcytosine; etc.).
  • nucleotide e.g., adenine, guanine, thymine, cytosine, or a natural or synthetic nucleobase, e.g., a modified purine such as hypoxanthine, xanthine, 7-methylguanine; a modified pyrimidine such as 5,6-dihydrouracil, 5-methylcytosine, 5-hydroxymethylcytosine; etc.
  • Other alkynyl groups are contemplated by the technology and find use in the technology, e.g.,
  • the nucleotide analog comprises a reversible terminator that comprises a blocking group that can be removed to unblock the nucleotide.
  • the nucleotide analog comprises a functional terminator, e.g., that provides a particular desired reactivity for subsequent steps.
  • the nucleotide analogs result in the production of a fragment ladder having fragments over a range of sizes.
  • the fragments have lengths ranging from ⁇ 100 bp to ⁇ 700 or 800 bp; furthermore, in some embodiments lengths greater than 1000 bp are achieved by adjusting the ratio of dNTPs and 3′-O-blocked dNTP analogs in the reaction mixture.
  • ddNTP dideoxynucleotide sequencing technologies
  • Sanger-type sequencing chemistries are not appropriate for this step in these embodiments because the lack of a 3′-OH group in the terminating ddNTP creates a non-reactive terminal 3′ end that cannot accept the ligation of the second adaptor oligonucleotide in the subsequent step.
  • a second adaptor oligonucleotide comprising a universal sequence (e.g., universal sequence C) is ligated (enzymatically or chemically) to the 3′ ends of the fragments of the nucleic acid fragment ladder to produce a NGS library. (see, e.g., FIG. 2A , step v).
  • limited cycle PCR or another amplification method is performed to amplify the final product.
  • the methods find use in acquiring short sequences, e.g., of ⁇ 120-200 bp. Such embodiments find use, e.g., in assessing cancer genes, e.g., to assess mutations of a cancer panel. In some embodiments, the technology finds use in acquiring sequences of 500 bp, 1000 bp, or more.
  • a target nucleic acid is amplified using one or more target specific primers (see, e.g., FIG. 2B , step i).
  • the target nucleic acid may be a DNA or an RNA, e.g., a genomic DNA; mRNA; a cosmid, fosmid, or bacterial artificial chromosome (e.g., comprising an insert), a gene, a plasmid, etc.
  • an RNA is first reverse transcribed to produce a DNA.
  • Amplification may be PCR, limited cycle PCR, isothermal PCR, amplification with Phi29 or Bst enzymes, etc., e.g., as shown in FIG. 2B .
  • the target specific primers include both a universal sequence (e.g., universal sequence A) and a uniquely identifying index sequence (e.g., a barcode sequence; see FIG. 2B , “NNNNN” barcode sequence) that allows tracking and/or identifying the target nucleic acid from which the amplified product (amplicon) was produced.
  • barcode sequences may consist of 1 to 10 or more nucleotides.
  • a 10-base barcode sequence provides 1,048,576 (4 10 ) combinations of uniquely identifiable target-specific primer molecules. Consequently, with an appropriately designed barcode length, a starting material containing a small to a very large number of target DNA fragments can be reliably tagged and indexed without duplicate tagging with the same barcode sequence.
  • a next step comprises ligating the uniquely barcoded individual amplicons at their 3′ ends to an adaptor oligonucleotide approximately 10 to 80 bases in length and comprising a second universal sequence (e.g., universal sequence B) (see, e.g., FIG. 2B , step ii).
  • the adaptor-amplicon nucleic acids are self-ligated (e.g., circularized) to form a circular template (see, e.g., FIG. 2B , step iii).
  • the circularization brings the universal sequence at the 3′ end adjacent to the barcode sequence at the 5′ end. Intramolecular ligation may be effected using a ligase.
  • CircLigase II (Epicentre) is a thermostable single-stranded DNA ligase that catalyzes intramolecular ligation of single-stranded DNA templates having a 5′ phosphate and a 3′ hydroxyl group.
  • a Sanger fragment-like DNA ladder is generated by a polymerase reaction using a primer complementary to universal sequence A and a mix of dNTPs and 3′-O-blocked dNTP analogs as described herein (see, e.g., FIG. 2B , step iv).
  • the 3′-O-blocked dNTP analog is a 3′-O-alkynyl nucleotide analog (e.g., an alkyl, having a saturated position (spa-hybridized) on a molecular framework next to an alkynyl group, and substituted variants thereof).
  • the 3′-O-blocked dNTP analog is a 3′-O-propargyl nucleotide analog having a structure as shown below:
  • B is the base of the nucleotide (e.g., adenine, guanine, thymine, cytosine, or a natural or synthetic nucleobase, e.g., a modified purine such as hypoxanthine, xanthine, 7-methylguanine; a modified pyrimidine such as 5,6-dihydrouracil, 5-methylcytosine, 5-hydroxymethylcytosine; etc.).
  • nucleotide e.g., adenine, guanine, thymine, cytosine, or a natural or synthetic nucleobase, e.g., a modified purine such as hypoxanthine, xanthine, 7-methylguanine; a modified pyrimidine such as 5,6-dihydrouracil, 5-methylcytosine, 5-hydroxymethylcytosine; etc.
  • Other alkynyl groups are contemplated by the technology and find use in the technology, e.g.,
  • the nucleotide analog comprises a reversible terminator that comprises a blocking group that can be removed to unblock the nucleotide.
  • the nucleotide analog comprises a functional terminator, e.g., that provides a particular desired reactivity for subsequent steps. The nucleotide analogs result in the production of a fragment ladder having fragments over a range of sizes.
  • the fragments have lengths ranging from ⁇ 100 bp to ⁇ 700 or 800 bp; furthermore, in some embodiments, sequence lengths greater than 1000 bp to greater than 10,000 bp are achieved, e.g., by adjusting the ratio of dNTPs and 3′-O-blocked dNTP analogs in the reaction mixture.
  • ddNTP dideoxynucleotide sequencing technologies
  • Sanger-type sequencing chemistries are not appropriate for this step in these embodiments because the lack of a 3′-OH group in the terminating ddNTP creates a non-reactive terminal 3′ end that cannot accept the ligation of the second adaptor oligonucleotide in the subsequent step.
  • nucleic acid fragment ladder is circularized to form a nucleic acid circle library (see, e.g., FIG. 2B , step v).
  • a second adaptor oligonucleotide e.g., comprising a universal sequence, e.g., universal sequence C
  • a second adaptor oligonucleotide is ligated (enzymatically or chemically) to the 3′ ends of the digestion products of the nucleic acid circle library to produce a NGS library.
  • limited cycle PCR or another amplification method is performed to amplify the final product. Without being limited to any particular method or length of time to perform any steps of the methods provided, in some embodiments the methods described take from ⁇ 6 (e.g., ⁇ 6.5) hours to ⁇ 9 (e.g., ⁇ 8.5 hours) to complete.
  • the fragments comprise a 3′ alkyne.
  • the second adaptor oligonucleotide comprising a universal sequence comprises a 5′ azide (N 3 ) group that is reactable with the fragment 3′ alkyne group.
  • a “click chemistry” process such as an azide-alkyne cycloaddition is used to link the adaptor to the fragment via formation of a triazole:
  • the triazole ring linkage has a structure according to:
  • the triazole ring linkage formed by the alkyne-azide cycloaddition has similar characteristics (e.g., physical, biological, chemical characteristics) as a natural phosphodiester bond present in nucleic acids and therefore is a nucleic acid backbone mimic. Consequently, conventional enzymes that recognize natural nucleic acids as substrates also recognize as substrates the products formed by alkyne-azide cycloaddition as provided by the technology described herein. See, e.g., El-Sagheer, et al. (2011) “Biocompatible artificial DNA linker that is read through by DNA polymerases and is functional in Escherichia coli” Proc Natl Acad Sci USA 108(28): 11338-43, which is incorporated herein by reference in its entirety).
  • the final NGS fragment library is then used as the input to a NGS system for sequencing.
  • ⁇ 20 to 50 bases of DNA adjacent to the adaptor comprising universal sequence C are sequenced (corresponding to ⁇ 20 to 50 bases of the target nucleic acid) and the barcode adjacent to the adaptor comprising universal sequence B is sequenced (see, e.g., FIG. 3 ).
  • the sequence reads are parsed into bins by the barcode sequences to collect sequence reads that originated from a template molecule tagged with that particular unique barcode sequence (see, e.g., FIG. 3 ).
  • the sequence reads in each bin are aligned to each other and assembled to construct a longer contiguous consensus sequence with phase information intact. This sequence can be aligned to an appropriate reference sequence for downstream sequence analysis.
  • nucleic acid sequencing platforms e.g., computer software and/or hardware
  • nucleic acid assembly e.g., nucleic acid assembly
  • nucleic acid mapping systems e.g., computer software and/or hardware
  • the techniques of “paired-end”, “mate-pair”, and other assembly-related sequencing are generally known in the art of molecular biology (Siegel A. F. et al., Genomics 2000, 68: 237-246; Roach J. C. et al., Genomics 1995, 26: 345-353). These sequencing techniques allow the determination of multiple “reads” of sequence, each from a different place on a single polynucleotide.
  • the distance between the reads or other information regarding a relationship between the reads is known.
  • these sequencing techniques provide more information than does sequencing multiple stretches of nucleic acid sequences in a random fashion.
  • appropriate software tools for the assembly of sequence information e.g., Millikin S. C. et al., Genome Res. 2003, 13: 81-90; Kent, W. J. et al., Genome Res. 2001, 11: 1541-8
  • a nucleotide analog finds use as a functional nucleotide terminator (e.g., in embodiments of compositions, methods, kits, and systems described herein).
  • a functional nucleotide terminator both terminates polymerization of a nucleic acid, e.g., by blocking the 3′ hydroxyl from participating further in the polymerization reaction, and comprises a functional reactive group that can participate in other chemical reactions with other chemical moieties and groups.
  • nucleotide analog comprising an alkynyl group finds use in some embodiments, e.g., having a structure according to:
  • B is a base, e.g., adenine, guanine, cytosine, thymine, or uracil, e.g., having a structure according to:
  • P comprises a phosphate moiety, e.g., to provide a nucleotide having a structure according to:
  • P comprises a tetraphosphate; a triphosphate; a diphosphate; a monophosphate; a 5′ hydroxyl; an alpha thiophosphate (e.g., phosphorothioate or phosphorodithioate), a beta thiophosphate (e.g., phosphorothioate or phosphorodithioate), and/or a gamma thiophosphate (e.g., phosphorothioate or phosphorodithioate); or an alpha methylphosphonate, a beta methylphosphonate, and/or a gamma methylphosphonate.
  • an alpha thiophosphate e.g., phosphorothioate or phosphorodithioate
  • a beta thiophosphate e.g., phosphorothioate or phosphorodithioate
  • a gamma thiophosphate e.g., phosphorothio
  • P comprises an azide (e.g., N 3 , e.g., N ⁇ N ⁇ N), thus providing, in some embodiments, a directional, bi-functional polymerization agent.
  • the technology comprises use of a nucleotide analog as described in co-pending U.S. Pat. Appl. Ser. No. 61/867,202, which is incorporated herein by reference in its entirety.
  • a propargyl nucleotide analog is a nucleotide analog comprising a base (e.g., adenine, guanine, cytosine, thymine, or uracil), a deoxyribose, and an alkyne chemical moiety attached to the 3′-oxygen of the deoxyribose.
  • Chemical ligation between the polymerase extension products and appropriate conjugation partners e.g., azide modified molecules
  • the 3′ hydroxyl group of the nucleotide analog is capped by a chemical moiety, e.g., an alkyne (e.g., a carbon-carbon triple bond), that halts further elongation of the nucleic acid (e.g., DNA, RNA) chain when incorporated by polymerase (e.g., DNA or RNA polymerase).
  • the alkyne chemical moiety is a well-known conjugation partner of an azide (N 3 ) group, e.g., in a copper (I)-catalyzed 1,3-dipolar cycloaddition reaction (e.g., a “click chemistry” reaction).
  • the triazole ring linkage in certain positional arrangements, has characteristics that are similar to a natural phosphodiester bond as found in a conventional nucleic acid backbone and therefore the triazole link is a nucleic acid backbone mimic.
  • use of 3′-O-propargyl-dNTPs creates nucleic acid fragments that have a terminal 3′-O-alkyne group.
  • these nucleic acid fragments can then be chemically ligated using click chemistry to any azide-modified molecules, such as 5′-azide-modified oligonucleotides (e.g., such as adaptors as provided herein or a solid support).
  • the triazole chemical bond is compatible with typical reactions and enzymes used for biochemistry and molecular biology and, as such, does not inhibit enzymatic reactions.
  • the chemically ligated nucleic acid fragments can then be used in subsequent enzymatic reactions, such as a polymerase chain reaction, a sequencing reaction, etc.
  • the nucleotide analog comprises a reversible terminator.
  • the 3′ hydroxyl groups are capped with a chemical moiety that can be removed with a specific chemical reaction, thus regenerating a free 3′ hydroxyl.
  • some embodiments comprise a reaction to remove the reversible terminator and, in some embodiments, an additional purification step to remove the free capping (terminator) moiety.
  • a nucleotide comprising a reversible terminator is as described in U.S. Pat. Appl. Ser. No. 61/791,730, incorporated herein by reference in its entirety.
  • Methods of the technology involve attaching an adaptor to a nucleic acid (e.g., an amplicon or a ladder fragment as described herein).
  • the adaptors are attached to a nucleic acid with an enzyme.
  • the enzyme may be a ligase or a polymerase.
  • the ligase may be any enzyme capable of ligating an oligonucleotide (single stranded RNA, double stranded RNA, single stranded DNA, or double stranded DNA) to another nucleic acid molecule.
  • Suitable ligases include T4 DNA ligase and T4 RNA ligase (such ligases are available commercially, e.g., from New England Biolabs).
  • the ligation may be blunt ended or via use of complementary over hanging ends.
  • the ends of nucleic acids may be phosphorylated (e.g., using T4 polynucleotide kinase), repaired, trimmed (e.g. using an exonuclease), or filled (e.g., using a polymerase and dNTPs), to form blunt ends.
  • the ends may be treated with a polymerase and dATP to form a template independent addition to the 3′ end of the fragments, thus producing a single A overhanging.
  • the polymerase may be any enzyme capable of adding nucleotides to the 3′ and the 5′ terminus of template nucleic acid molecules.
  • an adaptor comprises a functional moiety for chemical ligation to a nucleotide analog.
  • an adaptor comprises an azide group (e.g., at the 5′ end) that is reactive with an alkynyl group (e.g., a propargyl group, e.g., at the 3′ end of a nucleic acid comprising the nucleotide analog), e.g., by a click chemistry reaction (e.g., using a copper-based catalyst reagent).
  • the adaptors comprise a universal sequence and/or an index, e.g., a barcode nucleotide sequence.
  • adaptors can contain one or more of a variety of sequence elements, including but not limited to, one or more amplification primer annealing sequences or complements thereof, one or more sequencing primer annealing sequences or complements thereof, one or more barcode sequences, one or more common sequences shared among multiple different adaptors or subsets of different adaptors (e.g., a universal sequence), one or more restriction enzyme recognition sites, one or more overhangs complementary to one or more target polynucleotide overhangs, one or more probe binding sites (e.g.
  • a sequencing platform such as a flow cell for massive parallel sequencing, such as developed by Illumina, Inc.
  • a random or near-random sequences e.g. one or more nucleotides selected at random from a set of two or more different nucleotides at one or more positions, with each of the different nucleotides selected at one or more positions represented in a pool of adaptors comprising the random sequence
  • Two or more sequence elements can be non-adjacent to one another (e.g. separated by one or more nucleotides), adjacent to one another, partially overlapping, or completely overlapping.
  • an amplification primer annealing sequence can also serve as a sequencing primer annealing sequence.
  • Sequence elements can be located at or near the 3′ end, at or near the 5′ end, or in the interior of the adaptor oligonucleotide.
  • sequence elements can be located partially or completely outside the secondary structure, partially or completely inside the secondary structure, or in between sequences participating in the secondary structure.
  • sequence elements can be located partially or completely inside or outside the hybridizable sequences (the “stem”), including in the sequence between the hybridizable sequences (the “loop”).
  • the first adaptor oligonucleotides in a plurality of first adaptor oligonucleotides having different barcode sequences comprise a sequence element common among all first adaptor oligonucleotides in the plurality.
  • all second adaptor oligonucleotides comprise a sequence element common among all second adaptor oligonucleotides that is different from the common sequence element shared by the first adaptor oligonucleotides.
  • a difference in sequence elements can be any such that at least a portion of different adaptors do not completely align, for example, due to changes in sequence length, deletion or insertion of one or more nucleotides, or a change in the nucleotide composition at one or more nucleotide positions (such as a base change or base modification).
  • an adaptor oligonucleotide comprises a 5′ overhang, a 3′ overhang, or both that is complementary to one or more target polynucleotides.
  • Complementary overhangs can be one or more nucleotides in length, including but not limited to 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, or more nucleotides in length.
  • Complementary overhangs may comprise a fixed sequence.
  • Complementary overhangs may comprise a random sequence of one or more nucleotides, such that one or more nucleotides are selected at random from a set of two or more different nucleotides at one or more positions, with each of the different nucleotides selected at one or more positions represented in a pool of adaptors with complementary overhangs comprising the random sequence.
  • an adaptor overhang is complementary to a target polynucleotide overhang produced by restriction endonuclease digestion.
  • an adaptor overhang consists of an adenine or a thymine.
  • the adaptor sequences can contain a molecular binding site identification element to facilitate identification and isolation of the target nucleic acid for downstream applications.
  • Molecular binding as an affinity mechanism allows for the interaction between two molecules to result in a stable association complex.
  • Molecules that can participate in molecular binding reactions include proteins, nucleic acids, carbohydrates, lipids, and small organic molecules such as ligands, peptides, or drugs.
  • nucleic acid molecular binding site When a nucleic acid molecular binding site is used as part of the adaptor, it can be used to employ selective hybridization to isolate a target sequence. Selective hybridization may restrict substantial hybridization to target nucleic acids containing the adaptor with the molecular binding site and capture nucleic acids, which are sufficiently complementary to the molecular binding site. Thus, through “selective hybridization” one can detect the presence of the target polynucleotide in an unpure sample containing a pool of many nucleic acids.
  • An example of a nucleotide-nucleotide selective hybridization isolation system comprises a system with several capture nucleotides, which are complementary sequences to the molecular binding identification elements, and are optionally immobilized to a solid support.
  • the capture polynucleotides could be complementary to the target sequences itself or a barcode or unique tag contained within the adaptor.
  • the capture polynucleotides can be immobilized to various solid supports, such as inside of a well of a plate, mono-dispersed spheres, microarrays, or any other suitable support surface known in the art.
  • the hybridized complementary adaptor polynucleotides attached on the solid support can be isolated by washing away the undesirable non-binding nucleic acids, leaving the desirable target polynucleotides behind.
  • spheres can then be mixed in a tube together with the target polynucleotide containing the adaptors.
  • undesirable molecules can be washed away while spheres are kept in the tube with a magnet or similar agent.
  • the desired target molecules can be subsequently released by increasing the temperature, changing the pH, or by using any other suitable elution method known in the art.
  • a barcode is a known nucleic acid sequence that allows some feature of a nucleic acid with which the barcode is associated to be identified.
  • the feature of the nucleic acid to be identified is the sample or source from which the nucleic acid is derived.
  • the barcode sequence generally includes certain features that make the sequence useful in sequencing reactions.
  • the barcode sequences are designed to have minimal or no homopolymer regions, e.g., 2 or more of the same base in a row such as AA or CCC, within the barcode sequence.
  • the barcode sequences are also designed so that they are at least one edit distance away from the base addition order when performing base-by-base sequencing, ensuring that the first and last bases do not match the expected bases of the sequence.
  • the barcode sequences are designed such that each sequence is correlated to a particular target nucleic acid, allowing the short sequence reads to be correlated back to the target nucleic acid from which they came. Methods of designing sets of barcode sequences are shown, for example, in U.S. Pat. No. 6,235,475, the contents of which are incorporated by reference herein in their entirety.
  • the barcode sequences range from about 5 nucleotides to about 15 nucleotides. In a particular embodiment, the barcode sequences range from about 4 nucleotides to about 7 nucleotides.
  • the barcode sequences are sequenced along with the ladder fragment nucleic acid, in embodiments using longer sequences the barcode length is of a minimal length so as to permit the longest read from the fragment nucleic acid attached to the barcode.
  • the barcode sequences are spaced from the fragment nucleic acid molecule by at least one base, e.g., to minimize homopolymeric combinations.
  • lengths and sequences of barcode sequences are designed to achieve a desired level of accuracy of determining the identity of nucleic acid.
  • barcode sequences are designed such that after a tolerable number of point mutations, the identity of the associated nucleic acid can still be deduced with a desired accuracy.
  • a Tn-5 transposase (commercially available from Epicentre Biotechnologies; Madison, Wis.) cuts a nucleic acid into fragments and inserts short pieces of DNA into the cuts. The short pieces of DNA are used to incorporate the barcode sequences.
  • a single barcode is attached to each fragment.
  • a plurality of barcodes e.g., two barcodes, are attached to each fragment.
  • nucleic acid template molecules are isolated from a biological sample containing a variety of other components, such as proteins, lipids, and non-template nucleic acids.
  • Nucleic acid template molecules can be obtained from any material (e.g., cellular material (live or dead), extracellular material, viral material, environmental samples (e.g., metagenomic samples), synthetic material (e.g., amplicons such as provided by PCR or other amplification technologies)), obtained from an animal, plant, bacterium, archaeon, fungus, or any other organism.
  • Biological samples for use in the present invention include viral particles or preparations thereof.
  • Nucleic acid template molecules can be obtained directly from an organism or from a biological sample obtained from an organism, e.g., from blood, urine, cerebrospinal fluid, seminal fluid, saliva, sputum, stool, hair, sweat, tears, skin, and tissue.
  • Exemplary samples include, but are not limited to, whole blood, lymphatic fluid, serum, plasma, buccal cells, sweat, tears, saliva, sputum, hair, skin, biopsy, cerebrospinal fluid (CSF), amniotic fluid, seminal fluid, vaginal excretions, serous fluid, synovial fluid, pericardial fluid, peritoneal fluid, pleural fluid, transudates, exudates, cystic fluid, bile, urine, gastric fluids, intestinal fluids, fecal samples, and swabs, aspirates (e.g., bone marrow, fine needle, etc.), washes (e.g., oral, nasopharyngeal, bronchial, bronchialalveolar, optic, rectal, intestinal, vaginal, epidermal, etc.), and/or other specimens.
  • CSF cerebrospinal fluid
  • tissue or body fluid specimen may be used as a source for nucleic acid for use in the technology, including forensic specimens, archived specimens, preserved specimens, and/or specimens stored for long periods of time, e.g., fresh-frozen, methanol/acetic acid fixed, or formalin-fixed paraffin embedded (FFPE) specimens and samples.
  • Nucleic acid template molecules can also be isolated from cultured cells, such as a primary cell culture or a cell line. The cells or tissues from which template nucleic acids are obtained can be infected with a virus or other intracellular pathogen.
  • a sample can also be total RNA extracted from a biological specimen, a cDNA library, viral, or genomic DNA.
  • a sample may also be isolated DNA from a non-cellular origin, e.g. amplified/isolated DNA that has been stored in a freezer.
  • Nucleic acid template molecules can be obtained, e.g., by extraction from a biological sample, e.g., by a variety of techniques such as those described by Maniatis, et al. (1982) Molecular Cloning: A Laboratory Manual , Cold Spring Harbor, N.Y. (see, e.g., pp. 280-281).
  • size selection of the nucleic acids is performed to remove very short fragments or very long fragments. Suitable methods select a size are known in the art. In various embodiments, the size is limited to be 0.5, 1, 2, 3, 4, 5, 7, 10, 12, 15, 20, 25, 30, 50, 100 kb or longer.
  • a nucleic acid is amplified. Any amplification method known in the art may be used. Examples of amplification techniques that can be used include, but are not limited to, PCR, quantitative PCR, quantitative fluorescent PCR (QF-PCR), multiplex fluorescent PCR (MF-PCR), real time PCR(RT-PCR), single cell PCR, restriction fragment length polymorphism PCR (PCR-RFLP), hot start PCR, nested PCR, in situ polony PCR, in situ rolling circle amplification (RCA), bridge PCR, picotiter PCR, and emulsion PCR.
  • QF-PCR quantitative fluorescent PCR
  • MF-PCR multiplex fluorescent PCR
  • RT-PCR real time PCR
  • PCR-RFLP restriction fragment length polymorphism PCR
  • hot start PCR nested PCR
  • in situ polony PCR in situ rolling circle amplification
  • RCA in situ rolling circle amplification
  • bridge PCR picotiter PCR
  • picotiter PCR picot
  • LCR ligase chain reaction
  • transcription amplification self-sustained sequence replication
  • selective amplification of target polynucleotide sequences consensus sequence primed polymerase chain reaction (CP-PCR), arbitrarily primed polymerase chain reaction (AP-PCR), degenerate oligonucleotide-primed PCR (DOP-PCR), and nucleic acid based sequence amplification (NABSA).
  • CP-PCR consensus sequence primed polymerase chain reaction
  • AP-PCR arbitrarily primed polymerase chain reaction
  • DOP-PCR degenerate oligonucleotide-primed PCR
  • NABSA nucleic acid based sequence amplification
  • Other amplification methods that can be used herein include those described in U.S. Pat. Nos. 5,242,794; 5,494,810; 4,988,617; and 6,582,938.
  • end repair is performed to generate blunt end 5′ phosphorylated nucleic acid ends using commercial kits, such as those available from Epicentre Biotechnologies (Madison, Wis.).
  • nucleic acid sequence data are generated.
  • nucleic acid sequencing platforms e.g., a nucleic acid sequencer
  • a sequencing instrument includes a fluidic delivery and control unit, a sample processing unit, a signal detection unit, and a data acquisition, analysis and control unit.
  • Various embodiments of the instrument provide for automated sequencing that is used to gather sequence information from a plurality of sequences in parallel and/or substantially simultaneously.
  • the fluidics delivery and control unit includes a reagent delivery system.
  • the reagent delivery system includes a reagent reservoir for the storage of various reagents.
  • the reagents can include RNA-based primers, forward/reverse DNA primers, nucleotide mixtures (e.g., compositions comprising nucleotide analogs as provided herein) for sequencing-by-synthesis, buffers, wash reagents, blocking reagents, stripping reagents, and the like.
  • the reagent delivery system can include a pipetting system or a continuous flow system that connects the sample processing unit with the reagent reservoir.
  • the sample processing unit includes a sample chamber, such as flow cell, a substrate, a micro-array, a multi-well tray, or the like.
  • the sample processing unit can include multiple lanes, multiple channels, multiple wells, or other means of processing multiple sample sets substantially simultaneously.
  • the sample processing unit can include multiple sample chambers to enable processing of multiple runs simultaneously.
  • the system can perform signal detection on one sample chamber while substantially simultaneously processing another sample chamber.
  • the sample processing unit can include an automation system for moving or manipulating the sample chamber.
  • the signal detection unit can include an imaging or detection sensor.
  • the imaging or detection sensor can include a CCD, a CMOS, an ion sensor, such as an ion sensitive layer overlying a CMOS, a current detector, or the like.
  • the signal detection unit can include an excitation system to cause a probe, such as a fluorescent dye, to emit a signal.
  • the detection system can include an illumination source, such as arc lamp, a laser, a light emitting diode (LED), or the like.
  • the signal detection unit includes optics for the transmission of light from an illumination source to the sample or from the sample to the imaging or detection sensor.
  • the signal detection unit may not include an illumination source, such as for example, when a signal is produced spontaneously as a result of a sequencing reaction.
  • a signal can be produced by the interaction of a released moiety, such as a released ion interacting with an ion sensitive layer, or a pyrophosphate reacting with an enzyme or other catalyst to produce a chemiluminescent signal.
  • changes in an electrical current, voltage, or resistance are detected without the need for an illumination source.
  • a data acquisition analysis and control unit monitors various system parameters.
  • the system parameters can include temperature of various portions of the instrument, such as sample processing unit or reagent reservoirs, volumes of various reagents, the status of various system subcomponents, such as a manipulator, a stepper motor, a pump, or the like, or any combination thereof.
  • Sequencing by synthesis can include the incorporation of dye labeled nucleotides, chain termination, ion/proton sequencing, pyrophosphate sequencing, or the like.
  • Single molecule techniques can include staggered sequencing, where the sequencing reactions is paused to determine the identity of the incorporated nucleotide.
  • the sequencing instrument determines the sequence of a nucleic acid, such as a polynucleotide or an oligonucleotide.
  • the nucleic acid can include DNA or RNA, and can be single stranded, such as ssDNA and RNA, or double stranded, such as dsDNA or a RNA/cDNA pair.
  • the nucleic acid can include or be derived from a fragment library, a mate pair library, a ChIP fragment, or the like.
  • the sequencing instrument can obtain the sequence information from a single nucleic acid molecule or from a group of substantially identical nucleic acid molecules.
  • the sequencing instrument can output nucleic acid sequencing read data in a variety of different output data file types/formats, including, but not limited to: *.txt, *.fasta, *.csfasta, *seq.txt, *qseq.txt, *.fastq, *.sff, *prb.txt, *.sms, *srs, and/or *.qv.
  • NGS next-generation sequencing
  • Amplification-requiring methods include pyrosequencing commercialized by Roche as the 454 technology platforms (e.g., GS 20 and GS FLX), the Solexa platform commercialized by Illumina, and the Supported Oligonucleotide Ligation and Detection (SOLiD) platform commercialized by Applied Biosystems.
  • Non-amplification approaches also known as single-molecule sequencing, are exemplified by the HeliScope platform commercialized by Helicos BioSciences, and emerging platforms commercialized by VisiGen, Oxford Nanopore Technologies Ltd., Life Technologies/Ion Torrent, and Pacific Biosciences, respectively.
  • the NGS fragment library is clonally amplified in-situ by capturing single template molecules with beads bearing oligonucleotides complementary to the adaptors.
  • Each bead bearing a single template type is compartmentalized into a water-in-oil microvesicle, and the template is clonally amplified using a technique referred to as emulsion PCR.
  • the emulsion is disrupted after amplification and beads are deposited into individual wells of a picotitre plate functioning as a flow cell during the sequencing reactions. Ordered, iterative introduction of each of the four dNTP reagents occurs in the flow cell in the presence of sequencing enzymes and luminescent reporter such as luciferase.
  • the resulting production of ATP causes a burst of luminescence within the well, which is recorded using a CCD camera. It is possible to achieve read lengths greater than or equal to 400 bases, and 10 6 sequence reads can be achieved, resulting in up to 500 million base pairs (Mb) of sequence.
  • sequencing data are produced in the form of shorter-length reads.
  • the fragments of the NGS fragment library are captured on the surface of a flow cell that is studded with oligonucleotide anchors.
  • the anchor is used as a PCR primer, but because of the length of the template and its proximity to other nearby anchor oligonucleotides, extension by PCR results in the “arching over” of the molecule to hybridize with an adjacent anchor oligonucleotide to form a bridge structure on the surface of the flow cell.
  • These loops of DNA are denatured and cleaved. Forward strands are then sequenced with reversible dye terminators.
  • the sequence of incorporated nucleotides is determined by detection of post-incorporation fluorescence, with each fluor and block removed prior to the next cycle of dNTP addition. Sequence read length ranges from 36 nucleotides to over 100 nucleotides, with overall output exceeding 1 billion nucleotide pairs per analytical run.
  • Sequencing nucleic acid molecules using SOLiD technology also involves clonal amplification of the NGS fragment library by emulsion PCR. Following this, beads bearing template are immobilized on a derivatized surface of a glass flow-cell, and a primer complementary to the adaptor oligonucleotide is annealed.
  • interrogation probes have 16 possible combinations of the two bases at the 3′ end of each probe, and one of four fluors at the 5′ end. Fluor color, and thus identity of each probe, corresponds to specified color-space coding schemes. Multiple rounds (usually 7) of probe annealing, ligation, and fluor detection are followed by denaturation, and then a second round of sequencing using a primer that is offset by one base relative to the initial primer. In this manner, the template sequence can be computationally re-constructed, and template bases are interrogated twice, resulting in increased accuracy. Sequence read length averages 35 nucleotides, and overall output exceeds 4 billion bases per sequencing run.
  • HeliScope by Helicos BioSciences is employed (Voelkerding et al., Clinical Chem., 55: 641-658, 2009; MacLean et al., Nature Rev. Microbiol., 7: 287-296; U.S. Pat. No. 7,169,560; U.S. Pat. No. 7,282,337; U.S. Pat. No. 7,482,120; U.S. Pat. No. 7,501,245; U.S. Pat. No. 6,818,395; U.S. Pat. No. 6,911,345; U.S. Pat. No. 7,501,245; each herein incorporated by reference in their entirety).
  • Sequencing is achieved by addition of polymerase and serial addition of fluorescently-labeled dNTP reagents. Incorporation events result in a fluor signal corresponding to the dNTP, and signal is captured by a CCD camera before each round of dNTP addition. Sequence read length ranges from 25-50 nucleotides, with overall output exceeding 1 billion nucleotide pairs per analytical run.
  • 454 sequencing by Roche is used (Margulies et al. (2005) Nature 437: 376-380). 454 sequencing involves two steps. In the first step, DNA is sheared into fragments of approximately 300-800 base pairs and the fragments are blunt ended. Oligonucleotide adaptors are then ligated to the ends of the fragments. The adaptors serve as primers for amplification and sequencing of the fragments.
  • the fragments can be attached to DNA capture beads, e.g., streptavidin-coated beads using, e.g., an adaptor that contains a 5′-biotin tag.
  • the fragments attached to the beads are PCR amplified within droplets of an oil-water emulsion.
  • the beads are captured in wells (pico-liter sized). Pyrosequencing is performed on each DNA fragment in parallel. Addition of one or more nucleotides generates a light signal that is recorded by a CCD camera in a sequencing instrument. The signal strength is proportional to the number of nucleotides incorporated. Pyrosequencing makes use of pyrophosphate (PPi) which is released upon nucleotide addition. PPi is converted to ATP by ATP sulfurylase in the presence of adenosine 5′ phosphosulfate. Luciferase uses ATP to convert luciferin to oxyluciferin, and this reaction generates light that is detected and analyzed.
  • PPi pyrophosphate
  • the Ion Torrent technology is a method of DNA sequencing based on the detection of hydrogen ions that are released during the polymerization of DNA (see, e.g., Science 327(5970): 1190 (2010); U.S. Pat. Appl. Pub. Nos. 20090026082, 20090127589, 20100301398, 20100197507, 20100188073, and 20100137143, incorporated by reference in their entireties for all purposes).
  • a microwell contains a fragment of the NGS fragment library to be sequenced. Beneath the layer of microwells is a hypersensitive ISFET ion sensor. All layers are contained within a CMOS semiconductor chip, similar to that used in the electronics industry.
  • a hydrogen ion is released, which triggers a hypersensitive ion sensor.
  • a hydrogen ion is released, which triggers a hypersensitive ion sensor.
  • multiple dNTP molecules will be incorporated in a single cycle. This leads to a corresponding number of released hydrogens and a proportionally higher electronic signal.
  • This technology differs from other sequencing technologies in that no modified nucleotides or optics are used.
  • the per-base accuracy of the Ion Torrent sequencer is ⁇ 99.6% for 50 base reads, with ⁇ 100 Mb generated per run. The read-length is 100 base pairs.
  • the accuracy for homopolymer repeats of 5 repeats in length is ⁇ 98%.
  • the benefits of ion semiconductor sequencing are rapid sequencing speed and low upfront and operating costs. However, the cost of acquiring a pH-mediated sequencer is approximately $50,000, excluding sample preparation equipment and a server for data analysis.
  • Another exemplary nucleic acid sequencing approach that may be adapted for use with the present invention was developed by Stratos Genomics, Inc. and involves the use of Xpandomers.
  • This sequencing process typically includes providing a daughter strand produced by a template-directed synthesis.
  • the daughter strand generally includes a plurality of subunits coupled in a sequence corresponding to a contiguous nucleotide sequence of all or a portion of a target nucleic acid in which the individual subunits comprise a tether, at least one probe or nucleobase residue, and at least one selectively cleavable bond.
  • the selectively cleavable bond(s) is/are cleaved to yield an Xpandomer of a length longer than the plurality of the subunits of the daughter strand.
  • the Xpandomer typically includes the tethers and reporter elements for parsing genetic information in a sequence corresponding to the contiguous nucleotide sequence of all or a portion of the target nucleic acid. Reporter elements of the Xpandomer are then detected. Additional details relating to Xpandomer-based approaches are described in, for example, U.S. Pat. Pub No. 20090035777, entitled “HIGH THROUGHPUT NUCLEIC ACID SEQUENCING BY EXPANSION,” filed Jun. 19, 2008, which is incorporated herein in its entirety.
  • Sequencing reactions are performed using immobilized template, modified phi29 DNA polymerase, and high local concentrations of fluorescently labeled dNTPs. High local concentrations and continuous reaction conditions allow incorporation events to be captured in real time by fluor signal detection using laser excitation, an optical waveguide, and a CCD camera.
  • the single molecule real time (SMRT) DNA sequencing methods using zero-mode waveguides (ZMWs) developed by Pacific Biosciences, or similar methods are employed.
  • ZMWs zero-mode waveguides
  • DNA sequencing is performed on SMRT chips, each containing thousands of zero-mode waveguides (ZMWs).
  • a ZMW is a hole, tens of nanometers in diameter, fabricated in a 100 nm metal film deposited on a silicon dioxide substrate.
  • Each ZMW becomes a nanophotonic visualization chamber providing a detection volume of just 20 zeptoliters (10 L). At this volume, the activity of a single molecule can be detected amongst a background of thousands of labeled nucleotides.
  • the ZMW provides a window for watching DNA polymerase as it performs sequencing by synthesis.
  • a single DNA polymerase molecule is attached to the bottom surface such that it permanently resides within the detection volume.
  • Phospholinked nucleotides each type labeled with a different colored fluorophore, are then introduced into the reaction solution at high concentrations which promote enzyme speed, accuracy, and processivity. Due to the small size of the ZMW, even at these high, biologically relevant concentrations, the detection volume is occupied by nucleotides only a small fraction of the time. In addition, visits to the detection volume are fast, lasting only a few microseconds, due to the very small distance that diffusion has to carry the nucleotides. The result is a very low background.
  • nanopore sequencing is used (Soni G V and Meller A. (2007) Clin Chem 53: 1996-2001).
  • a nanopore is a small hole, of the order of 1 nanometer in diameter. Immersion of a nanopore in a conducting fluid and application of a potential across it results in a slight electrical current due to conduction of ions through the nanopore. The amount of current which flows is sensitive to the size of the nanopore. As a DNA molecule passes through a nanopore, each nucleotide on the DNA molecule obstructs the nanopore to a different degree. Thus, the change in the current passing through the nanopore as the DNA molecule passes through the nanopore represents a reading of the DNA sequence.
  • a sequencing technique uses a chemical-sensitive field effect transistor (chemFET) array to sequence DNA (for example, as described in US Patent Application Publication No. 20090026082).
  • chemFET chemical-sensitive field effect transistor
  • DNA molecules are placed into reaction chambers, and the template molecules are hybridized to a sequencing primer bound to a polymerase.
  • Incorporation of one or more triphosphates into a new nucleic acid strand at the 3′ end of the sequencing primer can be detected by a change in current by a chemFET.
  • An array can have multiple chemFET sensors.
  • single nucleic acids can be attached to beads, and the nucleic acids can be amplified on the bead, and the individual beads can be transferred to individual reaction chambers on a chemFET array, with each chamber having a chemFET sensor, and the nucleic acids can be sequenced.
  • sequencing technique uses an electron microscope (Moudrianakis E. N. and Beer M. Proc Natl Acad Sci USA. 1965 March; 53:564-71).
  • individual DNA molecules are labeled using metallic labels that are distinguishable using an electron microscope. These molecules are then stretched on a flat surface and imaged using an electron microscope to measure sequences.
  • “four-color sequencing by synthesis using cleavable fluorescents nucleotide reversible terminators” as described in Turro, et al. PNAS 103: 19635-40 (2006) is used, e.g., as commercialized by Intelligent Bio-Systems.
  • 20080157005 entitled “Methods and systems for simultaneous real-time monitoring of optical signals from multiple sources”, filed Oct. 26, 2007 by Lundquist et al.; 20080153100, entitled “Articles having localized molecules disposed thereon and methods of producing same”, filed Oct. 31, 2007 by Rank et al.; 20080153095, entitled “CHARGE SWITCH NUCLEOTIDES”, filed Oct. 26, 2007 by Williams et al.; 20080152281, entitled “Substrates, systems and methods for analyzing materials”, filed Oct. 31, 2007 by Lundquist et al.; 20080152280, entitled “Substrates, systems and methods for analyzing materials”, filed Oct.
  • a computer-based analysis program is used to translate the raw data generated by the detection assay (e.g., sequencing reads) into data of predictive value for an end user (e.g., medical personnel).
  • the user can access the predictive data using any suitable means.
  • the present technology provides the further benefit that the user, who is not likely to be trained in genetics or molecular biology, need not understand the raw data.
  • the data is presented directly to the end user in its most useful form. The user is then able to immediately utilize the information to determine useful information (e.g., in medical diagnostics, research, or screening).
  • the system can include a nucleic acid sequencer, a sample sequence data storage, a reference sequence data storage, and an analytics computing device/server/node.
  • the analytics computing device/server/node can be a workstation, mainframe computer, personal computer, mobile device, etc.
  • the nucleic acid sequencer can be configured to analyze (e.g., interrogate) a nucleic acid fragment (e.g., single fragment, mate-pair fragment, paired-end fragment, etc.) utilizing all available varieties of techniques, platforms or technologies to obtain nucleic acid sequence information, in particular the methods as described herein using compositions provided herein.
  • the nucleic acid sequencer is in communications with the sample sequence data storage either directly via a data cable (e.g., serial cable, direct cable connection, etc.) or bus linkage or, alternatively, through a network connection (e.g., Internet, LAN, WAN, VPN, etc.).
  • a network connection e.g., Internet, LAN, WAN, VPN, etc.
  • the network connection can be a “hardwired” physical connection.
  • the nucleic acid sequencer can be communicatively connected (via Category 5 (CAT5), fiber optic or equivalent cabling) to a data server that is communicatively connected (via CAT5, fiber optic, or equivalent cabling) through the Internet and to the sample sequence data storage.
  • CAT5 Category 5
  • CAT5 fiber optic or equivalent cabling
  • the network connection is a wireless network connection (e.g., Wi-Fi, WLAN, etc.), for example, utilizing an 802.11a/b/g/n or equivalent transmission format.
  • the network connection utilized is dependent upon the particular requirements of the system.
  • the sample sequence data storage is an integrated part of the nucleic acid sequencer.
  • the sample sequence data storage is any database storage device, system, or implementation (e.g., data storage partition, etc.) that is configured to organize and store nucleic acid sequence read data generated by nucleic acid sequencer such that the data can be searched and retrieved manually (e.g., by a database administrator or client operator) or automatically by way of a computer program, application, or software script.
  • database storage device e.g., data storage partition, etc.
  • implementation e.g., data storage partition, etc.
  • the reference data storage can be any database device, storage system, or implementation (e.g., data storage partition, etc.) that is configured to organize and store reference sequences (e.g., whole or partial genome, whole or partial exome, SNP, gen, etc.) such that the data can be searched and retrieved manually (e.g., by a database administrator or client operator) or automatically by way of a computer program, application, and/or software script.
  • reference sequences e.g., whole or partial genome, whole or partial exome, SNP, gen, etc.
  • sample nucleic acid sequencing read data can be stored on the sample sequence data storage and/or the reference data storage in a variety of different data file types/formats, including, but not limited to: *.txt, *.fasta, *.csfasta, *seq.txt, *qseq.txt, *.fastq, *.sff, *prb.txt, *.sms, *srs and/or *.qv.
  • sample sequence data storage and the reference data storage are independent standalone devices/systems or implemented on different devices. In some embodiments, the sample sequence data storage and the reference data storage are implemented on the same device/system. In some embodiments, the sample sequence data storage and/or the reference data storage can be implemented on the analytics computing device/server/node.
  • the analytics computing device/server/node can be in communications with the sample sequence data storage and the reference data storage either directly via a data cable (e.g., serial cable, direct cable connection, etc.) or bus linkage or, alternatively, through a network connection (e.g., Internet, LAN, WAN, VPN, etc.).
  • analytics computing device/server/node can host a reference mapping engine, a de novo mapping module, and/or a tertiary analysis engine.
  • the reference mapping engine can be configured to obtain sample nucleic acid sequence reads from the sample data storage and map them against one or more reference sequences obtained from the reference data storage to assemble the reads into a sequence that is similar but not necessarily identical to the reference sequence using all varieties of reference mapping/alignment techniques and methods.
  • the reassembled sequence can then be further analyzed by one or more optional tertiary analysis engines to identify differences in the genetic makeup (genotype), gene expression or epigenetic status of individuals that can result in large differences in physical characteristics (phenotype).
  • the tertiary analysis engine can be configured to identify various genomic variants (in the assembled sequence) due to mutations, recombination/crossover or genetic drift.
  • types of genomic variants include, but are not limited to: single nucleotide polymorphisms (SNPs), copy number variations (CNVs), insertions/deletions (Indels), inversions, etc.
  • the optional de novo mapping module can be configured to assemble sample nucleic acid sequence reads from the sample data storage into new and previously unknown sequences. It should be understood, however, that the various engines and modules hosted on the analytics computing device/server/node can be combined or collapsed into a single engine or module, depending on the requirements of the particular application or system architecture. Moreover, in some embodiments, the analytics computing device/server/node can host additional engines or modules as needed by the particular application or system architecture.
  • the mapping and/or tertiary analysis engines are configured to process the nucleic acid and/or reference sequence reads in color space. In some embodiments, the mapping and/or tertiary analysis engines are configured to process the nucleic acid and/or reference sequence reads in base space. It should be understood, however, that the mapping and/or tertiary analysis engines disclosed herein can process or analyze nucleic acid sequence data in any schema or format as long as the schema or format can convey the base identity and position of the nucleic acid sequence.
  • sample nucleic acid sequencing read and referenced sequence data can be supplied to the analytics computing device/server/node in a variety of different input data file types/formats, including, but not limited to: *.txt, *.fasta, *.csfasta, *seq.txt, *qseq.txt, *.fastq, *.sff, *prb.txt, *.sms, *srs and/or *.qv.
  • a client terminal can be a thin client or thick client computing device.
  • client terminal can have a web browser that can be used to control the operation of the reference mapping engine, the de novo mapping module and/or the tertiary analysis engine. That is, the client terminal can access the reference mapping engine, the de novo mapping module and/or the tertiary analysis engine using a browser to control their function.
  • the client terminal can be used to configure the operating parameters (e.g., mismatch constraint, quality value thresholds, etc.) of the various engines, depending on the requirements of the particular application.
  • client terminal can also display the results of the analysis performed by the reference mapping engine, the de novo mapping module and/or the tertiary analysis engine.
  • the present technology also encompasses any method capable of receiving, processing, and transmitting the information to and from laboratories conducting the assays, information provides, medical personal, and subjects.
  • the technology is not limited to particular uses, but finds use in a wide range of research (basic and applied), clinical, medical, and other biological, biochemical, and molecular biological applications.
  • Some exemplary uses of the technology include genetics, genomics, and/or genotyping, e.g., of plants, animals, and other organisms, e.g., to identify haplotypes, phasing, and/or linkage of mutations and/or alleles.
  • Particular and non-limiting illustrative examples in the human medical context include testing for cystic fibrosis and fragile X syndrome.
  • the technology finds use in the field of infectious disease, e.g., in identifying infectious agents such as viruses, bacteria, fungi, etc., and in determining viral types, families, species, and/or quasi-species, and to identify haplotypes, phasing, and/or linkage of mutations and/or alleles.
  • infectious disease e.g., in identifying infectious agents such as viruses, bacteria, fungi, etc., and in determining viral types, families, species, and/or quasi-species, and to identify haplotypes, phasing, and/or linkage of mutations and/or alleles.
  • HIV human immunodeficiency virus
  • infectious disease examples include characterizing antibiotic resistance determinants; tracking infectious organisms for epidemiology; monitoring the emergence and evolution of resistance mechanisms; identifying species, sub-species, strains, extra-chromosomal elements, types, etc. associated with virulence, monitoring the progress of treatments, etc.
  • the technology finds use in transplant medicine, e.g., for typing of the major histocompatibility complex (MHC), typing of the human leukocyte antigen (HLA), and for identifying haplotypes, phasing, and/or linkage of mutations and/or alleles associated with transplant medicine (e.g., to identify compatible donors for a particular host needing a transplant, to predict the chance of rejection, to monitor rejection, to archive transplant material, for medical informatics databases, etc.).
  • MHC major histocompatibility complex
  • HLA human leukocyte antigen
  • the technology finds use in oncology and fields related to oncology. Particular and non-limiting illustrative examples in the area of oncology are identifying genetic and/or genomic aberrations related to cancer, predisposition to cancer, and/or treatment of cancer. For example, in some embodiments the technology finds use in detecting the presence of a chromosomal translocation associated with cancer; and in some embodiments the technology finds use in identifying novel gene fusion partners to provide cancer diagnostic tests. In some embodiments, the technology finds use in cancer screening, cancer diagnosis, cancer prognosis, measuring minimal residual disease, and selecting and/or monitoring a course of treatment for a cancer.
  • the technology finds use in characterizing nucleotide sequences. For example, in some embodiments, the technology finds use in detecting insertions and/or deletions (“indels”) in a nucleotide (e.g., genome, gene, etc.) sequence. It is contemplated that the technology described herein provides improved indel detection relative to conventional technologies. In addition, the technology finds use in detecting short tandem repeats (STRs), inversions, large insertions, and in sequencing repetitive (e.g., highly repetitive) regions of a nucleotide sequence (e.g., of a genome).
  • STRs short tandem repeats
  • inversions e.g., large insertions
  • sequencing repetitive e.g., highly repetitive
  • the technology described herein decreases instrument run-time, has a higher throughput, and produces a higher percentage of reads with quality scores greater than Q30 with respect to NGS library construction using the Illumina technology.
  • MiSeq Reagent kit v2 Dual-surface scanning, 12-15 million clusters passing filter b) To cover the entire 400 bp amplicon, a 2 ⁇ 250 bp pair-end read strategy is implemented where the reads are overlapped by ⁇ 100 bp c) Actual sequencing portion only (does not include cluster generation time) d) To calculate coverage for SOD library: [(Total # of reads)/((insert size ⁇ SOD readlength) ⁇ (# of samples in a run ⁇ # of amplicons per sample))] ⁇ SOD readlength: e.g., [(15 ⁇ 10 6 )/((400 ⁇ 50) ⁇ (8 ⁇ 50))] ⁇ 50 e) To calculate throughput: [(me
  • the technology described herein decreases instrument run-time and produces a higher percentage of reads with quality scores greater than Q20 with respect to NGS library construction using the Ion Torrent technology.
  • Tables 5 and 6 compare the performance of the technology provided herein with conventional technologies for sequencing long amplicons of approximately 1000 by (Table 5) and 2000 by (Table 6). Run-time does not increase with amplicon size for the present technology because the read size is ⁇ 30-50 bases regardless of the size of the target nucleic acid to be sequenced.
  • a 2000-bp sequence is produced by the technology provided herein in a time that is an order of magnitude less than the conventional technology (see, e.g., Table 6).
  • the technology provided herein provides a longer sequence read with the same run time as the conventional technology.
  • a consensus sequence of ⁇ 127 by is constructed from a collection of ⁇ 35-bp reads produced according to embodiments of the technology provided.
  • the calculated sequencing run time on an Illumina MiSeq DNA sequencing apparatus to produce the ⁇ 127-bp sequence using a library produced by the technology provided herein is approximately 2.5 hours.
  • a run time of ⁇ 13 hours produces the same ⁇ 127-bp sequence read.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Genetics & Genomics (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Biochemistry (AREA)
  • Microbiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Analytical Chemistry (AREA)
  • Biomedical Technology (AREA)
  • Immunology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Plant Pathology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Chemical & Material Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Preparation Of Compounds By Using Micro-Organisms (AREA)
US14/463,508 2013-08-19 2014-08-19 Next-generation sequencing libraries Abandoned US20150051088A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US14/463,508 US20150051088A1 (en) 2013-08-19 2014-08-19 Next-generation sequencing libraries

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201361867224P 2013-08-19 2013-08-19
US14/463,508 US20150051088A1 (en) 2013-08-19 2014-08-19 Next-generation sequencing libraries

Publications (1)

Publication Number Publication Date
US20150051088A1 true US20150051088A1 (en) 2015-02-19

Family

ID=52467240

Family Applications (4)

Application Number Title Priority Date Filing Date
US14/463,508 Abandoned US20150051088A1 (en) 2013-08-19 2014-08-19 Next-generation sequencing libraries
US14/463,498 Active 2036-07-24 US10036013B2 (en) 2013-08-19 2014-08-19 Next-generation sequencing libraries
US16/023,574 Active 2035-04-11 US10865410B2 (en) 2013-08-19 2018-06-29 Next-generation sequencing libraries
US17/097,101 Abandoned US20210062186A1 (en) 2013-08-19 2020-11-13 Next-generation sequencing libraries

Family Applications After (3)

Application Number Title Priority Date Filing Date
US14/463,498 Active 2036-07-24 US10036013B2 (en) 2013-08-19 2014-08-19 Next-generation sequencing libraries
US16/023,574 Active 2035-04-11 US10865410B2 (en) 2013-08-19 2018-06-29 Next-generation sequencing libraries
US17/097,101 Abandoned US20210062186A1 (en) 2013-08-19 2020-11-13 Next-generation sequencing libraries

Country Status (7)

Country Link
US (4) US20150051088A1 (fr)
EP (3) EP3626866B1 (fr)
CN (1) CN105917036B (fr)
CA (1) CA2921620C (fr)
ES (2) ES2873850T3 (fr)
RU (1) RU2698125C2 (fr)
WO (1) WO2015026853A2 (fr)

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017123770A1 (fr) * 2016-01-12 2017-07-20 Bio-Rad Laboratories, Inc. Synthèse de séquences code-barres utilisant des blocs de déphasage et leurs utilisations
WO2017181161A1 (fr) * 2016-04-15 2017-10-19 Predicine, Inc. Systèmes et procédés pour détecter des altérations génétiques
WO2018013837A1 (fr) * 2016-07-15 2018-01-18 The Regents Of The University Of California Procédés de production de bibliothèques d'acides nucléiques
WO2018075785A1 (fr) * 2016-10-19 2018-04-26 Illumina, Inc. Méthodes de ligature chimique d'acides nucléiques
WO2019032762A1 (fr) * 2017-08-10 2019-02-14 Rootpath Genomics, Inc. Procédés pour améliorer le séquençage de polynucléotides à l'aide de codes-barres en utilisant une circularisation et une troncature de matrice
US10240196B2 (en) * 2016-05-27 2019-03-26 Agilent Technologies, Inc. Transposase-random priming DNA sample preparation
WO2019090156A1 (fr) * 2017-11-03 2019-05-09 Guardant Health, Inc. Normalisation de la charge de mutation tumorale
WO2019113506A1 (fr) * 2017-12-07 2019-06-13 The Broad Institute, Inc. Procédés et compositions pour multiplexer un séquençage de noyaux isolés et de cellules isolées
WO2020146312A1 (fr) * 2019-01-07 2020-07-16 Agilent Technologies, Inc. Compositions et procédés d'analyse d'expression génique et d'adn génomique dans des cellules uniques
US11118234B2 (en) 2018-07-23 2021-09-14 Guardant Health, Inc. Methods and systems for adjusting tumor mutational burden by tumor fraction and coverage
EP3894595A1 (fr) * 2018-12-14 2021-10-20 Lexogen GmbH Amplification d'acide nucléique et procédé d'identification
WO2021216868A1 (fr) * 2020-04-22 2021-10-28 The Regents Of The University Of California Procédés de détection et de séquençage d'un acide nucléique cible
US11174503B2 (en) 2016-09-21 2021-11-16 Predicine, Inc. Systems and methods for combined detection of genetic alterations
US11186836B2 (en) 2016-06-16 2021-11-30 Haystack Sciences Corporation Oligonucleotide directed and recorded combinatorial synthesis of encoded probe molecules
US11584929B2 (en) 2018-01-12 2023-02-21 Claret Bioscience, Llc Methods and compositions for analyzing nucleic acid
US11629345B2 (en) 2018-06-06 2023-04-18 The Regents Of The University Of California Methods of producing nucleic acid libraries and compositions and kits for practicing same
US11795580B2 (en) 2017-05-02 2023-10-24 Haystack Sciences Corporation Molecules for verifying oligonucleotide directed combinatorial synthesis and methods of making and using the same
US12065684B2 (en) 2020-05-15 2024-08-20 Telesis Bio Inc. Demand synthesis of polynucleotide sequences
US12385090B2 (en) 2018-12-07 2025-08-12 Bgi Shenzhen Method for sequencing long-fragment nucleic acid

Families Citing this family (41)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9932623B2 (en) * 2013-08-19 2018-04-03 Abbott Molecular Inc. Nucleotide analogs
US10395759B2 (en) 2015-05-18 2019-08-27 Regeneron Pharmaceuticals, Inc. Methods and systems for copy number variant detection
WO2017003924A1 (fr) * 2015-06-29 2017-01-05 Genesis DNA Inc. Procédé et appareil pour la synthèse d'acides nucléiques en double phase solide
CN109074426B (zh) 2016-02-12 2022-07-26 瑞泽恩制药公司 用于检测异常核型的方法和系统
EP4488686A3 (fr) 2016-04-14 2025-04-30 Guardant Health, Inc. Procédés de détection précoce du cancer
US11384382B2 (en) 2016-04-14 2022-07-12 Guardant Health, Inc. Methods of attaching adapters to sample nucleic acids
GB201609221D0 (en) * 2016-05-25 2016-07-06 Oxford Nanopore Tech Ltd Method
EP3485032B1 (fr) 2016-07-12 2021-02-17 Life Technologies Corporation Compositions et procédés pour détecter un acide nucléique
JP6810559B2 (ja) * 2016-09-09 2021-01-06 株式会社日立製作所 環状型一本鎖核酸、およびその調製方法と使用方法
ES2870639T3 (es) 2016-10-24 2021-10-27 Geneinfosec Inc Ocultación de información presente en los ácidos nucleicos
WO2018112349A1 (fr) * 2016-12-15 2018-06-21 University Of Cincinnati Procédé simplifié de purification par taille de petits oligonucléotides par électrophorèse sur gel
CN106676099B (zh) * 2016-12-21 2019-07-02 中国水稻研究所 构建简化基因组文库的方法及试剂盒
CA3059840C (fr) * 2017-04-23 2022-04-26 Illumina Cambridge Limited Compositions et procedes pour ameliorer l'identification d'echantillons dans des bibliotheques d'acides nucleiques indexees
WO2018197945A1 (fr) 2017-04-23 2018-11-01 Illumina Cambridge Limited Compositions et procédés permettant d'améliorer l'identification d'échantillons dans des bibliothèques d'acides nucléiques indexés
US10995369B2 (en) 2017-04-23 2021-05-04 Illumina, Inc. Compositions and methods for improving sample identification in indexed nucleic acid libraries
FI3842545T3 (fi) 2017-04-23 2023-01-31 Koostumuksia ja menetelmiä näytteiden tunnistuksen parantamiseksi indeksoiduissa nukleiinihappokirjastoissa
EP3635136B1 (fr) * 2017-06-07 2021-10-20 Oregon Health & Science University Banques de génomes entiers de cellules individuelles pour le séquençage de méthylation
US11339424B2 (en) 2017-09-06 2022-05-24 Dxome Co., Ltd. Method for amplification and quantitation of small amount of mutation using molecular barcode and blocking oligonucleotide
CN111315895A (zh) * 2017-09-14 2020-06-19 豪夫迈·罗氏有限公司 用于产生环状单链dna文库的新型方法
WO2019055780A1 (fr) * 2017-09-14 2019-03-21 Alere San Diego Inc. Détection d'amplification par polymérase recombinase à l'aide d'une sonde à double haptène
US10699802B2 (en) 2017-10-09 2020-06-30 Strata Oncology, Inc. Microsatellite instability characterization
CN109694864B (zh) * 2017-10-23 2020-12-25 深圳华大因源医药科技有限公司 基于点击化学的测序接头、双条形码测序文库及其构建方法
CN110021345B (zh) * 2017-12-08 2021-02-02 北京哲源科技有限责任公司 基于spark平台的基因数据分析方法
CN108148910A (zh) * 2017-12-18 2018-06-12 广东省人民医院(广东省医学科学院) 一种肺癌相关的285基因靶向捕获测序试剂盒及其应用
IL276343B2 (en) * 2018-01-29 2024-10-01 St Jude Childrens Res Hospital Inc Method for nucleic acid amplification
IL271454B2 (en) 2018-05-17 2025-04-01 Illumina Inc Rapid single-cell genetic sequencing with low Aggregation bias
KR20210114918A (ko) 2019-01-11 2021-09-24 일루미나 케임브리지 리미티드 복합체 표면-결합 트랜스포좀 복합체
US20220195502A1 (en) * 2019-04-24 2022-06-23 Genepath Diagnostics Inc. Method for detecting specific nucleic acids in samples
JP2022537069A (ja) * 2019-06-21 2022-08-23 サーモ フィッシャー サイエンティフィック バルティックス ユーエービー 次世代配列決定ライブラリを調製するための核酸の標識化に有用なオリゴヌクレオチドが連結された三リン酸ヌクレオチド
CN112342627B (zh) * 2019-08-09 2024-07-23 深圳市真迈生物科技有限公司 一种核酸文库的制备方法及测序方法
WO2021046502A2 (fr) * 2019-09-08 2021-03-11 The University Of Toledo Kits et procédés pour tester des risques de cancer du poumon
IL290043B2 (en) * 2019-12-23 2023-11-01 Baseclick Gmbh Method of amplifying mrnas and for preparing full length mrna libraries
EP3842532A1 (fr) * 2019-12-23 2021-06-30 baseclick GmbH Procédé d'amplification d'arnm et de préparation de bibliothèques d'arnm de pleine longueur
EP4253559B1 (fr) * 2020-02-26 2025-04-02 Illumina, Inc. Kits pour le génotypage
US12359266B2 (en) 2020-06-18 2025-07-15 Board Of Regents, The University Of Texas System Tiled ClickSeq for targeted virus whole genome sequencing
CN113689912B (zh) * 2020-12-14 2024-08-20 广东美格基因科技有限公司 基于宏基因组测序的微生物对比结果校正的方法和系统
EP4308723B1 (fr) * 2021-03-15 2025-04-23 F. Hoffmann-La Roche AG Séquençage ciblé de nouvelle génération par l'intermédiaire d'une extension d'amorce ancrée
EP4347872A2 (fr) * 2021-05-28 2024-04-10 Illumina, Inc. Analogues de nucléotides oligo-modifiés pour la préparation d'acides nucléiques
CN113862263B (zh) * 2021-12-01 2022-03-15 江苏为真生物医药技术股份有限公司 测序文库构建方法及应用
CN119213142A (zh) * 2022-06-17 2024-12-27 深圳华大智造科技股份有限公司 单链核酸环状文库的建库以及测序方法
EP4488381A1 (fr) * 2023-07-07 2025-01-08 ETH Zurich Séquences d'identification d'adn modifiées pour le criblage de modification d'adn

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080242560A1 (en) * 2006-11-21 2008-10-02 Gunderson Kevin L Methods for generating amplified nucleic acid arrays
WO2012134602A2 (fr) * 2011-04-01 2012-10-04 Centrillion Technology Holding Corporation Procédés et systèmes de séquençage de longs acides nucléiques

Family Cites Families (84)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US39793A (en) 1863-09-08 Improvement in gr
US5242794A (en) 1984-12-13 1993-09-07 Applied Biosystems, Inc. Detection of specific sequences in nucleic acids
US4988617A (en) 1988-03-25 1991-01-29 California Institute Of Technology Method of detecting a nucleotide change in nucleic acids
US5494810A (en) 1990-05-03 1996-02-27 Cornell Research Foundation, Inc. Thermostable ligase-mediated DNA amplifications system for the detection of genetic disease
WO1996006190A2 (fr) 1994-08-19 1996-02-29 Perkin-Elmer Corporation Procede de ligature et d'amplification associees
US5604097A (en) 1994-10-13 1997-02-18 Spectragen, Inc. Methods for sorting polynucleotides using oligonucleotide tags
US5846719A (en) 1994-10-13 1998-12-08 Lynx Therapeutics, Inc. Oligonucleotide tags for sorting and identification
US5695934A (en) 1994-10-13 1997-12-09 Lynx Therapeutics, Inc. Massively parallel sequencing of sorted polynucleotides
US5636400A (en) 1995-08-07 1997-06-10 Young; Keenan L. Automatic infant bottle cleaner
GB9620209D0 (en) 1996-09-27 1996-11-13 Cemu Bioteknik Ab Method of sequencing DNA
WO1998023733A2 (fr) 1996-11-27 1998-06-04 University Of Washington Polymerases thermostables presentant une fidelite modifiee
GB9626815D0 (en) 1996-12-23 1997-02-12 Cemu Bioteknik Ab Method of sequencing DNA
US6969488B2 (en) 1998-05-22 2005-11-29 Solexa, Inc. System and apparatus for sequential processing of analytes
US6312904B1 (en) 1997-07-11 2001-11-06 Xzillion Gmbh & Co. Kg Characterizing nucleic acid
JP2002508546A (ja) * 1998-03-26 2002-03-19 インサイト ファーマシューティカルズ インコーポレイテッド 生体分子配列を解析するためのシステムおよび方法
AR021833A1 (es) 1998-09-30 2002-08-07 Applied Research Systems Metodos de amplificacion y secuenciacion de acido nucleico
US6818395B1 (en) 1999-06-28 2004-11-16 California Institute Of Technology Methods and apparatus for analyzing polynucleotide sequences
US7501245B2 (en) 1999-06-28 2009-03-10 Helicos Biosciences Corp. Methods and apparatuses for analyzing polynucleotide sequences
EP1218543A2 (fr) 1999-09-29 2002-07-03 Solexa Ltd. Sequen age de polynucleotides
US6582938B1 (en) 2001-05-11 2003-06-24 Affymetrix, Inc. Amplification of nucleic acids
US6329178B1 (en) 2000-01-14 2001-12-11 University Of Washington DNA polymerase mutant having one or more mutations in the active site
US6936702B2 (en) 2000-06-07 2005-08-30 Li-Cor, Inc. Charge-switch nucleotides
JP2004513619A (ja) 2000-07-07 2004-05-13 ヴィジゲン バイオテクノロジーズ インコーポレイテッド リアルタイム配列決定
GB0102568D0 (en) 2001-02-01 2001-03-21 Magnetic Biosolutions Sweden A Method
US7668697B2 (en) 2006-02-06 2010-02-23 Andrei Volkov Method for analyzing dynamic detectable events at the single molecule level
US7655791B2 (en) * 2001-11-13 2010-02-02 Rubicon Genomics, Inc. DNA amplification and sequencing using DNA molecules generated by random fragmentation
US7871799B2 (en) 2002-11-22 2011-01-18 Lawrence Livermore National Security, Llc Sequential addition of short DNA oligos in DNA-polymerase-based synthesis reactions
US7297490B2 (en) 2003-03-10 2007-11-20 Chinese University Of Hong Kong Authentication of biologic materials using DNA-DNA hybridization on a solid support
US20040259118A1 (en) * 2003-06-23 2004-12-23 Macevicz Stephen C. Methods and compositions for nucleic acid sequence analysis
US7169560B2 (en) 2003-11-12 2007-01-30 Helicos Biosciences Corporation Short cycle methods for sequencing polynucleotides
US7170050B2 (en) 2004-09-17 2007-01-30 Pacific Biosciences Of California, Inc. Apparatus and methods for optical analysis of molecules
WO2006044078A2 (fr) 2004-09-17 2006-04-27 Pacific Biosciences Of California, Inc. Appareil et procede d'analyse de molecules
US20070048748A1 (en) 2004-09-24 2007-03-01 Li-Cor, Inc. Mutant polymerases for sequencing and genotyping
US7482120B2 (en) 2005-01-28 2009-01-27 Helicos Biosciences Corporation Methods and compositions for improving fidelity in a nucleic acid synthesis reaction
US20070141598A1 (en) 2005-02-09 2007-06-21 Pacific Biosciences Of California, Inc. Nucleotide Compositions and Uses Thereof
US7393665B2 (en) 2005-02-10 2008-07-01 Population Genetics Technologies Ltd Methods and compositions for tagging and identifying polynucleotides
US7805081B2 (en) 2005-08-11 2010-09-28 Pacific Biosciences Of California, Inc. Methods and systems for monitoring multiple optical signals from a single source
US7405281B2 (en) 2005-09-29 2008-07-29 Pacific Biosciences Of California, Inc. Fluorescent nucleotide analogs and uses therefor
US7763423B2 (en) 2005-09-30 2010-07-27 Pacific Biosciences Of California, Inc. Substrates having low density reactive groups for monitoring enzyme activity
AU2006320739B2 (en) 2005-11-28 2012-03-29 Pacific Biosciences Of California, Inc. Uniform surfaces for hybrid material substrates and methods for making and using same
US7998717B2 (en) 2005-12-02 2011-08-16 Pacific Biosciences Of California, Inc. Mitigation of photodamage in analytical reactions
AU2006331512B2 (en) 2005-12-22 2012-02-23 Pacific Biosciences Of California, Inc. Active surface coupled polymerases
CA2633524A1 (fr) 2005-12-22 2007-07-05 Pacific Biosciences Of California, Inc. Polymerases permettant d'incorporer des analogues de nucleotides
WO2007084433A2 (fr) 2006-01-13 2007-07-26 The Trustees Of Princeton University Polymorphisme a base sequentielle cartographiant a une resolution de nucleotide simple
US7544473B2 (en) 2006-01-23 2009-06-09 Population Genetics Technologies Ltd. Nucleic acid analysis using sequence tokens
US7537897B2 (en) 2006-01-23 2009-05-26 Population Genetics Technologies, Ltd. Molecular counting
US7715001B2 (en) 2006-02-13 2010-05-11 Pacific Biosciences Of California, Inc. Methods and systems for simultaneous real-time monitoring of optical signals from multiple sources
US7692783B2 (en) 2006-02-13 2010-04-06 Pacific Biosciences Of California Methods and systems for simultaneous real-time monitoring of optical signals from multiple sources
US7995202B2 (en) 2006-02-13 2011-08-09 Pacific Biosciences Of California, Inc. Methods and systems for simultaneous real-time monitoring of optical signals from multiple sources
US8975216B2 (en) 2006-03-30 2015-03-10 Pacific Biosciences Of California Articles having localized molecules disposed thereon and methods of producing same
US20080050747A1 (en) 2006-03-30 2008-02-28 Pacific Biosciences Of California, Inc. Articles having localized molecules disposed thereon and methods of producing and using same
US7563574B2 (en) 2006-03-31 2009-07-21 Pacific Biosciences Of California, Inc. Methods, systems and compositions for monitoring enzyme activity and applications thereof
US7282337B1 (en) 2006-04-14 2007-10-16 Helicos Biosciences Corporation Methods for increasing accuracy of nucleic acid sequencing
CN101915957B (zh) 2006-06-12 2012-12-12 加利福尼亚太平洋生物科学公司 实施分析反应的基材
AU2007260707A1 (en) 2006-06-16 2007-12-21 Pacific Biosciences Of California, Inc. Controlled initiation of primer extension
US20080241951A1 (en) 2006-07-20 2008-10-02 Visigen Biotechnologies, Inc. Method and apparatus for moving stage detection of single molecular events
CA2662521C (fr) 2006-09-01 2016-08-09 Pacific Biosciences Of California, Inc. Substrats, systemes et procedes d'analyse de materiaux
US20080080059A1 (en) 2006-09-28 2008-04-03 Pacific Biosciences Of California, Inc. Modular optical components and systems incorporating same
US20080081330A1 (en) 2006-09-28 2008-04-03 Helicos Biosciences Corporation Method and devices for analyzing small RNA molecules
AU2007309504B2 (en) 2006-10-23 2012-09-13 Pacific Biosciences Of California, Inc. Polymerase enzymes and reagents for enhanced nucleic acid sequencing
AU2007334393A1 (en) 2006-12-14 2008-06-26 Life Technologies Corporation Methods and apparatus for measuring analytes using large scale FET arrays
US8262900B2 (en) 2006-12-14 2012-09-11 Life Technologies Corporation Methods and apparatus for measuring analytes using large scale FET arrays
US8481259B2 (en) 2007-02-05 2013-07-09 Intelligent Bio-Systems, Inc. Methods and devices for sequencing nucleic acids in smaller batches
CA2677216C (fr) 2007-02-05 2015-10-20 Intelligent Bio-Systems, Inc. Dispositif de detection et procedes d'utilisation
US8551704B2 (en) 2007-02-16 2013-10-08 Pacific Biosciences Of California, Inc. Controllable strand scission of mini circle DNA
CN101024851A (zh) * 2007-03-29 2007-08-29 西北农林科技大学 基于梯状回收的基因拷贝数鉴定和各拷贝序列获得的方法
DK2171088T3 (en) 2007-06-19 2016-01-25 Stratos Genomics Inc Nucleic acid sequencing in a high yield by expansion
EP2173909A1 (fr) 2007-07-26 2010-04-14 Roche Diagnostics GmbH Préparation cible pour le séquençage en parallèle de génomes complexes
WO2009024019A1 (fr) * 2007-08-15 2009-02-26 The University Of Hong Kong Procédés et compositions pour un séquençage d'adn au bisulfite à haut débit et leurs utilités
EP3431615A3 (fr) * 2007-10-19 2019-02-20 The Trustees of Columbia University in the City of New York Séquençage d'adn avec des terminateurs nucléotidiques réversibles non fluorescents et terminateurs nucléotidiques modifiées à étiquette clivable
US20100035253A1 (en) 2008-03-19 2010-02-11 Intelligent Bio-Systems, Inc. Methods And Compositions For Incorporating Nucleotides
JP5774474B2 (ja) * 2008-05-02 2015-09-09 エピセンター テクノロジーズ コーポレーションEpicentre Technologies Corporation Rnaへの選択的な5’ライゲーションによるタグの付加
US20100301398A1 (en) 2009-05-29 2010-12-02 Ion Torrent Systems Incorporated Methods and apparatus for measuring analytes
US20100137143A1 (en) 2008-10-22 2010-06-03 Ion Torrent Systems Incorporated Methods and apparatus for measuring analytes
WO2010056728A1 (fr) 2008-11-11 2010-05-20 Helicos Biosciences Corporation Analyse multiplexée d’acides nucléiques codants
US20120165202A1 (en) * 2009-04-30 2012-06-28 Good Start Genetics, Inc. Methods and compositions for evaluating genetic markers
US20110257889A1 (en) * 2010-02-24 2011-10-20 Pacific Biosciences Of California, Inc. Sequence assembly and consensus sequence determination
WO2011137368A2 (fr) 2010-04-30 2011-11-03 Life Technologies Corporation Systèmes et méthodes d'analyse de séquences d'acides nucléiques
WO2012006116A2 (fr) 2010-06-28 2012-01-12 Life Technologies Corporation Procédés, déroulement des opérations, trousses, appareils et moyens de programmation informatique pour une préparation d'échantillons d'acide nucléique pour le séquençage d'acide nucléique
WO2013085918A1 (fr) * 2011-12-05 2013-06-13 The Regents Of The University Of California Procédés et compositions pour générer des fragments d'acides polynucléiques
JP6445426B2 (ja) 2012-05-10 2018-12-26 ザ ジェネラル ホスピタル コーポレイション ヌクレオチド配列を決定する方法
US9988625B2 (en) 2013-01-10 2018-06-05 Dharmacon, Inc. Templates, libraries, kits and methods for generating molecules
US10428379B2 (en) 2013-03-15 2019-10-01 Ibis Biosciences, Inc. Nucleotide analogs for sequencing
US9932623B2 (en) 2013-08-19 2018-04-03 Abbott Molecular Inc. Nucleotide analogs

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080242560A1 (en) * 2006-11-21 2008-10-02 Gunderson Kevin L Methods for generating amplified nucleic acid arrays
WO2012134602A2 (fr) * 2011-04-01 2012-10-04 Centrillion Technology Holding Corporation Procédés et systèmes de séquençage de longs acides nucléiques

Cited By (37)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10612089B2 (en) 2016-01-12 2020-04-07 Bio-Rad Laboratories, Inc. Synthesizing barcoding sequences utilizing phase-shift blocks and uses thereof
US11926871B2 (en) 2016-01-12 2024-03-12 Bio-Rad Laboratories, Inc. Synthesizing barcoding sequences utilizing phase-shift blocks and uses thereof
WO2017123770A1 (fr) * 2016-01-12 2017-07-20 Bio-Rad Laboratories, Inc. Synthèse de séquences code-barres utilisant des blocs de déphasage et leurs utilisations
US10988808B2 (en) 2016-01-12 2021-04-27 Bio-Rad Laboratories, Inc. Synthesizing barcoding sequences utilizing phase-shift blocks and uses thereof
WO2017181161A1 (fr) * 2016-04-15 2017-10-19 Predicine, Inc. Systèmes et procédés pour détecter des altérations génétiques
US11702702B2 (en) 2016-04-15 2023-07-18 Predicine, Inc. Systems and methods for detecting genetic alterations
USRE49207E1 (en) * 2016-05-27 2022-09-13 Agilent Technologies, Inc. Transposase-random priming DNA sample preparation
US10240196B2 (en) * 2016-05-27 2019-03-26 Agilent Technologies, Inc. Transposase-random priming DNA sample preparation
US11186836B2 (en) 2016-06-16 2021-11-30 Haystack Sciences Corporation Oligonucleotide directed and recorded combinatorial synthesis of encoded probe molecules
US11299780B2 (en) 2016-07-15 2022-04-12 The Regents Of The University Of California Methods of producing nucleic acid libraries
WO2018013837A1 (fr) * 2016-07-15 2018-01-18 The Regents Of The University Of California Procédés de production de bibliothèques d'acides nucléiques
CN109563543A (zh) * 2016-07-15 2019-04-02 加利福尼亚大学董事会 产生核酸库的方法
US11174503B2 (en) 2016-09-21 2021-11-16 Predicine, Inc. Systems and methods for combined detection of genetic alterations
AU2017345562B2 (en) * 2016-10-19 2024-01-25 Illumina Singapore Pte. Ltd. Methods for chemical ligation of nucleic acids
WO2018075785A1 (fr) * 2016-10-19 2018-04-26 Illumina, Inc. Méthodes de ligature chimique d'acides nucléiques
US11795580B2 (en) 2017-05-02 2023-10-24 Haystack Sciences Corporation Molecules for verifying oligonucleotide directed combinatorial synthesis and methods of making and using the same
WO2019032762A1 (fr) * 2017-08-10 2019-02-14 Rootpath Genomics, Inc. Procédés pour améliorer le séquençage de polynucléotides à l'aide de codes-barres en utilisant une circularisation et une troncature de matrice
EP3704268B1 (fr) 2017-11-03 2025-01-22 Guardant Health, Inc. Normalisation de la charge de mutation tumorale
WO2019090156A1 (fr) * 2017-11-03 2019-05-09 Guardant Health, Inc. Normalisation de la charge de mutation tumorale
US12385097B2 (en) 2017-11-03 2025-08-12 Guardant Health, Inc. Normalizing tumor mutation burden
US11193175B2 (en) 2017-11-03 2021-12-07 Guardant Health, Inc. Normalizing tumor mutation burden
CN111566225A (zh) * 2017-11-03 2020-08-21 夸登特健康公司 归一化肿瘤突变负荷
WO2019113506A1 (fr) * 2017-12-07 2019-06-13 The Broad Institute, Inc. Procédés et compositions pour multiplexer un séquençage de noyaux isolés et de cellules isolées
US11332736B2 (en) 2017-12-07 2022-05-17 The Broad Institute, Inc. Methods and compositions for multiplexing single cell and single nuclei sequencing
US11584929B2 (en) 2018-01-12 2023-02-21 Claret Bioscience, Llc Methods and compositions for analyzing nucleic acid
US11629345B2 (en) 2018-06-06 2023-04-18 The Regents Of The University Of California Methods of producing nucleic acid libraries and compositions and kits for practicing same
US11118234B2 (en) 2018-07-23 2021-09-14 Guardant Health, Inc. Methods and systems for adjusting tumor mutational burden by tumor fraction and coverage
US12291751B2 (en) 2018-07-23 2025-05-06 Guardant Health, Inc. Methods and systems for adjusting tumor mutational burden by tumor fraction and coverage
US12385090B2 (en) 2018-12-07 2025-08-12 Bgi Shenzhen Method for sequencing long-fragment nucleic acid
EP3894595A1 (fr) * 2018-12-14 2021-10-20 Lexogen GmbH Amplification d'acide nucléique et procédé d'identification
EP3894595B1 (fr) * 2018-12-14 2025-10-29 Lexogen GmbH Procédé d'amplification et d'identification d'acide nucléique
US11739321B2 (en) 2019-01-07 2023-08-29 Agilent Technologies, Inc. Compositions and methods for genomic DNA and gene expression analysis in single cells
EP4484573A3 (fr) * 2019-01-07 2025-03-26 Agilent Technologies, Inc. Compositions et procédés d'analyse d'expression génique et d'adn génomique dans des cellules uniques
CN113272443A (zh) * 2019-01-07 2021-08-17 安捷伦科技有限公司 用于单细胞中的基因组dna和基因表达分析的组合物和方法
WO2020146312A1 (fr) * 2019-01-07 2020-07-16 Agilent Technologies, Inc. Compositions et procédés d'analyse d'expression génique et d'adn génomique dans des cellules uniques
WO2021216868A1 (fr) * 2020-04-22 2021-10-28 The Regents Of The University Of California Procédés de détection et de séquençage d'un acide nucléique cible
US12065684B2 (en) 2020-05-15 2024-08-20 Telesis Bio Inc. Demand synthesis of polynucleotide sequences

Also Published As

Publication number Publication date
EP3879012A1 (fr) 2021-09-15
WO2015026853A2 (fr) 2015-02-26
RU2016107196A3 (fr) 2018-07-27
US20210062186A1 (en) 2021-03-04
US10036013B2 (en) 2018-07-31
CN105917036B (zh) 2019-08-06
CN105917036A (zh) 2016-08-31
US10865410B2 (en) 2020-12-15
RU2698125C2 (ru) 2019-08-22
CA2921620A1 (fr) 2015-02-26
EP3036359A2 (fr) 2016-06-29
RU2016107196A (ru) 2017-09-26
ES2873850T3 (es) 2021-11-04
CA2921620C (fr) 2021-01-19
EP3626866B1 (fr) 2021-03-24
US20150051116A1 (en) 2015-02-19
ES2764096T3 (es) 2020-06-02
EP3036359B1 (fr) 2019-10-23
EP3036359A4 (fr) 2017-06-21
EP3626866A1 (fr) 2020-03-25
WO2015026853A3 (fr) 2015-04-16
US20180334671A1 (en) 2018-11-22

Similar Documents

Publication Publication Date Title
US10865410B2 (en) Next-generation sequencing libraries
US20150344947A1 (en) Genotyping by next-generation sequencing
US20160115473A1 (en) Multifunctional oligonucleotides
US11359236B2 (en) DNA sequencing
US20190106744A1 (en) Dna sequencing
US20200040390A1 (en) Methods for Sequencing Repetitive Genomic Regions
US20220145287A1 (en) Methods and compositions for next generation sequencing (ngs) library preparation
HK1204337B (en) Genotyping by next-generation sequencing

Legal Events

Date Code Title Description
AS Assignment

Owner name: ABBOTT MOLECULAR INC., ILLINOIS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:KIM, DAE HYUN;REEL/FRAME:033914/0503

Effective date: 20141007

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION