HK1245843A1 - Dna sequencing using controlled strand displacement - Google Patents
Dna sequencing using controlled strand displacement Download PDFInfo
- Publication number
- HK1245843A1 HK1245843A1 HK18104730.7A HK18104730A HK1245843A1 HK 1245843 A1 HK1245843 A1 HK 1245843A1 HK 18104730 A HK18104730 A HK 18104730A HK 1245843 A1 HK1245843 A1 HK 1245843A1
- Authority
- HK
- Hong Kong
- Prior art keywords
- dna
- sequence
- adaptor
- strand
- primer
- Prior art date
Links
Description
Cross Reference to Related Applications
This application claims priority to U.S. provisional application nos. 62/117,391 (filed on 17/2/2015) and 62/194,741 (filed on 20/7/2015). The entire contents of each of the above provisional applications are incorporated herein by reference.
Technical Field
The present invention relates to the fields of DNA sequencing, genomics and molecular biology.
Background
The need for low-cost, high-throughput methods for nucleic acid sequencing and re-sequencing has led to the development of "massively parallel sequencing" (MPS) techniques. Improvements in such sequencing methods are of significant value in science, medicine and agriculture.
Disclosure of Invention
The present invention relates to nucleic acid sequencing (e.g., genomic DNA sequencing). In one aspect, methods for paired-end sequencing of single-stranded DNA, such as DNA concatemers (e.g., DNA nanospheres or DNBs), are provided. Typically, the DNA being sequenced comprises a target sequence and at least one adapter sequence.
The present invention provides a method for preparing a DNA strand complementary to a template DNA polynucleotide immobilized on a substrate, the template DNA comprising a first target DNA sequence inserted 3' of a first adaptor to the first target DNA sequence. The method comprises hybridizing a first primer to a first primer binding sequence in a first adaptor; extending the first primer using a first DNA polymerase to produce a second strand comprising a sequence complementary to the first target DNA sequence and a sequence complementary to at least a portion of the second adaptor; hybridizing a second primer to the second primer binding sequence; and extending the second primer using a DNA polymerase having strand displacement activity to produce a third strand. The third strand partially displaces the second strand and produces a partially hybridized second strand, comprising: 1) a hybridizing portion that hybridizes to a template DNA polynucleotide; and 2) a non-hybridizing overhang comprising a sequence complementary to the first target DNA sequence and a sequence complementary to at least a portion of the second adaptor.
In some embodiments, the DNA template polynucleotide comprises an additional adaptor, i.e., a third adaptor, which is 3' to the first adaptor; and an additional target DNA sequence, i.e. a second target DNA sequence, inserted between the first adaptor and the third adaptor. In one embodiment, the template DNA polynucleotide comprises a third adaptor and the second primer binding sequence is in the third adaptor. In another embodiment, the second primer binding sequence is in a first adaptor, the same adaptor also comprising the first primer.
In one embodiment, the first DNA polymerase used to produce the second strand and the DNA polymerase having strand displacement activity used to produce the third strand are the same polymerase. In one embodiment, the first primer and the second primer hybridize to their respective primer binding sequences or are extended in the same reaction.
In one embodiment, the method further comprises hybridizing a sequencing oligonucleotide to a sequence complementary to at least a portion of the second adaptor and determining the nucleotide sequence of at least a portion of the sequence complementary to the first target DNA sequence.
In one embodiment, the first adaptor, the second adaptor and the third adaptor (if present) have the same nucleotide sequence.
In one embodiment, the template DNA polynucleotide comprises a DNA concatemer, and the first target DNA sequence and the second target DNA sequence have the same nucleotide sequence.
In one embodiment, the template DNA polynucleotide comprises a DNA concatemer and the first primer and the second primer have the same nucleotide sequence.
In one embodiment, the plurality of third strands is generated by hybridizing a plurality of second primers comprising extendable and non-extendable primers to a plurality of second primer binding sequences.
In one embodiment, extension of the second primer to produce the third strand is terminated at a fixed time interval of 5 minutes, 10 minutes, 20 minutes, 30 minutes, 40 minutes, or 60 minutes. In one embodiment, the termination is achieved by chemical termination, i.e. by addition of chemicals. In one embodiment, the chemical used to stop the reaction is a Tris buffer containing 1.5M NaCl. In another embodiment, termination is achieved by the incorporation of a chain terminating nucleotide analog (e.g., ddNTP). In some embodiments, the ddNTP is added after the addition of the chemical terminator.
In one embodiment, the reaction to extend the second primer is controlled by selecting the temperature, enzyme concentration, and primer concentration such that complement displacement of the second strand can be avoided.
Drawings
Figure 1 shows the steps used in a method for generating DNA strands for sequencing.
FIG. 2 shows the steps used in a related method for generating DNA strands for sequencing.
FIG. 3 shows the steps used in determining sequences from DNA strands.
Fig. 4 shows an exemplary method of using an extension primer to generate complementary strands (a series of subsequent fragments) on DNB using the strand displacement activity of a DNA polymerase.
FIG. 5 shows exemplary adaptor and primer sequences for generating and sequencing DNA strands complementary to DNB.
FIG. 6 is a diagram of an exemplary method for generating DNA strands complementary to the DNA of an immobilized adaptor.
Detailed Description
1. Overview
In certain first aspects, the invention provides methods of preparing DNA strands for sequencing, as well as genetic constructs, libraries, and arrays using DNA strands prepared according to these methods. In certain second aspects, the invention provides methods of sequencing using the DNA strands, genetic constructs, libraries and arrays prepared according to the first aspect.
Preparation of DNA strands for sequencing
In one method, DNA strands for sequencing are generated by:
a) providing a template DNA polynucleotide comprising a first target DNA sequence inserted between a first adaptor 3' to the first target DNA sequence and a second adaptor 5' to the first target DNA sequence, and optionally a third adaptor 3' to the first adaptor and a second target DNA sequence inserted between the first adaptor and the third adaptor, wherein the template DNA polynucleotide is immobilized on a substrate,
b) combining a first primer with the immobilized template DNA polynucleotide and hybridizing the first primer to a first primer binding sequence in a first adaptor, wherein the first primer is not immobilized on the substrate when the first primer is combined with the immobilized template DNA polynucleotide;
c) extending the first primer using a first DNA polymerase to produce a second strand, wherein the second strand comprises a sequence complementary to the first target DNA sequence and a sequence complementary to at least a portion of the second adaptor;
d) combining a second primer with the immobilized template DNA polynucleotide, hybridizing the second primer to a second primer binding sequence, wherein the second primer binding sequence is 3' to the first primer binding sequence, wherein the second primer is not immobilized on the substrate when the second primer is combined with the immobilized template DNA polynucleotide;
e) extending the second primer using a DNA polymerase having strand displacement activity to produce a third strand,
wherein the second primer is extended to produce a third strand, partially displacing the second strand, thereby producing a partially hybridized second strand having:
(i) a hybridizing portion that hybridizes to a template DNA polynucleotide, and
(ii) an unhybridized overhang comprising a sequence complementary to the first target DNA sequence and a sequence complementary to at least a portion of the second adaptor, wherein the unhybridized portion is a 3' to hybridized portion in the second strand.
Fig. 1 shows the above steps (a) - (e).
Panel 1.1(Panel 1.1) showsTemplate DNA PolynucleotideWhich comprises insertion inFirst adapter3' to the first target DNA sequence andsecond adapter5' to the first target DNA sequenceFirst target DNA sequence。
Panel 1.2 shows the first adapterFirst primer binding sequence① hybridizedFirst primer。
Panel 1.3 shows extension of a first primer using a first DNA polymerase to generateSecond chainWherein the second chain comprises (i)Sequence complementary to the first target DNA sequence② and (ii)A sequence complementary to at least a portion of the second adaptor③。
Panel 1.4 showsSecond primerAndsecond primer binding sequence④, wherein the second primer binding sequence is 3 'to the first primer binding sequence in the embodiment shown in figure 1, the second primer binding sequence is contained in the first adaptor 3' (to the first primer binding sequence) (compare figure 2, panel 2.4, wherein the second primer binding sequence is in the third adaptor.)
Panel 1.5 shows extension of the second primer with a DNA polymerase having Strand Displacement ActivityTo produce a substanceThird chain. As shown in panel 1.5. The extension of the third strand partially displaces the second strand. This partial displacement results in a second strand that partially hybridizes to the template DNA polynucleotide (or "first strand"). The partially hybridized second strand has a sequence that hybridizes to the template DNA polynucleotideHybridizing moieties⑤ andunhybridized overhanging section⑥ unhybridized overhang portion ⑥ containsSequence complementary to the first target DNA sequence② andand a second Sequences complementary to at least a portion of the adapter⑦。
Fig. 2 shows a second scheme illustrating the above steps (a) - (e).
Panel 2.1 showsTemplate DNA PolynucleotideWhich comprises (i) inserting inFirst adapter3' to the first target DNA sequence andsecond adapter5' to the first target DNA sequenceFirst target DNA sequenceAnd (ii)Third adapter3' to the first adapter and inserted between the first adapter and the third adapterSecond target DNA sequence。
Panel 2.2 shows the first adapterFirst primer binding sequence① hybridizedFirst primer。
Panel 2.3 shows extension of a first primer using a first DNA polymerase to generateSecond chainWherein the second chain comprises (i)Sequence complementary to the first target DNA sequence② and (ii)A sequence complementary to at least a portion of the second adaptor③。
Panel 2.4 showsSecond primerAndsecond primer binding sequence④, wherein the second primer binding sequence is 3' to the first primer binding sequence, as shown in FIG. 2, the second primer binding sequence is contained in a third adaptor.
Panel 2.5 shows extension of a second primer to generate DNA polymerase with Strand Displacement ActivityThird chain. As shown in FIG. 2.5, the extension of the third strand partially displaces the second strand. This partial displacement results in a second strand that partially hybridizes to the template DNA polynucleotide (or "first strand"). The second strand partially hybridized has a sequence ofTemplate DNA polynucleotide hybridizationHybridizing moieties⑤ andunhybridized overhanging section⑥, the unhybridized overhang portion ⑥ containsSequence complementary to the first target DNA sequence② anda sequence complementary to at least a portion of the second adaptor⑦。
Sequencing DNA strands
DNA sequencing methods can be applied using the partially hybridized second strand as a sequencing template. Since the second strand comprises a sequence complementary to the first target DNA sequence, the method can be used to determine the nucleotide sequence of the first target DNA sequence.
In one method, the sequencing step comprises:
f) will be provided withSequencing oligonucleotidesHybridizes to a sequence in the third strand that is complementary to at least a portion of the second adaptor, and
g) at least a portion of the sequence complementary to the first target DNA sequence is determined. Methods of sequence determination may include, for example and without limitation, sequencing by synthesis (including extension of a sequencing oligonucleotide) and/or sequencing by ligation (including ligation of a probe to a sequencing oligonucleotide), or may include other methods.
FIG. 3 shows a scheme illustrating the above steps (f) - (g).
Panel 3.1 shows that the sequencing oligonucleotide (c) hybridizes to the sequence in the second strand that is complementary to at least a portion of the second adaptor.
Panel 3.2 shows extension of the sequencing oligonucleotide to determine at least a portion of the sequence complementary to the first target DNA sequence (and thus the first target sequence) using sequencing by a synthetic method in which the sequencing oligonucleotide is used as a primer for primer extension to produce extension product nina.
Panel 3.3 shows ligation of probe (R) to a sequencing oligonucleotide to generate a ligation product comprising a sequence complementary to the sequence of the second strand, thereby using sequencing by ligation to determine the sequence of the second strand (and thus the first target sequence).
This describes each of these elements and steps in more detail. It is understood that although aspects of the invention have been described with reference to particular embodiments or illustrations, other embodiments will be apparent to those skilled in the art upon reading this disclosure and are considered to be within the methods of the invention.
2. Template DNA Polynucleotide
As used herein, a "template DNA polynucleotide" is a DNA construct that comprises a target DNA sequence inserted between two adapter sequences, referred to herein as a "first adapter" (3 'to the target DNA sequence) and a "second adapter" (5' to the target DNA sequence). As used herein, "insertion" refers to a target DNA sequence between adapter sequences. In some embodiments, the target DNA sequence is contiguous with the adaptor sequence and no other bases or sequences are present (e.g., between the target DNA sequence and the adaptor sequence), but this is not required in all embodiments. Sequences inserted between adapters may also be referred to as sequences flanked by adapters.
Using the methods of the invention, at least a portion of a target DNA sequence is determined. The target DNA may be from any number of sources, as described below.
Any method for ligating the target DNA sequence of interest to the flanking adaptors can be used to generate the template DNA polynucleotide. For example, a target DNA sequence of interest can be obtained from a biological source (e.g., a cell, tissue, organism, or population of cells or organisms), and flanking adapters can be added by ligation, amplification, translocation, insertion, and the like. See, e.g., U.S. patent No.8445194 (describing DNA nanospheres comprising an adapter and a target sequence), international patent publication No. wo00/18957 (describing sequencing target sequences flanked by adapters), and U.S. patent publication No. us 2010/0120098 (describing fragmentation), each of which is incorporated herein for all purposes.
3. Template DNA polynucleotide library
In many Massively Parallel Sequencing (MPS) techniques, a library of sequencing templates is generated and individual species in the library are sequenced in parallel. For example, in the DNA nanosphere approach developed by Drmanac et al, genomic DNA is fragmented and a single fragment is used to generate circular DNA, where platform-specific oligonucleotide adaptors isolate genomic DNA sequences (the isolated genomic DNA sequences may be contiguous in the genome). The circular DNA is amplified to produce single-stranded concatemers ("DNA nanospheres") that can be immobilized on a substrate. In "Solexa" type sequencing, genomic DNA is fragmented and the DNA fragments are then ligated to platform-specific oligonucleotide adaptors. Adaptors are used to immobilize the individual fragments on a substrate, where they are amplified in situ to generate clonal cluster amplicons for sequencing. Many other MPS sequencing methods are known.
Thus, it should be recognized that although the invention is sometimes described with respect to target DNA (e.g., a single DNB template DNA), MPS sequencing is performed using a large library of sequences, typically on an array of constructs (e.g., an array comprising DNA concatemers or clonal copies of a template DNA polynucleotide), which comprise many different target sequences (e.g., different genomic DNA fragments) but share a common adaptor sequence.
Methods for making MPS sequencing libraries and methods of sequencing using such libraries are well known in the art and the reader is assumed to be familiar with these methods. See, e.g., section, j, and h.ji. "Next-generation dnasequencing." Nature biotechnology 26.10 (2008): 1135-; sheend, J., et al, "Advanced sequencing technologies: methods and metals" Nat. Rev. Genet.5, 335-344 (2004); metzker, Michael L. "Sequencing technologies-the next generation." NatureReviews Genetics 11.1(2010): 31-46; drmanac, R. et al, "Accurate wheel genome sequencing as the Multimate Genetic test," Clinical Chemistry 61.1(2015): 305-306; drmanac, R. et al, "Human genome sequencing using unreacted base reads on section-organizing DNA nanoarrays," Science 327.5961(2010): 78-81; drmanac, S. et al, "Accurate sequencing by hybridization for DNA diagnostics and infectious viral genetics," Nat.Biotechnol.16, 54-58 (1998); margulies, M. et al, "Genome sequential high-density picoliter microorganisms," Nature 437.7057(2005): 376-; ng, S. et al, "Targeted capture and mapping parallel sequencing of 12 humanoids," Nature 461.7261(2009): 272-276; meng, H-M et al, "DNA dendrimer, an effective nanoparticles for intracellular molecular sensing," ACS Nano 8.6(2014): 6171-; head, S. et al, "Practical Guide"; head, s. et al, "Practical guide"; shendare, J. et al, Accurate multiplex polarity sequencing of evolved bacterial genome, science 309, 1728-1732 (2005); brenner, S.et al, "general expression analysis by Mapping Parallel Signature Sequencing (MPSS) on microbead arrays," Nat.Biotechnol.18, 630-634 (2000); ronaghi et al, "Real-time DNAsequencing use detection of pyrophorite release," anal. biochem.242, 84-89 (1996); McKernan, K. et al, "Reagents, methods, and library for lead-based detection," U.S. patent publication No. 20080003571 (2006); adessi, C. et al, "Solid phase DNAamplification of characters and amplification mechanisms" Nucleic Acids Res.28, e87(2000), each of which is incorporated herein in its entirety for all purposes, including to teach DNA sequencing library preparation and MPS sequencing platforms and techniques.
4. Target DNA sequence
The target DNA portion of the template DNA polynucleotide may be from any source, including naturally occurring sequences (e.g., genomic DNA, cDNA, mitochondrial DNA, free DNA, etc.), artificial sequences (e.g., synthetic sequences, products of gene shuffling or molecular evolution, etc.), or combinations thereof. The target DNA can be from a source such as an organism or cell (e.g., from a plant, animal, virus, bacterium, fungus, human, mammal, insect), forensic source, etc. The target DNA sequence may be from a biological population, such as an intestinal bacterial population. The target DNA sequence may be obtained directly from the sample, or may be the product of an amplification reaction, a fragmentation reaction, or the like.
The target DNA may have a length within a specific size range, for example, a length of 50 to 600 nucleotides. Other exemplary size ranges include lengths of 25 to 2000, 50 to 1000, 100 to 600, 50-100, 50-300, 100-300, and 100-400 nucleotides. In a template DNA polynucleotide having two or more different target DNAs, the target DNAs may have the same length or different lengths. In a library of template DNA polynucleotides, the members of the library may in some embodiments be of similar length (e.g., all in the range of 25 to 2000 nucleotides or another range).
In one approach, the target DNA may be prepared by fragmenting larger source DNA (e.g., genomic DNA) to produce fragments within a desired size range. In some methods, a size selection step is used to obtain a pool of fragments within a particular size range.
5. Adapter
The template DNA or template DNA polynucleotide used in the methods disclosed herein comprises two or more adaptors. The adapter may comprise elements for immobilizing the template DNA polynucleotide on a substrate, elements for binding to oligonucleotides used in sequence determination (e.g., binding sites for primers extended in sequencing by synthetic methods and/or binding sites for probes of cPAL or other ligation-based sequencing methods, etc.) or elements for both immobilization and sequencing. The adapter may include additional features such as, but not limited to, restriction endonuclease recognition sites, extension primer hybridization sites (for analysis), barcode sequences, unique molecular identifier sequences, and polymerase recognition sequences.
The adapter sequences may have a length, structure, and other characteristics suitable for the particular sequencing platform and intended use. For example, the adaptor may be single-stranded, double-stranded, or partially double-stranded, and may have a length suitable for the intended use. For example, the length of the adapter can be in the range of 10-200 nucleotides, 20-100 nucleotides, 40-100 nucleotides, or 50-80 nucleotides. In some embodiments, the adaptor may comprise one or more modified nucleotides comprising modifications to the base, sugar and/or phosphate moieties.
One skilled in the art will appreciate that different members of a library will typically comprise common adaptor sequences, although different species or subcategories in a library may have unique characteristics, such as subgeneric-specific barcodes.
An individual adapter sequence may comprise a plurality of functionally distinct subsequences. For example, as discussed in detail in this disclosure, a single adaptor sequence may comprise more than two primer binding sequences (which may be recognized by different complementary primers or probes). Functionally different sequences within an adapter may be overlapping or non-overlapping. To illustrate, given an adapter that is 40 bases long, in one embodiment, bases 1-20 are the first primer binding site and bases 21-40 are the second primer binding site. In various embodiments, bases 1-15 are the first primer binding site and bases 21-40 are the second primer binding site. In various embodiments, bases 5-25 are the first primer binding site and bases 15-35 are the second primer binding site. Similarly, given an adaptor that is 40 bases long, bases 1-20 can be the immobilized sequence and bases 21-40 can be the primer binding site. Different primer binding sequences in an adaptor (or different adaptors of a template DNA polynucleotide) may be of the same or different lengths.
The adapters (e.g., first adapters, second adapters, third adapters, etc.) can comprise one, two, or more primer binding sequences. The primer binding sequence is functionally defined as the site or sequence to which the primer (or oligonucleotide) specifically binds. For example, an adaptor having two primer binding sequences may be specifically bound by two different primers. In one approach, two primer binding sequences in the same adaptor are overlapping, i.e., share a portion of the nucleotide sequence. In some embodiments, the overlapping region is no more than 50%, or 40%, or 30%, or 20%, or 10% or 5% of either of the two overlapping primer binding sequences. In one method, more than one primer binding sequence is non-overlapping. In some embodiments, the non-overlapping primer binding sequences are immediately adjacent to each other; in some other embodiments, the non-overlapping primer binding sequences are separated by 1-10, 10-20, 30-40, or 40-50 nucleotides.
The primer binding sequence will be of sufficient length to allow for primer hybridization, with the exact length and sequence depending on the intended function of the primer (e.g., extension primer, ligation substrate, index sequence, etc.). The length of the primer binding sequence is typically at least 10, at least 12, at least 15 or at least 18 bases.
It will be apparent that different adaptors may have the same sequence or different sequences and may have the same primer binding sequence or different primer binding sequences in a given template DNA polynucleotide. See, e.g., section 7 below. Although certain figures are provided to illustrate the invention, the use of cross-hatching or the like to represent adapters should not be construed as indicating sequence identity.
6. Primer and method for producing the same
The terms "primer" and "probe" are used interchangeably and refer to an oligonucleotide having a sequence complementary to a primer or probe binding site of DNA. These primers may be "extension primers" or "sequencing oligonucleotides". An "extension primer" is used in a primer extension reaction to produce the "second" and "third" strands of [ DNA ] described above. Thus, the extension primer is a substrate for a DNA polymerase that can be extended by the addition of nucleotides.
Primers and probes useful in the present invention (e.g., primers that are capable of extension or ligation under sequencing assay conditions) are well selected or designed by one of ordinary skill in the art. Without limiting the invention, the extension primer typically has a length in the range of 10-100 nucleotides, typically 12-80 nucleotides, typically 15-80 nucleotides.
It will be appreciated that the primers and probes may be fully or partially complementary to the binding sequences in the adapters to which they hybridize. For example, a primer may have at least 85%, 90%, 95%, or 100% identity to the sequence to which it hybridizes.
The primer may also contain additional sequences at the 5' end of the primer that are not complementary to the primer binding sequence in the adapter. The non-complementary portion of the primer can be of a length that does not interfere with hybridization between the primer and its primer binding sequence. Typically, the non-complementary portion is 1 to 100 nucleotides in length. In some embodiments, the non-complementary portion is 4 to 8 nucleotides in length. Primers may comprise DNA and/or RNA moieties, and in some methods, primers used in the invention may also have one or more modified nucleotides containing modifications to base, sugar, and/or phosphate moieties.
A "sequencing oligonucleotide" may be an extension primer for sequencing by synthesis reaction (also referred to as "sequencing by extension"). A "sequencing oligonucleotide" can be an oligonucleotide used in a sequencing by ligation method such as the "combinatorial probe-anchored ligation reaction" (cPAL) described in U.S. patent publication No. 20140213461, which is incorporated herein by reference for all purposes, including singleplex, duplex, and multiplex cPAL. Briefly, the cPAL comprises a loop of the following steps: first, a "sequencing oligonucleotide" (or "anchor") is hybridized to a complementary sequence in the adaptor of the third DNA strand described above. An enzymatic ligation reaction is then performed in which the anchors are ligated to a population of fully degenerate probes, e.g., 8-mer probes, labeled (e.g., with a fluorescent dye). Probes can include, for example, a length of about 6 to about 20 bases, a length of about 7 to about 12 bases. In any given cycle, the population of 8-mer probes used is constructed such that the identity of one or more of its positions correlates with the identity of the fluorophore (e.g., 8-mer probe) attached thereto. In variations of the primary cPAL known in the art, such as a multiplex cPAL, a partially or fully degenerate secondary anchor is used to augment the readable sequence.
7. Relationship of target sequence to adapter sequence
As described above, the template DNA polynucleotide comprises a first target DNA sequence inserted between the first adaptor 3 'to the first target DNA sequence and the second adaptor 5' to the first target DNA sequence.
The template DNA polynucleotide may comprise a plurality of target DNA sequences (e.g., more than 25 or more than 50; sometimes in the range of 2 to 1000, 50-800 or 300-600 copies), each of which may be flanked by a pair of adapters. Thus, in one embodiment, the template DNA polynucleotide comprises a third adaptor 3' to the first adaptor and a second target DNA sequence inserted between the first adaptor and the third adaptor. In some cases, the target DNA sequence is contained in a single-stranded DNA nanosphere. See, for example, section 7.1 and fig. 2 and 4.
The template DNA polynucleotide may comprise a single target DNA sequence flanked by two adaptors (sometimes referred to as "adaptor-containing target sequences"). See, for example, section 7.2 and fig. 1 and 6.
7.1. Template DNA polynucleotide: concatemers and DNB
In some embodiments, the template DNA polynucleotide used in the present invention is a DNA concatemer. As used herein, the term "concatemer" refers to a long contiguous DNA molecule comprising multiple copies of the same DNA sequence (tandem linked "monomers" or "monomer sequences"). A "DNA concatemer" may comprise at least 2 monomers, at least 3 monomers, at least 4 monomers, at least 10 monomers, at least 25 monomers, at least 50 monomers, at least 200 monomers, or at least 500 monomers. In some embodiments, the DNA concatemer comprises 25-1000 monomers, such as 50-800 monomers or 300-600 monomers. Each monomer comprises at least one target DNA sequence. The DNA concatemers used in the methods of the invention may be DNA nanospheres or "DNBs". Without intending to limit the invention in any way, DNA nanospheres are described in Drmanac et al, 2010, Human genome sequencing using exchanged base reads on self-assembly DNA nanoarrays, "Science 327:5961: 78-81; "Methods and oligonucleotide designs for insertion of multiple adapters into library constraints," U.S. Pat. No. 7,897,344(March 1,2011); U.S. Pat. No.8,445,194 ("Single Mobile Arrays for Genetic and Chemical Analysis" in 2013, 21/5); and Drmanac et al, "Methods and compositions for fragment read sequencing," U.S. Pat. No.8,592,150 (11/26/2013), each of which is incorporated herein by reference, as well as other references described herein. A "DNA nanosphere" or "DNB" is a single-stranded DNA concatemer of sufficient length to form a random coil that fills a roughly spherical volume in a solution (e.g., SSC buffer at room temperature). In some embodiments, DNA nanospheres typically have a diameter of about 100 to 300 nm. The template DNA in DNB may be referred to as "DNB template strand".
In one embodiment, the monomers of the concatemer comprise an adaptor sequence and a target DNA sequence. Due to the tandem ligation of monomers, the target DNA sequence will be flanked (flankedby) by two adaptor sequences.
In some methods, the target DNA sequence in the monomer is flanked by two "half-adaptor" sequences, such that each target sequence that is ligated in tandem in the concatemer is flanked by two adaptors.
In some methods, the monomeric unit comprises one, two, three, or four or more adaptors. In some embodiments, all adaptors of a monomer (and concatemer) have the same sequence. In other embodiments, the adapters may have different sequences, such as two, three, or four different sequences.
It will be appreciated that a single monomer may comprise more than one template DNA sequence. For example, the monomer may comprise structure A1-T1-A2-T2Wherein T is1And T2Is a template DNA having the same or different sequence, A1And A2Are adapters with the same or different sequences. The corresponding concatemer will have structure A1-T1-A2-T2-A1-T1-A2-T2-A1-T1-A2-T2..... In related embodiments, the monomer mayComprising structure A1-T1-A2-T2-A3Wherein T is1And T2Is a template DNA having the same or different sequence, A2Is an adapter, A1And A3Is a "half-adaptor". The corresponding concatemer will include structure A2-T2-A3A1-T1-A2-T2-A3A1-T1-A2-T2-A3A1.., wherein A3A1Half adaptors are used together as adaptors. For purposes of illustration and not limitation, table 1 shows exemplary concatemer structures. In table 1, N is greater than 1. Typically N is at least 3, typically at least 4, at least 10, at least 25 monomers, at least 50 monomers, at least 200 monomers or at least 500 monomers. In some embodiments, N is in the range of 25-1000, such as 50-800 or 300-600. In the case where the template DNA polynucleotide is a DNA nanosphere, N is at least 25, typically at least 50, typically 50-800 or 300-600.
TABLE 1
Concatemer structure
DNA concatemers (including DNA nanospheres) can be prepared by any suitable method. In one approach, a single genomic fragment is used to generate single stranded circular DNA with adapters inserted between adjacent or closely together target sequences in the genome. Circular DNA constructs can be amplified enzymatically (e.g., by rolling circle replication, or by linking monomers to each other). For purposes of illustration and not limitation, DNA nanospheres can be prepared according to the methods described in U.S. patent No.8,445,194 and U.S. patent No.8,592,150.
7.2 template DNA polynucleotides: adapter-containing target sequences
Alternatively, the template DNA polynucleotide may comprise a single target DNA sequence flanked by two adaptors. Template DNA polynucleotides having a single target DNA sequence and a pair of flanking adaptors may be particularly useful in Solexa-type sequencing. See, for example, fig. 6.
In some embodiments, the template DNA is a non-concatemeric DNA construct comprising at least one target DNA sequence and at least two adaptors. In some embodiments, the construct comprises more than two adaptors and/or more than one target DNA sequence.
In some embodiments, complementary strands are first synthesized from a single DNA strand comprising one or more adaptors and one or more target DNA sequences to form double-stranded DNA. One or both strands of the double-stranded DNA may be used as a template DNA.
In some embodiments, clonal copies of non-concatemers are generated and used as template DNA according to the present invention. Methods for preparing cloned copies of DNA sequences including non-concatemers are well known in the art. See the references cited in section 3 above.
8. Substrate and spacer
In some applications, a template DNA polynucleotide is immobilized on a substrate. Typically, immobilization is performed prior to synthesis of the "second" and "third" strands described above. In some cases, immobilization is performed prior to synthesis of the "third" strand described above. Exemplary substrates (substrates) may be substantially flat (e.g., glass slides) or non-flat and unitary or formed from a plurality of distinct units (e.g., beads). Exemplary materials include glass, ceramics, silica, silicon, metals, elastomers (e.g., silicone), polyacrylamides (e.g., polyacrylamide hydrogels; see WO 2005/065814). In some embodiments, the substrate comprises an ordered or disordered array of fixation sites or pores. In some methods, the target DNA polynucleotide is immobilized on a substantially planar substrate, such as a substrate comprising an ordered or disordered array of immobilization sites or pores. In some methods, the target DNA polynucleotide is immobilized on a bead.
Polynucleotides can be immobilized on a substrate by a variety of techniques including covalent and non-covalent attachment. Polynucleotides can be immobilized on a substrate by a variety of techniques. In one embodiment, the surface may comprise capture probes that form complexes (e.g., double-stranded duplexes) with components of the polynucleotide molecule (e.g., adaptor oligonucleotides). In another embodiment, the surface may have reactive functional groups that react with complementary functional groups on the polynucleotide molecule to form covalent bonds. Long DNA molecules, such as a few nucleotides or longer, can also be effectively attached to hydrophobic surfaces, such as clean glass surfaces with low concentrations of various reactive functional groups (e.g., -OH groups). In another embodiment, the polynucleotide molecules may adsorb to the surface through non-specific interactions with the surface or through non-covalent interactions (e.g., hydrogen bonding, van der waals forces, etc.).
For example, DNA nanospheres can be affixed to discrete spaced apart regions as described in Drmanac et al, U.S. Pat. No.8,609,335. In one method, adaptor-bearing DNA is immobilized on a substrate by hybridization to an immobilized probe sequence, and a clonal cluster comprising DNA template polynucleotides is generated using a solid-phase nucleic acid amplification method. See, for example, WO 98/44151 and WO 00/18957.
In some embodiments, the DNA template polynucleotides are partitioned in emulsions, droplets, beads and/or microwells prior to the primer extension step (Margulies et al, "Genome sequencing in a microbial high-Densitivity chromatography reaction," Nature 437:7057 (2005); Shendure et al, "Accutatemultiplex strategy sequencing of an evolved bacterial Genome" Science 309, 1728-1732 (2005)).
DNA polymerase
The methods of the invention can be performed using methods, tools, and reagents well known to those of ordinary skill in the art of molecular biology and MPS sequencing, including nucleic acid polymerases (RNA polymerases, DNA polymerases, reverse transcriptases), phosphatases and phosphorylases, DNA ligases, and the like. In particular, certain primer extension steps may be performed using one or more DNA polymerases. Certain extension steps are performed using a DNA polymerase with strand displacement activity.
The methods disclosed herein use the polymerase and strand displacement activity of the DNA polymerase to generate a DNA strand that is complementary to a template DNA. In one method, the present invention uses a DNA polymerase with strong 5'→ 3' strand displacement activity. The polymerase preferably does not have 5'→ 3' exonuclease activity. However, when the activity does not prevent the method of the invention from being carried out, for example by using reaction conditions that inhibit exonuclease activity, a DNA polymerase having 5'-3' exonuclease activity may be used.
The term "strand displacement activity" describes the ability to displace downstream DNA encountered during synthesis. Strand displacement activity is described in U.S. patent publication No.20120115145 (which is incorporated herein by reference) as follows: "Strand Displacement Activity" refers to the phenomenon whereby a biological, chemical or physical agent (e.g., DNA polymerase) causes paired nucleic acids to dissociate from their complementary strands in the 5 to 3 direction, bind and approach template-dependent nucleic acid synthesis. Strand displacement begins at the 5 'end of the partner nucleic acid sequence, so that the enzyme immediately performs nucleic acid synthesis 5' of the site of displacement. The newly synthesized nucleic acid and the displaced nucleic acid generally have the same nucleotide sequence complementary to the template nucleic acid strand. The strand displacement activity may be on the same molecule as the molecule conferring the activity of nucleic acid synthesis (particularly DNA synthesis), or it may be a separate and independent activity. DNA polymerases, such as the Klenow fragment of E.coli DNA polymerase I, DNA polymerase I, T7 or T5 phage DNA polymerase, and HIV virus reverse transcriptase, are enzymes having both polymerase activity and strand displacement activity. An agent such as a helicase may be used in combination with an inducing agent that does not have strand displacement activity to produce a strand displacement effect, that is, displacement of a nucleic acid is coupled with synthesis of a nucleic acid of the same sequence. Also, proteins such as Rec A or single-stranded binding proteins from E.coli or from another organism can be used to generate or facilitate strand displacement, along with other inducers (Kornberg and Baker, 1992, DNA Replication, second edition, pp 113-225, Freeman, NY).
In one approach, the polymerase is Phi29 polymerase. Phi29 polymerase has strong displacement activity at moderate temperatures (e.g., 20-37 ℃).
In one approach, Bst DNA polymerase, large fragment (NEB # M0275) was used. BstDNA polymerase is active at elevated temperatures (about 65 ℃).
In one approach, the polymerase is Deep-VentR DNA polymerase (NEB # M0258) (Hommelsheim et al, Scientific Reports 4: 5052 (2014)).
10. Preparation of complementary strands
This section describes certain aspects of the steps of preparing the second and third DNA strands.
The generation of a DNA strand complementary to a template DNA or target DNA sequence ("first strand") begins with hybridizing a first primer to a first primer binding sequence in a first adaptor in the template DNA. See fig. 1, panel 1.2 and fig. 2, panel 2.2. The first primer is then extended by a first DNA polymerase to produce a second strand. See fig. 1, panel 1.3 and fig. 2, panel 2.3. The first DNA polymerase may be a polymerase having strand displacement activity or a polymerase not having strand displacement activity.
A third strand is generated by extending a second primer that hybridizes to a second primer binding sequence (3' to the first primer binding sequence in the template DNA) to be extended. The second primer binding sequence may be in a third adaptor (if present). See fig. 2, panel 2.4. The second primer binding sequence may also be in the same adaptor as the first primer binding sequence and 3' to the first primer binding sequence. See fig. 1, panel 1.4. Extension of the second primer to produce a third strand is performed using a DNA polymerase having strand displacement activity. See fig. 1, panel 1.5 and fig. 2, panel 2.5. During the extension process, the third strand displaces the 5' portion of the second strand it encounters and causes the second strand to partially dissociate from the template DNA and form an overhang. See fig. 1, panel 1.5 and fig. 2, panel 2.5.
The extension-displacement reaction is controlled such that the second strand is not completely displaced, but partially hybridized and partially unhybridized to the template DNA. The unhybridized portion ("overhang") comprises a first sequence that is complementary to a first target DNA sequence, a sequence that is complementary to at least a portion of the first adaptor, and a third sequence that is complementary to at least a portion of the second adaptor, wherein the first sequence is flanked by the second sequence and the third sequence. Thus, in one embodiment, the overhang is flanked by an adapter sequence (or its complement) or a portion thereof.
An example of a first target DNA sequence inserted between a first adaptor and a second adaptor is shown in fig. 1. Another example of a first target DNA sequence inserted between a first adaptor and a second adaptor is shown in fig. 2. The embodiment in fig. 2 shows a second target DNA sequence inserted between the first adaptor and the third adaptor. In this case, the first target DNA sequence and the second target DNA sequence may be the same or different, and may be linked in the genome, etc., as described below.
In some embodiments, as shown in items 3, 5, 6, and 7 of table 1 and fig. 2, the template DNA comprises an additional adaptor (e.g., a third adaptor), 3' to the first adaptor, and a second target DNA sequence inserted between the first adaptor and the third adaptor.
In this embodiment, the first adaptor comprises a first primer binding sequence that can bind to a first primer; and the third adaptor comprises a second primer binding sequence capable of binding the second primer. In some embodiments, the first target DNA and the second target DNA have the same nucleotide sequence. In some embodiments, the first target DNA and the second target DNA have different nucleotide sequences. The first adaptor, the second adaptor and the third adaptor may have the same or different nucleotide sequences.
In one embodiment, as shown in figure 1, the first adaptor comprises both a first primer binding sequence capable of binding to the first primer and a second primer binding sequence capable of binding to the second primer. The second primer binding sequence is 3' to the first primer binding sequence. The first adaptor and the second adaptor may have the same or different nucleotide sequences. In a specific embodiment, the first and second adaptors have the same nucleotide sequence and each adaptor comprises two binding sequences for the first and second primers, respectively.
In some embodiments, the second adaptor in the template DNA comprises one or more primer binding sequences for one or more sequencing oligonucleotides. See fig. 3.
10.1 illustrative examples Using DNB primers
In one approach, the template DNA polynucleotide is a DNA concatemer, e.g., DNB, comprising a monomeric unit of a DNA sequence having the structure shown in figure 1 or figure 2. Fig. 4 shows an example of the generation of complementary strands from such DNBs. In this particular example, the template DNA polynucleotide may be a DNB comprising a monomeric unit of the DNA structure as shown in figure 2, panel 2.1. The DNB includes a plurality of adaptors having the same nucleotide sequence. In (a), DNB (each monomer unit comprising the adaptor sequence and the inserted genomic DNA sequence) is hybridized to a complementary primer. In one method, a primer hybridizes to an adaptor (to, e.g., all or part of an adaptor sequence) on a template DNA strand. In (B), polymerization is carried out to produce two or more complementary strands or subsequent fragments. In (C), when the 3' end of the newly synthesized strand (third strand) reaches the 5' end of the downstream subsequent strand (second strand), the 5' portion of the subsequent (following) DNA strand (second strand) is displaced by the DNA polymerase, resulting in an overhang. One or more of the monomer units of each concatemer may be replaced in this manner.
The extension-displacement reaction conditions are controlled to produce a second strand having a total length and an overhang length optimized for complementary strand sequencing. In one approach, the reaction is terminated by introducing ddNTP (or other means known to those of ordinary skill in the art) at a time determined to provide the desired product. See section 12 below. In (D), after the overhanging segments are generated, the sequencing oligonucleotide may hybridize to (overhang) the adaptor in each overhanging segment (i.e., the complement of the adaptor sequence of the template). It will be appreciated that in one embodiment, the subsequent fragment comprises, in addition to the adapter sequence bound to the extension primer, an overhanging portion of sufficient length to include at least one adapter sequence, and a hybridizing (duplex) portion of sufficient length to keep the subsequent fragment annealed to the DNB template strand. Sequencing chemistry is then performed, which may be by sequencing-by-synthesis (SBS) or other sequencing chemistry. The sequence produced will be the inserted (e.g., genomic) DNA adjacent to and upstream of the adapter. This sequence information can be paired with sequences generated from the sequencing template strand. Typically the sequencing template strand provides the sequence downstream of the adaptor.
FIG. 5 illustrates primers that can be used to generate complementary strands according to the methods of the invention. The adaptor "Ad 141-2" is ligated to a genomic DNA fragment (not shown) and used to generate a single-stranded DNA loop. The resulting DNA loop comprises the sequence of the top strand of the adaptor "Ad 141-2" (shown in 5 'to 3' orientation) and the sequence of a short target DNA sequence (e.g., genomic DNA). DNB is then generated from the DNA circle by rolling circle amplification. The DNB thus generated comprises the bottom strand sequence of "Ad 141-2" (shown in 3 'to 5' orientation) and can be used as a template DNA polynucleotide (first strand).
An adaptor containing 67 bases has two primer binding sequences that bind to CX117 (second primer) and AD120_3T _21b (first primer), respectively. CX117 and AD120_3T-21b are also referred to as DNB primers in FIG. 5. Extension of Ad120_3T produced the second strand and extension of the CX117 primer produced the third strand. Extension of the third strand displaces the second strand, as described in section B, thereby creating an overhanging portion of the second strand. Complementary strand primers ("AD 041_ 5T" and "AD 041_ Helper") are sequencing oligonucleotides that can be used to perform Sequencing By Synthesis (SBS) on the overhanging portion of the second strand.
10.2 preparation of strands complementary to adaptor-containing DNA fragments
In one method, the template DNA polynucleotide is non-concatemeric DNA (e.g., monomeric). Non-concatemeric DNA may have the structure shown in FIG. 1, panel 1.1.
Fig. 6 illustrates one approach. In FIG. 6(A), four immobilized single stranded polynucleotides are shown. Open circles indicate target sequences and filled circles indicate 3 'and 5' adaptor sequences (which may be the same or different). The four immobilized single-stranded polynucleotides may be different, or may be clusters comprising cloned copies of the template DNA polynucleotide. An example is shown in FIG. 6, in which a cloned copy of a single-stranded monomer DNA (template DNA) is immobilized on a substrate. Each template DNA comprises target DNA flanked by a first adaptor at 5 'and a second adaptor at 3'.
Fig. 6 (a): a first primer (indicated by the arrow with an open arrow) is hybridized to the first primer binding sequence on the first adaptor.
Fig. 6 (B): the first primer is extended with a DNA polymerase to produce a second strand. The second strand so prepared comprises a sequence complementary to the target DNA sequence and a sequence complementary to the second adaptor.
Fig. 6 (C): a second primer (represented by an arrow with a filled arrow) is hybridized to a second primer binding sequence that is 3' to the first primer binding sequence in the first adaptor. Extending the second primer with a DNA polymerase having strand displacement activity to produce a third strand.
Fig. 6 (D): extension of the third strand is controlled such that the second strand is partially displaced, i.e., remains ligated to the template DNA by hybridization to the second adaptor.
11. Sequence of primer addition
The order of addition of the extension primers (e.g., first primer, second primer) can vary. For example, in some embodiments, a first primer and a polymerase are added, and (at least partial) synthesis of the second strand is performed prior to addition of the second primer. In another method, the first primer and the second primer are added at about the same time (see, e.g., the examples below). For example, they may be added together in the same composition, or may be added separately within about 1 minute of each other or within about 5 minutes of each other. The first extension primer and the second extension primer can be added in any order.
In methods where the second strand is produced using a DNA polymerase that does not have strand displacement activity, while the third strand will be produced using a DNA polymerase that has strand displacement activity, sequential addition of primers may be necessary.
It will be appreciated that a single oligonucleotide may be used as an extension primer for generating both the second strand and/or the third strand.
It will be further appreciated that a plurality of different first primers and/or a plurality of different second primers and/or a plurality of different sequencing oligonucleotides may be used in the same sequencing reaction.
The sequencing oligonucleotide for the second strand is typically added after terminating the extension-displacement of the second strand using the methods disclosed herein. See below "control extension-displacement reactions to control chain length and avoid complete displacement" section.
The sequencing oligonucleotide hybridizes to the overhanging portion of the second strand. In some embodiments, the sequencing oligonucleotide has a sequence that is complementary to, and thus hybridizes to, a known sequence within the first target sequence. In some embodiments, the sequencing oligonucleotide hybridizes to a sequence in the second strand that is complementary to at least a portion of the second adaptor. In some embodiments, the sequencing oligonucleotide is partially or fully complementary to the first primer or the second primer.
12. Control of the extension-Displacement reaction to control chain Length and avoid complete Displacement
To generate a second strand (subsequent fragment) with partial displacement of both the overhang and duplex portions attached to the template DNA polynucleotide (e.g., a DNB DNA strand), the extension reaction that generates the third strand may be controlled to avoid complete displacement of the second strand (i.e., the "subsequent strand" or "subsequent fragment") and to generate the second and third strands with lengths suitable for sequencing. This can be achieved by controlling the progress of the reaction by selecting a polymerase with an appropriate rate of polymerization or other property, and by using a variety of reaction parameters including, but not limited to, reaction temperature, reaction duration, primer composition, DNA polymerase, primer and dent concentrations, additives and buffer composition. The optimum conditions may be determined empirically.
12.1 selection of DNA polymerase
One way to control the extension-displacement reaction is to use a DNA polymerase with suitable strand displacement activity to generate the third strand. DNA polymerases having strand displacement activity include, but are not limited to, Phi29, Bst DNA polymerase, Klenow fragment of DNA polymerase I, and Deep-VentR DNA polymerase (NEB # M0258). These DNA polymerases are known to have strand displacement activities of different strengths. See Kornberg and Baker (1992, DNA Replication, second edition, pp. 113-225, Freeman, N.Y.). One of ordinary skill in the art can select a DNA polymerase suitable for the present invention.
12.2 polymerase, primer and dNTP concentrations
Another method for controlling the extension-displacement reaction is to use a DNA polymerase having a strand displacement activity or dNTP or a second primer at an appropriate concentration.
12.3 additives
In some embodiments, the extension reaction is performed by including reagents in the reaction buffer that affect duplex formation between the extension primer and the template DNA (e.g., DMSO (e.g., 1% -2%), betaine (e.g., 0.5M), glycerol (e.g., 10% -20%), T4G32SSB (e.g., 10-20ng/ul), and a volume exclusion agent.
12.4 temperature
The reaction temperature may also be controlled to allow for proper rates of polymerization and strand displacement. Higher temperatures generally result in a greater degree of strand displacement. In some embodiments, the reaction temperature is maintained in the range of 20 ℃ to 37 ℃, such as 32 ℃,33 ℃,34 ℃, 35 ℃, 36 ℃ or 37 ℃, to avoid complete displacement.
In some methods, the extension reaction is controlled by using a mixture of conventional (extendable) and non-extendable primers (i.e., 3' blocked primers). The non-extendable primer blocks extension by, for example, a chemical blocking group that prevents polymerization by a DNA polymerase. By mixing these two different primers in different ratios, the length of the double stranded (hybridizing) portion of the newly synthesized complementary DNA strand (subsequent fragment) can be controlled. For example, in one method, a mixture of first primers is used in which 50-70% are non-extendable ("blocked") and 30-50% are extendable ("unblocked"). Many types of non-extendable primers are known in the art and are suitable for use in the present invention.
12.5 reaction time
In some embodiments, the extension-displacement reaction is controlled by terminating the reaction after a period of time during which a second strand of the desired length is obtained. In some embodiments, the reaction is terminated after 5 minutes, 10 minutes, 20 minutes, 30 minutes, 40 minutes, or 60 minutes of initiation. Methods of reaction termination are well known in the art, for example by incorporating ddNTPs or by addition of chemical solutions (e.g., Tris buffer containing 1.5M NaCl). In a preferred embodiment, termination is achieved by incorporating ddNTPs after addition of a Tris buffer containing 1.5M NaCl to the reaction.
13. Sequence determination
In some embodiments, the claimed invention provides methods of determining the sequence of the second strand produced as described above. The method comprises hybridizing a sequencing oligonucleotide to a sequence in the second strand that is complementary to at least a portion of the second adaptor (see figure 3, panel 3.1), and determining the nucleotide sequence of at least a portion of the sequence that is complementary to the first target DNA sequence. Sequencing can be performed using sequencing by synthesis methods (FIG. 3, panel 3.2) or using sequencing by ligation methods (FIG. 3, FIG. 3.3) or both.
In one embodiment, the resulting DNA strand complementary to the template DNA is used for sequencing of the target DNA. The overhang of the second strand is sequenced by extending a primer that hybridizes to the complementary sequence of the second adaptor, for example, as shown in fig. 3.
In another embodiment, the template DNA strand is also sequenced using a primer that hybridizes to the first adaptor. Sequence information from the complementary strand is paired with the sequence generated from sequencing the template DNA to determine the entire target DNA sequence.
It will be apparent to the reader that variations of the specific embodiments outlined herein may be used. In one method, the extension primer and the sequencing oligonucleotide are bound to different portions of the adaptor sequence. In one method, the extension primer and the sequencing oligonucleotide bind to the same portion of the adaptor sequence (e.g., the complement of the portion of the adaptor sequence used for extension and the same portion of the adaptor sequence used for sequencing).
The sequence of the overhang may be determined using any suitable sequence determination method, such as SBS, pyrosequencing, sequencing by ligation, and the like. In some embodiments, more than one sequencing method is used. For example, one method (e.g., cPAL) may be used to sequence the template DNA strand, and a different method (e.g., SBS) may be used to sequence the third strand.
Sequencing By Synthesis (SBS) can rely on DNA polymerase activity to perform chain extension during the sequencing reaction step. SBS is well known in the art. See, e.g., U.S. Pat. Nos. 6,210,891; no.6,828,100, No.6,833,246; no.6,911,345; no.6,969,488; no.6,897,023; no.6,833,246; and No.6,787,308; patent publication nos. 20040106130; no. 20030064398; and No. 20030022207; margulies et al, 2005, Nature 437: 376 and 380; ronaghi et al, 1996, anal. biochem.242: 84-89; constans, A,2003, the scientist 17 (13: 36); and Bentley et al, 2008, Nature 456(7218): 53-59. Other sequencing methods (e.g., sequencing by hybridization) are known in the art and can be used. Other methods of determining nucleotide sequences may also be used in the present invention. For example, sequencing by ligation (e.g., WO1999019341, WO2005082098, WO2006073504 and Shendire et al, 2005, Science, 309: 1728-1739), pyrosequencing (see, e.g., Ronaghi et al, 1996, anal. biochem.242: 84-89).
Composition and array of DNA complexes
14.1 DNB
In one aspect, the invention includes an array of DNA complexes. In one aspect, the array is a carrier comprising an array of discrete regions, wherein a plurality of regions comprises:
(a) single-stranded DNA concatemers, each concatemer comprising a plurality of monomers, each monomer comprising a target sequence and an adaptor sequence;
(b) wherein each of the plurality of monomers of at least a subset of the DNA concatemers in (a) comprises,
(i) second DNA strands partially hybridized thereto, wherein each second strand DNA comprises a portion complementary to the target sequence and a portion complementary to at least a portion of the adaptor sequence, and wherein a portion of the second strands is not hybridized to the concatemer and a portion of the second strands complementary to at least a portion of the adaptor is hybridized to the adaptor, and
(ii) a third DNA strand comprising a portion complementary to and hybridizing to the target sequence; and
(c) wherein each of at least a subset of the plurality of monomers of (b) comprises a fourth DNA strand hybridized to a third DNA strand at a hybridization site, wherein the fourth DNA strand comprises at least a portion of the sequence of the adaptor and the hybridization site is complementary to at least a portion of the second adaptor sequence.
An array as described above, wherein single stranded DNA concatemers are immobilized on the discrete spacer regions by: (i) attractive non-covalent interactions that can base pair with the capture oligonucleotide, or (ii) covalently interact with discrete spaced apart regions.
It is to be understood that the DNA complexes of the array may comprise any of the properties of the complexes described herein or prepared according to the methods described herein. In addition, the composite may have any combination of one or more of the following features: (i) the array comprises at least 106(ii) the concatemer comprises at least 50, more often at least 100, more often at least 500 monomers, (iii) wherein the single-stranded DNA concatemer is generated by denaturing a double-stranded concatemer in situ, (iv) wherein the fourth DNA strand comprises at least 10 bases, preferably at least 12 bases, and optionally at least 15 bases of an adaptor sequence, (v) the fourth DNA strand is fully complementary to the second DNA strand to which it is hybridized.
In some embodiments, the fourth DNA strand is an oligonucleotide capable of activation as a primer for primer extension (e.g., sequencing by synthesis reaction), or an extension product of such a primer, or an oligonucleotide capable of activation as an anchor for sequencing by ligation, or a ligation product of such an oligonucleotide and a labeled probe (e.g., a labeled cPAL probe). In one method, the fourth DNA strand comprises a portion complementary to the adapter sequence and a portion complementary to the target sequence.
14.2 Cluster
In one aspect, the invention includes an array of DNA complexes. In one aspect, the array is a carrier comprising an array of discrete regions, wherein a plurality of the regions comprises:
(a) a clonal cluster of double-stranded or single-stranded DNA, each DNA comprising a target sequence flanked by a first adaptor and a second adaptor;
(b) wherein each of the plurality of DNAs of at least a subset of the clusters in (a) comprises,
(i) second DNA strands partially hybridized thereto, wherein each second strand DNA comprises a portion complementary to the target sequence and a portion complementary to at least a portion of the first adaptor sequence, and wherein a portion of the second strand complementary to the target sequence is not hybridized to the DNA and a portion of the second strand complementary to at least a portion of the first adaptor is hybridized to the DNA, and
(ii) a third DNA strand comprising a portion complementary to and hybridizing to the target sequence and a portion complementary to and hybridizing to the second adaptor sequence; and
(c) wherein each of at least a subset of the plurality of DNAs of (b) comprises a fourth DNA strand hybridized to the third DNA strand at a hybridization site, wherein the fourth DNA strand comprises at least a portion of the sequence of the second adaptor and the hybridization site is complementary to at least a portion of the second adaptor sequence.
It is to be understood that the DNA complexes of the array may comprise any of the properties of the complexes described herein or prepared according to the methods described herein. In addition, the composite may have any combination of one or more of the following features: (i) the array comprises at least 106A discrete region, (ii) wherein the DNA is single stranded, (iii) wherein the fourth DNA strand comprises at least 10 bases, preferably at least 12 bases and optionally at least 15 bases of the sequence of the adaptor, (iv) the fourth DNA strand is fully complementary to the second DNA strand to which it is hybridized.
In some embodiments, the fourth DNA strand is an oligonucleotide capable of being activated as a primer for primer extension (e.g., sequencing by a synthesis reaction), or an extension product of such a primer, or an oligonucleotide capable of being activated as an anchor for sequencing by ligation, or a ligation product of such an oligonucleotide and a labeled probe (e.g., a labeled cPAL probe). In one method, the fourth DNA strand comprises a portion complementary to the adapter sequence and a portion complementary to the target sequence.
14.3 compositions
In one aspect, the invention provides a composition comprising an array as described in section 14.1 or 14.2 and an enzyme selected from a DNA ligase and a DNA polymerase, wherein the DNA polymerase has strand displacement activity. In one embodiment, the composition further comprises fluorescently labeled dntps (e.g., dNTP analogs) and/or a labeled oligonucleotide probe pool.
15 examples
15.1 example 1: generating complementary overhangs on DNB for paired-end sequencing
In this embodiment, a DNB array chip (DNBNANoball) using Complete Genomics (CGI) is usedTMArray) were performed for sequencing by synthesis of known adaptor sequences. DNBs are generated by rolling circle amplification using a library comprising human genomic DNA fragments and the single-stranded loop of the adaptor Ad 141-2. Ad 141-25 '-AAGTCGGAGGCCAAGCGGTCTTAGGAAGACAAGCTCGAGCTCGAGCGATCGGGCTTCGACTGGAGAC-3' (SEQ ID NO: 1; see FIG. 5). 1uM extension primer Ad120_3T _21 bp: 5'-GAT CGG GCT TCG ACT GGA GAC-3' (SEQ ID NO: 2; "first extension primer") and 1. mu.M extension primer CX 117: 5'-AAG TCG GAG GCC AAG-3' (SEQ ID NO: 3; "second extension primer") was hybridized to the DNB array at 35 ℃ for 30 minutes, see FIG. 5. In this experiment, primers were chosen such that 21 bases of the adaptor sequence were determined (thus, all DNBs in the array gave the same sequence read).
The primers were then extended (second and third strand synthesis) in an extension mix containing Phi29 polymerase 1.0U/ul in 1 XPhi 29 buffer, 0.1mg/ml BSA, 20% glycerol, 2% DMSO, 25uM dNTP for 20 minutes at 35 ℃ to synthesize the complementary strand ("subsequent fragment"). Extension was then terminated by the addition of 250. mu.M ddNTP.
The sequencing oligonucleotide (4uM) AD041_ Helper or AD041_5T (FIG. 5) was then hybridized to the single-stranded overhang of the subsequent fragment (third strand). SBS was then performed with Cicada at 35 ℃ and Hot MyChem #2 for 25 cycles for 30 minutes. 4 different fluorescent dye-labeled reversible terminator nucleotides (RT) were used in the sequencing reaction. TxR represents Texas Red; FIT stands for fluorescein; cy5 represents cyanine 5; cy3 represents cyanine 3. The average values of the signals shown in table 2 represent the average values of all DNBs on the array that incorporate a base with the identified base-specific dye. The highest value indicates that the base was named for the particular position. For example: at position 1, the Cy3 dye associated with base A having the highest signal average is designated A.
As a result: all 21 bases were correctly named because the sequencing region was AGA CCG CTT GGC CTC CGACTT, which is the complementary sequence to the adaptor region CX 117. Different extension times resulted in different signal strengths (data not shown). The signal of 21 bases from the complement complementary to the adapter region CX117 was determined. See table 2.
TABLE 2
Sequencing of 21 bases of the complement complementary to adapter region CX117
15.2 example 2: sequencing of genomic sequences
Multiple DNBs containing genomic sequences have been sequenced using the invention described herein. The table shows DNB array chips (DNBNanoball) at Complete Genomics (CGI) that have been completely uniquely mapped to the genomeTMArray) of DNBs (labeled exactly 1 time/0 time or>1 time); l01&L08: first representing a mapping of a first chain; l02: adaptor sequencing (genome-free sequencing) and L03-L07: second chain genomic sequencing. Line L03-L07 has an even higher ratio of fully uniquely mapped DNB to genome (exactly once). The percentage is calculated by using all DNBs arrayed on the array.
TABLE 3
Mapping results of 25-base genome sequences by SBS sequencing
| # aligned to reference | L01 | L02 | L03 | L04 | L05 | L06 | L07 | L08 |
| 0 time | 25.91% | 99.90% | 15.67% | 21.36% | 15.27% | 15.35% | 15.95% | 26.19% |
| Exactly 1 time | 54.03% | 0.09% | 63.11% | 58.94% | 63.72% | 63.48% | 62.64% | 53.49% |
| >1 time of | 20.06% | 0.01% | 21.22% | 19.71% | 21.01% | 21.17% | 21.41% | 20.33% |
This application is related to U.S. provisional application No.62/117,391 filed on day 17/2/2015, which is incorporated herein by reference in its entirety.
All publications and patent documents cited herein are incorporated by reference as if each such publication or document were specifically and individually indicated to be incorporated by reference. Although the present invention has been described primarily with reference to specific embodiments, it is also contemplated that other embodiments will become apparent to those of skill in the art upon reading this disclosure and it is intended that such embodiments be included in the methods of the present invention.
Claims (17)
1. A method of preparing a DNA strand for sequencing, comprising:
a) providing a template DNA polynucleotide comprising a first target DNA sequence inserted between a first adaptor 3' to a first target DNA sequence and a second adaptor 5' to the first target DNA sequence, and optionally a third adaptor 3' to the first adaptor and a second target DNA sequence inserted between the first adaptor and the third adaptor, wherein the template DNA polynucleotide is immobilized on a substrate,
b) combining a first primer with the immobilized template DNA polynucleotide and hybridizing the first primer to a first primer binding sequence in the first adaptor, wherein the first primer is not immobilized on the substrate when the first primer is combined with the immobilized template DNA polynucleotide;
c) extending the first primer using a first DNA polymerase to produce a second strand, wherein the second strand comprises a sequence complementary to the first target DNA sequence and a sequence complementary to at least a portion of the second adaptor;
d) combining a second primer with the immobilized template DNA polynucleotide, hybridizing a second primer with a second primer binding sequence, wherein the second primer binding sequence is 3' to the first primer binding sequence, wherein the second primer is not immobilized on the substrate when the second primer is combined with the immobilized template DNA polynucleotide; and
e) extending the second primer using a DNA polymerase having strand displacement activity to produce a third strand,
wherein extending said second primer to produce said third strand partially displaces said second strand, thereby producing a partially hybridized second strand having:
(i) a hybridizing portion that hybridizes to the template DNA polynucleotide, and
(ii) an unhybridized overhang comprising a sequence complementary to the first target DNA sequence and a sequence complementary to at least a portion of the second adaptor, wherein the unhybridized portion is 5' in the second strand to the hybridized portion.
2. The method of claim 1, further comprising:
f) hybridizing a sequencing oligonucleotide to said sequence in said third strand that is complementary to at least a portion of said second adaptor, and
g) determining at least a portion of the sequence complementary to the first target DNA sequence.
3. The method of claim 1, wherein the first adaptor, the second adaptor, and the third adaptor, if present, have the same nucleotide sequence.
4. The method of claim 1, wherein the first DNA polymerase and the DNA polymerase having strand displacement activity are the same polymerase.
5. The method of claim 1, wherein the second primer binding sequence that hybridizes to the second primer is in the first adaptor.
6. The method of claim 1, wherein the template DNA polynucleotide comprises the third adaptor and the second primer binding sequence is in the third adaptor.
7. The method of claim 1, wherein the template DNA polynucleotide comprises a DNA concatemer and the first target DNA sequence and the second target DNA sequence have the same nucleotide sequence.
8. The method of claim 4, wherein the first primer and the second primer hybridize or extend in the same reaction.
9. The method of claim 1, wherein the template DNA polynucleotide comprises a DNA concatemer and the first primer and the second primer have the same nucleotide sequence.
10. The method of claim 1, wherein a plurality of second primers are hybridized to a plurality of second primer binding sequences in step d), and wherein the plurality of second primers comprise extendable and non-extendable primers.
11. The method of claim 1, wherein extension of the second primer is terminated at a fixed time interval of 5 minutes, 10 minutes, 20 minutes, 30 minutes, 40 minutes, or 60 minutes, and wherein extension is terminated by chemical termination and/or addition of ddNTPs.
12. The method of claim 1, wherein the extending the second primer to produce the third strand is controlled by temperature, enzyme concentration, and primer concentration.
13. The method of claim 1, wherein each of the template DNAs is deposited on an array, bead, well, or droplet.
14. The method of claim 1, wherein the sequencing is by synthesis, pyrosequencing, or sequencing by ligation.
15. An array of DNA complexes, in one aspect, the array is a carrier comprising an array of discrete regions, wherein a plurality of the regions comprise:
(a) single-stranded DNA concatemers, each concatemer comprising a plurality of monomers, each monomer comprising a target sequence and an adaptor sequence;
(b) wherein each of the plurality of monomers of at least a subset of the DNA concatemers in (a) comprises,
(i) second DNA strands partially hybridized thereto, wherein each second strand DNA comprises a portion complementary to the target sequence and a portion complementary to at least a portion of the adaptor sequence, and wherein a portion of the second strands is not hybridized to the concatemer and a portion of the second strands complementary to at least a portion of the adaptor is hybridized to the adaptor, and
(ii) a third DNA strand comprising a portion complementary to and hybridizing to the target sequence; and
(c) wherein each of at least a subset of the plurality of monomers of (b) comprises a fourth DNA strand hybridized to a third DNA strand at a hybridization site, wherein the fourth DNA strand comprises at least a portion of the sequence of the adaptor and the hybridization site is complementary to at least a portion of the second adaptor sequence.
16. An array of DNA complexes, which in one aspect is a carrier comprising an array of discrete regions, wherein a plurality of said regions comprise:
(a) a clonal cluster of double-stranded or single-stranded DNA, each DNA comprising a target sequence flanked by a first adaptor and a second adaptor;
(b) wherein each of the plurality of DNAs of at least a subset of the clusters in (a) comprises,
(i) second DNA strands partially hybridized thereto, wherein each second strand DNA comprises a portion complementary to the target sequence and a portion complementary to at least a portion of the first adaptor sequence, and wherein a portion of the second strand complementary to the target sequence is not hybridized to the DNA and a portion of the second strand complementary to at least a portion of the first adaptor is hybridized to the DNA, and
(ii) a third DNA strand comprising a portion complementary to and hybridizing to the target sequence and a portion complementary to and hybridizing to the second adaptor sequence; and
(c) wherein each of at least a subset of the plurality of DNAs of (b) comprises a fourth DNA strand hybridized to the third DNA strand at a hybridization site, wherein the fourth DNA strand comprises at least a portion of the sequence of the second adaptor and the hybridization site is complementary to at least a portion of the second adaptor sequence.
17. A composition or system comprising the array of claim 15 or claim 16 and an enzyme selected from a DNA ligase and a DNA polymerase, wherein the DNA polymerase has strand displacement activity.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US62/117391 | 2015-02-17 | ||
| US62/194741 | 2015-07-20 |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| HK1245843A1 true HK1245843A1 (en) | 2018-08-31 |
| HK1245843B HK1245843B (en) | 2023-05-05 |
Family
ID=
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN107250383B (en) | DNA sequencing using controlled strand displacement | |
| US10876108B2 (en) | Compositions and methods for targeted nucleic acid sequence enrichment and high efficiency library generation | |
| JP5907990B2 (en) | Methods, compositions, systems, devices and kits for nucleic acid amplification | |
| CN114846154B (en) | Controlled strand displacement for paired-end sequencing | |
| CA2976786C (en) | Dna sequencing using controlled strand displacement | |
| HK1245843A1 (en) | Dna sequencing using controlled strand displacement | |
| HK1245843B (en) | Dna sequencing using controlled strand displacement | |
| WO2021083195A1 (en) | Dna linker oligonucleotides | |
| HK40075069A (en) | Controlled strand-displacement for paired-end sequencing | |
| HK40049316A (en) | Compositions and methods for improving sample identification in indexed nucleic acid libraries | |
| HK40049316B (en) | Compositions and methods for improving sample identification in indexed nucleic acid libraries |