CA3238653A1 - Methods of synthesizing nucleic acid molecules - Google Patents
Methods of synthesizing nucleic acid molecules Download PDFInfo
- Publication number
- CA3238653A1 CA3238653A1 CA3238653A CA3238653A CA3238653A1 CA 3238653 A1 CA3238653 A1 CA 3238653A1 CA 3238653 A CA3238653 A CA 3238653A CA 3238653 A CA3238653 A CA 3238653A CA 3238653 A1 CA3238653 A1 CA 3238653A1
- Authority
- CA
- Canada
- Prior art keywords
- sequence
- dsdna
- sequences
- variable
- molecule
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 208
- 230000002194 synthesizing effect Effects 0.000 title claims abstract description 26
- 102000039446 nucleic acids Human genes 0.000 title description 2
- 108020004707 nucleic acids Proteins 0.000 title description 2
- 150000007523 nucleic acids Chemical class 0.000 title description 2
- 108020004414 DNA Proteins 0.000 claims abstract description 576
- 102000053602 DNA Human genes 0.000 claims abstract description 503
- 108091034117 Oligonucleotide Proteins 0.000 claims abstract description 215
- 239000002773 nucleotide Substances 0.000 claims abstract description 192
- 125000003729 nucleotide group Chemical group 0.000 claims abstract description 192
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 claims abstract description 145
- 238000000137 annealing Methods 0.000 claims abstract description 43
- 108091028043 Nucleic acid sequence Proteins 0.000 claims abstract description 34
- 230000003321 amplification Effects 0.000 claims abstract description 30
- 238000003199 nucleic acid amplification method Methods 0.000 claims abstract description 30
- 239000012634 fragment Substances 0.000 claims description 194
- 230000000295 complement effect Effects 0.000 claims description 74
- 108091008146 restriction endonucleases Proteins 0.000 claims description 59
- 238000003752 polymerase chain reaction Methods 0.000 claims description 34
- 238000003786 synthesis reaction Methods 0.000 claims description 32
- 108020005004 Guide RNA Proteins 0.000 claims description 21
- 239000000203 mixture Substances 0.000 claims description 21
- 230000002068 genetic effect Effects 0.000 claims description 12
- 230000006820 DNA synthesis Effects 0.000 claims description 6
- 108010042407 Endonucleases Proteins 0.000 claims description 3
- 102000004533 Endonucleases Human genes 0.000 claims description 3
- 238000006243 chemical reaction Methods 0.000 abstract description 33
- 239000000047 product Substances 0.000 description 107
- 230000015572 biosynthetic process Effects 0.000 description 18
- 230000029087 digestion Effects 0.000 description 15
- 102000040430 polynucleotide Human genes 0.000 description 14
- 108091033319 polynucleotide Proteins 0.000 description 14
- 239000002157 polynucleotide Substances 0.000 description 14
- 102000012410 DNA Ligases Human genes 0.000 description 11
- 108010061982 DNA Ligases Proteins 0.000 description 11
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 8
- 108091033409 CRISPR Proteins 0.000 description 6
- 238000000018 DNA microarray Methods 0.000 description 6
- 102000004190 Enzymes Human genes 0.000 description 6
- 108090000790 Enzymes Proteins 0.000 description 6
- 102000003960 Ligases Human genes 0.000 description 6
- 108090000364 Ligases Proteins 0.000 description 6
- 239000007787 solid Substances 0.000 description 6
- 239000007790 solid phase Substances 0.000 description 6
- 230000002103 transcriptional effect Effects 0.000 description 6
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 5
- 108090000623 proteins and genes Proteins 0.000 description 5
- 230000004544 DNA amplification Effects 0.000 description 4
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 4
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 4
- 108010067770 Endopeptidase K Proteins 0.000 description 4
- 238000004891 communication Methods 0.000 description 4
- 230000037452 priming Effects 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 108020004705 Codon Proteins 0.000 description 3
- 230000009471 action Effects 0.000 description 3
- 239000011324 bead Substances 0.000 description 3
- 230000002255 enzymatic effect Effects 0.000 description 3
- 239000012467 final product Substances 0.000 description 3
- 230000004060 metabolic process Effects 0.000 description 3
- VGONTNSXDCQUGY-RRKCRQDMSA-N 2'-deoxyinosine Chemical compound C1[C@H](O)[C@@H](CO)O[C@H]1N1C(N=CNC2=O)=C2N=C1 VGONTNSXDCQUGY-RRKCRQDMSA-N 0.000 description 2
- LOJNBPNACKZWAI-UHFFFAOYSA-N 3-nitro-1h-pyrrole Chemical compound [O-][N+](=O)C=1C=CNC=1 LOJNBPNACKZWAI-UHFFFAOYSA-N 0.000 description 2
- 108010008286 DNA nucleotidylexotransferase Proteins 0.000 description 2
- 102100033215 DNA nucleotidylexotransferase Human genes 0.000 description 2
- 241000193385 Geobacillus stearothermophilus Species 0.000 description 2
- 102000004160 Phosphoric Monoester Hydrolases Human genes 0.000 description 2
- 108090000608 Phosphoric Monoester Hydrolases Proteins 0.000 description 2
- 230000015556 catabolic process Effects 0.000 description 2
- 238000003776 cleavage reaction Methods 0.000 description 2
- 238000010367 cloning Methods 0.000 description 2
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical compound NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 2
- 238000006731 degradation reaction Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- VGONTNSXDCQUGY-UHFFFAOYSA-N desoxyinosine Natural products C1C(O)C(CO)OC1N1C(NC=NC2=O)=C2N=C1 VGONTNSXDCQUGY-UHFFFAOYSA-N 0.000 description 2
- UYTPUPDQBNUYGX-UHFFFAOYSA-N guanine Chemical compound O=C1NC(N)=NC2=C1N=CN2 UYTPUPDQBNUYGX-UHFFFAOYSA-N 0.000 description 2
- 238000000338 in vitro Methods 0.000 description 2
- 238000005304 joining Methods 0.000 description 2
- 150000008300 phosphoramidites Chemical class 0.000 description 2
- 229920000642 polymer Polymers 0.000 description 2
- 230000001105 regulatory effect Effects 0.000 description 2
- 230000007017 scission Effects 0.000 description 2
- 239000000126 substance Substances 0.000 description 2
- RWQNBRDOKXIBIV-UHFFFAOYSA-N thymine Chemical compound CC1=CNC(=O)NC1=O RWQNBRDOKXIBIV-UHFFFAOYSA-N 0.000 description 2
- 238000013518 transcription Methods 0.000 description 2
- 230000035897 transcription Effects 0.000 description 2
- WJBNIBFTNGZFBW-DJLDLDEBSA-N 2'-deoxynebularine Chemical compound C1[C@H](O)[C@@H](CO)O[C@H]1N1C2=NC=NC=C2N=C1 WJBNIBFTNGZFBW-DJLDLDEBSA-N 0.000 description 1
- HCGYMSSYSAKGPK-UHFFFAOYSA-N 2-nitro-1h-indole Chemical compound C1=CC=C2NC([N+](=O)[O-])=CC2=C1 HCGYMSSYSAKGPK-UHFFFAOYSA-N 0.000 description 1
- 108020005065 3' Flanking Region Proteins 0.000 description 1
- FWMNVWWHGCHHJJ-SKKKGAJSSA-N 4-amino-1-[(2r)-6-amino-2-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]piperidine-4-carboxylic acid Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCCCN)C(=O)N1CCC(N)(CC1)C(O)=O)NC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FWMNVWWHGCHHJJ-SKKKGAJSSA-N 0.000 description 1
- 108020005029 5' Flanking Region Proteins 0.000 description 1
- OZFPSOBLQZPIAV-UHFFFAOYSA-N 5-nitro-1h-indole Chemical compound [O-][N+](=O)C1=CC=C2NC=CC2=C1 OZFPSOBLQZPIAV-UHFFFAOYSA-N 0.000 description 1
- 229930024421 Adenine Natural products 0.000 description 1
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical compound NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 1
- 108091032955 Bacterial small RNA Proteins 0.000 description 1
- 238000010354 CRISPR gene editing Methods 0.000 description 1
- 238000010356 CRISPR-Cas9 genome editing Methods 0.000 description 1
- 238000007702 DNA assembly Methods 0.000 description 1
- 230000033616 DNA repair Effects 0.000 description 1
- 230000004543 DNA replication Effects 0.000 description 1
- 241000588724 Escherichia coli Species 0.000 description 1
- 108010007577 Exodeoxyribonuclease I Proteins 0.000 description 1
- 102100029075 Exonuclease 1 Human genes 0.000 description 1
- 241000282326 Felis catus Species 0.000 description 1
- 238000007397 LAMP assay Methods 0.000 description 1
- 238000012408 PCR amplification Methods 0.000 description 1
- 108020004682 Single-Stranded DNA Proteins 0.000 description 1
- 239000000654 additive Substances 0.000 description 1
- 230000000996 additive effect Effects 0.000 description 1
- 229960000643 adenine Drugs 0.000 description 1
- 238000003556 assay Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008827 biological function Effects 0.000 description 1
- 230000033228 biological regulation Effects 0.000 description 1
- 244000309466 calf Species 0.000 description 1
- 210000004027 cell Anatomy 0.000 description 1
- 230000032823 cell division Effects 0.000 description 1
- 239000007795 chemical reaction product Substances 0.000 description 1
- 230000024321 chromosome segregation Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 229940104302 cytosine Drugs 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 239000005547 deoxyribonucleotide Substances 0.000 description 1
- 125000002637 deoxyribonucleotide group Chemical group 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 230000006862 enzymatic digestion Effects 0.000 description 1
- 238000006911 enzymatic reaction Methods 0.000 description 1
- 238000010362 genome editing Methods 0.000 description 1
- 238000010348 incorporation Methods 0.000 description 1
- 230000000968 intestinal effect Effects 0.000 description 1
- 238000011901 isothermal amplification Methods 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 230000012846 protein folding Effects 0.000 description 1
- 102000004169 proteins and genes Human genes 0.000 description 1
- 238000000746 purification Methods 0.000 description 1
- 230000035104 rRNA modification Effects 0.000 description 1
- 230000028706 ribosome biogenesis Effects 0.000 description 1
- 238000005096 rolling process Methods 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 238000007086 side reaction Methods 0.000 description 1
- 238000004513 sizing Methods 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 230000014626 tRNA modification Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 229940113082 thymine Drugs 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 230000014616 translation Effects 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12P—FERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
- C12P19/00—Preparation of compounds containing saccharide radicals
- C12P19/26—Preparation of nitrogen-containing carbohydrates
- C12P19/28—N-glycosides
- C12P19/30—Nucleotides
- C12P19/34—Polynucleotides, e.g. nucleic acids, oligoribonucleotides
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/102—Mutagenizing nucleic acids
- C12N15/1031—Mutagenizing nucleic acids mutagenesis by gene assembly, e.g. assembly by oligonucleotide extension PCR
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/1034—Isolating an individual clone by screening libraries
- C12N15/1093—General methods of preparing gene libraries, not provided for in other subgroups
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Chemical & Material Sciences (AREA)
- Engineering & Computer Science (AREA)
- Organic Chemistry (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biotechnology (AREA)
- General Engineering & Computer Science (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Biochemistry (AREA)
- Microbiology (AREA)
- General Health & Medical Sciences (AREA)
- Plant Pathology (AREA)
- Biophysics (AREA)
- Physics & Mathematics (AREA)
- Crystallography & Structural Chemistry (AREA)
- Bioinformatics & Computational Biology (AREA)
- Chemical Kinetics & Catalysis (AREA)
- General Chemical & Material Sciences (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
- Preparation Of Compounds By Using Micro-Organisms (AREA)
Abstract
The invention provides methods for synthesizing a product DNA molecule of any possible DNA sequence from a universal library of overlapping oligonucleotides. The method involves combining a plurality of the overlapping oligonucleotides in a reaction pool, where the sequences of the plurality of oligonucleotides comprise at least a sub-sequence of the product DNA molecule. The method also involves annealing the plurality of oligonucleotides, performing a ligation step, and performing an amplification step to thereby synthesize a sub-sequence of the product DNA molecule. The invention can be used to synthesize a DNA molecule of any possible sequence from the universal library, which can be accomplished through a hierarchal assembly scheme. In one embodiment the universal library comprises fewer than 10,000 pre-manufactured oligonucleotides that can be synthesized into the any possible DNA sequence. In any embodiment the product DNA molecule has an error rate of less than 1 error per 2,000 nucleotides.
Description
METHODS OF SYNTHESIZING NUCLEIC ACID MOLECULES
FIELD OF THE INVENTION
[0001] The invention provides compositions, methods, and kits for synthesizing any possible DNA molecule from a limited library of oligonucleotides.
BACKGROUND OF THE INVENTION
FIELD OF THE INVENTION
[0001] The invention provides compositions, methods, and kits for synthesizing any possible DNA molecule from a limited library of oligonucleotides.
BACKGROUND OF THE INVENTION
[0002] The fields of synthetic biology and gene editing and therapeutics have a continuing and growing need for oligonucleotides of diverse and known sequences. Existing methods for synthesizing small oligonucleotides involve chemical synthesis via solid-phase, sequential coupling of nucleotides to generate the oligonucleotide of desired length and sequence. Oligonucleotides produced are then released from the solid phase, deprotected, and collected for assembly into larger oligonucleotides by other methods.
While automated, these processes are subject to side reactions and base errors, thus limiting the length of the oligonucleotides produced. For applications requiring ultra-high sequence fidelity, these methods have additional limitations.
While automated, these processes are subject to side reactions and base errors, thus limiting the length of the oligonucleotides produced. For applications requiring ultra-high sequence fidelity, these methods have additional limitations.
[0003] Enzymatic methods of synthesizing oligonucleotides also exist and involve the use of enzymes such as terminal deoxynucleotidyl transferase (TdT), a template-independent polymerase that catalyzes the incorporation of deoxyribonucleotides into the 3 '-hydroxyl end of DNA templates. But the enzyme shows strong bias for specific nucleotide bases and does not reliably add nucleotides in the desired order and length.
[0004] There is a continuing need for methods of synthesizing oligonucleotides efficiently and with high fidelity so that the user can produce oligonucleotides of any desired length and sequence.
DESCRIPTION OF THE DRAWINGS
DESCRIPTION OF THE DRAWINGS
[0005] Figures 1A-1B; Figure 1A provides a schematic illustration of the synthesis of a DNA molecule of desired sequence according to one embodiment of the invention. Figure 1B provides further reactions from the product of a schematic illustration of the synthesis of a DNA molecule of desired sequence
[0006] Figures 2A-2B; Figure 2A provides a schematic illustration containing details of an embodiment of the DNA synthesis reaction of the invention. Figure 2B
shows an additional part of the reaction. The illustrations show complementary 3' and 5' overhang sequences between dsDNA molecules and fragments.
shows an additional part of the reaction. The illustrations show complementary 3' and 5' overhang sequences between dsDNA molecules and fragments.
[0007] Figure 3 provides a schematic illustration of the overall scheme for hierarchal assembly using the methods of the invention. By synthesizing dsDNA molecules with overlapping variable sequences hierarchal assembly can be leveraged. Assembly can also include the addition of dsDNA fragments with a 5' and 3' overhang, which can be added to the final assembly. Lengths of variable sequences are for illustration only.
[0008] Figures 4A-4D; Figure 4A provides a gel image of PCR1 products after the first ligation step (LO) where three oligos from the library are combined, ligated and PCR
amplified using a single universal primer pair. Contained within each of the 98bp PCR
products is 10bp of synthetic DNA that is leveraged for downstream assembly.
Figure 4B
provides a gel image showing PCR2 products after the first digest and ligation step (DL1).
These PCR products resulted from combining two PCR1 products, removal of one flank on each product (by enzymatic digestion) and then ligation of the products.
Although the PCR2 product total length is shorter (56 bp) than the PCR1 product in Figure 1, the variable sequence of the dsDNA has increased to 16 bp, representing an increasing in fragment length as the workflow progressed. Figure 4C provides a gel image showing PCR3 products after the second digestion and ligation step (DL2). These PCR products resulted from combining two PCR2 products, digesting away one flank on each product followed by ligation. Note that the 1st and 4th PCR product contain a capped termini, which aids in reducing downstream mis-ligation events, plus these also provide a universal priming sequence for the subsequent PCR4 amplification. The variable sequence of the dsDNA molecule is 28 bp in length in all the sequences on this gel. Figure 4D provides a gel image showing PCR4 products after the third digestion and ligation step (DL3). These PCR products result from combining four PCR3 products, digesting away one or two flanks on each product and then ligated them together. The variable sequence of the dsDNA is 100 bp in length in all the sequences on this gel, and they are flanked by 40bp of flank sequence on both sides, which can be digested away by using BsmBI, thus enabling further assembly into even larger pieces
amplified using a single universal primer pair. Contained within each of the 98bp PCR
products is 10bp of synthetic DNA that is leveraged for downstream assembly.
Figure 4B
provides a gel image showing PCR2 products after the first digest and ligation step (DL1).
These PCR products resulted from combining two PCR1 products, removal of one flank on each product (by enzymatic digestion) and then ligation of the products.
Although the PCR2 product total length is shorter (56 bp) than the PCR1 product in Figure 1, the variable sequence of the dsDNA has increased to 16 bp, representing an increasing in fragment length as the workflow progressed. Figure 4C provides a gel image showing PCR3 products after the second digestion and ligation step (DL2). These PCR products resulted from combining two PCR2 products, digesting away one flank on each product followed by ligation. Note that the 1st and 4th PCR product contain a capped termini, which aids in reducing downstream mis-ligation events, plus these also provide a universal priming sequence for the subsequent PCR4 amplification. The variable sequence of the dsDNA molecule is 28 bp in length in all the sequences on this gel. Figure 4D provides a gel image showing PCR4 products after the third digestion and ligation step (DL3). These PCR products result from combining four PCR3 products, digesting away one or two flanks on each product and then ligated them together. The variable sequence of the dsDNA is 100 bp in length in all the sequences on this gel, and they are flanked by 40bp of flank sequence on both sides, which can be digested away by using BsmBI, thus enabling further assembly into even larger pieces
9 PCT/US2021/059422 of DNA. The flank sequence can be digested away by using BsmBI, thus enabling further assembly into even larger pieces of DNA.
[0009] Figure 5 provides a schematic illustration of an embodiment of the invention for storing digital information in DNA A 16 bp product DNA molecule is produced encoding four bytes of information. The examples shows how a non-genetic message (here "cat in a hat") can be encoded into DNA using the methods of the invention. Figure 6 discloses SEQ
ID NOS 51-53,55 and 62-68.
[0009] Figure 5 provides a schematic illustration of an embodiment of the invention for storing digital information in DNA A 16 bp product DNA molecule is produced encoding four bytes of information. The examples shows how a non-genetic message (here "cat in a hat") can be encoded into DNA using the methods of the invention. Figure 6 discloses SEQ
ID NOS 51-53,55 and 62-68.
[0010] Figure 6 is a schematic illustration of an embodiment of the invention applied to synthesizing a 120 bp product DNA, which is an initial guide structure having the transcriptional elements of a promoter, a guide RNA, a Cas9 handle, and a terminator. In this embodiment the first cycle of PCR utilizes two primers having two variable bases on their 3' ends. This converts the otherwise 16 bp product DNA molecule into a 20 bp product. Later step(s) of PCR incorporate transcriptional elements.
SUMMARY OF THE INVENTION
SUMMARY OF THE INVENTION
[0011] The invention provides methods for synthesizing a product DNA molecule of any possible DNA sequence from a universal library of overlapping oligonucleotides. The method involves combining a plurality of the overlapping oligonucleotides in a reaction pool, where the sequences of the plurality of oligonucleotides comprise at least a sub-sequence of the product DNA molecule. The method also involves annealing the plurality of oligonucleotides, performing a ligation step, and performing an amplification step to thereby synthesize a sub-sequence of the product DNA molecule. The invention can be used to synthesize a DNA molecule of any possible sequence from the universal library, which can be accomplished through a hierarchal assembly scheme. In one embodiment the universal library comprises fewer than 10,000 pre-manufactured oligonucleotides that can be synthesized into the any possible DNA sequence. The product DNA molecule can be more than 150 base pairs long. When subsequent DNA assembly techniques are employed DNA
molecules of thousands of base pairs can be synthesized. In any embodiment the product DNA molecule has an error rate of less than 1 error per 2,000 nucleotides.
molecules of thousands of base pairs can be synthesized. In any embodiment the product DNA molecule has an error rate of less than 1 error per 2,000 nucleotides.
[0012] In a first aspect the invention provides methods of synthesizing a DNA
molecule having a desired sequence. The methods involve annealing at least two oligonucleotides to an anchor strand so that the at least two oligonucleotides annealed to the anchor strand abut one another on the anchor strand. In any embodiment the oligonucleotides can abut on the anchor strand at their variable sequences. The at least two oligonucleotides can each comprise a universal primer binding site on a 3' or 5' end, and a variable sequence on the opposing 5' or 3' end, and a conserved flanking sequence in between the universal primer binding site and the variable sequence. The anchor strand can have conserved flanking sequences complementary to those on the at least two oligonucleotides, and further can have at least one variable sequence. At least one portion of the at least one variable sequence on the anchor strand is complementary to at least a portion of the variable sequences on the at least two oligonucleotides. The invention involves a step of ligating the at least two oligonucleotides annealed to the anchor strand to produce a first dsDNA
molecule, performing an amplification step on the first dsDNA molecule having a desired sequence and comprising universal primer binding sites at the 3' and 5' ends, a conserved flanking sequence inside each of the 3' and 5' ends, and a variable sequence inside the conserved flanking sequences. In one embodiment the first dsDNA molecule has a variable sequence that is about 10 nucleotides long.
molecule having a desired sequence. The methods involve annealing at least two oligonucleotides to an anchor strand so that the at least two oligonucleotides annealed to the anchor strand abut one another on the anchor strand. In any embodiment the oligonucleotides can abut on the anchor strand at their variable sequences. The at least two oligonucleotides can each comprise a universal primer binding site on a 3' or 5' end, and a variable sequence on the opposing 5' or 3' end, and a conserved flanking sequence in between the universal primer binding site and the variable sequence. The anchor strand can have conserved flanking sequences complementary to those on the at least two oligonucleotides, and further can have at least one variable sequence. At least one portion of the at least one variable sequence on the anchor strand is complementary to at least a portion of the variable sequences on the at least two oligonucleotides. The invention involves a step of ligating the at least two oligonucleotides annealed to the anchor strand to produce a first dsDNA
molecule, performing an amplification step on the first dsDNA molecule having a desired sequence and comprising universal primer binding sites at the 3' and 5' ends, a conserved flanking sequence inside each of the 3' and 5' ends, and a variable sequence inside the conserved flanking sequences. In one embodiment the first dsDNA molecule has a variable sequence that is about 10 nucleotides long.
[0013] The method can also involve contacting the first dsDNA molecule with a restriction endonuclease to produce first dsDNA fragments having 3' and/or 5' overhang sequences comprising a portion of the variable sequence from the first dsDNA
molecule, providing at least one additional dsDNA fragment comprising a 3' and/or 5' overhang sequence that is at least partially complementary to an overhang sequence of at least one of the first dsDNA fragments. The 3' and/or 5' overhang sequences can contain at least a portion of the variable sequence. The methods also involve annealing the first dsDNA
fragments and at least one additional dsDNA fragment by the 3' and/or 5' overhang sequences, and ligating the annealed dsDNA fragments to produce a second dsDNA
molecule having a conserved flanking sequence inside each of the 3' and 5' ends, and a variable sequence inside the 3' and 5' conserved flanking sequences that is longer than the variable sequence on the first dsDNA molecule. In one embodiment the variable sequence of the second dsDNA molecule is about 16 base pairs in length. The method can also involve performing an amplification step on the second dsDNA molecule. In any embodiment the restriction endonuclease can be a Type ITS restriction endonuclease.
molecule, providing at least one additional dsDNA fragment comprising a 3' and/or 5' overhang sequence that is at least partially complementary to an overhang sequence of at least one of the first dsDNA fragments. The 3' and/or 5' overhang sequences can contain at least a portion of the variable sequence. The methods also involve annealing the first dsDNA
fragments and at least one additional dsDNA fragment by the 3' and/or 5' overhang sequences, and ligating the annealed dsDNA fragments to produce a second dsDNA
molecule having a conserved flanking sequence inside each of the 3' and 5' ends, and a variable sequence inside the 3' and 5' conserved flanking sequences that is longer than the variable sequence on the first dsDNA molecule. In one embodiment the variable sequence of the second dsDNA molecule is about 16 base pairs in length. The method can also involve performing an amplification step on the second dsDNA molecule. In any embodiment the restriction endonuclease can be a Type ITS restriction endonuclease.
[0014] In any embodiment and any step of the methods the at least one additional dsDNA fragment can be the product of a parallel DNA synthesis reaction. The method can further involve contacting the at least one second dsDNA molecule with a restriction endonuclease to produce a plurality of second dsDNA fragments comprising 3' and/or 5' overhang sequences and a conserved flanking sequence inside each of the 3' or 5' ends. The 3' and/or 5' overhang sequences can contain at least a portion of the variable sequence. The method can further involve a step of providing at least one additional dsDNA
fragment comprising a 3' and/or 5' overhang sequence that is at least partially complementary to an overhang sequence of at least one of the second dsDNA fragments, annealing the plurality of second dsDNA fragments to the one or more additional dsDNA fragment(s) by the 3' and/or 5' overhang sequence(s); and performing a step of ligation to produce a third dsDNA
molecule having a conserved flanking sequence on the 3' and 5' ends, and a variable sequence inside the conserved flanking sequences that is longer than the variable sequence of the second dsDNA molecule. In one embodiment the variable sequence is about 28 base pairs long. The method can include performing an amplification step on the third dsDNA
molecule. The at least one additional dsDNA fragment can be the product of a parallel DNA
synthesis reaction.
fragment comprising a 3' and/or 5' overhang sequence that is at least partially complementary to an overhang sequence of at least one of the second dsDNA fragments, annealing the plurality of second dsDNA fragments to the one or more additional dsDNA fragment(s) by the 3' and/or 5' overhang sequence(s); and performing a step of ligation to produce a third dsDNA
molecule having a conserved flanking sequence on the 3' and 5' ends, and a variable sequence inside the conserved flanking sequences that is longer than the variable sequence of the second dsDNA molecule. In one embodiment the variable sequence is about 28 base pairs long. The method can include performing an amplification step on the third dsDNA
molecule. The at least one additional dsDNA fragment can be the product of a parallel DNA
synthesis reaction.
[0015] The method can further involve contacting the at least one third dsDNA
molecule with a restriction endonuclease to produce a plurality of third dsDNA
fragments comprising 3' and/or 5' overhang sequences and a conserved flanking sequence inside each of the 3' or 5' ends; the fragments can contain at least a portion of the variable sequence on the 3' and/or 5' overhangs. The methods can also involve providing at least one additional dsDNA fragment comprising a 3' and/or 5' overhang sequence that is at least partially complementary to an overhang sequence of at least one of the third dsDNA
fragments. The overhang sequences can contain at least a portion of the variable sequence.
The methods can also involve annealing the plurality of third dsDNA fragments to the one or more additional dsDNA fragment(s) by the 3' and/or 5' overhang sequence(s). The method can further involve performing a step of ligation to produce a fourth dsDNA molecule having a conserved flanking sequence on the 3' and 5' ends, and a variable sequence inside the conserved flanking sequences that is longer than the variable sequence of the third dsDNA
molecule. The methods can also involve performing an amplification step on the fourth dsDNA molecule. In one embodiment the variable sequence is about 100 base pairs long.
molecule with a restriction endonuclease to produce a plurality of third dsDNA
fragments comprising 3' and/or 5' overhang sequences and a conserved flanking sequence inside each of the 3' or 5' ends; the fragments can contain at least a portion of the variable sequence on the 3' and/or 5' overhangs. The methods can also involve providing at least one additional dsDNA fragment comprising a 3' and/or 5' overhang sequence that is at least partially complementary to an overhang sequence of at least one of the third dsDNA
fragments. The overhang sequences can contain at least a portion of the variable sequence.
The methods can also involve annealing the plurality of third dsDNA fragments to the one or more additional dsDNA fragment(s) by the 3' and/or 5' overhang sequence(s). The method can further involve performing a step of ligation to produce a fourth dsDNA molecule having a conserved flanking sequence on the 3' and 5' ends, and a variable sequence inside the conserved flanking sequences that is longer than the variable sequence of the third dsDNA
molecule. The methods can also involve performing an amplification step on the fourth dsDNA molecule. In one embodiment the variable sequence is about 100 base pairs long.
[0016] In another embodiment step a) further involves annealing at least two paired oligonucleotides to a paired anchor strand so that the at least two paired oligonucleotides bound to the paired anchor strand abut one another on the paired anchor strand, which can occur at their variable sequences. The at least two paired oligonucleotides can have a universal primer binding site on a 3' or 5' end, and a variable sequence on the opposing 5' or 3' end, and a conserved flanking sequence in between the universal primer binding site and the variable sequence. The paired anchor strand can have conserved flanking sequences complementary to those on the at least two paired oligonucleotides, and further have at least one variable sequence. A portion of the variable sequence on the paired anchor strand can overlap with a portion of the variable sequence on the first anchor strand.
The variable sequence can be located in between the two sequences complementary to the conserved flanking sequences. At least a portion of the variable sequence differs between the first and second anchor strands.
The variable sequence can be located in between the two sequences complementary to the conserved flanking sequences. At least a portion of the variable sequence differs between the first and second anchor strands.
[0017] The method can further involve ligating the at least two paired oligonucleotides annealed to the anchor strand, performing an amplification step to produce a paired dsDNA molecule of desired sequence and comprising a universal primer binding site at a 3' and 5' end, a conserved flanking sequence inside each of the 3' and 5' ends, and a variable sequence inside the conserved flanking sequences that partially overlaps with the variable sequence of the first dsDNA molecule. In one embodiment the at least two oligonucleotides and first anchor strand, and the at least two paired oligonucleotides and paired anchor strand can be annealed in a simultaneous reaction in the same pool. The method can further involve contacting the first dsDNA molecule and the paired dsDNA
molecule with a restriction endonuclease to produce at least one dsDNA
fragment and at least one paired dsDNA fragment, each comprising at least one 3' and/or 5' overhang sequence;
and at least a portion of a 3' or 5' overhang sequence from the first dsDNA
fragment can be complementary to at least a portion of a 5' or 3' overhang sequence from the paired dsDNA
fragment. The method can further involve annealing the first and paired dsDNA
fragments by their complementary overhang sequences and performing a step of ligation to produce a second dsDNA molecule having a conserved flanking sequence inside each of the 3' and 5' ends, and a variable sequence inside the 3' and 5' conserved flanking sequences that is longer than the variable sequence on the first dsDNA molecules. The method can also involve performing an amplification step on the second dsDNA molecule.
molecule with a restriction endonuclease to produce at least one dsDNA
fragment and at least one paired dsDNA fragment, each comprising at least one 3' and/or 5' overhang sequence;
and at least a portion of a 3' or 5' overhang sequence from the first dsDNA
fragment can be complementary to at least a portion of a 5' or 3' overhang sequence from the paired dsDNA
fragment. The method can further involve annealing the first and paired dsDNA
fragments by their complementary overhang sequences and performing a step of ligation to produce a second dsDNA molecule having a conserved flanking sequence inside each of the 3' and 5' ends, and a variable sequence inside the 3' and 5' conserved flanking sequences that is longer than the variable sequence on the first dsDNA molecules. The method can also involve performing an amplification step on the second dsDNA molecule.
[0018] In a further embodiment the method can further involve contacting the at least one second dsDNA molecule and an at least one paired second dsDNA molecule with a restriction endonuclease to produce a plurality of second dsDNA fragments and paired second dsDNA fragments, each comprising a 3' and/or 5' overhang sequence(s).
At least two of the plurality comprise, a conserved flanking sequence inside each of the 3' or 5' ends. At least a portion of the 3' or 5' overhang sequence from a second dsDNA fragment can be complementary to at least a portion of the 5' or 3' overhang sequence from a paired second dsDNA fragment. The method can further involve annealing the second and paired second dsDNA fragments by their complementary overhang sequences, and performing a step of ligation to produce a third dsDNA molecule comprising a conserved flanking sequence inside each of the 3' and 5' ends, and a variable sequence inside the 3' and 5' conserved flanking sequences that is longer than the variable sequence on the second dsDNA
molecules. At least a portion of the variable sequence on the third dsDNA molecule can overlap with a portion of the variable sequence on the paired third dsDNA molecule. The methods can also involve performing a step of amplification on the third dsDNA molecule.
At least two of the plurality comprise, a conserved flanking sequence inside each of the 3' or 5' ends. At least a portion of the 3' or 5' overhang sequence from a second dsDNA fragment can be complementary to at least a portion of the 5' or 3' overhang sequence from a paired second dsDNA fragment. The method can further involve annealing the second and paired second dsDNA fragments by their complementary overhang sequences, and performing a step of ligation to produce a third dsDNA molecule comprising a conserved flanking sequence inside each of the 3' and 5' ends, and a variable sequence inside the 3' and 5' conserved flanking sequences that is longer than the variable sequence on the second dsDNA
molecules. At least a portion of the variable sequence on the third dsDNA molecule can overlap with a portion of the variable sequence on the paired third dsDNA molecule. The methods can also involve performing a step of amplification on the third dsDNA molecule.
[0019] In another embodiment the methods further involve contacting the at least one third dsDNA molecule and an at least one paired third dsDNA molecule with a restriction endonuclease to produce a plurality of third dsDNA fragments and paired third dsDNA
fragments, each comprising a 3' and/or 5' overhang sequence(s); the third dsDNA fragments can have at least a portion of the variable sequence on the 3' and/or 5' overhangs. At least two of the plurality can have a conserved flanking sequence inside each of the 3' or 5' ends.
At least a portion of the 3' or 5' overhang sequence from a third dsDNA
fragment can be complementary to at least a portion of the 5' or 3' overhang sequence from a paired third dsDNA fragment. The methods can further involve a step of annealing the third and paired third dsDNA fragments by their complementary overhang sequences, and performing a step of ligation to produce a fourth dsDNA molecule having a conserved flanking sequence inside each of the 3' and 5' ends, and a variable sequence inside the 3' and 5' conserved flanking sequences that is longer than the variable sequence on the third dsDNA
molecule. The methods can also involve performing a step of amplification on the fourth dsDNA molecule.
fragments, each comprising a 3' and/or 5' overhang sequence(s); the third dsDNA fragments can have at least a portion of the variable sequence on the 3' and/or 5' overhangs. At least two of the plurality can have a conserved flanking sequence inside each of the 3' or 5' ends.
At least a portion of the 3' or 5' overhang sequence from a third dsDNA
fragment can be complementary to at least a portion of the 5' or 3' overhang sequence from a paired third dsDNA fragment. The methods can further involve a step of annealing the third and paired third dsDNA fragments by their complementary overhang sequences, and performing a step of ligation to produce a fourth dsDNA molecule having a conserved flanking sequence inside each of the 3' and 5' ends, and a variable sequence inside the 3' and 5' conserved flanking sequences that is longer than the variable sequence on the third dsDNA
molecule. The methods can also involve performing a step of amplification on the fourth dsDNA molecule.
[0020] In any embodiment the first dsDNA molecule can have a variable region of 8-12 base pairs. In any embodiment the paired dsDNA molecule can have a variable region of 8-12 base pairs. In any embodiment the second dsDNA molecule can have a variable sequence of 14-18 base pairs. In any embodiment the third dsDNA molecule can have a variable sequence of 24-32 base pairs. In any embodiment the fourth dsDNA
molecule can have a variable sequence of 90-110 base pairs. In any embodiment the at least two oligonucleotides can have a variable sequence of 4-6 nucleotides. In any embodiment the anchor strands can have the sequences complementary to the conserved flanking sequences on the at least two oligonucleotides on the 3' and 5' ends. In any embodiment the anchor strands can have the sequences that are complementary to the conserved flanking sequences on the at least two oligonucleotides on the 3' and 5' ends. In any embodiment the amplification step can be performed by the polymerase chain reaction (PCR). In any embodiment the variable sequence can be equal to the lengths of the variable sequences on the at least two oligonucleotides. In any embodiment the anchor strand can have a variable sequence present in between the two sequences complementary to the conserved flanking sequences on the at least two oligonucleotides. In any embodiment the anchor strand can have a variable sequence present in between the two sequences complementary to the conserved flanking sequences on the at least two oligonucleotides. In any embodiment the at least two oligonucleotides bound to the anchor strand can abut one another on the anchor strand at their variable sequences. In any embodiment the portion of the variable sequence on the anchor strand that is complementary to the conserved flanking sequence on the at least two oligonucleotides can be 2-6 nucleotides. In any embodiment the at least two oligonucleotides and anchor strand are programmed so that the dsDNA molecule has at least one recognition site for a restriction endonuclease. In any embodiment the restriction endonuclease can be a Type IIS endonuclease. In any embodiment the anchor strand can have 4-6 degenerate nucleotides. In any embodiment the at least one additional dsDNA
fragment can be from a parallel synthesis reaction. In any embodiment the 3' and/or 5' overhang sequences can have the portion of the variable sequence from the first dsDNA
molecule. In any embodiment the step of ligation can occur spontaneously. In any embodiment the at least one additional dsDNA fragment can have a variable sequence at least partially complementary to the variable sequence from the first dsDNA
molecule. In any embodiment the methods can further involve a step of ligating the at least two oligonucleotides bound to the anchor strand.
molecule can have a variable sequence of 90-110 base pairs. In any embodiment the at least two oligonucleotides can have a variable sequence of 4-6 nucleotides. In any embodiment the anchor strands can have the sequences complementary to the conserved flanking sequences on the at least two oligonucleotides on the 3' and 5' ends. In any embodiment the anchor strands can have the sequences that are complementary to the conserved flanking sequences on the at least two oligonucleotides on the 3' and 5' ends. In any embodiment the amplification step can be performed by the polymerase chain reaction (PCR). In any embodiment the variable sequence can be equal to the lengths of the variable sequences on the at least two oligonucleotides. In any embodiment the anchor strand can have a variable sequence present in between the two sequences complementary to the conserved flanking sequences on the at least two oligonucleotides. In any embodiment the anchor strand can have a variable sequence present in between the two sequences complementary to the conserved flanking sequences on the at least two oligonucleotides. In any embodiment the at least two oligonucleotides bound to the anchor strand can abut one another on the anchor strand at their variable sequences. In any embodiment the portion of the variable sequence on the anchor strand that is complementary to the conserved flanking sequence on the at least two oligonucleotides can be 2-6 nucleotides. In any embodiment the at least two oligonucleotides and anchor strand are programmed so that the dsDNA molecule has at least one recognition site for a restriction endonuclease. In any embodiment the restriction endonuclease can be a Type IIS endonuclease. In any embodiment the anchor strand can have 4-6 degenerate nucleotides. In any embodiment the at least one additional dsDNA
fragment can be from a parallel synthesis reaction. In any embodiment the 3' and/or 5' overhang sequences can have the portion of the variable sequence from the first dsDNA
molecule. In any embodiment the step of ligation can occur spontaneously. In any embodiment the at least one additional dsDNA fragment can have a variable sequence at least partially complementary to the variable sequence from the first dsDNA
molecule. In any embodiment the methods can further involve a step of ligating the at least two oligonucleotides bound to the anchor strand.
[0021] In another aspect the invention provides a composition of at least two oligonucleotides, each comprising a universal primer binding site on a 3' or 5' end, and a variable sequence on the opposing 5' or 3' end, and a conserved flanking sequence in between the universal primer binding site and the variable sequence. In the composition an anchor strand can have sequences complementary to the conserved flanking sequences on the at least two oligonucleotides, and further have at least one variable sequence, which can be located in between the two sequences complementary to the conserved flanking sequences.
At least a portion of the variable sequence on the anchor strand can be complementary to at least a portion of the variable sequences on the at least two oligonucleotides. In one embodiment the anchor strand can have the sequences complementary to the conserved flanking sequences on the at least two oligonucleotides at its 3' and 5' ends.
The anchor strand can have the variable sequence in between the two sequences complementary to the conserved flanking sequence.
At least a portion of the variable sequence on the anchor strand can be complementary to at least a portion of the variable sequences on the at least two oligonucleotides. In one embodiment the anchor strand can have the sequences complementary to the conserved flanking sequences on the at least two oligonucleotides at its 3' and 5' ends.
The anchor strand can have the variable sequence in between the two sequences complementary to the conserved flanking sequence.
[0022] In another aspect the invention provides methods of storing data in a DNA
sequence. The methods involve determining a sequence of DNA that encodes a non-genetic message according to a coding scheme that translates the non-genetic message from a reference language into a DNA sequence and vice versa; synthesizing the sequence of DNA
that encodes the non-genetic message according to any method disclosed herein;
and thereby store data in a DNA sequence.
sequence. The methods involve determining a sequence of DNA that encodes a non-genetic message according to a coding scheme that translates the non-genetic message from a reference language into a DNA sequence and vice versa; synthesizing the sequence of DNA
that encodes the non-genetic message according to any method disclosed herein;
and thereby store data in a DNA sequence.
[0023] In another aspect the invention provides methods of synthesizing a DNA
sequence encoding a guide RNA. The methods involve determining a sequence of DNA that encodes a guide RNA; synthesizing the sequence of DNA that encodes the guide RNA
according to any method disclosed herein.
DETAILED DESCRIPTION OF THE INVENTION
sequence encoding a guide RNA. The methods involve determining a sequence of DNA that encodes a guide RNA; synthesizing the sequence of DNA that encodes the guide RNA
according to any method disclosed herein.
DETAILED DESCRIPTION OF THE INVENTION
[0024] The invention provides methods of assembling DNA molecules of any sequence with high fidelity using a universal library of oligonucleotides. The methods involve the use of an oligonucleotide library having DNA molecule members such that all possible DNA sequences can be assembled from the library using the methods. In one embodiment the library of oligonucleotides has less than 10,000 members. Many efforts have been made towards achieving methods of assembling any possible DNA
sequence from a library having a limited number of members. The present inventors discovered that any possible DNA sequence can be conveniently assembled using the materials and methods disclosed herein. The invention therefore enables creation of a library of less than 10,000 oligonucleotides, from which all possible oligonucleotide sequences can be assembled. The library of less than 10,000 oligonucleotides can be conveniently provided on a small device (e.g. a DNA chip), and devices and instrumentation provided to selectively assembly any DNA
sequence using only the members of the oligonucleotide library.
Oligo Library
sequence from a library having a limited number of members. The present inventors discovered that any possible DNA sequence can be conveniently assembled using the materials and methods disclosed herein. The invention therefore enables creation of a library of less than 10,000 oligonucleotides, from which all possible oligonucleotide sequences can be assembled. The library of less than 10,000 oligonucleotides can be conveniently provided on a small device (e.g. a DNA chip), and devices and instrumentation provided to selectively assembly any DNA
sequence using only the members of the oligonucleotide library.
Oligo Library
[0025] In various embodiments the oligonucleotide members in the library can be DNA of various lengths. In various embodiments the library of oligonucleotides can have less than 20,000 members or less than 15,000 or less than 12,000 members, or less than 10,000 members, or less than 9,000 members or less than 6,000 members. The methods are able to synthesize all possible polynucleotide sequences using the oligonucleotide members in the library. In various embodiments the invention permits the assembly of over 4 billion (for a 16mer) and up to over 1 trillion (for a 20mer) polynucleotides of distinct sequence beginning only with those oligonucleotides in the library. In various embodiments each oligonucleotide in the library can be used from 100 to 10,000 times in the synthesis of product DNA
molecules. The product DNA molecule assembled can be of any size, for example it can be longer than 1 kb, or longer than 1.5 kb, or longer than 2 kb, or longer than 5 kb or longer than kb or 2-10 kb or 5-15 kb or 5-20 kb less than 500 kb or less than 1 Mbp or less than 5 Mbp or less than 10 Mbp or less than 12 Mbp, or less than 13 Mbp, or less than 14 Mbp, or less than 15 Mbp or 1-10 Mbp, or 1-12 Mbp, or 1-15 Mpb. The terms "oligo" and "oligonucleotide" are used interchangeably herein, and indicates a polymer of nucleotides of generally shorter length. "Polynucleotide" is a general term denoting a polymer of nucleotides of any length.
molecules. The product DNA molecule assembled can be of any size, for example it can be longer than 1 kb, or longer than 1.5 kb, or longer than 2 kb, or longer than 5 kb or longer than kb or 2-10 kb or 5-15 kb or 5-20 kb less than 500 kb or less than 1 Mbp or less than 5 Mbp or less than 10 Mbp or less than 12 Mbp, or less than 13 Mbp, or less than 14 Mbp, or less than 15 Mbp or 1-10 Mbp, or 1-12 Mbp, or 1-15 Mpb. The terms "oligo" and "oligonucleotide" are used interchangeably herein, and indicates a polymer of nucleotides of generally shorter length. "Polynucleotide" is a general term denoting a polymer of nucleotides of any length.
[0026] In other embodiments the methods can also be used with even smaller libraries to assemble a significant number of sequences that may be desired, for example to assemble a more limited and directed number of sequences in a defined category where such sequences are needed. Examples of a defined category can include a set of genes related to a specific biological function, or genes from a particular organism. In any embodiment the product DNA molecule synthesized in the method can be synthesized entirely from and only using oligonucleotides from the oligonucleotide library. A "universal library" is a library of polynucleotide molecules from which any possible DNA sequence can be assembled. At a broad lever a universal library can be any possible DNA sequence. However, within a particular defined categories of DNA sequences smaller libraries can be used containing sequences of interest, for example a universal library of DNA sequences for RNA
metabolism, or for genes or sequences related to transcription, regulation, RNA metabolism, translation, protein folding, protein export, RNA (rRNA, tRNA, small RNAs), ribosome biogenesis, rRNA modification, DNA replication, DNA repair, DNA topology, DNA
metabolism, chromosome segregation, cell division, and tRNA modification.
Definitions of DNA sequences to be included in a defined category or sequences of interest may be subject to some discretion depending on the needs of the application.
Oligonucleotides
metabolism, or for genes or sequences related to transcription, regulation, RNA metabolism, translation, protein folding, protein export, RNA (rRNA, tRNA, small RNAs), ribosome biogenesis, rRNA modification, DNA replication, DNA repair, DNA topology, DNA
metabolism, chromosome segregation, cell division, and tRNA modification.
Definitions of DNA sequences to be included in a defined category or sequences of interest may be subject to some discretion depending on the needs of the application.
Oligonucleotides
[0027] In one embodiment the at least two oligonucleotides can be DNA of any convenient length to the purposes. For example, the at least two oligonucleotides can be greater than 12 nucleotides in length or, without limitation, about 20-35 nucleotides, or 35-55 nucleotides, or 30-60 nucleotide, or 40-50 nucleotides or about 42-48 nucleotides, or about 44 or about 45 nucleotides. Anchor strands used in the method can be from 20-60 nucleotides, or from 20-70 nucleotides, or from 30-60 nucleotides or from 40-60 nucleotides or from 40-50 nucleotides. In one embodiment the at least two oligonucleotides are from nucleotides and the anchor strand is from 35-45 or from 45-55 nucleotides.
Primer binding sites can be added to or included in these oligonucleotide lengths.
Oligonucleotides can be present in any combination or sub-combination of the lengths provided herein.
In any embodiment the oligonucleotides can have only nucleotides having no non-standard bases.
But in any embodiment the oligonucleotides can have only nucleotides having standard bases, i.e. all nucleotides in the oligonucleotide have a base that is either A (adenine), T
(thymine), C (cytosine), or G (guanine). In other embodiments any of the oligonucleotides can contain non-standard bases. The oligonucleotides and/or anchor strands can have sequences for binding a primer, which can be used in PCR or another DNA
amplification procedure. In sizing nucleotides for the library the person of ordinary skill with resort to this disclosure will realize optimal sizes of oligonucleotides to use in the methods by considering the ability of oligo lengths to anneal to other oligos. Any of the oligonucleotides can be programmed to comprise a recognition site for a restriction endonuclease in a resulting dsDNA molecule. The restriction enzyme can be one that recognizes asymmetric DNA
sequences and cleaves a number of nucleotides outside of their recognition sequence (e.g.
within 1-5 or 1-10 or 1-20 nucleotides).
Primer binding sites can be added to or included in these oligonucleotide lengths.
Oligonucleotides can be present in any combination or sub-combination of the lengths provided herein.
In any embodiment the oligonucleotides can have only nucleotides having no non-standard bases.
But in any embodiment the oligonucleotides can have only nucleotides having standard bases, i.e. all nucleotides in the oligonucleotide have a base that is either A (adenine), T
(thymine), C (cytosine), or G (guanine). In other embodiments any of the oligonucleotides can contain non-standard bases. The oligonucleotides and/or anchor strands can have sequences for binding a primer, which can be used in PCR or another DNA
amplification procedure. In sizing nucleotides for the library the person of ordinary skill with resort to this disclosure will realize optimal sizes of oligonucleotides to use in the methods by considering the ability of oligo lengths to anneal to other oligos. Any of the oligonucleotides can be programmed to comprise a recognition site for a restriction endonuclease in a resulting dsDNA molecule. The restriction enzyme can be one that recognizes asymmetric DNA
sequences and cleaves a number of nucleotides outside of their recognition sequence (e.g.
within 1-5 or 1-10 or 1-20 nucleotides).
[0028] The methods of the invention synthesize a product dsDNA molecule having a desired sequence, which can be a pre-determined sequence, i.e. one decided by the user prior to beginning the method. The product DNA molecule can be any molecule produced by the method including but not limited to the first dsDNA molecule, the second dsDNA
molecule, the third dsDNA molecule the fourth dsDNA molecule, the dsDNA fragments, and the additional dsDNA molecule and fragments.
Methods
molecule, the third dsDNA molecule the fourth dsDNA molecule, the dsDNA fragments, and the additional dsDNA molecule and fragments.
Methods
[0029] Figure 1A depicts a method of synthesizing a product DNA molecule according to a method of the invention. 01 and 02 are the at least two oligonucleotides, and 03 is the anchor strand. In this embodiment 01-02 each have a universal primer binding site 101 on the 5' or 3' end, and a variable sequence 105 on the opposing 3' or 5' end. 01-02 also have a conserved flanking sequence 110, in this embodiment depicted in between the universal primer binding site 101 and the variable sequence 105. The "conserved flanking sequences" (CFS) serve to assist the oligos in annealing to complementary CFSs on target oligos, and can also provide primer binding sites for use later in the method.
In some embodiments the CFSs can be a sequence of 15-20 or 15-30 nucleotides, but any convenient length can be used that is able to aid in annealing and provide a primer binding site.
In some embodiments the CFSs can be a sequence of 15-20 or 15-30 nucleotides, but any convenient length can be used that is able to aid in annealing and provide a primer binding site.
[0030] In this embodiment the anchor strand 03 has, at the 3' and 5' ends, conserved flanking sequences 110 complementary to the conserved flanking sequences on the at least two oligonucleotides. 03 also has at least one variable sequence 105, here situated in between the two conserved flanking sequences 110. In this embodiment of 03 the variable sequence is in between the two conserved flanking sequences. In other embodiments the variable sequence can be moved, as long as sufficient space is left for a CFS
able to facilitate annealing and/or provide a primer binding site (if utilized). In some embodiments at least 10 or at least 15 or at least 18 nucleotides of a conserved flanking sequence are present on both sides of the variable sequence of the anchor strand. In this embodiment the variable sequence 105 comprises degenerate nucleotides N, here six degenerate nucleotides as depicted in Figure 1. At least a portion of the at least one variable sequence 105 on the anchor strand is complementary to at least a portion of the variable sequences 105 on the at least two oligonucleotides, and in one embodiment can be complementary across the whole variable sequence. One or more of the at least two oligonucleotides can further be programmed to have a recognition site for a restriction endonuclease when assembled into a dsDNA
molecule; the anchor strand can also be programmed to contain a recognition site for a restriction endonuclease so that, when bound to the at least two oligonucleotides the recognition sites are formed by base pairs. In one embodiment the restriction endonuclease can be a Type IIS restriction endonuclease. The recognition site on the at least two oligonucleotides and anchor strand can be programmed to lie outside of the variable sequence on the assembled molecule, but the restriction endonuclease can cut inside of the variable sequence. In one embodiment the recognition sites are comprised within the conserved flanking sequence; and in one embodiment the restriction endonuclease cleaves within the variable sequence of a dsDNA molecule. Sequences or nucleotides are "complementary"
when they are able to anneal to each other and form base pairs, for example by standard Watson-Crick base pairing.
able to facilitate annealing and/or provide a primer binding site (if utilized). In some embodiments at least 10 or at least 15 or at least 18 nucleotides of a conserved flanking sequence are present on both sides of the variable sequence of the anchor strand. In this embodiment the variable sequence 105 comprises degenerate nucleotides N, here six degenerate nucleotides as depicted in Figure 1. At least a portion of the at least one variable sequence 105 on the anchor strand is complementary to at least a portion of the variable sequences 105 on the at least two oligonucleotides, and in one embodiment can be complementary across the whole variable sequence. One or more of the at least two oligonucleotides can further be programmed to have a recognition site for a restriction endonuclease when assembled into a dsDNA
molecule; the anchor strand can also be programmed to contain a recognition site for a restriction endonuclease so that, when bound to the at least two oligonucleotides the recognition sites are formed by base pairs. In one embodiment the restriction endonuclease can be a Type IIS restriction endonuclease. The recognition site on the at least two oligonucleotides and anchor strand can be programmed to lie outside of the variable sequence on the assembled molecule, but the restriction endonuclease can cut inside of the variable sequence. In one embodiment the recognition sites are comprised within the conserved flanking sequence; and in one embodiment the restriction endonuclease cleaves within the variable sequence of a dsDNA molecule. Sequences or nucleotides are "complementary"
when they are able to anneal to each other and form base pairs, for example by standard Watson-Crick base pairing.
[0031] The method involves steps of annealing the at least two oligonucleotides 01-02 to an anchor strand 03 so that the at least two oligonucleotides bound to the anchor strand abut one another on the anchor strand. In one embodiment the at least two oligonucleotides can abut at their variable sequences. In one embodiment the variable sequences 105 of the at least two oligonucleotides are annealed to the variable sequence of the anchor strand 105. It is noted in the embodiment illustrated that each of 01-02 have a variable sequence of 5 nucleotides and 03 has a variable sequence of ten nucleotides. Upon annealing the respective variable sequences anneal and form base pairs. In the invention "binding" or "annealing" are used interchangeably and refer to formation of a double-stranded polynucleotide sequence by standard Watson-Crick base pairing. Two oligonucleotides abut one another when a first oligonucleotide contains a nucleotide that is present adjacent to a nucleotide on a second oligonucleotide when both oligonucleotides are bound to the same (third) complementary oligonucleotide (e.g. an anchor strand) by Watson-Crick base pairing, at least at their variable sequences. Figures 1 and 2 illustrate this concept.
[0032] The methods also involve a step of ligating the at least two oligonucleotides annealed to the anchor strand to produce a ("first") dsDNA molecule. The step of ligation or "ligating" can mean allowing ligation to occur spontaneously, or by contacting the annealed dsDNA fragments or dsDNA molecules with a ligase. A ligase is an enzyme that catalyzes the joining of two polynucleotide molecules by forming a new chemical bond. In one embodiment the ligase can ligate polynucleotides bound to the same complementary polynucleotide strand. In any of the methods any DNA ligase can be used, for example T4 DNA ligase and E. coli DNA ligase are just two examples, but another DNA
ligase can also be used.
ligase can also be used.
[0033] The methods involve a step of performing an amplification step. In any step of any of the methods amplification can be, for example, PCR, isothermal amplification, rolling circle amplification, loop-mediated isothermal amplification, or another DNA
amplification method) on the dsDNA molecules (e.g. 01-03 and 04-06 when present) to produce a first dsDNA molecule 07 (and/or 08). In the embodiment of Figure 1 the variable sequence of 07 is a lOmer as an example, but persons of ordinary skill with resort to this disclosure will realize that any appropriate length of variable sequence can be used in the methods. For example variable sequences of 6-20 or 6-12 or 10-14 or 10-16 nucleotides or other numbers of nucleotides can be used on anchor strands in any embodiment and, optionally, correspond in length to the sum of the variable sequences on the at least two oligonucleotides. In any embodiment the at least two oligonucleotides can have variable sequences of different lengths. For example, one of the two oligonucleotides can have a variable sequence of 4 nucleotides and the second oligo a variable sequence of 6 nucleotides, or other combinations. In the embodiment depicted in Figure 1 the first dsDNA
molecule (e.g. 07 or 08) is synthesized with universal primer binding sites 101 at the 3' and 5' ends, a conserved flanking sequence 110 inside each of the 3' and 5' ends, and a variable sequence 105 inside the conserved flanking sequences 110. With respect to DNA sequences "inside"
refers to a feature present further towards the center of the DNA sequence (and further away from the 5' or 3' ends) than a reference feature.
amplification method) on the dsDNA molecules (e.g. 01-03 and 04-06 when present) to produce a first dsDNA molecule 07 (and/or 08). In the embodiment of Figure 1 the variable sequence of 07 is a lOmer as an example, but persons of ordinary skill with resort to this disclosure will realize that any appropriate length of variable sequence can be used in the methods. For example variable sequences of 6-20 or 6-12 or 10-14 or 10-16 nucleotides or other numbers of nucleotides can be used on anchor strands in any embodiment and, optionally, correspond in length to the sum of the variable sequences on the at least two oligonucleotides. In any embodiment the at least two oligonucleotides can have variable sequences of different lengths. For example, one of the two oligonucleotides can have a variable sequence of 4 nucleotides and the second oligo a variable sequence of 6 nucleotides, or other combinations. In the embodiment depicted in Figure 1 the first dsDNA
molecule (e.g. 07 or 08) is synthesized with universal primer binding sites 101 at the 3' and 5' ends, a conserved flanking sequence 110 inside each of the 3' and 5' ends, and a variable sequence 105 inside the conserved flanking sequences 110. With respect to DNA sequences "inside"
refers to a feature present further towards the center of the DNA sequence (and further away from the 5' or 3' ends) than a reference feature.
[0034] In some embodiments a plurality of product DNA molecules can be "multiplexed," i.e. synthesized in the same reaction pool. In other embodiments where desired DNA molecules can be synthesized individually (or "in parallel") in their own reaction pools (and combined subsequently). Reactions can be multiplexed with two or more binding sets of the at least two oligonucleotides and at least one anchor strand. As with any of the methods, the method depicted in Figure 1 can be performed as an individual synthesis of one dsDNA molecule, or as a multiplexed synthesis of at least two dsDNA
molecules, which can later, optionally, be joined as illustrated in Figures 1-2. When multiplexing is utilized more than one DNA molecule or more than two DNA molecules are synthesized in a simultaneous reaction in the same pool. In Figure 1 multiplexing is depicted as paired oligos 04-05 and paired anchor strand 06 forming a separate paired dsDNA molecule 08, as a pair to the first dsDNA molecule 07. But a paired dsDNA molecule 08 can be synthesized in a parallel and separate reaction pool, and the first and paired dsDNA molecules (or fragments therefrom) combined in a subsequent step. In any embodiment the dsDNA
molecules (or dsDNA fragments) are "paired" when they have an overlapping sequence at the variable sequence. In this embodiment there is depicted variable sequences of 10 bp in the first and paired dsDNA molecule and an overlap of 4 bp in the variable sequence between the dsDNA
molecules. But in any embodiment of the methods the overlap can be at least 1 bp, or at least 2 bp or at least 3 bp or at least 4 bp, or at least 5 bp or at least 6 bp. Any two dsDNA
molecules (or dsDNA fragment), whether first dsDNA molecule, second, third, fourth, etc., can be dsDNA molecules (or dsDNA fragments) that are paired. dsDNA fragments can be produced (e.g. by restriction enzyme action on a dsDNA molecule or, in any embodiment, separately synthesized) to produce paired dsDNA fragments that have overhanging 3' and/or 5' sequences, which overhangs can be at their variable sequences and can at least partially overlap. Such dsDNA fragments can therefore be annealed at the 3' and/or 5' overhangs to form a larger dsDNA molecule. Overlapping sequences are those that comprise the same sequence for a series of nucleotides. In any embodiment the methods can utilize polynucleotides that overlap by 1, 2, 3, 4, 5, 6, 7, or 8 nucleotides, or by at least 1 or at least 2 or at least 3 or at least 4 or at least 5 or at least 6 or at least 7 or at least 8, or by more than 4 or more than 5 or more than 6 consecutive nucleotides. The overlap can be at their variable sequences. "Overhangs" or "overhanging" sequences refers to 3' or 5' single-stranded DNA
sequences that extend from a double-stranded DNA sequence. In various embodiments the overhangs can be at least 2 or at least 3 or at least 4 nucleotides.
molecules, which can later, optionally, be joined as illustrated in Figures 1-2. When multiplexing is utilized more than one DNA molecule or more than two DNA molecules are synthesized in a simultaneous reaction in the same pool. In Figure 1 multiplexing is depicted as paired oligos 04-05 and paired anchor strand 06 forming a separate paired dsDNA molecule 08, as a pair to the first dsDNA molecule 07. But a paired dsDNA molecule 08 can be synthesized in a parallel and separate reaction pool, and the first and paired dsDNA molecules (or fragments therefrom) combined in a subsequent step. In any embodiment the dsDNA
molecules (or dsDNA fragments) are "paired" when they have an overlapping sequence at the variable sequence. In this embodiment there is depicted variable sequences of 10 bp in the first and paired dsDNA molecule and an overlap of 4 bp in the variable sequence between the dsDNA
molecules. But in any embodiment of the methods the overlap can be at least 1 bp, or at least 2 bp or at least 3 bp or at least 4 bp, or at least 5 bp or at least 6 bp. Any two dsDNA
molecules (or dsDNA fragment), whether first dsDNA molecule, second, third, fourth, etc., can be dsDNA molecules (or dsDNA fragments) that are paired. dsDNA fragments can be produced (e.g. by restriction enzyme action on a dsDNA molecule or, in any embodiment, separately synthesized) to produce paired dsDNA fragments that have overhanging 3' and/or 5' sequences, which overhangs can be at their variable sequences and can at least partially overlap. Such dsDNA fragments can therefore be annealed at the 3' and/or 5' overhangs to form a larger dsDNA molecule. Overlapping sequences are those that comprise the same sequence for a series of nucleotides. In any embodiment the methods can utilize polynucleotides that overlap by 1, 2, 3, 4, 5, 6, 7, or 8 nucleotides, or by at least 1 or at least 2 or at least 3 or at least 4 or at least 5 or at least 6 or at least 7 or at least 8, or by more than 4 or more than 5 or more than 6 consecutive nucleotides. The overlap can be at their variable sequences. "Overhangs" or "overhanging" sequences refers to 3' or 5' single-stranded DNA
sequences that extend from a double-stranded DNA sequence. In various embodiments the overhangs can be at least 2 or at least 3 or at least 4 nucleotides.
[0035] The methods can involve further steps towards synthesizing a larger product DNA molecule. The methods can involve a step of contacting paired (e.g. the first and paired) dsDNA molecule(s) (e.g. 07-08) with a restriction enzyme to produce first and paired dsDNA fragments that have 3' and/or 5' complementary overhang sequences and a portion of the variable sequence from the first dsDNA molecule(s). In any embodiment the first and paired dsDNA molecules can have variable sequences that overlap, and can have conserved flanking sequences containing a recognition site for a restriction endonuclease (e.g. a Type ITS restriction endonuclease). Thus, the restriction enzyme can cleave within the variable sequence of the first dsDNA molecule (and paired dsDNA molecule when present) to produce complementary overhanging 3' and/or 5' sequences of the first dsDNA
fragment and paired dsDNA fragment.
fragment and paired dsDNA fragment.
[0036] The methods can involve a step of providing at least one additional dsDNA
fragment that has a 3' and/or 5' overhang sequence complementary to an overhang sequence of at least one other dsDNA fragment in the synthesis reaction (e.g. the first and/or paired dsDNA fragment). The overhang sequence of the additional dsDNA fragment can contain at least a portion of the variable sequence (depicted in Figure 2). The overhang sequences can comprise at least a portion of the variable sequence of the first dsDNA
molecule (and paired dsDNA molecule when present), and therefore the fragments can have overlapping sequences at the variable sequence. The at least one additional dsDNA fragment can be from a restriction endonuclease reaction on the dsDNA molecule(s) or its pair (e.g.
07 and 08), or can be a DNA fragment produced in another, parallel reaction, and can also be separately synthesized. Each of these dsDNA fragments have a complementary 3' or 5' overhang sequence to at least one other dsDNA fragment in the synthesis reaction.
However, in other embodiments a plurality of additional dsDNA fragments can be imported and assembled into the product DNA molecule at the same time. Some dsDNA fragments can have both a 3' and a 5' overhang sequence complementary to two other dsDNA fragments in the reaction, and can therefore be inserted in between two fragments to lengthen the product DNA
molecule -these dsDNA fragments therefore will have 3' and 5' overhangs, each complementary to one other 3' or 5' overhang of a dsDNA fragment in the synthesis.
fragment that has a 3' and/or 5' overhang sequence complementary to an overhang sequence of at least one other dsDNA fragment in the synthesis reaction (e.g. the first and/or paired dsDNA fragment). The overhang sequence of the additional dsDNA fragment can contain at least a portion of the variable sequence (depicted in Figure 2). The overhang sequences can comprise at least a portion of the variable sequence of the first dsDNA
molecule (and paired dsDNA molecule when present), and therefore the fragments can have overlapping sequences at the variable sequence. The at least one additional dsDNA fragment can be from a restriction endonuclease reaction on the dsDNA molecule(s) or its pair (e.g.
07 and 08), or can be a DNA fragment produced in another, parallel reaction, and can also be separately synthesized. Each of these dsDNA fragments have a complementary 3' or 5' overhang sequence to at least one other dsDNA fragment in the synthesis reaction.
However, in other embodiments a plurality of additional dsDNA fragments can be imported and assembled into the product DNA molecule at the same time. Some dsDNA fragments can have both a 3' and a 5' overhang sequence complementary to two other dsDNA fragments in the reaction, and can therefore be inserted in between two fragments to lengthen the product DNA
molecule -these dsDNA fragments therefore will have 3' and 5' overhangs, each complementary to one other 3' or 5' overhang of a dsDNA fragment in the synthesis.
[0037] The additional dsDNA fragment(s) can be used in any embodiment.
"Additional dsDNA fragment" (or additional dsDNA molecule) is a general term, not necessarily specific to any particular step in the methods. Such additional dsDNA fragments can be annealed to another dsDNA fragment at any step in the methods at 3' and/or 5' overhangs. The additional dsDNA fragment can be at least partially complementary to the 3' and/or 5' overhang on at least one other dsDNA fragment in the method, which can be the first dsDNA fragment(s), or second dsDNA fragment(s), or third dsDNA
fragment(s), or fourth dsDNA fragment(s), or other additional dsDNA fragments. In any embodiment such additional dsDNA fragments can be derived from additional dsDNA molecules. The additional dsDNA fragment can therefore have at the 3' and/or 5' overhangs the next series of nucleotides to be synthesized into the final product dsDNA molecule to form the dsDNA
molecule of pre-determined sequence.
"Additional dsDNA fragment" (or additional dsDNA molecule) is a general term, not necessarily specific to any particular step in the methods. Such additional dsDNA fragments can be annealed to another dsDNA fragment at any step in the methods at 3' and/or 5' overhangs. The additional dsDNA fragment can be at least partially complementary to the 3' and/or 5' overhang on at least one other dsDNA fragment in the method, which can be the first dsDNA fragment(s), or second dsDNA fragment(s), or third dsDNA
fragment(s), or fourth dsDNA fragment(s), or other additional dsDNA fragments. In any embodiment such additional dsDNA fragments can be derived from additional dsDNA molecules. The additional dsDNA fragment can therefore have at the 3' and/or 5' overhangs the next series of nucleotides to be synthesized into the final product dsDNA molecule to form the dsDNA
molecule of pre-determined sequence.
[0038] The methods can involve a step of annealing a first dsDNA fragment with at least one additional dsDNA fragment by their complementary overhang sequences.
The method also can involve performing a step of ligation to produce a second dsDNA molecule (09) (here depicted having a 16mer variable sequence) having a conserved flanking sequence (CFS) 110 inside each of the 3' and 5' ends, and a variable sequence 105 inside the 3' and 5' conserved flanking sequences that is longer than the variable sequence on the first dsDNA
molecule.
The method also can involve performing a step of ligation to produce a second dsDNA molecule (09) (here depicted having a 16mer variable sequence) having a conserved flanking sequence (CFS) 110 inside each of the 3' and 5' ends, and a variable sequence 105 inside the 3' and 5' conserved flanking sequences that is longer than the variable sequence on the first dsDNA
molecule.
[0039] The methods therefore can further involve a step of contacting at least one second dsDNA molecule with a restriction enzyme to produce a plurality of second dsDNA
fragments comprising 3' and/or 5' overhang sequences. At least two of the plurality can have a conserved flanking sequence inside each of the 3' or 5' overhangs. The method can further involve a step of annealing the plurality of second dsDNA fragments to one or more additional dsDNA fragments having a complementary 3' or 5' overhang sequence (which can be at the variable sequence), and performing a step of ligation to produce at least one third dsDNA molecule having a conserved flanking sequence on the 3' and 5' ends, and a variable sequence inside the conserved flanking sequences that is longer than the variable sequence of the second dsDNA molecule. In the embodiment depicted in Figures 1 and 2 the at least one third dsDNA molecule has a variable sequence of 28 bp and a 4 bp overlap with an at least one paired third dsDNA molecule.
fragments comprising 3' and/or 5' overhang sequences. At least two of the plurality can have a conserved flanking sequence inside each of the 3' or 5' overhangs. The method can further involve a step of annealing the plurality of second dsDNA fragments to one or more additional dsDNA fragments having a complementary 3' or 5' overhang sequence (which can be at the variable sequence), and performing a step of ligation to produce at least one third dsDNA molecule having a conserved flanking sequence on the 3' and 5' ends, and a variable sequence inside the conserved flanking sequences that is longer than the variable sequence of the second dsDNA molecule. In the embodiment depicted in Figures 1 and 2 the at least one third dsDNA molecule has a variable sequence of 28 bp and a 4 bp overlap with an at least one paired third dsDNA molecule.
[0040] The methods can further involve a step of reacting the at least one third dsDNA molecule with a restriction enzyme to produce at least one third dsDNA
fragment 125 having 3' and/or 5' overhang sequences, optionally annealing the at least one third dsDNA fragment 125 to one or more additional dsDNA fragments 130 having a complementary 3' or 5' overhang sequence, and performing a step of ligation to produce a fourth dsDNA molecule. Two of the dsDNA fragments in the mixture can have a conserved flanking sequence inside the variable sequence, and a variable sequence at the 3' or 5' overhang. The fourth dsDNA molecule can therefore have conserved flanking sequences inside the 3' and 5' ends, an optional universal primer binding sequence, and a variable sequence (optionally between the CFSs) that is longer than the variable sequence of the third dsDNA molecule. In the embodiment depicted in Figures 1 and 2 the at least one fourth dsDNA molecule has a variable sequence of 100 bp. As in the other steps one or more additional dsDNA fragments can be included into the reaction to further lengthen the variable sequence of the product dsDNA molecule. The at least one additional dsDNA
fragment can be derived from parallel (or multiplexed) reactions and, as depicted in Figure 1B by fragments 125 and 130, can have overhangs at both the 3' and 5' ends with a sequence complementary to the overhang sequences of two other dsDNA fragments in the reaction.
These additional dsDNA fragments can be produced by including two restriction recognition sites on the dsDNA molecule (optionally within the CF Ss) contacted with the restriction endonuclease, which can then cleave the dsDNA molecule into at least three fragments. The dsDNA fragments can therefore be joined in annealing and ligation reactions to form a longer product dsDNA molecule. In any of the embodiments or steps two or more dsDNA
fragments can be included in the reaction, and the dsDNA fragments can have a 3' and 5' overhang sequence and not have a CFS, i.e. these dsDNA fragments can be all variable sequence.
fragment 125 having 3' and/or 5' overhang sequences, optionally annealing the at least one third dsDNA fragment 125 to one or more additional dsDNA fragments 130 having a complementary 3' or 5' overhang sequence, and performing a step of ligation to produce a fourth dsDNA molecule. Two of the dsDNA fragments in the mixture can have a conserved flanking sequence inside the variable sequence, and a variable sequence at the 3' or 5' overhang. The fourth dsDNA molecule can therefore have conserved flanking sequences inside the 3' and 5' ends, an optional universal primer binding sequence, and a variable sequence (optionally between the CFSs) that is longer than the variable sequence of the third dsDNA molecule. In the embodiment depicted in Figures 1 and 2 the at least one fourth dsDNA molecule has a variable sequence of 100 bp. As in the other steps one or more additional dsDNA fragments can be included into the reaction to further lengthen the variable sequence of the product dsDNA molecule. The at least one additional dsDNA
fragment can be derived from parallel (or multiplexed) reactions and, as depicted in Figure 1B by fragments 125 and 130, can have overhangs at both the 3' and 5' ends with a sequence complementary to the overhang sequences of two other dsDNA fragments in the reaction.
These additional dsDNA fragments can be produced by including two restriction recognition sites on the dsDNA molecule (optionally within the CF Ss) contacted with the restriction endonuclease, which can then cleave the dsDNA molecule into at least three fragments. The dsDNA fragments can therefore be joined in annealing and ligation reactions to form a longer product dsDNA molecule. In any of the embodiments or steps two or more dsDNA
fragments can be included in the reaction, and the dsDNA fragments can have a 3' and 5' overhang sequence and not have a CFS, i.e. these dsDNA fragments can be all variable sequence.
[0041] The methods offer great versatility in synthesizing a product dsDNA
molecule. For example, in the final step of synthesis at least one dsDNA
fragment can be included in the annealing reaction to place a desired sequence on the product dsDNA
molecule. In one embodiment a 5' cap and universal priming sequence 120 can be added to the 3' and 5' ends of the product dsDNA molecule. The 5' cap can assist in preventing degradation of the ends of the DNA molecule, and the priming sequence is convenient for amplification when desired.
molecule. For example, in the final step of synthesis at least one dsDNA
fragment can be included in the annealing reaction to place a desired sequence on the product dsDNA
molecule. In one embodiment a 5' cap and universal priming sequence 120 can be added to the 3' and 5' ends of the product dsDNA molecule. The 5' cap can assist in preventing degradation of the ends of the DNA molecule, and the priming sequence is convenient for amplification when desired.
[0042] Additive reactions can also be performed. In the embodiment depicted in Figure 1B the fourth dsDNA molecule has a variable sequence of 100 bp.
Parallel reactions can produce a plurality of additional dsDNA molecules having complementary and/or overlapping sequences with the fourth dsDNA molecule. The additional dsDNA can also have a variable sequence of, for example, 100 bp or any suitable length. Any of the dsDNA
molecules can be cut with one or two restriction endonuclease(s) to produce a plurality of dsDNA fragments having 3' and/or 5' overhangs that contain complementary sequences with the 3' or 5' overhang of one or two other dsDNA fragments. The overhangs can comprise at least a portion of the variable sequence of each dsDNA molecule. The dsDNA
fragments can be combined to synthesize a much longer variable sequence in a product dsDNA
molecule.
Parallel reactions can produce a plurality of additional dsDNA molecules having complementary and/or overlapping sequences with the fourth dsDNA molecule. The additional dsDNA can also have a variable sequence of, for example, 100 bp or any suitable length. Any of the dsDNA
molecules can be cut with one or two restriction endonuclease(s) to produce a plurality of dsDNA fragments having 3' and/or 5' overhangs that contain complementary sequences with the 3' or 5' overhang of one or two other dsDNA fragments. The overhangs can comprise at least a portion of the variable sequence of each dsDNA molecule. The dsDNA
fragments can be combined to synthesize a much longer variable sequence in a product dsDNA
molecule.
[0043] Figure 2 provides a more detailed illustration of methods of the invention.
Again are present at least two oligonucleotides 01-02 and anchor strand 03, and paired oligonucleotides 04-05 and paired anchor strand 06. Figure 2B shows the complementary overlapping 3' and 5' overhang sequences that occur after restriction endonuclease digestion of the first and paired dsDNA molecules (depicted as 07 and 08). The variable sequence is again depicted as a lOmer and forming part of the 3' or 5' overhang sequences.
Also depicted is the second dsDNA molecule (again illustrated with a variable sequence of 16 nucleotides) that is synthesized after annealing and amplification (PCR2) of the first and paired dsDNA fragments. In the embodiment depicted in Figure 2, annealing, ligation 0 (LO), and PCR1 occur between the at least two oligonucleotides and anchor strands, here depicted in binding sets 01-03 and 04-06, in multiplex mode, to form the first (07) and paired (08) dsDNA molecules. Digestion with restriction endonuclease is then performed to produce first and paired dsDNA fragments followed by ligation 1 to additional dsDNA
fragments (here the paired dsDNA fragment), then PCR2 to form the second dsDNA molecule, here depicted as having a variable sequence that is a 16mer (09).
Again are present at least two oligonucleotides 01-02 and anchor strand 03, and paired oligonucleotides 04-05 and paired anchor strand 06. Figure 2B shows the complementary overlapping 3' and 5' overhang sequences that occur after restriction endonuclease digestion of the first and paired dsDNA molecules (depicted as 07 and 08). The variable sequence is again depicted as a lOmer and forming part of the 3' or 5' overhang sequences.
Also depicted is the second dsDNA molecule (again illustrated with a variable sequence of 16 nucleotides) that is synthesized after annealing and amplification (PCR2) of the first and paired dsDNA fragments. In the embodiment depicted in Figure 2, annealing, ligation 0 (LO), and PCR1 occur between the at least two oligonucleotides and anchor strands, here depicted in binding sets 01-03 and 04-06, in multiplex mode, to form the first (07) and paired (08) dsDNA molecules. Digestion with restriction endonuclease is then performed to produce first and paired dsDNA fragments followed by ligation 1 to additional dsDNA
fragments (here the paired dsDNA fragment), then PCR2 to form the second dsDNA molecule, here depicted as having a variable sequence that is a 16mer (09).
[0044] The second dsDNA molecule can then be digested with restriction endonuclease to form second dsDNA fragments, and ligated (L2) with additional dsDNA
fragments followed by PCR3 to form the third dsDNA molecule, which is depicted as having a 28mer variable sequence (010). The third dsDNA molecule can in turn be digested with restriction endonuclease and to form third dsDNA fragments 125, which can be combined and ligated (L3) with additional dsDNA fragments 130 and PCR3 performed to yield the fourth dsDNA molecule (014), which is depicted as having a 100mer variable sequence. A
plurality of additional dsDNA fragments 130 can be included in the reaction, which can derive from multiplexed or parallel synthesis reactions.
fragments followed by PCR3 to form the third dsDNA molecule, which is depicted as having a 28mer variable sequence (010). The third dsDNA molecule can in turn be digested with restriction endonuclease and to form third dsDNA fragments 125, which can be combined and ligated (L3) with additional dsDNA fragments 130 and PCR3 performed to yield the fourth dsDNA molecule (014), which is depicted as having a 100mer variable sequence. A
plurality of additional dsDNA fragments 130 can be included in the reaction, which can derive from multiplexed or parallel synthesis reactions.
[0045] The terms "first dsDNA molecule," "second dsDNA molecule," "third dsDNA
molecule," "fourth dsDNA molecule," "dsDNA fragments," "additional dsDNA," and "paired dsDNA molecule" are relative terms that are provided to assist in tracking a molecule through any step(s) in the method, and do not necessarily refer to any absolute point or DNA
molecule or fragment in the reaction. The first dsDNA molecule contains a variable sequence provided by the at least two oligonucleotides and first anchor strand; its paired dsDNA molecule can also contain a variable sequence that at least partially overlaps with the variable sequence of the first dsDNA molecule. In another embodiment the variable sequence of the first dsDNA molecule will at least partially overlap with the variable sequence of at least one additional dsDNA molecule. The second dsDNA molecule contains a variable sequence of the first and paired dsDNA molecule, and in turn can overlap with a paired dsDNA fragments or additional dsDNA fragments. The third dsDNA molecule contains a variable sequence of the at least one second dsDNA molecule, and can further contain a variable sequence of the first dsDNA molecule, and can also have a variable sequence of one or more additional dsDNA molecules. The fourth dsDNA molecule can contain a variable sequence from the first, paired, second dsDNA molecule (and its pair), and third dsDNA molecule (and its pair); in some embodiments the fourth dsDNA
molecule contains a variable sequence of a plurality of third dsDNA molecules. Such can continue and five to ten dsDNA molecules can be synthesized in hierarchal fashion, as generally depicted in Figure 3. When digested by a Type ITS restriction endonuclease the dsDNA
molecules will produce a dsDNA fragment having 3' and/or 5' overhang sequences that are complementary to at least one other dsDNA fragment in the mixture (or produced by a parallel synthesis reaction) at their variable sequences.
molecule," "fourth dsDNA molecule," "dsDNA fragments," "additional dsDNA," and "paired dsDNA molecule" are relative terms that are provided to assist in tracking a molecule through any step(s) in the method, and do not necessarily refer to any absolute point or DNA
molecule or fragment in the reaction. The first dsDNA molecule contains a variable sequence provided by the at least two oligonucleotides and first anchor strand; its paired dsDNA molecule can also contain a variable sequence that at least partially overlaps with the variable sequence of the first dsDNA molecule. In another embodiment the variable sequence of the first dsDNA molecule will at least partially overlap with the variable sequence of at least one additional dsDNA molecule. The second dsDNA molecule contains a variable sequence of the first and paired dsDNA molecule, and in turn can overlap with a paired dsDNA fragments or additional dsDNA fragments. The third dsDNA molecule contains a variable sequence of the at least one second dsDNA molecule, and can further contain a variable sequence of the first dsDNA molecule, and can also have a variable sequence of one or more additional dsDNA molecules. The fourth dsDNA molecule can contain a variable sequence from the first, paired, second dsDNA molecule (and its pair), and third dsDNA molecule (and its pair); in some embodiments the fourth dsDNA
molecule contains a variable sequence of a plurality of third dsDNA molecules. Such can continue and five to ten dsDNA molecules can be synthesized in hierarchal fashion, as generally depicted in Figure 3. When digested by a Type ITS restriction endonuclease the dsDNA
molecules will produce a dsDNA fragment having 3' and/or 5' overhang sequences that are complementary to at least one other dsDNA fragment in the mixture (or produced by a parallel synthesis reaction) at their variable sequences.
[0046] The product DNA molecule can be optionally assembled having flanking sequences, useful for continuing procedures (e.g. PCR or other DNA
amplification).
Flanking sequences can be of any length appropriate for the continuing procedures contemplated, for example about 12 nucleotides, or about 18 nucleotides, or about 18-22 nucleotides or 18-30 nucleotides or 18-60 nucleotides. Flanking sequence can include a 5' cap to discourage degradation of the dsDNA molecule, and a universal primer binding sequence to aid amplification.
amplification).
Flanking sequences can be of any length appropriate for the continuing procedures contemplated, for example about 12 nucleotides, or about 18 nucleotides, or about 18-22 nucleotides or 18-30 nucleotides or 18-60 nucleotides. Flanking sequence can include a 5' cap to discourage degradation of the dsDNA molecule, and a universal primer binding sequence to aid amplification.
[0047] The methods therefore allow the production of a product DNA molecule having a variable sequence of any length without the need for a conventional oligonucleotide synthesizer, which typically relies on chemical synthesis (e.g.
phosphoramidite chemistry).
Instead, the methods rely on enzymatic-based synthesis, and therefore the DNA
molecules or polynucleotides can be produced on demand. DNA molecules can refer to single-stranded polynucleotides or double-stranded DNA bound by Watson-Crick base pairing. The methods can also involve performing multiple cycles of PCR or another DNA
amplification procedure on any product DNA molecule.
phosphoramidite chemistry).
Instead, the methods rely on enzymatic-based synthesis, and therefore the DNA
molecules or polynucleotides can be produced on demand. DNA molecules can refer to single-stranded polynucleotides or double-stranded DNA bound by Watson-Crick base pairing. The methods can also involve performing multiple cycles of PCR or another DNA
amplification procedure on any product DNA molecule.
[0048] In any embodiment the methods or any step of the methods can be performed without cloning or the need for cloning. In any embodiment the methods or any step of the methods can be performed entirely in vitro. In any embodiment the methods or any step of the methods can be performed without the use of live cells. In any embodiment the methods or any step of the methods can produce a scarless product DNA molecule. By scarless DNA
is meant DNA that does not have any nucleotide(s) introduced by or from the process of synthesizing the DNA (e.g. residue nucleotides from a linker or flanking sequence). In any embodiment the methods or any step of the methods can produce a product DNA
molecule that is barcode free, or free of a nucleotide sequence placed for identification purposes. A
barcode can be a sequence that is not otherwise needed but has a particular sequence and is used to identify a sequence of DNA. In various examples and embodiments a barcode sequence is 6-8 nucleotides in length, or 4-10 nucleotides in length. In any embodiment the methods or any step of the methods can be performed without any part of any oligonucleotide used in the method being immobilized, i.e. bound to a solid phase or solid support (e.g. a bead, DNA chip, microfluidic surface, etc). In any embodiment the oligonucleotides can be annealed in solution, and can be ligated in solution, i.e. without any oligonucleotide in the step or method being bound or partially bound to a solid phase or solid support. In any embodiment the methods or any step of the methods can synthesize the product DNA
molecule without the use of and without performing chemical assembly techniques (e.g.
phosphoramidite chemistry). In any embodiment the methods or any step of the methods can assemble the product DNA molecule using only enzymatic assembly of oligonucleotides. In any embodiment the methods or any step of the methods can be performed by drawing the at least two oligonucleotides and anchor strands from a library comprising less than 20,000 members, or from any library described herein. In any embodiment the at least two oligonucleotides and anchor strands can be selected from an oligonucleotide library having less than 10,000 members, or from any oligo library described herein. In any embodiment the methods or any step of the methods do not utilize or require the use of a vector in the methods.
is meant DNA that does not have any nucleotide(s) introduced by or from the process of synthesizing the DNA (e.g. residue nucleotides from a linker or flanking sequence). In any embodiment the methods or any step of the methods can produce a product DNA
molecule that is barcode free, or free of a nucleotide sequence placed for identification purposes. A
barcode can be a sequence that is not otherwise needed but has a particular sequence and is used to identify a sequence of DNA. In various examples and embodiments a barcode sequence is 6-8 nucleotides in length, or 4-10 nucleotides in length. In any embodiment the methods or any step of the methods can be performed without any part of any oligonucleotide used in the method being immobilized, i.e. bound to a solid phase or solid support (e.g. a bead, DNA chip, microfluidic surface, etc). In any embodiment the oligonucleotides can be annealed in solution, and can be ligated in solution, i.e. without any oligonucleotide in the step or method being bound or partially bound to a solid phase or solid support. In any embodiment the methods or any step of the methods can synthesize the product DNA
molecule without the use of and without performing chemical assembly techniques (e.g.
phosphoramidite chemistry). In any embodiment the methods or any step of the methods can assemble the product DNA molecule using only enzymatic assembly of oligonucleotides. In any embodiment the methods or any step of the methods can be performed by drawing the at least two oligonucleotides and anchor strands from a library comprising less than 20,000 members, or from any library described herein. In any embodiment the at least two oligonucleotides and anchor strands can be selected from an oligonucleotide library having less than 10,000 members, or from any oligo library described herein. In any embodiment the methods or any step of the methods do not utilize or require the use of a vector in the methods.
[0049] The product DNA molecule can optionally be formed having conserved flanking sequences and, optional universal primer binding sites on the 3' and 5' ends of the product DNA molecule. The at least two oligonucleotides can be formed with one or more primer binding sites, which can provide binding sites for primers in amplification procedures (e.g. by PCR). Once the anchor strands are no longer necessary (e.g. a sufficiently long product DNA molecule has been synthesized), amplification can be done using primers that bind to the conserved flanking sequences and the universal primer binding sites are not needed.
[0050] The method can be facilitated by the use of recognition sites for a restriction endonuclease that can be effectively activated or inactivated. For example with reference to Figure 1 01 can have an inactive recognition site and 02 an active site programmed into the sequence when bound to the anchor strand 03. This recognition site can be utilized to digest the formed dsDNA molecules 07 and 08 and digest the dsDNA molecules at one location leaving 5' and 3' overhangs. When multiplexing is used the restriction sites can be formulated within the sequences so that the restriction site is active on one side of the dsDNA
molecule and inactive on the other, and vice versa for the other member of the pair. Thus, when digested each dsDNA will produce two dsDNA fragments, which can then be annealed.
However, when a larger dsDNA is arrived at (e.g. one having at least a 20mer or 28mer or similar variable sequence), and when the use of additional dsDNA fragments having a complementary 3' and 5' overhang 130 from parallel reactions is contemplated, the dsDNA
molecule can be formulated so that it has active restriction recognition sites on both sides of the dsDNA molecule. Thus, when digested it will be cut into at least three fragments, at least one having both a 3' and 5' overhang sequence, which can be at the variable sequence. This additional dsDNA fragment can then be included within an annealing and ligation reaction with at least one 3' end and at least one 5' end of the dsDNA molecule, per Figure 1B. This allows for the ability to greatly increase the length of the variable sequence in the product dsDNA molecule. The recognition (and restriction) sites can be turned "on" or "off' by utilizing a primer having a nucleotide mismatch, so the product dsDNA molecule no longer has an active recognition site (or does have one where formerly it did not).
For example, the restriction site for BsaI is 5'-GGTCTC(N1)-3' (SEQ ID NO: 17). By changing one nucleotide in the sequence on can turn the restriction site "off" (or vice versa) by utilizing a primer with a single mismatch. This can be utilized for any restriction endonuclease and can be used to place or remove a recognition site on either or both ends of the DNA.
molecule and inactive on the other, and vice versa for the other member of the pair. Thus, when digested each dsDNA will produce two dsDNA fragments, which can then be annealed.
However, when a larger dsDNA is arrived at (e.g. one having at least a 20mer or 28mer or similar variable sequence), and when the use of additional dsDNA fragments having a complementary 3' and 5' overhang 130 from parallel reactions is contemplated, the dsDNA
molecule can be formulated so that it has active restriction recognition sites on both sides of the dsDNA molecule. Thus, when digested it will be cut into at least three fragments, at least one having both a 3' and 5' overhang sequence, which can be at the variable sequence. This additional dsDNA fragment can then be included within an annealing and ligation reaction with at least one 3' end and at least one 5' end of the dsDNA molecule, per Figure 1B. This allows for the ability to greatly increase the length of the variable sequence in the product dsDNA molecule. The recognition (and restriction) sites can be turned "on" or "off' by utilizing a primer having a nucleotide mismatch, so the product dsDNA molecule no longer has an active recognition site (or does have one where formerly it did not).
For example, the restriction site for BsaI is 5'-GGTCTC(N1)-3' (SEQ ID NO: 17). By changing one nucleotide in the sequence on can turn the restriction site "off" (or vice versa) by utilizing a primer with a single mismatch. This can be utilized for any restriction endonuclease and can be used to place or remove a recognition site on either or both ends of the DNA.
[0051] In any embodiment the methods can include a step of removing conserved flanking sequences and/or universal primer binding sites on one side or both sides of the DNA molecule after amplification to yield a product DNA molecule. Methods of removing flanking sequences are known in the art. In some embodiments the conserved flanking sequences and/or universal primer binding sites can be utilized to add length to the product DNA molecule, or to surround the product DNA molecule with transcriptional elements or other beneficial sequences that will be utilized in the final desired sequence. For example, the flanking sequences can be set to provide a promoter in front of the product DNA
molecule, and/or to provide a terminator (i.e. regulatory sequences). In one embodiment the product DNA molecule is a gRNA sequence (e.g. of 16-20 bp). The flanking sequences can optionally be set to provide a promoter in front of the gRNA sequence, and a Cas9 handle and terminator after it. Thus, in some embodiments the product DNA molecule can be expanded to encompass the universal primer binding sites and/or flanking sequences and/or one or more regulatory sequences and/or a Cas9 handle, any of which can provide more utility than being only binding sites for primers.
molecule, and/or to provide a terminator (i.e. regulatory sequences). In one embodiment the product DNA molecule is a gRNA sequence (e.g. of 16-20 bp). The flanking sequences can optionally be set to provide a promoter in front of the gRNA sequence, and a Cas9 handle and terminator after it. Thus, in some embodiments the product DNA molecule can be expanded to encompass the universal primer binding sites and/or flanking sequences and/or one or more regulatory sequences and/or a Cas9 handle, any of which can provide more utility than being only binding sites for primers.
[0052] Any of the methods disclosed herein can be performed in an automated method, for example by an automated instrument. An automated method is one where no human intervention is necessary after the method is initiated - the method goes to completion from that point without a human having to perform any action. The automated instrument can contain components for selecting oligonucleotide members from the oligo library. A
DNA sequence to be assembled can be uploaded, recorded on, or stored on a non-transitory computer-readable medium. A non-transitory computer-readable medium can be programmed to execute automated steps when inserted into or otherwise in electronic communication with a processor attached to or comprised within the automated instrument.
The automated steps can be any disclosed herein for performing any method disclosed herein.
Thus, the invention also provides a non-transitory computer-readable medium that is programmed with the locations of each member of an oligonucleotide library described herein, where the oligonucleotide library is present on a suitable support structure for the oligo library. In one embodiment the non-transitory computer-readable medium is programmed with the locations of at least 6,000 or at least 9,000 oligonucleotide library members. The medium can also be programmed with instructions to combine 4-6 members of a binding set from the library and to assemble the members of the binding set into a product DNA molecule according to the methods described herein. A "member" of a library is one or more polynucleotides at a location. An oligo library can be comprised on any type of medium, for example a multi-well plate or plurality of plates.
DNA sequence to be assembled can be uploaded, recorded on, or stored on a non-transitory computer-readable medium. A non-transitory computer-readable medium can be programmed to execute automated steps when inserted into or otherwise in electronic communication with a processor attached to or comprised within the automated instrument.
The automated steps can be any disclosed herein for performing any method disclosed herein.
Thus, the invention also provides a non-transitory computer-readable medium that is programmed with the locations of each member of an oligonucleotide library described herein, where the oligonucleotide library is present on a suitable support structure for the oligo library. In one embodiment the non-transitory computer-readable medium is programmed with the locations of at least 6,000 or at least 9,000 oligonucleotide library members. The medium can also be programmed with instructions to combine 4-6 members of a binding set from the library and to assemble the members of the binding set into a product DNA molecule according to the methods described herein. A "member" of a library is one or more polynucleotides at a location. An oligo library can be comprised on any type of medium, for example a multi-well plate or plurality of plates.
[0053] The invention also provides kits having an oligo library described herein located on a medium. The medium can be any suitable medium, for example one or more of a DNA chip, one or more bead(s), microtubes, one or more of a 96 well plate, one or more of a 384 well plate(s), one or more 1536-well plate(s), one or more microfluidic reaction support(s), one or more microtiter plate(s), one or more nanotiter plate(s), one or more picotiter plate(s), or other solid support or solid phase surface that can retain oligonucleotide members of the library. When more than one medium is utilized the media can be present in numbers sufficient to accommodate the oligo library. The medium containing the oligonucleotide library can contain members in any suitable volume, and examples include volumes of 1 nl up to 100 ul, or 10 nl up to 100 ul. A DNA chip (or DNA
microarray) is a solid surface having a collection of microscopic locations, to which oligonucleotides can be attached and/or stored.
microarray) is a solid surface having a collection of microscopic locations, to which oligonucleotides can be attached and/or stored.
[0054] Any of the methods of the invention can synthesize a product DNA
molecule with a very low error rate. In various embodiments the methods can produce any product DNA molecule described herein with error rates of less than 1 in 1,000 nucleotides, or less than 1 in 2,000 nucleotides, or less than 1 in 2,400 nucleotides, or less than 1 in 2,500 nucleotides, or less than 1 in 3,000 nucleotides, or less than 1 in 8,000 nucleotides.
General Steps
molecule with a very low error rate. In various embodiments the methods can produce any product DNA molecule described herein with error rates of less than 1 in 1,000 nucleotides, or less than 1 in 2,000 nucleotides, or less than 1 in 2,400 nucleotides, or less than 1 in 2,500 nucleotides, or less than 1 in 3,000 nucleotides, or less than 1 in 8,000 nucleotides.
General Steps
[0055] In any embodiment the methods can begin with a pooling of at least two oligonucleotides and an anchor strand from the oligo library. A general embodiment is depicted in Figure 1A-B. In the embodiment described here multiplexing will be utilized, and the at least two oligos and anchor strand have been prepared with restriction sites for BsaI, although any Type ITS restriction enzyme can be utilized.
[0056] The pool of oligos can be subjected to a step of annealing and a step of ligation (e.g. LO and PCR1). The ligation step can be performed by contacting the pool of oligonucleotides with a ligase, for example T4 DNA ligase. But any ligase can be utilized at any step in the invention. Ligation involves the annealing of complementary 5' and 3' overhang sequences on the dsDNA fragments produced by the digestion with restriction endonuclease. Ligation can also involve contacting the oligos with a ligase.
The polymerase chain reaction (PCR) is a common reaction in biology known to persons of ordinary skill.
PCR can be used in the invention according to normal procedures and well known techniques. PCR (PCR1) results in amplification of the oligos, depicted in the example in Figure 1A as 07 and 08 and having variable sequences that are 16mers. The methods can involve a step of digestion with restriction endonuclease and annealing with additional dsDNA fragments and ligation, followed by a step of PCR (D1 and PCR2). The oligo set can be digested with the restriction enzyme. Digestion results in dsDNA fragments with complementary 3' and 5' overhangs, which are then joined with other fragments, ligated, and amplified in the PCR2 step to form dsDNA molecules having variable sequence(s), depicted in the embodiment in Figure 1A as a 16mer. Another digestion (DL2) can be performed on the product to result in dsDNA fragments, steps of annealing, ligation, and PCR3 to form dsDNA molecules, depicted in Figure 1A as having variable sequences that are 28mers (Figure 1B). A step of digestion with restriction endonuclease, annealing with additional dsDNA fragments and ligation (DL3) can then be utilized. The ligation can optionally involve dsDNA fragments from parallel reactions or otherwise synthesized that have a 3' and/or 5' overhang complementary to at least one other dsDNA fragment in the mixture. In this manner the length of the variable sequence can be quickly increased. The product dsDNA molecule of desired sequence in this embodiment is a 100mer variable sequence depicted in Figure 1B.
Universal Primer Binding Sites
The polymerase chain reaction (PCR) is a common reaction in biology known to persons of ordinary skill.
PCR can be used in the invention according to normal procedures and well known techniques. PCR (PCR1) results in amplification of the oligos, depicted in the example in Figure 1A as 07 and 08 and having variable sequences that are 16mers. The methods can involve a step of digestion with restriction endonuclease and annealing with additional dsDNA fragments and ligation, followed by a step of PCR (D1 and PCR2). The oligo set can be digested with the restriction enzyme. Digestion results in dsDNA fragments with complementary 3' and 5' overhangs, which are then joined with other fragments, ligated, and amplified in the PCR2 step to form dsDNA molecules having variable sequence(s), depicted in the embodiment in Figure 1A as a 16mer. Another digestion (DL2) can be performed on the product to result in dsDNA fragments, steps of annealing, ligation, and PCR3 to form dsDNA molecules, depicted in Figure 1A as having variable sequences that are 28mers (Figure 1B). A step of digestion with restriction endonuclease, annealing with additional dsDNA fragments and ligation (DL3) can then be utilized. The ligation can optionally involve dsDNA fragments from parallel reactions or otherwise synthesized that have a 3' and/or 5' overhang complementary to at least one other dsDNA fragment in the mixture. In this manner the length of the variable sequence can be quickly increased. The product dsDNA molecule of desired sequence in this embodiment is a 100mer variable sequence depicted in Figure 1B.
Universal Primer Binding Sites
[0057] The universal primer binding sites can be present on some DNA molecules in some embodiments of the methods. In some embodiments the sites can be present on the at least two oligonucleotides and on the at least one first dsDNA molecule.
However, in some embodiments these sites can be eliminated in any step after the anchor strand is no longer utilized. For example, the sites can be eliminated after formation of the at least one first or second dsDNA molecule, and the conserved flanking sequences used as primer binding sites thereafter. Thus, the at least two oligonucleotides and the anchor strand can have universal primer binding sites, which then are present in the at least one first dsDNA
molecule, but any one or more of the second, third, and fourth dsDNA molecules can not have a universal primer binding site. Universal primer binding sites can also be added to dsDNA
molecules at any step where convenient in the methods, e.g. on forming the final product dsDNA molecule it may be found desirable to have a convenient methods of amplifying the product. The length of the universal primer binding site can be at least 6 nucleotides or at least 10 or at least 15 or at least 18 or at least 20 nucleotides or at least 25 nucleotides, or less than 15 nucleotides, or less than 12 nucleotides or less than 10 nucleotides or less than 8 nucleotides, but no particular length is necessary, only that the site allow for binding of a primer and amplification of the molecule. In one embodiment the universal primer binding sites have the same sequence on all molecules in a mixture, enabling amplification of the mixture from a single set of primers. In any step of amplification all dsDNA molecules to be amplified can have a universal primer binding site of the same sequence.
Variable Sequence
However, in some embodiments these sites can be eliminated in any step after the anchor strand is no longer utilized. For example, the sites can be eliminated after formation of the at least one first or second dsDNA molecule, and the conserved flanking sequences used as primer binding sites thereafter. Thus, the at least two oligonucleotides and the anchor strand can have universal primer binding sites, which then are present in the at least one first dsDNA
molecule, but any one or more of the second, third, and fourth dsDNA molecules can not have a universal primer binding site. Universal primer binding sites can also be added to dsDNA
molecules at any step where convenient in the methods, e.g. on forming the final product dsDNA molecule it may be found desirable to have a convenient methods of amplifying the product. The length of the universal primer binding site can be at least 6 nucleotides or at least 10 or at least 15 or at least 18 or at least 20 nucleotides or at least 25 nucleotides, or less than 15 nucleotides, or less than 12 nucleotides or less than 10 nucleotides or less than 8 nucleotides, but no particular length is necessary, only that the site allow for binding of a primer and amplification of the molecule. In one embodiment the universal primer binding sites have the same sequence on all molecules in a mixture, enabling amplification of the mixture from a single set of primers. In any step of amplification all dsDNA molecules to be amplified can have a universal primer binding site of the same sequence.
Variable Sequence
[0058] As the methods proceed, whether performed in multiplex fashion or in parallel the variable sequence in the dsDNA molecule can grow longer as the methods proceed due to progressively combining more DNA containing a variable sequence that will be part of the product dsDNA molecule. In any embodiment the variable sequence in the first dsDNA
molecule can equal the variable sequences from the at least two oligonucleotides combined.
In any embodiment the variable sequence in the first dsDNA molecule is 6-14 nucleotides, or 8-12 nucleotides or about 10 nucleotides, which can be adjusted depending on the dsDNA
molecule to be synthesized. In any embodiment the second dsDNA molecule can have a variable sequence of 8-24 or 10-22 or 14-18 or 15-17 base pairs. In any embodiment the third dsDNA molecule comprises a variable sequence of 18-38 or 20-36 or 24-32 or 26-30 or 27-29 base pairs. In any embodiment the fourth dsDNA molecule can have a variable sequence of 70-130 or 80-120 or 90-110 base pairs. But the length of the variable sequence in any step is not fixed and can be varied to whatever is convenient or desirable in the application.
molecule can equal the variable sequences from the at least two oligonucleotides combined.
In any embodiment the variable sequence in the first dsDNA molecule is 6-14 nucleotides, or 8-12 nucleotides or about 10 nucleotides, which can be adjusted depending on the dsDNA
molecule to be synthesized. In any embodiment the second dsDNA molecule can have a variable sequence of 8-24 or 10-22 or 14-18 or 15-17 base pairs. In any embodiment the third dsDNA molecule comprises a variable sequence of 18-38 or 20-36 or 24-32 or 26-30 or 27-29 base pairs. In any embodiment the fourth dsDNA molecule can have a variable sequence of 70-130 or 80-120 or 90-110 base pairs. But the length of the variable sequence in any step is not fixed and can be varied to whatever is convenient or desirable in the application.
[0059] The variable sequences can be sequences that will be present in the product dsDNA molecule. Thus, the variable sequences will vary in each construct depending on what portion of the final product DNA molecule it is carrying and what product dsDNA
molecule is being synthesized. The product DNA molecule can be the DNA
molecule having the desired sequence. In one embodiment all of the variable sequences in the at least two oligonucleotides will be present in the product dsDNA molecule produced at the end of whichever method is performed.
molecule is being synthesized. The product DNA molecule can be the DNA
molecule having the desired sequence. In one embodiment all of the variable sequences in the at least two oligonucleotides will be present in the product dsDNA molecule produced at the end of whichever method is performed.
[0060] The variable sequence in the at least two oligonucleotides (or in any step or molecule of the methods) can be at least 4 nucleotides, or at least 5 nucleotides, or at least 6 nucleotides, or at least 10 or at least 12 or at least 15 or at least 18 or at least 20 nucleotides, or 3-7 nucleotides or 4-6 nucleotides, or 4-8 nucleotides, or 6-10 nucleotides, or 6-12 nucleotides, or 12-16 nucleotides, or 14-18 nucleotides. The variable sequence for an anchor strand can be equal to the lengths of the variable sequences in the at least two oligonucleotides. The variable sequence on the anchor strand can anneal entirely with the variable sequences on the at least two oligonucleotides.
[0061] In any embodiment the variable sequence can be present as one consecutive sequence, or the nucleotides of the variable sequence can be separated singly or in groups of two or three or four or more consecutive nucleotides throughout the oligo sequence to comprise a variable region. The variable sequence can be at least a portion of the desired sequence or product dsDNA molecule to be synthesized in the methods. A
variable sequence can represent a distinct sequence for each possibility presented by the length of the variable sequence, and each distinct sequence can be present at a distinct location in the oligo library.
Thus, each oligo having a distinct variable sequence can be located at a distinct location in the oligo library. For example, 01 of the at least two oligonucleotides has a variable sequence. When the variable sequence is five nucleotides, 01 can have 1024 possible nucleotide sequences, i.e. 4x4x4x4x4 equals 1024 variable sequences for 01.
The same is true for 02-06 as depicted in Figure 1. Variable nucleotides can be dispersed in the sequence, singly or in groups as explained above.
variable sequence can represent a distinct sequence for each possibility presented by the length of the variable sequence, and each distinct sequence can be present at a distinct location in the oligo library.
Thus, each oligo having a distinct variable sequence can be located at a distinct location in the oligo library. For example, 01 of the at least two oligonucleotides has a variable sequence. When the variable sequence is five nucleotides, 01 can have 1024 possible nucleotide sequences, i.e. 4x4x4x4x4 equals 1024 variable sequences for 01.
The same is true for 02-06 as depicted in Figure 1. Variable nucleotides can be dispersed in the sequence, singly or in groups as explained above.
[0062] In any embodiment the variable sequences of two dsDNA molecules can overlap, i.e. have a common sequence for two or more nucleotides. In some embodiments any of the dsDNA molecules can contain variable sequences that overlap by at least 1 or at least 2 or at least 3, or at least 4, or at least 5, or at least 6, or at least 7, or at least 8 nucleotides, or by 1-6 nucleotides or 2-5 nucleotides, or by 3-10 nucleotides, or by about 4 nucleotides, or by at least 10 nucleotides, or by more than 8 nucleotides. In various embodiments the first or second or third dsDNA molecules, or additional dsDNA
molecules described herein can have variable regions that overlap with other dsDNA
molecules as described. For example, the first dsDNA molecule can overlap with its paired dsDNA
molecule, the second dsDNA molecule, third dsDNA molecule, or additional dsDNA
molecules can all overlap with their paired dsDNA molecule. But dsDNA
molecules can also overlap with any other dsDNA molecule (e.g. a second dsDNA can be made to overlap with a third dsDNA from a parallel synthesis reaction.
molecules described herein can have variable regions that overlap with other dsDNA
molecules as described. For example, the first dsDNA molecule can overlap with its paired dsDNA
molecule, the second dsDNA molecule, third dsDNA molecule, or additional dsDNA
molecules can all overlap with their paired dsDNA molecule. But dsDNA
molecules can also overlap with any other dsDNA molecule (e.g. a second dsDNA can be made to overlap with a third dsDNA from a parallel synthesis reaction.
[0063] In any embodiment dsDNA fragments can also have a 3' and/or 5' overhang sequence that contains a variable sequence that overlaps by at least 3, or at least 4, or at least 5, or at least 6, or at least 7, or at least 8 nucleotides, or by at least 10 nucleotides, or by more than 8 nucleotides with other (paired) dsDNA fragments. Thus, a first dsDNA
fragment can have a variable sequence on the 3' and/or 5' overhang that overlaps with that of a paired dsDNA fragment on its 5' or 3' overhang. A second dsDNA fragment can have 3' and/or 5' overhang sequence that contains a variable sequence that overlaps with that of its paired dsDNA fragment, or an additional dsDNA fragment. The 3' and/or 5' overhangs can be produced by restriction endonuclease action on a dsDNA molecule, and can also be synthesized separately and provided to any of the reactions. In any step of the methods dsDNA fragments can have 3' and/or 5' overhang sequences that are part of the variable sequence, and which can be used to anneal and combine them with one or more other dsDNA
fragments at their variable sequences.
Degenerate Nucleotides
fragment can have a variable sequence on the 3' and/or 5' overhang that overlaps with that of a paired dsDNA fragment on its 5' or 3' overhang. A second dsDNA fragment can have 3' and/or 5' overhang sequence that contains a variable sequence that overlaps with that of its paired dsDNA fragment, or an additional dsDNA fragment. The 3' and/or 5' overhangs can be produced by restriction endonuclease action on a dsDNA molecule, and can also be synthesized separately and provided to any of the reactions. In any step of the methods dsDNA fragments can have 3' and/or 5' overhang sequences that are part of the variable sequence, and which can be used to anneal and combine them with one or more other dsDNA
fragments at their variable sequences.
Degenerate Nucleotides
[0064] One or more of the at least two oligonucleotides and the anchor strands used in the methods can, optionally, have one or more degenerate nucleotides.
Degenerate nucleotide refers to degenerate positions in the oligo sequence. In any embodiment the degenerate nucleotides can be present within and as part of the variable sequence of the anchor strand or other oligonucleotide in the methods. A degenerate nucleotide in an oligo is a nucleotide that can be any of A, C, T, or G, i.e. a nucleotide position in a library member that has been randomized. Randomization can be performed by simply supplying all four bases during oligo synthesis producing a randomized oligonucleotide. However, in some embodiments degenerate nucleotides can be a universal base. Examples include deoxy-inosine, 2-deoxyinosine, nitroindole, 2'-deoxynebularine, 3-nitropyrrole, dP, dK, or other universal bases can be used to reduce degeneracy. 3-nitropyrrole 2'-deoxynucleoside, and 5-nitroindole 2'-deoxynucleoside can also be used as degenerate bases. An oligo having one or more degenerate nucleotides is a degenerate oligonucleotide. Degenerate oligos can be co-located at the same (degenerate oligonucleotide) location in an oligo library.
Degenerate oligos can thus be present at a location as a group of slightly different sequences, with each degenerate oligo having a distinct sequence due to the degenerate nucleotides, yet all co-located at the same location. In some embodiments degenerate nucleotides on one oligo can anneal to nucleotides of a variable sequence on another (target) oligo, such as is depicted in Figure 1-2. In various embodiments any of the oligonucleotides in a binding set can have degenerate nucleotides in its variable sequence. In some embodiments at least one oligonucleotide in a binding set has degenerate nucleotides. In some embodiments only the anchor strand has a variable sequence with degenerate nucleotides. With reference to Figures 1-2 the anchor strands are depicted as having degenerate oligonucleotides (designated "N") in the variable sequence. A binding set is a group of oligonucleotides that bind to one another in the methods. Thus, 01-03 is a binding set, as is 04-06. In some embodiments oligos of a binding set can bind substantially to one another (i.e. not merely by a small amount). In another embodiment oligos of a binding set bind to each other without mismatched base pairs. In one embodiment at least one oligo of a binding set binds completely to one or two other oligos of the binding set, i.e. no with no unmatched bases. A "target"
oligo is an second oligo that a first oligo is intended to bind to in the methods.
Degenerate nucleotide refers to degenerate positions in the oligo sequence. In any embodiment the degenerate nucleotides can be present within and as part of the variable sequence of the anchor strand or other oligonucleotide in the methods. A degenerate nucleotide in an oligo is a nucleotide that can be any of A, C, T, or G, i.e. a nucleotide position in a library member that has been randomized. Randomization can be performed by simply supplying all four bases during oligo synthesis producing a randomized oligonucleotide. However, in some embodiments degenerate nucleotides can be a universal base. Examples include deoxy-inosine, 2-deoxyinosine, nitroindole, 2'-deoxynebularine, 3-nitropyrrole, dP, dK, or other universal bases can be used to reduce degeneracy. 3-nitropyrrole 2'-deoxynucleoside, and 5-nitroindole 2'-deoxynucleoside can also be used as degenerate bases. An oligo having one or more degenerate nucleotides is a degenerate oligonucleotide. Degenerate oligos can be co-located at the same (degenerate oligonucleotide) location in an oligo library.
Degenerate oligos can thus be present at a location as a group of slightly different sequences, with each degenerate oligo having a distinct sequence due to the degenerate nucleotides, yet all co-located at the same location. In some embodiments degenerate nucleotides on one oligo can anneal to nucleotides of a variable sequence on another (target) oligo, such as is depicted in Figure 1-2. In various embodiments any of the oligonucleotides in a binding set can have degenerate nucleotides in its variable sequence. In some embodiments at least one oligonucleotide in a binding set has degenerate nucleotides. In some embodiments only the anchor strand has a variable sequence with degenerate nucleotides. With reference to Figures 1-2 the anchor strands are depicted as having degenerate oligonucleotides (designated "N") in the variable sequence. A binding set is a group of oligonucleotides that bind to one another in the methods. Thus, 01-03 is a binding set, as is 04-06. In some embodiments oligos of a binding set can bind substantially to one another (i.e. not merely by a small amount). In another embodiment oligos of a binding set bind to each other without mismatched base pairs. In one embodiment at least one oligo of a binding set binds completely to one or two other oligos of the binding set, i.e. no with no unmatched bases. A "target"
oligo is an second oligo that a first oligo is intended to bind to in the methods.
[0065] One or more anchor strands in a method can have 3 or 4 or 5 or 6 or 7 or 8 or 3-5 or 3-6 or 3-7 or 3-8 or 4-5 or 4-6 or 4-7 or 4-8 or 6-10 or more than 8 or more than 10 or more than 12 degenerate nucleotides in its variable sequence. The one or more degenerate nucleotides in an oligo can be present as one consecutive sequence to comprise a degenerate sequence, or the degenerate nucleotides can be separated singly or in groups of two or more consecutive degenerate nucleotides throughout the oligo (e.g. an anchor strand). In some embodiments degenerate nucleotides are present only within a variable sequence of the oligos, or only within the variable sequence of the anchor strand(s).
[0066] Degenerate oligonucleotides present at a location in the oligo library have multiple sequences at the location and can be grouped together and considered as one member of the library. For example, an anchor oligo (or other oligo) having, for example, five degenerate nucleotides can have 1024 possible sequences (4x4x4x4x4=1024), but all 1024 sequences can be co-located at a single defined location in the library.
A location in the oligo library containing the multiple sequences of degenerate oligonucleotides is termed a degenerate oligo location. Multiple degenerate oligonucleotides (each of slightly different sequence) can be co-located at a single location in the oligo library. While in some embodiments all possible sequences of a degenerate oligonucleotide are provided at the same location (e.g. all 1024 possible sequences of a degenerate oligo having 5 degenerate nucleotides), in other embodiments multiple degenerate oligonucleotides can be located in groups of convenient numbers at multiple different locations in the oligo library. Degenerate nucleotides allow the user to therefore greatly reduce the number of positions in the oligo library.
A location in the oligo library containing the multiple sequences of degenerate oligonucleotides is termed a degenerate oligo location. Multiple degenerate oligonucleotides (each of slightly different sequence) can be co-located at a single location in the oligo library. While in some embodiments all possible sequences of a degenerate oligonucleotide are provided at the same location (e.g. all 1024 possible sequences of a degenerate oligo having 5 degenerate nucleotides), in other embodiments multiple degenerate oligonucleotides can be located in groups of convenient numbers at multiple different locations in the oligo library. Degenerate nucleotides allow the user to therefore greatly reduce the number of positions in the oligo library.
[0067] Thus, while oligos having one or more degenerate nucleotides can be co-located together at a single defined location in the library, oligos having variable sequences with no degenerate nucleotides can each have their own defined location in the library, i.e. a separate location for each sequence. An oligonucleotide having one or more degenerate nucleotides can be co-located at a single location with all possible sequences of the oligo for each degenerate position present at the single location.
[0068] For illustration, consider anchor strand 03 in Figure 1 having ten variable nucleotides, including six degenerate nucleotide locations. Ten nucleotides of variable sequence would normally require over 1 million locations, but 03 has six degenerate nucleotides. Thus, 03 with 4 non-degenerate nucleotides can be present at locations Li L256 for 03 (4x4x4x4) in the library, with each location containing the specific sequence for the non-degenerate portion of the variable sequence. And the 256 locations can have a group of oligos that provide all possible sequences for the degenerate nucleotides D1 ... D4096 (4x4x4x4x4x4 or 4096). Thus, degenerate sequences Dl-D4096 can all be present at each of variable locations Li-L256 for 03, with each location having a distinct sequence for the non-degenerate positions on the sequence, and oligos of all possible sequences at the degenerate positions. Thus, at location Li for the example 03 can be SEQ ID NO: 3 NNNACTCNNN
(V1), plus degenerate sequences Dl-D4096 having the non-degenerate portion of variable sequence, and all possible sequences for the degenerate nucleotides in each location. At location L2 for 03, degenerate sequences Dl-D4096 will all have non-degenerate sequence V2. At location L3 for 03 degenerate sequences Dl-D4096 will all have non-degenerate sequence V3, and so on. Thus, degenerate sequences Dl-D4096 for 03 are all present at locations Ll-L256 for 03, with each degenerate sequence having the non-degenerate portion of the variable sequence. All anchor strands will thus contain the same set portion of the sequence, but all will vary in sequence at the degenerate nucleotides. Thus, the library can have 256 locations for 03 in this example.
Oligonucleotide Library
(V1), plus degenerate sequences Dl-D4096 having the non-degenerate portion of variable sequence, and all possible sequences for the degenerate nucleotides in each location. At location L2 for 03, degenerate sequences Dl-D4096 will all have non-degenerate sequence V2. At location L3 for 03 degenerate sequences Dl-D4096 will all have non-degenerate sequence V3, and so on. Thus, degenerate sequences Dl-D4096 for 03 are all present at locations Ll-L256 for 03, with each degenerate sequence having the non-degenerate portion of the variable sequence. All anchor strands will thus contain the same set portion of the sequence, but all will vary in sequence at the degenerate nucleotides. Thus, the library can have 256 locations for 03 in this example.
Oligonucleotide Library
[0069] The invention also provides methods of synthesizing a product DNA
molecule from a library of oligonucleotide members. The library of oligonucleotide members can have fewer than 10,000 or fewer than 5,000 oligonucleotide members, and the oligonucleotide members in the library are sufficient to assemble any possible polynucleotide sequence. The method involves assembling oligonucleotide members from the library to obtain the product DNA molecule.
molecule from a library of oligonucleotide members. The library of oligonucleotide members can have fewer than 10,000 or fewer than 5,000 oligonucleotide members, and the oligonucleotide members in the library are sufficient to assemble any possible polynucleotide sequence. The method involves assembling oligonucleotide members from the library to obtain the product DNA molecule.
[0070] With reference to Figure 1 all oligonucleotides, 01-06, are members in the library. 01-06 can each have one or more variable sequences. The oligonucleotide library can be comprised on any one or more of a DNA chip, solid support, solid phase, bead, microfluidic surface, plate, etc, or other structure where oligonucleotides can be stored at defined locations and be available for retrieval and use in the methods. In some embodiments the library will contain a distinct location for each of the possible variable sequences of 01-06.
3i
3i
[0071] When 01-02 and 04-05 have a variable region having 5 variable nucleotides, the number of locations to accommodate the possible sequences of the oligos is 4 to the 5th power, thus 4x4x4x4x4 equals 1,024. Thus, in some embodiments there is a defined oligo sequence at 4,096 defined locations, with a single or unique defined variable sequence for 01-02 and 04-05 present at each location. Thus, 01 oligos can have five variable nucleotides and thus 1024 possible sequences, which can be present at 1024 defined locations for 01 with a single defined variable sequence at each location, and similar for 02 and 04-5.
[0072] Adding anchor strands 03 and 06 in this example, each anchor strand has four non-degenerate nucleotides, and six degenerate nucleotides. Thus, the library can also have 256 locations for each of 03 and 06 to accommodate the non-degenerate nucleotides, with each location having a distinct sequence for non-degenerate nucleotides.
Additionally each of the 256 locations can have all possible degenerate sequences, thus 4,096 degenerate oligo sequences are present together at each of the 256 locations for the set nucleotides of the variable sequence. This example thus gives a total of only 4,608 distinct locations in the entire library (4x1024 + 2x256=4,608), from which one can assemble all possible DNA
sequences. Even doubling the library size to achieve parallel sequence gives only 9,216 members.
Additionally each of the 256 locations can have all possible degenerate sequences, thus 4,096 degenerate oligo sequences are present together at each of the 256 locations for the set nucleotides of the variable sequence. This example thus gives a total of only 4,608 distinct locations in the entire library (4x1024 + 2x256=4,608), from which one can assemble all possible DNA
sequences. Even doubling the library size to achieve parallel sequence gives only 9,216 members.
[0073] A location in the oligo library can be a well of a plate, a tube, or any other structure that segregates an oligonucleotide member in a distinct location, spatially separated from other members of the library sufficiently for it to be accessed individually and as a species at this distinct location.
[0074] The oligos can be maintained in their distinct locations as a single molecule (from which a complementary sequence can be synthesized) or as a multiple copies of the same molecule (from which a small volume can be taken and used in synthesis procedures).
The distinct locations can be identifiable to a software program that can be configured with a mechanical component or device that retrieves library members from the distinct location for use in a method of the invention where the defined oligonucleotide library member is required. In one embodiment an oligo library can be located in a collection of assay plates or small tubes, each containing a member of the oligo library, and to which instrumentation components can go and retrieve an oligo library member according to software instructions, which can be located on a non-transitory computer-readable medium. The non-transitory computer readable medium can also contain programmed instructions and/or steps for synthesizing a product DNA molecule according to any of the methods disclosed herein, and the programmed instructions and/or steps can be provided to an instrument in communication with the computer-readable medium. The programmed instructions or steps can direct the instrumentation to perform the assembly of a DNA molecule of pre-defined sequence according to any method disclosed herein, or to perform any of the methods provided herein.
The distinct locations can be identifiable to a software program that can be configured with a mechanical component or device that retrieves library members from the distinct location for use in a method of the invention where the defined oligonucleotide library member is required. In one embodiment an oligo library can be located in a collection of assay plates or small tubes, each containing a member of the oligo library, and to which instrumentation components can go and retrieve an oligo library member according to software instructions, which can be located on a non-transitory computer-readable medium. The non-transitory computer readable medium can also contain programmed instructions and/or steps for synthesizing a product DNA molecule according to any of the methods disclosed herein, and the programmed instructions and/or steps can be provided to an instrument in communication with the computer-readable medium. The programmed instructions or steps can direct the instrumentation to perform the assembly of a DNA molecule of pre-defined sequence according to any method disclosed herein, or to perform any of the methods provided herein.
[0075] Members of the oligonucleotide library are present at distinct locations, spatially separated from other members of the library. Thus, a member of the library can be a specific sequence present at its location (either singly or multiple copies).
When degenerate sequences are used, the member of the library containing degenerate sequences can be all possible degenerate sequences (or in some embodiments a subset of all possible sequence) in view of the number of degenerate nucleotides, and present at a distinct location.
When degenerate sequences are used, the member of the library containing degenerate sequences can be all possible degenerate sequences (or in some embodiments a subset of all possible sequence) in view of the number of degenerate nucleotides, and present at a distinct location.
[0076] In some embodiments there can be a number of sequences of the all possible sequences that are not of interest. Thus, only a subset of all possible degenerate sequences need be present at the distinct location to assemble all possible sequences of interest. In any embodiment the distinct location can be defined by any suitable technique, for example reference points in a microscopic picture or grid of the solid support containing the oligo library. In some embodiment the distinct location can be stored on and/or communicated by a non-transitory computer-readable medium.
DNA with Overhangs
DNA with Overhangs
[0077] The product DNA molecules can be assembled if desired into larger product dsDNA molecules. In some embodiments the product dsDNA molecules will be double-stranded blunt end DNA. DNA molecules can be synthesized so that the variable sequences between product dsDNA molecules contain an overlapping sequence. The dsDNA can be digested with a restriction endonuclease that cleaves within the variable sequence of each dsDNA and leaves overhang sequences or "sticky ends" in the dsDNA fragments remaining.
These overhang sequences can then be used to assemble the dsDNA fragments into a larger DNA molecule through annealing to complementary 3' and/or 5' sequences on additional dsDNA fragments. In other embodiments the product DNA molecule can be synthesized having single-stranded overhang sequences of one or more nucleotides, or of 4 nucleotides or nucleotides or 6 nucleotides or 7 nucleotides or 8 nucleotides or a more than 8 nucleotide single-stranded overhang.
Restriction Recognition Sites
These overhang sequences can then be used to assemble the dsDNA fragments into a larger DNA molecule through annealing to complementary 3' and/or 5' sequences on additional dsDNA fragments. In other embodiments the product DNA molecule can be synthesized having single-stranded overhang sequences of one or more nucleotides, or of 4 nucleotides or nucleotides or 6 nucleotides or 7 nucleotides or 8 nucleotides or a more than 8 nucleotide single-stranded overhang.
Restriction Recognition Sites
[0078] Type ITS restriction enzymes cleave DNA at a defined distance from their recognition site and leave a 5' and 3' single-stranded overhang. The recognition site can be programmed to lie outside of the variable sequence, and the cleavage site can be programmed to lie within the variable sequence, leaving 3' and/or 5' overhangs on the resulting dsDNA
fragments. Type ITS restriction endonucleases also find application in the invention for producing additional dsDNA fragments having single-stranded overhangs. The single-stranded overhangs can be present at the 3' and/or 5' ends, depending on where in the molecule the dsDNA fragment is to be positioned relative to other fragments.
dsDNA
molecules can be programmed to have active recognition sites on the 3' and 5' sides of the dsDNA molecule and on both sides of the variable sequence. The dsDNA molecules can also be programmed to have cleavage sites towards the 5' and 3' ends of the variable sequence.
dsDNA fragments can be joined by annealing dsDNA fragments having complementary overhanging 3' or 5' sequences and ligating to form a longer DNA molecule.
Multiple additional dsDNA fragments having 3' and 5' overhangs can be annealed to dsDNA
fragments having complementary 3' or 5' overhangs. Thus, dsDNA fragments at any step can be annealed to a plurality of additional dsDNA fragments to more rapidly advance the size of the variable sequence of the product dsDNA molecule. In this hierarchal fashion a dsDNA molecule can be synthesized having a 100 bp variable sequence or larger.
fragments. Type ITS restriction endonucleases also find application in the invention for producing additional dsDNA fragments having single-stranded overhangs. The single-stranded overhangs can be present at the 3' and/or 5' ends, depending on where in the molecule the dsDNA fragment is to be positioned relative to other fragments.
dsDNA
molecules can be programmed to have active recognition sites on the 3' and 5' sides of the dsDNA molecule and on both sides of the variable sequence. The dsDNA molecules can also be programmed to have cleavage sites towards the 5' and 3' ends of the variable sequence.
dsDNA fragments can be joined by annealing dsDNA fragments having complementary overhanging 3' or 5' sequences and ligating to form a longer DNA molecule.
Multiple additional dsDNA fragments having 3' and 5' overhangs can be annealed to dsDNA
fragments having complementary 3' or 5' overhangs. Thus, dsDNA fragments at any step can be annealed to a plurality of additional dsDNA fragments to more rapidly advance the size of the variable sequence of the product dsDNA molecule. In this hierarchal fashion a dsDNA molecule can be synthesized having a 100 bp variable sequence or larger.
[0079] In any embodiment the restriction enzyme utilized in the invention can be a Type ITS restriction enzyme. In one embodiment the Type ITS restriction enzyme is one that only cleaves dsDNA. In one embodiment Type ITS restriction sites can be encoded into the conserved flanking sequences, as illustrated in Figure 1. Any Type ITS
restriction enzyme can be utilized. In various embodiment the restriction sites can be BsmBI
sites, or BsmBi sites, or EciI sites, or BspMI sites, or FauI sites, etc. BsmBI recognizes the sequence 5'-CGTCTC(N)-3' (SEQ ID NO: 16). The enzyme generally cleaves to the 3' side of N. BsaI is another Type ITS restriction enzyme, which recognizes the sequence 5'-GGTCTC(N1)-3' (SEQ ID NO: 17) and generally cleaves to the 3' side of N. Persons of ordinary skill with resort to this disclosure will realize many other Type ITS restriction enzymes can be utilized in the invention. Such persons can also easily determine where the enzymes will cut in any particular application. In any embodiment of the methods disclosed herein any of the DNA
molecules utilized in or produced by the methods can contain one or more Type ITS
restriction endonuclease recognition sites.
DNA Data Storage
restriction enzyme can be utilized. In various embodiment the restriction sites can be BsmBI
sites, or BsmBi sites, or EciI sites, or BspMI sites, or FauI sites, etc. BsmBI recognizes the sequence 5'-CGTCTC(N)-3' (SEQ ID NO: 16). The enzyme generally cleaves to the 3' side of N. BsaI is another Type ITS restriction enzyme, which recognizes the sequence 5'-GGTCTC(N1)-3' (SEQ ID NO: 17) and generally cleaves to the 3' side of N. Persons of ordinary skill with resort to this disclosure will realize many other Type ITS restriction enzymes can be utilized in the invention. Such persons can also easily determine where the enzymes will cut in any particular application. In any embodiment of the methods disclosed herein any of the DNA
molecules utilized in or produced by the methods can contain one or more Type ITS
restriction endonuclease recognition sites.
DNA Data Storage
[0080] DNA is stable even over periods of thousands of years and even in many extreme environments, giving it great advantages for storing information. Any of the methods disclosed herein can be applied to encoding digital data into DNA. One or more product DNA molecule(s) can have a sequence that comprises an encoded non-genetic message. One or more product DNA molecule(s) can have a sequence that corresponds to bytes of information that encode the non-genetic message. The bytes of information can be decoded with reference to a key that assigns one or more language character(s) to each encoded character or byte of information.
[0081] For example, as illustrated in Figure 5, a 16 bp DNA molecule can be synthesized and easily accommodates four bytes of information, where each byte is encoded by an assigned sequence of nucleotides. In this example a four nucleotide sequence represents a byte of information, which can correspond to a character (e.g. a letter or numeral). Thus, in this example 256 characters can be encoded in each byte of information (4x4x4x4). Thus, the alphabet of any language in the world can be easily accommodated within these 256 bytes of information and a sufficient number of numerals and other characters utilized in communication as well. In various embodiments the message can be encoded in a reference language, such as English, French, German, Italian, Spanish, Latin, Japanese, Hindi, Chinese, Russian, or any language. A reference language can also include numbers and special characters, even though not formally part of the reference language. But any information can be encoded in the DNA sequence in any language.
[0082] The product DNA can also encode a character (e.g. a letter, a word, a number, a punctuation mark, word character, or other characters utilized in communication) indicating where in the sequence the information encoded by that DNA molecule is to be placed. Figure depicts 16 bp product DNA molecules having four bytes of four nucleotides each. The last byte in each product DNA sequence indicates the location in the message where the preceding three bytes are placed; this is conveniently a numeral but can be any character that can be placed into a definable sequence. While a 4 nucleotide byte provides up to 256 identifiers the byte can be any convenient length of nucleotides. For example, bytes can be comprised of 5 nucleotides or 6 nucleotides (allowing for 4,096 identifiers), or 7 or even 8 nucleotides, or more than 8 nucleotides, allowing for many more identifiers to be included.
Limited numbers of identifiers can also be expanded by placing DNA molecules in a single well up to the number of identifiers, and then assembling the messages from the DNA in the order of the sequence of wells. Using this method with only a 4 nucleotide identifier even a single 384 well plate can contain over 98,000 DNA molecules (256 molecules x 384 wells), which can be assembled in order to provide almost 300,000 bytes of information (in addition to the identifier). When a five nucleotide identifier is used over 1,024 molecules can be individually identified times 384 wells, i.e. 393,000 molecules, or over 1 million bytes of information in a single plate. Multiple plates can be used to accommodate much greater amounts of information. Therefore, an unlimited amount of information can be encoded and stored indefinitely according to the methods.
Limited numbers of identifiers can also be expanded by placing DNA molecules in a single well up to the number of identifiers, and then assembling the messages from the DNA in the order of the sequence of wells. Using this method with only a 4 nucleotide identifier even a single 384 well plate can contain over 98,000 DNA molecules (256 molecules x 384 wells), which can be assembled in order to provide almost 300,000 bytes of information (in addition to the identifier). When a five nucleotide identifier is used over 1,024 molecules can be individually identified times 384 wells, i.e. 393,000 molecules, or over 1 million bytes of information in a single plate. Multiple plates can be used to accommodate much greater amounts of information. Therefore, an unlimited amount of information can be encoded and stored indefinitely according to the methods.
[0083] Thus, the invention provides methods of storing data in a DNA sequence, which can involve determining a sequence of DNA that encodes a non-genetic message according to a coding scheme that can translate the non-genetic message from a reference language into a DNA sequence and vice versa; synthesizing the sequence of DNA
that encodes the non-genetic message according to a method disclosed herein; and thereby store data in a DNA sequence. A coding scheme is a set of codes (e.g. 4 or 3 nucleotide codons, an example of which is shown in Figure 5) that assign a particular character of a reference language to a particular codon. For example, the standard DNA codon table is a coding scheme, but it may be advantageous to use a coding scheme that is not easily transcribable.
Other examples of coding schemes are known to persons of ordinary skill in the art. After synthesis of dsDNA molecules according to any method described herein the dsDNA
molecules can be combined using additional DNA joining techniques known in the art to build a much larger dsDNA molecule, which contains the encoded information and can be stored indefinitely.
CRISPR Guide RNA
that encodes the non-genetic message according to a method disclosed herein; and thereby store data in a DNA sequence. A coding scheme is a set of codes (e.g. 4 or 3 nucleotide codons, an example of which is shown in Figure 5) that assign a particular character of a reference language to a particular codon. For example, the standard DNA codon table is a coding scheme, but it may be advantageous to use a coding scheme that is not easily transcribable.
Other examples of coding schemes are known to persons of ordinary skill in the art. After synthesis of dsDNA molecules according to any method described herein the dsDNA
molecules can be combined using additional DNA joining techniques known in the art to build a much larger dsDNA molecule, which contains the encoded information and can be stored indefinitely.
CRISPR Guide RNA
[0084] The invention can also be applied to the synthesis of guide RNAs (gRNA) for use in CRISPR-Cas9 methods. Using the methods any sequence of gRNA can be quickly constructed. Guide RNA constructs can also be constructed from oligonucleotides in the oligonucleotide library. A product DNA molecule can be synthesized in the methods having a DNA sequence that encodes an initial guide structure. The initial guide RNA
structure can encode a gRNA with the necessary prokaryotic or eukaryotic transcriptional elements for in vitro transcription in proper order, for example any one or more of a promoter, a sequence of gRNA, and a terminator. In some embodiments the gRNA can encode a Cas9-binding hairpin (Cas9 handle). In some embodiment the transcriptional elements include a promoter and/or a terminator. In some embodiments the product DNA molecule can encode 20 bases for the gRNA. Figure 6 depicts one embodiment in which a dsDNA molecule is synthesized into an initial guide structure having the transcriptional elements. In any of the methods disclosed herein the product dsDNA molecule can encode a guide structure or gRNA or other RNA molecule. Since all possible polynucleotide sequences can be assembled from the oligo library, any initial guide structure or gRNA or RNA can be assembled in the methods.
Embodiments
structure can encode a gRNA with the necessary prokaryotic or eukaryotic transcriptional elements for in vitro transcription in proper order, for example any one or more of a promoter, a sequence of gRNA, and a terminator. In some embodiments the gRNA can encode a Cas9-binding hairpin (Cas9 handle). In some embodiment the transcriptional elements include a promoter and/or a terminator. In some embodiments the product DNA molecule can encode 20 bases for the gRNA. Figure 6 depicts one embodiment in which a dsDNA molecule is synthesized into an initial guide structure having the transcriptional elements. In any of the methods disclosed herein the product dsDNA molecule can encode a guide structure or gRNA or other RNA molecule. Since all possible polynucleotide sequences can be assembled from the oligo library, any initial guide structure or gRNA or RNA can be assembled in the methods.
Embodiments
[0085] In one embodiment the method involves annealing at least two oligonucleotides of about 30-60 nucleotides in length with an anchor strand about 30-70 nucleotides in length according to the methods disclosed herein.
[0086] In another embodiment the method involves annealing at least two oligonucleotides of about 40-50 nucleotides in length with an anchor strand about 40-50 nucleotides in length according to the methods disclosed herein.
[0087] In another embodiment the method involves annealing at least two oligonucleotides of about or about 40-50 nucleotides in length with an anchor strand about 40-60 nucleotides in length. In different embodiments the anchor strand can utilize 4-6 or 6 degenerate oligonucleotides.
[0088] In another embodiment the method involves annealing at least two oligonucleotides of about or about 40-50 nucleotides in length with an anchor strand about 45-55 nucleotides in length. In different embodiments the anchor strand can utilize 4-6 or 6 degenerate oligonucleotides.
Example 1 ¨ Hierarchal Synthesis
Example 1 ¨ Hierarchal Synthesis
[0089] This example shows the synthesis of a dsDNA molecule of desired sequence having a 100 base pair variable region in a hierarchal method.
[0090] The "LO" reaction included two oligonucleotides 01 and 02 (each 45 nucleotides), each of which had a variable sequence of 5 nucleotides, a conserved flanking sequence of about 20 nucleotides, and a primer binding site of about 20 nucleotides. The anchor strand 03 was programmed to have a variable sequence of 10 nucleotides and be 50 nucleotides in length. The oligonucleotides were selected so that the sequence produced by the 01-03 synthesis (LO) would be a portion of the 100 nucleotide variable sequence of the pre-determined total dsDNA molecule, and would have a variable sequence of about 10 nucleotides. The oligonucleotides were also selected to encode a restriction site for BsaI, a Type ITS endonuclease, on the 5' side of the DNA molecule (for later ligation with a paired dsDNA molecule having an active recognition site on the 3' side of the DNA
molecule).
molecule).
[0091] A solution was prepared containing oligonucleotides 01-02 (two oligonucleotides) and 03 (the anchor strand) (2 ul of pool at 100 pM). The oligonucleotides were placed into wells containing T4 DNA ligase buffer (0.5 ul), water (2.4 ul), and T4 DNA
ligase (0.1 ul). The solution was incubated for 1 hour at 16 C, then for 10 minutes at 65 C.
ligase (0.1 ul). The solution was incubated for 1 hour at 16 C, then for 10 minutes at 65 C.
[0092] After a step of ligation (LO) with T4 DNA ligase a step of PCR
amplification (PCR1) was performed using water (2 ul), tailed 5' and 3' primers (1 ul, 1 uM) directed to the universal primer binding sites, a high fidelity thermostable DNA polymerase (5 ul) (Q5Ug) (New England Biolabs, Inc., Ipswich, MA), and the LO reaction product. The PCR
protocol was as follows: 98 C for 30 secs, then 30 cycles of 98 C (10 secs), 50 C
(10 secs) and 65 C (15 secs). An enzymatic purification was performed by adding 2uL of 10-fold diluted stock of Calf-Intestinal Phosphatase (CIP) + Exonuclease I ("CE") and incubated for 10 minutes at 37C. 10-fold diluted Proteinase K (2uL) was added and then incubated for 15 minutes at 37 C then 10 minutes at 95 C. A purified 98 bp product was confirmed on a gel using a 4% EX E-Gel (ThermoFisher Corp., Waltham, MA). The product had a variable sequence of 10 nucleotides.
amplification (PCR1) was performed using water (2 ul), tailed 5' and 3' primers (1 ul, 1 uM) directed to the universal primer binding sites, a high fidelity thermostable DNA polymerase (5 ul) (Q5Ug) (New England Biolabs, Inc., Ipswich, MA), and the LO reaction product. The PCR
protocol was as follows: 98 C for 30 secs, then 30 cycles of 98 C (10 secs), 50 C
(10 secs) and 65 C (15 secs). An enzymatic purification was performed by adding 2uL of 10-fold diluted stock of Calf-Intestinal Phosphatase (CIP) + Exonuclease I ("CE") and incubated for 10 minutes at 37C. 10-fold diluted Proteinase K (2uL) was added and then incubated for 15 minutes at 37 C then 10 minutes at 95 C. A purified 98 bp product was confirmed on a gel using a 4% EX E-Gel (ThermoFisher Corp., Waltham, MA). The product had a variable sequence of 10 nucleotides.
[0093] A digestion and ligation step DL1 was then performed. Water (2.3 ul), ligation buffer (0.5 ul), BsaI enzyme (0.1 ul), T4 DNA ligase (0.1 ul), and the PCR1 product were mixed together. The mixture was incubated for 1 minute at 37 C followed by 1 minute at 16 C and cycled 10 times. Finally, the mixture was held at 80 C for 20 minutes. A step of PCR (PCR2) was then performed on the DL1 product in a mixture of water (2 ul), 5' and 3' primers (1 uM), the DNA polymerase above (5 ul), and then diluted 150x. PCR
cycles and CIP+CE and proteinase K were performed as above. The dsDNA molecule produced had a variable sequence of 16 nucleotides.
cycles and CIP+CE and proteinase K were performed as above. The dsDNA molecule produced had a variable sequence of 16 nucleotides.
[0094] An additional dsDNA fragment having a variable sequence of 16 nucleotides and a 4 bp overlap with the first dsDNA molecule was added from a parallel synthesis reaction and derived from a dsDNA molecule with a recognition site on the opposite side of the dsDNA molecule. Another digestion and ligation step (DL2) was performed on both dsDNA molecules using 2.3 ul water, 10x T4 ligation buffer (0.5 ul), BsaI (0.1 ul), T4 DNA
ligase (0.1 ul), and 2 ul of the PCR2 product. The mixture was incubated for 1 minute at 37 C followed by 1 minute at 16 C and cycled 10 times. Finally the mixture was held at 80 C
for 20 minutes. A step of PCR (PCR3) was then performed on the DL2 product in a mixture of water (2 ul), 5' and 3' primers (1 uM), the DNA polymerase above (5 ul), and then diluted 150x. PCR cycles and calf intestinal phosphatase (CE) and proteinase K
digestions were performed as in step 5 above. Amplification products were verified on a gel showing the presence of 88, 68, 68, and 88 bp products, which had a variable sequence of nucleotides. The 28mer containing dsDNA molecule was also produced so that it would have a Type ITS restriction site on one side of the molecule.
ligase (0.1 ul), and 2 ul of the PCR2 product. The mixture was incubated for 1 minute at 37 C followed by 1 minute at 16 C and cycled 10 times. Finally the mixture was held at 80 C
for 20 minutes. A step of PCR (PCR3) was then performed on the DL2 product in a mixture of water (2 ul), 5' and 3' primers (1 uM), the DNA polymerase above (5 ul), and then diluted 150x. PCR cycles and calf intestinal phosphatase (CE) and proteinase K
digestions were performed as in step 5 above. Amplification products were verified on a gel showing the presence of 88, 68, 68, and 88 bp products, which had a variable sequence of nucleotides. The 28mer containing dsDNA molecule was also produced so that it would have a Type ITS restriction site on one side of the molecule.
[0095] A digestion reaction was performed and the resulting dsDNA fragment was combined with dsDNA fragments from parallel reactions, one which was a dsDNA
fragment that was all variable sequence and derived from a digestion of a dsDNA
molecule with recognition sites on both sides of the dsDNA, while maintaining the conserved flanking sequences from the 3' and 5' ends to allow for efficient ligation and to enable universal primers to be used in downstream PCR (for example illustrated in Figure 1B). A
ligation step was performed on the dsDNA fragments (DL3) using 16.5 ul water, 10x T4 ligation buffer (2.5 ul), BsaI (0.5 ul), T4 DNA ligase (0.5 ul), and 5 ul of the pooled PCR3 product. The mixture was incubated for 1 minute at 37 C followed by 1 minute at 16 C and cycled 25 times. Finally the mixture was held at 80 C for 20 minutes.
fragment that was all variable sequence and derived from a digestion of a dsDNA
molecule with recognition sites on both sides of the dsDNA, while maintaining the conserved flanking sequences from the 3' and 5' ends to allow for efficient ligation and to enable universal primers to be used in downstream PCR (for example illustrated in Figure 1B). A
ligation step was performed on the dsDNA fragments (DL3) using 16.5 ul water, 10x T4 ligation buffer (2.5 ul), BsaI (0.5 ul), T4 DNA ligase (0.5 ul), and 5 ul of the pooled PCR3 product. The mixture was incubated for 1 minute at 37 C followed by 1 minute at 16 C and cycled 25 times. Finally the mixture was held at 80 C for 20 minutes.
[0096] A step of PCR (PCR4) was then performed on the product in a mixture of water (6 ul), 5' and 3' primers (2 ul of 1 uM), the DNA polymerase above (10 ul), and 2 ul of the digestion and ligation product. PCR cycles and CE and proteinase K were performed as in step 5 above. Amplification products were verified on a gel showing the presence of a 180 bp product, which had a variable sequence of 100 nucleotides. The molecule was sequenced and found to have the correct sequence with no errors.
Example 2 ¨ Oligo Library
Example 2 ¨ Oligo Library
[0097] This example shows construction of a universal oligonucleotide library.
Considerations in selecting a library included whether flanking sequences that would serve as robust universal priming sequences and ensure that 5' and 3' flanking sequences were distinct enough so that PCR primer sequences would not cross-react in the PCR steps. A
common feature in all the flanks was a Type ITS site and this was held constant within the flanking sequence and designed around. These sequences were generated by computational design but can also be generated manually.
Considerations in selecting a library included whether flanking sequences that would serve as robust universal priming sequences and ensure that 5' and 3' flanking sequences were distinct enough so that PCR primer sequences would not cross-react in the PCR steps. A
common feature in all the flanks was a Type ITS site and this was held constant within the flanking sequence and designed around. These sequences were generated by computational design but can also be generated manually.
[0098] Different flanking sequences were empirically selected by ordering approximately eight sequences from a commercial supplier and testing them directly in PCR.
The best flanking sequence set was then selected. The "flank set" was tested with the 5' and 3' primer pair, the 5' only, and the 3' only to ensure that the expected PCR
product would be generated.
The best flanking sequence set was then selected. The "flank set" was tested with the 5' and 3' primer pair, the 5' only, and the 3' only to ensure that the expected PCR
product would be generated.
[0099] After selecting the best flanking sequences, the variable sequences were added to the sequences. Note that all possible permutations of the variable bases were needed to be able to construct any DNA sequence. For example, if 5 variable bases were added to the 3' end of 01, there was 4 to the 5th power or 1,024 different 01 sequences in separate microtiter wells where 4 is the number of DNA bases available and 5 is the number of variable bases utilized in the 01 oligo. These variable sequences were generated by available computational design programs but can also be generated manually.
[0100] In the case of 01, the variable bases were added to the 3' end (e.g., 5 variable bases). In the case of 02, the variable bases were added to the 5' end (e.g., 5 variable bases).
In the case of 03, the variable sequence non-degenerate bases were added to the central part of the oligo (e.g., 4 non-degenerate bases) to support the ligation of 01 and 02 at their abutting interfaces and then surrounded by degenerate N bases as these bases prevent the unnecessary expansion of the library. The degenerate N bases were synthesized on the oligo synthesizer by combining all four DNA bases for the N position, thus a 03 anchor oligo was a mixture of sequences. For example, if a single 03 anchor had a total of six N positions there would be a total 4 to the 6th power or 4,096 different molecules within a single library well. Not all the molecules in this library well were viable 03 anchors for the ligation of 01 + 02, but only a fraction of the 4,096 molecules were needed to support a robust LO ligation.
In the case of 03, the variable sequence non-degenerate bases were added to the central part of the oligo (e.g., 4 non-degenerate bases) to support the ligation of 01 and 02 at their abutting interfaces and then surrounded by degenerate N bases as these bases prevent the unnecessary expansion of the library. The degenerate N bases were synthesized on the oligo synthesizer by combining all four DNA bases for the N position, thus a 03 anchor oligo was a mixture of sequences. For example, if a single 03 anchor had a total of six N positions there would be a total 4 to the 6th power or 4,096 different molecules within a single library well. Not all the molecules in this library well were viable 03 anchors for the ligation of 01 + 02, but only a fraction of the 4,096 molecules were needed to support a robust LO ligation.
[0101] The oligos that made up the library were then synthesized in microtiter plate format in such a way that all oligo members had a discrete well location within the library.
The wells were in single micro-tubes or microtiter plate formats of 96 and 384-wells, but they can be any format that allows for the physical separation of library oligo members. The location of each member was precisely known and could be accessed when the oligo components were pooled together, either manually or by laboratory liquid handling automation.
The wells were in single micro-tubes or microtiter plate formats of 96 and 384-wells, but they can be any format that allows for the physical separation of library oligo members. The location of each member was precisely known and could be accessed when the oligo components were pooled together, either manually or by laboratory liquid handling automation.
[0102] When synthesizing a sequence, for example, a 100 bp sequence that is a portion of a specific gene. The following steps were followed:
[0103] Three oligos (01, 02 & 03) were pooled into a single well and these oligos corresponded to the first 10 bp (bases 1 to 10) of the 100bp sequence in this example.
[0104] Three more oligos were then pooled (i.e., the next set of 01, 02 & 03) into an adjacent well. These oligos constituted another 10bp but overlapped the first 10bp in "a."
above by 4bp, which constituted bases 6-14 of the 100bp sequence in this example.
above by 4bp, which constituted bases 6-14 of the 100bp sequence in this example.
[0105] This process was repeated until there were enough starting pools to make the entire 100bp. In this example, there were a total of 16 starting pools in which each pool overlaps by 4bp.
[0106] After all the pools were established in the reaction wells, the process of synthesis was started.
Library # 01 02 03 01 02 03 Total Assembly Assembly Assembly Assembly Assembly Assembly Grand 9216 Total -->
Library # 01 02 03 01 02 03 Total Assembly Assembly Assembly Assembly Assembly Assembly Grand 9216 Total -->
[0107] Table 1: This table shows the number of oligo members in an entire library set that were needed to build any 10 4 16 4 28 4 100bp DNA fragment. The total number of oligo members needed was 9216.
Library # 01 02 03 01 02 03 Assembly Assembly Assembly Assembly Assembly Assembly 1 45(5) 45(5) 50(4) 45(5) 45(5) 50(4) 2 45(5) 45(5) 50(4) 45(5) 45(5) 50(4)
Library # 01 02 03 01 02 03 Assembly Assembly Assembly Assembly Assembly Assembly 1 45(5) 45(5) 50(4) 45(5) 45(5) 50(4) 2 45(5) 45(5) 50(4) 45(5) 45(5) 50(4)
[0108] Table 2: This table shows the nucleotide lengths for each of the oligo members within the entire library set. The length of the non-degenerate nucleotides of the variable sequence is shown in parenthesis.
Sequences
Sequences
[0109] SEQ ID NO: 1, DNA, artificial sequence AGGGA
[0110] SEQ ID NO: 2, DNA, artificial sequence CGTTG
[0111] SEQ ID NO: 3, DNA, artificial sequence NNNACTCNNN
[0112] SEQ ID NO: 4, DNA, artificial sequence TTGCG
[0113] SEQ ID NO: 5, DNA, artificial sequence TAGCG
[0114] SEQ ID NO: 6, DNA, artificial sequence NNNTACGNNN
[0115] SEQ ID NO: 7, DNA, artificial sequence AGGGAGTTGC
[0116] SEQ ID NO: 8, DNA, artificial sequence TTGCGTAGCG
[0117] SEQ ID NO: 9, DNA, artificial sequence AGGGAG
[0118] SEQ ID NO: 10, DNA, artificial sequence TTGC
[0119] SEQ ID NO: 11, DNA, artificial sequence GCAACTCCCT
[0120] SEQ ID NO: 12, DNA, artificial sequence TTGCGTAGCG
[0121] SEQ ID NO: 13, DNA, artificial sequence CGCTAC
[0122] SEQ ID NO: 14, DNA, artificial sequence GCAA
[0123] SEQ ID NO: 15, DNA, artificial sequence AGGGAGTTGCGTAGCG
[0124] SEQ ID NO: 16, DNA, BsmBI recognition site, Bacillus stearothermophilus CGTCTC(N)
[0125] SEQ ID NO: 17, DNA, BsaI recognition site, Bacillus stearothermophilus GGTCTC(N)
[0126] Although the invention has been described with reference to the presently preferred embodiment, it should be understood that various modifications can be made without departing from the spirit of the invention. Accordingly, the invention is limited only by the following claims.
Claims (21)
PROPERTY OR PRIVILEGE IS CLAIMED ARE DEFINED AS FOLLOWS:
1. A method of synthesizing a DNA molecule having a desired sequence comprising:
a) annealing at least two oligonucleotides to an anchor strand so that the at least two oligonucleolides annealed to the anchor strand abut one another on the anchor strand;
wherein the at least two oligonucleotides each comprise a universal primer binding site on a 3' or 5' end, and a variable sequence on the opposing 5' or 3' end, and a conserved flanking sequence in between the universal primer binding site and the variable sequence; and wherein the anchor strand comprises conserved flanking sequences complementary to those on the at least two oligonucleotides, and further comprises at least one variable sequence, wherein at least a portion of the at least one variable sequence on the anchor strand is complementary to at least a portion of the variable sequences on the at least two oligonucleotides;
b) ligating the at least two oligonucleotides annealed to the anchor strand to produce a first dsDNA molecule;
c) performing an amplification step on the first dsDNA molecule having a desired sequence and comprising, a conserved flanking sequence inside each of the 3' and 5' ends, and a variable sequence inside the conserved flanking sequences.
a) annealing at least two oligonucleotides to an anchor strand so that the at least two oligonucleolides annealed to the anchor strand abut one another on the anchor strand;
wherein the at least two oligonucleotides each comprise a universal primer binding site on a 3' or 5' end, and a variable sequence on the opposing 5' or 3' end, and a conserved flanking sequence in between the universal primer binding site and the variable sequence; and wherein the anchor strand comprises conserved flanking sequences complementary to those on the at least two oligonucleotides, and further comprises at least one variable sequence, wherein at least a portion of the at least one variable sequence on the anchor strand is complementary to at least a portion of the variable sequences on the at least two oligonucleotides;
b) ligating the at least two oligonucleotides annealed to the anchor strand to produce a first dsDNA molecule;
c) performing an amplification step on the first dsDNA molecule having a desired sequence and comprising, a conserved flanking sequence inside each of the 3' and 5' ends, and a variable sequence inside the conserved flanking sequences.
2. The method of claim 1 further comprising contacting the first dsDNA
molecule with a restriction endonuclease to produce first dsDNA fragments comprising 3' and/or 5' overhang sequences comprising a portion of the variable sequence from the first dsDNA
molecule, providing at least one additional dsDNA fragment comprising a 3' and/or 5' overhang sequence that is at least partially complementary to an overhang sequence of at least one of the first dsDNA fragments;
annealing the first dsDNA fragments and at least one additional dsDNA fragment by the 3' and/or 5' overhang sequences; and Date recue/Date recieved 2024-05-14 ligating the annealed dsDNA fragments to produce a second dsDNA molecule comprising a conserved flanking sequence inside each of the 3' and 5' ends, and a variable sequence inside the 3' and 5' conserved flanking sequences that is longer than the variable sequence on the first dsDNA molecule; and optionally wherein the at least one additional dsDNA fragment is the product of a parallel DNA synthesis reaction.
molecule with a restriction endonuclease to produce first dsDNA fragments comprising 3' and/or 5' overhang sequences comprising a portion of the variable sequence from the first dsDNA
molecule, providing at least one additional dsDNA fragment comprising a 3' and/or 5' overhang sequence that is at least partially complementary to an overhang sequence of at least one of the first dsDNA fragments;
annealing the first dsDNA fragments and at least one additional dsDNA fragment by the 3' and/or 5' overhang sequences; and Date recue/Date recieved 2024-05-14 ligating the annealed dsDNA fragments to produce a second dsDNA molecule comprising a conserved flanking sequence inside each of the 3' and 5' ends, and a variable sequence inside the 3' and 5' conserved flanking sequences that is longer than the variable sequence on the first dsDNA molecule; and optionally wherein the at least one additional dsDNA fragment is the product of a parallel DNA synthesis reaction.
3. The method of claim 2 further comprising contacting the at least one second dsDNA
molecule with a restriction endonuclease to produce a plurality of second dsDNA fragments comprising 3' and/or 5' overhang sequences and a conserved flanking sequence inside each of the 3' or 5' ends;
providing at least one additional dsDNA fragment comprising a 3' and/or 5' overhang sequence that is at least partially complementary to an overhang sequence of at least one of the second dsDNA fragments;
annealing the plurality of second dsDNA fragments to the one or more additional dsDNA
fragment(s) by the 3' and/or 5' overhang sequence(s); and performing a step of ligation to produce a third dsDNA molecule comprising a conserved flanking sequence on the 3' and 5' ends, and a variable sequence inside the conserved flanking sequences that is longer than the variable sequence of the second dsDNA
molecule;
optionally wherein the at least one additional dsDNA fragment is the product of a parallel DNA synthesis reaction.
molecule with a restriction endonuclease to produce a plurality of second dsDNA fragments comprising 3' and/or 5' overhang sequences and a conserved flanking sequence inside each of the 3' or 5' ends;
providing at least one additional dsDNA fragment comprising a 3' and/or 5' overhang sequence that is at least partially complementary to an overhang sequence of at least one of the second dsDNA fragments;
annealing the plurality of second dsDNA fragments to the one or more additional dsDNA
fragment(s) by the 3' and/or 5' overhang sequence(s); and performing a step of ligation to produce a third dsDNA molecule comprising a conserved flanking sequence on the 3' and 5' ends, and a variable sequence inside the conserved flanking sequences that is longer than the variable sequence of the second dsDNA
molecule;
optionally wherein the at least one additional dsDNA fragment is the product of a parallel DNA synthesis reaction.
4. The method of claim 3 further comprising reacting the at least one third dsDNA
molecule with a restriction endonuclease to produce a plurality of third dsDNA
fragments comprising 3' and/or 5' overhang sequences and a conserved flanking sequence inside each of the 3' or 5' ends;
providing at least one additional dsDNA fragment comprising a 3' and/or 5' overhang sequence that is at least partially complementary to an overhang sequence of at least one of the third dsDNA fragments;
Date recue/Date recieved 2024-05-14 annealing the plurality of third dsDNA fragments to the one or more additional dsDNA
fragment(s) by the 3' and/or 5' overhang sequence(s); and perfoiming a step of ligation to produce a fourth dsDNA molecule comprising a conserved flanking sequence on the 3' and 5' ends, and a variable sequence inside the conserved flanking sequences that is longer than the variable sequence of the third dsDNA
molecule;
optionally wherein the at least one additional dsDNA fragment is the product of a parallel DNA synthesis reaction.
molecule with a restriction endonuclease to produce a plurality of third dsDNA
fragments comprising 3' and/or 5' overhang sequences and a conserved flanking sequence inside each of the 3' or 5' ends;
providing at least one additional dsDNA fragment comprising a 3' and/or 5' overhang sequence that is at least partially complementary to an overhang sequence of at least one of the third dsDNA fragments;
Date recue/Date recieved 2024-05-14 annealing the plurality of third dsDNA fragments to the one or more additional dsDNA
fragment(s) by the 3' and/or 5' overhang sequence(s); and perfoiming a step of ligation to produce a fourth dsDNA molecule comprising a conserved flanking sequence on the 3' and 5' ends, and a variable sequence inside the conserved flanking sequences that is longer than the variable sequence of the third dsDNA
molecule;
optionally wherein the at least one additional dsDNA fragment is the product of a parallel DNA synthesis reaction.
5. The method of claim 1 wherein step a) further comprises annealing at least two paired oligonucleotides to a paired anchor strand so that the at least two paired oligonucleotides bound to the paired anchor strand abut one another on the paired anchor strand, wherein the at least two paired oligonucleotides comprise a universal primer binding site on a 3' or 5' end, and a variable sequence on the opposing 5' or 3' end, and a conserved flanking sequence in between the universal primer binding site and the variable sequence;
and wherein the paired anchor strand comprises conserved flanking sequences complementary to those on the at least two paired oligonucleotides, and further comprises at least one variable sequence, and wherein a portion of the variable sequence on the paired anchor strand overlaps with a portion of the variable sequence on the first anchor strand, d) ligating the at least two paired oligonucleotides annealed to the anchor strand;
e) perfoiming an amplification step to produce a paired dsDNA molecule of desired sequence and comprising a universal primer binding site at a 3' and 5' end, a conserved flanking sequence inside each of the 3' and 5' ends, and a variable sequence inside the conserved flanking sequences that partially overlaps with the variable sequence of the first dsDNA molecule.
and wherein the paired anchor strand comprises conserved flanking sequences complementary to those on the at least two paired oligonucleotides, and further comprises at least one variable sequence, and wherein a portion of the variable sequence on the paired anchor strand overlaps with a portion of the variable sequence on the first anchor strand, d) ligating the at least two paired oligonucleotides annealed to the anchor strand;
e) perfoiming an amplification step to produce a paired dsDNA molecule of desired sequence and comprising a universal primer binding site at a 3' and 5' end, a conserved flanking sequence inside each of the 3' and 5' ends, and a variable sequence inside the conserved flanking sequences that partially overlaps with the variable sequence of the first dsDNA molecule.
6. The method of claim 5 further comprising contacting the first dsDNA
molecule and the paired dsDNA molecule with a restriction endonuclease to produce at least one dsDNA fragment and at least one paired dsDNA fragment, each comprising at least one 3' and/or 5' overhang sequence; and wherein at least a portion of a 3' or 5' overhang sequence from the first dsDNA
Date recue/Date recieved 2024-05-14 fragment is complementary to at least a portion of a 5' or 3' overhang sequence from the paired dsDNA fragment, annealing the at least one first and paired dsDNA fragments by their complementary overhang sequences and performing a step of ligation to produce a second dsDNA
molecule comprising a conserved flanking sequence inside each of the 3' and 5' ends, and a variable sequence inside the 3' and 5' conserved flanking sequences that is longer than the variable sequence on the respective first dsDNA molecules.
molecule and the paired dsDNA molecule with a restriction endonuclease to produce at least one dsDNA fragment and at least one paired dsDNA fragment, each comprising at least one 3' and/or 5' overhang sequence; and wherein at least a portion of a 3' or 5' overhang sequence from the first dsDNA
Date recue/Date recieved 2024-05-14 fragment is complementary to at least a portion of a 5' or 3' overhang sequence from the paired dsDNA fragment, annealing the at least one first and paired dsDNA fragments by their complementary overhang sequences and performing a step of ligation to produce a second dsDNA
molecule comprising a conserved flanking sequence inside each of the 3' and 5' ends, and a variable sequence inside the 3' and 5' conserved flanking sequences that is longer than the variable sequence on the respective first dsDNA molecules.
7. The method of claim 6 further comprising contacting the at least one second dsDNA
molecule and an at least one paired second dsDNA molecule with a restriction endonuclease to produce a plurality of second dsDNA fragments and paired second dsDNA
fragments, each comprising a 3' and/or 5' overhang sequence(s), wherein at least two of the plurality comprise, a conserved flanking sequence inside each of the 3' or 5' ends; and wherein at least a portion of the 3' or 5' overhang sequence from a second dsDNA fragment is complementary to at least a portion of the 5' or 3' overhang sequence from a paired second dsDNA fragment, annealing the second and paired second dsDNA fragments by their complementary overhang sequences; and performing a step of ligation to produce a third dsDNA molecule comprising a conserved flanking sequence inside each of the 3' and 5' ends, and a variable sequence inside the 3' and 5' conserved flanking sequences that is longer than the variable sequence on the second dsDNA
molecules.
molecule and an at least one paired second dsDNA molecule with a restriction endonuclease to produce a plurality of second dsDNA fragments and paired second dsDNA
fragments, each comprising a 3' and/or 5' overhang sequence(s), wherein at least two of the plurality comprise, a conserved flanking sequence inside each of the 3' or 5' ends; and wherein at least a portion of the 3' or 5' overhang sequence from a second dsDNA fragment is complementary to at least a portion of the 5' or 3' overhang sequence from a paired second dsDNA fragment, annealing the second and paired second dsDNA fragments by their complementary overhang sequences; and performing a step of ligation to produce a third dsDNA molecule comprising a conserved flanking sequence inside each of the 3' and 5' ends, and a variable sequence inside the 3' and 5' conserved flanking sequences that is longer than the variable sequence on the second dsDNA
molecules.
8. The method of claim 7 further comprising contacting the at least one third dsDNA
molecule and an at least one paired third dsDNA molecule with a restriction endonuclease to produce a plurality of third dsDNA fragments and paired third dsDNA fragments, each comprising a 3' and/or 5' overhang sequence(s), wherein at least two of the plurality comprise, a conserved flanking sequence inside each of the 3' or 5' ends; and wherein at least a portion of the 3' or 5' overhang sequence from a third dsDNA fragment is complementary to at least a portion of the 5' or 3' overhang sequence from a paired third dsDNA fragment, annealing the third and paired third dsDNA fragments by their complementary overhang sequences; and performing a step of ligation to produce a fourth dsDNA
molecule comprising a Date recue/Date recieved 2024-05-14 conserved flanking sequence inside each of the 3' and 5' ends, and a variable sequence inside the 3' and 5' conserved flanking sequences that is longer than the variable sequence on the third dsDNA molecule.
molecule and an at least one paired third dsDNA molecule with a restriction endonuclease to produce a plurality of third dsDNA fragments and paired third dsDNA fragments, each comprising a 3' and/or 5' overhang sequence(s), wherein at least two of the plurality comprise, a conserved flanking sequence inside each of the 3' or 5' ends; and wherein at least a portion of the 3' or 5' overhang sequence from a third dsDNA fragment is complementary to at least a portion of the 5' or 3' overhang sequence from a paired third dsDNA fragment, annealing the third and paired third dsDNA fragments by their complementary overhang sequences; and performing a step of ligation to produce a fourth dsDNA
molecule comprising a Date recue/Date recieved 2024-05-14 conserved flanking sequence inside each of the 3' and 5' ends, and a variable sequence inside the 3' and 5' conserved flanking sequences that is longer than the variable sequence on the third dsDNA molecule.
9. The method of any one of claims 1-8 wherein the first dsDNA molecule comprises a variable sequence of 8-12 base pairs; or wherein the paired dsDNA molecule comprises a variable sequence of 8-12 base pairs; or the method of any one of claims 2-4 or 6-8 wherein the second dsDNA molecule comprises a variable sequence of 14-18 base pairs; or the method of any one of claims 3-4 or 7-8 wherein the third dsDNA molecule comprises a variable sequence of 24-32 base pairs; or the method of claim 8 wherein the fourth dsDNA molecule comprises a variable sequence of 90-110 base pairs; or the method of any one of claims 1-8 wherein the at least two oligonucleotides have a variable sequence of 4-6 nucleotides.
10. The method of any one of claims 1-9 wherein the anchor strands comprise the sequences complementary to the conserved flanking sequences on the at least two oligonucleotides on the 3' and 5' ends.
11. The method of any one of claims 1-10 wherein the amplification step is performed by the polymerase chain reaction (PCR).
12. The method of any one of claims 1-12 wherein the variable sequence is equal to the lengths of the variable sequences on the at least two oligonucleotides; or wherein the anchor strand comprises a variable sequence present in between the two sequences complementary to the conserved flanking sequences on the at least two oligonucleofides; or wherein the at least two oligonucleotides bound to the anchor strand abut one another on the anchor strand at their variable sequences; or Date recue/Date recieved 2024-05-14 wherein the portion of the variable sequence on the anchor strand that is complementary to the conserved flanking sequence on the at least two oligonucleotides comprises 2-6 nucleotides;
or
or
13. The method of any one of claims 1-12 wherein the at least two oligonucleotides and anchor strand further comprise a recognition site for a restriction endonuclease, optionally wherein the restriction endonuclease is a Type IIS endonuclease.
14. The method of any one of claims 2-4 wherein the at least one additional dsDNA
fragment is from a parallel synthesis reaction.
fragment is from a parallel synthesis reaction.
15. The method of any one of claims 1-14 wherein the anchor strands comprise 4-degenerate nucleotides.
16. The method of claim 15 wherein the degenerate nucleotides comprise a universal or randomized base.
17. A composition comprising at least two oligonucleotides, each comprising a universal primer binding site on a 3' or 5' end, and a variable sequence on the opposing 5' or 3' end, and a conserved flanking sequence in between the universal primer binding site and the variable sequence; and wherein an anchor strand comprising sequences complementary to the conserved flanking sequences on the at least two oligonucleotides, and further comprising at least one variable sequence, wherein at least a portion of the at least one variable sequence on the anchor strand is complementary to at least a portion of the variable sequences on the at least two oligonucleotides.
18. The composition of claim 17 wherein the anchor strand comprises sequences complementary to the conserved flanking sequences on the at least two oligonucleotides at its 3' and 5' ends.
19. The composition of any one of claims 17-18 wherein the anchor strand comprises the variable sequence in between the two sequences complementary to the conserved flanking sequence.
Date recue/Date recieved 2024-05-14
Date recue/Date recieved 2024-05-14
20. A method of storing data in a DNA sequence comprising:
determining a sequence of DNA that encodes a non-genetic message according to a coding scheme that translates the non-genetic message from a reference language into a DNA sequence and vice versa;
synthesizing the sequence of DNA that encodes the non-genetic message according to the method of any one of claims 1-16; and thereby store data in a DNA sequence.
determining a sequence of DNA that encodes a non-genetic message according to a coding scheme that translates the non-genetic message from a reference language into a DNA sequence and vice versa;
synthesizing the sequence of DNA that encodes the non-genetic message according to the method of any one of claims 1-16; and thereby store data in a DNA sequence.
21. A method of synthesizing a DNA sequence encoding a guide RNA comprising:
determining a sequence of DNA that encodes a guide RNA;
synthesizing the sequence of DNA that encodes the guide RNA according to any one of claims 1-16.
Date recue/Date recieved 2024-05-14
determining a sequence of DNA that encodes a guide RNA;
synthesizing the sequence of DNA that encodes the guide RNA according to any one of claims 1-16.
Date recue/Date recieved 2024-05-14
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/US2021/059422 WO2023086109A1 (en) | 2021-11-15 | 2021-11-15 | Methods of synthesizing nucleic acid molecules |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CA3238653A1 true CA3238653A1 (en) | 2023-05-19 |
Family
ID=86336336
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CA3238653A Pending CA3238653A1 (en) | 2021-11-15 | 2021-11-15 | Methods of synthesizing nucleic acid molecules |
Country Status (5)
| Country | Link |
|---|---|
| EP (1) | EP4433605A4 (en) |
| CN (1) | CN118355127A (en) |
| AU (1) | AU2021473848A1 (en) |
| CA (1) | CA3238653A1 (en) |
| WO (1) | WO2023086109A1 (en) |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20250346886A1 (en) * | 2024-05-09 | 2025-11-13 | Telesis Bio, Inc. | Methods of Synthesizing Nucleic Acid Molecules |
Family Cites Families (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US9187777B2 (en) * | 2010-05-28 | 2015-11-17 | Gen9, Inc. | Methods and devices for in situ nucleic acid synthesis |
| US20160215316A1 (en) * | 2015-01-22 | 2016-07-28 | Genomic Expression Aps | Gene synthesis by self-assembly of small oligonucleotide building blocks |
| WO2018111914A1 (en) * | 2016-12-14 | 2018-06-21 | Synthetic Genomics, Inc. | Methods for assembling dna molecules |
| JP7742355B2 (en) * | 2020-03-03 | 2025-09-19 | コーデックス ディーエヌエー インコーポレイテッド | Methods for assembling nucleic acids |
-
2021
- 2021-11-15 AU AU2021473848A patent/AU2021473848A1/en active Pending
- 2021-11-15 CN CN202180104210.3A patent/CN118355127A/en active Pending
- 2021-11-15 EP EP21964265.9A patent/EP4433605A4/en active Pending
- 2021-11-15 WO PCT/US2021/059422 patent/WO2023086109A1/en not_active Ceased
- 2021-11-15 CA CA3238653A patent/CA3238653A1/en active Pending
Also Published As
| Publication number | Publication date |
|---|---|
| WO2023086109A1 (en) | 2023-05-19 |
| CN118355127A (en) | 2024-07-16 |
| EP4433605A1 (en) | 2024-09-25 |
| AU2021473848A1 (en) | 2024-06-20 |
| EP4433605A4 (en) | 2025-01-08 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US20240368658A1 (en) | Demand Synthesis of Polynucleotide Sequences | |
| CA2707436C (en) | Copy dna and sense rna | |
| US20080108804A1 (en) | Method for modifying RNAS and preparing DNAS from RNAS | |
| CN107075513A (en) | The oligonucleotides of separation and its purposes in nucleic acid sequencing | |
| CN109593757B (en) | Probe and method for enriching target region by using same and applicable to high-throughput sequencing | |
| US11279927B2 (en) | Compositions and methods for constructing strand specific CDNA libraries | |
| CN109312391B (en) | Method for generating single-stranded circular DNA library for single-molecule sequencing | |
| CN112941635A (en) | Second-generation sequencing library building kit and method for improving library conversion rate | |
| CN113015813A (en) | Sequencing algorithm | |
| CA3238653A1 (en) | Methods of synthesizing nucleic acid molecules | |
| US20250346886A1 (en) | Methods of Synthesizing Nucleic Acid Molecules | |
| US20230151402A1 (en) | Methods of synthesizing nucleic acid molecules | |
| CN113817803B (en) | Library construction method for small RNA carrying modification and application thereof | |
| EP3749779B1 (en) | Library preparation | |
| WO2024096856A1 (en) | Methods of synthesizing nucleic acid molecules | |
| JP2018500936A (en) | Bubble primer | |
| CN117701677A (en) | Small RNA library building method for reducing linker dimer and application | |
| WO2018081666A1 (en) | Methods of single dna/rna molecule counting | |
| US20060183123A1 (en) | Polymerase-based protocols for the introduction of combinatorial deletions... | |
| JP2020096564A (en) | RNA detection method, RNA detection nucleic acid, and RNA detection kit | |
| WO2025076733A1 (en) | Method for rapidly constructing nucleic acid library | |
| CN107354148A (en) | A kind of method for efficiently building storehouse for minim DNA | |
| HK1240263B (en) | Isolated oligonucleotide and use thereof in nucleic acid sequencing | |
| HK1168627B (en) | Dna (deoxyribonucleic acid) index library building method based on pcr (polymerase chain reaction) | |
| HK1168393B (en) | Method for accurate detection of whole genome methylation sites by utilizing trace genome dna (deoxyribonucleic acid) |