AU2005201777A1 - A Method for Direct Nucleic Acid Sequencing - Google Patents
A Method for Direct Nucleic Acid Sequencing Download PDFInfo
- Publication number
- AU2005201777A1 AU2005201777A1 AU2005201777A AU2005201777A AU2005201777A1 AU 2005201777 A1 AU2005201777 A1 AU 2005201777A1 AU 2005201777 A AU2005201777 A AU 2005201777A AU 2005201777 A AU2005201777 A AU 2005201777A AU 2005201777 A1 AU2005201777 A1 AU 2005201777A1
- Authority
- AU
- Australia
- Prior art keywords
- dna
- reaction
- polymerase
- sequencing
- nucleotide
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims description 97
- 238000012163 sequencing technique Methods 0.000 title description 64
- 102000039446 nucleic acids Human genes 0.000 title description 46
- 108020004707 nucleic acids Proteins 0.000 title description 46
- 150000007523 nucleic acids Chemical class 0.000 title description 46
- 125000003729 nucleotide group Chemical group 0.000 claims description 94
- 239000002773 nucleotide Substances 0.000 claims description 81
- 230000000903 blocking effect Effects 0.000 claims description 11
- 238000002372 labelling Methods 0.000 claims description 10
- 230000004913 activation Effects 0.000 claims description 2
- 238000006243 chemical reaction Methods 0.000 description 88
- 108020004414 DNA Proteins 0.000 description 52
- 108090000623 proteins and genes Proteins 0.000 description 25
- 239000000523 sample Substances 0.000 description 25
- 238000001514 detection method Methods 0.000 description 24
- 102000004190 Enzymes Human genes 0.000 description 22
- 108090000790 Enzymes Proteins 0.000 description 22
- 238000010348 incorporation Methods 0.000 description 22
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 20
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 20
- 125000005647 linker group Chemical group 0.000 description 20
- 239000013615 primer Substances 0.000 description 20
- ZMXDDKWLCZADIW-UHFFFAOYSA-N N,N-Dimethylformamide Chemical compound CN(C)C=O ZMXDDKWLCZADIW-UHFFFAOYSA-N 0.000 description 18
- BFMYDTVEBKDAKJ-UHFFFAOYSA-L disodium;(2',7'-dibromo-3',6'-dioxido-3-oxospiro[2-benzofuran-1,9'-xanthene]-4'-yl)mercury;hydrate Chemical compound O.[Na+].[Na+].O1C(=O)C2=CC=CC=C2C21C1=CC(Br)=C([O-])C([Hg])=C1OC1=C2C=C(Br)C([O-])=C1 BFMYDTVEBKDAKJ-UHFFFAOYSA-L 0.000 description 18
- 230000002068 genetic effect Effects 0.000 description 15
- 238000000492 total internal reflection fluorescence microscopy Methods 0.000 description 15
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 14
- 238000002866 fluorescence resonance energy transfer Methods 0.000 description 14
- 239000012634 fragment Substances 0.000 description 14
- 108010092799 RNA-directed DNA polymerase Proteins 0.000 description 12
- 210000004027 cell Anatomy 0.000 description 12
- 239000003153 chemical reaction reagent Substances 0.000 description 12
- 108090000626 DNA-directed RNA polymerases Proteins 0.000 description 11
- 102000004163 DNA-directed RNA polymerases Human genes 0.000 description 11
- 238000004458 analytical method Methods 0.000 description 11
- 102000053602 DNA Human genes 0.000 description 10
- 239000000872 buffer Substances 0.000 description 10
- 230000001419 dependent effect Effects 0.000 description 10
- 239000011521 glass Substances 0.000 description 10
- XEKOWRVHYACXOJ-UHFFFAOYSA-N Ethyl acetate Chemical compound CCOC(C)=O XEKOWRVHYACXOJ-UHFFFAOYSA-N 0.000 description 9
- 201000010099 disease Diseases 0.000 description 9
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 9
- 230000005284 excitation Effects 0.000 description 9
- 230000035772 mutation Effects 0.000 description 9
- 239000000758 substrate Substances 0.000 description 9
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 8
- 102100034343 Integrase Human genes 0.000 description 8
- 206010028980 Neoplasm Diseases 0.000 description 8
- 230000003321 amplification Effects 0.000 description 8
- 238000005286 illumination Methods 0.000 description 8
- 238000003199 nucleic acid amplification method Methods 0.000 description 8
- 239000012071 phase Substances 0.000 description 8
- 102000004169 proteins and genes Human genes 0.000 description 8
- 210000001519 tissue Anatomy 0.000 description 8
- 230000004888 barrier function Effects 0.000 description 7
- 230000015572 biosynthetic process Effects 0.000 description 7
- PXHVJJICTQNCMI-UHFFFAOYSA-N nickel Substances [Ni] PXHVJJICTQNCMI-UHFFFAOYSA-N 0.000 description 7
- -1 nucleoside triphosphate Chemical class 0.000 description 7
- 238000003752 polymerase chain reaction Methods 0.000 description 7
- 239000002096 quantum dot Substances 0.000 description 7
- 238000002198 surface plasmon resonance spectroscopy Methods 0.000 description 7
- UHOVQNZJYSORNB-UHFFFAOYSA-N Benzene Chemical compound C1=CC=CC=C1 UHOVQNZJYSORNB-UHFFFAOYSA-N 0.000 description 6
- OKKJLVBELUTLKV-UHFFFAOYSA-N Methanol Chemical compound OC OKKJLVBELUTLKV-UHFFFAOYSA-N 0.000 description 6
- 108091034117 Oligonucleotide Proteins 0.000 description 6
- HEMHJVSKTPXQMS-UHFFFAOYSA-M Sodium hydroxide Chemical compound [OH-].[Na+] HEMHJVSKTPXQMS-UHFFFAOYSA-M 0.000 description 6
- 230000000295 complement effect Effects 0.000 description 6
- 239000000975 dye Substances 0.000 description 6
- 238000001962 electrophoresis Methods 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 6
- 238000002474 experimental method Methods 0.000 description 6
- 230000003287 optical effect Effects 0.000 description 6
- 238000002360 preparation method Methods 0.000 description 6
- 230000010076 replication Effects 0.000 description 6
- 239000007787 solid Substances 0.000 description 6
- 239000000243 solution Substances 0.000 description 6
- 238000013459 approach Methods 0.000 description 5
- 230000000694 effects Effects 0.000 description 5
- 230000006870 function Effects 0.000 description 5
- 239000000499 gel Substances 0.000 description 5
- 230000003993 interaction Effects 0.000 description 5
- 229940046166 oligodeoxynucleotide Drugs 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 238000012216 screening Methods 0.000 description 5
- 239000001226 triphosphate Substances 0.000 description 5
- 235000011178 triphosphate Nutrition 0.000 description 5
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Chemical compound O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 5
- VHYFNPMBLIVWCW-UHFFFAOYSA-N 4-Dimethylaminopyridine Chemical compound CN(C)C1=CC=NC=C1 VHYFNPMBLIVWCW-UHFFFAOYSA-N 0.000 description 4
- 102100037084 C4b-binding protein alpha chain Human genes 0.000 description 4
- 241000252212 Danio rerio Species 0.000 description 4
- 241000588724 Escherichia coli Species 0.000 description 4
- 108010093488 His-His-His-His-His-His Proteins 0.000 description 4
- JUJWROOIHBZHMG-UHFFFAOYSA-N Pyridine Chemical compound C1=CC=NC=C1 JUJWROOIHBZHMG-UHFFFAOYSA-N 0.000 description 4
- 108091028664 Ribonucleotide Proteins 0.000 description 4
- BLRPTPMANUNPDV-UHFFFAOYSA-N Silane Chemical compound [SiH4] BLRPTPMANUNPDV-UHFFFAOYSA-N 0.000 description 4
- 229960002685 biotin Drugs 0.000 description 4
- 235000020958 biotin Nutrition 0.000 description 4
- 239000011616 biotin Substances 0.000 description 4
- 238000004624 confocal microscopy Methods 0.000 description 4
- 239000008367 deionised water Substances 0.000 description 4
- 229910021641 deionized water Inorganic materials 0.000 description 4
- 238000011161 development Methods 0.000 description 4
- 230000018109 developmental process Effects 0.000 description 4
- 208000000283 familial pityriasis rubra pilaris Diseases 0.000 description 4
- 239000002184 metal Substances 0.000 description 4
- 229910052751 metal Inorganic materials 0.000 description 4
- 238000001393 microlithography Methods 0.000 description 4
- MGFYIUFZLHCRTH-UHFFFAOYSA-N nitrilotriacetic acid Chemical compound OC(=O)CN(CC(O)=O)CC(O)=O MGFYIUFZLHCRTH-UHFFFAOYSA-N 0.000 description 4
- 239000002777 nucleoside Substances 0.000 description 4
- 238000011160 research Methods 0.000 description 4
- 239000002336 ribonucleotide Substances 0.000 description 4
- 125000002652 ribonucleotide group Chemical group 0.000 description 4
- 229910000077 silane Inorganic materials 0.000 description 4
- 230000003595 spectral effect Effects 0.000 description 4
- KZNICNPSHKQLFF-UHFFFAOYSA-N succinimide Chemical compound O=C1CCC(=O)N1 KZNICNPSHKQLFF-UHFFFAOYSA-N 0.000 description 4
- 238000003786 synthesis reaction Methods 0.000 description 4
- VDZOOKBUILJEDG-UHFFFAOYSA-M tetrabutylammonium hydroxide Chemical compound [OH-].CCCC[N+](CCCC)(CCCC)CCCC VDZOOKBUILJEDG-UHFFFAOYSA-M 0.000 description 4
- 239000011534 wash buffer Substances 0.000 description 4
- 238000001712 DNA sequencing Methods 0.000 description 3
- 241000196324 Embryophyta Species 0.000 description 3
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 3
- 108010043121 Green Fluorescent Proteins Proteins 0.000 description 3
- 102000004144 Green Fluorescent Proteins Human genes 0.000 description 3
- 241000282414 Homo sapiens Species 0.000 description 3
- 108060004795 Methyltransferase Proteins 0.000 description 3
- 108091028043 Nucleic acid sequence Proteins 0.000 description 3
- 108020004682 Single-Stranded DNA Proteins 0.000 description 3
- 241000589500 Thermus aquaticus Species 0.000 description 3
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 3
- 238000007792 addition Methods 0.000 description 3
- 239000002299 complementary DNA Substances 0.000 description 3
- 150000001875 compounds Chemical class 0.000 description 3
- SUYVUBYJARFZHO-RRKCRQDMSA-N dATP Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 SUYVUBYJARFZHO-RRKCRQDMSA-N 0.000 description 3
- 238000000609 electron-beam lithography Methods 0.000 description 3
- 230000002255 enzymatic effect Effects 0.000 description 3
- 239000005090 green fluorescent protein Substances 0.000 description 3
- 238000009396 hybridization Methods 0.000 description 3
- 238000003384 imaging method Methods 0.000 description 3
- RAXXELZNTBOGNW-UHFFFAOYSA-N imidazole Natural products C1=CNC=N1 RAXXELZNTBOGNW-UHFFFAOYSA-N 0.000 description 3
- 230000010354 integration Effects 0.000 description 3
- 239000000463 material Substances 0.000 description 3
- 230000001404 mediated effect Effects 0.000 description 3
- 239000002245 particle Substances 0.000 description 3
- 238000006303 photolysis reaction Methods 0.000 description 3
- 230000015843 photosynthesis, light reaction Effects 0.000 description 3
- 230000037452 priming Effects 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 239000000047 product Substances 0.000 description 3
- 239000011535 reaction buffer Substances 0.000 description 3
- 230000006798 recombination Effects 0.000 description 3
- 230000002441 reversible effect Effects 0.000 description 3
- 241000894007 species Species 0.000 description 3
- 102000040650 (ribonucleotides)n+m Human genes 0.000 description 2
- YGTNHTPZQUQMKP-UHFFFAOYSA-N 1-[(1e)-1-diazoethyl]-4,5-dimethoxy-2-nitrobenzene Chemical compound COC1=CC(C(C)=[N+]=[N-])=C([N+]([O-])=O)C=C1OC YGTNHTPZQUQMKP-UHFFFAOYSA-N 0.000 description 2
- QTBSBXVTEAMEQO-UHFFFAOYSA-N Acetic acid Chemical compound CC(O)=O QTBSBXVTEAMEQO-UHFFFAOYSA-N 0.000 description 2
- 241000713838 Avian myeloblastosis virus Species 0.000 description 2
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 2
- HEDRZPFGACZZDS-UHFFFAOYSA-N Chloroform Chemical compound ClC(Cl)Cl HEDRZPFGACZZDS-UHFFFAOYSA-N 0.000 description 2
- 108020004635 Complementary DNA Proteins 0.000 description 2
- 108050006400 Cyclin Proteins 0.000 description 2
- HMFHBZSHGGEWLO-SOOFDHNKSA-N D-ribofuranose Chemical compound OC[C@H]1OC(O)[C@H](O)[C@@H]1O HMFHBZSHGGEWLO-SOOFDHNKSA-N 0.000 description 2
- 239000003155 DNA primer Substances 0.000 description 2
- 241000701533 Escherichia virus T4 Species 0.000 description 2
- 108010070675 Glutathione transferase Proteins 0.000 description 2
- 102000005720 Glutathione transferase Human genes 0.000 description 2
- 241000713772 Human immunodeficiency virus 1 Species 0.000 description 2
- 208000026350 Inborn Genetic disease Diseases 0.000 description 2
- 241001465754 Metazoa Species 0.000 description 2
- 241001274216 Naso Species 0.000 description 2
- 102000009339 Proliferating Cell Nuclear Antigen Human genes 0.000 description 2
- PYMYPHUHKUWMLA-LMVFSUKVSA-N Ribose Natural products OC[C@@H](O)[C@@H](O)[C@@H](O)C=O PYMYPHUHKUWMLA-LMVFSUKVSA-N 0.000 description 2
- VYPSYNLAJGMNEJ-UHFFFAOYSA-N Silicium dioxide Chemical compound O=[Si]=O VYPSYNLAJGMNEJ-UHFFFAOYSA-N 0.000 description 2
- 108010090804 Streptavidin Proteins 0.000 description 2
- 108010020764 Transposases Proteins 0.000 description 2
- 102000008579 Transposases Human genes 0.000 description 2
- HMFHBZSHGGEWLO-UHFFFAOYSA-N alpha-D-Furanose-Ribose Natural products OCC1OC(O)C(O)C1O HMFHBZSHGGEWLO-UHFFFAOYSA-N 0.000 description 2
- 230000004075 alteration Effects 0.000 description 2
- 238000000137 annealing Methods 0.000 description 2
- 239000007864 aqueous solution Substances 0.000 description 2
- 238000003491 array Methods 0.000 description 2
- 238000003556 assay Methods 0.000 description 2
- 230000001580 bacterial effect Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 238000010876 biochemical test Methods 0.000 description 2
- 210000004369 blood Anatomy 0.000 description 2
- 239000008280 blood Substances 0.000 description 2
- 239000008366 buffered solution Substances 0.000 description 2
- 201000011510 cancer Diseases 0.000 description 2
- 229910052799 carbon Inorganic materials 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 239000002738 chelating agent Substances 0.000 description 2
- 239000007795 chemical reaction product Substances 0.000 description 2
- 239000003795 chemical substances by application Substances 0.000 description 2
- 238000001816 cooling Methods 0.000 description 2
- HAAZLUGHYHWQIW-KVQBGUIXSA-N dGTP Chemical compound C1=NC=2C(=O)NC(N)=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 HAAZLUGHYHWQIW-KVQBGUIXSA-N 0.000 description 2
- 238000012217 deletion Methods 0.000 description 2
- 230000037430 deletion Effects 0.000 description 2
- 239000005547 deoxyribonucleotide Substances 0.000 description 2
- 125000002637 deoxyribonucleotide group Chemical group 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000003818 flash chromatography Methods 0.000 description 2
- 239000007850 fluorescent dye Substances 0.000 description 2
- 208000016361 genetic disease Diseases 0.000 description 2
- 230000007614 genetic variation Effects 0.000 description 2
- 238000003205 genotyping method Methods 0.000 description 2
- RWSXRVCMGQZWBV-WDSKDSINSA-N glutathione Chemical compound OC(=O)[C@@H](N)CCC(=O)N[C@@H](CS)C(=O)NCC(O)=O RWSXRVCMGQZWBV-WDSKDSINSA-N 0.000 description 2
- 230000012010 growth Effects 0.000 description 2
- 238000010438 heat treatment Methods 0.000 description 2
- 238000000338 in vitro Methods 0.000 description 2
- 230000000977 initiatory effect Effects 0.000 description 2
- 150000002500 ions Chemical class 0.000 description 2
- 239000010410 layer Substances 0.000 description 2
- 230000000670 limiting effect Effects 0.000 description 2
- 229920002521 macromolecule Polymers 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 238000005459 micromachining Methods 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 239000003068 molecular probe Substances 0.000 description 2
- 239000002105 nanoparticle Substances 0.000 description 2
- VOFUROIFQGPCGE-UHFFFAOYSA-N nile red Chemical compound C1=CC=C2C3=NC4=CC=C(N(CC)CC)C=C4OC3=CC(=O)C2=C1 VOFUROIFQGPCGE-UHFFFAOYSA-N 0.000 description 2
- 230000036961 partial effect Effects 0.000 description 2
- 230000002085 persistent effect Effects 0.000 description 2
- XHXFXVLFKHQFAL-UHFFFAOYSA-N phosphoryl trichloride Chemical compound ClP(Cl)(Cl)=O XHXFXVLFKHQFAL-UHFFFAOYSA-N 0.000 description 2
- 238000000206 photolithography Methods 0.000 description 2
- 102000054765 polymorphisms of proteins Human genes 0.000 description 2
- UMJSCPRVCHMLSP-UHFFFAOYSA-N pyridine Natural products COC1=CC=CN=C1 UMJSCPRVCHMLSP-UHFFFAOYSA-N 0.000 description 2
- 238000006862 quantum yield reaction Methods 0.000 description 2
- 230000005855 radiation Effects 0.000 description 2
- 238000005215 recombination Methods 0.000 description 2
- 230000002829 reductive effect Effects 0.000 description 2
- 230000001105 regulatory effect Effects 0.000 description 2
- 230000003252 repetitive effect Effects 0.000 description 2
- 229920006395 saturated elastomer Polymers 0.000 description 2
- 230000035945 sensitivity Effects 0.000 description 2
- 238000004557 single molecule detection Methods 0.000 description 2
- 230000004936 stimulating effect Effects 0.000 description 2
- 239000000126 substance Substances 0.000 description 2
- 229960002317 succinimide Drugs 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- FPGGTKZVZWFYPV-UHFFFAOYSA-M tetrabutylammonium fluoride Chemical compound [F-].CCCC[N+](CCCC)(CCCC)CCCC FPGGTKZVZWFYPV-UHFFFAOYSA-M 0.000 description 2
- HWCKGOZZJDHMNC-UHFFFAOYSA-M tetraethylammonium bromide Chemical compound [Br-].CC[N+](CC)(CC)CC HWCKGOZZJDHMNC-UHFFFAOYSA-M 0.000 description 2
- RWQNBRDOKXIBIV-UHFFFAOYSA-N thymine Chemical compound CC1=CNC(=O)NC1=O RWQNBRDOKXIBIV-UHFFFAOYSA-N 0.000 description 2
- 230000005945 translocation Effects 0.000 description 2
- UNXRWKVEANCORM-UHFFFAOYSA-N triphosphoric acid Chemical compound OP(O)(=O)OP(O)(=O)OP(O)(O)=O UNXRWKVEANCORM-UHFFFAOYSA-N 0.000 description 2
- 230000003612 virological effect Effects 0.000 description 2
- XINQFOMFQFGGCQ-UHFFFAOYSA-L (2-dodecoxy-2-oxoethyl)-[6-[(2-dodecoxy-2-oxoethyl)-dimethylazaniumyl]hexyl]-dimethylazanium;dichloride Chemical compound [Cl-].[Cl-].CCCCCCCCCCCCOC(=O)C[N+](C)(C)CCCCCC[N+](C)(C)CC(=O)OCCCCCCCCCCCC XINQFOMFQFGGCQ-UHFFFAOYSA-L 0.000 description 1
- OWEGMIWEEQEYGQ-UHFFFAOYSA-N 100676-05-9 Natural products OC1C(O)C(O)C(CO)OC1OCC1C(O)C(O)C(O)C(OC2C(OC(O)C(O)C2O)CO)O1 OWEGMIWEEQEYGQ-UHFFFAOYSA-N 0.000 description 1
- 229960000549 4-dimethylaminophenol Drugs 0.000 description 1
- XVMSFILGAMDHEY-UHFFFAOYSA-N 6-(4-aminophenyl)sulfonylpyridin-3-amine Chemical compound C1=CC(N)=CC=C1S(=O)(=O)C1=CC=C(N)C=N1 XVMSFILGAMDHEY-UHFFFAOYSA-N 0.000 description 1
- 108700028369 Alleles Proteins 0.000 description 1
- 241001156002 Anthonomus pomorum Species 0.000 description 1
- 108090001008 Avidin Proteins 0.000 description 1
- 241000894006 Bacteria Species 0.000 description 1
- 241000724268 Bromovirus Species 0.000 description 1
- MUKWPASLDRXCFX-UHFFFAOYSA-N CCCC[NH+](CCCC)CCCC.CCCC[NH+](CCCC)CCCC.CCCC[NH+](CCCC)CCCC.CCCC[NH+](CCCC)CCCC.CCCC[NH+](CCCC)CCCC.[O-]P([O-])(=O)OP([O-])(=O)OP([O-])([O-])=O Chemical compound CCCC[NH+](CCCC)CCCC.CCCC[NH+](CCCC)CCCC.CCCC[NH+](CCCC)CCCC.CCCC[NH+](CCCC)CCCC.CCCC[NH+](CCCC)CCCC.[O-]P([O-])(=O)OP([O-])(=O)OP([O-])([O-])=O MUKWPASLDRXCFX-UHFFFAOYSA-N 0.000 description 1
- 102000014914 Carrier Proteins Human genes 0.000 description 1
- 108090000994 Catalytic RNA Proteins 0.000 description 1
- 102000053642 Catalytic RNA Human genes 0.000 description 1
- 108010077544 Chromatin Proteins 0.000 description 1
- 108091026890 Coding region Proteins 0.000 description 1
- 208000027205 Congenital disease Diseases 0.000 description 1
- 208000029767 Congenital, Hereditary, and Neonatal Diseases and Abnormalities Diseases 0.000 description 1
- GUBGYTABKSRVRQ-WFVLMXAXSA-N DEAE-cellulose Chemical compound OC1C(O)C(O)C(CO)O[C@H]1O[C@@H]1C(CO)OC(O)C(O)C1O GUBGYTABKSRVRQ-WFVLMXAXSA-N 0.000 description 1
- 102000004594 DNA Polymerase I Human genes 0.000 description 1
- 108010017826 DNA Polymerase I Proteins 0.000 description 1
- 102000016559 DNA Primase Human genes 0.000 description 1
- 108010092681 DNA Primase Proteins 0.000 description 1
- 102000004214 DNA polymerase A Human genes 0.000 description 1
- 108010076551 DNA polymerase C Proteins 0.000 description 1
- 230000006820 DNA synthesis Effects 0.000 description 1
- 102000016911 Deoxyribonucleases Human genes 0.000 description 1
- 108010053770 Deoxyribonucleases Proteins 0.000 description 1
- 229920002307 Dextran Polymers 0.000 description 1
- 241000255581 Drosophila <fruit fly, genus> Species 0.000 description 1
- 241000701867 Enterobacteria phage T7 Species 0.000 description 1
- 101000906736 Escherichia phage Mu DNA circularization protein N Proteins 0.000 description 1
- 108060002716 Exonuclease Proteins 0.000 description 1
- 241000193385 Geobacillus stearothermophilus Species 0.000 description 1
- 108010024636 Glutathione Proteins 0.000 description 1
- 241000711557 Hepacivirus Species 0.000 description 1
- 108010025076 Holoenzymes Proteins 0.000 description 1
- 101900297506 Human immunodeficiency virus type 1 group M subtype B Reverse transcriptase/ribonuclease H Proteins 0.000 description 1
- 206010020772 Hypertension Diseases 0.000 description 1
- 241000714216 Levivirus Species 0.000 description 1
- GUBGYTABKSRVRQ-PICCSMPSSA-N Maltose Natural products O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CO)O[C@@H]1O[C@@H]1[C@@H](CO)OC(O)[C@H](O)[C@H]1O GUBGYTABKSRVRQ-PICCSMPSSA-N 0.000 description 1
- 101710175625 Maltose/maltodextrin-binding periplasmic protein Proteins 0.000 description 1
- 241000124008 Mammalia Species 0.000 description 1
- 241000713869 Moloney murine leukemia virus Species 0.000 description 1
- 241000699666 Mus <mouse, genus> Species 0.000 description 1
- 206010061309 Neoplasm progression Diseases 0.000 description 1
- 101100436871 Neurospora crassa (strain ATCC 24698 / 74-OR23-1A / CBS 708.71 / DSM 1257 / FGSC 987) atp-3 gene Proteins 0.000 description 1
- VEQPNABPJHWNSG-UHFFFAOYSA-N Nickel(2+) Chemical compound [Ni+2] VEQPNABPJHWNSG-UHFFFAOYSA-N 0.000 description 1
- 240000007019 Oxalis corniculata Species 0.000 description 1
- 241000709664 Picornaviridae Species 0.000 description 1
- 208000006994 Precancerous Conditions Diseases 0.000 description 1
- 238000001069 Raman spectroscopy Methods 0.000 description 1
- 102000007056 Recombinant Fusion Proteins Human genes 0.000 description 1
- 108010008281 Recombinant Fusion Proteins Proteins 0.000 description 1
- BQCADISMDOOEFD-UHFFFAOYSA-N Silver Chemical compound [Ag] BQCADISMDOOEFD-UHFFFAOYSA-N 0.000 description 1
- 101710142606 Sliding clamp Proteins 0.000 description 1
- 101710119335 Sliding-clamp-loader large subunit Proteins 0.000 description 1
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical class [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 1
- 101000764570 Streptomyces phage phiC31 Probable tape measure protein Proteins 0.000 description 1
- 101710137500 T7 RNA polymerase Proteins 0.000 description 1
- 241000205180 Thermococcus litoralis Species 0.000 description 1
- 241000589499 Thermus thermophilus Species 0.000 description 1
- 102100036407 Thioredoxin Human genes 0.000 description 1
- 241000723848 Tobamovirus Species 0.000 description 1
- 241000710141 Tombusvirus Species 0.000 description 1
- 239000007983 Tris buffer Substances 0.000 description 1
- 241000251539 Vertebrata <Metazoa> Species 0.000 description 1
- 241000700605 Viruses Species 0.000 description 1
- 241000269457 Xenopus tropicalis Species 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 238000000862 absorption spectrum Methods 0.000 description 1
- 229960000583 acetic acid Drugs 0.000 description 1
- 150000008065 acid anhydrides Chemical class 0.000 description 1
- 125000003277 amino group Chemical group 0.000 description 1
- 210000004381 amniotic fluid Anatomy 0.000 description 1
- 244000037640 animal pathogen Species 0.000 description 1
- 230000002547 anomalous effect Effects 0.000 description 1
- 239000000427 antigen Substances 0.000 description 1
- 108091000831 antigen binding proteins Proteins 0.000 description 1
- 102000036639 antigens Human genes 0.000 description 1
- 108091008324 binding proteins Proteins 0.000 description 1
- 230000008033 biological extinction Effects 0.000 description 1
- 238000010804 cDNA synthesis Methods 0.000 description 1
- 239000000969 carrier Substances 0.000 description 1
- 230000003197 catalytic effect Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000002144 chemical decomposition reaction Methods 0.000 description 1
- 125000003636 chemical group Chemical group 0.000 description 1
- 210000003483 chromatin Anatomy 0.000 description 1
- 230000001684 chronic effect Effects 0.000 description 1
- 238000003759 clinical diagnosis Methods 0.000 description 1
- 238000010367 cloning Methods 0.000 description 1
- 238000010549 co-Evaporation Methods 0.000 description 1
- 239000011248 coating agent Substances 0.000 description 1
- 238000000576 coating method Methods 0.000 description 1
- 238000004440 column chromatography Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000001276 controlling effect Effects 0.000 description 1
- 238000009223 counseling Methods 0.000 description 1
- 238000011840 criminal investigation Methods 0.000 description 1
- 229940104302 cytosine Drugs 0.000 description 1
- SUYVUBYJARFZHO-UHFFFAOYSA-N dATP Natural products C1=NC=2C(N)=NC=NC=2N1C1CC(O)C(COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 SUYVUBYJARFZHO-UHFFFAOYSA-N 0.000 description 1
- RGWHQCVHVJXOKC-SHYZEUOFSA-J dCTP(4-) Chemical compound O=C1N=C(N)C=CN1[C@@H]1O[C@H](COP([O-])(=O)OP([O-])(=O)OP([O-])([O-])=O)[C@@H](O)C1 RGWHQCVHVJXOKC-SHYZEUOFSA-J 0.000 description 1
- NHVNXKFIZYSCEB-XLPZGREQSA-N dTTP Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)[C@@H](O)C1 NHVNXKFIZYSCEB-XLPZGREQSA-N 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 238000004925 denaturation Methods 0.000 description 1
- 230000036425 denaturation Effects 0.000 description 1
- 238000010511 deprotection reaction Methods 0.000 description 1
- 238000001212 derivatisation Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 239000005546 dideoxynucleotide Substances 0.000 description 1
- 230000029087 digestion Effects 0.000 description 1
- 239000000539 dimer Substances 0.000 description 1
- 238000010494 dissociation reaction Methods 0.000 description 1
- 230000005593 dissociations Effects 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 230000005672 electromagnetic field Effects 0.000 description 1
- 210000001671 embryonic stem cell Anatomy 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000032050 esterification Effects 0.000 description 1
- 238000005886 esterification reaction Methods 0.000 description 1
- 238000005530 etching Methods 0.000 description 1
- 125000001495 ethyl group Chemical group [H]C([H])([H])C([H])([H])* 0.000 description 1
- 230000005281 excited state Effects 0.000 description 1
- 102000013165 exonuclease Human genes 0.000 description 1
- GNBHRKFJIUUOQI-UHFFFAOYSA-N fluorescein Chemical compound O1C(=O)C2=CC=CC=C2C21C1=CC=C(O)C=C1OC1=CC(O)=CC=C21 GNBHRKFJIUUOQI-UHFFFAOYSA-N 0.000 description 1
- 238000001917 fluorescence detection Methods 0.000 description 1
- 238000000799 fluorescence microscopy Methods 0.000 description 1
- 238000002189 fluorescence spectrum Methods 0.000 description 1
- 230000004907 flux Effects 0.000 description 1
- 238000004374 forensic analysis Methods 0.000 description 1
- 238000011842 forensic investigation Methods 0.000 description 1
- 238000005194 fractionation Methods 0.000 description 1
- 230000002538 fungal effect Effects 0.000 description 1
- 102000037865 fusion proteins Human genes 0.000 description 1
- 108020001507 fusion proteins Proteins 0.000 description 1
- 238000001502 gel electrophoresis Methods 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 238000007429 general method Methods 0.000 description 1
- 230000004077 genetic alteration Effects 0.000 description 1
- 231100000118 genetic alteration Toxicity 0.000 description 1
- 238000011331 genomic analysis Methods 0.000 description 1
- 239000012362 glacial acetic acid Substances 0.000 description 1
- 229960003180 glutathione Drugs 0.000 description 1
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 1
- 239000010931 gold Substances 0.000 description 1
- 229910052737 gold Inorganic materials 0.000 description 1
- 150000004820 halides Chemical class 0.000 description 1
- 238000004128 high performance liquid chromatography Methods 0.000 description 1
- 229910052739 hydrogen Inorganic materials 0.000 description 1
- 239000001257 hydrogen Substances 0.000 description 1
- 238000007654 immersion Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 238000012177 large-scale sequencing Methods 0.000 description 1
- 239000003446 ligand Substances 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 239000007791 liquid phase Substances 0.000 description 1
- 108010026228 mRNA guanylyltransferase Proteins 0.000 description 1
- 230000000873 masking effect Effects 0.000 description 1
- 230000028161 membrane depolarization Effects 0.000 description 1
- 230000011987 methylation Effects 0.000 description 1
- 238000007069 methylation reaction Methods 0.000 description 1
- 238000000386 microscopy Methods 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 108091005573 modified proteins Proteins 0.000 description 1
- 102000035118 modified proteins Human genes 0.000 description 1
- 238000010369 molecular cloning Methods 0.000 description 1
- 230000004001 molecular interaction Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 229910052759 nickel Inorganic materials 0.000 description 1
- 229910001453 nickel ion Inorganic materials 0.000 description 1
- 239000012299 nitrogen atmosphere Substances 0.000 description 1
- 150000003833 nucleoside derivatives Chemical class 0.000 description 1
- 230000005257 nucleotidylation Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 239000012044 organic layer Substances 0.000 description 1
- 230000001590 oxidative effect Effects 0.000 description 1
- 244000052769 pathogen Species 0.000 description 1
- 238000010827 pathological analysis Methods 0.000 description 1
- NMHMNPHRMNGLLB-UHFFFAOYSA-N phloretic acid Chemical compound OC(=O)CCC1=CC=C(O)C=C1 NMHMNPHRMNGLLB-UHFFFAOYSA-N 0.000 description 1
- 230000035479 physiological effects, processes and functions Effects 0.000 description 1
- 230000006461 physiological response Effects 0.000 description 1
- 239000013612 plasmid Substances 0.000 description 1
- 230000003234 polygenic effect Effects 0.000 description 1
- 238000006116 polymerization reaction Methods 0.000 description 1
- 102000040430 polynucleotide Human genes 0.000 description 1
- 108091033319 polynucleotide Proteins 0.000 description 1
- 239000002157 polynucleotide Substances 0.000 description 1
- 125000006239 protecting group Chemical group 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 238000010791 quenching Methods 0.000 description 1
- 230000000171 quenching effect Effects 0.000 description 1
- 238000002708 random mutagenesis Methods 0.000 description 1
- 239000000376 reactant Substances 0.000 description 1
- 230000007420 reactivation Effects 0.000 description 1
- 238000010188 recombinant method Methods 0.000 description 1
- 230000008929 regeneration Effects 0.000 description 1
- 238000011069 regeneration method Methods 0.000 description 1
- 230000029865 regulation of blood pressure Effects 0.000 description 1
- 238000002165 resonance energy transfer Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 108091008146 restriction endonucleases Proteins 0.000 description 1
- PYWVYCXTNDRMGF-UHFFFAOYSA-N rhodamine B Chemical compound [Cl-].C=12C=CC(=[N+](CC)CC)C=C2OC2=CC(N(CC)CC)=CC=C2C=1C1=CC=CC=C1C(O)=O PYWVYCXTNDRMGF-UHFFFAOYSA-N 0.000 description 1
- 108091092562 ribozyme Proteins 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000009758 senescence Effects 0.000 description 1
- 238000011896 sensitive detection Methods 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 238000010008 shearing Methods 0.000 description 1
- 239000000377 silicon dioxide Substances 0.000 description 1
- 239000002904 solvent Substances 0.000 description 1
- 230000000392 somatic effect Effects 0.000 description 1
- 238000004611 spectroscopical analysis Methods 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 230000000638 stimulation Effects 0.000 description 1
- 238000003756 stirring Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
- 239000010409 thin film Substances 0.000 description 1
- 238000004809 thin layer chromatography Methods 0.000 description 1
- 108060008226 thioredoxin Proteins 0.000 description 1
- 229940094937 thioredoxin Drugs 0.000 description 1
- 229940113082 thymine Drugs 0.000 description 1
- 238000013518 transcription Methods 0.000 description 1
- 230000035897 transcription Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000001052 transient effect Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- IMFACGCPASFAPR-UHFFFAOYSA-N tributylamine Chemical compound CCCCN(CCCC)CCCC IMFACGCPASFAPR-UHFFFAOYSA-N 0.000 description 1
- WVLBCYQITXONBZ-UHFFFAOYSA-N trimethyl phosphate Chemical compound COP(=O)(OC)OC WVLBCYQITXONBZ-UHFFFAOYSA-N 0.000 description 1
- 125000002264 triphosphate group Chemical class [H]OP(=O)(O[H])OP(=O)(O[H])OP(=O)(O[H])O* 0.000 description 1
- LENZDBCJOHFCAS-UHFFFAOYSA-N tris Chemical compound OCC(N)(CO)CO LENZDBCJOHFCAS-UHFFFAOYSA-N 0.000 description 1
- 230000005751 tumor progression Effects 0.000 description 1
- 241001515965 unidentified phage Species 0.000 description 1
- 230000004304 visual acuity Effects 0.000 description 1
- 238000005406 washing Methods 0.000 description 1
- 238000012070 whole genome sequencing analysis Methods 0.000 description 1
Landscapes
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Description
0
N
ci 00 ci
AUSTRALIA
Patents Act 1990 COMPLETE SPECIFICATION FOR A STANDARD PATENT Name of Applicant: Address for Service: Invention Title: ASM Scientific, Inc.
CULLEN CO.
Level 26 239 George Street Brisbane Qld 4000 A Method for Direct Nucleic Acid Sequencing The following statement is a full description of the invention, including the best method of performing it, known to us: 00 FIELD OF THE INVENTION The present invention relates to methods for sequencing nucleic acid samples. More.
S specifically, the present invention relates to methods for sequencing without the need for S amplification; prior knowledge of some of the nucleotide sequence to generate the sequencing O primers; and the labor-intensive electrophoresis techniques.
in O BACKGROUND OF THE INVENTION The sequencing of nucleic acid.samples is an important analytical technique in modem molecular biology. The development of reliable methods for DNA sequencing has been crucial for understanding the function and control of genes and for applying many of the basic techniques of molecular biology. These methods have also become increasingly important as tools in genomic analysis and many non-research applications, such as genetic identification, forensic analysis, genetic counseling, medical diagnostics and many others. In these latter applications, both techniques providing partial sequence information, such as fingerprinting and sequence comparisons, and techniques providing full sequence determination have been employed. See, Gibbs et al., Proc. Natl. Acad. Sci USA 86: 1919-1923 (1989); Gyllensten et Proc. Natl. Acad. Sci USA 85: 7652-7656 (1988); Carrano et Genomics 4: 129-136 (1989); Caetano-Annoles et al., Mol. Gen. Genet. 235: 157-165 (1992); Brenner and Livak, Proc. Natl. Acad. Sci USA 86: 8902-8906 (1989); Green et al., PCR Methods and Applications 1: 77-90 (1991); and Versalovic et al, Nucleic Acid Res. 19: 6823-6831 (1991).
Most currently available DNA sequencing methods require the generation of a set of DNA fragments that are ordered by length according to nucleotide composition. The generation of this set of ordered fragments occurs in one of two ways: chemical degradation at specific nucleotides using the Maxam-Gilbert method or dideoxy nucleotide incorporation using the Sanger method. See Maxam and Gilbert; Proc Natl Acad Sci USA 74: 560-564 (1977); Sanger et al., Proc Natl Acad Sci USA 74: 5463-5467 (1977). The type and number of required steps inherently limits both the number of DNA segments that can be sequenced in parallel, and the amount of sequence that can be determined from a given site. Furthermore, both methods are prone to error due to the anomalous migration of DNA fragments in denaturing gels. Time and space limitations inherent in these gel-based methods have fueled the search for alternative methods.
SIn an effort to satisfy the current large-scale sequencing demands, improvements have been <1 made to the Sanger method. For example, the use of fluorescent chain terminators simplifies 00 Cil detection of the nucleotides. The synthesis of longer DNA fragments and improved fragment resolution produces more sequence information from each experiment. Automated analysis of fragments in gels or capillaries has significantly reduced the labor involved in collecting and processing sequence information. See, Prober et al., Science 238: 336-341 (1987); Smith et al., Nature 321: 674-679 (1986); Luckey et al., Nucleic Acids Res 18: 4417-4421(1990); S Dovichi, Electrophoresis 18: 2393-2399 (1997).
Cil However, current DNA sequencing technologies still suffer three major limitations. First, they require a large amount of identical DNA molecules, which are generally obtained either by molecular cloning or by polymerase chain reaction (PCR) amplification of DNA sequences.
Current methods of detection are insensitive and thus require a minimum critical number of labeled oligonucleotides. Also, many identical copies of the oligonucleotide are needed to generate a sequence ladder. A second limitation is that current sequencing techniques depend on priming from sequence-specific oligodeoxynucleotides that must be synthesized prior to initiating the sequencing procedure. Sanger and Coulson, J. Mol. Biol. 94: 441-448 (1975).
The need for multiple identical templates necessitates the synchronous priming of each copy from the same predetermined site. Third. current sequencing techniques depend on lengthy, labor-intensive electrophoresis techniques that are limited by the rate at which the fragments may be separated and are also limited by the number of bases that can be sequenced in a given experiment by the resolution obtainable on the gel.
In an effort to dispense with the need for elecrophoresis techniques, a sequencing method was developed which uses chain terminators that can be uncaged. or deprotected, for further extension. See, U.S. Patent No. 5,302,509: Metzker et al., Nucleic Acids Res. 22: 4259-4267 (1994). This method involves repetitive cycles of base incorporation, detection of incorporation, and re-activation of the chain terminator to allow the next cycle of DNA synthesis. Thus, by detecting each added base while the DNA chain is growing, the need for size-fractionation is eliminated. This method is nevertheless still highly dependent on large amounts of nucleic acid to be sequenced and the use of known sequences for priming the initiation of chain growth. Moreover. this technique is plagued by any inefficiencies. of O incorporation and deprotection. Because incorporation and 3'-OH regeneration are not Ci completely efficient, a pool of initially identical extending strands can rapidly become 0. asynchronous and sequences cannot be resolved beyond a few limited initial additions.
00 Thus, a need still remains in the art for a rapid, cost effective, high'throughput method for sequencing unknown nucleic acid samples that eliminates the need for amplification; prior knowledge of some of the nucleotide sequence to generate sequencing primers; and laborintensive electrophoresis techniques.
SUMMARY OF THE INVENTION S The present invention provides rapid, cost effective, high throughput methods for sequencing unknown nucleic acid samples that eliminate the need for amplification; prior knowledge of some of the nucleotide. sequence to generate sequencing primers; and labor-intensive electrophoresis techniques. The methods of the present invention permit direct nucleic acid sequencing (DNAS) of single nucleic acid molecules.
According to the methods of the present invention, a plurality of polymerase molecules is immobilized on a solid support through a covalent or non-covalent interaction. A nucleic acid sample and oligonucleotide primers are introduced to the reaction chamber in a- buffered solution containing all four labeled-caged nucleoside triphosphate terminators. Templatedriven elongation of a nucleic acid is mediated by the attached polymerases using the labeledcaged nucleoside triphosphate terminators. Reaction centers are monitored by the microscope system until a majority of sites contain immobilized polymerase bound to a nucleic acid template with a single incorporated labeled-caged nucleotide terminator. The reaction chamber is then flushed with a wash buffer. Specific nucleotide incorporation is then determined for each active reaction center. Following detection, the reaction chamber is irradiated to uncage the incorporated nucleotide and flushed with wash buffer once again. The presence of labeledcaged nucleotides is once again monitored before fresh reagents are added to reinitiate synthesis, to verify that reaction centers are successfully uncaged. A persistent failure of release or incorporation, however, indicates failure of a reaction center. A persistent failure of release or incorporation consists of 2-20 cycles, preferably 3-10 cycles, more preferably cycles, Wherein the presence of a labeled-caged nucleotide is detected during the second detection step, indicating that the reaction center was not successfully, uncaged. The sequencing cycle outlined above is repeated until a large proportion of reaction centers fail.
The differentially-labeled nucleotides used in the sequencing methods of the present invention S have a detachable labeling group and are blocked at the 3' portion with a detachable blocking group. In a preferred embodiment, the labeling group is directly attached to the detachable 3' blocking group. Uncaging of the nucleotides can be accomplished enzymatically, 00 chemically,or preferably photolytically, depending on the detachable linker used to link the labeling group and the 3' blocking group to the nucleotide.
In another preferred embodiment, the labeling group is attached to the base of each nucleotide with a detachable linker rather than to the detachable 3' blocking group. The labeling group Cl and the 3'blocking group can be removed enzymatically, chemically, or photolvtically.
O Alternative, the labeling group can be removed by a different method than and the 3' blocking C group. For example, the labeling group can be removed enzymatically while the 3' blocking group is removed chemically, or by photochemical activation.
Many independent reactions occur simultaneously within the reaction chamber, each individual reaction center generating a few hundred, or thousands, of base pairs. This apparatus has the capacity to sequence in parallel thousands and possibly millions of separate templates from either specified or random sequence points. The combined sequence from each run is on the order of several million base-pairs of sequence and does not require amplification, prior knowledge of a portion of the target sequence, or resolution of fragments on gels or capillaries.
Simple DNA preparations from any source can be sequenced with the apparatus and methods of the present invention.
BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 (Panels A-C) is a schematic representation of labeled-caged terminator nucleotides for use in direct nucleic acid sequencing. Panel A depicts a deoxyadenosine triphosphate modified by attachment of a photolabile linker-fluorochrome conjugate to the 3' carbon of the ribose.
Panel B depicts an alternative configuration, wherein the fluorochrome is attached to the base of the nucleotide by way of a photolabile linker. Panel C depicts the four different nucleotides each labeled with a fluorochrome with distinct spectral properties, which permits the four nucleotides to be distinguished during the detection phase of a direct nucleic acid sequencing reaction cycle.
o FIG. 2 is a schematic representation of the steps of one cycle of direct nucleic acid sequencing, C wherein step 1 illustrates the incorporation of a labeled-caged nucleotide, step 2 illustrates the detection of the label, and step 3 illustrates the unblocking of the 3'-OH cage.
00 FIG. 3 is a schematic representation of a reaction center depicting an immobilized polymerase and a nucleic acid sample being sequenced.
FIG. 4 is a schematic representation of the reaction chamber assembly that houses the array of DNAS reaction centers and mediates the exchange of reagents and buffer.
O FIG. 5 is a schematic representation of a reaction center array. The left side panel (Microscope O Field) depicts the view of an entire array as recorded by four successive detection events (one S for each of the separate fluorochromes). The center panel depicts a magnified view of a part of the field showing the spacing of individual reaction centers. The far right panel depicts the camera's view of a single reaction center.
FIG. 6 is a schematic representation of the principle of the evanescent wave.
FIG. 7 is a schematic representation of a direct nucleic acid sequencing set up using total internal reflection fluorescence microscopy.
FIG. 8 is a schematic representation of an example of a data acquisition algorithm obtained from a 3x3 matrix.
DETAILED DESCRIPTION OF THE INVENTION The present invention provides a novel sequencing apparatus and a novel sequencing method.
The method of the present invention, referred to herein as Direct Nucleic Acid Sequencing (DNAS), offers a rapid, cost effective, high throughput method by which nucleic acid molecules from any source can be readily sequenced without the need for prior amplification.
DNAS can be used to determine the nucleotide sequence of numerous single nucleic acid molecules in parallel.
1. DNAS Reaction Center Array olymerases are attached to the solid support, spaced at regular intervals, in an array of 'eaction centers, present at a periodicity greater than the optical resolving power of the nicroscope system. Preferably, only one polymerase molecule is present in each reaction :enter, and each reaction center is located at an optically resolvable distance from the other S reaction centers. Sequencing reactions preferably occur in a thin aqueous reaction chamber comprising a sealed cover slip and an optically transparent solid support.
S Immobilization of polymerase molecules for use in nucleic acid sequencing has been disclosed by Densham in PCT application WO 99/ 05315. Densham describes the attachment of selected 00 Cl amino groups within the polymerase to a dextran or N-hydroxysuccinimide ester-activated surface. WO 99/ 05315; EP-A-0589867; L6fas et al., Biosens. Bioelectron 10: 813-822 (1995). These techniques can be modified in the present invention to insure that the activated area is small enough so that steric hindrance will prevent the attachment of more than one 0 Cl polymerase at any given spot in the array.
The array of reaction centers containing a single polymerase molecule is constructed using Cl lithographic techniques commonly, used in the construction of electronic integrated circuits.
This methodology has. been used in the art to construct microscopic arrays of oligodeoxynucleotides and arrays of single protein motors. See, Chee et al., Science 274: 610-614 (1996); Fodor et al., Nature 364: 555-556 (1993); Fodor et al., Science 251: 767-773 (1991); Gushin, et al., Anal. Biochem. 250: 203-211 (1997); Kinosita et al., Cell 93: 21-24 (1998); Kato-Yamada et al., J Biol. Chem. 273: 19375-19377 (1998); and Yasuda et al., Cell 93: 1117-1124 (1998). Using techniques such.as photolithography and/or electron beam lithography [Rai-Choudhury, Handbook of Microlithography, Micromachining. and Microfabricarion, Volume I: Microlithography,. Volume PM39, SPIE Press (1997); Service, Science 283: 27-28 (1999)], the substrate is sensitized with a linking group that allows attachment of a single modified protein. Alternatively, an array of sensitized sites can be generated using thin-film technology such as Langmuir-Blodgett. See, Zasadzinski et al., Science 263: 1726-1733 (1994)..
The regular spacing of proteins is achieved by attachment of the protein to these sensitized sites on the substrate. Polymerases containing the appropriate tag are incubated with the sensitized substrate so that a single polymerase molecule attaches at each sensitized site. The attachment of the polymerase can be achieved via a covalent or non-covalent interaction.
Examples of such linkages common in the an include Ni'-/hexahistidine, streptavidin/biotin.
avidin/biotin, glutathione S-transferase (GST)!glutathione. monoclonal antibody/antigen, and maltose binding protein/maltose.
7 O A schematic representation of a reaction center is presented in FIG. 3. A DNA polymerase C] from Thermus aquaticus) is attached to a glass microscope slide. Attachment is mediated 0. by a hexahistidine tag on the polymerase, bound by strong non-covalent interaction to a Ni 2 O0 atom, which is, in turn, held to the glass by nitrilotriacetic acid and a linker molecule. The C' nitrilotriacetic acid is covalently linked to the glass by a linker attached by silane chemistry.
The silane chemistry is limited to small diameter spots etched at evenly spaced intervals on the glass by electron beam lithography or photolithography. In addition to the attached polymerase, the reaction center includes the template DNA molecule and an oligonucleotide primer both bound to the polymerase. The glass slide constitutes the lower slide of the DNAS S reaction chamber..
Housing the array of DNAS reaction centers and mediating the exchange of reagents and buffer is. the reaction chamber assembly. An example of DNAS reaction chamber assembly is illustrated in FIG. 4. The reaction chamber is a sealed compartment with transparent upper and lower slides. The slides are held in place by a metal or plastic housing, which may be assembled and disassembled to allow replacement of the slides. There are two ports that allow access to the chamber. One port allows the input of buffer (and reagents) and the other port allows buffer (and reaction products) to be withdrawn from the chamber. The lower slide carries the reaction center array. In addition, a prism is attached to the lower slide to direct laser light into the lower slide at such angle as to produce total internal reflection of the laser light within the lower slide. This arrangement allows an evanescent wave to be generated over the reaction center array. A high numerical apertnure objective lens is used to focus the image of the reaction center array onto the digital camera system. The reaction chamber housing can be fitted with heating and cooling elements, such as a Peltier device, to regulate the temperature of the reactions.
By fixing the site of nucleotide incorporation within the optical system, sequence information can be obtained from many distinct nucleic acid molecules simultaneously. A diagram of the DNAS reaction center array is given in FIG. 5. As described above, each reaction center is attached to the lower slide of the reaction chamber. Depicted in the left side panel (Microscope Field) is the view of an entire array as recorded by four successive detection events (one for each of the separate fluorochromes). The center panel is a magnified view of a part of the field showing the spacing of individual reaction centers. Finally, the far right panel depicts the camera's view of a single reaction center. Each reaction center is assianed 100 pixels to ensure it that it is truly isolated. The imaging area of a single pixel relative to the 1 gm X 1 im area 0 allotted to each reaction center is shown. The density of reaction centers is limited by the Soptical resolution of the microscope system. Practically, this means that reaction centers must t ,be separated by at least 0.2 pm to be detected as distinct sites.
00
C
2. Enzyme Selection In general, any macromolecule which catalyzes formation of a polynucleotide sequence can be used as the polymerase. In some embodiments, the polymerase can be an enzymatic complex o that: 1) promotes the association by hydrogen bonding or base-pairing) of a tag a ct normal or modified nucleotide, or any compound capable of specific association with 0 o complementary template nucleotides) with the complementary template nucleotide in the active site; 2) catalyzes the formation a covalent linkage between the tag and the synthetic strand or primer; and 3) translates the active site to the next template nucleotide.
While the polymerases will typically be proteinaceous enzymes, it will be obvious to one of average skill in the art that the polymerase activity need not be associated with a proteinaceous enzyme. For example, the polymerase may be a nucleic acid itself, as in the case of ribozymes or DNA-based enzymes.
A large selection of proteinaceous enzymes is available for use in the present invention. For example, the polymerase can be an enzyme such as a DNA-directed DNA polymerase, an RNA-directed DNA polymerase a DNA-directed RNA polymerase or and RNA-directed RNA polymerase. Some polymerases are multi-subunit replication systems made up of a core enzyme and associated factors that enhance the activity of the core they increase processivity or fidelity of the core subunit). The enzyme must be modified in order to link it to the support. The enzyme can be cloned by techniques well known in the art, to produce a recombinant protein with a suitable linkage tag. In a preferred embodiment, this linkage is a hexahistidine tag, which permits strong binding to nickel ions on the solid support. Preferred enzymes are highly processive, they remain associated with the template nucleotide sequence for a succession of nucleotide additions, and are able to maintain a polymerasepolynucleotide complex even when not actively synthesizing. Additionally, preferred polymerases are capable of incorporating 3-modified nucleotides. Sufficient quantities of an enzyme are obtained using standard recombinant techniques known in the art. See, for example, Dabrowski and Kur. Protein Expr. Purif 14: 131-138 (1998).
0 2.1 DNA Polymerase In a preferred embodiment. sequencing is done with a DNA-dependent DNA polvmerase.
DNA-dependent DNA polymerases catalyze the polymerization of deoxynucleotides to form 00 the complementary strand of a primed DNA template. Examples of DNA-dependent DNA polymerases include, but are not limited to, the DNA polymerase from Bacillus stearothermophilus (Bst), the E. coli DNA polvmerase I Klenow fragment, E. coli DNA S polymerase III holoenzyme, the bacteriophage T4 and T7 DNA polymerases, and those from o Thermus aquaticus (Taq), Pyrococcusfuriosis (Pfu), and Thermococcus litoralis (Vent). The tr polymerase from T7 gene 5 can also be used when complexed to thioredoxin. Tabor et al., J.
o biol. Chem., 262: 1612-1623 (1987). The Bst DNA polymerase is preferred because it has been shown to efficiently incorporate 3'-O-(-2-Nitrobenzyl)-dATP into a growing DNA chain, is highly processive. very stable, and lacks exonuclease activity. The coding sequence of this enzyme has been determined. See U.S. Patent Nos. 5,830,714 and 5,814,506. incorporated herein by reference.
In an alternative preferred embodiment where RNA is used as template, the selected' DNA-dependent DNA polymerase functions as an RNA-dependent DNA polymerase, or reverse transcriptase. For example, the DNA polymerase from Thermus thermophilus (Tth) has been reported to function as an RNA-dependent DNA polymerase, or reverse transcriptase, under certain conditions. See, Meyers and Gelfand, Biochem. 30: 7661-7666 (1991). Thus, the Tth DNA polymerase is linked to the substrate and the sequencing reaction is conducted under conditions. where this enzyme will sequence an RNA template, thereby producing a complementary DNA strand.
In some embodiments, a polymerase subunit or fragment is attached to the support, and other necessary subunits or fragments are added as part of a complex with the sample to be sequenced. This approach is useful for polymerase systems that involve a number of different replication factors. For example, to use the bacteriophage T4 replication system for DNAS sequencing, the gp43 polymerase can be attached to the support. Other replication factors.
such as the clamp loader (gp44/62) and sliding clamp (gp45), can be added with the nucleic acid template in order to increase the processivity of the replication system. A similar approach can be used with E.coli polvmerase III system, where the polymerase core is immobilized in the array and the P-dimer subunit (sliding clamp) and t and y subassembly (clamp loader) are added to the nucleic acid sample prior to DNAS sequencing. Additionally.
o this approach can be used with eukaryotic DNA polymerases ct or 6) and the Scorresponding PCNA (proliferating cell nuclear antigen). In some embodiments, the sliding clamp is the replication factor that is attached in the array and the polymerase moiety is added in conjunction with the nucleic acid sample.
00 2.2 Reverse Transcriptase A reverse transcriptase is an RNA-dependent DNA polymerase an enzyme that produces a DNA strand complementary to an RNA template. In an alternative preferred embodiment, a O reverse transcriptase enzyme is attached to the support for use in sequencing RNA molecules.
t) This permits the sequencing of RNAs taken directly from tissues, without prior reverse O transcription. Examples of reverse transcriptases include, but are not limited to, reverse transcriptase from Avian Myeloblastosis Virus (AMV), Moloney Murine Leukemia Virus. and Human Immunodeficiency Virus-1 (HIV-1). HIV-1 reverse transcriptase is particularly preferred because it is well characterized both structurally and biochemically. See, e.g..
Huang, et al., Science 282: 1669-1675 (1998).
In an alternative preferred embodiment, the immobilized reverse transcriptase functions as a DNA-dependent DNA polymerase, thereby producing a DNA copy of the sample or target DNA template strand.
2.3 RNA Polymerase In yet another alternative preferred embodiment, a DNA-dependent RNA polymerase is attached to the support, and uses labeled-caged ribonucleotides to generate an RNA copy of the sample or target DNA strand being sequenced. Preferred examples of these enzymes include, but are not limited to, RNA polymerase from E. coli [Yin, et al., Science 270: 1653-1657 (1995)] and RNA polymerases from the bacteriophages T7, T3, and SP6. In an alternative, preferred embodiment, a modified T7 RNA polymerase functions as a DNA dependent DNA polymerase. This RNA polymerase is attached to the support and uses labeled-caged deoxvribonucieotides to generate a DNA copy of a DNA template. See, Izawa. er al.. J Biol. Chem. 273: 14242-14246 (1998).
2.4 RNA Dependent RNA Polymerase Many viruses employ RNA-dependent RNA polymerases in their life-cycles. In a preferred embodiment, an RNA-dependent RNA poiymerase is attached to the support, and uses labeled-
O
o caged ribonucleotides to generate an RNA copy of a sample RNA strand being sequenced.
Preferred examples of these enzymes include, but are not limited to, RNA-dependent RNA polymerases from the viral families: bromoviruses, tobamoviruses, tombusvirus, leviviruses, 00 hepatitis C-like viruses, and picornaviruses. See, Huang et al., Science 282: 1668-1675 (1998); Lohmann etal., J. Virol. 71: 8416-8428 (1997); Lohmann et al., Virology 249:108-118 (1998), and O'Reilly and Kao, Virology 252: 287-303 (1998).
3. Sample Preparation C The nucleic acid to be sequenced can be obtained from any source. Example nucleic acid o samples to be sequenced include double-stranded DNA, single-stranded DNA, DNA from C plasmid, first strand cDNA, total genomic DNA, RNA, cut/end-modified DNA with RNA polymerase promoter), in vitro transposon tagged random insertion of RNA polymerase promoter). The target or sample nucleic acid to be sequenced is preferably sheared (or cut) to a certain size, and annealed with oligodeoxynucleotide primers using techniques well known in the art. Preferably, the sample nucleic acid is denatured, neutralized and precipitated and then diluted to an appropriate concentration, mixed with oligodeoxynucleotide primers, heated to 65*C and then cooled to room temperature in a suitable buffer. The nucleic acid is then added to the reaction chamber after the polymerase has been immobilized on the support or. alternatively, is combined with the polymerase prior to the immobilization step.
3.1 In vitro transposon tagging of template DNA In an alternative preferred embodiment purified transposases and transposable element tags will be used to randomly insert specific sequences into template double stranded DNA. In one configuration the transposable element contains the promoter for specific RNA polymerase.
Alternatively, the inverted repeats of the transposable elements can be hybridized with complementary oligodeoxvnucleotide primers for DNAS with DNA polymerases. Preferred examples of these transposases and transposable elements include, but are not limited to. TCI and TC3A from C. elegans and the engineered teleost system Sleeping Beauty. See, Ivics et al., Cell 91:501-510 (1997); Plasterk, Curr. Top. Microbiol. Immunol. 204: 125-143 (1996); van Luenen et al.. EMBO J. 12: 2513-2520 (1993), and Vos et al.. Genes Dev. 10: 755-761 (1996).
3.2 Double Stranded Template DNA 12 o In yet another embodiment, double stranded DNA is sequenced by Bst DNA polymerase
O
C without the need for primer annealing. See. Lu et al.. Chin. J. Biotechnol. 8: 29-32 S(1992).
00 3.3 Primers Various primers and promoters are known in the art and may be suitable for sequence extension in DNAS. Examples include random primers, anchor point primer libraries, singlestranded binding protein masking/primer library, and primase.
o In a preferred embodiment anchored primers are used instead of random primers. Anchor Sprimers are oligonucleotide primers to previously identified sequences. Anchor primers can be o used for rapid determination of specific sequences from whole genomic DNA, from cDNAs or RNAs. This will be of particular use for rapid genotyping, and/or for clinical screening to detect polymorphisms or mutations in previously identified disease-related genes or other genes of interest. Once genome projects, and other studies, have identified sequences of particular interest then oligonucleotides corresponding to various locations in and around that sequence can be designed for use in DNAS. This will maximize the quantity of useful data that can be obtained from a single sequencing run, particularly useful when complex DNA samples are used. For identification of mutated or polymorphic disease genes this technique will obviate the need to perform genotyping by any other means currently in use, including using single strand conformation polymorphism (SSCP) [Orita et al., Genomics 5: 874-879 (1989)], PCR sequencing or DNA array hybridization technology [Hacia, Nat. Genet. 21: 42-47 (1999)]. Direct sequencing of disease gene is superior to SSCP and hybridization technologies because they are relatively insensitive and may frequently positively or negatively identify mutations. Many anchor oligonucleotides can be mixed together so that hundreds or thousands of genes or sequences can be identified simultaneously. In essence every known or potential disease-related gene can be sequenced simultaneously from a given sample.
4. Labeled-caged Terminating Nucleotides To be useful as a chain, terminating substrate for the methods of the present invention, a nucleotide must contain a detectable label that distinguishes it from the other three nucleotides.
Furthermore, the chain terminating nucleotides must permit base incorporation, it must terminate elongation upon incorporation, and it must be capable of being uncaged to allow further chain elongation, thereby permitting repetitive cycles of incorporation, monitoring to o identify incorporated bases, and uncaging to allow the next cycle of chain elongation.
SUncaging of the nucleotides can be accomplished enzymatically, chemically, or preferably photolvtically.
0 The basic molecule is an NTP with modification at the 3'-OH the 2'-OH or the base In a standard dideoxy NTP, R=H, and R"=H.
S R=H, R'=OH, and R"=H is a chain terminator for RNA polymerases.
One set of useful chain-terminating nucleotides for the methods of the present invention is R= S cage/label, (H or OH), and H. In a preferred embodiment, the modified nucleotide is a label a fluorophore) linked to the sugar moiety by a 3 -O-(-2-Nitrobenzyl) group. The CN modified 3 '-O-(-2-Nitrobenzyl)-dNTP is incorporated into the growing DNA chain by Bst DNA polymerase linked to a support. In order to resume chain elongation, the nucleotide is uncaged by removal of the 2-Nitrobenzyl group (with its corresponding detectable label) by exposure to light of the appropriate frequency. The modified nucleotide 3'-O-(-2-Nitrobenzyl)-dATP has previously been used in a single round of nucleotide incorporation and uncaging. Metzker et al., Nucleic Acids Res. 22: 4259-4267 (1994). See also Cheesman, U.S. Patent No. 5,302,509, incorporated herein by reference.
An alternative set of useful chain-terminating nucleotides has the configuration R= cage, R'= (H or OH), and cage/label. In a preferred embodiment, the detachable labeling group is a label a fluorophore) linked to the base of the nucleotide by a 2-Nitrobenzyl group, and the detachable blocking group is a 3 '-O-(-2-Nitrobenzyl) group. The modified nucleotide is incorporated into the growing DNA chain by Bst DNA polymerase linked to a support. In order to resume chain elongation, the nucleotide is uncaged by removal of both the labeling group and the blocking group by exposure to light of the appropriate frequency.
In either of these configurations it may prove advantageous to place two labels two fluorochromes) on each cage, as has been described in WO 98/33939.
For sequencing when the synthetic strand is RNA. labeled-caged ribonucleotides R' OH) are synthesized as modified nucleotides designed for incorporation by support-linked
RNA
polymerase.
o 4.1 Fluorescent labels 0 C, The use of fluorescent tags to identify nucleotides in nucleic acid sequencing is well known in S the art. See, U.S. Patent Nos. 4,811,218; 5,405,747; 5,547,839 and 5,821,058, each incorporated herein by reference. Metzker and Gibbs have recently disclosed a family of fluorescently tagged nucleotides based on the Cy fluorophores with improved spectral characteristics. U.S. Patent No. 5,728,529, incorporated herein by reference. Alternative sets of fluorophores include: the rhodamine based fluorophores, TARAM, ROX, JOE, and FAM; the BigDye@ fluorophores (Applied Biosystems, Inc.); and the BODIPY® fluorophores (U.S.
0 Patent No. 5,728,529).
O In a preferred embodiment of the present invention, a fluorescent label is attached to the C" photolabile 3' blocking group cage). Examples of modified nucleotides for DNAS are schematically illustrated in FIG. 1 (Panels Panel A depicts a deoxyadenosine triphosphate modified by attachment of a photolabile linker-fluorochrome conjugate to the 3' carbon of the ribose. Photolysis of the linker by <360 nm light causes the fluorochrome to dissociate, leaving the 3'-OH group of the nucleotide intact. Panel B depicts an alternative configuration in which the fluorochrome is attached to the base of the nucleotide by way of a photolabile linker. The 3'-OH is blocked by a separate photolabile group. Modified nucleotides such as those depicted in Panels A and B are examples of labeled-caged deoxyribonucleotides for use in DNAS. A variety of fluorochromes and photolabile groups can be used in the synthesis of labeled-caged deoxyribonucleotides. Additionally, ribonucleotides can also be synthesized for use with RNA polymerases. Four fluorochromes with distinct spectral properties allow the four nucleotides to be distinguished during the detection phase of the DNAS reaction cycle. FIG. I (Panel C) provides a schematic representation of four different labeled-caged terminator nucleotides for use in direct nucleic acid sequencing.
After incorporation of the labeled-caged terminator nucleotides by the immobilized polymerase molecules, the fluorophores are illuminated to excite fluorescence in each of the four species of fluorophore. The emission at each point in the array is optically detected and recorded. Once the sequence information has been obtained, the photolabile linkers are removed by illumination with light at the uncaging wavelength (<360 nm).
Depicted in FIG 2 is a single round of the reaction cycle, the incorporation of a labeledcaged nucleotide; the detection of the labeled nucleotide: and the unblocking of the caged nucleotide. It is through successive rounds of the DNAS reaction cycle that primary S sequence information is deduced. In the first panel (Step 1) is an example single stranded template DNA (3'-AGCAGTCAG-5') on the left side is a short primer sequence and 0o a labeled-caged dGTP undergoing incorporation. In the middle panel (Step 2) the Cl fluorochrome, BODIPY 5 6 is excited by YAG laser illumination at 532nm. The fluorochrome emits light centered at a wavelength of 570 nm, which is detected by the microscope system. Finally, in Step 3, photolysis of the linker by illumination with <360 nm light simultaneously dissociates fluorochrome label and releases the 3' block. As a result the
O
C1 primer is extended by one base and the 3'-OH is restored so that another nucleotide o can be incorporated on the next cycle.
4.2 Quantum dot labels In an alternative preferred embodiment of the present invention, each of the caged terminators is labeled with a different type of quantum dot. Recently, highly luminescent semiconductor quantum dots (QDs) have been covalently coupled to biomolecules. Chan and Nie, Science 281: 2016-2018 (1998). These luminescent labels exhibit improved spectral characteristics over traditional organic dyes, and have been shown to allow sensitive detection with a confocal fluorescence microscope at the single dot level. In this embodiment, the caged quantum dot terminators are incorporated, detected, and uncaged in a manner similar to that described above for the fluorescent caged terminators.
4.3 Plasmon resonance particles In a preferred embodiment, each of the caged terminators is labeled with a colloidal silver plasmon-resonant particle (PRP). Schultz et al., J. Clin. Ligand Assay 22: 214-216 (1999); Schultz et al., Proc. Natl. Acad. Sci. 97: 996-1001 (2000). PRPs are metallic nanoparticles, typically 40-100 nm in diameter which can be engineered to efficiently scatter light anywhere in the visible range of the spectrum. These particles are bright enough to be used for single molecule detection. PRPs were shown to produce a scattering flux equivalent to that from million fluorescein molecules, and more than 105-fold greater than that from typical quantum dots. Schultz et al., Proc. Natl. Acad. Sci. 97: 996-1001 (2000). Furthermore, when imaged by a standard CCD, the spatial peak can be located to a precision of 10 A, similar precision to that observed with imaging single fluorophores on gold nanoparticles. Denk and Webb. Appl. Opt.
29: 2382-2391 (1990). To facilitate detection, in certain embodiments, each different type of O nucleotide is modified with a PRP of a different color. In order to resolve the signal from two C, PRPs incorporated into a sample at neighboring reaction centers, the reaction centers must at least be separated by a coherence length (approximately the wavelength of the illuminating 00 light). Additionally, Raman scattering may be used to detect the PRPs. Nie and Emory, Cl Science 275: 1102-1106 (1997).
Detection of Incorporated Nucleotides Advances in microscopic techniques have allowed the spectroscopic detection of single C molecules. See, Nie and Zare, Annu. Rev. Biophys. Biomol. Struct. 26: 567-596 (1997), and S Keller et al., Appl. Spectrosc. 50: 12A-32A (1996). For example, single fluorescent molecules 0 in aqueous solution can be visualized under total internal reflection fluorescence microscopy (TIRFM), confocal microscopy, fluorescence resonance energy transfer (FRET), or surface plasmon resonance spectroscopy (SPR). See, Dickson et al., Nature 388: 355-358 (1997); Dickson et al., Science 274: 966-969 (1996); Ishijima et al., Cell 92: 161-171 (1998); Iwane et al., FEBS Lett. 407: 235-238 (1997); Nie et al., Science 266: 1018-1021 (1994); Pierce et al., Nature 388: 338 (1997); Ha et al., Proc. Natl. Acad. Sci. USA 93: 6264-6268 (1996), and Gordon et al., Biophys. J. 74: 2702-2713 (1998). Yokota et al., Phys. Rev. Letts. 80:4606-4609 (1998). Since single molecules can be detected spectroscopically, cloned nucleic acid samples are no longer necessary for sequencing. A single copy of template, contained within a reaction center is a sufficient sample size. The apparatus and methods of the present invention allow the resolution of signals from single nucleotide tags within an optical plane and their subsequent conversion into digital information. Photons are collected from a thin plane roughly equivalent to the volume within which the enzyme and newly synthesized base reside.
5.1 TIRFM When light is directed at a particular angle into a refractive medium of set width, such as a glass slide, total internal reflection (TIR) will result. Above the plane of the refractive medium an electromagnetic phenomenon known as an evanescent wave occurs. The principle of the evanescent wave is depicted in FIG. 6. The evanescent wave extends from the surface to a distance of the order of the wavelength of light. Importantly, an evanescent wave can be used to excite fluorochromes within this distance. When this phenomenon is used for microscopy it is called total internal reflection fluorescence microscopy (TIRFM). The arrangement of microscope slides, prism and laser beam depicted in this figure will lead to TIR within the O lower slide and thus an evanescent wave .will be generated within -150 runm of the upper surface Sof the lower slide. Fluorochrome molecules, such as those within DNAS reaction centers. will be excited and can be detected optically using the objective lens, microscope and camera 00 system. A high signal-to-noise ratio is achieved using evanescent wave excitation because only those fluorochrome molecules within the evanescent wave are stimulated.
In a preferred embodiment TIRFM is used for detection. Depicted in FIG. 7 is the arrangement of equipment required to carry out DNAS using TIRFM. A standard laboratory microscope O stand houses the reaction chamber assembly, objective lens, filter wheel, microchannel plate intensifier, and cooled CCD camera. Laser light is directed into the prism by dichroic mirrors 0 and computer controlled shutters. Evanescent wave excitation is used to stimulate the sample.
Evanescent wave excitation is achieved by total internal reflection at the glass-liquid interface.
At this interface, the optical electromagnetic field does not abruptly drop to zero, but decays exponentially into the liquid phase. The rapidly decaying field (evanescent wave) can be used to excite fluorescent molecules in a thin layer of approximately 150 nm immediately next to this interface. See, PCT Patent Application WO 98/33939, incorporated herein by reference.
The sensitivity that allows single molecule detection arises from the small sample volume probed. One advantage of TIRFM is that the entire reaction center array can be imaged simultaneously. Images of the reaction center array are focused onto the face of the microchannel plate intensifier through barrier filters carried on the filter wheel. The microchannel plate intensifier amplifies the image and transfers it to the face of the cooled CCD camera. Image data are read from the CCD chip and processed on a microcomputer. A stimulating laser, or set of stimulating, lasers, is directed to the specimen by way of an optical table. Another laser uncages the 3'-OH protecting group. Additional lasers may be required for optimal fluorochrome stimulation. A filter wheel is also included in the invention to change barrier filters so that the four different fluorochromes (each corresponding to a different type of labeled-caged nucleotide) are unambiguously distinguished.
As shown in FIG. 7, a prism is built onto the microscope slide to direct the laser into the slide from outside the microscope. Ishijima et al., Cell 92: 161-171 (1998). Alternatively.
objective-type TIRFM can be used for fluorescence detection. Laser light is directed throurgh an objective lens off-center such that the critical angle is achieved using the objective lens itself See, Tokunaga er al., Biochem. Biophys. Res. Comm. 235: 47-53 (1997).
o 5.2 Confocal Microscopy In an alternative preferred embodiment. confocal microscopy is used for detection. In confocal Smicroscopy, a laser beam is brought to its diffraction-limited focus inside a sample using an oil immersion, high numerical-aperture (NA) objective lens. Single molecules have been detected 00 in solution by multi-photon confocal fluorescence. Mertz, et al., Opt. Lett. 20:2532-2534 (1995). In one embodiment of this invention, the nucleotide labels are detected by scanning multi-photon confocal microscopy. Nie et al., Science 266: 1018-1021 (1994).
S5.3 Fluorescence Resonance Energy Transfer (FRET) t) In an alternative preferred embodiment, FRET technology is used for detection. Fluorescence S resonance energy transfer is a distance-dependent interaction between the electronic excited states of two dye molecules in which excitation is transferred from a donor molecule to an acceptor molecule without emission of a photon. FRET is dependent on the inverse sixth power of the intermolecular separation, making it useful over distances comparable with the dimensions of biological macromolecules. Thus, FRET is an important technique for investigating a variety of biological phenomena that produce changes in molecular proximity.
This technique makes use of some unusual properties of dye molecules. In experiments that use fluorescent dyes, the dye molecule is typically excited at one wavelength of light and data is collected at a longer wavelength. However, when two different dye molecules are placed very close together, light can be absorbed by one molecule (the donor), and its emission can then be immediately captured by the adjacent molecuIe (the acceptor). Light at a still longer wavelength is then emitted from the acceptor. In most applications, the donor and acceptor dyes are different, in which case FRET can be detected by the appearance of sensitized fluorescence of the acceptor or by quenching of donor fluorescence. When the donor and acceptor are the same, FRET can be detected by the resulting fluorescence depolarization.
Donor and acceptor molecules must be in close proximity (typically 10-100 Absorption spectrum of the acceptor must overlap fluorescence emission spectrum of the donor, and donor and acceptor transition dipole orientations must be approximately parallel.
FRET can be employed to increase signal to noise ratios. Additionally. FRET can be used in DNAS to avoid the need for a photolabile linker on the fluorochromes. FRET is commonly used to measure the distance between molecules or parts of them. or to •detect transient molecular interactions. In practice candidate molecules, or different parts of, the same O molecule, are modified with two different fluorescent groups. The solution is then excited by light corresponding to the shorter excitation wavelength of the two fluorochromes. When the second fluorochrome is in close proximity to the first, it will be excited by the emitted energy OO of the former and emit at its own characteristic wavelength. The efficiency (quantum yield) of the conversion is directly related to the physical distance between the two fluorochromes. For specific application to DNAS, polymerase molecules are tagged with a fluorochrome that behaves as a photon donor for the modified nucleotides. This would limit their excitation to the active site of the polymerase or any other appropriate pan of the polymerase. Such an arrangement would significantly increase the signal-to-noise ratio of nucleotide detection.
O Moreover, because only nucleotides within the polymerase are excitable FRET as applied to Cl DNAS would render unnecessary the removal of previously incorporated fluorescent moieties.
FRET has been performed at the single molecule level as required for DNAS [Ha et al., Proc.
Natl. Acad. Sci. USA 93: 6264-6268 (1996)], and has been optimized for quantification in fluorescence microscopy. Gordon et al., Biophys. J. 74: 2702-2713 (1998).. Optimally the polymerase would be synthesized as a recombinant green fluorescent protein (GFP) fusion protein as this would eliminate the need to derivatize the polymerase and unlike most commonly used fluorochromes GFP is substantially resistant to photobleaching. However, we may find that the optimal arrangement is a chemically modified polymerase to which a synthetic fluorochrome or quantum dot has been attached.
5.4 Surface Plasmon Resonance In one embodiment, surface plasmon resonance (SPR) spectroscopy is used to detect the incorporation of label into the nucleic acid sample. SPR is used to measure the properties of a solution by detecting the differences in refractive index between the bulk phase of the solution and the evanescent wave region. SPR has been recently used to for single molecule imaging of fluorescently labeled proteins on metal by surface plasmons in aqueous solution. Yokota et al., Phys. Rev. Letts. 80:4606-4609 (1998). This technique involves coating the reaction chamber surface with a thin layer of metal in order to enhance the signal from fluorescenthl labeled nucleotides.
The DNAS Detector The detector is a cooled CCD camera fitted with a microchannel plate intensifier. A block diagram of the instrument set-up is presented in FIG. 7. Recently available intensified-cooled t CCD cameras have resolutions of at least l O00x 1O00 pixels. In -a preferred embodiment of this o invention, an array consists of 100x100 reaction centers: Thus, when the array is imaged onto the face of the camera, each reaction center is allotted approximately 10x10 pixels. DNAS uses a 63x 1.4 NA lens to image an array (100x100 pm grid) of regularly spaced reaction 00 centers, depicted in FIG. 5. Information can be simultaneously recorded from 10,000 reaction centers. This expected resolution is comparable to that achieved in a recent report, whereby TIRFM was used to image a sample of nile red fluorophores, and produced images of a large number of single molecules. A single nile red molecule was unambiguously imaged in an 8x8 0 pixel square. Dickson efal., Nature 388: 355-358 (1997).
S 6. The Sequencing Cycle Housing the array of DNAS reaction centers and mediating the exchange of reagents and buffer is the reaction chamber assembly. The reaction chamber is a sealed compartment with transparent upper and lower slides. The slides are held in place by a metal or plastic housing, which may be assembled and disassembled to allow replacement of the slides. There are two ports that allow access to the chamber. One port allows the input of buffer (and reagents) and the other port allows buffer (and reaction products) to be withdrawn from the chamber. The lower slide carries the reaction center array. In addition, a prism is attached to the lower slide to direct laser light into the lower slide at such angle as to produce total internal reflection of the laser light within the lower slide. This arrangement allows an evanescent wave to be generated over the reaction center array. A high numerical aperture objective lens is used to focus the image of the reaction center array onto the digital camera system. The reaction chamber housing can be fitted with heating and cooling elements, such as a Peltier device, to regulate the temperature of the reactions. A nucleic acid sample is introduced to the reaction chamber in buffered solution containing all four labeled nucleoside triphosphate terminators.
A schematic representation of the reaction chamber assembly is presented in FIG. 4. Reaction centers are monitored by the microscope system until a majority of reaction centers contain immobilized polymerase bound to the template with a single incorporated labeled-caged terminator nucleotide. The reaction chamber is then flushed with a wash buffer. Specific nucleotide incorporation is then determined for each reaction center. Following detection, the reaction chamber is irradiated to uncage the incorporated nucleotide and flushed with wash buffer once again. The presence of labeled nucleotides is once again mdnitored before fresh O reagents are added to reinitiate synthesis. This second detection verifies that a reaction center is successfully uncaged. The presence of a labeled nucleotide in the chamber during this step indicates that the reaction center has not been uncaged. Accordingly, the subsequent reading 00 from this reaction center during the next detection step of the cycle will be.ignored. Thus, by ignoring the signals from reaction centers that are not successfully uncaged, the methods of the present invention avoid the problems caused by incomplete uncaging in sequencing methods of the prior art. The sequencing cycle outlined above is repeated until a large proportion of o reaction centers persistently fail to incorporate or uncage additional nucleotides.
t' Methods for regulating the supply (and removal) of reagents to the reaction centers, as well as O the environment of the reaction chamber the temperature, and oxidative environment) are incorporated into the reaction chamber using techniques common in the art. Examples of this technology are outlined in: Kricka, Clinical Chem. 44: 2008-2014 (1998); see also U.S. Patent No. 5,846,727.
7. Sequence Acquisition Software The sequence acquisition software acquires and analyzes image data during the sequencing cycle. At the beginning of a sequencing experiment, a bin of pixels containing each reaction center is determined. During each sequencing cycle, four images of the entire array are produced, and each image corresponds to excitation of one of the four fluorescently labeled nucleotide bases A, C, G, or T For each reaction center bin, all of the four images are analyzed to determine which nucleotide species has been incorporated at that reaction center during that cycle. As described above, the reaction center bin corresponding to a certain reaction center contains a I 0x10 array of pixels. The total number of photons produced by the single fluorophore in that reaction center is determined by the summation of each pixel value in the array. Typically, 500-1500 photons are emitted from a single fluorophore when excited for 100 milliseconds with a laser producing an intensity of 5kW/cm: at the surface of the microscope slide. Dickson et al., Science 274: 966-969 (1996). The sums of the reaction center bins from each of the four images are compared, and the image that produces a significant sum corresponds to the newly incorporated base at that reaction center. The images are processed for each of the reaction centers and an array of incorporated nucleotides is recorded. An example of a data acquisition algorithm is provided in FIG. 8. Such processing is done in real time at low cost with modem image processing computers.
o Multiple reads of the reaction center array may be necessary during the detection step to ensure that the four nucleotides are properly distinguished. Exposure times can be.as low as 100 msec, and the readout time of the CCD chip can be as long as 250 msec. Thus, the maximum time needed for four complete reads of the array is 1.5 seconds. The total time for a given 00 Ccycle, including reagent addition, removal, and washes, is certainly less than 10 seconds.
Accordingly, a sequencing apparatus consisting of an array of 10,000 reaction centers is able to detect at least 360 bases per site per hour, or 3.6 Megabases per hour of total sequence, as a conservative estimate. This rate is significantly faster than those of traditional sequencing S methodologies.
S In addition to short sequencing times, the methods of the present invention do not require the Ci, time-consuming processes of sample amplification (cloning, or PCR), and gel electrophoresis.
The lack of consumables necessary for sample amplification and electrophoresis. coupled with small reagent volumes (the reaction chamber volume is on the order of 10 microliters) and reduced manual labor requirements drastically reduce the cost per nucleotide sequenced relative to traditional sequencing techniques.
8. Sequence Analysis Software Depicted in FIG. 8 is an example of DNAS data acquisition using a 3x3 array of reaction centers. In a typical configuration, however, DNAS would utilize an array of I 00x 100 reaction centers. In this example, four cycles of DNAS are presented. For each cycle, four images of the array are produced. Each image corresponds to a specific excitation wavelength and barrier filter combination, and thus corresponds to the incorporation of a specific modified nucleotide.
Consider the upper left array (Cycle 1, In this case when using the BODIPY set of modified nucleotides is 3'-O-(DMNPE-(BODIPy493/ 0 deoxy ATP. Thus the reaction center array is illuminated with 488 runm light from the Ar laser and the image focused through a 503 nm barrier filter. -Each of the nine elements in the 3x3 matrix corresponds to a 1Ox 10 pixel area of the CCD camera output. For each of the four images each reaction center pixel group is analyzed to determine whether a the given nucleotide has been incorporated. Thus we see in the example that in Cycle A. modified deoxyATPs were incorporated at reaction centers X1 and ZI. Hence. in the table the first nucleotides recorded for reaction-centers XI and ZI are If we consider a given reaction center, reaction center XI, over the four cycles of DNAS we see that in the first cycle the reaction center has incorporated a in the second 0 cycle a in the third cycle.a and in the fourth cycle an Hence the sequence fragment of the template DNA bound at reaction center Y3 is the reverse complement of 5'-ACCT-3', which is 5'-TGGA-3'. The primary sequence exists as an array of sequences, each derived from 00 a single reaction center. The length of each reaction center sequence will depend upon the number of cycles a given center remains active in an experiment. Based on the processivity of cloned polymerases reported in the art, sequence lengths of several hundred to several thousand bases are expected.
S In one embodiment of the present invention, a nucleic acid sample is sheared prior to inclusion tf in a reaction center. Once these fragments have been sequenced, sequence analysis software is used to assemble their sequences into contiguous stretches. Mahy algorithms exist in the art that can compare sequences and deduce their correct overlap. New algorithms have recently been designed to process large amounts of sequence data from shotgun (random) sequencing approaches.
In one preferred embodiment, an algorithm initially reduces the amount of data to be processed by using only two smaller sequences derived from either end of the sequence deduced from a single reaction center in a given experiment. This approach has been proposed for use in shotgun sequencing of the human genome. Rawlinson, et al., J. Virol 70: 8833-8849 (1996); Venter et al., Science 280: 1540-1542 (1998). It employs algorithms developed at the Institute for Genome Research (TIGR). Sutton, et al., Genome Sci. Technol. 1: 9 (1995).
In an alternative preferred embodiment, raw data is compressed into a fingerprint of smaller words hexanuclebtide restriction enzyme sites) and these fingerprints can be compared and assembled into larger continuous blocks of sequence (contigs). This technique is similar to that used to deduce overlapping sequences after oligonucleotide hybridization. Idury and Waterman, J. Comput. Biol. 2: 291-306 (1995). Yet another embodiment uses existing sequence data, from genetic or physical linkage maps, to assist the assembly of new sequence data from whole genomes or large genomic pieces.
9. Utility of DNAS Clinical Applications The importance of genetic diagnoses in medicine cannot be understated. Most obvious is the use of techniques that can identify carriers of harmful genetic traits for pre-natal and neo-natal O diagnosis. Currently, biochemical tests and karyotype analyses are the most commonly used
O
C- techniques, but these have clear limitations. Biochemical tests are only useful when there is a S change in the activity or levels of an enzyme or protein which has been associated with the disease state and for which a specific test has been determined. Even when a protein has been 00 l attributed to a disease state the development of such reagents can be difficult, expensive and time consuming. Karyotypic analyses are only useful for identifying gross genetic disorders such as ploidy, translocations and large deletions. Although it is theoretically possible to determine whether individuals possess defective alleles of a given gene by current DNA CN techniques, effective screening programs are only currently practicable in cases in which a o common mutation is associated with the disease and its presence can be determined by C non-sequencing techniques.
The methods of the present invention permit large amounts of DNA sequence data to be determined from an individual patient with little technical effort, and without the need to clone patient DNA or amplify specific sequences by PCR. Single molecules can be sequenced directly from a simple DNA preparation from the patient's blood, tissue samples or from amniotic fluid. Accordingly, DNAS can be used for clinical diagnosis of genetic disorders, traits or other features predictable from primary DNA sequence information, such as prenatal, neo-natal and post-natal diagnoses or detection of congenital disorders; pathological analysis of somatic disease caused by genetic recombination and/or mutation; identification of loss of heterozygosity, point mutations, or other genetic changes associated with cancer, or present in pre-cancerous states.
The methods of the present invention can also be used to identify disease-causing pathogens viral, bacterial, fungal) by direct sequencing of affected tissues.
Functional Gene Identification Large scale genetic screens for genes involved in certain processes, for example during development, are now common and are applied to vertebrates with large genomes such as the zebrafish (Danio rerio) and the amphibian Xenopus tropicalis. Attempts to clone mutant genes in mouse and human have been iengthy and difficult and even in more genetically amenable organisms like zebrafish it is still time consuming and difficult.
Since the methods of the present invention permit the sequencing of an entire genome the size of a mammal in a short period of time, identification of mutant genes can be achieved by bulk O sequence screening, sequencing whole genomes or large genomic segments of a carrier, k and comparing to the sequence of whole genomes or large genomic segments of different members of a given species.
00 C Similarly, the methods of the present invention allow facile sequencing of entire bacterial genomes. Sequence information generated in this fashion can be used for rapid identification of genes encoding novel enzymes from a wide variety of organisms, including extremophillic bacteria.
N In addition, the methods of the present invention can also be used for assessment of mutation O rates in response tomutagens and radiation in any tissue or cell type. This technique is useful Cl for optimization of protocols for future mutation screens.
Analysis of Genetic Alterations in Tumors Many cancers, possibly all cancers, begin with specific alterations in the genome of a cell or a few cells, which then grow unchecked by the controls of normal growth. Much of the treatment of cancers is dependent upon the specific physiological response of these abnormal cells to particular agents.
The method of the present invention will allow the rapid generation of a genetic profile from.
individual tumors, allowing researchers to follow precisely what genetic changes accompany various stages of tumor progression. This information will also permit the design of specific agents to target cancer cells for tailor-made assaults on individual tumors.
Analysis of Genetic Variation Many important physiological traits, such as control of blood pressure, are controlled by a multiplicity of genetic loci. Currently, these traits are analyzed by quantitative trait linkage (QTL) analysis. Generally, in QTL analysis a set of polymorphic genetic linkage markers is utilized on a group of subjects with a particular trait, such as familial chronic high blood pressure. Through an analysis of the linkage of the markers with the trait, a correlation is irawn between a set of particular loci and the trait. Usually a handful of loci contribute the najority of the trait and a larger group of loci will have minor effects on the trait.
Fhe methods of the present invention permit rapid whole genome sequencing. Thus. using the nethods of the present invention, QTL analysis is executed at a very fine scale and. with a 0 large group of subjects, all of the major loci contributing to a given trait and most of the minor Cl loci are easily identified.
Moreover, the method of the present invention can be used for constructing phylogenetic trees 00 and/or kinship relationships by estimation of previous genomic recombinations inversion, translocation, deletion, point mutation), or by previous meiotic recombination events affecting the distribution of polymorphic markers. The method of the present invention can be used to identify mutations or polymorphisms, with the aim of associating genotype with phenotype.
The method of the present invention can also be use to identify the sequence of those mutant or polymorphic genes resulting in a specific phenotype, or contributing to a polygenic trait.
0 0 Agricultural Applications Agricultural efficiency and productivity is increased by generating breeds of plants and animals with optimal genetic characteristics. The methods of the present invention can be used, for example, to reveal genetic variation underlying both desirable and undesirable traits in agriculturally important plants and animals. Additionally, the methods of the present invention can be used to identify plant and animal pathogens, and designing methods of combating them.
Forensic Applications The methods of the present invention can be used in criminal and forensic investigations, or for the purpose of paternity/matemity determination by genetically identifying samples of blood, hair, skin and other tissues to unambiguously establish a link between a suspected individual' and forensically relevant samples. The results obtained will be analogous to results obtained with current genetic fingerprinting techniques, but will provide far more detailed information and will be less likely to provide false positive identification. Moreover, the identity of individuals from a mixed sample can be determined.
Research Applications The methods of the present invention can be used for several research applications. such.as the sequencing of artificial DNA constructs to confirm/elicit their primary sequence. and'or to isolate specific mutant clones from random mutagenesis screens: the sequencing of cDNA from single cells, whole tissues or organisms from any developmental stage or environmental circumstance in order to determine the gene expression profile from that specimen; the O sequencing of PCR products and/or cloned DNA fragments of any size isolated from any source.
The methods of the present invention can be also used for the sequencing of DNA fragments 00 S generated by analytical techniques that probe higher order DNA structure by their differential sensitivity to enzymes, radiation or chemical treatment partial DNase treatment of chromatin), or for the determination of the methylation status of DNA by comparing sequence generated from a given tissue with or without prior treatment with chemicals that convert Smethyl-cytosine to thymine (or other nucleotide) as the effective base recognized by the polymerase. Further, the methods of the present invention can be used to assay cellular physiology changes occurring during development or senescence at the level of primary sequence.
The methods of the present invention can also be used for the sequencing of whole genomes or large genomic segments of transformed cells to select individuals with the desired integration status. For example, DNAS can be used for the screening of transfected embryonic stem cell lines for correct integration of specific constructs, or for the screening of organisms such as Drosophila, zebrafish, mouse, or human tissues for specific integration events.
Additionally, the method of the present invention can be used to identify novel genes through the identification of conserved blocks of sequence or motifs. from evolutionarily divergent organisms. The method of the present invention can also be used for identification of other genetic elements regulatory sequences and protein binding sites) by sequence conservation and relative genetic location.
The details of one or more embodiments of the invention have been set forth in the accompanying description above. Although any methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention, the preferred methods and materials are now described. Other features, objects, and advantages of the invention will be apparent from the description and from the claims. In the specification and the appended claims, the singular forms include plural referents unless the context clearly dictates otherwise. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. All patents and publications cited in this specification are incorporated by reference.
o The following EXAMPLES are presented in order to more fully illustrate the preferred embodiments of the invention. These EXAMPLES should in no way be construed as limiting the scope of the invention, as defined by the appended claims.
oO Example I Reaction chamber substratum preparation, Nickel/chelator l conjugate.
S The fundamental unit of the DNAS methodology is the reaction center (FIG. The reaction center comprises a polymerase molecule bound to a template nucleic acid molecule, and tethered to a fixed location on a transparent substrate via a high affinity interaction between groups attached to the polymerase and substrate respectively. In one configuration, DNAS 0 reactions occur in a reaction chamber whose base, the substrate, is made of glass (SiO,) modified so that polymerase molecules can be attached in a regular array. Using electron beam lithography a square array of dimensions 100 pm X 100 pm is generated. Rai-Choudhury, Handbook of Microlithography, Micromachining, and Microfabrication, Volume I: Microlithography, Volume PM39, SPIE Press (1997). A small spot, <50 nm in diameter, is etched at every 1 pm interval in resist material covering the glass slide. This etching exposes the glass for subsequent derivatization in which a nitrilotriacetic acid group is covalently bound by way of silane chemistry. Schmid, et al.. Anal Chem 69: 1979-1985 (1997). Each nitrilotriacetic acid group serves as a chelator for a Ni' ion. The coordinated Ni 2 ion can then be bound by hexahistidine moieties engineered into a variety of polymerase molecules. Thus an array of 10,000 polymerase molecules is generated in a 100 pm X 100 pm array, which will be observed in an optical microscope system. In an alternative configuration biotin is covalently attached to each spot by way of silane chemistry. The biotin is then bound by streptavidin moieties covalently linked to, or engineered into, the polymerase molecules.
Example 2: Microfluidic reaction chamber allows rapid exchange of reactants, buffer and products.
The reaction chamber is a device that houses the array of reaction centers and regulates the environment. As described in Example 1, the substrate is a glass microscope slide prepared with a regular microscopic array of covalently moieties. A prism is attached to the slide on the surface opposite to the array. The prism directs laser light into the slide at such an angle that total-internal reflection of the laser light is achieved within the slide. Under this condition an evanescent wave is generated over the array during the sequencing reaction cycle. The slide
O
O and prism are fixed into an assembly, which will generate a sealed chamber with a volume of 1-10 ul (FIG. Reagents and buffer are pumped into and out-of the chamber through S microfluidic ports on either side of the chamber. Complete exchanges of volume take place 00 within 1 second and are mediated by electronically controlled valves and pumps.
Example 3: Preparation of labeled-caged chain terminating nucleotides Preparation of fluorochrome-photolabile linker conjugate 0 Fluorochrome-linked 2-nitrobenzyl derivatives are first generated as described by Anasawa, el ot al, WO 98/33939. Alternatively a sensitized photolabile linker using DMNPE caging 0, kit, Catalog Number D-2516, Molecular Probes, Inc.) may be first attached to the 3' group of the dNTP as detailed below and then linked to a fluorochrome using succinimide chemistry or otherwise. It may prove optimal to use a linker of variable length between the fluorochrome and the caging group to reduce possible steric hindrance caused by large chemical groups.
Brandis, et Biochemistry 35: 2189-2200 (1996).
Preparation of 3'--modified-2'-deoxynucleotide analogs 3'-O-modified-2'-deoxynucleotides are synthesized by esterification of the 3'-OH group of dATP, dCTP, dGTP and dTTP. This is accomplished by several general methods. Metzker, et al., Nucleic Acids Res 22: 4259-4267 (1994).
Method 1: First 2'-deoxy-5'-hydroxy-dNTPs are reacted with tert-butyldiphenylsilyl (TBDPS) in the presence of imidazole and dimethylformamide (DMF) producing deoxynucleotides. Then the resulting 2 '-deoxy-5'-tert-butyldiphenyisilyl dNTP is dissolved in benzene and mixed with the halide derivative of the fluorochrome-photolabile linker conjugate in the presence of tetrabutylammonium hydroxide (TBAH) (and additionally NaOH in some cases) and stirred at 25°C for 16 hours. The organic layer is extracted with ethyl acetate and washed with deionized water, saturated NaCl. dried over NaSO, and purified by flash chromatography using a stepwise gradient (10% methanol/ethyl acetate to 5% methanol/ethyl acetate in 2% intervals) o Method 2: dNTPs prepared as detailed above are reacted directly with Sthe acid anhydride of the fluorochrome-photolabile linker conjugate in dry pyridine in the Oc presence of 4 -dimethylaminopyridine (DMAP) at 25 0 C for 6 hours. The pyridine is then removed under vacuum, the residue is dissolved in deionized water, extracted in chloroform, washed with deionized water, with 10% HC1, saturated NaHCO 3 saturated NaCI, dried over NaSO,, and purified by flash chromatography.
SMethod 3: in S 2'-deoxy-5'-tert-butyldiphenylsilyl dNTPs are dried by repeated co-evaporation with pyridine, Ci dissolved in hot DMF and cooled to 0'C -in an ice bath. NaOH is dissolved in DMF after washing with dry benzene, then added to the dissolved 2'-deoxy-5'-terr-butyldiphenylsily! and stirred for 45 minutes. A halogenated derivative of the fluorochrome-photolabile linker conjugate in DMF is added and the reaction is stirred for a few hours. The reaction is then quenched with cold deionized water and stirred overnight. The solid obtained is filtered, dried, and recrystallized in ethanol.
Method 4: The 3-caged NTPs can be prepared directly from the triphosphate according to Hiratsuka et al., Biochim Biophys Acta 742: 496-508 (1983).
In the case of methods 1-3, the resulting compounds are subsequently desilyated by the addition of 1.0 equivalents of tetrabutylammonium fluoride (BuNF). The reactions are monitored by thin layer chromatography and after completion (about 15 minutes), the reactions are quenched with 1 equivalent of glacial acetic acid. The solvent is removed, and the residues purified by silica column chromatography. The 5'-triphosphate derivatives of the compounds generated by methods 1-3 are synthesized by the following protocol. The 3-modified nucleoside (1.0 equivalents) is dissolved in trimethylphosphate under a Nitrogen atmosphere.
Phosphorus oxychloride (POCI,) (3.0 equivalents) is added and the reaction is stirred at for 4 hours. The reaction is quenched with a solution of tributylammonium triphosphate equivalents) in DMF and tributylamine. After stirring vigorously for 10 minutes, the reaction is quenched with TEAB pH 7.5. The solution is concentrated, and the triphosphate derivative o isolated by linear gradient (0.01 M to 0.5 M TEAB) using a DEAE cellulose (HCO,- form) column.
The final synthetic products are purified by HPLC, and may be further purified by enzymatic 0 mop-up if necessary [Metzker, et al., Biotechniques 25: 814-817 (1998)], a technique .which utilizes the extreme enzymatic preference of many polymerases for deoxynucleotides versus their 3'-blocked counterparts. This probably results from low efficiency of the catalytic formation of the phosphodiester bond when 3'-modified nucleotides are present in the enzyme O active site so that the enzyme tends to rapidly exhaust the normal contaminating S deoxynucleotides first. Brandis, et al., Biochemistry 35: 2189-2200 (1996).
0 In an alternative configuration a photolabile group is attached to the 3'-OH using succinimide or other chemistry and a fluorochrome-photolabile linker conjugate is attached directly to the base of the nucleotide as described by Anasawa et al., WO 98/33939. The 3' attached photolabile group will serve as a reversible chain terminator [Metzker, et al., Nucleic Acids Res 22: 4259-4267 (1994)] and the base-attached fluorochrome-photolabile linker will serve as a removable label. In this configuration with each cycle both photolabile groups will be removed by photolysis before further incorporation, is allowed. Such a configuration may be preferred if it is found that steric hindrance of large fluorochrome groups attached to the 3'-OH of the nucleotide prevent the nucleotide from entering the polymerase.
Example 4: DNAS using a cloned hexahistidine-tagged DNA polymerase, random primed single-stranded DNA template and total internal reflection fluorescence microscopy.
There are two phases to the process.
Phase 1: The first phase is the set-up phase. Hexahistidine-tagged DNA polymerase is washed into the reaction chamber and allowed to attach to the Ni 2 nitrilotriacetic array. As an example.
hexahistidine-tagged DNA polymerase from Thermus aquaticus might be used. Dabrowski, er al., Acta Biochim Pol 45: 661-667 (1998). Template DNA, is prepared by shearing or restriction digestion, followed by denaturation at 95C and annealing with a mixture of random oligodeoxynucleotide primers. The primed single-stranded DNA template is then pumped into the reaction chamber.
SPhase 2: 0 C, The second phase of the process is the main sequencing cycle. The cycle is as follows: 1. Reaction buffer containing labeled-caged chain-terminating deoxynucleoside o triphosphates (dNTP*s) is pumped into the reaction chamber: Reaction buffer consists of: 10 mM Tris HC1, pH 8.3; 50 mM KCI; and 2.5 mM MgCIl. The dNTP*s are each at a concentration of 0.02-0.2 mM.
2. Reaction buffer without the dNTP*s is rinsed through the reaction chamber.
3. For each of the 10,000 reaction centers, the identity of the newly incorporated o nucleotide is determined by total internal reflection fluorescence microscopy (TIRFM).
Multiple recordings of the reaction center array are made so that each of the four nucleotides are distinguished. The fluorochrores used have high extinction coefficients and/or high quantum-yields for fluorescence. In addition, the fluorochromes have well resolved excitation and/or emission maxima. There are several fluorochrome families that will be used, for example, the BODIPY family of fluorochromes (Molecular Probes, Inc.). Using BODIPY fluorochromes and the photolabile linker I-( 4 ,5-dimethoxy-2-nitrophenyl) ethyl (DMNPE) the follow set of nucleotide analogs can be employed for DNAS: 3'-O-(DMNPE-(BODIPY 4 9 3 deoxy ATP 3'-O-(DMNPE-(BODIPYS 3 50 deoxy CTP 3'-O-(DMNPE-(BODIPY 5 6 7 0 deoxy GTP 3'-O-(DMNPE-(BODIPY"'/ 59 deoxy TTP Thus incorporated 'A's are detected with 488 nm Argon-ion laser illumination and a barrier filter centered at 503 nm. Incorporated and 'G's and 'T's with are detected with 532 nm YAG laser illumination and barrier filters centered at 550 nm, 570 nm, and 591 nm respectively.
For each of the separate illumination events an evanescent wave is generated in the reaction center array and the image of the array is focused through the microscope system onto the face of a micro-channel plate intensified cooled-CCD camera.
o4. Newly incorporated nucleotides are optically uncaged by illumination with <360 tim light from another YAG laser. This causes dissociation of the DMN-PE-BODIPY from the nascent nucleic acid strand leaving it intact and prepared to incorporate the next 00 nucleotide.
The removal of the fluorescent moiety is verified by TIRFM and the reaction cycle is repeated until nucleotides are no longer incorporated.
Typically, the exposure time for each fluoroebrome is 100 msec. The readout time of the CCD chip is -0.25 sec. Hence, the detection step for each cycle takes <1 .5 secs. The total volume o of the reaction chamber is 1-10jp1. Less than one second is taken to completely flush the reaction chamber. Hence the total time for a given cycle is less than 10 seconds. Therefore, at seconds/cycle each of the 10,000 reaction centers of the DNAS machine is able to deduce at least 360 bases of sequence per hour, corresponding to 3.6 M base/hour of sequence deduced by the DNAS machine as a whole.
Shutters controlling laser illumination, filter wheels carrying the barrier filters and the CCD camera are all controlled by a microcomputer. Image collection and data analysis are all executed by the same microcomputer. Extracted sequence data and array images are stored permanently on CD ROM as they are collected.
EQUIVALENTS
From the foregoing detailed description of the specific embodiments of th& invention, it should be apparent that a unique' method and apparatus for nucleic acid sequencing has been described. Although particular embodiments have been disclosed herein in detail, this has been done by way of example for purposes of illustration only, and is not intended to be limiting with respect to the scope of the appended claims that folipw. In particular, it is contemplated by the inventors that various substitutions, alterations, and modifications may be made to the invention without departing from the spirit and scope of the invention as defined by the claims.
For instance, the choice of the particular polymerase, the particular linkagre of the polymerase to the solid support, or the particular nucleotide terminators is believed to be a matter of routine for a person of ordinary skill in the art with knowledge of the embodiments described herein.
Claims (1)
- 2. The method of claim 1, wherein the 3' blocking group and the labeling group are separated from the incorporated nucleotide by photochemical activation. ASM Scientific, Inc. By their patent attorneys CULLEN CO. Date: 28 April 2005
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| AU2005201777A AU2005201777B2 (en) | 1999-03-10 | 2005-04-28 | A Method for Direct Nucleic Acid Sequencing |
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US09266187 | 1999-03-10 | ||
| AU31746/00A AU3174600A (en) | 1999-03-10 | 2000-03-10 | A method for direct nucleic acid sequencing |
| AU2005201777A AU2005201777B2 (en) | 1999-03-10 | 2005-04-28 | A Method for Direct Nucleic Acid Sequencing |
Related Parent Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| AU31746/00A Division AU3174600A (en) | 1999-03-10 | 2000-03-10 | A method for direct nucleic acid sequencing |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| AU2005201777A1 true AU2005201777A1 (en) | 2005-05-19 |
| AU2005201777B2 AU2005201777B2 (en) | 2007-08-02 |
Family
ID=34578096
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| AU2005201777A Ceased AU2005201777B2 (en) | 1999-03-10 | 2005-04-28 | A Method for Direct Nucleic Acid Sequencing |
Country Status (1)
| Country | Link |
|---|---|
| AU (1) | AU2005201777B2 (en) |
Family Cites Families (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CA2294962C (en) * | 1997-07-28 | 2012-09-18 | Medical Biosystems Ltd. | Nucleic acid sequence analysis |
-
2005
- 2005-04-28 AU AU2005201777A patent/AU2005201777B2/en not_active Ceased
Also Published As
| Publication number | Publication date |
|---|---|
| AU2005201777B2 (en) | 2007-08-02 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US7270951B1 (en) | Method for direct nucleic acid sequencing | |
| US8535881B2 (en) | High speed parallel molecular nucleic acid sequencing | |
| US20030235854A1 (en) | Methods for analyzing a nucleic acid | |
| AU783841B2 (en) | Nucleic acid probe arrays | |
| EP2182077B1 (en) | A method for single nucleotide polymorphism and mutation detection using real time polymerase chain reaction microarray | |
| US20100173363A1 (en) | Use of Single-Stranded Nucleic Acid Binding Proteins in Sequencing | |
| US20130303381A1 (en) | Alternative nucleic acid sequencing methods | |
| JP2001500741A (en) | Identification of molecular sequence signatures and methods related thereto | |
| WO2001016375A2 (en) | High speed parallel molecular nucleic acid sequencing | |
| US5545528A (en) | Rapid screening method of gene amplification products in polypropylene plates | |
| JP2010213709A (en) | Length determination of nucleic acid repeat sequences by discontinuous primer extension | |
| KR20120017033A (en) | Detection of Multiple Nucleic Acid Sequences in Reaction Cartridges | |
| JP2002525127A (en) | Methods and products for genotyping and DNA analysis | |
| US20100105032A1 (en) | Highly sensitive multiplex single nucleotide polymorphism and mutation detection using real time ligase chain reaction microarray | |
| US20070031875A1 (en) | Signal pattern compositions and methods | |
| CN112639128A (en) | Methods and compositions for nucleic acid sequencing using photoswitchable labels | |
| JP2000228999A (en) | Multi-genotype of population | |
| US7829278B2 (en) | Polynucleotide barcoding | |
| AU2005201777B2 (en) | A Method for Direct Nucleic Acid Sequencing | |
| CN117561339A (en) | Detection of methylcytosine using a modified base opposite methylcytosine | |
| US20100190151A1 (en) | Fluorescently labeled nucleoside triphosphates and analogs thereof for sequencing nucleic acids | |
| KR20020067132A (en) | Method of analysing one or more gene by using a dna chip | |
| Herron et al. | Rapid single nucleotide polymorphism detection for personalized medicine applications using planar waveguide fluorescence sensors | |
| Rubens et al. | Schneider et al. | |
| HK1091871B (en) | Random array dna analysis by hybridization |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| FGA | Letters patent sealed or granted (standard patent) | ||
| MK14 | Patent ceased section 143(a) (annual fees not paid) or expired |