[go: up one dir, main page]

WO2012037875A1 - Etiquettes d'adn et leur utilisation - Google Patents

Etiquettes d'adn et leur utilisation Download PDF

Info

Publication number
WO2012037875A1
WO2012037875A1 PCT/CN2011/079897 CN2011079897W WO2012037875A1 WO 2012037875 A1 WO2012037875 A1 WO 2012037875A1 CN 2011079897 W CN2011079897 W CN 2011079897W WO 2012037875 A1 WO2012037875 A1 WO 2012037875A1
Authority
WO
WIPO (PCT)
Prior art keywords
dna
tag
index
pcr
library
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2011/079897
Other languages
English (en)
Chinese (zh)
Inventor
樊帆
张俊青
王博
孔淑娟
程玲
胡帅星
汪建
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BGI Shenzhen Co Ltd
Original Assignee
BGI Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BGI Shenzhen Co Ltd filed Critical BGI Shenzhen Co Ltd
Publication of WO2012037875A1 publication Critical patent/WO2012037875A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B40/00Libraries per se, e.g. arrays, mixtures
    • C40B40/04Libraries containing only organic compounds
    • C40B40/06Libraries containing nucleotides or polynucleotides, or derivatives thereof
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1034Isolating an individual clone by screening libraries
    • C12N15/1065Preparation or screening of tagged libraries, e.g. tagged microorganisms by STM-mutagenesis, tagged polynucleotides, gene tags
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B50/00Methods of creating libraries, e.g. combinatorial synthesis
    • C40B50/06Biochemical methods, e.g. using enzymes or whole viable microorganisms

Definitions

  • the invention relates to the field of nucleic acid sequencing technology, in particular to the field of DNA sequencing technology.
  • the invention relates to DNA tags for DNA sequencing and their use. More specifically, the present invention provides a DNA tag, a DNA tag linker, a PCR tag primer, a DNA tag library, a preparation method thereof, a method for determining DNA sequence information, and a plurality of DNA sample sequence information for constructing a DNA tag library. Methods and kits for constructing DNA tag libraries.
  • DNA sequencing technology is one of the important molecular biological analysis methods. It not only provides important data for basic biological research such as gene expression and gene regulation, but also plays an important role in applied research such as disease diagnosis and gene therapy. .
  • Solexa DNA Sequencing Platform Illumina
  • SBS Sequencing By Synthesis
  • each Genome Analyzers can generate up to 50Gb of data per run, such as Illumina's HiSeq 2000, which can generate up to 200 Gb of data per run, compared to
  • the size of a typical DNA library is several orders of magnitude smaller than the amount of data generated by each high-throughput sequencer.
  • the size of a Fosmid library is generally only about 40k.
  • the DNA library is sequenced using a high-throughput sequencer. If you only test one library at a time, it will be a waste of multiple orders of magnitude. The inventors have found that mixing multiple samples together as a sample can save the cost of building a library and save on the cost of sequencing.
  • a DNA tag (herein, simply referred to as a "tag";), which can be used to construct a library of DNA tags, is proposed.
  • the invention proposes a set of isolated DNA tags.
  • the sample source of the DNA can be accurately characterized by linking the DNA tag to the sample DNA or its equivalent.
  • a DNA tag library of a plurality of samples (herein, sometimes referred to as a "tag library”) can be constructed at the same time, so that the DNA tag library derived from different samples can be mixed and then sequenced. And can classify DNA sequences of DNA tag libraries based on DNA tags, thereby obtaining DNA sequence information of various samples, thereby making full use of high-throughput sequencing technologies, such as using Solexa sequencing technology, and simultaneously for multiple DNAs
  • the tag library is sequenced to increase the sequencing efficiency and throughput of the DNA tag library.
  • the inventors have surprisingly found that the construction of a DNA tag library using a DNA tag according to an embodiment of the present invention can The identification of multiple DNA tag libraries is accurate enough, and the resulting sequencing data results are very stable and reproducible.
  • the invention also provides a set of isolated oligonucleotides for introducing the above DNA tag into sample DNA or an equivalent thereof.
  • a set of isolated oligonucleotides according to an embodiment of the invention having a first strand and a second strand, each of said strands being composed of a nucleotide represented by SEQ ID NO: (3N-1), respectively
  • these oligonucleotides also referred to as “DNA tag linkers” and “tag linkers” in the present specification
  • a corresponding DNA tag linker having a Y-form structure can be formed by subjecting the sense sequence DNA Index-NF_adapter and its corresponding antisense sequence DNA Index-NR_adapter to an equimolar annealing treatment.
  • Index-48R_adapter Using the above-described oligonucleotide according to an embodiment of the present invention (which may also be referred to as a DNA tag linker), it is possible to efficiently introduce a DNA tag into the DNA of the sample or its equivalent, thereby enabling the construction of a DNA tag.
  • the inventors have surprisingly found that when targeting the same sample, the use of different labels Oligonucleotides When constructing a DNA tag library containing various DNA tags, the resulting sequencing data results are very stable and reproducible.
  • the DNA tag library of the Fosmid sample constructed using Indexl-48 exhibits a correlation of at least 0.99 when data analysis is performed using the pearson coefficient. Details of the specific algorithm for the pearson coefficient can be found in the relevant literature, for example: t Hoen, PA, Y. Ariyurek, et al. (2008).
  • the invention also provides a set of isolated PCR tag primers for introducing the above DNA tag into sample DNA or its equivalent.
  • a set of isolated PCR tag primers according to an embodiment of the invention, each consisting of the nucleotides set forth in SEQ ID NOs: 145-160.
  • the set of isolated PCR tag primers respectively have a specific DNA tag, and the PCR tag primer can be introduced into the DNA of the sample or its equivalent by PCR reaction using a PCR tag primer, thereby The corresponding DNA tag is introduced into the DNA or its equivalent.
  • PCR1.0 tag primer ( PCR index_N Primer ) sequence
  • ACGTGTGCTCTTCCGATCT ( 157) CAAGCAGAAGACGGCATACGAGATTGCAAGGTGTGACTGGAGTTCAG
  • PCR tag primers of the present invention it is possible to efficiently introduce a DNA tag into the DNA of the sample or its equivalent, thereby enabling construction of a DNA tag library having a DNA tag.
  • the inventors have surprisingly found that when the DNA tag libraries containing various DNA tags are separately constructed using PCR tag primers with different tags for the same sample, the stability and reproducibility of the obtained sequencing data results are very it is good.
  • two tags can be introduced into a DNA sample by a linker ligation and a PCR reaction using the above-described DNA tag linker and PCR tag primer.
  • a DNA tag linker and a PCR tag primer can be used to introduce different tags into a DNA sample, and the sequence information and tags of the two tags introduced are in the PCR amplification product. With the location information, it is possible to distinguish the DNA tag library, so that a DNA tag library of a plurality of DNA samples can be constructed at the same time, and finally a hybrid sequencing of a large number of samples can be realized to meet the demand of high-throughput sequencing, thereby reducing the cost of sequencing.
  • the present invention provides a method of preparing a DNA tag library.
  • the method comprises the steps of: fragmenting a DNA sample to obtain a DNA fragment; performing end repair of the DNA fragment to obtain a DNA fragment that has been repaired at the end; 3, a base A is added at the end to obtain a DNA fragment having a sticky terminal A; the DNA fragment having the sticky terminal A is ligated to a DNA tag adaptor to obtain a ligation product to which a DNA tag linker is ligated, wherein the DNA
  • the tag linker comprises one selected from the group of isolated DNA tags according to the above embodiments of the present invention; the ligation product is subjected to a PCR reaction to obtain a PCR amplification product, wherein the PCR reaction uses a PCR tag primer,
  • the PCR tag primer comprises a specific DNA tag, the PCR amplification product comprising a fragment of interest, a DNA linker and a DNA tag, wherein the sequence of the target fragment correspond
  • the DNA tag of the embodiment of the present invention can be efficiently introduced into a DNA tag library constructed for sample DNA.
  • This allows the DNA tag library to be sequenced to obtain sequence information of the sample DNA and information on the DNA tag, thereby enabling differentiation of the source of the sample DNA.
  • the inventors have surprisingly found that when targeting the same sample, based on the above method, a different DNA tag library (ie, the composition of the tag contained in the DNA tag linker and the PCR tag primer) is used to construct a DNA tag library containing various DNA tags. The stability and reproducibility of the resulting sequencing data results are very good.
  • the present invention also provides a DNA tag library obtained by the method of constructing a DNA tag library according to an embodiment of the present invention.
  • the present invention also provides a method of determining DNA sample sequence information.
  • a method of determining DNA sample sequence information comprising: constructing a DNA tag library of the DNA sample according to a method of constructing a DNA tag library according to an embodiment of the present invention; and sequencing the DNA tag library to determine a sequence of the DNA sample information. Based on this method, the sequence information of the DNA sample in the DNA tag library and the sequence information of the DNA tag can be efficiently obtained, thereby distinguishing the source of the DNA sample. Further, the inventors have surprisingly found that the use of the method according to an embodiment of the present invention to determine DNA sample sequence information can effectively reduce the problem of data output bias, and can accurately distinguish a plurality of DNA tag libraries.
  • the present invention also provides a method of determining sequence information of a plurality of DNA samples.
  • a DNA tag library mixture is constructed according to the method described above, wherein the DNA tag library mixture is composed of a DNA tag library of each of the plurality of samples, and different DNA samples are used DNA tags of mutually different and known sequences; sequencing of the DNA tag library by Solexa sequencing technology to obtain sequence information of the DNA sample and sequence information of the tag; and sequence information based on the tag
  • the sequence information of the DNA sample is classified to determine DNA sequence information of the plurality of samples.
  • the method according to an embodiment of the present invention can make full use of high-throughput sequencing technology, for example, using Solexa sequencing technology, and simultaneously sequencing DNA tag libraries of various samples, thereby improving the efficiency and sequencing of DNA tag library sequencing.
  • the quantity can also improve the efficiency of determining the sequence information of multiple DNA samples.
  • a kit for constructing a DNA tag library comprising: 48 isolated oligonucleotides, said isolated oligonucleotide, according to an embodiment of the present invention
  • the glucosinolate has a first strand consisting of a nucleotide represented by SEQ ID NO: (3N-1), and a second strand consisting of SEQ ID NO: (3N), respectively
  • Figure 1 Schematic diagram showing the construction of the DN A library of the small fragment DN A provided by Illumina (see, for example, Preparing Samples for Sequencing Genomic DNA. Illumina protocol: Part # 11251892 Rev, incorporated herein by reference in its entirety);
  • FIG. 2 is a schematic flow chart showing a method for constructing a DNA tag library according to an embodiment of the present invention
  • FIG. 3 is a flow chart showing a method for constructing a DNA tag library according to an embodiment of the present invention
  • FIG. 4 showing an implementation according to the present invention.
  • the DNA tag library constructed by the method of constructing a DNA tag library was tested using the results of Agilent Bioananazer 2100.
  • first and second are used for descriptive purposes only, and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, features defining “first” and “second” may include one or more of the features, either explicitly or implicitly. Further, in the description of the present invention, “multiple” means two or more unless otherwise stated.
  • the present invention proposes a number of isolated DNA tags.
  • SEQ ID NO: (3N-2) any integer of 1-48.
  • DNA as used in the present invention may be any polymer comprising deoxyribonucleotides including, but not limited to, modified or unmodified DNA.
  • a DNA tag library having a tag is obtained by linking the DNA tag to the DNA of the sample or its equivalent.
  • sequencing the DNA tag library the sequence of the sample DNA and the sequence of the tag can be obtained, and the sample source of the DNA can be accurately characterized based on the sequence of the tag.
  • a DNA tag library of a plurality of samples can be simultaneously constructed, and the DNA sequence of the sample can be classified based on the DNA tag by mixing and simultaneously sequencing the DNA tag library derived from different samples.
  • DNA tag is linked to the DNA of the sample or its equivalent
  • a nucleic acid of the same sequence (for example, may be the corresponding RNA sequence or cDNA sequence, which has the same sequence as the DNA).
  • the inventors of the present application found that: In the present invention, in order to design an effective DNA tag, it is first necessary to consider the problem of recognizability and recognition rate between tag sequences. Second, in the case of a label mix of less than 12 samples, the GT content of each base site on the mixed label must be considered. Because the excitation fluorescence of the bases G and T is the same in the Solexa sequencing process, the excitation lights of the bases A and C are the same, so the "balance" of the base “GT” content and the base “AC” content must be considered. The base base "GT” content is 50%, which guarantees the highest label recognition rate and the lowest error rate. Finally, consider the repeatability and accuracy of the data output.
  • a set of DNA tags In order to achieve efficient construction of the DNA tag library and sequencing, a set of DNA tags must be constructed to ensure reliable results and high reproducibility. The same DNA sample ensures that a library of DNA tags constructed using different tags in the set of DNA tags will result in consistent sequencing results, thus ensuring reliable and reproducible results. In addition, it is necessary to avoid the occurrence of 3 or more consecutive bases in the tag sequence, because 3 or more consecutive bases increase the error rate of the sequence during synthesis or sequencing, and also Try to avoid the DNA tag linker and the PCR tag primer itself forming a hairpin structure.
  • the inventors of the present application performed a large number of screening work, and selected a set of isolated DNA tags according to an embodiment of the present invention, which are respectively represented by the nucleotides represented by SEQ ID NO: (3N-2)
  • the sequence is as shown in Table 1 above and will not be described again.
  • These tags can be applied to the construction of any DNA tag library. There are currently no reports on the construction of these tags for DNA sample sequencing and sequencing by Solexa.
  • the DNA tag used is a nucleic acid sequence of 6 bp in length, and the difference between the tags is more than 3 bases, the set of DNA tags consisting of the following: At least 5, or at least 10, or at least 15, at least 20, or at least 25, or at least 30, at least 35, or at least a DNA tag or a DNA tag that differs by one base 40, or at least 45, or all 48.
  • the set of DNA tags preferably includes at least DNA index-1 - DNA index-5, or DNA index-6 - DNA index- in 48 DNA tags shown in Table 1.
  • the 1 base difference comprises a substitution, addition or deletion of 1 base in the sequence of 48 tags shown in Table 1.
  • the invention also provides the use of a tag according to an embodiment of the invention for the construction and sequencing of a DNA tag library.
  • the DNA tag linker of the DNA tag library comprises a DNA tag according to an embodiment of the present invention, thereby constituting a corresponding DNA tag linker.
  • the DNA tag is inserted into the DNA tag linker, preferably, the DNA tag linker is inserted. 3, upstream of the "T" base of the end; wherein the PCR tag primer of the DNA tag library comprises a DNA tag according to an embodiment of the present invention, thereby constituting each relative The PCR tag primer is applied, and according to the embodiment of the use, the DNA tag is inserted into the PCR tag primer.
  • Oligonucleotides Oligonucleotides, PCR tag primers, and construction of DNA tag libraries
  • the present invention provides a set of isolated oligonucleotides which can be used to introduce the DNA tag described above into the DNA of a sample, thereby constructing a DNA tag library.
  • the invention provides a set of isolated oligonucleotides, each of the set of isolated oligonucleotides having a sticky end T, and the isolated oligonucleotides have a A chain and a second strand, the sticky end T is formed on the first strand of each of the oligonucleotides.
  • the first strand is composed of the nucleotides represented by SEQ ID NO: (3N-1), and the second strand is composed of the nucleotides represented by SEQ ID NO: (3N), respectively.
  • the N values of the first strand and the second strand are the same, that is, when the corresponding nucleotides in the sequence listing are used as the first strand and the second strand, respectively, the core of the first strand is formed.
  • the corresponding oligonucleotides can be formed by annealing the first strand and the second strand constituting the corresponding oligonucleotide, respectively.
  • the above oligonucleotides respectively have the DNA tags according to the embodiments of the present invention as described above, and the oligonucleotides have sticky ends, and thus, the corresponding DNA tags can be linked by a ligation reaction. Introduced into the DNA of the sample or its equivalent. Specifically, the sequences of these oligonucleotides are as shown in Table 1 above, and will not be described again.
  • the oligonucleotide sequence (DNA tag linker) provided according to an embodiment of the present invention has high stability. This finding was primarily based on the analysis of the structural stability of these oligonucleotide sequences by Lasergene software (http://www.dnastar.com/) in accordance with some embodiments of the present invention. Using Lasergene's PrimerSelect software, the affinity parameter between the duplexes can be determined by analyzing the energy values formed between the two sequences, thereby predicting the most stable dimer overrall and energy formed by the DNA tag linker. The value, where the absolute value of the energy value (kcal/mol) is larger, indicates that the result of the duplex is more stable. According to the examples of the present invention, the above-mentioned structural stability and affinity analysis were carried out on the 48 DNA tag-linkers shown in Table 1 above, and the results showed that the "Y-type" structure formed by these DNA tag-linkers was very stable.
  • the invention provides a DNA tag linker, wherein the DNA tag linker of the DNA tag library comprises the DNA tag described above, preferably the DNA tag is inserted into the "T" base at the 3' end of the tag linker Upstream, and the tag linker is preferably used as both 5, and 3, linkers.
  • These DNA tag linkers include or consist of the following: 48 DNA tag linkers shown in Table 1 or the DNA tag sequence contained therein differs by 1 At least 5, or at least 10, or at least 15, or at least 20, at least 25, or at least 30, or at least 35, or at least 40, or 45 of the bases of the DNA tag linkers , or all 48.
  • these DNA tag linkers preferably include at least DNA index-1F/R_adapter-DNA index-5F/R_adapter, or DNA index-6F/R_adapter in the 48 DNA tag linkers shown in Table 1.
  • a difference of 1 base includes substitution, addition or deletion of 1 base in the tag sequence.
  • the present invention also provides the use of a DNA tag linker for the construction and sequencing of a DNA tag library, preferably the tag linker is used simultaneously as a 5, and 3, linker for a tag library.
  • the present invention provides a set of isolated PCR tag primers which can be used to introduce a DNA tag as described above into the DNA of a sample, thereby constructing a DNA tag library.
  • the set of isolated PCR tag primers are composed of the nucleotides shown in SEQ ID NO: 145-160, respectively, and the sequence thereof is shown in Table 2 above, and will not be described herein.
  • the PCR tag primers respectively have a specific DNA tag
  • the PCR tag primer can be introduced into the DNA of the sample or its equivalent by PCR reaction using a PCR tag primer, thereby introducing the corresponding DNA tag.
  • DNA or its equivalent To DNA or its equivalent.
  • PCR tag primers provided in accordance with embodiments of the present invention have higher stability. This finding was primarily based on some embodiments of the invention, and in particular, according to an embodiment of the invention, Lasergene's PrimerSelect software was used to predict and analyze the hairpins formed by each of the 16 PCR tag primers in accordance with an embodiment of the present invention.
  • the structure, self-extension of the dimeric structure, and self-dimer structure indicate that the structure of these PCR tag primers is very stable.
  • the invention provides PCR primer primers comprising a specific DNA tag at the 3' end.
  • the PCR tag primers comprise or consist of: 16 PCR tag primers shown in Table 2 or at least 2 PCR tag primers differing from the DNA tag sequence contained therein by 1 base, or At least 4, or at least 6, at least 8, or at least 10, or at least 12, or at least 14, or all 16.
  • these PCR tag primers preferably include at least PCR index_l Primer - PCR index_2 Primer in 16 tag primers shown in Table 2, or PCR index_3 Primer - PCR index_4 Primer, or PCR index_5 Primer - PCR index_6 Primer, or PCR index_7 Primer - PCR index_8 Primer, or PCR index_9 Primer - PCR index_10 Primer, or PCR index_l l Primer - PCR index_12 Primer, or PCR index_13 Primer - PCR index_14 Primer, or PCR index_15 Primer - PCR index_16 Primer, or they Any combination of two or more.
  • a difference of 1 base includes substitution, addition or deletion of 1 base in the tag sequence.
  • the use of PCR tag primers for DNA tag library construction and sequencing is also provided.
  • a DNA tag library constructed using the above DNA tag linker and PCR tag primer is also provided.
  • the present invention also provides a method of constructing a DNA tag library using the above DNA tag linker and PCR tag primer. Specifically, according to an embodiment of the present invention, referring to FIG. 2, the method includes:
  • a DNA sample is fragmented to obtain a DNA fragment.
  • the DNA sample is fragmented by ultrasonication.
  • the source of the DNA sample is not particularly limited.
  • the DNA sample is a Fosimd sample.
  • the inventors have found that a DNA tag library of various Fosimd samples can be efficiently constructed using the method according to an embodiment of the present invention.
  • the obtained DNA fragment is about 180 bp in length, thereby further improving the efficiency of constructing a DNA tag library and subsequent sequencing.
  • the DNA fragment is end-repaired to obtain a DNA fragment that has been repaired at the end.
  • End repair was performed using T4 DNA polymerase and Klenow polymerase according to an embodiment of the present invention.
  • base A is added to the 3' end of the end-repaired DNA fragment to obtain a DNA fragment having a sticky end A.
  • base A is added to the 3' end of the end-repaired DNA fragment using Klenow polymerase.
  • a DNA fragment having a sticky end A is ligated to a DNA tag linker to obtain a ligation product to which a DNA tag linker is ligated, wherein the DNA tag linker comprises a group selected from the above-described embodiments according to the present invention.
  • a type of isolated DNA tag is one selected from the group consisting of a group of oligonucleotides of the present invention. According to an embodiment of the invention, both ends of the DNA fragment are ligated to a DNA tag linker.
  • the DNA fragment having the cohesive end A is linked to the DNA tag linker by ligating the DNA tag linker at the 3' end of both oligonucleotide strands of the DNA fragment having the sticky end A .
  • the ligation product is subjected to a PCR reaction to obtain a PCR amplification product
  • the PCR reaction uses a PCR tag primer
  • the PCR tag primer comprises a specific DNA tag
  • the PCR amplification product comprises a target fragment, a DNA linker, and A DNA tag in which the sequence of the fragment of interest corresponds to the sequence of the DNA fragment.
  • the expression "the sequence of the target fragment corresponds to the sequence of the DNA fragment” means that the sequence of the random fragment can be directly derived from the sequence of the target fragment, for example, the sequence of the target fragment can be identical to the sequence of the DNA fragment, It may also be fully complementary, even increasing or decreasing a known number of known bases, as long as the sequence of DNA can be obtained by limited calculations.
  • the PCR tag primer for PCR reaction is one selected from the group consisting of PCR tag primers according to an embodiment of the present invention, and further, another primer for PCR reaction is represented by SEQ ID NO: 161 The nucleotide sequence constitutes.
  • the obtained PCR amplification product is separated and recovered, and the PCR amplification product constitutes a DNA tag library.
  • the method for isolating and recovering the amplified product is also not particularly limited, and those skilled in the art can select an appropriate method and apparatus for separation according to the characteristics of the amplified product, for example, by electrophoresis and recovering a PCR of a specific length.
  • the method of amplifying the product is recovered.
  • a PCR amplification product having a length of about 400 to 800 bp is preferably recovered.
  • the above method for constructing a DNA tag library according to an embodiment of the present invention is applicable to the preparation of any DNA library, and is particularly suitable for the preparation of a Fosmid library.
  • the present invention provides a method of constructing a Fosmid library, which comprises:
  • fragmentation method includes, but is not limited to, an ultrasonic disruption method
  • the T4 DNA polymerase and Klenow polymerase are used for end repair, and the fragment 5 is phosphorylated with T4 PNK; and the 3' end of the repaired DNA fragment is added by using, but not limited to, Klenow polymerase. "A" base;
  • Step 4 Recovery and PCR and PCR reactions
  • Label primers are used in the PCR reaction, preferably the recovered and purified DNA packets are mixed together, and different sets use different label primers.
  • the label joint described in the third step of the above method for constructing a Fosmid library according to an embodiment of the present invention is a label joint according to an embodiment of the present invention.
  • the label primer described in the fourth step of the method for constructing a Fosmid library according to an embodiment of the present invention is a label primer according to an embodiment of the present invention.
  • the PCR reaction described in the fourth step of the method for constructing a Fosmid library according to an embodiment of the present invention also uses the primer Index PCR primer.
  • a DNA tag according to an embodiment of the present invention can be efficiently introduced into a DNA tag library constructed for a DNA sample.
  • the DNA tag library can be sequenced to obtain sequence information of the DNA sample and sequence information of the DNA tag, thereby enabling differentiation of the source of the DNA sample.
  • Method of constructing a DNA tag library according to an embodiment of the present invention The DNA tag linker and the PCR tag primer used in the respective ones contain a tag, and thus the DNA tag library constructed according to the method has two kinds of tags.
  • the method for constructing a DNA tag library provided by the present invention has been significantly improved, thereby making full use of a high-throughput sequencing platform.
  • the need for high-throughput sequencing saves sequencing resources: thereby reducing sequencing costs.
  • the method of constructing the introduced label is optimized to introduce the label by only two PCR primers (PCR1.0 label primer and PCR2.0 label primer), thereby reducing the difficulty of the PCR reaction.
  • the specificity of the PCR amplification is improved, and the efficiency of the PCR amplification reaction is improved.
  • the invention also improves the recognition efficiency of the tag sequence, thereby improving the construction efficiency of the DNA tag library and reducing the cost of library construction.
  • FIGS. 1 and 2 wherein FIG. 1 shows a flowchart of a DNA library construction method of small fragment DNA provided by Illumina, and FIG.
  • the DNA tag library construction method according to the embodiment of the present invention shown in 2 mixes a plurality of libraries together as a library process, and reduces the purification step, and is capable of performing DNA library construction on a plurality of DNA samples (especially Fosmid samples). Sequencing saves time in database construction and reduces sequencing costs.
  • the inventors have surprisingly found that when the combination of the DNA tag linker and the PCR tag primer for the same sample is different, the DNA tag library containing various DNA tags constructed based on the above method is obtained by sequencing. The stability and repeatability of the sequencing data results are very good.
  • the present invention also provides a kit for constructing a DN A tag library.
  • a DNA tag according to an embodiment of the present invention can be conveniently introduced into a constructed DNA tag library.
  • a DNA tag according to an embodiment of the present invention can be conveniently introduced into a constructed DNA tag library.
  • other components for constructing a DNA tag library can also be included in the kit, and details are not described herein.
  • the present invention also provides a DNA tag library constructed according to the method of constructing a DNA tag library of the present invention.
  • the tagged DNA tag library can be effectively applied to high-throughput sequencing technologies such as Solexa technology, so that the obtained nucleic acid sequence information such as DNA sequence information can be accurately classified by sample source by obtaining a tag sequence.
  • the present invention also provides a method of determining DNA sample sequence information.
  • a method of determining DNA sample sequence information comprising: constructing a DNA tag library according to a method for constructing a DNA tag library according to an embodiment of the present invention; and then, sequencing the constructed DNA tag library to determine a DNA sample Sequence information.
  • the sequence information of the DNA sample in the DNA tag library and the sequence information of the DNA tag can be efficiently obtained, thereby enabling differentiation of the source of the DNA sample.
  • the inventors have surprisingly found that the use of the method according to an embodiment of the present invention to determine DNA sample sequence information can effectively reduce the problem of data output bias and can accurately distinguish a plurality of DNA tag libraries.
  • the constructed DNA tag library can be sequenced by any known method, and the type thereof is not particularly limited. According to some examples of the invention, DNA tag libraries can be sequenced using Solexa sequencing technology. According to an embodiment of the present invention, suitable sequencing primers can be selected for sequencing according to specific conditions.
  • the present invention provides a method of determining sequence information for a plurality of DNA samples.
  • the method comprises the steps of: establishing a DNA tag library mixture according to the method for constructing a DNA library according to an embodiment of the present invention, wherein the DNA tag library mixture is composed of the plurality of samples A DNA tag library of each sample is constructed, and different DNA samples are labeled with DNA tags of different and known sequences; the DNA tag library mixture is sequenced using Solexa sequencing technology to obtain sequence information of the DNA sample and label Sequence information; and sequence information of the DNA sample based on the sequence information of the tag to determine DNA sequence information of the plurality of samples.
  • the term “various” is used in at least two.
  • the expression “different and known sequence DNA tags” means that the DNA tag library constructed for one DNA sample contains two kinds of DNA tags and two other tags of the DNA tag library of any sample. At least one of the differences, and the sequence of each tag is known.
  • there are two kinds and three DNA tags in each DNA tag library and the positions of the three tags are fixed, and the two tags respectively refer to the tags introduced by the DNA tag linker and the PCR tag primers.
  • Label 3 labels, including 2 labels (of the same label) introduced through two identical DNA label adaptors, and a label introduced by PCR label primers, thereby allowing DNA labels to be passed through DNA label binders and PCR label primers
  • Two kinds of tags are introduced into the library, and the number of tags is three.
  • the two tags in a DNA tag library may be the same or different.
  • a DNA tag library of a plurality of DNA samples is constructed such that the DNA tag library can be distinguished according to the difference in tag sequences in different DNA tag libraries after sequencing the DNA tag library.
  • hybrid sequencing of very large numbers of samples can be achieved in accordance with embodiments of the present invention.
  • the obtained DNA tag libraries of various samples are combined to obtain a DNA tag library mixture, and the obtained DNA tag library mixture is sequenced by Solexa sequencing technology, thereby obtaining sequence information of the DNA sample and sequence information of the tag. .
  • sequence information of the DNA sample is classified based on the sequence information of the two tags to determine the sequence information of the plurality of DNA samples.
  • the method according to an embodiment of the present invention can make full use of high-throughput sequencing technology, for example, using Solexa sequencing technology to simultaneously sequence DNA libraries of various samples, thereby improving the efficiency and throughput of DNA library sequencing.
  • the efficiency of determining sequence information of a plurality of DNA samples can be improved.
  • the sequencing method and the sequencing primer used in the prior art have been described in detail above and will not be described again here.
  • a method of preparing a DNA tag library mixture is not particularly limited, and the obtained DNA tag library may be mixed after independently constructing a DNA tag library, or a DNA tag may be prepared.
  • the intermediate products are mixed, and then a common preparation step is completed to prepare a DNA tag library containing a plurality of tags, as long as the sequences of the DNA tags for different samples are known.
  • a DNA tag library mixture can be prepared by the following method: for each of the plurality of samples, independently establishing a method for constructing a DNA library of the present invention, establishing a plurality of A DNA tag library of each of the DNA samples, wherein different DNA samples are labeled with DNA tags of different and known sequences; and a DNA tag library of a plurality of samples is combined to obtain a DNA tag library mixture.
  • different labels can be introduced into a plurality of samples by a PCR reaction, thereby greatly increasing the sequencing method. The number of samples that can be applied.
  • End repair Configure the digestion reaction system in a F-faced table on each of the four 96-well plates:
  • the synthesized 100 ⁇ M DNA Index-NF_adapter and DNA Index-NR_adapter were mixed at 10 ⁇ L, 94 ° C, 5 minutes, and then placed in a 50 ° C water bath for 30 minutes to obtain 50 ⁇ M DNA Index-N. Adapter Annealing product.
  • Each of the four 96-well plates was added to the corresponding tag linker according to the label number shown in the table below (the distribution corresponds to the DNA Index-NF/R_adapter shown in Table 1 of the specification) and was connected overnight at 16 ° C. The connection time was 14-18 hours.
  • the second, third, fourth, fifth, and sixth columns of the third plate are mixed and named L. -6, the seventh, eighth, ninth, tenth, and eleventh plates of the third plate are named together as L-7, the twelfth column of the third plate and the first, second, third, and fourth columns of the fourth plate are mixed and named.
  • L-8 the 5, 6, 7, 8, 9, 10, 11, and 12 columns behind the 4th plate are mixed with the following batch of samples to build the library. After taking 240 ⁇ l, it was purified by Backman's Ampure magnetic beads. After 2% agarose gel electrophoresis, lOOv, 2h, the gel was recovered by 400bp and 800bp, purified by QIAquick PCR Purification Kit, and finally dissolved in 30 ⁇ l. Pure water.
  • the gels of the eight samples were recovered and purified, and the fragments of 400 bp and 800 bp were recovered and dissolved in 30 ⁇ l of ultrapure water.
  • the Q-PCR concentration of the PCR product was 2 nM and the Agilent Bioanalyzer 2100 was used to detect the fragments.
  • the size deviation is within 50 bp of the soil, that is, after the library is constructed, eight PCR products of the same fragment size are mixed in equal amounts, and can be sequenced using Illumina HiSeq.
  • the eight PCR products of the 800 bp fragment size were combined and sequenced using the index PE101+8+101 cycle program.
  • the specific operation procedure is detailed in the Illumina HiSeq operating instructions.
  • the results of the sequencing run are as follows:
  • the insertion fragment deviation is less than 50bp, which meets the requirements, and the total data yield is greater than 10G, indicating that the database construction is successful.
  • the sequencing result produced by Illumina HiSeq is a series of DNA sequences, and the sequence information of the corresponding samples of each primer label is established by searching for the linker sequence, the positive and negative primer tag sequences and the primer sequences in the sequencing results.
  • E. coli DNA ligase is from NEBNest dsDNA Fragmentase
  • the 8 samples of B12, C12, D12, E12, F12, G12, HI are mixed together to obtain 200 microliters of each of the two tubes.
  • 100V, 2h, 2% agarose gel electrophoresis, gel recovery 400 bp, 800 bp fragment, PCR amplification after purification can also add different label primers during amplification, and then 100V, 2h, 2% agarose gel electrophoresis, after cutting, using QIAquick PCR Purification Kit to recover DNA, 4
  • the samples are A11-H11 400-450, A12-H12 400-450, A11-H11 800, A12-H12 800.
  • the specific sample labeling information is as follows:
  • the Agilent Bioanalyzer 2100 test results for two of the samples shown in Figure 4 show that the DNA library fragment size meets the requirements; and the Q-PCR results above show that the concentration is greater than 1.0 nanomolar.
  • the above test results indicate that the 400 bp and 800 bp libraries constructed by the method for constructing a DNA tag library according to an embodiment of the present invention were all qualified.
  • the A11-H11 400-450, A12-H12400-450 two-tube samples can be mixed together in equal amounts, and then the A11-H11 800, A12-H12 800 two-tube samples are mixed together in equal amounts. That is, the 16 samples of the present embodiment were mixed together.
  • the method of sequence information and the kit for constructing a DNA tag library can be applied to DNA sequencing and can effectively improve the sequencing throughput of a sequencing platform such as the Solexa sequencing platform.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Medicinal Chemistry (AREA)
  • Microbiology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Chemical & Material Sciences (AREA)
  • Biomedical Technology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Plant Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

L'invention concerne des étiquettes d'ADN pour construire une bibliothèque d'étiquettes d'ADN, des adaptateurs étiquettes d'ADN, des initiateurs d'étiquettes PCR, une bibliothèque d'étiquettes d'ADN et leur procédé de création. L'invention concerne également un procédé de détermination d'informations de séquence d échantillon d'ADN, un procédé de détermination d'informations de séquence d'une pluralité d'échantillons d'ADN et un kit de construction de bibliothèque d'étiquettes d'ADN. Les étiquettes d'ADN sont constituées de nucléotides tels que présentés dans SEQ ID NO:(3N-2), N = un nombre entier quelconque compris entre 1 et 48.
PCT/CN2011/079897 2010-09-21 2011-09-20 Etiquettes d'adn et leur utilisation Ceased WO2012037875A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201010299247.X 2010-09-21
CN201010299247XA CN102409043B (zh) 2010-09-21 2010-09-21 高通量低成本Fosmid文库构建的方法及其所使用标签和标签接头

Publications (1)

Publication Number Publication Date
WO2012037875A1 true WO2012037875A1 (fr) 2012-03-29

Family

ID=45873440

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2011/079897 Ceased WO2012037875A1 (fr) 2010-09-21 2011-09-20 Etiquettes d'adn et leur utilisation

Country Status (2)

Country Link
CN (1) CN102409043B (fr)
WO (1) WO2012037875A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106591956A (zh) * 2016-11-15 2017-04-26 上海派森诺医学检验所有限公司 一种测序文库构建方法

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2785718C (fr) 2010-01-19 2017-04-04 Verinata Health, Inc. Procedes pour determiner une fraction d'acide nucleique fƒtal dans des echantillons maternels
CN204440396U (zh) * 2012-04-12 2015-07-01 维里纳塔健康公司 用于确定胎儿分数的试剂盒
CN103290104B (zh) * 2013-01-23 2016-03-02 北京诺禾致源生物信息科技有限公司 一种应用于第二代测序的简捷廉价的基因组样品破碎方法
CN105442051A (zh) * 2014-09-26 2016-03-30 深圳华大基因科技有限公司 一种基因文库的筛选方法
CN104694635B (zh) * 2015-02-12 2017-10-10 北京百迈客生物科技有限公司 一种高通量简化基因组测序文库的构建方法
CN105671644A (zh) * 2016-02-26 2016-06-15 武汉冰港生物科技有限公司 一种基因组混样测序文库的制备方法
CN114717662A (zh) * 2022-04-20 2022-07-08 深圳市易基因科技有限公司 一种微量游离dna甲基化建库方法、试剂盒及测序方法
CN118957057B (zh) * 2024-09-23 2025-04-04 首都医科大学附属北京儿童医院 高深度靶向测序检测朗格汉斯组织细胞增生症map2k1基因突变

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1833034A (zh) * 2003-06-20 2006-09-13 埃克斯魁恩公司 用于分析核酸混合物的探针、文库和试剂盒及其构建方法
WO2008093098A2 (fr) * 2007-02-02 2008-08-07 Illumina Cambridge Limited Procedes pour indexer des echantillons et sequencer de multiples matrices nucleotidiques
CN101434988A (zh) * 2007-11-16 2009-05-20 深圳华因康基因科技有限公司 一种高通量寡核苷酸测序方法
WO2010030683A1 (fr) * 2008-09-09 2010-03-18 Rosetta Inpharmatics Llc Procédés de génération de bibliothèques spécifiques de gènes
WO2010053587A2 (fr) * 2008-11-07 2010-05-14 Mlc Dx Incorporated Procédés de surveillance de maladies par analyse de séquence

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100564618C (zh) * 2007-06-13 2009-12-02 北京万达因生物医学技术有限责任公司 分子置换标签测序并行检测法即寡聚核酸代码标签分子库微球阵列分析

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1833034A (zh) * 2003-06-20 2006-09-13 埃克斯魁恩公司 用于分析核酸混合物的探针、文库和试剂盒及其构建方法
WO2008093098A2 (fr) * 2007-02-02 2008-08-07 Illumina Cambridge Limited Procedes pour indexer des echantillons et sequencer de multiples matrices nucleotidiques
CN101434988A (zh) * 2007-11-16 2009-05-20 深圳华因康基因科技有限公司 一种高通量寡核苷酸测序方法
WO2010030683A1 (fr) * 2008-09-09 2010-03-18 Rosetta Inpharmatics Llc Procédés de génération de bibliothèques spécifiques de gènes
WO2010053587A2 (fr) * 2008-11-07 2010-05-14 Mlc Dx Incorporated Procédés de surveillance de maladies par analyse de séquence

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
DONG, CHUNSHENG ET AL.: "A Simple Method Quickly to Construct the DNA Virus Sequencing Library", BIOTECHNOLOGY, vol. 15, no. 4, 31 August 2005 (2005-08-31), pages 34 - 36 *
NG, PATRICK ET AL.: "Multiplex sequencing of paired-end ditags (MS-PET): a strategy for the ultra-high-throughout analysis of transcriptomes and genomes. Art e84", NUCLEIC ACIDS RESEARCH, vol. 34, no. 12, 13 July 2006 (2006-07-13) *
ZHANG, JUNCHENG ET AL.: "Construction of Wheat BAC Shotgun Library", JOURNAL OFTRITICEAE CROPS, vol. 27, no. 3, 31 December 2007 (2007-12-31), pages 374 - 377 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106591956A (zh) * 2016-11-15 2017-04-26 上海派森诺医学检验所有限公司 一种测序文库构建方法

Also Published As

Publication number Publication date
CN102409043B (zh) 2013-12-04
HK1168625A1 (en) 2013-01-04
CN102409043A (zh) 2012-04-11

Similar Documents

Publication Publication Date Title
WO2012037875A1 (fr) Etiquettes d'adn et leur utilisation
CN102653784B (zh) 用于多重核酸测序的标签及其使用方法
CN106795514B (zh) 泡状接头及其在核酸文库构建及测序中的应用
WO2012037882A1 (fr) Étiquettes d'adn et leur utilisation
CN105506125B (zh) 一种dna的测序方法及一种二代测序文库
CN105400776B (zh) 寡核苷酸接头及其在构建核酸测序单链环状文库中的应用
CN102409049B (zh) 一种基于pcr的dna标签文库构建方法
CN102181533B (zh) 多样本混合测序方法及试剂盒
WO2012037876A1 (fr) Index d'adn et son application
CN108138228B (zh) 用于下一代测序的高分子量dna样品追踪标签
CN110036117A (zh) 通过多联短dna片段增加单分子测序的处理量的方法
WO2013056640A1 (fr) Procédé de préparation d'une banque d'acides nucléiques, ses utilisations, et kits associés
WO2012068919A1 (fr) Bibliothèque d'adn et procédé de préparation de celle-ci, procédé et dispositif de détection de snp
CN102690809A (zh) Dna标签及其在构建和测序配对末端标签文库中的应用
CN102839168A (zh) 核酸探针及其制备方法和应用
WO2012037884A1 (fr) Étiquettes d'adn et leur utilisation
CN110628890A (zh) 测序质控标准品及其应用与产品
CN107604046A (zh) 用于微量dna超低频突变检测的双分子自校验文库制备及杂交捕获的二代测序方法
WO2018148289A2 (fr) Adaptateurs duplex et séquençage duplex
CN106676099B (zh) 构建简化基因组文库的方法及试剂盒
CN112359093B (zh) 血液中游离miRNA文库制备和表达定量的方法及试剂盒
CN103571822B (zh) 一种用于新一代测序分析的多重目的dna片段富集方法
WO2012037881A1 (fr) Marqueurs d'acides nucléiques et leurs utilisations
CN104232626A (zh) 简化基因组测序文库中条码物及其设计方法
CN111471746A (zh) 检测低突变丰度样本的ngs文库制备接头及其制备方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11826402

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 16.08.2013.)

122 Ep: pct application non-entry in european phase

Ref document number: 11826402

Country of ref document: EP

Kind code of ref document: A1