CN116814752A - Method for detecting slow virus integration site and application thereof - Google Patents
Method for detecting slow virus integration site and application thereof Download PDFInfo
- Publication number
- CN116814752A CN116814752A CN202310063306.0A CN202310063306A CN116814752A CN 116814752 A CN116814752 A CN 116814752A CN 202310063306 A CN202310063306 A CN 202310063306A CN 116814752 A CN116814752 A CN 116814752A
- Authority
- CN
- China
- Prior art keywords
- primer
- round
- sequencing
- sequence
- integration
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000010354 integration Effects 0.000 title claims abstract description 41
- 238000000034 method Methods 0.000 title claims abstract description 35
- 241000700605 Viruses Species 0.000 title abstract description 7
- 238000012163 sequencing technique Methods 0.000 claims abstract description 27
- 230000003321 amplification Effects 0.000 claims abstract description 22
- 238000003199 nucleic acid amplification method Methods 0.000 claims abstract description 22
- 241000713666 Lentivirus Species 0.000 claims abstract description 19
- 238000001914 filtration Methods 0.000 claims abstract description 13
- 239000002773 nucleotide Substances 0.000 claims description 19
- 125000003729 nucleotide group Chemical group 0.000 claims description 19
- 241000208340 Araliaceae Species 0.000 claims description 6
- 235000005035 Panax pseudoginseng ssp. pseudoginseng Nutrition 0.000 claims description 6
- 235000003140 Panax quinquefolius Nutrition 0.000 claims description 6
- 235000008434 ginseng Nutrition 0.000 claims description 6
- 101150022075 ADR1 gene Proteins 0.000 claims description 5
- 238000013461 design Methods 0.000 claims description 3
- 239000012634 fragment Substances 0.000 claims description 3
- 238000012216 screening Methods 0.000 claims description 3
- FWMNVWWHGCHHJJ-SKKKGAJSSA-N 4-amino-1-[(2r)-6-amino-2-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]piperidine-4-carboxylic acid Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCCCN)C(=O)N1CCC(N)(CC1)C(O)=O)NC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FWMNVWWHGCHHJJ-SKKKGAJSSA-N 0.000 claims description 2
- 101100490566 Arabidopsis thaliana ADR2 gene Proteins 0.000 claims 1
- 101100269260 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) ADH2 gene Proteins 0.000 claims 1
- 238000001514 detection method Methods 0.000 abstract description 12
- 238000003780 insertion Methods 0.000 abstract description 8
- 230000037431 insertion Effects 0.000 abstract description 8
- 230000035945 sensitivity Effects 0.000 abstract description 8
- 230000001404 mediated effect Effects 0.000 abstract description 7
- 238000005516 engineering process Methods 0.000 abstract description 5
- 230000001566 pro-viral effect Effects 0.000 abstract description 2
- 230000008685 targeting Effects 0.000 abstract description 2
- 108020004414 DNA Proteins 0.000 description 17
- 210000004027 cell Anatomy 0.000 description 13
- 238000006243 chemical reaction Methods 0.000 description 13
- 108090000623 proteins and genes Proteins 0.000 description 12
- 239000013598 vector Substances 0.000 description 8
- 239000000203 mixture Substances 0.000 description 7
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 6
- 238000012408 PCR amplification Methods 0.000 description 6
- 238000011144 upstream manufacturing Methods 0.000 description 6
- 238000012546 transfer Methods 0.000 description 5
- 230000000875 corresponding effect Effects 0.000 description 4
- 239000000523 sample Substances 0.000 description 4
- 241000725303 Human immunodeficiency virus Species 0.000 description 3
- 108700020796 Oncogene Proteins 0.000 description 3
- 239000011324 bead Substances 0.000 description 3
- 229960002685 biotin Drugs 0.000 description 3
- 235000020958 biotin Nutrition 0.000 description 3
- 239000011616 biotin Substances 0.000 description 3
- 230000000903 blocking effect Effects 0.000 description 3
- 238000001415 gene therapy Methods 0.000 description 3
- 238000002360 preparation method Methods 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 238000010451 viral insertion Methods 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 238000002659 cell therapy Methods 0.000 description 2
- 210000000349 chromosome Anatomy 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 238000002743 insertional mutagenesis Methods 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 238000000746 purification Methods 0.000 description 2
- 239000000243 solution Substances 0.000 description 2
- 241001430294 unidentified retrovirus Species 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 241000282693 Cercopithecidae Species 0.000 description 1
- 102000012410 DNA Ligases Human genes 0.000 description 1
- 206010061598 Immunodeficiency Diseases 0.000 description 1
- 208000029462 Immunodeficiency disease Diseases 0.000 description 1
- 102100034343 Integrase Human genes 0.000 description 1
- 108010061833 Integrases Proteins 0.000 description 1
- 206010028980 Neoplasm Diseases 0.000 description 1
- 108091034117 Oligonucleotide Proteins 0.000 description 1
- 108700020978 Proto-Oncogene Proteins 0.000 description 1
- 102000052575 Proto-Oncogene Human genes 0.000 description 1
- 108091036333 Rapid DNA Proteins 0.000 description 1
- 241000713311 Simian immunodeficiency virus Species 0.000 description 1
- 238000002105 Southern blotting Methods 0.000 description 1
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 201000011510 cancer Diseases 0.000 description 1
- 238000003776 cleavage reaction Methods 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 239000012470 diluted sample Substances 0.000 description 1
- 238000003113 dilution method Methods 0.000 description 1
- 238000009509 drug development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000012165 high-throughput sequencing Methods 0.000 description 1
- 230000007813 immunodeficiency Effects 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 210000005259 peripheral blood Anatomy 0.000 description 1
- 239000011886 peripheral blood Substances 0.000 description 1
- 108091008146 restriction endonucleases Proteins 0.000 description 1
- 230000007017 scission Effects 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 230000001629 suppression Effects 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 230000001225 therapeutic effect Effects 0.000 description 1
- 238000010361 transduction Methods 0.000 description 1
- 230000026683 transduction Effects 0.000 description 1
- 241000701161 unidentified adenovirus Species 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6869—Methods for sequencing
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/30—Detection of binding sites or motifs
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B50/00—ICT programming tools or database systems specially adapted for bioinformatics
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Chemical & Material Sciences (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Organic Chemistry (AREA)
- General Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Biotechnology (AREA)
- Zoology (AREA)
- Theoretical Computer Science (AREA)
- Wood Science & Technology (AREA)
- Molecular Biology (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Medical Informatics (AREA)
- Evolutionary Biology (AREA)
- Analytical Chemistry (AREA)
- Genetics & Genomics (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Engineering & Computer Science (AREA)
- Immunology (AREA)
- Biochemistry (AREA)
- Microbiology (AREA)
- Bioethics (AREA)
- Databases & Information Systems (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The invention discloses a detection method of lentivirus integration sites and application thereof, wherein the detection method comprises the following steps: s100, extracting DNA, and breaking; s200, repairing the tail end of the broken DNA, adding A, amplifying by adopting a joint primer, and connecting an asymmetric joint; s300, taking the PCR product in the step S200 as a template, and adopting a first round of primer and introducing a blocker for amplification; s400, taking the PCR product in the step S300 as a template, and adopting a second round of primer and introducing a blocker for amplification; simultaneously introducing a sequencing primer binding sequence of a sequencing platform; s500, sequencing the PCR product in the step S400 on a machine; s600, filtering sequencing data; s700, judging the slow virus integration event according to the filtered data. The invention is based on the PCR technology mediated by the linker, enriches the insertion sites through two-step PCR, and has simple operation, short experimental period and low cost compared with other library building methods. Meanwhile, by introducing a blocker targeting the proviral sequence, the amplification of 5' LTR invalid integration site reads is blocked, and the detection sensitivity is improved.
Description
Technical Field
The invention relates to the field of gene therapy and cell therapy drug development. In particular to a detection method of lentivirus integration sites and application thereof.
Background
Lentiviruses are one of the retrovirus classes, and existing lentivirus vectors are derived from a number of species, including human immunodeficiency virus (human immunodeficiency virus, HIV), monkey immunodeficiency virus (simian immunodeficiency virus, SIV), and the like. It is constructed based on lentivirus genome, with the nonessential genes of its expression removed, and instead of therapeutic genes. Lentiviruses are stably expressed on the chromosome of host cells for a long period of time, and compared with other retroviruses (e.g., adenoviruses), lentiviruses are capable of infecting not only dividing cells but also non-dividing cells, and based on the above advantages, lentivirus vector (lentiviral vector) has been widely used for clinical gene therapy as a foreign gene transfer vector.
The obvious advantage of using slow virus as carrier is high gene transfer efficiency and integration rate higher than other gene transfer methods. However, since the integration site is one of the factors determining the expression level of the foreign gene, the lentiviral vector is not capable of inserting the target gene into a specific site but is randomly inserted, and thus one of the possible risks of lentiviral vector-mediated gene transduction is that it may be integrated into a protooncogene or an oncogene of a host cell to induce insertional mutagenesis (insertional mutagenesis) resulting in activation of the oncogene or suppression of the oncogene, thereby inducing the occurrence of cancer. These potential side effects have attracted great attention, so we need to evaluate the safety of lentiviral vector-mediated gene transfer.
Methods for evaluation of lentiviral random insertions have evolved as gene/cell therapy applications have evolved. In the past, identification of viral insertion sites was carried out by the earliest of Southern hybridization and proviral PCR, which had problems of insufficient sensitivity and specificity. Accordingly, the discovery of linear amplification-Mediated PCR (Linear Amplification-Mediated PCR, LAM-PCR) is a significant advance in viral insertion sites, and the existing viral insertion site analysis (VIS) is based on this approach. LAM-PCR can be used to detect complex samples such as rare insertion sites derived from peripheral blood, but this method requires consideration of the preference of cleavage recognition frequency of restriction enzymes, with the risk of technical errors. With the development of sequencing technology, LAM-PCR is combined with high-throughput sequencing technology, so that the method becomes an effective method for identifying insertion sites, and the known regions are marked by using biotin, so that the insertion sites are enriched and the integration safety and preference are analyzed. However, LAM-PCR relies on linear amplification techniques, and has long amplification time, complicated downstream biotin capture procedures, resulting in long experimental cycles and easy operation errors. Meanwhile, the biotin marks the primer, so that the synthesis cost and the working time are increased.
Disclosure of Invention
The invention aims to establish a high-sensitivity lentivirus insertion site detection method, which has the advantages of simple operation, short experimental period and low cost compared with other library building methods.
The invention adopts the following technical scheme:
a method for detecting a lentivirus integration site, comprising the steps of:
s100, extracting host genome DNA, and breaking the host genome DNA into fragments of 200-400 bp;
s200, carrying out terminal repair and A addition on the interrupted host genome DNA, and connecting an asymmetric joint;
s300, using the PCR product obtained in the step S200 as a template, adopting a first round of primer, introducing a blocker for amplification, and enriching and integrating DNA for the first time;
s400, using the PCR product obtained in the step S300 as a template, adopting a second round of primer, introducing a blocker for amplification, and enriching and integrating DNA for the second time; simultaneously introducing a sequencing primer binding sequence of a sequencing platform;
s500, sequencing the PCR product in the step S400 on a machine;
s600, filtering sequencing data;
s700, judging the slow virus integration event according to the filtered data.
The invention is based on the joint-Mediated PCR (LM-PCR) technology, and enriches the insertion sites through two-step PCR, and compared with other library building methods, the invention has simple operation, short experimental period and low cost. Meanwhile, by introducing a specific probe (blocker) targeting provirus, amplification of 5' LTR null integration site reads is blocked, and detection sensitivity is improved.
Preferably, the adaptor primer is: the nucleotide sequence of the upstream primer adapter is shown in SEQ ID NO:1 is shown in the specification; the nucleotide sequence of the downstream primer adapter is shown as SEQ ID NO: 2.
Preferably, the nucleotide sequence of the blocker is shown in SEQ ID NO: 3.
Preferably, the first round primer in step S300 and the second round primer in step S400 employ nested primers.
Preferably, the first round primer is: LTRF1, nucleotide sequence set forth in SEQ ID NO:4 is shown in the figure; ADR1, the nucleotide sequence of which is shown in SEQ ID NO: shown at 5.
Preferably, the second round primer is: the nucleotide sequence of the core sequence of the upstream primer is "5'-GTAACTAGAGATCCCTCAGACCCTTTTA-3'", as shown in SEQ ID NO:6 is shown in the figure; the nucleotide sequence of the core sequence of the downstream primer is "5'-TACCGGACCGAAGGAGCTAA-3'", as shown in SEQ ID NO: shown at 7.
Preferably, the second round of primers, the upstream primer adds a P5 primer sequence, a P5 terminal index sequence and a sequencing primer sequence at the 5 'end of the corresponding core sequence, and the downstream primer adds a P7 primer sequence, a P7 terminal index sequence and a sequencing primer sequence at the 5' end of the corresponding core sequence.
Further preferably, the second round primer, the upstream primer is any one of LTRFnesttepi 501, LTRFnesttepi 502, LTRFnesttepi 503, LTRFnesttepi 504, LTRFnesttepi 505, LTRFnesttepi 506, LTRFnesttepi 507, LTRFnesttepi 508, the corresponding nucleotide sequence is as set forth in SEQ ID NO: 8-15; the downstream primer is any one of ADR2-2step (p 51), ADR2-2step (p 52), ADR2-2step (p 53), ADR2-2step (p 54), ADR2-2step (p 55), ADR2-2step (p 56), ADR2-2step (p 57), ADR2-2step (p 58), ADR2-2step (p 59), ADR2-2step (p 60), ADR2-2step (p 61) and ADR2-2step (p 62), and the corresponding nucleotide sequence is as shown in SEQ ID NO: 16-27. Wherein the upstream primer and the downstream primer each comprise a labeling sequence of 8 bases, and through different upstream and downstream primer combinations (up to 96 types), the samples can be labeled with different labels, and up to 96 types of samples are supported to be simultaneously stored and put on the machine.
Preferably, the design principle of the blocking device is as follows:
the length is about 15-35 nt;
3-10 bp longer than the first round primer and the second round primer.
Preferably, in step S600, filtering of low quality reads and filtering of lentiviral signature sequences are included.
Preferably, in step S600, the data filtering method is as follows:
a. truncating the linker sequence in the sequencing reads;
b. cutting off bases with mass lower than 20 at both ends of reads;
c. deleting the N base at the tail end of read;
d. removing the pair-end reads with the single-end read length less than 20bp after truncation;
e, screening a nucleotide sequence containing LTR sequence ATCTCTAGCA, wherein the mismatch is less than or equal to 3bp, and the positioning rule is adopted;
read2 selection contains asymmetric linker sequence "GCTCTTCCGATC", mismatch ∈3bp.
Preferably, the following conditions are satisfied, defined as an integration event:
the length of the read1 comparison to the ginseng genome is more than or equal to 55bp;
the length of the read2 comparison to the ginseng genome is more than or equal to 55bp;
c. the strand of the two reads is opposite;
integration events within 300bp range are merged into a single integration event;
e.reads are more than or equal to 10.
In another aspect of the invention, the application of the detection method for detecting the integration site of the lentivirus is also provided, and the detection method is used for evaluating the safety of gene transfer mediated by the lentivirus vector.
According to the technical scheme, the detection method provided by the invention can be used for blocking the amplification of 5' LTR invalid integration site reads by introducing a specific probe (blocker) of a target provirus sequence (provirus), inhibiting the information of the useless integration site, and improving the sensitivity by more than 2 times by using the same sequencing data volume. By introducing a unique molecular tag technology, not only is the genomic position of an integration site detected, but also the copy number of an integrated gene can be accurately quantified, and meanwhile, the unique molecular tag adopts three random base designs, so that sequencing errors can be corrected at the same time.
Drawings
FIG. 1 is a schematic diagram of a method for detecting lentivirus integration sites.
In the figure, host DNA: genomic DNA;5'LTR and 3' LTR: a lentiviral signature sequence, an integrase recognition sequence; provirus: a virus-loaded foreign sequence; blocking er: oligonucleotides that inhibit amplification of exogenous sequences.
Sensitivity experiment-theoretical integration ratio of the cell line of fig. 2 correlation with measurement ratio. R is R 2 =0.9986
FIG. 3 is a graph of the results of the verification of the detection method of the present invention by Sanger.
Detailed Description
The present invention will be further described with reference to specific examples, which are not intended to be limiting, so that those skilled in the art will better understand the present invention and practice it.
Example 1
Sensitivity experiments on cell lines
Cell lines (110621P 3 MX-6) and human genomes with known integration site information were used as sensitivity test samples. Specifically, the cell line DNA was incorporated into the human genome at 5%, 1%, 0.5%, 0.1% and 0.05%, and the dilution method can be referred to in table 1 below.
TABLE 1 reference relationship of target cell lines to human genome
| Gradient of | Solution 1 | Solution 2 |
| 50% | 50. Mu.L of cell line DNA | 50 μl human genome |
| 5% | 10μL 50% | 90 μL human genome |
| 1% | 20μL 5% | 80 μL human genome |
| 0.50% | 50μL 1% | 50 μl human genome |
| 0.10% | 20μL 0.5% | 80 μL human genome |
| 0.05% | 50μL 0.1% | 50 μl human genome |
Library construction, sequencing and belief analysis were performed as follows:
1) Extraction and disruption of genome of lentivirus transfected cells
The diluted sample DNA was broken by 5%, 1%, 0.50%, 0.10%, 0.05% and human genome DNA, and the fragment length was set to "200 to 400bp". 2 well replicates were set for each sample, i.e. a total of 12 samples were used for library construction.
2) Breaking genome, repairing end, adding A, and connecting asymmetric joints at two ends
By usingUniversal DNA Library Prep Kit for illumine V3 kit terminal repair, addition of A and linker to the fragmented DNA. When the End is repaired, the End Prep Mix 4 is thawed and then is mixed reversely, and the reaction system shown in the following table 2 is prepared in a sterilized PCR tube.
TABLE 2 reaction system for repairing terminal
| Component (A) | Volume/. Mu.L |
| Input DNA | 50 |
| End Prep Mix 4 | 15 |
The reaction procedure: 15min at 20℃and 15min at 65 ℃.
The reaction system of Table 3 below was prepared in the End Preparation step PCR tube when the adaptors were ligated.
Table 3 Joint reaction System
| Component (A) | Volume/. Mu.L |
| End Preparation product | 65 |
| Rapid Ligation buffer | 25 |
| Rapid DNA ligase | 5 |
| VIS-adapter mixture | 5 |
The sequence of the adaptor primer pair adaptor F and adaptor R is as follows:
adaptor F sequence (5 '-3')
GATCGGAAGAGCHHHHHHHHHHHHHHHTTAGCTCCTTCGGTCCTCC,
The adaptor R sequence (5 '-3') GCTCTTCCGATCT.
The reaction procedure: 15min at 20℃and hold at 4 ℃.
The magnetic beads purify the linker ligation product and then go to the next step.
3) Primers are designed for amplification according to asymmetric linkers and the fixed sequences of lentiviruses, and integrated DNA is enriched for the first time
VAHTS HiFi Amplification Mix was thawed and mixed upside down, and the reaction was prepared in a sterilized PCR tube as shown in Table 4 below.
TABLE 4 first round PCR amplification System
| Component (A) | Volume/. Mu.L |
| Purification of Adapter Ligation product | 23 |
| VAHTS HiFi Amplification Mix | 25 |
| LTRF1(10μM) | 1 |
| ADR1(10μM) | 1 |
The first round of amplification primers LTRF1 and ADR1 were the following sequences:
LTRF1 sequence (5 '-3') TGTGACTCTGGTAACTAGAGATCCCTC,
ADR1 sequence (5 '-3') ACCGCTTGGCCTCCGACTT.
The reaction procedure is as in table 5:
TABLE 5 first round PCR amplification reaction procedure
The PCR products were purified by magnetic beads after library amplification.
4) Based on the asymmetric adapter and the sequence inside the lentivirus, a semi-nested primer was designed to enrich the integrated DNA a second time. Meanwhile, the sequencing primer binding sequence of the sequencing platform is introduced through a primer 5' tag form.
VAHTS HiFi Amplification Mix was thawed and mixed upside down, and the reaction system shown in Table 6 below was prepared in a sterilized PCR tube.
TABLE 6 second round PCR amplification reaction System
| Component (A) | Volume/. Mu.L |
| Purification of Adapter Ligation product | 4 |
| VAHTS HiFi Amplification Mix | 25 |
| LTRFnest(10μM) | 1 |
| ADR2-2(10μM) | 1 |
| Water and its preparation method | 19 |
The correspondence of the samples of example 2 to the second round PCR amplification primers is shown in Table 7.
TABLE 7 second round PCR amplification primers
| Sample of | LTRFnest primer | ADR2-2 primers |
| 5% | LTRFneststepi506 | ADR2-2step(p59) |
| 5% | LTRFneststepi506 | ADR2-2step(p60) |
| 1% | LTRFneststepi506 | ADR2-2step(p61) |
| 1% | LTRFneststepi506 | ADR2-2step(p62) |
| 0.50% | LTRFneststepi504 | ADR2-2step(p51) |
| 0.50% | LTRFneststepi504 | ADR2-2step(p52) |
| 0.10% | LTRFneststepi507 | ADR2-2step(p51) |
| 0.10% | LTRFneststepi507 | ADR2-2step(p52) |
| 0.05% | LTRFneststepi507 | ADR2-2step(p53) |
| 0.05% | LTRFneststepi507 | ADR2-2step(p54) |
| 10ng/μl HGD | LTRFneststepi504 | ADR2-2step(p54) |
| 10ng/μl HGD | LTRFneststepi505 | ADR2-2step(p55) |
The reaction procedure is shown in Table 8:
TABLE 8 second round PCR amplification reaction procedure
The PCR products were purified by magnetic beads after library amplification.
5) Sequencing on machine
6) Low quality reads filtration and lentiviral signature sequence filtration
The illuminea sequencing element sequences of the sequencing data were removed by a letter generating tool and low quality reads were filtered. Rules are set to screen for read1 sequences containing LTRs, and read2 sequences containing asymmetric linkers.
The data filtering method comprises the following steps:
a. truncating the linker sequence in the sequencing reads;
b. cutting off bases with mass lower than 20 at both ends of reads;
c. deleting the N base at the tail end of read;
d. removing the pair-end reads with the single-end read length less than 20bp after truncation;
e, screening a nucleotide sequence containing LTR sequence ATCTCTAGCA, wherein the mismatch is less than or equal to 3bp, and the positioning rule is adopted;
read2 selection contains asymmetric linker sequence "GCTCTTCCGATC", mismatch ∈3bp.
7) Discrimination of integration events
Comparing the sequencing data with a reference human genome through Bowtie2, setting an integration event rule, and counting the chromosome position of an integration site. The integrated site information was annotated and counted by a database such as Ensemble, UCSC. The following conditions are met, defined as integration events:
the length of the read1 comparison to the ginseng genome is more than or equal to 55bp;
the length of the read2 comparison to the ginseng genome is more than or equal to 55bp
c. Strand of two reads is reversed
Integration events within the d.300bp range are merged into a single integration event
e.reads is more than or equal to 10
Experimental results show that the positive samples (0.05% -5%) with different proportions can detect the chr17:79555395 and chr17:38033997 (hg 19) 2 integration sites (Table 9), and the insertion proportion is expected to be positively correlated with the measurement proportion, R 2 Not less than 0.99 (figure 2). The human genomic DNA did not detect any integration events, so the detection sensitivity of the method of the invention was 0.05%.
TABLE 9 integration site detection results for different ratio samples
Example 2
Accuracy test of cell lines
The integration sites detected by the invention were verified by Sanger to assess the accuracy of the method. The verification result is shown in figure 3. The results showed that the results of the test cell lines using the present invention were confirmed to be consistent with Sanger results.
Claims (10)
1. A method for detecting a lentivirus integration site, comprising the steps of:
s100, extracting host genome DNA, and breaking the host genome DNA into fragments of 200-400 bp;
s200, performing terminal repair and A addition on the interrupted host genome DNA, and connecting asymmetric joints;
s300, using the PCR product obtained in the step S200 as a template, adopting a first round of primer, introducing a blocker for amplification, and enriching integrated DNA for the first time;
s400, using the PCR product of the step S300 as a template, adopting a second round of primer, introducing a blocker for amplification, and enriching and integrating DNA for the second time; simultaneously introducing a sequencing primer binding sequence of a sequencing platform;
s500, sequencing the PCR product in the step S400 on a machine;
s600, filtering sequencing data;
s700, judging the integration site of the lentivirus according to the filtered data.
2. The method for detecting lentiviral integration sites according to claim 1, wherein the adapter primer is: adaptor F, nucleotide sequence such as SEQ ID NO:1 is shown in the specification; adaptor, nucleotide sequence such as SEQ ID NO: 2.
3. The method for detecting lentiviral integration sites according to claim 1, wherein the nucleotide sequence of the blocker is as set forth in seq id no: 3.
4. The method according to claim 1, wherein the first round of primers in step S300 and the second round of primers in step S400 are nested primers.
5. The method for detecting lentiviral integration sites according to claim 1, wherein the first round of primers are: LTRF1, nucleotide sequence such as seq id no:4 is shown in the figure; ADR1, nucleotide sequence such as seq id no: shown at 5.
6. The method of claim 4, wherein the second round of primers are: LTRF2, the nucleotide sequence of its core sequence is as set forth in seq id no:6 is shown in the figure; ADR2, its core sequence nucleotide sequence such as SEQ ID NO: shown at 7.
7. The method for detecting lentiviral integration sites according to claim 1, wherein the design principle of the blocker is as follows:
the length is about 15-35 nt;
the length of the primer is 3-10 bp longer than that of the first round primer and the second round primer.
8. The method according to claim 1, wherein the filtering in step S600 includes filtering low quality reads and filtering characteristic sequences of lentivirus.
9. The method according to claim 8, wherein in the step S600, the data filtering method is as follows:
a. truncating the linker sequence in the sequencing reads;
b. cutting off bases with mass lower than 20 at both ends of reads;
c. deleting the N base at the tail end of read;
d. removing the pair-end reads with the single-end read length less than 20bp after truncation;
e, screening a nucleotide sequence containing LTR sequence ATCTCTAGCA, wherein the mismatch is less than or equal to 3bp, and the positioning rule is adopted;
read2 selection contains asymmetric linker sequence "GCTCTTCCGATC", mismatch ∈3bp.
10. The method according to claim 1, wherein in the step S700, the following condition is satisfied, defined as an integration event:
the length of the read1 comparison to the ginseng genome is more than or equal to 55bp;
the length of the read2 comparison to the ginseng genome is more than or equal to 55bp;
c. the strand of the two reads is opposite;
integration events within 300bp range are merged into a single integration event;
e.reads are more than or equal to 10.
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202310063306.0A CN116814752A (en) | 2023-01-17 | 2023-01-17 | Method for detecting slow virus integration site and application thereof |
| PCT/CN2023/096254 WO2024152493A1 (en) | 2023-01-17 | 2023-05-25 | Method for detecting lentivirus integration site and use thereof |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202310063306.0A CN116814752A (en) | 2023-01-17 | 2023-01-17 | Method for detecting slow virus integration site and application thereof |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN116814752A true CN116814752A (en) | 2023-09-29 |
Family
ID=88139918
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202310063306.0A Pending CN116814752A (en) | 2023-01-17 | 2023-01-17 | Method for detecting slow virus integration site and application thereof |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN116814752A (en) |
| WO (1) | WO2024152493A1 (en) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN118703607A (en) * | 2024-07-10 | 2024-09-27 | 上海唯可生物科技有限公司 | A high-throughput single-cell exogenous vector integration site detection method based on microfluidics technology |
Family Cites Families (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US12188099B2 (en) * | 2018-12-12 | 2025-01-07 | University Of Pittsburgh-Of The Commonwealth System Of Higher Education | Integrated proviral sequencing assay |
| CN109554447A (en) * | 2018-12-19 | 2019-04-02 | 武汉波睿达生物科技有限公司 | Integration site analysis method and primer of the slow virus carrier in CAR-T cell |
| GB201905244D0 (en) * | 2019-04-12 | 2019-05-29 | Ospedale San Raffaele | Method for analysisng insertion sites |
| CN113046835A (en) * | 2019-12-27 | 2021-06-29 | 深圳华大生命科学研究院 | Sequencing library construction method for detecting lentivirus insertion site and lentivirus insertion site detection method |
-
2023
- 2023-01-17 CN CN202310063306.0A patent/CN116814752A/en active Pending
- 2023-05-25 WO PCT/CN2023/096254 patent/WO2024152493A1/en not_active Ceased
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN118703607A (en) * | 2024-07-10 | 2024-09-27 | 上海唯可生物科技有限公司 | A high-throughput single-cell exogenous vector integration site detection method based on microfluidics technology |
Also Published As
| Publication number | Publication date |
|---|---|
| WO2024152493A1 (en) | 2024-07-25 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US12209281B2 (en) | Safe sequencing system | |
| CN110997937B (en) | Universal short adaptors with variable length non-random unique molecular identifiers | |
| Serrao et al. | Amplification, next-generation sequencing, and genomic DNA mapping of retroviral integration sites | |
| CN106048009B (en) | Label joint for ultralow frequency gene mutation detection and application thereof | |
| CN111808854B (en) | Equilibrium linker with molecular barcode and method for rapid construction of transcriptome library | |
| US20230002821A1 (en) | High-throughput detection method for rare mutation of gene | |
| CN105331606A (en) | Nucleic acid molecule quantification method applied to high-throughput sequencing | |
| CN113046835A (en) | Sequencing library construction method for detecting lentivirus insertion site and lentivirus insertion site detection method | |
| CN110219054B (en) | A nucleic acid sequencing library and its construction method | |
| WO2017204572A1 (en) | Method for preparing library for highly parallel sequencing by using molecular barcoding, and use thereof | |
| CN116814752A (en) | Method for detecting slow virus integration site and application thereof | |
| CN104834833B (en) | The detection method and device of SNP | |
| WO2008062385A2 (en) | Antiretroviral drug resistance testing | |
| CN114790579A (en) | Method for constructing new coronavirus sequencing library, method for determining new coronavirus nucleic acid sequence, sequencing library and kit | |
| CN118186064A (en) | A DNA damage-differentiating sequencing method based on different base transition patterns | |
| US11959131B2 (en) | Method for measuring mutation rate | |
| US20240209349A1 (en) | Umi and application thereof, molecular identifier group, adapter, adapter ligation reagent, kits, method for constructing dna library and method for sequencing gene | |
| US20200208140A1 (en) | Methods of making and using tandem, twin barcode molecules | |
| CN109609694B (en) | Kit and method for detecting hepatitis B typing and multidrug resistance sites based on Illumina sequencing technology | |
| CN117625788B (en) | Construction method of multiplex PCR (polymerase chain reaction) combined molecular tag sequencing library | |
| CN117568450B (en) | Improved construction method and application of amplicon library carrying specificity molecular tag | |
| CN119876496B (en) | Detection method, kit and application of HIV-1 drug resistance gene based on NGS | |
| WO2017179946A1 (en) | Error confirmation method and device for massive parallel sequencing | |
| HK40069955A (en) | Method for constructing novel coronavirus sequencing library, method for determining novel coronavirus nucleic acid sequence, sequencing library and kit | |
| CN119177318A (en) | Primer group, kit for amplifying HIV-1 genome, and method and application thereof |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination |