CN116814752A

CN116814752A - Method for detecting slow virus integration site and application thereof

Info

Publication number: CN116814752A
Application number: CN202310063306.0A
Authority: CN
Inventors: 黄启宽; 彭祥翔; 陈苗苗; 刘海迪; 李晓那
Original assignee: Shanghai Jinghan Biotechnology Co ltd; Ningbo Xining Testing Technology Co ltd
Current assignee: Shanghai Jinghan Biotechnology Co ltd; Ningbo Xining Testing Technology Co ltd
Priority date: 2023-01-17
Filing date: 2023-01-17
Publication date: 2023-09-29
Also published as: WO2024152493A1

Abstract

The invention discloses a detection method of lentivirus integration sites and application thereof, wherein the detection method comprises the following steps: s100, extracting DNA, and breaking; s200, repairing the tail end of the broken DNA, adding A, amplifying by adopting a joint primer, and connecting an asymmetric joint; s300, taking the PCR product in the step S200 as a template, and adopting a first round of primer and introducing a blocker for amplification; s400, taking the PCR product in the step S300 as a template, and adopting a second round of primer and introducing a blocker for amplification; simultaneously introducing a sequencing primer binding sequence of a sequencing platform; s500, sequencing the PCR product in the step S400 on a machine; s600, filtering sequencing data; s700, judging the slow virus integration event according to the filtered data. The invention is based on the PCR technology mediated by the linker, enriches the insertion sites through two-step PCR, and has simple operation, short experimental period and low cost compared with other library building methods. Meanwhile, by introducing a blocker targeting the proviral sequence, the amplification of 5' LTR invalid integration site reads is blocked, and the detection sensitivity is improved.

Description

Method for detecting slow virus integration site and application thereof

Technical Field

The invention relates to the field of gene therapy and cell therapy drug development. In particular to a detection method of lentivirus integration sites and application thereof.

Background

Lentiviruses are one of the retrovirus classes, and existing lentivirus vectors are derived from a number of species, including human immunodeficiency virus (human immunodeficiency virus, HIV), monkey immunodeficiency virus (simian immunodeficiency virus, SIV), and the like. It is constructed based on lentivirus genome, with the nonessential genes of its expression removed, and instead of therapeutic genes. Lentiviruses are stably expressed on the chromosome of host cells for a long period of time, and compared with other retroviruses (e.g., adenoviruses), lentiviruses are capable of infecting not only dividing cells but also non-dividing cells, and based on the above advantages, lentivirus vector (lentiviral vector) has been widely used for clinical gene therapy as a foreign gene transfer vector.

The obvious advantage of using slow virus as carrier is high gene transfer efficiency and integration rate higher than other gene transfer methods. However, since the integration site is one of the factors determining the expression level of the foreign gene, the lentiviral vector is not capable of inserting the target gene into a specific site but is randomly inserted, and thus one of the possible risks of lentiviral vector-mediated gene transduction is that it may be integrated into a protooncogene or an oncogene of a host cell to induce insertional mutagenesis (insertional mutagenesis) resulting in activation of the oncogene or suppression of the oncogene, thereby inducing the occurrence of cancer. These potential side effects have attracted great attention, so we need to evaluate the safety of lentiviral vector-mediated gene transfer.

Methods for evaluation of lentiviral random insertions have evolved as gene/cell therapy applications have evolved. In the past, identification of viral insertion sites was carried out by the earliest of Southern hybridization and proviral PCR, which had problems of insufficient sensitivity and specificity. Accordingly, the discovery of linear amplification-Mediated PCR (Linear Amplification-Mediated PCR, LAM-PCR) is a significant advance in viral insertion sites, and the existing viral insertion site analysis (VIS) is based on this approach. LAM-PCR can be used to detect complex samples such as rare insertion sites derived from peripheral blood, but this method requires consideration of the preference of cleavage recognition frequency of restriction enzymes, with the risk of technical errors. With the development of sequencing technology, LAM-PCR is combined with high-throughput sequencing technology, so that the method becomes an effective method for identifying insertion sites, and the known regions are marked by using biotin, so that the insertion sites are enriched and the integration safety and preference are analyzed. However, LAM-PCR relies on linear amplification techniques, and has long amplification time, complicated downstream biotin capture procedures, resulting in long experimental cycles and easy operation errors. Meanwhile, the biotin marks the primer, so that the synthesis cost and the working time are increased.

Disclosure of Invention

The invention aims to establish a high-sensitivity lentivirus insertion site detection method, which has the advantages of simple operation, short experimental period and low cost compared with other library building methods.

The invention adopts the following technical scheme:

a method for detecting a lentivirus integration site, comprising the steps of:

s100, extracting host genome DNA, and breaking the host genome DNA into fragments of 200-400 bp;

s200, carrying out terminal repair and A addition on the interrupted host genome DNA, and connecting an asymmetric joint;

s300, using the PCR product obtained in the step S200 as a template, adopting a first round of primer, introducing a blocker for amplification, and enriching and integrating DNA for the first time;

s400, using the PCR product obtained in the step S300 as a template, adopting a second round of primer, introducing a blocker for amplification, and enriching and integrating DNA for the second time; simultaneously introducing a sequencing primer binding sequence of a sequencing platform;

s500, sequencing the PCR product in the step S400 on a machine;

s600, filtering sequencing data;

s700, judging the slow virus integration event according to the filtered data.

The invention is based on the joint-Mediated PCR (LM-PCR) technology, and enriches the insertion sites through two-step PCR, and compared with other library building methods, the invention has simple operation, short experimental period and low cost. Meanwhile, by introducing a specific probe (blocker) targeting provirus, amplification of 5' LTR null integration site reads is blocked, and detection sensitivity is improved.

Preferably, the adaptor primer is: the nucleotide sequence of the upstream primer adapter is shown in SEQ ID NO:1 is shown in the specification; the nucleotide sequence of the downstream primer adapter is shown as SEQ ID NO: 2.

Preferably, the nucleotide sequence of the blocker is shown in SEQ ID NO: 3.

Preferably, the first round primer in step S300 and the second round primer in step S400 employ nested primers.

Preferably, the first round primer is: LTRF1, nucleotide sequence set forth in SEQ ID NO:4 is shown in the figure; ADR1, the nucleotide sequence of which is shown in SEQ ID NO: shown at 5.

Preferably, the second round primer is: the nucleotide sequence of the core sequence of the upstream primer is "5'-GTAACTAGAGATCCCTCAGACCCTTTTA-3'", as shown in SEQ ID NO:6 is shown in the figure; the nucleotide sequence of the core sequence of the downstream primer is "5'-TACCGGACCGAAGGAGCTAA-3'", as shown in SEQ ID NO: shown at 7.

Preferably, the second round of primers, the upstream primer adds a P5 primer sequence, a P5 terminal index sequence and a sequencing primer sequence at the 5 'end of the corresponding core sequence, and the downstream primer adds a P7 primer sequence, a P7 terminal index sequence and a sequencing primer sequence at the 5' end of the corresponding core sequence.

Further preferably, the second round primer, the upstream primer is any one of LTRFnesttepi 501, LTRFnesttepi 502, LTRFnesttepi 503, LTRFnesttepi 504, LTRFnesttepi 505, LTRFnesttepi 506, LTRFnesttepi 507, LTRFnesttepi 508, the corresponding nucleotide sequence is as set forth in SEQ ID NO: 8-15; the downstream primer is any one of ADR2-2step (p 51), ADR2-2step (p 52), ADR2-2step (p 53), ADR2-2step (p 54), ADR2-2step (p 55), ADR2-2step (p 56), ADR2-2step (p 57), ADR2-2step (p 58), ADR2-2step (p 59), ADR2-2step (p 60), ADR2-2step (p 61) and ADR2-2step (p 62), and the corresponding nucleotide sequence is as shown in SEQ ID NO: 16-27. Wherein the upstream primer and the downstream primer each comprise a labeling sequence of 8 bases, and through different upstream and downstream primer combinations (up to 96 types), the samples can be labeled with different labels, and up to 96 types of samples are supported to be simultaneously stored and put on the machine.

Preferably, the design principle of the blocking device is as follows:

the length is about 15-35 nt;

3-10 bp longer than the first round primer and the second round primer.

Preferably, in step S600, filtering of low quality reads and filtering of lentiviral signature sequences are included.

Preferably, in step S600, the data filtering method is as follows:

a. truncating the linker sequence in the sequencing reads;

b. cutting off bases with mass lower than 20 at both ends of reads;

c. deleting the N base at the tail end of read;

d. removing the pair-end reads with the single-end read length less than 20bp after truncation;

e, screening a nucleotide sequence containing LTR sequence ATCTCTAGCA, wherein the mismatch is less than or equal to 3bp, and the positioning rule is adopted;

read2 selection contains asymmetric linker sequence "GCTCTTCCGATC", mismatch ∈3bp.

Preferably, the following conditions are satisfied, defined as an integration event:

the length of the read1 comparison to the ginseng genome is more than or equal to 55bp;

the length of the read2 comparison to the ginseng genome is more than or equal to 55bp;

c. the strand of the two reads is opposite;

integration events within 300bp range are merged into a single integration event;

e.reads are more than or equal to 10.

In another aspect of the invention, the application of the detection method for detecting the integration site of the lentivirus is also provided, and the detection method is used for evaluating the safety of gene transfer mediated by the lentivirus vector.

According to the technical scheme, the detection method provided by the invention can be used for blocking the amplification of 5' LTR invalid integration site reads by introducing a specific probe (blocker) of a target provirus sequence (provirus), inhibiting the information of the useless integration site, and improving the sensitivity by more than 2 times by using the same sequencing data volume. By introducing a unique molecular tag technology, not only is the genomic position of an integration site detected, but also the copy number of an integrated gene can be accurately quantified, and meanwhile, the unique molecular tag adopts three random base designs, so that sequencing errors can be corrected at the same time.

Drawings

FIG. 1 is a schematic diagram of a method for detecting lentivirus integration sites.

In the figure, host DNA: genomic DNA;5'LTR and 3' LTR: a lentiviral signature sequence, an integrase recognition sequence; provirus: a virus-loaded foreign sequence; blocking er: oligonucleotides that inhibit amplification of exogenous sequences.

Sensitivity experiment-theoretical integration ratio of the cell line of fig. 2 correlation with measurement ratio. R is R ² ＝0.9986

FIG. 3 is a graph of the results of the verification of the detection method of the present invention by Sanger.

Detailed Description

The present invention will be further described with reference to specific examples, which are not intended to be limiting, so that those skilled in the art will better understand the present invention and practice it.

Example 1

Sensitivity experiments on cell lines

Cell lines (110621P 3 MX-6) and human genomes with known integration site information were used as sensitivity test samples. Specifically, the cell line DNA was incorporated into the human genome at 5%, 1%, 0.5%, 0.1% and 0.05%, and the dilution method can be referred to in table 1 below.

TABLE 1 reference relationship of target cell lines to human genome

Gradient of	Solution 1	Solution 2
			50％	50. Mu.L of cell line DNA	50 μl human genome
5％	10μL 50％	90 μL human genome
			1％	20μL 5％	80 μL human genome
0.50％	50μL 1％	50 μl human genome
			0.10％	20μL 0.5％	80 μL human genome
0.05％	50μL 0.1％	50 μl human genome

Library construction, sequencing and belief analysis were performed as follows:

1) Extraction and disruption of genome of lentivirus transfected cells

The diluted sample DNA was broken by 5%, 1%, 0.50%, 0.10%, 0.05% and human genome DNA, and the fragment length was set to "200 to 400bp". 2 well replicates were set for each sample, i.e. a total of 12 samples were used for library construction.

2) Breaking genome, repairing end, adding A, and connecting asymmetric joints at two ends

By usingUniversal DNA Library Prep Kit for illumine V3 kit terminal repair, addition of A and linker to the fragmented DNA. When the End is repaired, the End Prep Mix 4 is thawed and then is mixed reversely, and the reaction system shown in the following table 2 is prepared in a sterilized PCR tube.

TABLE 2 reaction system for repairing terminal

Component (A)	Volume/. Mu.L
		Input DNA	50
End Prep Mix 4	15

The reaction procedure: 15min at 20℃and 15min at 65 ℃.

The reaction system of Table 3 below was prepared in the End Preparation step PCR tube when the adaptors were ligated.

Table 3 Joint reaction System

Component (A)	Volume/. Mu.L
		End Preparation product	65
Rapid Ligation buffer	25
		Rapid DNA ligase	5
VIS-adapter mixture	5

The sequence of the adaptor primer pair adaptor F and adaptor R is as follows:

adaptor F sequence (5 '-3')

GATCGGAAGAGCHHHHHHHHHHHHHHHTTAGCTCCTTCGGTCCTCC，

The adaptor R sequence (5 '-3') GCTCTTCCGATCT.

The reaction procedure: 15min at 20℃and hold at 4 ℃.

The magnetic beads purify the linker ligation product and then go to the next step.

3) Primers are designed for amplification according to asymmetric linkers and the fixed sequences of lentiviruses, and integrated DNA is enriched for the first time

VAHTS HiFi Amplification Mix was thawed and mixed upside down, and the reaction was prepared in a sterilized PCR tube as shown in Table 4 below.

TABLE 4 first round PCR amplification System

Component (A)	Volume/. Mu.L
		Purification of Adapter Ligation product	23
VAHTS HiFi Amplification Mix	25
		LTRF1(10μM)	1
ADR1(10μM)	1

The first round of amplification primers LTRF1 and ADR1 were the following sequences:

LTRF1 sequence (5 '-3') TGTGACTCTGGTAACTAGAGATCCCTC,

ADR1 sequence (5 '-3') ACCGCTTGGCCTCCGACTT.

The reaction procedure is as in table 5:

TABLE 5 first round PCR amplification reaction procedure

The PCR products were purified by magnetic beads after library amplification.

4) Based on the asymmetric adapter and the sequence inside the lentivirus, a semi-nested primer was designed to enrich the integrated DNA a second time. Meanwhile, the sequencing primer binding sequence of the sequencing platform is introduced through a primer 5' tag form.

VAHTS HiFi Amplification Mix was thawed and mixed upside down, and the reaction system shown in Table 6 below was prepared in a sterilized PCR tube.

TABLE 6 second round PCR amplification reaction System

Component (A)	Volume/. Mu.L
		Purification of Adapter Ligation product	4
VAHTS HiFi Amplification Mix	25
		LTRFnest(10μM)	1
ADR2-2(10μM)	1
		Water and its preparation method	19

The correspondence of the samples of example 2 to the second round PCR amplification primers is shown in Table 7.

TABLE 7 second round PCR amplification primers

Sample of	LTRFnest primer	ADR2-2 primers
			5％	LTRFneststepi506	ADR2-2step(p59)
5％	LTRFneststepi506	ADR2-2step(p60)
			1％	LTRFneststepi506	ADR2-2step(p61)
1％	LTRFneststepi506	ADR2-2step(p62)
			0.50％	LTRFneststepi504	ADR2-2step(p51)
0.50％	LTRFneststepi504	ADR2-2step(p52)
			0.10％	LTRFneststepi507	ADR2-2step(p51)
0.10％	LTRFneststepi507	ADR2-2step(p52)
			0.05％	LTRFneststepi507	ADR2-2step(p53)
0.05％	LTRFneststepi507	ADR2-2step(p54)
			10ng/μl HGD	LTRFneststepi504	ADR2-2step(p54)
10ng/μl HGD	LTRFneststepi505	ADR2-2step(p55)

The reaction procedure is shown in Table 8:

TABLE 8 second round PCR amplification reaction procedure

The PCR products were purified by magnetic beads after library amplification.

5) Sequencing on machine

6) Low quality reads filtration and lentiviral signature sequence filtration

The illuminea sequencing element sequences of the sequencing data were removed by a letter generating tool and low quality reads were filtered. Rules are set to screen for read1 sequences containing LTRs, and read2 sequences containing asymmetric linkers.

The data filtering method comprises the following steps:

a. truncating the linker sequence in the sequencing reads;

b. cutting off bases with mass lower than 20 at both ends of reads;

c. deleting the N base at the tail end of read;

7) Discrimination of integration events

Comparing the sequencing data with a reference human genome through Bowtie2, setting an integration event rule, and counting the chromosome position of an integration site. The integrated site information was annotated and counted by a database such as Ensemble, UCSC. The following conditions are met, defined as integration events:

the length of the read2 comparison to the ginseng genome is more than or equal to 55bp

c. Strand of two reads is reversed

Integration events within the d.300bp range are merged into a single integration event

e.reads is more than or equal to 10

Experimental results show that the positive samples (0.05% -5%) with different proportions can detect the chr17:79555395 and chr17:38033997 (hg 19) 2 integration sites (Table 9), and the insertion proportion is expected to be positively correlated with the measurement proportion, R ² Not less than 0.99 (figure 2). The human genomic DNA did not detect any integration events, so the detection sensitivity of the method of the invention was 0.05%.

TABLE 9 integration site detection results for different ratio samples

Example 2

Accuracy test of cell lines

The integration sites detected by the invention were verified by Sanger to assess the accuracy of the method. The verification result is shown in figure 3. The results showed that the results of the test cell lines using the present invention were confirmed to be consistent with Sanger results.

Claims

1. A method for detecting a lentivirus integration site, comprising the steps of:

s200, performing terminal repair and A addition on the interrupted host genome DNA, and connecting asymmetric joints;

s300, using the PCR product obtained in the step S200 as a template, adopting a first round of primer, introducing a blocker for amplification, and enriching integrated DNA for the first time;

s400, using the PCR product of the step S300 as a template, adopting a second round of primer, introducing a blocker for amplification, and enriching and integrating DNA for the second time; simultaneously introducing a sequencing primer binding sequence of a sequencing platform;

s500, sequencing the PCR product in the step S400 on a machine;

s600, filtering sequencing data;

s700, judging the integration site of the lentivirus according to the filtered data.

2. The method for detecting lentiviral integration sites according to claim 1, wherein the adapter primer is: adaptor F, nucleotide sequence such as SEQ ID NO:1 is shown in the specification; adaptor, nucleotide sequence such as SEQ ID NO: 2.

3. The method for detecting lentiviral integration sites according to claim 1, wherein the nucleotide sequence of the blocker is as set forth in seq id no: 3.

4. The method according to claim 1, wherein the first round of primers in step S300 and the second round of primers in step S400 are nested primers.

5. The method for detecting lentiviral integration sites according to claim 1, wherein the first round of primers are: LTRF1, nucleotide sequence such as seq id no:4 is shown in the figure; ADR1, nucleotide sequence such as seq id no: shown at 5.

6. The method of claim 4, wherein the second round of primers are: LTRF2, the nucleotide sequence of its core sequence is as set forth in seq id no:6 is shown in the figure; ADR2, its core sequence nucleotide sequence such as SEQ ID NO: shown at 7.

7. The method for detecting lentiviral integration sites according to claim 1, wherein the design principle of the blocker is as follows:

the length is about 15-35 nt;

the length of the primer is 3-10 bp longer than that of the first round primer and the second round primer.

8. The method according to claim 1, wherein the filtering in step S600 includes filtering low quality reads and filtering characteristic sequences of lentivirus.

9. The method according to claim 8, wherein in the step S600, the data filtering method is as follows:

a. truncating the linker sequence in the sequencing reads;

b. cutting off bases with mass lower than 20 at both ends of reads;

c. deleting the N base at the tail end of read;

10. The method according to claim 1, wherein in the step S700, the following condition is satisfied, defined as an integration event:

c. the strand of the two reads is opposite;

e.reads are more than or equal to 10.