[go: up one dir, main page]

CN116694603A - Novel Cas protein, crispr-Cas system and use thereof in the field of gene editing - Google Patents

Novel Cas protein, crispr-Cas system and use thereof in the field of gene editing Download PDF

Info

Publication number
CN116694603A
CN116694603A CN202310742030.9A CN202310742030A CN116694603A CN 116694603 A CN116694603 A CN 116694603A CN 202310742030 A CN202310742030 A CN 202310742030A CN 116694603 A CN116694603 A CN 116694603A
Authority
CN
China
Prior art keywords
protein
cas
sequence
seq
crispr
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310742030.9A
Other languages
Chinese (zh)
Inventor
江媛
王丹
章登位
戴雪辰
汪晓珏
纪泽阳
王�琦
赵静
李卓坤
顾颖
欧阳文杰
沈玥
陈奥
章文蔚
肖亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BGI Shenzhen Co Ltd
Original Assignee
BGI Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BGI Shenzhen Co Ltd filed Critical BGI Shenzhen Co Ltd
Publication of CN116694603A publication Critical patent/CN116694603A/en
Pending legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases [RNase]; Deoxyribonucleases [DNase]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/87Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
    • C12N15/90Stable introduction of foreign DNA into chromosome
    • C12N15/902Stable introduction of foreign DNA into chromosome using homologous recombination
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPR]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A50/00TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE in human health protection, e.g. against extreme weather
    • Y02A50/30Against vector-borne diseases, e.g. mosquito-borne, fly-borne, tick-borne or waterborne diseases whose impact is exacerbated by climate change

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Zoology (AREA)
  • Biomedical Technology (AREA)
  • Organic Chemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Wood Science & Technology (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Plant Pathology (AREA)
  • Medicinal Chemistry (AREA)
  • Mycology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Peptides Or Proteins (AREA)

Abstract

The invention relates to the field of gene editing, in particular to a novel Cas protein, a Crispr-Cas system and application thereof in the field of gene editing. The novel Cas protein is selected from at least one of the following: SEQ ID NO. 1-SEQ ID NO. 4; the sequence similarity is 85% or more, preferably 90% or more, compared with any one of SEQ ID NO 1 to SEQ ID NO 4. The novel Cas protein provided by the invention can be used for a Crispr-Cas system, and can be used for editing genes. It can edit more target sites and is easier to deliver into cells for editing without causing off-target.

Description

新型的Cas蛋白、Crispr-Cas系统及其在基因编辑领域中的 用途Novel Cas proteins, Crispr-Cas systems and their applications in gene editing

技术领域Technical Field

本发明涉及基因编辑领域,具体涉及新型的Cas蛋白、Crispr-Cas系统及其在基因编辑领域中的用途。The present invention relates to the field of gene editing, and in particular to a novel Cas protein, a Crispr-Cas system and uses thereof in the field of gene editing.

背景技术Background Art

CRISPR(Clustered regularly interspaced short palindromic repeats),被称为规律成簇间隔短回文重复,实际上是一种基因编辑器,是大多数细菌及古细菌中的一种天然免疫方式。通过对CRISPR簇的侧翼序列分析发现,在其附近存在一个多态性家族基因,并且与CRISPR区域共同发挥作用,因此被命名为CRISPR关联基因(CRISPRassociated),缩写为Cas。大多数的CRISPR-Cas系统都含有cas1蛋白,而且cas1是Cas家族中较为保守的蛋白。根据效应模块的结构,目前被发现的CRISPR-Cas系统主要有两类:Class1是包含多个Cas蛋白并有多个效应蛋白(effector)共同作用,主要包括Type I型、Type III和Type IV型;Class2仅包含一个巨大的effector蛋白,包括Type II型、Type V型和Type VI型。目前,Class2包括Cas9系统(TypeⅡ型)和Cpf1(TypeⅤ型)系统,并且广泛用于基因编辑应用中。CRISPR (Clustered regularly interspaced short palindromic repeats), known as regularly clustered short palindromic repeats, is actually a gene editor and a natural immunity method in most bacteria and archaea. Through the analysis of the flanking sequences of the CRISPR cluster, it was found that there is a polymorphic family gene near it, and it works together with the CRISPR region, so it is named CRISPR-associated gene (CRISPR-associated), abbreviated as Cas. Most CRISPR-Cas systems contain cas1 protein, and cas1 is a more conservative protein in the Cas family. According to the structure of the effector module, there are two main types of CRISPR-Cas systems discovered so far: Class 1 contains multiple Cas proteins and multiple effector proteins (effectors) work together, mainly including Type I, Type III and Type IV; Class 2 contains only a huge effector protein, including Type II, Type V and Type VI. At present, Class 2 includes Cas9 system (Type II) and Cpf1 (Type V) system, and is widely used in gene editing applications.

然而,Crispr-Cas系统仍需在不少缺点,例如可能会存在基因脱靶的现象,而且其应用范围也有限,还需要进一步改进。However, the Crispr-Cas system still has many shortcomings, such as the possibility of gene off-target, and its scope of application is limited, and further improvement is needed.

发明内容Summary of the invention

本发明旨在至少在一定程度上解决相关技术中的技术问题之一。为此,本发明的一个目的在于提出一种新型的Cas蛋白、Crispr-Cas系统及其在基因编辑领域中的用途。The present invention aims to solve one of the technical problems in the related art at least to a certain extent. To this end, one object of the present invention is to propose a novel Cas protein, Crispr-Cas system and its use in the field of gene editing.

CRISPR/Cas系统是一种常用的基因编辑的系统,能够成功应用于动物和植物基因组的精确编辑中。该系统是由RNA介导靶向识别DNA双链特异位点,并通过核酸酶进行切割,通常使用比较广泛的是Cas9核酸酶和Cpf1核酸酶。Cas9核酸酶和Cpf1核酸酶通过RNA介导靶向识别DNA双链特异位点并切割,造成DNA双链断裂,细胞再通过NHEJ(nonhomologousend joining)或HR(homologous recombination)途径进行修复,实现对目标基因的定点修饰。商业上广泛应用的一种Cas9核酸酶是SpCas9核酸酶,其识别PAM序列为NGG,位于靶向序列的3’端,在距PAM序列3bp处切割形成平末端。LbCpf1是广泛商业应用的一种Cpf1核酸酶,其识别PAM位点是位于靶向序列5’端的TTTN序列,并在远端进行切割,形成粘性末端。The CRISPR/Cas system is a commonly used gene editing system that can be successfully applied to the precise editing of animal and plant genomes. The system is mediated by RNA to target and recognize specific sites of DNA double strands and cut them through nucleases. The most widely used ones are Cas9 nucleases and Cpf1 nucleases. Cas9 nucleases and Cpf1 nucleases target and recognize specific sites of DNA double strands and cut them through RNA, causing DNA double strand breaks. The cells then repair them through NHEJ (nonhomologous end joining) or HR (homologous recombination) pathways to achieve site-specific modification of the target gene. A widely used Cas9 nuclease in commercial applications is SpCas9 nuclease, which recognizes the PAM sequence NGG, located at the 3' end of the target sequence, and cuts at 3bp from the PAM sequence to form a blunt end. LbCpf1 is a widely used Cpf1 nuclease in commercial applications. Its recognition PAM site is the TTTN sequence located at the 5' end of the target sequence, and cuts at the distal end to form a sticky end.

在研究过程中发现:无论是SpCas9还是LbCpf1,均有比较严格的PAM序列,对靶向位点的设计有所限制。而且SpCas9蛋白和LbCpf1蛋白分别由1368和1228个氨基酸组成,体积太大不能用AAV病毒包装和递送,在一定程度上限制了其在动物细胞方面的应用。且SpCas9的靶向序列为20bp,在全基因组中易出现相似序列,造成脱靶。During the research, it was found that both SpCas9 and LbCpf1 have relatively strict PAM sequences, which restrict the design of targeting sites. Moreover, SpCas9 protein and LbCpf1 protein are composed of 1368 and 1228 amino acids respectively, which are too large to be packaged and delivered by AAV virus, which limits their application in animal cells to a certain extent. Moreover, the targeting sequence of SpCas9 is 20bp, and similar sequences are likely to appear in the whole genome, resulting in off-target.

寻找新型可用的Cas蛋白,使得其蛋白长度更小,从而可以方便包装和递送,进一步扩大其在动物细胞领域中的应用。而且使得Crispr-Cas系统不易造成脱靶,至关重要。It is crucial to find new available Cas proteins that are smaller in length, so that they can be packaged and delivered more easily, further expanding their applications in animal cells. It is also crucial to make the Crispr-Cas system less likely to cause off-target effects.

为此,通过研究我们找到多种新型的Cas蛋白,其蛋白长度更短,将其用于Crispr-Cas系统中,可以更容易被递送到细胞中进行编辑。而且更不容易脱靶。以在人肠道菌Veillonella sp AF13-2(下简称:AF13-2)上得到的BES1蛋白为例,其用于Crispr-Cas系统,所识别的PAM序列的特异性均比商业的SpCas9和LbCpf1低,该Cas蛋白潜在可以编辑的靶位点更多。而且的BES1蛋白仅由1064个氨基酸组成,更容易被递送到细胞中进行编辑。SpCas9的靶向序列为20bp,我们的BES1的靶向序列为23bp,比SpCas9潜在更不易造成脱靶。To this end, through research we have found a variety of new Cas proteins with shorter protein lengths. When used in the Crispr-Cas system, they can be more easily delivered to cells for editing. And it is less likely to be off-target. Taking the BES1 protein obtained from the human intestinal bacteria Veillonella sp AF13-2 (hereinafter referred to as: AF13-2) as an example, it is used in the Crispr-Cas system, and the specificity of the PAM sequence recognized is lower than that of the commercial SpCas9 and LbCpf1. The Cas protein has more potential target sites that can be edited. Moreover, the BES1 protein consists of only 1064 amino acids, which is easier to be delivered to cells for editing. The targeting sequence of SpCas9 is 20bp, and the targeting sequence of our BES1 is 23bp, which is potentially less likely to cause off-target than SpCas9.

具体而言,本发明提供了如下技术方案:Specifically, the present invention provides the following technical solutions:

根据本发明的第一方面,本发明提供了一种Cas蛋白,选自下列中的至少一种:SEQID NO:1~SEQ ID NO:4;与SEQ ID NO:1~SEQ ID NO:4中的任一序列相比,其序列相似性在85%以上,优选在90%以上。经过生物学信息技术筛选获得了新型的Cas蛋白SEQ ID NO:1~SEQ ID NO:4,并经过分子生物学技术进行验证,这些Cas蛋白中的任意一种蛋白的容易被递送到细胞中进行基因编辑。而且其所识别的PAM序列特异性合适,所以可以编辑的靶位点更多,而且靶向序列的长度也合适,更不易造成脱靶。与SEQ ID NO:1~SEQ ID NO:4中任一蛋白相比,序列相似性在85%以上,例如在86%以上,87%以上,88%以上,89%以上,优选在90%以上,例如在91%以上,92%以上,93%以上,94%以上的蛋白,其具有SEQ ID NO:1~SEQ ID NO:4所示Cas蛋白相同或者相似的活性和功能,也容易被递送到细胞中进行基因编辑,而且可以编辑的靶位点更多,所靶向的序列长度也更合适,更不易造成脱靶。According to the first aspect of the present invention, the present invention provides a Cas protein, selected from at least one of the following: SEQ ID NO: 1 to SEQ ID NO: 4; compared with any sequence in SEQ ID NO: 1 to SEQ ID NO: 4, the sequence similarity is more than 85%, preferably more than 90%. After biological information technology screening, a new type of Cas protein SEQ ID NO: 1 to SEQ ID NO: 4 was obtained, and it was verified by molecular biology technology. Any of these Cas proteins is easily delivered to cells for gene editing. Moreover, the PAM sequence it recognizes is specific, so more target sites can be edited, and the length of the targeting sequence is also suitable, which is less likely to cause off-target. Compared with any protein in SEQ ID NO:1 to SEQ ID NO:4, the sequence similarity is above 85%, for example, above 86%, above 87%, above 88%, above 89%, preferably above 90%, for example, above 91%, above 92%, above 93%, above 94%, which has the same or similar activity and function as the Cas protein shown in SEQ ID NO:1 to SEQ ID NO:4, and is also easily delivered to cells for gene editing, and more target sites can be edited, the length of the targeted sequence is more suitable, and it is less likely to cause off-target.

根据本发明的实施例,以上所述的Cas蛋白可以进一步包括如下技术特征:According to an embodiment of the present invention, the Cas protein described above may further include the following technical features:

在本发明的一些实施例中,与SEQ ID NO:1~SEQ ID NO:4中的任一序列相比,其序列相似性在95%以上,优选在96%以上,更优选在97%以上,更优选在98%以上,最优选在99%以上。与SEQ ID NO:1~SEQ ID NO:4中任一蛋白相比,序列相似性在95%以上,优选在96%以上,97%以上,98%以上,99%以上,99.5%以上的蛋白与上述Cas蛋白具有相同或者相似的活性,容易被递送到细胞中进行基因编辑,而且可以编辑的靶位点更多,所靶向的序列长度也更合适,不易造成脱靶。In some embodiments of the present invention, compared with any sequence in SEQ ID NO: 1 to SEQ ID NO: 4, the sequence similarity is more than 95%, preferably more than 96%, more preferably more than 97%, more preferably more than 98%, and most preferably more than 99%. Compared with any protein in SEQ ID NO: 1 to SEQ ID NO: 4, the protein with a sequence similarity of more than 95%, preferably more than 96%, more than 97%, more than 98%, more than 99%, and more than 99.5% has the same or similar activity as the above-mentioned Cas protein, is easily delivered to cells for gene editing, and can edit more target sites, and the length of the targeted sequence is more suitable, which is not easy to cause off-target.

在本发明的一些实施例中,所述Cas蛋白为与SEQ ID NO:1~SEQ ID NO:4中的任一序列相比,经过取代、缺失或者添加一个或几个氨基酸且具有核酸酶活性的Cas蛋白。这些蛋白具有与SEQ ID NO:1~SEQ ID NO:4所示蛋白相同或者相似的核酸酶活性,也容易被递送到细胞中进行基因编辑,而且靶位点多,且不易脱靶。In some embodiments of the present invention, the Cas protein is a Cas protein having nuclease activity after substitution, deletion or addition of one or more amino acids compared to any sequence in SEQ ID NO: 1 to SEQ ID NO: 4. These proteins have the same or similar nuclease activity as the proteins shown in SEQ ID NO: 1 to SEQ ID NO: 4, are easily delivered to cells for gene editing, have multiple target sites, and are not easy to be off-target.

在本发明的一些实施例中,所述Cas蛋白与SEQ ID NO:1~SEQ ID NO:4中的任一序列相比,经过取代、缺失或者添加至多8个氨基酸且具有核酸酶活性的Cas蛋白。这些蛋白具有与SEQ ID NO:1~SEQ ID NO:4所示蛋白相同或者相似的核酸酶活性,也容易被递送到细胞中进行基因编辑,而且靶位点多,且不易脱靶。In some embodiments of the present invention, the Cas protein is a Cas protein with nuclease activity after substitution, deletion or addition of up to 8 amino acids compared to any sequence in SEQ ID NO: 1 to SEQ ID NO: 4. These proteins have the same or similar nuclease activity as the proteins shown in SEQ ID NO: 1 to SEQ ID NO: 4, are easily delivered to cells for gene editing, have multiple target sites, and are not easy to be off-target.

在本发明的一些实施例中,所述Cas蛋白与SEQ ID NO:1~SEQ ID NO:4中的任一序列相比,经过取代、缺失或者添加至多6个氨基酸且具有核酸酶活性的Cas蛋白。这些蛋白具有与SEQ ID NO:1~SEQ ID NO:4所示蛋白相同或者相似的核酸酶活性,也容易被递送到细胞中进行基因编辑,而且靶位点多,且不易脱靶。In some embodiments of the present invention, the Cas protein is a Cas protein with nuclease activity after substitution, deletion or addition of up to 6 amino acids compared to any sequence in SEQ ID NO: 1 to SEQ ID NO: 4. These proteins have the same or similar nuclease activity as the proteins shown in SEQ ID NO: 1 to SEQ ID NO: 4, are easily delivered to cells for gene editing, have multiple target sites, and are not easy to be off-target.

在本发明的一些实施例中,所述Cas蛋白与SEQ ID NO:1~SEQ ID NO:4中的任一序列相比,经过取代、缺失或者添加至多5个氨基酸且具有核酸酶活性的Cas蛋白。这些蛋白具有与SEQ ID NO:1~SEQ ID NO:4所示蛋白相同或者相似的核酸酶活性,也容易被递送到细胞中进行基因编辑,而且靶位点多,且不易脱靶。In some embodiments of the present invention, the Cas protein is a Cas protein with nuclease activity after substitution, deletion or addition of up to 5 amino acids compared to any sequence in SEQ ID NO: 1 to SEQ ID NO: 4. These proteins have the same or similar nuclease activity as the proteins shown in SEQ ID NO: 1 to SEQ ID NO: 4, are easily delivered to cells for gene editing, have multiple target sites, and are not easy to be off-target.

在本发明的一些实施例中,所述Cas蛋白与SEQ ID NO:1~SEQ ID NO:4中的任一序列相比,经过取代、缺失或者添加至多4个氨基酸且具有核酸酶活性的Cas蛋白。这些蛋白具有与SEQ ID NO:1~SEQ ID NO:4所示蛋白相同或者相似的核酸酶活性,也容易被递送到细胞中进行基因编辑,而且靶位点多,且不易脱靶。In some embodiments of the present invention, the Cas protein is a Cas protein with nuclease activity after substitution, deletion or addition of up to 4 amino acids compared to any sequence in SEQ ID NO: 1 to SEQ ID NO: 4. These proteins have the same or similar nuclease activity as the proteins shown in SEQ ID NO: 1 to SEQ ID NO: 4, are easily delivered to cells for gene editing, have multiple target sites, and are not easy to be off-target.

在本发明的一些实施例中,所述Cas蛋白与SEQ ID NO:1~SEQ ID NO:4中的任一序列相比,经过取代、缺失或者添加至多3个氨基酸且具有核酸酶活性的Cas蛋白。这些蛋白具有与SEQ ID NO:1~SEQ ID NO:4所示蛋白相同或者相似的核酸酶活性,也容易被递送到细胞中进行基因编辑,而且靶位点多,且不易脱靶。In some embodiments of the present invention, the Cas protein is a Cas protein with nuclease activity after substitution, deletion or addition of up to 3 amino acids compared to any sequence in SEQ ID NO: 1 to SEQ ID NO: 4. These proteins have the same or similar nuclease activity as the proteins shown in SEQ ID NO: 1 to SEQ ID NO: 4, are easily delivered to cells for gene editing, have multiple target sites, and are not easy to be off-target.

在本发明的一些实施例中,所述Cas蛋白与SEQ ID NO:1~SEQ ID NO:4中的任一序列相比,经过取代、缺失或者添加至多2个氨基酸且具有核酸酶活性的Cas蛋白。这些蛋白具有与SEQ ID NO:1~SEQ ID NO:4所示蛋白相同或者相似的核酸酶活性,也容易被递送到细胞中进行基因编辑,而且靶位点多,且不易脱靶。In some embodiments of the present invention, the Cas protein is a Cas protein with nuclease activity after substitution, deletion or addition of up to 2 amino acids compared to any sequence in SEQ ID NO: 1 to SEQ ID NO: 4. These proteins have the same or similar nuclease activity as the proteins shown in SEQ ID NO: 1 to SEQ ID NO: 4, are easily delivered to cells for gene editing, have multiple target sites, and are not easy to be off-target.

在本发明的一些实施例中,所述Cas蛋白与SEQ ID NO:1~SEQ ID NO:4中的任一序列相比,经过取代、缺失或者添加1个氨基酸且具有核酸酶活性的Cas蛋白。这些蛋白具有与SEQ ID NO:1~SEQ ID NO:4所示蛋白相同或者相似的核酸酶活性,也容易被递送到细胞中进行基因编辑,而且靶位点多,且不易脱靶。In some embodiments of the present invention, the Cas protein is a Cas protein with nuclease activity after substitution, deletion or addition of one amino acid compared with any sequence in SEQ ID NO: 1 to SEQ ID NO: 4. These proteins have the same or similar nuclease activity as the proteins shown in SEQ ID NO: 1 to SEQ ID NO: 4, are easily delivered to cells for gene editing, have multiple target sites, and are not easy to be off-target.

在本发明的一些实施例中,所述Cas蛋白为SEQ ID NO:1所示。该Cas蛋白由1064个氨基酸组成,氨基酸个数比较少,更容易被递送到细胞中进行编辑,而且其所识别的PAM序列为NNNV(其中V代表碱基A/G/C),由此可以编辑更多的靶位点,而且其靶向序列为23bp,不容易造成脱靶现象。该Cas蛋白具有体外切割DNA双链活性,未检测到人细胞内编辑活性。In some embodiments of the present invention, the Cas protein is shown in SEQ ID NO: 1. The Cas protein consists of 1064 amino acids, which has a relatively small number of amino acids and is easier to be delivered to cells for editing. The PAM sequence it recognizes is NNNV (where V represents the base A/G/C), so more target sites can be edited, and its targeting sequence is 23bp, which is not easy to cause off-target phenomena. The Cas protein has the activity of cutting double-stranded DNA in vitro, and no editing activity in human cells has been detected.

在本发明的一些实施例中,所述Cas蛋白为SEQ ID NO:2所示。该Cas蛋白由1368个氨基酸组成,氨基酸个数比较少,更容易被递送到细胞中进行编辑,而且其所识别的PAM序列为NNMTA。该Cas蛋白具有体外切割DNA双链活性,未检测到人细胞内编辑活性。In some embodiments of the present invention, the Cas protein is shown in SEQ ID NO: 2. The Cas protein consists of 1368 amino acids, which has a relatively small number of amino acids and is easier to be delivered into cells for editing, and the PAM sequence it recognizes is NMNTA. The Cas protein has in vitro DNA double-strand cleavage activity, and no editing activity in human cells has been detected.

在本发明的一些实施例中,所述Cas蛋白为SEQ ID NO:3所示。该Cas蛋白由1245个氨基酸组成,氨基酸个数比较少,更容易被递送到细胞中进行编辑,而且其所识别的PAM序列为TTTN。该Cas蛋白具有体外切割DNA双链活性和人细胞内编辑活性。In some embodiments of the present invention, the Cas protein is shown in SEQ ID NO: 3. The Cas protein consists of 1245 amino acids, which has a relatively small number of amino acids and is easier to be delivered into cells for editing, and the PAM sequence it recognizes is TTTN. The Cas protein has in vitro DNA double-strand cleavage activity and editing activity in human cells.

在本发明的一些实施例中,所述Cas蛋白为SEQ ID NO:4所示。该Cas蛋白由1306个氨基酸组成,氨基酸个数比较少,更容易被递送到细胞中进行编辑,而且其所识别的PAM序列为YYN,极大地缓解了LbCpf1只识别TTTN的限制。该Cas蛋白具有体外切割DNA双链活性和人细胞内编辑活性。In some embodiments of the present invention, the Cas protein is shown in SEQ ID NO: 4. The Cas protein consists of 1306 amino acids, which has a relatively small number of amino acids and is easier to be delivered into cells for editing. Moreover, the PAM sequence it recognizes is YYN, which greatly alleviates the limitation that LbCpf1 only recognizes TTTN. The Cas protein has in vitro DNA double-strand cleavage activity and editing activity in human cells.

根据本发明的第二方面,本发明提供了一种核酸序列,所述核酸序列选自下列中的至少一种:编码本发明第一方面任一实施例所述的Cas蛋白的核酸序列;与编码本发明第一方面任一实施例所述的Cas蛋白的核酸序列反向互补的核酸序列。According to the second aspect of the present invention, the present invention provides a nucleic acid sequence, which is selected from at least one of the following: a nucleic acid sequence encoding the Cas protein described in any embodiment of the first aspect of the present invention; a nucleic acid sequence that is reverse complementary to the nucleic acid sequence encoding the Cas protein described in any embodiment of the first aspect of the present invention.

在本发明的一些实施例中,所述核酸序列为DNA或者RNA。In some embodiments of the present invention, the nucleic acid sequence is DNA or RNA.

根据本发明的第三方面,本发明提供了一种表达载体,所述表达载体包括本发明第二方面所述的核酸序列。将上述核酸序列与载体构建,获得表达载体,这些表达载体可以在目标细胞中表达相应的Cas蛋白,从而在目标细胞中进行相应的基因编辑。常用的载体可以是质粒、慢病毒等等,例如可以为pET 28a载体、pMD19载体等。According to the third aspect of the present invention, the present invention provides an expression vector, the expression vector comprising the nucleic acid sequence described in the second aspect of the present invention. The above nucleic acid sequence is constructed with a vector to obtain an expression vector, which can express the corresponding Cas protein in the target cell, thereby performing corresponding gene editing in the target cell. Commonly used vectors can be plasmids, lentiviruses, etc., for example, pET 28a vectors, pMD19 vectors, etc.

根据本发明的第四方面,本发明提供了一种重组细胞,所述重组细胞含有本发明第三方面所述的表达载体。将表达载体导入到细胞中,构成重组细胞,利用表达载体表达相应的Cas蛋白,可以实现对于重组细胞的基因编辑。这些重组细胞可以是真核细胞,例如植物细胞、动物细胞。尤其是相较于常用的SpCas9蛋白和LbCpf1蛋白,本文提供的Cas蛋白其氨基酸个数较少,更容易被递送到细胞中进行编辑。在用于动物细胞时,更方便被病毒载体包装和递送,扩大了在动物细胞领域中的应用。According to a fourth aspect of the present invention, the present invention provides a recombinant cell, which contains the expression vector described in the third aspect of the present invention. The expression vector is introduced into the cell to form a recombinant cell, and the corresponding Cas protein is expressed by the expression vector, so that gene editing of the recombinant cell can be achieved. These recombinant cells can be eukaryotic cells, such as plant cells and animal cells. In particular, compared with the commonly used SpCas9 protein and LbCpf1 protein, the Cas protein provided herein has fewer amino acids and is easier to be delivered to cells for editing. When used in animal cells, it is more convenient to be packaged and delivered by viral vectors, which expands its application in the field of animal cells.

根据本发明的第五方面,本发明提供了一种Crispr-Cas系统,包括本发明第一方面所述的Cas蛋白。本发明提供的Cas蛋白可以用于Crispr-Cas系统中,应用于基因编辑领域,扩大了可编辑的范围,而且不容易脱靶,提高了编辑的准确性。将该系统可用于基础生物科学、医药、农业等众多领域中。According to a fifth aspect of the present invention, the present invention provides a Crispr-Cas system, comprising the Cas protein described in the first aspect of the present invention. The Cas protein provided by the present invention can be used in the Crispr-Cas system, applied to the field of gene editing, expanding the editable range, and not easy to off-target, thereby improving the accuracy of editing. The system can be used in many fields such as basic biological sciences, medicine, and agriculture.

根据本发明的实施例,以上所述Crispr-Cas系统可以进一步包括如下技术特征:According to an embodiment of the present invention, the Crispr-Cas system described above may further include the following technical features:

在本发明的一些实施例中,所述Crispr-Cas系统进一步包括下列中的至少一种:crRNA、tracrRNA或者由crRNA、tracrRNA形成的嵌合RNA。这些RNA可以帮助Crispr-cas系统发挥基因编辑的功能。除此之外,所述Crispr-cas系统根据需要还可以进一步包括Crispr_repeat序列,其中每种Cas蛋白对应的Crispr_repeat序列如附表I和附表II所示。In some embodiments of the present invention, the Crispr-Cas system further includes at least one of the following: crRNA, tracrRNA, or a chimeric RNA formed by crRNA and tracrRNA. These RNAs can help the Crispr-cas system to perform the gene editing function. In addition, the Crispr-cas system can further include a Crispr_repeat sequence as needed, wherein the Crispr_repeat sequence corresponding to each Cas protein is shown in Appendix I and Appendix II.

在本发明的一些实施例中,所述crRNA、tracrRNA如附表I和附表II所示。在附表I和附表II中列出了Cas蛋白在进行基因编辑时所用到的crRNA、tracrRNA序列。这些序列可以帮助Cas蛋白精确的定位到靶序列,实现精准的基因编辑。In some embodiments of the present invention, the crRNA and tracrRNA are shown in Appendix I and Appendix II. The crRNA and tracrRNA sequences used by the Cas protein in gene editing are listed in Appendix I and Appendix II. These sequences can help the Cas protein accurately locate the target sequence and achieve accurate gene editing.

根据本发明的第六方面,本发明提供了本发明第一方面所述的Cas蛋白、核酸序列、表达载体、重组细胞或Crispr-Cas系统在基因编辑领域中的用途,其中所述Cas蛋白为本发明第一方面所述的Cas蛋白,所述核酸序列为本发明第二方面所述的核酸序列,所述表达载体为本发明第三方面所述的表达载体,所述重组细胞为本发明第四方面所述的重组细胞,所述Crispr-Cas系统为本发明第五方面所述的Crispr-Cas系统。According to the sixth aspect of the present invention, the present invention provides use of the Cas protein, nucleic acid sequence, expression vector, recombinant cell or Crispr-Cas system described in the first aspect of the present invention in the field of gene editing, wherein the Cas protein is the Cas protein described in the first aspect of the present invention, the nucleic acid sequence is the nucleic acid sequence described in the second aspect of the present invention, the expression vector is the expression vector described in the third aspect of the present invention, the recombinant cell is the recombinant cell described in the fourth aspect of the present invention, and the Crispr-Cas system is the Crispr-Cas system described in the fifth aspect of the present invention.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1是根据本发明的实施例提供的BES1的PAM偏好性图。FIG. 1 is a PAM preference diagram of BES1 provided according to an embodiment of the present invention.

图2是根据本发明的实施例提供的BES1纯化结果图。FIG. 2 is a diagram showing the results of BES1 purification according to an embodiment of the present invention.

图3是根据本发明的实施例提供的BES1的crRNA+tracrRNA-L、sgRNA-1、sgRNA-2和sgRNA-3的碱基序列及结构图。FIG3 is a base sequence and structure diagram of crRNA+tracrRNA-L, sgRNA-1, sgRNA-2 and sgRNA-3 of BES1 provided according to an embodiment of the present invention.

图4是根据本发明的实施例提供的芯片检测分别用crRNA+tracrRNA-L,sgRNA-1,sgRNA-3的BES1的PAM偏好性图。4 is a PAM preference diagram of BES1 detected by a chip using crRNA+tracrRNA-L, sgRNA-1, and sgRNA-3, respectively, according to an embodiment of the present invention.

图5是根据本发明的实施例提供的spacer序列图。FIG. 5 is a spacer sequence diagram provided according to an embodiment of the present invention.

图6是根据本发明的实施例提供的所构建的PAM文库序列。FIG. 6 is a sequence of a PAM library constructed according to an embodiment of the present invention.

图7是根据本发明的实施例提供的切割底物序列示意图。FIG. 7 is a schematic diagram of a cleavage substrate sequence provided according to an embodiment of the present invention.

图8是根据本发明的实施例提供的BES1与crRNA+tracrRNA-L、sgRNA-1、sgRNA-2和sgRNA-3在20℃、25℃和37℃体外切割产物条带图。Figure 8 is a band diagram of in vitro cleavage products of BES1 and crRNA+tracrRNA-L, sgRNA-1, sgRNA-2 and sgRNA-3 at 20°C, 25°C and 37°C according to an embodiment of the present invention.

图9是根据本发明的实施例提供的获取新型的Cas蛋白的流程示意图。FIG. 9 is a schematic diagram of a process for obtaining a novel Cas protein according to an embodiment of the present invention.

图10是根据本发明实施例的芯片检测BES2、BES4和BES6系统的PAM偏好性图。FIG. 10 is a PAM preference diagram of chip detection BES2, BES4 and BES6 systems according to an embodiment of the present invention.

图11是根据本发明实施例的BES2、BES4和BES6系统体外切割实验图。FIG. 11 is a diagram of an in vitro cutting experiment of BES2, BES4 and BES6 systems according to an embodiment of the present invention.

图12是根据本发明实施例的BES6系统人细胞编辑活性检测电泳图。FIG12 is an electrophoresis diagram of human cell editing activity detection in the BES6 system according to an embodiment of the present invention.

图13是根据本发明实施例的BES4系统人细胞编辑活性检测电泳图。FIG13 is an electrophoresis diagram of human cell editing activity detection by the BES4 system according to an embodiment of the present invention.

具体实施方式DETAILED DESCRIPTION

下面详细描述本发明的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,旨在用于解释本发明,而不能理解为对本发明的限制。同时为了使得本领域技术人员更好的理解本发明,对本文中出现的某些术语或者表述进行解释,这些解释和说明仅用于方便对于本发明的理解,而不应看做是对本发明保护范围的限制。Embodiments of the present invention are described in detail below, and examples of the embodiments are shown in the accompanying drawings, wherein the same or similar reference numerals throughout represent the same or similar elements or elements having the same or similar functions. The embodiments described below with reference to the accompanying drawings are exemplary and are intended to be used to explain the present invention, and should not be construed as limitations of the present invention. At the same time, in order to enable those skilled in the art to better understand the present invention, certain terms or expressions appearing in this document are explained, and these explanations and descriptions are only used to facilitate the understanding of the present invention, and should not be regarded as limitations on the scope of protection of the present invention.

本文中,术语“Crispr”、“crispr”或者“CRISPR”均指规律成簇间隔短回文重复,即Clustered regularly interspaced short palindromic repeats的首字母缩写,术语无论是大写还是小写或者是首字母大写,均是本领域常用的表述方式。相应的,Crispr-Cas系统中因为字母大小写存在不同的表述。另外,当表示碱基时,如无特别说明,字母N和字母V所代表的碱基具有本领域通常的含义,即N代表随机或者任意碱基A、T、C或者G,V代表随机或者任意碱基A、C或者G。As used herein, the term "Crispr", "crispr" or "CRISPR" refers to the acronym for Clustered regularly interspaced short palindromic repeats. The terms, whether in uppercase or lowercase or with the first letter capitalized, are commonly used expressions in the art. Accordingly, there are different expressions in the Crispr-Cas system due to the capitalization of letters. In addition, when representing bases, unless otherwise specified, the bases represented by the letters N and V have the usual meanings in the art, that is, N represents random or arbitrary bases A, T, C or G, and V represents random or arbitrary bases A, C or G.

Cas9酶在目标DNA靶点上进行切割,通常通过如下方式来确定靶位点:被称作Crispr RNA(crRNA)的RNA分子利用它的一部分序列与被称作tracrRNA的RNA分子通过碱基配对结合在一起,形成嵌合RNA(tracrRNA/crRNA),然后借助crRNA的另一部分序列与靶DNA位点进行碱基配对,由此,嵌合RNA引导Cas蛋白结合到这个靶位点进行切割,这种嵌合RNA也称为向导RNA(guide RNA)。与Crispr-Cas9系统不同的是,Cpf1酶能够独自地对CrRNA前体进行加工,然后利用加工后产生的crRNA特异性地靶向和切割DNA,不需要来自宿主细胞的核糖核酸酶和tracrRNA。The Cas9 enzyme cuts at the target DNA target site, and the target site is usually determined in the following way: an RNA molecule called Crispr RNA (crRNA) uses part of its sequence to combine with an RNA molecule called tracrRNA through base pairing to form a chimeric RNA (tracrRNA/crRNA), and then uses another part of the crRNA sequence to base pair with the target DNA site. As a result, the chimeric RNA guides the Cas protein to bind to this target site for cutting. This chimeric RNA is also called guide RNA. Unlike the Crispr-Cas9 system, the Cpf1 enzyme can process the crRNA precursor alone, and then use the processed crRNA to specifically target and cut DNA, without the need for ribonucleases and tracrRNA from host cells.

Crispr的靶向特异性由两部分决定,一部分是RNA嵌合体和靶DNA之间的碱基配对,另一部分依靠Cas蛋白和一个短的DNA序列,这个短的DNA序列在靶DNA的3’末端,称为PAM(protospacer adjacent motif)。The targeting specificity of Crispr is determined by two parts: one is the base pairing between the RNA chimera and the target DNA, and the other relies on the Cas protein and a short DNA sequence at the 3’ end of the target DNA, called PAM (protospacer adjacent motif).

若PAM序列严格(例如可能为特定的几个碱基),则Cas蛋白可以编辑的靶位点就比较少,从而限制了Crispr-Cas系统的应用。SpCas9和LbCpf1均有比较严格的PAM序列,从而使得在对靶向位点的设计有所限制。例如,SpCas9核酸酶所识别的PAM序列为NGG,位于靶向序列的3’端,在距PAM序列3bp处切割形成平末端,由于其PAM序列仅为NGG,限制了该编辑系统的应用。If the PAM sequence is strict (for example, it may be a few specific bases), the Cas protein can edit fewer target sites, thus limiting the application of the Crispr-Cas system. Both SpCas9 and LbCpf1 have relatively strict PAM sequences, which restricts the design of the target site. For example, the PAM sequence recognized by the SpCas9 nuclease is NGG, which is located at the 3' end of the target sequence and cuts at 3bp from the PAM sequence to form a flat end. Since its PAM sequence is only NGG, the application of this editing system is limited.

我们在人的肠道菌群中,利用生物信息和分子实验技术,找到的多种有基因编辑潜力的新型的Cas9系统和Cpf1系统,如附表I和附表II所示。其中Cpf系统中Cpf1酶也称为Cas12a蛋白以与Cas9蛋白不同的方式进行基因编辑,Cpf1酶比SpCas9蛋白要小,更易传送至细胞和组织内。而且其应用于Crsipr-Cpf1系统中,只需要一个crRNA,且可以实现多位点的同时编辑。本申请中提供的Cas蛋白,既包括Cas9蛋白,也包括Cpf1蛋白。即本发明提供了Cas蛋白,其为SEQ ID NO:1~SEQ ID NO:4中的至少一种。这些Cas蛋白具有核酸酶活性,可以用来切割目标核酸,从而应用于Crispr-cas系统中,实现基因的有效编辑,且可用于编辑的靶位点更多,应用范围更广。We have found a variety of new Cas9 systems and Cpf1 systems with gene editing potential in human intestinal flora using bioinformatics and molecular experimental techniques, as shown in Appendix I and Appendix II. Among them, the Cpf1 enzyme in the Cpf system, also known as the Cas12a protein, performs gene editing in a different way from the Cas9 protein. The Cpf1 enzyme is smaller than the SpCas9 protein and is easier to deliver to cells and tissues. Moreover, its application in the Crsipr-Cpf1 system only requires one crRNA, and simultaneous editing of multiple sites can be achieved. The Cas protein provided in this application includes both Cas9 protein and Cpf1 protein. That is, the present invention provides a Cas protein, which is at least one of SEQ ID NO:1 to SEQ ID NO:4. These Cas proteins have nuclease activity and can be used to cut target nucleic acids, so as to be applied to the Crispr-cas system to achieve effective gene editing, and there are more target sites that can be used for editing, and the application range is wider.

所提供的新型的Cas9和Cpf1系统,其所识别的PAM特异性更低,因此扩大了基因编辑系统的应用。以我们在人肠道菌Veillonella sp AF13-2(下简称:AF13-2)上得到的BES1蛋白为例,其比现有的商业SpCas9和LbCpf1的PAM特异性低和蛋白更小。BES1蛋白的氨基酸个数比较小,更容易被递送到细胞中行使基因编辑功能。而且BES1的PAM序列偏好性如图1所示,其中图1中横坐标代表紧邻靶序列3’端的7个位点,纵坐标代表在被切割的所有阳性序列中,各个碱基所占的比例。图1中,紧邻靶序列3’端的第一个位点上,无论是碱基A、碱基C、碱基T或者碱基G的概率都很大,该位点即可以表示为N,依次观察各位点的结果。从图1可以看出,只有第四位为T的时候,被切割的概率极低(小于0.05),因此BES1的PAM序列为NNNV(其中V代表碱基A、G或者C)。The new Cas9 and Cpf1 systems provided have lower PAM specificity, thus expanding the application of gene editing systems. Taking the BES1 protein we obtained on human intestinal bacteria Veillonella sp AF13-2 (hereinafter referred to as: AF13-2) as an example, it has lower PAM specificity and smaller protein than the existing commercial SpCas9 and LbCpf1. The number of amino acids in the BES1 protein is relatively small, and it is easier to be delivered to cells to perform gene editing functions. Moreover, the PAM sequence preference of BES1 is shown in Figure 1, where the abscissa in Figure 1 represents the 7 sites adjacent to the 3' end of the target sequence, and the ordinate represents the proportion of each base in all positive sequences cut. In Figure 1, at the first site adjacent to the 3' end of the target sequence, the probability of base A, base C, base T or base G is very large, and the site can be represented as N, and the results of each site are observed in turn. As can be seen from Figure 1, only when the fourth position is T, the probability of being cut is extremely low (less than 0.05), so the PAM sequence of BES1 is NNNV (where V represents the base A, G or C).

新型的Cas9系统表,包括该Cas蛋白所在的菌株名,Genome ID(NCBI database),TaxID(NCBI database),种,属,门,crRNA,tracrRNA,crispr repeat sequence,effectorprotein length,effector amino acid sequence等,详见附表I。The new Cas9 system table includes the strain name where the Cas protein is located, Genome ID (NCBI database), TaxID (NCBI database), species, genus, phylum, crRNA, tracrRNA, crispr repeat sequence, effector protein length, effector amino acid sequence, etc. See Appendix I for details.

新型的Cpf1系统表,包括该Cas蛋白所在的菌株名,Genome ID(NCBI database),TaxID(NCBI database),种,属,门,crRNA,tracrRNA,crispr repeat sequence,effectorprotein length,effector amino acid sequence等,详见附表II。The new Cpf1 system table includes the strain name where the Cas protein is located, Genome ID (NCBI database), TaxID (NCBI database), species, genus, phylum, crRNA, tracrRNA, crispr repeat sequence, effector protein length, effector amino acid sequence, etc. See Appendix II for details.

下面将结合实施例对本发明的方案进行解释。本领域技术人员将会理解,下面的实施例仅用于说明本发明,而不应视为限定本发明的范围。实施例中未注明具体技术或条件的,按照本领域内的文献所描述的技术或条件或者按照产品说明书进行。所用试剂或仪器未注明生产厂商者,均为可以通过市购获得的常规产品。The scheme of the present invention will be explained below in conjunction with the embodiments. It will be appreciated by those skilled in the art that the following embodiments are only used to illustrate the present invention and should not be considered as limiting the scope of the present invention. Where specific techniques or conditions are not indicated in the embodiments, the techniques or conditions described in the literature in this area or the product specifications are used. The reagents or instruments used are not indicated by the manufacturer and are all conventional products that can be obtained commercially.

实施例一Embodiment 1

根据微生物基因组数据库,对人的肠道菌群中的微生物进行分析,预测Cas蛋白序列和Crispr序列,确定了在Crispr上下游20kb的所有蛋白序列。然后跟NCBI中的蛋白数据库进行比对,得到与已知TypeII或者TypeV蛋白的同源蛋白。将这些同源蛋白进行分析,确定同源蛋白的关键结构域的保守位点以及蛋白的完整性,从而得到附表I和附表II中的Cas蛋白序列及附近的Crispr序列。分析方法如图9所示。这些新型的Crispr-Cas系统属于新型的Type II和Type V Crispr-Cas系统,具有不同于现有SpCas9蛋白的基因编辑能力。这些新型的Crispr-Cas系统丰富了已有的Crispr-cas系统,可以根据需要用于不同细胞中,例如动物细胞和植物细胞中,发挥基因编辑功能。According to the microbial genome database, the microorganisms in the human intestinal flora were analyzed, the Cas protein sequence and Crispr sequence were predicted, and all protein sequences 20kb upstream and downstream of Crispr were determined. Then, the protein database in NCBI was compared to obtain homologous proteins with known Type II or Type V proteins. These homologous proteins were analyzed to determine the conserved sites of the key domains of the homologous proteins and the integrity of the proteins, thereby obtaining the Cas protein sequences in Appendix I and Appendix II and the nearby Crispr sequences. The analysis method is shown in Figure 9. These new Crispr-Cas systems belong to the new Type II and Type V Crispr-Cas systems, which have gene editing capabilities different from the existing SpCas9 protein. These new Crispr-Cas systems enrich the existing Crispr-cas systems and can be used in different cells, such as animal cells and plant cells, as needed to play a gene editing function.

以人肠道菌Veillonella sp AF13-2(简称:AF13-2)上得到的BES1为例,其比现有的商业SpCas9和LbCpf1的PAM特异性低和蛋白更小。BES1的PAM序列偏好性如图1所示,只有第四位为T的时候,被切割的概率极低(小于0.05),BES1的PAM序列为NNNV(其中V代表碱基A、G、C)。Taking BES1 obtained from human intestinal bacteria Veillonella sp AF13-2 (abbreviated as: AF13-2) as an example, it has lower PAM specificity and smaller protein than the existing commercial SpCas9 and LbCpf1. The PAM sequence preference of BES1 is shown in Figure 1. Only when the fourth position is T, the probability of being cut is extremely low (less than 0.05). The PAM sequence of BES1 is NNNV (where V represents bases A, G, C).

附表I为新型的Cas9系统表,包括该Crispr-Cas系统或者Cas蛋白所在的菌株名,Genome ID(NCBI database),TaxID(NCBI database),种,属,门信息,以及crRNA,tracrRNA,crispr repeat sequence(crispr重复序列),effector protein length(效应蛋白长度),effector amino acid sequence(效应蛋白氨基酸序列)。附表I中所示出的Cas蛋白,已经示出了相应的crRNA、tracrRNA和/或crispr repeat sequence,本领域技术人员可以根据示出的序列直接应用。Appendix I is a new Cas9 system table, including the strain name, Genome ID (NCBI database), TaxID (NCBI database), species, genus, phylum information, and crRNA, tracrRNA, crispr repeat sequence, effector protein length, effector amino acid sequence. The Cas protein shown in Appendix I has shown the corresponding crRNA, tracrRNA and/or crispr repeat sequence, and those skilled in the art can directly apply it according to the sequence shown.

附表II为新型的Cpf1系统表,包括该Crispr-Cas系统或者Cas蛋白所在的菌株名,Genome ID(NCBI database),TaxID(NCBI database),种,属,门信息,以及crRNA,tracrRNA,crispr repeat sequence,effector protein length,effector amino acidsequence。附表II中所示出的Cas蛋白,未示出相应的crRNA、tracrRNA和/或crispr repeatsequence,可以根据相应Cas蛋白的信息找到能够帮助这些Cas蛋白行使编辑功能的crRNA、tracrRNA和/或crispr repeat sequence。Appendix II is a new Cpf1 system table, including the strain name, Genome ID (NCBI database), TaxID (NCBI database), species, genus, phylum information, and crRNA, tracrRNA, crispr repeat sequence, effector protein length, effector amino acid sequence of the Crispr-Cas system or Cas protein. The Cas protein shown in Appendix II does not show the corresponding crRNA, tracrRNA and/or crispr repeat sequence. The crRNA, tracrRNA and/or crispr repeat sequence that can help these Cas proteins perform editing functions can be found based on the information of the corresponding Cas protein.

实施例二表达纯化BES1蛋白的实验Example 2 Experiment of expressing and purifying BES1 protein

1、BES1表达载体的构建1. Construction of BES1 expression vector

采取In-fusion的方法进行表达载体的构建,选取NdeI和EcoR I两个位点酶切pET28a载体,将BES1编码的基因序列插入到载体pET 28a的克隆区。将重组型BES1蛋白氨基酸序列的N端的6个His作为纯化标签,其中筛选标签为卡那霉素,将构建好的载体命名为pET28a-BES1。The expression vector was constructed by the In-fusion method. The pET28a vector was digested at two sites, NdeI and EcoR I, and the gene sequence encoded by BES1 was inserted into the cloning region of the vector pET 28a. The six His at the N-terminus of the recombinant BES1 protein amino acid sequence were used as purification tags, and the screening tag was kanamycin. The constructed vector was named pET28a-BES1.

2、BES1菌株的培养和诱导2. Cultivation and induction of BES1 strain

LB液体培养基:胰蛋白胨10g/L,酵母提取物5g/L,NaCl 10g/L。LB liquid medium: tryptone 10 g/L, yeast extract 5 g/L, NaCl 10 g/L.

将重组表达载体pET 28a-BES1转化到大肠杆菌表达菌株Ecoli.BL21(DE3)中,将菌液均匀涂抹于卡那霉素浓度为50μg/mL的LB固体培养基平板上,37℃过夜培养。挑取单菌落,于5ml LB培养基(含有50μg/mL卡那霉素)培养,37℃,200rpm,过夜培养。将上述所得菌液,按1:100接种于50ml LB培养基(含有50μg/mL卡那霉素)中培养,37℃,200rpm,4h。将扩大培养的菌液,按1:100接种于2L LB液体培养基(含有50μg/mL卡那霉素)中培养,37℃,200rpm,待OD600值达0.6-0.8左右,加入IPTG至终浓度为0.4mM,16℃,200rpm,培养过夜,约16-18h。将诱导结束的菌液于10000g离心收集菌体,菌体冻存于-20℃待用。The recombinant expression vector pET 28a-BES1 was transformed into the Escherichia coli expression strain Ecoli.BL21 (DE3), and the bacterial solution was evenly spread on a LB solid medium plate with a kanamycin concentration of 50 μg/mL, and cultured at 37°C overnight. A single colony was picked and cultured in 5 ml LB medium (containing 50 μg/mL kanamycin) at 37°C, 200 rpm, overnight. The bacterial solution obtained above was inoculated into 50 ml LB medium (containing 50 μg/mL kanamycin) at a ratio of 1:100 and cultured at 37°C, 200 rpm, for 4 hours. The expanded culture solution was inoculated into 2L LB liquid medium (containing 50μg/mL kanamycin) at a ratio of 1:100, cultured at 37°C, 200rpm, and when the OD600 value reached about 0.6-0.8, IPTG was added to a final concentration of 0.4mM, and cultured overnight at 16°C, 200rpm for about 16-18h. The induction-completed bacterial solution was centrifuged at 10000g to collect the bacterial cells, and the bacterial cells were frozen at -20°C for use.

3、BES1蛋白的提取与纯化3. Extraction and purification of BES1 protein

纯化Buffer配制:Purification Buffer Preparation:

(1)Ni柱亲和层析(1) Ni column affinity chromatography

Buffer A平衡缓冲液:50mM Tris-HCl+500mM NaCl+20mM咪唑,pH 7.5。Buffer A: 50 mM Tris-HCl + 500 mM NaCl + 20 mM imidazole, pH 7.5.

Buffer B洗脱缓冲液:50mM Tris-HCl+500mM NaCl+500mM咪唑,pH 7.5。Buffer B elution buffer: 50 mM Tris-HCl + 500 mM NaCl + 500 mM imidazole, pH 7.5.

(2)离子交换层析(2) Ion exchange chromatography

Buffer C平衡缓冲液:50mM Tris-HCl+100mM NaCl,pH 7.0。Buffer C equilibrium buffer: 50mM Tris-HCl + 100mM NaCl, pH 7.0.

Buffer D洗脱缓冲液:50mM Tris-HCl+1M NaCl,pH 7.0。Buffer D elution buffer: 50 mM Tris-HCl + 1 M NaCl, pH 7.0.

(3)蛋白样品稀释液(3) Protein sample diluent

Buffer E稀释液:50mM Tris-HCl,pH 7.0。Buffer E diluent: 50 mM Tris-HCl, pH 7.0.

(4)蛋白样品2×储存液(4) Protein sample 2× storage solution

Buffer F 2×储存液:50mM Tris-HCl+300mM NaCl,pH 7.0。Buffer F 2× storage solution: 50 mM Tris-HCl + 300 mM NaCl, pH 7.0.

按1g菌体加15ml Buffer A液的比例重悬菌体,并加入终浓度为1mM的PMSF,超声破碎细胞,直至菌体溶液至澄清。将破碎后的菌体4℃,12000rpm,离心30min,取上清,0.22μm滤膜过滤后于4℃储存。Resuspend the cells at a ratio of 1g of cells to 15ml of Buffer A, and add PMSF at a final concentration of 1mM. Ultrasonicate the cells until the cell solution becomes clear. Centrifuge the broken cells at 4℃, 12000rpm, for 30min, take the supernatant, filter with a 0.22μm filter membrane, and store at 4℃.

将Ni柱亲合层析柱水洗5CV,Buffer B清洗5CV,Buffer A进行平衡10CV后,进行上样。上样完成后,平衡15CV,使用15% Buffer B洗掉杂蛋白,线性洗脱(15-100% BufferB,10CV),当UV值大于100mAU以上收集蛋白。The Ni column affinity chromatography column was washed with water for 5CV, washed with Buffer B for 5CV, and equilibrated with Buffer A for 10CV before loading. After loading, equilibrated for 15CV, 15% Buffer B was used to wash away the impurities, and linear elution (15-100% Buffer B, 10CV) was performed. When the UV value was greater than 100mAU, the protein was collected.

将Ni柱收集到的蛋白用Buffer E稀释5倍,将Q阴离子交换柱水洗5CV,Buffer C平衡5CV,蛋白样品上样,当UV值上升开始收集流穿液。将SP阳离子交换柱使用Buffer C平衡5CV,将上步得到的蛋白样品上样,上样完成后,用Buffer C平衡15CV后,用洗脱缓冲液Buffer D线性洗脱(0-100% Buffer D,10CV),收集蛋白。收集蛋白进行过夜透析,透析液为2×的储存Buffer。蛋白终浓度为1mg/mL,甘油浓度为50%。如图2所示,SDS-PAGE结果显示融合蛋白的纯化效果很好,纯度合格。The protein collected by the Ni column was diluted 5 times with Buffer E, the Q anion exchange column was washed with water for 5CV, Buffer C was balanced for 5CV, the protein sample was loaded, and the flow-through liquid was collected when the UV value rose. The SP cation exchange column was balanced with Buffer C for 5CV, and the protein sample obtained in the previous step was loaded. After loading, Buffer C was balanced for 15CV, and then the elution buffer Buffer D was linearly eluted (0-100% Buffer D, 10CV) to collect the protein. The collected protein was dialyzed overnight, and the dialysate was 2× storage Buffer. The final protein concentration was 1mg/mL and the glycerol concentration was 50%. As shown in Figure 2, the SDS-PAGE results showed that the purification effect of the fusion protein was very good and the purity was qualified.

下面的实施例三和实施例四,以在人肠道菌Veillonella sp AF13-2中发现的Cas9蛋白BES1BES1(SEQ ID NO:1)为例,探究了该蛋白所识别的PAM序列以及其在体外对于目标底物的切割功能。The following Examples 3 and 4 take the Cas9 protein BES1BES1 (SEQ ID NO: 1) found in the human intestinal bacteria Veillonella sp AF13-2 as an example to explore the PAM sequence recognized by the protein and its cleavage function for the target substrate in vitro.

实施例三得到BES1 PAM序列的实验Example 3 Experimental study on obtaining BES1 PAM sequence

1、向导RNA(guide RNA)制备1. Preparation of guide RNA

首先,我们根据预测的BES1在菌株AF13-2的crRNA和tracrRNA序列(见附表I),设计了得到crRNA和tracrRNA-L的双链DNA转录模板(见下表1)。同时在此基础上尝试将crRNA和tracrRNA-L的配对区域的序列缩短,用一个GAAA的连接序列将其连接起来,使其形成了一个单DNA链,即sgRNA-1,sgRNA-1的转录模板序列见下表1。同时为了最大程度地保持原RNA的活性,设计了sgRNA-3,其转录模板序列见下表1,表1所用脱氧核苷酸序列皆于深圳国家基因库合成与编辑平台合成。其中表1中所示出的序列均为各RNA转录用的DNA模板序列。crRNA+tracrRNA-L,sgRNA-1,sgRNA-3的序列和二级结构见图3。First, we designed a double-stranded DNA transcription template for crRNA and tracrRNA-L based on the predicted crRNA and tracrRNA sequences of BES1 in strain AF13-2 (see Appendix I) (see Table 1 below). At the same time, on this basis, we tried to shorten the sequence of the pairing region of crRNA and tracrRNA-L, and connected them with a GAAA connection sequence to form a single DNA chain, i.e. sgRNA-1, and the transcription template sequence of sgRNA-1 is shown in Table 1 below. At the same time, in order to maximize the activity of the original RNA, sgRNA-3 was designed, and its transcription template sequence is shown in Table 1 below. The deoxynucleotide sequences used in Table 1 are all synthesized at the Shenzhen National Gene Bank Synthesis and Editing Platform. The sequences shown in Table 1 are all DNA template sequences for transcription of each RNA. The sequences and secondary structures of crRNA+tracrRNA-L, sgRNA-1, and sgRNA-3 are shown in Figure 3.

表1BES1芯片切割实验所使用RNA转录用模板序列Table 1 RNA transcription template sequences used in BES1 chip cleavage experiments

通过DNA聚合酶链式反应使用KAPAHiFiTM热激活即时使用混合液(Roche)制备上述双链DNA模板。反应后使用苯酚氯仿异戊醇混合液(阿拉丁)进行DNA双链模板纯化,纯化之后的DNA双链模板使用Nanodrop TM 2000光谱仪进行纯度测定(Thermo FisherScientific),并对纯化后的DNA双链模板使用Qubit TM双链DNA高灵敏定量试剂盒(ThermoFisher Scientific)和Qubit TM 3.0荧光定量仪进行浓度测定。The double-stranded DNA template was prepared by DNA polymerase chain reaction using KAPA HiFiTM heat-activated instant mixed solution (Roche). After the reaction, the double-stranded DNA template was purified using a phenol chloroform isoamyl alcohol mixed solution (Aladdin), and the purity of the purified double-stranded DNA template was measured using a Nanodrop TM 2000 spectrometer (Thermo Fisher Scientific), and the concentration of the purified double-stranded DNA template was measured using a Qubit TM double-stranded DNA high-sensitivity quantitative kit (Thermo Fisher Scientific) and a Qubit TM 3.0 fluorescence quantifier.

然后利用上述DNA双链模板进行转录,在进行转录时,按照MEGAscriptTMT7Transcription Kit说明书中的内容,投入2皮摩尔的DNA双链模板,使用Bio-rad S1000TM PCR仪37℃孵育12小时。并利用苯酚氯仿异戊醇混合液(阿拉丁)对RNA进行纯化,纯化之后的RNA使用Nanodrop TM 2000光谱仪进行纯度与浓度测定(ThermoFisher Scientific)。Then, the double-stranded DNA template was used for transcription. During the transcription, according to the instructions of MEGAscript T7 Transcription Kit, 2 pmol of double-stranded DNA template was added and incubated at 37°C for 12 hours using Bio-rad S1000™ PCR instrument. RNA was purified using phenol chloroform isoamyl alcohol mixture (Aladdin), and the purity and concentration of the purified RNA were measured using Nanodrop™ 2000 spectrometer (ThermoFisher Scientific).

2、切割底物单链环制备2. Preparation of single-stranded loop of cleaved substrate

制备能够用于上述BES1蛋白的切割底物,其中切割底物所用到的脱氧核苷酸序列见下表(表2)。其中表2所用脱氧核苷酸序列皆于深圳国家基因库合成与编辑平台合成。A cleavage substrate for the BES1 protein was prepared, wherein the deoxynucleotide sequences used for the cleavage substrate are shown in the following table (Table 2). The deoxynucleotide sequences used in Table 2 were all synthesized at the Shenzhen National Gene Bank Synthesis and Editing Platform.

通过DNA聚合酶链式反应使用热激活即时使用混合液(Roche)制备待切割底物双链(双链底物)。利用表2中PAM_AF13-2_2/1与PAM_AF13-2_2/2两条核苷酸序列在95摄氏度变性后复性作为模板,利用PAM_AF13-2_1与PAM_AF13-2_3两条核苷酸序列为引物进行聚合酶链式反应扩增获得双链底物。DNA polymerase chain reaction using Heat activated instant use mixture (Roche) was used to prepare the double-stranded substrate to be cleaved (double-stranded substrate). The two nucleotide sequences PAM_AF13-2_2/1 and PAM_AF13-2_2/2 in Table 2 were denatured at 95 degrees Celsius and then renatured as templates, and the two nucleotide sequences PAM_AF13-2_1 and PAM_AF13-2_3 were used as primers for polymerase chain reaction amplification to obtain double-stranded substrates.

使用E.Z.N.A.TM胶回收试剂盒对所获得的聚合酶链式反应产物进行回收,然后将回收得到的产物使用Nanodrop TM 2000光谱仪进行纯度测定(Thermo FisherScientific),并且使用Qubit TM双链DNA高灵敏定量试剂盒(Thermo Fisher Scientific)和Qubit TM 3.0荧光定量仪进行浓度测定。The polymerase chain reaction product was recovered using an E.Z.N.A.TM gel recovery kit, and the purity of the recovered product was measured using a Nanodrop TM 2000 spectrometer (Thermo Fisher Scientific), and the concentration was measured using a Qubit TM double-stranded DNA high-sensitivity quantification kit (Thermo Fisher Scientific) and a Qubit TM 3.0 fluorescence quantification instrument.

表2切割底物制备所用脱氧核苷酸序列Table 2 Deoxynucleotide sequences used in the preparation of cleavage substrates

然后利用上述获得的双链底物,进行单链环化,获得单链环产物。方法如下:Then, the double-stranded substrate obtained above is used to perform single-stranded cyclization to obtain a single-stranded circular product. The method is as follows:

使用1皮摩尔上述制备的DNA双链底物,1×TA缓冲液(Epicentre),T4 DNA连接酶120U(Epicentre),和10mM ATP(NEB)终浓度,反应产物体系大小为60μl,使用Bio-radS1000TM PCR仪37℃孵育1小时。Using 1 pmol of the double-stranded DNA substrate prepared above, 1×TA buffer (Epicentre), 120 U of T4 DNA ligase (Epicentre), and 10 mM ATP (NEB) at a final concentration, the reaction product system size was 60 μl and incubated at 37° C. for 1 hour using a Bio-rad S1000 PCR instrument.

然后使用EXO III(10U/μl)(购自BGI)与EXO I(3U/μl)(购自BGI),使用Bio-radS1000TM PCR仪37℃孵育30分钟,对未成环的PCR产物进行消化。产物使用2.5倍体积的AMPure XP(BeckmanTM)进行纯化后并且使用Qubit TM单链DNA高灵敏定量试剂盒(ThermoFisher Scientific)和Qubit TM 3.0(Thermo Fisher Scientific)荧光定量仪进行浓度测定。Then, the uncircularized PCR products were digested using EXO III (10 U/μl) (purchased from BGI) and EXO I (3 U/μl) (purchased from BGI) and incubated at 37°C for 30 minutes using a Bio-rad S1000 PCR instrument. The products were purified using 2.5 volumes of AMPure XP (Beckman ) and the concentration was measured using the Qubit single-stranded DNA high-sensitivity quantification kit (ThermoFisher Scientific) and the Qubit 3.0 (Thermo Fisher Scientific) fluorescence quantification instrument.

3、SE51测序3. SE51 sequencing

(1)利用上述单链环制备上机所用纳米球,取6纳克上述单链环产物,使用无核酸酶纯水(AmbionTM)配平到20微升,加入Make DnB buffer(BGI)20μl,混匀后离心,使用Bio-rad S1000TM PCR仪进行95℃孵育1分钟,65℃孵育1分钟,40℃孵育1分钟,4℃孵育分钟。(1) The single-stranded ring was used to prepare nanospheres for the above-mentioned machine. 6 ng of the single-stranded ring product was taken and diluted to 20 μl with nuclease-free pure water (Ambion TM ). 20 μl of Make DnB buffer (BGI) was added, mixed and centrifuged, and incubated at 95°C for 1 min, 65°C for 1 min, 40°C for 1 min, and 4°C for 5 min using a Bio-rad S1000TM PCR instrument.

反应后产物加入make DnB enzyme mix V2.0(BGI)40微升,make DnB enzyme mixII V2.0(BGI)2微升,混匀后使用Bio-rad S1000TM PCR仪30℃孵育20分钟,反应后混匀DnB终止缓冲液(BGI),使用扩口枪头(Axygen)吹匀后,加入30微升load DnB buffer(BGI),使用扩口枪头(Axygen)吹匀,使用BGITMSEQ500 DnB loader(BGI)将所述文库固定到BGITMSEQ500 V3.1芯片(BGI)上,得到待测序的芯片。The reaction product was added with 40 μl of make DnB enzyme mix V2.0 (BGI) and 2 μl of make DnB enzyme mixII V2.0 (BGI), mixed and incubated at 30°C for 20 minutes using a Bio-rad S1000TM PCR instrument. After the reaction, the DnB stop buffer (BGI) was mixed and evenly mixed using a flared pipette tip (Axygen). Then, 30 μl of load DnB buffer (BGI) was added and evenly mixed using a flared pipette tip (Axygen). The library was fixed to a BGITMSEQ500 V3.1 chip (BGI) using a BGITMSEQ500 DnB loader (BGI) to obtain a chip to be sequenced.

(2)使用BGITMSEQ500 SE100 sequencing Cartridge测序试剂盒(BGI)对上述芯片使用BGITMSEQ500测序仪(BGI)进行SE51测序,获得每条核酸序列的序列信息及ID号。(2) Using BGI SEQ500 SE100 sequencing cartridge (BGI), the above chip was subjected to SE51 sequencing using BGI™ SEQ500 sequencer (BGI) to obtain the sequence information and ID number of each nucleic acid sequence.

4、BES1-PAM原生链测序4. BES1-PAM native strand sequencing

由于上述测序所得到的为单链DNA,利用该单链DNA合成得到互补链(即原生链),所得到的双链DNA用于蛋白的切割实验。包括:Since the single-stranded DNA obtained by the above sequencing is used, the single-stranded DNA is used to synthesize the complementary strand (i.e., the native strand), and the obtained double-stranded DNA is used for protein cleavage experiments. Including:

(1)芯片测序完成后,在BGITMSEQ500 DnB loader(BGI)上使用100%甲酰胺(Sigma)将第一次测序生成的新链洗脱掉。(1) After chip sequencing was completed, the new strand generated by the first sequencing was eluted using 100% formamide (Sigma) on a BGI SEQ500 DnB loader (BGI).

(2)芯片洗脱完成后,使用dNTP mix 2(BGI),在BGITMSEQ500测序仪(BGI)上进行原生链合成,得到双链DNA,合成长度为50个核苷酸,第51个碱基使用dNTP mix 1(BGI)合成,此步骤为合成链末尾加上带荧光dNTP。(2) After the chip was eluted, native strand synthesis was performed on a BGI SEQ500 sequencer (BGI) using dNTP mix 2 (BGI) to obtain double-stranded DNA with a synthetic length of 50 nucleotides. The 51st base was synthesized using dNTP mix 1 (BGI). This step was to add fluorescent dNTPs to the end of the synthetic strand.

(3)上述步骤完成后,使用BGITMSEQ500测序仪(BGI)对芯片进行拍照,在测序仪上保存为原图一。(3) After the above steps are completed, the chip is photographed using a BGI SEQ500 sequencer (BGI) and saved as original image 1 on the sequencer.

(4)BES1芯片酶切反应。对步骤(2)所获得的双链DNA,利用不同的RNA进行酶切反应。其中,反应所用缓冲液为spCas9 1×反应缓冲液(NEB),上述步骤1所制备RNA(crRNA+tracrRNA-L,sgRNA-1或者是sgRNA-3)投入30微克,BES1蛋白终浓度为0.1微摩尔,RNase抑制剂(Epicentre)反应体系终体积300微升,使用BGITMSEQ500 DnB loader(BGI)泵上混合液入芯片,于37℃孵育5小时。(4) BES1 chip digestion reaction. The double-stranded DNA obtained in step (2) was digested with different RNAs. The buffer used in the reaction was spCas9 1× reaction buffer (NEB), 30 μg of RNA (crRNA+tracrRNA-L, sgRNA-1 or sgRNA-3) prepared in step 1 was added, the final concentration of BES1 protein was 0.1 μM, the final volume of the RNase inhibitor (Epicentre) reaction system was 300 μL, and the mixed solution was pumped into the chip using BGI TM SEQ500 DnB loader (BGI) and incubated at 37°C for 5 hours.

(5)上述芯片使用洗涤缓冲液2(BGI)300微升进行清洗3次。(5) The chip was washed three times with 300 μl of washing buffer 2 (BGI).

(6)上述步骤完成后,使用BGITMSEQ500测序仪(BGI)对芯片进行拍照,在测序仪上保存为原图二。(6) After the above steps are completed, the chip is photographed using a BGITMSEQ500 sequencer (BGI) and saved as original image 2 on the sequencer.

(7)使用BGITMSEQ500测序仪(BGI)对已保存的原图一与原图二进行手动basecall软件(BGI)对比酶切前后的荧光信号。对BES1的PAM序列进行分析,同时以SpCas9作为对照,结果如图4所示。(7) The BGITMSEQ500 sequencer (BGI) was used to perform manual basecalling on the saved original image 1 and original image 2. The fluorescence signals before and after enzyme digestion were compared using the BGITMSEQ500 sequencer (BGI). The PAM sequence of BES1 was analyzed, and SpCas9 was used as a control. The results are shown in Figure 4.

图4示出的结果中,横坐标示出了紧邻靶序列3’端的7个位点,纵坐标为在被切割的所有阳性序列中,各个碱基所占的比例。即纵坐标代表以被切割的序列数为分母,确定各位置上被切得分别是哪种碱基,计算每个位置上四种碱基所占的比例。从图4示出的结果可以看出,相较于SpCas9,在结构略有不同的Guide RNA的作用下BES1的偏好性并无太大差异。In the results shown in Figure 4, the horizontal axis shows the 7 sites adjacent to the 3' end of the target sequence, and the vertical axis is the proportion of each base in all positive sequences cut. That is, the vertical axis represents the number of sequences cut as the denominator, determines which base is cut at each position, and calculates the proportion of the four bases at each position. From the results shown in Figure 4, it can be seen that compared with SpCas9, the preference of BES1 under the action of Guide RNA with slightly different structures is not much different.

实施例四BES1的体外切割实验Example 4 In vitro cleavage experiment of BES1

1、guide RNA制备1. Preparation of guide RNA

按照实施例三的方法,获得crRNA转录模板,tracrRNA-L的双链DNA转录模板以及sgRNA-1和sgRNA-3的双链DNA转录模板。同时设计了更短的tracrRNA-S,并利用完整的crRNA与tracrRNA-S设计了sgRNA-2,其转录模板序列见下表3。该转录模板DNA皆于深圳国家基因库合成与编辑平台合成。According to the method of Example 3, the crRNA transcription template, the double-stranded DNA transcription template of tracrRNA-L, and the double-stranded DNA transcription templates of sgRNA-1 and sgRNA-3 were obtained. At the same time, a shorter tracrRNA-S was designed, and sgRNA-2 was designed using the complete crRNA and tracrRNA-S. The transcription template sequences are shown in Table 3 below. The transcription template DNAs were all synthesized at the Shenzhen National Gene Bank Synthesis and Editing Platform.

表3sgRNA-2的双链DNA转录模板Table 3 Double-stranded DNA transcription template of sgRNA-2

可以利用上述DNA模板转录出如图4所示的功能RNA,包括crRNA+tracrRNA-L、sgRNA-1、sgRNA-2和sgRNA-3(其中靶标序列在图4中用N代替)。The above DNA template can be used to transcribe the functional RNA shown in Figure 4, including crRNA+tracrRNA-L, sgRNA-1, sgRNA-2 and sgRNA-3 (wherein the target sequence is replaced by N in Figure 4).

具体来说,按照实施例三的方法,包括:Specifically, the method according to the third embodiment includes:

通过DNA聚合酶链式反应使用KAPAHiFiTM热激活即时使用混合液(Roche)制备双链DNA模板。反应后使用苯酚氯仿异戊醇混合液(阿拉丁)进行DNA双链模板纯化,纯化之后的DNA双链模板使用Nanodrop TM 2000光谱仪进行纯度测定(Thermo FisherScientific),并对纯化后的DNA双链模板使用Qubit TM双链DNA高灵敏定量试剂盒(ThermoFisher Scientific)和Qubit TM 3.0荧光定量仪进行浓度测定。The double-stranded DNA template was prepared by DNA polymerase chain reaction using KAPA HiFiTM heat-activated instant mixed solution (Roche). After the reaction, the double-stranded DNA template was purified using a phenol-chloroform-isoamyl alcohol mixed solution (Aladdin), and the purity of the purified double-stranded DNA template was measured using a Nanodrop TM 2000 spectrometer (Thermo Fisher Scientific), and the concentration of the purified double-stranded DNA template was measured using a Qubit TM double-stranded DNA high-sensitivity quantification kit (Thermo Fisher Scientific) and a Qubit TM 3.0 fluorescence quantification instrument.

然后利用上述DNA双链模板进行转录,在进行转录时,按照MEGAscriptTMT7Transcription Kit说明书中的内容,投入2皮摩尔的DNA双链模板,使用Bio-rad S1000TM PCR仪37℃孵育12小时。使用苯酚氯仿异戊醇混合液(阿拉丁)对RNA进行纯化,纯化之后的RNA使用Nanodrop TM 2000光谱仪进行纯度与浓度测定(Thermo FisherScientific)。Then, the double-stranded DNA template was used for transcription. During the transcription, according to the instructions of MEGAscript T7 Transcription Kit, 2 pmol of double-stranded DNA template was added and incubated at 37°C for 12 hours using Bio-rad S1000™ PCR instrument. RNA was purified using a phenol chloroform isoamyl alcohol mixture (Aladdin), and the purity and concentration of the purified RNA were measured using a Nanodrop™ 2000 spectrometer (Thermo Fisher Scientific).

2、切割底物制备2. Preparation of cutting substrate

目标位点设计:Crispr序列通常由一个前导区(leader)、多个重复序列(repeat)和多个间隔区(spacer)构成,前导区通常可以作为Crispr序列的启动子,重复序列可以形成发卡结构,间隔区通常由俘获的外源DNA组成。因此以Veillonella sp.AF13-2菌株(NCBIgenome ID:QTMT00000000)基因组序列上原始的pro-spacer序列(如图5中selected-spacer)作为目标位点序列。Target site design: Crispr sequences are usually composed of a leader, multiple repeats, and multiple spacers. The leader can usually serve as the promoter of the Crispr sequence, the repeats can form a hairpin structure, and the spacers are usually composed of captured exogenous DNA. Therefore, the original pro-spacer sequence (such as selected-spacer in Figure 5) on the genome sequence of Veillonella sp.AF13-2 strain (NCBIgenome ID: QTMT00000000) was used as the target site sequence.

PAM序列设计:建立了一个7个N的PAM文库(图6中的spacer及PAM序列),以便于BES1蛋白的切割。PAM sequence design: A 7-N PAM library (spacer and PAM sequences in Figure 6 ) was established to facilitate the cleavage of the BES1 protein.

切割底物设计:将合成的PAM文库序列克隆至pMD19载体中形成了pMD19-AF13-2-3’PAM文库。我们在这个文库中扩增了一个842bp的切割底物序列(见图7,其中切割底物序列如SEQ ID NO:243所示),目标位点位置为402bp-431bp(见图7),PAM位置为第432bp-438bp(见图7,即SEQ ID NO:24中第432位~第438位的7个随机碱基,下划线部分),因此其切割后产物皆为400bp左右。进行如此设计的原因是在凝胶电泳分辨率不高的情况下,切割产物会形成一条较宽的条带,以便于我们检测是否切割。Cutting substrate design: The synthesized PAM library sequence was cloned into the pMD19 vector to form the pMD19-AF13-2-3'PAM library. We amplified an 842bp cutting substrate sequence in this library (see Figure 7, where the cutting substrate sequence is shown in SEQ ID NO:243), the target site position is 402bp-431bp (see Figure 7), and the PAM position is 432bp-438bp (see Figure 7, i.e., the 7 random bases from the 432nd to the 438th position in SEQ ID NO:24, the underlined part), so the products after cutting are all about 400bp. The reason for such design is that when the resolution of gel electrophoresis is not high, the cutting product will form a wider band, so that we can detect whether it is cut.

842bp的切割底物序列如下(N代表任意碱基)(SEQ ID NO:23):The 842 bp cleavage substrate sequence is as follows (N represents any base) (SEQ ID NO: 23):

CTGGCCTTTTGCTCACATGTTCTTTCCTGCGTTATCCCCTGATTCTGTGGATAACCGTATTACCGCCTTTGAGTGAGCTGATACCGCTCGCCGCAGCCGAACGACCGAGCGCAGCGAGTCAGTGAGCGAGGAAGCGGAAGAGCGCCCAATACGCAAACCGCCTCTCCCCGCGCGTTGGCCGATTCATTAATGCAGCTGGCACGACAGGTTTCCCGACTGGAAAGCGGGCAGTGAGCGCAACGCAATTAATGTGAGTTAGCTCACTCATTAGGCACCCCAGGCTTTACACTTTATGCTTCCGGCTCGTATGTTGTGTGGAATTGTGAGCGGATAACAATTTCACACAGGAAACAGCTATGACCATGATTACGCCAAGTTTGCACGCCTGCCGTTCGACGATTGTAGTAGCTCAAAAGGGAACTGCTACCGAANNNNNNNAATCTCTGGAAGATCCGCGCGTACCGAGTTCTAATTCACTGGCCGTCGTTTTACAACGTCGTGACTGGGAAAACCCTGGCGTTACCCAACTTAATCGCCTTGCAGCACATCCCCCTTTCGCCAGCTGGCGTAATAGCGAAGAGGCCCGCACCGATCGCCCTTCCCAACAGTTGCGCAGCCTGAATGGCGAATGGCGCCTGATGCGGTATTTTCTCCTTACGCATCTGTGCGGTATTTCACACCGCATATGGTGCACTCTCAGTACAATCTGCTCTGATGCCGCATAGTTAAGCCAGCCCCGACACCCGCCAACACCCGCTGACGCGCCCTGACGGGCTTGTCTGCTCCCGGCATCCGCTTACAGACAAGCTGTGACCGTCTCCGGGAGCTGCATGTGTCAGAGG(SEQ ID NO:23)。CTGGCCTTTTGCTCCACATGTTCTTTCCTGCGTTATCCCCTGATTCTGTGGATAACCGTATTACCGCCTTTGAGTGAGCTGATACCGCTCGCCGCAGCCGAACGACCGAGCGCAGCGAGTCAGTGAGCGAGGAAGCGGAAGAGCGCCCAATACGCAAACCGCCTCTCCCCGCGCGTTGGCCGATTCATTAATGCAGCTGGCACGACAGGTTTCCCGACTGGAAAGCGGGCAGTGAGCGCAACGCAATTAATGTGAGTTAGC TCACTCATTAGGCACCCCAGGCTTTACACTTTATGCTTCCGGCTCGTATGTTGTGTGGAATTGTGAGCGGATAACAATTTCACACAGGAAACAGCTATGACCATGATTACGCCAAGTTTGCACGCCTGCCGTTCGACGATTGTAGTAGCTCAAAAGGGAACTGCTACCGAA NNNNNNN AATCTCTGGAAGATCCGCGCGTACCGAGTTCTAATTCACTGGCCGTCGTTTTACAACGTCGTGACTGGGAAAACCCTGGCGTTACCCAACTTAATCGCCTTGCAGCACATCCCCCTTTCGCCAGCTGGCGTAATAGCGAAGAGGCCCGCACCGATCGCCCTTCCCAACAGTTGCGCAGCCTGAATGGCGAATGGCGCCTGATGCGGTATTTTCTCCTTACGCATCTGTGCGGTATTTCACACCGCATATGGTGCACTC TCAGTACAATCTGCTCTGATGCCGCATAGTTAAGCCAGCCCCGACACCCGCCAACACCCGCTGACGCGCCCTGACGGGCTTGTCTGCTCCCGGCATCCGCTTACAGACAAGCTGTGACCGTCTCCGGGAGCTGCATGTGTCAGAGG (SEQ ID NO: 23).

3、切割实验及结果3. Cutting experiment and results

切割体系为功能性RNA(图4所示的四种RNA)、切割底物及BES1投入量均为终浓度100nM,分别在20℃、25℃和37℃孵育1小时,切割产物利用2%的琼脂糖凝胶进行鉴定,切割结果如图8所示。The cleavage system consisted of functional RNA (the four RNAs shown in FIG4 ), cleavage substrate and BES1 input at a final concentration of 100 nM. The cells were incubated at 20° C., 25° C. and 37° C. for 1 hour, respectively. The cleavage products were identified using 2% agarose gel. The cleavage results are shown in FIG8 .

从图8示出的结果可以看出,在20℃、25℃和37℃孵育下,BES1分别加上图4中四种功能RNA均可以对目标底物进行切割。From the results shown in FIG8 , it can be seen that under incubation at 20° C., 25° C. and 37° C., BES1 plus the four functional RNAs in FIG4 can cleave the target substrate.

实施例五BES2、BES4和BES6系统的PAM偏好性鉴定Example 5 PAM Preference Identification of BES2, BES4 and BES6 Systems

BES2、BES4和BES6三个系统的PAM鉴定实验方法及步骤与上述实施例一致,主要步骤如下:The PAM identification experimental methods and steps of the three systems BES2, BES4 and BES6 are consistent with the above embodiment, and the main steps are as follows:

(1)guide RNA的制备(1) Preparation of guide RNA

生信预测获得BES2系统在菌株Collinsella sp.Marseille-P2666中的tracrRNA和crRNA序列(见附表I),设计了由crRNA与tracrRNA连接整合的sgRNA的双链DNA转录模板,具体脱氧核苷酸序列见下表4。BES4和BES6属于Cpf1同源系统,此系统只需crRNA引导效应蛋白即可实现基因组靶向切割,无需tracrRNA的参与,通过生信预测两个蛋白的crRNA序列,设计并合成其双链DNA转录模板,具体脱氧核苷酸序列见下表4。表4所用脱氧核苷酸序列皆于深圳国家基因库合成与编辑平台合成。The tracrRNA and crRNA sequences of the BES2 system in the strain Collinsella sp. Marseille-P2666 were obtained by bioinformatics prediction (see Appendix I), and the double-stranded DNA transcription template of the sgRNA connected and integrated by crRNA and tracrRNA was designed. The specific deoxynucleotide sequences are shown in Table 4 below. BES4 and BES6 belong to the Cpf1 homologous system. This system only needs crRNA to guide the effector protein to achieve genome targeted cleavage without the participation of tracrRNA. The crRNA sequences of the two proteins were predicted by bioinformatics, and their double-stranded DNA transcription templates were designed and synthesized. The specific deoxynucleotide sequences are shown in Table 4 below. The deoxynucleotide sequences used in Table 4 were all synthesized at the Shenzhen National Gene Bank Synthesis and Editing Platform.

表4:Table 4:

表4所示的BES2系统gRNA、BES4和BES6系统crRNA的双链DNA转录模板guide RNA表达制备步骤同实施例三致。The preparation steps for the expression of double-stranded DNA transcription template guide RNA for BES2 system gRNA, BES4 and BES6 system crRNA shown in Table 4 are the same as those in Example 3.

(2)PAM鉴定(2) PAM identification

基于DNB芯片快速检测BES2、BES4和BES6系统的PAM序列与实施例三一致。三个系统的PAM偏好性如图10所示。The PAM sequences of the BES2, BES4 and BES6 systems detected quickly based on the DNB chip are consistent with those in Example 3. The PAM preferences of the three systems are shown in FIG10 .

实施例六BES2、BES4和BES6系统体外切割活性鉴定Example 6 Identification of in vitro cleavage activity of BES2, BES4 and BES6 systems

首先,根据实施例三中所述,体外转录表达BES2、BES4和BES6系统的guide RNA序列;其次,同实施例二中实验方法一致,表达纯化BES2、BES4和BES6系统的效应蛋白;最后,同实施例四中实验方法一致,进行底物制备和体外切割。如图11所示,三个系统均具有体外切割DNA双链的活性。First, guide RNA sequences of BES2, BES4 and BES6 systems were expressed by in vitro transcription according to Example 3; secondly, the effector proteins of BES2, BES4 and BES6 systems were expressed and purified in accordance with the experimental method in Example 2; finally, substrate preparation and in vitro cleavage were performed in accordance with the experimental method in Example 4. As shown in Figure 11, all three systems have the activity of cleaving double-stranded DNA in vitro.

实施例七BES6系统在人细胞体内的编辑活性鉴定Example 7 Identification of editing activity of the BES6 system in human cells

(1)人细胞培养(1) Human cell culture

发明人选择人HEK293T细胞作为进行体内编辑活性测试的细胞。HEK293T细胞培养于DMEM培养基上,由胎牛血清(FBS)提供营养。The inventors selected human HEK293T cells as cells for in vivo editing activity testing. HEK293T cells were cultured in DMEM medium and provided with nutrition by fetal bovine serum (FBS).

(2)RNP制备(2) RNP preparation

对于HEK293T细胞的编辑,我们选用内源基因AAVS1进行靶向切割验证。For editing of HEK293T cells, we selected the endogenous gene AAVS1 for targeted cutting verification.

AAVS1的靶向区域核苷酸序列如下:The nucleotide sequence of the targeted region of AAVS1 is as follows:

CCCTTGCTCTCTGCTGTGTTGCTGCCCAAGGATGCTCTTTCCGGAGCACTTCCTTCCCCTTGCTCTCTGCTGTGTTGCTGCCCAAGGATGCTCTTTCCGGAGCACTTCCTTC

TCGGCGCTGCACCACGTGATGTCCTCTGAGCGGATCCTCCCCGTGTCTGGGTCCTCTCTCGGCGCTGCACCACGTGATGTCCTCTGAGCGGATCCTCCCCGTGTCTGGGTCCTCTC

CGGGCATCTCTCCTCCCTCACCCAACCCCATGCCGTcTTCACTCGCTGGGTTCCCTTTTCGGGCATCTCTCCTCCCTCACCCAACCCCATGCCGTcTTCACTCGCTGGGTTCCCTTTT

CCTTCTCCTTCTGGGGCCTGTGCCATCTCTCGTTTCTTAGGATGGCCTTCTCCGACGGACCTTCTCCTTCTGGGGCCTGTGCCATCTCTCGTTTCTTAGGATGGCCTTCTCCGACGGA

TGTCTCCCTTGCGTCCCGCCTCCCCTTCTTGTAGGCCTGCATCATCACCGTTTTTCTGGTGTCTCCCTTGCGTCCCGCCTCCCCTTCTTGTAGGCCTGCATCATCACCGTTTTTCTGG

ACAACCCCAAAGTACCCCGTCTCCCTGGCTTtAGcCACCTCTCCATCCTCTTGCTTTCTTACAACCCCAAAGTACCCCGTCTCCCTGGCTTtAGcCACCTCTCCATCCTCTTGCTTTCTT

TGCCTGGACACCCCGTTCTCCTGTGGATTCGGGTCACCTCTCACTCCTTTCATTTGGGCTGCCTGGACACCCCGTTCTCCTGTGGATTCGGGTCACCTCTCACTCCTTTCATTTGGGC

AGCTCCCCTACCCCCCTTACCTCTCTAGTCTGTGCTAGCTCTTCCAGCCCCCTGTCATGAGCTCCCCTACCCCCCTTACCTCTCTAGTCTGTGCTAGCTCTCCAGCCCCCTGTCATG

GCATCTTCCAGGGGTCCGAGAGCTCAGCTAGTCTTCTTCCTCCAACCCGGGCCCcTATGCATCTTCCAGGGGTCCGAGAGCTCAGCTAGTCTTCTTCCTCCAACCCGGGCCCcTAT

GTCCACTTCAGGACAGCATGTTTGCTGCCTCCAGGGATCCTGTGTCCCCGAGCTGGGAGTCCACTTCAGGACAGCATGTTTGCTGCCTCCAGGGATCCTGTGTCCCCGAGCTGGGA

CCACCTTATATTCCCAGGGCCGGTTAATGTGGCTCTGGTTCTGGGTACTTTTATCTGTCCCCACCTTATATTCCCAGGGCCGGTTAATGTGGCTCTGGTTCTGGGTACTTTTATCTGTCC

CCTCCACCCCACAGTGGGGCCACTAGGGACAGGATTGGTGACAGAAAAGCCCCATCCCCTCCACCCCACAGTGGGGCCACTAGGGACAGGATTGGTGACAGAAAAGCCCCATCC

TTAGGCCTCCTCCTTCCTAGTCTCCTGATATTGGGTCTAACCCCCACCTCCTGTTAGGCTTAGGCCTCCTCCTTCCTAGTCTCCTGATATTGGGTCTAACCCCCACCTCCTGTTAGGC

AGATTCCTTATCTGGTGACACACCCCCATTTCCTGGAGCCATCTCTCTCCTTGCCAGAAAGATTCCTTATCTGGTGACACACCCCCATTTCCTGGAGCCATCTCTCTCCTTGCCAGAA

CCTCTAAGGTTTGCTTACGATGGAGCCAGAGAGGATCCTGGGAGGGAGAGCTTGGCACCTCTAAGGTTTGCTTACGATGGAGCCAGAGAGGATCCTGGGAGGGAGAGCTTGGCA

GGGGGTGGGAGGGAAGGGGGGGATGCGTGACCTGCCCGGTTCTCAGTGGCCACCCTGGGGGTGGGAGGGAAGGGGGGGATGCGTGACCTGCCCGGTTCTCAGTGGCCACCCT

GCGCTACCCTCTCCCAGAACCTGAGCTGCTCTGACGCGGCTGTCTGGTGCGTTTCACTGCGCTACCCTCTCCCAGAACCTGAGCTGCTCTGACGCGGCTGTCTGGTGCGTTTCACT

GATCCTGGTGCTGCAGCTTCCTTACACTTCCCAAGAGGAGAAGCAGTTTGGAAAAACGATCCTGGTGCTGCAGCTTCCTTACACTTCCCAAGAGGAGAAGCAGTTTGGAAAAAC

AAAATCAGAATAAGTTGGTCCTGAGTTCTAACTTTGGCTCTTCACCTTTCTAGTCCCCAAAAATCAGAATAAGTTGGTCCTGAGTTCTAACTTTGGCTCTTCACCTTTCTAGTCCCCA

ATTTATATTGTTCCTCCGTGCGTCAGTTTTACCTGTGAGATAAGGCCAGTAGCCACCCCATTTATATTGTTCCTCCGTGCGTCAGTTTTACCTGTGAGATAAGGCCAGTAGCCACCCC

CGTCCTGGCAGGGCTGTGGTGAGGAGGGGGGTGTCCGTGTGGAAAACTCCCTTTGTGCGTCCTGGCAGGGCTGTGGTGAGGAGGGGGGTGTCCGTGTGGAAAACTCCCTTTGTG

AGAATGGTGCGTCCTAGGTGTTCACCAGGTCGTGGCCGCCTCTACTCCCTTTCTCTTTCAGAATGGTGCGTCCTAGGTGTTCACCAGGTCGTGGCCGCCTCTACTCCCTTTCTCTTTC

TCCATCCTTCTTTCCTTAAAGAGCCCCCAGTGCTATCTGGACATATTCCTCCGCCCAGATCCATCCTTCTTTCCTTAAAGAGCCCCCAGTGCTATCTGGACATATTCCTCCGCCCAGA

GCAGGGTCCGCTTCCCTAAGGCCCTGCTCTGGGCTTCTGGGTTTGAGTCCTTGCAAGCGCAGGGTCCGCTTCCCTAAGGCCCTGCTCTGGGCTTCTGGGTTTGAGTCCTTGCAAGC

CCAGGAGAGCGCTAGCTTCCCTGTCCCCCTTCCTCGTCCACCATCTCATGCCCTGGCTCCAGGAGAGCGCTAGCTTCCCTGTCCCCCTTCCTCGTCCACCATCTCATGCCCTGGCT

CTCCTGCCCCTTCCTACA(SEQ ID NO:27).CTCCTGCCCCTTCCTACA (SEQ ID NO:27).

针对这个基因,设计了1个靶向位点,设计并合成其双链DNA转录模板,具体脱氧核苷酸序列见下表5。表5所用脱氧核苷酸序列皆于深圳国家基因库合成与编辑平台合成。For this gene, a targeting site was designed, and its double-stranded DNA transcription template was designed and synthesized. The specific deoxynucleotide sequences are shown in Table 5. The deoxynucleotide sequences used in Table 5 were all synthesized at the Shenzhen National Gene Bank Synthesis and Editing Platform.

表5:Table 5:

表5所示的BES4和BES6靶向AAVS1位点序列使用订购的寡核苷酸和MEGAshortscriptTM T7转录试剂盒(Invitrogen),根据制造商推荐的方法在体外转录生成guide RNA。BES4和BES6效应蛋白的体外表达同实施例二一致。The BES4 and BES6 targeting AAVS1 site sequences shown in Table 5 were transcribed in vitro using ordered oligonucleotides and MEGAshortscriptTM T7 transcription kit (Invitrogen) according to the manufacturer's recommended method to generate guide RNA. The in vitro expression of BES4 and BES6 effector proteins was consistent with Example 2.

(3)RNP转入人细胞(3) RNP transfer into human cells

在十二孔板中,每个孔加入10皮摩尔的纯化效应蛋白和0.5微升的gRNA。使用NeonTM转染系统试剂盒及核转仪(Invitrogen)根据制造商的流程将RNP组装并转染入HEK293T细胞中。In a twelve-well plate, 10 pmol of purified effector protein and 0.5 μl of gRNA were added to each well. RNPs were assembled and transfected into HEK293T cells using the Neon Transfection System Kit and Nucleofectamine (Invitrogen) according to the manufacturer's protocol.

(4)编辑活性鉴定(4) Identification of editing activity

RNP转染后2-3天收获细胞,并进行T7E1酶切实验检测活性,步骤如下:Harvest cells 2-3 days after RNP transfection and perform T7E1 enzyme digestion assay to detect activity. The steps are as follows:

(a)收集细胞:向12孔板每孔中加入200微升0.5摩尔的EDTA(pH 8.0),以重悬细胞;(a) Collect cells: add 200 μl of 0.5 M EDTA (pH 8.0) to each well of a 12-well plate to resuspend the cells;

(b)基因组DNA提取:使用基因组DNA提取试剂盒(Tiangen)提取基因组DNA,采用Nanodrop来测量gDNA浓度;(b) Genomic DNA extraction: Genomic DNA was extracted using a genomic DNA extraction kit (Tiangen), and the gDNA concentration was measured using Nanodrop;

(c)靶向区域PCR:使用GXL Prime从gDNA进行靶位点区域扩增,扩增引物如下表6所示,所用脱氧核苷酸序列皆于深圳国家基因库合成与编辑平台合成。并使用PCR纯化和凝胶提取试剂盒(MN)进行纯化。通过琼脂糖凝胶电泳分析PCR产物洁净度,同时使用Nanodrop测量浓度。(c) Targeted region PCR: GXL Prime was used to amplify the target site region from gDNA. The amplification primers are shown in Table 6 below. The deoxynucleotide sequences used were synthesized at the Shenzhen National Gene Bank Synthesis and Editing Platform. The PCR purification and gel extraction kit (MN) was used for purification. The cleanliness of the PCR product was analyzed by agarose gel electrophoresis, and the concentration was measured using Nanodrop.

(d)变性和退火:使用Bio-rad PCR仪对步骤(c)纯化产物进行变性和退火。T7E1酶切反应投入等量底物DNA(约200-300ng/rxn,反应体系10微升)。(d) Denaturation and annealing: Use Bio-rad PCR instrument to denature and anneal the purified product from step (c). An equal amount of substrate DNA (about 200-300 ng/rxn, 10 μl reaction system) is added to the T7E1 digestion reaction.

(e)T7E1酶切:将0.2微升的T7EI核酸酶加入步骤(d)中的10微升样品中。37℃,20分钟进行酶切反应。(e) T7E1 digestion: Add 0.2 μl of T7EI nuclease to 10 μl of the sample in step (d) and perform digestion reaction at 37°C for 20 minutes.

(f)活性检测:T7E1完成切割反应后,加入loading buffer,进行琼脂糖凝胶检测条带。(f) Activity detection: After T7E1 completes the cleavage reaction, loading buffer is added and the bands are detected by agarose gel.

表6:PCR扩增引物列表Table 6: List of PCR amplification primers

如图12所示,BES6具有人细胞编辑活性。As shown in FIG. 12 , BES6 has editing activity in human cells.

实施例八BES4系统在人细胞体内的编辑活性鉴定Example 8 Identification of editing activity of the BES4 system in human cells

(1)人细胞培养(1) Human cell culture

发明人选择人HEK293T细胞作为进行体内编辑活性测试的细胞。HEK293T细胞培养于DMEM培养基上,由胎牛血清(FBS)提供营养。The inventors selected human HEK293T cells as cells for in vivo editing activity testing. HEK293T cells were cultured in DMEM medium and provided with nutrition by fetal bovine serum (FBS).

(2)质粒制备(2) Plasmid preparation

对于HEK293T细胞的编辑,我们选用内源基因HBG进行靶向切割验证。For the editing of HEK293T cells, we selected the endogenous gene HBG for targeted cutting verification.

HBG的靶向区域核苷酸序列如下:The nucleotide sequence of the targeted region of HBG is as follows:

CCCTGCTGTGCTCAGATCAATACTCCGTTGTCTAAGTTGCCTCGAGACTAAAGGCCCCTGCTGTGCTCAGATCAATACTCCGTTGTCTAAGTTGCCTCGAGACTAAAGGC

AACAGGGCTGAAACATCTCCTGGACTCACCTTGAAGTTCTCAGGATCCACATGCAGCTAACAGGGCTGAAACATCTCCTGGACTCACCTTGAAGTTTCTCAGGATCCACATGCAGCT

TGTCACAGTGCAGTTCACTCAGCTGGGCAAAGGTGCCCTTGAGATCATCCAGGTGCTTTGTCACAGTGCAGTTCACTCAGCTGGGCAAAGGTGCCCTTGAGATCATCCAGGTGCTT

TGTGGCATCTCCCAAGGAAGTCAGCACCTTCTTGCCATGTGCCTTGACTTTGGGGTTGTGTGGCATCTCCCAAGGAAGTCAGCACCTTCTTGCCATGTGCCTTGACTTTGGGGTTG

CCCATGATGGCAGAGGCAGAGGACAGGTTGCCAAAGCTGTCAAAGAACCTCTGGGTCCCCATGATGGCAGAGGCAGAGGACAGGTTGCCAAAGCTGTCAAAGAACCTCTGGGTC

CATGGGTAGACAACCAGGAGCCTGTGAGATTGACAAGAACAGTTTGACAGTCAGAAGCATGGGTAGACAACCAGGAGCCTGTGAGATTGACAAGAACAGTTTGACAGTCAGAAG

GTGCCACAAATCCTGAGAAGCGACCTGGACTTTTGCCAGGCACAGGGTCCTTCCTTCGTGCCAAAATCCTGAGAAGCGACCTGGACTTTTGCCAGGCACAGGGTCCTTCCTTC

CCTCCCTTGTCCTGGTCACCAGAGCCTACCTTCCCAGGGTTTCTCCTCCAGCATCTTCCCCTCCCTTGTCCTGGTCACCAGAGCCTACCTTCCCAGGGTTTCTCCTCCAGCATCTTCC

ACATTCACCTTGCCCCACAGGCTTGTGATAGTAGCCTTGTCCTCCTCTGTGAAATGACCACATTCACCTTGCCCCACAGGCTTGTGATAGTAGCCTTGTCCTCCTCTGTGAAATGACC

CATGGCGTCTGGACTAGGAGCTTATTGATAACCTCAGACGTTCCAGAAGCGAGTGTGTCATGGCGTCTGGACTAGGAGCTTATTGATAACCTCAGACGTTCCAGAAGCGAGTGTGT

GGAACTGCTGAAGGGTGCTTCCTTTTATTCTTCATCCCTAGCCAGCCGCCGGCCCCTGGGAACTGCTGAAGGGTGCTTCCTTTTATTCTTCATCCCTAGCCAGCCGCCGGCCCCTG

GCCTCACTGGATACTCTAAGACTATTGGTCAAGTTTGCCTTGTCAAGGCTATTGGTCAAGCCTCACTGGATACTCTAAGACTATTGGTCAAGTTTGCCTTGTCAAGGCTATTGGTCAA

GGCAAGGCTGGCCAACCCATGGGTGGAGTTTAGCCAGGGACCGTTTCAGACAGATATGGCAAGGCTGGCCAACCCATGGGTGGAGTTTAGCCAGGGACCGTTTCAGACAGATAT

TTGCATTGAGATAGTGTGGGGAAGGGGCCCCCAAGAGGATACTGCTAATTTTTTTTATATTGCATTGAGATAGTGTGGGGAAGGGGCCCCCAAGAGGATACTGCTAATTTTTTTTATA

GCCTTTGCCTTGTTCCGATTCAGTCATTCCAGTTTTTCTCTAATTTATTCTTCCCTTTAGCGCCTTTGCCTTGTTCCGATTCAGTCATTCCAGTTTTTCTCTAATTTATTCTTCCCTTTAGC

TAGTTTCCTTCTCCCATCATAGAGGATACCAGGACTTCTTTTGTCAGCCGTTTTTTACCTTAGTTTCCTTCTCCCATCATAGAGGATACCAGGACTTCTTTTGTCAGCCGTTTTTTACCT

TCTTGTCTCTAGCTCCAGTGAGGCCTGTAGTTTAAAGCTAAAGCATGTACCAATTTTTGTCTTGTCTCTAGCTCCAGTGAGGCCTGTAGTTTAAAGCTAAAGCATGTACCAATTTTTG

AAAAGTTCAGGGATTGTGAAATGTGTTTTAGGCATAGGTCCAGGATTTTTGACGGGACAAAAGTTCAGGGATTGTGAAATGTGTTTTAGGCATAGGTCCAGGATTTTTGACGGGAC

AAATCTTAGTCTCTTTCAGTTAGCAGTGGTTTCTAAGGA(SEQ ID NO:32).AAATCTTAGTCTCTTTCAGTTAGCAGTGGTTTCTAAGGA (SEQ ID NO: 32).

针对这个区域,发明人设计了三个靶点,并合成相应质粒序列,For this region, the inventors designed three targets and synthesized the corresponding plasmid sequences.

BES4-HBG-sg01:BES4-HBG-sg01:

GAGGGCCTATTTCCCATGATTCCTTCATATTTGCATATACGATACAAGGCTGTTAGAGAGGGCCTATTTCCCATGATTCCTTCATATTTGCATATACGATACAAGGCTGTTAGA

GAGATAATTGGAATTAATTTGACTGTAAACACAAAGATATTAGTACAAAATACGTGACGGAGATAATTGGAATTAATTTGACTGTAAACACAAAGATATTAGTACAAAATACGTGACG

TAGAAAGTAATAATTTCTTGGGTAGTTTGCAGTTTTAAAATTATGTTTTAAAATGGACTATAGAAAGTAATAATTTCTTGGGTAGTTTGCAGTTTTAAAATTATGTTTTAAAATGGACTA

TCATATGCTTACCGTAACTTGAAAGTATTTCGATTTCTTGGCTTTATATATCTTGTGGAATCATATGCTTACCGTAACTTGAAAGTATTTCGATTTCTTGGCTTTATATATCTTGTGGAA

AGGACGAAACACCGAATTTCTACTATTGTAGATGCCAGCCTTGCCTTGACCAATAGTTTAGGACGAAACACCGAATTTCTACTATTGTAGATGCCAGCCTTGCCTTGACCAATAGTTT

TTTGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTTTTAGCGCGTGCTTTGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTTTTAGCGCGTGC

GCCAATTCTGCAGACAAATGGCTCTAGAGGTACCCGTTACATAACTTACGGTAAATGGGCCAATTCTGCAGACAAATGGCTCTAGAGGTACCCGTTACATAACTTACGGTAAATGG

CCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAGTAACGCCAATACCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAGTAACGCCAATA

GGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGCCCACTTGGCAGGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGCCCACTTGGCAG

TACATCAAGTGTATCATATGCCAAGTACGCCCCCTATTGACGTCAATGACGGTAAATGGTACATCAAGTGTATCATATGCCAAGTACGCCCCCTATTGACGTCAATGACGGTAAATGG

CCCGCCTGGCATTGTGCCCAGTACATGACCTTATGGGACTTTCCTACTTGGCAGTACATCCCGCCTGGCATTGTGCCCAGTACATGACCTTATGGGACTTTCCTACTTGGCAGTACAT

CTACGTATTAGTCATCGCTATTACCATGGTCGAGGTGAGCCCCACGTTCTGCTTCACTCCTACGTATTAGTCATCGCTATTACCATGGTCGAGGTGAGCCCCACGTTCTGCTTCACTC

TCCCCATCTCCCCCCCCTCCCCACCCCCAATTTTGTATTTATTTATTTTTTAATTATTTTGTCCCCATCTCCCCCCTCCCCACCCCCAATTTTGTATTTATTTATTTTTTAATTATTTTG

TGCAGCGATGGGGGCGGGGGGGGGGGGGGGGCGCGCGCCAGGCGGGGCGGGGCGGTGCAGCGATGGGGGCGGGGGGGGGGGGGGGGCGCGCGCCAGGCGGGGCGGGGCGG

GGCGAGGGGCGGGGCGGGGCGAGGCGGAGAGGTGCGGCGGCAGCCAATCAGAGCGGGCGAGGGGCGGGGCGGGGCGAGGCGGAGAGGTGCGGCGGCAGCCAATCAGAGCG

GCGCGCTCCGAAAGTTTCCTTTTATGGCGAGGCGGCGGCGGCGGCGGCCCTATAAAAGCGCGCTCCGAAAGTTTCCTTTTATGGCGAGGCGGCGGCGGCGGCGGCCCTATAAAA

AGCGAAGCGCGCGGCGGGCGGGAGTCGCTGCGCGCTGCCTTCGCCCCGTGCCCCGCTAGCGAAGCGCGCGGCGGGCGGGAGTCGCTGCGCGCTGCCTTCGCCCCGTGCCCCGCT

CCGCCGCCGCCTCGCGCCGCCCGCCCCGGCTCTGACTGACCGCGTTACTCCCACAGGCCGCCGCCGCCTCGCGCCGCCCGCCCCGCCTCTGACTGACCGCGTTACTCCCACAGG

TGAGCGGGCGGGACGGCCCTTCTCCTCCGGGCTGTAATTAGCTGAGCAAGAGGTAAGTGAGCGGGCGGGACGGCCCTTTCTCCTCCGGGCTGTAATTAGCTGAGCAAGAGGTAAG

GGTTTAAGGGATGGTTGGTTGGTGGGGTATTAATGTTTAATTACCTGGAGCACCTGCCTGGTTTAAGGGATGGTTGGTTGGTGGGGTATTAATGTTTAATTACCTGGAGCACCTGCCT

GAAATCACTTTTTTTCAGGTTGGACCGGTGCCACCATGGACTATAAGGACCACGACGGGAAATCACTTTTTTTCAGGTTGGACCGGTGCCACCATGGACTATAAGGACCACGACGG

AGACTACAAGGATCATGATATTGATTACAAAGACGATGACGATAAGATGGCCCCAAAGAGACTACAAGGATCATGATATTGATTACAAAGACGATGACGATAAGATGGCCCCAAAG

AAGAAGCGGAAGGTCGGTATCCACGGAGTCCCAGCAGCCATGCAGGAGAGAAAGAAAAGAAGCGGAAGGTCGGTATCCACGGAGTCCCAGCAGCCATGCAGGAGAGAAAGAA

GATCAGCCACCTGACCCACAGAAACAGCGTGAAGAAAACCATCAGAATGCAGCTGAGATCAGCCACCTGACCCACAGAAACAGCGTGAAGAAAACCATCAGAATGCAGCTGA

ACCCCGTGGGAAAGACCATGGACTACTTCCAGGCCAAGCAGATCCTGGAGAACGACGACCCCTGTGGGAAAGACCATGGACTACTTCCAGGCCAAGCAGATCCTGGAGAACGACG

AGAAGCTGAAGGAGGACTACCAGAAGATCAAGGAGATCGCCGACAGATTCTACAGAAGAAGCTGAAGGAGGACTACCAGAAGATCAAGGAGATCGCCGACAGATTCTACAGA

AACCTGAACGAGGACGTGCTGAGCAAAACCGGACTGGACAAGCTGAAGGACTACGCAACCTGAACGAGGACGTGCTGAGCAAAACCGGACTGGACAAGCTGAAGGACTACGC

CGAGATCTACTACCATTGCAACACCGACGCCGACAGAAAGAGACTGAACGAGTGCGCCGAGATCTACTACCATTGCAACACCGACGCCGACAGAAAGAGACTGAACGAGTGCGC

CAGCGAGCTGAGAAAGGAGATCGTGAAGAACTTCAAGAACAGAGATGAGTACAACACAGCGAGCTGAGAAAGGAGATCGTGAAGAACTTCAAGAACAGAGATGAGTACAACA

AGCTGTTCAACAAGAAGATGATCGAGATCGTGCTGCCCAAGCACCTGAAGAACGAGGAGCTGTTCAACAAGAAGATGATCGAGATCGTGCTGCCCAAGCACCTGAAGAACGAGG

ACGAGAAGGAAGTGGTGGCCAGCTTCAAGAACTTCACCACCTACTTCACCGGCTTCTACGAGAAGGAAGGTGGTGGCCAGCTTCAAGAACTTCACCACCTACTTCACCGGCTTCT

TCACCAACAGAAAGAACATGTACAGCGACGGCGAAGAGTCTACCGCTATTGCCTACATCACCAACAGAAAGAACATGTACAGCGACGGCGAAGAGTCTACCGCTATTGCCTACA

GATGCATCAACGAGAACCTGCCCAAGCACCTGGACAACGTGAAGGTGTTCGAGAAGGATGCATCAACGAGAACCTGCCCAAGCACCTGGACAACGTGAAGGTGTTCGAGAAG

GCCATCAGCAAGCTGAGCAAGAACGCCATCGACGACCTGGATGCCACATATTCTGGCCGCCATCAGCAAGCTGAGCAAGAACGCCATCGACGACCTGGATGCCACATATTCTGGCC

TGTGCGGCACAAATCTGTACGACGTGTTCACCGTGGACTACTTCAACTTCCTGCTGCCTGTGCGGCACAAATCTGTACGACGTGTTCACCGTGGACTACTTCAACTTCCTGCTGCC

CCAAAGCGGAATCACCGAGTACAACAAGATCATCGGCGGCTACACAACAAGCGACGGCCAAAGCGGAATCACCGAGTACAACAAGATCATCGGCGGCTACACAACAAGCGACGG

CACCAAAGTGAAGGGCATCAACGAGTACATCAACCTGTACAACCAGCAGGTGAGCAACACCAAAGTGAAGGGCATCAACGAGTACATCAACCTGTACAACCAGCAGGTGAGCAA

GAGAGACAAGATCCCCAACCTGAAGATCCTGTACAAGCAGATCCTGAGCGAGAGCGAGAGAGACAAGATCCCCAACCTGAAGATCCTGTACAAGCAGATCCTGAGCGAGAGCGA

GAAGGTGTCTTTCATCCCCCCCAAGTTCGAGGACGACAACGAACTGCTGTCTGCCGTGAAGGTGTCTTTCATCCCCCCCAAGTTCGAGGACGACAACGAACTGCTGTCTGCCGT

GAGCGAGTTCTATGCCAACGACGAGACATTTGATGGCATGCCCCTGAAGAAAGCCATCGAGCGAGTTCTATGCCAACGACGAGACATTTGATGGCATGCCCTGAAGAAAGCCATC

GACGAAACCAAACTGCTGTTCGGCAACCTGGACAACAGCAGCCTGAACGGCATCTACGACGAAACCAAACTGCTGTTCGGCAACCTGGACAACAGCAGCCTGAACGGCATCTAC

ATCCAGAACGACAGAAGCGTGACCAACCTGAGCAACAGCATGTTCGGCAGCTGGAGATCCAGAACGACAGAAGCGTGACCAACCTGAGCAACAGCATGTTCGGCAGCTGGAG

CGTGATTGAGGACCTGTGGAACAAGAACTACGACAGCGTGAACAGCAACAGCAGAACGTGATTGAGGACCTGTGGAACAAGAACTACGACAGCGTGAACAGCAACAGCAGAA

TCAAGGACATCCAGAAGAGAGAGGACAAGAGAAAGAAGGCCTACAAGGCCGAGAATCAAGGACATCCAGAAGAGAGAGGACAAGAGAAAGAAGGCCTACAAGGCCGAGAA

GAAGCTGAGCCTGAGCTTCCTGCAGGTGCTGATCAGCAACAGCGAGAACGACGAGATGAAGCTGAGCCTGAGCTTCCTGCAGGTGCTGATCAGCAACAGCGAGAACGACGAGAT

CAGAAAGAAGAGCATCGTGGACTACTACAAGACCAGCCTGATGCAGCTGACCGACAACAGAAAGAAGAGCATCGTGGACTACTACAAGACCAGCCTGATGCAGCTGACCGACAA

CCTGAGCGACAAGTACAAAGAAGCCGCCCCCCTGTTTTCTGAGAACTACGACAACGACCTGAGCGACAAGTACAAAGAAGCCGCCCCCCTGTTTTCTGAGAACTACGACAACGA

GAAGGGCCTGAAGAACGACGACAAGAGCATCAGCCTGATCAAGAACTTCCTGGACGGAAGGGCCTGAAGAACGACGACAAGAGCATCAGCCTGATCAAGAACTTCCTGGACG

CCATCAAGGAGATCGAGAAGTTCATCAAGCCCCTGAGCGAGACAAATATCACCGGCGCCATCAAGGAGATCGAGAAGTTCATCAAGCCCCTGAGCGAGACAAATATCACCGGCG

AGAAGAACGACCTGTTCTACAGCCAGTTCACCCCCCTGCTGGACAACATCAGCAGAAAGAAGAACGACCTGTTCTACAGCCAGTTCACCCCCCTGCTGGACAACATCAGCAGAA

TCGACAGACTGTACGACAAGGTGAGAAACTACGTGACCCAGAAGCCCTTCAGCACCGTCGACAGACTGTACGACAAGGTGAGAAAACTACGTGACCCAGAAGCCCTTCAGCACCG

ACAAGATCAAGCTGAACTTCGGCAACAGCCAGCTTCTGAACGGCTGGGACAGAAACACAAGATCAAGCTGAACTTCGGCAACAGCCAGCTTCTGAACGGCTGGGACAGAAAC

AAGGAGAAGGACTGTGGCGCTGTGCTGCTGTGTAAGGACGAGAAGTACTACCTGGCCAAGGAGAAGGACTGTGGCGCTGTGCTGCTGTAAGGACGAGAAGTACTACCTGGCC

ATCATCGACAAGAGCAACAACAGCATCCTGGAGAACATCGACTTCCAGGACTGCAACATCATCGACAAGAGCAACAACAGCATCCTGGAGAACATCGACTTCCAGGACTGCAAC

GAGAGCGACTACTACGAGAAGATCGTGTACAAGCTGCTGACCAAGATCTCTGGCAACGAGAGCGACTACTACGAGAAGATCGTGTACAAGCTGCTGACCAAGATCTCTGGCAAC

CTGCCCAGAGTGTTCTTCAGCGAGAAGCACAAGAAGCTGCTGAGCCCCAGCGATGAGCTGCCCAGAGGTGTTCTTCAGCGAGAAGCACAAGAAGCTGCTGAGCCCCAGCGATGAG

ATCCTGAAGATCTACAAGAGCGGCACCTTCAAGAAGGGCGACAAGTTCAGCCTTGACATCCTGAAGATCTACAAGAGCGGCACCTTCAAGAAGGGCGACAAGTTCAGCCTTGAC

GACTGCCACAAGCTGATCGACTTCTACAAGGAGAGCTTCAAGAAGTACCCCAAGTGGGACTGCCACAAGCTGATCGACTTCTACAAGGAGAGCTTCAAGAAGTACCCCAAGTGG

CTGATCTACAACTTCAAGTTCAAGAACACCAACGAGTACAACGACATCAGCGAGTTCTCTGATCTACAACTTCAAGTTCAAGAACACCAACGAGTACAACGACATCAGCGAGTTCT

ACAACGACGTGGCCAGCCAGGGATACAACATCAGCAAGATGAAGATCCCCACCAGCTACAACGACGTGGCCAGCCAGGGATACAACATCAGCAAGATGAAGATCCCCACCAGCT

TCATCGACAAGCTGGTGGACGAGGGCAAGATCTACCTGTTCCAGCTGTACAACAAGGTCATCGACAAGCTGGTGGACGAGGGCAAGATCTACCTGTTCCAGCTGTACAACAAGG

ACTTCAGCCCCCACAGCAAGGGAACACCTAACCTGCACACCCTGTACTTCAAGATGCACTTCAGCCCCCACAGCAAGGGAACACCTAACCTGCACAACCCTGTACTTCAAGATGC

TGTTCGACGAGAGAAACCTGGAGGACGTGGTGTACAAGCTGAATGGCGAGGCCGAGTGTTCGACGAGAGAAACCTGGAGGACGTGGTGTACAAGCTGAATGGCGAGGCCGAG

ATGTTTTACAGACCCGCCAGCATCAAGTATGACAAGCCCACCCACCCTAAGAACACCCATGTTTTACAGACCCGCCAGCATCAAGTATGACAAGCCCACCCACCCTAAGAACACCC

CCATCAAGAACAAGAACACCCTGAACGACAAGAAGGCCAGCACCTTCCCCTACGACCCCATCAAGAACAAGAACACCCTGAACGACAAGAAGGCCAGCACCTTCCCTTACGACC

TGATCAAGGACAAGAGATACACCAAGTGGCAGTTCAGCCTGCACTTCCCCATCACCATTGATCAAGGACAAGAGATACACCAAGTGGCAGTCAGCCTGCACTTCCCCATCACCAT

GAACTTCAAGGCCCCCGACAGAGCCATGATCAACGACGACGTGAGAAACCTGCTGAGAACTTCAAGGCCCCCGACAGAGCCATGATCAACGACGACGTGAGAAACCTGCTGA

AGAGCTGCAACAACAACTTCATCATCGGCATCGACAGAGGCGAGAGAAACCTGCTGTAGAGCTGCAACAACAACTTCATCATCGGCATCGACAGAGGCGAGAGAAACCTGCTGT

ACGTGAGCGTGATCGATAGCAACGGCGCCATCATCTACCAGCACAGCCTGAACATCATACGTGAGCGTGATCGATAGCAACGGCGCCATCATCTACCAGCACAGCCTGAACATCAT

CGGCAACAAGTTCAAGGGCAAGACCTACGAAACCAACTACAGAGAGAAGCTGGCCACGGCAACAAGTTCAAGGGCAAGACCTACGAAAACCAACTACAGAGAGAAGCTGGCCA

CCAGAGAGAAGGAGAGAACCGAGCAGAGAAGAAACTGGAAGGCCATCGAGAGCATCCAGAGAGAAGGAGAGAACCGAGCAGAGAAGAAACTGGAAGGCCATCGAGAGCAT

CAAGGAGCTGAAGGAGGGCTACATCAGCCAAACCGTGCACGTGATTTGCCAGCTGGTCAAGGAGCTGAAGGAGGGCTACATCAGCCAAACCGTGCACGTGATTTGCCAGCTGGT

GGTGAAGTACGACGCCATCATCGTGATGGAGAAGCTGACCGACGGCTTCAAGAGAGGGGTGAAGTACGACGCCATCATCGTGATGGAGAAGCTGACCGACGGCTTCAAGAGAGG

CAGAACCAAGTTCGAGAAGCAGGTGTACCAGAAGTTCGAGAAGATGCTGATCGACACAGAACCAAGTTCGAGAAGCAGGTGTACCAGAAGTTCGAGAAGATGCTGATCGACA

AGCTGAACTACTACGTGGACAAGAAGCTGGACCCCAATGAGGAAGGCGGACTGCTGAGCTGAACTACTACGTGGACAAGAAGCTGGACCCCAATGAGGAAGGCGGACTGCTG

CATGCTTATCAGCTGACCAACAAGCTGGACAGCTTCGACAAGCTGGGAATGCAGAGCCATGCTTATCAGCTGACCAACAAGCTGGACAGCTTCGACAAGCTGGGAATGCAGAGC

GGCTTCATCTTCTACGTCAGACCCGACTTCACCAGCAAAATCGACCCCGTGACCGGATGGCTTCATCTTCTACGTCAGACCCGACTTCACCAGCAAAATCGACCCCGTGACCGGAT

TTGTGAACCTGCTGTACCCCAGATACGAGAACATCGACAAGGCCAAGGACATGATCATTGTGAACCTGCTGTACCCCAGATACGAGAACATCGACAAGGCCAAGGACATGATCA

GCAGATTCGACGACATCAGATACAACGCCGGCGAGGACTTCTTCGAGTTCGACATCGGCAGATTCGACGACATCAGATACAACGCCGGCGAGGACTTCTTCGAGTTCGACATCG

ACTACGACAAGTTCCCCAAGACCGCCAGCGACTACAGAAAGAAGTGGACCATCTGCAACTACGACAAGTTCCCCAAGACCGCCAGCGACTACAGAAAGAAGTGGACCATCTGCA

CCAACGGCGAGAGAATCGAGGCCTTCAGAAACCCCGCCAACAACAACGAGTGGAGCCCAACGGCGAGAGAATCGAGGCCTTCAGAAACCCCGCCAACAACAACGAGTGGAGC

TACAGAACCATCATCCTGGCCGAGAAGTTCAAGGAGCTGTTCGACAACAACAGCATCTACAGAACCATCATCCTGGCCGAGAAGTTCAAGGAGCTGTTCGACAACAACAGCATC

AACTACAGAGACAGCGACGACCTGAAAGCCGAGATCCTGAGCCAAACCAAGGGCAAAACTACAGAGACAGCGACGACCTGAAAGCCGAGATCCTGAGCCAAACCAAGGGCAA

GTTCTTCGAGGACTTCTTCAAGCTGCTGAGACTGACCCTGCAGATGAGAAACAGCAAGTTCTTCGAGGACTTCTTCAAGCTGCTGAGACTGACCCTGCAGATGAGAAACAGCAA

CCCCGAAACCGGAGAGGACAGGATTCTGAGCCCCGTGAAGGACAAGAACGGCAACTCCCCGAAACCGGAGAGGACAGGATTCTGAGCCCCGTGAAGGACAAGAACGGCAACT

TCTACGACAGCAGCAAGTACGACGAGAAGAGCAAGCTGCCCTGTGACGCTGATGCTATCTACGACAGCAGCAAGTACGACGAGAAGAGCAAGCTGCCCTGTGACGCTGATGCTA

ACGGCGCTTACAACATCGCCAGAAAGGGCCTGTGGATCGTGGAGCAGTTCAAGAAGGACGGCGCTTACAACATCGCCAGAAAGGGCCTGTGGATCGTGGAGCAGTTCAAGAAGG

CCGACAACGTGTCTGCTGTGGAACCCGTGATCCACAACGACAAGTGGCTGAAGTTCGCCGACAACGTGTCTGCTGTGGAACCCGTGATCCACAACGACAAGTGGCTGAAGTTCG

TGCAGGAGAACGACATGGCCAACAACAAAAGGCCGGCGGCCACGAAAAAGGCCGGTGCAGGAGAACGACATGGCCAACAACAAAAAGGCCGGCGGCCACGAAAAAGGCCGG

CCAGGCAAAAAAGAAAAAGGAATTCGGCAGTGGAGAGGGCAGAGGAAGTCTGCTAACCAGGCAAAAAAGAAAAAGGAATTCGGCAGTGGAGAGGGCAGAGGAAGTCTGCTAA

CATGCGGTGACGTCGAGGAGAATCCTGGCCCAGTGAGCAAGGGCGAGGAGCTGTTCCATGCGGTGACGTCGAGGAGAATCCTGGCCCAGTGAGCAAGGGCGAGGAGCTGTTC

ACCGGGGTGGTGCCCATCCTGGTCGAGCTGGACGGCGACGTAAACGGCCACAAGTTCACCGGGGTGGTGCCCATCCTGGTCGAGCTGGACGGCGACGTAAACGGCCACAAGTTC

AGCGTGTCCGGCGAGGGCGAGGGCGATGCCACCTACGGCAAGCTGACCCTGAAGTTCAGCGTGTCCGGCGAGGGCGAGGGCGATGCCACCTACGGCAAGCTGACCCTGAAGTTC

ATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCTCGTGACCACCCTGACCTATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCTCGTGACCACCCTGACCT

ACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATGAAGCAGCACGACTTCTTCAACGGCGTGCAGTGCTTCAGCCGCTACCCGACCACATGAAGCAGCACGACTTCTTCA

AGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCAAGGACGACGAGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCAAGGACGACG

GCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGCGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGC

ATCGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCAACATCCTGGGGCACAAGCTGATCGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCAACATCCTGGGGCACAAGCTG

GAGTACAACTACAACAGCCACAACGTCTATATCATGGCCGACAAGCAGAAGAACGGCGAGTACAACTACAACAGCCACAACGTCTATATCATGGCCGACAAGCAGAAGAACGGC

ATCAAGGTGAACTTCAAGATCCGCCACAACATCGAGGACGGCAGCGTGCAGCTCGCCATCAAGGTGAACTTCAAGATCCGCCACAACATCGAGGACGGCAGCGTGCAGCTCGCC

GACCACTACCAGCAGAACACCCCCATCGGCGACGGCCCCGTGCTGCTGCCCGACAACGACCACTACCAGCAGAACACCCCATCGGCGACGGCCCCGTGCTGCTGCCCGACAAC

CACTACCTGAGCACCCAGTCCGCCCTGAGCAAAGACCCCAACGAGAAGCGCGATCACCACTACCTGAGCACCCAGTCCGCCCTGAGCAAAGACCCCAACGAGAAGCGCGATCAC

ATGGTCCTGCTGGAGTTCGTGACCGCCGCCGGGATCACTCTCGGCATGGACGAGCTGTATGGTCCTGCTGGAGTTCGTGACCGCCGCCGGGATCACTCTCGGCATGGACGAGCTGT

ACAAGGAATTCTAACTAGAGCTCGCTGATCAGCCTCGACTGTGCCTTCTAGTTGCCAGACAAGGAATTCTAACTAGAGCTCGCTGATCAGCCTCGACTGTGCCTTCTAGTTGCCAG

CCATCTGTTGTTTGCCCCTCCCCCGTGCCTTCCTTGACCCTGGAAGGTGCCACTCCCACCATCTGTTGTTTGCCCCTCCCCCGTGCCTTCCTTGACCCTGGAAGGTGCCACTCCCA

CTGTCCTTTCCTAATAAAATGAGGAAATTGCATCGCATTGTCTGAGTAGGTGTCATTCTCTGTCCTTTCTAATAAAATGAGGAAATTGCATCGCATTGTCTGAGTAGGTGTCATTCT

ATTCTGGGGGGTGGGGTGGGGCAGGACAGCAAGGGGGAGGATTGGGAAGAGAATAGATTCTGGGGGGTGGGGTGGGGCAGGACAGCAAGGGGGAGGATTGGGAAGAGAATAG

CAGGCATGCTGGGGAGCGGCCGCAGGAACCCCTAGTGATGGAGTTGGCCACTCCCTCCAGGCATGCTGGGGAGCGGCCGCAGGAACCCCTAGTGATGGAGTTGGCCACTCCCTC

TCTGCGCGCTCGCTCGCTCACTGAGGCCGGGCGACCAAAGGTCGCCCGACGCCCGGGTCTGCGCGCTCGCTCGCTCACTGAGGCCGGGCGACCAAAGGTCGCCCGACGCCCGGG

CTTTGCCCGGGCGGCCTCAGTGAGCGAGCGAGCGCGCAGCTGCCTGCAGGGGCGCCTCTTTGCCCGGGCGGCCTCAGTGAGCGAGCGAGCGCGCAGCTGCCTGCAGGGGCGCCT

GATGCGGTATTTTCTCCTTACGCATCTGTGCGGTATTTCACACCGCATACGTCAAAGCAGATGCGGTATTTTCTCCTTACGCATCTGTGCGGTATTTCACACCGCATACGTCAAAGCA

ACCATAGTACGCGCCCTGTAGCGGCGCATTAAGCGCGGCGGGTGTGGTGGTTACGCGCACCATAGTACGCGCCCTGTAGCGGCGCATTAAGCGCGGCGGGTGTGGTGGTTACGCGC

AGCGTGACCGCTACACTTGCCAGCGCCTTAGCGCCCGCTCCTTTCGCTTTCTTCCCTTCAGCGTGACCGCTACACTTGCCAGCGCCTTAGCGCCCGCTCCTTTTCGCTTTCTTCCCTTC

CTTTCTCGCCACGTTCGCCGGCTTTCCCCGTCAAGCTCTAAATCGGGGGCTCCCTTTACTTTCTCGCCACGTTCGCCGGCTTTCCCCGTCAAGCTCTAAATCGGGGGGCTCCCTTTA

GGGTTCCGATTTAGTGCTTTACGGCACCTCGACCCCAAAAAACTTGATTTGGGTGATGGGGTTCCGATTTAGTGCTTTACGGCACCTCGACCCCAAAAAACTTGATTTGGGTGATG

GTTCACGTAGTGGGCCATCGCCCTGATAGACGGTTTTTCGCCCTTTGACGTTGGAGTCGTTCACGTAGTGGGCCATCGCCCTGATAGACGGTTTTTCGCCCTTTGACGTTGGAGTC

CACGTTCTTTAATAGTGGACTCTTGTTCCAAACTGGAACAACACTCAACTCTATCTCGCACGTTTCTTTAATAGTGGACTCTTGTTCCAAACTGGAACAACACTCAACTCTATCTCG

GGCTATTCTTTTGATTTATAAGGGATTTTGCCGATTTCGGTCTATTGGTTAAAAAATGAGGGCTATTCTTTTGATTTATAAGGGATTTTGCCGATTTCGGTCTATTGGTTAAAAAATGAG

CTGATTTAACAAAAATTTAACGCGAATTTTAACAAAATATTAACGTTTACAATTTTATGGCTGATTTAACAAAAATTTAACGCGAATTTTAACAAAATATTAACGTTTACAATTTTATGG

TGCACTCTCAGTACAATCTGCTCTGATGCCGCATAGTTAAGCCAGCCCCGACACCCGCTGCACTCTCAGTACAATCTGCTCTGATGCCGCATAGTTAAGCCAGCCCCGACACCCGC

CAACACCCGCTGACGCGCCCTGACGGGCTTGTCTGCTCCCGGCATCCGCTTACAGACCAACACCCGCTGACGCGCCCTGACGGCTTGTCTGCTCCCGGCATCCGCTTACAGAC

AAGCTGTGACCGTCTCCGGGAGCTGCATGTGTCAGAGGTTTTCACCGTCATCACCGAAAGCTGTGACCGTCTCCGGGAGCTGCATGTGTCAGAGGTTTTCACCGTCATCACCGA

AACGCGCGAGACGAAAGGGCCTCGTGATACGCCTATTTTTATAGGTTAATGTCATGATAAACGCGCGAGACGAAAGGGCCTCGTGATACGCCTATTTTTATAGGTTAATGTCATGATA

ATAATGGTTTCTTAGACGTCAGGTGGCACTTTTCGGGGAAATGTGCGCGGAACCCCTAATAATGGTTTCTTAGACGTCAGGTGGCACTTTTCGGGGAAATGTGCGCGGAACCCCTA

TTTGTTTATTTTTCTAAATACATTCAAATATGTATCCGCTCATGAGACAATAACCCTGATTTTGTTTATTTTTCTAAATACATTCAAATATGTATCCGCTCATGAGACAATAACCCTGAT

AAATGCTTCAATAATATTGAAAAAGGAAGAGTATGAGTATTCAACATTTCCGTGTCGCCAAATGCTTCAATAATATTGAAAAAGGAAGAGTATGAGTATTCAACATTTCCGTGTCGCC

CTTATTCCCTTTTTTGCGGCATTTTGCCTTCCTGTTTTTGCTCACCCAGAAACGCTGGTCTTATTCCCTTTTTTGCGGCATTTTGCCTTCCTGTTTTTGCTCACCCAGAAACGCTGGT

GAAAGTAAAAGATGCTGAAGATCAGTTGGGTGCACGAGTGGGTTACATCGAACTGGAGAAAGTAAAAGATGCTGAAGATCAGTTGGGTGCACGAGTGGGTTACATCGAACTGGA

TCTCAACAGCGGTAAGATCCTTGAGAGTTTTCGCCCCGAAGAACGTTTTCCAATGATGTCTCAACAGCGGTAAGATCCTTGAGAGTTTTCGCCCCGAAGAACGTTTTCCAATGATG

AGCACTTTTAAAGTTCTGCTATGTGGCGCGGTATTATCCCGTATTGACGCCGGGCAAGAAGCACTTTTAAAGTTCTGCTATGTGGCGCGGTATTATCCCGTATTGACGCCGGGCAAGA

GCAACTCGGTCGCCGCATACACTATTCTCAGAATGACTTGGTTGAGTACTCACCAGTCGCAACTCGGTCGCCGCATACACTATTCTCAGAATGACTTGGTTGAGTACTCACCAGTC

ACAGAAAAGCATCTTACGGATGGCATGACAGTAAGAGAATTATGCAGTGCTGCCATAAACAGAAAAGCATCTTACGGATGGCATGACAGTAAGAGAATTATGCAGTGCTGCCATAA

CCATGAGTGATAACACTGCGGCCAACTTACTTCTGACAACGATCGGAGGACCGAAGGCCATGAGTGATAACACTGCGGCCAACTTACTTCTGACAACGATCGGAGGACCGAAGG

AGCTAACCGCTTTTTTGCACAACATGGGGGATCATGTAACTCGCCTTGATCGTTGGGAAGCTAACCGCTTTTTTGCACAACATGGGGGATCATGTAACTCGCCTTGATCGTTGGGA

ACCGGAGCTGAATGAAGCCATACCAAACGACGAGCGTGACACCACGATGCCTGTAGCACCGGAGCTGAATGAAGCCATACCAAACGACGAGCGTGACACCACGATGCCTGTAGC

AATGGCAACAACGTTGCGCAAACTATTAACTGGCGAACTACTTACTCTAGCTTCCCGGAATGGCAACAACGTTGCGCAAACTATTAACTGGCGAACTACTTACTCTAGCTTCCCGG

CAACAATTAATAGACTGGATGGAGGCGGATAAAGTTGCAGGACCACTTCTGCGCTCGGCAACAATTAATAGACTGGATGGAGGCGGATAAAGTTGCAGGACCACTTCTGCGCTCGG

CCCTTCCGGCTGGCTGGTTTATTGCTGATAAATCTGGAGCCGGTGAGCGTGGAAGCCGCCCTTCCGGCTGGCTGGTTTATTGCTGATAAATCTGGAGCCGGTGAGCGTGGAAGCCG

CGGTATCATTGCAGCACTGGGGCCAGATGGTAAGCCCTCCCGTATCGTAGTTATCTACACGGTATCATTGCAGCACTGGGGCCAGATGGTAAGCCCTCCCGTATCGTAGTTATCTACA

CGACGGGGAGTCAGGCAACTATGGATGAACGAAATAGACAGATCGCTGAGATAGGTGCGACGGGGAGTCAGGCAACTATGGATGAACGAAATAGACAGATCGCTGAGATAGGTG

CCTCACTGATTAAGCATTGGTAACTGTCAGACCAAGTTTACTCATATATACTTTAGATTGCCTCACTGATTAAGCATTGGTAACTGTCAGACCAAGTTTACTCATATATACTTTAGATTG

ATTTAAAACTTCATTTTTAATTTAAAAGGATCTAGGTGAAGATCCTTTTTGATAATCTCAATTTAAAACTTCATTTTTAATTTAAAAGGATCTAGGTGAAGATCCTTTTTGATAATCTCA

TGACCAAAATCCCTTAACGTGAGTTTTCGTTCCACTGAGCGTCAGACCCCGTAGAAAATGACCAAAATCCCTTAACGTGAGTTTTCGTTCCACTGAGCGTCAGACCCGTAGAAAA

GATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCTGCTGCTTGCAAACAAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCTGCTGCTTGCAAACAA

AAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAGCTACCAACTCTTTTAAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAGCTACCAACTCTTTT

TCCGAAGGTAACTGGCTTCAGCAGAGCGCAGATACCAAATACTGTTCTTCTAGTGTAGTCCGAAGGTAACTGGCTTCAGCAGAGCGCAGATACCAAATACTGTTCTTCTAGTGTAG

CCGTAGTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATACCTCGCTCTGCCCGTAGTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATACCTCGCTCTGC

TAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAGTCGTGTCTTACCGGGTTGGATAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAGTCGTGTCTTACCGGGTTGGA

CTCAAGACGATAGTTACCGGATAAGGCGCAGCGGTCGGGCTGAACGGGGGGTTCGTGCTCAAGACGATAGTTACCGGATAAGGCGCAGCGGTCGGGCTGAACGGGGGGTTCGTG

CACACAGCCCAGCTTGGAGCGAACGACCTACACCGAACTGAGATACCTACAGCGTGACACACAGCCCAGCTTGGAGCGAACGACCTACACCGAACTGAGATACCTACAGCGTGA

GCTATGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCCGGTAAGGCTATGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGCAGGTATCCGGTAAG

CGGCAGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAACGCCTGGTCGGCAGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAACGCCTGGT

ATCTTTATAGTCCTGTCGGGTTTCGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCATCTTTATAGTCCTGTCGGGTTTCGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGC

TCGTCAGGGGGGCGGAGCCTATGGAAAAACGCCAGCAACGCGGCCTTTTTACGGTTCCTGGCCTTTTGCTGGCCTTTTGCTCACATGT(SEQ ID NO:33).TCGTCAGGGGGGCGGAGCCTATGGAAAAACGCCAGCAACGCGGCCTTTTTACGGTTCCTGGCCTTTTGCTGGCCTTTTGCTCACATGT (SEQ ID NO: 33).

BES4-HBG-sg02:BES4-HBG-sg02:

GAGGGCCTATTTCCCATGATTCCTTCATATTTGCATATACGATACAAGGCTGTTAGAGAGATAATTGGAATTAATTTGACTGTAAACACAAAGATATTAGTACAAAATACGTGACGTAGAAAGTAATAATTTCTTGGGTAGTTTGCAGTTTTAAAATTATGTTTTAAAATGGACTATCATATGCTTACCGTAACTTGAAAGTATTTCGATTTCTTGGCTTTATATATCTTGTGGAAAGGACGAAACACCGAATTTCTACTATTGTAGATACCAATAGCCTTGACAAGGCAAATTGAGGGCCTATTTCCCATGATTCCTTCATATTTGCATATACGATACAAGGCTGTTAGAGAGATAATTGGAATTAATTTGACTGTAAACACAAAGATATTAGTACAAAATACGTGACGTAGAAAGTAATAATTTCTTGGGTAGTTTGCAGTTTTAAAATTATGTTTTAAAATGGACTATCATATGCTTACCGTAACTTGAAAGTATTTCGATTTCTTGGCTTTATATATCTTGTGGAAAGGACGAAACACCGAATTTCTACTATTGTAGATA CCAATAGCCTTGACAAGGCAAATT

TTTTGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTTTTAGCGCGTGTTTTGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTTTTAGCGCGTG

CGCCAATTCTGCAGACAAATGGCTCTAGAGGTACCCGTTACATAACTTACGGTAAATGCGCCAATTCTGCAGACAAATGGCTCTAGAGGTACCCGTTACATAACTTACGGTAAATG

GCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAGTAACGCCAATGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAGTAACGCCAAT

AGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGCCCACTTGGCAAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGCCCACTTGGCA

GTACATCAAGTGTATCATATGCCAAGTACGCCCCCTATTGACGTCAATGACGGTAAATGGTACATCAAGTGTATCATATGCCAAGTACGCCCCCTATTGACGTCAATGACGGTAAATG

GCCCGCCTGGCATTGTGCCCAGTACATGACCTTATGGGACTTTCCTACTTGGCAGTACAGCCCGCCTGGCATTGTGCCCAGTACATGACCTTATGGGACTTTCCTACTTGGCAGGTACA

TCTACGTATTAGTCATCGCTATTACCATGGTCGAGGTGAGCCCCACGTTCTGCTTCACTTCTACGTATTAGTCATCGCTATTACCATGGTCGAGGTGAGCCCCACGTTCTGCTTCACT

CTCCCCATCTCCCCCCCCTCCCCACCCCCAATTTTGTATTTATTTATTTTTTAATTATTTTCTCCCCATCTCCCCCCTCCCCACCCCCAATTTTGTATTTATTTATTTTTTAATTATTTT

GTGCAGCGATGGGGGCGGGGGGGGGGGGGGGGCGCGCGCCAGGCGGGGCGGGGCGGTGCAGCGATGGGGGCGGGGGGGGGGGGGGGGCGCGCCAGGCGGGGCGGGGCG

GGGCGAGGGGCGGGGCGGGGCGAGGCGGAGAGGTGCGGCGGCAGCCAATCAGAGCGGGCGAGGGGCGGGGCGGGGCGAGGCGGAGAGGTGCGGCGGCAGCCAATCAGAGC

GGCGCGCTCCGAAAGTTTCCTTTTATGGCGAGGCGGCGGCGGCGGCGGCCCTATAAAGGCGCGCTCCGAAAGTTTCCTTTTATGGCGAGGCGGCGGCGGCGGCGGCCCTATAAA

AAGCGAAGCGCGCGGCGGGCGGGAGTCGCTGCGCGCTGCCTTCGCCCCGTGCCCCGAAGCGAAGCGCGCGGCGGGCGGGAGTCGCTGCGCGCTGCCTTCGCCCCGTGCCCCG

CTCCGCCGCCGCCTCGCGCCGCCCGCCCCGGCTCTGACTGACCGCGTTACTCCCACAGCTCCCGCCGCCGCCTCGCGCCGCCCGCCCCGGCCTTGACTGACCGCGTTACTCCCACAG

GTGAGCGGGCGGGACGGCCCTTCTCCTCCGGGCTGTAATTAGCTGAGCAAGAGGTAAGTGAGCGGGCGGGACGGCCCTTTCTCCTCCGGGCTGTAATTAGCTGAGCAAGAGGTAA

GGGTTTAAGGGATGGTTGGTTGGTGGGGTATTAATGTTTAATTACCTGGAGCACCTGCCGGGTTTAAGGGATGGTTGGTTGGTGGGGTATTAATGTTTAATTACCTGGAGCACCTGCC

TGAAATCACTTTTTTTCAGGTTGGACCGGTGCCACCATGGACTATAAGGACCACGACGTGAAATCACTTTTTTTCAGGTTGGACCGGTGCCACCATGGACTATAAGGACCACGACG

GAGACTACAAGGATCATGATATTGATTACAAAGACGATGACGATAAGATGGCCCCAAAGAGACTACAAGGATCATGATATTGATTACAAAGACGATGACGATAAGATGGCCCCAAA

GAAGAAGCGGAAGGTCGGTATCCACGGAGTCCCAGCAGCCATGCAGGAGAGAAAGAGAAAGAAGCGGAAGGTCGGTATCCACGGAGTCCCAGCAGCCATGCAGGAGAGAAAGA

AGATCAGCCACCTGACCCACAGAAACAGCGTGAAGAAAACCATCAGAATGCAGCTGAGATCAGCCACCTGACCCACAGAAACAGCGTGAAGAAAACCATCAGAATGCAGCTG

AACCCCGTGGGAAAGACCATGGACTACTTCCAGGCCAAGCAGATCCTGGAGAACGACAACCCCGTGGGAAAGACCATGGACTACTTCCAGGCCAAGCAGATCCTGGAGAACGAC

GAGAAGCTGAAGGAGGACTACCAGAAGATCAAGGAGATCGCCGACAGATTCTACAGGAGAAGCTGAAGGAGGACTACCAGAAGATCAAGGAGATCGCCGACAGATTCTACAG

AAACCTGAACGAGGACGTGCTGAGCAAAACCGGACTGGACAAGCTGAAGGACTACGAAACCTGAACGAGGACGTGCTGAGCAAAACCGGACTGGACAAGCTGAAGGACTACG

CCGAGATCTACTACCATTGCAACACCGACGCCGACAGAAAGAGACTGAACGAGTGCGCCGAGATCTACTACCATTGCAACACCGACGCCGACAGAAAGAGACTGAACGAGTGCG

CCAGCGAGCTGAGAAAGGAGATCGTGAAGAACTTCAAGAACAGAGATGAGTACAACCCAGCGAGCTGAGAAAGGAGATCGTGAAGAACTTCAAGAACAGAGATGAGTACAAC

AAGCTGTTCAACAAGAAGATGATCGAGATCGTGCTGCCCAAGCACCTGAAGAACGAGAAGCTGTTCAACAAGAAGATGATCGAGATCGTGCTGCCCAAGCACCTGAAGAACGAG

GACGAGAAGGAAGTGGTGGCCAGCTTCAAGAACTTCACCACCTACTTCACCGGCTTCGACGAGAAGGAAGTGGTGGCCAGCTTCAAGAACTTCACCACCTACTTCACCGGCTTC

TTCACCAACAGAAAGAACATGTACAGCGACGGCGAAGAGTCTACCGCTATTGCCTACTTCACCAACAGAAAGAACATGTACAGCGACGGCGAAGAGTCTACCGCTATTGCCTAC

AGATGCATCAACGAGAACCTGCCCAAGCACCTGGACAACGTGAAGGTGTTCGAGAAAGATGCATCAACGAGAACCTGCCCAAGCACCTGGACAACGTGAAGGTGTTCGAGAA

GGCCATCAGCAAGCTGAGCAAGAACGCCATCGACGACCTGGATGCCACATATTCTGGGGCCATCAGCAAGCTGAGCAAGAACGCCATCGACGACCTGGATGCCACATATTCTGG

CCTGTGCGGCACAAATCTGTACGACGTGTTCACCGTGGACTACTTCAACTTCCTGCTGCCTGTGCGGCACAAATCTGTACGACGTGTTCACCGTGGACTACTTCAACTTCCTGCTG

CCCCAAAGCGGAATCACCGAGTACAACAAGATCATCGGCGGCTACACAACAAGCGACCCCCAAAGCGGAATCACCGAGTACAACAAGATCATCGGCGGCTACACAACAAGCGAC

GGCACCAAAGTGAAGGGCATCAACGAGTACATCAACCTGTACAACCAGCAGGTGAGCGGCACCAAAGTGAAGGGCATCAACGAGTACATCAACCTGTACAACCAGCAGGTGAGC

AAGAGAGACAAGATCCCCAACCTGAAGATCCTGTACAAGCAGATCCTGAGCGAGAGCAAGAGAGACAAGATCCCCAACCTGAAGATCCTGTACAAGCAGATCCTGAGCGAGAGC

GAGAAGGTGTCTTTCATCCCCCCCAAGTTCGAGGACGACAACGAACTGCTGTCTGCCGAGAAGGTGTCTTTCATCCCCCCCAAGTTCGAGGACGACAACGAACTGCTGTCTGCC

GTGAGCGAGTTCTATGCCAACGACGAGACATTTGATGGCATGCCCCTGAAGAAAGCCGTGAGCGAGTTCTATGCCAACGACGAGACATTTGATGGCATGCCCTGAAGAAAGCC

ATCGACGAAACCAAACTGCTGTTCGGCAACCTGGACAACAGCAGCCTGAACGGCATCATCGACGAAACCAAACTGCTGTTCGGCAACCTGGACAACAGCAGCCTGAACGGCATC

TACATCCAGAACGACAGAAGCGTGACCAACCTGAGCAACAGCATGTTCGGCAGCTGGTACATCCAGAACGACAGAAGCGTGACCAACCTGAGCAACAGCATGTTCGGCAGCTGG

AGCGTGATTGAGGACCTGTGGAACAAGAACTACGACAGCGTGAACAGCAACAGCAGAGCGTGATTGAGGACCTGTGGAACAAGAACTACGACAGCGTGAACAGCAACAGCAG

AATCAAGGACATCCAGAAGAGAGAGGACAAGAGAAAGAAGGCCTACAAGGCCGAGAATCAAGGACATCCAGAAGAGAGAGGACAAGAGAAAGAAGGCCTACAAGGCCGAG

AAGAAGCTGAGCCTGAGCTTCCTGCAGGTGCTGATCAGCAACAGCGAGAACGACGAAAGAAGCTGAGCCTGAGCTTCCTGCAGGTGCTGATCAGCAACAGCGAGAACGACGA

GATCAGAAAGAAGAGCATCGTGGACTACTACAAGACCAGCCTGATGCAGCTGACCGAGATCAGAAAGAAGAGCATCGTGGACTACTACAAGACCAGCCTGATGCAGCTGACCGA

CAACCTGAGCGACAAGTACAAAGAAGCCGCCCCCCTGTTTTCTGAGAACTACGACAACAACCTGAGCGACAAGTACAAAGAAGCCGCCCCCCTGTTTTCTGAGAACTACGACAA

CGAGAAGGGCCTGAAGAACGACGACAAGAGCATCAGCCTGATCAAGAACTTCCTGGCGAGAAGGGCCTGAAGAACGACGACAAGAGCATCAGCCTGATCAAGAACTTCCTGG

ACGCCATCAAGGAGATCGAGAAGTTCATCAAGCCCCTGAGCGAGACAAATATCACCGACGCCATCAAGGAGATCGAGAAGTTCATCAAGCCCCTGAGCGAGACAAATATCACCG

GCGAGAAGAACGACCTGTTCTACAGCCAGTTCACCCCCCTGCTGGACAACATCAGCAGCGAGAAGAACGACCTGTTCTACAGCCAGTTCACCCCCCTGCTGGACAACATCAGCA

GAATCGACAGACTGTACGACAAGGTGAGAAACTACGTGACCCAGAAGCCCTTCAGCAGAATCGACAGACTGTACGACAAGGTGAGAAAACTACGTGACCCAGAAGCCCTTCAGCA

CCGACAAGATCAAGCTGAACTTCGGCAACAGCCAGCTTCTGAACGGCTGGGACAGACCGACAAGATCAAGCTGAACTTCGGCAACAGCCAGCTTCTGAACGGCTGGGACAGA

AACAAGGAGAAGGACTGTGGCGCTGTGCTGCTGTGTAAGGACGAGAAGTACTACCTGAACAAGGAGAAGGACTGTGGCGCTGTGCTGCTGTAAGGACGAGAAGTACTACCTG

GCCATCATCGACAAGAGCAACAACAGCATCCTGGAGAACATCGACTTCCAGGACTGCGCCATCATCGACAAGAGCAACAACAGCATCCTGGAGAACATCGACTTCCAGGACTGC

AACGAGAGCGACTACTACGAGAAGATCGTGTACAAGCTGCTGACCAAGATCTCTGGCAACGAGAGCGACTACTACGAGAAGATCGTGTACAAGCTGCTGACCAAGATCTCTGGC

AACCTGCCCAGAGTGTTCTTCAGCGAGAAGCACAAGAAGCTGCTGAGCCCCAGCGATAACCTGCCCAGAGTGTTCTTCAGCGAGAAGCACAAGAAGCTGCTGAGCCCCAGCGAT

GAGATCCTGAAGATCTACAAGAGCGGCACCTTCAAGAAGGGCGACAAGTTCAGCCTTGAGATCCTGAAGATCTACAAGAGCGGCACCTTCAAGAAGGGCGACAAGTTCAGCCTT

GACGACTGCCACAAGCTGATCGACTTCTACAAGGAGAGCTTCAAGAAGTACCCCAAGGACGACTGCCACAAGCTGATCGACTTCTACAAGGAGAGCTTCAAGAAGTACCCCAAG

TGGCTGATCTACAACTTCAAGTTCAAGAACACCAACGAGTACAACGACATCAGCGAGTGGCTGATCTACAACTTCAAGTTCAAGAACACCAACGAGTACAACGACATCAGCGAG

TTCTACAACGACGTGGCCAGCCAGGGATACAACATCAGCAAGATGAAGATCCCCACCTTCTACAACGACGTGGCCAGCCAGGGATACAACATCAGCAAGATGAAGATCCCCACC

AGCTTCATCGACAAGCTGGTGGACGAGGGCAAGATCTACCTGTTCCAGCTGTACAACAGCTTCATCGACAAGCTGGTGGACGAGGGCAAGATCTACCTGTTCCAGCTGTACAAC

AAGGACTTCAGCCCCCACAGCAAGGGAACACCTAACCTGCACACCCTGTACTTCAAGAAGGACTTCAGCCCCCACAGCAAGGGAACACCTAACCTGCACACCCTGTACTTCAAG

ATGCTGTTCGACGAGAGAAACCTGGAGGACGTGGTGTACAAGCTGAATGGCGAGGCCATGCTGTTCGACGAGAGAAACCTGGAGGACGTGGTGTACAAGCTGAATGGCGAGGCC

GAGATGTTTTACAGACCCGCCAGCATCAAGTATGACAAGCCCACCCACCCTAAGAACGAGATGTTTTACAGACCCGCCAGCATCAAGTATGACAAGCCCACCCACCCTAAGAAC

ACCCCCATCAAGAACAAGAACACCCTGAACGACAAGAAGGCCAGCACCTTCCCCTACACCCCCATCAAGAACAAGAACACCCTGAACGACAAGAAGGCCAGCACCTTCCCCTAC

GACCTGATCAAGGACAAGAGATACACCAAGTGGCAGTTCAGCCTGCACTTCCCCATCGACCTGATCAAGGACAAGAGATACACCAAGTGGCAGTCAGCCTGCACTTCCCCATC

ACCATGAACTTCAAGGCCCCCGACAGAGCCATGATCAACGACGACGTGAGAAACCTGACCATGAACTTCAAGGCCCCCGACAGAGCCATGATCAACGACGACGTGAGAAACCTG

CTGAAGAGCTGCAACAACAACTTCATCATCGGCATCGACAGAGGCGAGAGAAACCTGCTGAAGAGCTGCAACAACAACTTCATCATCGGCATCGACAGAGGCGAGAGAAACCTG

CTGTACGTGAGCGTGATCGATAGCAACGGCGCCATCATCTACCAGCACAGCCTGAACACTGTACGTGAGCGTGATCGATAGCAACGGCGCCATCATCTACCAGCACAGCCTGAACA

TCATCGGCAACAAGTTCAAGGGCAAGACCTACGAAACCAACTACAGAGAGAAGCTGTCATCGGCAACAAGTTCAAGGGCAAGACCTACGAAAACCAACTACAGAGAGAAGCTG

GCCACCAGAGAGAAGGAGAGAACCGAGCAGAGAAGAAACTGGAAGGCCATCGAGAGCCACCAGAGAGAAGGAGAGAACCGAGCAGAGAAGAAACTGGAAGGCCATCGAGA

GCATCAAGGAGCTGAAGGAGGGCTACATCAGCCAAACCGTGCACGTGATTTGCCAGCGCATCAAGGAGCTGAAGGAGGGCTACATCAGCCAAACCGTGCACGTGATTTGCCAGC

TGGTGGTGAAGTACGACGCCATCATCGTGATGGAGAAGCTGACCGACGGCTTCAAGATGGTGGTGAAGTACGACGCCATCATCGTGATGGAGAAGCTGACCGACGGCTTCAAGA

GAGGCAGAACCAAGTTCGAGAAGCAGGTGTACCAGAAGTTCGAGAAGATGCTGATCGAGGCAGAACCAAGTTCGAGAAGCAGGTGTACCAGAAGTTCGAGAAGATGCTGATC

GACAAGCTGAACTACTACGTGGACAAGAAGCTGGACCCCAATGAGGAAGGCGGACTGACAAGCTGAACTACTACGTGGACAAGAAGCTGGACCCCAATGAGGAAGGCGGACT

GCTGCATGCTTATCAGCTGACCAACAAGCTGGACAGCTTCGACAAGCTGGGAATGCAGCTGCATGCTTATCAGCTGACCAACAAGCTGGACAGCTTCGACAAGCTGGGAATGCA

GAGCGGCTTCATCTTCTACGTCAGACCCGACTTCACCAGCAAAATCGACCCCGTGACCGAGCGGCTTCATCTTCTACGTCAGACCCGACTTCACCAGCAAAATCGACCCGTGACC

GGATTTGTGAACCTGCTGTACCCCAGATACGAGAACATCGACAAGGCCAAGGACATGGGATTTGTGAACCTGCTGTACCCCAGATACGAGAACATCGACAAGGCCAAGGACATG

ATCAGCAGATTCGACGACATCAGATACAACGCCGGCGAGGACTTCTTCGAGTTCGACAATCAGCAGATTCGACGACATCAGATACAACGCCGGCGAGGACTTCTTCGAGTTCGACA

TCGACTACGACAAGTTCCCCAAGACCGCCAGCGACTACAGAAAGAAGTGGACCATCTTCGACTACGACAAGTTCCCCAAGACCGCCAGCGACTACAGAAAGAAGTGGACCATCT

GCACCAACGGCGAGAGAATCGAGGCCTTCAGAAACCCCGCCAACAACAACGAGTGGGCACCAACGGCGAGAGAATCGAGGCCTTCAGAAAACCCCGCCAACAACAACGAGTGG

AGCTACAGAACCATCATCCTGGCCGAGAAGTTCAAGGAGCTGTTCGACAACAACAGCAGCTACAGAACCATCATCCTGGCCGAGAAGTTCAAGGAGCTGTTCGACAACAACAGC

ATCAACTACAGAGACAGCGACGACCTGAAAGCCGAGATCCTGAGCCAAACCAAGGGATCAACTACAGAGACAGCGACGACCTGAAAGCCGAGATCCTGAGCCAAACCAAGGG

CAAGTTCTTCGAGGACTTCTTCAAGCTGCTGAGACTGACCCTGCAGATGAGAAACAGCAAGTTCTTCGAGGACTTCTTCAAGCTGCTGAGACTGACCCTGCAGATGAGAAACAG

CAACCCCGAAACCGGAGAGGACAGGATTCTGAGCCCCGTGAAGGACAAGAACGGCACAACCCCGAAACCGGAGAGGACAGGATTCTGAGCCCCGTGAAGGACAAGAACGGCA

ACTTCTACGACAGCAGCAAGTACGACGAGAAGAGCAAGCTGCCCTGTGACGCTGATGACTTCTACGACAGCAGCAAGTACGACGAGAAGAGCAAGCTGCCCTGTGACGCTGATG

CTAACGGCGCTTACAACATCGCCAGAAAGGGCCTGTGGATCGTGGAGCAGTTCAAGACTAACGGCGCTTACAACATCGCCAGAAAGGGCCTGTGGATCGTGGAGCAGTTCAAGA

AGGCCGACAACGTGTCTGCTGTGGAACCCGTGATCCACAACGACAAGTGGCTGAAGTAGGCCGACAACGTGTCTGCTGTGGAACCCGTGATCCACAACGACAAGTGGCTGAAGT

TCGTGCAGGAGAACGACATGGCCAACAACAAAAGGCCGGCGGCCACGAAAAAGGCCTCGTGCAGGAGAACGACATGGCCAACAACAAAAGGCCGGCGGCCACGAAAAAGGCC

GGCCAGGCAAAAAAGAAAAAGGAATTCGGCAGTGGAGAGGGCAGAGGAAGTCTGCTGGCCAGGCAAAAAAGAAAAAGGAATTCGGCAGTGGAGAGGGCAGAGGAAGTCTGCT

AACATGCGGTGACGTCGAGGAGAATCCTGGCCCAGTGAGCAAGGGCGAGGAGCTGTAACATGCGGTGACGTCGAGGAGAATCCTGGCCCAGTGAGCAAGGGCGAGGAGCTGT

TCACCGGGGTGGTGCCCATCCTGGTCGAGCTGGACGGCGACGTAAACGGCCACAAGTTCACCGGGGGTGGTGCCCATCCTGGTCGAGCTGGACGGCGACGTAAACGGCCACAAGT

TCAGCGTGTCCGGCGAGGGCGAGGGCGATGCCACCTACGGCAAGCTGACCCTGAAGTTCAGCGTGTCCGGCGAGGGCGAGGGCGATGCCACCTACGGCAAGCTGACCCTGAAGT

TCATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCTCGTGACCACCCTGATCATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCTCGTGACCACCCTGA

CCTACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATGAAGCAGCACGACTTCTTCCTACGGCGTGCAGTGCTTCAGCCGCTACCCGACCACATGAAGCAGCACGACTTCTT

CAAGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCAAGGACGACAAGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCAAGGACGA

CGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCCGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACC

GCATCGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCAACATCCTGGGGCACAAGGCATCGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCAACATCCTGGGGCACAAG

CTGGAGTACAACTACAACAGCCACAACGTCTATATCATGGCCGACAAGCAGAAGAACCTGGAGTACAACTACAACAGCCACAACGTCTATATCATGGCCGACAAGCAGAAGAAC

GGCATCAAGGTGAACTTCAAGATCCGCCACAACATCGAGGACGGCAGCGTGCAGCTCGGCATCAAGGTGAACTTCAAGATCCGCCACAACATCGAGGACCGGCAGCGTGCAGCTC

GCCGACCACTACCAGCAGAACACCCCCATCGGCGACGGCCCCGTGCTGCTGCCCGACGCCGACCACTACCAGCAGAACACCCCATCGGCGACGGCCCCGTGCTGCTGCCCGAC

AACCACTACCTGAGCACCCAGTCCGCCCTGAGCAAAGACCCCAACGAGAAGCGCGATAACCACTACCTGAGCACCCAGTCCGCCCTGAGCAAAGACCCCAACGAGAAGCGCGAT

CACATGGTCCTGCTGGAGTTCGTGACCGCCGCCGGGATCACTCTCGGCATGGACGAGCACATGGTCCTGCTGGAGTTCGTGACCGCCGCCGGGATCACTCTCGGCATGGACGAG

CTGTACAAGGAATTCTAACTAGAGCTCGCTGATCAGCCTCGACTGTGCCTTCTAGTTGCTGTACAAGGAATTCTAACTAGAGCTCGCTGATCAGCCTCGACTGTGCCTTCTAGTTG

CCAGCCATCTGTTGTTTGCCCCTCCCCCGTGCCTTCCTTGACCCTGGAAGGTGCCACTCCAGCCATCTGTTGTTTGCCCCTCCCCCGTGCCTTCCTTGACCCTGGAAGGTGCCACT

CCCACTGTCCTTTCCTAATAAAATGAGGAAATTGCATCGCATTGTCTGAGTAGGTGTCACCCACTGTCCTTCCTAATAAAATGAGGAAATTGCATCGCATTGTCTGAGTAGGTGTCA

TTCTATTCTGGGGGGTGGGGTGGGGCAGGACAGCAAGGGGGAGGATTGGGAAGAGATTCTATTCTGGGGGGTGGGGTGGGGCAGGACAGCAAGGGGGAGGATTGGGAAGAGA

ATAGCAGGCATGCTGGGGAGCGGCCGCAGGAACCCCTAGTGATGGAGTTGGCCACTCATAGCAGGCATGCTGGGGAGCGGCCGCAGGAACCCCTAGTGATGGAGTTGGCCACTC

CCTCTCTGCGCGCTCGCTCGCTCACTGAGGCCGGGCGACCAAAGGTCGCCCGACGCCCCTCTCTGCGCGCTCGCTCGCTCACTGAGGCCGGGCGACCAAAGGTCGCCCGACGCC

CGGGCTTTGCCCGGGCGGCCTCAGTGAGCGAGCGAGCGCGCAGCTGCCTGCAGGGGCGGGCTTTGCCCGGGCGGCCTCAGTGAGCGAGCGAGCGCGCAGCTGCCTGCAGGGG

CGCCTGATGCGGTATTTTCTCCTTACGCATCTGTGCGGTATTTCACACCGCATACGTCACGCCTGATGCGGTATTTTCTCCTTACGCATCTGTGCGGTATTTCACACCGCATACGTCA

AAGCAACCATAGTACGCGCCCTGTAGCGGCGCATTAAGCGCGGCGGGTGTGGTGGTTAAGCAACCATAGTACGCGCCCTGTAGCGGCGCATTAAGCGCGGCGGGTGTGGTGGTT

ACGCGCAGCGTGACCGCTACACTTGCCAGCGCCTTAGCGCCCGCTCCTTTCGCTTTCTACGCGCAGCGTGACCGCTACACTTGCCAGCGCCTTAGCGCCCGCTCCTTTCGCTTTCT

TCCCTTCCTTTCTCGCCACGTTCGCCGGCTTTCCCCGTCAAGCTCTAAATCGGGGGCTCTCCCTTCCTTTCTCGCCACGTTTCGCCGGCTTTCCCCGTCAAGCTCTAAATCGGGGGCTC

CCTTTAGGGTTCCGATTTAGTGCTTTACGGCACCTCGACCCCAAAAAACTTGATTTGGCCTTTAGGGTTCCGATTTAGTGCTTTACGGCACCTCGACCCCAAAAAACTTGATTTGG

GTGATGGTTCACGTAGTGGGCCATCGCCCTGATAGACGGTTTTTCGCCCTTTGACGTTGTGATGGTTCACGTAGTGGGCCATCGCCCTGATAGACGGTTTTTCGCCCTTTGACGTT

GGAGTCCACGTTCTTTAATAGTGGACTCTTGTTCCAAACTGGAACAACACTCAACTCTGGAGTCCACGTTCTTTAATAGTGGACTCTTGTTCCAAACTGGAACAACACTCAACTCT

ATCTCGGGCTATTCTTTTGATTTATAAGGGATTTTGCCGATTTCGGTCTATTGGTTAAAAATCTCGGGCTATTCTTTTGATTTATAAGGGATTTTGCCGATTTCGGTCTATTGGTTAAAA

AATGAGCTGATTTAACAAAAATTTAACGCGAATTTTAACAAAATATTAACGTTTACAATAATGAGCTGATTTAACAAAAATTTAACGCGAATTTTAACAAAATATTAACGTTTACAAT

TTTATGGTGCACTCTCAGTACAATCTGCTCTGATGCCGCATAGTTAAGCCAGCCCCGACTTTATGGTGCACTCTCAGTACAATCTGCTCTGATGCCGCATAGTTAAGCCAGCCCCGAC

ACCCGCCAACACCCGCTGACGCGCCCTGACGGGCTTGTCTGCTCCCGGCATCCGCTTAACCCGCCAACACCCGCTGACGCGCCCTGACGGGCTTGTCTGCTCCCGGCATCCGCTTA

CAGACAAGCTGTGACCGTCTCCGGGAGCTGCATGTGTCAGAGGTTTTCACCGTCATCCAGACAAGCTGTGACCGTCTCCGGGAGCTGCATGTGTCAGAGGTTTTCACCGTCATC

ACCGAAACGCGCGAGACGAAAGGGCCTCGTGATACGCCTATTTTTATAGGTTAATGTCACCGAAACGCGCGAGACGAAAGGGCCTCGTGATACGCCTATTTTTATAGGTTAATGTC

ATGATAATAATGGTTTCTTAGACGTCAGGTGGCACTTTTCGGGGAAATGTGCGCGGAAATGATAATAATGGTTTCTTAGACGTCAGGTGGCACTTTTCGGGGAAATGTGCGCGGAA

CCCCTATTTGTTTATTTTTCTAAATACATTCAAATATGTATCCGCTCATGAGACAATAACCCCCCTATTTGTTTTATTTTTCTAAATACATTCAAATATGTATCCGCTCATGAGACAATAACC

CTGATAAATGCTTCAATAATATTGAAAAAGGAAGAGTATGAGTATTCAACATTTCCGTGCTGATAAATGCTTCAATAATATTGAAAAAGGAAGAGTATGAGTATTCAACATTTCCGTG

TCGCCCTTATTCCCTTTTTTGCGGCATTTTGCCTTCCTGTTTTTGCTCACCCAGAAACGTCGCCCTTATTCCCTTTTTTGCGGCATTTTGCCTTCCTGTTTTTGCTCACCCAGAAACG

CTGGTGAAAGTAAAAGATGCTGAAGATCAGTTGGGTGCACGAGTGGGTTACATCGAACTGGTGAAAGTAAAAGATGCTGAAGATCAGTTGGGTGCACGAGTGGGTTACATCGAA

CTGGATCTCAACAGCGGTAAGATCCTTGAGAGTTTTCGCCCCGAAGAACGTTTTCCAACTGGATCTCAACAGCGGTAAGATCCTTGAGAGTTTTCGCCCCGAAGAACGTTTTCCAA

TGATGAGCACTTTTAAAGTTCTGCTATGTGGCGCGGTATTATCCCGTATTGACGCCGGGTGATGAGCACTTTTAAAGTTCTGCTATGTGGCGCGGTATTATCCCGTATTGACCCGGG

CAAGAGCAACTCGGTCGCCGCATACACTATTCTCAGAATGACTTGGTTGAGTACTCACCAAGAGCAACTCGGTCGCCGCATACACTATTCTCAGAATGACTTGGTTGAGTACTCAC

CAGTCACAGAAAAGCATCTTACGGATGGCATGACAGTAAGAGAATTATGCAGTGCTGCCAGTCACAGAAAAGCATCTTACGGATGGCATGACAGTAAGAGAATTATGCAGTGCTGC

CATAACCATGAGTGATAACACTGCGGCCAACTTACTTCTGACAACGATCGGAGGACCGCATAACCATGAGTGATAACACTGCGGCCAACTTACTTCTGACAACGATCGGAGGACCG

AAGGAGCTAACCGCTTTTTTGCACAACATGGGGGATCATGTAACTCGCCTTGATCGTTAAGGAGCTAACCGCTTTTTTGCACAACATGGGGGATCATGTAACTCGCCTTGATCGTT

GGGAACCGGAGCTGAATGAAGCCATACCAAACGACGAGCGTGACACCACGATGCCTGGGGAACCGGAGCTGAATGAAGCCATACCAAACGACGAGCGTGACACCACGATGCCTG

TAGCAATGGCAACAACGTTGCGCAAACTATTAACTGGCGAACTACTTACTCTAGCTTCTAGCAATGGCAACAACGTTGCGCAAACTATTAACTGGCGAACTACTTACTCTAGCTTC

CCGGCAACAATTAATAGACTGGATGGAGGCGGATAAAGTTGCAGGACCACTTCTGCGCCGGCAACAATTAATAGACTGGATGGAGGCGGATAAAGTTGCAGGACCACTTCTGCG

CTCGGCCCTTCCGGCTGGCTGGTTTATTGCTGATAAATCTGGAGCCGGTGAGCGTGGACTCGGCCCTTCCGGCTGGCTGGTTTTATTGCTGATAAATCTGGAGCCGGTGAGCGTGGA

AGCCGCGGTATCATTGCAGCACTGGGGCCAGATGGTAAGCCCTCCCGTATCGTAGTTATAGCCGCGGTATCATTGCAGCACTGGGGCCAGATGGTAAGCCCTCCCGTATCGTAGTTAT

CTACACGACGGGGAGTCAGGCAACTATGGATGAACGAAATAGACAGATCGCTGAGATCTACACGACGGGGAGTCAGGCAACTATGGATGAACGAAATAGACAGATCGCTGAGAT

AGGTGCCTCACTGATTAAGCATTGGTAACTGTCAGACCAAGTTTACTCATATATACTTTAGGTGCCTCACTGATTAAGCATTGGTAACTGTCAGACCAAGTTTACTCATATATACTTT

AGATTGATTTAAAACTTCATTTTTAATTTAAAAGGATCTAGGTGAAGATCCTTTTTGATAAGATTGATTTAAAACTTCATTTTTAATTTAAAAGGATCTAGGTGAAGATCCTTTTTGATA

ATCTCATGACCAAAATCCCTTAACGTGAGTTTTCGTTCCACTGAGCGTCAGACCCCGTATCTCATGACCAAAATCCCTTAACGTGAGTTTTCGTTCCACTGAGCGTCAGACCCCGT

AGAAAAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCTGCTGCTTGCAGAAAAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCTGCTGCTTGC

AAACAAAAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAGCTACCAAAAACAAAAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAGCTACCAA

CTCTTTTTCCGAAGGTAACTGGCTTCAGCAGAGCGCAGATACCAAATACTGTTCTTCTCTCTTTTTCCGAAGGTAACTGGCTTCAGCAGAGCGCAGATACCAAATACTGTTCTTCT

AGTGTAGCCGTAGTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATACCTCAGTGTAGCCGTAGTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATACCTC

GCTCTGCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAGTCGTGTCTTACCGGCTCTGCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAGTCGTGTCTTACCG

GGTTGGACTCAAGACGATAGTTACCGGATAAGGCGCAGCGGTCGGGCTGAACGGGGGGGTTGGACTCAAGACGATAGTTACCGGATAAGGCGCAGCGGTCGGGCTGAACGGGG

GTTCGTGCACACAGCCCAGCTTGGAGCGAACGACCTACACCGAACTGAGATACCTACGTTCGTGCACACAGCCCAGCTTGGAGCGAACGACCTACACCGAACTGAGATACCTAC

AGCGTGAGCTATGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGTATAGCGTGAGCTATGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGCGGGACAGGTAT

CCGGTAAGCGGCAGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAACCGGTAAGCGGCAGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAA

ACGCCTGGTATCTTTATAGTCCTGTCGGGTTTCGCCACCTCTGACTTGAGCGTCGATTTACGCCTGGTATCTTTATAGTCCTGTCGGGTTTCGCCACCTCTGACTTGAGCGTCGATTT

TTGTGATGCTCGTCAGGGGGGCGGAGCCTATGGAAAAACGCCAGCAACGCGGCCTTTTTACGGTTCCTGGCCTTTTGCTGGCCTTTTGCTCACATGT(SEQ ID NO:34)TTGTGATGCTCGTCAGGGGGGCGGAGCCTATGGAAAAACGCCAGCAACGCGGCCTTTTTACGGTTCCTGGCCTTTTGCTGGCCTTTTGCTCACATGT(SEQ ID NO:34)

BES4-HBG-SG03:BES4-HBG-SG03:

GAGGGCCTATTTCCCATGATTCCTTCATATTTGCATATACGATACAAGGCTGTTAGAGAGATAATTGGAATTAATTTGACTGTAAACACAAAGATATTAGTACAAAATACGTGACGTAGAAAGTAATAATTTCTTGGGTAGTTTGCAGTTTTAAAATTATGTTTTAAAATGGACTATCATATGCTTACCGTAACTTGAAAGTATTTCGATTTCTTGGCTTTATATATCTTGTGGAAAGGACGAAACACCGAATTTCTACTATTGTAGATCCTTGTCAAGGCTATTGGTCAAGTTTTTTGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTTTTAGCGCGTGCGCCAATTCTGCAGACAAATGGCTCTAGAGGTACCCGTTACATAACTTACGGTAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGCCCACTTGGCAGGAGGGCCTATTTCCCATGATTCCTTCATATTTGCATATACGATACAAGGCTGTTAGAGAGATAATTGGAATTAATTTGACTGTAAACACAAAGATATTAGTACAAAATACGTGACGTAGAAAGTAATAATTTCTTGGGTAGTTTGCAGTTTTAAAATTATGTTTTAAAATGGACTATCATATGCTTACCGTAACTTGAAAGTATTTCGATTTCTTGGCTTTATATATCTTGTGGAAAGGACGAAACACCGAATTTCTACTATTG TAGATCCTTGTCAAGGCTATTGGTCAAGTTTTTTGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTTTTAGCGCGTGCGCCAATTCTGCAGACAAATGGCTCTAGAGGTACCCGTTACATAACTTACGGTAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGCCCACTTG GCAG

TACATCAAGTGTATCATATGCCAAGTACGCCCCCTATTGACGTCAATGACGGTAAATGGTACATCAAGTGTATCATATGCCAAGTACGCCCCCTATTGACGTCAATGACGGTAAATGG

CCCGCCTGGCATTGTGCCCAGTACATGACCTTATGGGACTTTCCTACTTGGCAGTACATCCCGCCTGGCATTGTGCCCAGTACATGACCTTATGGGACTTTCCTACTTGGCAGTACAT

CTACGTATTAGTCATCGCTATTACCATGGTCGAGGTGAGCCCCACGTTCTGCTTCACTCCTACGTATTAGTCATCGCTATTACCATGGTCGAGGTGAGCCCCACGTTCTGCTTCACTC

TCCCCATCTCCCCCCCCTCCCCACCCCCAATTTTGTATTTATTTATTTTTTAATTATTTTGTCCCCATCTCCCCCCTCCCCACCCCCAATTTTGTATTTATTTATTTTTTAATTATTTTG

TGCAGCGATGGGGGCGGGGGGGGGGGGGGGGCGCGCGCCAGGCGGGGCGGGGCGGTGCAGCGATGGGGGCGGGGGGGGGGGGGGGGCGCGCGCCAGGCGGGGCGGGGCGG

GGCGAGGGGCGGGGCGGGGCGAGGCGGAGAGGTGCGGCGGCAGCCAATCAGAGCGGGCGAGGGGCGGGGCGGGGCGAGGCGGAGAGGTGCGGCGGCAGCCAATCAGAGCG

GCGCGCTCCGAAAGTTTCCTTTTATGGCGAGGCGGCGGCGGCGGCGGCCCTATAAAAGCGCGCTCCGAAAGTTTCCTTTTATGGCGAGGCGGCGGCGGCGGCGGCCCTATAAAA

AGCGAAGCGCGCGGCGGGCGGGAGTCGCTGCGCGCTGCCTTCGCCCCGTGCCCCGCTAGCGAAGCGCGCGGCGGGCGGGAGTCGCTGCGCGCTGCCTTCGCCCCGTGCCCCGCT

CCGCCGCCGCCTCGCGCCGCCCGCCCCGGCTCTGACTGACCGCGTTACTCCCACAGGCCGCCGCCGCCTCGCGCCGCCCGCCCCGCCTCTGACTGACCGCGTTACTCCCACAGG

TGAGCGGGCGGGACGGCCCTTCTCCTCCGGGCTGTAATTAGCTGAGCAAGAGGTAAGTGAGCGGGCGGGACGGCCCTTTCTCCTCCGGGCTGTAATTAGCTGAGCAAGAGGTAAG

GGTTTAAGGGATGGTTGGTTGGTGGGGTATTAATGTTTAATTACCTGGAGCACCTGCCTGGTTTAAGGGATGGTTGGTTGGTGGGGTATTAATGTTTAATTACCTGGAGCACCTGCCT

GAAATCACTTTTTTTCAGGTTGGACCGGTGCCACCATGGACTATAAGGACCACGACGGGAAATCACTTTTTTTCAGGTTGGACCGGTGCCACCATGGACTATAAGGACCACGACGG

AGACTACAAGGATCATGATATTGATTACAAAGACGATGACGATAAGATGGCCCCAAAGAGACTACAAGGATCATGATATTGATTACAAAGACGATGACGATAAGATGGCCCCAAAG

AAGAAGCGGAAGGTCGGTATCCACGGAGTCCCAGCAGCCATGCAGGAGAGAAAGAAAAGAAGCGGAAGGTCGGTATCCACGGAGTCCCAGCAGCCATGCAGGAGAGAAAGAA

GATCAGCCACCTGACCCACAGAAACAGCGTGAAGAAAACCATCAGAATGCAGCTGAGATCAGCCACCTGACCCACAGAAACAGCGTGAAGAAAACCATCAGAATGCAGCTGA

ACCCCGTGGGAAAGACCATGGACTACTTCCAGGCCAAGCAGATCCTGGAGAACGACGACCCCTGTGGGAAAGACCATGGACTACTTCCAGGCCAAGCAGATCCTGGAGAACGACG

AGAAGCTGAAGGAGGACTACCAGAAGATCAAGGAGATCGCCGACAGATTCTACAGAAGAAGCTGAAGGAGGACTACCAGAAGATCAAGGAGATCGCCGACAGATTCTACAGA

AACCTGAACGAGGACGTGCTGAGCAAAACCGGACTGGACAAGCTGAAGGACTACGCAACCTGAACGAGGACGTGCTGAGCAAAACCGGACTGGACAAGCTGAAGGACTACGC

CGAGATCTACTACCATTGCAACACCGACGCCGACAGAAAGAGACTGAACGAGTGCGCCGAGATCTACTACCATTGCAACACCGACGCCGACAGAAAGAGACTGAACGAGTGCGC

CAGCGAGCTGAGAAAGGAGATCGTGAAGAACTTCAAGAACAGAGATGAGTACAACACAGCGAGCTGAGAAAGGAGATCGTGAAGAACTTCAAGAACAGAGATGAGTACAACA

AGCTGTTCAACAAGAAGATGATCGAGATCGTGCTGCCCAAGCACCTGAAGAACGAGGAGCTGTTCAACAAGAAGATGATCGAGATCGTGCTGCCCAAGCACCTGAAGAACGAGG

ACGAGAAGGAAGTGGTGGCCAGCTTCAAGAACTTCACCACCTACTTCACCGGCTTCTACGAGAAGGAAGGTGGTGGCCAGCTTCAAGAACTTCACCACCTACTTCACCGGCTTCT

TCACCAACAGAAAGAACATGTACAGCGACGGCGAAGAGTCTACCGCTATTGCCTACATCACCAACAGAAAGAACATGTACAGCGACGGCGAAGAGTCTACCGCTATTGCCTACA

GATGCATCAACGAGAACCTGCCCAAGCACCTGGACAACGTGAAGGTGTTCGAGAAGGATGCATCAACGAGAACCTGCCCAAGCACCTGGACAACGTGAAGGTGTTCGAGAAG

GCCATCAGCAAGCTGAGCAAGAACGCCATCGACGACCTGGATGCCACATATTCTGGCCGCCATCAGCAAGCTGAGCAAGAACGCCATCGACGACCTGGATGCCACATATTCTGGCC

TGTGCGGCACAAATCTGTACGACGTGTTCACCGTGGACTACTTCAACTTCCTGCTGCCTGTGCGGCACAAATCTGTACGACGTGTTCACCGTGGACTACTTCAACTTCCTGCTGCC

CCAAAGCGGAATCACCGAGTACAACAAGATCATCGGCGGCTACACAACAAGCGACGGCCAAAGCGGAATCACCGAGTACAACAAGATCATCGGCGGCTACACAACAAGCGACGG

CACCAAAGTGAAGGGCATCAACGAGTACATCAACCTGTACAACCAGCAGGTGAGCAACACCAAAGTGAAGGGCATCAACGAGTACATCAACCTGTACAACCAGCAGGTGAGCAA

GAGAGACAAGATCCCCAACCTGAAGATCCTGTACAAGCAGATCCTGAGCGAGAGCGAGAGAGACAAGATCCCCAACCTGAAGATCCTGTACAAGCAGATCCTGAGCGAGAGCGA

GAAGGTGTCTTTCATCCCCCCCAAGTTCGAGGACGACAACGAACTGCTGTCTGCCGTGAAGGTGTCTTTCATCCCCCCCAAGTTCGAGGACGACAACGAACTGCTGTCTGCCGT

GAGCGAGTTCTATGCCAACGACGAGACATTTGATGGCATGCCCCTGAAGAAAGCCATCGAGCGAGTTCTATGCCAACGACGAGACATTTGATGGCATGCCCTGAAGAAAGCCATC

GACGAAACCAAACTGCTGTTCGGCAACCTGGACAACAGCAGCCTGAACGGCATCTACGACGAAACCAAACTGCTGTTCGGCAACCTGGACAACAGCAGCCTGAACGGCATCTAC

ATCCAGAACGACAGAAGCGTGACCAACCTGAGCAACAGCATGTTCGGCAGCTGGAGATCCAGAACGACAGAAGCGTGACCAACCTGAGCAACAGCATGTTCGGCAGCTGGAG

CGTGATTGAGGACCTGTGGAACAAGAACTACGACAGCGTGAACAGCAACAGCAGAACGTGATTGAGGACCTGTGGAACAAGAACTACGACAGCGTGAACAGCAACAGCAGAA

TCAAGGACATCCAGAAGAGAGAGGACAAGAGAAAGAAGGCCTACAAGGCCGAGAATCAAGGACATCCAGAAGAGAGAGGACAAGAGAAAGAAGGCCTACAAGGCCGAGAA

GAAGCTGAGCCTGAGCTTCCTGCAGGTGCTGATCAGCAACAGCGAGAACGACGAGATGAAGCTGAGCCTGAGCTTCCTGCAGGTGCTGATCAGCAACAGCGAGAACGACGAGAT

CAGAAAGAAGAGCATCGTGGACTACTACAAGACCAGCCTGATGCAGCTGACCGACAACAGAAAGAAGAGCATCGTGGACTACTACAAGACCAGCCTGATGCAGCTGACCGACAA

CCTGAGCGACAAGTACAAAGAAGCCGCCCCCCTGTTTTCTGAGAACTACGACAACGACCTGAGCGACAAGTACAAAGAAGCCGCCCCCCTGTTTTCTGAGAACTACGACAACGA

GAAGGGCCTGAAGAACGACGACAAGAGCATCAGCCTGATCAAGAACTTCCTGGACGGAAGGGCCTGAAGAACGACGACAAGAGCATCAGCCTGATCAAGAACTTCCTGGACG

CCATCAAGGAGATCGAGAAGTTCATCAAGCCCCTGAGCGAGACAAATATCACCGGCGCCATCAAGGAGATCGAGAAGTTCATCAAGCCCCTGAGCGAGACAAATATCACCGGCG

AGAAGAACGACCTGTTCTACAGCCAGTTCACCCCCCTGCTGGACAACATCAGCAGAAAGAAGAACGACCTGTTCTACAGCCAGTTCACCCCCCTGCTGGACAACATCAGCAGAA

TCGACAGACTGTACGACAAGGTGAGAAACTACGTGACCCAGAAGCCCTTCAGCACCGTCGACAGACTGTACGACAAGGTGAGAAAACTACGTGACCCAGAAGCCCTTCAGCACCG

ACAAGATCAAGCTGAACTTCGGCAACAGCCAGCTTCTGAACGGCTGGGACAGAAACACAAGATCAAGCTGAACTTCGGCAACAGCCAGCTTCTGAACGGCTGGGACAGAAAC

AAGGAGAAGGACTGTGGCGCTGTGCTGCTGTGTAAGGACGAGAAGTACTACCTGGCCAAGGAGAAGGACTGTGGCGCTGTGCTGCTGTAAGGACGAGAAGTACTACCTGGCC

ATCATCGACAAGAGCAACAACAGCATCCTGGAGAACATCGACTTCCAGGACTGCAACATCATCGACAAGAGCAACAACAGCATCCTGGAGAACATCGACTTCCAGGACTGCAAC

GAGAGCGACTACTACGAGAAGATCGTGTACAAGCTGCTGACCAAGATCTCTGGCAACGAGAGCGACTACTACGAGAAGATCGTGTACAAGCTGCTGACCAAGATCTCTGGCAAC

CTGCCCAGAGTGTTCTTCAGCGAGAAGCACAAGAAGCTGCTGAGCCCCAGCGATGAGCTGCCCAGAGGTGTTCTTCAGCGAGAAGCACAAGAAGCTGCTGAGCCCCAGCGATGAG

ATCCTGAAGATCTACAAGAGCGGCACCTTCAAGAAGGGCGACAAGTTCAGCCTTGACATCCTGAAGATCTACAAGAGCGGCACCTTCAAGAAGGGCGACAAGTTCAGCCTTGAC

GACTGCCACAAGCTGATCGACTTCTACAAGGAGAGCTTCAAGAAGTACCCCAAGTGGGACTGCCACAAGCTGATCGACTTCTACAAGGAGAGCTTCAAGAAGTACCCCAAGTGG

CTGATCTACAACTTCAAGTTCAAGAACACCAACGAGTACAACGACATCAGCGAGTTCTCTGATCTACAACTTCAAGTTCAAGAACACCAACGAGTACAACGACATCAGCGAGTTCT

ACAACGACGTGGCCAGCCAGGGATACAACATCAGCAAGATGAAGATCCCCACCAGCTACAACGACGTGGCCAGCCAGGGATACAACATCAGCAAGATGAAGATCCCCACCAGCT

TCATCGACAAGCTGGTGGACGAGGGCAAGATCTACCTGTTCCAGCTGTACAACAAGGTCATCGACAAGCTGGTGGACGAGGGCAAGATCTACCTGTTCCAGCTGTACAACAAGG

ACTTCAGCCCCCACAGCAAGGGAACACCTAACCTGCACACCCTGTACTTCAAGATGCACTTCAGCCCCCACAGCAAGGGAACACCTAACCTGCACAACCCTGTACTTCAAGATGC

TGTTCGACGAGAGAAACCTGGAGGACGTGGTGTACAAGCTGAATGGCGAGGCCGAGTGTTCGACGAGAGAAACCTGGAGGACGTGGTGTACAAGCTGAATGGCGAGGCCGAG

ATGTTTTACAGACCCGCCAGCATCAAGTATGACAAGCCCACCCACCCTAAGAACACCCATGTTTTACAGACCCGCCAGCATCAAGTATGACAAGCCCACCCACCCTAAGAACACCC

CCATCAAGAACAAGAACACCCTGAACGACAAGAAGGCCAGCACCTTCCCCTACGACCCCATCAAGAACAAGAACACCCTGAACGACAAGAAGGCCAGCACCTTCCCTTACGACC

TGATCAAGGACAAGAGATACACCAAGTGGCAGTTCAGCCTGCACTTCCCCATCACCATTGATCAAGGACAAGAGATACACCAAGTGGCAGTCAGCCTGCACTTCCCCATCACCAT

GAACTTCAAGGCCCCCGACAGAGCCATGATCAACGACGACGTGAGAAACCTGCTGAGAACTTCAAGGCCCCCGACAGAGCCATGATCAACGACGACGTGAGAAACCTGCTGA

AGAGCTGCAACAACAACTTCATCATCGGCATCGACAGAGGCGAGAGAAACCTGCTGTAGAGCTGCAACAACAACTTCATCATCGGCATCGACAGAGGCGAGAGAAACCTGCTGT

ACGTGAGCGTGATCGATAGCAACGGCGCCATCATCTACCAGCACAGCCTGAACATCATACGTGAGCGTGATCGATAGCAACGGCGCCATCATCTACCAGCACAGCCTGAACATCAT

CGGCAACAAGTTCAAGGGCAAGACCTACGAAACCAACTACAGAGAGAAGCTGGCCACGGCAACAAGTTCAAGGGCAAGACCTACGAAAACCAACTACAGAGAGAAGCTGGCCA

CCAGAGAGAAGGAGAGAACCGAGCAGAGAAGAAACTGGAAGGCCATCGAGAGCATCCAGAGAGAAGGAGAGAACCGAGCAGAGAAGAAACTGGAAGGCCATCGAGAGCAT

CAAGGAGCTGAAGGAGGGCTACATCAGCCAAACCGTGCACGTGATTTGCCAGCTGGTCAAGGAGCTGAAGGAGGGCTACATCAGCCAAACCGTGCACGTGATTTGCCAGCTGGT

GGTGAAGTACGACGCCATCATCGTGATGGAGAAGCTGACCGACGGCTTCAAGAGAGGGGTGAAGTACGACGCCATCATCGTGATGGAGAAGCTGACCGACGGCTTCAAGAGAGG

CAGAACCAAGTTCGAGAAGCAGGTGTACCAGAAGTTCGAGAAGATGCTGATCGACACAGAACCAAGTTCGAGAAGCAGGTGTACCAGAAGTTCGAGAAGATGCTGATCGACA

AGCTGAACTACTACGTGGACAAGAAGCTGGACCCCAATGAGGAAGGCGGACTGCTGAGCTGAACTACTACGTGGACAAGAAGCTGGACCCCAATGAGGAAGGCGGACTGCTG

CATGCTTATCAGCTGACCAACAAGCTGGACAGCTTCGACAAGCTGGGAATGCAGAGCCATGCTTATCAGCTGACCAACAAGCTGGACAGCTTCGACAAGCTGGGAATGCAGAGC

GGCTTCATCTTCTACGTCAGACCCGACTTCACCAGCAAAATCGACCCCGTGACCGGATGGCTTCATCTTCTACGTCAGACCCGACTTCACCAGCAAAATCGACCCCGTGACCGGAT

TTGTGAACCTGCTGTACCCCAGATACGAGAACATCGACAAGGCCAAGGACATGATCATTGTGAACCTGCTGTACCCCAGATACGAGAACATCGACAAGGCCAAGGACATGATCA

GCAGATTCGACGACATCAGATACAACGCCGGCGAGGACTTCTTCGAGTTCGACATCGGCAGATTCGACGACATCAGATACAACGCCGGCGAGGACTTCTTCGAGTTCGACATCG

ACTACGACAAGTTCCCCAAGACCGCCAGCGACTACAGAAAGAAGTGGACCATCTGCAACTACGACAAGTTCCCCAAGACCGCCAGCGACTACAGAAAGAAGTGGACCATCTGCA

CCAACGGCGAGAGAATCGAGGCCTTCAGAAACCCCGCCAACAACAACGAGTGGAGCCCAACGGCGAGAGAATCGAGGCCTTCAGAAACCCCGCCAACAACAACGAGTGGAGC

TACAGAACCATCATCCTGGCCGAGAAGTTCAAGGAGCTGTTCGACAACAACAGCATCTACAGAACCATCATCCTGGCCGAGAAGTTCAAGGAGCTGTTCGACAACAACAGCATC

AACTACAGAGACAGCGACGACCTGAAAGCCGAGATCCTGAGCCAAACCAAGGGCAAAACTACAGAGACAGCGACGACCTGAAAGCCGAGATCCTGAGCCAAACCAAGGGCAA

GTTCTTCGAGGACTTCTTCAAGCTGCTGAGACTGACCCTGCAGATGAGAAACAGCAAGTTCTTCGAGGACTTCTTCAAGCTGCTGAGACTGACCCTGCAGATGAGAAACAGCAA

CCCCGAAACCGGAGAGGACAGGATTCTGAGCCCCGTGAAGGACAAGAACGGCAACTCCCCGAAACCGGAGAGGACAGGATTCTGAGCCCCGTGAAGGACAAGAACGGCAACT

TCTACGACAGCAGCAAGTACGACGAGAAGAGCAAGCTGCCCTGTGACGCTGATGCTATCTACGACAGCAGCAAGTACGACGAGAAGAGCAAGCTGCCCTGTGACGCTGATGCTA

ACGGCGCTTACAACATCGCCAGAAAGGGCCTGTGGATCGTGGAGCAGTTCAAGAAGGACGGCGCTTACAACATCGCCAGAAAGGGCCTGTGGATCGTGGAGCAGTTCAAGAAGG

CCGACAACGTGTCTGCTGTGGAACCCGTGATCCACAACGACAAGTGGCTGAAGTTCGCCGACAACGTGTCTGCTGTGGAACCCGTGATCCACAACGACAAGTGGCTGAAGTTCG

TGCAGGAGAACGACATGGCCAACAACAAAAGGCCGGCGGCCACGAAAAAGGCCGGTGCAGGAGAACGACATGGCCAACAACAAAAAGGCCGGCGGCCACGAAAAAGGCCGG

CCAGGCAAAAAAGAAAAAGGAATTCGGCAGTGGAGAGGGCAGAGGAAGTCTGCTAACCAGGCAAAAAAGAAAAAGGAATTCGGCAGTGGAGAGGGCAGAGGAAGTCTGCTAA

CATGCGGTGACGTCGAGGAGAATCCTGGCCCAGTGAGCAAGGGCGAGGAGCTGTTCCATGCGGTGACGTCGAGGAGAATCCTGGCCCAGTGAGCAAGGGCGAGGAGCTGTTC

ACCGGGGTGGTGCCCATCCTGGTCGAGCTGGACGGCGACGTAAACGGCCACAAGTTCACCGGGGTGGTGCCCATCCTGGTCGAGCTGGACGGCGACGTAAACGGCCACAAGTTC

AGCGTGTCCGGCGAGGGCGAGGGCGATGCCACCTACGGCAAGCTGACCCTGAAGTTCAGCGTGTCCGGCGAGGGCGAGGGCGATGCCACCTACGGCAAGCTGACCCTGAAGTTC

ATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCTCGTGACCACCCTGACCTATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCTCGTGACCACCCTGACCT

ACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATGAAGCAGCACGACTTCTTCAACGGCGTGCAGTGCTTCAGCCGCTACCCGACCACATGAAGCAGCACGACTTCTTCA

AGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCAAGGACGACGAGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCAAGGACGACG

GCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGCGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGC

ATCGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCAACATCCTGGGGCACAAGCTGATCGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCAACATCCTGGGGCACAAGCTG

GAGTACAACTACAACAGCCACAACGTCTATATCATGGCCGACAAGCAGAAGAACGGCGAGTACAACTACAACAGCCACAACGTCTATATCATGGCCGACAAGCAGAAGAACGGC

ATCAAGGTGAACTTCAAGATCCGCCACAACATCGAGGACGGCAGCGTGCAGCTCGCCATCAAGGTGAACTTCAAGATCCGCCACAACATCGAGGACGGCAGCGTGCAGCTCGCC

GACCACTACCAGCAGAACACCCCCATCGGCGACGGCCCCGTGCTGCTGCCCGACAACGACCACTACCAGCAGAACACCCCATCGGCGACGGCCCCGTGCTGCTGCCCGACAAC

CACTACCTGAGCACCCAGTCCGCCCTGAGCAAAGACCCCAACGAGAAGCGCGATCACCACTACCTGAGCACCCAGTCCGCCCTGAGCAAAGACCCCAACGAGAAGCGCGATCAC

ATGGTCCTGCTGGAGTTCGTGACCGCCGCCGGGATCACTCTCGGCATGGACGAGCTGTATGGTCCTGCTGGAGTTCGTGACCGCCGCCGGGATCACTCTCGGCATGGACGAGCTGT

ACAAGGAATTCTAACTAGAGCTCGCTGATCAGCCTCGACTGTGCCTTCTAGTTGCCAGACAAGGAATTCTAACTAGAGCTCGCTGATCAGCCTCGACTGTGCCTTCTAGTTGCCAG

CCATCTGTTGTTTGCCCCTCCCCCGTGCCTTCCTTGACCCTGGAAGGTGCCACTCCCACCATCTGTTGTTTGCCCCTCCCCCGTGCCTTCCTTGACCCTGGAAGGTGCCACTCCCA

CTGTCCTTTCCTAATAAAATGAGGAAATTGCATCGCATTGTCTGAGTAGGTGTCATTCTCTGTCCTTTCTAATAAAATGAGGAAATTGCATCGCATTGTCTGAGTAGGTGTCATTCT

ATTCTGGGGGGTGGGGTGGGGCAGGACAGCAAGGGGGAGGATTGGGAAGAGAATAGATTCTGGGGGGTGGGGTGGGGCAGGACAGCAAGGGGGAGGATTGGGAAGAGAATAG

CAGGCATGCTGGGGAGCGGCCGCAGGAACCCCTAGTGATGGAGTTGGCCACTCCCTCCAGGCATGCTGGGGAGCGGCCGCAGGAACCCCTAGTGATGGAGTTGGCCACTCCCTC

TCTGCGCGCTCGCTCGCTCACTGAGGCCGGGCGACCAAAGGTCGCCCGACGCCCGGGTCTGCGCGCTCGCTCGCTCACTGAGGCCGGGCGACCAAAGGTCGCCCGACGCCCGGG

CTTTGCCCGGGCGGCCTCAGTGAGCGAGCGAGCGCGCAGCTGCCTGCAGGGGCGCCTCTTTGCCCGGGCGGCCTCAGTGAGCGAGCGAGCGCGCAGCTGCCTGCAGGGGCGCCT

GATGCGGTATTTTCTCCTTACGCATCTGTGCGGTATTTCACACCGCATACGTCAAAGCAGATGCGGTATTTTCTCCTTACGCATCTGTGCGGTATTTCACACCGCATACGTCAAAGCA

ACCATAGTACGCGCCCTGTAGCGGCGCATTAAGCGCGGCGGGTGTGGTGGTTACGCGCACCATAGTACGCGCCCTGTAGCGGCGCATTAAGCGCGGCGGGTGTGGTGGTTACGCGC

AGCGTGACCGCTACACTTGCCAGCGCCTTAGCGCCCGCTCCTTTCGCTTTCTTCCCTTCAGCGTGACCGCTACACTTGCCAGCGCCTTAGCGCCCGCTCCTTTTCGCTTTCTTCCCTTC

CTTTCTCGCCACGTTCGCCGGCTTTCCCCGTCAAGCTCTAAATCGGGGGCTCCCTTTACTTTCTCGCCACGTTCGCCGGCTTTCCCCGTCAAGCTCTAAATCGGGGGGCTCCCTTTA

GGGTTCCGATTTAGTGCTTTACGGCACCTCGACCCCAAAAAACTTGATTTGGGTGATGGGGTTCCGATTTAGTGCTTTACGGCACCTCGACCCCAAAAAACTTGATTTGGGTGATG

GTTCACGTAGTGGGCCATCGCCCTGATAGACGGTTTTTCGCCCTTTGACGTTGGAGTCGTTCACGTAGTGGGCCATCGCCCTGATAGACGGTTTTTCGCCCTTTGACGTTGGAGTC

CACGTTCTTTAATAGTGGACTCTTGTTCCAAACTGGAACAACACTCAACTCTATCTCGCACGTTTCTTTAATAGTGGACTCTTGTTCCAAACTGGAACAACACTCAACTCTATCTCG

GGCTATTCTTTTGATTTATAAGGGATTTTGCCGATTTCGGTCTATTGGTTAAAAAATGAGGGCTATTCTTTTGATTTATAAGGGATTTTGCCGATTTCGGTCTATTGGTTAAAAAATGAG

CTGATTTAACAAAAATTTAACGCGAATTTTAACAAAATATTAACGTTTACAATTTTATGGCTGATTTAACAAAAATTTAACGCGAATTTTAACAAAATATTAACGTTTACAATTTTATGG

TGCACTCTCAGTACAATCTGCTCTGATGCCGCATAGTTAAGCCAGCCCCGACACCCGCTGCACTCTCAGTACAATCTGCTCTGATGCCGCATAGTTAAGCCAGCCCCGACACCCGC

CAACACCCGCTGACGCGCCCTGACGGGCTTGTCTGCTCCCGGCATCCGCTTACAGACCAACACCCGCTGACGCGCCCTGACGGCTTGTCTGCTCCCGGCATCCGCTTACAGAC

AAGCTGTGACCGTCTCCGGGAGCTGCATGTGTCAGAGGTTTTCACCGTCATCACCGAAAGCTGTGACCGTCTCCGGGAGCTGCATGTGTCAGAGGTTTTCACCGTCATCACCGA

AACGCGCGAGACGAAAGGGCCTCGTGATACGCCTATTTTTATAGGTTAATGTCATGATAAACGCGCGAGACGAAAGGGCCTCGTGATACGCCTATTTTTATAGGTTAATGTCATGATA

ATAATGGTTTCTTAGACGTCAGGTGGCACTTTTCGGGGAAATGTGCGCGGAACCCCTAATAATGGTTTCTTAGACGTCAGGTGGCACTTTTCGGGGAAATGTGCGCGGAACCCCTA

TTTGTTTATTTTTCTAAATACATTCAAATATGTATCCGCTCATGAGACAATAACCCTGATTTTGTTTATTTTTCTAAATACATTCAAATATGTATCCGCTCATGAGACAATAACCCTGAT

AAATGCTTCAATAATATTGAAAAAGGAAGAGTATGAGTATTCAACATTTCCGTGTCGCCAAATGCTTCAATAATATTGAAAAAGGAAGAGTATGAGTATTCAACATTTCCGTGTCGCC

CTTATTCCCTTTTTTGCGGCATTTTGCCTTCCTGTTTTTGCTCACCCAGAAACGCTGGTCTTATTCCCTTTTTTGCGGCATTTTGCCTTCCTGTTTTTGCTCACCCAGAAACGCTGGT

GAAAGTAAAAGATGCTGAAGATCAGTTGGGTGCACGAGTGGGTTACATCGAACTGGAGAAAGTAAAAGATGCTGAAGATCAGTTGGGTGCACGAGTGGGTTACATCGAACTGGA

TCTCAACAGCGGTAAGATCCTTGAGAGTTTTCGCCCCGAAGAACGTTTTCCAATGATGTCTCAACAGCGGTAAGATCCTTGAGAGTTTTCGCCCCGAAGAACGTTTTCCAATGATG

AGCACTTTTAAAGTTCTGCTATGTGGCGCGGTATTATCCCGTATTGACGCCGGGCAAGAAGCACTTTTAAAGTTCTGCTATGTGGCGCGGTATTATCCCGTATTGACGCCGGGCAAGA

GCAACTCGGTCGCCGCATACACTATTCTCAGAATGACTTGGTTGAGTACTCACCAGTCGCAACTCGGTCGCCGCATACACTATTCTCAGAATGACTTGGTTGAGTACTCACCAGTC

ACAGAAAAGCATCTTACGGATGGCATGACAGTAAGAGAATTATGCAGTGCTGCCATAAACAGAAAAGCATCTTACGGATGGCATGACAGTAAGAGAATTATGCAGTGCTGCCATAA

CCATGAGTGATAACACTGCGGCCAACTTACTTCTGACAACGATCGGAGGACCGAAGGCCATGAGTGATAACACTGCGGCCAACTTACTTCTGACAACGATCGGAGGACCGAAGG

AGCTAACCGCTTTTTTGCACAACATGGGGGATCATGTAACTCGCCTTGATCGTTGGGAAGCTAACCGCTTTTTTGCACAACATGGGGGATCATGTAACTCGCCTTGATCGTTGGGA

ACCGGAGCTGAATGAAGCCATACCAAACGACGAGCGTGACACCACGATGCCTGTAGCACCGGAGCTGAATGAAGCCATACCAAACGACGAGCGTGACACCACGATGCCTGTAGC

AATGGCAACAACGTTGCGCAAACTATTAACTGGCGAACTACTTACTCTAGCTTCCCGGAATGGCAACAACGTTGCGCAAACTATTAACTGGCGAACTACTTACTCTAGCTTCCCGG

CAACAATTAATAGACTGGATGGAGGCGGATAAAGTTGCAGGACCACTTCTGCGCTCGGCAACAATTAATAGACTGGATGGAGGCGGATAAAGTTGCAGGACCACTTCTGCGCTCGG

CCCTTCCGGCTGGCTGGTTTATTGCTGATAAATCTGGAGCCGGTGAGCGTGGAAGCCGCCCTTCCGGCTGGCTGGTTTATTGCTGATAAATCTGGAGCCGGTGAGCGTGGAAGCCG

CGGTATCATTGCAGCACTGGGGCCAGATGGTAAGCCCTCCCGTATCGTAGTTATCTACACGGTATCATTGCAGCACTGGGGCCAGATGGTAAGCCCTCCCGTATCGTAGTTATCTACA

CGACGGGGAGTCAGGCAACTATGGATGAACGAAATAGACAGATCGCTGAGATAGGTGCGACGGGGAGTCAGGCAACTATGGATGAACGAAATAGACAGATCGCTGAGATAGGTG

CCTCACTGATTAAGCATTGGTAACTGTCAGACCAAGTTTACTCATATATACTTTAGATTGCCTCACTGATTAAGCATTGGTAACTGTCAGACCAAGTTTACTCATATATACTTTAGATTG

ATTTAAAACTTCATTTTTAATTTAAAAGGATCTAGGTGAAGATCCTTTTTGATAATCTCAATTTAAAACTTCATTTTTAATTTAAAAGGATCTAGGTGAAGATCCTTTTTGATAATCTCA

TGACCAAAATCCCTTAACGTGAGTTTTCGTTCCACTGAGCGTCAGACCCCGTAGAAAATGACCAAAATCCCTTAACGTGAGTTTTCGTTCCACTGAGCGTCAGACCCGTAGAAAA

GATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCTGCTGCTTGCAAACAAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCTGCTGCTTGCAAACAA

AAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAGCTACCAACTCTTTTAAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAGCTACCAACTCTTTT

TCCGAAGGTAACTGGCTTCAGCAGAGCGCAGATACCAAATACTGTTCTTCTAGTGTAGTCCGAAGGTAACTGGCTTCAGCAGAGCGCAGATACCAAATACTGTTCTTCTAGTGTAG

CCGTAGTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATACCTCGCTCTGCCCGTAGTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATACCTCGCTCTGC

TAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAGTCGTGTCTTACCGGGTTGGATAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAGTCGTGTCTTACCGGGTTGGA

CTCAAGACGATAGTTACCGGATAAGGCGCAGCGGTCGGGCTGAACGGGGGGTTCGTGCTCAAGACGATAGTTACCGGATAAGGCGCAGCGGTCGGGCTGAACGGGGGGTTCGTG

CACACAGCCCAGCTTGGAGCGAACGACCTACACCGAACTGAGATACCTACAGCGTGACACACAGCCCAGCTTGGAGCGAACGACCTACACCGAACTGAGATACCTACAGCGTGA

GCTATGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCCGGTAAGGCTATGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGCAGGTATCCGGTAAG

CGGCAGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAACGCCTGGTCGGCAGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAACGCCTGGT

ATCTTTATAGTCCTGTCGGGTTTCGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCATCTTTATAGTCCTGTCGGGTTTCGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGC

TCGTCAGGGGGGCGGAGCCTATGGAAAAACGCCAGCAACGCGGCCTTTTTACGGTTCCTGGCCTTTTGCTGGCCTTTTGCTCACATGT(SEQ ID NO:35).TCGTCAGGGGGGCGGAGCCTATGGAAAAACGCCAGCAACGCGGCCTTTTTACGGTTCCTGGCCTTTTGCTGGCCTTTTGCTCACATGT (SEQ ID NO: 35).

PX458-HBG-SG01:PX458-HBG-SG01:

GAGGGCCTATTTCCCATGATTCCTTCATATTTGCATATACGATACAAGGCTGTTAGAGAGATAATTGGAATTAATTTGACTGTAAACACAAAGATATTAGTACAAAATACGTGACGTAGAAAGTAATAATTTCTTGGGTAGTTTGCAGTTTTAAAATTATGTTTTAAAATGGACTATCATATGCTTACCGTAACTTGAAAGTATTTCGATTTCTTGGCTTTATATATCTTGTGGAAAGGACGAAACACCCTTGTCAAGGCTATTGGTCAGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTTTTTTGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTTTTAGCGCGTGCGCCAATTCTGCAGACAAATGGCTCTAGAGGTACCCGTTACATAACTTACGGTAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTACGCCCCCTATTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTGTGCCCAGTACATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTCGAGGTGAGCCCCACGTTCTGCTTCACTCTCCGAGGGCCTATTTCCCATGATTCCTTCATATTTGCATATACGATACAAGGCTGTTAGAGAGATAATTGGAATTAATTTGACTGTAAACACAAAGATATTAGTACAAAATACGTGACGTAGAAAGTAATAATTTCTTGGGTAGTTTGCAGTTTTAAAATTATGTTTTAAAATGGACTATCATATGCTTACCGTAACTTGAAAGTATTTCGATTTCTTGGCTTTATATATCTTGTGGAAAGGACGAAACACCCTTGTCAAGGCTATTGGTC AGTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTTTTTTGTTTTAGAGCTAGAAATAGCAAGTTAAAA TAAGGCTAGTCCGTTTTTAGCGCGTGCGCCAATTCTGCAGACAAATGGCTCTAGAGGTACCCGTTACATAACTTACGGTAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTACGCCCCCTATTGACGTCAATGACGGTAAATGGCCC GCCTGGCATTGTGCCCAGTACATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTCGAGGTGAGCCCCACGTTCTGCTTCACTCTCC

CCATCTCCCCCCCCTCCCCACCCCCAATTTTGTATTTATTTATTTTTTAATTATTTTGTGCCCATCTCCCCCCCCTCCCCACCCCCAATTTTGTATTTATTTATTTTTTAATTATTTTGTGC

AGCGATGGGGGCGGGGGGGGGGGGGGGGCGCGCGCCAGGCGGGGCGGGGCGGGGCAGCGATGGGGGCGGGGGGGGGGGGGGGGCGCGCCAGGCGGGGCGGGGCGGGGC

GAGGGGCGGGGCGGGGCGAGGCGGAGAGGTGCGGCGGCAGCCAATCAGAGCGGCGGAGGGGCGGGGCGGGGCGAGGCGGAGAGGTGCGGCGGCAGCCAATCAGAGCGGCG

CGCTCCGAAAGTTTCCTTTTATGGCGAGGCGGCGGCGGCGGCGGCCCTATAAAAAGCCGCTCCGAAAGTTTCCTTTTATGGCGAGGCGGCGGCGGCGGCGGCCCTATAAAAAGC

GAAGCGCGCGGCGGGCGGGAGTCGCTGCGCGCTGCCTTCGCCCCGTGCCCCGCTCCGGAAGCGCGCGGCGGGCGGGAGTCGCTGCGCGCTGCCTTCGCCCCGTGCCCCGCTCCG

CCGCCGCCTCGCGCCGCCCGCCCCGGCTCTGACTGACCGCGTTACTCCCACAGGTGACCGCCGCCTCGCGCCGCCCGCCCCGGCTCTGACTGACCGCGTTACTCCCACAGGTGA

GCGGGCGGGACGGCCCTTCTCCTCCGGGCTGTAATTAGCTGAGCAAGAGGTAAGGGTGCGGGCGGGACGGCCCTTTCCTCCGGGCTGTAATTAGCTGAGCAAGAGGTAAGGGT

TTAAGGGATGGTTGGTTGGTGGGGTATTAATGTTTAATTACCTGGAGCACCTGCCTGAATTAAGGGATGGTTGGTTGGTGGGGTATTAATGTTTAATTACCTGGAGCACCTGCCTGAA

ATCACTTTTTTTCAGGTTGGACCGGTGCCACCATGGACTATAAGGACCACGACGGAGAATCACTTTTTTTCAGGTTGGACCGGTGCCACCATGGACTATAAGGACCACGACGGAGA

CTACAAGGATCATGATATTGATTACAAAGACGATGACGATAAGATGGCCCCAAAGAAGCTACAAGGATCATGATATTGATTACAAAGACGATGACGATAAGATGGCCCCAAAGAAG

AAGCGGAAGGTCGGTATCCACGGAGTCCCAGCAGCCGACAAGAAGTACAGCATCGGAAGCGGAAGGTCGGTATCCACGGAGTCCCAGCAGCCGACAAGAAGTACAGCATCGG

CCTGGACATCGGCACCAACTCTGTGGGCTGGGCCGTGATCACCGACGAGTACAAGGTCCTGGACATCGGCACCAACTCTGTGGGCTGGGCCGTGATCACCGACGAGTACAAGGT

GCCCAGCAAGAAATTCAAGGTGCTGGGCAACACCGACCGGCACAGCATCAAGAAGAGCCCAGCAAGAAATTCAAGGTGCTGGGCAACACCGACCGGCACAGCATCAAGAAGA

ACCTGATCGGAGCCCTGCTGTTCGACAGCGGCGAAACAGCCGAGGCCACCCGGCTGAACCTGATCGGAGCCCTGCTGTTCGACAGCGGCGAAACAGCCGAGGCCACCCGGCTGA

AGAGAACCGCCAGAAGAAGATACACCAGACGGAAGAACCGGATCTGCTATCTGCAAAGAGAACCGCCAGAAGAAGATACACCAGACGGAAGAACCGGATCTGCTATCTGCAA

GAGATCTTCAGCAACGAGATGGCCAAGGTGGACGACAGCTTCTTCCACAGACTGGAAGAGATCTTCAGCAACGAGATGGCCAAGGTGGACGACAGCTTTCTTCCACAGACTGGAA

GAGTCCTTCCTGGTGGAAGAGGATAAGAAGCACGAGCGGCACCCCATCTTCGGCAACGAGTCCTTCCTGGTGGAAGAGGATAAGAAGCACGAGCGGCACCCCATCTTCGGCAAC

ATCGTGGACGAGGTGGCCTACCACGAGAAGTACCCCACCATCTACCACCTGAGAAAGATCGTGGACGAGGTGGCCTACCACGAGAAGTAACCCACCATCTACCACCTGAGAAAG

AAACTGGTGGACAGCACCGACAAGGCCGACCTGCGGCTGATCTATCTGGCCCTGGCCAAACTGGTGGACAGCACCGACAAGGCCGACCTGCGGCTGATCTATCTGGCCCTGGCC

CACATGATCAAGTTCCGGGGCCACTTCCTGATCGAGGGCGACCTGAACCCCGACAACCACATGATCAAGTTCCGGGGCCACTTCCTGATCGAGGGCGACCTGAACCCCGACAAC

AGCGACGTGGACAAGCTGTTCATCCAGCTGGTGCAGACCTACAACCAGCTGTTCGAGAGCGACGTGGACAAGCTGTTCATCCAGCTGGTGCAGACCTACAACCAGCTGTTCGAG

GAAAACCCCATCAACGCCAGCGGCGTGGACGCCAAGGCCATCCTGTCTGCCAGACTGGAAAACCCCATCAACGCCAGCGGCGTGGACGCCAAGGCCATCCTGTCTGCCAGACTG

AGCAAGAGCAGACGGCTGGAAAATCTGATCGCCCAGCTGCCCGGCGAGAAGAAGAAAGCAAGAGCAGACGGCTGGAAAATCTGATCGCCCAGCTGCCCGGCGAGAAGAAGAA

TGGCCTGTTCGGAAACCTGATTGCCCTGAGCCTGGGCCTGACCCCCAACTTCAAGAGTGGCCTGTTCGGAAACCTGATTGCCCTGAGCCTGGGCCTGACCCCCAACTTCAAGAG

CAACTTCGACCTGGCCGAGGATGCCAAACTGCAGCTGAGCAAGGACACCTACGACGACAACTTCGACCTGGCCGAGGATGCCAAACTGCAGCTGAGCAAGGACACCTACGACGA

CGACCTGGACAACCTGCTGGCCCAGATCGGCGACCAGTACGCCGACCTGTTTCTGGCCGACCTGGACAACCTGCTGGCCCAGATCGGCGACCAGTACGCCGACCTGTTTCTGGC

CGCCAAGAACCTGTCCGACGCCATCCTGCTGAGCGACATCCTGAGAGTGAACACCGACGCCAAGAACCTGTCCGACGCCATCCTGCTGAGCGACATCCTGAGAGTGAACACCGA

GATCACCAAGGCCCCCCTGAGCGCCTCTATGATCAAGAGATACGACGAGCACCACCAGATCACCAAGGCCCCCCTGAGCGCCTCTATGATCAAGAGATACGACGAGCACCACCA

GGACCTGACCCTGCTGAAAGCTCTCGTGCGGCAGCAGCTGCCTGAGAAGTACAAAGAGGACCTGACCCTGCTGAAAGCTCTCGTGCGGCAGCAGCTGCCTGAGAAGTACAAAGA

GATTTTCTTCGACCAGAGCAAGAACGGCTACGCCGGCTACATTGACGGCGGAGCCAGGATTTTCTTCGACCAGAGCAAGAACGGGCTACGCCGGCTACATTGACGGCGGAGCCAG

CCAGGAAGAGTTCTACAAGTTCATCAAGCCCATCCTGGAAAAGATGGACGGCACCGACCAGGAAGAGTTCTACAAGTTCATCAAGCCCATCCTGGAAAAGATGGACGGCACCGA

GGAACTGCTCGTGAAGCTGAACAGAGAGGACCTGCTGCGGAAGCAGCGGACCTTCGGGAACTGCTCGTGAAGCTGAACAGAGAGGACCTGCTGCGGAAGCAGCGGACCTTCG

ACAACGGCAGCATCCCCCACCAGATCCACCTGGGAGAGCTGCACGCCATTCTGCGGCACAACGGCAGCATCCCCCACCAGATCCACCTGGGAGAGCTGCACGCCATTCTGCGGC

GGCAGGAAGATTTTTACCCATTCCTGAAGGACAACCGGGAAAAGATCGAGAAGATCCGGCAGGAAGATTTTTACCCATTCCTGAAGGACAACCGGGAAAAGATCGAGAAGATCC

TGACCTTCCGCATCCCCTACTACGTGGGCCCTCTGGCCAGGGGAAACAGCAGATTCGCTGACCTTCCGCATCCCCTACTACGTGGGCCCTCTGGCCAGGGGAAACAGCAGATTCGC

CTGGATGACCAGAAAGAGCGAGGAAACCATCACCCCCTGGAACTTCGAGGAAGTGGCTGGATGACCAGAAAGAGCGAGGAAACCATCACCCCCTGGAACTTCGAGGAAGTGG

TGGACAAGGGCGCTTCCGCCCAGAGCTTCATCGAGCGGATGACCAACTTCGATAAGATGGACAAGGGCGCTTCCGCCCAGAGCTTCATCGAGCGGATGACCAACTTCGATAAGA

ACCTGCCCAACGAGAAGGTGCTGCCCAAGCACAGCCTGCTGTACGAGTACTTCACCGACCTGCCCCAACGAGAAGGTGCTGCCCAAGCACAGCCTGCTGTACGAGTACTTCACCG

TGTATAACGAGCTGACCAAAGTGAAATACGTGACCGAGGGAATGAGAAAGCCCGCCTTGTATAACGAGCTGACCAAAGTGAAATACGTGACCGAGGGAATGAGAAAGCCCGCCT

TCCTGAGCGGCGAGCAGAAAAAGGCCATCGTGGACCTGCTGTTCAAGACCAACCGGTCCTGAGCGGCGAGCAGAAAAAGGCCATCGTGGACCTGCTGTTCAAGACCAACCGG

AAAGTGACCGTGAAGCAGCTGAAAGAGGACTACTTCAAGAAAATCGAGTGCTTCGAAAAGTGACCGTGAAGCAGCTGAAAGAGGACTACTTCAAGAAAATCGAGTGCTTCGA

CTCCGTGGAAATCTCCGGCGTGGAAGATCGGTTCAACGCCTCCCTGGGCACATACCACCTCCGTGGAAATCTCCGGCGTGGAAGATCGGTTCAACGCCTCCCTGGGCACATACCAC

GATCTGCTGAAAATTATCAAGGACAAGGACTTCCTGGACAATGAGGAAAACGAGGACGATCTGCTGAAAATTATCAAGGACAAGGACTTCCTGGACAATGAGGAAAACGAGGAC

ATTCTGGAAGATATCGTGCTGACCCTGACACTGTTTGAGGACAGAGAGATGATCGAGGATTCTGGAAGATATCGTGCTGACCCTGACACTGTTTGAGGACAGAGAGATGATCGAGG

AACGGCTGAAAACCTATGCCCACCTGTTCGACGACAAAGTGATGAAGCAGCTGAAGCAACGGCTGAAAACCTATGCCCACCTGTTCGACGACAAAGTGATGAAGCAGCTGAAGC

GGCGGAGATACACCGGCTGGGGCAGGCTGAGCCGGAAGCTGATCAACGGCATCCGGGGCGGAGATACACCGGCTGGGGCAGGCTGAGCCGGAAGCTGATCAACGGCATCCGG

GACAAGCAGTCCGGCAAGACAATCCTGGATTTCCTGAAGTCCGACGGCTTCGCCAACGACAAGCAGTCCGGCAAGACAATCCTGGATTTCCTGAAGTCCGACGGCTTCGCCAAC

AGAAACTTCATGCAGCTGATCCACGACGACAGCCTGACCTTTAAAGAGGACATCCAGAGAAACTTCATGCAGCTGATCCACGACGACAGCCTGACCTTTAAAGAGGACATCCAG

AAAGCCCAGGTGTCCGGCCAGGGCGATAGCCTGCACGAGCACATTGCCAATCTGGCCAAAGCCCAGGTGTCCGGCCAGGGCGATAGCCTGCACGAGCACATTGCCAATCTGGCC

GGCAGCCCCGCCATTAAGAAGGGCATCCTGCAGACAGTGAAGGTGGTGGACGAGCTCGGCAGCCCCGCCATTAAGAAGGGCATCCTGCAGACAGTGAAGGTGGTGGACGAGCTC

GTGAAAGTGATGGGCCGGCACAAGCCCGAGAACATCGTGATCGAAATGGCCAGAGAGTGAAAGTGATGGGCCGGCACAAGCCCGAGAACATCGTGATCGAAATGGCCAGAGA

GAACCAGACCACCCAGAAGGGACAGAAGAACAGCCGCGAGAGAATGAAGCGGATCGAACCAGACCACCCAGAAGGGACAGAAGAACAGCCGCGAGAGAATGAAGCGGATC

GAAGAGGGCATCAAAGAGCTGGGCAGCCAGATCCTGAAAGAACACCCCGTGGAAAAGAAGAGGGCATCAAAGAGCTGGGCAGCCAGATCCTGAAAGAACACCCCGTGGAAAA

CACCCAGCTGCAGAACGAGAAGCTGTACCTGTACTACCTGCAGAATGGGCGGGATATCACCCAGCTGCAGAACGAGAAGCTGTACCTGTACTACCTGCAGAATGGGCGGGATAT

GTACGTGGACCAGGAACTGGACATCAACCGGCTGTCCGACTACGATGTGGACCATATCGTACGTGGACCAGGAACTGGACATCAACCGGCTGTCCGACTACGATGTGGACCATATC

GTGCCTCAGAGCTTTCTGAAGGACGACTCCATCGACAACAAGGTGCTGACCAGAAGCGTGCCTCAGAGCTTTCTGAAGGACGACTCCATCGACAACAAGGTGCTGACCAGAAGC

GACAAGAACCGGGGCAAGAGCGACAACGTGCCCTCCGAAGAGGTCGTGAAGAAGATGACAAGAACCGGGGCAAGAGCGACAACGTGCCCTCCGAAGAGGTCGTGAAGAAGAT

GAAGAACTACTGGCGGCAGCTGCTGAACGCCAAGCTGATTACCCAGAGAAAGTTCGAGAAGAACTACTGGCGGCAGCTGCTGAACGCCAAGCTGATTACCCAGAGAAAGTTCGA

CAATCTGACCAAGGCCGAGAGAGGCGGCCTGAGCGAACTGGATAAGGCCGGCTTCATCAATCTGACCAAGGCCGAGAGAGGCGGCCTGAGCGAACTGGATAAGGCCGGCTTCAT

CAAGAGACAGCTGGTGGAAACCCGGCAGATCACAAAGCACGTGGCACAGATCCTGGCAAGAGACAGCTGGTGGAAACCCGGCAGATCACAAAGCACGTGGCACAGATCCTGG

ACTCCCGGATGAACACTAAGTACGACGAGAATGACAAGCTGATCCGGGAAGTGAAAGACTCCCGGATGAACACTAAGTACGACGAGAATGACAAGCTGATCCGGGAAGTGAAAG

TGATCACCCTGAAGTCCAAGCTGGTGTCCGATTTCCGGAAGGATTTCCAGTTTTACAATGATCACCCTGAAGTCCAAGCTGGTGTCCGATTTCCGGAAGGATTTCCAGTTTTACAA

AGTGCGCGAGATCAACAACTACCACCACGCCCACGACGCCTACCTGAACGCCGTCGTAGTGCGCGAGATCAACAACTACCACCACGCCCACGACGCCTACCTGAACGCCGTCGT

GGGAACCGCCCTGATCAAAAAGTACCCTAAGCTGGAAAGCGAGTTCGTGTACGGCGAGGGAACCGCCCTGATCAAAAAGTACCCTAAGCTGGAAAGCGAGTTCGTGTACGGCGA

CTACAAGGTGTACGACGTGCGGAAGATGATCGCCAAGAGCGAGCAGGAAATCGGCACTACAAGGTGTACGACGTGCGGAAGATGATCGCCAAGAGCGAGCAGGAAATCGGCA

AGGCTACCGCCAAGTACTTCTTCTACAGCAACATCATGAACTTTTTCAAGACCGAGATAGGCTACCGCCAAGTACTTCTTCTACAGCAACATCATGAACTTTTTCAAGACCGAGAT

TACCCTGGCCAACGGCGAGATCCGGAAGCGGCCTCTGATCGAGACAAACGGCGAAACTACCCTGGCCAACGGCGAGATCCGGAAGCGGCCTCTGATCGAGACAAACGGCGAAAC

CGGGGAGATCGTGTGGGATAAGGGCCGGGATTTTGCCACCGTGCGGAAAGTGCTGAGCGGGGAGATCGTGTGGGATAAGGGCCGGGATTTTGCCACCGTGCGGAAAGTGCTGAG

CATGCCCCAAGTGAATATCGTGAAAAAGACCGAGGTGCAGACAGGCGGCTTCAGCAACATGCCCCAAGTGAATATCGTGAAAAAGACCGAGGTGCAGACAGGCGGCTTCAGCAA

AGAGTCTATCCTGCCCAAGAGGAACAGCGATAAGCTGATCGCCAGAAAGAAGGACTGAGAGTCTATCCTGCCCAAGAGGAACAGCGATAAGCTGATCGCCAGAAAGAAGGACTG

GGACCCTAAGAAGTACGGCGGCTTCGACAGCCCCACCGTGGCCTATTCTGTGCTGGTGGGACCCTAAGAAGTACGGCGGCTTCGACAGCCCCACCGTGGCCTATTCTGTGCTGGTG

GTGGCCAAAGTGGAAAAGGGCAAGTCCAAGAAACTGAAGAGTGTGAAAGAGCTGCTGTGGCCAAAGTGGAAAAGGGCAAGTCCAAGAAACTGAAGAGTGTGAAAGAGCTGCT

GGGGATCACCATCATGGAAAGAAGCAGCTTCGAGAAGAATCCCATCGACTTTCTGGAGGGGATCACCATCATGGAAAGAAGCAGCTTCGAGAAGAATCCCATCGACTTTCTGGA

AGCCAAGGGCTACAAAGAAGTGAAAAAGGACCTGATCATCAAGCTGCCTAAGTACTCAGCCAAGGGCTACAAAGAAGTGAAAAAGGACCTGATCATCAAGCTGCCTAAGTACTC

CCTGTTCGAGCTGGAAAACGGCCGGAAGAGAATGCTGGCCTCTGCCGGCGAACTGCACCTGTTCGAGCTGGAAAACGGCCGGAAGAGAATGCTGGCCTCTGCCGGCGAACTGCA

GAAGGGAAACGAACTGGCCCTGCCCTCCAAATATGTGAACTTCCTGTACCTGGCCAGGAAGGGAAACGAACTGGCCCTGCCCTCCAAATATGTGAACTTCCTGTACCTGGCCAG

CCACTATGAGAAGCTGAAGGGCTCCCCCGAGGATAATGAGCAGAAACAGCTGTTTGTCCACTATGAGAAGCTGAAGGGCTCCCCCGAGGATAATGAGCAGAAACAGCTGTTTGT

GGAACAGCACAAGCACTACCTGGACGAGATCATCGAGCAGATCAGCGAGTTCTCCAAGGAACAGCACAAGCACTACCTGGACGAGATCATCGAGCAGATCAGCGAGTTCTCCAA

GAGAGTGATCCTGGCCGACGCTAATCTGGACAAAGTGCTGTCCGCCTACAACAAGCAGAGAGTGATCCTGGCCGACGCTAATCTGGACAAAGTGCTGTCCGCCTACAACAAGCA

CCGGGATAAGCCCATCAGAGAGCAGGCCGAGAATATCATCCACCTGTTTACCCTGACCCCGGGATAAGCCCATCAGAGAGCAGGCCGAGAATATCATCCACCTGTTTACCCTGACC

AATCTGGGAGCCCCTGCCGCCTTCAAGTACTTTGACACCACCATCGACCGGAAGAGGAATCTGGGAGCCCCTGCCGCCTTCAAGTACTTTGACACCACCATCGACCGGAAGAGG

TACACCAGCACCAAAGAGGTGCTGGACGCCACCCTGATCCACCAGAGCATCACCGGCTACACCAGCACCAAAGAGGTGCTGGACGCCACCCTGATCCACCAGAGCATCACCGGC

CTGTACGAGACACGGATCGACCTGTCTCAGCTGGGAGGCGACAAAAGGCCGGCGGCCTGTACGAGACACGGATCGACCTGTCTCAGCTGGGAGGCGACAAAAGGCCGGCGGC

CACGAAAAAGGCCGGCCAGGCAAAAAAGAAAAAGGAATTCGGCAGTGGAGAGGGCCACGAAAAAGGCCGGCCAGGCAAAAAAGAAAAAGGAATTCGGCAGTGGAGAGGGC

AGAGGAAGTCTGCTAACATGCGGTGACGTCGAGGAGAATCCTGGCCCAGTGAGCAAGAGAGGAAGTCTGCTAACATGCGGTGACGTCGAGGAGAATCCTGGCCCAGTGAGCAAG

GGCGAGGAGCTGTTCACCGGGGTGGTGCCCATCCTGGTCGAGCTGGACGGCGACGTAGGCGAGGAGCTGTTCACCGGGGGTGGTGCCCATCCTGGTCGAGCTGGACGGCGACGTA

AACGGCCACAAGTTCAGCGTGTCCGGCGAGGGCGAGGGCGATGCCACCTACGGCAAAACGGCCACAAGTTCAGCGTGTCCGGCGAGGGCGAGGGCGATGCCACCTACGGCAA

GCTGACCCTGAAGTTCATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCTGCTGACCCTGAAGTTCATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCT

CGTGACCACCCTGACCTACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATGAACGTGACCACCCTGACCTACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATGAA

GCAGCACGACTTCTTCAAGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCACCATGCAGCACGACTTCTTCAAGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCACCAT

CTTCTTCAAGGACGACGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGCTTCTTCAAGGACGACGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCG

ACACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCAACAACACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCAACA

TCCTGGGGCACAAGCTGGAGTACAACTACAACAGCCACAACGTCTATATCATGGCCGATCCTGGGGCACAAGCTGGAGTACAACTACAACAGCCACAACGTCTATATCATGGCCGA

CAAGCAGAAGAACGGCATCAAGGTGAACTTCAAGATCCGCCACAACATCGAGGACGCAAGCAGAAGAACGGCATCAAGGTGAACTTCAAGATCCGCCACAACATCGAGGACG

GCAGCGTGCAGCTCGCCGACCACTACCAGCAGAACACCCCCATCGGCGACGGCCCCGGCAGCGTGCAGCTCGCCGACCACTACCAGCAGAACACCCCCATCGGCGACGGCCCCG

TGCTGCTGCCCGACAACCACTACCTGAGCACCCAGTCCGCCCTGAGCAAAGACCCCATGCTGCTGCCCGACAACCACTACCTGAGCACCCAGTCCGCCCTGAGCAAAGACCCCA

ACGAGAAGCGCGATCACATGGTCCTGCTGGAGTTCGTGACCGCCGCCGGGATCACTCACGAGAAGCGCGATCACATGGTCCTGCTGGAGTTCGTGACCGCCGCCGGGATCACTC

TCGGCATGGACGAGCTGTACAAGGAATTCTAACTAGAGCTCGCTGATCAGCCTCGACTTCGGCATGGACGAGCTGTACAAGGAATTCTAACTAGAGCTCGCTGATCAGCCTCGACT

GTGCCTTCTAGTTGCCAGCCATCTGTTGTTTGCCCCTCCCCCGTGCCTTCCTTGACCCTGTGCCTTCTAGTTGCCAGCCATCTGTTGTTTGCCCCTCCCCCGTGCCTTCCTTGACCCT

GGAAGGTGCCACTCCCACTGTCCTTTCCTAATAAAATGAGGAAATTGCATCGCATTGTGGAAGGTGCCACTCCCACTGTCCTTTCCTAATAAAATGAGGAAATTGCATCGCATTGT

CTGAGTAGGTGTCATTCTATTCTGGGGGGTGGGGTGGGGCAGGACAGCAAGGGGGAGCTGAGTAGGTGTCATTCTATTCTGGGGGGTGGGGTGGGGCAGGACAGCAAGGGGGAG

GATTGGGAAGAGAATAGCAGGCATGCTGGGGAGCGGCCGCAGGAACCCCTAGTGATGGATTGGGAAGAGAATAGCAGGCATGCTGGGGAGCGGCCGCAGGAACCCCTAGTGATG

GAGTTGGCCACTCCCTCTCTGCGCGCTCGCTCGCTCACTGAGGCCGGGCGACCAAAGGAGTTGGCCACTCCCTCTCTGCGCGCTCGCTCGCTCACTGAGGCCGGGCGACCAAAG

GTCGCCCGACGCCCGGGCTTTGCCCGGGCGGCCTCAGTGAGCGAGCGAGCGCGCAGGTCGCCCGACGCCCGGGCTTTGCCCGGGCGGCCTCAGTGAGCGAGCGAGCGCGCAG

CTGCCTGCAGGGGCGCCTGATGCGGTATTTTCTCCTTACGCATCTGTGCGGTATTTCACCTGCCTGCAGGGGCGCCTGATGCGGTATTTTCTCCTTACGCATCTGTGCGGTATTTCAC

ACCGCATACGTCAAAGCAACCATAGTACGCGCCCTGTAGCGGCGCATTAAGCGCGGCACCGCATACGTCAAAGCAACCATAGTACGCGCCCTGTAGCGGCGCATTAAGCGCGGC

GGGTGTGGTGGTTACGCGCAGCGTGACCGCTACACTTGCCAGCGCCTTAGCGCCCGCGGGTGTGGTGGTTACGCGCAGCGTGACCGCTACACTTGCCAGCGCCTTAGCGCCCGC

TCCTTTCGCTTTCTTCCCTTCCTTTCTCGCCACGTTCGCCGGCTTTCCCCGTCAAGCTCTCCTTTCGCTTTCTTCCCTTCCTTTCTCCGCCACGTTCGCCGGCTTTCCCCGTCAAGCTC

TAAATCGGGGGCTCCCTTTAGGGTTCCGATTTAGTGCTTTACGGCACCTCGACCCCAATAAATCGGGGCTCCCTTTAGGGTCCGATTTAGTGCTTTACGGCACCTCGACCCCAA

AAAACTTGATTTGGGTGATGGTTCACGTAGTGGGCCATCGCCCTGATAGACGGTTTTTAAAACTTGATTTGGGTGATGGTTCACGTAGTGGGCCATCGCCCTGATAGACGGTTTTT

CGCCCTTTGACGTTGGAGTCCACGTTCTTTAATAGTGGACTCTTGTTCCAAACTGGAACGCCCTTTGACGTTGGAGTCCACGTTCTTTAATAGTGGACTCTTGTTCCAAACTGGAA

CAACACTCAACTCTATCTCGGGCTATTCTTTTGATTTATAAGGGATTTTGCCGATTTCGGCAACACTCAACTCTATCTCGGGCTATTCTTTTGATTTATAAGGGATTTTGCCGATTTCGG

TCTATTGGTTAAAAAATGAGCTGATTTAACAAAAATTTAACGCGAATTTTAACAAAATATCTATTGGTTAAAAAATGAGCTGATTTAACAAAAATTTAACGCGAATTTTAACAAAATA

TTAACGTTTACAATTTTATGGTGCACTCTCAGTACAATCTGCTCTGATGCCGCATAGTTATTAACGTTTACAATTTTATGGTGCACTCTCAGTACAATCTGCTCTGATGCCGCATAGTTA

AGCCAGCCCCGACACCCGCCAACACCCGCTGACGCGCCCTGACGGGCTTGTCTGCTCAGCCAGCCCCGACACCCGCCAACACCCGCTGACGCGCCCTGACGGGCTTGTCTGCTC

CCGGCATCCGCTTACAGACAAGCTGTGACCGTCTCCGGGAGCTGCATGTGTCAGAGGCCGGCATCCGCTTACAGACAAGCTGTGACCGTCTCCGGGAGCTGCATGTGTCAGAGG

TTTTCACCGTCATCACCGAAACGCGCGAGACGAAAGGGCCTCGTGATACGCCTATTTTTTTTCACCGTCATCACCGAAACGCGCGAGACGAAAGGGCCTCGTGATACGCCTATTTT

TATAGGTTAATGTCATGATAATAATGGTTTCTTAGACGTCAGGTGGCACTTTTCGGGGATATAGGTTAATGTCATGATAATAATGGTTTCTTAGACGTCAGGTGGCACTTTTCGGGGA

AATGTGCGCGGAACCCCTATTTGTTTATTTTTCTAAATACATTCAAATATGTATCCGCTCAATGTGCGCGGAACCCCTATTTGTTTATTTTTCTAAATACATTCAAATATGTATCCGCTC

ATGAGACAATAACCCTGATAAATGCTTCAATAATATTGAAAAAGGAAGAGTATGAGTATATGAGACAATAACCCTGATAAATGCTTCAATAATATTGAAAAAGGAAGAGTATGAGTAT

TCAACATTTCCGTGTCGCCCTTATTCCCTTTTTTGCGGCATTTTGCCTTCCTGTTTTTGCTCAACATTTCCGTGTCGCCCTTATTCCCTTTTTTGCGGGCATTTTGCCTTCCTGTTTTTGC

TCACCCAGAAACGCTGGTGAAAGTAAAAGATGCTGAAGATCAGTTGGGTGCACGAGTTCACCCAAGAAACGCTGGTGAAAGTAAAAGATGCTGAAGATCAGTTGGGTGCACGAGT

GGGTTACATCGAACTGGATCTCAACAGCGGTAAGATCCTTGAGAGTTTTCGCCCCGAAGGGTTACATCGAACTGGATCTCAACAGCGGTAAGATCCTTGAGAGTTTTCGCCCCGAA

GAACGTTTTCCAATGATGAGCACTTTTAAAGTTCTGCTATGTGGCGCGGTATTATCCCGGAACGTTTTCCAATGATGAGCACTTTTAAAGTTCTGCTATGTGGCGCGGTATTATCCCG

TATTGACGCCGGGCAAGAGCAACTCGGTCGCCGCATACACTATTCTCAGAATGACTTGTATTGACGCCGGGCAAGAGCAACTCGGTCGCCGCATACACTATTCTCAGAATGACTTG

GTTGAGTACTCACCAGTCACAGAAAAGCATCTTACGGATGGCATGACAGTAAGAGAAGTTGAGTACTCACCAGTCACAGAAAAGCATCTTACGGATGGCATGACAGTAAGAGAA

TTATGCAGTGCTGCCATAACCATGAGTGATAACACTGCGGCCAACTTACTTCTGACAATTATGCAGTGCTGCCATAACCATGAGTGATAACACTGCGGCCAACTTACTTCTGACAA

CGATCGGAGGACCGAAGGAGCTAACCGCTTTTTTGCACAACATGGGGGATCATGTAACGATCGGAGGACCGAAGGAGCTAACCGCTTTTTTGCACAACATGGGGGATCATGTAA

CTCGCCTTGATCGTTGGGAACCGGAGCTGAATGAAGCCATACCAAACGACGAGCGTGCTCGCCTTGATCGTTGGGAACCGGAGCTGAATGAAGCCATACCAAACGACGAGCGTG

ACACCACGATGCCTGTAGCAATGGCAACAACGTTGCGCAAACTATTAACTGGCGAACTACACCACGATGCCTGTAGCAATGGCAACAACGTTGCGCAAACTATTAACTGGCGAACT

ACTTACTCTAGCTTCCCGGCAACAATTAATAGACTGGATGGAGGCGGATAAAGTTGCAACTTACTCTAGCTTCCCGGCAACAATTAATAGACTGGATGGAGGCGGATAAAGTTGCA

GGACCACTTCTGCGCTCGGCCCTTCCGGCTGGCTGGTTTATTGCTGATAAATCTGGAGGGACCACTTCTGCGCTCGGCCCTTCCGGCTGGCTGGTTTTATTGCTGATAAATCTGGAG

CCGGTGAGCGTGGAAGCCGCGGTATCATTGCAGCACTGGGGCCAGATGGTAAGCCCTCCGGTGAGCGTGGAAGCCGCGGTATCATTGCAGCACTGGGGCCAGATGGTAAGCCCT

CCCGTATCGTAGTTATCTACACGACGGGGAGTCAGGCAACTATGGATGAACGAAATAGCCCGTATCGTAGTTATCTACACGACGGGGGAGTCAGGCAACTATGGATGAACGAAATAG

ACAGATCGCTGAGATAGGTGCCTCACTGATTAAGCATTGGTAACTGTCAGACCAAGTTACAGATCGCTGAGATAGGTGCCTCACTGATTAAGCATTGGTAACTGTCAGACCAAGTT

TACTCATATATACTTTAGATTGATTTAAAACTTCATTTTTAATTTAAAAGGATCTAGGTGTACTCATATATACTTTAGATTGATTTAAAACTTCATTTTTAATTTAAAAGGATCTAGGTG

AAGATCCTTTTTGATAATCTCATGACCAAAATCCCTTAACGTGAGTTTTCGTTCCACTGAAGATCCTTTTTGATAATCTCATGACCAAAATCCCTTAACGTGAGTTTTCGTTCCACTG

AGCGTCAGACCCCGTAGAAAAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCAGCGTCAGACCCCGTAGAAAAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGC

GTAATCTGCTGCTTGCAAACAAAAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGGTAATCTGCTGCTTGCAAACAAAAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCG

GATCAAGAGCTACCAACTCTTTTTCCGAAGGTAACTGGCTTCAGCAGAGCGCAGATACGATCAAGAGCTACCAACTCTTTTTCCGAAGGTAACTGGCTTCAGCAGAGCGCAGATAC

CAAATACTGTTCTTCTAGTGTAGCCGTAGTTAGGCCACCACTTCAAGAACTCTGTAGCCAAATACTGTTCTTCTAGTGTAGCCGTAGTTAGGCCACCACTTCAAGAACTCTGTAGC

ACCGCCTACATACCTCGCTCTGCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAACCGCCTACATACCTCGCTCTGCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATA

AGTCGTGTCTTACCGGGTTGGACTCAAGACGATAGTTACCGGATAAGGCGCAGCGGTCAGTCGTGTCTTACCGGGTTGGACTCAAGACGATAGTTACCGGATAAGGCGCAGCGGTC

GGGCTGAACGGGGGGTTCGTGCACACAGCCCAGCTTGGAGCGAACGACCTACACCGGGGCTGAACGGGGGGTTTCGTGCACAGCCCAGCTTGGAGCGAACGACCTACACCG

AACTGAGATACCTACAGCGTGAGCTATGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAACTGAGATACCTACAGCGTGAGCTATGAGAAAGCGCCACGCTTCCCGAAGGGAGAA

AGGCGGACAGGTATCCGGTAAGCGGCAGGGTCGGAACAGGAGAGCGCACGAGGGAGAGGCGGACAGGTATCCGGTAAGCGGCAGGGTCGGAACAGGAGAGCGCACGAGGGAG

CTTCCAGGGGGAAACGCCTGGTATCTTTATAGTCCTGTCGGGTTTCGCCACCTCTGACTCTTCCAGGGGGGAAACGCCTGGTATCTTTATAGTCCTGTCGGGTTTCGCCACCTCTGACT

TGAGCGTCGATTTTTGTGATGCTCGTCAGGGGGGCGGAGCCTATGGAAAAACGCCAGCAACGCGGCCTTTTTACGGTTCCTGGCCTTTTGCTGGCCTTTTGCTCACATGT(SEQ ID NO:36)。TGAGCGTCGATTTTTGTGATGCTCGTCAGGGGGGCGGAGCCTATGGAAAAACGCCAGCAACGCGGCCTTTTTACGGTTCCTGGCCTTTTGCTGGCCTTTTGCTCACATGT (SEQ ID NO: 36).

BES4活性检测实验所需质粒序列获取商业合成的质粒和菌株后,通过直接接菌的方式扩增:After obtaining the commercially synthesized plasmid and strain required for the BES4 activity detection experiment, amplify them by direct inoculation:

(a)取15mL无抗LB液体培养基,加入1000X Amp抗生素15μL,然后利用一个白枪头挑取保存有目标质粒的菌株,放入培养基中,37℃,200rpm,培养过夜;(a) Take 15 mL of antibiotic-free LB liquid medium, add 15 μL of 1000X Amp antibiotic, then use a white pipette tip to pick up the strain containing the target plasmid, put it into the medium, and culture it at 37°C, 200 rpm overnight;

(b)将过夜培养的菌株进行8000rpm离心3min,将菌离心至底部,倒掉培养基;(b) Centrifuge the overnight cultured strain at 8000 rpm for 3 min, centrifuge the bacteria to the bottom, and discard the culture medium;

(c)利用天根小提试剂盒或者天根无内毒素小提中量试剂盒提取;(c) extracting using the Tiangen Mini Extraction Kit or the Tiangen Endotoxin-Free Mini Extraction Midi Kit;

(d)提取完质粒后利用Nanodrop进行浓度定量,置于-20℃保存。(d) After the plasmid was extracted, the concentration was quantified using Nanodrop and stored at -20°C.

(3)质粒转入人细胞(3) Plasmid transfer into human cells

(a)质粒转染利用的是Lipo3000试剂盒(每孔投入的质粒1.5μg);(a) Plasmid transfection was performed using the Lipo3000 kit (1.5 μg of plasmid was added to each well);

(b)转染过后,对细胞进行2-3天培养后,充分进行基因编辑后,回收细胞;(b) After transfection, the cells are cultured for 2-3 days to fully perform gene editing and then the cells are recovered;

(c)细胞培养完成后,用枪头吸取培养基,在12孔板每孔中加入200μL 0.5M EDTA溶液,放置十分钟之后,利用吹打重悬,转移至EP管中,12000rpm离心1min,取上清回收细胞;(c) After the cell culture is completed, the culture medium is aspirated with a pipette tip, and 200 μL of 0.5 M EDTA solution is added to each well of the 12-well plate. After standing for ten minutes, the culture medium is resuspended by pipetting, transferred to an EP tube, and centrifuged at 12,000 rpm for 1 min. The supernatant is taken to recover the cells;

(4)编辑活性鉴定(4) Identification of editing activity

收获细胞后进行基因组提取和T7E1酶切实验检测活性,步骤如下:After harvesting the cells, perform genome extraction and T7E1 enzyme digestion assay to detect activity. The steps are as follows:

(a)基因组DNA提取:使用基因组DNA提取试剂盒(Tiangen)提取基因组DNA,采用Nanodrop来测量gDNA浓度;(a) Genomic DNA extraction: Genomic DNA was extracted using a genomic DNA extraction kit (Tiangen), and the gDNA concentration was measured using Nanodrop;

(b)靶向区域PCR:使用GXL Prime从gDNA进行靶位点区域扩增,扩增引物如下表7所示,所用脱氧核苷酸序列皆于深圳国家基因库合成与编辑平台合成。并使用PCR纯化和凝胶提取试剂盒(MN)进行纯化。通过琼脂糖凝胶电泳分析PCR产物洁净度,同时使用Nanodrop测量浓度。(b) Targeted region PCR: GXL Prime was used to amplify the target site region from gDNA. The amplification primers are shown in Table 7 below. The deoxynucleotide sequences used were synthesized at the Shenzhen National Gene Bank Synthesis and Editing Platform. The PCR purification and gel extraction kit (MN) was used for purification. The cleanliness of the PCR product was analyzed by agarose gel electrophoresis, and the concentration was measured using Nanodrop.

(c)变性和退火:使用Bio-rad PCR仪对步骤(c)纯化产物进行变性和退火。T7E1酶切反应投入等量底物DNA(约200-300ng/rxn,反应体系10微升)。(c) Denaturation and annealing: Use Bio-rad PCR instrument to denature and anneal the purified product of step (c). An equal amount of substrate DNA (about 200-300 ng/rxn, reaction system 10 μl) is added to the T7E1 digestion reaction.

(d)T7E1酶切:将0.2微升的T7EI核酸酶加入步骤(d)中的10微升样品中。37℃,20分钟进行酶切反应。(d) T7E1 digestion: Add 0.2 μl of T7EI nuclease to 10 μl of the sample in step (d) and perform digestion reaction at 37°C for 20 minutes.

(e)活性检测:T7E1完成切割反应后,加入loading buffer,进行琼脂糖凝胶检测条带。(e) Activity detection: After T7E1 completes the cleavage reaction, loading buffer is added and the bands are detected on agarose gel.

表7:PCR扩增引物列表Table 7: List of PCR amplification primers

如图13所示,BES4的sg03质粒具有人细胞编辑活性。As shown in FIG13 , the sg03 plasmid of BES4 has editing activity in human cells.

此外,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。在本发明的描述中,“多个”的含义是至少两个,例如两个,三个等,除非另有明确具体的限定。In addition, the terms "first" and "second" are used for descriptive purposes only and should not be understood as indicating or implying relative importance or implicitly indicating the number of the indicated technical features. Therefore, the features defined as "first" and "second" may explicitly or implicitly include at least one of the features. In the description of the present invention, the meaning of "plurality" is at least two, such as two, three, etc., unless otherwise clearly and specifically defined.

在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不必针对的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。In the description of this specification, the description with reference to the terms "one embodiment", "some embodiments", "example", "specific example", or "some examples" etc. means that the specific features, structures, materials or characteristics described in conjunction with the embodiment or example are included in at least one embodiment or example of the present invention. In this specification, the schematic representations of the above terms do not necessarily refer to the same embodiment or example. Moreover, the specific features, structures, materials or characteristics described may be combined in any one or more embodiments or examples in a suitable manner. In addition, those skilled in the art may combine and combine the different embodiments or examples described in this specification and the features of the different embodiments or examples, without contradiction.

尽管上面已经示出和描述了本发明的实施例,可以理解的是,上述实施例是示例性的,不能理解为对本发明的限制,本领域的普通技术人员在本发明的范围内可以对上述实施例进行变化、修改、替换和变型。Although the embodiments of the present invention have been shown and described above, it is to be understood that the above embodiments are exemplary and are not to be construed as limitations of the present invention. A person skilled in the art may change, modify, replace and vary the above embodiments within the scope of the present invention.

附表I:新型Type II Crispr-Cas系统Appendix I: Novel Type II Crispr-Cas System

附表II:新型Cpf1(Type V)系统Appendix II: New Cpf1 (Type V) System

Claims (10)

1. A Cas protein, comprising:
the amino acid sequence shown in SEQ ID NO. 4.
2. A nucleic acid sequence encoding the Cas protein of claim 1.
3. The nucleic acid sequence of claim 2, wherein the nucleic acid sequence is DNA or RNA.
4. An expression vector comprising the nucleic acid sequence of claim 2 or 3.
5. A recombinant cell comprising the expression vector of claim 4, wherein the recombinant cell is a non-plant cell.
6. The recombinant cell of claim 5, wherein the recombinant cell is a eukaryotic cell.
7. The recombinant cell of claim 6, wherein the recombinant cell is an animal cell.
8. A Crispr-Cas system comprising the Cas protein of claim 1.
9. The system of claim 8, further comprising at least one of: crRNA, tracrRNA or a chimeric RNA formed from crRNA, tracrRNA.
10. Use of the Cas protein of claim 1, the nucleic acid sequence of claim 2 or 3, the expression vector of claim 4, the recombinant cell of any one of claims 5-7, or the Crispr-Cas system of claim 8 or 9 in the field of gene editing for non-disease diagnosis or treatment.
CN202310742030.9A 2019-05-14 2020-05-13 Novel Cas protein, crispr-Cas system and use thereof in the field of gene editing Pending CN116694603A (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CN201910399082 2019-05-14
CN2019103990824 2019-05-14
CN202010401622.0A CN112301018B (en) 2019-05-14 2020-05-13 Novel Cas protein, crispr-Cas system and use thereof in the field of gene editing

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CN202010401622.0A Division CN112301018B (en) 2019-05-14 2020-05-13 Novel Cas protein, crispr-Cas system and use thereof in the field of gene editing

Publications (1)

Publication Number Publication Date
CN116694603A true CN116694603A (en) 2023-09-05

Family

ID=74336498

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202310742030.9A Pending CN116694603A (en) 2019-05-14 2020-05-13 Novel Cas protein, crispr-Cas system and use thereof in the field of gene editing
CN202010401622.0A Active CN112301018B (en) 2019-05-14 2020-05-13 Novel Cas protein, crispr-Cas system and use thereof in the field of gene editing

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN202010401622.0A Active CN112301018B (en) 2019-05-14 2020-05-13 Novel Cas protein, crispr-Cas system and use thereof in the field of gene editing

Country Status (1)

Country Link
CN (2) CN116694603A (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114921439B (en) * 2022-06-16 2024-04-26 尧唐(上海)生物科技有限公司 CRISPR-Cas effector protein, gene editing system and application thereof
CN119563022A (en) * 2022-11-11 2025-03-04 深圳华大生命科学研究院 Protein mutants and their application in treating diseases related to HBB gene mutation
TW202440913A (en) * 2022-12-08 2024-10-16 香港商正基基因科技有限公司 Cas12 protein, crispr-cas system and uses thereof
CN116410955B (en) * 2023-03-10 2023-12-19 华中农业大学 Two new endonucleases and their applications in nucleic acid detection
CN120519429B (en) * 2025-07-25 2025-10-10 珠海舒桐医疗科技有限公司 Efficient miniature CRISPR/Cas9 gene editing system and application

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107784200B (en) * 2016-08-26 2020-11-06 深圳华大生命科学研究院 Method and device for screening novel CRISPR-Cas system
CA3046824A1 (en) * 2016-12-14 2018-06-21 Wageningen Universiteit Thermostable cas9 nucleases
CN108690845B (en) * 2017-04-10 2021-04-27 中国科学院动物研究所 Genome editing systems and methods

Also Published As

Publication number Publication date
CN112301018B (en) 2023-07-25
CN112301018A (en) 2021-02-02

Similar Documents

Publication Publication Date Title
US12123014B2 (en) Class II, type V CRISPR systems
CN116694603A (en) Novel Cas protein, crispr-Cas system and use thereof in the field of gene editing
AU2016274452B2 (en) Thermostable Cas9 nucleases
CN106544351B (en) CRISPR-Cas9 knock out in vitro drug resistant gene mcr-1 method and its dedicated cell-penetrating peptides
KR20240055073A (en) Class II, type V CRISPR systems
CN103911376A (en) CRISPR-Cas9 targeted knockout hepatitis b virus cccDNA and specific sgRNA thereof
CN105492608A (en) Method using CRISPR-Cas9 to specifically knock off pig PDX1 gene and sgRNA of PDX1 gene for specific targeting
CN113234702A (en) Lt1Cas13d protein and gene editing system
EP4159853A1 (en) Genome editing system and method
CN112430586B (en) VI-B type CRISPR/Cas13 gene editing system and application thereof
KR20240049306A (en) Enzymes with RUVC domains
WO2023206872A1 (en) Engineering-optimized nuclease, guide rna, editing system, and use
KR20240051994A (en) Systems, compositions, and methods comprising retrotransposons and functional fragments thereof
JP2024509047A (en) CRISPR-related transposon system and its usage
CN118006584A (en) Programmable nucleases completely lacking Cas1, Cas2 and Cas4 in CRISPR loci and their applications
WO2024089629A1 (en) Cas12 protein, crispr-cas system and uses thereof
CN115074361B (en) Strong promoter derived from fungi and its application
CN118019843A (en) Class II, Type V CRISPR systems
WO2022197727A1 (en) Generation of novel crispr genome editing agents using combinatorial chemistry
Jiang et al. Restriction site-dependent PCR: an efficient technique for fast cloning of new genes of microorganisms
RU2788197C1 (en) DNA-CUTTING AGENT BASED ON Cas9 PROTEIN FROM THE BACTERIUM STREPTOCOCCUS UBERIS NCTC3858
US20250059568A1 (en) Class ii, type v crispr systems
CN117821423A (en) RalCas13d protein and editing system thereof
RU2778156C1 (en) DNA-CUTTING AGENT BASED ON THE Cas9 PROTEIN FROM THE BACTERIUM CAPNOCYTOPHAGA OCHRACEA
JP7708752B2 (en) Use of the Cas9 protein from the bacterium Pasteurella pneumotropica

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination