RU2771626C1

RU2771626C1 - Tool for cutting double-stranded dna using cas12d protein from katanobacteria and hybrid rna produced by fusion of guide crispr rna and scout rna

Info

Publication number: RU2771626C1
Application number: RU2020142806A
Authority: RU
Inventors: Константин Викторович Северинов; Александра Андреевна Васильева; Полина Анатольевна Селькова; Анатолий Николаевич Арсениев; Яна Витальевна Федорова; Михаил Алексеевич Ходорковский
Filing date: 2020-12-24
Publication date: 2022-05-11

Abstract

FIELD: biotechnology.

SUBSTANCE: invention relates to the field of biotechnology, in particular, to hybrid RNA (sgRNA) used as guide RNA in the Cas12d system for editing genomic DNA, as well as to a DNA cassette for expression thereof.

EFFECT: invention is effective for changing the sequence of genomic DNA of a unicellular or multicellular organism.

6 cl, 7 dwg, 2 tbl, 3 ex

Description

Область техникиTechnical field

Изобретение относится к области молекулярной биологии и микробиологии, в частности описывает новые бактериальные нуклеазы системы CRISPR-Cas и новые гибридные формы направляющих CRISPR-РНК. Изобретение может быть использовано в качестве инструмента для строго специфической модификации ДНК в различных организмах.The invention relates to the field of molecular biology and microbiology, in particular, describes new bacterial nucleases of the CRISPR-Cas system and new hybrid forms of CRISPR-RNA guides. The invention can be used as a tool for highly specific DNA modification in various organisms.

Изобретение создано при финансовой поддержке Министерства Науки и Высшего образования Российской Федерации в рамках Соглашения № 075-15-2019-1661 от 31.10.2019.The invention was created with the financial support of the Ministry of Science and Higher Education of the Russian Federation under Agreement No. 075-15-2019-1661 dated 10/31/2019.

Уровень техникиState of the art

Изменение последовательности ДНК - одна из актуальных задач биотехнологии на сегодняшний день. Редактирование и изменение геномов эукариотических и прокариотических организмов, а также манипуляции с ДНК in vitro, требуют направленного внесения двунитевых разрывов в последовательности ДНК. Для решения этой задачи в настоящее время используют следующие методики: искусственные нуклеазные системы, содержащей домены типа «цинковые пальцы», TALEN-системы и бактериальные CRISPR-Cas системы. Первые два метода требуют трудозатратой оптимизации аминокислотной последовательности нуклеазы для узнавания конкретной последовательности ДНК. В отличие от них в случае CRISPR-Cas систем структурами, узнающими ДНК мишень, являются не белки, а короткие направляющие РНК. Разрезание конкретной ДНК мишени не требует синтеза нуклеазы или ее гена de novo, а обеспечивается за счет использования направляющих РНК, комплементарных целевой последовательности. Это делает CRISPR-Cas системы удобными и эффективными инструментами разрезания различных ДНК-последовательностей. Методика позволяет осуществлять единовременное разрезание ДНК в нескольких участках при использовании направляющих РНК разной последовательностей. Такой подход используется в том числе для одновременного изменения нескольких генов в эукариотических организмах.Changing the DNA sequence is one of the urgent tasks of biotechnology today. Editing and modifying the genomes of eukaryotic and prokaryotic organisms, as well as manipulations with DNA in vitro, require the targeted introduction of double-strand breaks in the DNA sequence. To solve this problem, the following methods are currently used: artificial nuclease systems containing zinc finger domains, TALEN systems, and bacterial CRISPR-Cas systems. The first two methods require labor-intensive optimization of the nuclease amino acid sequence to recognize a particular DNA sequence. In contrast to them, in the case of CRISPR-Cas systems, the structures that recognize the target DNA are not proteins, but short guide RNAs. Cutting a specific target DNA does not require de novo synthesis of the nuclease or its gene, but is achieved through the use of guide RNAs complementary to the target sequence. This makes CRISPR-Cas systems convenient and efficient tools for cutting various DNA sequences. The technique allows simultaneous cutting of DNA in several regions using guide RNAs of different sequences. This approach is used, among other things, to simultaneously change several genes in eukaryotic organisms.

По своей природе CRISPR-Cas системы являются иммунными системами прокариот, способными высоко специфично вносить разрывы в генетический материал вирусов (Mojica et al., 2005). Аббревиатура CRISPR-Cas расшифровывается как “Clustered Regularly Interspaced Short Palindromic Repeats and CRISPR associated genes” (Jansen et al., 2002), что переводе с английского обозначает “короткие палиндромные повторы, регулярно расположенные группами, и aссоциированные с ними гены”. Все CRISPR-Cas системы состоят из CRISPR кассет и генов, кодирующих различные Cas белки (Jansen, 2002). CRISPR кассеты состоят из последовательностей-спейсеров, каждый из которых имеет уникальную нуклеотидную последовательность, и повторяющихся палиндромных повторов (Jansen, 2002). В результате транскрипции CRISPR кассет и их последующего процессинга образуются направляющие крРНК, которые вместе с Cas белками формируют эффекторный комплекс (Brouns et al., 2008). За счет комплементарного спаривания крРНК с целевым участком ДНК, именуемым протоспейсером, Cas-нуклеаза узнает ДНК-мишень и высоко специфично вносит в нее разрыв. By their nature, CRISPR-Cas systems are the immune systems of prokaryotes capable of highly specific ruptures in the genetic material of viruses (Mojica et al., 2005). The abbreviation CRISPR-Cas stands for “Clustered Regularly Interspaced Short Palindromic Repeats and CRISPR associated genes” (Jansen et al., 2002). All CRISPR-Cas systems consist of CRISPR cassettes and genes encoding various Cas proteins (Jansen, 2002). CRISPR cassettes consist of spacer sequences, each with a unique nucleotide sequence, and repeated palindromic repeats (Jansen, 2002). As a result of transcription of CRISPR cassettes and their subsequent processing, guide crRNAs are formed, which, together with Cas proteins, form an effector complex (Brouns et al., 2008). By complementary pairing of crRNA with a target DNA region, called a protospacer, Cas nuclease recognizes the target DNA and introduces a break in it in a highly specific manner.

CRISPR-Cas системы, представленные одиночным белком-эффектором, разделяют на шесть различных типов (от I до VI) в зависимости от Cas белков, входящих в состав систем. CRISPR-Cas systems represented by a single effector protein are divided into six different types (from I to VI) depending on the Cas proteins that make up the systems.

Система CRISPR-Cas12d относится к типу V и отличается простотой состава и механизма работы: для ее функционирования необходимо формирование эффекторного комплекса, состоящего лишь из одного белка Cas12d и двух коротких РНК: крРНК (crRNA) и скаутной РНК (scoutRNA). Скаутная РНК комплементарно спаривается с участком крРНК, проиcходящим из CRISPR повтора, образуя вторичную структуру, необходимую для связывания направляющих РНК с Cas эффектором. Несмотря на некоторые аналогии, скаутная РНК и механизм ее связывания с крРНК сильно отличаются от трейсерной РНК, характерной для Cas9 белков, в частности SpCas9 из Streptococcus pyogenes, нуклеазе широко используемой в биотехнологии. The CRISPR-Cas12d system belongs to type V and is characterized by a simple composition and mechanism of operation: its functioning requires the formation of an effector complex consisting of only one Cas12d protein and two short RNAs: crRNA (crRNA) and scout RNA (scoutRNA). The scout RNA pairs complementarily with the crRNA region derived from the CRISPR repeat, forming a secondary structure necessary for binding guide RNAs to the Cas effector. Despite some similarities, scout RNA and its binding mechanism to crRNA are very different from tracer RNA, which is characteristic of Cas9 proteins, in particular SpCas9 from Streptococcus pyogenes, a nuclease widely used in biotechnology.

Эффекторный белок Cas12d является РНК-зависимой ДНК эндонуклеазой, образующей двунитевые разрывы в ДНК (Harrington et al., 2020). The effector protein Cas12d is an RNA-dependent DNA endonuclease that forms double-strand breaks in DNA (Harrington et al., 2020).

Использование CRISPR-Cas нуклеаз для модификации геномов различных организмов, требует воссоздания их активности in vitro и создания гибридной формы направляющих РНК (sgRNA), где крРНК и вторая, дополнительная РНК слиты в единую молекулу. Такие молекулы были созданы для SpCas9 и повышают удобство работы с системой, сокращая количество ее компонентов и снижая число промоторов, необходимых для экспрессии генов в различных организмах. Однако для систем CRISPR-Cas12d такие гибридные РНК не были созданы ранее. The use of CRISPR-Cas nucleases to modify the genomes of various organisms requires recreating their activity in vitro and creating a hybrid form of guide RNA (sgRNA), where crRNA and a second, additional RNA are fused into a single molecule. Such molecules were created for SpCas9 and improve the usability of the system by reducing the number of its components and reducing the number of promoters required for gene expression in various organisms. However, for CRISPR-Cas12d systems, such hybrid RNAs have not been created previously.

На сегодняшний день известно несколько CRISPR-Cas нуклеаз, работа которых реконструирована in vitro, способных направлено и специфично вносить двунитевые разрывы в ДНК, самой популярной из них является SpCas9 из Streptococcus pyogenes. Одной из основных характеристик, ограничивающих применение CRISPR-Cas систем является PAM последовательность, фланирующая ДНК-мишень с 3'-конца, наличие которой необходимо для правильного узнавания ДНК Cas нуклеазой. Различные CRISPR-Cas белки имеют разные PAM последовательности, которые ограничивают возможности применения нуклеаз на любых участках ДНК. Так, SpCas9 требует 5' -NGG -3' PAM.To date, there are several CRISPR-Cas nucleases, whose work has been reconstructed in vitro, capable of directing and specifically introducing double-strand breaks in DNA, the most popular of which is SpCas9 from Streptococcus pyogenes. One of the main characteristics limiting the use of CRISPR-Cas systems is the PAM sequence that flanks the target DNA from the 3'-end, the presence of which is necessary for the correct recognition of Cas DNA by nuclease. Different CRISPR-Cas proteins have different PAM sequences, which limit the application of nucleases to any DNA region. Thus, SpCas9 requires 5'-NGG-3' PAM.

Использование CRISPR-Cas белков с новыми разнообразными PAM последовательностями необходимо для обеспечения возможности изменения любого участка ДНК, как in vitro, так и в геноме живых организмов. Изменение эукариотических геномов также требует использования нуклеаз малого размера для обеспечения доставки CRISPR-Cas систем в клетки посредством AAV вирусов. Cas12d нуклеазы, имеющие малый размер, могут решить проблему доставки CRISPR-Cas систем в вирусных капсидах ограниченной емкости.The use of CRISPR-Cas proteins with new diverse PAM sequences is necessary to ensure the possibility of changing any DNA region, both in vitro and in the genome of living organisms. Altering eukaryotic genomes also requires the use of small nucleases to enable delivery of CRISPR-Cas systems to cells via AAV viruses. Cas12d nucleases, which are small in size, can solve the problem of delivery of CRISPR-Cas systems in viral capsids of limited capacity.

Несмотря на известность ряда способов разрезания ДНК и изменения последовательности геномной ДНК, на сегодняшний день сохраняется потребность в новых эффективных инструментах для модификации ДНК в различных организмах и в строго определенных местах последовательности ДНК. Despite the popularity of a number of methods for cutting DNA and changing the sequence of genomic DNA, today there is a need for new effective tools for modifying DNA in various organisms and at strictly defined places in the DNA sequence.

Данное изобретение обладает рядом свойств, необходимых для решения этой задачи. This invention has a number of properties necessary to solve this problem.

Сущность изобретенияThe essence of the invention

Задачей настоящего изобретения является создание новых инструментов для изменения последовательности геномной ДНК одноклеточных или многоклеточных организмов на основе систем CRISPR-Cas12d. The objective of the present invention is to create new tools for changing the sequence of genomic DNA of unicellular or multicellular organisms based on CRISPR-Cas12d systems.

Существующие в настоящее время CRISPR-Cas системы имеют ограниченное применение из-за специфичной последовательности РАМ, которая должна присутствовать на 3'-конце участка ДНК, подвергающегося модификации. Поиск новых ферментов Cas с другими РАМ последовательностями позволит расширить арсенал имеющихся средств для образования двунитевого разрыва в необходимых, строго определенных местах в молекулах ДНК разных организмов. Поиск новых ферментов Cas малого размера позволит осуществлять их доставку в AAV векторах. The current CRISPR-Cas systems are of limited use due to the specific PAM sequence that must be present at the 3' end of the DNA region being modified. The search for new Cas enzymes with other PAM sequences will expand the arsenal of available tools for the formation of a double-strand break in the necessary, strictly defined places in the DNA molecules of different organisms. The search for new Cas enzymes of small size will allow their delivery in AAV vectors.

Cas12d белки имеют малый размер и короткие, часто встречающиеся в геномах, PAM последовательности, что расширяет их применение в биотехнологии. Однако на сегодняшний день работа только одной Cas12d нуклеазы из бактерий-симбионтов термитов была реконструирована in vitro. Кроме того, ни для одного представителя Cas12d семейства не создана гибридная направляющая РНК (sgRNA), которая аналогично sgRNA, используемой для белков Cas9, уменьшала бы количество компонентов системы и облегчало бы их экспрессии в клетках организмов.Cas12d proteins have a small size and short PAM sequences, which are often found in genomes, which expands their application in biotechnology. However, to date, only one Cas12d nuclease from termite symbiont bacteria has been reconstructed in vitro. In addition, no hybrid guide RNA (sgRNA) has been created for any member of the Cas12d family, which, similarly to sgRNA used for Cas9 proteins, would reduce the number of system components and facilitate their expression in the cells of organisms.

Задача работы - создание средства разрезания ДНК на основе Cas12d белков и гибридной РНК, полученной путем слияния направляющей CRISPR РНК и scout РНК. The task of the work is to create a DNA cutting tool based on Cas12d proteins and hybrid RNA obtained by fusion of CRISPR guide RNA and scout RNA.

Для решения этой задачи авторами была in vitro реконструирована нуклеазная активность CRISPR-Cas12d системы из Katanobacteria (KbCas12d), чья активность была ранее показана только в бактериальных клетках (US2020255858A1), и впервые для систем Cas12d типа создан новый вид гибридной РНК, представляющий собой слияние направляющей CRISPR РНК и scout РНК.To solve this problem, the authors reconstructed in vitro the nuclease activity of the CRISPR-Cas12d system from Katanobacteria (KbCas12d), whose activity was previously shown only in bacterial cells (US2020255858A1), and for the first time for systems of the Cas12d type, a new type of hybrid RNA was created, which is a fusion of the guide CRISPR RNA and scout RNA.

CRISPR-KbCas12d c направляющей гибридной РНК может быть применена для внесения направленных изменений в геном как этого, так и других организмов. Существенными признаками, отличающими настоящее изобретение, являются: (a) возможность использования одной направляющей РНК (sgRNA) вместо двух (crRNA и scoutRNA) ; (б) малый размер охарактеризованного белка KbCas12d - 1125 аминокислотных остатков (а.о.), что на 243 а.о. меньше, чем размер известного фермента Cas9 из Streptococcus pyogenes (SpCas9); (в) широкий рабочий диапазон температур нуклеазы KbCas12d, которая активна при температурах от 25°С до 45°С с оптимумом при 35°С, что позволит использовать ее в организмах, имеющих различную температуру.CRISPR-KbCas12d with guide fusion RNA can be used to introduce targeted changes in the genome of this and other organisms. The essential features that distinguish the present invention are: (a) the possibility of using one guide RNA (sgRNA) instead of two (crRNA and scoutRNA); (b) the small size of the characterized KbCas12d protein - 1125 amino acid residues (a.a.), which is 243 a.a. less than the size of the known Cas9 enzyme from Streptococcus pyogenes (SpCas9); (c) a wide operating temperature range of the KbCas12d nuclease, which is active at temperatures from 25°C to 45°C with an optimum at 35°C, which will allow its use in organisms with different temperatures.

Указанная задача решается путем применения белка, имеющего последовательность, которая по меньшей мере на 95% идентична аминокислотной последовательности SEQ ID NO: 1, для образования двунитевого разрыва в молекуле ДНК, расположенного непосредственно после нуклеотидной последовательности 5'- TA -3' в указанной молекуле ДНК. В некоторых вариантах изобретения данное применение характеризуется тем, что образование двунитевого разрыва в молекуле ДНК происходит при температуре от 25°С до 45°С.This problem is solved by using a protein having a sequence that is at least 95% identical to the amino acid sequence of SEQ ID NO: 1, to form a double-strand break in the DNA molecule, located immediately after the 5'-TA-3' nucleotide sequence in the specified DNA molecule . In some embodiments of the invention, this application is characterized in that the formation of a double-strand break in the DNA molecule occurs at a temperature of from 25°C to 45°C.

Указанная задача также решается путем создания способа изменения последовательности геномной ДНК одноклеточного или многоклеточного организма, включающего введение в по меньшей мере одну клетку этого организма эффективного количества: а) либо белка, имеющего последовательность, которая по меньшей мере на 95% идентична аминокислотной последовательности SEQ ID NO: 1, либо нуклеиновой кислоты, кодирующей ген белка, имеющего последовательность, которая по меньшей мере на 95% идентична аминокислотной последовательности SEQ ID NO: 1, и б) направляющей гибридной РНК, содержащей последовательность, образующую дуплекс с нуклеотидной последовательностью участка геномной ДНК организма, непосредственно примыкающей к нуклеотидной последовательности 5'-TA -3', и взаимодействующей с указанным белком после образования дуплекса, при этом взаимодействие указанного белка с направляющей РНК и нуклеотидной последовательностью 5'- ТА -3' приводит к образованию двунитевого разрыва в последовательности геномной ДНК, непосредственно примыкающей к последовательности 5'- ТА-3'. В некоторых вариантах изобретения данный способ характеризуется тем, что дополнительно включающий введение экзогенной последовательности ДНК одновременно с направляющей РНК. This problem is also solved by creating a method for changing the genomic DNA sequence of a unicellular or multicellular organism, including introducing into at least one cell of this organism an effective amount of: a) either a protein having a sequence that is at least 95% identical to the amino acid sequence of SEQ ID NO : 1, or a nucleic acid encoding a protein gene having a sequence that is at least 95% identical to the amino acid sequence of SEQ ID NO: 1, and b) a guide hybrid RNA containing a sequence that forms a duplex with the nucleotide sequence of a portion of the genomic DNA of the organism, immediately adjacent to the nucleotide sequence 5'-TA -3', and interacting with the specified protein after duplex formation, while the interaction of the specified protein with the guide RNA and the nucleotide sequence 5'-TA-3' leads to the formation of a double-strand break in the genomic sequence th DNA immediately adjacent to the 5'-TA-3' sequence. In some embodiments of the invention, this method is characterized by further comprising the introduction of an exogenous DNA sequence simultaneously with the guide RNA.

В качестве направляющей РНК может быть использована смесь из крРНК (crRNA) и скаутной РНК (scoutRNA), способных образовать комплекс с участком целевой ДНК и белком KbCas12d. В предпочтительных вариантах изобретения в качестве направляющей РНК может быть использована гибридная РНК, разработанная авторами и сконструированная на основе крРНК и скаутной РНК путем слияния их нуклеотидных последовательностей, общей формулой А-В-С-D, где А - последовательность KbCas12d scout РНК, B - последовательность линкера; C - последовательность прямого повтора DR KbCas12d крРНК, D - последовательность, комплементарная ДНК-мишени (спейсерный сегмент); при этом последовательность линкера и спейсерного сегмента может быть любой.As a guide RNA, a mixture of crRNA (crRNA) and scout RNA (scoutRNA) can be used, capable of forming a complex with the target DNA site and the KbCas12d protein. In preferred embodiments of the invention, a hybrid RNA developed by the authors and constructed on the basis of crRNA and scout RNA by merging their nucleotide sequences with the general formula A-B-C-D can be used as a guide RNA, where A is the sequence of KbCas12d scout RNA, B is linker sequence; C - sequence of the direct repeat DR KbCas12d crRNA, D - sequence complementary to the target DNA (spacer segment); in this case, the sequence of the linker and the spacer segment can be any.

В некоторых вариантах изобретения последовательность KbCas12d scout РНК синтезировна на основе последовательности SEQ ID NO: 6. В некоторых вариантах изобретения последовательность прямого повтора DR KbCas12d крРНК является неизменяемой частью последовательности SEQ ID NO: 3.In some embodiments, the sequence of the KbCas12d scout RNA is synthesized based on the sequence of SEQ ID NO: 6. In some embodiments, the direct repeat DR sequence of the KbCas12d crRNA is an invariant part of the sequence of SEQ ID NO: 3.

Гибридная РНК в свою очередь может быть получена с помощью кассеты для экспрессии гибридной РНК и состоит из последовательности U6 промотора, последовательности гибридной РНК по п.1 и ДНК последовательности, фланкированной с 5' конца PAM последовательностью 5'-TA -3'.The fusion RNA, in turn, can be produced using a fusion RNA expression cassette and consists of a U6 promoter sequence, a fusion RNA sequence according to claim 1, and a DNA sequence flanked at the 5' end of the PAM by a 5'-TA-3' sequence.

Изобретение может быть использовано как для разрезания целевой ДНК in vitro, так и для модификации генома какого-либо живого организма. Модификация генома может проводиться прямым способом - разрезанием генома в соответствующем сайте, а также вставкой экзогенной последовательности ДНК за счет гомологичной репарации. The invention can be used both for cutting the target DNA in vitro and for modifying the genome of any living organism. Genome modification can be carried out in a direct way - by cutting the genome at the appropriate site, as well as by inserting an exogenous DNA sequence due to homologous repair.

В качестве экзогенной последовательности ДНК может быть использован любой участок двунитевой или однонитевой ДНК из генома организма, отличного от организма, используемого при введении (или смесь таких участков между собой и с другими фрагментами ДНК), при этом этот участок (или смесь участков) предназначен для интеграции в место двуцепочечного разрыва в таргетной ДНК, образованного под действием нуклеазы KbCas12d. В некоторых вариантах изобретения в качестве экзогенной последовательности ДНК может быть использован участок двуцепочечной ДНК из генома организма, используемого при введении белка KbCas12d, но при этом измененный мутациями (заменой нуклеотидов), а также вставками или делециями одного или нескольких нуклеотидов. Any section of double-stranded or single-stranded DNA from the genome of an organism other than the organism used for administration (or a mixture of such sections among themselves and with other DNA fragments) can be used as an exogenous DNA sequence, while this section (or mixture of sections) is intended for integration into the site of a double-strand break in the target DNA formed under the action of the KbCas12d nuclease. In some embodiments of the invention, a portion of double-stranded DNA from the genome of an organism used when introducing the KbCas12d protein, but altered by mutations (substitution of nucleotides), as well as insertions or deletions of one or more nucleotides, can be used as an exogenous DNA sequence.

Техническим результатом настоящего изобретения является повышение универсальности доступных систем CRISPR-Cas12d, позволяющее использовать нуклеазу Cas12d для разрезания геномной или плазмидной ДНК в большем количестве специфических мест и при больших диапазонах температур. Еще одним техническим результатом настоящего изобретения является упрощение редактирования генома системами семейства Cas12d за счет использования гибридной направляющей РНК. The technical result of the present invention is to increase the versatility of the available CRISPR-Cas12d systems, allowing the use of the Cas12d nuclease to cut genomic or plasmid DNA at more specific sites and at larger temperature ranges. Another technical result of the present invention is the simplification of genome editing by systems of the Cas12d family through the use of a hybrid guide RNA.

Краткое описание рисунковBrief description of the drawings

Фиг. 1. Схема устройства CRISPR-KbCas12d локуса. Fig. 1. Scheme of the CRISPR-KbCas12d locus.

Фиг. 2. KbCas12d в комплексе с скаутной РНК и крРНК разрезает ДНК мишени in vitro.Fig. 2. KbCas12d in complex with scout RNA and crRNA cuts the target DNA in vitro.

Фиг. 3. Активность белка KbCas12d в разрезании ДНК in vitro в зависимости от температуры.Fig. Fig. 3. Activity of the KbCas12d protein in DNA cutting in vitro depending on temperature.

Фиг. 4. Разрезание ДНК белком KbCas12d в комплексе с крРНК и различными формами скаутной РНК, изображенными на рисунке сверху. Fig. Fig. 4. DNA cutting with KbCas12d protein in complex with crRNA and different forms of scout RNA shown in the figure above.

Фиг. 5. Создание гибридной РНК для KbCas12d нуклеазы. (A) схема дизайна гибридной sgРНК. (B) результат разрезания ДНК sgРНК формами, имеющими линкерные последовательности разной длины для соединения крРНК и скаутной РНК. Fig. 5. Creation of hybrid RNA for KbCas12d nuclease. (A) sgRNA fusion design scheme. (B) The result of sgRNA DNA cutting with forms having linker sequences of different lengths to connect crRNA and scout RNA.

Фиг. 6. Разрезание KbCas12d различные ДНК мишени in vitro.Fig. 6. Cutting KbCas12d various target DNAs in vitro.

Фиг. 7. Выравнивание аминокислотных последовательностей KbCas12d и Cas12d15 из бактерий-симбионтов термитов при помощи программы NCBI BLASTp (default parameters).Fig. 7. Alignment of the amino acid sequences of KbCas12d and Cas12d15 from termite symbiont bacteria using the NCBI BLASTp program (default parameters).

Подробное раскрытие изобретенияDetailed disclosure of the invention

В описании данного изобретения термины «включает» и «включающий» интерпретируются как означающие «включает, помимо всего прочего». Указанные термины не предназначены для того, чтобы их истолковывали как «состоит только из». Если не определено отдельно, технические и научные термины в данной заявке имеют стандартные значения, общепринятые в научной и технической литературе.In the description of the present invention, the terms "comprises" and "comprising" are interpreted to mean "includes, among other things." These terms are not intended to be construed as "consisting only of". Unless otherwise defined, the technical and scientific terms in this application have the standard meanings generally accepted in the scientific and technical literature.

Используемый здесь термин «процент гомологии двух последовательностей» эквивалентен термину «процент идентичности двух последовательностей». Идентичность последовательностей определяется на основании референсной последовательности. Алгоритмы для анализа последовательности известны в данной области, такие как BLAST, описанный Altschul et al. в 1990 году. Для целей настоящего изобретения для определения уровня идентичности и сходства между нуклеотидными последовательностями и аминокислотными последовательностями может быть использовано сравнение нуклеотидных и аминокислотных последовательностей, производимое с помощью пакета программ Blast, предоставляемого National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/blast) с использованием содержащего разрывы выравнивания со стандартными параметрами. Процент идентичности двух последовательностей определяется числом положений идентичных аминокислот в этих двух последовательностях с учетом числа пробелов и длины каждого пробела, которые необходимо ввести для оптимального сопоставления двух последовательностей путем выравнивания. Процент идентичности равен числу идентичных аминокислот в данных положениях с учетом выравнивания последовательностей, разделенному на общее число положений и умноженному на 100.As used herein, the term "percent homology of two sequences" is equivalent to the term "percent identity of two sequences". Sequence identity is determined based on the reference sequence. Algorithms for sequence analysis are known in the art, such as BLAST as described by Altschul et al. in 1990. For the purposes of the present invention, comparison of nucleotide and amino acid sequences can be used to determine the level of identity and similarity between nucleotide sequences and amino acid sequences using the Blast software package provided by the National Center for Biotechnology Information (http://www.ncbi.nlm.nih .gov/blast) using a broken alignment with default settings. The percent identity of two sequences is determined by the number of positions of identical amino acids in these two sequences, taking into account the number of gaps and the length of each gap, which must be entered for optimal matching of two sequences by alignment. The percent identity is equal to the number of identical amino acids at given positions, taking into account the alignment of the sequences, divided by the total number of positions and multiplied by 100.

Термин "специфически гибридизуется" относится к ассоциации между двумя одноцепочечными молекулами нуклеиновых кислот или в достаточной степени комплементарными последовательностями, что разрешает такую гибридизацию в предопределенных условиях, обычно использующихся в данной области.The term "specifically hybridizes" refers to an association between two single-stranded nucleic acid molecules, or sufficiently complementary sequences, to permit such hybridization under predetermined conditions commonly used in the art.

Фраза "двунитевой разрыв, расположенный непосредственно после нуклеотидной последовательностью РАМ" означает, что двунитевой разрыв в целевой последовательности ДНК будет произведен на расстоянии от 0 до 25 нуклеотидов после нуклеотидной последовательности РАМ.The phrase "double-strand break located immediately after the PAM nucleotide sequence" means that a double-strand break in the target DNA sequence will be made at a distance of 0 to 25 nucleotides after the PAM nucleotide sequence.

Под экзогенной последовательностью ДНК, вводимой одновременно с направляющей РНК, следует понимать последовательность ДНК, подготовленную специально для специфической модификации двуцепочечной целевой ДНК в месте разрыва, определяемого специфичностью направляющей РНК. Подобной модификацией может быть, например, вставка или делеция определенных нуклеотидов в месте разрыва целевой ДНК. Экзогенной ДНК может служить как участок ДНК из другого организма, так и участок ДНК из того же организма, что и целевая ДНК.An exogenous DNA sequence administered simultaneously with a guide RNA is to be understood as a DNA sequence prepared specifically for the specific modification of a double-stranded target DNA at the break site determined by the specificity of the guide RNA. Such a modification may be, for example, the insertion or deletion of certain nucleotides at the site of a break in the target DNA. Exogenous DNA can be either a stretch of DNA from another organism or a stretch of DNA from the same organism as the target DNA.

Под белком, содержащим определенную аминокислотную последовательность следует понимать белок, имеющий аминокислотную последовательность, составленную из указанной аминокислотной последовательности и, возможно, других последовательностей, соединённых пептидными связями с указанной аминокислотной последовательностью. Примером других последовательностей может служить последовательность сигнала ядерной локализации (NLS), или другие последовательности, обеспечивающие повышенную функциональность для указанной аминокислотной последовательности.A protein containing a specific amino acid sequence is to be understood as a protein having an amino acid sequence composed of the specified amino acid sequence and possibly other sequences connected by peptide bonds to the specified amino acid sequence. Other sequences are exemplified by the nuclear localization signal (NLS) sequence, or other sequences that provide increased functionality for the specified amino acid sequence.

Под эффективным количеством вводимых в клетку белка и РНК следует понимать такое количество белка и РНК, которое при попадании в указанную клетку будет способно образовать функциональный комплекс, то есть комплекс, который будет специфически связываться с целевой ДНК и производить в ней двунитевой разрыв в месте, определяемом направляющей РНК и РАМ последовательностью на ДНК. Эффективность этого процесса может быть оценена при помощи анализа целевой ДНК, выделенной из указанной клетки с помощью стандартных методов, известных специалистам.An effective amount of protein and RNA introduced into a cell should be understood as such an amount of protein and RNA that, when it enters the specified cell, will be able to form a functional complex, that is, a complex that will specifically bind to the target DNA and produce a double-strand break in it at a location determined by guide RNA and PAM sequence on DNA. The efficiency of this process can be assessed by analyzing the target DNA isolated from said cell using standard methods known to those skilled in the art.

Доставка белка и РНК в клетку может быть осуществлена различными способами. Например, белок может быть доставлен в виде ДНК-плазмиды, которая кодирует ген этого белка, как мРНК для трансляции этого белка в цитоплазме клетки, или как рибонуклеопротеидный комплекс, включающий этот белок и направляющую РНК. Доставка может быть осуществлена различными методами, известными специалистам. Delivery of protein and RNA into the cell can be carried out in various ways. For example, a protein can be delivered as a DNA plasmid that encodes the gene for that protein, as an mRNA for translation of that protein in the cell's cytoplasm, or as a ribonucleoprotein complex that includes the protein and a guide RNA. Delivery can be accomplished by various methods known to those skilled in the art.

Нуклеиновая кислота, кодирующая компоненты системы, может быть введена в клетку, непосредственно или опосредованно: за счет трансфекции или трансформации клеток известными специалистам способами, за счет использования рекомбинатного вируса, за счет манипуляций с клеткой, таких как микроинъекция ДНК и т.п.The nucleic acid encoding the components of the system can be introduced into the cell directly or indirectly: by transfection or transformation of cells by methods known to those skilled in the art, by the use of a recombinant virus, by cell manipulation such as DNA microinjection, and the like.

Доставка рибонуклеинового комплекса, состоящего из нуклеазы и направляющих РНК и экзогенной ДНК (при необходимости) может осуществляться путем трансфекции комплексов в клетку или за счет механического введения комплекса внутрь клетки, например, микроинъекции.Delivery of a ribonucleic complex consisting of a nuclease and guide RNAs and exogenous DNA (if necessary) can be carried out by transfection of the complexes into the cell or by mechanical introduction of the complex into the cell, for example, microinjection.

Молекула нуклеиновой кислоты, кодирующая белок, который необходимо ввести в клетку, может быть интегрирована в хромосому или может представлять собой внехромосомно реплицирующуюся ДНК. В некоторых вариантах для обеспечения эффективной экспрессии гена белка с вводимой в клетку ДНК необходимо изменить последовательность этой ДНК в соответствии с типом клетки в целях оптимизации кодонов при экспрессии, обусловленное неравномерностью частот встречаемости синонимичных кодонов в кодирующих областях генома различных организмов. Оптимизация кодонов необходима для увеличения экспрессии в клетках животных, растений, грибов или микроорганизмов.The nucleic acid molecule encoding the protein to be introduced into the cell may be integrated into a chromosome or may be extrachromosomally replicating DNA. In some embodiments, to ensure efficient expression of a protein gene with DNA introduced into a cell, it is necessary to change the sequence of this DNA in accordance with the cell type in order to optimize codons during expression, due to the uneven frequency of occurrence of synonymous codons in the coding regions of the genome of various organisms. Codon optimization is required to increase expression in animal, plant, fungal, or microbial cells.

Для функционирования белка, имеющего последовательность, которая по меньшей мере на 95% идентична аминокислотной последовательности SEQ ID NO: 1, в эукариотической клетке необходимо, чтобы этот белок оказался в ядре этой клетки. Поэтому, в некоторых вариантах изобретения, для образования двунитевых разрывов в целевой ДНК используют белок, имеющий последовательность, которая по меньшей мере на 95% идентична аминокислотной последовательности SEQ ID NO: 1, и который дополнительно модифицирован с одного или с обоих концов добавлением одного или нескольких сигналов ядерной локализации. Например, может быть использован сигнал ядерной локализации из вируса SV40. Для эффективной доставки в ядро сигнал ядерной локализации может быть отделен от основной последовательности белка спейсерной последовательностью, например, описанной Shen et al. в 2013. Также, в других вариантах осуществления, может быть использован другой сигнал ядерной локализации, или альтернативный метод доставки указанного белка в ядро клетки.For a protein having a sequence that is at least 95% identical to the amino acid sequence of SEQ ID NO: 1 to function in a eukaryotic cell, the protein must be in the nucleus of that cell. Therefore, in some embodiments of the invention, a protein is used to form double-strand breaks in the target DNA, having a sequence that is at least 95% identical to the amino acid sequence of SEQ ID NO: 1, and which is further modified at one or both ends by the addition of one or more nuclear localization signals. For example, a nuclear localization signal from the SV40 virus can be used. For efficient delivery to the nucleus, the nuclear localization signal can be separated from the main protein sequence by a spacer sequence, such as that described by Shen et al. in 2013. Also, in other embodiments, a different nuclear localization signal, or alternative method of delivering said protein to the cell nucleus, can be used.

Настоящее изобретение охватывает применение белка из организма Katanobateria, гомологичного ранее охарактеризованному in vitro белку Cas12d15 из бактерий симбиотов термита, для внесения двуцепочечных разрывов в молекулы ДНК в строго определенных положениях. The present invention encompasses the use of a protein from the organism Katanobateria homologous to the previously characterized in vitro protein Cas12d15 from termite symbiote bacteria to introduce double-strand breaks in DNA molecules at well-defined positions.

Использование CRISPR нуклеаз для внесения направленных изменений в геном имеет ряд преимуществ. Во-первых, специфичность действия системы определяется последовательностью крРНК, что позволяет использовать один тип нуклеазы для всех локусов-мишеней. Во-вторых, методика позволяет доставить в клетку сразу несколько направляющих РНК, комплементарных разным генам-мишеням, что позволяет осуществлять единовременное изменение сразу нескольких генов. The use of CRISPR nucleases to introduce targeted changes in the genome has a number of advantages. First, the specificity of the system's action is determined by the crRNA sequence, which makes it possible to use one type of nuclease for all target loci. Secondly, the technique allows several guide RNAs, complementary to different target genes, to be delivered into the cell at once, which makes it possible to carry out a one-time change in several genes at once.

KbCas9 - Cas нуклеаза, найденная в Katanobacteria. Katanobacteria CRISPR Cas12d1 система (далее CRISPR KbCas12d) относится к V D типу CRISPR Cas систем и состоит из CRISPR кассеты, несущей прямые повторы (direct repeats, DR) последовательностью 5' actccgaaagtatcggggataaaggc 3' разделенных последовательностями уникальных спейсеров. Активность CRISPR-KbCas12d системы в качестве защитной была показана ранее в бактериях (Harrington et al., 2020), в ходе этих же экспериментов были определены требования к PAM, однако in vitro работа системы не была ранее воссоздана. К CRISPR кассете прилегает ген эффекторного Cas12d белка KbCas12d, а также ген белка Cas1, участвующий в адаптации, встраивании новых спейсеров. Рядом с Cas генами находится последовательность, частично комплементарная прямым повторам, складывающаяся в характерную вторичную структуру, - предполагаемая скаутная РНК (scoutRNA) (Фиг. 1) Однако экспериментально функционирование скаутной РНК в KbCas12d показано не было. KbCas9 - Cas nuclease found in Katanobacteria. The Katanobacteria CRISPR Cas12d1 system (hereinafter CRISPR KbCas12d) belongs to the V D type of CRISPR Cas systems and consists of a CRISPR cassette carrying direct repeats (DR) with a 5' actccgaaagtatcggggataaaggc 3' sequence separated by unique spacer sequences. The activity of the CRISPR-KbCas12d system as a protective system was previously shown in bacteria (Harrington et al., 2020), during the same experiments, the requirements for PAM were determined, but in vitro the operation of the system was not previously recreated. The CRISPR cassette is adjacent to the gene for the effector Cas12d protein KbCas12d, as well as the gene for the Cas1 protein involved in adaptation and incorporation of new spacers. Next to the Cas genes is a sequence partially complementary to direct repeats, folding into a characteristic secondary structure, the putative scout RNA (scoutRNA) (Fig. 1). However, the functioning of scout RNA in KbCas12d has not been experimentally shown.

Воссоздание ДНК-разрезающего комплекса KbCas12d in vitroReconstruction of the DNA-cutting complex KbCas12d in vitro

Проведенный ранее биоинформатический анализ (Harrington et al., 2020) предсказал предположительные последовательности крРНК и scout РНК для системы CRISPR-KbCas12d (Таблица 1). Previous bioinformatic analysis (Harrington et al., 2020) predicted putative crRNA and scout RNA sequences for the CRISPR-KbCas12d system (Table 1).

Таблица 1. Последовательности направляющих РНК системы CRISPR-KbCas12d используемые для первичных тестирований активности нуклеазы KbCas12d in vitro. Жирным шрифтом обозначена последовательность прямого повтора DR.Table 1. CRISPR-KbCas12d guide RNA sequences used for primary tests of KbCas12d nuclease activity in vitro. Bold indicates the DR direct repeat sequence.

НазваниеName ПоследовательностьSubsequence KbCas12d scout РНКKbCas12d scout RNA 5' aaguaucaaaauaaaaaggguuuccaguuuuuaacuaaacuuuagccuuccacccuuuc -3'
SEQ ID NO: 25'
SEQ ID NO: 2 KbCas12d крРНКKbCas12d crRNA 5'-acuccgaaaguaucggggauaaaggcnnnnnnnnnnnnnnnnnn-3'
(SEQ ID NO: 3)5'-acuccgaaaguaucggggauaaaggcnnnnnnnnnnnnnnnnnn-3'
(SEQ ID NO: 3)

Для проверки активности KbCas12d нуклеазы были проведены эксперименты по воссозданию реакции разрезания ДНК in vitro. Для этого необходимо было получить все компоненты эффекторного комплекса KbCas12d: направляющие РНК и нуклеазу в рекомбинантной форме. Определение последовательности направляющих РНК позволило синтезировать in vitro молекулы крРНК и scout РНК. Синтез осуществляли с помощью набора NEB HiScribe T7 RNA synthesis.To test the activity of KbCas12d nuclease, experiments were carried out to recreate the DNA cutting reaction in vitro. To do this, it was necessary to obtain all components of the KbCas12d effector complex: guide RNA and nuclease in recombinant form. Sequencing of guide RNAs made it possible to synthesize crRNA and scout RNA molecules in vitro. Synthesis was performed using the NEB HiScribe T7 RNA synthesis kit.

В качестве ДНК-мишени использовался линейный ДНК фрагмент длиной 916 п.н. (SEQ ID NO: 4). ДНК-мишень несла целевой сайт 5' GTCATTGGCAGCTACAGG 3', фланкированный с 5' конца PAM последовательностью 5' TA 3'A linear DNA fragment 916 bp long was used as a target DNA. (SEQ ID NO: 4). The target DNA carried the target site 5' GTCATTGGCAGCTACAGG 3' flanked from the 5' end of the PAM by the sequence 5' TA 3'

Для разрезания этой мишени использовали направляющие РНК следующей последовательности: scout РНК (SEQ ID NO: 2; 5' aaguaucaaaauaaaaaggguuuccaguuuuuaacuaaacuuuagccuuccacccuuuc -3') и крРНК (SEQ ID NO: 5; 5' cuccgaaaguaucggggauaaaggcGUCAUUGGCAGCUACAGG 3'). Жирным шрифтом выделена последовательность крРНК, комплементарная протоспейсеру (целевой ДНК последовательности). To cut this target, guide RNAs of the following sequence were used: scout RNA (SEQ ID NO: 2; 5' aaguaucaaaauaaaaaggguuuccaguuuuuaacuaaacuuuuagccuuccacccuuuc -3') and crRNA (SEQ ID NO: 5; 5' cuccgaaaguaucgggauaaaggcGUCAUUGGCAGCUACAGG 3'). The crRNA sequence complementary to the protospacer (target DNA sequence) is highlighted in bold.

Для получения рекомбинантного белка KbCas12d его ген был клонирован в плазмиду pET21a вместе с последовательностью, кодирующей MBP (Maltose Binding Protein). В качестве кодирующей ген ДНК, использовалась ДНК с последовательностью, совпадающей с соответствующим геном в геноме бактерии-хозяина Katanobacteria. Клетки Esherichia coli Rosetta были трансформированы полученной плазмидой pET21a-MBP-KbCas12d-6xHis.To obtain the recombinant KbCas12d protein, its gene was cloned into the pET21a plasmid together with the MBP (Maltose Binding Protein) coding sequence. As DNA encoding the gene, DNA with a sequence matching the corresponding gene in the genome of the host bacterium Katanobacteria was used. Escherichia coli Rosetta cells were transformed with the obtained plasmid pET21a-MBP-KbCas12d-6xHis.

5 мл ночной культуры разводили в 500 мл среды LB, и растили клетки при температуре 37°С до достижения оптической плотности 0.6 отн.ед. Синтез целевого белка индуцировали добавлением ИПТГ до концентрации 1 мМ, после чего клетки инкубировали при температуре 16°C в течение 16 часов. Затем проводили центрифугирование клеток на скорости 5000 g в течение 30 минут, полученные осадки клеток замораживали при температуре -20°С.5 ml of the overnight culture was diluted in 500 ml of LB medium, and the cells were grown at a temperature of 37°C until an optical density of 0.6 relative units was reached. Target protein synthesis was induced by adding IPTG to a concentration of 1 mM, after which the cells were incubated at 16°C for 16 hours. Then the cells were centrifuged at a speed of 5000 g for 30 minutes, the resulting cell pellets were frozen at -20°C.

Осадки размораживали на льду в течение 30 минут, ресуспензировали в 15 мл лизисного буфера (HEPES 50mM pH 7,5 (24°C), 10% глицерина, 500 мМ NaCl, имидазол 10 мМ, 1 мМ бета-меркаптоэтанол) с добавлением 15 мг лизоцима и снова инкубировали на льду в течение 30 минут. Затем клетки разрушали воздействием ультразвука в течение 30 минут и центрифугировали в течение 40 минут на скорости 16000 g. Полученный супернатант пропускали через фильтр 0,2 мкм и наносили на колонку HisTrap HP 1 mL (GE Healthcare) на скорости 0.4 мл/мин.The precipitates were thawed on ice for 30 minutes, resuspended in 15 ml of lysis buffer (HEPES 50mM pH 7.5 (24°C), 10% glycerol, 500 mM NaCl, 10 mM imidazole, 1 mM beta-mercaptoethanol) with the addition of 15 mg lysozyme and again incubated on ice for 30 minutes. The cells were then disrupted by sonication for 30 minutes and centrifuged for 40 minutes at 16,000 g. The resulting supernatant was passed through a 0.2 µm filter and applied to a HisTrap HP 1 mL column (GE Healthcare) at a rate of 0.4 mL/min.

Хроматографию проводили при помощи FPLC хроматографа AKTA (GE Healthcare) на скорости 0.4 мл/мин. Колонку с нанесенным белком промывали 20 мл лизисного буфера с добавлением 30 мМ имидазола, после чего белок смывали лизисным буфером с добавлением 300 мМ имидазола. Chromatography was performed using an AKTA FPLC chromatograph (GE Healthcare) at a speed of 0.4 ml/min. The protein loaded column was washed with 20 ml of lysis buffer with the addition of 30 mM imidazole, after which the protein was washed with lysis buffer with the addition of 300 mM imidazole.

Далее белок концентрировали с помощью концентратора Аmicon (с фильтром на 30 кДа) до объема 500мкл в буфере, не содержащем имидазол. Затем к белку добавляли TEV протеазу (30 мкл TEV из стока концентрацией 2 мг/мл)), чтобы отрезать MBP таг и инкубировали ночь на +4°С.Next, the protein was concentrated using an Amicon concentrator (with a 30 kDa filter) to a volume of 500 µl in a buffer containing no imidazole. TEV protease (30 μl TEV from a 2 mg/ml stock) was then added to the protein to cut off the MBP tag and incubated overnight at +4°C.

Затем, фракцию белка, полученную в ходе афинной хроматографии, пропускали через гель-фильтрационную колонку Superdex 200 Increase 10/300 GL (GE Healthcare) уравновешенную буфером, содержащим 50 мМ Hepes-HCl pH=7.5, 500 мМ NaCl, 1 мМ DTT и 10% глицерина. При помощи концентратора Аmicon (с фильтром на 30 кДа) фракции, соответствующие мономерной форме белка KbCas12d, сконцентрировали до 1 мг/мл, после чего очищенный белок хранили при температуре -80 ^оС в буфере, содержащем 10% глицерин. Then, the protein fraction obtained during affinity chromatography was passed through a Superdex 200 Increase 10/300 GL gel filtration column (GE Healthcare) equilibrated with a buffer containing 50 mM Hepes-HCl pH=7.5, 500 mM NaCl, 1 mM DTT and 10 % glycerin. Using an Amicon concentrator (with a 30 kDa filter), the fractions corresponding to the monomeric form of the KbCas12d protein were concentrated to 1 mg/ml, after which the purified protein was stored at -80°C in ^a buffer containing 10% glycerol.

In vitro реакцию разрезания линейной ДНК проводили в объёме 20 мкл в следующих условиях. Реакционная смесь состояла из: 1X CutSmart буфера (NEB), 20 нМ ДНК, 4 мкМ scout РНК/крРНК, 400 нМ белка KbCas12d. В качестве контроля аналогичным образом были приготовлены пробы, не содержащие scoutРНК. Пробы инкубировали при температуре 37°С и анализировали методом гель-электрофореза в 1.5 % агарозном геле. В случае правильного узнавания и специфического разрезания ДНК белком KbCas12d должны формироваться два фрагмента ДНК длиной порядка 670 и 246 пар оснований (см. Фиг. 2).In vitro linear DNA cutting reaction was carried out in a volume of 20 µl under the following conditions. The reaction mixture consisted of: 1X CutSmart Buffer (NEB), 20 nM DNA, 4 µM scout RNA/crRNA, 400 nM KbCas12d protein. Samples containing no scoutRNA were prepared similarly as controls. Samples were incubated at 37°C and analyzed by gel electrophoresis in 1.5% agarose gel. In the case of correct recognition and specific cutting of DNA by the KbCas12d protein, two DNA fragments of the order of 670 and 246 base pairs in length should be formed (see Fig. 2).

Результаты опыта показали, что воссозданный KbCas12d-крРНК-scoutРНК комплекс активен in vitro и разрезает ДНК, несущую целевую протоспейсерную последовательность.The experimental results showed that the reconstructed KbCas12d-crRNA-scoutRNA complex is active in vitro and cuts DNA carrying the target protospacer sequence.

Градиент температур (Фиг. 3) показал, что белок активен в диапазоне температур 25°С до 45°С. В дальнейшем в работе в качестве рабочей использовалась температура 37°С.The temperature gradient (Fig. 3) showed that the protein is active in the temperature range of 25°C to 45°C. Later in the work, a temperature of 37°C was used as a working temperature.

Дизайн гибридной РНК, полученной путем слияния направляющей CRISPR РНК и scout РНК. Design of a hybrid RNA obtained by fusion of CRISPR guide RNA and scout RNA.

Далее проводился подбор оптимальной последовательности скаутной РНК. Для этого in vitro был синтезирован ряд форм скаутной РНК, отличающихся длиной и вторичной структурой (Фиг.4): Next, the selection of the optimal sequence of scout RNA was carried out. For this, a number of forms of scout RNA were synthesized in vitro, differing in length and secondary structure (Fig. 4):

Scout РНК 1 aaaggguuuccaguuuuuaacuaaacuuuagccuuccacccuuu (SEQ ID NO: 6);Scout RNA 1 aaaggguuuccaguuuuuaacuaaaacuuuagccuuccacccuuuu (SEQ ID NO: 6);

Scout РНК 2 caaaauaaaaaggguuuccaguuuuuaacuaaacuuuagccuuccacccuuuccugauuuug (SEQ ID NO: 7);Scout RNA 2 caaaauaaaaaggguuuccaguuuuuaacuaaacuuuuagccuuccacccuuuccugauuuug (SEQ ID NO: 7);

Scout РНК 3Scout RNA 3

aaguaucaaaauaaaaaggguuuccaguuuuuaacuaaacuuuagccuuccacccuuuccugauuuuguugauaauaaguaucaaaauaaaaaggguuuccaguuuuuaacuaaacuuuagccuuccacccuuuccugauuuuguugauaau

(SEQ ID NO: 8);(SEQ ID NO: 8);

Scout РНК 4 aaguaucaaaauaaaaaggguuuccaguuuuuaacuaaacuuuagccuuccacccuuuc (SEQ ID NO: 9);Scout RNA 4 aaguaucaaaauaaaaaggguuuccaguuuuuaacuaaacuuuagccuuccacccuuuc (SEQ ID NO: 9);

Scout РНК 5 auuaucaacaaaaucaggaaaggguggaaggcuaaaguuuaguuaaaaacuggaaacccuuu (SEQ ID NO: 10).Scout RNA 5 auuaucaacaaaaucaggaaaggguggaaggcuaaaguuuaguuaaaaacuggaaacccuuu (SEQ ID NO: 10).

С синтезированными скаутными РНК, крРНК последовательностью SEQ ID NO: 5 и рекомбинатным белком KbCas12d проводилась реакция разрезания ДНК мишени (SEQ ID NO: 4) в объёме 20 мкл на 37°С в течение 30 минут (1X CutSmart буфера (NEB), 20 нМ ДНК, 4 мкМ scoutРНК/крРНК, 400 нМ белка KbCas12d) (Фиг.4)With synthesized scout RNAs, crRNA sequence SEQ ID NO: 5 and recombinant protein KbCas12d, a target DNA cutting reaction (SEQ ID NO: 4) was carried out in a volume of 20 μl at 37°C for 30 minutes (1X CutSmart buffer (NEB), 20 nM DNA, 4 μM scoutRNA/crRNA, 400 nM KbCas12d protein) (Figure 4)

Результаты эксперимента показали, что оптимальной для разрезания ДНК является scout РНК1 (aaaggguuuccaguuuuuaacuaaacuuuagccuuccacccuuu, SEQ ID NO: 6) - самая короткая форма молекулы. В комплексе с ней и крРНК белок KbCas12d активно разрезает двунитевой ДНК фрагмент, несущий протоспейсерную последовательность, фланкированную PAM 5' TA 3'. The results of the experiment showed that scout RNA1 (aaaggguuuccaguuuuuaacuaaacuuuuagccuuccacccuuu, SEQ ID NO: 6), the shortest form of the molecule, is optimal for DNA cutting. In complex with it and crRNA, the KbCas12d protein actively cuts a double-stranded DNA fragment carrying a protospacer sequence flanked by PAM 5' TA 3'.

Эта молекула была использована как основа для дизайна гибридной РНК - молекулы слияния направляющей крРНК и scout РНК (sgRNA, single guide RNA, sgРНК) (Фиг 5). Для создания sgРНК 3'-конец скаутной РНК был соединен с последовательностью крРНК через линкеры последовательностью “GAAA”. Были проверены sgРНК c различной длиной линкера (Фиг.5): This molecule was used as the basis for the design of a hybrid RNA fusion guide crRNA and scout RNA (sgRNA, single guide RNA, sgRNA) (FIG. 5). To create sgRNA, the 3' end of the scout RNA was connected to the crRNA sequence via linkers with the “GAAA” sequence. sgRNAs with different linker lengths were tested (Figure 5):

sgРНК 1 (SEQ ID NO: 11) (aaaggguuuccaguuuuuaacuaaacuuuagccuuccacccuuugaaacuccgaaaguaucggggauaaaggcAUCAAUACCAAACUCUGG)sgRNA 1 (SEQ ID NO: 11) (aaaggguuuccaguuuuuaacuaaacuuuagccuuccacccuuugaaacuccgaaaguaucggggauaaaggcAUCAAUACCAAACUCUGG)

sgРНК 2 (SEQ ID NO: 12)sgRNA 2 (SEQ ID NO: 12)

(aaaggguuuccaguuuuuaacuaaacuuuagccuuccacccuuugaaagaaaсuccgaaaguaucggggauaaaggcAUCAAUACCAAACUCUGG)(aaaggguuuccaguuuuuaacuaaacuuuagccuuccacccuuugaaagaaacccgaaaguaucggggauaaaggcAUCAAUACCAAACUCUGG)

sgРНК 3 (SEQ ID NO: 13)sgRNA 3 (SEQ ID NO: 13)

(aaaggguuuccaguuuuuaacuaaacuuuagccuuccacccuuugaaagaaagaaaсuccgaaaguaucggggauaaaggcAUCAAUACCAAACUCUGG)(aaaggguuuccaguuuuuaacuaaacuuuagccuuccacccuuugaaagaaagaaacccgaaaguaucggggauaaaggcAUCAAUACCAAACUCUGG)

Последовательность sgРНК, узнающая протоспейсер выделена жирным шрифтом; последовательность линкера выделена жирным шрифтом-курсивом. The sgRNA sequence that recognizes the protospacer is in bold; the linker sequence is shown in bold-italics.

Белок KbCas12d в комплексе с sgРНК 1, sgРНК 2 или sgРНК 3 инкубировался с ДНК мишенью, использованной в вышеописанных экспериментах, в течение 30 минут на 37°С. Продукты реакции наносили на агарозный электрофорез для оценки эффективности разрезания ДНК фрагмента. В качестве положительного контроля использовалась реакционная смесь, где вместо sgРНК использовался комплекс крРНК и скаутной РНК. Protein KbCas12d in complex with sgRNA 1, sgRNA 2 or sgRNA 3 was incubated with the DNA target used in the above experiments for 30 minutes at 37°C. The reaction products were applied to agarose electrophoresis to assess the efficiency of cutting the DNA fragment. A reaction mixture was used as a positive control, where a complex of crRNA and scout RNA was used instead of sgRNA.

Результат эксперимента показал, что использование sgРНК 1, sgРНК 2 приводят к эффективному разрезанию ДНК мишени, наравне с набором из направляющих РНК двух видов - скаутной и крРНК. Таким образом, удалось найти форму гибридной РНК (sgРНК) для эффективного разрезания ДНК белком KbCas12d. Эти варианты гибридной РНК могут быть использованы для разрезания любой другой целевой ДНК при изменении последовательности, непосредственно спаривающейся с ДНК -мишенью.The result of the experiment showed that the use of sgRNA 1, sgRNA 2 lead to effective cutting of the target DNA, along with a set of guide RNAs of two types - scout and crRNA. Thus, it was possible to find a form of hybrid RNA (sgRNA) for effective DNA cutting by the KbCas12d protein. These fusion RNA variants can be used to cut any other target DNA by changing the sequence that directly pairs with the target DNA.

Подобная схема соединения скаутной РНК и крРНК может быть использована для любого представителя семейства Cas12d. A similar scheme for connecting scout RNA and crRNA can be used for any member of the Cas12d family.

Нижеследующие примеры осуществления способа приведены в целях раскрытия характеристик настоящего изобретения и их не следует рассматривать как каким-либо образом ограничивающие объем изобретения.The following examples of the implementation of the method are given in order to disclose the characteristics of the present invention and should not be construed as in any way limiting the scope of the invention.

Пример 1. Тестирование активности белка KbCas12d в разрезании различных ДНК мишеней. Example 1 Testing the activity of the KbCas12d protein in cutting various target DNAs.

Для того, чтобы проверить способность KbCas12d узнавать различные последовательности ДНК, фланкированные PAM последовательностью, были проведены эксперименты по in vitro разрезанию ДНК-мишеней из последовательности гена grin2b человека (см. Таблицу 2). В качестве целевой ДНК использовался ПЦР-продукт, совпадающий по последовательности с фрагментом гена человека grin2b: SEQ ID NO: 14.In order to test the ability of KbCas12d to recognize different DNA sequences flanked by the PAM sequence, in vitro cutting experiments on DNA targets from the human grin2b gene sequence were performed (see Table 2). The target DNA was a PCR product matching the sequence of the human gene fragment grin2b: SEQ ID NO: 14.

В качестве отрицательного контроля проводилось разрезание ДНК-мишени, фланкированной с 5' -конца нуклеотидами, отличающимися от PAM “TA”.As a negative control, a target DNA flanked at the 5' end with nucleotides other than PAM “TA” was cut.

В реакции разрезания в качестве мишени использовался ПЦР фрагмент гена grin2b, несущий сайты узнавания (Таблица 2), предположительно распознаваемые KbCas12d. Для узнавания этих последовательностей были синтезированы крРНК, направляющие KbCas12d на данные сайты. In the cutting reaction, a PCR fragment of the grin2b gene carrying recognition sites (Table 2) presumably recognized by KbCas12d was used as a target. To recognize these sequences, we synthesized crRNAs that direct KbCas12d to these sites.

Таблица 2. ДНК-мишени гена grin2b человека. Table 2. DNA targets of the human grin2b gene.

5' фланкирующая область (PAM выделен жирным шрифтом)5' flanking region (PAM in bold) ДНК мишеньDNA target Мишень1Target1 ATCTCTGGTAATTCTCTGGTA TTTGCTCTGCAGAATGAGTTTGCTCTGCAGAATGAG Мишень 2Target 2 AAGGACCTTAAAGGACCTTTA TCTCCTTTCATTGAGCACTCTCCTTTCATTGAGCAC Мишень 3Target 3 GAGCATGTTAGAGCATGTTA AAATAGGATCTACATCACAAATAGGATCTACATCAC Мишень 4Target 4 ACCCGGGGTAACCCGGGGTA CCACGGAGAGATGGTGGACCACGGAGAGATGGTGGA Мишень 5Target 5 TATTGCTATATATTGCTATA GTCATTGGCAGCTACAGGGTCATTGGCAGCTACAGG Мишень 6Target 6 TTGGCAGCTATTGGCAGCTA CAGGCAGAGACAAAGGAGCAGGCAGAGACAAAGGAG Мишень 7 (без PAM)Target 7 (without PAM) GCCATCCTATGCCATCCTAT AGTCGTGACTTCCCTAAAAGTCGTGACTTCCCTAAA

Реакции разрезания проводились в подобранных для KbCas12d условиях, результат представлен на Фиг. 6. Из Фиг. 6 видно, что фермент KbCas12d успешно разрезал все мишени с подходящим PAM и не внес разрывы в мишень 7, используемую в качестве отрицательного контроля. The cutting reactions were carried out under conditions adjusted for KbCas12d, the result is shown in FIG. 6. From FIG. 6 it can be seen that the KbCas12d enzyme successfully cut all targets with the appropriate PAM and did not introduce breaks in target 7 used as a negative control.

Таким образом, проведенные исследовательские испытания позволили восстановить активность системы CRISPR-KdCas12d из Katanobacteria in vitro, создать на ее основе средство разрезания ДНК, состоящее из нуклеазы KbCas12d гибридной РНК, полученной путем слияния направляющей CRISPR РНК и scout РНК.Thus, the conducted research tests made it possible to restore the activity of the CRISPR-KdCas12d system from Katanobacteria in vitro, to create a DNA cutter based on it, consisting of a hybrid RNA KbCas12d nuclease obtained by fusion of CRISPR guide RNA and scout RNA.

Пример 3. Белки Cas12d из близкородственных организмов, относящихся к Katanobacteria. Example 3 Cas12d proteins from closely related organisms belonging to Katanobacteria.

На сегодняшний день для систем CRISPR-Cas12d семейства только активность системы CRISPR-Cas12d15 из бактерий-симбионтов термитов была реконструирована in vitro (Harrington et al., 2020). Cas12d15 имеет сравнимый с KbCas12d размер и идентичен KbCas12d на 27 % (Фиг. 7, степень идентичности была рассчитана по программе BLASTp, default parameters). To date, for CRISPR-Cas12d family systems, only the activity of the CRISPR-Cas12d15 system from termite symbionts has been reconstructed in vitro (Harrington et al., 2020). Cas12d15 has a size comparable to KbCas12d and is 27% identical to KbCas12d (Fig. 7, the degree of identity was calculated using the BLASTp program, default parameters).

Таким образом, белок KbCas12d существенно отличается по аминокислотной последовательности от других Cas12d белков, чья активность восстановлена in vitro. KbCas12d - это первый Cas12d белок для которого показана возможность создания гибридной sgРНК.Thus, the KbCas12d protein significantly differs in amino acid sequence from other Cas12d proteins whose activity has been restored in vitro. KbCas12d is the first Cas12d protein for which the possibility of creating a hybrid sgRNA has been shown.

Специалисту в области генетической инженерии очевидно, что полученный и охарактеризованный в данном Описании вариант последовательности белка KbCas12d может быть изменен без изменения функции самого белка (например, направленным мутагенезом аминокислотных остатков, напрямую не влияющих на функциональную активность (Sambrook et al., 1989). В частности, специалисту известно, что могут быть изменены неконсервативные аминокислотные остатки, не затрагивающие остатки, определяющие функциональность белка (определяющие его функцию или структуру). Примерами таких изменений могут служить замены неконсервативных аминокислотных остатков на гомологичные. В некоторых вариантах осуществления изобретения возможно использование белка, содержащего аминокислотную последовательность, которая по меньшей мере на 95% идентична аминокислотной последовательности SEQ ID NO: 1 и имеет отличия по сравнению с SEQ ID NO: 1 только в неконсервативных аминокислотных остатках, для образования двунитевого разрыва в молекуле ДНК, расположенного непосредственно после нуклеотидной последовательности 5'-TA-3' в указанной молекуле ДНК. Гомологичные белки могут быть получены путем мутагенеза (например, сайт-направленного или ПЦР-опосредуемого мутагенеза) соответствующих молекул нуклеиновых кислот с последующим тестированием кодируемого модифицированного белка Cas12d на сохранение его функций в соответствии с описанными здесь функциональными анализами.It is obvious to a specialist in the field of genetic engineering that the variant of the KbCas12d protein sequence obtained and characterized in this Description can be changed without changing the function of the protein itself (for example, by directed mutagenesis of amino acid residues that do not directly affect functional activity (Sambrook et al., 1989). in particular, the person skilled in the art knows that non-conservative amino acid residues can be changed that do not affect the residues that determine the functionality of the protein (defining its function or structure. Examples of such changes can be the replacement of non-conservative amino acid residues with homologous ones. In some embodiments of the invention, it is possible to use a protein containing an amino acid sequence that is at least 95% identical to the amino acid sequence of SEQ ID NO: 1 and differs from SEQ ID NO: 1 only in non-conservative amino acid residues, to form a double-strand break in the mole a DNA stub located immediately after the 5'-TA-3' nucleotide sequence in said DNA molecule. Homologous proteins can be obtained by mutagenesis (eg, site-directed or PCR-mediated mutagenesis) of the appropriate nucleic acid molecules, followed by testing the encoded modified Cas12d protein for retention of its functions in accordance with the functional assays described here.

Пример 3. Описанная в настоящем изобретении система KbCas12d в комплексе с направляющими РНК или разработанной в настоящем изобретении гибридной РНК может быть использована для изменения последовательности геномной ДНК многоклеточного организма, в том числе эукариотического. Для введения система KbCas12d в комплексе с направляющими РНК/ sgРНК в клетки этого организма (во все клетки или в часть клеток) могут быть применены различные подходы, известные специалистам. Например, методы доставки CRISPR-Cas систем в клетки организмов раскрыты в источниках (Liu et al., 2017; Lino et al., 2018) и в источниках, раскрытых внутри этих источников. Example 3. The KbCas12d system described in the present invention in combination with guide RNAs or the hybrid RNA developed in the present invention can be used to change the genomic DNA sequence of a multicellular organism, including a eukaryotic one. To introduce the KbCas12d system in combination with the guide RNA/sgRNA into the cells of this organism (in all cells or in a part of the cells), various approaches known to those skilled in the art can be applied. For example, methods for delivering CRISPR-Cas systems to cells of organisms are disclosed in sources (Liu et al., 2017; Lino et al., 2018) and in sources disclosed within these sources.

Для эффективной экспрессии нуклеазы KbCas12d в эукариотических клетках будет желательно провести оптимизацию кодонов для аминокислотной последовательности белка KbCas12d методами, известными специалистам (например, IDT codon optimization tool).For efficient expression of the KbCas12d nuclease in eukaryotic cells, it will be desirable to perform codon optimization for the amino acid sequence of the KbCas12d protein by methods known to those skilled in the art (eg, IDT codon optimization tool).

Для эффективной работы нуклеазы KbCas12d в эукариотических клетках необходимо обеспечить импорт этого белка внутрь ядра эукариотической клетки. Для этого можно использовать сигнал ядерной локализации из Т-антигена вируса SV40 (Lanford et al., Cell, 1986, 46: 575-582), соединённый с последовательностью KbCas12d с помощью спейсерной последовательности, описанной в Shen et al., 2013 или без нее. Таким образом, полная аминокислотная последовательность нуклеазы, транспортируемой внутрь ядра эукариотической клетки, будет представлять собой следующую последовательность: MAPKKKRKVGIHGVPAA-KbCas12d-KRPAATKKAGQAKKKK (SEQ ID NO: 15 - KbCas12d - SEQ ID NO: 16; далее KbCas12d NLS). Для доставки белка с приведенной выше аминокислотной последовательностью, могут быть использованы по меньшей мере два подхода. For the effective operation of the KbCas12d nuclease in eukaryotic cells, it is necessary to ensure the import of this protein into the nucleus of the eukaryotic cell. This can be done using the nuclear localization signal from the SV40 T-antigen (Lanford et al., Cell, 1986, 46: 575-582) coupled to the KbCas12d sequence with or without the spacer sequence described in Shen et al., 2013 . Thus, the complete amino acid sequence of a nuclease transported into the nucleus of a eukaryotic cell will be the following sequence: MAPKKKRKVGIHGVPAA-KbCas12d-KRPAATKKAGQAKKKK (SEQ ID NO: 15 - KbCas12d - SEQ ID NO: 16; hereinafter KbCas12d NLS). To deliver a protein with the above amino acid sequence, at least two approaches can be used.

Доставка в виде гена осуществляется путем создания плазмиды, несущей ген KbCas12d NLS под регуляцией промотора (например, CMV промотора) и последовательности, кодирующей направляющие РНК под регуляцией U6 промотора. В качестве ДНК- мишеней используются ДНК последовательности, фланкированные 5'-TA -3', например, последовательности гена grin2b человека:Delivery as a gene is accomplished by creating a plasmid carrying the KbCas12d NLS gene under the regulation of a promoter (eg CMV promoter) and a sequence encoding guide RNAs under the regulation of the U6 promoter. As DNA targets, DNA sequences flanked by 5'-TA -3' are used, for example, the sequences of the human grin2b gene:

5'- AAATAGGATCTACATCAC -3'5'-AAATAGGATCTACATCAC-3'

Таким образом, кассета для экспрессии sgРНК (SEQ ID NO: 17) выглядит следующим образом: Thus, the sgRNA expression cassette (SEQ ID NO: 17) is as follows:

gagggcctatttcccatgattccttcatatttgcatatacgatacaaggctgttagagagataattggaattaatttgactgtaaacacaaagatattagtacaaaatacgtgacgtagaaagtaataatttcttgggtagtttgcagttttaaaattatgttttaaaatggactatcatatgcttaccgtaacttgaaagtatttcgatttcttggctttatatatcttgtggaaaggacgaaacaccgaaagggtttccagtttttaactaaactttagccttccaccctttgaaagaaactccgaaagtatcggggataaaggcATCAATACCAAACTCTGGgagggcctatttcccatgattccttcatatttgcatatacgatacaaggctgttagagagataattggaattaatttgactgtaaacacaaagatattagtacaaaatacgtgacgtagaaagtaataatttcttgggtagtttgcagttttaaaattatgttttaaaatggactatcatatgcttaccgtaacttgaaagtatttcgatttcttggctttatatatcttgtggaaaggacgaaacaccgaaagggtttccagtttttaactaaactttagccttccaccctttgaaagaaactccgaaagtatcggggataaaggcATCAATACCAAACTCTGG

Жирным шрифтом выделена последовательность U6 промотора, строчными буквами - последовательность, образующая структуру sgRNA, а прописными буквами последовательность, необходимая для узнавания целевой ДНК.The U6 promoter sequence is shown in bold, the sequence that forms the sgRNA structure is in lowercase letters, and the sequence necessary for recognition of the target DNA is in capital letters.

Плазмидную ДНК очищают и трансфицируют в клетки человека HEK293 c помощью реагента Lipofectamine 2000 (Thermo Fisher Scientific). Клетки инкубируют в течение 72 часов, после чего из них выделяется геномная ДНК с помощью колонок для очистки геномной ДНК (Thermo Fisher Scientific). Целевой ДНК сайт анализируется с помощью секвенирования на платформе Illumina с целью определения числа вставок-делеций в ДНК, происходящих в целевом сайте по причине направленного двунитевого разрыва и последующей его репарации. Plasmid DNA was purified and transfected into human HEK293 cells using the Lipofectamine 2000 reagent (Thermo Fisher Scientific). Cells are incubated for 72 hours, after which genomic DNA is isolated using genomic DNA purification columns (Thermo Fisher Scientific). The target DNA site is analyzed by sequencing on the Illumina platform to determine the number of DNA insertion-deletions that occur at the target site due to directed double-strand break and its subsequent repair.

Для амплификации целевых фрагментов используют праймеры, фланкирующие предположительное место внесения разрыва.For amplification of the target fragments, primers flanking the presumed site of the break are used.

После амплификации пробы готовятся по протоколу реагента Ultra II DNA Library Prep Kit for Illumina (NEB) для подготовки образцов к высокопроизводительному секвенированию. Затем проводится секвенирование на платформе Illumina 300cycles, прямое прочтение. Результаты секвенирования анализируются биоинформатическими методами. В качестве детекции разрезания принимается вставка или делеция нескольких нуклеотидов в целевой последовательности ДНК.After amplification, samples are prepared using the Ultra II DNA Library Prep Kit for Illumina (NEB) reagent protocol to prepare samples for high throughput sequencing. This is followed by sequencing on the Illumina 300cycles platform, direct reading. The sequencing results are analyzed by bioinformatic methods. The insertion or deletion of several nucleotides in the target DNA sequence is taken as a cut detection.

Доставка в виде рибонуклеинового комплекса осуществляется путем инкубации рекомбинантной формы KbCas12d NLS c направляющими РНК в CutSmart буфере (NEB). Рекомбинантный белок получают из бактериальных клеток-продуцентов, очищая его с помощью аффинной хроматографии (NiNTA, Qiagen) разделением по размеру (Superdex 200). Delivery in the form of a ribonucleic complex is carried out by incubation of the recombinant form of KbCas12d NLS with guide RNAs in CutSmart Buffer (NEB). The recombinant protein is obtained from bacterial producer cells by purifying it by size separation affinity chromatography (NiNTA, Qiagen) (Superdex 200).

Белок смешивают с РНК в соотношении 1:2 (KbCas9 NLS : sgRNA), инкубируют в течение 10 минут на комнатной температуре, затем смесь трансфицируют в клетки.The protein is mixed with RNA in a ratio of 1:2 (KbCas9 NLS : sgRNA), incubated for 10 minutes at room temperature, then the mixture is transfected into cells.

Далее проводится анализ экстрагированной из них ДНК на предмет вставок-делеций в целевом ДНК сайте (как описано выше). Next, the DNA extracted from them is analyzed for insertions-deletions in the target DNA site (as described above).

Охарактеризованная в настоящем изобретении нуклеаза KbCas12d из Katanobacteria имеет ряд преимуществ относительно ранее охарактеризованных Cas белков. The nuclease KbCas12d from Katanobacteria characterized in the present invention has a number of advantages over previously characterized Cas proteins.

На сегодняшний день KbCas12d - это второй Cas12d белок, активность которого удалось восстановить in vitro и первый Cas12d белок, для которого была создана гибридная sgРНК. Использование sgРНК вместо комплекса из крРНК и скаутной РНК упрощает систему и является преимуществом в сравнении с Cas12d15 из бактерий симбионтов термитов. To date, KbCas12d is the second Cas12d protein whose activity has been restored in vitro and the first Cas12d protein for which a hybrid sgRNA has been created. The use of sgRNA instead of a complex of crRNA and scout RNA simplifies the system and is an advantage over Cas12d15 from termite symbiont bacteria.

KbCas12d имеет малый размер белка (1125 а.о.), что также является преимуществом по сравнению с другими Cas нуклеазами. Эффективное разрезание ДНК в широком диапазоне температур делает KbCas12d применимым как в клетках рыб и растений, так и в клетках теплокровных животных. Таким образом, KbCas12d в комплексе с созданной гибридной РНК может стать основой нового инструмента геномного редактирования. KbCas12d has a small protein size (1125 aa), which is also an advantage over other Cas nucleases. Effective cutting of DNA in a wide temperature range makes KbCas12d applicable both in fish and plant cells, and in cells of warm-blooded animals. Thus, KbCas12d in combination with the created hybrid RNA can become the basis of a new tool for genomic editing.

Несмотря на то, что изобретение описано со ссылкой на раскрываемые варианты воплощения, для специалистов в данной области должно быть очевидно, что конкретные подробно описанные случаи приведены лишь в целях иллюстрирования настоящего изобретения, и их не следует рассматривать как каким-либо образом ограничивающие объем изобретения. Должно быть, понятно, что возможно осуществление различных модификаций без отступления от сути настоящего изобретения.While the invention has been described with reference to the disclosed embodiments, it should be apparent to those skilled in the art that the specific instances described in detail are for the purpose of illustrating the present invention only and should not be construed as limiting the scope of the invention in any way. It should be clear that it is possible to carry out various modifications without departing from the essence of the present invention.

Список литературыBibliography

Altschul S. F. et al., Basic local alignment search tool // J. Mol. Biol., 1990, Oct, 215 (3): 403-410.Altschul S. F. et al., Basic local alignment search tool // J. Mol. Biol., 1990, Oct. 215(3): 403-410.

Brouns S. J. J. et al., Small CRISPR RNAs guide antiviral defense in prokaryotes // Science, 2008, 321 (5891): 960-964.Brouns S. J. J. et al., Small CRISPR RNAs guide antiviral defense in prokaryotes // Science, 2008, 321 (5891): 960-964.

Harrington L. B. et al. A scoutRNA is required for some type V CRISPR-Cas systems // Mol. Cell, 2020, Aug, 79 (3): 416-424.Harrington L. B. et al. A scoutRNA is required for some type V CRISPR-Cas systems // Mol. Cell, 2020, Aug, 79(3): 416-424.

Jansen R. et al., Identification of genes that are associated with DNA repeats in prokaryotes // Molecular microbiology, 2002, Mar, 43 (6): 1565-1575.Jansen R. et al., Identification of genes that are associated with DNA repeats in prokaryotes // Molecular microbiology, 2002, Mar, 43 (6): 1565-1575.

Lino C. A. et al., Delivering CRISPR: a review of the challenges and approaches // Drug Deliv., 2018, Nov, 25 (1): 1234-1257.Lino C. A. et al., Delivering CRISPR: a review of the challenges and approaches // Drug Deliv., 2018, Nov, 25 (1): 1234-1257.

Liu C. et al., Delivery strategies of the CRISPR-Cas9 gene-editing system for therapeutic applications // J Control Release. 2017, Nov, 28 (266): 17-26.Liu C. et al., Delivery strategies of the CRISPR-Cas9 gene-editing system for therapeutic applications // J Control Release. 2017, Nov. 28 (266): 17-26.

Mojica F. J. M. et al., Intervening sequences of regularly spaced prokaryotic repeats derive from foreign genetic elements // Journal of molecular evolution. 2005, 60 (2): 174-182.Mojica F. J. M. et al., Intervening sequences of regularly spaced prokaryotic repeats derive from foreign genetic elements // Journal of molecular evolution. 2005, 60(2): 174-182.

Sambrook J. et al., Molecular Cloning: A Laboratory Manual, 1989, CSH Press, pp. 15.3-15.108.Sambrook J. et al., Molecular Cloning: A Laboratory Manual, 1989, CSH Press, pp. 15.3-15.108.

Shen B., et al., Generation of gene-modified mice via Cas9/RNA-mediated gene targeting // Cell Res. 2013 May, 23 (5): 720-3.Shen B., et al., Generation of gene-modified mice via Cas9/RNA-mediated gene targeting // Cell Res. 2013 May, 23(5): 720-3.

--->--->

<110> Федеральное государственное бюджетное учреждение науки Институт<110> Federal State Budgetary Institution of Science Institute

биологии гена Российской академии наук (Institute of Gene Biology RussianGene Biology of the Russian Academy of Sciences (Institute of Gene Biology Russian

Academy of Sciences)Academy of Sciences

<120> Средство разрезания двунитевой ДНК с помощью Cas12d белка из<120> Double strand DNA cutter with Cas12d protein from

Katanobacteria и гибридной РНК, полученной путем слияния направляющейKatanobacteria and hybrid RNA obtained by guide fusion

CRISPR РНК и scout РНКCRISPR RNA and scout RNA

<160> 17<160> 17

<210> 1<210> 1

<211> 1125<211> 1125

<212> PRT<212> PRT

<213> Katanobateria<213> Katanobateria

<400> 1<400> 1

Met Arg Lys Lys Leu Phe Lys Gly Tyr Ile Leu His Asn Lys Arg Leu 16Met Arg Lys Lys Leu Phe Lys Gly Tyr Ile Leu His Asn Lys Arg Leu 16

5 10 15 5 10 15

Val Tyr Thr Gly Lys Ala Ala Ile Arg Ser Ile Lys Tyr Pro Leu Val 32Val Tyr Thr Gly Lys Ala Ala Ile Arg Ser Ile Lys Tyr Pro Leu Val 32

20 25 30 20 25 30

Ala Pro Asn Lys Thr Ala Leu Asn Asn Leu Ser Glu Lys Ile Ile Tyr 48Ala Pro Asn Lys Thr Ala Leu Asn Asn Leu Ser Glu Lys Ile Ile Tyr 48

35 30 45 35 30 45

Asp Tyr Glu His Leu Phe Gly Pro Leu Asn Val Ala Ser Tyr Ala Arg 64Asp Tyr Glu His Leu Phe Gly Pro Leu Asn Val Ala Ser Tyr Ala Arg 64

50 55 60 50 55 60

Asn Ser Asn Arg Tyr Ser Leu Val Asp Phe Trp Ile Asp Ser Leu Arg 80Asn Ser Asn Arg Tyr Ser Leu Val Asp Phe Trp Ile Asp Ser Leu Arg 80

65 70 75 8065 70 75 80

Ala Gly Val Ile Trp Gln Ser Lys Ser Thr Ser Leu Ile Asp Leu Ile 96Ala Gly Val Ile Trp Gln Ser Lys Ser Thr Ser Leu Ile Asp Leu Ile 96

85 90 95 85 90 95

Ser Lys Leu Glu Gly Ser Lys Ser Pro Ser Glu Lys Ile Phe Glu Gln 112Ser Lys Leu Glu Gly Ser Lys Ser Pro Ser Glu Lys Ile Phe Glu Gln 112

100 105 110 100 105 110

Ile Asp Phe Glu Leu Lys Asn Lys Leu Asp Lys Glu Gln Phe Lys Asp 128Ile Asp Phe Glu Leu Lys Asn Lys Leu Asp Lys Glu Gln Phe Lys Asp 128

115 120 125 115 120 125

Ile Ile Leu Leu Asn Thr Gly Ile Arg Ser Ser Ser Asn Val Arg Ser 144Ile Ile Leu Leu Asn Thr Gly Ile Arg Ser Ser Ser Asn Val Arg Ser 144

130 135 140 130 135 140

Leu Arg Gly Arg Phe Leu Lys Cys Phe Lys Glu Glu Phe Arg Asp Thr 160Leu Arg Gly Arg Phe Leu Lys Cys Phe Lys Glu Glu Phe Arg Asp Thr 160

145 150 155 160145 150 155 160

Glu Glu Val Ile Ala Cys Val Asp Lys Trp Ser Lys Asp Leu Ile Val 176Glu Glu Val Ile Ala Cys Val Asp Lys Trp Ser Lys Asp Leu Ile Val 176

165 170 175 165 170 175

Glu Gly Lys Ser Ile Leu Val Ser Lys Gln Phe Leu Tyr Trp Glu Glu 192Glu Gly Lys Ser Ile Leu Val Ser Lys Gln Phe Leu Tyr Trp Glu Glu 192

180 185 190 180 185 190

Glu Phe Gly Ile Lys Ile Phe Pro His Phe Lys Asp Asn His Asp Leu 208Glu Phe Gly Ile Lys Ile Phe Pro His Phe Lys Asp Asn His Asp Leu 208

195 200 205 195 200 205

Pro Lys Leu Thr Phe Phe Val Glu Pro Ser Leu Glu Phe Ser Pro His 224Pro Lys Leu Thr Phe Phe Val Glu Pro Ser Leu Glu Phe Ser Pro His 224

210 215 220 210 215 220

Leu Pro Leu Ala Asn Cys Leu Glu Arg Leu Lys Lys Phe Asp Ile Ser 240Leu Pro Leu Ala Asn Cys Leu Glu Arg Leu Lys Lys Phe Asp Ile Ser 240

225 230 235 240225 230 235 240

Arg Glu Ser Leu Leu Gly Leu Asp Asn Asn Phe Ser Ala Phe Ser Asn 256Arg Glu Ser Leu Leu Gly Leu Asp Asn Asn Phe Ser Ala Phe Ser Asn 256

245 250 255 245 250 255

Tyr Phe Asn Glu Leu Phe Asn Leu Leu Ser Arg Gly Glu Ile Lys Lys 272Tyr Phe Asn Glu Leu Phe Asn Leu Leu Ser Arg Gly Glu Ile Lys Lys 272

260 265 270 260 265 270

Ile Val Thr Ala Val Leu Ala Val Ser Lys Ser Trp Glu Asn Glu Pro 288Ile Val Thr Ala Val Leu Ala Val Ser Lys Ser Trp Glu Asn Glu Pro 288

275 280 285 275 280 285

Glu Leu Glu Lys Arg Leu His Phe Leu Ser Glu Lys Ala Lys Leu Leu 304Glu Leu Glu Lys Arg Leu His Phe Leu Ser Glu Lys Ala Lys Leu Leu 304

290 295 300 290 295 300

Gly Tyr Pro Lys Leu Thr Ser Ser Trp Ala Asp Tyr Arg Met Ile Ile 320Gly Tyr Pro Lys Leu Thr Ser Ser Trp Ala Asp Tyr Arg Met Ile Ile 320

305 310 315 320305 310 315 320

Gly Gly Lys Ile Lys Ser Trp His Ser Asn Tyr Thr Glu Gln Leu Ile 336Gly Gly Lys Ile Lys Ser Trp His Ser Asn Tyr Thr Glu Gln Leu Ile 336

325 330 335 325 330 335

Lys Val Arg Glu Asp Leu Lys Lys His Gln Ile Ala Leu Asp Lys Leu 352Lys Val Arg Glu Asp Leu Lys Lys His Gln Ile Ala Leu Asp Lys Leu 352

340 345 350 340 345 350

Gln Glu Asp Leu Lys Lys Val Val Asp Ser Ser Leu Arg Glu Gln Ile 368Gln Glu Asp Leu Lys Lys Val Val Asp Ser Ser Leu Arg Glu Gln Ile 368

355 360 365 355 360 365

Glu Ala Gln Arg Glu Ala Leu Leu Pro Leu Leu Asp Thr Met Leu Lys 384Glu Ala Gln Arg Glu Ala Leu Leu Pro Leu Leu Asp Thr Met Leu Lys 384

370 375 380 370 375 380

Glu Lys Asp Phe Ser Asp Asp Leu Glu Leu Tyr Arg Phe Ile Leu Ser 400Glu Lys Asp Phe Ser Asp Asp Leu Glu Leu Tyr Arg Phe Ile Leu Ser 400

385 390 395 400385 390 395 400

Asp Phe Lys Ser Leu Leu Asn Gly Ser Tyr Gln Arg Tyr Ile Gln Thr 416Asp Phe Lys Ser Leu Leu Asn Gly Ser Tyr Gln Arg Tyr Ile Gln Thr 416

405 410 415 405 410 415

Glu Glu Glu Arg Lys Glu Asp Arg Asp Val Thr Lys Lys Tyr Lys Asp 432Glu Glu Glu Arg Lys Glu Asp Arg Asp Val Thr Lys Lys Tyr Lys Asp 432

420 425 430 420 425 430

Leu Tyr Ser Asn Leu Arg Asn Ile Pro Arg Phe Phe Gly Glu Ser Lys 448Leu Tyr Ser Asn Leu Arg Asn Ile Pro Arg Phe Phe Gly Glu Ser Lys 448

435 440 445 435 440 445

Lys Glu Gln Phe Asn Lys Phe Ile Asn Lys Ser Leu Pro Thr Ile Asp 464Lys Glu Gln Phe Asn Lys Phe Ile Asn Lys Ser Leu Pro Thr Ile Asp 464

450 455 460 450 455 460

Val Gly Leu Lys Ile Leu Glu Asp Ile Arg Asn Ala Leu Glu Thr Val 480Val Gly Leu Lys Ile Leu Glu Asp Ile Arg Asn Ala Leu Glu Thr Val 480

465 470 475 480465 470 475 480

Ser Val Arg Lys Pro Pro Ser Ile Thr Glu Glu Tyr Val Thr Lys Gln 496Ser Val Arg Lys Pro Pro Ser Ile Thr Glu Glu Tyr Val Thr Lys Gln 496

485 490 495 485 490 495

Leu Glu Lys Leu Ser Arg Lys Tyr Lys Ile Asn Ala Phe Asn Ser Asn 512Leu Glu Lys Leu Ser Arg Lys Tyr Lys Ile Asn Ala Phe Asn Ser Asn 512

500 505 510 500 505 510

Arg Phe Lys Gln Ile Thr Glu Gln Val Leu Arg Lys Tyr Asn Asn Gly 528Arg Phe Lys Gln Ile Thr Glu Gln Val Leu Arg Lys Tyr Asn Asn Gly 528

515 520 525 515 520 525

Glu Leu Pro Lys Ile Ser Glu Val Phe Tyr Arg Tyr Pro Arg Glu Ser 544Glu Leu Pro Lys Ile Ser Glu Val Phe Tyr Arg Tyr Pro Arg Glu Ser 544

530 535 540 530 535 540

His Val Ala Ile Arg Ile Leu Pro Val Lys Ile Ser Asn Pro Arg Lys 560His Val Ala Ile Arg Ile Leu Pro Val Lys Ile Ser Asn Pro Arg Lys 560

545 550 555 560545 550 555 560

Asp Ile Ser Tyr Leu Leu Asp Lys Tyr Gln Ile Ser Pro Asp Trp Lys 576Asp Ile Ser Tyr Leu Leu Asp Lys Tyr Gln Ile Ser Pro Asp Trp Lys 576

565 570 575 565 570 575

Asn Ser Asn Pro Gly Glu Val Val Asp Leu Ile Glu Ile Tyr Lys Leu 592Asn Ser Asn Pro Gly Glu Val Val Asp Leu Ile Glu Ile Tyr Lys Leu 592

580 585 590 580 585 590

Thr Leu Gly Trp Leu Leu Ser Cys Asn Lys Asp Phe Ser Met Asp Phe 608Thr Leu Gly Trp Leu Leu Ser Cys Asn Lys Asp Phe Ser Met Asp Phe 608

595 600 605 595 600 605

Ser Ser Tyr Asp Leu Lys Leu Phe Pro Glu Ala Ala Ser Leu Ile Lys 624Ser Ser Tyr Asp Leu Lys Leu Phe Pro Glu Ala Ala Ser Leu Ile Lys 624

610 615 620 610 615 620

Asn Phe Gly Ser Cys Leu Ser Gly Tyr Tyr Leu Ser Lys Met Ile Phe 640Asn Phe Gly Ser Cys Leu Ser Gly Tyr Tyr Leu Ser Lys Met Ile Phe 640

625 630 635 640625 630 635 640

Asn Cys Ile Thr Ser Glu Ile Lys Gly Met Ile Thr Leu Tyr Thr Arg 656Asn Cys Ile Thr Ser Glu Ile Lys Gly Met Ile Thr Leu Tyr Thr Arg 656

645 650 655 645 650 655

Asp Lys Phe Val Val Arg Tyr Val Thr Gln Met Ile Gly Ser Asn Gln 672Asp Lys Phe Val Val Arg Tyr Val Thr Gln Met Ile Gly Ser Asn Gln 672

660 665 670 660 665 670

Lys Phe Pro Leu Leu Cys Leu Val Gly Glu Lys Gln Thr Lys Asn Phe 688Lys Phe Pro Leu Leu Cys Leu Val Gly Glu Lys Gln Thr Lys Asn Phe 688

675 680 685 675 680 685

Ser Arg Asn Trp Gly Val Leu Ile Glu Glu Lys Gly Asp Leu Gly Glu 704Ser Arg Asn Trp Gly Val Leu Ile Glu Glu Lys Gly Asp Leu Gly Glu 704

690 695 700 690 695 700

Glu Lys Asn Gln Glu Lys Cys Leu Ile Phe Lys Asp Lys Thr Asp Phe 720Glu Lys Asn Gln Glu Lys Cys Leu Ile Phe Lys Asp Lys Thr Asp Phe 720

705 710 715 720705 710 715 720

Ala Lys Ala Lys Glu Val Glu Ile Phe Lys Asn Asn Ile Trp Arg Ile 736Ala Lys Ala Lys Glu Val Glu Ile Phe Lys Asn Asn Ile Trp Arg Ile 736

725 730 735 725 730 735

Arg Thr Ser Lys Tyr Gln Ile Gln Phe Leu Asn Arg Leu Phe Lys Lys 752Arg Thr Ser Lys Tyr Gln Ile Gln Phe Leu Asn Arg Leu Phe Lys Lys 752

740 745 750 740 745 750

Thr Lys Glu Trp Asp Leu Met Asn Leu Val Leu Ser Glu Pro Ser Leu 768Thr Lys Glu Trp Asp Leu Met Asn Leu Val Leu Ser Glu Pro Ser Leu 768

755 760 765 755 760 765

Val Leu Glu Glu Glu Trp Gly Val Ser Trp Asp Lys Asp Lys Leu Leu 784Val Leu Glu Glu Glu Glu Trp Gly Val Ser Trp Asp Lys Asp Lys Leu Leu 784

770 775 780 770 775 780

Pro Leu Leu Lys Lys Glu Lys Ser Cys Glu Glu Arg Leu Tyr Tyr Ser 800Pro Leu Leu Lys Lys Glu Lys Ser Cys Glu Glu Arg Leu Tyr Tyr Ser 800

785 790 795 800785 790 795 800

Leu Pro Leu Asn Leu Val Pro Ala Thr Asp Tyr Lys Glu Gln Ser Ala 816Leu Pro Leu Asn Leu Val Pro Ala Thr Asp Tyr Lys Glu Gln Ser Ala 816

805 810 815 805 810 815

Glu Ile Glu Gln Arg Asn Thr Tyr Leu Gly Leu Asp Val Gly Glu Phe 832Glu Ile Glu Gln Arg Asn Thr Tyr Leu Gly Leu Asp Val Gly Glu Phe 832

820 825 830 820 825 830

Gly Val Ala Tyr Ala Val Val Arg Ile Val Arg Asp Arg Ile Glu Leu 848Gly Val Ala Tyr Ala Val Val Arg Ile Val Arg Asp Arg Ile Glu Leu 848

835 840 845 835 840 845

Leu Ser Trp Gly Phe Leu Lys Asp Pro Ala Leu Arg Lys Ile Arg Glu 864Leu Ser Trp Gly Phe Leu Lys Asp Pro Ala Leu Arg Lys Ile Arg Glu 864

850 855 860 850 855 860

Arg Val Gln Asp Met Lys Lys Lys Gln Val Met Ala Val Phe Ser Ser 880Arg Val Gln Asp Met Lys Lys Lys Gln Val Met Ala Val Phe Ser Ser 880

865 870 875 880865 870 875 880

Ser Ser Thr Ala Val Ala Arg Val Arg Glu Met Ala Ile His Ser Leu 896Ser Ser Thr Ala Val Ala Arg Val Arg Glu Met Ala Ile His Ser Leu 896

885 890 895 885 890 895

Arg Asn Gln Ile His Ser Ile Ala Leu Ala Tyr Lys Ala Lys Ile Ile 912Arg Asn Gln Ile His Ser Ile Ala Leu Ala Tyr Lys Ala Lys Ile Ile 912

900 905 910 900 905 910

Tyr Glu Ile Ser Ile Ser Asn Phe Glu Thr Gly Gly Asn Arg Met Ala 928Tyr Glu Ile Ser Ile Ser Asn Phe Glu Thr Gly Gly Asn Arg Met Ala 928

915 920 925 915 920 925

Lys Ile Tyr Arg Ser Ile Lys Val Ser Asp Val Tyr Arg Glu Ser Gly 944Lys Ile Tyr Arg Ser Ile Lys Val Ser Asp Val Tyr Arg Glu Ser Gly 944

930 935 940 930 935 940

Ala Asp Thr Leu Val Ser Glu Met Ile Trp Gly Lys Lys Asn Lys Gln 960Ala Asp Thr Leu Val Ser Glu Met Ile Trp Gly Lys Lys Asn Lys Gln 960

945 950 955 960945 950 955 960

Met Gly Asn His Ile Ser Ser Tyr Ala Thr Ser Tyr Thr Cys Cys Asn 976Met Gly Asn His Ile Ser Ser Tyr Ala Thr Ser Tyr Thr Cys Cys Asn 976

965 970 975 965 970 975

Cys Ala Arg Thr Pro Phe Glu Leu Val Ile Asp Asn Asp Lys Glu Tyr 992Cys Ala Arg Thr Pro Phe Glu Leu Val Ile Asp Asn Asp Lys Glu Tyr 992

980 985 990 980 985 990

Glu Lys Gly Gly Asp Glu Phe Ile Phe Asn Val Gly Asp Glu Lys Lys 1008Glu Lys Gly Gly Asp Glu Phe Ile Phe Asn Val Gly Asp Glu Lys Lys 1008

995 1000 1005 995 1000 1005

Val Arg Gly Phe Leu Gln Lys Ser Leu Leu Gly Lys Thr Ile Lys Gly 1024Val Arg Gly Phe Leu Gln Lys Ser Leu Leu Gly Lys Thr Ile Lys Gly 1024

1010 1015 1020 1010 1015 1020

Lys Glu Val Leu Lys Ser Ile Lys Glu Tyr Ala Arg Pro Pro Ile Arg 1040Lys Glu Val Leu Lys Ser Ile Lys Glu Tyr Ala Arg Pro Pro Ile Arg 1040

1025 1030 1035 10401025 1030 1035 1040

Glu Val Leu Leu Glu Gly Glu Asp Val Glu Gln Leu Leu Lys Arg Arg 1056Glu Val Leu Leu Glu Gly Glu Asp Val Glu Gln Leu Leu Lys Arg Arg 1056

1045 1050 1055 1045 1050 1055

Gly Asn Ser Tyr Ile Tyr Arg Cys Pro Phe Cys Gly Tyr Lys Thr Asp 1072Gly Asn Ser Tyr Ile Tyr Arg Cys Pro Phe Cys Gly Tyr Lys Thr Asp 1072

1060 1065 1070 1060 1065 1070

Ala Asp Ile Gln Ala Ala Leu Asn Ile Ala Cys Arg Gly Tyr Ile Ser 1088Ala Asp Ile Gln Ala Ala Leu Asn Ile Ala Cys Arg Gly Tyr Ile Ser 1088

1075 1080 1085 1075 1080 1085

Asp Asn Ala Lys Asp Ala Val Lys Glu Gly Glu Arg Lys Leu Asp Tyr 1104Asp Asn Ala Lys Asp Ala Val Lys Glu Gly Glu Arg Lys Leu Asp Tyr 1104

1090 1095 1100 1090 1095 1100

Ile Leu Glu Val Arg Lys Leu Trp Glu Lys Asn Gly Ala Val Leu Arg 1120Ile Leu Glu Val Arg Lys Leu Trp Glu Lys Asn Gly Ala Val Leu Arg 1120

1105 1110 1115 11201105 1110 1115 1120

Ser Ala Lys Phe Leu 1125Ser Ala Lys Phe Leu 1125

1125 1125

<210> 2<210> 2

<211> 59<211> 59

<212> РНК<212> RNA

<213> artificial sequence<213> artificial sequence

<220><220>

<223> KbCas12d scout РНК<223> KbCas12d scout RNA

<400> 2<400> 2

aaguaucaaa auaaaaaggg uuuccaguuu uuaacuaaac uuuagccuuc cacccuuuc 59aaguaucaaa auaaaaaggg uuuccaguuu uuaacuaaac uuuagccuuc cacccuuuc 59

<210> 3<210> 3

<211> 44<211> 44

<212> РНК<212> RNA

<213> artificial sequence<213> artificial sequence

<220><220>

<223> KbCas12d крРНК, где n - любой нуклеотид<223> KbCas12d crRNA, where n is any nucleotide

<400> 3<400> 3

acuccgaaag uaucggggau aaaggcnnnn nnnnnnnnnn nnnn 44acuccgaaag uaucggggau aaaggcnnnn nnnnnnnnnn nnnn 44

<210> 4<210> 4

<211> 916<211> 916

<212> ДНК<212> DNA

<213> artificial sequence<213> artificial sequence

<220><220>

<223> линейный ДНК фрагмент<223> linear DNA fragment

<400> 4<400> 4

ccctgcaaac acaaagaaag agcatgttaa aataggatct acatcacgta acctgtctta 60ccctgcaaac acaaagaaag agcatgttaa aataggatct acatcacgta acctgtctta 60

gaagaggcta gatactgcaa ttcaaggacc ttatctcctt tcattgagca ccaaacccaa 120gaagaggcta gatactgcaa ttcaaggacc ttatctcctt tcattgagca ccaaacccaa 120

ctccatctac cagcctactc tcttatctct ggtatttgct ctgcagaatg agagaaaatg 180ctccatctac cagcctactc tcttatctct ggtatttgct ctgcagaatg agagaaaatg 180

aaactttcaa aagcctcaga aatccttgaa caaggcaata aaaggtgcta ttgctatagt 240aaactttcaa aagcctcaga aatccttgaa caaggcaata aaaggtgcta ttgctatagt 240

cattggcagc tacaggcaga gacaaaggag gaaaagaggt tgtgagtggt ccaggtagcc 300cattggcagc tacaggcaga gacaaaggag gaaaagaggt tgtgagtggt ccaggtagcc 300

atgcgagtat gcatacacaa atctcctggc cctcctgtta cagcccaccc ttgtactgtt 360atgcgagtat gcatacacaa atctcctggc cctcctgtta cagcccaccc ttgtactgtt 360

cttgggctga aggaaagcaa ggccagacaa agtgagcaga aaaacgtgct cagcagaggt 420cttgggctga aggaaagcaa ggccagacaa agtgagcaga aaaacgtgct cagcagaggt 420

gagcaacaga acatgggctg gataaactgg atgtgggggg ctataagtac acaagccctg 480gagcaacaga acatgggctg gataaactgg atgtgggggg ctataagtac acaagccctg 480

cattcttgct gccttcacct tatgtttgcc tcaatgagga caacagccag aaaattcttt 540cattcttgct gccttcacct tatgtttgcc tcaatgagga caacagccag aaaattcttt 540

agtaaccttg ttagtatctg gctcttaata ttaaactaca agaacaactg atacatgact 600agtaaccttg ttagtatctg gctcttaata ttaaactaca agaacaactg atacatgact 600

agtagtttta aaacattgcc tcaattgatc cttacaatga cccagtaggg aataaaataa 660agtagtttta aaacattgcc tcaattgatc cttacaatga cccagtaggg aataaaataa 660

gaaaaacatt attatcacca tttttataaa tgttgacgcc aaggctcaga gaagctaagt 720gaaaaacatt attatcacca tttttataaa tgttgacgcc aaggctcaga gaagctaagt 720

gttctaagac catgaaccaa tcaattaagt gaaacaaacc tgaacccaga tcttctgact 780gttctaagac catgaaccaa tcaattaagt gaaacaaacc tgaacccaga tcttctgact 780

tttgtattcc aatatgtgtt ctattacact acgtggaact gcctctcata taaccaatta 840tttgtattcc aatatgtgtt ctattacact acgtggaact gcctctcata taaccaatta 840

ttataaggaa cacatattac tccaatctat ttatacacca aaatacagaa acttaaacaa 900ttataaggaa cacatattac tccaatctat ttatacacca aaatacagaa acttaaacaa 900

ataccattag cagctg 916ataccattag cagctg 916

<210> 5<210> 5

<211> 43<211> 43

<212> РНК<212> RNA

<213> artificial sequence<213> artificial sequence

<220><220>

<223> крРНК<223> crRNA

<400> 5<400> 5

cuccgaaagu aucggggaua aaggcgucau uggcagcuac agg 43cuccgaaagu aucggggaua aaggcgucau uggcagcuac agg 43

<210> 6<210> 6

<211> 44<211> 44

<212> РНК<212> RNA

<213> artificial sequence<213> artificial sequence

<220><220>

<223> scout РНК<223> scout RNA

<400> 6<400> 6

aaaggguuuc caguuuuuaa cuaaacuuua gccuuccacc cuuu 44aaaggguuuc caguuuuuaa cuaaacuuua gccuuccacc cuuu 44

<210> 7<210> 7

<211> 62<211> 62

<212> РНК<212> RNA

<213> artificial sequence<213> artificial sequence

<220><220>

<223> scout РНК<223> scout RNA

<400> 7<400> 7

caaaauaaaa aggguuucca guuuuuaacu aaacuuuagc cuuccacccu uuccugauuu 6060

ug 62ug 62

<210> 8<210> 8

<211> 76<211> 76

<212> РНК<212> RNA

<213> artificial sequence<213> artificial sequence

<220><220>

<223> scout РНК<223> scout RNA

<400> 8<400> 8

aaguaucaaa auaaaaaggg uuuccaguuu uuaacuaaac uuuagccuuc cacccuuucc 60uuuccaguuu uuaacuaaac uuuagccuuc cacccuuucc 60

ugauuuuguu gauaau 76ugauuuuguu gauaau 76

<210> 9<210> 9

<211> 59<211> 59

<212> РНК<212> RNA

<213> artificial sequence<213> artificial sequence

<220><220>

<223> scout РНК<223> scout RNA

<400> 9<400> 9

<210> 10<210> 10

<211> 62<211> 62

<212> РНК<212> RNA

<213> artificial sequence<213> artificial sequence

<220><220>

<223> scout РНК<223> scout RNA

<400> 10<400> 10

auuaucaaca aaaucaggaa aggguggaag gcuaaaguuu aguuaaaaac uggaaacccu 60auuaucaaca aaaucaggaa aggguggaag gcuaaaguuu aguuaaaaac uggaaacccu 60

uu 62uu 62

<210> 11<210> 11

<211> 91<211> 91

<212> РНК<212> RNA

<213> artificial sequence<213> artificial sequence

<220><220>

<223> sgРНК<223> sgRNA

<400> 11<400> 11

aaaggguuuc caguuuuuaa cuaaacuuua gccuuccacc cuuugaaacu ccgaaaguau 60aaaggguuuc caguuuuuaa cuaaacuuua gccuuccacc cuuugaaacu ccgaaaguau 60

cggggauaaa ggcaucaaua ccaaacucug g 91cggggauaaa ggcaucaua ccaaacucug g 91

<210> 12<210> 12

<211> 95<211> 95

<212> РНК<212> RNA

<213> artificial sequence<213> artificial sequence

<220><220>

<223> sgРНК, участки спейсерной последовательности (с 78 по 95 п.н.<223> sgRNA, regions of the spacer sequence (from 78 to 95 bp

последовательность aucaauaccaaacucugg) и линкера (с 45 по 52 п.н.sequence aucauaccaaacucugg) and a linker (bp 45 to 52).

последовательность gaaagaaa) могут быть вариабельнымиsequence gaaagaaa) can be variable

<400> 12<400> 12

aaaggguuuc caguuuuuaa cuaaacuuua gccuuccacc cuuugaaaga aacuccgaaa 60aaaggguuuc caguuuuuaa cuaaacuuua gccuuccacc cuuugaaaga aacuccgaaa 60

guaucgggga uaaaggcauc aauaccaaac ucugg 95guaucgggga uaaaggcauc aauaccaaac ucugg 95

<210> 13<210> 13

<211> 99<211> 99

<212> РНК<212> RNA

<213> artificial sequence<213> artificial sequence

<220><220>

<223> sgРНК<223> sgRNA

<400> 13<400> 13

aaaggguuuc caguuuuuaa cuaaacuuua gccuuccacc cuuugaaaga aagaaacucc 60aaaggguuuc caguuuuuaa cuaaacuuua gccuuccacc cuuugaaaga aagaaacucc 60

gaaaguaucg gggauaaagg caucaauacc aaacucugg 99gaaaguaucg gggauaaagg caucaauacc aaacucugg 99

<210> 14<210> 14

<211> 1612<211> 1612

<212> ДНК<212> DNA

<213> H.sapiens<213> H. sapiens

<223> фрагмент гена grin2b<223> grin2b gene fragment

<400> 14<400> 14

gagagagatg gccaaggctt atattctata gagcattatg tccttagttt gatgcataga 60gagagagatg gccaaggctt atattctata gagcattatg tccttagttt gatgcataga 60

ataagattta gggtcatatg tggaagtaaa aaggaaggag ttctttgtag gtaaaaggtg 120ataagattta gggtcatatg tggaagtaaa aaggaaggag ttctttgtag gtaaaaggtg 120

gcaaattata tgaaaatacg gtatcagtca ttttagggaa gtcacgacta taggatggca 180gcaaattata tgaaaatacg gtatcagtca ttttagggaa gtcacgacta taggatggca 180

tcagaccttt tattgccttg ttcagaaaaa aaaaggaaca tttttcaaat gtggctctaa 240tcagaccttt tattgccttg ttcagaaaaa aaaaggaaca tttttcaaat gtggctctaa 240

cattacttca gctgctaatg gtatttgttt aagtttctgt attttggtgt ataaatagat 300cattacttca gctgctaatg gtatttgttt aagtttctgt attttggtgt ataaatagat 300

tggagtaata tgtgttcctt ataataattg gttatatgag aggcagttcc acgtagtgta 360tggagtaata tgtgttcctt ataataattg gttatatgag aggcagttcc acgtagtgta 360

atagaacaca tattggaata caaaagtcag aagatctggg ttcaggtttg tttcacttaa 420420

ttgattggtt catggtctta gaacacttag cttctctgag ccttggcgtc aacatttata 480ttgattggtt catggtctta gaacacttag cttctctgag ccttggcgtc aacatttata 480

aaaatggtga taataatgtt tttcttattt tattccctac tgggtcattg taaggatcaa 540540

ttgaggcaat gttttaaaac tactagtcat gtatcagttg ttcttgtagt ttaatattaa 600ttgaggcaat gttttaaaac tactagtcat gtatcagttg ttcttgtagt ttaatattaa 600

gagccagata ctaacaaggt tactaaagaa ttttctggct gttgtcctca ttgaggcaaa 660gagccagata ctaacaaggt tactaaagaa ttttctggct gttgtcctca ttgaggcaaa 660

cataaggtga aggcagcaag aatgcagggc ttgtgtactt atagcccccc acatccagtt 720cataaggtga aggcagcaag aatgcagggc ttgtgtactt atagcccccc acatccagtt 720

tatccagccc atgttctgtt gctcacctct gctgagcacg tttttctgct cactttgtct 780tatccagccc atgttctgtt gctcacctct gctgagcacg tttttctgct cactttgtct 780

ggccttgctt tccttcagcc caagaacagt acaagggtgg gctgtaacag gagggccagg 840ggccttgctt tccttcagcc caagaacagt acaagggtgg gctgtaacag gagggccagg 840

agatttgtgt atgcatactc gcatggctac ctggaccact cacaacctct tttcctcctt 900agatttgtgt atgcatactc gcatggctac ctggaccact cacaacctct tttcctcctt 900

tgtctctgcc tgtagctgcc aatgactata gcaatagcac cttttattgc cttgttcaag 960tgtctctgcc tgtagctgcc aatgactata gcaatagcac cttttattgc cttgttcaag 960

gatttctgag gcttttgaaa gtttcatttt ctctcattct gcagagcaaa taccagagat 1020gatttctgag gcttttgaaa gtttcatttt ctctcattct gcagagcaaa taccagagat 1020

aagagagtag gctggtagat ggagttgggt ttggtgctca atgaaaggag ataaggtcct 1080aagagagtag gctggtagat ggagttggggt ttggtgctca atgaaaggag ataaggtcct 1080

tgaattgcag tatctagcct cttctaagac aggttacgtg atgtagatcc tattttaaca 1140tgaattgcag tatctagcct cttctaagac aggttacgtg atgtagatcc tattttaaca 1140

tgctctttct ttgtgtttgc agggagtcga cgagttgaag atgaagccca gagcggagtg 1200tgctctttct ttgtgtttgc agggagtcga cgagttgaag atgaagccca gagcggagtg 1200

ctgttctccc aagttctggt tggtgttggc cgtcctggcc gtgtcaggca gcagagctcg 1260ctgttctccc aagttctggt tggtgttggc cgtcctggcc gtgtcaggca gcagagctcg 1260

ttctcagaag agccccccca gcattggcat tgctgtcatc ctcgtgggca cttccgacga 1320ttctcagaag agccccccca gcattggcat tgctgtcatc ctcgtgggca cttccgacga 1320

ggtggccatc aaggatgccc acgagaaaga tgatttccac catctctccg tggtaccccg 1380ggtggccatc aaggatgccc acgagaaaga tgatttccac catctctccg tggtaccccg 1380

ggtggaactg gtagccatga atgagaccga cccaaagagc atcatcaccc gcatctgtga 1440ggtggaactg gtagccatga atgagaccga cccaaagagc atcatcaccc gcatctgtga 1440

tctcatgtct gaccggaaga tccagggggt ggtgtttgct gatgacacag accaggaagc 1500tctcatgtct gaccggaaga tccagggggt ggtgtttgct gatgacacag accaggaagc 1500

catcgcccag atcctcgatt tcatttcagc acagactctc acccccatcc tgggcatcca 1560catcgcccag atcctcgatt tcatttcagc acagactctc acccccatcc tgggcatcca 1560

cgggggctcc tctatgataa tggcagataa ggtaaaaagg ggctgcaggg ag 1612cgggggctcc tctatgataa tggcagataa ggtaaaaagg ggctgcaggg ag 1612

<210> 15<210> 15

<211> 17<211> 17

<212> PRT<212> PRT

<213> artificial sequence<213> artificial sequence

<220><220>

<223> часть нуклеазы<223> nuclease part

<400> 15<400> 15

Met Ala Pro Lys Lys Lys Arg Lys Val Gly Ile His Gly Val Pro Ala 16Met Ala Pro Lys Lys Lys Arg Lys Val Gly Ile His Gly Val Pro Ala 16

1 5 10 151 5 10 15

Ala 17Ala 17

<210> 16<210> 16

<211> 16<211> 16

<212> PRT<212> PRT

<213> artificial sequence<213> artificial sequence

<220><220>

<223> часть нуклеазы<223> nuclease part

<400> 16<400> 16

Lys Arg Pro Ala Ala Thr Lys Lys Ala Gly Gln Ala Lys Lys Lys Lys 16Lys Arg Pro Ala Ala Thr Lys Lys Ala Gly Gln Ala Lys Lys Lys Lys 16

<210> 17<210> 17

<211> <211>

<212> ДНК<212> DNA

<213> artificial sequence<213> artificial sequence

<220><220>

<223> кассета для экспрессии sgРНК<223> sgRNA expression cassette

<400> 17<400> 17

gagggcctat ttcccatgat tccttcatat ttgcatatac gatacaaggc tgttagagag 60gagggcctat ttcccatgat tccttcatat ttgcatatac gatacaaggc tgttagagag 60

ataattggaa ttaatttgac tgtaaacaca aagatattag tacaaaatac gtgacgtaga 120ataattggaa ttaatttgac tgtaaacaca aagatattag tacaaaatac gtgacgtaga 120

aagtaataat ttcttgggta gtttgcagtt ttaaaattat gttttaaaat ggactatcat 180aagtaataat ttcttgggta gtttgcagtt ttaaaattat gttttaaaat ggactatcat 180

atgcttaccg taacttgaaa gtatttcgat ttcttggctt tatatatctt gtggaaagga 240atgcttaccg taacttgaaa gtatttcgat ttcttggctt tatatatctt gtggaaagga 240

cgaaacaccg aaagggtttc cagtttttaa ctaaacttta gccttccacc ctttgaaaga 300cgaaacaccg aaagggtttc cagtttttaa ctaaacttta gccttccacc ctttgaaaga 300

aactccgaaa gtatcgggga taaaggcaaa taggatctac atcac 345aactccgaaa gtatcgggga taaaggcaaa taggatctac atcac 345

<---<---

Claims

1. Hybrid RNA (sgRNA) used as a guide RNA in the Cas12d system for editing genomic DNA, which is a nucleic acid molecule obtained by fusing the nucleotide sequences of scout RNA (scout RNA) and crRNA (crRNA), with the general formula A-B- C-D, where A is the KbCas12d (CRISPR-Cas12d system from Katanobacteria) scout RNA sequence, B is the linker sequence; C, direct repeat sequence of DR KbCas12d crRNA; D, sequence complementary to the target DNA (spacer segment); while the sequence of KbCas12d scout RNA is characterized by SEQ ID NO: 6, and the sequence of the linker and spacer segment can be any.

2. The fusion RNA according to claim 1, wherein the KbCas12d crRNA direct repeat DR sequence is a direct repeat sequence of SEQ ID NO: 3.

3. Hybrid RNA according to claim 1, which is the sequence of SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, but not limited to them.

4. DNA cassette for the expression of the hybrid RNA according to claim 1, consisting of the nucleotide sequence of the U6 promoter, the nucleotide sequence encoding the hybrid RNA according to claim 1, and the nucleotide sequence necessary for recognition of the target DNA and the sequence 5 flanked from the 5' end of the PAM '-TA -3'.

5. A method for altering the genomic DNA sequence of a unicellular or multicellular organism, comprising introducing into at least one cell of this organism an effective amount of:

a) a protein having a sequence that is at least 95% identical to the amino acid sequence of SEQ ID NO: 1, or a nucleic acid encoding a protein gene having a sequence that is at least 95% identical to the amino acid sequence of SEQ ID NO: 1;

b) hybrid RNA according to claim 1,

in this case, the interaction of the specified protein with the hybrid RNA and the 5'-TA-3' nucleotide sequence leads to the formation of a double-strand break in the genomic DNA sequence immediately adjacent to the 5'-TA-3' sequence.

6. The method of claim 8, further comprising introducing the exogenous DNA sequence simultaneously with the fusion RNA.