[go: up one dir, main page]

WO2025065873A1 - Système d'entraînement de gène artificiel - Google Patents

Système d'entraînement de gène artificiel Download PDF

Info

Publication number
WO2025065873A1
WO2025065873A1 PCT/CN2023/136757 CN2023136757W WO2025065873A1 WO 2025065873 A1 WO2025065873 A1 WO 2025065873A1 CN 2023136757 W CN2023136757 W CN 2023136757W WO 2025065873 A1 WO2025065873 A1 WO 2025065873A1
Authority
WO
WIPO (PCT)
Prior art keywords
gene
plant
nucleic acid
drive system
plants
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/CN2023/136757
Other languages
English (en)
Chinese (zh)
Inventor
刘洋
焦丙可
钱文峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Genetics and Developmental Biology of CAS
Original Assignee
Institute of Genetics and Developmental Biology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Genetics and Developmental Biology of CAS filed Critical Institute of Genetics and Developmental Biology of CAS
Publication of WO2025065873A1 publication Critical patent/WO2025065873A1/fr
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/415Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from plants
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8241Phenotypically and genetically modified plants via recombinant DNA technology
    • C12N15/8261Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield
    • C12N15/8271Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield for stress resistance, e.g. heavy metal resistance
    • C12N15/8274Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield for stress resistance, e.g. heavy metal resistance for herbicide resistance

Definitions

  • the present invention belongs to the field of biotechnology. Specifically, the present invention relates to an artificial gene drive system. More specifically, the present invention relates to an artificial gene drive system based on a poison-antidote mechanism that can be applied to plants.
  • the gene drive system carries a cargo3 internally, population modification can be achieved as the gene drive spreads; if the gene drive system itself is located within a fertility-related essential gene4,5 and continues to cleave its homologous genes, population suppression can be achieved.
  • HDR repair is not successfully achieved, the DSB will be repaired by non-homologous end joining (NHEJ), introducing Indels, which will produce resistance alleles that can no longer be targeted and cut by gRNA.
  • NHEJ non-homologous end joining
  • Indels resistance alleles that can no longer be targeted and cut by gRNA.
  • the NHEJ method is particularly common in plants, so the continued spread of gene drives in plant populations poses a major challenge. Therefore, in the pursuit of more efficient artificial gene drive systems, mechanisms that do not rely on the HDR repair pathway are crucial.
  • the poison-antidote (TA) mechanism taking the naturally occurring t-haplotype in mice as an example, provides a promising revelation.
  • the poison is usually expressed before meiosis and is therefore present in the four gametes formed, interfering with normal gamete formation, while the antidote is activated at the stage after meiosis and can mitigate and neutralize the damage caused by the poison, thereby providing an evolutionary advantage to its carrier.
  • these natural poison-antidote systems cannot be directly replicated in various species, the emergence of CRISPR-Cas9 provides a more general way to mimic the natural poison-antidote strategy.
  • an essential gene is repaired by CRISPR/Cas9 cutting and the NHEJ pathway to make it lose-of-function (LOF) as a poison, and the recoded (Recoded) sequence of the essential gene that will not be targeted by the gRNA is used as an antidote to rescue the effects of the poison.
  • LEF loss-of-function
  • TARE Toxin-Antidote Recessive Embryo
  • ClvR 9 The developed Toxin-Antidote Recessive Embryo (TARE) drive system 10 , also known as Cleave and Rescue or ClvR 9 , targets essential genes for zygotic development.
  • LEF inactivated
  • This system has achieved a gene drive transmission rate of 88-95% in female heterozygotes of Drosophila melanogaster 10.
  • the efficiency of this system depends on the presence of Cas9 activity carried over from the egg cell to the zygote (to cut the WT allele contributed by the father), or requires the target gene to be inactivated. Located on sex chromosomes, this hinders its application to a wide range of species to some extent.
  • TADS Toxin-Antidote Dominant Sperm
  • TADS driver12 aims to interfere with essential genes in spermatogenesis.
  • the driver efficiency is higher, and it circumvents the above-mentioned TARE limitation by destroying only one copy of the target gene, rather than two alleles as required by the TARE system.
  • TARE Toxin-Antidote Dominant Sperm
  • CAIN CRISPR-Assisted Inheritance utilizing NPG1
  • CAIN CRISPR-Assisted Inheritance utilizing NPG1
  • NPG1 No Pollen Germination 1
  • CAIN heralds the potential for application in various plant species and provides solutions to important challenges - slowing the spread of invasive species by affecting the genetic proportion of sterility genes, and managing weed populations by spreading genes sensitive to certain herbicides, thus leading to a new era of ecological management and sustainable agriculture.
  • Embodiment 1 An artificial gene drive system for a plant, comprising:
  • a first nucleic acid comprising a coding sequence of a gene editing system component, wherein the gene editing system can target a gene such as a coding sequence of a protein essential for pollen tube development in the plant and cause the protein essential for pollen tube development to lose function, wherein the coding sequence of the gene editing system component is operably linked to a promoter that mediates specific expression during pollen formation;
  • a second nucleic acid comprising a recoded coding sequence of the pollen tube development essential protein, wherein the recoded sequence encodes the wild type of the pollen tube development essential protein and cannot be targeted by the gene editing system, and is operably linked to the natural promoter of the pollen tube development essential gene;
  • a third nucleic acid comprising a coding sequence for a cargo e.g., the cargo is to be propagated among a population of the plant.
  • Embodiment 2 The artificial gene drive system of embodiment 1, wherein the first nucleic acid, the second nucleic acid and the third nucleic acid are located on the same expression construct.
  • Embodiment 3 An artificial gene drive system of embodiment 1 or 2, wherein the protein essential for pollen tube development is No Pollen Germination 1 (NPG1).
  • Embodiment 4 An artificial gene drive system of embodiment 3, wherein the NPG1 comprises an amino acid sequence having at least 70%, at least 75%, at least 80%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.9%, or even 100% sequence identity with SEQ ID NO:1.
  • Implementation Option 5 The artificial gene drive system of Implementation Option 3, wherein the coding sequence of endogenous NPG1 in the plant comprises a nucleotide sequence having at least 70%, at least 75%, at least 80%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.9%, or even 100% sequence identity with SEQ ID NO:2.
  • Embodiment 6 The artificial gene drive system of embodiment 3, wherein the coding sequence of the recoded NPG1 comprises a nucleotide sequence having at least 70%, at least 75%, at least 80%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.9%, or even 100% sequence identity with SEQ ID NO: 3, and the coding sequence of the recoded NPG1 cannot be targeted by the gene editing system direction, so that the function is not lost due to the expression of the gene editing system.
  • Implementation Option 7 An artificial gene drive system of Implementation Option 3, wherein the natural promoter of NPG1 comprises a nucleotide sequence having at least 70%, at least 75%, at least 80%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.9%, or even 100% sequence identity with SEQ ID NO:4.
  • Embodiment 8 An artificial gene drive system according to any one of embodiments 1-7, wherein the promoter that mediates specific expression during pollen formation is the promoter of the DMC1 (Disruption of Meiotic Control 1) gene.
  • DMC1 Disruption of Meiotic Control 1
  • Implementation Option 9 An artificial gene drive system of Implementation Option 8, wherein the DMC1 promoter comprises a nucleotide sequence having at least 70%, at least 75%, at least 80%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.9%, or even 100% sequence identity with SEQ ID NO:5.
  • Embodiment 10 An artificial gene drive system according to any one of embodiments 1-7, wherein the promoter that mediates specific expression during pollen formation is the promoter of the TPD1 (Tapetum Determinant 1) gene.
  • Embodiment 11 The artificial gene drive system of embodiment 10, wherein the TPD1 promoter comprises a nucleotide sequence having at least 70%, at least 75%, at least 80%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.9%, or even 100% sequence identity with SEQ ID NO:6.
  • Embodiment 12 An artificial gene drive system according to any one of embodiments 1 to 11, wherein the gene editing system is selected from a gene editing system based on CRISPR, ZFN or TALEN, and preferably, the gene editing system is a gene editing system based on CRISPR.
  • Embodiment 13 The artificial gene drive system of embodiment 12, wherein the CRISPR gene editing system comprises a CRISPR nuclease and at least one guide RNA, preferably, the CRISPR nuclease is a Cas9 nuclease.
  • Embodiment 14 The artificial gene drive system of embodiment 13, wherein the coding sequence of the CRISPR nuclease is operably linked to the promoter that mediates specific expression during pollen formation, preferably, is operably linked to the TPD1 promoter.
  • Embodiment 15 The artificial gene drive system of embodiment 13, wherein the gene editing system comprises a Cas9 nuclease and at least one gRNA targeting endogenous NPG1.
  • Embodiment 16 An artificial gene drive system of Embodiment 15, wherein at least one gRNA targeting endogenous NPG1 targets a nucleotide sequence selected from any one of SEQ ID NO:7-10.
  • Embodiment 17 An artificial gene drive system according to any one of embodiments 1 to 16, wherein the expression of the cargo is harmful or beneficial to the plant when the plant is exposed to specific compounds or conditions, for example, the cargo is a herbicide-sensitive gene, a gene that destroys herbicide resistance, a gene that improves environmental adaptability, or a gene that improves disease resistance.
  • the cargo is a herbicide-sensitive gene, a gene that destroys herbicide resistance, a gene that improves environmental adaptability, or a gene that improves disease resistance.
  • Embodiment 18 A method for producing a modified plant for gene drive modification of a plant population, the method comprising introducing the artificial gene drive system of any one of embodiments 1-17 into at least one plant, thereby obtaining at least one modified plant, whose genome is integrated with the first nucleic acid, the second nucleic acid and the third nucleic acid.
  • Embodiment 19 The method of embodiment 18, wherein the first nucleic acid, the second nucleic acid, and the third nucleic acid integrated into the genome of the modified plant are closely linked, such as located at the same locus.
  • Embodiment 20 Use of a modified plant for gene drive modification of a plant population, wherein the modified plant is obtained by the method of embodiment 18 or 19 or the modified plant is introduced with the artificial gene drive system for a plant according to any one of embodiments 1 to 17, so that the first nucleic acid, the second nucleic acid and the third nucleic acid are integrated into its genome.
  • Embodiment 21 A modified plant for gene drive modification of a plant population, wherein the modified plant is obtained by the method of embodiment 18 or 19 or the modified plant is introduced with the artificial gene drive system for plants according to any one of embodiments 1-17, so that the first nucleic acid, the second nucleic acid and the third nucleic acid are integrated into its genome.
  • Embodiment 22 A method for modifying a plant population by gene drive, the method comprising placing at least one modified plant of embodiment 21 into the plant population, and allowing the at least one modified plant to hybridize with other plants in the plant population.
  • Embodiment 23 The method of embodiment 22, wherein the method allows the offspring of the cross between the at least one modified plant and other plants in the plant population to cross with other plants and/or offspring in the population.
  • Fig. 2 Transmission rate of CAIN gene drive from T1 to F1 generation in testcross.
  • a Transformation to obtain T1 plants and subsequent hybridization test steps.
  • Transgenic T1 was obtained by transformation of plants with Agrobacterium carrying a control vector (FAST only) or one of the gene drive vectors (DMC-CAIN and TPD-CAIN).
  • F1 generation was obtained with T1 plants with single-site insertion as male parent and wild-type Col-0 as female parent.
  • b Transmission efficiency of CAIN gene drive is the proportion of FAST+F1 seeds to all F1 seeds. Each red dot represents the transmission efficiency in a single silique.
  • Fig. 3 Genotypes of FAST+F1 plants at the NPG1 locus in the TPD-CAIN experiment.
  • a Schematic diagram of genotype identification of F1 somatic tissues (rosette leaves and inflorescences).
  • b Summary of genotype results of 16 FAST+F1 plants at four gRNA target sites.
  • the symbols "+” and “-” before the numerical value and base indicate insertion and deletion events, respectively.
  • the numerical value after the symbol indicates the number of nucleotides in the indel greater than 2.
  • the symbol "A>C” represents a base substitution from adenine (A) to cytosine (C).
  • c Genotype results at the gRNA11 target site based on Illumina sequencing.
  • Fig. 4 Transmission efficiency of TPD-CAIN from F1 to F2 generation in backcrossing. Average transmission efficiency of TPD-CAIN in the progeny when FAST+F1 plants were used as male (a) or female (b) parents.
  • FIG. 1 Transmission rate of DMC-CAIN from F1 generation to F2 generation.
  • a Genotype identification of the inflorescence of F1 plants.
  • b Summary of genotypes of 12 FAST+F1 plants at four gRNA target sites.
  • c Average transmission efficiency of DMC-CAIN in F2 seeds produced by FAST+F1 as the male parent.
  • Genotypes of FAST-F1 and FAST+/-F2 plants at the NPG1 locus a. Genotype identification of the leaf part of FAST-F1 plants produced by T1 plants carrying TPD-CAIN as the male parent. b. Summary of the genotypes of 11 FAST-F1 plants at four gRNA target sites. The mechanism corresponding to each F1 plant (incomplete cutting efficiency or incomplete penetrance) is marked below by a check mark. c. Summary of the genotypes of FAST+ and FAST-F2 plants at four gRNA target sites based on Sanger sequencing. The F2 plant is produced by reciprocal crosses between F1 carrying TPD-CAIN (TPD-CAIN/+) and the wild type (+/+).
  • FIG. 7 Dynamic simulation of the spread of modified and repressed CAIN-driven genes.
  • a Computational simulations showing the efficiency of male germ cell cleavage (empirical value: 98.4%, artificial setting: 50.0% and 100.0%) and penetrance (empirical value: 96.0%, artificial setting: 50.0%) and 100.0%) on the CAIN propagation dynamics.
  • b Computational simulation of the homing type, TARE and CAIN-driven diffusion dynamics, with an initial release ratio of 1%. The cutting efficiency of homing and TARE was set to the maximum value. For CAIN, the penetrance was set to an empirical value of 96.0%.
  • Figure 8 Four gRNAs involved in CAIN gene drive.
  • the CAIN vector contains four tandem gRNAs. Based on the principle of synonymous codons, the NPG1 sequence is changed without changing the encoded amino acid, as Recoded NPG1, and the mutated nucleotides are marked with red boxes.
  • b The positions of the four gRNA target sequences on the genomic sequence are displayed.
  • the primers used for genotyping have been marked. First, the primer pairs NPG-gDNA-F1 and NPG-gDNA-R1_2 were used to amplify the genomic region covering the four target sites, and the PCR products were Sanger sequenced.
  • the primers used for sequencing are: NPG-gDNA-F1, NPG-gDNA-F2, NPG-gDNA-R1_1 and NPG-gDNA-R2.
  • FIG. 9 CAIN gene drive vector map.
  • the vector maps driven by the control vector FAST only (a), DMC-CAIN (b) and TPD-CAIN (c) are shown, with the total sequence length and main features marked.
  • Step 1 Infect Arabidopsis Col-0 background with control vector (FAST only), DMC-CAIN or TPD-CAIN gene drive vector respectively. Select the successfully transformed seeds directly by the FAST phenotype (red fluorescence). Screen T1 plants with single-site insertion by TAIL-PCR and whole genome sequencing.
  • Step 2 Hybridize the above T1 as the male parent with Col-0 female parent. The percentage of FAST+ seeds in F1 seeds is used as the transmission rate (drive%) of CAIN gene drive.
  • Step 3 Plant F1 seeds to obtain F1 plants. Genotype each F1 plant as described in the method section.
  • Step 4 Hybridize F1 plants with known genotype as male or female parent with Col-0 plants respectively. Similarly, count the drive transmission rate in F2 seeds.
  • Step 5 Plant F2 seeds to obtain F2 plants. Genotype F2 plants are similarly identified.
  • Figure 11 Types of mutations detected in FAST+F1 plants.
  • the figure illustrates the insertions, deletions, and single nucleotide polymorphisms generated at the target sites of (a) gRNA2 and (b) gRNA11 and their locations. * indicates possible alignments, as the underlined bases can be located on either side of the deletion.
  • Reversible TPD-CAIN gene drive In the new version of the gene drive TPD-CAIN n+1 , gRNAs n+1 are designed as a new poison for the new site in NPG1, while recoded n+1, which is resistant to both gRNAs n and gRNAs n+1 (cannot be targeted for cutting), is used as an antidote.
  • the new poison destroys NPG1 and recoded n on the genome, so it can only be rescued by recoded n+1 .
  • the new version TPD-CAIN n+1 is in a homologous position to the old version TPD-CAIN n , the new version will eliminate and replace the old version.
  • the new cargo n+1 linked to it also spreads.
  • Figure 13 Effects of male germ cell cutting efficiency, incomplete penetrance, and female germ cell cutting efficiency on the TPD-CAIN system.
  • a Estimation of male germ cell cutting efficiency and penetrance.
  • F1 progeny 94.3% (526/558) of the seeds were FAST+ (TPD-CAIN/+), and all were NPG1 - genotypes at the gRNA11 target site.
  • 2.6% 5.7% ⁇ 5/11) were genotype +/+; NPG1 +/- , and 3.1% (5.7% ⁇ 6/11) were genotype +/+; NPG1 +/+ plants.
  • the cleavage efficiency r was calculated based on the NPG1 genotype at its gRNA11 target site. Since only one of the 34 FAST-F2 plants was a +/+; NPG1 +/+ genotype, r was estimated to be 94.1%.
  • TPD-CAIN has two potential applications. a. Improving plant adaptability: By spreading genes that promote the adaptability of specific endangered species to their environment, TPD-CAIN can rapidly Genetic rescue and make the target species more adaptable to their living environment. b. Weed management: By spreading genes that confer sensitivity to target weeds to herbicides, combined with subsequent local application of herbicides, TPD-CAIN can achieve efficient weed area management.
  • the term “and/or” encompasses all combinations of items connected by the term, and each combination should be considered to have been listed separately herein.
  • “A and/or B” encompasses “A,” “A and B,” and “B.”
  • “A, B, and/or C” encompasses “A,” “B,” “C,” “A and B,” “A and C,” “B and C,” and “A and B and C.”
  • the protein or nucleic acid may consist of the sequence, or may have additional amino acids or nucleotides at one or both ends of the protein or nucleic acid, but still have the activity described in the present invention.
  • the methionine encoded by the start codon at the N-terminus of the polypeptide may be retained in certain practical situations (for example, when expressed in a specific expression system), but it does not substantially affect the function of the polypeptide.
  • Exogenous with respect to a sequence refers to a sequence that is from a foreign species, or, if from the same species, a sequence that has been significantly altered in composition and/or locus from its native form through deliberate human intervention.
  • nucleic acid sequence is used interchangeably and are single-stranded or double-stranded RNA or DNA polymers that optionally may contain synthetic, non-natural, or altered nucleotide bases.
  • Nucleotides are referred to by their single-letter names as follows: “A” is adenosine or deoxyadenosine (RNA or DNA, respectively), “C” is cytidine or deoxycytidine, “G” is guanosine or deoxyguanosine, “U” is uridine, “T” is deoxythymidine, “R” is a purine (A or G), “Y” is a pyrimidine (C or T), “K” is G or T, “H” is A or C or T, “D” is A, T or G, “I” is inosine, and “N” is any nucleotide.
  • Codon optimization refers to a method of modifying a nucleic acid sequence to enhance expression in a host cell of interest by replacing at least one codon of the native sequence (e.g., about or more than about 1, 2, 3, 4, 5, 10, 15, 20, 25, 50 or more codons) with codons that are more frequently or most frequently used in the genes of the host cell while maintaining the native amino acid sequence.
  • codon bias differences in codon usage between organisms
  • mRNA messenger RNA
  • tRNA transfer RNA
  • Codon usage tables are readily available, for example, in the Codon Usage Database available at www.kazusa.orjp/codon/, and these tables can be adapted for use in different ways. See, Nakamura et al., 2001. Y. et al., “Codon usage tabulated from the international DNA sequence databases: status for the year 2000. Nucl. Acids Res., 28:292(2000).
  • Polypeptide “peptide”, and “protein” are used interchangeably herein to refer to polymers of amino acid residues.
  • the term applies to amino acid polymers in which one or more amino acid residues are artificial chemical analogs of the corresponding naturally occurring amino acids, as well as to naturally occurring amino acid polymers.
  • the terms “polypeptide”, “peptide”, “amino acid sequence” and “protein” may also include modified forms, including but not limited to glycosylation, lipid attachment, sulfation, gamma carboxylation of glutamic acid residues, hydroxylation and ADP-ribosylation.
  • Sequence "identity” has an art-recognized meaning, and the percentage of sequence identity between two nucleic acid or polypeptide molecules or regions can be calculated using published techniques. Sequence identity can be measured along the entire length of a polynucleotide or polypeptide or along a region of the molecule.
  • Suitable conservative amino acid substitutions are known to those skilled in the art and can generally be made without changing the biological activity of the resulting molecule.
  • those skilled in the art recognize that single amino acid substitutions in non-essential regions of a polypeptide do not substantially change the biological activity (see, e.g., Watson et al., Molecular Biology of the Gene, 4th Edition, 1987, The Benjamin/Cummings Pub.co., p. 224).
  • expression construct refers to a vector such as a recombinant vector suitable for expressing a nucleotide sequence of interest in an organism. "Expression” refers to the production of a functional product.
  • expression of a nucleotide sequence can refer to the transcription of the nucleotide sequence (such as transcription to generate mRNA or functional RNA) and/or the translation of RNA into a precursor or mature protein.
  • the "expression construct" of the present invention can be a linear nucleic acid fragment, a circular plasmid, a viral vector, or, in some embodiments, can be a translatable RNA (such as mRNA), for example, RNA generated by in vitro transcription.
  • a translatable RNA such as mRNA
  • An "expression construct" of the present invention may comprise regulatory sequences and a nucleotide sequence of interest from different sources, or regulatory sequences and a nucleotide sequence of interest from the same source but arranged in a manner different from that normally found in nature.
  • regulatory sequence and “regulatory element” are used interchangeably and refer to nucleotide sequences located upstream (5' non-coding sequence), in the middle or downstream (3' non-coding sequence) of a coding sequence and affecting the transcription, RNA processing or stability or translation of the relevant coding sequence. Regulatory sequences may include, but are not limited to, promoters, translation leader sequences, introns and polyadenylation recognition sequences.
  • Promoter refers to a nucleic acid fragment that can control the transcription of another nucleic acid fragment.
  • a promoter is a promoter that can control the transcription of a gene in a cell, whether or not it is derived from the cell.
  • a promoter can be a constitutive promoter or a tissue-specific promoter or a developmentally regulated promoter or an inducible promoter.
  • operably linked refers to the connection of a regulatory element (e.g., but not limited to, a promoter sequence, a transcription termination sequence, etc.) to a nucleic acid sequence (e.g., a coding sequence or an open reading frame) such that transcription of the nucleotide sequence is controlled and regulated by the transcription regulatory element.
  • a regulatory element e.g., but not limited to, a promoter sequence, a transcription termination sequence, etc.
  • nucleic acid sequence e.g., a coding sequence or an open reading frame
  • "Introducing" a nucleic acid molecule e.g., a plasmid, a linear nucleic acid fragment, RNA, etc.
  • a protein into an organism means transforming the cells of the organism with the nucleic acid or protein so that the nucleic acid or protein can function in the cell.
  • Transformation includes stable transformation and transient transformation.
  • Stable transformation refers to the introduction of an exogenous nucleotide sequence into a genome, resulting in stable inheritance of the exogenous gene. Once stably transformed, the exogenous nucleic acid sequence is stably integrated into the genome of the organism and any successive generations thereof.
  • Transient transformation refers to the introduction of a nucleic acid molecule or protein into a cell to perform its function without the foreign gene being stably inherited. In transient transformation, the foreign nucleic acid sequence is not integrated into the genome.
  • Plant “progeny” include any subsequent generations of the plant.
  • the present invention provides an artificial gene drive system for a plant, comprising:
  • a first nucleic acid comprising a coding sequence of a gene editing system component
  • the gene editing system can target a gene encoding a protein essential for pollen tube development in the plant and cause the loss of function of the protein essential for pollen tube development, wherein the coding sequence of the gene editing system component is operably linked to a promoter that mediates specific expression during pollen formation;
  • a second nucleic acid comprising a recoded coding sequence of the pollen tube development essential protein, wherein the recoded sequence encodes a functional pollen tube development essential protein and cannot be targeted by the gene editing system, and is operably linked to a native promoter of a gene encoding the pollen tube development essential protein;
  • a third nucleic acid comprising a coding sequence for a cargo to be propagated among a population of said plant.
  • the first nucleic acid, the second nucleic acid, and the third nucleic acid are located on the same expression construct.
  • the gene encoding a protein essential for pollen tube development is an endogenous gene of the plant. In some embodiments, the gene encoding a protein essential for pollen tube development is an exogenous gene that has been introduced into the plant.
  • the essential protein for pollen tube development is No Pollen Germination 1 (NPG1).
  • NPG1 is associated with male gametophyte development but does not affect female gametophyte development and is required for the later stages of pollen germination. NPG1 is highly conserved in different plants.
  • NPG1 comprises an amino acid sequence having at least 70%, at least 75%, at least 80%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.9%, or even 100% sequence identity with SEQ ID NO:1.
  • the coding sequence of endogenous NPG1 in the plant comprises a nucleotide sequence having at least 70%, at least 75%, at least 80%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.9%, or even 100% sequence identity with SEQ ID NO:2.
  • the coding sequence of the re-encoded NPG1 comprises a nucleotide sequence having at least 70%, at least 75%, at least 80%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.9%, or even 100% sequence identity to SEQ ID NO: 3.
  • the coding sequence of the re-encoded NPG1 cannot be targeted by the gene editing system, and thus will not lose function due to expression of the gene editing system.
  • the promoter of a gene refers to a sequence of about 100 bp to about 5 kb, such as about 500 bp to about 3 kb, such as about 2 kb, upstream of the translation start site or transcription start site of the coding sequence of the gene on the genome.
  • the natural promoter of NPG1 comprises a nucleotide sequence having at least 70%, at least 75%, at least 80%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.9%, or even 100% sequence identity with SEQ ID NO:4.
  • the promoter that mediates specific expression during pollen formation is the promoter of the DMC1 (Disruption of Meiotic Control 1) gene.
  • the DMC1 promoter comprises a nucleotide sequence having at least 70%, at least 75%, at least 80%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.9%, or even 100% sequence identity to SEQ ID NO: 5.
  • the DMC1 promoter is capable of driving the expression of a nucleotide sequence operably linked thereto in pollen mother cells.
  • the promoter that mediates specific expression during pollen formation is the promoter of the TPD1 (Tapetum Determinant 1) gene.
  • the TPD1 promoter comprises a nucleotide sequence having at least 70%, at least 75%, at least 80%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.9%, or even 100% sequence identity with SEQ ID NO: 6.
  • the TPD1 promoter is capable of driving the nucleotide sequence operably linked thereto to be continuously expressed during the process in which the progenitor of the pollen mother cell, i.e., the archosporium, develops into the pollen mother cell.
  • the gene editing system available in the present invention can be various gene editing systems known in the art, as long as it can perform targeted genome editing in plants.
  • the gene editing system can be a gene editing system based on CRISPR, ZFN or TALEN.
  • the gene editing system is a gene editing system based on CRISPR.
  • the CRISPR gene editing system can include a CRISPR nuclease and at least one guide RNA.
  • the CRISPR nuclease and the guide RNA can form a complex to target and/or cleave a genomic target sequence based on the complementarity of the guide RNA to the genomic target sequence.
  • CRISPR nuclease can be derived from Cas9 nuclease, including Cas9 nuclease or its functional variant.
  • the Cas9 nuclease can be a Cas9 nuclease from different species, such as spCas9 from Streptococcus pyogenes (S.pyogenes) or SaCas9 derived from Staphylococcus aureus (S.aureus).
  • Cas9 nuclease and “Cas9” are used interchangeably herein and refer to RNA-guided nucleases including Cas9 proteins or fragments thereof (e.g., proteins comprising active DNA cleavage domains of Cas9 and/or gRNA binding domains of Cas9).
  • Cas9 is a component of the CRISPR/Cas (clustered regularly interspaced short palindromic repeats and related systems) genome editing system, which can target and cut DNA target sequences to form DNA double-strand breaks (DSBs) under the guidance of guide RNA.
  • CRISPR/Cas clustered regularly interspaced short palindromic repeats and related systems
  • CRISPR nuclease can also be derived from Cpf1 nuclease, including Cpf1 nuclease or its functional variant.
  • the Cpf1 nuclease can be Cpf1 nuclease from different species, such as Cpf1 nuclease from Francisella novicida U112, Acidaminococcus sp. BV3L6 and Lachnospiraceae bacterium ND2006.
  • CRISPR nucleases can also be derived from Cas3, Cas8a, Cas5, Cas8b, Cas8c, Cas10d, Cse1, Cse2, Csy1, Csy2, Csy3, GSU0054, Cas10, Csm2, Cmr5, Cas10, Csx11, Csx10, Csf1, Csn2, Cas4, C2c1, C2c3 or C2c2 nucleases, for example, including these nucleases or functional variants thereof.
  • the coding sequence of the CRISPR nuclease is operably linked to the promoter that mediates specific expression during pollen formation, preferably, is operably linked to the TPD1 promoter.
  • guide RNA and “gRNA” are used interchangeably and refer to an RNA molecule that is capable of forming a complex with a CRISPR effector protein and targeting the complex to a target sequence due to a certain degree of identity with the target sequence.
  • the target sequence is targeted by base pairing with the complementary strand of the target sequence.
  • the gRNA used by the Cas9 nuclease or its functional variant is generally composed of crRNA and tracrRNA molecules that partially complement each other to form a complex, wherein the crRNA comprises a guide sequence (also referred to as a seed sequence) that has sufficient homology with the target sequence to hybridize with the complementary strand of the target sequence and instruct the CRISPR complex (Cas9+crRNA+tracrRNA) to specifically bind to the target sequence sequence.
  • a single guide RNA sgRNA
  • sgRNA single guide RNA
  • the gRNA used by the Cpf1 nuclease or its functional variant is generally composed of only mature crRNA molecules, which may also be referred to as sgRNA. Designing suitable gRNA based on the CRISPR nuclease used and the target sequence to be edited is within the capabilities of those skilled in the art.
  • the guide RNA can be driven to express by a constitutive promoter, such as transcription.
  • the guide RNA can be driven to express by a U6 or U3 promoter, such as transcription.
  • the gene editing system can target any region of the endogenous gene encoding the essential protein for pollen tube development, as long as it can cause the loss of function of the endogenous essential protein for pollen tube development.
  • the gene editing system can target the endogenous coding sequence of the essential protein for pollen tube development, resulting in mutation or incomplete translation of the protein.
  • the gene editing system can target the endogenous regulatory sequence of the essential protein for pollen tube development, resulting in the non-expression of the protein.
  • the nucleotide sequence can be changed by codon degeneracy to remove the target sequence of the gene editing system but without changing the encoded protein sequence.
  • the coding sequence of the protein essential for pollen tube development contained in the second nucleic acid can also be the same as the wild-type coding sequence, which can also be referred to as recoded herein because it cannot be targeted by the gene editing system either.
  • the gene editing system comprises a Cas9 nuclease and at least one gRNA targeting endogenous NPG1.
  • the at least one gRNA targeting endogenous NPG1 targets a nucleotide sequence selected from any one of SEQ ID NO: 7-10.
  • the "goods to be spread in the population of the plant” described herein can be any sequence that is expected to spread in the population of the plant, such as a wild population.
  • the expression of the goods is harmful to the plant when the plant is exposed to specific compounds or conditions.
  • the goods can be herbicide-sensitive genes, or genes that can destroy the original herbicide resistance. By cooperating with the subsequent artificial spraying of a certain herbicide or specific compound, effective local weed management with controllable range can be achieved.
  • the goods can also be genes that affect the development of megaspore mother cells or embryos, thereby achieving control of population size.
  • the goods can also be genes that can improve environmental adaptability, disease resistance, etc., thereby improving the adaptability of endangered plants to their natural environment.
  • the present invention provides a method for producing modified plants for gene drive modification of plant populations, the method comprising introducing the artificial gene drive system for plants of the present invention into at least one plant, thereby obtaining at least one modified plant whose genome is integrated with the first nucleic acid, the second nucleic acid and the third nucleic acid.
  • the first nucleic acid, the second nucleic acid and the third nucleic acid integrated into the genome of the modified plant are tightly linked, for example, located at the same locus.
  • the first nucleic acid, the second nucleic acid and/or the third nucleic acid exist in the genome of the plant with a single copy.
  • the artificial gene drive system can be introduced into plants by various methods well known to those skilled in the art.
  • Methods that can be used to introduce the artificial gene drive system of the present invention into plants include, but are not limited to, gene gun method, PEG-mediated protoplast transformation, Agrobacterium-mediated transformation, plant virus-mediated transformation, pollen tube channel method, and ovary injection method.
  • the present invention provides a modified plant for gene drive modification of a plant population, which is prepared by the method of the present invention.
  • the present invention provides a modified plant for gene drive modification of a plant population, into which the artificial gene drive system for plants of the present invention is introduced, so that the first nucleic acid, the second nucleic acid and the third nucleic acid are integrated into its genome.
  • the first nucleic acid, the second nucleic acid and the third nucleic acid integrated into the genome of the modified plant are tightly linked, for example, located at the same locus.
  • the first nucleic acid, the second nucleic acid and/or the third nucleic acid exist in the genome of the plant with a single copy.
  • the present invention provides a method for modifying a plant population by gene drive, the method comprising placing at least one modified plant of the present invention in a population of plants, and allowing the at least one modified plant to hybridize with other plants in the plant population.
  • the method allows the offspring of the hybridization of the at least one modified plant with other plants in the plant population to hybridize with other plants and/or offspring in the population.
  • the method results in the modified plant population containing an increased proportion of plants carrying the cargo compared to the unmodified plant population.
  • the modified plant population contains at least 1%-100%, such as at least 1%, at least 5%, at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, or even 100% of the plants carrying the cargo.
  • the present invention provides a method for modifying a plant population by gene drive, the method comprising:
  • step ii) placing the at least one modified plant obtained in step i) into said plant population and allowing said at least one modified plant to crossbreed with other plants in said plant population.
  • the first nucleic acid, the second nucleic acid and the third nucleic acid integrated into the genome of the modified plant are closely linked, for example, located at the same locus. In some embodiments, the first nucleic acid, the second nucleic acid and/or the third nucleic acid are present in the genome of the plant in a single copy. In some embodiments, the method allows the offspring of the hybridization of the at least one modified plant with other plants in the plant population to hybridize with other plants in the population. In some embodiments, the method allows the offspring of the hybridization of the at least one modified plant with other plants in the plant population to hybridize with other plants and/or offspring in the population.
  • the method results in an increased proportion of plants carrying the cargo in the modified plant population compared to the unmodified plant population.
  • the modified plant population contains at least 1%-100%, for example, at least 1%, at least 5%, at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, or even 100% of the plants carrying the cargo.
  • the plants in various aspects of the present invention can be monocotyledonous or dicotyledonous plants, preferably mainly outcrossing plants.
  • the plants can be Arabidopsis, corn, rapeseed, tobacco, grass weeds, etc.
  • the FAST marker sequence 17 (pOLE1: OLE1-TagRFP-Nos terminator) was inserted into the XF675 binary vector by Gibson ligation 26.
  • the promoter and partial coding sequence of the OLE1 (AT4G25140) gene were amplified from genomic DNA (gDNA), and the sequences of TagRFP and Nos terminator were synthesized according to reference 17.
  • the XF675 plasmid digested with HindIII and the above two fragments were assembled together by Gibson ligation.
  • the sgRNA cassette (pU6-SmR-gRNA scaffold-U6 terminator) was amplified from the template pHEE401E (Addgene #71287) and inserted into XF675 after double restriction digestion (EcoRI and HindIII) using Gibson ligation. Then, four gRNAs were introduced using the BsaI cutting and ligation method27 . The above plasmid was digested with HindIII again, and the FAST marker sequence was cloned in through Gibson ligation, and the HindIII restriction recognition site was filled in.
  • the NPG1 promoter (about 2kb) was amplified from gDNA, and the Recoded NPG1 sequence was obtained by mutation PCR, and then inserted into the HindIII-digested plasmid in the previous step through Gibson ligation. Finally, the plasmid from the previous step was digested with KpnI and the following fragments were ligated via Gibson ligation: the promoter sequences of DMC1 (AT3G22880) and TPD1 (AT4G24972) amplified from gDNA, the plant codon-optimized SpCas9 sequence with two nuclear localization signals (NLS) amplified from the template pHEE401E, and the Nos terminator amplified from the template XF675.
  • DMC1 AT3G22880
  • TPD1 AT4G24972
  • Candidate gRNAs for the NPG1 (AT2G43040) coding sequence were predicted based on CRISPR-P 2.0 24 (http://crispr.hzau.edu.cn/CRISPR2/) and CRISPR-GE 28 (http://skl.scau.edu.cn/home/), and 12 were screened to detect cutting efficiency in Arabidopsis protoplasts.
  • Each 20 nt gRNA sequence was inserted into the pAtU6-sgRNA vector (Addgene plasmid #119775) by BsaI digestion and Gibson ligation.
  • Arabidopsis protoplasts were prepared according to references 29 , 30 . After co-transformation with pAtU6-sgRNA and p2X35S-Cas9, protoplasts were cultured at room temperature for 48 h and then harvested. In parallel, one tube of protoplasts was transformed with p2X35S-GFP alone and used to estimate transformation efficiency approximately 16 h after transformation. The transformation efficiencies of the two biological replicates were found to be 41% and 45%, respectively.
  • the genomic DNA of each tube of protoplasts was extracted using a DNA extraction kit (Plant Genomic DNA Kit, DP305, TIANGEN).
  • the genomic region of 180-220bp around each target region was PCR amplified and purified.
  • the 6nt barcode ('ATGCAG') was introduced by primers to distinguish the PCR products produced by two biological replicates of the same target site. All 24 purified PCR product samples were quantified by Nanodrop 2000 and mixed in equal amounts and sent to Novogene for library construction and Illumina PE150 sequencing.
  • the gRNA editing efficiency is calculated by dividing the number of edited reads by the total number of reads that can be mapped to the site. Mismatched reads within 10 bp upstream of PAM (NGG) are considered edited reads. Since the main editing types of Cas9 are insertions and deletions, single base substitutions (SNPs) are excluded when calculating the editing types.
  • T1 was directly selected from the harvested dry seeds based on the presence or absence of red fluorescence under a handheld fluorescence detector (LUYOR 3415RG).
  • Agrobacterium-mediated transformation sometimes introduces exogenous DNA sequences into multiple sites of the plant genome.
  • 48-50 plants were randomly selected from the T1 obtained from DMC-CAIN and TPD-CAIN, and the number of T-DNA insertion sites in each plant was checked by thermal asymmetric interlaced polymerase chain reaction (TAIL-PCR) 32.
  • the T1 plants with single-site insertion were further confirmed by whole genome sequencing (Novogene, PE150), using the software TDNAscan 33 and the semi-automatic pipeline developed by the inventors for data analysis, and 3 T1s from DMC-CAIN, 5 T1s from TPD-CAIN, and 2 T1s from FAST only were obtained.
  • the genotype identification of the target gene NPG1 was completed by Sanger sequencing after amplification of a genomic fragment of about 2kb including four target sites ( Figure 8b).
  • the reverse amplification primer was located in the intron region to avoid interference with the Recoded NPG1 sequence in the driver element.
  • Genomic DNA (gDNA) extracted from rosette leaves and inflorescences was used as a template for polymerase chain reaction (PCR).
  • the PCR products were directly subjected to Sanger sequencing. Consistent Sanger sequencing results can usually be obtained from leaf and inflorescence samples of the same plant. If the sequencing results of the leaf sample have multiple peaks (indicating possible mosaicism or heterogeneity) and the genotype cannot be determined, the inflorescence sample from the same plant is used to determine the genotype.
  • FAST-F1 and F2 plants were genotyped using only leaf samples.
  • the gRNA11 target site genotype was determined using Illumina sequencing for leaf and inflorescence samples. PCR products from different tissues, each with a unique barcode sequence introduced during PCR, were mixed in equal amounts and sequenced using Illumina PE150. Clean reads separated by barcode sequence were aligned back to the genomic sequence around the gRNA11 target site using BWA. The read depth covering the gRNA11 region ranged from 107,207 to 150,422. Reads with mismatches within the 23-nt target site region were considered as edited types, and the editing efficiency of each sample was determined as the ratio of reads containing mismatches to the total reads mapped to the target site. Single-base substitutions with a frequency higher than 0.5% and all indels were considered as edited types.
  • the population dynamics of CAIN were computationally simulated using an individual-based stochastic model based on the Wright-Fisher model, which assumes a constant population size and non-overlapping generations.
  • the genotypes of two unlinked loci, CAIN and NPG1 were considered, and the initial population had 9900 wild-type individuals and 100 heterozygotes carrying CAIN (TPD-CAIN/+; NPG1 +/+ ).
  • a density adjustment strategy was adopted.
  • the population size (X) of each generation was calculated according to the binomial distribution X ⁇ B(50,0.02 ⁇ S). For each generation, pairs of individuals were randomly selected to produce offspring, and the sex was randomly assigned. This process was repeated X times, and the offspring produced covered the parental data.
  • the father carrying CAIN can produce CRISPR-mediated cleavage at the NPG1 locus, and the cleavage efficiency is set to an empirical value (98.4%, Figure 13a) or a fixed value (50% or 100%).
  • the cleavage results in the loss of function of the NPG1 gene.
  • the phenotypic penetrance of the inability of pollen to germinate due to the loss of male gamete function is based on an empirical value (96.0%, Figure 13a) or an artificial setting (100%).
  • the cleavage efficiency of NPG1 by TPD-CAIN/+ female germ cells was also considered and set to an empirical value (94.1%, Figure 13b) or an artificial setting (0%, 50%, 100%).
  • the driving allele in the heterozygous state can convert the wild-type allele into the driving allele with 100% efficiency.
  • the target gene is a haplosufficient gene essential for embryonic development, and Cas9 will cut the wild-type target gene in germ cells with a probability of 100%. After fertilization, the Cas9 carried by the egg cell will further cut the wild-type target gene of paternal origin, and the cutting efficiency is also 100%. Embryos with two destroyed target gene alleles but no inherited TARE driving elements will not be able to continue to develop.
  • CAIN is located inside a haploid-sufficient male fertility gene, which is therefore non-functional.
  • the settings are the same as for the modified CAIN driver, except that the initial population size and K are both set to 100,000. Males homozygous for CAIN cannot produce viable pollen.
  • the simulation was implemented using a custom Python script available at https://github.com/QianLabWebsite/GeneDrive.
  • Example 1 Design of CAIN, a CRISPR-based poison-antidote gene drive system targeting pollen germination
  • CAIN consists of three tightly linked parts: poison, antidote and cargo.
  • the poison is a gRNA-Cas9 complex that can introduce an inactivating mutation in an essential gene associated with pollen germination by inducing DSBs and subsequent repair via NHEJ.
  • the antidote is a recoded version of this essential gene, expressed with its native promoter ( Figure 1a).
  • the poison can disrupt both alleles of the essential gene before meiosis, thereby affecting the germination of all four pollen grains, but the antidote can only neutralize this effect if it is present in the pollen grains.
  • This process can be propagated in successive generations, spreading CAIN throughout the population through a continuous hybridization process.
  • Arabidopsis thaliana was chosen as the experimental subject because it is a model plant that is mainly self-pollinating. Artificial hybridization is performed throughout the experiment, and strict experimental procedures and management are used to further enhance the ecological safety of the gene drive system before its official release.
  • CAIN requires the selection of a gene that is essential for pollen germination.
  • a list of genes that are related to male gametophyte development but do not affect female gametophyte development was searched from a previous study13 .
  • No Pollen Germination (NPG1) was selected as the target gene, which is a gene required for the later stages of pollen germination14. Selecting it as the target gene can make the Cas9 cutting period relatively longer, because Cas9 cutting needs to be completed before it can function ( Figure 1a).
  • the antidote is a recoded version of the NPG1 gene sequence, driven by its original promoter to restore the gene function.
  • the target site of the gRNA based on synonymous codons, so it does not affect the amino acid composition, Figure 8a.
  • the intron region of the gene was deleted in this version, making the structure more streamlined and not interfering with the subsequent genotyping of NPG1 in the genome ( Figure 8b).
  • FAST 17 a red fluorescent protein expressed in the seed drying stage, named FAST 17 , was selected to facilitate the observation of the propagation of the drive.
  • insertions and deletions of multiples of three lengths can produce CRISPR resistance alleles without affecting subsequent reading frames, but the rarity of their occurrence at two gRNA sites at the same time (i.e., 0%) indicates that the CAIN design has a lower rate of resistance allele formation with normal gene function, especially when multiple gRNAs are connected in series.
  • NPG1- is ubiquitous in almost all F1 plants, indicating that CRISPR-based NPG1 knockout is highly efficient, which theoretically further impairs pollen germination and leads to partial segregation.
  • TPD-CAIN Compared with DMC-CAIN, TPD-CAIN has a higher transmission rate (Figure 2), which is speculated to be caused by the TPD1 promoter's ability to produce more active Cas9, resulting in a higher cutting efficiency of NPG1.
  • DMC-CAIN had a low cutting efficiency
  • DMC-CAIN/+;NPG1 -/- F1 plants were used to pollinate wild-type female parents, and the proportion of red fluorescence in their F2 seeds was counted ( Figures 5b and 5c), and it was found that the DMC-CAIN transmission rate reached 95.9% and 99.5%, respectively.
  • the drive% in the F2 progeny produced by DMC-CAIN/+;NPG1 +/- F1 plants as male parents was 63.7% ( Figures 5b and 5c).
  • Example 4 Insufficient DNA cleavage and incomplete penetrance may result in TPD-CAIN transmission rate not reaching 100%
  • TPD-CAIN When T1 plants with gene drive elements were used as male parents, the transmission rate of TPD-CAIN was observed to be between 89.6% and 96.9% (Figure 2b), that is, a portion (3.1%-10.4%) still did not inherit TPD-CAIN (i.e., FAST-). Similarly, in the F2 generation produced by F1 plants carrying TPD-CAIN, a portion (1.0%-12.2%) did not inherit TPD-CAIN ( Figure 4a). The mechanism of the generation of these FAST- offspring was further explored.
  • the Cas9 activity responsible for generating these DSBs could originate from one of two possible scenarios: Cas9 expressed from the embryonic genome after fertilization (i.e., zygote), or paternal inheritance of the Cas9 protein. The latter scenario seems unlikely due to the limited protein/RNA contained in sperm cells.
  • the NPG1 genotype of the F2 plants previously produced by F1 reciprocal crosses was examined using Sanger sequencing ( Figure 6c). The NPG1 + genotype did not appear in the FAST + F2 progeny, regardless of whether TPD-CAIN was inherited from the father or the mother.
  • NPG1 in female germ cells and its subsequent repair have the potential to enhance the spread of CAIN by complementing the NPG1 knockout achieved in male germ cells.
  • the genotypes of FAST-F2 plants were analyzed, since no additional NPG1 cleavage occurs after F2 zygote formation, providing information on the original genotype inherited from their parents.
  • CAIN/TADS also spread significantly faster than TARE at high initial release ratios18 , demonstrating its potential as a population suppression tool when integrated into haplosufficient male fertility genes25.
  • the population dynamics of CAIN were simulated at an initial release ratio of 1% ( Figure 7c). These simulations showed a rapid increase in the number and frequency of CAIN carriers (mostly heterozygous in the early stage and mostly homozygous in the later stage). At the same time, the total population size gradually decreased until the population became extinct at generation 26 ( Figure 7c), which may be due to a surge in the number of male-sterile CAIN homozygotes. Overall, the results of the computer simulations suggest that the rapid spread of CAIN gives it the potential to achieve population suppression.
  • CAIN a CRISPR-based poison-antidote gene drive system that targets a key gene essential for male gametophyte function in Arabidopsis thaliana, achieves super-Mendelian inheritance, and produces very few resistant alleles compared to homing-based drives.
  • the inventors not only provide key insights into the design and application of artificial TA gene drive systems in plants, but also propose an innovative solution to address pressing ecological and agricultural challenges.
  • TARE 10 also known as ClvR 9
  • ClvR 9 The design of TARE 10 avoids this by targeting essential genes in individual developmental stages, although it relies on the carryover of Cas9 activity in the egg cell to the zygote. Therefore, the design of Medea and TARE/ClvR focuses on female germ cells and may impair fertility 22 . In contrast, the inventors' design aims to affect male germ cells. Given that pollen grains are far more numerous than ovules, it is possible to minimize the fitness cost.
  • the inventors’ design focuses on the key genes for the normal functioning of the male gametophyte.
  • the inventors’ unique strategy takes advantage of the common male gametophyte formation process in plants, that is, after meiosis, there are two rounds of mitotic cycles, which eventually form mature pollen grains23 . Pollen grains continue to germinate and elongate, and two sperm cells are passed through the stigma to achieve the final double fertilization, which can be truly passed on to the offspring.
  • the design and efficacy of the inventors’ gene drive system depends on the selection of a target gene that is critical for pollen grain germination (NPG1 in this case), coupled with the selection of an efficient promoter for high activity of Cas9 in the reproductive system. This method tailored for plants may also be applied in animals if the key essential genes for sperm formation can be identified.
  • CAIN can achieve efficient partial segregation in plants. Cargo can also be replaced and used to solve various ecological problems, depending on the specific situation and goals ( Figure 14).
  • the CAIN system can be used to control invasive plants by selecting specific genes, such as those that affect megaspore mother cells or embryonic development, to achieve population control.
  • CAIN can be used to spread beneficial traits, such as drought or disease resistance genes, thereby enhancing the survival rate of endangered species in the wild.
  • beneficial traits such as drought or disease resistance genes, thereby enhancing the survival rate of endangered species in the wild.
  • specific herbicide susceptibility genes can be introduced to more effectively manage weeds, etc. If this strategy is widely used, it is possible to herald a new era of ecological management and sustainable agriculture.
  • CAIN was designed to include a degree of specificity by screening for gRNA can specifically target certain genotypes or ecotypes.
  • CAIN has zero-threshold, which means that in theory, releasing an individual containing a drive element can gradually spread in the population.
  • its spread rate can be controlled by using weaker gRNA or promoter to control the expression of Cas9, because the effectiveness of the drive is closely related to the cutting efficiency. This additional flexibility adds a layer of control to the spread of gene drives and enhances their safe application in different ecological environments.
  • gRNAs may have certain off-target effects, which is a problem worthy of attention. Although off-target phenomena are unlikely to hinder the spread of CAIN, they may introduce unexpected genomic mutations, thereby increasing the genetic load. It is worth mentioning that, although not emphasized in the results section, the inventors conducted a small-scale test and predicted 16 potential off-target sites of four gRNAs by CRISPR-P 2.0 24 , and determined the genotypes of these sites by Sanger sequencing in 16 F1 plants generated by four T1 lines. These potential off-target sites were not edited in the 16 plants tested, which also supported that the four gRNAs selected in CAIN specifically targeted NPG1.
  • the new CAIN driver, CAIN n+1 must be integrated into the same genomic location as CAIN n .
  • HDR homology repair
  • the homologous location can force CAIN n+1 and CAIN n to compete directly, ensuring that only one drive element remains.
  • CAIN n+1 needs to use different gRNA targets to destroy essential genes, thus acting as a new poison. This change makes both the recoded NPG1 in CAIN n and the NPG1 in the genome the targets of cleavage.
  • the new Recoded NPG1 which is not targeted by the gRNA in CAIN n and CAIN n+1, serves as a new antidote.
  • the original CAIN n can be replaced by CAIN n+1 , thereby achieving the removal or modification of the current cargo.
  • This approach does not require the selection of new target genes and preserves the overall size of the gene drive element, thereby improving compatibility and integration efficiency.
  • CAIN a gene drive system based on the TA principle of CRISPR specifically customized for plants.
  • the inventors successfully demonstrated the efficacy of CAIN in Arabidopsis, setting a benchmark for its application in other species.
  • NPG1 the key component NPG1 gene shows sequence conservation in various plants, the species promotion of CAIN is promising. Looking ahead, it is necessary to continue to improve this gene drive system, including studying its reversibility and adaptability, as well as the related mechanisms of controllability.
  • CAIN and similar gene drive systems are expected to reshape ecological management, agriculture, and species conservation in a substantial and transformative way.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Molecular Biology (AREA)
  • Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biophysics (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biochemistry (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Physics & Mathematics (AREA)
  • Microbiology (AREA)
  • Plant Pathology (AREA)
  • Cell Biology (AREA)
  • Botany (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Medicinal Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Breeding Of Plants And Reproduction By Means Of Culturing (AREA)

Abstract

La présente invention appartient au domaine technique de la biologie. Plus particulièrement, la présente invention concerne un système de pilotage génique artificiel. Plus particulièrement, la présente invention concerne un système artificiel de pilotage génique utilisant un mécanisme toxine-antidote pouvant être utilisé chez les plantes, un système d'édition génique pouvant priver de sa fonction une protéine essentielle au développement du tube pollinique étant utilisé comme toxine, une séquence de recodage codant pour la protéine essentielle au développement du tube pollinique étant utilisée comme antidote, et la séquence de recodage codant pour la protéine de type sauvage étant essentielle au développement du tube pollinique et ne pouvant pas être ciblée par le système d'édition génique. Le système de pilotage génique artificiel de la présente invention peut être utilisé pour transmettre des caractéristiques bénéfiques pour l'homme à des populations d'animaux sauvages.
PCT/CN2023/136757 2023-09-25 2023-12-06 Système d'entraînement de gène artificiel Pending WO2025065873A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202311247476.0A CN116987715B (zh) 2023-09-25 2023-09-25 人工基因驱动系统
CN202311247476.0 2023-09-25

Publications (1)

Publication Number Publication Date
WO2025065873A1 true WO2025065873A1 (fr) 2025-04-03

Family

ID=88525161

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/136757 Pending WO2025065873A1 (fr) 2023-09-25 2023-12-06 Système d'entraînement de gène artificiel

Country Status (2)

Country Link
CN (1) CN116987715B (fr)
WO (1) WO2025065873A1 (fr)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116987715B (zh) * 2023-09-25 2024-01-30 中国科学院遗传与发育生物学研究所 人工基因驱动系统

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030061635A1 (en) * 2001-06-20 2003-03-27 Reddy Anireddy S.N. Pollen-specific novel calmodulin-binding protein, NPG1 (No Pollen Germination1), promoter, coding sequences and methods for using the same
US20130180006A1 (en) * 2012-01-06 2013-07-11 Pioneer Hi Bred International Inc Pollen Preferred Promoters and Methods of Use
CN104039967A (zh) * 2012-01-06 2014-09-10 先锋高级育种国际公司 筛选植物的在植物中诱导孤雌生殖的遗传元件的方法
US20180320164A1 (en) * 2017-05-05 2018-11-08 California Institute Of Technology Dna sequence modification-based gene drive
US20190241879A1 (en) * 2016-09-09 2019-08-08 Massachusetts Institute Of Technology Methods and compounds for gene insertion into repeated chromosome regions for multi-locus assortment and daisyfield drives
US20200140885A1 (en) * 2018-11-05 2020-05-07 California Institute Of Technology Dna sequence modification-based gene drive
US20210105962A1 (en) * 2018-02-22 2021-04-15 Elsoms Developments Ltd Methods and compositions relating to maintainer lines
WO2023009993A1 (fr) * 2021-07-26 2023-02-02 Elsoms Developments Limited Procédés et compositions se rapportant à des lignées de mainteneurs pour la stérilité mâle
CN116987715A (zh) * 2023-09-25 2023-11-03 中国科学院遗传与发育生物学研究所 人工基因驱动系统

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017196858A1 (fr) * 2016-05-09 2017-11-16 Massachusetts Institute Of Technology Procédés de conception et d'utilisation de systèmes d'entraînement de gènes
WO2020018528A2 (fr) * 2018-07-16 2020-01-23 Board Of Trustees Of Michigan State University Maîtrise de l'auto-incompatibilité chez des plantes diploïdes pour la reproduction et la production d'hybrides
CN117187220A (zh) * 2022-03-08 2023-12-08 中国科学院遗传与发育生物学研究所 腺嘌呤脱氨酶及其在碱基编辑中的用途

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030061635A1 (en) * 2001-06-20 2003-03-27 Reddy Anireddy S.N. Pollen-specific novel calmodulin-binding protein, NPG1 (No Pollen Germination1), promoter, coding sequences and methods for using the same
US20130180006A1 (en) * 2012-01-06 2013-07-11 Pioneer Hi Bred International Inc Pollen Preferred Promoters and Methods of Use
CN104039967A (zh) * 2012-01-06 2014-09-10 先锋高级育种国际公司 筛选植物的在植物中诱导孤雌生殖的遗传元件的方法
US20190241879A1 (en) * 2016-09-09 2019-08-08 Massachusetts Institute Of Technology Methods and compounds for gene insertion into repeated chromosome regions for multi-locus assortment and daisyfield drives
US20180320164A1 (en) * 2017-05-05 2018-11-08 California Institute Of Technology Dna sequence modification-based gene drive
US20210105962A1 (en) * 2018-02-22 2021-04-15 Elsoms Developments Ltd Methods and compositions relating to maintainer lines
US20200140885A1 (en) * 2018-11-05 2020-05-07 California Institute Of Technology Dna sequence modification-based gene drive
WO2023009993A1 (fr) * 2021-07-26 2023-02-02 Elsoms Developments Limited Procédés et compositions se rapportant à des lignées de mainteneurs pour la stérilité mâle
CN116987715A (zh) * 2023-09-25 2023-11-03 中国科学院遗传与发育生物学研究所 人工基因驱动系统

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
DATABASE PROTEIN 20 October 2022 (2022-10-20), XP093298382, Database accession no. NP_001324372.1 *

Also Published As

Publication number Publication date
CN116987715A (zh) 2023-11-03
CN116987715B (zh) 2024-01-30

Similar Documents

Publication Publication Date Title
US12195741B2 (en) Creation of herbicide resistant gene and use thereof
Massel et al. Hotter, drier, CRISPR: the latest edit on climate change
US11820990B2 (en) Method for base editing in plants
US10487336B2 (en) Methods for selecting plants after genome editing
US12241074B2 (en) Genome editing-based crop engineering and production of brachytic plants
Ishizaki CRISPR/Cas9 in rice can induce new mutations in later generations, leading to chimerism and unpredicted segregation of the targeted mutation
CN110621154A (zh) 用于植物的除草剂耐受性的方法和组合物
Liu et al. Overriding Mendelian inheritance in Arabidopsis with a CRISPR toxin–antidote gene drive that impairs pollen germination
Oberhofer et al. Cleave and Rescue gamete killers create conditions for gene drive in plants
US20250057098A1 (en) Methods and compositions to increase yield through modifications of fea3 genomic locus and associated ligands
CN107090447A (zh) 使植物具有除草剂抗性的水稻als突变型蛋白、基因及其应用
CN116987715B (zh) 人工基因驱动系统
US20220411809A1 (en) Gene mutations in tomato to yield compact and early yielding forms suitable for urban agriculture
CN118667837A (zh) 一种植株籽粒大小调控基因及其应用
WO2021193865A1 (fr) Procédé de production d'un plant stérile mâle sensible à la température
US20250171798A1 (en) Genome editing-based crop engineering and production of brachytic plants
CN116121298B (zh) 抑制hsrp1基因的表达在提高植物耐热性中的应用
JP2024113751A (ja) 超優性形質を有する作物の作出方法、及び超優性形質を有する作物
CN113999871B (zh) 创制矮杆直立株型的水稻种质的方法及其应用
Nagaraj Kumar et al. Principles and applications of RNA-based genome editing for crop improvement
CN108668884A (zh) 利用两种转录因子基因创制观赏型水稻种质的方法
CN118308418A (zh) 玉米基因dwf4和功能位点及其用途
Rai et al. 8 CRISPR/Cas9 for Enhancing
WO2024228946A2 (fr) Modifications d'événement de maïs transgénique zm_bcs216090 et procédés associés
US20240309394A1 (en) Herbicide resistant cannabis plant

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23954006

Country of ref document: EP

Kind code of ref document: A1