CN113066527A - Target prediction method and system for siRNA knockdown of mRNA - Google Patents
Target prediction method and system for siRNA knockdown of mRNA Download PDFInfo
- Publication number
- CN113066527A CN113066527A CN202110397409.1A CN202110397409A CN113066527A CN 113066527 A CN113066527 A CN 113066527A CN 202110397409 A CN202110397409 A CN 202110397409A CN 113066527 A CN113066527 A CN 113066527A
- Authority
- CN
- China
- Prior art keywords
- mrna
- target
- sirna
- model
- prediction
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B15/00—ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
- G16B15/30—Drug targeting using structural data; Docking or binding prediction
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B30/00—ICT specially adapted for sequence analysis involving nucleotides or amino acids
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Chemical & Material Sciences (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Health & Medical Sciences (AREA)
- Theoretical Computer Science (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biotechnology (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Biophysics (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Analytical Chemistry (AREA)
- Medicinal Chemistry (AREA)
- Pharmacology & Pharmacy (AREA)
- Crystallography & Structural Chemistry (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The invention discloses a target prediction method for siRNA knockdown mRNA, which comprises the following steps: firstly, extracting the base sequence characteristics of the target binding target of the mRNA, then extracting the RNA secondary structure characteristics of the target binding target, and further realizing the target prediction of the siRNA knockdown mRNA through a target prediction model of the siRNA knockdown mRNA. In addition, the invention also discloses a target point prediction system for siRNA knockdown of mRNA, which comprises: the system comprises an input mRNA sequence module, a sequence feature extraction module, a secondary structure feature extraction module and a prediction model screening module. The invention considers the mRNA target binding target spot and the base sequence characteristics of the corresponding siRNA, also considers the RNA secondary structure characteristics of the mRNA target binding target spot, and effectively improves the target spot prediction effect of the siRNA knockdown mRNA.
Description
Technical Field
The invention belongs to the field of biological information, and particularly relates to a target point prediction method for siRNA knockdown of mRNA. In addition, the invention also relates to a target point prediction system for siRNA knockdown of mRNA.
Background
A large number of biological experiments show that siRNA combined with different targets of the same mRNA has different knockdown efficiency. In view of low efficiency, high cost, long period and many interference factors of searching for the proper siRNA binding target on mRNA by a biological experiment mode, the prediction of the proper siRNA binding target on mRNA by means of a computer technology has significant significance. In the early stage, the target prediction of siRNA knockdown mRNA is mainly based on the observation of the frequency of various bases on the target sample of siRNA combined mRNA by researchers, and has low efficiency and difficulty in obtaining the optimal result. With the increase of siRNA combined mRNA target samples and the rise of machine learning methods, the prediction efficiency and accuracy of siRNA knock-down mRNA target are greatly improved by extracting the base sequence characteristics of siRNA combined mRNA target and training a prediction model by using large sample data. However, the existing prediction models only consider the base sequence characteristics of siRNA binding to mRNA target points, but do not consider the RNA secondary structure characteristics of the binding to the mRNA target points, so that the prediction effect is still unsatisfactory.
Therefore, the invention provides a novel target point prediction method for siRNA knockdown mRNA. The method considers the mRNA target binding target spot and the base sequence characteristics of the corresponding siRNA, also considers the RNA secondary structure characteristics of the mRNA target binding target spot, and effectively improves the target spot prediction effect of the siRNA knockdown mRNA.
Disclosure of Invention
The technical problem to be solved by the invention is to provide a target point prediction method for siRNA knockdown mRNA, which can effectively improve the target point prediction accuracy of siRNA knockdown mRNA and provide a powerful and reliable basis for target point selection of siRNA knockdown mRNA. In addition, the invention also provides a target point prediction system for siRNA knockdown of mRNA.
In order to solve the technical problems, the invention adopts the following technical scheme:
in one aspect of the invention, a method for predicting a target for siRNA knockdown of mRNA that takes into account RNA secondary structural features is provided. The method comprises the following steps: firstly, extracting the base sequence characteristics of the target binding target of the mRNA, then extracting the RNA secondary structure characteristics of the target binding target, and further realizing the target prediction of the siRNA knockdown mRNA through a target prediction model of the siRNA knockdown mRNA.
The method comprises the following steps:
step 1, inputting a base sequence of mRNA to be knocked down, and obtaining siRNA sequences corresponding to all candidate target points on the mRNA according to a base complementary pairing principle;
step 2, extracting the target spot, the base type of each position of the corresponding siRNA of the target spot and the appearance frequency base sequence characteristic of each base type according to the base sequence of the target binding target spot of the mRNA;
step 3, extracting the secondary structure characteristics of the mRNA target binding target point and the secondary structure characteristics of the siRNA antisense strand corresponding to the secondary structure characteristics;
step 4, inputting all the extracted features into a prediction model, and outputting the binding probability value of the mRNA target binding target and the corresponding siRNA antisense chain by the model;
and 5, screening out a proper siRNA knockdown mRNA target according to the probability value output by the model.
As a preferred technical solution of the present invention, step 3 specifically includes the following steps:
step A, as shown in formula (1), calculating the probability P of pairing and combining each base i on the mRNA single chain at the target binding target point of the mRNA with other bases j on the whole mRNA single chainijSum of Si. m is the number of bases of the mRNA.
Wherein k is any one secondary structure formed by pairing a base i and a base j in multiple secondary structures possibly formed by the single-stranded mRNA, S is any one of all secondary structures possibly formed by the single-stranded mRNA, and delta GkAnd Δ GSThe free energies corresponding to the secondary structures numbered K and S, T is the absolute temperature, and R is the gas constant 8.314J/(mol x K).
Step B, the probability and S of each base on the target binding target point extracted in the step A are comparediCarry out a weighted summation FsumAnd calculating the maximum value Fmax. In the weighted summation process, as shown in formula (1), the number of hydrogen bonds formed by base pairing is considered, and if the base type is A or U, the weight W isiIs 2; if the base type is C or G, weight WiIs 3. The calculation process of the maximum value is shown in formula (3). n is the number of bases of the target binding site.
Fmax=MAX Si (3)
And C, performing characteristic extraction on the siRNA antisense strand corresponding to the target binding target of the mRNA according to the step A, B. In this case, m is n.
The characteristics of the mRNA and siRNA antisense strand combination region comprise the probability of base pairing with all other bases at each position and n characteristics in total, the maximum value of the n numbers, and the weighted sum of the n numbers, and n +2 characteristics in total. Extraction of features in the same manner for n bases on the siRNA antisense strand also yields n +2 features, thus step 3 extracts 2n +4 features in total that reflect the RNA secondary structure at the target binding site of the mRNA target.
As the preferable technical scheme, the target point prediction model of siRNA knockdown mRNA is composed of LightGBM regression models with three different parameters, and the prediction results of the three models are averaged to be used as the final prediction result; the LightGBM regression model has the following structure:
in the formula (4), ft(x) The output value of the T-th decision tree is T, and T is the number of the decision trees.
As a preferred technical scheme of the invention, in the process of training the siRNA knockdown mRNA target prediction model, the 1 st tree is obtained according to a training set according to predefined parameters and decision tree splitting rules, and then 1 tree is added each time; the training target of the t tree is the difference between the output value and the true value of the first t-1 trees in the fitting sample; repeating the process until the model output does not change along with the increase of the tree or t is equal to a preset over parameter num _ iterations; in this case, the model is composed of t trees, and the output value of the model is the sum of the output values of the t trees;
in training each newly added tree, the loss function L for model training is:
in the formula yiFor the real value of the sample,is the predicted value of the model, and n is the number of samples. After adding a regularization term into L and simplifying, the obtained objective function J of each tree is:
in the formula GjThe sum, H, of the first derivatives of all leaf nodes of the tree with respect to the sample true value differencejAnd the sum of second derivatives of all current leaf nodes with respect to the sample real value difference is represented, T represents the number of leaves, and lambda and gamma are regularization parameters.
As a preferred technical solution of the present invention, the hyper-parameters of the three LightGBM regression models are set as follows:
1:num_iterations:79,learning_rate:0.1,max_depth:11,bagging_fraction:0.93,bagging_fraq:1,feature_fraction:0.147,γ:0.28,λ:1.9。
2:num_iterations:78,learning_rate:0.1,max_depth:11,bagging_fraction:0.93,bagging_fraq:1,feature_fraction:0.147,γ:0.07,λ:2.15。
3:num_iterations:83,learning_rate:0.1,max_depth:11,bagging_fraction:0.93,bagging_fraq:1,feature_fraction:0.147,γ:0.18,λ:1.05。
in a second aspect of the invention, there is provided a target prediction system for siRNA knock-down of mRNA, comprising:
the input mRNA sequence module is used for inputting the base sequence of the mRNA to be knocked down and obtaining target targets on all candidate mRNAs and corresponding siRNA sequences thereof according to the base complementary pairing principle;
the sequence characteristic extraction module is used for extracting the base sequence characteristics of the mRNA target binding target;
the secondary structure characteristic extraction module is used for extracting RNA secondary structure characteristics at the target binding target point;
and the prediction model screening module is used for inputting all the extracted features into the prediction model, outputting a predicted value by the prediction model, and screening out a proper siRNA knockdown mRNA target according to the predicted value.
As a preferred technical scheme of the invention, the sequence feature extraction module extracts the base sequence features of the target spot and the base types of each position of the corresponding siRNA and the occurrence frequency of each base type according to the base sequence of the target spot combined with the mRNA.
As a preferred technical scheme of the invention, the secondary structure characteristic extraction module extracts the secondary structure characteristic of the target binding site of the mRNA and the secondary structure characteristic of the corresponding siRNA antisense strand.
As the preferred technical scheme of the invention, the prediction model screening module inputs all the extracted features into the prediction model, and the model outputs the binding probability value of the target binding target of mRNA and the corresponding siRNA antisense chain; then, according to the probability value output by the model, screening out a proper siRNA knockdown mRNA target.
The principle of the invention for realizing target point prediction of siRNA knockdown mRNA is as follows: the combination of the siRNA and the mRNA target point is related to the base sequences of the mRNA and the siRNA at the target point, the RNA secondary structure at the target point and other factors. And quantifying the factors into specific characteristics by using a machine learning technology, and further realizing the prediction of the target point of the siRNA knockdown mRNA by using a machine learning model.
Compared with the prior art, the invention has the beneficial effects that: the invention considers the mRNA target binding target spot and the base sequence characteristics of the corresponding siRNA, also considers the RNA secondary structure characteristics of the mRNA target binding target spot, and effectively improves the target spot prediction effect of the siRNA knockdown mRNA. Through comparison tests, the spearman correlation value obtained by the prediction method is far higher than that of the currently used four prediction methods (Biopredisi, i-score, DSIR and ThermoCompsition), the target point prediction accuracy of siRNA knockdown mRNA is greatly improved, the unexpected technical effect of the existing method is achieved, and a powerful basis is provided for siRNA screening. Researchers can obtain the knockdown efficiency of all siRNA targeting any mRNA sequence by using the prediction system of the invention without biological experiments, and the most effective siRNA sequence can be quickly selected by sequentially selecting high-efficiency siRNA from high to low according to the knockdown efficiency for experimental verification. Because the prediction system of the invention has higher prediction accuracy, the effective siRNA sequence can be found out by testing 10 siRNAs with higher prediction results under normal conditions. If selecting other less accurate methods such as screening high efficiency siRNA according to Tuschl rules may select a large number of siRNA that meet the rules, experimental verification of a large number of siRNA may be required to determine high efficiency siRNA.
Drawings
FIG. 1 is a flow chart of a prediction method of the present invention;
Detailed Description
The following describes the implementation of the method of the invention:
as shown in FIG. 1, the present invention relates to a target prediction method for siRNA knockdown mRNA considering the secondary structure characteristics of RNA, which comprises the following steps:
1. inputting the base sequence of mRNA to be knocked down, and obtaining siRNA sequences corresponding to all candidate target points on the mRNA according to the base complementary pairing principle;
2. extracting base sequence characteristics such as base types of the target spot and each position of corresponding siRNA of the target spot and the occurrence frequency of each base type according to the base sequence of the target binding target spot of mRNA;
3. extracting secondary structure characteristics of the mRNA target binding target point and secondary structure characteristics of the siRNA antisense chain corresponding to the secondary structure characteristics;
4. inputting all the extracted features into a prediction model, and outputting the binding probability value of the mRNA target binding target and the corresponding siRNA antisense chain by the model;
5. and screening out a proper siRNA knockdown mRNA target according to the probability value output by the model.
Example 1 Experimental validation
In the experiment, a LightGBM regression model is selected as a target point prediction model for siRNA knockdown of mRNA.
The LightGBM regression model has the following structure:
in the formula (4), ft(x) The output value of the T-th decision tree is T, and T is the number of the decision trees.
In the model training process, the 1 st tree is obtained according to a training set according to predefined parameters and decision tree splitting rules, and then 1 tree is added each time. The training target of the t-th tree is the difference between the output value and the true value of the first t-1 trees in the fitting sample. This process is repeated until the model output does not change as the tree grows or t equals the preset over parameter num _ iterations. In this case, the model is composed of t trees, and the output value thereof is the sum of the output values of the t trees.
In training each newly added tree, the loss function L for model training is:
in the formula yiFor the real value of the sample,is the predicted value of the model, and n is the number of samples. After adding a regularization term into L and simplifying, the obtained objective function J of each tree is:
in the formula GjThe sum, H, of the first derivatives of all leaf nodes of the tree with respect to the sample true value differencejIs shown asAnd the sum of the second derivatives of all the first leaf nodes aiming at the sample real value difference, T represents the number of leaves, and lambda and gamma are regularization parameters.
In the experiment, the target point prediction model of siRNA knockdown mRNA is composed of LightGBM regression models with three different parameters, and the prediction results of the three models are averaged to be used as the final prediction result; all hyper-parameters of the three LightGBM regression models are set as follows:
1:num_iterations:79,learning_rate:0.1,max_depth:11,bagging_fraction:0.93,bagging_fraq:1,feature_fraction:0.147,γ:0.28,λ:1.9。
2:num_iterations:78,learning_rate:0.1,max_depth:11,bagging_fraction:0.93,bagging_fraq:1,feature_fraction:0.147,γ:0.07,λ:2.15。
3:num_iterations:83,learning_rate:0.1,max_depth:11,bagging_fraction:0.93,bagging_fraq:1,feature_fraction:0.147,γ:0.18,λ:1.05。
to test the predictive effect of our predictive method, we performed biological experiments and compared our predictions with those of four predictive algorithms currently in use (Biopredsi, i-score, DSIR and thermochcomposition).
The experiments were as follows:
we selected mRNA corresponding to PD-1 gene (whose sequence number in Genbank is NM-005018.3) as the subject of knockdown. This mRNA was not present in the training of the model by the present invention and other methods. The following steps are performed according to the method of the present invention to obtain the optimal siRNA for knocking down the mRNA:
1. obtaining siRNA sequences corresponding to all candidate target points on mRNA according to a base complementary pairing principle;
2. extracting base sequence characteristics such as base types of the target spot and each position of corresponding siRNA of the target spot and the occurrence frequency of each base type according to the base sequence of the target binding target spot of mRNA;
3. extracting secondary structure characteristics of the mRNA target binding target point and secondary structure characteristics of the siRNA antisense chain corresponding to the secondary structure characteristics;
4. inputting all the extracted features into a prediction model, and outputting the binding probability value of the mRNA target binding target and the corresponding siRNA antisense chain by the model;
5. and selecting the siRNA with the efficiency in the first ten positions from high to low according to the probability value output by the model for experimental measurement, and selecting the optimal siRNA according to the experimental value.
Randomly selecting 10 target spots and corresponding siRNA from the candidate target spots, wherein the length of the siRNA is 19 bases. The probability of knocking down PD-1 gene by the 10 siRNAs is predicted by five prediction methods respectively. Then, synthesizing the siRNAs by a biological means, respectively introducing the siRNAs into HELA cells, and detecting the attenuation quantity of PD-1 protein subjected to siRNA knockdown in the cells to calculate the real knockdown efficiency of the siRNAs.
We performed two experiments in total, and used different measurement methods to determine the real knockdown efficiency of the 10 sirnas. In each experiment, three repeated experiments were performed for each siRNA, and the average of the three groups was taken as the final measurement result. Table 1 shows the predicted knockdown probabilities for these 10 sirnas for the five prediction methods and their experimentally determined real knockdown efficiencies, with lower experimental values indicating higher knockdown efficiencies.
Table 1 shows the predicted knockdown probabilities for 10 sirnas and the actual knockdown efficiencies measured experimentally for the prediction method of the present invention and the other four general prediction methods (the lower the experimental value, the higher the knockdown efficiency); based on the data in table 1, the spearman correlation between the predicted knockdown probability of the five prediction methods and the experimentally measured real knockdown efficiency can be calculated, and the calculation results are shown in table 2.
Table 2 shows the spearman correlation coefficient between the predicted knockdown probability of the prediction method of the present invention and the other four commonly used prediction methods and the experimentally measured real knockdown efficiency. According to the analysis results in table 2, it can be seen that the prediction effect of the prediction method of the present invention is the best in two experiments, and the spearman correlation value using the method of the present invention is much higher than that of the control group (i.e. four commonly used prediction methods (Biopredsi, i-score, DSIR and thermochcomposition)) so as to greatly improve the accuracy of target point prediction for siRNA-knocking mRNA, and achieve the unexpected technical effect of the existing method.
TABLE 1
TABLE 2
Claims (11)
1. A target point prediction method for siRNA knockdown of mRNA is characterized by adopting the following steps: firstly, extracting the base sequence characteristics of the target binding target of the mRNA, then extracting the RNA secondary structure characteristics of the target binding target, and further realizing the target prediction of the siRNA knockdown mRNA through a target prediction model of the siRNA knockdown mRNA.
2. A method as claimed in claim 1, characterized in that the method comprises the following steps:
step 1, inputting a base sequence of mRNA to be knocked down, and obtaining siRNA sequences corresponding to all candidate target points on the mRNA according to a base complementary pairing principle;
step 2, extracting the target spot, the base type of each position of the corresponding siRNA of the target spot and the appearance frequency base sequence characteristic of each base type according to the base sequence of the target binding target spot of the mRNA;
step 3, extracting the secondary structure characteristics of the mRNA target binding target point and the secondary structure characteristics of the siRNA antisense strand corresponding to the secondary structure characteristics;
step 4, inputting all the extracted features into a prediction model, and outputting the binding probability value of the mRNA target binding target and the corresponding siRNA antisense chain by the model;
and 5, screening out a proper siRNA knockdown mRNA target according to the probability value output by the model.
3. The method according to claim 2, wherein step 3 comprises in particular the steps of:
step A, as shown in formula (1), calculating the probability P of pairing and combining each base i on the mRNA single chain at the target binding target point of the mRNA with other bases j on the whole mRNA single chainijSum of Si(ii) a m is the number of bases of mRNA;
wherein k is any one secondary structure formed by pairing a base i and a base j in multiple secondary structures possibly formed by the single-stranded mRNA, S is any one of all secondary structures possibly formed by the single-stranded mRNA, and delta GkAnd Δ GSThe free energies corresponding to the secondary structures numbered K and S, T is the absolute temperature, and R is the gas constant 8.314J/(mol x K).
Step B, the probability and S of each base on the target binding target point extracted in the step A are comparediCarry out a weighted summation FsumAnd calculating the maximum value Fmax(ii) a In the weighted summation process, as shown in formula (1), the number of hydrogen bonds formed by base pairing is considered, and if the base type is A or U, the weight W isiIs 2; if the base type is C or G, weight WiIs 3; the calculation process of the maximum value is shown as formula (3), wherein n is the number of bases of the target binding target:
Fmax=MAX Si (3)
step C, extracting the characteristics of the siRNA antisense strand corresponding to the target binding target of the mRNA according to the step A, B; in this case, m is n.
4. The method of claim 3, wherein in step 3, the characteristics of the binding region between the mRNA and the siRNA antisense strand comprise the probability of base pairing with all other bases at each position and a total of n characteristics, the maximum value of the n numbers, and the weighted sum of the n numbers, and a total of n +2 characteristics; extracting features from n bases on the siRNA antisense chain in the same way to obtain n +2 features; in total, 2n +4 features were extracted that reflect the secondary structure of the RNA at the target binding site of the mRNA target.
5. The method of claim 1, wherein the target prediction model for siRNA-knockdown mRNA is comprised of a LightGBM regression model with three different parameters, and the predictions from the three models are averaged to obtain a final prediction; the LightGBM regression model has the following structure:
in the formula (4), ft(x) The output value of the T-th decision tree is T, and T is the number of the decision trees.
6. The method of claim 5, wherein during the training of the siRNA knockdown mRNA target prediction model, the 1 st tree is first obtained according to a training set according to predefined parameters and decision tree splitting rules, and then 1 tree is added each time; the training target of the t tree is the difference between the output value and the true value of the first t-1 trees in the fitting sample; repeating the process until the model output does not change along with the increase of the tree or t is equal to a preset over parameter num _ iterations; in this case, the model is composed of t trees, and the output value of the model is the sum of the output values of the t trees;
in training each newly added tree, the loss function L for model training is:
in the formula yiFor the real value of the sample,is the predicted value of the model, and n is the number of samples. After adding regularization term in L and simplifyingAnd obtaining an objective function J of each tree as follows:
in the formula GjThe sum, H, of the first derivatives of all leaf nodes of the tree with respect to the sample true value differencejAnd the sum of second derivatives of all current leaf nodes with respect to the sample real value difference is represented, T represents the number of leaves, and lambda and gamma are regularization parameters.
7. The method of claim 5 or 6, wherein the hyper-parameter settings of the three LightGBM regression models are as follows:
1:num_iterations:79,learning_rate:0.1,max_depth:11,bagging_fraction:0.93,bagging_fraq:1,feature_fraction:0.147,γ:0.28,λ:1.9;
2:num_iterations:78,learning_rate:0.1,max_depth:11,bagging_fraction:0.93,bagging_fraq:1,feature_fraction:0.147,γ:0.07,λ:2.15;
3:num_iterations:83,learning_rate:0.1,max_depth:11,bagging_fraction:0.93,bagging_fraq:1,feature_fraction:0.147,γ:0.18,λ:1.05。
8. a siRNA knockdown mRNA target prediction system, comprising:
the input mRNA sequence module is used for inputting the base sequence of the mRNA to be knocked down and obtaining target targets on all candidate mRNAs and corresponding siRNA sequences thereof according to the base complementary pairing principle;
the sequence characteristic extraction module is used for extracting the base sequence characteristics of the mRNA target binding target;
the secondary structure characteristic extraction module is used for extracting RNA secondary structure characteristics at the target binding target point;
and the prediction model screening module is used for inputting all the extracted features into the prediction model, outputting a predicted value by the prediction model, and screening out a proper siRNA knockdown mRNA target according to the predicted value.
9. The system of claim 8, wherein the sequence feature extraction module extracts base sequence features of the target site and its corresponding siRNA at each position and its occurrence frequency according to the base sequence of the target binding site of the mRNA.
10. The system of claim 8, wherein the secondary structure feature extraction module extracts secondary structure features at the target binding site of the mRNA and their corresponding siRNA antisense strand.
11. The system of claim 8, wherein the predictive model screening module inputs all of the extracted features into a predictive model that outputs binding probability values for mRNA target binding targets to their corresponding siRNA antisense strands; then, according to the probability value output by the model, screening out a proper siRNA knockdown mRNA target.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202110397409.1A CN113066527B (en) | 2021-04-14 | 2021-04-14 | Target prediction method and system for siRNA knockdown mRNA |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202110397409.1A CN113066527B (en) | 2021-04-14 | 2021-04-14 | Target prediction method and system for siRNA knockdown mRNA |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN113066527A true CN113066527A (en) | 2021-07-02 |
| CN113066527B CN113066527B (en) | 2024-02-09 |
Family
ID=76566882
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202110397409.1A Active CN113066527B (en) | 2021-04-14 | 2021-04-14 | Target prediction method and system for siRNA knockdown mRNA |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN113066527B (en) |
Cited By (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN114708909A (en) * | 2022-03-21 | 2022-07-05 | 深圳市新合生物医疗科技有限公司 | method, device and equipment for optimizing mRNA sequence and storage medium |
| CN116798513A (en) * | 2023-02-21 | 2023-09-22 | 苏州赛赋新药技术服务有限责任公司 | Method and system for screening siRNA sequence to reduce off-target effect |
| CN116825199A (en) * | 2023-02-21 | 2023-09-29 | 王全军 | Method and system for screening siRNA sequence to reduce off-target effect |
| CN119724349A (en) * | 2025-02-28 | 2025-03-28 | 电子科技大学长三角研究院(衢州) | A RNA G-quadruplex prediction method and system based on pre-trained model and RNA secondary structure |
Citations (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20040002083A1 (en) * | 2002-01-29 | 2004-01-01 | Ye Ding | Statistical algorithms for folding and target accessibility prediction and design of nucleic acids |
| US20100063745A1 (en) * | 2006-12-14 | 2010-03-11 | Takeda Pharmaceutical Company Limited | Method of estimating secondary structure in rna and program and apparatus therefor |
| CN103390119A (en) * | 2013-07-03 | 2013-11-13 | 哈尔滨工程大学 | Method for recognizing transcription factor binding site |
| CN109215740A (en) * | 2018-11-06 | 2019-01-15 | 中山大学 | Full-length genome RNA secondary structure prediction method based on Xgboost |
| US20190032048A1 (en) * | 2016-02-09 | 2019-01-31 | Brookhaven Science Associates, Llc | Improved cloning and expression vectors and systems |
| CN110010194A (en) * | 2019-04-10 | 2019-07-12 | 浙江科技学院 | A Prediction Method of RNA Secondary Structure |
| CN111261223A (en) * | 2020-01-12 | 2020-06-09 | 湖南大学 | CRISPR off-target effect prediction method based on deep learning |
| CN111354420A (en) * | 2020-03-08 | 2020-06-30 | 吉林大学 | siRNA research and development method for COVID-19 virus drug therapy |
| CN111798921A (en) * | 2020-06-22 | 2020-10-20 | 武汉大学 | RNA binding protein prediction method and device based on multi-scale attention convolution neural network |
-
2021
- 2021-04-14 CN CN202110397409.1A patent/CN113066527B/en active Active
Patent Citations (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20040002083A1 (en) * | 2002-01-29 | 2004-01-01 | Ye Ding | Statistical algorithms for folding and target accessibility prediction and design of nucleic acids |
| US20100063745A1 (en) * | 2006-12-14 | 2010-03-11 | Takeda Pharmaceutical Company Limited | Method of estimating secondary structure in rna and program and apparatus therefor |
| CN103390119A (en) * | 2013-07-03 | 2013-11-13 | 哈尔滨工程大学 | Method for recognizing transcription factor binding site |
| US20190032048A1 (en) * | 2016-02-09 | 2019-01-31 | Brookhaven Science Associates, Llc | Improved cloning and expression vectors and systems |
| CN109215740A (en) * | 2018-11-06 | 2019-01-15 | 中山大学 | Full-length genome RNA secondary structure prediction method based on Xgboost |
| CN110010194A (en) * | 2019-04-10 | 2019-07-12 | 浙江科技学院 | A Prediction Method of RNA Secondary Structure |
| CN111261223A (en) * | 2020-01-12 | 2020-06-09 | 湖南大学 | CRISPR off-target effect prediction method based on deep learning |
| CN111354420A (en) * | 2020-03-08 | 2020-06-30 | 吉林大学 | siRNA research and development method for COVID-19 virus drug therapy |
| CN111798921A (en) * | 2020-06-22 | 2020-10-20 | 武汉大学 | RNA binding protein prediction method and device based on multi-scale attention convolution neural network |
Non-Patent Citations (4)
| Title |
|---|
| 刘元宁;徐宝林;张浩;陈竟博;韩烨;禹剑龙;: "基于siRNA-mRNA结合热力学特征的高效siRNA筛选", 吉林大学学报(工学版), no. 01, pages 191 - 195 * |
| 吴宏杰 等: "一种自适应序列长度的RNA二级结构深度预测方法", 小型微型计算机系统, vol. 40, no. 8, pages 1799 - 1803 * |
| 梁成 等: "一种新的RNA二级结构特征序列表示及相似性分析", 计算机应用研究, vol. 28, no. 3, pages 969 - 971 * |
| 赵英杰;王正志;: "基于结构信息的RNA多序列比对", 生物信息学, no. 02, pages 128 - 132 * |
Cited By (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN114708909A (en) * | 2022-03-21 | 2022-07-05 | 深圳市新合生物医疗科技有限公司 | method, device and equipment for optimizing mRNA sequence and storage medium |
| WO2023179273A1 (en) * | 2022-03-21 | 2023-09-28 | 深圳市新合生物医疗科技有限公司 | Mrna sequence optimization method and apparatus, device, and storage medium |
| CN114708909B (en) * | 2022-03-21 | 2023-10-20 | 深圳市新合生物医疗科技有限公司 | mRNA sequence optimization method and device, equipment and storage medium |
| CN116798513A (en) * | 2023-02-21 | 2023-09-22 | 苏州赛赋新药技术服务有限责任公司 | Method and system for screening siRNA sequence to reduce off-target effect |
| CN116825199A (en) * | 2023-02-21 | 2023-09-29 | 王全军 | Method and system for screening siRNA sequence to reduce off-target effect |
| CN116798513B (en) * | 2023-02-21 | 2023-12-15 | 苏州赛赋新药技术服务有限责任公司 | Method and system for screening siRNA sequence to reduce off-target effect |
| CN119724349A (en) * | 2025-02-28 | 2025-03-28 | 电子科技大学长三角研究院(衢州) | A RNA G-quadruplex prediction method and system based on pre-trained model and RNA secondary structure |
| CN119724349B (en) * | 2025-02-28 | 2025-05-16 | 电子科技大学长三角研究院(衢州) | RNA G-quadruplex prediction method and system based on pre-training model and RNA secondary structure |
Also Published As
| Publication number | Publication date |
|---|---|
| CN113066527B (en) | 2024-02-09 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN113066527B (en) | Target prediction method and system for siRNA knockdown mRNA | |
| CN108595913B (en) | A supervised learning method for discriminating mRNA and lncRNA | |
| CN106599615B (en) | A Sequence Characteristic Analysis Method for Predicting miRNA Target Genes | |
| Mostavi et al. | Deep-2'-O-me: predicting 2'-O-methylation sites by convolutional neural networks | |
| CN112599194B (en) | Method and device for processing methylation sequencing data | |
| CN109215740A (en) | Full-length genome RNA secondary structure prediction method based on Xgboost | |
| CN107918725B (en) | A DNA methylation prediction method for selecting optimal features based on machine learning | |
| CN108090327B (en) | Prediction method for exogenous miRNA (micro ribonucleic acid) regulation and control target gene containing three-dimensional free energy | |
| CN110021361B (en) | miRNA target gene prediction method based on convolutional neural network | |
| CN113407185A (en) | Compiler optimization option recommendation method based on Bayesian optimization | |
| Yao et al. | plantMirP: an efficient computational program for the prediction of plant pre-miRNA by incorporating knowledge-based energy features | |
| CN116798513B (en) | Method and system for screening siRNA sequence to reduce off-target effect | |
| KR101840028B1 (en) | Method and apparatus for integrated analysis of expression data of miRNA and mRNA | |
| CN116825199A (en) | Method and system for screening siRNA sequence to reduce off-target effect | |
| JP2008146538A (en) | MicroRNA detection apparatus, method and program | |
| CN110059228B (en) | DNA data set implantation motif searching method and device and storage medium thereof | |
| Lokuge et al. | miRNAFinder: A comprehensive web resource for plant Pre-microRNA classification | |
| CN119358386A (en) | A large prefabricated T-beam strength prediction model, method, system and readable medium | |
| CN116805512A (en) | Method for simulating high-depth sequencing TSS (TSS) characteristics | |
| Bu et al. | An efficient deep learning based predictor for identifying miRNA-triggered phasiRNA loci in plant | |
| CN116364191B (en) | A method for predicting single-cell m6A methylation profiles | |
| NL2013120B1 (en) | A method for finding associated positions of bases of a read on a reference genome. | |
| CN115394350A (en) | False junction RNA structure prediction method based on simulated annealing algorithm | |
| JP4843778B2 (en) | siRNA RNAi effect calculation device | |
| CN118899029B (en) | Optimization method of sequence design |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |