WO2021075575A1 - Procédé de prédiction de valeurs de décalage chimique de résonance magnétique nucléaire, appareil de prédiction et programme de prédiction - Google Patents
Procédé de prédiction de valeurs de décalage chimique de résonance magnétique nucléaire, appareil de prédiction et programme de prédiction Download PDFInfo
- Publication number
- WO2021075575A1 WO2021075575A1 PCT/JP2020/039187 JP2020039187W WO2021075575A1 WO 2021075575 A1 WO2021075575 A1 WO 2021075575A1 JP 2020039187 W JP2020039187 W JP 2020039187W WO 2021075575 A1 WO2021075575 A1 WO 2021075575A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- dihedral angle
- chemical shift
- amino acid
- value
- distribution
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N23/00—Investigating or analysing materials by the use of wave or particle radiation, e.g. X-rays or neutrons, not covered by groups G01N3/00 – G01N17/00, G01N21/00 or G01N22/00
- G01N23/20—Investigating or analysing materials by the use of wave or particle radiation, e.g. X-rays or neutrons, not covered by groups G01N3/00 – G01N17/00, G01N21/00 or G01N22/00 by using diffraction of the radiation by the materials, e.g. for investigating crystal structure; by using scattering of the radiation by the materials, e.g. for investigating non-crystalline materials; by using reflection of the radiation by the materials
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N24/00—Investigating or analyzing materials by the use of nuclear magnetic resonance, electron paramagnetic resonance or other spin effects
Definitions
- the present invention relates to a method, a prediction device, and a prediction program for predicting a chemical shift value in nuclear magnetic resonance analysis.
- IDPs intrinsically disordered proteins
- IDRs intrinsically disordered regions
- Non-Patent Document 1 The Nuclear Magnetic Resonance (NMR) method is a useful tool in IDPs (Non-Patent Document 1).
- the chemical shift ( ⁇ / ppm) is the most basic and important observation amount (parameter) and is used to predict the three-dimensional structure of peptides represented by the three-dimensional coordinates of atoms (non-patented).
- Document 2 Attempts have also been made to identify the structures of IDPs by predicting chemical shifts in NMR analysis.
- Non-Patent Document 3 small-angle X-ray scattering (small angle X-ray) is used for the structure of each peptide contained in the target peptide population in the structural analysis of Tau protein, which is a kind of IDPs, and ⁇ -synclein. Many experiments must be performed and analyzed by combining ray scattering: SAXS) analysis and NMR analysis.
- SAXS ray scattering
- Non-Patent Document 2 is useful in that a chemical shift can be predicted from the three-dimensional structure of the target peptide.
- the chemical shift is predicted from the dihedral angle values ( ⁇ and ⁇ values) of the main chain of the amino acids constituting the peptide and the dihedral angle values ( ⁇ 1 value) of the side chains of the peptide, and compared with the experimental values. can do.
- this method requires an enormous amount of time for analysis when there is diversity in the structure of peptides such as IDPs and IDRs.
- An object of the present invention is to predict the three-dimensional structure of a target peptide in a shorter time.
- Item 1 A method for predicting nuclear magnetic resonance chemical shift values of peptides.
- Item 2. Prior to clustering the frequency distribution of the main chain dihedrals of each amino acid residue, it further includes pre-clustering based on the side chain dihedrals (represented by a value of ⁇ 1). , Item 1. The prediction method according to Item 1. Item 3. Item 4. The prediction method according to Item 1 or 2, which is for predicting the nuclear magnetic resonance chemical shift value of the unchained protein. Item 4.
- the prediction device (10, 20) includes a processing unit (101, 201).
- the processing unit (101, 201) Cluster the appearance frequency distribution of the dihedral angle values (represented by ⁇ and ⁇ values) of the main chain of each amino acid residue constituting the target peptide, and obtain the distribution of multiple clusters.
- the prediction device (10, 20).
- Item 8 A method for predicting nuclear magnetic resonance chemical shift values of peptides.
- the dihedral angle value of the main chain represented by the values of ⁇ and ⁇
- the dihedral angle value of the side chain represented by the value of ⁇ 1
- the process of clustering the frequency distribution and acquiring the distribution of multiple clusters in three dimensions Using the score function, the process of calculating the similarity between the acquired distribution of each cluster and the reference main chain dihedral angle value registered in the amino acid 3-residue database, and A prediction method including a step of predicting a nuclear magnetic resonance chemical shift value of a peptide of interest from the calculated similarity.
- the NMR chemical shift value of the target peptide can be predicted in a shorter time.
- FIG. 1A shows a dihedral angle appearance frequency distribution map of the target peptide.
- FIG. 1B shows a part of the molecular model of the target peptide.
- FIG. 1 (c) shows a cluster distribution in which a dihedral angle appearance frequency distribution map is clustered with a mixed Gaussian distribution.
- FIG. 1 (d) shows a combination of clusters of dihedral angle appearance frequencies of the i-1st, i-th, and i + 1th amino acid residues of the mixed Gaussian distribution.
- the hardware configuration of the prediction systems 1000 and 2000 is shown.
- the hardware configuration of the prediction device 10 is shown.
- the functional block of the prediction device 10 is shown.
- the processing flow of the prediction program 1042 is shown.
- the flowchart of the chemical shift value prediction processing of step S5 shown in FIG. 5 and step S26 shown in FIG. 9 is shown.
- the hardware configuration of the prediction device 10 is shown.
- the functional block of the prediction device 10 is shown.
- the processing flow of the prediction program 1042 is shown.
- the comparison between the nuclear magnetic resonance chemical shift value predicted by the method of the present invention and the nuclear magnetic resonance chemical shift value predicted by the SPARCA program for unfolded apomyoglobin is shown.
- C, CA, CB, N, NH, and HA indicate 13 C', 13 C ⁇ , 13 C ⁇ , 15 N, 1 H N , and 1 H ⁇ , respectively.
- Three clustering results obtained by pre-clustering with the value of ⁇ 1 of the glutamine residue are shown.
- the left column shows “Gauche +", the middle column shows “Trance”, and the right column shows “Gauche-”.
- the upper row shows the clustering result, and the lower row shows the predicted main chain dihedral angle data.
- the reproducibility of the chemical shift experimental data of urea-modified apomyoglobin is shown.
- C, CA, CB, and HA indicate 13 C', 13 C ⁇ , 13 C ⁇ , and 1 H ⁇ , respectively.
- target peptide is intended to be a peptide for which a chemical shift value is to be predicted.
- target peptide may be simply referred to as the “target peptide”.
- a “residue” of various amino acids is a constituent unit of an amino acid constituting a peptide, and a hydrogen atom is removed from the amino acid for the amino group of the main chain and / or -OH is used for the carboxyl group of the main chain. Represents a group that is excluded.
- the amino acid is not particularly limited, but is preferably a natural amino acid. More preferably, it is selected from valine, isoleucine, leucine, methionine, lysine, phenylalanine, tryptophan, threonine, histidine, arginine, glycine, alanine, serine, tyrosine, cysteine, aspartic acid, glutamine, proline, aspartic acid, and glutamic acid.
- each amino acid residue constituting the target peptide is intended for each amino acid residue constituting the target peptide.
- a main chain dihedral angle ⁇ and a main chain dihedral angle ⁇ exist for each of the 100 amino acid residues.
- a side chain dihedral angle ⁇ 1 is present.
- the proline residue has substantially only one side chain dihedral angle ⁇ 1.
- the "chemical shift value” is intended as a nuclear magnetic resonance chemical shift value.
- Non-Patent Document 2 uses the three-dimensional structure of the peptide (three-dimensional coordinates of the constituent atoms) to determine the chemical shift value of the main chain atom of the peptide constituting the protein in the SPARCA program. Is a method of prediction using.
- i indicates the residue number of the amino acid constituting the target peptide
- j indicates the residue number in the 3-residue database.
- r represents the nuclide of the atom ( 15 N, 1 H N , 1 H ⁇ , 13 C ⁇ , 13 C ⁇ , 13 C).
- the sum on the right side indicates that the scores of each of the three consecutive residues are added together.
- Each term on the right side shows the similarity of the amino acid residue species (ResType), the main chain dihedral angle ⁇ , the main chain dihedral angle ⁇ , and the side chain dihedral angle ⁇ 1, respectively, from the left.
- kn and r in the formula are parameters, and the values optimized for the chemical shift prediction of the natural globular protein are shown in Non-Patent Document 2. 24,166 data are registered in the 3-residue database, and the score function is calculated for all of them, and the weighted average of 20 chemical shift values with a low score value S is used as the estimated value. ..
- globular protein is intended as a peptide that forms a unique three-dimensional structure.
- the untied protein is intended to be a peptide in a denatured state in which the chain is untied.
- the unchained protein may include a denatured protein and the like.
- the globular protein is preferably a natural globular protein.
- the denatured protein is preferably an intrinsically disordered protein.
- "natural" is intended not to be a protein or peptide artificially manipulated with a denaturing agent or the like.
- the main chain dihedral angle ( ⁇ , ⁇ ) of the peptide is intended to be the angle of rotation around the covalent bond. Therefore, its value is defined in the range of ⁇ to ⁇ and has a periodicity with a period of 2 ⁇ .
- main chain dihedral angle value the appearance frequency distribution (hereinafter, main chain dihedral frequency distribution) is shown.
- main chain dihedral frequency distribution Widely used in local structure classification of natural globular proteins and quality control of three-dimensional structure.
- This main chain dihedral frequency distribution can be used in structural analysis of unchained proteins. As shown in FIG. 1 (a), the main chain dihedral angle frequency distribution has some regions with high frequency, while some regions have almost no structure.
- the dihedral angle of the main chain or side chain of the peptide can be obtained by measurement by X-ray diffraction method, neutron diffraction method, etc., or by computer simulation such as quantum chemistry calculation.
- first dihedral angle values of the main chain structure of the peptide of interest shown in FIG. 1 (a) (hereinafter, also referred to as "target peptide dihedral angle value") from the distribution of, 10 4 to 10 6 or so
- the dihedral angle value of the target peptide is sampled.
- the dihedral angle value of the main chain structure of the target peptide includes values obtained from a plurality of target peptides.
- TPDB tripeptide database
- the degree of similarity with the dihedral angle value (hereinafter, also referred to as “reference dihedral angle value”) of each structure of 24,166 peptides composed of three groups is determined.
- the degree of similarity can be determined by a score function. Subsequently, based on the similarity obtained from the score function, atoms 13 C', 13 C ⁇ , 13 C ⁇ , 15 N, 1 H N , or 1 H ⁇ having a similar structure shown in FIG. 1 (b).
- the chemical shift value of is extracted from the recorded database and used as the predicted chemical shift value of the target peptide.
- the three-dimensional structure is registered in a protein structure database such as RCSB Protein Data Bank (RCSB PDB: https://www.rcsb.org/).
- RCSB PDB Protein Data Bank
- BMRB BioMagResBank
- the main chain of 3 amino acids which is the sum of the chemical shift values of each of the nuclear species ( 13 C', 13 C ⁇ , 13 C ⁇ , 15 N, 1 H N , or 1 H ⁇ ) and 1 amino acid residue before and after.
- a database of the two-sided angle values of the side chains can be mentioned.
- the dihedral angle values of the main chain and side chains of the amino acid 3 residues, the names of the amino acid residues constituting the 3 residues, the chemical shift values of the nuclei of each atom constituting each amino acid residue, etc. Can be included.
- the amino acid 3 residue database described in Non-Patent Document 2 is constructed from amino acid 3 residues contained in 200 proteins satisfying the above requirements.
- One embodiment of the present invention relates to a method of predicting a chemical shift value of nuclear magnetic resonance of a peptide.
- (1) the appearance frequency distribution of the dihedral angle values (represented by the values of ⁇ and ⁇ ) of the main chain of each amino acid residue constituting the target peptide is clustered, and a plurality of clusters are used.
- the degree of similarity between the acquired distribution of each cluster and the reference main chain dihedral angle value registered in the amino acid 3-residue database is calculated. It may include a step and (3) a step of predicting the nuclear magnetic resonance chemical shift value of the peptide of interest from the calculated similarity.
- the appearance frequency distribution of the dihedral angle values of the main chain of the target peptide is clustered, and based on the distribution of the clusters, the above 1 -1.
- a similar structure is extracted from the reference dihedral angle values recorded in the 3-residue database described in.
- the number of clusters in one dihedral frequency distribution map can be plural.
- the number of clusters can be on the order of 4 to 6.
- the dihedral angle value of the main chain may be abbreviated as "main chain dihedral angle value”.
- the appearance frequency distribution of the dihedral angle value of the main chain may be abbreviated as "main chain dihedral angle distribution”.
- pre-clustering may be performed in advance to cluster the values of the dihedral angles ( ⁇ 1 ) of the side chains of each amino acid residue constituting the target peptide.
- the dihedral angle of the side chain may be abbreviated as “side chain dihedral angle”.
- the value of the side chain dihedral angle ( ⁇ 1 ) may also be referred to as “side chain dihedral angle value” or “ value of ⁇ 1”.
- Pre-clustering obtains the frequency distribution at three peaks (angles) of ⁇ 1 for amino acid residues other than glycine residue, alanine residue and proline residue among the amino acid residues constituting the target peptide. Intended to be. Specifically, there are three peaks of ⁇ 1 : “Gauche +" near + 60 °, “Gauche-” near -60 °, and “Trance” near + 180 °. Here, “+” means “plus” and “-” means “minus”. Considering the side chain dihedral angle ( ⁇ 1 ) values indicated by these three ⁇ 1 peaks, when obtaining the main chain dihedral angle value of the target peptide for each amino acid residue, for one amino acid residue.
- ⁇ 1 can be obtained with three different main chain dihedral angle distributions.
- the main chain dihedral angle distribution according to the chi 1 value, acquired for at least one peak of chi 1, may be used for clustering.
- a main chain dihedral angle distribution corresponding to the value of ⁇ 1 is obtained for at least two peaks of ⁇ 1 and used for clustering.
- the backbone dihedral angle distribution corresponding to the value of chi 1, acquired for three peaks of chi 1, is used for clustering.
- a group of the main chain dihedral angle distribution of the amino acid can be obtained according to the side chain dihedral angle ( ⁇ 1 ) of the amino acid residue constituting the target peptide, and the main chain dihedral angle distribution of each of the amino acids can be obtained.
- Clustering can be performed for the group of.
- ⁇ 1 cannot be taken into consideration for the glycine residue, alanine residue, and proline residue, for these amino acids, k ⁇ ⁇ 1 _n, r is set to “0 (” in the formula (2) described later. It is preferable to define it as "zero)".
- the clustering method is not limited as long as the distribution of each dihedral angle value in one dihedral angle frequency distribution can be clustered.
- clustering can be performed by a mixed Gaussian distribution model, a von Mises distribution model, or the like. From a cost perspective, it is preferable to use a mixed Gaussian distribution model.
- parameters can be estimated by EM algorithm, maximum likelihood estimation, MAP estimation, Bayesian estimation, or the like.
- EM algorithm maximum likelihood estimation
- MAP estimation MAP estimation
- Bayesian estimation Bayesian estimation
- the clustering of the distribution of the main chain dihedral angle values is performed in two dimensions of the value of ⁇ and the value of ⁇ . Further, the pre-clustering of the distribution of the side chain dihedral angle values is performed in one dimension only for the value of ⁇ 1.
- the number of clusters in one dihedral frequency distribution map can be plural.
- the number of clusters can be on the order of 12 to 18.
- the present embodiment will be described below by taking a case where a mixed Gaussian distribution is used as an example.
- the term relating to the main chain dihedral angle of the score function shown in the equation (1) is extended to the score function for one Gaussian distribution (one region) constituting the mixed Gaussian distribution.
- the extension to the score function uses the integral method of one Gaussian distribution (one region) to calculate the dihedral angle value of each structure of the peptide consisting of three amino acid residues recorded in the three-residue database. Compare by the following formula.
- i represents a region (Gaussian distribution) of the main chain dihedral angle distribution, unlike the equation (1).
- Other symbols are the same as in equation (1).
- ⁇ represents the dihedral angle ( ⁇ , ⁇ ) of the main chain.
- FWN represents the following Wrapped Normal Distribution function.
- the FWN parameters mean ⁇ ⁇ and variance ⁇ ⁇ 2 , are given for each region of the main chain dihedral angle distribution. Perform the integration of equation (3).
- the calculation time for predicting one peptide using the SPARCA program is equivalent to the prediction time for the unchained protein population for one main chain dihedral angle distribution pattern.
- the computational time to predict one peptide using the SPARCA program is the structure of a population containing multiple unchained proteins and varying patterns of unchained when using the SPARCA program. It will take several times longer.
- the structure can be predicted in a short time even in a population containing a plurality of unchained proteins, and the time is the same as when the structure of one peptide is predicted using the SPARCA program. Equivalent.
- Step 2 Perform the pre-calculation. Only the number of regions of the main chain dihedral frequency distribution (for example, 4 to 6 regions) is calculated and recorded. This formula is independent of the amino acid 3-residue database.
- Step 3 Obtain the dihedral angle value ( ⁇ j ) from the center of the distribution and calculate the score function s ⁇ (j). The obtained value is squared and the value of step 2 is added to calculate s ⁇ (j). This calculation is performed for all of the amino acid 3-residue databases.
- the first term is the same as the calculation of the quadratic moment of the Gaussian distribution because the integration range is divided and integrated, and the sum is added later.
- the second term then performs the integration normally.
- k 0 is 0, so if you write down the one with the same absolute value of k, Will be.
- FIG. 2 shows the nuclear magnetic resonance chemical shift value of a peptide provided with a prediction device 10 (hereinafter, also simply referred to as “prediction device 10”) for predicting the nuclear magnetic resonance chemical shift value of the peptide according to the first embodiment.
- the hardware configuration of the prediction system 1000 (hereinafter, also simply referred to as “prediction system 1000”) is shown.
- the prediction device 10 is communicably connected to the amino acid 3-residue database 30 via a wired or wireless network.
- FIG. 3 shows the hardware configuration of the prediction device 10.
- the prediction device 10 may be connected to the input device 111, the output device 112, and the storage medium 113.
- the / F) 107 and the media interface (I / F) 108 are connected to each other by a bus 109 so as to be capable of data communication.
- the memory 102 and the recording device 104 may be collectively referred to as a storage unit.
- the recording device 104 includes an operation system (OS) 1041 that provides a graphical user interface environment such as Windows (registered trademark) manufactured and sold by Microsoft Corporation in the United States, a prediction program 1042 that is the application software of the present invention, and a target peptide 2.
- the target peptide 2-plane angle value database 1043 for storing the plane angle values is recorded non-volatilely.
- the CPU 101 is a processing unit of the prediction device 10.
- the computer functions as the prediction device 10 by the CPU 101 executing the OS 1041 stored in the recording device 104 or the ROM 103 in cooperation with the prediction program 1042 and performing the processes from steps S1 to S5 shown in FIG. To do.
- the ROM 103 is composed of a mask ROM, a PROM, an EPROM, an EEPROM, and the like, and records a computer program executed by the CPU 101 and data used for the program.
- the CPU 101 may be the MPU 101.
- the ROM 103 stores programs and settings related to the boot program executed by the CPU 101 and the operation of the hardware of the prediction device 10 when the prediction device 10 is started.
- the memory 102 is composed of a RAM (Random access memory) such as SRAM or DRAM.
- the memory 102 is used to read a computer program recorded in the ROM 103 and the recording device 104. Further, the memory 102 is used as a work area when the CPU 101 executes these computer programs.
- the communication I / F 105 is a serial interface such as USB, IEEE1394, RS-232C, a parallel interface such as SCSI, IDE, IEEE1284, an analog interface including a D / A converter, an A / D converter, and a network interface controller ( It is composed of Network interface controller (NIC) and the like.
- NIC Network interface controller
- the communication I / F 105 receives data from the measuring unit 30 or another external device, and transmits or displays the information stored or generated by the prediction device 10 to the outside as needed.
- the communication I / F 105 receives data from an external database via a network.
- the input I / F 106 is composed of, for example, a serial interface such as USB, IEEE1394, RS-232C, a parallel interface such as SCSI, IDE, IEEE1284, and an analog interface including a D / A converter and an A / D converter.
- a serial interface such as USB, IEEE1394, RS-232C
- a parallel interface such as SCSI, IDE, IEEE1284
- an analog interface including a D / A converter and an A / D converter.
- the input I / F 106 accepts character input, click, voice input, and the like from the input device 111.
- the received input contents are stored in the memory 102 or the recording device 104.
- the input device 111 is composed of a touch panel, a keyboard, a mouse, a pen tablet, a microphone, and the like, and inputs characters or voices to the prediction device 10.
- the input device 111 may be connected from the outside of the prediction device 10 or may be integrated with the prediction device 10.
- the output I / F 107 is composed of an interface similar to that of the input I / F 106, for example.
- the output I / F 107 outputs the information generated by the CPU 101 to the output device 112.
- the output I / F 107 outputs the information generated by the CPU 101 and stored in the recording device 104 to the output device 112.
- the output device 112 is composed of, for example, a display, a printer, or the like, and displays measurement results transmitted from the measurement unit 30, various operation windows in the prediction device 10, analysis results, and the like.
- the media I / F 108 reads, for example, application software stored in the storage medium 113.
- the read application software and the like are stored in the memory 102 or the recording device 104. Further, the media I / F 108 writes the information generated by the CPU 101 into the storage medium 113. The media I / F 108 writes the information generated by the CPU 101 and stored in the recording device 104 to the storage medium 113.
- the storage medium 113 is composed of a flexible disk, a CD-ROM, a DVD-ROM, or the like.
- the storage medium 113 is connected to the media I / F 108 by a flexible disk drive, a CD-ROM drive, a DVD-ROM drive, or the like.
- the storage medium 113 may store an application program or the like for the computer to execute an operation.
- the CPU 101 may acquire the application software and various settings necessary for controlling the prediction device 10 via the network instead of reading from the ROM 103 or the recording device 104.
- the application program is stored in the auxiliary storage unit of the server computer on the network, and the prediction device 10 can access the server computer, download the computer program, and store it in the ROM 103 or the recording device 104. Is.
- the ROM 103 or the recording device 104 is installed with an operation system that provides a graphical user interface environment such as Windows (registered trademark) manufactured and sold by Microsoft Corporation in the United States.
- the application program according to the second embodiment shall run on the operating system. That is, the prediction device 10 can be a personal computer or the like.
- FIG. 4 shows the functional configuration of the prediction device 10.
- the CPU 101 of the prediction device 10 functions as a sample dihedral angle value acquisition unit 1, a reference dihedral angle value acquisition unit 3, a clustering unit 5, a cluster distribution comparison unit 7, and a chemical shift prediction unit 9.
- the sample dihedral angle value acquisition unit 1 corresponds to step S1 in FIG. 5 described later
- the reference dihedral angle value acquisition unit 3 corresponds to step S2 in FIG. 5 described later
- the clustering unit 5 corresponds to step S2 in FIG. 5 described later.
- the cluster distribution comparison unit 7 corresponds to step S4 in FIG. 5 described later
- the chemical shift prediction unit 9 corresponds to step S5 in FIG. 5 described later.
- FIGS. 5 and 6 show a flow of processing executed by the prediction program 1042.
- step S1 of FIG. 5 the CPU 101 acquires the main chain dihedral angle value of the target peptide from the sample dihedral angle value database 1043 stored in the recording device 104. This process is started by the user requesting the acquisition of the main chain dihedral angle value of the target peptide from the input device 111.
- step S2 of FIG. 5 the CPU 101 acquires the reference dihedral angle value stored in the amino acid 3-residue database 30 shown in FIG. 2 via the communication I / F 107. This process is started when the user requests the acquisition of the reference dihedral angle value from the input device 111.
- steps S1 and S2 in FIG. 5 may be reversed.
- the CPU 101 may acquire the reference dihedral angle value stored in the amino acid 3-residue database 30 shown in FIG. 2 in advance and record it in the recording device 104.
- step S3 of FIG. 5 the CPU 101 clusters the main chain dihedral angle values of the target peptide acquired in step S1 and acquires the cluster distribution.
- the clustering method is described in 1-2. As explained in (1).
- step S4 of FIG. 5 the CPU 101 compares the cluster distribution acquired by the clustering in step S3 with the reference dihedral angle value acquired in step S2, and calculates the similarity.
- the calculation method of the similarity is described in 1-2. As explained in (2).
- step S5 of FIG. 5 the CPU 101 sets the chemical shift value of the nuclide of each atom corresponding to the reference two-plane angle value similar to the cluster distribution obtained in step S3 based on the similarity acquired in step S4 as the target peptide. It is obtained as a predicted value of the chemical shift of the nuclide of each atom contained in each amino acid of.
- step S5 shown in FIG. 5 by the CPU 101 will be described in more detail with reference to FIG.
- step S51 shown in FIG. 6 the CPU 101 extracts a structure having a reference dihedral angle value having a high degree of similarity in step S4 shown in FIG. 5 from the amino acid 3-residue database 30.
- step S52 shown in FIG. 6 the CPU 101 acquires the chemical shift value corresponding to the structure extracted in step S51 from the amino acid 3-residue database 30.
- step S53 shown in FIG. 6 the CPU 101 determines whether or not all the chemical shift values have been acquired for all the nuclear species, and when all the chemical shift values have not been acquired (in the case of “No”). Returns to step S52 and acquires chemical shift values for nuclear species for which chemical shifts have not yet been acquired.
- step S53 When the CPU 101 has acquired all the chemical shift values for all the nuclides in step S53 shown in FIG. 6 (in the case of “Yes”), the CPU 101 proceeds to step S54 and aims at the chemical shift values acquired for the nuclides. It is recorded on the recording device 104 as a predicted chemical shift value of the peptide.
- steps S51 to S54 shown in FIG. 6 are described in 1-2.
- the user inputs an extraction instruction from the input device 111 after step S4 shown in FIG. 5, and the extraction instruction. May be accepted by the CPU 101.
- the CPU 101 may automatically extract a structure having a similar high reference dihedral angle value from the amino 3 residue database 30.
- step S52 the user may input an acquisition instruction from the input device 111 after step S51, and the CPU 101 may accept the acquisition instruction. Alternatively, the end of step S51 may be used as a trigger to automatically acquire the CPU 101 from the amino 3 residue database 30 based on the extracted structure.
- step S51 may be used as a trigger to automatically acquire the CPU 101 from the amino 3 residue database 30 based on the extracted structure.
- step S51 may be used as a trigger to automatically acquire the CPU 101 from the amino 3 residue database 30 based on the extracted structure.
- step S51 may be used as a trigger to automatically acquire the CPU 101 from the amino 3 residue database 30 based on the extracted structure.
- step S51 may be used as a trigger to automatically acquire the CPU 101 from the amino 3 residue database 30 based on the extracted structure.
- step S51 may be used as a trigger to automatically acquire the CPU 101 from the amino 3 residue database 30 based on the extracted structure.
- prediction system 2000 FIG. 2 shows the nuclear magnetic resonance chemical shift value of a peptide provided with a prediction
- FIG. 7 shows the hardware configuration of the prediction device 10.
- the prediction device 20 may be connected to the input device 211, the output device 212, and the storage medium 213.
- the / F) 207 and the media interface (I / F) 208 are connected to each other by a bus 209 so as to be capable of data communication.
- the memory 202 and the recording device 204 may be collectively referred to as a storage unit.
- the recording device 204 includes an operating system (OS) 2041 that provides a graphical user interface environment such as Windows (registered trademark) manufactured and sold by Microsoft Corporation in the United States, a prediction program 2042 that is the application software of the present invention, and a target peptide 2.
- the target peptide 2 surface angle value database 2043 for storing the surface angle values is recorded non-volatilely.
- the target peptide two-sided angle value database 2043 includes the CPU 101, the memory 102, the ROM 103, the recording device 104, the communication I / F 105, the input I / F 106, and the output I / F 107 in the prediction device 10, respectively. It corresponds to the media I / F 108, the bus 109, the operating system (OS) 1041, and the target peptide biplane angle database 1043.
- FIG. 8 shows the functional configuration of the prediction device 20.
- the CPU 201 of the prediction device 20 includes a sample dihedral angle value acquisition unit 21, a reference dihedral angle value acquisition unit 23, a side chain dihedral angle value pre-clustering unit 24, a main chain dihedral angle value clustering unit 25, and a cluster distribution comparison unit. 27, functions as a chemical shift prediction unit 29.
- the sample dihedral angle value acquisition unit 21 corresponds to step S21 in FIG. 9, which will be described later
- the reference dihedral angle value acquisition unit 23 corresponds to step S22 in FIG. 9, which will be described later
- the side chain dihedral angle value pre-clustering unit 24 Corresponds to step S23 in FIG.
- the main chain dihedral angle value clustering unit 25 corresponds to step S24 in FIG. 9, which will be described later
- the cluster distribution comparison unit 27 corresponds to step S25 in FIG. 9, which will be described later
- the chemical shift prediction unit 29 corresponds to step S26 in FIG. 9, which will be described later.
- FIGS. 9 and 6 show the flow of processing executed by the prediction program 2042.
- step S21 of FIG. 9 the CPU 201 acquires the main chain dihedral angle value and the side chain dihedral angle value of the target peptide from the sample dihedral angle value database 2043 stored in the recording device 204. This process is started by the user requesting the acquisition of the main chain dihedral angle value and the side chain dihedral angle value of the target peptide from the input device 211.
- step S22 of FIG. 9 the CPU 201 acquires the reference dihedral angle value stored in the amino acid 3-residue database 30 shown in FIG. 2 via the communication I / F 207. This process is started when the user requests the acquisition of the reference dihedral angle value from the input device 211.
- step S21 and step S22 in FIG. 9 may be reversed.
- the CPU 201 may acquire the reference dihedral angle value stored in the amino acid 3-residue database 30 shown in FIG. 2 in advance and record it in the recording device 204.
- step S23 of FIG. 9 the CPU 201 executes pre-clustering using the side chain dihedral angle value.
- the method of pre-clustering is described in 1-2. As explained in (1).
- step S24 of FIG. 9 the CPU 201 clusters the main chain dihedral angle values of the target peptide for each ⁇ 1 according to the side chain ⁇ 1 acquired in step S23, and acquires a cluster distribution.
- the clustering method is described in 1-2. As explained in (1).
- step S25 of FIG. 9 the CPU 201 compares the cluster distribution acquired by clustering in step S24 with the reference dihedral angle value acquired in step S22 to calculate the similarity.
- the calculation method of the similarity is described in 1-2. As explained in (2).
- step S26 of FIG. 9 the CPU 201 sets the chemical shift value of the nuclide of each atom corresponding to the reference two-plane angle value similar to the cluster distribution obtained in step S24 based on the similarity acquired in step S25 as the target peptide. It is obtained as a predicted value of the chemical shift of the nuclide of each atom contained in each amino acid of.
- step S26 shown in FIG. 9 by the CPU 201 will be described in more detail with reference to FIG.
- step S51 shown in FIG. 6 the CPU 101 extracts a structure having a reference dihedral angle value having a high degree of similarity in step S25 shown in FIG. 9 from the amino acid 3-residue database 30.
- step S52 shown in FIG. 6 the CPU 201 acquires the chemical shift value corresponding to the structure extracted in step S51 from the amino acid 3-residue database 30.
- step S53 shown in FIG. 6 the CPU 201 determines whether or not all the chemical shift values have been acquired for all the nuclear species, and when all the chemical shift values have not been acquired (in the case of “No”). Returns to step S52 and acquires chemical shift values for nuclear species for which chemical shifts have not yet been acquired.
- step S53 When the CPU 201 has acquired all the chemical shift values for all the nuclides in step S53 shown in FIG. 6 (in the case of “Yes”), the CPU 201 proceeds to step S54 and aims at the chemical shift values acquired for the nuclides. Recorded on recording device 204 as a predicted chemical shift value of the peptide.
- steps S51 to S54 shown in FIG. 6 are described in 1-2.
- the user inputs an extraction instruction from the input device 211 after step S25 shown in FIG. 9, and the extraction instruction. May be accepted by the CPU 201.
- the CPU 201 may automatically extract a structure having a similar high reference dihedral angle value from the amino 3 residue database 30.
- step S52 the user may input an acquisition instruction from the input device 211 after step S51, and the CPU 201 may accept the acquisition instruction.
- the end of step S51 may be used as a trigger to automatically acquire the CPU 201 from the amino 3 residue database 30 based on the extracted structure.
- the prediction program 1042 and the prediction program 2042 can be executed on a computer as a computer program for predicting the nuclear magnetic resonance chemical shift value of the peptide according to the present invention.
- the computer program can be provided as a program product such as a storage medium.
- the computer program is stored in a semiconductor memory element such as a hard disk or a flash memory, or a storage medium such as an optical disk.
- the storage format of the program in the storage medium is not limited as long as the control unit can read the program.
- the storage in the storage medium is preferably non-volatile.
- FIG. 10 shows a comparison of the patterns of the nuclear magnetic resonance chemical shift values predicted by the prediction method according to the first embodiment and the nuclear magnetic resonance chemical shift values predicted by the SPARCA program for unfolded apomyoglobin.
- C, CA, CB, N, NH, and HA indicate 13 C', 13 C ⁇ , 13 C ⁇ , 15 N, 1 H N , and 1 H ⁇ , respectively.
- the red line (gray line in gray scale) shows the predicted value of the present invention, and the black line shows the predicted value by the SPARCA program.
- the vertical axis shows the chemical shift value (ppm), and the horizontal axis shows the residue number.
- the residue number the N-terminal is the residue number 1.
- the chemical shift value obtained by the prediction method according to the first embodiment was almost the same as the chemical shift value predicted by the SPARCA program.
- the prediction method of the present invention has a prediction function comparable to that of the SPARCA program.
- FIG. 11 shows three clustering results obtained by pre-clustering with the value of ⁇ 1 of the glutamine residue.
- the left column shows "Gauche +", the middle column shows “Trance”, and the right column shows "Gauche-”.
- the upper row shows the clustering result, and the lower row shows the predicted main chain dihedral angle data.
- FIG. 12 shows the reproducibility of the chemical shift experimental data of urea-modified apomyoglobin.
- Four nuclides of C, CA, CB, and HA indicating 13 C', 13 C ⁇ , 13 C ⁇ , and 1 H ⁇ , respectively) were calculated.
- White circles indicate predicted values calculated without considering the value of ⁇ 1.
- Black circles indicate predicted values calculated in consideration of the value of ⁇ 1.
- predicted value calculated in consideration of the chi 1 value than the predicted value did not consider the value of chi 1 showed a value close to the experimental value. Therefore, by correcting the weight of each Gaussian distribution of the main chain dihedral angle distribution of each amino residue with the value of ⁇ 1, the predicted value can be made closer to the experimental value.
Landscapes
- Physics & Mathematics (AREA)
- Chemical & Material Sciences (AREA)
- Biochemistry (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Analytical Chemistry (AREA)
- General Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Immunology (AREA)
- Pathology (AREA)
- Crystallography & Structural Chemistry (AREA)
- High Energy & Nuclear Physics (AREA)
- Analysing Materials By The Use Of Radiation (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
La présente invention aborde le problème consistant à prédire plus rapidement la structure tridimensionnelle d'un peptide d'intérêt. Le problème est résolu par un procédé de prédiction des valeurs de décalage chimique de résonance magnétique nucléaire de peptides, le procédé comprenant une étape consistant à regrouper la distribution de fréquence d'occurrence des valeurs d'angle de dièdre de chaîne principale (représentées par φ et ψ) des résidus d'acides aminés individuels constituant un peptide d'intérêt, pour acquérir une pluralité de distributions de groupes ; une étape consistant à calculer, à l'aide d'une fonction de notation, le degré de similarité entre chaque distribution de groupes acquis et des valeurs d'angle de dièdre de chaîne principale de référence enregistrées dans une base de données de tripeptides d'acides aminés ; et une étape consistant à prédire, à partir du degré de similarité calculé, les valeurs de décalage chimique de résonance magnétique nucléaire du peptide d'intérêt.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2021552484A JP7583450B2 (ja) | 2019-10-17 | 2020-10-16 | 核磁気共鳴の化学シフト値の予測方法、予測装置及び予測プログラム |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2019-189855 | 2019-10-17 | ||
| JP2019189855 | 2019-10-17 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2021075575A1 true WO2021075575A1 (fr) | 2021-04-22 |
Family
ID=75538292
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2020/039187 Ceased WO2021075575A1 (fr) | 2019-10-17 | 2020-10-16 | Procédé de prédiction de valeurs de décalage chimique de résonance magnétique nucléaire, appareil de prédiction et programme de prédiction |
Country Status (2)
| Country | Link |
|---|---|
| JP (1) | JP7583450B2 (fr) |
| WO (1) | WO2021075575A1 (fr) |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN115825135A (zh) * | 2022-11-04 | 2023-03-21 | 国家毒品实验室浙江分中心(浙江省毒品技术中心) | 基于核磁氢谱数据的化合物分子结构式相似性比较方法 |
| JP2023047726A (ja) * | 2021-09-27 | 2023-04-06 | 富士通株式会社 | 初期構造生成装置、初期構造生成方法及び初期構造生成プログラム |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2005181104A (ja) * | 2003-12-19 | 2005-07-07 | Hitachi Ltd | 高精度ドッキングスコアリング方法 |
| US20120143580A1 (en) * | 2004-11-12 | 2012-06-07 | Bristol-Myers Squibb Company | Protein-ligand noe matching for high-throughput structure determination |
| US20150300968A1 (en) * | 2012-12-05 | 2015-10-22 | Nymirum, Inc. | Device and Methods for Analysis of Biomolecule Structure, Dynamics and Activity |
Family Cites Families (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20100143580A1 (en) | 2008-05-28 | 2010-06-10 | American Air Liquide, Inc. | Stabilization of Bicycloheptadiene |
-
2020
- 2020-10-16 WO PCT/JP2020/039187 patent/WO2021075575A1/fr not_active Ceased
- 2020-10-16 JP JP2021552484A patent/JP7583450B2/ja active Active
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2005181104A (ja) * | 2003-12-19 | 2005-07-07 | Hitachi Ltd | 高精度ドッキングスコアリング方法 |
| US20120143580A1 (en) * | 2004-11-12 | 2012-06-07 | Bristol-Myers Squibb Company | Protein-ligand noe matching for high-throughput structure determination |
| US20150300968A1 (en) * | 2012-12-05 | 2015-10-22 | Nymirum, Inc. | Device and Methods for Analysis of Biomolecule Structure, Dynamics and Activity |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2023047726A (ja) * | 2021-09-27 | 2023-04-06 | 富士通株式会社 | 初期構造生成装置、初期構造生成方法及び初期構造生成プログラム |
| CN115825135A (zh) * | 2022-11-04 | 2023-03-21 | 国家毒品实验室浙江分中心(浙江省毒品技术中心) | 基于核磁氢谱数据的化合物分子结构式相似性比较方法 |
Also Published As
| Publication number | Publication date |
|---|---|
| JP7583450B2 (ja) | 2024-11-14 |
| JPWO2021075575A1 (fr) | 2021-04-22 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Cano et al. | Automatic selection of molecular descriptors using random forest: Application to drug discovery | |
| Venkatraman et al. | Protein-protein docking using region-based 3D Zernike descriptors | |
| US11450407B1 (en) | Systems and methods for artificial intelligence-guided biomolecule design and assessment | |
| Shen et al. | Protein backbone and sidechain torsion angles predicted from NMR chemical shifts using artificial neural networks | |
| EP3374906B1 (fr) | Systèmes et procédés de prédiction d'épitopes de protéine mal repliés par sollicitation de coordonnées collectives | |
| Zaborowski et al. | A maximum-likelihood approach to force-field calibration | |
| Fujitsuka et al. | SimFold energy function for de novo protein structure prediction: consensus with Rosetta | |
| Estácio et al. | Robustness of atomistic Gō models in predicting native-like folding intermediates | |
| Moshawih et al. | Consensus holistic virtual screening for drug discovery: a novel machine learning model approach | |
| WO2021075575A1 (fr) | Procédé de prédiction de valeurs de décalage chimique de résonance magnétique nucléaire, appareil de prédiction et programme de prédiction | |
| Kountouris et al. | Predicting β-turns and their types using predicted backbone dihedral angles and secondary structures | |
| Zheng et al. | An ensemble method for prediction of conformational B-cell epitopes from antigen sequences | |
| Shatnawi | Review of recent protein-protein interaction techniques | |
| Wu et al. | Surface-vqmae: Vector-quantized masked auto-encoders on molecular surfaces | |
| Zhang et al. | SPIN-CGNN: Improved fixed backbone protein design with contact map-based graph construction and contact graph neural network | |
| Liu et al. | ExEnDiff: an Experiment-guided Diffusion model for protein conformational Ensemble generation | |
| JP5211458B2 (ja) | 化合物の仮想スクリーニング方法および装置 | |
| Boomsma et al. | Full cyclic coordinate descent: Solving the protein loop closure problem in C α space | |
| Carrillo-Cabada et al. | A graphic encoding method for quantitative classification of protein structure and representation of conformational changes | |
| Li et al. | ctP 2 ISP: Protein–Protein Interaction Sites Prediction Using Convolution and Transformer With Data Augmentation | |
| Sun et al. | From isotropic to anisotropic side chain representations: comparison of three models for residue contact estimation | |
| JP2007505372A (ja) | アミノ酸配列の立体構造を確定し、分析する方法 | |
| Baakman et al. | Swiftmhc: A high-speed attention network for mhc-bound peptide identification and 3d modeling | |
| Li | Uncovering Hierarchical Cellular Mechanisms: Linking Molecular Regulation and Biological Topology | |
| US20240386991A1 (en) | Systems and methods for electrostatic landscape of mhc-peptide binding revealed using inception networks |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20877041 Country of ref document: EP Kind code of ref document: A1 |
|
| ENP | Entry into the national phase |
Ref document number: 2021552484 Country of ref document: JP Kind code of ref document: A |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 20877041 Country of ref document: EP Kind code of ref document: A1 |