CN117457200A - Hypertension risk prediction method, device, system and storage medium - Google Patents
Hypertension risk prediction method, device, system and storage medium Download PDFInfo
- Publication number
- CN117457200A CN117457200A CN202311393849.5A CN202311393849A CN117457200A CN 117457200 A CN117457200 A CN 117457200A CN 202311393849 A CN202311393849 A CN 202311393849A CN 117457200 A CN117457200 A CN 117457200A
- Authority
- CN
- China
- Prior art keywords
- prevotella
- data
- hypertension
- risk prediction
- abundance
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/30—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B30/00—ICT specially adapted for sequence analysis involving nucleotides or amino acids
Landscapes
- Engineering & Computer Science (AREA)
- Medical Informatics (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Public Health (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Epidemiology (AREA)
- Chemical & Material Sciences (AREA)
- Analytical Chemistry (AREA)
- Biophysics (AREA)
- Pathology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biotechnology (AREA)
- Evolutionary Biology (AREA)
- Primary Health Care (AREA)
- Databases & Information Systems (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The invention provides a method, a device, a system and a storage medium for predicting risk of hypertension, which relate to the technical field of risk prediction based on biomarkers, wherein the method for predicting risk of hypertension comprises the following steps: obtaining sequence data obtained after sequencing of a target sample; performing bioinformatics analysis on the sequence data, and determining abundance information corresponding to Prevotella in the sequence data; and (3) taking Prevotella as a biomarker, and inputting the abundance information into a trained risk prediction model to obtain a prediction result for representing the hypertension risk. The invention provides a method for predicting blood pressure risk by using a Prevotella sample based on an intestinal microbiome and bioinformatics thought by taking Prevotella in an intestinal microbiome of human beings as a research object, and has the characteristics of low cost, noninvasive sampling, high stability, high added value and the like.
Description
Technical Field
The invention relates to the technical field of risk prediction based on biomarkers, in particular to a method, a device, a system and a storage medium for predicting hypertension risk.
Background
According to recent studies, it has been shown that there is a certain relationship between intestinal microorganisms and hypertension. Intestinal microorganisms are a microflora formed in the human intestinal tract including bacteria, fungi, viruses, and the like. These microorganisms play an important role in human health, involving various aspects of digestion, immune system regulation, and metabolism. While we have confirmed the existence of this association, more research is still needed to gain insight into the exact mechanism and impact between gut microorganisms and hypertension. Therefore, the selection of an appropriate method for predicting risk of hypertension is of great importance.
Emerging methods, such as risk prediction based on intestinal microbiome, provide a new approach for personalized management of hypertension. By combining with advanced technologies such as metagenome and bioinformatics, it is expected to identify microbial features associated with hypertension risk, thereby providing a more accurate strategy for early intervention and treatment. The method utilizes the data of intestinal microbiome, and can find out the microbial characteristics related to hypertension through advanced calculation tools and algorithm analysis, thereby providing guidance for individualized hypertension management.
However, the prior art methods of predicting risk of hypertension have some drawbacks: firstly, the sequencing cost of the prediction analysis by using the virus sequence data is high, so that the application range of the method is limited; secondly, hypertension is affected by individual differences and multiple factors such as inheritance, diet, weight, exercise habit and the like, so that the stability and the accuracy of risk prediction by only relying on virus groups are low; furthermore, due to the complexity of the intestinal microbiota, the current technology has not fully understood and interpreted the mechanisms and interactions therein.
In summary, despite the potential of gut microbiome-based approaches to predicting risk of hypertension, some challenges and limitations are still faced. Further research and innovation is needed to solve these problems. For example, more economical and efficient sequencing technology is developed, the accuracy and stability of a prediction model are improved, and the relationship between intestinal microorganisms and hypertension is deeply studied to find more accurate prediction indexes and intervention strategies. Accordingly, there is a need to further refine and improve the prior art to better achieve personalized management and early intervention of hypertension.
Disclosure of Invention
In view of the above, the present invention provides a method for predicting risk of hypertension, comprising:
Obtaining sequence data obtained after sequencing of a target sample;
performing bioinformatics analysis on the sequence data, and determining abundance information corresponding to Prevotella in the sequence data;
and inputting the abundance information into a trained risk prediction model by taking Prevotella as a biomarker to obtain a prediction result for representing the hypertension risk.
Preferably, the sequence data obtained after the sequencing of the obtained target sample includes:
sequencing a sample to be tested based on a high-throughput sequencing technology to obtain sequencing data; wherein the high throughput sequencing technique comprises any one of Illumina, 454 and Ion Torrent;
and carrying out data standardization processing on the sequencing data to obtain the sequence data.
Preferably, the data normalization process includes: the sequencing data is subjected to a process of removing low-quality sequences, removing adaptor sequences and removing primer sequences.
Preferably, the bioinformatic analysis is performed on the sequence data, and determining abundance information corresponding to Prevotella in the sequence data includes:
performing quality control on the sequence data to obtain data to be processed;
Analyzing the microorganism species present in the data to be processed and the relative abundance corresponding to the microorganism species, and generating a species abundance table from the microorganism species and the relative abundance;
judging whether the microorganism species in the species abundance table contains Prevotella;
if yes, the abundance corresponding to the Prevotella in the species abundance list is used as the abundance information.
Preferably, the biomarker comprises one or more combinations of the following species of bacteria belonging to the genus Prevotella:
prevoltellella_ amnii, prevotella _ baroniae, prevotella _ bivia, prevotella _ buccae, prevotella _ buccalis, prevotella _ colorans, prevotella _ copri, prevotella _ corporis, prevotella _ dentalis, prevotella _ denticola, prevotella _ disiens, prevotella _ histicola, prevotella _ intermedia, prevotella _ multisaccharivorax, prevotella _ nigrescens, prevotella _ oralis, prevotella _ oris, prevotella _sp_885 Prevolella sp AM42 24 Prevoltella_sp_CAG_1031, prevoltella_sp_CAG_1058, prevoltella_sp_CAG_1092, prevoltella_sp_CAG_1124, prevoltella_sp_CAG_1185, prevoltella_sp_CAG_1320 Prevoltella_sp_CAG_279, prevoltella_sp_CAG_485, prevoltella_sp_CAG_520, prevoltella_sp_CAG_5226, prevoltella_sp_CAG_617, prevoltella_sp_CAG_755, prevoltella_sp_CAG_873, prevoltella_sp_CAG_891, prevoltella_s7_1_8, prevoltella_stercorea and Prevoltella_timensis.
Preferably, the hypertension risk prediction method further comprises:
establishing a training data set and a testing data set based on the collected stool samples of the normal population and the hypertensive population;
constructing the risk prediction model;
training the risk prediction model by using the training data set to obtain the test data and a corresponding prediction result of hypertension risk in each sample;
and determining and correcting the risk prediction model through the prediction result and the test data set to obtain the trained risk prediction model.
Preferably, the establishing a training data set and a testing data set based on the collected stool samples of the normal population and the hypertensive population comprises:
performing metagenome sequencing on fecal samples of the normal population and the hypertensive population to obtain test sequence data;
performing quality control on the test sequence data to obtain test screening data;
performing bioinformatic analysis on the test screening data, analyzing the microorganism species present in the test screening data and the relative abundance corresponding to the microorganism species, and generating a test species abundance table according to the microorganism species and the relative abundance; wherein, the abundance table of the test species comprises Prevotella and the abundance information corresponding to the Prevotella;
And establishing the training data set and the test data set according to the test species abundance table.
Preferably, said constructing said risk prediction model comprises:
constructing a machine learning model according to a random forest class function of python;
and learning and establishing a mapping relation from the input abundance information characteristics to the hypertension risk prediction result based on the machine learning model to obtain the risk prediction model.
In addition, in order to solve the above problems, the present invention also provides a hypertension risk prediction apparatus, including:
the acquisition module is used for acquiring sequence data obtained after sequencing of the target sample;
the determining module is used for performing bioinformatics analysis on the sequence data and determining abundance information corresponding to Prevotella in the sequence data;
the prediction module is used for inputting the abundance information into a trained risk prediction model by taking Prevotella as a biomarker to obtain a prediction result representing the risk of hypertension.
In addition, in order to solve the above-mentioned problems, the present invention further provides a hypertension risk prediction system, which includes a memory and a processor, wherein the memory stores a hypertension risk prediction program, and the processor runs the hypertension risk prediction program to make the hypertension risk prediction system execute the hypertension risk prediction method as described above.
In addition, in order to solve the above-mentioned problems, the present invention also provides a computer-readable storage medium having stored thereon a hypertension risk prediction program that, when executed by a processor, implements the hypertension risk prediction method as described above.
The invention provides a hypertension risk prediction method, which comprises the following steps: obtaining sequence data obtained after sequencing of a target sample; performing bioinformatics analysis on the sequence data, and determining abundance information corresponding to Prevotella in the sequence data; and inputting the abundance information into a trained risk prediction model by taking Prevotella as a biomarker to obtain a prediction result for representing the hypertension risk. The invention obtains the microorganism Prevolvulella (Prevoltella) with relevance to hypertension through screening, and uses the microorganism Prevoltella as a microorganism marker, wherein the microorganism marker can be applied to the preparation of a hypertension risk prediction product, such as a reagent, a kit, a chip, a prediction model, an evaluation system or a high-throughput sequencing platform for uncovering hypertension risk. The invention clarifies the microbial difference between the hypertension patient and the healthy person, and by identifying the hypertension risk in advance, the individual can adopt proper life style adjustment and treatment method, thereby reducing the disease risk.
In a word, the invention takes Prevotella in human intestinal microbiota as a research object, and provides a method for predicting blood pressure risk by using a Prevotella sample based on intestinal microbiome and bioinformatics thought, which has the characteristics of low cost, non-invasive sampling, high stability, high added value and the like.
Drawings
Fig. 1 is a schematic structural diagram of a hardware operating environment related to an embodiment of a hypertension risk prediction method according to the present invention;
fig. 2 is a flowchart of an embodiment 1 of a method for predicting risk of hypertension according to the present invention;
fig. 3 is a schematic flow chart of step S100 refinement in embodiment 2 of the hypertension risk prediction method of the present invention;
fig. 4 is a schematic flow chart of step S200 refinement in embodiment 3 of the hypertension risk prediction method of the present invention;
fig. 5 is a flow chart of a complementary technical scheme including steps S400-S700 in embodiment 4 of the method for predicting risk of hypertension according to the present invention;
fig. 6 is a flowchart illustrating refinement of step S400 in embodiment 4 of the hypertension risk prediction method according to the present invention;
fig. 7 is a flowchart illustrating refinement of step S500 in embodiment 4 of the method for predicting risk of hypertension according to the present invention;
fig. 8 is a schematic overall flow chart of embodiment 5 of the method for predicting risk of hypertension according to the present invention;
Fig. 9 is a statistical diagram of feature importance of a characteristic strain of prasugrel bacteria in embodiment 5 of the method for predicting risk of hypertension according to the present invention;
fig. 10 is a schematic diagram of module connection of the hypertension risk prediction device of the present invention.
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
Embodiments of the present invention are described in detail below, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout.
Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more such feature. In the description of the present invention, the meaning of "a plurality" is two or more, unless explicitly defined otherwise.
In the present invention, unless explicitly specified and limited otherwise, the terms "mounted," "connected," "secured," and the like are to be construed broadly, and may be, for example, fixedly connected, detachably connected, or integrally formed; can be mechanically or electrically connected; can be directly connected or indirectly connected through an intermediate medium, and can be communicated with the inside of two elements or the interaction relationship of the two elements. The specific meaning of the above terms in the present invention can be understood by those of ordinary skill in the art according to the specific circumstances.
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Fig. 1 is a schematic structural diagram of a hardware operating environment of a terminal according to an embodiment of the present invention.
The hypertension risk prediction system of the embodiment of the invention can be a PC, a mobile terminal device such as a smart phone, a tablet computer or a portable computer, and the like. The hypertension risk prediction system may include: a processor 1001, e.g. a CPU, a network interface 1004, a user interface 1003, a memory 1005 and a communication bus 1002. Wherein the communication bus 1002 is used to enable connected communication between these components. The user interface 1003 may include a display screen, an input unit such as a keyboard, a remote control, and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a stable memory such as a disk memory. The memory 1005 may also optionally be a storage device separate from the processor 1001 described above. Optionally, the hypertension risk prediction system may further include an RF (Radio Frequency) circuit, an audio circuit, a WiFi module, and the like. In addition, the hypertension risk prediction system can be further configured with other sensors such as a gyroscope, a barometer, a hygrometer, a thermometer, an infrared sensor and the like, which are not described herein.
It will be appreciated by those skilled in the art that the hypertension risk prediction system shown in fig. 1 is not limiting thereof and may include more or fewer components than shown, or certain components in combination, or a different arrangement of components. As shown in fig. 1, an operating system, a data interface control program, a network connection program, and a hypertension risk prediction program may be included in a memory 1005 as one type of computer-readable storage medium.
In a word, the invention takes Prevotella in human intestinal microbiota as a research object, and provides a method for predicting blood pressure risk by using a Prevotella sample based on intestinal microbiome and bioinformatics thought, which has the characteristics of low cost, non-invasive sampling, high stability, high added value and the like.
Example 1:
referring to fig. 2, an embodiment of the present invention provides a method for predicting risk of hypertension, including:
step S100, obtaining sequence data obtained after sequencing of a target sample;
the target sample is a stool sample of a subject to be tested.
The sequence data is obtained by extracting the genomic DNA from the sample and performing the metagenomic sequencing, and the genomic information is obtained for further analysis and calculation.
The above steps mean that DNA/RNA is extracted from the sample and subjected to high throughput sequencing, resulting in a large amount of sequence data. These data contain the gene sequences of the intestinal flora and can be used for further analysis. The above steps may be accomplished by laboratory techniques (e.g., PCR, high throughput sequencing, etc.).
Step S200, performing bioinformatics analysis on the sequence data, and determining abundance information corresponding to Prevotella in the sequence data;
in the above steps, the sequence data are aligned to obtain the classification information of the microorganism, and then the relative abundance of various bacterial groups including Prevotella is calculated based on the classification information.
In the above steps, data preprocessing such as denoising, filtering, removing low quality sequences, etc. can be performed first, then sequence alignment and classification annotation are performed using bioinformatics tools, and finally the relative abundance is calculated using a statistical method.
The steps can effectively extract the microbial community information and reveal the relationship between the intestinal flora and the diseases. The method can be implemented by existing bioinformatics tools and software.
And step S300, using Prevotella as a biomarker, and inputting the abundance information into a trained risk prediction model to obtain a prediction result representing the risk of hypertension.
The above steps are based on the microorganism abundance information obtained in step S200, and then, prasugrel bacteria is selected as a biomarker, and is input into a pre-constructed hypertension risk prediction model, so as to obtain a prediction result.
Prevotella is a genus of bacteria existing in the environment such as the intestinal tract and the oral cavity of the human body, and belongs to gram-negative bacteria. Prevotella is widely studied and occupies a relatively high abundance in the human microbial community. It is an anaerobic bacterium, which can decompose polysaccharide substances and produce metabolites such as short chain fatty acids.
The relationship between Prevotella and human health and disease is complex. On the one hand, the microbial agent plays a certain role in probiotics in normal intestinal microbiota, and participates in food digestion, maintains intestinal health, adjusts immunity and other functions. On the other hand, studies have also found that in some cases, abnormal increases in the relative abundance of Prevotella may be associated with the occurrence and development of diseases such as inflammatory diseases of the intestinal tract, metabolic diseases, cardiovascular diseases, and the like.
In the steps, a hypertension risk prediction model can be constructed by using methods such as machine learning, so that important references are provided for clinical treatment and prevention. The method may be implemented by an open-source machine learning tool and platform.
In summary, the hypertension risk prediction method is a multi-step analysis process, and needs to be completed by applying knowledge and technology in multiple fields such as bioinformatics, statistics, machine learning and the like. The method has the advantages that the association between the diseases and intestinal flora can be revealed at the microorganism level, and new ideas and directions are provided for disease treatment and prevention.
Further, the biomarker comprises one or more combinations of one or more of the following species of bacteria belonging to the genus Prevotella:
TABLE 1 bacterial strain table of Prevotella employed in this example
In summary, in this embodiment, the microorganisms Prevotella (Prevotella) associated with hypertension are obtained by screening and used as a microbial marker, and the microbial marker can be applied to the preparation of a hypertension risk prediction product, such as a reagent, a kit, a chip, a prediction model, an evaluation system or a high throughput sequencing platform for uncovering hypertension risk. The invention clarifies the microbial difference between the hypertension patient and the healthy person, and by identifying the hypertension risk in advance, the individual can adopt proper life style adjustment and treatment method, thereby reducing the disease risk. The method for predicting the blood pressure risk by using the Prevotella sample is provided by taking Prevotella in the human intestinal microbiota as a research object based on the intestinal microbiome and bioinformatics thought, and has the characteristics of low cost, noninvasive sampling, high stability, high added value and the like.
Example 2:
referring to fig. 3, embodiment 2 of the present invention provides a method for predicting risk of hypertension, based on embodiment 1 above, the step S100 of obtaining sequence data obtained after sequencing a target sample includes:
step S110, sequencing a sample to be tested based on a high-throughput sequencing technology to obtain sequencing data; wherein the high throughput sequencing technique comprises any one of Illumina, 454 and Ion Torrent;
in step S100, DNA or RNA of the target sample is acquired and sequenced using high throughput sequencing techniques (e.g., illumina, 454, ion Torrent, etc.). This will result in a large amount of sequence data for subsequent analysis.
The steps described above use high throughput sequencing techniques to sequence samples to be tested, for example using sequencing platforms such as Illumina, 454 or Ion Torrent. These techniques allow rapid and accurate sequencing of DNA or RNA, allowing the generation of large amounts of sequencing data.
Illumina, 454 and Ion Torrent are three different high-throughput sequencing technology platforms for rapid and accurate sequencing of DNA or RNA samples. The platforms all adopt different sequencing technologies, but can generate a large amount of sequence data, and are widely used in research fields such as genome sequencing, transcriptome sequencing, exon sequencing and the like.
The Illumina sequencing platform adopts the bridge amplification technology to generate a large amount of sequence data of small fragments on a wafer, has the characteristics of high accuracy and high flux, and is widely applied to the fields of genome, transcriptome, methylation and the like.
The 454 sequencing platform adopts a single-molecule sequencing technology, directly sequences the DNA fragments by adding fluorescent labeled nucleotides, and can output long fragment sequences, thus being applicable to sequencing samples rich in GC or AT regions.
The Ion Torrent sequencing platform adopts a sequencing method based on a semiconductor chip technology, and the main principle is that the DNA sequence is identified by detecting released protons, so that the Ion Torrent sequencing platform has shorter running time and convenient operation flow, and is suitable for small-scale sequencing and rapid screening.
These sequencing platform techniques each have advantages and disadvantages and may also differ in terms of sample processing, data analysis, etc. Therefore, it is necessary to consider the characteristics of the sample and the purpose of the study before sequencing, and to select an appropriate sequencing platform and corresponding analysis method.
And step S120, carrying out data standardization processing on the sequencing data to obtain the sequence data.
In this step, the sequencing data needs to be subjected to data normalization processing for subsequent analysis. The main data normalization processes include removing low quality sequences, removing adaptor sequences, removing primer sequences, and the like.
Further, in the step S120, the data normalization process includes: the sequencing data is subjected to a process of removing low-quality sequences, removing adaptor sequences and removing primer sequences.
The low quality sequences were removed as described above: during sequencing, it may occur that the sequence quality is low, such as a base error due to sequencing errors or other noise factors. By setting the threshold value, sequences with lower quality can be removed, and the accuracy of subsequent analysis is improved.
The linker sequence was removed as described above: before sequencing, adaptor sequences are often introduced at both ends of the DNA or RNA for ligation of sequencing primers. These linker sequences do not belong to the gene sequence of the sample to be tested and may introduce bias in the data analysis. Therefore, it is necessary to remove the linker sequence and retain the core sequence of the sample to be tested.
The primer sequence was removed as described above: in amplification processes such as PCR, primers are typically used to selectively amplify a target sequence. The primer sequences also do not belong to the gene sequences of the samples to be tested, and need to be removed in data analysis to ensure the accuracy of analysis results.
The data normalization processing steps have the advantages of improving the quality and accuracy of sequencing data, removing noise and interference of primer sequences and providing a reliable data base for subsequent analysis. These processing steps may be implemented by bioinformatics tools and software, such as Trim Galore, cutadapt, etc.
By carrying out data standardization processing, the accuracy and reliability of subsequent analysis can be ensured. The processing operations such as low-quality sequences, linker sequences, primer sequences and the like are removed, so that the influence of noise and deviation can be reduced, and the quality and the interpretability of sequencing data are improved. Therefore, it is necessary to perform data normalization processing before data analysis in order to obtain high-quality, reliable sequence data.
Example 3:
referring to fig. 4, embodiment 3 of the present invention provides a method for predicting risk of hypertension, based on embodiment 1 above, the step S200 of performing bioinformatic analysis on the sequence data to determine abundance information corresponding to prasugrel bacteria in the sequence data, including:
step S210, quality control is carried out on the sequence data to obtain data to be processed;
in this step, quality control processing is performed on the sequence data obtained in the previous step to obtain data to be processed with higher quality. The quality control mainly comprises the following aspects:
(1) Removing low quality sequences: and removing the sequences with lower quality according to the quality value of the sequencing data. The quality value is typically represented by a Phred quality value, with a larger value representing a higher quality.
(2) Removing the linker and primer sequences: during the sequencing process, adaptor and primer sequences are often introduced, which sequences do not belong to the gene sequence of the sample to be tested. These non-target sequences can be removed by aligning the reference sequences or using specific algorithms.
(3) Removal of the repeated sequence: due to the existence of PCR amplification and the like, the repeated occurrence of partial sequences may be caused. Removing the repeated sequence can avoid the influence on the subsequent analysis and reduce the consumption of computing resources.
By performing quality control processing, the accuracy and reliability of the data can be improved, noise and deviation interference can be removed, and a high-quality data base can be provided for subsequent analysis. The usual quality control tools and software are FastQC, trimmomatic, etc.
Step S220, analyzing the microorganism species present in the data to be processed and the relative abundance corresponding to the microorganism species, and generating a species abundance table according to the microorganism species and the relative abundance;
in this step, microbiological analysis is performed on the data to be processed obtained from the previous step to identify the presence of the microorganism species and their relative abundance. For example, the analysis may be performed based on sequence alignment, OTU clustering, or metagenome assembly, among other methods.
Sequence alignment: and comparing the data to be processed with a database of known microorganism species reference sequences, and determining the microorganism species in the data to be processed through comparison results.
OTU clustering: by categorizing similar sequences into one operational taxon (Operational Taxonomic Unit, OTU), the data to be processed can be subjected to a cluster analysis to identify the microorganism species present.
Metagenome assembly: for metagenome research, data to be processed can be assembled to obtain complete microbial genome sequences, so that species and abundance information can be determined.
The above analysis methods can be selected according to the required accuracy and application scenario. By analyzing the microorganism species and relative abundance, the microorganism composition and relative abundance distribution existing in the sample can be known, and basic data is provided for subsequent microbiological study.
Step S230, judging whether the microorganism species in the species abundance table contains Prevotella or not;
in this step, the resulting species abundance table is judged to determine whether Prevotella is contained therein. The determination and screening may be based on the name of the species in the species abundance table or specific classification information.
And step S240, if so, taking the abundance corresponding to the Prevotella in the species abundance table as the abundance information.
If Prevotella is present in the species abundance table, its corresponding abundance information can be extracted and used as the desired result. Thus, the relative abundance of Prevotella in the sample can be obtained, and the importance and distribution of Prevotella in the microbial community can be further known.
In addition, if not, the method may return to step S210 to re-perform quality control on other data to obtain data to be processed, or return to step S230 to continue re-judgment, or stop analysis, and generate corresponding prompt information.
The method in the above steps may be implemented by using bioinformatics tools and software, for example QIIME, MEGAN, kraken and the like. These tools and software provide various analysis methods and algorithms that can be processed and analyzed in a manner that is appropriate for the needs of the study.
By performing these steps, quality control processed high quality data can be obtained from the raw sequence data, and microbiological analysis methods can be used to identify the microorganism species and relative abundance and extract abundance information for a particular microorganism species. These analytical results can help to gain insight into the composition and function of complex microbial communities, providing an important data basis for related research.
Example 4:
referring to fig. 5, embodiment 4 of the present invention provides a method for predicting risk of hypertension, based on embodiment 1 above, the method further includes:
step S400, a training data set and a testing data set are established based on the acquired stool samples of the normal crowd and the hypertension crowd;
this step is to build a training dataset and a test dataset for subsequent model construction and training. The specific process may include:
collecting fecal samples of normal people and hypertensive people: fecal samples were collected from normal and hypertensive populations to obtain raw data.
The grouping establishes a training data set and a test data set: grouping the collected samples according to a certain proportion, wherein a part of the samples are used as a training data set for model training and parameter adjustment; the other part is used as a test data set for evaluating the model predictive ability and determining the final predictive model.
The training data set and the test data set are built for the purpose of selecting the most effective prediction model by comparing and evaluating the prediction results on the different data sets. Meanwhile, the two data sets are used, so that errors caused by randomness of data can be reduced, and the stability of the model is improved.
Step S500, constructing the risk prediction model;
this step is to construct a hypertension risk prediction model to classify and predict the sample. May include:
feature selection: differentiated features, such as the relative abundance of certain microorganisms, are selected from the raw data as model input variables.
Model selection: and selecting a proper machine learning model to construct a hypertension risk prediction model according to research requirements and data characteristics. Common models include Support Vector Machines (SVMs), neural Networks (NNs), random Forests (RFs), and the like.
Model training: the selected model is trained using the established training dataset to obtain the weights and parameters of the model and to optimize the performance of the model.
By constructing a hypertension risk prediction model, the characteristic information of the sample can be used for classification and prediction, so that whether the hypertension risk exists in the sample can be known. Different model selections and parameter adjustments may help us obtain more accurate predictions.
Step S600, training the risk prediction model by using the training data set to obtain the test data and the corresponding prediction result of the hypertension risk in each sample;
This step is to train the selected model with the training dataset to obtain a trained model and a predicted result. May include:
data preprocessing: the training data set is preprocessed, e.g., normalized, missing value processed, etc.
Model training: training the training data set by using the established risk prediction model to obtain a hypertension risk prediction result of each individual in the sample.
By training the model by using the training data set, the model parameters and performance can be optimized, and the accuracy and stability of the model can be improved. Meanwhile, the model obtained through training can be used for predicting the follow-up unknown sample.
And step S700, determining and correcting the risk prediction model through the prediction result and the test data set, and obtaining the trained risk prediction model.
This step is for determining a final hypertension risk prediction model, including modifications and adjustments to model parameters. May include:
performance evaluation: and performing performance evaluation on the trained model by using the test data set, for example, calculating indexes such as accuracy, recall, F1 score and the like.
And (3) correcting a model: and according to the performance evaluation result, adjusting the model parameters and the structure to achieve the optimal prediction effect.
By using the test data set to evaluate and correct the performance of the model, the predictive power and stability of the model can be further improved. The final hypertension risk prediction model can be used for predicting and classifying new samples, and helps to diagnose and treat hypertension.
Further, referring to fig. 6, the step S400 of establishing a training data set and a test data set based on the collected stool samples of the normal population and the hypertensive population includes:
step S410, performing metagenome sequencing on fecal samples of the normal population and the hypertensive population to obtain test sequence data;
the above steps are for obtaining DNA sequence data of stool samples for subsequent microbiological analysis and hypertension risk prediction. May include:
extracting DNA: total DNA was extracted from fecal samples and purified.
Metagenome sequencing: using high throughput sequencing techniques, DNA from fecal samples was subjected to metagenomic sequencing to obtain millions of short reads.
By means of metagenomic sequencing, a large amount of fecal sample DNA sequence information can be obtained, and the information comprises genetic information and abundance information of microbial communities, which is the basis for subsequent microbiological analysis and hypertension risk prediction.
Step S420, quality control is carried out on the test sequence data to obtain test screening data;
the above steps are to ensure that subsequent bioinformatic analysis and hypertension risk prediction can be performed based on accurate and reliable sequence data. May include:
read length filtration: short read sequences are screened for their length and sequences that are too short or too long are discarded.
And (3) quality control: and judging the mass fraction of each sequence, and removing the sequences with low quality and sequences with excessive error bases.
By performing quality control, low-quality unreliable sequences can be filtered, the accuracy and reliability of sequence data are improved, and a better data basis is provided for subsequent analysis.
Step S430, performing bioinformatic analysis on the test screening data, analyzing the microorganism species existing in the test screening data and the relative abundance corresponding to the microorganism species, and generating a test species abundance table according to the microorganism species and the relative abundance; wherein, the abundance table of the test species comprises Prevotella and the abundance information corresponding to the Prevotella;
the steps are to analyze the structure and function of microbiome, determine the composition and relative abundance of different microbial communities, and provide feature variables for subsequent hypertension risk prediction. The specific process comprises the following steps:
Tag sequence assignment: and distributing the screened sequence data to different samples according to the label sequence.
Bioinformatics analysis: sequence data are analyzed by various bioinformatics tools and are compared with a known microorganism database to analyze the abundance and community structure of microorganisms.
Species abundance table generation: based on the analysis results, relative abundance and abundance tables of the various microorganism species in each sample are generated.
By performing bioinformatics analysis, the composition and relative abundance of microbial communities in different samples can be determined, providing important feature variables for subsequent hypertension risk prediction. Meanwhile, bioinformatics analysis can also explore functions and interaction relations of microbial communities, and provides a new idea for understanding pathological mechanisms and treatment of hypertension.
Step S440, the training data set and the test data set are established according to the test species abundance table.
The steps are to build a training set and a testing set of the hypertension risk prediction model so as to carry out model construction and training subsequently. May include:
grouping sets up data sets: and dividing the normal population and the hypertension population into a training data set and a testing data set according to a certain proportion according to the abundance table of the test species.
Data set establishment optimization: and adjusting and optimizing the established data set according to the experimental requirements and the accuracy of the existing data.
In this embodiment, a training data set for training a model and a test data set for evaluating the effect of the model are constructed.
When the training data set and the test data set are established, the construction method can be as follows: firstly, stool samples of healthy people and hypertensive people are sequenced, quality control is carried out, the abundance of the species is analyzed, then the abundance information of most samples is randomly used as a training set according to a certain proportion (such as 7:3), and the abundance information of the rest samples is used as a testing set.
If the ratio is 7:3, the representative training dataset includes 70% healthy samples and 70% hypertensive samples, and the test dataset includes 30% healthy samples and 30% hypertensive samples. By building training data sets and test data sets, the predictive performance of the model on different data sets can be evaluated and the parameters and structure of the model optimized. Meanwhile, the data set is carefully designed and optimized, so that the accuracy and stability of a prediction model can be improved, and a reliable data source is provided for the subsequent hypertension risk prediction.
Further, referring to fig. 7, in the step S500, the constructing the risk prediction model includes:
step S510, constructing a machine learning model according to a random forest class function of python;
the above steps are for constructing a machine learning model using a random forest classifier (Random Forest Classifier) for prediction of hypertension risk. It may include:
and (5) guiding and warehousing: the random forest class function is imported using a correlation library in Python, such as scikit-learn.
Preparing data: and taking the obtained abundance table of the test species as input data and taking sample labels of normal people and hypertensive people as output data.
And (3) constructing a model: and setting related parameters such as decision tree quantity, feature selection and the like by utilizing a RandomastClassifier function, and constructing a machine learning model.
By constructing a machine learning model by using a random forest classifier, prediction of hypertension risk can be achieved. The random forest is an integrated learning method based on decision trees, can process a large number of input features and samples, and has good prediction accuracy and robustness.
Step S520, based on the machine learning model learning, establishing a mapping relation from the input abundance information characteristics to the hypertension risk prediction result, and obtaining the risk prediction model.
The method comprises the steps of training a machine learning model, learning and establishing a mapping relation between input microorganism abundance information characteristics and a hypertension risk prediction result, so as to obtain a final hypertension risk prediction model. It comprises the following steps:
training a model: and (3) inputting a training data set to train by using the machine learning model constructed in the prior art, and learning to establish a mapping relation between input characteristics and risk prediction results by optimizing model parameters and structures.
Model evaluation: and evaluating the trained model by using the test data set, calculating the prediction accuracy of the model on the new sample, and further optimizing the model according to the evaluation result.
By training and learning based on a machine learning model, a mapping relationship between the input microorganism abundance information characteristics and the hypertension risk prediction result can be established. Such a predictive model may be used to predict the risk of hypertension for an unknown sample with high prediction accuracy and interpretability.
This may be accomplished in a Python programming language, using a random forest class function provided in a corresponding machine learning library (e.g., scikit-learn) for model construction and training. Python has a rich machine learning ecosystem, provides easy-to-use and flexible tools, and can efficiently build and train machine learning models.
Steps S510 and S520 are performed in this manner because the random forest classifier is a commonly used machine learning algorithm, is adapted to handle complex classification problems, and has good performance in feature selection and processing high-dimensional data. By constructing and training a random forest classifier, the risk of hypertension can be predicted by utilizing the microorganism abundance information, so that the establishment of a risk prediction model is realized.
Example 5:
referring to fig. 8, in order to better explain the method of predicting risk of hypertension, in this embodiment, based on embodiments 1 to 5, the method of predicting risk of hypertension is specifically implemented as follows:
step S1, sample collection: collecting stool samples of healthy people and hypertensive people: the subjects were divided into two groups, one group being healthy population and one group being hypertensive population. Samples were taken from fresh fecal samples from subjects using a disposable sampling tool. Avoiding mixing the sample with urine, water or other substances that may affect the quality of the sample. Sufficient sample size is collected for analysis, which can result in library construction failure. The sample is stored in a low temperature environment, typically between 2 ℃ and 8 ℃. The sample is prevented from being exposed to high temperature to prevent bacterial proliferation.
Step S2, extracting microbiome DNA: the sample collected in step 1 is pretreated by centrifugation, filter paper filtration or other sample preparation method to remove solid particles and other impurities. Cell disruption is achieved by mechanical means (e.g., shaker), chemical means (e.g., protease treatment), or thermal treatment. Microbial DNA is extracted using a suitable DNA extraction kit or method. Typically, this involves mixing the sample with an extraction reagent and then separating the DNA from other cellular components by centrifugation or the like. The extracted DNA typically requires purification and washing to remove potential contaminants and reagent residues. The extracted DNA is then concentrated to a suitable concentration. This is typically accomplished by spin concentration or other concentration methods. The quality of the extracted DNA is detected, usually using an ultraviolet spectrophotometer (UV-Vis spectrophotometer) or a fluorescence analyzer. This helps determine the concentration and purity of the DNA.
Step S3, metagenome sequencing: the DNA was PCR amplified to increase the amount of DNA that could be sequenced. The extracted DNA fragments are constructed into a DNA library. This involves the adaptation of the DNA fragments to the primers and connectors required for the sequencing platform. The DNA library is sequenced using high throughput sequencing techniques such as Illumina, 454, ion Torrent, etc. These platforms are capable of sequencing large numbers of DNA fragments simultaneously, generating millions to billions of short sequences (short reads).
And finally, performing quality control on the sequencing data, including removing low-quality sequences, removing linker sequences, removing primer sequences and the like, so as to improve the accuracy of subsequent analysis.
Step S4, data analysis: and 3, after obtaining the sequence file in the FASTQ format, performing quality control by using KneadData, and removing low-quality sequences, pollution sequences and repair sequences. The output of the kneadatdata will be sequencing data that has undergone a quality control process.
Sequencing data subjected to the KneadData treatment were then input into MetaPhlAn. MetaPhlAn will analyze the microbial species present in the sample and estimate the relative abundance, generating a species abundance profile. The document will contain information on the relative abundance of the detected microorganism species in each sample and its species level in the domain, phylum, class, order, family, genus, including the abundance of Prevoltellella (Prevoltellella).
Step S5, feature selection and modeling: wilcoxon rank sum test was performed using the clip package of python, selecting features with significant differences. The pre-screened biomarkers of hypertension are shown as strains in table 1.
Step S6, constructing a hypertension risk prediction model: first, the data set is divided into two parts, typically a training set and a test set. The training set is used to build the model and the test set is used to evaluate the performance of the model. Typically, most of the data is used for training, leaving a portion for testing. Typical segmentation ratios may be 70% training data and 30% test data, or 80% training data and 20% test data may be selected.
Next, a machine learning model is built using the RannomforstClassifier function in the sklearn package of python. The model will learn the mapping from the input features to the target variables (i.e. risk of hypertension). And then, predicting the samples in the test set by using the trained model. The model will predict the risk of hypertension for each test sample based on the entered eigenvalues.
Step S7, verification and evaluation: the performance index of the model is calculated by comparing the predicted results of the model with the actual labels (the actual values of the hypertension risk) in the test set.
Specifically, the accuracy is selected as a performance index of the model, i.e., the ratio of the number of correctly predicted samples to the total number of samples. The optimal super parameter n_estimators is determined to be 120 through grid search, the model is optimal at the moment, the accuracy is 0.756, and the model has a good hypertension risk prediction effect.
The importance of the Prevotella species is highest in the Prevotella species, as shown in FIG. 9, in which the importance of the characteristics of the Prevotella (Prevotella) microbiota is ranked from high to low and the importance of the Prevotella species is the same on the horizontal axis and the Prevotella species on the vertical axis.
It should be noted that, as a person to be tested, the subject needs to provide a fecal sample, which is usually collected by a medical professional or by itself with the aid of a specific sampling tool.
Sample collection should follow a standardized procedure to ensure quality and consistency of the samples.
After DNA extraction, sequencing and analysis, the collected fecal samples are obtained for the presence and relative abundance of Prevotella and other microorganisms.
The abundance data is then provided as input to the hypertension risk prediction model. The model construction method is the same as that listed in the previous embodiment. The hypertension risk prediction model will generate a prediction of hypertension risk from the entered microbiome data and other features. This result may be the probability or risk score of the subject suffering from hypertension for health assessment and personalized medical decision.
In addition, referring to fig. 10, the present embodiment further provides a hypertension risk prediction apparatus, including:
an acquisition module 10, configured to acquire sequence data obtained after sequencing of a target sample;
a determining module 20, configured to perform bioinformatic analysis on the sequence data, and determine abundance information corresponding to Prevotella in the sequence data;
the prediction module 30 is configured to input the abundance information into a trained risk prediction model by using Prevotella as a biomarker, so as to obtain a prediction result representing the risk of hypertension.
In addition, the embodiment also provides a hypertension risk prediction system, which comprises a memory and a processor, wherein the memory stores a hypertension risk prediction program, and the processor runs the hypertension risk prediction program to enable the hypertension risk prediction system to execute the hypertension risk prediction method.
In addition, the present embodiment also provides a computer-readable storage medium, on which a hypertension risk prediction program is stored, which when executed by a processor, implements the hypertension risk prediction method as described above.
In a word, the invention takes Prevotella in human intestinal microbiota as a research object, and provides a method for predicting blood pressure risk by using a Prevotella sample based on intestinal microbiome and bioinformatics thought, which has the characteristics of low cost, non-invasive sampling, high stability, high added value and the like.
The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) as described above, comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method according to the embodiments of the present invention. The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.
Claims (10)
1. A method for predicting risk of hypertension, comprising:
obtaining sequence data obtained after sequencing of a target sample;
performing bioinformatics analysis on the sequence data, and determining abundance information corresponding to Prevotella in the sequence data;
and inputting the abundance information into a trained risk prediction model by taking Prevotella as a biomarker to obtain a prediction result for representing the hypertension risk.
2. The method of claim 1, wherein the sequencing of the obtained sequence data from the target sample comprises:
sequencing a sample to be tested based on a high-throughput sequencing technology to obtain sequencing data; wherein the high throughput sequencing technique comprises any one of Illumina, 454 and Ion Torrent;
carrying out data standardization processing on the sequencing data to obtain the sequence data;
preferably, the data normalization process includes: the sequencing data is subjected to a process of removing low-quality sequences, removing adaptor sequences and removing primer sequences.
3. The method of claim 1, wherein said bioinformatic analysis of said sequence data to determine abundance information corresponding to Prevotella in said sequence data comprises:
Performing quality control on the sequence data to obtain data to be processed;
analyzing the microorganism species present in the data to be processed and the relative abundance corresponding to the microorganism species, and generating a species abundance table from the microorganism species and the relative abundance;
judging whether the microorganism species in the species abundance table contains Prevotella;
if yes, the abundance corresponding to the Prevotella in the species abundance list is used as the abundance information.
4. The method of claim 1, wherein the biomarker comprises one or more combinations of the following species of bacteria belonging to the genus prasuvorexant:
prevoltellella_ amnii, prevotella _ baroniae, prevotella _ bivia, prevotella _ buccae, prevotella _ buccalis, prevotella _ colorans, prevotella _ copri, prevotella _ corporis, prevotella _ dentalis, prevotella _ denticola, prevotella _ disiens, prevotella _ histicola, prevotella _ intermedia, prevotella _ multisaccharivorax, prevotella _ nigrescens, prevotella _ oralis, prevotella _ oris, prevotella _sp_885 Prevolella sp AM42 24 Prevoltella_sp_CAG_1031, prevoltella_sp_CAG_1058, prevoltella_sp_CAG_1092, prevoltella_sp_CAG_1124, prevoltella_sp_CAG_1185, prevoltella_sp_CAG_1320 Prevoltella_sp_CAG_279, prevoltella_sp_CAG_485, prevoltella_sp_CAG_520, prevoltella_sp_CAG_5226, prevoltella_sp_CAG_617, prevoltella_sp_CAG_755, prevoltella_sp_CAG_873, prevoltella_sp_CAG_891, prevoltella_s7_1_8, prevoltella_stercorea and Prevoltella_timensis.
5. The method of claim 1, further comprising:
establishing a training data set and a testing data set based on the collected stool samples of the normal population and the hypertensive population;
constructing the risk prediction model;
training the risk prediction model by using the training data set to obtain the test data and a corresponding prediction result of hypertension risk in each sample;
and determining and correcting the risk prediction model through the prediction result and the test data set to obtain the trained risk prediction model.
6. The method of claim 5, wherein the establishing training data sets and testing data sets based on the collected stool samples of the normal population and the hypertensive population comprises:
performing metagenome sequencing on fecal samples of the normal population and the hypertensive population to obtain test sequence data;
performing quality control on the test sequence data to obtain test screening data;
performing bioinformatic analysis on the test screening data, analyzing the microorganism species present in the test screening data and the relative abundance corresponding to the microorganism species, and generating a test species abundance table according to the microorganism species and the relative abundance; wherein, the abundance table of the test species comprises Prevotella and the abundance information corresponding to the Prevotella;
And establishing the training data set and the test data set according to the test species abundance table.
7. The method of claim 6, wherein said constructing said risk prediction model comprises:
constructing a machine learning model according to a random forest class function of python;
and learning and establishing a mapping relation from the input abundance information characteristics to the hypertension risk prediction result based on the machine learning model to obtain the risk prediction model.
8. A hypertension risk prediction apparatus, comprising:
the acquisition module is used for acquiring sequence data obtained after sequencing of the target sample;
the determining module is used for performing bioinformatics analysis on the sequence data and determining abundance information corresponding to Prevotella in the sequence data;
the prediction module is used for inputting the abundance information into a trained risk prediction model by taking Prevotella as a biomarker to obtain a prediction result representing the risk of hypertension.
9. A hypertension risk prediction system comprising a memory and a processor, wherein the memory stores a hypertension risk prediction program, and the processor runs the hypertension risk prediction program to cause the hypertension risk prediction system to perform the hypertension risk prediction method according to any one of claims 1 to 7.
10. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon a hypertension risk prediction program, which when executed by a processor, implements the hypertension risk prediction method according to any one of claims 1-7.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202311393849.5A CN117457200A (en) | 2023-10-25 | 2023-10-25 | Hypertension risk prediction method, device, system and storage medium |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202311393849.5A CN117457200A (en) | 2023-10-25 | 2023-10-25 | Hypertension risk prediction method, device, system and storage medium |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN117457200A true CN117457200A (en) | 2024-01-26 |
Family
ID=89590343
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202311393849.5A Pending CN117457200A (en) | 2023-10-25 | 2023-10-25 | Hypertension risk prediction method, device, system and storage medium |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN117457200A (en) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN118806799A (en) * | 2024-06-11 | 2024-10-22 | 中南大学 | Application of Prevotella in preparing medicine for treating or assisting in treating hypertension |
-
2023
- 2023-10-25 CN CN202311393849.5A patent/CN117457200A/en active Pending
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN118806799A (en) * | 2024-06-11 | 2024-10-22 | 中南大学 | Application of Prevotella in preparing medicine for treating or assisting in treating hypertension |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Kim et al. | Unraveling metagenomics through long-read sequencing: a comprehensive review | |
| CN108804875B (en) | Method for analyzing microbial population function by using metagenome data | |
| Tanca et al. | The impact of sequence database choice on metaproteomic results in gut microbiota studies | |
| CN109273053B (en) | A high-throughput sequencing method for microbial data processing | |
| Robinson et al. | Intricacies of assessing the human microbiome in epidemiologic studies | |
| CN115064220A (en) | A single-cell method for cross-species cell type identification | |
| WO2014019180A1 (en) | Method and system for determining biomarker in abnormal state | |
| CN113593630A (en) | Family coronary heart disease risk assessment and risk factor identification system | |
| CN113270145B (en) | Method for judging background introduction microorganism sequence and application thereof | |
| CN111243676B (en) | High-throughput sequencing data-based wilt disease onset prediction model and application | |
| CN118737282A (en) | A bacterial selenoprotein gene identification method and terminal based on deep learning | |
| CN117457200A (en) | Hypertension risk prediction method, device, system and storage medium | |
| CN109215736B (en) | A high-throughput detection method and application of enterovirome | |
| WO2024187890A1 (en) | Snp data-based prediction method, apparatus and device and readable storage medium | |
| Hollister et al. | Bioinformation and’omic approaches for characterization of environmental microorganisms | |
| EP3975190A1 (en) | Method for discovering marker for predicting risk of depression or suicide using multi-omics analysis, marker for predicting risk of depression or suicide, and method for predicting risk of depression or suicide using multi-omics analysis | |
| CN118841180B (en) | A method and system for constructing a prognostic model for acute myeloid leukemia | |
| CN112599190B (en) | Method for identifying deafness-related genes based on mixed classifier | |
| CN118782149B (en) | A Hi-C-based microbial metagenomic sequencing analysis method and system | |
| CN114464255A (en) | Methylation age assessment method based on DNA methylation level data | |
| TWI582631B (en) | Dna sequence analyzing system for analyzing bacterial species and method thereof | |
| CN119252349A (en) | Methods for single-cell transcriptome data-assisted AD analysis and classification | |
| CN117487937B (en) | Application of miRNA marker combinations in the preparation of age prediction products | |
| EP4451285B1 (en) | Method and system for identifying and utilizing frugal markers for classification of biological sample | |
| CN116168761B (en) | Method and device for determining characteristic region of nucleic acid sequence, electronic equipment and storage medium |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination |