WO2023035623A1 - Procédé de génération de corpus de réponses basé sur une intelligence artificielle et dispositif associé - Google Patents
Procédé de génération de corpus de réponses basé sur une intelligence artificielle et dispositif associé Download PDFInfo
- Publication number
- WO2023035623A1 WO2023035623A1 PCT/CN2022/088893 CN2022088893W WO2023035623A1 WO 2023035623 A1 WO2023035623 A1 WO 2023035623A1 CN 2022088893 W CN2022088893 W CN 2022088893W WO 2023035623 A1 WO2023035623 A1 WO 2023035623A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- corpus
- question
- answer
- participle
- professional
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3329—Natural language query formulation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/20—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Definitions
- the present application relates to the field of artificial intelligence, in particular to an artificial intelligence-based answer corpus generation method and related equipment.
- the inventor realizes that the existing medical platform has a product recommendation function in the consultation process, generally relying on the doctor to make a verbal recommendation, and when the patient confirms that there is a purchase intention, the product or service is pushed.
- this promotion method relying on manpower has a better conversion effect, it cannot be promoted on a large scale.
- accurate intelligent recommendations based on big data analysis have not been fully applied.
- the main purpose of this application is to solve the technical problem of low accuracy in the existing intelligent product recommendation based on the consultation link.
- the first aspect of the present application provides a method for generating response corpus based on artificial intelligence, including: obtaining the query corpus and the response corpus corresponding to the query corpus to be pushed, and based on the preset linear chain conditional random field, respectively
- the query corpus and the response corpus to be pushed are subjected to word segmentation processing, correspondingly obtaining a plurality of question segmentation words and a plurality of response segmentation words; respectively performing professional word semantic matching on the query diagnosis segmentation words and the response segmentation words, correspondingly obtaining the Inquiry professional participle corresponding to the question participle and the response professional participle corresponding to the response participle; Carry out cross question-and-answer matching to each described interrogation professional participle and each described response professional participle in turn, and according to the result of cross question-answer matching, to The professional participle of the inquiry and the professional participle of the response are combined to obtain a diagnostic statement; using the preset prior medical knowledge base, matching the treatment
- the second aspect of the present application provides an artificial intelligence-based response corpus generation device, including: a word segmentation module, used to obtain the query corpus and the response corpus corresponding to the query corpus to be pushed, and based on the preset linear chain condition.
- word segmentation processing is performed on the inquiry corpus and the response corpus to be pushed, correspondingly obtaining a plurality of question segmentation words and a plurality of response word segmentations;
- a semantic matching module is used to separately analyze the question segmentation words and the response Segmentation carries out semantic matching of professional words, and correspondingly obtains the corresponding professional word segmentation of the inquiry and the corresponding professional participle of the response participle;
- the combination module is used to adopt a preset prior medical knowledge base, Match the treatment product information
- the third aspect of the present application provides an artificial intelligence-based answer corpus generation device, including: a memory and at least one processor, instructions are stored in the memory; the at least one processor invokes the instructions in the memory , so that the artificial intelligence-based answer corpus generation device executes the artificial intelligence-based answer corpus generation method as follows: obtain the question corpus and the answer corpus corresponding to the question corpus, and based on the preset linear chain conditional random field, respectively performing word segmentation processing on the inquiry corpus and the response corpus to be pushed, and correspondingly obtaining multiple question segmentation words and multiple response word segmentations; Semantic matching, correspondingly obtain the professional participles corresponding to the question participle and the corresponding participles of the response participles; carry out cross question-and-answer matching to each of the participles of the professional participle of the inquiry and each participle of the professional participle of the response in turn, and according to As a result of cross-question-answer matching, combine the professional participle of the inquiry
- the fourth aspect of the present application provides a computer-readable storage medium, and instructions are stored in the computer-readable storage medium, and when it is run on a computer, it causes the computer to perform the following artificial intelligence-based response corpus generation Method: Obtain the question corpus and the response corpus to be pushed corresponding to the question corpus, and perform word segmentation processing on the question corpus and the response corpus to be pushed respectively based on the preset linear chain conditional random field, correspondingly get more An inquiry participle and a plurality of response participles; the professional word semantic matching is carried out to the question participle and the response participle respectively, and correspondingly obtain the question participle corresponding to the question participle and the response specialty corresponding to the response participle Segmentation: Carry out cross question-and-answer matching for each of the professional word segmentations of the inquiry and each of the professional word segmentations of the responses in turn, and according to the results of the cross question-answer matching, combine the professional word segmentation of the inquiry and the professional word segment
- the inquiry corpus and the corpus to be pushed are converted into professional inquiry corpus and professional word semantic matching Response corpus, and match the professional word segmentation of the inquiry and response to judge the patient's condition and the treatment plan recommended by the doctor, get the diagnosis sentence, and match the treatment product information according to the diagnosis sentence, and together with the response corpus to be pushed sent to patients.
- the recommended products and services are accurate recommendations for this chat scene, providing users with a quick ordering function for specific products, and realizing accurate recommendations of property rights during the consultation process.
- Fig. 1 is the schematic diagram of the first embodiment of the response corpus generation method based on artificial intelligence of the present application
- Fig. 2 is the second embodiment schematic diagram of the application's artificial intelligence-based response corpus generation method
- Fig. 3 is the schematic diagram of the third embodiment of the answer corpus generation method based on artificial intelligence of the present application
- Fig. 4 is a schematic diagram of an embodiment of the artificial intelligence-based response corpus generating device of the present application.
- FIG. 5 is a schematic diagram of another embodiment of the artificial intelligence-based response corpus generation device of the present application.
- FIG. 6 is a schematic diagram of an embodiment of an artificial intelligence-based response corpus generation device in the present application.
- the embodiment of the present application provides an artificial intelligence-based answer corpus generation method and related equipment, which obtains the question corpus and the answer corpus to be pushed, and performs word segmentation processing based on the preset linear chain condition random field, and obtains the question word segmentation and answer correspondingly Word segmentation; carry out semantic matching of professional words for question and answer word segmentation, correspondingly obtain professional word segmentation for inquiry and professional response; carry out cross-question and answer matching for professional word segmentation for inquiry and professional word for response, and according to the results of cross-question-answer matching, Combine the word segmentation for diagnosis and the professional word for response to obtain the diagnosis statement; use the preset prior medical knowledge base to match the treatment product information corresponding to the diagnosis statement, and combine the treatment product information and the response corpus to be pushed to obtain a new response to be pushed Corpus and push.
- This application realizes the recommendation of treatment products in the process of online consultation, and improves the intelligence of online consultation.
- the first embodiment of the artificial intelligence-based answer corpus generation method in the embodiment of the present application includes:
- the execution subject of the present application may be an artificial intelligence-based response corpus generation device, or a terminal or a server, which is not specifically limited here.
- the embodiment of the present application is described by taking the server as an execution subject as an example.
- the inquiry corpus is input and sent by the patient through the inquiry chat interface
- the response corpus to be pushed is obtained by the docking doctor receiving the patient's inquiry corpus, inputting and sending it through the chat interface, for example, the patient sends "arm Skin allergies, what should I do with the doctor?”, and the doctor replies to the query corpus, “Have you ever taken anti-allergic medicine at home?”
- the response corpus to be pushed among which, the corpus to be pushed is sent to the background, and If it is not directly forwarded to the patient, it needs to be implanted with product recommendation information through this application method before sending it.
- the acquisition of product recommendation information is based on the semantic matching of the inquiry corpus and the response corpus to be pushed, so it is necessary to perform semantic recognition on the query corpus and the response corpus to be pushed, and match the corresponding product recommendation information based on the semantic recognition results .
- the pre-set linear chain conditional random field can be used to perform word segmentation processing on the question corpus and the response corpus to be pushed, so as to obtain corresponding multiple question and answer word segmentation.
- the relationship of "generation-discriminant pair" performs word segmentation processing on the question segmentation and response segmentation.
- the part of speech is firstly distinguished between the question participle and the response participle, and the parts of speech are marked for the question participle and the response participle, so as to find out the noun question participle and response participle.
- the answer participle is directly excluded. It can be directly connected to a custom or existing rule dictionary for exclusion, and the operator can complete the rules, or it can be connected to the AI system for exclusion.
- the doctor responds to the question corpus input by the patient.
- the doctor obtains the professional question and response word segmentation, which are the key question words and key words in the question corpus.
- Response words which may contain multiple symptoms or questions during the consultation process, and the doctor will also respond to different symptoms or questions accordingly. Therefore, it is necessary to perform cross-question and answer matching on the professional participle of the consultation and the professional participle of the response. .
- the mapping relationship table between the regular expression of the diagnosis statement and the diagnosis result is configured in the prior medical knowledge base, the corresponding regular expression can be found through the diagnosis statement, and then the mapping relationship can be traversed through the regular expression table to determine the diagnostic result mapped to the diagnostic statement, and then a mapping relationship table between the diagnostic result and the treatment product identification information is also configured in the prior medical knowledge base.
- the inquiry corpus and the corpus to be pushed are converted into professional inquiry corpus and response corpus through semantic matching of professional words , and match the professional word segmentation of the inquiry and response to judge the patient's condition and the treatment plan recommended by the doctor, obtain the diagnosis sentence, and match the treatment product information according to the diagnosis sentence, and push it together with the response corpus to be pushed to patient.
- the recommended products and services are accurate recommendations for this chat scene, providing users with a quick ordering function for specific products, and realizing accurate recommendations of property rights during the consultation process.
- the second embodiment of the artificial intelligence-based answer corpus generation method in the embodiment of the present application includes:
- each character in the question corpus is divided and sequentially encoded to obtain a character table.
- the character feature vector of each character is trained using a neural network such as Word2vec, wherein the character feature The vector contains the context information of the question corpus.
- Each character feature vector represents a character.
- the dimension of each character feature vector can be adjusted according to the size of the corpus. Generally, the optional dimensions are 50, 100, 200, etc.
- cL is The vector corresponding to the l-th letter in the pinyin corresponding to the text; L is the maximum length of the pinyin, which is a fixed value by default.
- the maximum length of the pinyin corresponding to the text is 6, so L can be set to 6; and if the length L' of the pinyin corresponding to the text is less than L, the elements in the row L'+1 ⁇ L' in the corresponding pinyin vector matrix will be Set to zero; for example, if the length of the pinyin "shi" corresponding to " ⁇ " is 3, all rows 4-6 in the corresponding pinyin vector matrix are set to zero.
- Each pinyin feature vector matrix is sequentially encoded by a convolutional neural network CNN to obtain a pinyin feature vector of a fixed size.
- the character feature vector and the pinyin feature vector are assembled according to the order of each character in the question corpus for one-to-one correspondence, and the context information vector can be obtained, and then the context information vector is input into the bidirectional LSTM neural network for semantic analysis , where the bidirectional LSTM neural network includes a forward LSTM neural network and a backward LSTM neural network, combined with forgetting and saving mechanisms for backpropagation to learn the semantic features of the context information vector.
- z) is the probability of labeling y when the value is z
- S(z) is a normalization factor, in order to normalize the output to a value from 0 to 1.
- each character in the question-and-answer word segmentation and each professional word in the professional word dictionary has its special phonetic and font combination.
- the initial consonant, the final vowel, the complement code of the final vowel and the tone of each character are digitally coded to obtain the four-digit code of its pronunciation;
- the Chinese character structure, five four-corner codes, and the number of strokes of each character are coded to obtain the shape of the character 7-digit code;
- the combination of the two can form the unique 11-digit phonetic code of each character, including the first phonetic code and the second phonetic code.
- F0 to F9, G0 to G9, H0 to H9, J0 to J9, K0 to K9 represent respectively the upper left corner, the upper right corner, the lower left corner, the lower right corner, and the coding field corresponding to the ten classes of strokes corresponding to the attached number;
- Li i is the number of strokes and i is Positive integer
- the font coding information of the word " ⁇ " is E2F4G4H2J1K4L7, so the coding information of the commonly used characters of the word "flower” is A11B13C13D1E2F4G4H2J1K4L7.
- the phonetic-graph code includes eleven types of coding fields. If the same type of coding field is different between the first phonetic-graph code and the second phonetic-graph code, the edit distance is increased by 1; otherwise, the original value remains. If the coding fields of all types are consistent between the two, it means that the two commonly used words have the highest similarity, and the edit distance between the two is 0. If the coding fields of all types between the two are inconsistent, it means that the two commonly used words are similar If the degree is the lowest, the edit distance between the two is 11, so the edit distance between the pre-replaced word and the commonly used word is between 0-11.
- the edit distance is the quantitative value of the similarity between each word in the question and answer segment and each word in the professional dictionary, and the smaller the edit distance, the higher the similarity, so the user can set the preset edit distance Threshold, used to filter professional words for cross combination.
- a conventional semantic recognition model is used to perform semantic analysis on the question-and-answer sub-phrases and professional phrases, and obtain the first semantic analysis result and the second semantic analysis result respectively. If there is a small semantic deviation between the two after the comparison, it is determined that the professional word changed in the corresponding professional phrase is a synonym for the corresponding question and answer participle, and it is used as the question and answer professional participle corresponding to the question and answer participle.
- the phonetic-phonetic codes of question and answer word segmentation and professional word segmentation are constructed through the preset common word dictionary and professional word dictionary, and the number of each question and answer word segmentation is determined through the matching of phonetic-phonetic codes. Synonyms, and replace them to obtain the corresponding professional participle for consultation and professional participle for answering, which will be more accurate in subsequent product matching.
- the third embodiment of the artificial intelligence-based answer corpus generation method in the embodiment of the present application includes:
- the modular encrypted corpus includes the first modular encrypted corpus corresponding to the medical inquiry corpus and the response to be pushed The second encrypted corpus corresponding to the corpus;
- the type (Type) of the plaintext m of the inquiry corpus and the response corpus to be pushed is T
- the set of T is ⁇ integer, real number, character, date, Boolean, etc. ⁇
- a medical inquiry corpus that is, binary, decimal, hexadeci
- c represents the ciphertext
- m represents the consultation
- s represents the base used in encryption
- r represents a random number
- p is an encryption key
- x0 is an intermediate variable, which is equal to the encryption key p and another encryption key
- the corresponding original ciphertext code, inverse ciphertext code and complement code of ciphertext can be calculated through the encrypted corpus, and the original ciphertext code,
- the ciphertext inverse code and the ciphertext complement code are used for encryption calculation, for the addition operation of the encrypted corpus, the ciphertext combination in it is directly summed in place without using the original ciphertext code, ciphertext inverse code and ciphertext Text complement.
- the total length of the storage format is 32 bits, 64 bits or 80 bits, and includes sign bits, integer bits and decimal places, and according to This storage format expands the binary bit plaintext; performs encryption operations on the expanded binary bit plaintext, and combines the results of the encrypted operations to obtain the corresponding ciphertext as the dividend and divisor respectively; set the initial value of the decimal counter count equal to the storage Format length-L, where L is the length of integer bits in the storage format; judge whether the ciphertext of the dividend is greater than the ciphertext of the divisor, if greater, then add the ciphertext of the dividend to the complement of the encrypted corpus, and obtain the remainder as The new dividend, and add the ciphertext of 1 in the integer position, that is, the ciphertext quotient is obtained; otherwise, judge whether the ciphertext of the remainder is all zero or the decimal place counter count is greater than the total length of the storage format,
- the patient's personal privacy information can be guaranteed, and the patient's consultation can be improved. sense of experience.
- the artificial intelligence-based response corpus generation method in the embodiment of the present application
- the following describes the artificial intelligence-based response corpus generation device in the embodiment of the present application.
- the artificial intelligence-based response in the embodiment of the present application One embodiment of the corpus generation device includes:
- the word segmentation module 401 is used to obtain the query corpus and the response corpus to be pushed corresponding to the query corpus, and perform word segmentation processing on the query corpus and the response corpus to be pushed based on the preset linear chain conditional random field , correspondingly obtain multiple question and response participle words;
- the semantic matching module 402 is used to carry out semantic matching of professional words to the question participle and the response participle respectively, and correspondingly obtain the professional question participle corresponding to the question participle and the response professional participle corresponding to the response participle;
- the question-and-answer matching module 403 is used to sequentially perform cross question-and-answer matching on each of the professional word segmentations of the inquiry and each of the professional word segmentations of the responses, and perform cross-question matching on the professional word segmentation of the inquiry and the professional word segmentation of the responses according to the results of the cross-question matching. Combination to get the diagnostic statement;
- the combination module 404 is used to match the treatment product information corresponding to the diagnostic sentence using the preset prior medical knowledge base, combine the treatment product information and the response corpus to be pushed, obtain and push new response corpus to be pushed.
- the inquiry corpus and the corpus to be pushed are converted into professional inquiry corpus and response corpus through semantic matching of professional words , and match the professional word segmentation of the inquiry and response to judge the patient's condition and the treatment plan recommended by the doctor, obtain the diagnosis sentence, and match the treatment product information according to the diagnosis sentence, and push it together with the response corpus to be pushed to patient.
- the recommended products and services are accurate recommendations for this chat scene, providing users with a quick ordering function for specific products, and realizing accurate recommendations of property rights during the consultation process.
- FIG. 5 another embodiment of the artificial intelligence-based answer corpus generation device in the embodiment of the present application includes:
- the word segmentation module 401 is used to obtain the query corpus and the response corpus to be pushed corresponding to the query corpus, and perform word segmentation processing on the query corpus and the response corpus to be pushed based on the preset linear chain conditional random field , correspondingly obtain multiple question and response participle words;
- the semantic matching module 402 is used to carry out semantic matching of professional words to the question participle and the response participle respectively, and correspondingly obtain the professional question participle corresponding to the question participle and the response professional participle corresponding to the response participle;
- the question-and-answer matching module 403 is used to sequentially perform cross question-and-answer matching on each of the professional word segmentations of the inquiry and each of the professional word segmentations of the responses, and perform cross-question matching on the professional word segmentation of the inquiry and the professional word segmentation of the responses according to the results of the cross-question matching. Combination to get the diagnostic statement;
- the combination module 404 is used to match the treatment product information corresponding to the diagnostic sentence using the preset prior medical knowledge base, combine the treatment product information and the response corpus to be pushed, obtain and push new response corpus to be pushed.
- the word segmentation module 401 includes:
- An extraction unit 4011 configured to extract character feature vectors and corresponding pinyin feature vectors of the question-and-answer corpus, wherein the question-and-answer corpus includes question-and-answer corpus and response corpus to be pushed;
- a splicing unit 4012 configured to splice the character feature vectors and corresponding pinyin feature vectors to obtain context information vectors, and perform semantic analysis on the context information vectors to obtain semantic features;
- the decoding unit 4013 is configured to use a preset linear chain conditional random field to mark the semantic features to obtain a word segmentation tag sequence, and decode the word segment tag sequence to obtain a plurality of question and answer word segmentation, wherein the question and answer word segmentation includes Questions and answers.
- the semantic matching module 402 includes:
- Construction unit 4021 for constructing the first phonetic-phonetic code of the question-and-answer participle in the preset common word dictionary, and constructing the second phonetic-phonetic code of each professional word in the preset professional word dictionary, and calculating the first phonetic-phonetic code and the edit distance between the second phonetic-graph code;
- the combination unit 4022 is used to combine the question and answer participle corresponding to the first phonetic-phonetic code whose editing distance is less than the preset editing distance threshold to obtain the question-answering word group, and select the second phonetic-phonetic code corresponding to the second phonetic-phonetic code whose editing distance is less than the editing distance threshold professional term;
- the replacement unit 4023 is used to replace the corresponding question and answer participle in the question and answer participle with the selected professional words in turn, so as to obtain a plurality of professional phrases corresponding to the question and answer participle;
- Semantic analysis unit 4024 configured to perform semantic analysis on the question and answer phrases to obtain a first semantic analysis result, and perform semantic analysis on each of the professional phrases to obtain multiple second semantic analysis results;
- the comparison unit 4025 is configured to compare the first semantic analysis result with each of the second semantic analysis results, and select each question and answer in the question and answer sub-phrase group from a plurality of professional phrases according to the comparison result A synonym of the participle; use the selected synonym as the question and answer professional participle corresponding to the question and answer participle, wherein the question and answer professional participle includes an inquiry professional participle and a response professional participle.
- comparison unit 4025 is also used for:
- the combining module 404 includes:
- the traversal unit 4041 is configured to use the diagnostic statement to perform hierarchical traversal in the preset priori medical knowledge base, and determine the diagnosis result corresponding to the diagnostic statement according to the result of the hierarchical traversal;
- the screening unit 4042 is configured to select the therapeutic product identification information matching the diagnosis result from the prior knowledge base, and obtain the therapeutic product information mapped to the therapeutic product identification information, wherein the therapeutic product Information includes referral links and summary information for therapeutic products.
- the artificial intelligence-based answer corpus generation device also includes an encryption module 405, which is used for:
- the inverse code of the ciphertext, and the complementary code of the ciphertext perform a modular operation on the encrypted corpus to obtain a modular encrypted corpus, wherein the modular encrypted corpus includes the corresponding The first modular encrypted corpus and the second encrypted corpus corresponding to the response corpus to be pushed;
- the first encrypted corpus is used as a new interrogation corpus, and the second encrypted corpus is used as a new response corpus to be pushed.
- the phonetic-phonetic codes of question and answer word segmentation and professional word segmentation are constructed through the preset common word dictionary and professional word dictionary, and the number of each question and answer word segmentation is determined through the matching of phonetic-phonetic codes. Synonyms, and replace them, to get the corresponding professional word segmentation and response professional word segmentation, which is more accurate in subsequent product matching; by further encrypting the query corpus and the response corpus to be pushed, and by calculating the ciphertext , product recommendation and other data processing processes can better guarantee the patient's personal privacy information and improve the patient's consultation experience.
- Figure 4 and Figure 5 above describe in detail the artificial intelligence-based response corpus generation device in the embodiment of the present application from the perspective of modular functional entities, and the following describes the artificial intelligence-based response corpus generation device in the embodiment of the present application from the perspective of hardware processing Describe in detail.
- FIG. 6 is a schematic structural diagram of an artificial intelligence-based response corpus generation device provided by an embodiment of the present application.
- the artificial intelligence-based response corpus generation device 600 may have relatively large differences due to different configurations or performances, and may include one or More than one processor (central processing units, CPU) 610 (for example, one or more processors) and memory 620, one or more storage media 630 for storing application programs 633 or data 632 (for example, one or more mass storage devices ).
- the memory 620 and the storage medium 630 may be temporary storage or persistent storage.
- the program stored in the storage medium 630 may include one or more modules (not shown in the figure), and each module may include a series of instruction operations for the artificial intelligence-based answer corpus generation device 600 .
- the processor 610 may be configured to communicate with the storage medium 630 , and execute a series of instruction operations in the storage medium 630 on the artificial intelligence-based response corpus generating device 600 .
- the artificial intelligence-based answer corpus generating device 600 may also include one or more power sources 640, one or more wired or wireless network interfaces 650, one or more input and output interfaces 660, and/or, one or more operating systems 631 , such as Windows Server, Mac OS X, Unix, Linux, FreeBSD, etc.
- operating systems 631 such as Windows Server, Mac OS X, Unix, Linux, FreeBSD, etc.
- the present application also provides an artificial intelligence-based response corpus generation device, the computer device includes a memory and a processor, and computer-readable instructions are stored in the memory, and when the computer-readable instructions are executed by the processor, the processor executes the above-mentioned tasks.
- the present application also provides a computer-readable storage medium.
- the computer-readable storage medium may be a non-volatile computer-readable storage medium.
- the computer-readable storage medium may also be a volatile computer-readable storage medium.
- Instructions are stored in the computer-readable storage medium, and when the instructions are run on the computer, the computer is made to execute the steps of the method for generating answer corpus based on artificial intelligence as follows: obtaining the question corpus and the information corresponding to the question corpus The response corpus to be pushed, and based on the preset linear chain conditional random field, respectively perform word segmentation processing on the inquiry corpus and the response corpus to be pushed, correspondingly obtain a plurality of question segmentation words and a plurality of response word segmentation; The questioning participle and the response participle carry out professional word semantic matching, correspondingly obtain the questioning specialty participle corresponding to the described questioning participle and the answering specialty participle corresponding to the response participle; According to the results of the cross-ques
- the integrated unit is realized in the form of a software function unit and sold or used as an independent product, it can be stored in a computer-readable storage medium.
- the technical solution of the present application is essentially or part of the contribution to the prior art or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application.
- the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disk or optical disc and other media that can store program codes. .
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Databases & Information Systems (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Mathematical Physics (AREA)
- Biomedical Technology (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Public Health (AREA)
- Human Computer Interaction (AREA)
- Pathology (AREA)
- Epidemiology (AREA)
- Primary Health Care (AREA)
- Machine Translation (AREA)
Abstract
La présente demande concerne le domaine de l'intelligence artificielle. Sont divulgués un procédé de génération de corpus de réponses basé sur l'intelligence artificielle et un dispositif associé. Le procédé consiste à : acquérir un corpus d'interrogations et un corpus de réponses à pousser et effectuer un traitement de segmentation de mots d'après un champ aléatoire conditionnel prédéfini à chaîne linéaire, afin d'obtenir de manière correspondante des mots segmentés d'interrogations et des mots segmentés de réponses (101) ; effectuer un appariement sémantique de mots professionnels sur les mots segmentés d'interrogations et de réponses, afin d'obtenir de manière correspondante des mots professionnels segmentés d'interrogations et des mots professionnels segmentés de réponses (102) ; effectuer un appariement croisé de questions-réponses sur les mots professionnels segmentés d'interrogations et de réponses et combiner les mots professionnels segmentés d'interrogations et de réponses selon un résultat d'appariement croisé de questions-réponses, afin d'obtenir une déclaration de diagnostic (103) ; et apparier, au moyen d'une base prédéfinie de connaissances médicales antérieures, des informations de produits thérapeutiques correspondant à la déclaration de diagnostic, combiner les informations de produits thérapeutiques et ledit corpus de réponses, afin d'obtenir un nouveau corpus de réponses à pousser et pousser ledit nouveau corpus de réponses (104). Grâce à la présente demande, une recommandation de produits thérapeutiques est réalisée pendant un processus d'interrogation en ligne, ce qui permet d'améliorer le degré d'intelligence d'une interrogation en ligne.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202111055021.XA CN113742454B (zh) | 2021-09-09 | 2021-09-09 | 基于人工智能的应答语料生成方法及相关设备 |
| CN202111055021.X | 2021-09-09 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2023035623A1 true WO2023035623A1 (fr) | 2023-03-16 |
Family
ID=78737446
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2022/088893 Ceased WO2023035623A1 (fr) | 2021-09-09 | 2022-04-25 | Procédé de génération de corpus de réponses basé sur une intelligence artificielle et dispositif associé |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN113742454B (fr) |
| WO (1) | WO2023035623A1 (fr) |
Cited By (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN116313162A (zh) * | 2023-05-12 | 2023-06-23 | 北京梆梆安全科技有限公司 | 一种基于ai模型的医疗问诊系统 |
| CN116992011A (zh) * | 2023-08-15 | 2023-11-03 | 浙商证券股份有限公司 | 一种业务数据匹配查询的方法、系统及装置 |
| CN118278406A (zh) * | 2024-04-29 | 2024-07-02 | 上海信产管理咨询有限公司 | 通信工程记录文件信息处理方法、装置及存储介质 |
| CN118690000A (zh) * | 2024-08-26 | 2024-09-24 | 吉林大学第一医院 | 一种基于知识图谱的急诊分诊问答系统 |
| CN119670869A (zh) * | 2025-02-21 | 2025-03-21 | 北京融威众邦科技股份有限公司 | 一种医疗问答大模型的语料库构建方法及系统 |
Families Citing this family (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113742454B (zh) * | 2021-09-09 | 2023-07-21 | 平安科技(深圳)有限公司 | 基于人工智能的应答语料生成方法及相关设备 |
| CN114297693B (zh) * | 2021-12-30 | 2022-11-18 | 北京海泰方圆科技股份有限公司 | 一种模型预训练方法、装置、电子设备及存储介质 |
| CN114861080B (zh) * | 2022-05-12 | 2025-05-02 | 平安科技(深圳)有限公司 | 一种问答语料推荐方法、装置、计算机设备及存储介质 |
| CN116775833A (zh) * | 2023-06-20 | 2023-09-19 | 平安科技(深圳)有限公司 | 适用于问诊的信息补全方法、装置、设备及介质 |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR101178197B1 (ko) * | 2011-03-17 | 2012-08-29 | 김지만 | 의약품 광고 시스템 |
| US20170116384A1 (en) * | 2015-10-21 | 2017-04-27 | Jamal Ghani | Systems and methods for computerized patient access and care management |
| CN109817351A (zh) * | 2019-01-31 | 2019-05-28 | 百度在线网络技术(北京)有限公司 | 一种信息推荐方法、装置、设备及存储介质 |
| CN112509682A (zh) * | 2020-12-15 | 2021-03-16 | 康键信息技术(深圳)有限公司 | 基于文本识别的问诊方法、装置、设备及存储介质 |
| CN113742454A (zh) * | 2021-09-09 | 2021-12-03 | 平安科技(深圳)有限公司 | 基于人工智能的应答语料生成方法及相关设备 |
Family Cites Families (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6026388A (en) * | 1995-08-16 | 2000-02-15 | Textwise, Llc | User interface and other enhancements for natural language information retrieval system and method |
| CN110781677B (zh) * | 2019-10-12 | 2023-02-07 | 深圳平安医疗健康科技服务有限公司 | 药品信息匹配处理方法、装置、计算机设备和存储介质 |
| CN111695343A (zh) * | 2020-06-23 | 2020-09-22 | 深圳壹账通智能科技有限公司 | 错词纠正方法、装置、设备及存储介质 |
| CN112287080B (zh) * | 2020-10-23 | 2023-10-03 | 平安科技(深圳)有限公司 | 问题语句的改写方法、装置、计算机设备和存储介质 |
-
2021
- 2021-09-09 CN CN202111055021.XA patent/CN113742454B/zh active Active
-
2022
- 2022-04-25 WO PCT/CN2022/088893 patent/WO2023035623A1/fr not_active Ceased
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR101178197B1 (ko) * | 2011-03-17 | 2012-08-29 | 김지만 | 의약품 광고 시스템 |
| US20170116384A1 (en) * | 2015-10-21 | 2017-04-27 | Jamal Ghani | Systems and methods for computerized patient access and care management |
| CN109817351A (zh) * | 2019-01-31 | 2019-05-28 | 百度在线网络技术(北京)有限公司 | 一种信息推荐方法、装置、设备及存储介质 |
| CN112509682A (zh) * | 2020-12-15 | 2021-03-16 | 康键信息技术(深圳)有限公司 | 基于文本识别的问诊方法、装置、设备及存储介质 |
| CN113742454A (zh) * | 2021-09-09 | 2021-12-03 | 平安科技(深圳)有限公司 | 基于人工智能的应答语料生成方法及相关设备 |
Cited By (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN116313162A (zh) * | 2023-05-12 | 2023-06-23 | 北京梆梆安全科技有限公司 | 一种基于ai模型的医疗问诊系统 |
| CN116313162B (zh) * | 2023-05-12 | 2023-08-18 | 北京梆梆安全科技有限公司 | 一种基于ai模型的医疗问诊系统 |
| CN116992011A (zh) * | 2023-08-15 | 2023-11-03 | 浙商证券股份有限公司 | 一种业务数据匹配查询的方法、系统及装置 |
| CN116992011B (zh) * | 2023-08-15 | 2024-09-13 | 浙商证券股份有限公司 | 一种业务数据匹配查询的方法、系统及装置 |
| CN118278406A (zh) * | 2024-04-29 | 2024-07-02 | 上海信产管理咨询有限公司 | 通信工程记录文件信息处理方法、装置及存储介质 |
| CN118690000A (zh) * | 2024-08-26 | 2024-09-24 | 吉林大学第一医院 | 一种基于知识图谱的急诊分诊问答系统 |
| CN119670869A (zh) * | 2025-02-21 | 2025-03-21 | 北京融威众邦科技股份有限公司 | 一种医疗问答大模型的语料库构建方法及系统 |
Also Published As
| Publication number | Publication date |
|---|---|
| CN113742454B (zh) | 2023-07-21 |
| CN113742454A (zh) | 2021-12-03 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2023035623A1 (fr) | Procédé de génération de corpus de réponses basé sur une intelligence artificielle et dispositif associé | |
| CN113707300B (zh) | 基于人工智能的搜索意图识别方法、装置、设备及介质 | |
| CN110032648B (zh) | 一种基于医学领域实体的病历结构化解析方法 | |
| WO2021139424A1 (fr) | Procédé, appareil et dispositif d'évaluation de la qualité d'un contenu textuel et support de stockage | |
| CN117407541A (zh) | 一种基于知识增强的知识图谱问答方法 | |
| CN112447300A (zh) | 基于图神经网络的医疗查询方法、装置、计算机设备及存储介质 | |
| CN112035627B (zh) | 自动问答方法、装置、设备及存储介质 | |
| CN110569343B (zh) | 一种基于问答的临床文本结构化方法 | |
| CN111048167A (zh) | 一种层级式病例结构化方法及系统 | |
| CN113094478B (zh) | 表情回复方法、装置、设备及存储介质 | |
| CN110598786A (zh) | 神经网络的训练方法、语义分类方法、语义分类装置 | |
| CN113724830A (zh) | 基于人工智能的用药风险检测方法及相关设备 | |
| CN110322959A (zh) | 一种基于知识的深度医疗问题路由方法及系统 | |
| CN116882496A (zh) | 一种多级逻辑推理的医学知识库构建方法 | |
| WO2021174923A1 (fr) | Support d'enregistrement, dispositif informatique, appareil et procédé de génération de séquence de mots conceptuels | |
| CN111680501A (zh) | 基于深度学习的问询信息识别方法、装置及存储介质 | |
| CN116522944A (zh) | 基于多头注意力的图片生成方法、装置、设备及介质 | |
| CN108664464A (zh) | 一种语义相关度的确定方法及确定装置 | |
| CN116975212A (zh) | 问题文本的答案查找方法、装置、计算机设备和存储介质 | |
| CN113704481B (zh) | 一种文本处理方法、装置、设备及存储介质 | |
| CN111680515B (zh) | 基于ai识别的答案确定方法、装置、电子设备及介质 | |
| CN118503411B (zh) | 提纲生成方法、模型训练方法、设备及介质 | |
| CN109284491A (zh) | 医学文本识别方法、语句识别模型训练方法 | |
| CN115132372A (zh) | 术语处理方法、装置、电子设备、存储介质及程序产品 | |
| EP3901875A1 (fr) | Modélisation de sujet de courtes enquêtes médicales |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22866101 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 22866101 Country of ref document: EP Kind code of ref document: A1 |