CN118942736A

CN118942736A - Intelligent pre-diagnosis method, device, intelligent device and storage medium

Info

Publication number: CN118942736A
Application number: CN202411030663.8A
Authority: CN
Inventors: 马飞; 卓一瑶; 施斯; 董淳光
Original assignee: Guangdong Provincial Laboratory Of Artificial Intelligence And Digital Economy Shenzhen
Current assignee: Guangdong Provincial Laboratory Of Artificial Intelligence And Digital Economy Shenzhen
Priority date: 2024-07-29
Filing date: 2024-07-29
Publication date: 2024-11-12

Abstract

The application is applicable to the technical field of artificial intelligence, and provides an intelligent pre-consultation method, an intelligent device and a storage medium, wherein the method comprises the following steps: acquiring basic information of a patient; generating a target digital person according to the basic information, wherein the target digital person is used for interacting with the patient according to an interaction strategy and guiding the patient to provide a disease description; carrying out semantic understanding and reasoning on the basic information and the illness state description by using a first large language model to generate an initial pre-consultation suggestion; and combining the basic information and the initial pre-consultation advice with hospital government information by using a second large language model to generate a pre-consultation result of the patient. The application can provide personalized and professional pre-consultation service for patients, improves the efficiency and effectiveness of pre-consultation, and enhances the experience of the patients.

Description

Intelligent pre-consultation method and device, intelligent equipment and storage medium

Technical Field

The application relates to the technical field of artificial intelligence, in particular to an intelligent pre-consultation method, an intelligent device and a storage medium.

Background

With the increasing population, medical resources are increasingly scarce. In conventional consultation protocols, a doctor directly asks a patient for symptoms of a patient's illness or untimely condition when facing the patient, to diagnose the patient's condition, or to diagnose the patient by remote video consultation. The doctor can take a great deal of effort on the premise of inquiring on the surface or remotely taking a video inquiry, and the inquiry process is long and is particularly easy to miss. Thus, the pre-consultation mode is born. In the modern consultation scheme, by pre-consultation before consultation, doctors can be helped to quickly know the state of illness of patients in advance, and relevant preparation and arrangement are made in advance, so that the diagnosis and treatment efficiency of the doctors is improved, and the working efficiency is improved.

Because of the variety of patient individuality, the effectiveness of the unified mode pre-consultation mode is not high. In view of this, how to provide personalized and professional pre-consultation services for patients, to improve the efficiency of pre-consultation and the effectiveness of pre-consultation is a problem that needs to be considered currently.

Disclosure of Invention

The embodiment of the application provides an intelligent pre-consultation method, an intelligent device and a storage medium, which can provide personalized and professional pre-consultation service for patients, improve the efficiency and effectiveness of pre-consultation and enhance the experience of the patients.

In a first aspect, an embodiment of the present application provides an intelligent pre-consultation method, where the intelligent pre-consultation method includes:

Acquiring basic information of a patient;

generating a target digital person according to the basic information, wherein the target digital person is used for interacting with the patient according to an interaction strategy and guiding the patient to provide a disease description;

carrying out semantic understanding and reasoning on the basic information and the illness state description by using a first large language model to generate an initial pre-consultation suggestion;

And combining the basic information and the initial pre-consultation advice with hospital government information by using a second large language model to generate a pre-consultation result of the patient.

In a possible implementation manner of the first aspect, the basic information includes an age and a sex, and the step of obtaining basic information of the patient includes:

acquiring a facial image of a patient;

Inputting the facial image into a pre-trained multitasking convolutional neural network model, and identifying the age and sex of the patient by using the multitasking convolutional neural network model.

In a possible implementation manner of the first aspect, the step of generating an initial pre-inquiry suggestion by using a first large language model to perform semantic understanding and reasoning on the basic information and the condition description includes:

Filling the basic information and the illness state description into a preset prompt template to obtain a prompt;

inputting the prompt into the first large language model, guiding the large language model to perform semantic understanding and reasoning on the basic information and the illness state description, and generating initial pre-consultation suggestions.

In a possible implementation manner of the first aspect, the step of generating the target digital person according to the basic information includes:

According to the basic information of the patient, selecting an initial digital human model corresponding to the basic information from a predefined digital human model library;

Determining the corresponding interaction style of the patient;

and generating a target digital person based on the interaction style and the initial digital person model.

In a possible implementation manner of the first aspect, the method further includes:

identifying an emotional state of the patient during the patient's interaction with the target digital person;

And adjusting an interaction strategy of the target digital person and the patient according to the emotion state.

In a possible implementation manner of the first aspect, the step of identifying an emotional state in the process of the patient interacting with the target digital person includes:

acquiring facial images of the patient and voice information of the patient in an interaction process;

Inputting the facial image of the patient in the interaction process to a facial expression recognition model, and recognizing the facial emotion of the patient;

inputting the voice information of the patient in the interaction process to a voice emotion recognition model, and recognizing the voice emotion state of the patient;

And fusing the facial expression with the voice emotion state, and determining the emotion state of the patient.

In a possible implementation manner of the first aspect, the step of adjusting an interaction strategy of the target digital person to interact with the patient according to the emotional state includes:

And according to the emotion state, adjusting at least one of the interactive style, the interactive guide statement, the interactive rhythm and the interactive animation of the target digital person.

In a second aspect, an embodiment of the present application provides an intelligent pre-consultation apparatus, including:

an information acquisition unit for acquiring basic information of a patient;

The intelligent interaction unit is used for generating a target digital person according to the basic information, wherein the target digital person is used for interacting with the patient according to an interaction strategy and guiding the patient to provide a disease description;

The initial suggestion generation unit is used for carrying out semantic understanding and reasoning on the basic information and the illness state description by utilizing a first large language model to generate an initial pre-inquiry suggestion;

And the pre-inquiry result generation unit is used for combining the basic information and the initial pre-inquiry advice with hospital government information by using a second large language model to generate a pre-inquiry result of the patient.

In a third aspect, an embodiment of the present application provides an intelligent device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the intelligent pre-interrogation method according to the first aspect when executing the computer program.

In a fourth aspect, embodiments of the present application provide a computer readable storage medium storing a computer program which, when executed by a processor, implements the intelligent pre-interrogation method as described in the first aspect above.

In a fifth aspect, embodiments of the present application provide a computer program product, which when run on a smart device, causes the smart device to perform the smart pre-consultation method according to the first aspect described above.

According to the embodiment of the application, the intelligent equipment generates the target digital person according to the basic information of the patient, performs personalized interaction with the patient by utilizing the target digital person, guides the patient to provide the illness state description, is favorable for helping the patient to relax the mood, improves the accuracy and effectiveness of the illness state description, then performs semantic understanding and reasoning on the basic information and the illness state description by utilizing the first large language model, rapidly generates the professional initial pre-consultation suggestion, can improve the accuracy of the pre-consultation, combines the basic information and the initial pre-consultation suggestion with the hospital government information by utilizing the second large language model, generates more professional and comprehensive pre-consultation results for the patient, improves the professional and effectiveness of the pre-consultation and enhances the patient experience.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic diagram of a pre-interrogation system employing an intelligent pre-interrogation method provided by an embodiment of the present application;

FIG. 2 is a flowchart of an implementation of an intelligent pre-consultation method provided by an embodiment of the present application;

fig. 3 is a flowchart of a specific implementation of step S201 of the intelligent pre-consultation method provided in the embodiment of the present application;

FIG. 4 is a flowchart of a specific implementation of generating a target digital person in the intelligent pre-consultation method according to the embodiment of the present application;

FIG. 5 is a flowchart of a specific implementation of identifying a patient condition status in an intelligent pre-consultation method according to an embodiment of the present application;

FIG. 6 is a block diagram of an intelligent pre-consultation device according to an embodiment of the present application;

Fig. 7 is a schematic diagram of an intelligent device according to an embodiment of the present application.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth such as the particular system architecture, techniques, etc., in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.

It should be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It should also be understood that the term "and/or" as used in the present specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations.

As used in the present description and the appended claims, the term "if" may be interpreted as "when..once" or "in response to a determination" or "in response to detection" depending on the context. Similarly, the phrase "if a determination" or "if a [ described condition or event ] is detected" may be interpreted in the context of meaning "upon determination" or "in response to determination" or "upon detection of a [ described condition or event ]" or "in response to detection of a [ described condition or event ]".

Furthermore, the terms "first," "second," "third," and the like in the description of the present specification and in the appended claims, are used for distinguishing between descriptions and not necessarily for indicating or implying a relative importance.

Reference in the specification to "one embodiment" or "some embodiments" or the like means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," and the like in the specification are not necessarily all referring to the same embodiment, but mean "one or more but not all embodiments" unless expressly specified otherwise. The terms "comprising," "including," "having," and variations thereof mean "including but not limited to," unless expressly specified otherwise.

It should be understood that the intelligent pre-consultation method provided by the embodiment of the application is applicable to various intelligent devices and servers which need to execute intelligent pre-consultation, and the intelligent devices can specifically include mobile phones, tablet computers, wearable devices, notebook computers, desktop computers and the like. The embodiment of the application does not limit the specific types of intelligent equipment and servers.

The following specifically describes an application scenario of the intelligent pre-inquiry method provided by the embodiment of the present application, which is specifically described as follows:

the intelligent pre-consultation method provided by the embodiment of the application is applied to a pre-consultation system.

Fig. 1 shows a schematic diagram of a pre-consultation system according to an embodiment of the present application. Referring to fig. 1, the pre-consultation system at least includes an intelligent device 10 and a remote server 11, wherein the intelligent device 10 includes a display device 101 and an acquisition device 102, and a processing control module 103 is further built in the intelligent device 10. The display device 101 is configured to display a target digital person, the acquisition device 102 is configured to acquire image information and voice information of a patient, the processing control module 103 may be communicatively connected to the remote server 11, and configured to execute the intelligent pre-consultation method provided by the embodiment of the present application, where the number of the acquisition devices 102 may be one or more. If the number of the collection devices 102 is one, the collection devices 102 may also be disposed on a central line of a plane where the display device 101 of the smart device 10 is located; if the number of the collection devices 102 is two or more, the collection devices 102 may be symmetrically distributed on the central line, and the specific placement mode of the collection devices 102 may be determined according to the actual situation, which is not limited herein. The remote server 11 may be a server of a hospital.

In some implementations, the acquisition device 102 may be a built-in device of the smart device 10, or may be an independent external device, such as a smart camera, connected to the smart device 10 through an external interface, and sends acquired information to the processing control module 103 in the smart device 10.

Fig. 2 shows an implementation flow of the intelligent pre-consultation method according to the embodiment of the present application, where the method flow includes steps S201 to S204. The specific implementation principle of each step is as follows:

Step S201: basic information of a patient is acquired.

The basic information is the basic personal condition of the patient, such as age, sex, etc. In this embodiment, the base information may be used to determine the personalized target digital person to which the patient corresponds.

In one possible implementation mode, the intelligent device can acquire basic information actively provided by a patient in a voice or text mode, and the basic information is provided by the patient himself, so that the accuracy and the reliability of the basic information can be ensured.

As a possible implementation manner of the present application, fig. 3 shows a specific implementation flow of step S201 of the intelligent pre-inquiry method provided by the embodiment of the present application, which is described in detail below:

a1: an image of a face of a patient is acquired. The camera of the smart device may be utilized to capture an image of the patient's face.

A2: inputting the facial image into a pre-trained multitasking convolutional neural network model, and identifying the age and sex of the patient by using the multitasking convolutional neural network model.

The intelligent device can acquire facial images of a patient through a camera, and after the facial images are acquired, gender identification and age prediction are simultaneously carried out by using a pretrained multitask convolutional neural network model. The multi-task convolutional neural network model includes a first task sub-model for gender identification and a second task sub-model for age prediction. Wherein gender identification is a two-classification problem, the first task sub-model may be trained using a cross entropy loss function; the second task sub-model may be combined with a deep learning (Deep Label Distribution Learning, DLDL) method to perform age prediction, improving prediction accuracy.

In this embodiment, the intelligent device performs recognition processing on the facial image by acquiring the facial image of the patient and using the multitask convolutional neural network model after pre-training, so as to determine the age and sex of the patient, without the active provision of the patient, so that the situation that the patient cannot or is inconvenient to actively provide can be avoided, and the age and sex of the patient can be automatically recognized, so that the pre-consultation service is more humanized and intelligent.

Step S202: and generating a target digital person according to the basic information, wherein the target digital person is used for interacting with the patient according to an interaction strategy and guiding the patient to provide a disease description.

The target digital person is a personalized digital person corresponding to basic information of a patient, and the interaction strategy is a personalized interaction strategy for the patient. In some embodiments, the interaction policy has a correspondence with the digital person, and the corresponding interaction policy can be determined according to the determined digital person. According to the embodiment, the target digital person corresponding to the patient is interacted with the patient based on the interaction strategy, so that the patient can be effectively guided to provide the illness state description.

In this embodiment, the target digital person may guide the patient to provide the patient's condition description by text, picture, voice, and video.

As a possible implementation manner of the present application, fig. 4 shows a specific implementation flow of generating a target digital person according to the basic information in the intelligent pre-consultation method provided by the embodiment of the present application, which is described in detail below:

B1: and selecting an initial digital human model corresponding to the basic information from a predefined digital human model library according to the basic information of the patient. The predefined digital person model library comprises predefined various initial digital person models, and the digital person images of the predefined various initial digital person models are different. Digital figures include cartoon, real, etc. For example, for pediatric patients, cartoon-like digital humans may be employed to increase affinity; for adult patients, digital people with real person simulation images can be adopted, so that the sense of expertise is enhanced.

B2: and determining the corresponding interaction style of the patient. The interaction style includes, but is not limited to, interaction speed, interaction phrase, and tone. In a possible implementation, the interaction style further comprises a dialect type.

In some embodiments, the corresponding interaction style of the patient may be determined directly from the age of the patient. The patient ages may be in different intervals and their corresponding interaction styles may be different. For example, for patients with age in childhood, a first interaction style may be determined in which the interaction language is slow, the expression is soft, the cartoon tone; for patients with age in young age, a second interaction style with faster interaction speed, serious expression and male tone can be determined; for patients in the elderly region, a third interaction style may be determined in which the interaction speech rate is slow, the wording is soft, and the female tone is soft.

In this embodiment, after detecting the first sentence of voice information of the patient, the intelligent device performs dialect type recognition analysis on the first sentence of voice information, determines the dialect type of the first sentence of voice information, and uses the dialect type as the dialect type of the interaction between the target digital person and the patient.

In yet other embodiments, the smart device display may provide a plurality of interaction style options, with the patient selecting the interaction style at his or her discretion, and determining the interaction style of the target digital person based on the patient's selection.

B3: and generating a target digital person based on the interaction style and the initial digital person model.

In this embodiment, an initial digital person model is determined according to basic information of a patient, that is, an image of a target digital person is determined, and then, according to an interaction style selected by the patient, an interaction speech speed, an interaction phrase, a tone or even a dialect type and the like of the target digital person in an interaction process with the patient are determined. The gender and age of the patient are different, the corresponding image of the target digital person and the interaction style in the interaction process are different, the patient is guided to interact by the personalized digital person special for the patient, the patient is guided to provide the illness state description, the interaction efficiency is improved, and the effectiveness of the illness state description is improved.

In a possible implementation manner, the intelligent device further identifies an emotional state in the process of interaction of the patient with the target digital person, and adjusts an interaction strategy of interaction of the target digital person with the patient according to the emotional state. The interaction strategy comprises an interaction guidance sentence, an interaction rhythm and an interaction animation.

And the intelligent equipment adjusts at least one of the interactive style, the interactive guiding statement, the interactive rhythm and the interactive animation of the target digital person according to the emotion state.

Illustratively, when the patient has negative emotion such as anxiety, tension and the like, the target digital person can timely adjust the interaction strategy, and at least one of the interaction style, the interaction guidance statement, the interaction rhythm and the interaction animation in the interaction process with the patient are adjusted. For example, the target digital person may switch to a milder, more concentric interaction style, creating a safe, comfortable interrogation atmosphere. The target digital person can slow down the interaction rhythm, give the patient sufficient time and space to express own feeling and demand, avoid the quality of information acquisition that the mood factor disturbed. The target digital person can also provide interactive guidance sentences containing targeted emotion management advice, guide the patient to perform emotion adjustment, such as deep breathing, relaxation training and the like, and help the patient to restore a calm state. The target digital person may also assist the patient in mood relief by playing interactive animations corresponding to the patient's emotional state. Through the adjustment of the interaction strategy, the target digital person can not only support and encourage the emotion of the patient, but also ensure the smooth progress of the pre-consultation process.

In contrast, when the emotional state of the patient is stable and positive, the target digital person can adjust the interaction style, and interact with the patient by adopting the interaction style which is consistent with the current emotional state of the patient, and a better pre-consultation atmosphere is created by using a more relaxed and friendly language by the target digital person, so that the trust and communication with the patient are improved. In some embodiments, on the premise of ensuring comfort level of a patient, a target digital person can also properly accelerate the inquiry rhythm, improve the inquiry efficiency, fully utilize the advantage of positive cooperation of the patient, and efficiently and accurately collect the key condition description.

As a possible implementation manner of the present application, fig. 5 shows a specific implementation procedure of identifying an emotional state in a process of interaction between the patient and the target digital person in the intelligent pre-consultation method provided by the embodiment of the present application, which is described in detail below:

C1: and acquiring facial images of the patient and voice information of the patient in the interaction process.

C2: and inputting the facial image of the patient in the interaction process into a facial expression recognition model, and recognizing the facial emotion of the patient. Facial emotion may be determined from emotion category and emotion intensity.

And C3: and inputting the voice information of the patient in the interaction process to a voice emotion recognition model, and recognizing the voice emotion state of the patient.

And C4: and fusing the facial expression with the voice emotion state, and determining the emotion state of the patient.

In this embodiment, the emotional state is determined according to facial emotion and speech emotion. The facial emotion of the patient is identified by utilizing the facial expression identification model, the voice emotion state of the patient is identified by utilizing the voice emotion identification model, and then the facial emotion and the voice emotion state are fused, so that the emotion state of the patient can be accurately determined.

Wherein facial expression recognition models such as FER, emotionNet can be used to recognize the emotion type (such as happiness, surprise, sadness, aversion, etc.) and emotion intensity of the patient by analyzing facial muscle movements; the emotion recognition model can adopt EmoAudioNet, speech Emotion Recognition and the like, and judges the emotion state of the patient by analyzing the characteristics of rhythm, intensity, tone and the like of the voice. And fusing the facial expression and the voice analysis result to obtain the emotional state of the patient.

In one possible embodiment, facial images and voice information of a patient are input into a multimodal emotion recognition model, and the emotional state of the patient is recognized using the multimodal emotion recognition model.

In this embodiment, the above multi-modal emotion recognition model is constructed and trained. The specific implementation process comprises the following steps:

1. Data preparation: and collecting and labeling emotion data sets containing multi-mode data such as facial expressions, voices, texts and the like, ensuring that the data sets cover different emotion categories and having enough data quantity and diversity.

2. And (3) modal feature extraction: a separate feature extractor is designed for each modality data, wherein:

Facial expression: facial expression features are extracted using a Convolutional Neural Network (CNN) based on an attention mechanism. On the basis of the traditional CNN, attention mechanisms are introduced, so that the model adaptively focuses on the key area of the face, and features with more discriminant are extracted;

Voice information: extracting acoustic and prosodic features of the voice by adopting a Deep Convolutional Neural Network (DCNN), adding a residual connection and pooling layer into the DCNN, and enhancing the capability of the model for extracting the emotion features of the voice;

text information: semantic and emotion characteristics of a text are extracted by using a pre-trained language model such as BERT, and emotion characteristic auxiliary loss is introduced in a fine tuning stage, so that the model is promoted to learn more accurate emotion representation.

3. Modality adaptive alignment: in order to alleviate the influence caused by the distribution difference of different modal data, a modal self-adaptive alignment module is designed. And a mode alignment method based on countermeasure learning is introduced, different modes are aligned in a shared space by training a mode discriminator and a feature extractor, and the distribution gap between the modes is reduced. And designing a mode alignment strategy based on multi-core learning, learning a kernel function, mapping different mode data to a high-dimensional space, and minimizing the distance between the different mode data in the kernel space to realize mode self-adaptive alignment.

4. Cross-modal fusion: an innovative cross-mode fusion mechanism is designed, and complementary information of each mode is fully utilized. In this embodiment, a dynamic fusion strategy based on an attention mechanism is proposed. The weight is dynamically distributed to each mode through the attention network, so that the contribution degree of each mode is adaptively adjusted according to the input emotion state by the model. In the embodiment, a tensor fusion method is also introduced, features of different modes are constructed into high-order tensors, and high-order association among modes is mined through tensor decomposition and low-rank constraint, so that comprehensive and compact emotion representation is obtained.

5. Modal Drop mechanism: and a modal Drop mechanism is introduced to further improve the robustness and generalization capability of the model. In the training process, the characteristics of certain modes are randomly discarded, so that the model is forced to learn to use the information of other modes for emotion reasoning. The mechanism can alleviate the overfitting and improve the performance of the model under the condition of missing modes.

In the embodiment, a multi-modal emotion recognition model is constructed, a multi-modal domain self-adaptive method is combined with cross-modal fusion, and in a modal separation characterization learning stage, the model analyzes the relationship between the unchanged modes and the subspace characteristics of specific modes in different modal data, and explores the correlation between the modes. Meanwhile, the multi-mode emotion recognition model also needs to solve the problems of distribution gaps and information redundancy among different modes. The influence caused by the data distribution difference of different modes can be relieved by a field self-adaptive method such as countermeasure learning, multi-core learning and the like, and the generalization capability of the model is improved. In the cross-mode fusion stage, an effective fusion mechanism such as a attention mechanism, tensor fusion and the like is designed, so that the emotion characteristics extracted by each mode can be fully utilized, information redundancy is eliminated to the maximum extent, and comprehensive and accurate emotion representation is formed.

Step S203: and carrying out semantic understanding and reasoning on the basic information and the disease description by using a first large language model to generate an initial pre-consultation suggestion.

The first large language model is used for generating initial pre-consultation advice for a patient, is a large language model (LLM, large Language Model) used in the medical field, and the LLM is a language model based on a deep learning technology, and can process a large amount of text data and learn grammar, semantics and context information of a language from the language model. The first large language model may be a medical GPT.

As a possible implementation mode of the application, the basic information and the illness state description are filled into a preset prompt template to obtain a prompt, the prompt is input into the first large language model, the large language model is guided to perform semantic understanding and reasoning on the basic information and the illness state description, and an initial pre-inquiry suggestion is generated.

The first large language model may also solve medical health related problems for the patient based on the medical knowledge base.

Illustratively, the text corpus of the entire knowledge base is partitioned into a plurality of blocks; converting each block into a vector using an embedding (Embedding) model; storing all vectors in a vector database; embedding the questions/queries to be asked by the patient into vectors using the same embedding model as that used to embed the knowledge base itself; running a query in the index of the vector database using the generated vector, returning the top k vector that is most similar in the given embedding/latent space; the questions/inquiry and the searched text blocks form a prompt according to the template; the hints are passed to a Large Language Model (LLM) that instructs the LLM to answer a given question using only the provided context. The LLM generates an answer based on the prompt and returns the answer to the patient.

In this embodiment, the language expression and knowledge representation in the medical field are learned by pre-training multi-source heterogeneous data such as large-scale medical documents and electronic medical records, and on this basis, fine tuning is performed for specific application tasks such as disease diagnosis and medical question answering, so that a large language model can infer diseases possibly suffered by a patient according to the disease description information provided by the patient, and through knowledge reasoning and semantic understanding, classification of the diseases, symptom identification and the like are given, and an initial pre-inquiry suggestion is generated, so that not only can the efficiency of pre-inquiry be improved, but also the effectiveness of pre-inquiry can be improved.

Step S204: and combining the basic information and the initial pre-consultation advice with hospital government information by using a second large language model to generate a pre-consultation result of the patient.

The second large language model is used to generate pre-consultation results for the patient. The pre-consultation results include advice of medical visits and pre-consultation reports. The second large language model differs from the first large language model described above in that the second large language model is a large language model focusing on knowledge learning and application in hospital management and medical procedures. In this embodiment, a proprietary data security customized differentiated model may be used to construct an internal knowledge base of each consulting room and doctor in the hospital using the RAG technology. Through the pre-training of the hospital administrative files, the medical guide and other data, the second large language model can master the relevant knowledge of the operation of the hospital, in the fine tuning process of the medical advice task, the second large language model takes the basic information of a patient, the preliminary pre-consultation result and the like as input, and combines the factors of department setting, doctor expertise and the like of the hospital to generate the pre-consultation result of the patient, wherein the pre-consultation result comprises personalized medical advice, such as recommended medical departments, specialists, examination/treatment schemes and the like, so as to intelligently guide the follow-up medical action of the patient, the pre-consultation result also comprises a pre-consultation report, and the pre-consultation report can be generated according to the medical advice, thereby facilitating the main doctor of the patient to know the actual condition of the patient in advance before treating the patient. The second largest language model may be a hospital government GPT.

In this embodiment, after basic information of a patient and descriptions of various types of conditions such as text, pictures, voice and video provided by the patient are comprehensively analyzed, an initial pre-consultation suggestion is given, and then the basic information of the patient and the initial pre-consultation suggestion are combined to generate a doctor-seeing suggestion, and a pre-consultation report is generated for a main doctor of the patient. The pre-diagnosis report integrates analysis results of the medical GPT and the hospital government GPT, automatically generates a structured and visual pre-diagnosis report, comprehensively presents basic information, complaints, related medical history and preliminary pre-consultation results of patients, provides multidimensional correlation analysis of diseases, assists doctors in quickly knowing the conditions of the patients, perfects diagnosis ideas and formulates personalized diagnosis and treatment schemes, thereby improving diagnosis efficiency and accuracy.

According to the embodiment of the application, the intelligent equipment generates the target digital person according to the basic information of the patient, performs personalized interaction with the patient by utilizing the target digital person, guides the patient to provide the illness state description, is beneficial to helping the patient relax the mood, improves the accuracy and the effectiveness of the illness state description, then performs semantic understanding and reasoning on the basic information and the illness state description by utilizing the first large language model, rapidly generates the professional initial pre-consultation suggestion, can improve the accuracy of the pre-consultation, combines the basic information and the initial pre-consultation suggestion with the hospital government affair information by utilizing the second large language model, generates more professional and comprehensive pre-consultation results for the patient, improves the professional and the effectiveness of the pre-consultation, and enhances the patient experience.

It should be understood that the sequence number of each step in the foregoing embodiment does not mean that the execution sequence of each process should be determined by the function and the internal logic, and should not limit the implementation process of the embodiment of the present application.

Corresponding to the intelligent pre-consultation method described in the above embodiments, fig. 6 shows a block diagram of the intelligent pre-consultation device provided in the embodiment of the present application, and for convenience of explanation, only the parts related to the embodiment of the present application are shown.

Referring to fig. 6, the intelligent pre-consultation apparatus includes: an information acquisition unit 61, an intelligent interaction unit 62, an initial advice generation unit 63, a pre-consultation result generation unit 64, wherein:

An information acquisition unit 61 for acquiring basic information of a patient;

an intelligent interaction unit 62, configured to generate a target digital person according to the basic information, where the target digital person is configured to interact with the patient according to an interaction policy, and guide the patient to provide a patient condition description;

An initial advice generation unit 63, configured to perform semantic understanding and reasoning on the basic information and the condition description by using a first large language model, to generate an initial pre-consultation advice;

A pre-consultation result generating unit 64 for generating a pre-consultation result of the patient by combining the basic information and the initial pre-consultation advice with hospital government information using a second large language model.

As a possible embodiment of the present application, the above-described information acquisition unit 61 includes:

the image acquisition module is used for acquiring a facial image of a patient;

And the age and sex identification module is used for inputting the facial image into a pre-trained multitasking convolutional neural network model, and identifying the age and sex of the patient by using the multitasking convolutional neural network model.

As a possible embodiment of the present application, the initial advice generating unit 63 described above includes:

The prompt generation module is used for filling the basic information and the illness state description into a preset prompt template to obtain a prompt;

the initial suggestion generation module is used for inputting the prompt language into the first large language model, guiding the large language model to perform semantic understanding and reasoning on the basic information and the illness state description, and generating initial pre-consultation suggestions.

As a possible embodiment of the present application, the intelligent interaction unit 62 includes:

The initial model selection module is used for selecting an initial digital human model corresponding to the basic information from a predefined digital human model library according to the basic information of the patient;

the style acquisition module is used for determining the interaction style corresponding to the patient;

And the digital person generating module is used for generating a target digital person based on the interaction style and the initial digital person model.

As a possible implementation manner of the present application, the intelligent pre-inquiry apparatus further includes:

an emotional state identification unit for identifying an emotional state of the patient in the process of interacting with the target digital person;

and the interaction strategy adjusting unit is used for adjusting the interaction strategy of the target digital person and the patient interaction according to the emotion state.

As a possible embodiment of the present application, the above-mentioned emotional state recognition unit includes:

The interaction information acquisition module is used for acquiring the facial image of the patient and the voice information of the patient in the interaction process;

The expression recognition module is used for inputting the facial image of the patient in the interaction process to a facial expression recognition model and recognizing the facial emotion of the patient;

The emotion recognition module is used for inputting the voice information of the patient in the interaction process to the voice emotion recognition model and recognizing the voice emotion state of the patient;

And the emotion state determining module is used for fusing the facial expression with the voice emotion state and determining the emotion state of the patient.

As a possible implementation manner of the present application, the interaction policy adjustment unit is specifically configured to:

It should be noted that, because the content of information interaction and execution process between the above devices/units is based on the same concept as the method embodiment of the present application, specific functions and technical effects thereof may be referred to in the method embodiment section, and will not be described herein.

Embodiments of the present application also provide a computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of any of the intelligent pre-interrogation methods as represented in fig. 2-5.

The embodiment of the application also provides intelligent equipment, which comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the steps of any intelligent pre-inquiry method shown in fig. 2-5 are realized when the processor executes the computer program.

Embodiments of the present application also provide a computer program product which, when run on a server, causes the server to perform the steps of implementing any one of the intelligent pre-interrogation methods as represented in fig. 2 to 5.

Fig. 7 is a schematic diagram of an intelligent device according to an embodiment of the present application. As shown in fig. 7, the smart device 7 of this embodiment includes: a processor 70, a memory 71, and a computer program 72 stored in the memory 71 and executable on the processor 70. The steps of the various intelligent pre-consultation method embodiments described above, such as steps S201 through S204 of fig. 2, are implemented by the processor 70 when executing the computer program 72. Or the processor 70, when executing the computer program 72, performs the functions of the modules/units of the apparatus embodiments described above, e.g., the functions of the units 61-64 shown in fig. 6.

By way of example, the computer program 72 may be partitioned into one or more modules/units that are stored in the memory 71 and executed by the processor 70 to complete the present application. The one or more modules/units may be a series of computer readable instruction segments capable of performing specific functions describing the execution of the computer program 72 in the smart device 7.

The smart device 7 may include, but is not limited to, a processor 70, a memory 71. It will be appreciated by those skilled in the art that fig. 7 is merely an example of the smart device 7 and is not meant to be limiting of the smart device 7, and may include more or fewer components than shown, or may combine certain components, or different components, e.g., the smart device 7 may also include input-output devices, network access devices, buses, etc.

The Processor 70 may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (DIGITAL SIGNAL Processor, DSP), application SPECIFIC INTEGRATED Circuit (ASIC), off-the-shelf Programmable gate array (Field-Programmable GATE ARRAY, FPGA) or other Programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 71 may be an internal storage unit of the smart device 7, such as a hard disk or a memory of the smart device 7. The memory 71 may also be an external storage device of the smart device 7, such as a plug-in hard disk, a smart memory card (SMART MEDIA CARD, SMC), a Secure Digital (SD) card, a flash memory card (FLASH CARD) or the like, which are provided on the smart device 7. Further, the memory 71 may also include both an internal storage unit and an external storage device of the smart device 7. The memory 71 is used for storing the computer program as well as other programs and data required by the smart device. The memory 71 may also be used for temporarily storing data that has been output or is to be output.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions. The functional units and modules in the embodiment may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit, where the integrated units may be implemented in a form of hardware or a form of a software functional unit. In addition, the specific names of the functional units and modules are only for distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working process of the units and modules in the above system may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.

The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the present application may implement all or part of the flow of the method of the above embodiments, and may be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and when the computer program is executed by a processor, the computer program may implement the steps of each of the method embodiments described above. Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include at least: any entity or device capable of carrying computer program code to an apparatus/terminal device, recording medium, computer Memory, read-Only Memory (ROM), random access Memory (RAM, random Access Memory), electrical carrier signals, telecommunications signals, and software distribution media. Such as a U-disk, removable hard disk, magnetic or optical disk, etc. In some jurisdictions, computer readable media may not be electrical carrier signals and telecommunications signals in accordance with legislation and patent practice.

In the foregoing embodiments, the descriptions of the embodiments are emphasized, and in part, not described or illustrated in any particular embodiment, reference is made to the related descriptions of other embodiments.

The above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application, and are intended to be included in the scope of the present application.

Claims

1. An intelligent pre-diagnosis method, characterized in that the intelligent pre-diagnosis method comprises:

Obtain basic information of patients;

Generate a target digital person based on the basic information, wherein the target digital person is used to interact with the patient according to an interaction strategy to guide the patient to provide a description of the condition;

Using the first language model to perform semantic understanding and reasoning on the basic information and the condition description, and generate initial pre-diagnosis suggestions;

The basic information and the initial pre-consultation suggestion are combined with hospital government affairs information using a second language model to generate a pre-consultation result for the patient.

2. The intelligent pre-diagnosis method according to claim 1, wherein the basic information includes age and gender, and the step of obtaining the basic information of the patient comprises:

obtaining a facial image of the patient;

The facial image is input into a pre-trained multi-task convolutional neural network model, and the multi-task convolutional neural network model is used to identify the age and gender of the patient.

3. The intelligent pre-diagnosis method according to claim 1, characterized in that the step of using the first large language model to semantically understand and infer the basic information and the condition description to generate an initial pre-diagnosis suggestion comprises:

Fill the basic information and the condition description into a preset prompt template to obtain a prompt;

The prompt is input into the first large language model, guiding the large language model to perform semantic understanding and reasoning on the basic information and the condition description, and generating an initial pre-diagnosis suggestion.

4. The intelligent pre-diagnosis method according to any one of claims 1 to 3, characterized in that the step of generating a target digital person according to the basic information comprises:

determining an interaction style corresponding to the patient;

A target digital human is generated based on the interaction style and the initial digital human model.

5. The intelligent pre-diagnosis method according to claim 1, characterized in that the method further comprises:

Identifying the emotional state of the patient during the interaction with the target digital human;

According to the emotional state, the interaction strategy of the target digital person and the patient is adjusted.

6. The intelligent pre-diagnosis method according to claim 5, characterized in that the step of identifying the emotional state of the patient during the interaction with the target digital human comprises:

Acquire the facial image of the patient and the voice information of the patient during the interaction process;

Inputting the facial image of the patient during the interaction into a facial expression recognition model to recognize the facial emotion of the patient;

Inputting the patient's voice information during the interaction into a voice emotion recognition model to identify the patient's voice emotion state;

The facial emotion is combined with the speech emotion state to determine the patient's emotional state.

7. The intelligent pre-diagnosis method according to claim 5, characterized in that the step of adjusting the interaction strategy between the target digital person and the patient according to the emotional state comprises:

According to the emotional state, at least one of the interaction style, interaction guide sentence, interaction rhythm and interaction animation of the target digital human is adjusted.

8. An intelligent pre-diagnosis device, characterized in that the intelligent pre-diagnosis device comprises:

An information acquisition unit, used to obtain basic information of the patient;

An intelligent interaction unit, used to generate a target digital person according to the basic information, wherein the target digital person is used to interact with the patient according to an interaction strategy and guide the patient to provide a description of the condition;

An initial suggestion generating unit, configured to use a first language model to perform semantic understanding and reasoning on the basic information and the condition description, and generate an initial pre-diagnosis suggestion;

The pre-consultation result generating unit is used to combine the basic information and the initial pre-consultation suggestion with the hospital government affairs information by using the second language model to generate the pre-consultation result of the patient.

9. An intelligent device, comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the intelligent pre-diagnosis method according to any one of claims 1 to 7 when executing the computer program.

10. A computer-readable storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the intelligent pre-diagnosis method according to any one of claims 1 to 7.