[go: up one dir, main page]

WO2024187925A1 - Question answering system model training method, and sample generation method - Google Patents

Question answering system model training method, and sample generation method Download PDF

Info

Publication number
WO2024187925A1
WO2024187925A1 PCT/CN2024/070737 CN2024070737W WO2024187925A1 WO 2024187925 A1 WO2024187925 A1 WO 2024187925A1 CN 2024070737 W CN2024070737 W CN 2024070737W WO 2024187925 A1 WO2024187925 A1 WO 2024187925A1
Authority
WO
WIPO (PCT)
Prior art keywords
question
entity
sample
target
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/CN2024/070737
Other languages
French (fr)
Chinese (zh)
Inventor
夏志超
马超
肖冰
夏粉
蒋宁
吴海英
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Mashang Consumer Finance Co Ltd
Original Assignee
Mashang Consumer Finance Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN202310247092.2A external-priority patent/CN116306974B/en
Priority claimed from CN202310247102.2A external-priority patent/CN118674059A/en
Application filed by Mashang Consumer Finance Co Ltd filed Critical Mashang Consumer Finance Co Ltd
Publication of WO2024187925A1 publication Critical patent/WO2024187925A1/en
Anticipated expiration legal-status Critical
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Definitions

  • the present application relates to the field of knowledge graph technology, and in particular to a model training method, sample generation method, device, electronic device and storage medium for a question-answering system.
  • question-answering systems With the development of Internet technology, the application of question-answering systems has become more and more popular. Automatically answering questions raised by users through question-answering systems can improve answering efficiency and save human resources.
  • the questions raised by users may be colloquial, making the sentence structure rich and diverse, which increases the difficulty of intent recognition in question-answering systems.
  • question-answering systems involving knowledge graphs are often highly customized, and the amount of sample data required for model training of such question-answering systems is very large. If training samples are manually configured, the efficiency is low and it is difficult to meet the model training requirements.
  • the present application provides a model training method, sample generation method, device, electronic device and storage medium for a question-answering system to improve the intent recognition capability and sample generation efficiency of the question-answering system.
  • the present application provides a model training method for a question-answering system, wherein the question-answering system includes an initial question parsing model; the method includes: obtaining a question text sample; The sample is input into the initial question parsing model for iterative training to obtain a question parsing model; the initial question parsing model includes a first encoding layer and a conversion layer; the first encoding layer is used to perform encoding processing according to the question text sample to obtain a corresponding first sentence vector; the conversion layer is used to generate a preset number of initial intention vectors when receiving the first sentence vector, fill each of the initial intention vectors according to the first sentence vector to obtain a corresponding target intention vector, and convert the target intention vector into a corresponding text fragment; the text fragment is used to query the answer to the question text sample in the question-answering system.
  • the present application provides an answering method, including: obtaining a target question to be answered; inputting the target question into a question parsing model for parsing and processing to obtain a corresponding target fragment; the question parsing model is obtained by training with the model training method of the question-answering system as described above; and determining the answer to the target question based on the target fragment.
  • the present application provides a model training device for a question-answering system, the question-answering system including an initial question parsing model; the device including: a first acquisition unit, used to acquire a question text sample; a first training unit, used to input the question text sample into the initial question parsing model for iterative training to obtain a question parsing model; the initial question parsing model includes a first encoding layer and a conversion layer; the first encoding layer is used to perform encoding processing according to the question text sample to obtain a corresponding first sentence vector; the conversion layer is used to generate a preset number of initial intention vectors when receiving the first sentence vector, fill each of the initial intention vectors according to the first sentence vector to obtain a corresponding target intention vector, and convert the target intention vector into a corresponding text fragment; the text fragment is used to query the answer to the question text sample in the question-answering system.
  • the present application provides an answering device, including: a second acquisition unit, used to acquire a target question to be answered; a first parsing unit, used to input the target question into a question parsing model for parsing and processing to obtain a corresponding target fragment; the question parsing model is obtained by training with the model training method of the above-mentioned question-answering system; a first determination unit, used to determine the answer to the target question based on the target fragment.
  • the present application provides a sample generation method for a question-answering system, the method comprising: obtaining a pre-configured synonym set, a similar question set, a comparison word set, a problem domain set, and a standard entity library;
  • the synonym set comprises a plurality of standard words and at least one synonym corresponding to each of the standard words;
  • the similar question set comprises a plurality of first attributes and at least one similar question sentence corresponding to each of the first attributes;
  • the comparison word set comprises comparison word information;
  • the problem domain set comprises The method comprises the following steps: the method comprises: comprising: comprising: comprising: comprising: comprising: a plurality of standard entities; generating a question parsing sample according to the synonym set, the similar question set and at least one of the comparative word set; generating a question classification sample according to the question parsing sample and the question domain set; and generating the entity linking sample according to the question parsing sample and the standard entity library; and constructing a training data set for the question answering system
  • the present application provides a model training method for a question-answering system, comprising: generating a training data set by a sample generation method for the question-answering system as described above; the training data set comprises the question classification samples, the question parsing samples and the entity linking samples; inputting the question classification samples into an initial question classification model in the question-answering system for iterative training to obtain a question classification model; inputting the question parsing samples into an initial question parsing model in the question-answering system for iterative training to obtain a question parsing model; inputting the entity linking samples into an initial entity linking model in the question-answering system for iterative training to obtain an entity linking model.
  • the present application provides an answering method, including: obtaining a target question to be answered; inputting the target question into a question classification model for classification processing to obtain a classification result; the question classification model is obtained by inputting question classification samples in a training data set into an initial question classification model for training; the training data set is generated by the sample generation method of the above-mentioned question-answering system; when the classification result is used to characterize that the target question belongs to a first preset classification, inputting the target question into a question parsing model for parsing processing to obtain a corresponding target fragment; the question parsing model is obtained by inputting question parsing samples in the training data set into an initial question parsing model for training; inputting the target fragment into an entity linking model for prediction processing to obtain a corresponding target entity; the entity linking model is obtained by inputting entity linking samples in the training data set into an initial entity linking model for training; and determining the answer to the target question based on the target entity.
  • the present application provides a sample generation device for a question-answering system, the device comprising: a third acquisition unit, used to acquire a pre-configured synonym set, a similar question set, a comparative word set, a problem domain set and a standard entity library;
  • the synonym set comprises a plurality of standard words and at least one synonym corresponding to each of the standard words;
  • the similar question set comprises a plurality of first attributes and at least one similar question sentence corresponding to each of the first attributes;
  • the comparative word set comprises comparative word information;
  • the problem domain set comprises in-domain question text and out-of-domain question text;
  • the standard entity library comprises a plurality of standard entities;
  • a first generation unit used to generate a sample according to the synonym set, the similar question set and the comparative word set.
  • a question parsing sample is generated by comparing at least one of the word sets; a second generating unit is used to generate a question classification sample according to the question parsing sample and the problem domain set; and, based on the question parsing sample and the standard entity library, the entity linking sample is generated; a constructing unit is used to construct a training data set for the question answering system according to the question classification sample, the question parsing sample and the entity linking sample.
  • the present application provides a model training device for a question-answering system, comprising: a third generation unit, used to generate a training data set through a sample generation method for the question-answering system as described above; the training data set includes the question classification samples, the question parsing samples and the entity linking samples; a second training unit, used to input the question classification samples into an initial question classification model in the question-answering system for iterative training to obtain a question classification model; input the question parsing samples into the initial question parsing model in the question-answering system for iterative training to obtain a question parsing model; input the entity linking samples into the initial entity linking model in the question-answering system for iterative training to obtain an entity linking model.
  • the present application provides an answering device, including: a fourth acquisition unit, used to acquire a target question to be answered; a classification unit, used to input the target question into a question classification model for classification processing to obtain a classification result; the question classification model is obtained by inputting question classification samples in a training data set into an initial question classification model for training; the training data set is generated by a sample generation method such as the above-mentioned question-answering system; a second parsing unit, used to input the target question into a question parsing model for parsing processing to obtain a corresponding target fragment when the classification result is used to characterize that the target question belongs to a first preset classification; the question parsing model is obtained by inputting question parsing samples in the training data set into an initial question parsing model for training; a prediction unit, used to input the target fragment into an entity linking model for prediction processing to obtain a corresponding target entity; the entity linking model is obtained by inputting entity linking samples in the training data set into an initial entity linking model for
  • the present application provides an electronic device, comprising: a processor; and a memory configured to store computer-executable instructions, which, when executed, cause the processor to execute a model training method for the question-answering system as described above, or, a response method as described above, or, a sample generation method for the question-answering system as described above.
  • the present application provides a computer-readable storage medium for storing computer-executable instructions, wherein the computer-executable instructions, when executed by a processor, implement the model of the question-answering system as described above.
  • FIG1 is a processing flow chart of a model training method for a question-answering system provided in an embodiment of the present application
  • FIG2 is a data flow diagram of a question parsing model in a model training method for a question-answering system provided in an embodiment of the present application;
  • FIG3 is a data flow diagram of an entity linking model in a model training method for a question-answering system provided in an embodiment of the present application
  • FIG4 is a processing flow chart of a response method provided in an embodiment of the present application.
  • FIG5 is a flow chart of session management in a response method provided in an embodiment of the present application.
  • FIG6 is a schematic diagram of a model training device for a question-answering system provided in an embodiment of the present application.
  • FIG7 is a schematic diagram of a response device provided in an embodiment of the present application.
  • FIG8 is a processing flow chart of a sample generation method of a question-answering system provided in an embodiment of the present application.
  • FIG9 is a schematic diagram of a framework of a question-answering system provided in an embodiment of the present application.
  • FIG10 is a first processing flow sub-chart of a sample generation method of a question-answering system provided in an embodiment of the present application
  • FIG11 is a second processing flow sub-chart of a sample generation method of a question-answering system provided in an embodiment of the present application.
  • FIG12 is a third processing flow sub-chart of a sample generation method of a question-answering system provided in an embodiment of the present application.
  • FIG13 is a fourth processing flow sub-chart of a sample generation method of a question-answering system provided in an embodiment of the present application.
  • FIG14 is a fifth processing flow sub-chart of a sample generation method of a question-answering system provided in an embodiment of the present application.
  • FIG15 is a sixth processing flow sub-chart of a sample generation method of a question-answering system provided in an embodiment of the present application.
  • FIG16 is a seventh processing flow sub-chart of a sample generation method of a question-answering system provided in an embodiment of the present application.
  • FIG17 is an eighth processing flow sub-chart of a sample generation method of a question-answering system provided in an embodiment of the present application.
  • FIG18 is a processing flow chart of another model training method for a question-answering system provided in an embodiment of the present application.
  • FIG19 is a processing flow chart of another response method provided in an embodiment of the present application.
  • FIG20 is a working principle diagram of a question-answering system provided in an embodiment of the present application.
  • FIG21 is a schematic diagram of a sample generation device of a question-answering system provided in an embodiment of the present application.
  • FIG22 is a schematic diagram of another model training device for a question-answering system provided in an embodiment of the present application.
  • FIG23 is a schematic diagram of another response device provided in an embodiment of the present application.
  • FIG. 24 is a schematic diagram of the structure of an electronic device provided in an embodiment of the present application.
  • the questions asked by users may be colloquial or one-sided, resulting in a rich and diverse sentence structure of the questions, which increases the difficulty of the question-answering system in identifying the intention; on the other hand, the information that users may provide when asking questions may be one-sided. If you only answer questions based on insufficient information, you may not be able to find a satisfactory answer to the user.
  • the embodiment of the present application provides a model training method for a question-answering system.
  • FIG1 is a processing flow chart of a model training method for a question-answering system provided in an embodiment of the present application.
  • the model training method for the question-answering system in FIG1 can be executed by an electronic device, which can be a terminal device, such as a mobile phone, a laptop computer, an intelligent interactive device, etc.; or, the electronic device can also be a server, such as an independent physical server, a server cluster, or a server capable of cloud computing. Cloud server.
  • the model training method of the question answering system provided in this embodiment specifically includes steps S102 to S104.
  • the question answering system may include an initial question parsing model, an initial question classification model, and an initial entity linking model.
  • the initial problem classification model can be an untrained OOD (out of the design scope) model.
  • the question domain can be pre-configured. Questions within the question domain are called in-domain questions, and questions outside the question domain are called out-of-domain questions.
  • the question-answering system has the ability to answer in-domain questions, but does not have the ability to answer out-of-domain questions.
  • the OOD model can be used to classify the received questions, and the classification results include at least: in-domain questions and out-of-domain questions.
  • the initial question parsing model may be an untrained relation extraction model.
  • the relation extraction model may be used to perform slot recognition processing to obtain slot recognition results, which include but are not limited to: entities, attributes, relations, constraints, and the like.
  • Entities can be names of people, places, organizations, pre-set proper nouns, etc.
  • Concepts also known as classes, are abstract descriptions of a collection of objects with the same characteristics. Concepts can be used to reflect categories. A concept can correspond to multiple entities.
  • plant can correspond to the following entities: “willow”, “cactus”, “cherry blossom”, and so on.
  • Attributes can be used to reflect the characteristics of an entity.
  • a concept can correspond to multiple attributes.
  • plant can correspond to the following attributes: “name”, “type”, “shape”, “growing environment”, “distribution range” and “propagation method”, etc.
  • attribute types include but are not limited to: text, number, picture, rich text, JS object notation (JavaScript Object Notation, json).
  • Relationships are used to describe the connection between concepts, which can be divided into classification relationships and non-classification relationships. In practical applications, corresponding relationships can be customized according to specific fields and specific applications, such as "cause and effect".
  • a constraint can be a restriction condition. For example, in the question “What are the services with an annual interest rate less than x%?”, the slot recognition result of “annual interest rate less than x%” is a constraint.
  • the initial entity linking model can be an untrained Bert (Bidirectional Encoder Representation from Transformers) classification model.
  • the Bert classification model can be used to perform classification prediction processing, linking the entity fragment of the input model to an entity node in a pre-configured knowledge graph.
  • the knowledge graph includes multiple entity nodes, each of which corresponds to an entity.
  • the knowledge graph can be constructed by extracting knowledge through ontology and corpus to form triples, and turning business data into knowledge.
  • Step S102 obtaining a question text sample.
  • the question text sample may be a pre-generated question parsing sample.
  • Each question text sample may be a word, a word combination, an incomplete sentence or a complete sentence, and the like.
  • the synonym set includes multiple standard words and at least one synonym corresponding to each standard word
  • the similar question set includes multiple first attributes and at least one similar question sentence corresponding to each first attribute
  • the comparative word set includes comparative word information; generate a question parsing sample based on at least one of the synonym set, similar question set, and comparative word set.
  • the specific implementation method of generating question analysis samples in this embodiment involves the sample generation method of the question and answer system, and can refer to the specific scheme in the following method embodiment.
  • Step S104 input the question text sample into the initial question parsing model for iterative training to obtain the question parsing model;
  • the initial question parsing model includes a first encoding layer and a conversion layer;
  • the first encoding layer is used to perform encoding processing according to the question text sample to obtain a corresponding first sentence vector;
  • the conversion layer is used to generate a preset number of initial intention vectors when receiving the first sentence vector, fill each initial intention vector according to the first sentence vector to obtain a corresponding target intention vector, and convert the target intention vector into a corresponding text fragment;
  • the text fragment is used to query the answer to the question text sample in the question-answering system.
  • the question parsing model obtained after iterative training also has a first encoding layer and a conversion layer.
  • the first encoding layer can be a Bert encoder or other components that can be used to convert text into sentence vectors.
  • the conversion layer can be composed of a spatial conversion layer and multiple binary classification fully connected layers.
  • the output of the first coding layer can be the input of the transformation layer.
  • the initial intention vector can be a vector with a fixed number of elements but each element is unknown. Each initial intention vector can be used to represent a corresponding intention space. After the filling process, some elements are known, and the remaining unknown elements can be filled by specifying numerical values to obtain the target intention vector corresponding to the initial intention vector.
  • a preset number of initial intention vectors are generated, and the preset number of initial intention vectors include the same number of elements.
  • the number of elements included in each initial intention vector may be a pre-configured fixed value. Specifically, when the first sentence vector is received, a preset number of initial intention vectors are generated; and the number of elements included in the initial intention vector is a preset value.
  • the number of elements included in each initial intention vector can also be determined based on the number of characters corresponding to the first sentence vector. Specifically, when the first sentence vector is received, a preset number of initial intention vectors are generated according to the first sentence vector; the number of elements included in the initial intention vector is determined based on the number of characters corresponding to the first sentence vector.
  • the first sentence vector includes a semantic feature sub-vector and multiple character sub-vectors; each initial intention vector is filled according to the first sentence vector to obtain the corresponding target intention vector.
  • the specific implementation method is: intent classification is performed on the semantic feature sub-vector to obtain the corresponding intention classification result; according to the intention classification result, each initial intention vector is filled to obtain the corresponding intermediate intention vector; according to each character sub-vector, each intermediate intention vector is filled to obtain the corresponding target intention vector.
  • the semantic feature sub-vector can be represented by [cls].
  • [cls] is not a vector used to represent a certain character in the text, but a semantic feature vector used to represent the entire text, which can be directly used for classification.
  • the question text sample may include multiple characters.
  • the first sentence vector may include a character sub-vector corresponding to each character in the question text sample.
  • the intent classification result can be used to indicate, for each initial intent vector, whether the first sentence vector has the intent corresponding to the initial intent vector. For example, for initial intent vector 1, if the intent classification result is the first classification result, it means that the first sentence vector does not have the intent corresponding to the initial intent vector 1; if the intent classification result is the second classification result, it means that the first sentence vector has the intent corresponding to the initial intent vector 1.
  • the semantic feature sub-vector is processed for intent classification to obtain the corresponding intent classification result.
  • it can be processed based on [cls] through a binary classification fully connected layer, that is, a linear layer (linear), to obtain the intent classification result. That is, the input value of linear is [cls], and the output value includes multiple intents.
  • Figure classification results each intent classification result corresponds to an intent initial vector, and each intent classification result can be one of the first classification result and the second classification result.
  • each initial intent vector is filled to obtain the corresponding intermediate intent vector, which can be automatically filled with the value used to represent the intent classification result of the initial intent vector in each initial intent vector to obtain the intermediate intent vector. For example, for initial intent vector 1, if the intent classification result is the first classification result, the value "0" used to represent the first classification result is filled in the specified position of the initial intent vector 1; if the intent classification result is the second classification result, the value "1" used to represent the second classification result is filled in the specified position of the initial intent vector 1.
  • filling the specified position of the initial intention vector 1 with the value "0" for indicating the first classification result may be performed by replacing the unknown element at the specified position with "0".
  • the intent classification result corresponding to the initial intention vector is the first classification result, it means that the first sentence vector does not have the corresponding intention. Therefore, the character recognition results obtained by entity recognition and constraint recognition in subsequent steps are irrelevant to the initial intention vector. Therefore, after filling "0" in the initial intention vector to obtain the intermediate intention vector, the intermediate intention vector can be determined as the corresponding target intention vector.
  • each intermediate intention vector is filled according to each character sub-vector to obtain the corresponding target intention vector.
  • the specific implementation method is: according to each character sub-vector, entity recognition processing and constraint recognition processing are performed to obtain the corresponding character recognition result; according to the character recognition result, each intermediate intention vector is filled to obtain the corresponding target intention vector.
  • a binary fully connected layer linear can be used to perform entity recognition and constraint recognition processing according to each character sub-vector to obtain a corresponding character recognition result, which can be used to represent the entity type and constraint type corresponding to the character sub-vector.
  • the entity type includes non-entity and multiple preset entity types
  • the constraint type includes non-constraint and multiple preset constraint types.
  • each intermediate intention vector is filled according to the character recognition result to obtain the corresponding target intention vector.
  • the specific implementation method is: if the character recognition result is used to characterize that the character sub-vector belongs to an entity category or a constraint category, then according to the character recognition result, the corresponding intermediate intention vector is determined and filled.
  • the character recognition result is used to represent that the character sub-vector belongs to an entity category
  • the entity extraction process is performed on the identification result, and the intent prediction process is performed on the extracted entity to determine the intermediate intent vector corresponding to the entity, and the entity is automatically filled into the corresponding intermediate intent vector.
  • the filling position of the entity in the intermediate intent vector can be determined based on the character position corresponding to the entity in the question text sample.
  • constraint extraction processing can be performed based on the character recognition result, and intention prediction processing can be performed on the extracted constraints to determine the intermediate intention vector corresponding to the constraint, and the constraint can be automatically filled into the corresponding intermediate intention vector.
  • the filling position of the constraint in the intermediate intention vector can be determined based on the character position corresponding to the constraint in the question text sample.
  • the question text sample includes at least one of an entity element, an attribute element, a relationship element, and a constraint element; a preset number of initial intention vectors include a first intention vector and multiple second intention vectors; the first intention vector corresponds to no intent, and each second intention vector corresponds to an attribute element or a relationship element.
  • the question text sample includes at least one of an entity element, an attribute element, a relationship element, and a constraint element.
  • the slot recognition result of the entity element may be "entity”
  • the slot recognition result of the attribute element may be “attribute”
  • the slot recognition result of the relationship element may be “relationship”
  • the slot recognition result of the constraint element may be "constraint”, and so on.
  • the question parsing model can convert question text samples into different intent spaces, each of which corresponds to an initial intent vector.
  • the attribute element can be used to determine a target initial intention vector corresponding to the attribute element from a preset number of initial intention vectors.
  • the relation element can be used to determine a target initial intention vector corresponding to the relation element from a preset number of initial intention vectors.
  • Filling each initial intention vector according to the first sentence vector may be performed by labeling entities and constraints under the initial intention vector corresponding to each intention space.
  • Labeling entities and constraints in the target initial intention vector corresponding to the attribute element may enable the generated target intention vector to reflect at least one of the corresponding relationship between the attribute and the entity and the corresponding relationship between the attribute and the constraint.
  • Labeling entities and constraints in the target initial intention vector corresponding to the relationship element may enable the generated target intention vector to reflect at least one of the corresponding relationship between the relationship and the entity and the corresponding relationship between the relationship and the constraint.
  • slot elements such as entity elements, attribute elements, relationship elements, constraint elements, etc. may have certain corresponding relationships with each other.
  • a sample question text is: What is the price of product A, and when will product B be available? Among them, “product A” and “product B” are entities, “price” and “availability time” are attributes, and “product A” corresponds to “price” and “product B” corresponds to "availability time”. If slot recognition is performed for each element separately, the user's intention corresponding to the sample question text may be misunderstood, and the user may be fed back the prices of both products A and B, as well as the availability time of both products A and B.
  • slot recognition can be performed simultaneously in different intent spaces, not only identifying each slot element in the question text sample, but also identifying the correspondence between attributes and entities, the correspondence between attributes and constraints, the correspondence between relationships and entities, and the correspondence between relationships and constraints, etc.
  • the question text sample may be composed of entity elements and attribute elements.
  • the question text sample is: How is the cost of product A calculated? Among them, the entity element is "product A”, and the attribute element is "how is the cost calculated", which corresponds to the attribute "price”.
  • the question text sample may be composed of an entity element and a constraint element.
  • the question text sample is: What restaurant activities can be participated in before this Sunday? Among them, the entity element is "restaurant activities" and the constraint element is "before this Sunday”.
  • the question text sample can be composed of entity elements and relationship elements, for example, what are the handling channels for C activity? Among them, the entity element is "C activity” and the relationship element is "handling channel”.
  • the question text sample can be composed of entity elements, for example, D service.
  • the entity element is "D service”.
  • the question text sample only includes entity elements, the question is usually incomplete and needs to be asked in reverse to guide the user asking the question to supplement the question.
  • the intent classification result of the first intent vector can be the second classification result, and the intent classification result of each second intent vector can be the first classification result, that is, the first sentence vector does not have any intent.
  • Figure 2 is a data flow diagram of a question parsing model in a model training method for a question-answering system provided in an embodiment of the present application.
  • the question text sample 202 is “How long does it take for liposuction to heal? How much does it cost to slim the face?”
  • [sep] is used to represent the separator between two question text samples.
  • [cls] means classification, which can be understood as a classification task for downstream.
  • the BERT model inserts a [cls] symbol before the text and uses the output vector corresponding to the symbol as the semantic representation of the entire text for text classification.
  • the first encoding layer may be a Bert encoder 204
  • the transformation layer may include a spatial transformation layer 208 , a binary classification fully connected layer 210 , and a binary classification fully connected layer 212 .
  • the Bert encoder 204 is used to convert the question text sample 202 into a corresponding sentence vector 206 and send it to the spatial conversion layer 208; the spatial conversion layer 208 is used to generate a preset number of initial intention vectors when receiving the sentence vector 206, wherein the initial intention vector 1 corresponds to the intention space Null (null value) 220, the initial intention vector 2 corresponds to the intention space "introduction” 218, the initial intention vector 3 corresponds to the intention space "price” 216, and the initial intention vector 4 corresponds to the intention space "recovery period" 214.
  • the binary classification fully connected layer 210 can be used to perform intent classification processing according to the semantic feature sub-vector H[cls]2062, and obtain the intent classification results corresponding to each intent space and fill them in: the intent space "recovery period" 214 corresponds to the second classification result, and "1" is filled in the first position; the intent space “price” 216 corresponds to the second classification result, and “1” is filled in the first position; the intent space “introduction” 218 corresponds to the first classification result, and "0” is filled in the first position; the intent space Null220 corresponds to the first classification result, and "0" is filled in the first position.
  • the binary classification fully connected layer 212 can be used to perform entity recognition processing and constraint recognition processing according to each character sub-vector 2064 to obtain the corresponding character recognition result, and perform entity extraction processing and constraint extraction processing based on the character recognition result to obtain the entities "liposuction" and "face thinning" corresponding to the first sentence vector.
  • the entity "liposuction” is processed for intention prediction, and the intention space corresponding to the entity “liposuction” is determined to be the intention space "recovery period” 214 and filled, and the target intention vector corresponding to the intention space “recovery period” is obtained.
  • the entity "face thinning” is processed for intention prediction, and the intention space corresponding to the entity “face thinning” is determined to be the intention space "price” 216 and filled, and the target intention vector corresponding to the intention space "price” is obtained.
  • the target intent vector can generate the corresponding text segment. For other intent spaces whose intent classification results are the first classification results, the corresponding target intent vector can be discarded.
  • the question-answering system also includes an initial entity linking model;
  • the model training method of the question-answering system also includes: obtaining entity linking samples; inputting the entity linking samples into the initial entity linking model for iterative training to obtain an entity linking model;
  • the entity linking model includes a second encoding layer and a prediction layer;
  • the second encoding layer is used to encode the entity linking samples to obtain a corresponding second sentence vector;
  • the prediction layer is used to perform prediction processing based on the second sentence vector to determine the corresponding target entity.
  • Entity link samples can be obtained in the following ways: obtaining a standard entity library; the standard entity library includes multiple standard entities; and generating entity link samples based on the question parsing samples and the standard entity library.
  • the specific implementation method of generating entity link samples based on question parsing samples and standard entity libraries involves a sample generation method of a question-answering system, and the specific scheme in the following method embodiment can be referred to.
  • the structure of the initial entity link model is exactly the same as that of the entity link model obtained after iterative training, but the model parameters involved in the training are different.
  • the entity link model obtained after iterative training also has a second encoding layer and a prediction layer.
  • the second encoding layer can be a Bert converter or other components that can be used to convert text into sentence vectors.
  • the prediction layer can be a binary fully connected layer linear or other components that can be used to map semantic feature sub-vectors to corresponding entities.
  • the prediction layer is used to perform prediction processing according to the second sentence vector to determine the corresponding target entity.
  • the prediction layer can be used to perform prediction processing according to the semantic feature subvector in the second sentence vector to determine the corresponding target entity.
  • the generated entity linking dataset can be used to train the BERT classification model.
  • the prediction phase after the entity recognition is completed through the question parsing model, an entity mention segment will be obtained, and the entities with higher similarity will be recalled in the graph. Similar to the construction of data, the question with the mention segment and the recalled entity will be spliced separately. When using the entity linking model for prediction, it will be linked to the unique entity in the graph.
  • Figure 3 is a data flow diagram of the entity linking model in a model training method for a question-answering system provided in an embodiment of the present application.
  • the entity link sample 302 is “[sep] How much does it cost to use $radio frequency face slimming$ [sep] radio frequency lipolysis face slimming”, and the entity link sample 302 is encoded by the Bert encoder 302 to obtain the corresponding second sentence vector, which includes the semantic feature sub-vector H[cls] 306.
  • the semantic feature sub-vector H[cls] 306 is predicted by the binary classification fully connected layer 308 to determine the corresponding entity prediction result 310.
  • the question-answering system also includes an initial question classification model; the model training method of the question-answering system also includes: obtaining question classification samples; inputting the question classification samples into the initial question classification model for iterative training to obtain a question classification model.
  • Question classification samples can be obtained in the following ways: obtaining a question domain set; the question domain set includes in-domain question texts and out-of-domain question texts; generating question classification samples based on the question parsing samples and the question domain set.
  • the specific implementation method of obtaining the question domain set and the specific description of the question domain set involve the sample generation method of the question and answer system, and can refer to the specific scheme in the following method embodiment.
  • the problem classification sample can be an OOD data set, which is used to construct the training data required for the OOD model.
  • the problem classification sample can include positive and negative samples. The positive sample is used to represent out-of-domain problems and is marked with label "1", and the negative sample is used to represent in-domain problems and is marked with label "0".
  • a positive sample may be ⁇ "text":"Hello”,”label":1 ⁇ .
  • Negative samples can be ⁇ "text":"What is the transaction time for unit notice deposit”,"label":0 ⁇ , or ⁇ "text":"How long can the parking space loan last",”label”:0 ⁇ .
  • the specific implementation method of generating question classification samples based on question analysis samples and question domain sets involves a sample generation method of a question-answering system, and reference may be made to the specific scheme in the following method embodiment.
  • a question text sample is obtained; then, the question text sample is input into the initial question parsing model for iterative training to obtain a question parsing model;
  • the initial question parsing model includes a first encoding layer and a conversion layer;
  • the first encoding layer is used to perform encoding processing according to the question text sample to obtain a corresponding first sentence vector;
  • the conversion layer is used to generate a preset number of initial intention vectors when the first sentence vector is received, fill each initial intention vector according to the first sentence vector to obtain a corresponding target intention vector, and convert the target intention vector into a corresponding text fragment;
  • the text fragment is used to query the answer to the question text sample in the question-answering system.
  • the initial question parsing model is iteratively trained by the obtained question text sample.
  • the question text sample can be encoded to obtain the corresponding first sentence vector through the first encoding layer, and a preset number of initial intention vectors can be generated through the conversion layer, and then each initial intention vector can be filled based on the first sentence vector to obtain a corresponding target intention vector, and the target intention vector can be converted into a corresponding text fragment.
  • the text fragment is used to query the answer to the question text sample in the question-answering system.
  • the initial intent vector is filled, so that the question text sample can be parsed at one time to obtain a text fragment corresponding to at least one target intent vector.
  • the question intent reflected by each text fragment can be determined by the corresponding initial intent vector, and the question text sample can be converted into different intent spaces through the question parsing model, and entities and constraints can be annotated under the intent vectors corresponding to each intent space, which is conducive to improving slot recognition efficiency and reducing the number of models required for the slot recognition processing flow, thereby reducing latency.
  • slot recognition is performed simultaneously in each intent space, which can increase concurrency and improve the intent recognition capability and efficiency of the question-answering system.
  • Figure 4 is a processing flow chart of a response method provided in the present application embodiment. Referring to Figure 4, the processing flow of the response method specifically includes steps S402 to S406.
  • the question parsing model is obtained by training using the model training method of the question-answering system.
  • the model training method for the question-answering system in this step may be a model training method for the question-answering system provided by the aforementioned various method embodiments.
  • the target problem is input into a problem analysis model for analysis processing to obtain a corresponding target fragment, including: inputting the target problem into a problem classification model for classification processing to obtain a classification result; when the classification result is used to characterize that the target problem belongs to a first preset classification, inputting the target problem into the problem analysis model for analysis processing to obtain a corresponding target fragment.
  • Step S406 Determine the answer to the target question based on the target segment.
  • the answer to the target question is determined based on the target fragment, including: inputting the target fragment into the entity linking model for prediction processing to obtain the corresponding target entity; performing slot filling processing based on the target entity to obtain the slot filling result of the target question; based on the slot filling result, querying the corresponding answer in the pre-configured knowledge graph to obtain the answer to the target question.
  • slot filling processing is performed.
  • the target fragment obtained in step S404 first, at least one slot template that the target fragment may correspond to is determined, and then, for each slot template, slot filling processing is performed on the slot template through the target fragment to obtain the filled slot template, and the filled slot template is determined as the slot filling result corresponding to the target problem.
  • Each slot template may include one or more slots to be filled.
  • slot template 1 is: What is the (attribute slot) of (entity slot)?
  • Slot template 2 is: (entity slot)(relationship slot)(entity slot)?
  • Slot template 3 is: (Constraint slot) (Entity slot) What are they?
  • the target entity can be used not only to fill the "entity slot”, but also to fill multiple slots such as “attribute slot”, “relationship slot” and “constraint slot”.
  • the slot filling result of the target question includes the filled slot template 1: What is the (attribute slot) of (product A)?
  • the filled slot template 1 cannot be used to query in the knowledge graph to obtain a unique corresponding answer. It can be determined that the slot missing information corresponding to the filled slot template 1 is attribute missing. According to the slot missing information "attribute missing", a corresponding rhetorical question "What information do you want to inquire about product A?” is generated. The user input "price" is received in response to the rhetorical question.
  • Figure 5 is a flow chart of session management in a response method provided in an embodiment of the present application.
  • the question-answering system can perform algorithmic analysis on the question to obtain an analysis result, which includes at least one slot identification result.
  • parsing result check If the parsing result is used to indicate OOD or no result, generate a response text according to the pre-configured response words corresponding to "don't know/can't answer/flag bit". The response text is used to inform the user that the question-and-answer system cannot answer the question.
  • a session stack detection is performed: for the first session, the slot management process is entered; for multiple rounds of dialogue, the session management process is entered.
  • the slot management process is entered, and the slot filling process is performed through the slot management process.
  • a model training method for a question-answering system is provided.
  • an embodiment of the present application also provides a model training device for a question-answering system, which is described below in conjunction with the accompanying drawings.
  • FIG6 is a schematic diagram of a model training device for a question-answering system provided in an embodiment of the present application.
  • the present embodiment provides a model training device 600 for a question-answering system, wherein the question-answering system includes an initial question parsing model; the device includes: a first acquisition unit 602, used to acquire a question text sample; a first training unit 604, used to input the question text sample into the initial question parsing model for iterative training to obtain a question parsing model; the initial question parsing model includes a first encoding layer and a conversion layer; the first encoding layer is used to perform encoding processing according to the question text sample to obtain a corresponding first sentence vector; the conversion layer is used to generate a preset number of initial intention vectors when receiving the first sentence vector, fill each of the initial intention vectors according to the first sentence vector to obtain a corresponding target intention vector, and convert the target intention vector into a corresponding text fragment; the text fragment is used to query the answer to the question text sample in the question-answering system.
  • the question-answering system includes an initial question parsing model
  • the device includes:
  • the intention classification result includes a first classification result and a second classification result
  • the specific implementation method of filling each of the intermediate intention vectors according to each of the character sub-vectors to obtain the corresponding target intention vector is: performing entity recognition processing and constraint recognition processing according to each of the character sub-vectors to obtain the corresponding character recognition result; filling each of the intermediate intention vectors according to the character recognition result to obtain the corresponding target intention vector.
  • the specific implementation method of filling each of the intermediate intention vectors according to the character recognition result to obtain the corresponding target intention vector is: if the character recognition result is used to characterize that the character sub-vector belongs to an entity category or a constraint category, then according to the character recognition result, the corresponding intermediate intention vector is determined and filled.
  • the question text sample includes at least one of entity elements, attribute elements, relationship elements, and constraint elements; the preset number of initial intention vectors includes a first intention vector and multiple second intention vectors; the first intention vector corresponds to no intent, and each of the second intention vectors corresponds to one attribute element or one relationship element.
  • the question-answering system also includes an initial entity linking model; the first acquisition unit 602 is also used to obtain entity linking samples; the first training unit 604 is also used to input the entity linking samples into the initial entity linking model for iterative training to obtain an entity linking model; the entity linking model includes a second encoding layer and a prediction layer; the second encoding layer is used to encode the entity linking samples to obtain a corresponding second sentence vector; the prediction layer is used to perform prediction processing based on the second sentence vector to determine the corresponding target entity.
  • the question-answering system further includes an initial question classification model; the first acquisition unit 602 is further used to acquire question classification samples; the first training unit 604 is further used to input the question classification samples into the initial question classification model for iterative training to obtain a question classification model.
  • the model training device of the question-answering system includes a first acquisition unit and a first training unit, wherein: the first acquisition unit is used to acquire a question text sample; the first training unit is used to input the question text sample into the initial question parsing model for iterative training to obtain the question parsing model; the initial question parsing model includes a first encoding layer and a conversion layer; the first encoding layer is used to perform encoding processing according to the question text sample to obtain a corresponding first sentence vector; the conversion layer is used to generate a preset number of initial intention vectors when receiving the first sentence vector, fill each initial intention vector according to the first sentence vector to obtain a corresponding target intention vector, and convert the target intention vector into a corresponding text fragment; the text fragment is used to query the answer to the question text sample in the question-answering system.
  • the initial question parsing model is iteratively trained through the obtained question text samples.
  • the question text sample can be encoded to obtain the corresponding first sentence vector through the first encoding layer, and a preset number of initial intention vectors can be generated through the conversion layer, and then each initial intention vector is filled based on the first sentence vector, so that the question text sample is parsed at one time to obtain a text segment corresponding to at least one target intention vector, and the question intent reflected by each text segment can be determined by the corresponding initial intention vector, and the question text sample can be converted into a target intention vector through the question parsing model.
  • Switching to different intent spaces and marking entities and constraints under the intent vectors corresponding to each intent space can help improve slot recognition efficiency and reduce the number of models required for the slot recognition processing flow, thereby reducing latency.
  • Simultaneous slot recognition in each intent space can increase concurrency and improve the intent recognition capability and efficiency of the question-answering system.
  • FIG. 7 is a schematic diagram of a response device provided in an embodiment of the present application.
  • This embodiment provides an answering device 700, including: a second acquisition unit 702, used to acquire a target question to be answered; a first parsing unit 704, used to input the target question into a question parsing model for parsing and processing to obtain a corresponding target segment; the question parsing model is obtained by training using a model training method of a question-answering system; a first determination unit 706, used to determine an answer to the target question based on the target segment.
  • the first parsing unit 704 is specifically used to: input the target problem into the problem classification model for classification processing to obtain a classification result; when the classification result is used to characterize that the target problem belongs to a first preset classification, input the target problem into the problem parsing model for parsing processing to obtain a corresponding target fragment.
  • the first determination unit 706 is specifically used to: input the target fragment into the entity linking model for prediction processing to obtain a corresponding target entity; perform slot filling processing based on the target entity to obtain a slot filling result for the target question; based on the slot filling result, query the corresponding answer in a pre-configured knowledge graph to obtain an answer to the target question.
  • the answering device includes: a second acquisition unit, which is used to acquire a target question to be answered; a first parsing unit, which is used to input the target question into a question parsing model for parsing and processing to obtain a corresponding target segment; the question parsing model is obtained by training through a model training method of a question-answering system; and a first determination unit, which is used to determine the answer to the target question based on the target segment.
  • the question parsing model can encode the question text sample through the first encoding layer to obtain the corresponding first sentence vector, and through the conversion layer, a preset number of initial intention vectors can be generated, and then each initial intention vector can be filled based on the first sentence vector, so that the question text sample can be parsed at one time to obtain a text segment corresponding to at least one target intention vector, and the question intent reflected by each text segment can be determined by the corresponding initial intention vector, and the question text sample can be converted to different intention spaces through the question parsing model, and entities and constraints can be annotated under the intention vectors corresponding to each intention space, which is conducive to improving the efficiency of slot recognition and reducing
  • the number of models required for the slot recognition processing flow is reduced, thereby reducing latency, and slot recognition is performed simultaneously in each intent space, which can increase concurrency and improve the intent recognition capability and efficiency of the question-answering system.
  • the target fragments are generated through the question parsing model, and the answer to the
  • an embodiment of the present application provides a sample generation method for a question-answering system.
  • Figure 8 is a processing flow chart of a sample generation method of a question-and-answer system provided in an embodiment of the present application.
  • the sample generation method of the question-and-answer system of Figure 8 can be executed by an electronic device, which can be a terminal device, such as a mobile phone, a laptop computer, an intelligent interactive device, etc.; or, the electronic device can also be a server, such as an independent physical server, a server cluster, or a cloud server capable of cloud computing.
  • the sample generation method of the question-and-answer system provided in this embodiment specifically includes steps S802 to S808.
  • the question answering system may include an initial question classification model, an initial question parsing model, and an initial entity linking model.
  • the specific description of the question-answering system, the initial question classification model, the initial question parsing model and the initial entity linking model in the question-answering system in this embodiment involves the model training method of the question-answering system, and reference can be made to the specific scheme in the above method embodiment.
  • FIG9 is a schematic diagram of a framework of a question-answering system provided in an embodiment of the present application.
  • the training data set to be generated includes: a question parsing data set 908 for iterative training of the initial question parsing model, an entity linking data set 910 for iterative training of the initial entity linking model, and an OOD data set 912 for iterative training of the initial OOD model.
  • Step S802 obtaining a pre-configured synonym set, similar question set, comparison word set, problem domain set, and standard entity library;
  • the synonym set includes a plurality of standard words and at least one synonym corresponding to each standard word;
  • the similar question set includes a plurality of first attributes and a corresponding entity library for each first attribute;
  • the comparison word set includes comparison word information;
  • the question domain set includes in-domain question text and out-of-domain question text;
  • the standard entity library includes multiple standard entities.
  • the synonym set may be obtained from a pre-configured synonym template.
  • the following fields can be displayed: standard word, type, synonym, update time, operation, etc.
  • Standard words automatically generated by the system, and the values are derived from the entities and attributes defined in the corresponding graph.
  • (a2) Type Automatically generated by the system, the value comes from the type corresponding to the entity and attribute defined in the corresponding graph.
  • the synonym set includes multiple standard words and at least one synonym corresponding to each standard word.
  • the synonym set includes: standard word x1, synonyms x2, synonyms x3 and synonyms x4 corresponding to the standard word x1; standard word y1, synonym y2 corresponding to the standard word y1; standard word z1, synonyms z2 and synonym z3 corresponding to the standard word z1, and so on.
  • the similar question set can be obtained from a pre-configured similar question template.
  • the entity type column corresponds to the concept
  • the attribute column corresponds to the attribute
  • similar questions are constructed for the attributes.
  • the following fields can be displayed: concept, attribute name, attribute type, similarity question, update time, operation, etc.
  • (b1) Concept corresponds to the concept defined in the graph, which can be searched but cannot be added or edited.
  • Attribute type corresponds to the attribute defined in the graph and cannot be edited.
  • Attribute types include but are not limited to: text, number, image, rich text, map, etc.
  • the similar question set includes multiple first attributes and at least one similar question corresponding to each first attribute.
  • the similar question set includes: first attribute A, and similar question A1 and similar question A2 corresponding to the first attribute A; first attribute B, and similar question B1 corresponding to the first attribute B, and so on.
  • the comparison word set may be obtained from a pre-configured comparison word template.
  • the comparison word template can be used to generate a question parsing dataset.
  • the maintainable candidate words are mainly attribute words with the concept attribute type of "number". Its purpose is to ensure that training data of relevant comparison types are generated in the parsing dataset.
  • the comparative word information may include a comparative word and a unit.
  • Comparative words include but are not limited to: lowest, smallest, least, shortest, cheapest, low, small, few, short, cheap, highest, largest, most, longest, most expensive, high, big, many, long, expensive, etc.
  • Units include but are not limited to: yuan, thousand, thousand yuan, ten thousand, ten thousand yuan, hundred million yuan, day, day, month, year, %, percent, etc.
  • the problem domain set may be obtained from a pre-configured problem domain template.
  • the question domain template is used to generate OOD model data sets and maintain some questions and small talk that the question-answering system cannot answer.
  • the backend will sample some question parsing data sets to form binary positive and negative samples to ensure the reasonable leakage of question-answering data. Due to certain limitations in the generated data, the OOD template provides editing and importing of both in-domain and out-of-domain questions to ensure classification accuracy and data maintainability.
  • the question domain set includes in-domain question texts and out-of-domain question texts.
  • the question domain set can be generated based on the pre-configured question domain.
  • the question answering system has the ability to answer questions within the domain, but does not have the ability to answer questions outside the domain.
  • An in-domain problem may be a problem within the problem domain, and the text used to describe the in-domain problem may be the in-domain problem text.
  • An out-of-domain problem may be a problem outside the problem domain, and the text used to describe the out-of-domain problem may be the out-of-domain problem text.
  • In-domain question text such as "When is the deadline for activity A?”
  • out-of-domain question text such as "Should I go out today?”
  • Step S804 based on at least one of the synonym set, the similar question set and the comparison word set, Generate problem parsing samples.
  • the problem analysis samples include but are not limited to: single slot samples, single attribute samples, single entity single attribute samples, single entity dual attribute samples, dual entity single attribute samples, dual entity dual attribute samples, composite attribute constraint samples, comparison type samples, etc.
  • the generated problem analysis samples can be stored in the following format:
  • the graph attribute may be an attribute of the generated data in the graph, and is used to indicate the storage intention of the generated data in the graph.
  • generating a question analysis sample according to at least one of a synonym set, a similar question set, and a comparative word set includes: filtering the synonym set according to a first preset filtering condition to obtain a candidate word list; the candidate word list includes candidate standard words and candidate synonyms corresponding to the candidate standard words; performing a first sampling process on the candidate word list to obtain a single-slot sample, and determining the single-slot sample as a question parsing sample; or performing a second sampling process on multiple first attributes in a similar question set to obtain a target first attribute; performing a third sampling process on at least one similar question sentence corresponding to the target first attribute to obtain an initial similar question sentence; determining the corresponding single-attribute sample based on the initial similar question sentence, and determining the single-attribute sample as a question parsing sample.
  • Problem analysis samples may include single-slot samples or single-attribute samples.
  • the question analysis sample when the question analysis sample includes a single-slot sample, the question analysis sample is generated according to at least one of a synonym set, a similar question set and a comparative word set.
  • the synonym set may be filtered according to a first preset filtering condition to obtain a candidate word list; the candidate word list includes candidate standard words and candidate synonyms corresponding to the candidate standard words; and the candidate word list is subjected to a first sampling process to obtain a single-slot sample.
  • a single-slot sample may be a text sample that corresponds to only one preset slot type among multiple preset slot types such as entity, attribute, relationship, constraint, etc.
  • Each preset slot type corresponds to one of the aforementioned slot identification results.
  • the question “What is the price of product A?” is processed by slot recognition through the relationship extraction model, and the obtained slot recognition results include: the slot recognition result of "product A” is “entity”, the slot recognition result of "price” is “attribute”, and so on.
  • the slot recognition result "entity" corresponds to the preset slot type "entity”.
  • the single slot samples belonging to the preset slot type "entity” can be "A product", "B activity”, “C product and D product”, and so on.
  • the slot recognition result "attribute” corresponds to the preset slot type "attribute”.
  • Single slot samples belonging to the preset slot type "attribute” may be "price”, "activity deadline”, “interest rate”, “discount”, and so on.
  • the generation of single-slot samples is to simulate questions asked when the user's intention is unclear, and can be used in guided rhetorical questions and continuous rhetorical questions scenarios.
  • a single-slot sample may be a single-entity sample, a double-entity sample, or a single-constraint sample.
  • the “dual-entity sample” includes two entities, the two entities both correspond to a preset slot type, namely the preset slot type "entity”, so the “dual-entity sample” is also a single-slot sample.
  • FIG. 10 is a first processing diagram of a sample generation method of a question-answering system provided in an embodiment of the present application. Flowchart sub-graph. FIG. 10 exemplarily shows a method for generating a single-slot sample, where the single-slot sample includes but is not limited to: single-entity data 1008 , double-entity data 1010 , and single-constraint data 1012 .
  • the first preset filtering condition may include the filtering condition “entity” and the filtering condition “json key (JS object notation keyword)”. Based on the filtering condition “entity”, the synonym set 1002 is filtered to obtain a candidate word list 1004, and the candidate word list 1004 is subjected to a first sampling process to generate single entity data 1008 and/or double entity data 1010.
  • the sampling method of the first sampling process may be random sampling or other sampling methods.
  • Single entity data 1008 for example,
  • "Bill Business” located in the "text” row may be the generated data, that is, the text corresponding to the single entity data 1008.
  • "NULL” may be a graph attribute, used to indicate that the storage intention of the single entity data 1008 is empty. "0" may be the start position of the header entity, "4" may be the end position of the header entity, "Business” may be the header entity category, and "Bill Business” located in the "value” row may be the standard name of the header entity.
  • the synonym set 1002 is filtered to obtain a candidate word list 1006.
  • the candidate word list 1006 is subjected to a first sampling process to generate single constraint data 1012.
  • Single constraint data 1012 for example,
  • "X Branch” in the “text” row may be the generated data, that is, the text corresponding to the single constraint data 1012.
  • "Process flow” may be a graph attribute, used to indicate that the storage intention of the single constraint data 1012 is the process flow. "0" may be the starting position of the header entity, "4" may be the ending position of the header entity, "constraint” may be the header entity category, and "X Branch” in the "value” row may be the standard name of the header entity.
  • the question parsing sample when the question parsing sample includes a single-attribute sample, the question parsing sample is generated according to at least one of a synonym set, a similar question set, and a comparative word set.
  • the second sampling process may be performed on multiple first attributes in the similar question set to obtain a target first attribute; a third sampling process may be performed on at least one similar question sentence corresponding to the target first attribute to obtain an initial similar question sentence; and a corresponding single-attribute sample is determined according to the initial similar question sentence, and the single-attribute sample is determined as the question parsing sample.
  • the corresponding single-attribute sample is determined, and the single-attribute sample is determined as the question parsing sample. It can be: if the initial similar question carries a mask, the mask is deleted in the initial similar question to obtain the single-attribute sample; if the initial similar question does not carry a mask, the initial similar question is determined as a single-attribute sample.
  • a single attribute sample can be a text sample that only corresponds to the preset slot type "attribute”.
  • a single attribute sample can be a special single slot sample. Since the purpose of a single attribute sample is different from the "single entity sample”, “double entity sample” and “single constraint sample” listed above, it is explained separately.
  • the generation of single-attribute samples is to simulate questions asked by users when the subject is unknown, and is used in guided rhetorical questions and continuous rhetorical questions scenarios.
  • Fig. 11 is a second processing flow chart of a sample generation method of a question-answering system provided in an embodiment of the present application.
  • Fig. 11 exemplarily shows a generation method of a single attribute sample, and the single attribute sample can be single attribute data 1108 .
  • a second sampling process is performed on multiple first attributes in the similar question set 1102 to obtain a target first attribute; a third sampling process is performed on at least one similar question sentence corresponding to the target first attribute to obtain an initial similar question sentence. If the initial similar question sentence carries a mask, that is, the initial similar question sentence is a similar question sentence 1104 with [e], then [e] is deleted to obtain single attribute data 1108; if the initial similar question sentence does not carry a mask, that is, the initial similar question sentence is a similar question sentence 1106 without [e], then the similar question sentence 1106 is determined as single attribute data 1108.
  • the processing methods of the second sampling process and the third sampling process may be random sampling or other sampling methods.
  • "What is the handling process?" in the "text" row may be generated data, that is, the text corresponding to the single attribute data 1108.
  • "Handling process” may be a graph attribute, used to indicate that the storage intention of the single attribute data 1108 is the handling process.
  • the single attribute data 1108 has neither a head entity nor a tail entity.
  • a question parsing sample is generated according to at least one of a synonym set, a similar question set and a comparative word set, including: filtering the synonym set according to a second preset filtering condition to obtain a candidate entity word list; the candidate entity word list includes multiple candidate entity words; screening the similar question set to obtain an intermediate similar question set; the intermediate similar question set includes multiple first attributes and at least one candidate similar question sentence carrying a mask corresponding to each first attribute; according to the candidate word list and the intermediate similar question set, target candidate entity words and target candidate similar question sentences corresponding to the same attribute category are determined; the mask in the target candidate similar question sentence is replaced by the target candidate entity word to obtain a composite sample, and the composite sample is determined as a question parsing sample; or, filtering the synonym set according to a third preset filtering condition to obtain attribute synonyms and entity synonyms; generating random numbers; and selecting a composite sample according to comparative word information, attribute synonyms, entity synonyms, The random numbers and the preset description words are concaten
  • Problem analysis samples can include composite samples as well as comparative samples.
  • the question parsing sample when the question parsing sample includes a composite sample, the question parsing sample is generated according to at least one of a synonym set, a similar question set and a comparative word set.
  • the method may be to filter the synonym set according to a second preset filtering condition to obtain a candidate entity word list; the candidate entity word list includes a plurality of candidate entity words; the similar question set is screened to obtain an intermediate similar question set; the intermediate similar question set includes a plurality of first attributes and at least one candidate similar question sentence carrying a mask corresponding to each first attribute; according to the candidate word list and the intermediate similar question set, the target candidate entity words and the target candidate similar question sentence corresponding to the same attribute category are determined; and the mask in the target candidate similar question sentence is replaced by the target candidate entity word to obtain a composite sample.
  • a composite sample may be a text sample corresponding to multiple preset slot types among multiple preset slot types such as entity, attribute, relationship, constraint, etc.
  • Composite samples include but are not limited to: single-entity single-attribute samples, single-entity dual-attribute samples, dual-entity single-attribute samples, and dual-entity dual-attribute samples, etc.
  • the generation of single-entity single-attribute samples is to simulate the questions when users normally ask about the attributes of a thing, and is used in single-entity single-attribute scenarios.
  • the generation of single-entity dual-attribute samples is to simulate the questions when users ask about two or more attributes of a thing at the same time, and is used in single-entity multi-attribute scenarios.
  • the generation of dual-entity single-attribute samples is to simulate the questions when users ask about the same attribute of multiple things at the same time, and is used in multi-entity single-attribute scenarios.
  • the generation of dual-entity dual-attribute samples is to simulate the questions when users ask about multiple attributes of multiple things at the same time, and is used in multi-entity multi-attribute scenarios.
  • FIG12 is a third processing flow sub-diagram of a sample generation method of a question-answering system provided in an embodiment of the present application.
  • FIG12 exemplarily shows a generation method of a composite sample, and the composite sample can be composite data 1212.
  • Composite data 1212 includes but is not limited to: single-entity single-attribute data, single-entity dual-attribute data, dual-entity single-attribute data, and dual-entity dual-attribute data.
  • the second preset filtering condition may include the filtering condition “entity”.
  • the synonym set 1202 is filtered to obtain a candidate entity word list; the candidate entity word list includes multiple candidate entity words; the similar question set 1204 is screened to obtain an intermediate similar question set; the intermediate similar question set includes multiple first attributes and at least one candidate similar question sentence 1206 with a mask corresponding to each first attribute; according to the candidate word list and the intermediate similar question set, Determine the candidate entity words and candidate similar questions 1208 corresponding to each attribute category; randomly sample the candidate entity words and candidate similar questions 1208 corresponding to each attribute category to obtain target candidate entity words and target candidate similar questions 1210 corresponding to the same attribute category; replace the mask [e] in the target candidate similar questions with the target candidate entity words to obtain composite data 1212.
  • "Where to support the processing of business A” can be the generated data, that is, the text corresponding to the single-entity single-attribute data.
  • "Processing channel” can be a graph attribute, which is used to indicate that the storage intention of the single-entity single-attribute data is the processing channel.
  • "9" can be the starting position of the header entity
  • "12” can be the ending position of the header entity
  • "Business” can be the header entity category
  • “Business A” can be the standard name of the header entity.
  • "What information do I need to submit to handle business X? What are the application requirements?" can be the generated data, that is, the text corresponding to the single-entity dual-attribute data.
  • "Handling information” and “Handling conditions” can be two different graph attributes, used to indicate that the storage intention of the single-entity dual-attribute data includes handling information and handling conditions.
  • "4" can be the starting position of the header entity
  • "11" can be the ending position of the header entity
  • "Business” can be the header entity category
  • "Export Letter of Credit” can be the first header entity standard name
  • "X" can be the second header entity standard name.
  • the first header entity and the second header entity are the same entity, so the starting position of the first header entity is the same as the starting position of the second header entity, and the ending position of the first header entity is the same as the ending position of the second header entity.
  • a preset splicing method uses one of “and”, “and”, “,”, “,” to splice two words together.
  • Dual-entity single-attribute data for example,
  • “How much is the credit line for business A and business B” can be the generated data, that is, the text corresponding to the double-entity single-attribute data.
  • “Amount” can be a graph attribute used to represent the double entity The storage intention of single attribute data is the amount. "0" can be the start position of the first header entity, "6” can be the end position of the first header entity, "7” can be the start position of the second header entity, "12” can be the end position of the second header entity, "Business” can be the header entity category, "A Business” can be the standard name of the first header entity, and "B Business” can be the standard name of the second header entity.
  • the first head entity and the second head entity are two different entities, but the graph attributes of the two entities are the same.
  • Fig. 13 is a fourth processing flow chart of a sample generation method of a question-answering system provided in an embodiment of the present application.
  • Fig. 13 exemplarily shows a generation method of a dual-entity dual-attribute sample, where the dual-entity dual-attribute sample can be dual-entity dual-attribute data 1306 .
  • multiple data can be randomly selected from the multiple generated single-entity single-attribute data, for example, single-entity single-attribute data 1302 and single-entity single-attribute data 1304, and single-entity single-attribute data 1302 and single-entity single-attribute data 1304 can be spliced together through randomly selected connectors to form dual-entity dual-attribute data 1306.
  • Fig. 14 is a fifth processing flow chart of a sample generation method of a question-answering system provided in an embodiment of the present application.
  • Fig. 14 exemplarily shows a generation method of a composite attribute constraint sample.
  • the composite attribute constraint sample can be composite attribute constraint data 1416.
  • compound attribute constraint samples are to simulate the questions asked by users when they ask about a certain attribute of a thing under certain limited conditions. They are used in compound attribute constraint scenarios and continuous rhetorical question scenarios.
  • the synonym set 1402 is filtered based on the filter condition “json key” to obtain json attribute synonyms 1406 , and the synonym set 1402 is filtered based on the filter condition “entity” to obtain entity synonyms 1408 .
  • the similar question set 1404 is screened to obtain similar questions with masks, that is, similar questions 1410 with [e].
  • target entity synonyms By randomly sampling the entity synonyms, json attribute synonyms, and similar questions 1412 corresponding to each attribute category, target entity synonyms, target json attribute synonyms, and target similar questions 1414 can be obtained.
  • the [e] in the target similar questions is replaced by the target entity synonyms to obtain the intermediate similar questions, and then the splicing method is randomly determined.
  • the target json attribute synonyms and the intermediate similar questions are spliced according to the determined splicing method to obtain the composite attribute constraint data 1416.
  • Composite attribute constraint data 1416 for example,
  • "Can you tell me about the process of handling business B mobile banking?" can be the generated data, that is, the text corresponding to the composite attribute constraint data 1416.
  • "Process” can be a graph attribute, used to indicate that the storage intention of the composite attribute constraint data 1416 is the process.
  • "3" can be the start position of the head entity
  • "6” can be the end position of the head entity
  • "11” can be the start position of the tail entity
  • "15” can be the end position of the tail entity
  • business can be the head entity category
  • "constraint” can be the tail entity category
  • business B” can be the head entity standard name
  • “mobile banking” can be the tail entity standard name.
  • the question analysis sample when the question analysis sample includes a comparison type sample, the question analysis sample is generated according to at least one of a synonym set, a similar question set and a comparison word set.
  • the synonym set may be filtered according to a third preset filtering condition to obtain attribute synonyms and entity synonyms; a random number is generated; and a splicing process is performed based on comparison word information, attribute synonyms, entity synonyms, random numbers and preset description words to obtain a comparison type sample.
  • comparison type samples is to simulate users asking questions that require numerical comparison, such as the maximum, minimum, greater than, less than, and between, of a certain attribute under a certain category of things, and is used in comparison sentence scenarios.
  • Fig. 15 is a sixth processing flow chart of a sample generation method of a question-answering system provided in an embodiment of the present application.
  • Fig. 15 exemplarily shows a generation method of a comparison type sample, and the comparison type sample can be comparison type data 1512.
  • the synonym set 1504 is filtered according to the third preset filtering condition to obtain attribute synonyms 1506 and entity synonyms 1508 .
  • Comparison word information 1502 , attribute synonyms 1506 , entity synonyms 1508 , random numbers 1510 , and preset description words in the comparison word set are concatenated to obtain comparison type data 1512 .
  • Comparison type data 1512 for example,
  • “How many years can the M product last at most” can be the generated data, that is, the text corresponding to the comparison type data 1512.
  • “Term” can be a graph attribute, used to indicate that the storage intention of the comparison type data 1512 is the term. "0" can be the starting position of the header entity, "4" can be the ending position of the header entity, "Business” can be the category of the header entity, and "Asset Business” can be the standard name of the header entity.
  • the comparison word information can be synchronized from the comparison word template, such as attributes, comparison words, units, etc.; the attribute synonyms of digital type in the synonym template, such as amount, quota, term, etc., are taken, and the entities of the same type of non-leaf nodes are obtained from the entity synonyms; when generating maximum, minimum, greater than, and less than type data, the attribute synonyms, comparison words, generated random numbers, and units are first spliced, and then the spliced string is randomly selected at a position with the entity synonyms and preset description words for splicing.
  • the comparison word template such as attributes, comparison words, units, etc.
  • the attribute synonyms of digital type in the synonym template such as amount, quota, term, etc.
  • the data of numerical single constraint can also be composed of comparison words, random numbers and units. For example, greater than 4.35% and less than 5 years.
  • Step S806 generating a question classification sample based on the question parsing sample and the question domain set; and generating an entity linking sample based on the question parsing sample and the standard entity library.
  • question classification samples and entity linking samples may use a portion of the question parsing samples generated in step S804.
  • the specific description of the question classification samples in this embodiment can refer to the corresponding content in the above method embodiment.
  • a problem classification sample is generated based on a problem analysis sample and a problem domain set, including: performing a fourth sampling process on the problem analysis sample to obtain a first domain sample; generating a corresponding second domain sample based on the domain problem text; generating a corresponding out-of-domain sample based on the out-of-domain problem text; and generating a problem classification sample based on the first domain sample, the second domain sample and the out-of-domain sample.
  • the sampling method of the fourth sampling process may be random sampling or other sampling methods.
  • Figure 16 is a seventh processing flow sub-diagram of a sample generation method of a question-answering system provided in an embodiment of the present application.
  • Figure 16 exemplarily shows a method for generating a question classification sample, and its specific processing flow includes steps S1602 to S1606:
  • Step S1602 extract a first in-domain sample from the problem analysis sample, and obtain a second in-domain sample and an out-of-domain sample from the problem domain set.
  • Step S1604 adding positive labels to the first domain samples and the second domain samples, and adding negative labels to the out-of-domain samples.
  • Step S1606 merge all samples and save them into a json file.
  • an entity linking sample is generated based on a problem parsing sample and a standard entity library, including: determining a target parsing sample carrying a non-standard entity based on the problem parsing sample; calculating the similarity between the non-standard entity and each standard entity in the standard entity library and sorting them; determining a preset number of target standard entities corresponding to the non-standard entity based on the sorting result; and Quasi-entities and a preset number of target standard entities are used to construct positive and negative samples corresponding to the non-standard entities, and the positive and negative samples are determined as entity link samples.
  • Determining a preset number of target standard entities corresponding to the non-standard entity according to the sorting result may be determining the preset number of standard entities with the greatest similarity as the target standard entities.
  • positive and negative samples corresponding to the non-standard entity are constructed.
  • the positive sample can be constructed based on the non-standard entity and the target standard entity with the greatest similarity
  • the negative sample can be constructed based on the non-standard entity and the target standard entity other than the target standard entity with the greatest similarity.
  • the constructed positive sample is ⁇ "text":"What is the transaction time of "Unit Notice Deposit” [sep] Unit Notice Deposit”,"label”:1 ⁇ .
  • the constructed negative samples include: ⁇ "text":"What is the transaction time of "Unit Notice to Deposit Money”[sep]Unit Regular Deposit Money”,”label”:0 ⁇ ; ⁇ "text”:"What is the transaction time of "Unit Notice to Deposit Money”[sep]Unit Current Deposit Money”,”label”:0 ⁇ ; ⁇ "text”:"What is the transaction time of "Unit Notice to Deposit Money”[sep]Unit Agreement Deposit”,”label”:0 ⁇ ; ⁇ "text”:”What is the transaction time of "Unit Notice to Deposit Money”[sep]Unit Regular Deposit Book”,”label”:0 ⁇ .
  • Fig. 17 is an eighth processing flow sub-diagram of a sample generation method of a question-answering system provided in an embodiment of the present application.
  • Fig. 17 exemplarily shows a generation method of entity link samples, and its specific processing flow includes steps S1702 to S1708.
  • Step S1702 randomly select a certain number of target parsing samples carrying non-standard entities from the question parsing samples.
  • Step S1704 Take out the non-standard entity and recall the top 5 most similar standard entities in the standard entity library.
  • Step S1706 constructing positive and negative samples based on the non-standard entities and the recalled standard entities.
  • Step S1708 merge all samples into a json file.
  • Step S808 constructing a training data set for the question answering system based on the question classification samples, question parsing samples, and entity linking samples.
  • This training dataset is used to train the initial question classification model, initial question parsing model, and initial entity linking model in the question answering system.
  • a pre-configured synonym set, similar question set, and A set of synonyms, a set of comparison words, a set of problem domains, and a standard entity library includes multiple standard words and at least one synonym corresponding to each standard word; a set of similar questions includes multiple first attributes and at least one similar question sentence corresponding to each first attribute; a set of comparison words includes comparison word information; a problem domain set includes in-domain question texts and out-of-domain question texts; a standard entity library includes multiple standard entities; then, a question parsing sample is generated according to at least one of the synonym set, the set of similar questions, and the set of comparison words; then, a question classification sample is generated according to the question parsing sample and the problem domain set; and, an entity link sample is generated according to the question parsing sample and the standard entity library; finally, a training data set for the question answering system is constructed according to the question classification sample, the question parsing sample, and the entity link sample.
  • question parsing samples, question classification samples and entity linking samples are generated according to pre-configured synonym sets, similar question sets, comparative word sets, problem domain sets and standard entity libraries, so that a large number of training samples can be generated using a small amount of pre-configured data, thereby improving sample generation efficiency and reducing manual workload; on the other hand, question parsing samples can not only be used to train the initial question parsing model, but also can be used to generate question classification samples and entity linking samples, thereby improving data utilization.
  • FIG18 is a processing flow chart of another model training method for a question-answering system provided by the embodiment of the present application. Referring to FIG18 , the processing flow of the model training method for a question-answering system specifically includes steps S1802 to S1804.
  • Step S1802 generating a training data set by using a sample generation method of a question-answering system; the training data set includes question classification samples, question parsing samples, and entity linking samples.
  • the sample generation method of the question-answering system may be the sample generation method of the question-answering system provided by the aforementioned sample generation method embodiment of the question-answering system.
  • Step S1804 input the question classification samples into the initial question classification model in the question-answering system for iterative training to obtain a question classification model; input the question parsing samples into the initial question parsing model in the question-answering system for iterative training to obtain a question parsing model; input the entity linking samples into the initial entity linking model in the question-answering system for iterative training to obtain an entity linking model.
  • FIG. 19 is a processing flow chart of another response method provided by the present application embodiment.
  • FIG. 20 is a working principle diagram of a question-and-answer system provided by the present application embodiment. Referring to FIG. 19, The processing flow of the response method specifically includes step S1902 to step S1910.
  • the answering method shown in the embodiment of Figure 19 can be applied to a question-answering system, which may include a question classification model, a question parsing model, and an entity linking model connected in sequence.
  • the output of the question classification model may be the input of the question parsing model, and the output of the question parsing model may be the input of the entity linking model.
  • Step S1902 obtaining the target question to be answered.
  • Step S1904 input the target question into the question classification model for classification processing to obtain the classification result; the question classification model is obtained by inputting the question classification samples in the training data set into the initial question classification model for training; the training data set is generated by the sample generation method of the question answering system.
  • the sample generation method of the question-answering system may be the sample generation method of the question-answering system provided by the aforementioned sample generation method embodiment of the question-answering system.
  • Step S1906 when the classification result is used to characterize that the target question belongs to the first preset classification, the target question is input into the question parsing model for parsing processing to obtain the corresponding target segment; the question parsing model is obtained by inputting the question parsing samples in the training data set into the initial question parsing model for training.
  • Step S1908 inputting the target segment into the entity linking model for prediction processing to obtain the corresponding target entity;
  • the entity linking model is obtained by inputting the entity linking samples in the training data set into the initial entity linking model for training;
  • Step S1910 determining the answer to the target question based on the target entity.
  • the answer to the target question is determined according to the target entity, including: performing slot filling processing according to the target entity to obtain a slot filling result for the target question; and querying a corresponding answer in a pre-configured knowledge graph according to the slot filling result to obtain the answer to the target question.
  • slot filling processing is performed.
  • the target entity obtained in step S1908 first, at least one slot template that may correspond to the target entity is determined, and then, for each slot template, slot filling processing is performed on the slot template through the target entity to obtain the filled slot template, and the filled slot template is determined as the slot filling result corresponding to the target problem.
  • the slot template in this embodiment is the same as the slot template in the aforementioned method embodiment, and the specific scheme in the aforementioned method embodiment can be referred to.
  • the method comprises: if the slot filling result cannot be used to query the knowledge graph to obtain the unique corresponding answer, determining the slot missing information corresponding to the slot filling result; generating a rhetorical question according to the slot missing information; receiving user input in response to the rhetorical question; performing slot filling processing according to the user input; and querying the corresponding answer in a pre-configured knowledge graph according to the slot filling result after the slot filling processing to obtain the answer to the target question.
  • the slot filling result of the target question includes the filled slot template 1: What is the (attribute slot) of (product A)?
  • the filled slot template 1 cannot be used to query in the knowledge graph to obtain a unique corresponding answer. It can be determined that the slot missing information corresponding to the filled slot template 1 is attribute missing. According to the slot missing information "attribute missing", a corresponding rhetorical question "What information do you want to inquire about product A?” is generated. The user input "price" is received in response to the rhetorical question.
  • the slot filling process is performed according to the user input, that is, the attribute slot in "What is the (attribute slot) of (product A)” is filled to obtain the slot template 1 after secondary filling: What is the (price) of (product A)?
  • the answer corresponding to "What is the price of product A” is queried in the pre-configured knowledge graph to obtain the answer to the target question.
  • the OOD model is first used to determine whether the user's question is an out-of-domain question or an in-domain question. If it is an in-domain question, the process continues; if it is an out-of-domain question, the system will uniformly reply with the pre-configured out-of-domain question response script.
  • the question parsing model is used to identify slots in the domain question, including but not limited to: entities, attributes, relations, constraints, etc.
  • entity recognition is completed through the question parsing model, one or more entity fragments will be obtained.
  • the entity node corresponding to the currently recognized entity fragment can be recalled, and the entity fragment can be linked to the unique entity in the knowledge graph using the entity linking model.
  • DM Dialog Management
  • Users can personalize the relevant counter-question scripts in the counter-question template.
  • DM can ask: What product do you want to consult? If the attribute is missing, such as: Parking space loan. DM can ask: What specific questions do you want to consult about parking space loan?
  • the slot filling process can be performed through multiple rounds of strategies until the knowledge graph can be queried and returned. Until the only answer.
  • the OOD model can be obtained by iteratively training the initial OOD model based on the OOD training set;
  • the question parsing model can be obtained by iteratively training the initial question parsing model based on the question parsing training set;
  • the entity linking model can be obtained by iteratively training the initial entity linking model based on the entity linking training set.
  • the question parsing training set can be generated based on synonym templates, similar question templates, comparative word templates and rhetorical question templates;
  • the entity linking training set can be generated based on the standard entity library and the question parsing training set;
  • the OOD training set can be generated based on the OOD template and the question parsing training set.
  • the synonym template, similar question template, comparative word template, OOD template and rhetorical question template can be generated based on the customer knowledge base.
  • a sample generation method of a question-answering system is provided.
  • an embodiment of the present application also provides a sample generation device of a question-answering system, which is described below in conjunction with the accompanying drawings.
  • FIG21 is a schematic diagram of a sample generation device of a question-answering system provided in an embodiment of the present application.
  • the present embodiment provides a sample generation device 2100 for a question-answering system, comprising: a third acquisition unit 2102, used to acquire a pre-configured synonym set, a similar question set, a comparison word set, a problem domain set, and a standard entity library; the synonym set includes a plurality of standard words and at least one synonym corresponding to each of the standard words; the similar question set includes a plurality of first attributes and at least one similar question sentence corresponding to each of the first attributes; the comparison word set includes comparison word information; the problem domain set includes an in-domain question text and an out-of-domain question text; the standard entity library includes a plurality of standard entities; a first generation unit 2104, used to generate a question parsing sample according to at least one of the synonym set, the similar question set, and the comparison word set; a second generation unit 2106, used to generate a question classification sample according to the question parsing sample and the problem domain set; and, based on the question parsing sample and the standard entity library, to generate the
  • the first generation unit 2104 is specifically configured to: filter the synonym set according to the first preset filtering condition to obtain a candidate word list; the candidate word list includes candidate standard words and candidate synonyms corresponding to the candidate standard words; perform a first sampling on the candidate word list Processing is performed to obtain a single-slot sample, and the single-slot sample is determined as the problem parsing sample; or, a second sampling process is performed on multiple first attributes in the similar question set to obtain a target first attribute; a third sampling process is performed on at least one similar question sentence corresponding to the target first attribute to obtain an initial similar question sentence; based on the initial similar question sentence, a corresponding single-attribute sample is determined, and the single-attribute sample is determined as the problem parsing sample.
  • the first generating unit 2104 is specifically used to: filter the synonym set according to the second preset filtering condition to obtain a candidate entity word list; the candidate entity word list includes multiple candidate entity words; screen the similar question set to obtain an intermediate similar question set; the intermediate similar question set includes multiple first attributes and at least one candidate similar question sentence carrying a mask corresponding to each of the first attributes; determine the target candidate entity words and target candidate similar questions corresponding to the same attribute category according to the candidate word list and the intermediate similar question set; replace the mask in the target candidate similar question sentence with the target candidate entity word to obtain a composite sample, and determine the composite sample as the question parsing sample; or, filter the synonym set according to the third preset filtering condition to obtain attribute synonyms and entity synonyms; generate random numbers; splice the comparison word information, the attribute synonyms, the entity synonyms, the random numbers and the preset description words to obtain a comparison type sample, and determine the comparison type sample as the question parsing sample.
  • the second generation unit 2106 is specifically used to: perform a fourth sampling process on the question parsing sample to obtain a first in-domain sample; generate a corresponding second in-domain sample based on the in-domain question text; generate a corresponding out-of-domain sample based on the out-of-domain question text; generate the question classification sample based on the first in-domain sample, the second in-domain sample and the out-of-domain sample.
  • the second generation unit 2106 is specifically used to: determine a target parsing sample carrying a non-standard entity based on the problem parsing sample; calculate the similarity between the non-standard entity and each of the standard entities in the standard entity library and sort them; determine a preset number of target standard entities corresponding to the non-standard entity based on the sorting result; construct positive and negative samples corresponding to the non-standard entity based on the non-standard entity and the preset number of target standard entities, and determine the positive and negative samples as the entity link samples.
  • the sample generation device of the question-answering system includes a third acquisition unit for acquiring a pre-configured synonym set, a similar question set, a comparison word set, a problem domain set, and a standard entity library;
  • the synonym set includes a plurality of standard words and at least one synonym corresponding to each standard word;
  • the similar question set includes a plurality of first attributes and at least one similar word corresponding to each first attribute.
  • the problem parsing sample can not only be used for the training of the initial problem parsing model, but also can be used to generate a problem classification sample and an entity linking sample, thereby improving the data utilization rate.
  • a model training method for a question-answering system is provided.
  • an embodiment of the present application also provides a model training device for a question-answering system, which is described below in conjunction with the accompanying drawings.
  • Figure 22 is a schematic diagram of a model training device for another question-answering system provided in an embodiment of the present application.
  • This embodiment provides a model training device 2200 for a question-answering system, comprising: a third generation unit 2202, used to generate a training data set by a sample generation method of the question-answering system; the training data set includes the question classification sample, the question parsing sample and the entity linking sample; a second training unit 2204, used to input the question classification sample into an initial question classification model in the question-answering system for iterative training to obtain a question classification model; input the question parsing sample into the initial question parsing model in the question-answering system for iterative training to obtain a question parsing model; input the entity linking sample into the initial entity linking model in the question-answering system for iterative training to obtain an entity linking model.
  • the model training device of the question-answering system includes a third generation unit, which is used to generate a training data set through a sample generation method of the question-answering system; the training data set includes question classification samples, question parsing samples and entity linking samples; the second training unit is used to input the question classification samples into the initial question classification model in the question-answering system for iterative training to obtain a question classification model; input the question parsing samples into the initial question parsing model in the question-answering system for iterative training to obtain a question parsing model; input the entity linking samples into the initial entity linking model in the question-answering system for iterative training to obtain an entity linking model.
  • question parsing samples, question domain sets and standard entity libraries are generated according to the pre-configured synonym sets, similar question sets, comparative word sets and problem domain sets.
  • Question classification samples and entity linking samples can generate a large number of training samples using a small amount of pre-configured data, improve sample generation efficiency, and reduce manual workload; on the other hand, question parsing samples can not only be used to train the initial question parsing model, but also to generate question classification samples and entity linking samples, which improves data utilization. On this basis, a large number of training samples can be efficiently generated for model training, thereby improving the accuracy of the question-answering system.
  • FIG. 23 is a schematic diagram of another response device provided in an embodiment of the present application.
  • This embodiment provides an answering device 2300, including: a fourth acquisition unit 2302, used to acquire a target question to be answered; a classification unit 2304, used to input the target question into a question classification model for classification processing to obtain a classification result; the question classification model is obtained by inputting question classification samples in a training data set into an initial question classification model for training; the training data set is generated by a sample generation method of a question-answering system; a second parsing unit 2306, used to input the target question into a question parsing model for parsing processing to obtain a corresponding target segment when the classification result is used to characterize that the target question belongs to a first preset classification; the question parsing model is obtained by inputting question parsing samples in the training data set into an initial question parsing model for training; a prediction unit 2308, used to input the target segment into an entity linking model for prediction processing to obtain a corresponding target entity; the entity linking model is obtained by inputting entity linking samples in the training data set into an initial entity linking model for training
  • the second determination unit 2310 is specifically used to: perform slot filling processing according to the target entity to obtain a slot filling result for the target question; and query a corresponding answer in a pre-configured knowledge graph according to the slot filling result to obtain an answer to the target question.
  • the second determination unit 2310 is also specifically used to: if the slot filling result cannot be used to query the knowledge graph to obtain a unique corresponding answer, then determine the slot missing information corresponding to the slot filling result; generate a rhetorical question based on the slot missing information; receive user input in response to the rhetorical question; perform slot filling processing based on the user input; and query the corresponding answer in a pre-configured knowledge graph based on the slot filling result after the slot filling processing to obtain the answer to the target question.
  • the answering device includes: a fourth acquisition unit, used to acquire a target question to be answered; a classification unit, used to input the target question into a question classification model for classification processing, The classification result is obtained; the question classification model is obtained by inputting the question classification samples in the training data set into the initial question classification model for training; the training data set is generated by the sample generation method of the question-answering system; the second parsing unit is used to input the target question into the question parsing model for parsing processing when the classification result is used to characterize that the target question belongs to the first preset classification, so as to obtain the corresponding target segment; the question parsing model is obtained by inputting the question parsing samples in the training data set into the initial question parsing model for training; the prediction unit is used to input the target segment into the entity linking model for prediction processing to obtain the corresponding target entity; the entity linking model is obtained by inputting the entity linking samples in the training data set into the initial entity linking model for training; the second determination unit is used to determine the
  • question parsing samples, question classification samples and entity linking samples are generated according to the pre-configured synonym sets, similar question sets, comparative word sets, problem domain sets and standard entity libraries, so that a large number of training samples can be generated using a small amount of pre-configured data, thereby improving the efficiency of sample generation and reducing manual workload;
  • question parsing samples can not only be used to train the initial question parsing model, but also can be used to generate question classification samples and entity linking samples, thereby improving data utilization.
  • a large number of training samples can be efficiently generated for model training, thereby improving the response accuracy of the question and answer system.
  • an embodiment of the present application also provides an electronic device, which is used to execute the model training method of the question-answering system provided above, or, the answering method provided above.
  • Figure 24 is a structural schematic diagram of an electronic device provided in an embodiment of the present application.
  • electronic devices may have relatively large differences due to different configurations or performances, and may include one or more processors 2401 and memory 2402, and the memory 2402 may store one or more storage applications or data.
  • the memory 2402 may be a short-term storage or a persistent storage.
  • the application stored in the memory 2402 may include one or more modules (not shown in the figure), and each module may include a series of computer executable instructions in the electronic device.
  • the processor 2401 may be configured to communicate with the memory 2402 to execute a series of computer executable instructions in the memory 2402 on the electronic device.
  • the electronic device may also include one or more power supplies 2403, one or more wired or wireless network interfaces 2404, one or more input/output interfaces 2405, one or more keyboards 2406, and the like.
  • the electronic device includes a memory and one or more programs, wherein the one or more programs are stored in the memory, and the one or more programs are stored in the memory.
  • the program may include one or more modules, and each module may include a series of computer executable instructions in an electronic device, and is configured to be executed by one or more processors.
  • the one or more programs include the following computer executable instructions: obtaining a question text sample; inputting the question text sample into the initial question parsing model for iterative training to obtain a question parsing model;
  • the initial question parsing model includes a first encoding layer and a conversion layer;
  • the first encoding layer is used to perform encoding processing according to the question text sample to obtain a corresponding first sentence vector;
  • the conversion layer is used to generate a preset number of initial intention vectors when receiving the first sentence vector, fill each of the initial intention vectors according to the first sentence vector to obtain a corresponding target intention vector, and convert the target intention vector into a corresponding text fragment;
  • the text fragment is used to query the answer to the question text sample in the question and answer system.
  • the electronic device includes a memory and one or more programs, wherein the one or more programs are stored in the memory, and the one or more programs may include one or more modules, and each module may include a series of computer executable instructions for the electronic device, and the one or more programs are configured to be executed by one or more processors, and include the following computer executable instructions: obtaining a preconfigured synonym set, a similar question set, a comparison word set, a problem domain set, and a standard entity library; the synonym set includes a plurality of standard words and at least one synonym corresponding to each of the standard words; the similar question set The set includes multiple first attributes and at least one similar question corresponding to each of the first attributes; the comparison word set includes comparison word information; the problem domain set includes in-domain question text and out-of-domain question text; the standard entity library includes multiple standard entities; based on the synonym set, the similar question set and at least one of the comparison word set, a question parsing sample is generated; based on the question parsing sample and the problem domain
  • the electronic device includes a memory and one or more programs, wherein the one or more programs are stored in the memory, and the one or more programs may include one or more modules, and each module may include a series of computer executable instructions for the electronic device, and the one or more programs are configured to be executed by one or more processors, and include the following computer executable instructions: obtaining a target question to be answered; inputting the target question into a problem parsing model for parsing processing to obtain a corresponding target question; The question parsing model is obtained by training through the model training method of the question-answering system; and the answer to the target question is determined according to the target segment.
  • the electronic device includes a memory, and one or more programs, wherein the one or more programs are stored in the memory, and the one or more programs may include one or more modules, and each module may include a series of computer executable instructions in the electronic device, and the one or more programs are configured to be executed by one or more processors, including computer executable instructions for performing the following: generating a training data set by a sample generation method of a question-answering system; the training data set includes the question classification sample, the question parsing sample and the entity linking sample; inputting the question classification sample into an initial question classification model in the question-answering system for iterative training to obtain a question classification model; inputting the question parsing sample into an initial question parsing model in the question-answering system for iterative training to obtain a question parsing model; inputting the entity linking sample into an initial entity linking model in the question-answering system for iterative training to obtain an entity linking model.
  • the electronic device includes a memory, and one or more programs, wherein the one or more programs are stored in the memory, and the one or more programs may include one or more modules, and each module may include a series of computer executable instructions in the electronic device, and the one or more programs are configured to be executed by one or more processors to include the following computer executable instructions: obtaining a target question to be answered; inputting the target question into a question classification model for classification processing to obtain a classification result; the question classification model is obtained by inputting question classification samples in a training data set into an initial question classification model for training; the training data set is generated by a sample generation method of a question-answering system; when the classification result is used to characterize that the target question belongs to a first preset classification, inputting the target question into a question parsing model for parsing processing to obtain a corresponding target segment; the question parsing model is obtained by inputting question parsing samples in the training data set into an initial question parsing model for training; inputting the target segment
  • an embodiment of the present application also provides a computer-readable storage medium.
  • the computer-readable storage medium is used to store computer-executable instructions.
  • the following process is implemented: obtaining a question text sample; inputting the question text sample into the initial question parsing model for iterative training to obtain a question parsing model; the initial question parsing model includes a first encoding layer and a conversion layer; the first encoding layer is used to perform encoding processing according to the question text sample to obtain a corresponding first sentence vector; the conversion layer is used to generate a preset number of initial intention vectors when receiving the first sentence vector, fill each of the initial intention vectors according to the first sentence vector to obtain a corresponding target intention vector, and convert the target intention vector into a corresponding text fragment; the text fragment is used to query the answer to the question text sample in the question-answering system.
  • an embodiment of the present application also provides a computer-readable storage medium.
  • the computer-readable storage medium provided in this embodiment is used to store computer-executable instructions.
  • the computer-executable instructions When executed by a processor, the following process is implemented: obtaining a target question to be answered; inputting the target question into a question parsing model for parsing and processing to obtain a corresponding target segment; the question parsing model is obtained by training using a model training method of a question-answering system; and determining an answer to the target question based on the target segment.
  • an embodiment of the present application also provides a computer-readable storage medium.
  • the computer-readable storage medium is used to store computer-executable instructions.
  • the following process is implemented: obtaining a pre-configured synonym set, a similar question set, a comparison word set, a problem domain set, and a standard entity library;
  • the synonym set includes a plurality of standard words and at least one synonym corresponding to each of the standard words;
  • the similar question set includes a plurality of first attributes and at least one similar question sentence corresponding to each of the first attributes;
  • the comparison word set includes comparison word information;
  • the problem domain set includes in-domain question text and out-of-domain question text;
  • the standard entity library includes a plurality of standard entities; Generate a question parsing sample based on a set of similar questions and at least one of the comparison word sets; generate a question classification sample based on the question parsing sample and the problem domain set; and generate the entity linking sample based on the question parsing sample and the standard entity library; construct a training data set for the question-answering
  • an embodiment of the present application also provides a computer-readable storage medium.
  • the computer-readable storage medium provided in this embodiment is used to store computer-executable instructions.
  • the computer-executable instructions When executed by a processor, the following process is implemented: a training data set is generated by a sample generation method of a question-answering system; the training data set includes the question classification samples, the question parsing samples and the entity linking samples; the question classification samples are input into an initial question classification model in the question-answering system for iterative training to obtain a question classification model; the question parsing samples are input into an initial question parsing model in the question-answering system for iterative training to obtain a question parsing model; the entity linking samples are input into an initial entity linking model in the question-answering system for iterative training to obtain an entity linking model.
  • an embodiment of the present application also provides a computer-readable storage medium.
  • the computer-readable storage medium provided in this embodiment is used to store computer-executable instructions, and the computer-executable instructions implement the following process when executed by the processor: obtaining a target question to be answered; inputting the target question into a question classification model for classification processing to obtain a classification result; the question classification model is obtained by inputting question classification samples in a training data set into an initial question classification model for training; the training data set is generated by a sample generation method of a question-answering system; when the classification result is used to characterize that the target question belongs to a first preset classification, inputting the target question into a question parsing model for parsing processing to obtain a corresponding target segment; the question parsing model is obtained by inputting question parsing samples in the training data set into an initial question parsing model
  • the target segment is input into the entity linking model for prediction processing to obtain the corresponding target entity; the entity linking model is obtained by inputting the entity linking samples in the training data set into the initial entity linking model for training; according to the target entity, the
  • the embodiments of the present application may be provided as methods, systems or computer program products. Therefore, the embodiments of the present application may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, this specification may adopt the form of a computer program product implemented on one or more computer-readable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.
  • a computer-readable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable device to work in a specific manner, so that the instructions stored in the computer-readable memory produce a manufactured product including an instruction device that implements the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.
  • These computer program instructions can also be loaded into a computer or other programmable device so that a series of operation steps are executed on the computer or other programmable device to produce a computer-implemented process, so that the instructions executed on the computer or other programmable device provide a method for implementing a process in the flowchart. or multiple flows and/or block diagrams with steps of functions specified in one or more blocks.
  • a computing device includes one or more processors (CPU), input/output interfaces, network interfaces, and memory.
  • processors CPU
  • input/output interfaces network interfaces
  • memory volatile and non-volatile memory
  • Memory may include non-permanent storage in a computer-readable medium, in the form of random access memory (RAM) and/or non-volatile memory, such as read-only memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
  • RAM random access memory
  • ROM read-only memory
  • flash RAM flash memory
  • Computer readable media include permanent and non-permanent, removable and non-removable media that can be implemented by any method or technology to store information.
  • Information can be computer readable instructions, data structures, program modules or other data.
  • Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, compact disk read-only memory (CD-ROM), digital versatile disk (DVD) or other optical storage, magnetic cassettes, disk storage or other magnetic storage devices or any other non-transmission media that can be used to store information that can be accessed by a computing device.
  • computer readable media does not include temporary computer readable media (transitory media), such as modulated data signals and carrier waves.
  • Embodiments of the present application may be described in the general context of computer-executable instructions executed by a computer, such as program modules.
  • program modules include routines, programs, objects, components, data structures, etc. that perform specific tasks or implement specific abstract data types.
  • One or more embodiments of the present specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices connected via a communications network.
  • program modules may be located in local and remote computer storage media, including storage devices.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)

Abstract

The embodiments of the present disclosure provide a question answering system model training method, a sample generation method and apparatus, an electronic device, and a storage medium, wherein the question answering system model training method comprises: obtaining a question text sample; inputting the question text sample into an initial question analysis model, performing iterative training, and obtaining a question analysis model; the initial question analysis model comprising a first encoding layer and a conversion layer; the first encoding layer being used for performing encoding in accordance with the question text sample, so as to obtain a corresponding first sentence vector; the conversion layer being used for generating a preset number of initial intention vectors when the first sentence vector is received, filling each of the initial intention vectors in accordance with the first sentence vector, obtaining a corresponding target intention vector, and converting the target intention vector into a corresponding text segment; and the text segment being used for querying the question answering system for an answer to the question text sample.

Description

问答系统的模型训练方法、样本生成方法Model training method and sample generation method for question answering system

交叉引用Cross-references

本申请要求在2023年03月14日提交中国专利局、申请号为202310247102.2、名称为“问答系统的样本生成方法、装置、电子设备及存储介质”以及在2023年03月14日提交中国专利局、申请号为202310247092.2、名称为“问答系统的模型训练方法、装置、电子设备以及存储介质”的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。This application claims priority to the Chinese patent application filed with the Chinese Patent Office on March 14, 2023, with application number 202310247102.2, and entitled “Sample generation method, device, electronic device and storage medium for question and answer system”, and the Chinese patent application filed with the Chinese Patent Office on March 14, 2023, with application number 202310247092.2, and entitled “Model training method, device, electronic device and storage medium for question and answer system”, the entire contents of which are incorporated by reference into this application.

技术领域Technical Field

本申请涉及知识图谱技术领域,尤其涉及一种问答系统的模型训练方法、样本生成方法、装置、电子设备及存储介质。The present application relates to the field of knowledge graph technology, and in particular to a model training method, sample generation method, device, electronic device and storage medium for a question-answering system.

背景技术Background Art

随着互联网技术的发展,问答系统的应用越来越流行,通过问答系统自动应答用户提出的问题,能够提高应答效率,节省人力资源。然而,实际应用中,用户提出的问题可能是口语化的,使得句式结构丰富多样,增加了问答系统的意图识别难度。另外,实际应用中,涉及知识图谱的问答系统往往是高度定制化的,该问答系统的模型训练所需的样本数据量需求非常大,若通过人工配置训练样本,效率低下,难以满足模型训练需求。With the development of Internet technology, the application of question-answering systems has become more and more popular. Automatically answering questions raised by users through question-answering systems can improve answering efficiency and save human resources. However, in actual applications, the questions raised by users may be colloquial, making the sentence structure rich and diverse, which increases the difficulty of intent recognition in question-answering systems. In addition, in actual applications, question-answering systems involving knowledge graphs are often highly customized, and the amount of sample data required for model training of such question-answering systems is very large. If training samples are manually configured, the efficiency is low and it is difficult to meet the model training requirements.

发明内容Summary of the invention

本申请提供了一种问答系统的模型训练方法、样本生成方法、装置、电子设备及存储介质,以提高问答系统的意图识别能力和样本生成效率。The present application provides a model training method, sample generation method, device, electronic device and storage medium for a question-answering system to improve the intent recognition capability and sample generation efficiency of the question-answering system.

一方面,本申请提供了一种问答系统的模型训练方法,所述问答系统包括初始问题解析模型;所述方法包括:获取问题文本样本;将所述问题文本 样本输入所述初始问题解析模型进行迭代训练,得到问题解析模型;所述初始问题解析模型包括第一编码层和转换层;所述第一编码层用于根据所述问题文本样本进行编码处理,得到对应的第一句式向量;所述转换层用于在接收到所述第一句式向量的情况下生成预设数量个初始意图向量,根据所述第一句式向量对每个所述初始意图向量进行填充处理,得到对应的目标意图向量,将所述目标意图向量转换为对应的文本片段;所述文本片段用于在所述问答系统中查询所述问题文本样本的答案。On the one hand, the present application provides a model training method for a question-answering system, wherein the question-answering system includes an initial question parsing model; the method includes: obtaining a question text sample; The sample is input into the initial question parsing model for iterative training to obtain a question parsing model; the initial question parsing model includes a first encoding layer and a conversion layer; the first encoding layer is used to perform encoding processing according to the question text sample to obtain a corresponding first sentence vector; the conversion layer is used to generate a preset number of initial intention vectors when receiving the first sentence vector, fill each of the initial intention vectors according to the first sentence vector to obtain a corresponding target intention vector, and convert the target intention vector into a corresponding text fragment; the text fragment is used to query the answer to the question text sample in the question-answering system.

一方面,本申请提供了一种应答方法,包括:获取待应答的目标问题;将所述目标问题输入问题解析模型进行解析处理,得到对应的目标片段;所述问题解析模型是通过如上述问答系统的模型训练方法进行训练所得到的;根据所述目标片段,确定所述目标问题的答案。On the one hand, the present application provides an answering method, including: obtaining a target question to be answered; inputting the target question into a question parsing model for parsing and processing to obtain a corresponding target fragment; the question parsing model is obtained by training with the model training method of the question-answering system as described above; and determining the answer to the target question based on the target fragment.

一方面,本申请提供了一种问答系统的模型训练装置,所述问答系统包括初始问题解析模型;所述装置包括:第一获取单元,用于获取问题文本样本;第一训练单元,用于将所述问题文本样本输入所述初始问题解析模型进行迭代训练,得到问题解析模型;所述初始问题解析模型包括第一编码层和转换层;所述第一编码层用于根据所述问题文本样本进行编码处理,得到对应的第一句式向量;所述转换层用于在接收到所述第一句式向量的情况下生成预设数量个初始意图向量,根据所述第一句式向量对每个所述初始意图向量进行填充处理,得到对应的目标意图向量,将所述目标意图向量转换为对应的文本片段;所述文本片段用于在所述问答系统中查询所述问题文本样本的答案。On the one hand, the present application provides a model training device for a question-answering system, the question-answering system including an initial question parsing model; the device including: a first acquisition unit, used to acquire a question text sample; a first training unit, used to input the question text sample into the initial question parsing model for iterative training to obtain a question parsing model; the initial question parsing model includes a first encoding layer and a conversion layer; the first encoding layer is used to perform encoding processing according to the question text sample to obtain a corresponding first sentence vector; the conversion layer is used to generate a preset number of initial intention vectors when receiving the first sentence vector, fill each of the initial intention vectors according to the first sentence vector to obtain a corresponding target intention vector, and convert the target intention vector into a corresponding text fragment; the text fragment is used to query the answer to the question text sample in the question-answering system.

一方面,本申请提供了一种应答装置,包括:第二获取单元,用于获取待应答的目标问题;第一解析单元,用于将所述目标问题输入问题解析模型进行解析处理,得到对应的目标片段;所述问题解析模型是通过上述问答系统的模型训练方法进行训练所得到的;第一确定单元,用于根据所述目标片段,确定所述目标问题的答案。On the one hand, the present application provides an answering device, including: a second acquisition unit, used to acquire a target question to be answered; a first parsing unit, used to input the target question into a question parsing model for parsing and processing to obtain a corresponding target fragment; the question parsing model is obtained by training with the model training method of the above-mentioned question-answering system; a first determination unit, used to determine the answer to the target question based on the target fragment.

一方面,本申请提供了一种问答系统的样本生成方法,所述方法包括:获取预先配置的同义词集合、相似问集合、比较词集合、问题域集合以及标准实体库;所述同义词集合包括多个标准词以及每个所述标准词对应的至少一个同义词;所述相似问集合包括多个第一属性以及每个所述第一属性对应的至少一个相似问句;所述比较词集合包括比较词信息;所述问题域集合包 括域内问题文本和域外问题文本;所述标准实体库包括多个标准实体;根据所述同义词集合、所述相似问集合以及所述比较词集合中的至少一者,生成问题解析样本;根据所述问题解析样本和所述问题域集合,生成问题分类样本;以及,根据所述问题解析样本和所述标准实体库,生成所述实体链接样本;根据所述问题分类样本、所述问题解析样本以及所述实体链接样本,构建所述问答系统的训练数据集。On the one hand, the present application provides a sample generation method for a question-answering system, the method comprising: obtaining a pre-configured synonym set, a similar question set, a comparison word set, a problem domain set, and a standard entity library; the synonym set comprises a plurality of standard words and at least one synonym corresponding to each of the standard words; the similar question set comprises a plurality of first attributes and at least one similar question sentence corresponding to each of the first attributes; the comparison word set comprises comparison word information; the problem domain set comprises The method comprises the following steps: the method comprises: comprising: comprising: comprising: a plurality of standard entities; generating a question parsing sample according to the synonym set, the similar question set and at least one of the comparative word set; generating a question classification sample according to the question parsing sample and the question domain set; and generating the entity linking sample according to the question parsing sample and the standard entity library; and constructing a training data set for the question answering system according to the question classification sample, the question parsing sample and the entity linking sample.

一方面,本申请提供了一种问答系统的模型训练方法,包括:通过如上述问答系统的样本生成方法生成训练数据集;所述训练数据集包括所述问题分类样本、所述问题解析样本以及所述实体链接样本;将所述问题分类样本输入所述问答系统中的初始问题分类模型进行迭代训练,得到问题分类模型;将所述问题解析样本输入所述问答系统中的初始问题解析模型进行迭代训练,得到问题解析模型;将所述实体链接样本输入所述问答系统中的初始实体链接模型进行迭代训练,得到实体链接模型。On the one hand, the present application provides a model training method for a question-answering system, comprising: generating a training data set by a sample generation method for the question-answering system as described above; the training data set comprises the question classification samples, the question parsing samples and the entity linking samples; inputting the question classification samples into an initial question classification model in the question-answering system for iterative training to obtain a question classification model; inputting the question parsing samples into an initial question parsing model in the question-answering system for iterative training to obtain a question parsing model; inputting the entity linking samples into an initial entity linking model in the question-answering system for iterative training to obtain an entity linking model.

一方面,本申请提供了一种应答方法,包括:获取待应答的目标问题;将所述目标问题输入问题分类模型进行分类处理,得到分类结果;所述问题分类模型是通过将训练数据集中的问题分类样本输入初始问题分类模型进行训练所得到的;所述训练数据集是通过上述问答系统的样本生成方法所生成的;在所述分类结果用于表征所述目标问题属于第一预设分类的情况下,将所述目标问题输入问题解析模型进行解析处理,得到对应的目标片段;所述问题解析模型是通过将所述训练数据集中的问题解析样本输入初始问题解析模型进行训练所得到的;将所述目标片段输入实体链接模型进行预测处理,得到对应的目标实体;所述实体链接模型是通过将所述训练数据集中的实体链接样本输入初始实体链接模型进行训练所得到的;根据所述目标实体,确定所述目标问题的答案。On the one hand, the present application provides an answering method, including: obtaining a target question to be answered; inputting the target question into a question classification model for classification processing to obtain a classification result; the question classification model is obtained by inputting question classification samples in a training data set into an initial question classification model for training; the training data set is generated by the sample generation method of the above-mentioned question-answering system; when the classification result is used to characterize that the target question belongs to a first preset classification, inputting the target question into a question parsing model for parsing processing to obtain a corresponding target fragment; the question parsing model is obtained by inputting question parsing samples in the training data set into an initial question parsing model for training; inputting the target fragment into an entity linking model for prediction processing to obtain a corresponding target entity; the entity linking model is obtained by inputting entity linking samples in the training data set into an initial entity linking model for training; and determining the answer to the target question based on the target entity.

一方面,本申请提供了一种问答系统的样本生成装置,所述装置包括:第三获取单元,用于获取预先配置的同义词集合、相似问集合、比较词集合、问题域集合以及标准实体库;所述同义词集合包括多个标准词以及每个所述标准词对应的至少一个同义词;所述相似问集合包括多个第一属性以及每个所述第一属性对应的至少一个相似问句;所述比较词集合包括比较词信息;所述问题域集合包括域内问题文本和域外问题文本;所述标准实体库包括多个标准实体;第一生成单元,用于根据所述同义词集合、相似问集合以及比 较词集合中的至少一者,生成问题解析样本;第二生成单元,用于根据所述问题解析样本和所述问题域集合,生成问题分类样本;以及,根据所述问题解析样本和所述标准实体库,生成所述实体链接样本;构建单元,用于根据所述问题分类样本、所述问题解析样本以及所述实体链接样本,构建所述问答系统的训练数据集。On the one hand, the present application provides a sample generation device for a question-answering system, the device comprising: a third acquisition unit, used to acquire a pre-configured synonym set, a similar question set, a comparative word set, a problem domain set and a standard entity library; the synonym set comprises a plurality of standard words and at least one synonym corresponding to each of the standard words; the similar question set comprises a plurality of first attributes and at least one similar question sentence corresponding to each of the first attributes; the comparative word set comprises comparative word information; the problem domain set comprises in-domain question text and out-of-domain question text; the standard entity library comprises a plurality of standard entities; a first generation unit, used to generate a sample according to the synonym set, the similar question set and the comparative word set. A question parsing sample is generated by comparing at least one of the word sets; a second generating unit is used to generate a question classification sample according to the question parsing sample and the problem domain set; and, based on the question parsing sample and the standard entity library, the entity linking sample is generated; a constructing unit is used to construct a training data set for the question answering system according to the question classification sample, the question parsing sample and the entity linking sample.

一方面,本申请提供了一种问答系统的模型训练装置,包括:第三生成单元,用于通过如上述问答系统的样本生成方法生成训练数据集;所述训练数据集包括所述问题分类样本、所述问题解析样本以及所述实体链接样本;第二训练单元,用于将所述问题分类样本输入所述问答系统中的初始问题分类模型进行迭代训练,得到问题分类模型;将所述问题解析样本输入所述问答系统中的初始问题解析模型进行迭代训练,得到问题解析模型;将所述实体链接样本输入所述问答系统中的初始实体链接模型进行迭代训练,得到实体链接模型。On the one hand, the present application provides a model training device for a question-answering system, comprising: a third generation unit, used to generate a training data set through a sample generation method for the question-answering system as described above; the training data set includes the question classification samples, the question parsing samples and the entity linking samples; a second training unit, used to input the question classification samples into an initial question classification model in the question-answering system for iterative training to obtain a question classification model; input the question parsing samples into the initial question parsing model in the question-answering system for iterative training to obtain a question parsing model; input the entity linking samples into the initial entity linking model in the question-answering system for iterative training to obtain an entity linking model.

一方面,本申请提供了一种应答装置,包括:第四获取单元,用于获取待应答的目标问题;分类单元,用于将所述目标问题输入问题分类模型进行分类处理,得到分类结果;所述问题分类模型是通过将训练数据集中的问题分类样本输入初始问题分类模型进行训练所得到的;所述训练数据集是通过如上述问答系统的样本生成方法所生成的;第二解析单元,用于在所述分类结果用于表征所述目标问题属于第一预设分类的情况下,将所述目标问题输入问题解析模型进行解析处理,得到对应的目标片段;所述问题解析模型是通过将所述训练数据集中的问题解析样本输入初始问题解析模型进行训练所得到的;预测单元,用于将所述目标片段输入实体链接模型进行预测处理,得到对应的目标实体;所述实体链接模型是通过将所述训练数据集中的实体链接样本输入初始实体链接模型进行训练所得到的;第二确定单元,用于根据所述目标实体,确定所述目标问题的答案。On the one hand, the present application provides an answering device, including: a fourth acquisition unit, used to acquire a target question to be answered; a classification unit, used to input the target question into a question classification model for classification processing to obtain a classification result; the question classification model is obtained by inputting question classification samples in a training data set into an initial question classification model for training; the training data set is generated by a sample generation method such as the above-mentioned question-answering system; a second parsing unit, used to input the target question into a question parsing model for parsing processing to obtain a corresponding target fragment when the classification result is used to characterize that the target question belongs to a first preset classification; the question parsing model is obtained by inputting question parsing samples in the training data set into an initial question parsing model for training; a prediction unit, used to input the target fragment into an entity linking model for prediction processing to obtain a corresponding target entity; the entity linking model is obtained by inputting entity linking samples in the training data set into an initial entity linking model for training; a second determination unit, used to determine the answer to the target question based on the target entity.

一方面,本申请提供了一种电子设备,包括:处理器;以及,被配置为存储计算机可执行指令的存储器,所述计算机可执行指令在被执行时使所述处理器执行如上述问答系统的模型训练方法,或者,如上述应答方法,或者,如上述问答系统的样本生成方法。On the one hand, the present application provides an electronic device, comprising: a processor; and a memory configured to store computer-executable instructions, which, when executed, cause the processor to execute a model training method for the question-answering system as described above, or, a response method as described above, or, a sample generation method for the question-answering system as described above.

一方面,本申请提供了一种计算机可读存储介质,用于存储计算机可执行指令,所述计算机可执行指令在被处理器执行时实现如上述问答系统的模 型训练方法,或者,如上述应答方法,或者,如上述问答系统的样本生成方法。On the one hand, the present application provides a computer-readable storage medium for storing computer-executable instructions, wherein the computer-executable instructions, when executed by a processor, implement the model of the question-answering system as described above. Type training method, or, the answering method as mentioned above, or, the sample generation method of the question-answering system as mentioned above.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本说明书中记载的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图;In order to more clearly illustrate the technical solutions in the embodiments of the present application or the prior art, the drawings required for use in the embodiments or the prior art descriptions are briefly introduced below. Obviously, the drawings described below are only some embodiments recorded in this specification. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without creative labor.

图1为本申请实施例提供的一种问答系统的模型训练方法的处理流程图;FIG1 is a processing flow chart of a model training method for a question-answering system provided in an embodiment of the present application;

图2为本申请实施例提供的一种问答系统的模型训练方法中问题解析模型的数据流向图;FIG2 is a data flow diagram of a question parsing model in a model training method for a question-answering system provided in an embodiment of the present application;

图3为本申请实施例提供的一种问答系统的模型训练方法中实体链接模型的数据流向图;FIG3 is a data flow diagram of an entity linking model in a model training method for a question-answering system provided in an embodiment of the present application;

图4为本申请实施例提供的一种应答方法的处理流程图;FIG4 is a processing flow chart of a response method provided in an embodiment of the present application;

图5为本申请实施例提供的一种应答方法中会话管理流程图;FIG5 is a flow chart of session management in a response method provided in an embodiment of the present application;

图6为本申请实施例提供的一种问答系统的模型训练装置示意图;FIG6 is a schematic diagram of a model training device for a question-answering system provided in an embodiment of the present application;

图7为本申请实施例提供的一种应答装置示意图;FIG7 is a schematic diagram of a response device provided in an embodiment of the present application;

图8为本申请实施例提供的一种问答系统的样本生成方法的处理流程图;FIG8 is a processing flow chart of a sample generation method of a question-answering system provided in an embodiment of the present application;

图9为本申请实施例提供的一种问答系统的框架示意图;FIG9 is a schematic diagram of a framework of a question-answering system provided in an embodiment of the present application;

图10为本申请实施例提供的一种问答系统的样本生成方法的第1种处理流程子图;FIG10 is a first processing flow sub-chart of a sample generation method of a question-answering system provided in an embodiment of the present application;

图11为本申请实施例提供的一种问答系统的样本生成方法的第2种处理流程子图;FIG11 is a second processing flow sub-chart of a sample generation method of a question-answering system provided in an embodiment of the present application;

图12为本申请实施例提供的一种问答系统的样本生成方法的第3种处理流程子图;FIG12 is a third processing flow sub-chart of a sample generation method of a question-answering system provided in an embodiment of the present application;

图13为本申请实施例提供的一种问答系统的样本生成方法的第4种处理流程子图;FIG13 is a fourth processing flow sub-chart of a sample generation method of a question-answering system provided in an embodiment of the present application;

图14为本申请实施例提供的一种问答系统的样本生成方法的第5种处理流程子图; FIG14 is a fifth processing flow sub-chart of a sample generation method of a question-answering system provided in an embodiment of the present application;

图15为本申请实施例提供的一种问答系统的样本生成方法的第6种处理流程子图;FIG15 is a sixth processing flow sub-chart of a sample generation method of a question-answering system provided in an embodiment of the present application;

图16为本申请实施例提供的一种问答系统的样本生成方法的第7种处理流程子图;FIG16 is a seventh processing flow sub-chart of a sample generation method of a question-answering system provided in an embodiment of the present application;

图17为本申请实施例提供的一种问答系统的样本生成方法的第8种处理流程子图;FIG17 is an eighth processing flow sub-chart of a sample generation method of a question-answering system provided in an embodiment of the present application;

图18为本申请实施例提供的另一种问答系统的模型训练方法的处理流程图;FIG18 is a processing flow chart of another model training method for a question-answering system provided in an embodiment of the present application;

图19为本申请实施例提供的另一种应答方法的处理流程图;FIG19 is a processing flow chart of another response method provided in an embodiment of the present application;

图20为本申请实施例提供的一种问答系统的工作原理图;FIG20 is a working principle diagram of a question-answering system provided in an embodiment of the present application;

图21为本申请实施例提供的一种问答系统的样本生成装置示意图;FIG21 is a schematic diagram of a sample generation device of a question-answering system provided in an embodiment of the present application;

图22为本申请实施例提供的另一种问答系统的模型训练装置示意图;FIG22 is a schematic diagram of another model training device for a question-answering system provided in an embodiment of the present application;

图23为本申请实施例提供的另一种应答装置示意图;FIG23 is a schematic diagram of another response device provided in an embodiment of the present application;

图24为本申请实施例提供的一种电子设备的结构示意图。FIG. 24 is a schematic diagram of the structure of an electronic device provided in an embodiment of the present application.

具体实施方式DETAILED DESCRIPTION

为了使本技术领域的人员更好地理解本申请实施例中的技术方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本说明书的一部分实施例,而不是全部的实施例。基于本申请实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本申请的保护范围。In order to enable those skilled in the art to better understand the technical solutions in the embodiments of the present application, the technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the drawings in the embodiments of the present application. Obviously, the described embodiments are only part of the embodiments of this specification, not all of the embodiments. Based on the embodiments of the present application, all other embodiments obtained by ordinary technicians in this field without creative work should fall within the scope of protection of the present application.

在问答系统的实际应用中,一方面,用户提出的问题可能是口语化的,也可能是片面化的,导致问题的句式结构丰富多样,增加了问答系统的意图识别难度;另一方面,用户在提问时可能提供的信息可能是片面的,若仅根据信息不足的问题进行应答,很可能无法查找到令用户满意的答复。为了解决上述问题,本申请实施例提供了一种问答系统的模型训练方法。In the actual application of the question-answering system, on the one hand, the questions asked by users may be colloquial or one-sided, resulting in a rich and diverse sentence structure of the questions, which increases the difficulty of the question-answering system in identifying the intention; on the other hand, the information that users may provide when asking questions may be one-sided. If you only answer questions based on insufficient information, you may not be able to find a satisfactory answer to the user. In order to solve the above problems, the embodiment of the present application provides a model training method for a question-answering system.

图1为本申请实施例提供的一种问答系统的模型训练方法的处理流程图。图1的问答系统的模型训练方法可由电子设备执行,该电子设备可以是终端设备,比如手机、笔记本电脑、智能交互设备等等;或者,该电子设备还可以是服务器,比如独立的物理服务器、服务器集群或者是能够进行云计算的 云服务器。参照图1,本实施例提供的问答系统的模型训练方法,具体包括步骤S102至步骤S104。FIG1 is a processing flow chart of a model training method for a question-answering system provided in an embodiment of the present application. The model training method for the question-answering system in FIG1 can be executed by an electronic device, which can be a terminal device, such as a mobile phone, a laptop computer, an intelligent interactive device, etc.; or, the electronic device can also be a server, such as an independent physical server, a server cluster, or a server capable of cloud computing. Cloud server. Referring to FIG1 , the model training method of the question answering system provided in this embodiment specifically includes steps S102 to S104.

问答系统可以包括初始问题解析模型,还可以包括初始问题分类模型以及初始实体链接模型。The question answering system may include an initial question parsing model, an initial question classification model, and an initial entity linking model.

初始问题分类模型可以是未训练的OOD(out of the design scope,超出设计范围)模型。The initial problem classification model can be an untrained OOD (out of the design scope) model.

具体实施时,可以预先配置问题域,位于该问题域之内的问题即域内问题,位于该问题域之外的问题即域外问题。问答系统具有应答域内问题的能力,且不具有应答域外问题的能力。OOD模型可以用于对接收到的问题进行分类处理,分类结果至少包括:域内问题和域外问题。In specific implementation, the question domain can be pre-configured. Questions within the question domain are called in-domain questions, and questions outside the question domain are called out-of-domain questions. The question-answering system has the ability to answer in-domain questions, but does not have the ability to answer out-of-domain questions. The OOD model can be used to classify the received questions, and the classification results include at least: in-domain questions and out-of-domain questions.

初始问题解析模型可以是未训练的关系抽取模型。该关系抽取模型可以用于进行槽位识别处理,得到槽位识别结果,槽位识别结果包括且不限于:实体、属性、关系、约束,等等。The initial question parsing model may be an untrained relation extraction model. The relation extraction model may be used to perform slot recognition processing to obtain slot recognition results, which include but are not limited to: entities, attributes, relations, constraints, and the like.

实体可以是人名、地名、机构名称、预设专有名词,等等。概念(concept),又称为类,是一类具有相同特征的对象集合进行抽象的描述。概念可以用于反映类别。一个概念可以对应于多个实体。Entities can be names of people, places, organizations, pre-set proper nouns, etc. Concepts, also known as classes, are abstract descriptions of a collection of objects with the same characteristics. Concepts can be used to reflect categories. A concept can correspond to multiple entities.

例如,概念“植物”,可以对应于如下实体:“柳树”、“仙人掌”、“樱花”,等等。For example, the concept "plant" can correspond to the following entities: "willow", "cactus", "cherry blossom", and so on.

属性可以用于反映实体的特征。一个概念可以对应于多个属性。Attributes can be used to reflect the characteristics of an entity. A concept can correspond to multiple attributes.

例如,概念“植物”,可以对应于如下属性:“名称”、“种类”、“形状”、“生长环境”、“分布范围”以及“繁殖方法”,等等。For example, the concept "plant" can correspond to the following attributes: "name", "type", "shape", "growing environment", "distribution range" and "propagation method", etc.

针对每个属性,可以预先配置属性名称、属性类型以及属性描述。For each attribute, you can pre-configure the attribute name, attribute type, and attribute description.

其中,属性类型包括且不限于:文本、数字、图片、富文本、JS对象简谱(JavaScript Object Notation,json)。Among them, attribute types include but are not limited to: text, number, picture, rich text, JS object notation (JavaScript Object Notation, json).

关系(relation)用于描述概念之间的联系,其中又分为分类关系以及非分类关系。在实际应用中,可以根据具体领域和具体应用自定义相应的关系,例如“因果关系”。Relationships are used to describe the connection between concepts, which can be divided into classification relationships and non-classification relationships. In practical applications, corresponding relationships can be customized according to specific fields and specific applications, such as "cause and effect".

约束可以是限制条件。例如,“请问年利率小于x%的业务有哪些?”该问句中,“年利率小于x%”的槽位识别结果为约束。A constraint can be a restriction condition. For example, in the question “What are the services with an annual interest rate less than x%?”, the slot recognition result of “annual interest rate less than x%” is a constraint.

初始实体链接模型可以是未训练的Bert(Bidirectional Encoder Representation from Transformers,基于变换模型的双向编码)分类模型。该 Bert分类模型可以用于进行分类预测处理,将输入模型的实体片段链接到预先配置的知识图谱中的一个实体节点。The initial entity linking model can be an untrained Bert (Bidirectional Encoder Representation from Transformers) classification model. The Bert classification model can be used to perform classification prediction processing, linking the entity fragment of the input model to an entity node in a pre-configured knowledge graph.

知识图谱包括多个实体节点,每个实体节点对应于一个实体。知识图谱可以通过本体和语料库,进行知识抽取形成三元组,将业务数据知识化,从而构建得到。The knowledge graph includes multiple entity nodes, each of which corresponds to an entity. The knowledge graph can be constructed by extracting knowledge through ontology and corpus to form triples, and turning business data into knowledge.

步骤S102,获取问题文本样本。Step S102, obtaining a question text sample.

问题文本样本可以是预先生成的问题解析样本。每个问题文本样本可以是一个词语、一个词语组合、一个不完整的句子或者一个完整的句子,等等。The question text sample may be a pre-generated question parsing sample. Each question text sample may be a word, a word combination, an incomplete sentence or a complete sentence, and the like.

问题解析样本可以通过如下方式生成:Question analysis samples can be generated in the following ways:

获取预先配置的同义词集合、相似问集合、比较词集合;同义词集合包括多个标准词以及每个标准词对应的至少一个同义词;相似问集合包括多个第一属性以及每个第一属性对应的至少一个相似问句;比较词集合包括比较词信息;根据同义词集合、相似问集合以及比较词集合中的至少一者,生成问题解析样本。Obtain a preconfigured synonym set, similar question set, and comparative word set; the synonym set includes multiple standard words and at least one synonym corresponding to each standard word; the similar question set includes multiple first attributes and at least one similar question sentence corresponding to each first attribute; the comparative word set includes comparative word information; generate a question parsing sample based on at least one of the synonym set, similar question set, and comparative word set.

其中,本实施例中生成问题解析样本的具体实现方式涉及问答系统的样本生成方法,可参照下述方法实施例中的具体方案。Among them, the specific implementation method of generating question analysis samples in this embodiment involves the sample generation method of the question and answer system, and can refer to the specific scheme in the following method embodiment.

步骤S104,将问题文本样本输入初始问题解析模型进行迭代训练,得到问题解析模型;初始问题解析模型包括第一编码层和转换层;第一编码层用于根据问题文本样本进行编码处理,得到对应的第一句式向量;转换层用于在接收到第一句式向量的情况下生成预设数量个初始意图向量,根据第一句式向量对每个初始意图向量进行填充处理,得到对应的目标意图向量,将目标意图向量转换为对应的文本片段;文本片段用于在问答系统中查询问题文本样本的答案。Step S104, input the question text sample into the initial question parsing model for iterative training to obtain the question parsing model; the initial question parsing model includes a first encoding layer and a conversion layer; the first encoding layer is used to perform encoding processing according to the question text sample to obtain a corresponding first sentence vector; the conversion layer is used to generate a preset number of initial intention vectors when receiving the first sentence vector, fill each initial intention vector according to the first sentence vector to obtain a corresponding target intention vector, and convert the target intention vector into a corresponding text fragment; the text fragment is used to query the answer to the question text sample in the question-answering system.

需要说明的是,初始问题解析模型与经过迭代训练之后得到的问题解析模型的结构完全一致,参与训练的模型参数不同。经过迭代训练得到的问题解析模型同样具有第一编码层和转换层。It should be noted that the structure of the initial question parsing model is completely consistent with that of the question parsing model obtained after iterative training, but the model parameters involved in the training are different. The question parsing model obtained after iterative training also has a first encoding layer and a conversion layer.

第一编码层可以是Bert编码器,也可以是其他可以用于将文本转换为句式向量的组件。转换层可以由空间转换层和多个二分类全连接层构成。The first encoding layer can be a Bert encoder or other components that can be used to convert text into sentence vectors. The conversion layer can be composed of a spatial conversion layer and multiple binary classification fully connected layers.

第一编码层的输出可以是转换层的输入。The output of the first coding layer can be the input of the transformation layer.

初始意图向量,可以是一个元素数量固定但每个元素未知的向量,每个初始意图向量可以用于表征一个对应的意图空间,在对初始意图向量进行填 充处理之后,部分元素已知,剩余未知的元素可以通过指定数值补位,从而得到该初始意图向量对应的目标意图向量。The initial intention vector can be a vector with a fixed number of elements but each element is unknown. Each initial intention vector can be used to represent a corresponding intention space. After the filling process, some elements are known, and the remaining unknown elements can be filled by specifying numerical values to obtain the target intention vector corresponding to the initial intention vector.

在接收到第一句式向量的情况下生成预设数量个初始意图向量,该预设数量个初始意图向量所包括的元素数量均相同。When the first sentence vector is received, a preset number of initial intention vectors are generated, and the preset number of initial intention vectors include the same number of elements.

每个初始意图向量所包括的元素数量可以是预先配置的固定数值。具体地,在接收到第一句式向量的情况下,生成预设数量个初始意图向量;初始意图向量所包括的元素数量为预设数值。The number of elements included in each initial intention vector may be a pre-configured fixed value. Specifically, when the first sentence vector is received, a preset number of initial intention vectors are generated; and the number of elements included in the initial intention vector is a preset value.

每个初始意图向量所包括的元素数量也可以基于第一句式向量对应的字符数量确定。具体地,在接收到第一句式向量的情况下,根据第一句式向量,生成预设数量个初始意图向量;初始意图向量所包括的元素数量基于第一句式向量所对应的字符数量确定。The number of elements included in each initial intention vector can also be determined based on the number of characters corresponding to the first sentence vector. Specifically, when the first sentence vector is received, a preset number of initial intention vectors are generated according to the first sentence vector; the number of elements included in the initial intention vector is determined based on the number of characters corresponding to the first sentence vector.

在一种具体的实现方式中,第一句式向量包括语义特征子向量和多个字符子向量;根据第一句式向量对每个初始意图向量进行填充处理,得到对应的目标意图向量的具体实现方式有:对语义特征子向量进行意图分类处理,得到对应的意图分类结果;根据意图分类结果,对每个初始意图向量进行填充处理,得到对应的中间意图向量;根据每个字符子向量,对每个中间意图向量进行填充处理,得到对应的目标意图向量。In a specific implementation method, the first sentence vector includes a semantic feature sub-vector and multiple character sub-vectors; each initial intention vector is filled according to the first sentence vector to obtain the corresponding target intention vector. The specific implementation method is: intent classification is performed on the semantic feature sub-vector to obtain the corresponding intention classification result; according to the intention classification result, each initial intention vector is filled to obtain the corresponding intermediate intention vector; according to each character sub-vector, each intermediate intention vector is filled to obtain the corresponding target intention vector.

示例性地,语义特征子向量可以用[cls]表示。[cls]不是用于表示文本中某个字符的向量,是一个用于代表整个文本的语义特征向量,取出来就可以直接用于分类。For example, the semantic feature sub-vector can be represented by [cls]. [cls] is not a vector used to represent a certain character in the text, but a semantic feature vector used to represent the entire text, which can be directly used for classification.

问题文本样本可以包括多个字符,在通过第一编码层对问题文本样本进行编码得到对应的第一句式向量之后,该第一句式向量可以包括问题文本样本中的每个字符对应的字符子向量。The question text sample may include multiple characters. After the question text sample is encoded by the first encoding layer to obtain the corresponding first sentence vector, the first sentence vector may include a character sub-vector corresponding to each character in the question text sample.

意图分类结果可以用于表示,针对每个初始意图向量,第一句式向量是否具有该初始意图向量对应的意图。例如,针对初始意图向量1,若意图分类结果为第一分类结果,则说明第一句式向量不具有该初始意图向量1对应的意图;若意图分类结果为第二分类结果,则说明第一句式向量具有该初始意图向量1对应的意图。The intent classification result can be used to indicate, for each initial intent vector, whether the first sentence vector has the intent corresponding to the initial intent vector. For example, for initial intent vector 1, if the intent classification result is the first classification result, it means that the first sentence vector does not have the intent corresponding to the initial intent vector 1; if the intent classification result is the second classification result, it means that the first sentence vector has the intent corresponding to the initial intent vector 1.

对语义特征子向量进行意图分类处理,得到对应的意图分类结果,具体实施时,可以是通过二分类全连接层,即线性层(linear),基于[cls]进行意图分类处理,得到意图分类结果。即linear的输入值为[cls],输出值包括多个意 图分类结果,每个意图分类结果对应于一个意图初始向量,每个意图分类结果可以是第一分类结果和第二分类结果中的一者。The semantic feature sub-vector is processed for intent classification to obtain the corresponding intent classification result. In specific implementation, it can be processed based on [cls] through a binary classification fully connected layer, that is, a linear layer (linear), to obtain the intent classification result. That is, the input value of linear is [cls], and the output value includes multiple intents. Figure classification results, each intent classification result corresponds to an intent initial vector, and each intent classification result can be one of the first classification result and the second classification result.

根据意图分类结果,对每个初始意图向量进行填充处理,得到对应的中间意图向量,可以是在每个初始意图向量中自动填充用于表示该初始意图向量的意图分类结果的数值,得到中间意图向量。例如,针对初始意图向量1,若意图分类结果为第一分类结果,则在该初始意图向量1的指定位置填入用于表示第一分类结果的数值“0”;若意图分类结果为第二分类结果,则在该初始意图向量1的指定位置填入用于表示第二分类结果的数值“1”。According to the intent classification result, each initial intent vector is filled to obtain the corresponding intermediate intent vector, which can be automatically filled with the value used to represent the intent classification result of the initial intent vector in each initial intent vector to obtain the intermediate intent vector. For example, for initial intent vector 1, if the intent classification result is the first classification result, the value "0" used to represent the first classification result is filled in the specified position of the initial intent vector 1; if the intent classification result is the second classification result, the value "1" used to represent the second classification result is filled in the specified position of the initial intent vector 1.

示例性地,在该初始意图向量1的指定位置填入用于表示第一分类结果的数值“0”,可以是将指定位置的未知元素替换为“0”。Exemplarily, filling the specified position of the initial intention vector 1 with the value "0" for indicating the first classification result may be performed by replacing the unknown element at the specified position with "0".

需要注意的是,若初始意图向量对应的意图分类结果为第一分类结果,说明第一句式向量不具有对应的意图,从而,在后续步骤中进行实体识别和约束识别得到的字符识别结果与该初始意图向量无关,故在该初始意图向量中填充“0”得到中间意图向量之后,可以将该中间意图向量确定为对应的目标意图向量。It should be noted that if the intent classification result corresponding to the initial intention vector is the first classification result, it means that the first sentence vector does not have the corresponding intention. Therefore, the character recognition results obtained by entity recognition and constraint recognition in subsequent steps are irrelevant to the initial intention vector. Therefore, after filling "0" in the initial intention vector to obtain the intermediate intention vector, the intermediate intention vector can be determined as the corresponding target intention vector.

在一种具体的实现方式中,根据每个字符子向量,对每个中间意图向量进行填充处理,得到对应的目标意图向量的具体实现方式有:根据每个字符子向量,进行实体识别处理和约束识别处理,得到对应的字符识别结果;根据字符识别结果,对每个中间意图向量进行填充处理,得到对应的目标意图向量。In a specific implementation method, each intermediate intention vector is filled according to each character sub-vector to obtain the corresponding target intention vector. The specific implementation method is: according to each character sub-vector, entity recognition processing and constraint recognition processing are performed to obtain the corresponding character recognition result; according to the character recognition result, each intermediate intention vector is filled to obtain the corresponding target intention vector.

具体实施时,可以通过二分类全连接层linear,根据每个字符子向量,进行实体识别处理和约束识别处理,得到对应的字符识别结果,该字符识别结果可以用于表示字符子向量对应的实体类型和约束类型。其中,实体类型包括非实体和多种预设实体类型,约束类型包括非约束和多种预设约束类型。In specific implementation, a binary fully connected layer linear can be used to perform entity recognition and constraint recognition processing according to each character sub-vector to obtain a corresponding character recognition result, which can be used to represent the entity type and constraint type corresponding to the character sub-vector. Among them, the entity type includes non-entity and multiple preset entity types, and the constraint type includes non-constraint and multiple preset constraint types.

通过生成字符识别结果,可以确定每个字符子向量是否属于实体或约束,进而实现实体抽取或约束抽取。By generating character recognition results, it is possible to determine whether each character sub-vector belongs to an entity or a constraint, thereby achieving entity extraction or constraint extraction.

在一种具体的实现方式中,根据字符识别结果,对每个中间意图向量进行填充处理,得到对应的目标意图向量的具体实现方式有:若字符识别结果用于表征字符子向量属于实体类别或约束类别,则根据字符识别结果,确定对应的中间意图向量并填充。In a specific implementation method, each intermediate intention vector is filled according to the character recognition result to obtain the corresponding target intention vector. The specific implementation method is: if the character recognition result is used to characterize that the character sub-vector belongs to an entity category or a constraint category, then according to the character recognition result, the corresponding intermediate intention vector is determined and filled.

若字符识别结果用于表征字符子向量属于实体类别,则可以根据字符识 别结果进行实体抽取处理,以及,对抽取得到的实体进行意图预测处理,确定该实体对应的中间意图向量,将该实体自动填入对应的中间意图向量中。If the character recognition result is used to represent that the character sub-vector belongs to an entity category, then The entity extraction process is performed on the identification result, and the intent prediction process is performed on the extracted entity to determine the intermediate intent vector corresponding to the entity, and the entity is automatically filled into the corresponding intermediate intent vector.

该实体在中间意图向量中的填充位置可以基于该实体在问题文本样本中对应的字符位置确定。The filling position of the entity in the intermediate intent vector can be determined based on the character position corresponding to the entity in the question text sample.

若字符识别结果用于表征字符子向量属于约束类别,则可以根据字符识别结果进行约束抽取处理,以及,对抽取得到的约束进行意图预测处理,确定该约束对应的中间意图向量,将该约束自动填入对应的中间意图向量中。If the character recognition result is used to characterize that the character sub-vector belongs to the constraint category, constraint extraction processing can be performed based on the character recognition result, and intention prediction processing can be performed on the extracted constraints to determine the intermediate intention vector corresponding to the constraint, and the constraint can be automatically filled into the corresponding intermediate intention vector.

该约束在中间意图向量中的填充位置可以基于该约束在问题文本样本中对应的字符位置确定。The filling position of the constraint in the intermediate intention vector can be determined based on the character position corresponding to the constraint in the question text sample.

在一种具体的实现方式中,问题文本样本包括实体元素、属性元素、关系元素、约束元素中的至少一者;预设数量个初始意图向量包括第一意图向量和多个第二意图向量;第一意图向量对应于无意图,每个第二意图向量对应于一个属性元素或一个关系元素。In a specific implementation, the question text sample includes at least one of an entity element, an attribute element, a relationship element, and a constraint element; a preset number of initial intention vectors include a first intention vector and multiple second intention vectors; the first intention vector corresponds to no intent, and each second intention vector corresponds to an attribute element or a relationship element.

问题文本样本包括实体元素、属性元素、关系元素、约束元素中的至少一者。实体元素的槽位识别结果可以是“实体”,属性元素的槽位识别结果可以是“属性”,关系元素的槽位识别结果可以是“关系”,约束元素的槽位识别结果可以是“约束”,等等。The question text sample includes at least one of an entity element, an attribute element, a relationship element, and a constraint element. The slot recognition result of the entity element may be "entity", the slot recognition result of the attribute element may be "attribute", the slot recognition result of the relationship element may be "relationship", the slot recognition result of the constraint element may be "constraint", and so on.

下面通过几个例子示例性说明问题文本样本。The following are several examples to illustrate sample question texts.

通过问题解析模型可以将问题文本样本转换到不同的意图空间,每个意图空间对应于一个初始意图向量。The question parsing model can convert question text samples into different intent spaces, each of which corresponds to an initial intent vector.

若问题文本样本包括属性元素,则该属性元素可以用于在预设数量个初始意图向量中确定对应于该属性元素的目标初始意图向量。If the question text sample includes an attribute element, the attribute element can be used to determine a target initial intention vector corresponding to the attribute element from a preset number of initial intention vectors.

若问题文本样本包括关系元素,则该关系元素可以用于在预设数量个初始意图向量中确定对应于该关系元素的目标初始意图向量。If the question text sample includes a relation element, the relation element can be used to determine a target initial intention vector corresponding to the relation element from a preset number of initial intention vectors.

根据第一句式向量对每个初始意图向量进行填充处理,可以是在各意图空间对应的初始意图向量下,进行实体与约束的标注。在属性元素对应的目标初始意图向量中进行实体与约束的标注,可以使得生成的目标意图向量能够反映属性与实体之间的对应关系,以及,属性与约束之间的对应关系中的至少一者。在关系元素对应的目标初始意图向量中进行实体与约束的标注,可以使得生成的目标意图向量能够反映关系与实体之间的对应关系,以及,关系与约束之间的对应关系中的至少一者。 Filling each initial intention vector according to the first sentence vector may be performed by labeling entities and constraints under the initial intention vector corresponding to each intention space. Labeling entities and constraints in the target initial intention vector corresponding to the attribute element may enable the generated target intention vector to reflect at least one of the corresponding relationship between the attribute and the entity and the corresponding relationship between the attribute and the constraint. Labeling entities and constraints in the target initial intention vector corresponding to the relationship element may enable the generated target intention vector to reflect at least one of the corresponding relationship between the relationship and the entity and the corresponding relationship between the relationship and the constraint.

在槽位识别的处理过程中,问题文本样本中,实体元素、属性元素、关系元素、约束元素等多种槽位元素相互之间可能存在一定的对应关系。In the process of slot identification, in the question text sample, various slot elements such as entity elements, attribute elements, relationship elements, constraint elements, etc. may have certain corresponding relationships with each other.

例如,问题文本样本为:A产品的价格是多少,B产品的上架时间是几点?其中,“A产品”和“B产品”为实体,“价格”和“上架时间”为属性,其中,“A产品”对应于“价格”,“B产品”对应于“上架时间”。如果分别针对每一种元素进行槽位识别处理,可能会对问题文本样本对应的用户意图产生误解,从而向用户反馈A产品和B产品二者的价格,以及,A产品和B产品二者的上架时间。For example, a sample question text is: What is the price of product A, and when will product B be available? Among them, "product A" and "product B" are entities, "price" and "availability time" are attributes, and "product A" corresponds to "price" and "product B" corresponds to "availability time". If slot recognition is performed for each element separately, the user's intention corresponding to the sample question text may be misunderstood, and the user may be fed back the prices of both products A and B, as well as the availability time of both products A and B.

在本实施例中,在每个初始意图向量对应于一个属性元素或一个关系元素的情况下,通过在各意图空间对应的初始意图向量下进行实体与约束的标注,可以分别在不同的意图空间同时执行槽位识别,不仅识别出问题文本样本中各个槽位元素,还可以识别得到属性与实体之间的对应关系、属性与约束之间的对应关系、关系与实体之间的对应关系,以及,关系与约束之间的对应关系,等等。以此,无需分别针对每一种槽位元素采用对应的模型进行槽位识别的模型训练,减少了槽位识别处理流程所需的模型数量,降低了槽位识别处理流程中的时间延迟,且多个意图空间内同时执行实体与约束的标注,可以一次性解析问题文本样本得到其中各个槽位元素的槽位识别结果,提高了槽位识别效率。In this embodiment, when each initial intent vector corresponds to an attribute element or a relationship element, by annotating entities and constraints under the initial intent vectors corresponding to each intent space, slot recognition can be performed simultaneously in different intent spaces, not only identifying each slot element in the question text sample, but also identifying the correspondence between attributes and entities, the correspondence between attributes and constraints, the correspondence between relationships and entities, and the correspondence between relationships and constraints, etc. In this way, there is no need to use a corresponding model for each slot element to perform model training for slot recognition, which reduces the number of models required for the slot recognition processing flow, reduces the time delay in the slot recognition processing flow, and simultaneously performs entity and constraint annotation in multiple intent spaces, which can parse the question text sample at one time to obtain the slot recognition results of each slot element therein, thereby improving the efficiency of slot recognition.

问题文本样本可以由实体元素和属性元素构成,例如,问题文本样本为:产品A的费用怎么算?其中,实体元素为“产品A”,属性元素为“费用怎么算”,该属性元素对应于属性“价格”。The question text sample may be composed of entity elements and attribute elements. For example, the question text sample is: How is the cost of product A calculated? Among them, the entity element is "product A", and the attribute element is "how is the cost calculated", which corresponds to the attribute "price".

问题文本样本可以由实体元素和约束元素构成,例如,问题文本样本为:本周日之前可以参加的餐饮店活动有哪些?其中,实体元素为“餐饮店活动”,约束元素为“本周日之前”。The question text sample may be composed of an entity element and a constraint element. For example, the question text sample is: What restaurant activities can be participated in before this Sunday? Among them, the entity element is "restaurant activities" and the constraint element is "before this Sunday".

问题文本样本可以由实体元素和关系元素构成,例如,C活动的办理渠道有哪些?其中,实体元素为“C活动”,关系元素为“办理渠道”。The question text sample can be composed of entity elements and relationship elements, for example, what are the handling channels for C activity? Among them, the entity element is "C activity" and the relationship element is "handling channel".

问题文本样本可以由实体元素构成,例如,D服务。其中,实体元素为“D服务”。在问题文本样本仅包括实体元素的情况下,通常问题是不完整的需要再进行反问以引导提问的用户对问题进行补充。该情况下,第一意图向量的意图分类结果可以是第二分类结果,每个第二意图向量的意图分类结果可以是第一分类结果,即第一句式向量不具有任何意图。 The question text sample can be composed of entity elements, for example, D service. The entity element is "D service". In the case where the question text sample only includes entity elements, the question is usually incomplete and needs to be asked in reverse to guide the user asking the question to supplement the question. In this case, the intent classification result of the first intent vector can be the second classification result, and the intent classification result of each second intent vector can be the first classification result, that is, the first sentence vector does not have any intent.

下面可以参照图2示例性说明问题解析模型的结构与问题解析模型内部的数据处理方式。图2为本申请实施例提供的一种问答系统的模型训练方法中问题解析模型的数据流向图。The structure of the question parsing model and the data processing method within the question parsing model can be exemplarily described below with reference to Figure 2. Figure 2 is a data flow diagram of a question parsing model in a model training method for a question-answering system provided in an embodiment of the present application.

如图2所示,问题文本样本202为“吸脂多久可以好?瘦脸费用怎么算”。[sep]用于表示两个问题文本样本之间的分隔符。[cls]就是classification(分类)的意思,可以理解为用于下游的分类任务。对于文本分类任务,BERT模型在文本前插入一个[cls]符号,并将该符号对应的输出向量作为整篇文本的语义表示,用于文本分类。As shown in FIG2 , the question text sample 202 is “How long does it take for liposuction to heal? How much does it cost to slim the face?” [sep] is used to represent the separator between two question text samples. [cls] means classification, which can be understood as a classification task for downstream. For text classification tasks, the BERT model inserts a [cls] symbol before the text and uses the output vector corresponding to the symbol as the semantic representation of the entire text for text classification.

第一编码层可以是Bert编码器204,转换层可以包括空间转换层208、二分类全连接层210以及二分类全连接层212。The first encoding layer may be a Bert encoder 204 , and the transformation layer may include a spatial transformation layer 208 , a binary classification fully connected layer 210 , and a binary classification fully connected layer 212 .

Bert编码器204用于将问题文本样本202转换为对应的句式向量206并发送至空间转换层208;空间转换层208用于在接收到句式向量206的情况下生成预设数量个初始意图向量,其中,初始意图向量1对应于意图空间Null(空值)220,初始意图向量2对应于意图空间“介绍”218,初始意图向量3对应于意图空间“价格”216,初始意图向量4对应于意图空间“恢复期”214。The Bert encoder 204 is used to convert the question text sample 202 into a corresponding sentence vector 206 and send it to the spatial conversion layer 208; the spatial conversion layer 208 is used to generate a preset number of initial intention vectors when receiving the sentence vector 206, wherein the initial intention vector 1 corresponds to the intention space Null (null value) 220, the initial intention vector 2 corresponds to the intention space "introduction" 218, the initial intention vector 3 corresponds to the intention space "price" 216, and the initial intention vector 4 corresponds to the intention space "recovery period" 214.

二分类全连接层210可以用于根据语义特征子向量H[cls]2062进行意图分类处理,得到每个意图空间对应的意图分类结果并填充:意图空间“恢复期”214对应于第二分类结果,在首位填入“1”;意图空间“价格”216对应于第二分类结果,在首位填入“1”;意图空间“介绍”218对应于第一分类结果,在首位填入“0”;意图空间Null220对应于第一分类结果,在首位填入“0”。The binary classification fully connected layer 210 can be used to perform intent classification processing according to the semantic feature sub-vector H[cls]2062, and obtain the intent classification results corresponding to each intent space and fill them in: the intent space "recovery period" 214 corresponds to the second classification result, and "1" is filled in the first position; the intent space "price" 216 corresponds to the second classification result, and "1" is filled in the first position; the intent space "introduction" 218 corresponds to the first classification result, and "0" is filled in the first position; the intent space Null220 corresponds to the first classification result, and "0" is filled in the first position.

二分类全连接层212可以用于根据每个字符子向量2064进行实体识别处理和约束识别处理,得到对应的字符识别结果,基于字符识别结果进行实体抽取处理和约束抽取处理,得到第一句式向量对应的实体“吸脂”和“瘦脸”。The binary classification fully connected layer 212 can be used to perform entity recognition processing and constraint recognition processing according to each character sub-vector 2064 to obtain the corresponding character recognition result, and perform entity extraction processing and constraint extraction processing based on the character recognition result to obtain the entities "liposuction" and "face thinning" corresponding to the first sentence vector.

对实体“吸脂”进行意图预测处理,确定该实体“吸脂”对应的意图空间为意图空间“恢复期”214并填充,得到意图空间“恢复期”对应的目标意图向量。对实体“瘦脸”进行意图预测处理,确定该实体“瘦脸”对应的意图空间为意图空间“价格”216并填充,得到意图空间“价格”对应的目标意图向量。The entity "liposuction" is processed for intention prediction, and the intention space corresponding to the entity "liposuction" is determined to be the intention space "recovery period" 214 and filled, and the target intention vector corresponding to the intention space "recovery period" is obtained. The entity "face thinning" is processed for intention prediction, and the intention space corresponding to the entity "face thinning" is determined to be the intention space "price" 216 and filled, and the target intention vector corresponding to the intention space "price" is obtained.

根据意图空间“恢复期”对应的目标意图向量和意图空间“价格”对应 的目标意图向量,可以生成对应的文本片段。对于其他意图分类结果为第一分类结果的意图空间,可以舍弃对应的目标意图向量。According to the target intention vector corresponding to the intention space "recovery period" and the intention space "price" corresponding The target intent vector can generate the corresponding text segment. For other intent spaces whose intent classification results are the first classification results, the corresponding target intent vector can be discarded.

在一种具体的实现方式中,问答系统还包括初始实体链接模型;问答系统的模型训练方法还包括:获取实体链接样本;将实体链接样本输入初始实体链接模型进行迭代训练,得到实体链接模型;实体链接模型包括第二编码层和预测层;第二编码层用于对实体链接样本进行编码处理,得到对应的第二句式向量;预测层用于根据第二句式向量进行预测处理,确定对应的目标实体。In a specific implementation, the question-answering system also includes an initial entity linking model; the model training method of the question-answering system also includes: obtaining entity linking samples; inputting the entity linking samples into the initial entity linking model for iterative training to obtain an entity linking model; the entity linking model includes a second encoding layer and a prediction layer; the second encoding layer is used to encode the entity linking samples to obtain a corresponding second sentence vector; the prediction layer is used to perform prediction processing based on the second sentence vector to determine the corresponding target entity.

获取实体链接样本,具体可以通过如下方式:获取标准实体库;标准实体库包括多个标准实体;根据问题解析样本和标准实体库,生成实体链接样本。Entity link samples can be obtained in the following ways: obtaining a standard entity library; the standard entity library includes multiple standard entities; and generating entity link samples based on the question parsing samples and the standard entity library.

其中,本实施例中,根据问题解析样本和标准实体库,生成实体链接样本的具体实现方式涉及问答系统的样本生成方法,可参照下述方法实施例中的具体方案。Among them, in this embodiment, the specific implementation method of generating entity link samples based on question parsing samples and standard entity libraries involves a sample generation method of a question-answering system, and the specific scheme in the following method embodiment can be referred to.

初始实体链接模型与经过迭代训练之后得到的实体链接模型的结构完全一致,参与训练的模型参数不同。经过迭代训练得到的实体链接模型同样具有第二编码层和预测层。The structure of the initial entity link model is exactly the same as that of the entity link model obtained after iterative training, but the model parameters involved in the training are different. The entity link model obtained after iterative training also has a second encoding layer and a prediction layer.

第二编码层可以是Bert转换器,也可以是其他可以用于将文本转换为句式向量的组件。预测层可以是二分类全连接层linear,也可以是其他可以用于将语义特征子向量映射到对应实体的组件。The second encoding layer can be a Bert converter or other components that can be used to convert text into sentence vectors. The prediction layer can be a binary fully connected layer linear or other components that can be used to map semantic feature sub-vectors to corresponding entities.

预测层用于根据第二句式向量进行预测处理,确定对应的目标实体,具体地,预测层可以用于根据第二句式向量中的语义特征子向量进行预测处理,确定对应的目标实体。The prediction layer is used to perform prediction processing according to the second sentence vector to determine the corresponding target entity. Specifically, the prediction layer can be used to perform prediction processing according to the semantic feature subvector in the second sentence vector to determine the corresponding target entity.

在训练阶段,可以使用生成好的实体链接数据集,训练bert分类模型。在预测阶段,在通过问题解析模型完成实体识别后,将会得到一个实体提及(mention)片段,在图谱中召回相似度较高的实体,与构造数据时类似,将带mention片段的问句与召回实体进行各自拼接。使用实体链接模型进行预测,则会链接到图谱中的唯一实体。In the training phase, the generated entity linking dataset can be used to train the BERT classification model. In the prediction phase, after the entity recognition is completed through the question parsing model, an entity mention segment will be obtained, and the entities with higher similarity will be recalled in the graph. Similar to the construction of data, the question with the mention segment and the recalled entity will be spliced separately. When using the entity linking model for prediction, it will be linked to the unique entity in the graph.

下面可以结合图3示例性说明实体链接模型的结构和实体链接模型内部的数据处理流程。图3为本申请实施例提供的一种问答系统的模型训练方法中实体链接模型的数据流向图。 The structure of the entity linking model and the data processing flow within the entity linking model can be exemplarily described below in conjunction with Figure 3. Figure 3 is a data flow diagram of the entity linking model in a model training method for a question-answering system provided in an embodiment of the present application.

如图3所示,实体链接样本302为“[sep]使用$射频瘦脸$多少钱[sep]射频溶脂瘦脸”,通过Bert编码器302对实体链接样本302进行编码处理,得到对应的第二句式向量,该第二句式向量包括语义特征子向量H[cls]306。通过二分类全连接层308对语义特征子向量H[cls]306进行预测处理,确定对应的实体预测结果310。As shown in FIG3 , the entity link sample 302 is “[sep] How much does it cost to use $radio frequency face slimming$ [sep] radio frequency lipolysis face slimming”, and the entity link sample 302 is encoded by the Bert encoder 302 to obtain the corresponding second sentence vector, which includes the semantic feature sub-vector H[cls] 306. The semantic feature sub-vector H[cls] 306 is predicted by the binary classification fully connected layer 308 to determine the corresponding entity prediction result 310.

在一种具体的实现方式中,问答系统还包括初始问题分类模型;问答系统的模型训练方法还包括:获取问题分类样本;将问题分类样本输入初始问题分类模型进行迭代训练,得到问题分类模型。In a specific implementation, the question-answering system also includes an initial question classification model; the model training method of the question-answering system also includes: obtaining question classification samples; inputting the question classification samples into the initial question classification model for iterative training to obtain a question classification model.

问题分类样本可以通过如下方式获取:获取问题域集合;问题域集合包括域内问题文本和域外问题文本;根据问题解析样本和问题域集合,生成问题分类样本。Question classification samples can be obtained in the following ways: obtaining a question domain set; the question domain set includes in-domain question texts and out-of-domain question texts; generating question classification samples based on the question parsing samples and the question domain set.

其中,获取问题域集合的具体实现方式以及问题域集合的具体说明涉及问答系统的样本生成方法,可参照下述方法实施例中的具体方案。Among them, the specific implementation method of obtaining the question domain set and the specific description of the question domain set involve the sample generation method of the question and answer system, and can refer to the specific scheme in the following method embodiment.

问题分类样本可以是OOD数据集,用于构造OOD模型所需训练数据。问题分类样本可以包括正负样本,正样本用于表征域外问题,用label(标签)“1”标识,负样本用于表征域内问题,用label“0”标识。The problem classification sample can be an OOD data set, which is used to construct the training data required for the OOD model. The problem classification sample can include positive and negative samples. The positive sample is used to represent out-of-domain problems and is marked with label "1", and the negative sample is used to represent in-domain problems and is marked with label "0".

示例性地,正样本可以是{"text":"你好","label":1}。For example, a positive sample may be {"text":"Hello","label":1}.

负样本可以是{"text":"单位通知存款的交易时间是几点","label":0},或者,{"text":"车位贷能贷多久","label":0}。Negative samples can be {"text":"What is the transaction time for unit notice deposit","label":0}, or {"text":"How long can the parking space loan last","label":0}.

其中,根据问题解析样本和问题域集合,生成问题分类样本的具体实现方式涉及问答系统的样本生成方法,可参照下述方法实施例中的具体方案。Among them, the specific implementation method of generating question classification samples based on question analysis samples and question domain sets involves a sample generation method of a question-answering system, and reference may be made to the specific scheme in the following method embodiment.

在如图1所示的实施例中,首先,获取问题文本样本;然后,将问题文本样本输入初始问题解析模型进行迭代训练,得到问题解析模型;初始问题解析模型包括第一编码层和转换层;第一编码层用于根据问题文本样本进行编码处理,得到对应的第一句式向量;转换层用于在接收到第一句式向量的情况下生成预设数量个初始意图向量,根据第一句式向量对每个初始意图向量进行填充处理,得到对应的目标意图向量,将目标意图向量转换为对应的文本片段;文本片段用于在问答系统中查询问题文本样本的答案。以此,通过获取的问题文本样本对初始问题解析模型进行迭代训练,在训练过程中,通过第一编码层,可以将问题文本样本编码得到对应的第一句式向量,通过转换层,可以生成预设数量个初始意图向量,进而基于第一句式向量对每个 初始意图向量进行填充处理,从而一次性将问题文本样本解析得到至少一个目标意图向量对应的文本片段,每个文本片段所反映的问题意图可以由对应的初始意图向量确定,且通过问题解析模型可以将问题文本样本转换到不同的意图空间,并在各意图空间对应的意图向量下标注实体和约束,有利于提高槽位识别效率,减少槽位识别处理流程所需的模型数量,从而降低时延,且各个意图空间内同时执行槽位识别,可以增加并发,提高了问答系统的意图识别能力和识别效率。In the embodiment shown in FIG1 , first, a question text sample is obtained; then, the question text sample is input into the initial question parsing model for iterative training to obtain a question parsing model; the initial question parsing model includes a first encoding layer and a conversion layer; the first encoding layer is used to perform encoding processing according to the question text sample to obtain a corresponding first sentence vector; the conversion layer is used to generate a preset number of initial intention vectors when the first sentence vector is received, fill each initial intention vector according to the first sentence vector to obtain a corresponding target intention vector, and convert the target intention vector into a corresponding text fragment; the text fragment is used to query the answer to the question text sample in the question-answering system. In this way, the initial question parsing model is iteratively trained by the obtained question text sample. During the training process, the question text sample can be encoded to obtain the corresponding first sentence vector through the first encoding layer, and a preset number of initial intention vectors can be generated through the conversion layer, and then each initial intention vector can be filled based on the first sentence vector to obtain a corresponding target intention vector, and the target intention vector can be converted into a corresponding text fragment. The text fragment is used to query the answer to the question text sample in the question-answering system. The initial intent vector is filled, so that the question text sample can be parsed at one time to obtain a text fragment corresponding to at least one target intent vector. The question intent reflected by each text fragment can be determined by the corresponding initial intent vector, and the question text sample can be converted into different intent spaces through the question parsing model, and entities and constraints can be annotated under the intent vectors corresponding to each intent space, which is conducive to improving slot recognition efficiency and reducing the number of models required for the slot recognition processing flow, thereby reducing latency. In addition, slot recognition is performed simultaneously in each intent space, which can increase concurrency and improve the intent recognition capability and efficiency of the question-answering system.

出于与前述的方法实施例相同的技术构思,本申请实施例还提供了一种应答方法的实施例。图4为本申请实施例提供的一种应答方法的处理流程图。参见图4应答方法的处理流程具体包括步骤S402至步骤S406。Based on the same technical concept as the aforementioned method embodiment, the present application embodiment also provides an embodiment of a response method. Figure 4 is a processing flow chart of a response method provided in the present application embodiment. Referring to Figure 4, the processing flow of the response method specifically includes steps S402 to S406.

S402,获取待应答的目标问题。S402, obtaining a target question to be answered.

S404,将目标问题输入问题解析模型进行解析处理,得到对应的目标片段;问题解析模型是通过问答系统的模型训练方法进行训练所得到的。S404, inputting the target question into the question parsing model for parsing and processing to obtain the corresponding target segment; the question parsing model is obtained by training using the model training method of the question-answering system.

本步骤中的问答系统的模型训练方法可以是前述的各个方法实施例提供的一种问答系统的模型训练方法。The model training method for the question-answering system in this step may be a model training method for the question-answering system provided by the aforementioned various method embodiments.

在一种具体的实现方式中,将目标问题输入问题解析模型进行解析处理,得到对应的目标片段,包括:将目标问题输入问题分类模型进行分类处理,得到分类结果;在分类结果用于表征目标问题属于第一预设分类的情况下,将目标问题输入问题解析模型进行解析处理,得到对应的目标片段。In a specific implementation method, the target problem is input into a problem analysis model for analysis processing to obtain a corresponding target fragment, including: inputting the target problem into a problem classification model for classification processing to obtain a classification result; when the classification result is used to characterize that the target problem belongs to a first preset classification, inputting the target problem into the problem analysis model for analysis processing to obtain a corresponding target fragment.

步骤S406,根据目标片段,确定目标问题的答案。Step S406: Determine the answer to the target question based on the target segment.

在一种具体的实现方式中,根据目标片段,确定目标问题的答案,包括:将目标片段输入实体链接模型进行预测处理,得到对应的目标实体;根据目标实体,进行槽位填充处理,得到目标问题的槽位填充结果;根据槽位填充结果,在预先配置的知识图谱中查询对应的答案,得到目标问题的答案。In a specific implementation, the answer to the target question is determined based on the target fragment, including: inputting the target fragment into the entity linking model for prediction processing to obtain the corresponding target entity; performing slot filling processing based on the target entity to obtain the slot filling result of the target question; based on the slot filling result, querying the corresponding answer in the pre-configured knowledge graph to obtain the answer to the target question.

根据目标实体,进行槽位填充处理,可以是针对步骤S404中得到的目标片段,首先,确定该目标片段可能对应的至少一种槽位模板,然后,针对每种槽位模板,通过目标片段对该种槽位模板进行槽位填充处理,得到填充后的槽位模板,将该填充后的槽位模板确定为目标问题对应的槽位填充结果。According to the target entity, slot filling processing is performed. For the target fragment obtained in step S404, first, at least one slot template that the target fragment may correspond to is determined, and then, for each slot template, slot filling processing is performed on the slot template through the target fragment to obtain the filled slot template, and the filled slot template is determined as the slot filling result corresponding to the target problem.

每种槽位模板可以包括一个或多个待填充的槽位。Each slot template may include one or more slots to be filled.

例如,槽位模板1为:(实体槽位)的(属性槽位)是什么?For example, slot template 1 is: What is the (attribute slot) of (entity slot)?

槽位模板2为:(实体槽位)(关系槽位)(实体槽位)? Slot template 2 is: (entity slot)(relationship slot)(entity slot)?

槽位模板3为:(约束槽位)(实体槽位)有哪些?Slot template 3 is: (Constraint slot) (Entity slot) What are they?

需要强调的是,目标实体不仅可以用于“实体槽位”的填充,还可以用于“属性槽位”、“关系槽位”以及“约束槽位”等多种槽位的填充。It should be emphasized that the target entity can be used not only to fill the "entity slot", but also to fill multiple slots such as "attribute slot", "relationship slot" and "constraint slot".

根据槽位填充结果,在预先配置的知识图谱中查询对应的答案,可以是,若槽位填充结果无法用于在知识图谱中查询得到唯一对应的答案,则确定槽位填充结果对应的槽位缺失信息;根据槽位缺失信息生成反问句;接收响应反问句的用户输入;根据用户输入进行槽位填充处理;根据槽位填充处理后的槽位填充结果,在预先配置的知识图谱中查询对应的答案,得到目标问题的答案。According to the slot filling result, the corresponding answer is queried in the pre-configured knowledge graph. If the slot filling result cannot be used to query in the knowledge graph to obtain the unique corresponding answer, the slot missing information corresponding to the slot filling result is determined; a rhetorical question is generated according to the slot missing information; a user input in response to the rhetorical question is received; a slot filling process is performed according to the user input; and according to the slot filling result after the slot filling process, the corresponding answer is queried in the pre-configured knowledge graph to obtain the answer to the target question.

例如,目标问题的槽位填充结果包括填充后的槽位模板1:(产品A)的(属性槽位)是什么?该填充后的槽位模板1无法用于在知识图谱中查询得到唯一对应的答案,可以确定该填充处理后的槽位模板1对应的槽位缺失信息为属性缺失。根据槽位缺失信息“属性缺失”生成对应的反问句“请问您想咨询A产品的什么信息”。接收响应反问句的用户输入“价格”。根据用户输入进行槽位填充处理,即,对“(产品A)的(属性槽位)是什么”中的属性槽位进行填充处理,得到二次填充后的槽位模板1:(产品A)的(价格)是什么?根据槽位填充处理后的槽位填充结果,在预先配置的知识图谱中查询“产品A的价格是什么”对应的答案,得到目标问题的答案。For example, the slot filling result of the target question includes the filled slot template 1: What is the (attribute slot) of (product A)? The filled slot template 1 cannot be used to query in the knowledge graph to obtain a unique corresponding answer. It can be determined that the slot missing information corresponding to the filled slot template 1 is attribute missing. According to the slot missing information "attribute missing", a corresponding rhetorical question "What information do you want to inquire about product A?" is generated. The user input "price" is received in response to the rhetorical question. The slot filling process is performed according to the user input, that is, the attribute slot in "What is the (attribute slot) of (product A)" is filled to obtain the slot template 1 after secondary filling: What is the (price) of (product A)? According to the slot filling result after the slot filling process, the answer corresponding to "What is the price of product A" is queried in the pre-configured knowledge graph to obtain the answer to the target question.

下面,可以结合图5示例性说明如何根据槽位填充结果在知识图谱中确定对应的答案。图5为本申请实施例提供的一种应答方法中会话管理流程图。Next, how to determine the corresponding answer in the knowledge graph according to the slot filling result can be exemplified in conjunction with Figure 5. Figure 5 is a flow chart of session management in a response method provided in an embodiment of the present application.

如图5所示,在用户输入的问题进入会话栈之后,问答系统可以对该问题进行算法解析,得到解析结果。该解析结果包括至少一个槽位识别结果。As shown in Figure 5, after the question input by the user enters the conversation stack, the question-answering system can perform algorithmic analysis on the question to obtain an analysis result, which includes at least one slot identification result.

进行解析结果检测,若解析结果用于表示OOD或无结果,则按照“不知道/不能回答/标志位”对应的预先配置的应答话术生成应答文本,该应答文本用于通知用户问答系统无法回答该问题。Perform a parsing result check. If the parsing result is used to indicate OOD or no result, generate a response text according to the pre-configured response words corresponding to "don't know/can't answer/flag bit". The response text is used to inform the user that the question-and-answer system cannot answer the question.

若解析结果用于表示单结果,则进行会话栈检测:对于首次会话,进入槽位管理流程;对于多轮对话,则进入会话管理流程。If the parsing result is used to represent a single result, a session stack detection is performed: for the first session, the slot management process is entered; for multiple rounds of dialogue, the session management process is entered.

若解析结果用于表示多结果,则进入槽位管理流程,通过该槽位管理流程进行槽位填充处理。If the parsing result is used to represent multiple results, the slot management process is entered, and the slot filling process is performed through the slot management process.

在槽位管理流程中,可以通过复合约束槽、实体槽、意图槽、普通约束槽是否存在、节点是否叶子节点,以及,意图槽为普通意图或多值意图,确 定待生成的反问模板,或者,应答模板。In the slot management process, you can check whether the composite constraint slot, entity slot, intent slot, common constraint slot exists, whether the node is a leaf node, and whether the intent slot is a common intent or a multi-value intent. Specifies the question template to be generated, or the answer template.

在会话管理流程中,可以通过复合约束检测、普通约束检测、实体检测、意图检测、节点判断、实体继承、约束继承以及意图继承等方式确定如何在多轮会话中进行逻辑继承。In the conversation management process, how to perform logical inheritance in multiple rounds of conversations can be determined through composite constraint detection, common constraint detection, entity detection, intent detection, node judgment, entity inheritance, constraint inheritance, and intent inheritance.

由于技术构思相同,本实施例中描述得比较简单,相关的部分请参见上述提供的方法实施例的对应说明即可。Since the technical concept is the same, the description in this embodiment is relatively simple, and the relevant parts may refer to the corresponding description of the method embodiment provided above.

在上述的实施例中,提供了一种问答系统的模型训练方法,与之相对应的,基于相同的技术构思,本申请实施例还提供了一种问答系统的模型训练装置,下面结合附图进行说明。In the above-mentioned embodiment, a model training method for a question-answering system is provided. Correspondingly, based on the same technical concept, an embodiment of the present application also provides a model training device for a question-answering system, which is described below in conjunction with the accompanying drawings.

图6为本申请实施例提供的一种问答系统的模型训练装置示意图。FIG6 is a schematic diagram of a model training device for a question-answering system provided in an embodiment of the present application.

本实施例提供一种问答系统的模型训练装置600,问答系统包括初始问题解析模型;所述装置包括:第一获取单元602,用于获取问题文本样本;第一训练单元604,用于将所述问题文本样本输入所述初始问题解析模型进行迭代训练,得到问题解析模型;所述初始问题解析模型包括第一编码层和转换层;所述第一编码层用于根据所述问题文本样本进行编码处理,得到对应的第一句式向量;所述转换层用于在接收到所述第一句式向量的情况下生成预设数量个初始意图向量,根据所述第一句式向量对每个所述初始意图向量进行填充处理,得到对应的目标意图向量,将所述目标意图向量转换为对应的文本片段;所述文本片段用于在所述问答系统中查询所述问题文本样本的答案。The present embodiment provides a model training device 600 for a question-answering system, wherein the question-answering system includes an initial question parsing model; the device includes: a first acquisition unit 602, used to acquire a question text sample; a first training unit 604, used to input the question text sample into the initial question parsing model for iterative training to obtain a question parsing model; the initial question parsing model includes a first encoding layer and a conversion layer; the first encoding layer is used to perform encoding processing according to the question text sample to obtain a corresponding first sentence vector; the conversion layer is used to generate a preset number of initial intention vectors when receiving the first sentence vector, fill each of the initial intention vectors according to the first sentence vector to obtain a corresponding target intention vector, and convert the target intention vector into a corresponding text fragment; the text fragment is used to query the answer to the question text sample in the question-answering system.

可选地,所述第一句式向量包括语义特征子向量和多个字符子向量;所述根据所述第一句式向量对每个所述初始意图向量进行填充处理,得到对应的目标意图向量的具体实现方式有:对所述语义特征子向量进行意图分类处理,得到对应的意图分类结果;根据所述意图分类结果,对每个所述初始意图向量进行填充处理,得到对应的中间意图向量;根据每个所述字符子向量,对每个所述中间意图向量进行填充处理,得到对应的目标意图向量。Optionally, the first sentence vector includes a semantic feature sub-vector and multiple character sub-vectors; the specific implementation method of filling each of the initial intention vectors according to the first sentence vector to obtain the corresponding target intention vector is: performing intent classification processing on the semantic feature sub-vector to obtain the corresponding intention classification result; filling each of the initial intention vectors according to the intention classification result to obtain the corresponding intermediate intention vector; filling each of the intermediate intention vectors according to each of the character sub-vectors to obtain the corresponding target intention vector.

可选地,所述意图分类结果包括第一分类结果和第二分类结果;所述根据每个所述字符子向量,对每个所述中间意图向量进行填充处理,得到对应的目标意图向量的具体实现方式有:根据每个所述字符子向量,进行实体识别处理和约束识别处理,得到对应的字符识别结果;根据所述字符识别结果,对每个所述中间意图向量进行填充处理,得到对应的目标意图向量。 Optionally, the intention classification result includes a first classification result and a second classification result; the specific implementation method of filling each of the intermediate intention vectors according to each of the character sub-vectors to obtain the corresponding target intention vector is: performing entity recognition processing and constraint recognition processing according to each of the character sub-vectors to obtain the corresponding character recognition result; filling each of the intermediate intention vectors according to the character recognition result to obtain the corresponding target intention vector.

可选地,所述根据所述字符识别结果,对每个所述中间意图向量进行填充处理,得到对应的目标意图向量的具体实现方式有:若所述字符识别结果用于表征所述字符子向量属于实体类别或约束类别,则根据所述字符识别结果,确定对应的中间意图向量并填充。Optionally, the specific implementation method of filling each of the intermediate intention vectors according to the character recognition result to obtain the corresponding target intention vector is: if the character recognition result is used to characterize that the character sub-vector belongs to an entity category or a constraint category, then according to the character recognition result, the corresponding intermediate intention vector is determined and filled.

可选地,所述问题文本样本包括实体元素、属性元素、关系元素、约束元素中的至少一者;预设数量个所述初始意图向量包括第一意图向量和多个第二意图向量;所述第一意图向量对应于无意图,每个所述第二意图向量对应于一个所述属性元素或一个所述关系元素。Optionally, the question text sample includes at least one of entity elements, attribute elements, relationship elements, and constraint elements; the preset number of initial intention vectors includes a first intention vector and multiple second intention vectors; the first intention vector corresponds to no intent, and each of the second intention vectors corresponds to one attribute element or one relationship element.

可选地,所述问答系统还包括初始实体链接模型;第一获取单元602,还用于获取实体链接样本;第一训练单元604,还用于将所述实体链接样本输入所述初始实体链接模型进行迭代训练,得到实体链接模型;所述实体链接模型包括第二编码层和预测层;所述第二编码层用于对所述实体链接样本进行编码处理,得到对应的第二句式向量;所述预测层用于根据所述第二句式向量进行预测处理,确定对应的目标实体。Optionally, the question-answering system also includes an initial entity linking model; the first acquisition unit 602 is also used to obtain entity linking samples; the first training unit 604 is also used to input the entity linking samples into the initial entity linking model for iterative training to obtain an entity linking model; the entity linking model includes a second encoding layer and a prediction layer; the second encoding layer is used to encode the entity linking samples to obtain a corresponding second sentence vector; the prediction layer is used to perform prediction processing based on the second sentence vector to determine the corresponding target entity.

可选地,所述问答系统还包括初始问题分类模型;第一获取单元602,还用于获取问题分类样本;第一训练单元604,还用于将所述问题分类样本输入所述初始问题分类模型进行迭代训练,得到问题分类模型。Optionally, the question-answering system further includes an initial question classification model; the first acquisition unit 602 is further used to acquire question classification samples; the first training unit 604 is further used to input the question classification samples into the initial question classification model for iterative training to obtain a question classification model.

本申请实施例所提供的问答系统的模型训练装置包括第一获取单元和第一训练单元,其中:第一获取单元用于获取问题文本样本;第一训练单元用于将问题文本样本输入初始问题解析模型进行迭代训练,得到问题解析模型;初始问题解析模型包括第一编码层和转换层;第一编码层用于根据问题文本样本进行编码处理,得到对应的第一句式向量;转换层用于在接收到第一句式向量的情况下生成预设数量个初始意图向量,根据第一句式向量对每个初始意图向量进行填充处理,得到对应的目标意图向量,将目标意图向量转换为对应的文本片段;文本片段用于在问答系统中查询问题文本样本的答案。以此,通过获取的问题文本样本对初始问题解析模型进行迭代训练,在训练过程中,通过第一编码层,可以将问题文本样本编码得到对应的第一句式向量,通过转换层,可以生成预设数量个初始意图向量,进而基于第一句式向量对每个初始意图向量进行填充处理,从而一次性将问题文本样本解析得到至少一个目标意图向量对应的文本片段,每个文本片段所反映的问题意图可以由对应的初始意图向量确定,且通过问题解析模型可以将问题文本样本转 换到不同的意图空间,并在各意图空间对应的意图向量下标注实体和约束,有利于提高槽位识别效率,减少槽位识别处理流程所需的模型数量,从而降低时延,且各个意图空间内同时执行槽位识别,可以增加并发,提高了问答系统的意图识别能力和识别效率。The model training device of the question-answering system provided in the embodiment of the present application includes a first acquisition unit and a first training unit, wherein: the first acquisition unit is used to acquire a question text sample; the first training unit is used to input the question text sample into the initial question parsing model for iterative training to obtain the question parsing model; the initial question parsing model includes a first encoding layer and a conversion layer; the first encoding layer is used to perform encoding processing according to the question text sample to obtain a corresponding first sentence vector; the conversion layer is used to generate a preset number of initial intention vectors when receiving the first sentence vector, fill each initial intention vector according to the first sentence vector to obtain a corresponding target intention vector, and convert the target intention vector into a corresponding text fragment; the text fragment is used to query the answer to the question text sample in the question-answering system. In this way, the initial question parsing model is iteratively trained through the obtained question text samples. During the training process, the question text sample can be encoded to obtain the corresponding first sentence vector through the first encoding layer, and a preset number of initial intention vectors can be generated through the conversion layer, and then each initial intention vector is filled based on the first sentence vector, so that the question text sample is parsed at one time to obtain a text segment corresponding to at least one target intention vector, and the question intent reflected by each text segment can be determined by the corresponding initial intention vector, and the question text sample can be converted into a target intention vector through the question parsing model. Switching to different intent spaces and marking entities and constraints under the intent vectors corresponding to each intent space can help improve slot recognition efficiency and reduce the number of models required for the slot recognition processing flow, thereby reducing latency. Simultaneous slot recognition in each intent space can increase concurrency and improve the intent recognition capability and efficiency of the question-answering system.

在上述的实施例中,提供了一种应答方法,与之相对应的,基于相同的技术构思,本申请实施例还提供了一种应答装置,下面结合附图进行说明。In the above-mentioned embodiment, a response method is provided. Correspondingly, based on the same technical concept, an embodiment of the present application also provides a response device, which will be described below in conjunction with the accompanying drawings.

图7为本申请实施例提供的一种应答装置示意图。FIG. 7 is a schematic diagram of a response device provided in an embodiment of the present application.

本实施例提供一种应答装置700,包括:第二获取单元702,用于获取待应答的目标问题;第一解析单元704,用于将所述目标问题输入问题解析模型进行解析处理,得到对应的目标片段;所述问题解析模型是通过问答系统的模型训练方法进行训练所得到的;第一确定单元706,用于根据所述目标片段,确定所述目标问题的答案。This embodiment provides an answering device 700, including: a second acquisition unit 702, used to acquire a target question to be answered; a first parsing unit 704, used to input the target question into a question parsing model for parsing and processing to obtain a corresponding target segment; the question parsing model is obtained by training using a model training method of a question-answering system; a first determination unit 706, used to determine an answer to the target question based on the target segment.

可选地,第一解析单元704,具体用于:将所述目标问题输入问题分类模型进行分类处理,得到分类结果;在所述分类结果用于表征所述目标问题属于第一预设分类的情况下,将所述目标问题输入所述问题解析模型进行解析处理,得到对应的目标片段。Optionally, the first parsing unit 704 is specifically used to: input the target problem into the problem classification model for classification processing to obtain a classification result; when the classification result is used to characterize that the target problem belongs to a first preset classification, input the target problem into the problem parsing model for parsing processing to obtain a corresponding target fragment.

可选地,第一确定单元706,具体用于:将所述目标片段输入实体链接模型进行预测处理,得到对应的目标实体;根据所述目标实体,进行槽位填充处理,得到所述目标问题的槽位填充结果;根据所述槽位填充结果,在预先配置的知识图谱中查询对应的答案,得到所述目标问题的答案。Optionally, the first determination unit 706 is specifically used to: input the target fragment into the entity linking model for prediction processing to obtain a corresponding target entity; perform slot filling processing based on the target entity to obtain a slot filling result for the target question; based on the slot filling result, query the corresponding answer in a pre-configured knowledge graph to obtain an answer to the target question.

本申请实施例所提供的应答装置包括:第二获取单元,用于获取待应答的目标问题;第一解析单元,用于将所述目标问题输入问题解析模型进行解析处理,得到对应的目标片段;所述问题解析模型是通过问答系统的模型训练方法进行训练所得到的;第一确定单元,用于根据所述目标片段,确定所述目标问题的答案。以此,问题解析模型在训练过程中通过第一编码层,可以将问题文本样本编码得到对应的第一句式向量,通过转换层,可以生成预设数量个初始意图向量,进而基于第一句式向量对每个初始意图向量进行填充处理,从而一次性将问题文本样本解析得到至少一个目标意图向量对应的文本片段,每个文本片段所反映的问题意图可以由对应的初始意图向量确定,且通过问题解析模型可以将问题文本样本转换到不同的意图空间,并在各意图空间对应的意图向量下标注实体和约束,有利于提高槽位识别效率,减少 槽位识别处理流程所需的模型数量,从而降低时延,且各个意图空间内同时执行槽位识别,可以增加并发,提高了问答系统的意图识别能力和识别效率,进而,通过问题解析模型生成目标片段,通过目标片段确定目标问题的答案,能够提高问答系统的意图识别准确性和意图识别效率。The answering device provided in the embodiment of the present application includes: a second acquisition unit, which is used to acquire a target question to be answered; a first parsing unit, which is used to input the target question into a question parsing model for parsing and processing to obtain a corresponding target segment; the question parsing model is obtained by training through a model training method of a question-answering system; and a first determination unit, which is used to determine the answer to the target question based on the target segment. In this way, during the training process, the question parsing model can encode the question text sample through the first encoding layer to obtain the corresponding first sentence vector, and through the conversion layer, a preset number of initial intention vectors can be generated, and then each initial intention vector can be filled based on the first sentence vector, so that the question text sample can be parsed at one time to obtain a text segment corresponding to at least one target intention vector, and the question intent reflected by each text segment can be determined by the corresponding initial intention vector, and the question text sample can be converted to different intention spaces through the question parsing model, and entities and constraints can be annotated under the intention vectors corresponding to each intention space, which is conducive to improving the efficiency of slot recognition and reducing The number of models required for the slot recognition processing flow is reduced, thereby reducing latency, and slot recognition is performed simultaneously in each intent space, which can increase concurrency and improve the intent recognition capability and efficiency of the question-answering system. Furthermore, the target fragments are generated through the question parsing model, and the answer to the target question is determined through the target fragments, which can improve the intent recognition accuracy and efficiency of the question-answering system.

涉及知识图谱的问答系统通常是高度定制化的,需要海量的训练样本参与模型训练,才能取得较好的模型效果,进而实现高效准确的自动应答。然而,若通过人工配置训练样本,效率非常低下,难以满足模型训练需求。为了解决上述问题,本申请实施例提供了一种问答系统的样本生成方法。Question-answering systems involving knowledge graphs are usually highly customized and require a large number of training samples to participate in model training in order to achieve good model effects and thus achieve efficient and accurate automatic responses. However, if training samples are manually configured, the efficiency is very low and it is difficult to meet the model training requirements. In order to solve the above problems, an embodiment of the present application provides a sample generation method for a question-answering system.

图8为本申请实施例提供的一种问答系统的样本生成方法的处理流程图。图8的问答系统的样本生成方法可由电子设备执行,该电子设备可以是终端设备,比如手机、笔记本电脑、智能交互设备等等;或者,该电子设备还可以是服务器,比如独立的物理服务器、服务器集群或者是能够进行云计算的云服务器。参照图8,本实施例提供的问答系统的样本生成方法,具体包括步骤S802至步骤S808。Figure 8 is a processing flow chart of a sample generation method of a question-and-answer system provided in an embodiment of the present application. The sample generation method of the question-and-answer system of Figure 8 can be executed by an electronic device, which can be a terminal device, such as a mobile phone, a laptop computer, an intelligent interactive device, etc.; or, the electronic device can also be a server, such as an independent physical server, a server cluster, or a cloud server capable of cloud computing. Referring to Figure 8, the sample generation method of the question-and-answer system provided in this embodiment specifically includes steps S802 to S808.

问答系统可以包括初始问题分类模型、初始问题解析模型以及初始实体链接模型。The question answering system may include an initial question classification model, an initial question parsing model, and an initial entity linking model.

其中,本实施例中的问答系统、问答系统中的初始问题分类模型、初始问题解析模型以及初始实体链接模型的具体说明涉及问答系统的模型训练方法,可参照上述方法实施例中的具体方案。Among them, the specific description of the question-answering system, the initial question classification model, the initial question parsing model and the initial entity linking model in the question-answering system in this embodiment involves the model training method of the question-answering system, and reference can be made to the specific scheme in the above method embodiment.

图9为本申请实施例提供的一种问答系统的框架示意图。FIG9 is a schematic diagram of a framework of a question-answering system provided in an embodiment of the present application.

如图9所示,在问答系统投入使用之前,需要先通过模型训练部分进行模型训练,具体地,需要对初始问句解析模型进行迭代训练,得到问句解析模型902;需要对初始实体链接模型进行迭代训练,得到实体链接模型904;需要对初始OOD模型进行迭代训练,得到OOD模型906。为此,在模型训练之前,需要进行数据准备,具体地,待生成的训练数据集包括:用于对初始问句解析模型进行迭代训练的问句解析数据集908,用于对初始实体链接模型进行迭代训练的实体链接数据集910,以及,用于对初始OOD模型进行迭代训练的OOD数据集912。As shown in FIG9 , before the question answering system is put into use, it is necessary to first perform model training through the model training part. Specifically, it is necessary to iteratively train the initial question parsing model to obtain the question parsing model 902; it is necessary to iteratively train the initial entity linking model to obtain the entity linking model 904; it is necessary to iteratively train the initial OOD model to obtain the OOD model 906. To this end, data preparation is required before model training. Specifically, the training data set to be generated includes: a question parsing data set 908 for iterative training of the initial question parsing model, an entity linking data set 910 for iterative training of the initial entity linking model, and an OOD data set 912 for iterative training of the initial OOD model.

步骤S802,获取预先配置的同义词集合、相似问集合、比较词集合、问题域集合以及标准实体库;同义词集合包括多个标准词以及每个标准词对应的至少一个同义词;相似问集合包括多个第一属性以及每个第一属性对应的 至少一个相似问句;比较词集合包括比较词信息;问题域集合包括域内问题文本和域外问题文本;标准实体库包括多个标准实体。Step S802, obtaining a pre-configured synonym set, similar question set, comparison word set, problem domain set, and standard entity library; the synonym set includes a plurality of standard words and at least one synonym corresponding to each standard word; the similar question set includes a plurality of first attributes and a corresponding entity library for each first attribute; At least one similar question; the comparison word set includes comparison word information; the question domain set includes in-domain question text and out-of-domain question text; the standard entity library includes multiple standard entities.

具体实施时,同义词集合可以从预先配置的同义词模板中获取。During specific implementation, the synonym set may be obtained from a pre-configured synonym template.

在同义词模板中,可以显示如下字段:标准词、类型、同义词、更新时间、操作,等等。In the synonym template, the following fields can be displayed: standard word, type, synonym, update time, operation, etc.

具体地:(a1)标准词:系统自动生成,值来源于对应图谱中已定义的实体和属性。Specifically: (a1) Standard words: automatically generated by the system, and the values are derived from the entities and attributes defined in the corresponding graph.

(a2)类型:系统自动生成,值来源于对应图谱中已定义的实体和属性所对应的类型。(a2) Type: Automatically generated by the system, the value comes from the type corresponding to the entity and attribute defined in the corresponding graph.

(a3)同义词:通过“编辑”手动录入,或通过“上传同义词”批量录入。(a3) Synonyms: Manually enter via "Edit" or batch enter via "Upload Synonyms".

(a4)更新时间:同义词最新修改时间。(a4) Update time: the time when the synonym was last modified.

同义词集合包括多个标准词以及每个标准词对应的至少一个同义词,例如,同义词集合包括:标准词x1,该标准词x1对应的同义词x2、同义词x3以及同义词x4;标准词y1,该标准词y1对应的同义词y2;标准词z1,该标准词z1对应的同义词z2以及同义词z3,等等。The synonym set includes multiple standard words and at least one synonym corresponding to each standard word. For example, the synonym set includes: standard word x1, synonyms x2, synonyms x3 and synonyms x4 corresponding to the standard word x1; standard word y1, synonym y2 corresponding to the standard word y1; standard word z1, synonyms z2 and synonym z3 corresponding to the standard word z1, and so on.

具体实施时,相似问集合可以从预先配置的相似问模板中获取。During specific implementation, the similar question set can be obtained from a pre-configured similar question template.

在相似问模板中,实体类型列对应于概念,属性列对应于属性,针对属性构造相似问,在维护相似问模板时,注意配置掩码[e],即留出填充同义词的位置。In the similarity question template, the entity type column corresponds to the concept, the attribute column corresponds to the attribute, and similar questions are constructed for the attributes. When maintaining the similarity question template, pay attention to configuring the mask [e], that is, leaving space for filling in synonyms.

例如,“这个车位贷怎么办理呢?”在相似问模板中应维护为“这个[e]怎么办理呢?”For example, "How do I apply for this parking space loan?" should be maintained as "How do I apply for this [e]?" in the similar question template.

相似问模板可以用于问句解析数据集,与同义词、比较词共同组成完整句子,模拟真实场景下用户问题。Similar question templates can be used in question parsing datasets to form complete sentences together with synonyms and comparative words, simulating user questions in real scenarios.

在相似问模板中,可以显示如下字段:概念、属性名称、属性类型、相似问、更新时间、操作,等等。In the similarity question template, the following fields can be displayed: concept, attribute name, attribute type, similarity question, update time, operation, etc.

具体地:(b1)概念:对应图谱中定义好的概念,可搜索,不能新增、编辑。Specifically: (b1) Concept: corresponds to the concept defined in the graph, which can be searched but cannot be added or edited.

(b2)属性类型:对应图谱中定义好的属性,不能编辑。(b2) Attribute type: corresponds to the attribute defined in the graph and cannot be edited.

属性类型包括且不限于:文本、数字、图片、富文本、图(Map),等等。Attribute types include but are not limited to: text, number, image, rich text, map, etc.

(b3)相似问:某一具体属性的相似问,可以单条维护也可以批量导入, 相似问可以是一个,也可以是多个。(b3) Similar questions: Similar questions of a specific attribute can be maintained individually or imported in batches. The number of similar questions can be one or more.

(b4)更新时间:相似问最近修改时间。(b4) Update time: The time when the similar question was last modified.

(b5)操作:编辑、删除。(b5) Operation: Edit, Delete.

相似问集合包括多个第一属性以及每个第一属性对应的至少一个相似问句,例如,相似问集合包括:第一属性甲,以及该第一属性甲对应的相似问句甲1和相似问句甲2;第一属性乙,以及该第一属性乙对应的相似问句乙1,等等。The similar question set includes multiple first attributes and at least one similar question corresponding to each first attribute. For example, the similar question set includes: first attribute A, and similar question A1 and similar question A2 corresponding to the first attribute A; first attribute B, and similar question B1 corresponding to the first attribute B, and so on.

具体实施时,比较词集合可以从预先配置的比较词模板中获取。During specific implementation, the comparison word set may be obtained from a pre-configured comparison word template.

比较词模板,可以用于生成问句解析数据集,可维护的候选词主要为概念的属性类型为“数字”的属性词,其目的在于保证解析数据集中生成相关比较类型的训练数据。The comparison word template can be used to generate a question parsing dataset. The maintainable candidate words are mainly attribute words with the concept attribute type of "number". Its purpose is to ensure that training data of relevant comparison types are generated in the parsing dataset.

比较词信息可以包括比较词和单位。The comparative word information may include a comparative word and a unit.

比较词包括且不限于:最低、最小、最少、最短、最便宜、低、小、少、短、便宜、最高、最大、最多、最长、最贵、高、大、多、长、贵,等等。Comparative words include but are not limited to: lowest, smallest, least, shortest, cheapest, low, small, few, short, cheap, highest, largest, most, longest, most expensive, high, big, many, long, expensive, etc.

单位包括且不限于:元、千、千元、万、万元、亿元、天、日、月、年、%、百分之,等等。Units include but are not limited to: yuan, thousand, thousand yuan, ten thousand, ten thousand yuan, hundred million yuan, day, day, month, year, %, percent, etc.

具体实施时,问题域集合可以从预先配置的问题域模板中获取。During specific implementation, the problem domain set may be obtained from a pre-configured problem domain template.

问题域模板用于生成OOD模型数据集,维护一些问答系统回答不了的问题、闲聊等。后台会抽样部分问句解析数据集组成二分类正负样本,确保问答数据的合理漏出。由于生成数据存在一定局限性,因此OOD模板提供域内、域外两种问题的编辑和导入保证分类准确性和数据可维护性。The question domain template is used to generate OOD model data sets and maintain some questions and small talk that the question-answering system cannot answer. The backend will sample some question parsing data sets to form binary positive and negative samples to ensure the reasonable leakage of question-answering data. Due to certain limitations in the generated data, the OOD template provides editing and importing of both in-domain and out-of-domain questions to ensure classification accuracy and data maintainability.

问题域集合包括域内问题文本和域外问题文本。The question domain set includes in-domain question texts and out-of-domain question texts.

问题域集合可以基于预先配置的问题域生成。问答系统具有应答域内问题的能力,且不具有应答域外问题的能力。The question domain set can be generated based on the pre-configured question domain. The question answering system has the ability to answer questions within the domain, but does not have the ability to answer questions outside the domain.

域内问题可以是位于该问题域之内的问题,用于描述该域内问题的文本可以是域内问题文本。域外问题可以是位于该问题域之外的问题,用于描述该域外问题的文本可以是域外问题文本。An in-domain problem may be a problem within the problem domain, and the text used to describe the in-domain problem may be the in-domain problem text. An out-of-domain problem may be a problem outside the problem domain, and the text used to describe the out-of-domain problem may be the out-of-domain problem text.

域内问题文本,例如“A活动的截止时间是哪天?”;域外问题文本,例如,“我今天要不要出门呢?”In-domain question text, such as "When is the deadline for activity A?"; out-of-domain question text, such as "Should I go out today?"

步骤S804,根据同义词集合、相似问集合以及比较词集合中的至少一者, 生成问题解析样本。Step S804: based on at least one of the synonym set, the similar question set and the comparison word set, Generate problem parsing samples.

问题解析样本包括且不限于:单槽位样本、单属性样本、单实体单属性样本、单实体双属性样本、双实体单属性样本、双实体双属性样本、复合属性约束样本、比较类型样本,等等。示例性地,生成的问题解析样本可以参照如下格式存储:
The problem analysis samples include but are not limited to: single slot samples, single attribute samples, single entity single attribute samples, single entity dual attribute samples, dual entity single attribute samples, dual entity dual attribute samples, composite attribute constraint samples, comparison type samples, etc. Exemplarily, the generated problem analysis samples can be stored in the following format:

上述存储格式仅仅是示例性地,对本实施例不构成特殊限制。The above storage format is merely exemplary and does not constitute any special limitation to this embodiment.

图谱属性可以是生成后的数据在图谱中的属性,用于表示生成后的数据在图谱中的存放意图。The graph attribute may be an attribute of the generated data in the graph, and is used to indicate the storage intention of the generated data in the graph.

不同类型的问题解析样本的生成方式不同,下文对其中几种进行具体说明。Different types of question parsing samples are generated in different ways, and several of them are described in detail below.

在一种具体的实现方式中,根据同义词集合、相似问集合以及比较词集合中的至少一者,生成问题解析样本,包括:按照第一预设过滤条件,对同义词集合进行过滤处理,得到候选词列表;候选词列表包括候选标准词以及 候选标准词对应的候选同义词;对候选词列表进行第一采样处理,得到单槽位样本,将单槽位样本确定为问题解析样本;或者,对相似问集合中的多个第一属性进行第二采样处理,得到目标第一属性;对目标第一属性对应的至少一个相似问句进行第三采样处理,得到初始相似问句;根据初始相似问句,确定对应的单属性样本,将单属性样本确定为问题解析样本。In a specific implementation, generating a question analysis sample according to at least one of a synonym set, a similar question set, and a comparative word set includes: filtering the synonym set according to a first preset filtering condition to obtain a candidate word list; the candidate word list includes candidate standard words and candidate synonyms corresponding to the candidate standard words; performing a first sampling process on the candidate word list to obtain a single-slot sample, and determining the single-slot sample as a question parsing sample; or performing a second sampling process on multiple first attributes in a similar question set to obtain a target first attribute; performing a third sampling process on at least one similar question sentence corresponding to the target first attribute to obtain an initial similar question sentence; determining the corresponding single-attribute sample based on the initial similar question sentence, and determining the single-attribute sample as a question parsing sample.

问题解析样本可以包括单槽位样本,也可以包括单属性样本。Problem analysis samples may include single-slot samples or single-attribute samples.

具体实施时,在问题解析样本包括单槽位样本的情况下,根据同义词集合、相似问集合以及比较词集合中的至少一者,生成问题解析样本,可以是按照第一预设过滤条件,对同义词集合进行过滤处理,得到候选词列表;候选词列表包括候选标准词以及候选标准词对应的候选同义词;对候选词列表进行第一采样处理,得到单槽位样本。In specific implementation, when the question analysis sample includes a single-slot sample, the question analysis sample is generated according to at least one of a synonym set, a similar question set and a comparative word set. The synonym set may be filtered according to a first preset filtering condition to obtain a candidate word list; the candidate word list includes candidate standard words and candidate synonyms corresponding to the candidate standard words; and the candidate word list is subjected to a first sampling process to obtain a single-slot sample.

单槽位样本可以是在实体、属性、关系、约束等多种预设槽位类型中仅对应于其中一种预设槽位类型的文本样本。A single-slot sample may be a text sample that corresponds to only one preset slot type among multiple preset slot types such as entity, attribute, relationship, constraint, etc.

每种预设槽位类型对应于前述的槽位识别结果中的一种。Each preset slot type corresponds to one of the aforementioned slot identification results.

例如,通过关系抽取模型对问句“A产品的价格是多少”进行槽位识别处理,得到的槽位识别结果包括:“A产品”的槽位识别结果为“实体”,“价格”的槽位识别结果为“属性”,等等。For example, the question "What is the price of product A?" is processed by slot recognition through the relationship extraction model, and the obtained slot recognition results include: the slot recognition result of "product A" is "entity", the slot recognition result of "price" is "attribute", and so on.

槽位识别结果“实体”对应于预设槽位类型“实体”。属于预设槽位类型“实体”的单槽位样本,可以是“A产品”、“B活动”、“C商品和D商品”,等等。The slot recognition result "entity" corresponds to the preset slot type "entity". The single slot samples belonging to the preset slot type "entity" can be "A product", "B activity", "C product and D product", and so on.

槽位识别结果“属性”对应于预设槽位类型“属性”。属于预设槽位类型“属性”的单槽位样本可以是“价格”、“活动截止日期”、“利率”、“折扣”,等等。The slot recognition result "attribute" corresponds to the preset slot type "attribute". Single slot samples belonging to the preset slot type "attribute" may be "price", "activity deadline", "interest rate", "discount", and so on.

单槽位样本的生成是为了模拟用户意图不明时的问句,可以在引导反问、连续反问场景下使用。The generation of single-slot samples is to simulate questions asked when the user's intention is unclear, and can be used in guided rhetorical questions and continuous rhetorical questions scenarios.

示例性地,单槽位样本可以是单实体样本,也可以是双实体样本,还可以是单约束样本。Exemplarily, a single-slot sample may be a single-entity sample, a double-entity sample, or a single-constraint sample.

需要注意的是,“双实体样本”尽管包括两个实体,但该两个实体均对应于一种预设槽位类型,即预设槽位类型“实体”,故“双实体样本”也是一种单槽位样本。It should be noted that although the "dual-entity sample" includes two entities, the two entities both correspond to a preset slot type, namely the preset slot type "entity", so the "dual-entity sample" is also a single-slot sample.

图10为本申请实施例提供的一种问答系统的样本生成方法的第1种处理 流程子图。图10示例性示出了单槽位样本的一种生成方式,单槽位样本包括且不限于:单实体数据1008、双实体数据1010以及单约束数据1012。FIG. 10 is a first processing diagram of a sample generation method of a question-answering system provided in an embodiment of the present application. Flowchart sub-graph. FIG. 10 exemplarily shows a method for generating a single-slot sample, where the single-slot sample includes but is not limited to: single-entity data 1008 , double-entity data 1010 , and single-constraint data 1012 .

如图10所示,第一预设过滤条件可以包括过滤条件“实体”和过滤条件“json key(JS对象简谱关键字)”。基于过滤条件“实体”,对同义词集合1002进行过滤处理,得到候选词列表1004,对候选词列表1004进行第一采样处理,可以生成单实体数据1008和/或双实体数据1010。第一采样处理的采样方式可以是随机采样,也可以是其他采样方式。As shown in FIG10 , the first preset filtering condition may include the filtering condition “entity” and the filtering condition “json key (JS object notation keyword)”. Based on the filtering condition “entity”, the synonym set 1002 is filtered to obtain a candidate word list 1004, and the candidate word list 1004 is subjected to a first sampling process to generate single entity data 1008 and/or double entity data 1010. The sampling method of the first sampling process may be random sampling or other sampling methods.

单实体数据1008,例如,
Single entity data 1008, for example,

其中,位于"text(文本)"行的“票据业务”可以是生成后的数据,即单实体数据1008对应的文本。“NULL”可以是图谱属性,用于表示该单实体数据1008的存放意图为空。“0”可以是头实体开始位置,“4”可以是头实体结束位置,“业务”可以是头实体类别,位于"value(值)"行的“票据业务”可以是头实体标准名称。Among them, "Bill Business" located in the "text" row may be the generated data, that is, the text corresponding to the single entity data 1008. "NULL" may be a graph attribute, used to indicate that the storage intention of the single entity data 1008 is empty. "0" may be the start position of the header entity, "4" may be the end position of the header entity, "Business" may be the header entity category, and "Bill Business" located in the "value" row may be the standard name of the header entity.

基于过滤条件“json key”,对同义词集合1002进行过滤处理,得到候选词列表1006,对候选词列表1006进行第一采样处理,可以生成单约束数据1012。Based on the filtering condition "json key", the synonym set 1002 is filtered to obtain a candidate word list 1006. The candidate word list 1006 is subjected to a first sampling process to generate single constraint data 1012.

单约束数据1012,例如,

Single constraint data 1012, for example,

其中,位于"text"行的“X支行”可以是生成后的数据,即单约束数据1012对应的文本。“办理流程”可以是图谱属性,用于表示该单约束数据1012的存放意图为办理流程。“0”可以是头实体开始位置,“4”可以是头实体结束位置,“constraint(约束)”可以是头实体类别,位于"value"行的“X支行”可以是头实体标准名称。Among them, "X Branch" in the "text" row may be the generated data, that is, the text corresponding to the single constraint data 1012. "Process flow" may be a graph attribute, used to indicate that the storage intention of the single constraint data 1012 is the process flow. "0" may be the starting position of the header entity, "4" may be the ending position of the header entity, "constraint" may be the header entity category, and "X Branch" in the "value" row may be the standard name of the header entity.

具体实施时,在问题解析样本包括单属性样本的情况下,根据同义词集合、相似问集合以及比较词集合中的至少一者,生成问题解析样本,可以是对相似问集合中的多个第一属性进行第二采样处理,得到目标第一属性;对目标第一属性对应的至少一个相似问句进行第三采样处理,得到初始相似问句;根据初始相似问句,确定对应的单属性样本,将单属性样本确定为问题解析样本。In a specific implementation, when the question parsing sample includes a single-attribute sample, the question parsing sample is generated according to at least one of a synonym set, a similar question set, and a comparative word set. The second sampling process may be performed on multiple first attributes in the similar question set to obtain a target first attribute; a third sampling process may be performed on at least one similar question sentence corresponding to the target first attribute to obtain an initial similar question sentence; and a corresponding single-attribute sample is determined according to the initial similar question sentence, and the single-attribute sample is determined as the question parsing sample.

具体地,根据初始相似问句,确定对应的单属性样本,将单属性样本确定为问题解析样本,可以是:若初始相似问句携带有掩码,则在初始相似问句中对掩码进行删除处理,得到单属性样本;若初始相似问句未携带掩码,则将初始相似问句确定为单属性样本。Specifically, according to the initial similar question, the corresponding single-attribute sample is determined, and the single-attribute sample is determined as the question parsing sample. It can be: if the initial similar question carries a mask, the mask is deleted in the initial similar question to obtain the single-attribute sample; if the initial similar question does not carry a mask, the initial similar question is determined as a single-attribute sample.

单属性样本可以是仅对应于预设槽位类型“属性”的文本样本。单属性样本可以是一种特殊的单槽位样本,由于单属性样本的用途与上述列举的“单实体样本”、“双实体样本”以及“单约束样本”不同,故对其进行单独说明。A single attribute sample can be a text sample that only corresponds to the preset slot type "attribute". A single attribute sample can be a special single slot sample. Since the purpose of a single attribute sample is different from the "single entity sample", "double entity sample" and "single constraint sample" listed above, it is explained separately.

单属性样本的生成是为了模拟用户询问主体不明时的问句,在引导反问、连续反问场景下使用。The generation of single-attribute samples is to simulate questions asked by users when the subject is unknown, and is used in guided rhetorical questions and continuous rhetorical questions scenarios.

图11为本申请实施例提供的一种问答系统的样本生成方法的第2种处理流程子图。图11示例性示出了单属性样本的一种生成方式,单属性样本可以是单属性数据1108。 Fig. 11 is a second processing flow chart of a sample generation method of a question-answering system provided in an embodiment of the present application. Fig. 11 exemplarily shows a generation method of a single attribute sample, and the single attribute sample can be single attribute data 1108 .

如图11所示,对相似问集合1102中的多个第一属性进行第二采样处理,得到目标第一属性;对目标第一属性对应的至少一个相似问句进行第三采样处理,得到初始相似问句。若初始相似问句携带有掩码,即初始相似问句为带[e]的相似问句1104,则删除[e],得到单属性数据1108;若初始相似问句未携带掩码,即初始相似问句为不带[e]的相似问句1106,则将相似问句1106确定为单属性数据1108。As shown in FIG11 , a second sampling process is performed on multiple first attributes in the similar question set 1102 to obtain a target first attribute; a third sampling process is performed on at least one similar question sentence corresponding to the target first attribute to obtain an initial similar question sentence. If the initial similar question sentence carries a mask, that is, the initial similar question sentence is a similar question sentence 1104 with [e], then [e] is deleted to obtain single attribute data 1108; if the initial similar question sentence does not carry a mask, that is, the initial similar question sentence is a similar question sentence 1106 without [e], then the similar question sentence 1106 is determined as single attribute data 1108.

第二采样处理和第三采样处理的处理方式可以是随机采样,也可以是其他采样方式。The processing methods of the second sampling process and the third sampling process may be random sampling or other sampling methods.

单属性数据1108,例如,
Single attribute data 1108, for example,

其中,位于"text"行的“办理流程是什么啊”可以是生成后的数据,即单属性数据1108对应的文本。“办理流程”可以是图谱属性,用于表示该单属性数据1108的存放意图为办理流程。该单属性数据1108不具有头实体,也不具有尾实体。Among them, "What is the handling process?" in the "text" row may be generated data, that is, the text corresponding to the single attribute data 1108. "Handling process" may be a graph attribute, used to indicate that the storage intention of the single attribute data 1108 is the handling process. The single attribute data 1108 has neither a head entity nor a tail entity.

在一种具体的实现方式中,根据同义词集合、相似问集合以及比较词集合中的至少一者,生成问题解析样本,包括:按照第二预设过滤条件,对同义词集合进行过滤处理,得到候选实体词表;候选实体词表包括多个候选实体词;对相似问集合进行筛选处理,得到中间相似问集合;中间相似问集合包括多个第一属性以及每个第一属性对应的至少一个携带有掩码的候选相似问句;根据候选词列表和中间相似问集合,确定对应于同一属性类别的目标候选实体词和目标候选相似问句;通过目标候选实体词,对目标候选相似问句中的掩码进行替换处理,得到复合样本,将复合样本确定为问题解析样本;或者,按照第三预设过滤条件对同义词集合进行过滤处理,得到属性同义词和实体同义词;生成随机数;根据比较词信息、属性同义词、实体同义词、 随机数以及预设描述词进行拼接处理,得到比较类型样本,将比较类型样本确定为问题解析样本。In a specific implementation, a question parsing sample is generated according to at least one of a synonym set, a similar question set and a comparative word set, including: filtering the synonym set according to a second preset filtering condition to obtain a candidate entity word list; the candidate entity word list includes multiple candidate entity words; screening the similar question set to obtain an intermediate similar question set; the intermediate similar question set includes multiple first attributes and at least one candidate similar question sentence carrying a mask corresponding to each first attribute; according to the candidate word list and the intermediate similar question set, target candidate entity words and target candidate similar question sentences corresponding to the same attribute category are determined; the mask in the target candidate similar question sentence is replaced by the target candidate entity word to obtain a composite sample, and the composite sample is determined as a question parsing sample; or, filtering the synonym set according to a third preset filtering condition to obtain attribute synonyms and entity synonyms; generating random numbers; and selecting a composite sample according to comparative word information, attribute synonyms, entity synonyms, The random numbers and the preset description words are concatenated to obtain comparison type samples, and the comparison type samples are determined as problem analysis samples.

问题解析样本可以包括复合样本,也可以包括比较类型样本。Problem analysis samples can include composite samples as well as comparative samples.

具体实施时,在问题解析样本包括复合样本的情况下,根据同义词集合、相似问集合以及比较词集合中的至少一者,生成问题解析样本,可以是按照第二预设过滤条件,对同义词集合进行过滤处理,得到候选实体词表;候选实体词表包括多个候选实体词;对相似问集合进行筛选处理,得到中间相似问集合;中间相似问集合包括多个第一属性以及每个第一属性对应的至少一个携带有掩码的候选相似问句;根据候选词列表和中间相似问集合,确定对应于同一属性类别的目标候选实体词和目标候选相似问句;通过目标候选实体词,对目标候选相似问句中的掩码进行替换处理,得到复合样本。In a specific implementation, when the question parsing sample includes a composite sample, the question parsing sample is generated according to at least one of a synonym set, a similar question set and a comparative word set. The method may be to filter the synonym set according to a second preset filtering condition to obtain a candidate entity word list; the candidate entity word list includes a plurality of candidate entity words; the similar question set is screened to obtain an intermediate similar question set; the intermediate similar question set includes a plurality of first attributes and at least one candidate similar question sentence carrying a mask corresponding to each first attribute; according to the candidate word list and the intermediate similar question set, the target candidate entity words and the target candidate similar question sentence corresponding to the same attribute category are determined; and the mask in the target candidate similar question sentence is replaced by the target candidate entity word to obtain a composite sample.

复合样本可以是在实体、属性、关系、约束等多种预设槽位类型中,对应于其中多种预设槽位类型的文本样本。A composite sample may be a text sample corresponding to multiple preset slot types among multiple preset slot types such as entity, attribute, relationship, constraint, etc.

复合样本包括且不限于:单实体单属性样本、单实体双属性样本、双实体单属性样本以及双实体双属性样本,等等。Composite samples include but are not limited to: single-entity single-attribute samples, single-entity dual-attribute samples, dual-entity single-attribute samples, and dual-entity dual-attribute samples, etc.

其中,单实体单属性样本的生成是为了模拟用户正常询问某个事物的属性时的问句,在单实体单属性场景下使用单实体双属性样本的生成是为了模拟用户同时询问某个事物的两个或多个属性时的问句,在单实体多属性场景下使用。双实体单属性样本的生成是为了模拟用户同时询问多个事物的同一属性时的问句,在多实体单属性场景下使用。双实体双属性样本的生成是为了模拟用户同时询问多个事物的多个属性时的问句,在多实体多属性场景下使用。Among them, the generation of single-entity single-attribute samples is to simulate the questions when users normally ask about the attributes of a thing, and is used in single-entity single-attribute scenarios. The generation of single-entity dual-attribute samples is to simulate the questions when users ask about two or more attributes of a thing at the same time, and is used in single-entity multi-attribute scenarios. The generation of dual-entity single-attribute samples is to simulate the questions when users ask about the same attribute of multiple things at the same time, and is used in multi-entity single-attribute scenarios. The generation of dual-entity dual-attribute samples is to simulate the questions when users ask about multiple attributes of multiple things at the same time, and is used in multi-entity multi-attribute scenarios.

图12为本申请实施例提供的一种问答系统的样本生成方法的第3种处理流程子图。图12示例性示出了复合样本的一种生成方式,复合样本可以是复合数据1212。复合数据1212包括且不限于:单实体单属性数据、单实体双属性数据、双实体单属性数据以及双实体双属性数据。FIG12 is a third processing flow sub-diagram of a sample generation method of a question-answering system provided in an embodiment of the present application. FIG12 exemplarily shows a generation method of a composite sample, and the composite sample can be composite data 1212. Composite data 1212 includes but is not limited to: single-entity single-attribute data, single-entity dual-attribute data, dual-entity single-attribute data, and dual-entity dual-attribute data.

如图12所示,第二预设过滤条件可以包括过滤条件“实体”。基于过滤条件“实体”对同义词集合1202进行过滤处理,得到候选实体词表;候选实体词表包括多个候选实体词;对相似问集合1204进行筛选处理,得到中间相似问集合;中间相似问集合包括多个第一属性以及每个第一属性对应的至少一个携带有掩码的候选相似问句1206;根据候选词列表和中间相似问集合, 确定每个属性类别对应的候选实体词以及候选相似问句1208;对每个属性类别对应的候选实体词以及候选相似问句1208进行随机采样,可以得到对应于同一个属性类别的目标候选实体词和目标候选相似问句1210;通过目标候选实体词,对目标候选相似问句中的掩码[e]进行替换处理,得到复合数据1212。As shown in FIG12 , the second preset filtering condition may include the filtering condition “entity”. Based on the filtering condition “entity”, the synonym set 1202 is filtered to obtain a candidate entity word list; the candidate entity word list includes multiple candidate entity words; the similar question set 1204 is screened to obtain an intermediate similar question set; the intermediate similar question set includes multiple first attributes and at least one candidate similar question sentence 1206 with a mask corresponding to each first attribute; according to the candidate word list and the intermediate similar question set, Determine the candidate entity words and candidate similar questions 1208 corresponding to each attribute category; randomly sample the candidate entity words and candidate similar questions 1208 corresponding to each attribute category to obtain target candidate entity words and target candidate similar questions 1210 corresponding to the same attribute category; replace the mask [e] in the target candidate similar questions with the target candidate entity words to obtain composite data 1212.

确定同一个属性类别的目标候选实体词和目标候选相似问句1210;通过目标候选实体词,对目标候选相似问句中的掩码[e]进行替换处理,得到复合数据1212,可以是,随机选取同一个属性类别下的一个目标候选实体词和一个目标候选相似问句,通过目标候选实体词,对目标候选相似问句中的掩码[e]进行替换处理,得到单实体单属性数据。Determine the target candidate entity words and target candidate similar questions 1210 of the same attribute category; replace the mask [e] in the target candidate similar questions through the target candidate entity words to obtain composite data 1212, which can be, randomly select a target candidate entity word and a target candidate similar question under the same attribute category, and replace the mask [e] in the target candidate similar questions through the target candidate entity words to obtain single entity single attribute data.

单实体单属性数据,例如,
Single-entity single-attribute data, for example,

其中,“在什么地方支持办理业务甲”可以是生成后的数据,即单实体单属性数据对应的文本。“办理渠道”可以是图谱属性,用于表示该单实体单属性数据的存放意图为办理渠道。“9”可以是头实体开始位置,“12”可以是头实体结束位置,“业务”可以是头实体类别,“业务甲”可以是头实体标准名称。Among them, "Where to support the processing of business A" can be the generated data, that is, the text corresponding to the single-entity single-attribute data. "Processing channel" can be a graph attribute, which is used to indicate that the storage intention of the single-entity single-attribute data is the processing channel. "9" can be the starting position of the header entity, "12" can be the ending position of the header entity, "Business" can be the header entity category, and "Business A" can be the standard name of the header entity.

确定同一个属性类别的目标候选实体词和目标候选相似问句1210;通过目标候选实体词,对目标候选相似问句中的掩码[e]进行替换处理,得到复合数据1212,也可以是,随机选取同一个属性类别下的一个目标候选实体词和两个目标候选相似问句,通过目标候选实体词,对第一个目标候选相似问句中的掩码[e]进行替换处理,得到属性问1,且将第二个目标候选相似问句中 的[e]删除,得到属性问2,将属性问1与属性问2进行拼接处理,得到单实体双属性数据。Determine the target candidate entity words and target candidate similar questions of the same attribute category 1210; replace the mask [e] in the target candidate similar questions with the target candidate entity words to obtain composite data 1212; or randomly select a target candidate entity word and two target candidate similar questions under the same attribute category, replace the mask [e] in the first target candidate similar question with the target candidate entity words to obtain attribute question 1, and replace the mask [e] in the second target candidate similar question with the target candidate entity words. Delete [e] to get attribute question 2, concatenate attribute question 1 and attribute question 2 to get single-entity dual-attribute data.

单实体双属性数据,例如,
Single entity dual attribute data, for example,

其中,"请问办理业务X需要提交什么资料吗,请问申请要什么要求"可以是生成后的数据,即单实体双属性数据对应的文本。“办理资料”和“办理条件”可以是两个不同的图谱属性,用于表示该单实体双属性数据的存放意图包括办理资料和办理条件。“4”可以是头实体开始位置,“11”可以是头实体结束位置,“业务”可以是头实体类别,"出口信用证"可以是第一个头实体标准名称,"X"可以是第二个头实体标准名称。Among them, "What information do I need to submit to handle business X? What are the application requirements?" can be the generated data, that is, the text corresponding to the single-entity dual-attribute data. "Handling information" and "Handling conditions" can be two different graph attributes, used to indicate that the storage intention of the single-entity dual-attribute data includes handling information and handling conditions. "4" can be the starting position of the header entity, "11" can be the ending position of the header entity, "Business" can be the header entity category, "Export Letter of Credit" can be the first header entity standard name, and "X" can be the second header entity standard name.

需要注意的是,该单实体双属性数据中,第一个头实体和第二个头实体为同一个实体,故第一个头实体开始位置与第二个头实体开始位置相同,第一个头实体结束位置与第二个头实体结束位置相同。 It should be noted that, in the single-entity dual-attribute data, the first header entity and the second header entity are the same entity, so the starting position of the first header entity is the same as the starting position of the second header entity, and the ending position of the first header entity is the same as the ending position of the second header entity.

确定同一个属性类别的目标候选实体词和目标候选相似问句1210;通过目标候选实体词,对目标候选相似问句中的掩码[e]进行替换处理,得到复合数据1212,还可以是,随机选取同一个属性类别下的两个目标候选实体词和一个目标候选相似问句,在多种预设拼接方式中随机选取一种作为目标拼接方式,通过目标拼接方式对两个目标候选实体词进行拼接处理,得到目标拼接次,并通过目标拼接词,对目标候选相似问句中的掩码[e]进行替换处理,得到双实体参数性数据。Determine the target candidate entity words and target candidate similar questions 1210 of the same attribute category; replace the mask [e] in the target candidate similar questions through the target candidate entity words to obtain composite data 1212. Alternatively, randomly select two target candidate entity words and a target candidate similar question under the same attribute category, randomly select one of multiple preset splicing methods as the target splicing method, splice the two target candidate entity words through the target splicing method to obtain the target splicing method, and replace the mask [e] in the target candidate similar questions through the target splicing word to obtain dual-entity parametric data.

预设拼接方式,例如,采用“和”、“与”、“,”、“、”中的一者将两个词语拼接到一起。A preset splicing method, for example, uses one of “and”, “and”, “,”, “,” to splice two words together.

双实体单属性数据,例如,
Dual-entity single-attribute data, for example,

其中,"A业务和B业务的授信额度有多少钱"可以是生成后的数据,即双实体单属性数据对应的文本。“额度”可以是图谱属性,用于表示该双实体 单属性数据的存放意图为额度。“0”可以是第一个头实体开始位置,“6”可以是第一个头实体结束位置,“7”可以是第二个头实体开始位置,“12”可以是第二个头实体结束位置,“业务”可以是头实体类别,"A业务"可以是第一个头实体标准名称,"B业务"可以是第二个头实体标准名称。Among them, "How much is the credit line for business A and business B" can be the generated data, that is, the text corresponding to the double-entity single-attribute data. "Amount" can be a graph attribute used to represent the double entity The storage intention of single attribute data is the amount. "0" can be the start position of the first header entity, "6" can be the end position of the first header entity, "7" can be the start position of the second header entity, "12" can be the end position of the second header entity, "Business" can be the header entity category, "A Business" can be the standard name of the first header entity, and "B Business" can be the standard name of the second header entity.

需要注意的是,该双实体单属性数据中,第一个头实体和第二个头实体分别是两个不同的实体,但该两个实体的图谱属性相同。It should be noted that, in the dual-entity single-attribute data, the first head entity and the second head entity are two different entities, but the graph attributes of the two entities are the same.

图13为本申请实施例提供的一种问答系统的样本生成方法的第4种处理流程子图。图13示例性示出了双实体双属性样本的一种生成方式,双实体双属性样本可以是双实体双属性数据1306。Fig. 13 is a fourth processing flow chart of a sample generation method of a question-answering system provided in an embodiment of the present application. Fig. 13 exemplarily shows a generation method of a dual-entity dual-attribute sample, where the dual-entity dual-attribute sample can be dual-entity dual-attribute data 1306 .

具体实施时,可以从已生成的多个单实体单属性数据中,随机抽取多条数据,例如,单实体单属性数据1302和单实体单属性数据1304,通过随机选取的连接符对单实体单属性数据1302和单实体单属性数据1304进行拼接处理,组成双实体双属性数据1306。In specific implementation, multiple data can be randomly selected from the multiple generated single-entity single-attribute data, for example, single-entity single-attribute data 1302 and single-entity single-attribute data 1304, and single-entity single-attribute data 1302 and single-entity single-attribute data 1304 can be spliced together through randomly selected connectors to form dual-entity dual-attribute data 1306.

双实体双属性数据1306,例如,

Dual-entity dual-attribute data 1306, for example,

其中,"Y产品的额度使用方式具体有哪些,Z业务一般能使用什么卡种"可以是生成后的数据,即双实体双属性数据对应的文本。“额度使用方式”和"种类"可以是两种不同的图谱属性,用于表示该双实体双属性数据的存放意图包括额度使用方式和种类。“0”可以是第一个头实体开始位置,“3”可以是第一个头实体结束位置,“16”可以是第二个头实体开始位置,“24”可以是第二个头实体结束位置,“业务”可以是头实体类别,"Y产品"可以是第一个头实体标准名称,"Z业务"可以是第二个头实体标准名称。Among them, "What are the specific ways to use the credit limit of product Y, and what kind of cards can generally be used for business Z" can be the generated data, that is, the text corresponding to the dual-entity and dual-attribute data. "Use of credit limit" and "Type" can be two different graph attributes, used to indicate that the storage intention of the dual-entity and dual-attribute data includes the use of credit limit and type. "0" can be the starting position of the first header entity, "3" can be the end position of the first header entity, "16" can be the starting position of the second header entity, "24" can be the end position of the second header entity, "Business" can be the header entity category, "Y Product" can be the standard name of the first header entity, and "Z Business" can be the standard name of the second header entity.

另外,图14为本申请实施例提供的一种问答系统的样本生成方法的第5种处理流程子图。图14示例性示出了复合属性约束样本的一种生成方式。复合属性约束样本可以是复合属性约束数据1416。In addition, Fig. 14 is a fifth processing flow chart of a sample generation method of a question-answering system provided in an embodiment of the present application. Fig. 14 exemplarily shows a generation method of a composite attribute constraint sample. The composite attribute constraint sample can be composite attribute constraint data 1416.

复合属性约束样本的生成是为了模拟用户询问某个限定条件下某个事物的某个属性时的问句,在复合属性约束场景以及连续反问场景下使用。The generation of compound attribute constraint samples is to simulate the questions asked by users when they ask about a certain attribute of a thing under certain limited conditions. They are used in compound attribute constraint scenarios and continuous rhetorical question scenarios.

如图14所示,基于过滤条件“json key”对同义词集合1402进行过滤处理,得到json属性同义词1406,以及,基于过滤条件“实体”,对同义词集合1402进行过滤处理,得到实体同义词1408。As shown in FIG. 14 , the synonym set 1402 is filtered based on the filter condition “json key” to obtain json attribute synonyms 1406 , and the synonym set 1402 is filtered based on the filter condition “entity” to obtain entity synonyms 1408 .

对相似问集合1404进行筛选处理,得到携带有掩码的相似问句,即带[e]的相似问句1410。The similar question set 1404 is screened to obtain similar questions with masks, that is, similar questions 1410 with [e].

根据json属性同义词1406、实体同义词1408以及带[e]的相似问句1410,可以确定每个属性类别对应的实体同义词、json属性同义词以及相似问句1412。According to the json attribute synonyms 1406, entity synonyms 1408 and similar questions 1410 with [e], the entity synonyms, json attribute synonyms and similar questions 1412 corresponding to each attribute category can be determined.

对每个属性类别对应的实体同义词、json属性同义词以及相似问句1412进行随机采样,可以得到目标实体同义词、目标json属性同义词以及目标相似问句1414。通过目标实体同义词,对目标相似问句中的[e]进行替换处理,得到中间相似问句,再随机确定拼接方式,按照确定的拼接方式将目标json属性同义词与中间相似问句进行拼接处理,得到复合属性约束数据1416。By randomly sampling the entity synonyms, json attribute synonyms, and similar questions 1412 corresponding to each attribute category, target entity synonyms, target json attribute synonyms, and target similar questions 1414 can be obtained. The [e] in the target similar questions is replaced by the target entity synonyms to obtain the intermediate similar questions, and then the splicing method is randomly determined. The target json attribute synonyms and the intermediate similar questions are spliced according to the determined splicing method to obtain the composite attribute constraint data 1416.

复合属性约束数据1416,例如,

Composite attribute constraint data 1416, for example,

其中,"能说下业务乙办理流程吗手机银行"可以是生成后的数据,即复合属性约束数据1416对应的文本。“办理流程”可以是图谱属性,用于表示该复合属性约束数据1416的存放意图为办理流程。“3”可以是头实体开始位置,“6”可以是头实体结束位置,“11”可以是尾实体开始位置,“15”可以是尾实体结束位置,“业务”可以是头实体类别,“constraint”可以是尾实体类别,"业务乙"可以是头实体标准名称,"手机银行"可以是尾实体标准名称。Among them, "Can you tell me about the process of handling business B mobile banking?" can be the generated data, that is, the text corresponding to the composite attribute constraint data 1416. "Process" can be a graph attribute, used to indicate that the storage intention of the composite attribute constraint data 1416 is the process. "3" can be the start position of the head entity, "6" can be the end position of the head entity, "11" can be the start position of the tail entity, "15" can be the end position of the tail entity, "business" can be the head entity category, "constraint" can be the tail entity category, "business B" can be the head entity standard name, and "mobile banking" can be the tail entity standard name.

具体实施时,在问题解析样本包括比较类型样本的情况下,根据同义词集合、相似问集合以及比较词集合中的至少一者,生成问题解析样本,可以是按照第三预设过滤条件对同义词集合进行过滤处理,得到属性同义词和实体同义词;生成随机数;根据比较词信息、属性同义词、实体同义词、随机数以及预设描述词进行拼接处理,得到比较类型样本。In a specific implementation, when the question analysis sample includes a comparison type sample, the question analysis sample is generated according to at least one of a synonym set, a similar question set and a comparison word set. The synonym set may be filtered according to a third preset filtering condition to obtain attribute synonyms and entity synonyms; a random number is generated; and a splicing process is performed based on comparison word information, attribute synonyms, entity synonyms, random numbers and preset description words to obtain a comparison type sample.

需要注意的是,在第一预设过滤条件、第二预设过滤条件以及第三预设过滤条件中,“第一”、“第二”和“第三”仅仅是为了便于区分不同的过滤条件,不具有实际含义。It should be noted that, in the first preset filtering condition, the second preset filtering condition and the third preset filtering condition, “first”, “second” and “third” are only used to distinguish different filtering conditions and have no actual meaning.

比较类型样本的生成是为了模拟用户询问某类事物下的某个属性中最大、最小、大于、小于、之间等需要数值比较的问句,在比较句场景下使用。The generation of comparison type samples is to simulate users asking questions that require numerical comparison, such as the maximum, minimum, greater than, less than, and between, of a certain attribute under a certain category of things, and is used in comparison sentence scenarios.

图15为本申请实施例提供的一种问答系统的样本生成方法的第6种处理流程子图。图15示例性示出了比较类型样本的一种生成方式,比较类型样本可以是比较类型数据1512。 Fig. 15 is a sixth processing flow chart of a sample generation method of a question-answering system provided in an embodiment of the present application. Fig. 15 exemplarily shows a generation method of a comparison type sample, and the comparison type sample can be comparison type data 1512.

如图15所示,按照第三预设过滤条件对同义词集合1504进行过滤处理,得到属性同义词1506和实体同义词1508。As shown in FIG. 15 , the synonym set 1504 is filtered according to the third preset filtering condition to obtain attribute synonyms 1506 and entity synonyms 1508 .

生成随机数1510。Generate a random number 1510.

根据比较词集合中的比较词信息1502、属性同义词1506、实体同义词1508、随机数1510以及预设描述词进行拼接处理,得到比较类型数据1512。Comparison word information 1502 , attribute synonyms 1506 , entity synonyms 1508 , random numbers 1510 , and preset description words in the comparison word set are concatenated to obtain comparison type data 1512 .

比较类型数据1512,例如,
Comparison type data 1512, for example,

其中,“M产品最长可以多少年”可以是生成后的数据,即比较类型数据1512对应的文本。“期限”可以是图谱属性,用于表示该比较类型数据1512的存放意图为期限。“0”可以是头实体开始位置,“4”可以是头实体结束位置,“业务”可以是头实体类别,“资产业务”可以是头实体标准名称。Among them, "How many years can the M product last at most" can be the generated data, that is, the text corresponding to the comparison type data 1512. "Term" can be a graph attribute, used to indicate that the storage intention of the comparison type data 1512 is the term. "0" can be the starting position of the header entity, "4" can be the ending position of the header entity, "Business" can be the category of the header entity, and "Asset Business" can be the standard name of the header entity.

预设描述词,例如,有哪些、有什么、哪些、都有啥、有啥、说下呢等。Preset descriptive words, for example, what are there, what are there, which ones, what are all, what are there, let me tell you about it, etc.

具体实施时,可以从比较词模板中同步比较词信息,例如,属性,比较词,单位等;取同义词模板中类型为数字的属性同义词,例如,金额,额度,期限等,从实体同义词中获取非叶子节点的同一类型下的实体;生成最大、最小、大于、小于类型的数据时,先将属性同义词,比较词,生成的随机数,单位进行拼接,然后将拼接好的字符串与实体同义词、预设描述词随机选取一种位置进行拼接。In specific implementation, the comparison word information can be synchronized from the comparison word template, such as attributes, comparison words, units, etc.; the attribute synonyms of digital type in the synonym template, such as amount, quota, term, etc., are taken, and the entities of the same type of non-leaf nodes are obtained from the entity synonyms; when generating maximum, minimum, greater than, and less than type data, the attribute synonyms, comparison words, generated random numbers, and units are first spliced, and then the spliced string is randomly selected at a position with the entity synonyms and preset description words for splicing.

假设上述内容分别为ABC,则拼接可能有:ABC、ACB、BAC、BCA、CBA、CAB,等等,组成属性问。Assuming that the above contents are ABC, the possible splicing is: ABC, ACB, BAC, BCA, CBA, CAB, etc., which constitute the attribute question.

具体实施时,还可以由比较词、随机数、单位组成数值型单约束的数据, 例如,大于4.35%,小于5年。In specific implementation, the data of numerical single constraint can also be composed of comparison words, random numbers and units. For example, greater than 4.35% and less than 5 years.

具体实施时,还可以取知识图谱中同一类型下为叶子节点的实体及同义词,随机选取一种拼接方式拼接两种实体同义词,再通过连接词拼接属性词与比较词组成属性问。例如,A和B谁的利率高;A和C谁的期限长?In specific implementation, we can also select entities and synonyms of the same type as leaf nodes in the knowledge graph, randomly select a splicing method to splice two entity synonyms, and then use the conjunction to splice the attribute words and comparison words to form an attribute question. For example, which one has a higher interest rate, A or B; which one has a longer term, A or C?

步骤S806,根据问题解析样本和问题域集合,生成问题分类样本;以及,根据问题解析样本和标准实体库,生成实体链接样本。Step S806, generating a question classification sample based on the question parsing sample and the question domain set; and generating an entity linking sample based on the question parsing sample and the standard entity library.

需要注意的是,问题分类样本的生成和实体链接样本的生成可以使用一部分在步骤S804中生成的问题解析样本。It should be noted that the generation of question classification samples and entity linking samples may use a portion of the question parsing samples generated in step S804.

其中,本实施例中的问题分类样本的具体说明可参照上述方法实施例中的对应内容。Among them, the specific description of the question classification samples in this embodiment can refer to the corresponding content in the above method embodiment.

在一种具体的实现方式中,根据问题解析样本和问题域集合,生成问题分类样本,包括:对问题解析样本进行第四采样处理,得到第一域内样本;根据域内问题文本,生成对应的第二域内样本;根据域外问题文本,生成对应的域外样本;根据第一域内样本、第二域内样本以及域外样本,生成问题分类样本。In a specific implementation, a problem classification sample is generated based on a problem analysis sample and a problem domain set, including: performing a fourth sampling process on the problem analysis sample to obtain a first domain sample; generating a corresponding second domain sample based on the domain problem text; generating a corresponding out-of-domain sample based on the out-of-domain problem text; and generating a problem classification sample based on the first domain sample, the second domain sample and the out-of-domain sample.

第四采样处理的采样方式可以是随机采样,也可以是其他采样方式。The sampling method of the fourth sampling process may be random sampling or other sampling methods.

需要注意的是,在第一采样处理、第二采样处理、第三采样处理以及第四采样处理中,“第一”、“第二”、“第三”以及“第四”是为了便于区分在不同的步骤中所执行的采样操作,不具有实际含义。It should be noted that in the first sampling process, the second sampling process, the third sampling process and the fourth sampling process, "first", "second", "third" and "fourth" are for the convenience of distinguishing the sampling operations performed in different steps and have no actual meaning.

图16为本申请实施例提供的一种问答系统的样本生成方法的第7种处理流程子图。图16示例性示出了问题分类样本的一种生成方式,其具体处理流程包括步骤S1602-步骤S1606:Figure 16 is a seventh processing flow sub-diagram of a sample generation method of a question-answering system provided in an embodiment of the present application. Figure 16 exemplarily shows a method for generating a question classification sample, and its specific processing flow includes steps S1602 to S1606:

步骤S1602,从问题解析样本中抽取第一域内样本,从问题域集合中获取第二域内样本和域外样本。Step S1602: extract a first in-domain sample from the problem analysis sample, and obtain a second in-domain sample and an out-of-domain sample from the problem domain set.

步骤S1604,对第一域内样本和第二域内样本添加正标签,对域外样本添加负标签。Step S1604: adding positive labels to the first domain samples and the second domain samples, and adding negative labels to the out-of-domain samples.

步骤S1606,将所有样本合并存入json文件。Step S1606: merge all samples and save them into a json file.

在一种具体的实现方式中,根据问题解析样本和标准实体库,生成实体链接样本,包括:根据问题解析样本,确定携带有非标准实体的目标解析样本;在标准实体库中计算非标准实体与每个标准实体的相似度并进行排序;根据排序结果,确定非标准实体对应的预设数量个目标标准实体;根据非标 准实体和预设数量个目标标准实体,构建非标准实体对应的正负样本,将正负样本确定为实体链接样本。In a specific implementation, an entity linking sample is generated based on a problem parsing sample and a standard entity library, including: determining a target parsing sample carrying a non-standard entity based on the problem parsing sample; calculating the similarity between the non-standard entity and each standard entity in the standard entity library and sorting them; determining a preset number of target standard entities corresponding to the non-standard entity based on the sorting result; and Quasi-entities and a preset number of target standard entities are used to construct positive and negative samples corresponding to the non-standard entities, and the positive and negative samples are determined as entity link samples.

根据排序结果,确定非标准实体对应的预设数量个目标标准实体,可以是将相似度最大的预设数量个标准实体确定为目标标准实体。Determining a preset number of target standard entities corresponding to the non-standard entity according to the sorting result may be determining the preset number of standard entities with the greatest similarity as the target standard entities.

例如,非标准实体“单位通知存钱”,在标准实体库中根据相似度召回最相似的top5实体,分别是“单位通知存款”、“单位定期存钱”、“单位活期存钱”、“单位协定存款”、“单位定期一本通”。For example, for the non-standard entity “Unit Notice Deposit”, the top 5 most similar entities are recalled in the standard entity library based on similarity, which are “Unit Notice Deposit”, “Unit Regular Deposit”, “Unit Current Deposit”, “Unit Agreement Deposit”, and “Unit Regular Deposit Passbook”.

根据非标准实体和预设数量个目标标准实体,构建非标准实体对应的正负样本,可以根据非标准实体与相似度最大的目标标准实体构建正样本,根据非标准实体与该相似度最大的目标标准实体之外的目标标准实体构建负样本。According to the non-standard entity and a preset number of target standard entities, positive and negative samples corresponding to the non-standard entity are constructed. The positive sample can be constructed based on the non-standard entity and the target standard entity with the greatest similarity, and the negative sample can be constructed based on the non-standard entity and the target standard entity other than the target standard entity with the greatest similarity.

例如,构建得到的正样本为{"text":"《单位通知存钱》的交易时间是几点[sep]单位通知存款","label":1}。For example, the constructed positive sample is {"text":"What is the transaction time of "Unit Notice Deposit" [sep] Unit Notice Deposit","label":1}.

构建得到的负样本包括:{"text":"《单位通知存钱》的交易时间是几点[sep]单位定期存钱","label":0};{"text":"《单位通知存钱》的交易时间是几点[sep]单位活期存钱","label":0};{"text":"《单位通知存钱》的交易时间是几点[sep]单位协定存款","label":0};{"text":"《单位通知存钱》的交易时间是几点[sep]单位定期一本通","label":0}。The constructed negative samples include: {"text":"What is the transaction time of "Unit Notice to Deposit Money"[sep]Unit Regular Deposit Money","label":0}; {"text":"What is the transaction time of "Unit Notice to Deposit Money"[sep]Unit Current Deposit Money","label":0}; {"text":"What is the transaction time of "Unit Notice to Deposit Money"[sep]Unit Agreement Deposit","label":0}; {"text":"What is the transaction time of "Unit Notice to Deposit Money"[sep]Unit Regular Deposit Book","label":0}.

图17为本申请实施例提供的一种问答系统的样本生成方法的第8种处理流程子图。图17示例性示出了实体链接样本的一种生成方式,其具体处理流程包括步骤S1702-步骤S1708。Fig. 17 is an eighth processing flow sub-diagram of a sample generation method of a question-answering system provided in an embodiment of the present application. Fig. 17 exemplarily shows a generation method of entity link samples, and its specific processing flow includes steps S1702 to S1708.

步骤S1702,从问题解析样本中随机选取一定数量的携带有非标准实体的目标解析样本。Step S1702: randomly select a certain number of target parsing samples carrying non-standard entities from the question parsing samples.

步骤S1704,取出非标准实体在标准实体库中召回最相似top5标准实体。Step S1704: Take out the non-standard entity and recall the top 5 most similar standard entities in the standard entity library.

步骤S1706,根据非标准实体和召回的标准实体构造正负样本。Step S1706, constructing positive and negative samples based on the non-standard entities and the recalled standard entities.

步骤S1708,将所有样本合并存入json文件。Step S1708: merge all samples into a json file.

步骤S808,根据问题分类样本、问题解析样本以及实体链接样本,构建问答系统的训练数据集。Step S808, constructing a training data set for the question answering system based on the question classification samples, question parsing samples, and entity linking samples.

该训练数据集用于对问答系统中的初始问题分类模型、初始问题解析模型以及初始实体链接模型进行模型训练。This training dataset is used to train the initial question classification model, initial question parsing model, and initial entity linking model in the question answering system.

在如图8所示的实施例中,首先,获取预先配置的同义词集合、相似问 集合、比较词集合、问题域集合以及标准实体库;同义词集合包括多个标准词以及每个标准词对应的至少一个同义词;相似问集合包括多个第一属性以及每个第一属性对应的至少一个相似问句;比较词集合包括比较词信息;问题域集合包括域内问题文本和域外问题文本;标准实体库包括多个标准实体;接着,根据同义词集合、相似问集合以及比较词集合中的至少一者,生成问题解析样本;然后,根据问题解析样本和问题域集合,生成问题分类样本;以及,根据问题解析样本和标准实体库,生成实体链接样本;最后,根据问题分类样本、问题解析样本以及实体链接样本,构建问答系统的训练数据集。以此,一方面,根据预先配置的同义词集合、相似问集合、比较词集合、问题域集合以及标准实体库生成问题解析样本、问题分类样本以及实体链接样本,能够利用少量预先配置的数据生成大量的训练样本,提高样本生成效率,减少人工工作量;另一方面,问题解析样本不仅可以用于初始问题解析模型的训练,还可以用于生成问题分类样本以及实体链接样本,提高了数据利用率。In the embodiment shown in FIG8 , first, a pre-configured synonym set, similar question set, and A set of synonyms, a set of comparison words, a set of problem domains, and a standard entity library; a set of synonyms includes multiple standard words and at least one synonym corresponding to each standard word; a set of similar questions includes multiple first attributes and at least one similar question sentence corresponding to each first attribute; a set of comparison words includes comparison word information; a problem domain set includes in-domain question texts and out-of-domain question texts; a standard entity library includes multiple standard entities; then, a question parsing sample is generated according to at least one of the synonym set, the set of similar questions, and the set of comparison words; then, a question classification sample is generated according to the question parsing sample and the problem domain set; and, an entity link sample is generated according to the question parsing sample and the standard entity library; finally, a training data set for the question answering system is constructed according to the question classification sample, the question parsing sample, and the entity link sample. In this way, on the one hand, question parsing samples, question classification samples and entity linking samples are generated according to pre-configured synonym sets, similar question sets, comparative word sets, problem domain sets and standard entity libraries, so that a large number of training samples can be generated using a small amount of pre-configured data, thereby improving sample generation efficiency and reducing manual workload; on the other hand, question parsing samples can not only be used to train the initial question parsing model, but also can be used to generate question classification samples and entity linking samples, thereby improving data utilization.

出于与前述的方法实施例相同的技术构思,本申请实施例还提供了一种问答系统的模型训练方法的实施例。图18为本申请实施例提供的另一种问答系统的模型训练方法的处理流程图。参见图18,问答系统的模型训练方法的处理流程具体包括步骤S1802至步骤S1804。Based on the same technical concept as the aforementioned method embodiment, the embodiment of the present application also provides an embodiment of a model training method for a question-answering system. FIG18 is a processing flow chart of another model training method for a question-answering system provided by the embodiment of the present application. Referring to FIG18 , the processing flow of the model training method for a question-answering system specifically includes steps S1802 to S1804.

步骤S1802,通过问答系统的样本生成方法生成训练数据集;训练数据集包括问题分类样本、问题解析样本以及实体链接样本。Step S1802, generating a training data set by using a sample generation method of a question-answering system; the training data set includes question classification samples, question parsing samples, and entity linking samples.

问答系统的样本生成方法可以是前述的问答系统的样本生成方法实施例所提供的问答系统的样本生成方法。The sample generation method of the question-answering system may be the sample generation method of the question-answering system provided by the aforementioned sample generation method embodiment of the question-answering system.

步骤S1804,将问题分类样本输入问答系统中的初始问题分类模型进行迭代训练,得到问题分类模型;将问题解析样本输入问答系统中的初始问题解析模型进行迭代训练,得到问题解析模型;将实体链接样本输入问答系统中的初始实体链接模型进行迭代训练,得到实体链接模型。Step S1804, input the question classification samples into the initial question classification model in the question-answering system for iterative training to obtain a question classification model; input the question parsing samples into the initial question parsing model in the question-answering system for iterative training to obtain a question parsing model; input the entity linking samples into the initial entity linking model in the question-answering system for iterative training to obtain an entity linking model.

由于技术构思相同,本实施例中描述得比较简单,相关的部分请参见上述提供的方法实施例的对应说明即可。Since the technical concept is the same, the description in this embodiment is relatively simple, and the relevant parts may refer to the corresponding description of the method embodiment provided above.

出于与前述的方法实施例相同的技术构思,本申请实施例还提供了一种应答方法的实施例。图19为本申请实施例提供的另一种应答方法的处理流程图。图20为本申请实施例提供的一种问答系统的工作原理图。参见图19, 应答方法的处理流程具体包括步骤S1902至步骤S1910。Based on the same technical concept as the aforementioned method embodiment, the present application embodiment also provides an embodiment of a response method. FIG. 19 is a processing flow chart of another response method provided by the present application embodiment. FIG. 20 is a working principle diagram of a question-and-answer system provided by the present application embodiment. Referring to FIG. 19, The processing flow of the response method specifically includes step S1902 to step S1910.

如图19实施例所示的应答方法可以应用于问答系统,该问答系统可以包括依次连接的问题分类模型、问题解析模型以及实体链接模型。其中,问题分类模型的输出可以是问题解析模型的输入,问题解析模型的输出可以是实体链接模型的输入。The answering method shown in the embodiment of Figure 19 can be applied to a question-answering system, which may include a question classification model, a question parsing model, and an entity linking model connected in sequence. The output of the question classification model may be the input of the question parsing model, and the output of the question parsing model may be the input of the entity linking model.

步骤S1902,获取待应答的目标问题。Step S1902, obtaining the target question to be answered.

步骤S1904,将目标问题输入问题分类模型进行分类处理,得到分类结果;问题分类模型是通过将训练数据集中的问题分类样本输入初始问题分类模型进行训练所得到的;训练数据集是通过问答系统的样本生成方法所生成的。Step S1904, input the target question into the question classification model for classification processing to obtain the classification result; the question classification model is obtained by inputting the question classification samples in the training data set into the initial question classification model for training; the training data set is generated by the sample generation method of the question answering system.

问答系统的样本生成方法可以是前述的问答系统的样本生成方法实施例所提供的问答系统的样本生成方法。The sample generation method of the question-answering system may be the sample generation method of the question-answering system provided by the aforementioned sample generation method embodiment of the question-answering system.

步骤S1906,在分类结果用于表征目标问题属于第一预设分类的情况下,将目标问题输入问题解析模型进行解析处理,得到对应的目标片段;问题解析模型是通过将训练数据集中的问题解析样本输入初始问题解析模型进行训练所得到的。Step S1906, when the classification result is used to characterize that the target question belongs to the first preset classification, the target question is input into the question parsing model for parsing processing to obtain the corresponding target segment; the question parsing model is obtained by inputting the question parsing samples in the training data set into the initial question parsing model for training.

步骤S1908,将目标片段输入实体链接模型进行预测处理,得到对应的目标实体;实体链接模型是通过将训练数据集中的实体链接样本输入初始实体链接模型进行训练所得到的;Step S1908, inputting the target segment into the entity linking model for prediction processing to obtain the corresponding target entity; the entity linking model is obtained by inputting the entity linking samples in the training data set into the initial entity linking model for training;

步骤S1910,根据目标实体,确定目标问题的答案。Step S1910, determining the answer to the target question based on the target entity.

在一种具体的实现方式中,根据目标实体,确定目标问题的答案,包括:根据目标实体,进行槽位填充处理,得到目标问题的槽位填充结果;根据槽位填充结果,在预先配置的知识图谱中查询对应的答案,得到目标问题的答案。In a specific implementation, the answer to the target question is determined according to the target entity, including: performing slot filling processing according to the target entity to obtain a slot filling result for the target question; and querying a corresponding answer in a pre-configured knowledge graph according to the slot filling result to obtain the answer to the target question.

根据目标实体,进行槽位填充处理,可以是针对步骤S1908中得到的目标实体,首先,确定该目标实体可能对应的至少一种槽位模板,然后,针对每种槽位模板,通过目标实体对该种槽位模板进行槽位填充处理,得到填充后的槽位模板,将该填充后的槽位模板确定为目标问题对应的槽位填充结果。According to the target entity, slot filling processing is performed. For the target entity obtained in step S1908, first, at least one slot template that may correspond to the target entity is determined, and then, for each slot template, slot filling processing is performed on the slot template through the target entity to obtain the filled slot template, and the filled slot template is determined as the slot filling result corresponding to the target problem.

其中,本实施例中的槽位模板与前述方法实施例中的槽位模板相同,可参照上述方法实施例中的具体方案。Among them, the slot template in this embodiment is the same as the slot template in the aforementioned method embodiment, and the specific scheme in the aforementioned method embodiment can be referred to.

在一种具体的实现方式中,根据槽位填充结果,在预先配置的知识图谱 中查询对应的答案,包括:若槽位填充结果无法用于在知识图谱中查询得到唯一对应的答案,则确定槽位填充结果对应的槽位缺失信息;根据槽位缺失信息生成反问句;接收响应反问句的用户输入;根据用户输入进行槽位填充处理;根据槽位填充处理后的槽位填充结果,在预先配置的知识图谱中查询对应的答案,得到目标问题的答案。In a specific implementation, based on the slot filling result, The method comprises: if the slot filling result cannot be used to query the knowledge graph to obtain the unique corresponding answer, determining the slot missing information corresponding to the slot filling result; generating a rhetorical question according to the slot missing information; receiving user input in response to the rhetorical question; performing slot filling processing according to the user input; and querying the corresponding answer in a pre-configured knowledge graph according to the slot filling result after the slot filling processing to obtain the answer to the target question.

例如,目标问题的槽位填充结果包括填充后的槽位模板1:(产品A)的(属性槽位)是什么?该填充后的槽位模板1无法用于在知识图谱中查询得到唯一对应的答案,可以确定该填充处理后的槽位模板1对应的槽位缺失信息为属性缺失。根据槽位缺失信息“属性缺失”生成对应的反问句“请问您想咨询A产品的什么信息”。接收响应反问句的用户输入“价格”。根据用户输入进行槽位填充处理,即,对“(产品A)的(属性槽位)是什么”中的属性槽位进行填充处理,得到二次填充后的槽位模板1:(产品A)的(价格)是什么?根据槽位填充处理后的槽位填充结果,在预先配置的知识图谱中查询“产品A的价格是什么”对应的答案,得到目标问题的答案。For example, the slot filling result of the target question includes the filled slot template 1: What is the (attribute slot) of (product A)? The filled slot template 1 cannot be used to query in the knowledge graph to obtain a unique corresponding answer. It can be determined that the slot missing information corresponding to the filled slot template 1 is attribute missing. According to the slot missing information "attribute missing", a corresponding rhetorical question "What information do you want to inquire about product A?" is generated. The user input "price" is received in response to the rhetorical question. The slot filling process is performed according to the user input, that is, the attribute slot in "What is the (attribute slot) of (product A)" is filled to obtain the slot template 1 after secondary filling: What is the (price) of (product A)? According to the slot filling result after the slot filling process, the answer corresponding to "What is the price of product A" is queried in the pre-configured knowledge graph to obtain the answer to the target question.

下面,结合图20所示的问答系统的工作原理图,可以示例性说明图19所示的应答方法的处理流程:Next, in conjunction with the working principle diagram of the question-answering system shown in FIG. 20 , the processing flow of the answering method shown in FIG. 19 can be exemplified:

在实际使用过程中,当用户的问题进入问答系统之后,首先利用OOD模型判断用户问题是域外问题还是域内问题,如果是域内问题则流程继续,如果是域外问题则统一回复预先配置的域外问题应答话术。In actual use, when a user's question enters the question-and-answer system, the OOD model is first used to determine whether the user's question is an out-of-domain question or an in-domain question. If it is an in-domain question, the process continues; if it is an out-of-domain question, the system will uniformly reply with the pre-configured out-of-domain question response script.

接着,通过问句解析模型对域内问题进行槽位识别,槽位包括且不限于:实体、属性、关系、约束,等等。在通过问题解析模型完成实体识别之后,将会得到一个或多个实体片段。Next, the question parsing model is used to identify slots in the domain question, including but not limited to: entities, attributes, relations, constraints, etc. After the entity recognition is completed through the question parsing model, one or more entity fragments will be obtained.

在预先配置的知识图谱中可以召回与当前识别得到的实体片段对应的实体节点,利用实体链接模型将实体片段链接到知识图谱中的唯一实体。In the pre-configured knowledge graph, the entity node corresponding to the currently recognized entity fragment can be recalled, and the entity fragment can be linked to the unique entity in the knowledge graph using the entity linking model.

在问句解析模型解析后,DM(Dialog Management,对话管理)可以根据槽位缺失情况进行反问,相关反问话术用户可以在反问模板中进行个性化编辑。After the question parsing model is used for analysis, DM (Dialog Management) can ask counter-questions based on the missing slots. Users can personalize the relevant counter-question scripts in the counter-question template.

如果缺实体,比如:如何办理?DM可以反问:请问您要咨询什么产品呢?如果缺属性,比如:车位贷。DM可以反问:请问您是想咨询车位贷具体什么问题呢?If the entity is missing, such as: How to apply? DM can ask: What product do you want to consult? If the attribute is missing, such as: Parking space loan. DM can ask: What specific questions do you want to consult about parking space loan?

通过多轮策略可以进行槽位填充处理,直到通过知识图谱可以查询并返 回唯一的答案为止。The slot filling process can be performed through multiple rounds of strategies until the knowledge graph can be queried and returned. Until the only answer.

在图20所示的问答系统中,OOD模型可以基于OOD训练集对初始OOD模型进行迭代训练所得到;问句解析模型可以基于问句解析训练集对初始问句解析模型进行迭代训练所得到;实体链接模型可以基于实体链接训练集对初始实体链接模型进行迭代训练所得到。In the question-answering system shown in FIG20 , the OOD model can be obtained by iteratively training the initial OOD model based on the OOD training set; the question parsing model can be obtained by iteratively training the initial question parsing model based on the question parsing training set; and the entity linking model can be obtained by iteratively training the initial entity linking model based on the entity linking training set.

其中,问句解析训练集,可以基于同义词模板、相似问模板、比较词模板以及反问模板生成;实体链接训练集,可以基于标准实体库和问句解析训练集生成;OOD训练集,可以基于OOD模板和问句解析训练集生成。同义词模板、相似问模板、比较词模板、OOD模板以及反问模板可以基于客户知识库生成。Among them, the question parsing training set can be generated based on synonym templates, similar question templates, comparative word templates and rhetorical question templates; the entity linking training set can be generated based on the standard entity library and the question parsing training set; the OOD training set can be generated based on the OOD template and the question parsing training set. The synonym template, similar question template, comparative word template, OOD template and rhetorical question template can be generated based on the customer knowledge base.

由于技术构思相同,本实施例中描述得比较简单,相关的部分请参见上述提供的方法实施例的对应说明即可。Since the technical concept is the same, the description in this embodiment is relatively simple, and the relevant parts may refer to the corresponding description of the method embodiment provided above.

在上述的实施例中,提供了一种问答系统的样本生成方法,与之相对应的,基于相同的技术构思,本申请实施例还提供了一种问答系统的样本生成装置,下面结合附图进行说明。In the above-mentioned embodiment, a sample generation method of a question-answering system is provided. Correspondingly, based on the same technical concept, an embodiment of the present application also provides a sample generation device of a question-answering system, which is described below in conjunction with the accompanying drawings.

图21为本申请实施例提供的一种问答系统的样本生成装置示意图。FIG21 is a schematic diagram of a sample generation device of a question-answering system provided in an embodiment of the present application.

本实施例提供一种问答系统的样本生成装置2100,包括:第三获取单元2102,用于获取预先配置的同义词集合、相似问集合、比较词集合、问题域集合以及标准实体库;所述同义词集合包括多个标准词以及每个所述标准词对应的至少一个同义词;所述相似问集合包括多个第一属性以及每个所述第一属性对应的至少一个相似问句;所述比较词集合包括比较词信息;所述问题域集合包括域内问题文本和域外问题文本;所述标准实体库包括多个标准实体;第一生成单元2104,用于根据所述同义词集合、相似问集合以及比较词集合中的至少一者,生成问题解析样本;第二生成单元2106,用于根据所述问题解析样本和所述问题域集合,生成问题分类样本;以及,根据所述问题解析样本和所述标准实体库,生成所述实体链接样本;构建单元2108,用于根据所述问题分类样本、所述问题解析样本以及所述实体链接样本,构建所述问答系统的训练数据集。The present embodiment provides a sample generation device 2100 for a question-answering system, comprising: a third acquisition unit 2102, used to acquire a pre-configured synonym set, a similar question set, a comparison word set, a problem domain set, and a standard entity library; the synonym set includes a plurality of standard words and at least one synonym corresponding to each of the standard words; the similar question set includes a plurality of first attributes and at least one similar question sentence corresponding to each of the first attributes; the comparison word set includes comparison word information; the problem domain set includes an in-domain question text and an out-of-domain question text; the standard entity library includes a plurality of standard entities; a first generation unit 2104, used to generate a question parsing sample according to at least one of the synonym set, the similar question set, and the comparison word set; a second generation unit 2106, used to generate a question classification sample according to the question parsing sample and the problem domain set; and, based on the question parsing sample and the standard entity library, to generate the entity link sample; a construction unit 2108, used to construct a training data set for the question-answering system according to the question classification sample, the question parsing sample, and the entity link sample.

可选地,第一生成单元2104具体用于:按照第一预设过滤条件,对所述同义词集合进行过滤处理,得到候选词列表;所述候选词列表包括候选标准词以及所述候选标准词对应的候选同义词;对所述候选词列表进行第一采样 处理,得到单槽位样本,将所述单槽位样本确定为所述问题解析样本;或者,对所述相似问集合中的多个所述第一属性进行第二采样处理,得到目标第一属性;对所述目标第一属性对应的至少一个相似问句进行第三采样处理,得到初始相似问句;根据所述初始相似问句,确定对应的单属性样本,将所述单属性样本确定为所述问题解析样本。Optionally, the first generation unit 2104 is specifically configured to: filter the synonym set according to the first preset filtering condition to obtain a candidate word list; the candidate word list includes candidate standard words and candidate synonyms corresponding to the candidate standard words; perform a first sampling on the candidate word list Processing is performed to obtain a single-slot sample, and the single-slot sample is determined as the problem parsing sample; or, a second sampling process is performed on multiple first attributes in the similar question set to obtain a target first attribute; a third sampling process is performed on at least one similar question sentence corresponding to the target first attribute to obtain an initial similar question sentence; based on the initial similar question sentence, a corresponding single-attribute sample is determined, and the single-attribute sample is determined as the problem parsing sample.

可选地,第一生成单元2104具体用于:按照第二预设过滤条件,对所述同义词集合进行过滤处理,得到候选实体词表;所述候选实体词表包括多个候选实体词;对所述相似问集合进行筛选处理,得到中间相似问集合;所述中间相似问集合包括多个第一属性以及每个所述第一属性对应的至少一个携带有掩码的候选相似问句;根据所述候选词列表和所述中间相似问集合,确定对应于同一属性类别的目标候选实体词和目标候选相似问句;通过所述目标候选实体词,对所述目标候选相似问句中的掩码进行替换处理,得到复合样本,将所述复合样本确定为所述问题解析样本;或者,按照第三预设过滤条件对所述同义词集合进行过滤处理,得到属性同义词和实体同义词;生成随机数;根据所述比较词信息、所述属性同义词、所述实体同义词、所述随机数以及预设描述词进行拼接处理,得到比较类型样本,将所述比较类型样本确定为所述问题解析样本。Optionally, the first generating unit 2104 is specifically used to: filter the synonym set according to the second preset filtering condition to obtain a candidate entity word list; the candidate entity word list includes multiple candidate entity words; screen the similar question set to obtain an intermediate similar question set; the intermediate similar question set includes multiple first attributes and at least one candidate similar question sentence carrying a mask corresponding to each of the first attributes; determine the target candidate entity words and target candidate similar questions corresponding to the same attribute category according to the candidate word list and the intermediate similar question set; replace the mask in the target candidate similar question sentence with the target candidate entity word to obtain a composite sample, and determine the composite sample as the question parsing sample; or, filter the synonym set according to the third preset filtering condition to obtain attribute synonyms and entity synonyms; generate random numbers; splice the comparison word information, the attribute synonyms, the entity synonyms, the random numbers and the preset description words to obtain a comparison type sample, and determine the comparison type sample as the question parsing sample.

可选地,第二生成单元2106具体用于:对所述问题解析样本进行第四采样处理,得到第一域内样本;根据所述域内问题文本,生成对应的第二域内样本;根据所述域外问题文本,生成对应的域外样本;根据所述第一域内样本、所述第二域内样本以及所述域外样本,生成所述问题分类样本。Optionally, the second generation unit 2106 is specifically used to: perform a fourth sampling process on the question parsing sample to obtain a first in-domain sample; generate a corresponding second in-domain sample based on the in-domain question text; generate a corresponding out-of-domain sample based on the out-of-domain question text; generate the question classification sample based on the first in-domain sample, the second in-domain sample and the out-of-domain sample.

可选地,第二生成单元2106具体用于:根据所述问题解析样本,确定携带有非标准实体的目标解析样本;在所述标准实体库中计算所述非标准实体与每个所述标准实体的相似度并进行排序;根据排序结果,确定所述非标准实体对应的预设数量个目标标准实体;根据所述非标准实体和所述预设数量个目标标准实体,构建所述非标准实体对应的正负样本,将所述正负样本确定为所述实体链接样本。Optionally, the second generation unit 2106 is specifically used to: determine a target parsing sample carrying a non-standard entity based on the problem parsing sample; calculate the similarity between the non-standard entity and each of the standard entities in the standard entity library and sort them; determine a preset number of target standard entities corresponding to the non-standard entity based on the sorting result; construct positive and negative samples corresponding to the non-standard entity based on the non-standard entity and the preset number of target standard entities, and determine the positive and negative samples as the entity link samples.

本申请实施例所提供的问答系统的样本生成装置包括第三获取单元,用于获取预先配置的同义词集合、相似问集合、比较词集合、问题域集合以及标准实体库;同义词集合包括多个标准词以及每个标准词对应的至少一个同义词;相似问集合包括多个第一属性以及每个第一属性对应的至少一个相似 问句;比较词集合包括比较词信息;问题域集合包括域内问题文本和域外问题文本;标准实体库包括多个标准实体;第一生成单元,用于根据同义词集合、相似问集合以及比较词集合中的至少一者,生成问题解析样本;第二生成单元,用于根据问题解析样本和问题域集合,生成问题分类样本;以及,根据问题解析样本和标准实体库,生成实体链接样本;构建单元,用于根据问题分类样本、问题解析样本以及实体链接样本,构建问答系统的训练数据集。一方面,根据预先配置的同义词集合、相似问集合、比较词集合、问题域集合以及标准实体库生成问题解析样本、问题分类样本以及实体链接样本,能够利用少量预先配置的数据生成大量的训练样本,提高样本生成效率,减少人工工作量;另一方面,问题解析样本不仅可以用于初始问题解析模型的训练,还可以用于生成问题分类样本以及实体链接样本,提高了数据利用率。The sample generation device of the question-answering system provided in the embodiment of the present application includes a third acquisition unit for acquiring a pre-configured synonym set, a similar question set, a comparison word set, a problem domain set, and a standard entity library; the synonym set includes a plurality of standard words and at least one synonym corresponding to each standard word; the similar question set includes a plurality of first attributes and at least one similar word corresponding to each first attribute. A question sentence; a comparison word set including comparison word information; a problem domain set including a problem text in the domain and a problem text outside the domain; a standard entity library including a plurality of standard entities; a first generation unit, used to generate a problem parsing sample according to at least one of a synonym set, a similar question set and a comparison word set; a second generation unit, used to generate a problem classification sample according to the problem parsing sample and the problem domain set; and, based on the problem parsing sample and the standard entity library, to generate an entity linking sample; a construction unit, used to construct a training data set for a question answering system according to the problem classification sample, the problem parsing sample and the entity linking sample. On the one hand, by generating a problem parsing sample, a problem classification sample and an entity linking sample according to the pre-configured synonym set, similar question set, comparison word set, problem domain set and standard entity library, a large number of training samples can be generated using a small amount of pre-configured data, the sample generation efficiency is improved, and the manual workload is reduced; on the other hand, the problem parsing sample can not only be used for the training of the initial problem parsing model, but also can be used to generate a problem classification sample and an entity linking sample, thereby improving the data utilization rate.

在上述的实施例中,提供了一种问答系统的模型训练方法,与之相对应的,基于相同的技术构思,本申请实施例还提供了一种问答系统的模型训练装置,下面结合附图进行说明。In the above-mentioned embodiment, a model training method for a question-answering system is provided. Correspondingly, based on the same technical concept, an embodiment of the present application also provides a model training device for a question-answering system, which is described below in conjunction with the accompanying drawings.

图22为本申请实施例提供的另一种问答系统的模型训练装置示意图。Figure 22 is a schematic diagram of a model training device for another question-answering system provided in an embodiment of the present application.

本实施例提供一种问答系统的模型训练装置2200,包括:第三生成单元2202,用于通过问答系统的样本生成方法生成训练数据集;所述训练数据集包括所述问题分类样本、所述问题解析样本以及所述实体链接样本;第二训练单元2204,用于将所述问题分类样本输入所述问答系统中的初始问题分类模型进行迭代训练,得到问题分类模型;将所述问题解析样本输入所述问答系统中的初始问题解析模型进行迭代训练,得到问题解析模型;将所述实体链接样本输入所述问答系统中的初始实体链接模型进行迭代训练,得到实体链接模型。This embodiment provides a model training device 2200 for a question-answering system, comprising: a third generation unit 2202, used to generate a training data set by a sample generation method of the question-answering system; the training data set includes the question classification sample, the question parsing sample and the entity linking sample; a second training unit 2204, used to input the question classification sample into an initial question classification model in the question-answering system for iterative training to obtain a question classification model; input the question parsing sample into the initial question parsing model in the question-answering system for iterative training to obtain a question parsing model; input the entity linking sample into the initial entity linking model in the question-answering system for iterative training to obtain an entity linking model.

本申请实施例所提供的问答系统的模型训练装置包括第三生成单元,用于通过问答系统的样本生成方法生成训练数据集;训练数据集包括问题分类样本、问题解析样本以及实体链接样本;第二训练单元,用于将问题分类样本输入问答系统中的初始问题分类模型进行迭代训练,得到问题分类模型;将问题解析样本输入问答系统中的初始问题解析模型进行迭代训练,得到问题解析模型;将实体链接样本输入问答系统中的初始实体链接模型进行迭代训练,得到实体链接模型。以此,一方面,根据预先配置的同义词集合、相似问集合、比较词集合、问题域集合以及标准实体库生成问题解析样本、问 题分类样本以及实体链接样本,能够利用少量预先配置的数据生成大量的训练样本,提高样本生成效率,减少人工工作量;另一方面,问题解析样本不仅可以用于初始问题解析模型的训练,还可以用于生成问题分类样本以及实体链接样本,提高了数据利用率,在此基础上,可以高效生成大量的训练样本以进行模型训练,从而提高问答系统的应答准确性。The model training device of the question-answering system provided in the embodiment of the present application includes a third generation unit, which is used to generate a training data set through a sample generation method of the question-answering system; the training data set includes question classification samples, question parsing samples and entity linking samples; the second training unit is used to input the question classification samples into the initial question classification model in the question-answering system for iterative training to obtain a question classification model; input the question parsing samples into the initial question parsing model in the question-answering system for iterative training to obtain a question parsing model; input the entity linking samples into the initial entity linking model in the question-answering system for iterative training to obtain an entity linking model. In this way, on the one hand, question parsing samples, question domain sets and standard entity libraries are generated according to the pre-configured synonym sets, similar question sets, comparative word sets and problem domain sets. Question classification samples and entity linking samples can generate a large number of training samples using a small amount of pre-configured data, improve sample generation efficiency, and reduce manual workload; on the other hand, question parsing samples can not only be used to train the initial question parsing model, but also to generate question classification samples and entity linking samples, which improves data utilization. On this basis, a large number of training samples can be efficiently generated for model training, thereby improving the accuracy of the question-answering system.

在上述的实施例中,提供了一种应答方法,与之相对应的,基于相同的技术构思,本申请实施例还提供了一种应答装置,下面结合附图进行说明。In the above-mentioned embodiment, a response method is provided. Correspondingly, based on the same technical concept, an embodiment of the present application also provides a response device, which will be described below in conjunction with the accompanying drawings.

图23为本申请实施例提供的另一种应答装置示意图。FIG. 23 is a schematic diagram of another response device provided in an embodiment of the present application.

本实施例提供一种应答装置2300,包括:第四获取单元2302,用于获取待应答的目标问题;分类单元2304,用于将所述目标问题输入问题分类模型进行分类处理,得到分类结果;所述问题分类模型是通过将训练数据集中的问题分类样本输入初始问题分类模型进行训练所得到的;所述训练数据集是通过问答系统的样本生成方法所生成的;第二解析单元2306,用于在所述分类结果用于表征所述目标问题属于第一预设分类的情况下,将所述目标问题输入问题解析模型进行解析处理,得到对应的目标片段;所述问题解析模型是通过将所述训练数据集中的问题解析样本输入初始问题解析模型进行训练所得到的;预测单元2308,用于将所述目标片段输入实体链接模型进行预测处理,得到对应的目标实体;所述实体链接模型是通过将所述训练数据集中的实体链接样本输入初始实体链接模型进行训练所得到的;第二确定单元2310,用于根据所述目标实体,确定所述目标问题的答案。This embodiment provides an answering device 2300, including: a fourth acquisition unit 2302, used to acquire a target question to be answered; a classification unit 2304, used to input the target question into a question classification model for classification processing to obtain a classification result; the question classification model is obtained by inputting question classification samples in a training data set into an initial question classification model for training; the training data set is generated by a sample generation method of a question-answering system; a second parsing unit 2306, used to input the target question into a question parsing model for parsing processing to obtain a corresponding target segment when the classification result is used to characterize that the target question belongs to a first preset classification; the question parsing model is obtained by inputting question parsing samples in the training data set into an initial question parsing model for training; a prediction unit 2308, used to input the target segment into an entity linking model for prediction processing to obtain a corresponding target entity; the entity linking model is obtained by inputting entity linking samples in the training data set into an initial entity linking model for training; a second determination unit 2310, used to determine the answer to the target question according to the target entity.

可选地,第二确定单元2310具体用于:根据所述目标实体,进行槽位填充处理,得到所述目标问题的槽位填充结果;根据所述槽位填充结果,在预先配置的知识图谱中查询对应的答案,得到所述目标问题的答案。Optionally, the second determination unit 2310 is specifically used to: perform slot filling processing according to the target entity to obtain a slot filling result for the target question; and query a corresponding answer in a pre-configured knowledge graph according to the slot filling result to obtain an answer to the target question.

可选地,第二确定单元2310还具体用于:若所述槽位填充结果无法用于在所述知识图谱中查询得到唯一对应的答案,则确定所述槽位填充结果对应的槽位缺失信息;根据所述槽位缺失信息生成反问句;接收响应所述反问句的用户输入;根据所述用户输入进行槽位填充处理;根据槽位填充处理后的槽位填充结果,在预先配置的知识图谱中查询对应的答案,得到所述目标问题的答案。Optionally, the second determination unit 2310 is also specifically used to: if the slot filling result cannot be used to query the knowledge graph to obtain a unique corresponding answer, then determine the slot missing information corresponding to the slot filling result; generate a rhetorical question based on the slot missing information; receive user input in response to the rhetorical question; perform slot filling processing based on the user input; and query the corresponding answer in a pre-configured knowledge graph based on the slot filling result after the slot filling processing to obtain the answer to the target question.

本申请实施例所提供的应答装置包括:第四获取单元,用于获取待应答的目标问题;分类单元,用于将目标问题输入问题分类模型进行分类处理, 得到分类结果;问题分类模型是通过将训练数据集中的问题分类样本输入初始问题分类模型进行训练所得到的;训练数据集是通过问答系统的样本生成方法所生成的;第二解析单元,用于在分类结果用于表征目标问题属于第一预设分类的情况下,将目标问题输入问题解析模型进行解析处理,得到对应的目标片段;问题解析模型是通过将训练数据集中的问题解析样本输入初始问题解析模型进行训练所得到的;预测单元,用于将目标片段输入实体链接模型进行预测处理,得到对应的目标实体;实体链接模型是通过将训练数据集中的实体链接样本输入初始实体链接模型进行训练所得到的;第二确定单元,用于根据目标实体,确定目标问题的答案。以此,一方面,根据预先配置的同义词集合、相似问集合、比较词集合、问题域集合以及标准实体库生成问题解析样本、问题分类样本以及实体链接样本,能够利用少量预先配置的数据生成大量的训练样本,提高样本生成效率,减少人工工作量;另一方面,问题解析样本不仅可以用于初始问题解析模型的训练,还可以用于生成问题分类样本以及实体链接样本,提高了数据利用率,在此基础上,可以高效生成大量的训练样本以进行模型训练,从而提高问答系统的应答准确性。The answering device provided in the embodiment of the present application includes: a fourth acquisition unit, used to acquire a target question to be answered; a classification unit, used to input the target question into a question classification model for classification processing, The classification result is obtained; the question classification model is obtained by inputting the question classification samples in the training data set into the initial question classification model for training; the training data set is generated by the sample generation method of the question-answering system; the second parsing unit is used to input the target question into the question parsing model for parsing processing when the classification result is used to characterize that the target question belongs to the first preset classification, so as to obtain the corresponding target segment; the question parsing model is obtained by inputting the question parsing samples in the training data set into the initial question parsing model for training; the prediction unit is used to input the target segment into the entity linking model for prediction processing to obtain the corresponding target entity; the entity linking model is obtained by inputting the entity linking samples in the training data set into the initial entity linking model for training; the second determination unit is used to determine the answer to the target question according to the target entity. In this way, on the one hand, question parsing samples, question classification samples and entity linking samples are generated according to the pre-configured synonym sets, similar question sets, comparative word sets, problem domain sets and standard entity libraries, so that a large number of training samples can be generated using a small amount of pre-configured data, thereby improving the efficiency of sample generation and reducing manual workload; on the other hand, question parsing samples can not only be used to train the initial question parsing model, but also can be used to generate question classification samples and entity linking samples, thereby improving data utilization. On this basis, a large number of training samples can be efficiently generated for model training, thereby improving the response accuracy of the question and answer system.

对应上述描述的问答系统的模型训练方法,或者,对应上述描述的应答方法,或者,对应上述描述的问答系统的样本生成方法,基于相同的技术构思,本申请实施例还提供一种电子设备,该电子设备用于执行上述提供的问答系统的模型训练方法,或者,上述提供的应答方法,图24为本申请实施例提供的一种电子设备的结构示意图。Corresponding to the model training method of the question-answering system described above, or, corresponding to the answering method described above, or, corresponding to the sample generation method of the question-answering system described above, based on the same technical concept, an embodiment of the present application also provides an electronic device, which is used to execute the model training method of the question-answering system provided above, or, the answering method provided above. Figure 24 is a structural schematic diagram of an electronic device provided in an embodiment of the present application.

如图24所示,电子设备可因配置或性能不同而产生比较大的差异,可以包括一个或一个以上的处理器2401和存储器2402,存储器2402中可以存储有一个或一个以上存储应用程序或数据。其中,存储器2402可以是短暂存储或持久存储。存储在存储器2402的应用程序可以包括一个或一个以上模块(图示未示出),每个模块可以包括电子设备中的一系列计算机可执行指令。更进一步地,处理器2401可以设置为与存储器2402通信,在电子设备上执行存储器2402中的一系列计算机可执行指令。电子设备还可以包括一个或一个以上电源2403,一个或一个以上有线或无线网络接口2404,一个或一个以上输入/输出接口2405,一个或一个以上键盘2406等。As shown in FIG. 24 , electronic devices may have relatively large differences due to different configurations or performances, and may include one or more processors 2401 and memory 2402, and the memory 2402 may store one or more storage applications or data. Among them, the memory 2402 may be a short-term storage or a persistent storage. The application stored in the memory 2402 may include one or more modules (not shown in the figure), and each module may include a series of computer executable instructions in the electronic device. Furthermore, the processor 2401 may be configured to communicate with the memory 2402 to execute a series of computer executable instructions in the memory 2402 on the electronic device. The electronic device may also include one or more power supplies 2403, one or more wired or wireless network interfaces 2404, one or more input/output interfaces 2405, one or more keyboards 2406, and the like.

在一个具体的实施例中,电子设备包括有存储器,以及一个或一个以上的程序,其中一个或者一个以上程序存储于存储器中,且一个或者一个以上 程序可以包括一个或一个以上模块,且每个模块可以包括对电子设备中的一系列计算机可执行指令,且经配置以由一个或者一个以上处理器执行该一个或者一个以上程序包含用于进行以下计算机可执行指令:获取问题文本样本;将所述问题文本样本输入所述初始问题解析模型进行迭代训练,得到问题解析模型;所述初始问题解析模型包括第一编码层和转换层;所述第一编码层用于根据所述问题文本样本进行编码处理,得到对应的第一句式向量;所述转换层用于在接收到所述第一句式向量的情况下生成预设数量个初始意图向量,根据所述第一句式向量对每个所述初始意图向量进行填充处理,得到对应的目标意图向量,将所述目标意图向量转换为对应的文本片段;所述文本片段用于在所述问答系统中查询所述问题文本样本的答案。In a specific embodiment, the electronic device includes a memory and one or more programs, wherein the one or more programs are stored in the memory, and the one or more programs are stored in the memory. The program may include one or more modules, and each module may include a series of computer executable instructions in an electronic device, and is configured to be executed by one or more processors. The one or more programs include the following computer executable instructions: obtaining a question text sample; inputting the question text sample into the initial question parsing model for iterative training to obtain a question parsing model; the initial question parsing model includes a first encoding layer and a conversion layer; the first encoding layer is used to perform encoding processing according to the question text sample to obtain a corresponding first sentence vector; the conversion layer is used to generate a preset number of initial intention vectors when receiving the first sentence vector, fill each of the initial intention vectors according to the first sentence vector to obtain a corresponding target intention vector, and convert the target intention vector into a corresponding text fragment; the text fragment is used to query the answer to the question text sample in the question and answer system.

在又一个具体的实施例中,电子设备包括有存储器,以及一个或一个以上的程序,其中一个或者一个以上程序存储于存储器中,且一个或者一个以上程序可以包括一个或一个以上模块,且每个模块可以包括对电子设备中的一系列计算机可执行指令,且经配置以由一个或者一个以上处理器执行该一个或者一个以上程序包含用于进行以下计算机可执行指令:获取预先配置的同义词集合、相似问集合、比较词集合、问题域集合以及标准实体库;所述同义词集合包括多个标准词以及每个所述标准词对应的至少一个同义词;所述相似问集合包括多个第一属性以及每个所述第一属性对应的至少一个相似问句;所述比较词集合包括比较词信息;所述问题域集合包括域内问题文本和域外问题文本;所述标准实体库包括多个标准实体;根据所述同义词集合、所述相似问集合以及所述比较词集合中的至少一者,生成问题解析样本;根据所述问题解析样本和所述问题域集合,生成问题分类样本;以及,根据所述问题解析样本和所述标准实体库,生成所述实体链接样本;根据所述问题分类样本、所述问题解析样本以及所述实体链接样本,构建所述问答系统的训练数据集。In another specific embodiment, the electronic device includes a memory and one or more programs, wherein the one or more programs are stored in the memory, and the one or more programs may include one or more modules, and each module may include a series of computer executable instructions for the electronic device, and the one or more programs are configured to be executed by one or more processors, and include the following computer executable instructions: obtaining a preconfigured synonym set, a similar question set, a comparison word set, a problem domain set, and a standard entity library; the synonym set includes a plurality of standard words and at least one synonym corresponding to each of the standard words; the similar question set The set includes multiple first attributes and at least one similar question corresponding to each of the first attributes; the comparison word set includes comparison word information; the problem domain set includes in-domain question text and out-of-domain question text; the standard entity library includes multiple standard entities; based on the synonym set, the similar question set and at least one of the comparison word set, a question parsing sample is generated; based on the question parsing sample and the problem domain set, a question classification sample is generated; and, based on the question parsing sample and the standard entity library, the entity link sample is generated; based on the question classification sample, the question parsing sample and the entity link sample, a training data set for the question answering system is constructed.

在另一个具体的实施例中,电子设备包括有存储器,以及一个或一个以上的程序,其中一个或者一个以上程序存储于存储器中,且一个或者一个以上程序可以包括一个或一个以上模块,且每个模块可以包括对电子设备中的一系列计算机可执行指令,且经配置以由一个或者一个以上处理器执行该一个或者一个以上程序包含用于进行以下计算机可执行指令:获取待应答的目标问题;将所述目标问题输入问题解析模型进行解析处理,得到对应的目标 片段;所述问题解析模型是通过问答系统的模型训练方法进行训练所得到的;根据所述目标片段,确定所述目标问题的答案。In another specific embodiment, the electronic device includes a memory and one or more programs, wherein the one or more programs are stored in the memory, and the one or more programs may include one or more modules, and each module may include a series of computer executable instructions for the electronic device, and the one or more programs are configured to be executed by one or more processors, and include the following computer executable instructions: obtaining a target question to be answered; inputting the target question into a problem parsing model for parsing processing to obtain a corresponding target question; The question parsing model is obtained by training through the model training method of the question-answering system; and the answer to the target question is determined according to the target segment.

在又一个具体的实施例中,电子设备包括有存储器,以及一个或一个以上的程序,其中一个或者一个以上程序存储于存储器中,且一个或者一个以上程序可以包括一个或一个以上模块,且每个模块可以包括对电子设备中的一系列计算机可执行指令,且经配置以由一个或者一个以上处理器执行该一个或者一个以上程序包含用于进行以下计算机可执行指令:通过问答系统的样本生成方法生成训练数据集;所述训练数据集包括所述问题分类样本、所述问题解析样本以及所述实体链接样本;将所述问题分类样本输入所述问答系统中的初始问题分类模型进行迭代训练,得到问题分类模型;将所述问题解析样本输入所述问答系统中的初始问题解析模型进行迭代训练,得到问题解析模型;将所述实体链接样本输入所述问答系统中的初始实体链接模型进行迭代训练,得到实体链接模型。In another specific embodiment, the electronic device includes a memory, and one or more programs, wherein the one or more programs are stored in the memory, and the one or more programs may include one or more modules, and each module may include a series of computer executable instructions in the electronic device, and the one or more programs are configured to be executed by one or more processors, including computer executable instructions for performing the following: generating a training data set by a sample generation method of a question-answering system; the training data set includes the question classification sample, the question parsing sample and the entity linking sample; inputting the question classification sample into an initial question classification model in the question-answering system for iterative training to obtain a question classification model; inputting the question parsing sample into an initial question parsing model in the question-answering system for iterative training to obtain a question parsing model; inputting the entity linking sample into an initial entity linking model in the question-answering system for iterative training to obtain an entity linking model.

在又一个具体的实施例中,电子设备包括有存储器,以及一个或一个以上的程序,其中一个或者一个以上程序存储于存储器中,且一个或者一个以上程序可以包括一个或一个以上模块,且每个模块可以包括对电子设备中的一系列计算机可执行指令,且经配置以由一个或者一个以上处理器执行该一个或者一个以上程序包含用于进行以下计算机可执行指令:获取待应答的目标问题;将所述目标问题输入问题分类模型进行分类处理,得到分类结果;所述问题分类模型是通过将训练数据集中的问题分类样本输入初始问题分类模型进行训练所得到的;所述训练数据集是通过问答系统的样本生成方法所生成的;在所述分类结果用于表征所述目标问题属于第一预设分类的情况下,将所述目标问题输入问题解析模型进行解析处理,得到对应的目标片段;所述问题解析模型是通过将所述训练数据集中的问题解析样本输入初始问题解析模型进行训练所得到的;将所述目标片段输入实体链接模型进行预测处理,得到对应的目标实体;所述实体链接模型是通过将所述训练数据集中的实体链接样本输入初始实体链接模型进行训练所得到的;根据所述目标实体,确定所述目标问题的答案。In another specific embodiment, the electronic device includes a memory, and one or more programs, wherein the one or more programs are stored in the memory, and the one or more programs may include one or more modules, and each module may include a series of computer executable instructions in the electronic device, and the one or more programs are configured to be executed by one or more processors to include the following computer executable instructions: obtaining a target question to be answered; inputting the target question into a question classification model for classification processing to obtain a classification result; the question classification model is obtained by inputting question classification samples in a training data set into an initial question classification model for training; the training data set is generated by a sample generation method of a question-answering system; when the classification result is used to characterize that the target question belongs to a first preset classification, inputting the target question into a question parsing model for parsing processing to obtain a corresponding target segment; the question parsing model is obtained by inputting question parsing samples in the training data set into an initial question parsing model for training; inputting the target segment into an entity linking model for prediction processing to obtain a corresponding target entity; the entity linking model is obtained by inputting entity linking samples in the training data set into an initial entity linking model for training; and determining the answer to the target question based on the target entity.

本说明书提供的一种计算机可读存储介质实施例如下:An embodiment of a computer-readable storage medium provided in this specification is as follows:

对应上述描述的一种问答系统的模型训练方法,基于相同的技术构思,本申请实施例还提供一种计算机可读存储介质。 Corresponding to the model training method of a question-answering system described above, based on the same technical concept, an embodiment of the present application also provides a computer-readable storage medium.

本实施例提供的计算机可读存储介质,用于存储计算机可执行指令,计算机可执行指令在被处理器执行时实现以下流程:获取问题文本样本;将所述问题文本样本输入所述初始问题解析模型进行迭代训练,得到问题解析模型;所述初始问题解析模型包括第一编码层和转换层;所述第一编码层用于根据所述问题文本样本进行编码处理,得到对应的第一句式向量;所述转换层用于在接收到所述第一句式向量的情况下生成预设数量个初始意图向量,根据所述第一句式向量对每个所述初始意图向量进行填充处理,得到对应的目标意图向量,将所述目标意图向量转换为对应的文本片段;所述文本片段用于在所述问答系统中查询所述问题文本样本的答案。The computer-readable storage medium provided in this embodiment is used to store computer-executable instructions. When the computer-executable instructions are executed by a processor, the following process is implemented: obtaining a question text sample; inputting the question text sample into the initial question parsing model for iterative training to obtain a question parsing model; the initial question parsing model includes a first encoding layer and a conversion layer; the first encoding layer is used to perform encoding processing according to the question text sample to obtain a corresponding first sentence vector; the conversion layer is used to generate a preset number of initial intention vectors when receiving the first sentence vector, fill each of the initial intention vectors according to the first sentence vector to obtain a corresponding target intention vector, and convert the target intention vector into a corresponding text fragment; the text fragment is used to query the answer to the question text sample in the question-answering system.

需要说明的是,本说明书中关于计算机可读存储介质的实施例与本说明书中关于问答系统的模型训练方法的实施例基于同一构思,因此该实施例的具体实施可以参见前述对应方法的实施,重复之处不再赘述。It should be noted that the embodiment of the computer-readable storage medium in this specification and the embodiment of the model training method for the question-answering system in this specification are based on the same concept. Therefore, the specific implementation of this embodiment can refer to the implementation of the corresponding method mentioned above, and the repeated parts will not be repeated.

对应上述描述的一种应答方法,基于相同的技术构思,本申请实施例还提供一种计算机可读存储介质。Corresponding to the response method described above, based on the same technical concept, an embodiment of the present application also provides a computer-readable storage medium.

本实施例提供的计算机可读存储介质,用于存储计算机可执行指令,计算机可执行指令在被处理器执行时实现以下流程:获取待应答的目标问题;将所述目标问题输入问题解析模型进行解析处理,得到对应的目标片段;所述问题解析模型是通过问答系统的模型训练方法进行训练所得到的;根据所述目标片段,确定所述目标问题的答案。The computer-readable storage medium provided in this embodiment is used to store computer-executable instructions. When the computer-executable instructions are executed by a processor, the following process is implemented: obtaining a target question to be answered; inputting the target question into a question parsing model for parsing and processing to obtain a corresponding target segment; the question parsing model is obtained by training using a model training method of a question-answering system; and determining an answer to the target question based on the target segment.

需要说明的是,本说明书中关于计算机可读存储介质的实施例与本说明书中关于应答方法的实施例基于同一构思,因此该实施例的具体实施可以参见前述对应方法的实施,重复之处不再赘述。It should be noted that the embodiment of the computer-readable storage medium in this specification and the embodiment of the response method in this specification are based on the same concept, so the specific implementation of this embodiment can refer to the implementation of the corresponding method mentioned above, and the repeated parts will not be repeated.

对应上述描述的一种问答系统的样本生成方法,基于相同的技术构思,本申请实施例还提供一种计算机可读存储介质。Corresponding to the sample generation method of a question-answering system described above, based on the same technical concept, an embodiment of the present application also provides a computer-readable storage medium.

本实施例提供的计算机可读存储介质,用于存储计算机可执行指令,计算机可执行指令在被处理器执行时实现以下流程:获取预先配置的同义词集合、相似问集合、比较词集合、问题域集合以及标准实体库;所述同义词集合包括多个标准词以及每个所述标准词对应的至少一个同义词;所述相似问集合包括多个第一属性以及每个所述第一属性对应的至少一个相似问句;所述比较词集合包括比较词信息;所述问题域集合包括域内问题文本和域外问题文本;所述标准实体库包括多个标准实体;根据所述同义词集合、所述相 似问集合以及所述比较词集合中的至少一者,生成问题解析样本;根据所述问题解析样本和所述问题域集合,生成问题分类样本;以及,根据所述问题解析样本和所述标准实体库,生成所述实体链接样本;根据所述问题分类样本、所述问题解析样本以及所述实体链接样本,构建所述问答系统的训练数据集。The computer-readable storage medium provided in this embodiment is used to store computer-executable instructions. When the computer-executable instructions are executed by a processor, the following process is implemented: obtaining a pre-configured synonym set, a similar question set, a comparison word set, a problem domain set, and a standard entity library; the synonym set includes a plurality of standard words and at least one synonym corresponding to each of the standard words; the similar question set includes a plurality of first attributes and at least one similar question sentence corresponding to each of the first attributes; the comparison word set includes comparison word information; the problem domain set includes in-domain question text and out-of-domain question text; the standard entity library includes a plurality of standard entities; Generate a question parsing sample based on a set of similar questions and at least one of the comparison word sets; generate a question classification sample based on the question parsing sample and the problem domain set; and generate the entity linking sample based on the question parsing sample and the standard entity library; construct a training data set for the question-answering system based on the question classification sample, the question parsing sample and the entity linking sample.

需要说明的是,本说明书中关于计算机可读存储介质的实施例与本说明书中关于问答系统的样本生成方法的实施例基于同一构思,因此该实施例的具体实施可以参见前述对应方法的实施,重复之处不再赘述。It should be noted that the embodiment of the computer-readable storage medium in this specification and the embodiment of the sample generation method for the question-and-answer system in this specification are based on the same concept, so the specific implementation of this embodiment can refer to the implementation of the aforementioned corresponding method, and the repeated parts will not be repeated.

对应上述描述的一种问答系统的模型训练方法,基于相同的技术构思,本申请实施例还提供一种计算机可读存储介质。Corresponding to the model training method of a question-answering system described above, based on the same technical concept, an embodiment of the present application also provides a computer-readable storage medium.

本实施例提供的计算机可读存储介质,用于存储计算机可执行指令,计算机可执行指令在被处理器执行时实现以下流程:通过问答系统的样本生成方法生成训练数据集;所述训练数据集包括所述问题分类样本、所述问题解析样本以及所述实体链接样本;将所述问题分类样本输入所述问答系统中的初始问题分类模型进行迭代训练,得到问题分类模型;将所述问题解析样本输入所述问答系统中的初始问题解析模型进行迭代训练,得到问题解析模型;将所述实体链接样本输入所述问答系统中的初始实体链接模型进行迭代训练,得到实体链接模型。The computer-readable storage medium provided in this embodiment is used to store computer-executable instructions. When the computer-executable instructions are executed by a processor, the following process is implemented: a training data set is generated by a sample generation method of a question-answering system; the training data set includes the question classification samples, the question parsing samples and the entity linking samples; the question classification samples are input into an initial question classification model in the question-answering system for iterative training to obtain a question classification model; the question parsing samples are input into an initial question parsing model in the question-answering system for iterative training to obtain a question parsing model; the entity linking samples are input into an initial entity linking model in the question-answering system for iterative training to obtain an entity linking model.

需要说明的是,本说明书中关于计算机可读存储介质的实施例与本说明书中关于问答系统的模型训练方法的实施例基于同一构思,因此该实施例的具体实施可以参见前述对应方法的实施,重复之处不再赘述。It should be noted that the embodiment of the computer-readable storage medium in this specification and the embodiment of the model training method for the question-answering system in this specification are based on the same concept. Therefore, the specific implementation of this embodiment can refer to the implementation of the corresponding method mentioned above, and the repeated parts will not be repeated.

对应上述描述的一种应答方法,基于相同的技术构思,本申请实施例还提供一种计算机可读存储介质。Corresponding to the response method described above, based on the same technical concept, an embodiment of the present application also provides a computer-readable storage medium.

本实施例提供的计算机可读存储介质,用于存储计算机可执行指令,计算机可执行指令在被处理器执行时实现以下流程:获取待应答的目标问题;将所述目标问题输入问题分类模型进行分类处理,得到分类结果;所述问题分类模型是通过将训练数据集中的问题分类样本输入初始问题分类模型进行训练所得到的;所述训练数据集是通过问答系统的样本生成方法所生成的;在所述分类结果用于表征所述目标问题属于第一预设分类的情况下,将所述目标问题输入问题解析模型进行解析处理,得到对应的目标片段;所述问题解析模型是通过将所述训练数据集中的问题解析样本输入初始问题解析模型 进行训练所得到的;将所述目标片段输入实体链接模型进行预测处理,得到对应的目标实体;所述实体链接模型是通过将所述训练数据集中的实体链接样本输入初始实体链接模型进行训练所得到的;根据所述目标实体,确定所述目标问题的答案。The computer-readable storage medium provided in this embodiment is used to store computer-executable instructions, and the computer-executable instructions implement the following process when executed by the processor: obtaining a target question to be answered; inputting the target question into a question classification model for classification processing to obtain a classification result; the question classification model is obtained by inputting question classification samples in a training data set into an initial question classification model for training; the training data set is generated by a sample generation method of a question-answering system; when the classification result is used to characterize that the target question belongs to a first preset classification, inputting the target question into a question parsing model for parsing processing to obtain a corresponding target segment; the question parsing model is obtained by inputting question parsing samples in the training data set into an initial question parsing model The target segment is input into the entity linking model for prediction processing to obtain the corresponding target entity; the entity linking model is obtained by inputting the entity linking samples in the training data set into the initial entity linking model for training; according to the target entity, the answer to the target question is determined.

需要说明的是,本说明书中关于计算机可读存储介质的实施例与本说明书中关于应答方法的实施例基于同一构思,因此该实施例的具体实施可以参见前述对应方法的实施,重复之处不再赘述。It should be noted that the embodiment of the computer-readable storage medium in this specification and the embodiment of the response method in this specification are based on the same concept, so the specific implementation of this embodiment can refer to the implementation of the corresponding method mentioned above, and the repeated parts will not be repeated.

上述对本说明书特定实施例进行了描述。其它实施例在所附权利要求书的范围内。在一些情况下,在权利要求书中记载的动作或步骤可以按照不同于实施例中的顺序来执行并且仍然可以实现期望的结果。另外,在附图中描绘的过程不一定要求示出的特定顺序或者连续顺序才能实现期望的结果。在某些实施方式中,多任务处理和并行处理也是可以的或者可能是有利的。The above is a description of a specific embodiment of the specification. Other embodiments are within the scope of the appended claims. In some cases, the actions or steps recorded in the claims can be performed in an order different from that in the embodiments and still achieve the desired results. In addition, the processes depicted in the drawings do not necessarily require the specific order or continuous order shown to achieve the desired results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.

本领域内的技术人员应明白,本申请实施例可提供为方法、系统或计算机程序产品。因此,本申请实施例可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本说明书可采用在一个或多个其中包含有计算机可用程序代码的计算机可读存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art will appreciate that the embodiments of the present application may be provided as methods, systems or computer program products. Therefore, the embodiments of the present application may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, this specification may adopt the form of a computer program product implemented on one or more computer-readable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.

本说明书是参照根据本说明书实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程设备的处理器以产生一个机器,使得通过计算机或其他可编程设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。This specification is described with reference to the flowchart and/or block diagram of the method, device (system), and computer program product according to the embodiment of this specification. It should be understood that each process and/or box in the flowchart and/or block diagram, as well as the combination of the process and/or box in the flowchart and/or block diagram can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, a special-purpose computer, an embedded processor or other programmable device to produce a machine, so that the instructions executed by the processor of the computer or other programmable device produce a device for implementing the functions specified in one process or multiple processes in the flowchart and/or one box or multiple boxes in the block diagram.

这些计算机程序指令也可存储在能引导计算机或其他可编程设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable device to work in a specific manner, so that the instructions stored in the computer-readable memory produce a manufactured product including an instruction device that implements the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.

这些计算机程序指令也可装载到计算机或其他可编程设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程 或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded into a computer or other programmable device so that a series of operation steps are executed on the computer or other programmable device to produce a computer-implemented process, so that the instructions executed on the computer or other programmable device provide a method for implementing a process in the flowchart. or multiple flows and/or block diagrams with steps of functions specified in one or more blocks.

在一个典型的配置中,计算设备包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。In a typical configuration, a computing device includes one or more processors (CPU), input/output interfaces, network interfaces, and memory.

内存可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。内存是计算机可读介质的示例。Memory may include non-permanent storage in a computer-readable medium, in the form of random access memory (RAM) and/or non-volatile memory, such as read-only memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.

计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括暂存电脑可读媒体(transitory media),如调制的数据信号和载波。Computer readable media include permanent and non-permanent, removable and non-removable media that can be implemented by any method or technology to store information. Information can be computer readable instructions, data structures, program modules or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, compact disk read-only memory (CD-ROM), digital versatile disk (DVD) or other optical storage, magnetic cassettes, disk storage or other magnetic storage devices or any other non-transmission media that can be used to store information that can be accessed by a computing device. As defined in this article, computer readable media does not include temporary computer readable media (transitory media), such as modulated data signals and carrier waves.

还需要说明的是,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、商品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、商品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、商品或者设备中还存在另外的相同要素。It should also be noted that the terms "include", "comprises" or any other variations thereof are intended to cover non-exclusive inclusion, so that a process, method, commodity or device including a series of elements includes not only those elements, but also other elements not explicitly listed, or also includes elements inherent to such process, method, commodity or device. In the absence of more restrictions, the elements defined by the sentence "comprises a ..." do not exclude the existence of other identical elements in the process, method, commodity or device including the elements.

本申请实施例可以在由计算机执行的计算机可执行指令的一般上下文中描述,例如程序模块。一般地,程序模块包括执行特定任务或实现特定抽象数据类型的例程、程序、对象、组件、数据结构等等。也可以在分布式计算环境中实践本说明书的一个或多个实施例,在这些分布式计算环境中,由通过通信网络而被连接的远程处理设备来执行任务。在分布式计算环境中,程序模块可以位于包括存储设备在内的本地和远程计算机存储介质中。Embodiments of the present application may be described in the general context of computer-executable instructions executed by a computer, such as program modules. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform specific tasks or implement specific abstract data types. One or more embodiments of the present specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices connected via a communications network. In a distributed computing environment, program modules may be located in local and remote computer storage media, including storage devices.

本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于系统实施例而言,由于其基本相似于方法实施例,所以描 述的比较简单,相关之处参见方法实施例的部分说明即可。Each embodiment in this specification is described in a progressive manner. The same or similar parts between the embodiments can be referred to each other. Each embodiment focuses on the differences from other embodiments. In particular, for the system embodiment, since it is basically similar to the method embodiment, the description is The description is relatively simple, and the relevant parts can be referred to the partial description of the method embodiment.

以上所述仅为本文件的实施例而已,并不用于限制本文件。对于本领域技术人员来说,本文件可以有各种更改和变化。凡在本文件的精神和原理之内所作的任何修改、等同替换、改进等,均应包含在本文件的权利要求范围之内。 The above description is only an embodiment of this document and is not intended to limit this document. For those skilled in the art, this document may have various changes and variations. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of this document should be included in the scope of the claims of this document.

Claims (24)

一种问答系统的模型训练方法,所述问答系统包括初始问题解析模型;所述方法包括:A model training method for a question-answering system, the question-answering system comprising an initial question parsing model; the method comprising: 获取问题文本样本;Get sample question text; 将所述问题文本样本输入所述初始问题解析模型进行迭代训练,得到问题解析模型;所述初始问题解析模型包括第一编码层和转换层;Inputting the question text sample into the initial question parsing model for iterative training to obtain a question parsing model; the initial question parsing model includes a first encoding layer and a conversion layer; 所述第一编码层用于根据所述问题文本样本进行编码处理,得到对应的第一句式向量;The first encoding layer is used to perform encoding processing according to the question text sample to obtain a corresponding first sentence vector; 所述转换层用于在接收到所述第一句式向量的情况下生成预设数量个初始意图向量,根据所述第一句式向量对每个所述初始意图向量进行填充处理,得到对应的目标意图向量,将所述目标意图向量转换为对应的文本片段;所述文本片段用于在所述问答系统中查询所述问题文本样本的答案。The conversion layer is used to generate a preset number of initial intention vectors when receiving the first sentence vector, fill each of the initial intention vectors according to the first sentence vector to obtain a corresponding target intention vector, and convert the target intention vector into a corresponding text fragment; the text fragment is used to query the answer to the question text sample in the question and answer system. 根据权利要求1所述的方法,其中,所述第一句式向量包括语义特征子向量和多个字符子向量;所述根据所述第一句式向量对每个所述初始意图向量进行填充处理,得到对应的目标意图向量的具体实现方式有:The method according to claim 1, wherein the first sentence vector includes a semantic feature sub-vector and a plurality of character sub-vectors; and the specific implementation methods of performing filling processing on each of the initial intention vectors according to the first sentence vector to obtain the corresponding target intention vector are: 对所述语义特征子向量进行意图分类处理,得到对应的意图分类结果;Performing intent classification processing on the semantic feature sub-vector to obtain a corresponding intent classification result; 根据所述意图分类结果,对每个所述初始意图向量进行填充处理,得到对应的中间意图向量;According to the intention classification result, each of the initial intention vectors is filled to obtain a corresponding intermediate intention vector; 根据每个所述字符子向量,对每个所述中间意图向量进行填充处理,得到对应的目标意图向量。According to each of the character sub-vectors, each of the intermediate intention vectors is filled to obtain a corresponding target intention vector. 根据权利要求2所述的方法,其中,所述根据每个所述字符子向量,对每个所述中间意图向量进行填充处理,得到对应的目标意图向量的具体实现方式有:According to the method of claim 2, wherein the specific implementation methods of performing filling processing on each of the intermediate intention vectors according to each of the character sub-vectors to obtain the corresponding target intention vector are: 根据每个所述字符子向量,进行实体识别处理和约束识别处理,得到对应的字符识别结果;According to each of the character sub-vectors, entity recognition processing and constraint recognition processing are performed to obtain a corresponding character recognition result; 根据所述字符识别结果,对每个所述中间意图向量进行填充处理,得到对应的目标意图向量。 According to the character recognition result, each of the intermediate intention vectors is filled to obtain a corresponding target intention vector. 根据权利要求3所述的方法,其中,所述根据所述字符识别结果,对每个所述中间意图向量进行填充处理,得到对应的目标意图向量的具体实现方式有:According to the method of claim 3, wherein the specific implementation manner of performing filling processing on each of the intermediate intention vectors according to the character recognition result to obtain the corresponding target intention vector is: 若所述字符识别结果用于表征所述字符子向量属于实体类别或约束类别,则根据所述字符识别结果,确定对应的中间意图向量并填充。If the character recognition result is used to characterize that the character sub-vector belongs to an entity category or a constraint category, then the corresponding intermediate intention vector is determined and filled according to the character recognition result. 根据权利要求1-4任一项所述的方法,其中,所述问题文本样本包括实体元素、属性元素、关系元素、约束元素中的至少一者;预设数量个所述初始意图向量包括第一意图向量和多个第二意图向量;所述第一意图向量对应于无意图,每个所述第二意图向量对应于一个所述属性元素或一个所述关系元素。According to the method described in any one of claims 1 to 4, the question text sample includes at least one of an entity element, an attribute element, a relationship element, and a constraint element; the preset number of the initial intention vectors includes a first intention vector and multiple second intention vectors; the first intention vector corresponds to no intent, and each of the second intention vectors corresponds to one of the attribute elements or one of the relationship elements. 根据权利要求1所述的方法,其中,所述问答系统还包括初始实体链接模型;所述方法还包括:The method according to claim 1, wherein the question answering system further comprises an initial entity linking model; the method further comprises: 获取实体链接样本;Get entity link samples; 将所述实体链接样本输入所述初始实体链接模型进行迭代训练,得到实体链接模型;所述实体链接模型包括第二编码层和预测层;Inputting the entity link sample into the initial entity link model for iterative training to obtain an entity link model; the entity link model includes a second encoding layer and a prediction layer; 所述第二编码层用于对所述实体链接样本进行编码处理,得到对应的第二句式向量;The second encoding layer is used to encode the entity link sample to obtain a corresponding second sentence vector; 所述预测层用于根据所述第二句式向量进行预测处理,确定对应的目标实体。The prediction layer is used to perform prediction processing according to the second sentence pattern vector to determine the corresponding target entity. 根据权利要求1所述的方法,其中,所述问答系统还包括初始问题分类模型;所述方法还包括:The method according to claim 1, wherein the question answering system further comprises an initial question classification model; the method further comprises: 获取问题分类样本;Get problem classification samples; 将所述问题分类样本输入所述初始问题分类模型进行迭代训练,得到问题分类模型。The question classification samples are input into the initial question classification model for iterative training to obtain a question classification model. 一种应答方法,包括:A response method, comprising: 获取待应答的目标问题;Get the target questions to be answered; 将所述目标问题输入问题解析模型进行解析处理,得到对应的目标片段; 所述问题解析模型是通过如权利要求1-7任一项所述的问答系统的模型训练方法进行训练所得到的;Inputting the target problem into the problem parsing model for parsing and processing to obtain a corresponding target segment; The question parsing model is obtained by training through the model training method of the question answering system according to any one of claims 1 to 7; 根据所述目标片段,确定所述目标问题的答案。An answer to the target question is determined based on the target segment. 根据权利要求8所述的方法,其中,所述将所述目标问题输入问题解析模型进行解析处理,得到对应的目标片段,包括:The method according to claim 8, wherein the step of inputting the target question into a question parsing model for parsing to obtain a corresponding target segment comprises: 将所述目标问题输入问题分类模型进行分类处理,得到分类结果;Inputting the target problem into the problem classification model for classification processing to obtain a classification result; 在所述分类结果用于表征所述目标问题属于第一预设分类的情况下,将所述目标问题输入所述问题解析模型进行解析处理,得到对应的目标片段。In the case where the classification result is used to characterize that the target question belongs to the first preset classification, the target question is input into the question parsing model for parsing to obtain a corresponding target segment. 根据权利要求8所述的方法,其中,所述根据所述目标片段,确定所述目标问题的答案,包括:The method according to claim 8, wherein determining the answer to the target question based on the target segment comprises: 将所述目标片段输入实体链接模型进行预测处理,得到对应的目标实体;Inputting the target segment into an entity linking model for prediction processing to obtain a corresponding target entity; 根据所述目标实体,进行槽位填充处理,得到所述目标问题的槽位填充结果;Perform slot filling processing according to the target entity to obtain a slot filling result of the target problem; 根据所述槽位填充结果,在预先配置的知识图谱中查询对应的答案,得到所述目标问题的答案。According to the slot filling result, the corresponding answer is queried in the pre-configured knowledge graph to obtain the answer to the target question. 一种应答装置,包括:A response device, comprising: 第二获取单元,用于获取待应答的目标问题;A second acquisition unit, used to acquire a target question to be answered; 第一解析单元,用于将所述目标问题输入问题解析模型进行解析处理,得到对应的目标片段;所述问题解析模型是通过如权利要求1-7任一项所述的问答系统的模型训练方法进行训练所得到的;A first parsing unit, configured to input the target question into a question parsing model for parsing to obtain a corresponding target segment; the question parsing model is obtained by training using the model training method for the question-answering system according to any one of claims 1 to 7; 第一确定单元,用于根据所述目标片段,确定所述目标问题的答案。The first determining unit is used to determine an answer to the target question according to the target segment. 一种问答系统的样本生成方法,所述方法包括:A sample generation method for a question-answering system, the method comprising: 获取预先配置的同义词集合、相似问集合、比较词集合、问题域集合以及标准实体库;所述同义词集合包括多个标准词以及每个所述标准词对应的至少一个同义词;所述相似问集合包括多个第一属性以及每个所述第一属性对应的至少一个相似问句;所述比较词集合包括比较词信息;所述问题域集合包括域内问题文本和域外问题文本;所述标准实体库包括多个标准实体; Acquire a pre-configured synonym set, similar question set, comparative word set, problem domain set and standard entity library; the synonym set includes a plurality of standard words and at least one synonym corresponding to each of the standard words; the similar question set includes a plurality of first attributes and at least one similar question sentence corresponding to each of the first attributes; the comparative word set includes comparative word information; the problem domain set includes in-domain question text and out-of-domain question text; the standard entity library includes a plurality of standard entities; 根据所述同义词集合、所述相似问集合以及所述比较词集合中的至少一者,生成问题解析样本;Generate a question parsing sample according to at least one of the synonym set, the similar question set and the comparative word set; 根据所述问题解析样本和所述问题域集合,生成问题分类样本;以及,根据所述问题解析样本和所述标准实体库,生成所述实体链接样本;Generate a question classification sample according to the question parsing sample and the question domain set; and generate the entity linking sample according to the question parsing sample and the standard entity library; 根据所述问题分类样本、所述问题解析样本以及所述实体链接样本,构建所述问答系统的训练数据集。A training data set for the question answering system is constructed based on the question classification samples, the question parsing samples, and the entity linking samples. 根据权利要求12所述的方法,其中,所述根据所述同义词集合、所述相似问集合以及所述比较词集合中的至少一者,生成问题解析样本,包括:The method according to claim 12, wherein generating a question parsing sample based on at least one of the synonym set, the similar question set, and the comparative word set comprises: 按照第一预设过滤条件,对所述同义词集合进行过滤处理,得到候选词列表;所述候选词列表包括候选标准词以及所述候选标准词对应的候选同义词;According to the first preset filtering condition, the synonym set is filtered to obtain a candidate word list; the candidate word list includes candidate standard words and candidate synonyms corresponding to the candidate standard words; 对所述候选词列表进行第一采样处理,得到单槽位样本,将所述单槽位样本确定为问题解析样本;Performing a first sampling process on the candidate word list to obtain a single-slot sample, and determining the single-slot sample as a question parsing sample; 或者,or, 对所述相似问集合中的多个所述第一属性进行第二采样处理,得到目标第一属性;Performing a second sampling process on a plurality of the first attributes in the similar question set to obtain a target first attribute; 对所述目标第一属性对应的至少一个相似问句进行第三采样处理,得到初始相似问句;Performing a third sampling process on at least one similar question sentence corresponding to the first attribute of the target to obtain an initial similar question sentence; 根据所述初始相似问句,确定对应的单属性样本,将所述单属性样本确定为所述问题解析样本。According to the initial similar question, a corresponding single attribute sample is determined, and the single attribute sample is determined as the question resolution sample. 根据权利要求12所述的方法,其中,所述根据所述同义词集合、所述相似问集合以及所述比较词集合中的至少一者,生成问题解析样本,包括:The method according to claim 12, wherein generating a question parsing sample based on at least one of the synonym set, the similar question set, and the comparative word set comprises: 按照第二预设过滤条件,对所述同义词集合进行过滤处理,得到候选实体词表;所述候选实体词表包括多个候选实体词;According to the second preset filtering condition, the synonym set is filtered to obtain a candidate entity word list; the candidate entity word list includes a plurality of candidate entity words; 对所述相似问集合进行筛选处理,得到中间相似问集合;所述中间相似问集合包括多个第一属性以及每个所述第一属性对应的至少一个携带有掩码的候选相似问句;The similar question set is screened to obtain an intermediate similar question set; the intermediate similar question set includes a plurality of first attributes and at least one candidate similar question sentence with a mask corresponding to each of the first attributes; 根据所述候选词列表和所述中间相似问集合,确定对应于同一属性类别的目标候选实体词和目标候选相似问句; Determine target candidate entity words and target candidate similar question sentences corresponding to the same attribute category according to the candidate word list and the intermediate similar question set; 通过所述目标候选实体词,对所述目标候选相似问句中的掩码进行替换处理,得到复合样本,将所述复合样本确定为所述问题解析样本;Using the target candidate entity word, the mask in the target candidate similar question is replaced to obtain a composite sample, and the composite sample is determined as the question parsing sample; 或者,or, 按照第三预设过滤条件对所述同义词集合进行过滤处理,得到属性同义词和实体同义词;Filter the synonym set according to a third preset filtering condition to obtain attribute synonyms and entity synonyms; 生成随机数;Generate random numbers; 根据所述比较词信息、所述属性同义词、所述实体同义词、所述随机数以及预设描述词进行拼接处理,得到比较类型样本,将所述比较类型样本确定为所述问题解析样本。A splicing process is performed according to the comparison word information, the attribute synonyms, the entity synonyms, the random number and the preset description word to obtain a comparison type sample, and the comparison type sample is determined as the question parsing sample. 根据权利要求12所述的方法,其中,所述根据所述问题解析样本和所述问题域集合,生成问题分类样本,包括:The method according to claim 12, wherein generating a problem classification sample based on the problem parsing sample and the problem domain set comprises: 对所述问题解析样本进行第四采样处理,得到第一域内样本;Performing a fourth sampling process on the problem analysis sample to obtain a first domain sample; 根据所述域内问题文本,生成对应的第二域内样本;Generating a corresponding second in-domain sample according to the in-domain question text; 根据所述域外问题文本,生成对应的域外样本;Generate corresponding out-of-domain samples according to the out-of-domain question text; 根据所述第一域内样本、所述第二域内样本以及所述域外样本,生成所述问题分类样本。The question classification sample is generated according to the first in-domain sample, the second in-domain sample and the out-of-domain sample. 根据权利要求12所述的方法,其中,所述根据所述问题解析样本和所述标准实体库,生成所述实体链接样本,包括:The method according to claim 12, wherein generating the entity link sample according to the question parsing sample and the standard entity library comprises: 根据所述问题解析样本,确定携带有非标准实体的目标解析样本;Determine a target parsing sample carrying a non-standard entity according to the problem parsing sample; 在所述标准实体库中计算所述非标准实体与每个所述标准实体的相似度并进行排序;Calculating the similarity between the non-standard entity and each of the standard entities in the standard entity library and sorting them; 根据排序结果,确定所述非标准实体对应的预设数量个目标标准实体;According to the sorting result, determining a preset number of target standard entities corresponding to the non-standard entity; 根据所述非标准实体和所述预设数量个目标标准实体,构建所述非标准实体对应的正负样本,将所述正负样本确定为所述实体链接样本。According to the non-standard entity and the preset number of target standard entities, positive and negative samples corresponding to the non-standard entity are constructed, and the positive and negative samples are determined as the entity link samples. 一种问答系统的模型训练方法,包括:A model training method for a question answering system, comprising: 通过如权利要求12-16任一项所述的问答系统的样本生成方法生成训练数据集;所述训练数据集包括所述问题分类样本、所述问题解析样本以及所述实体链接样本; Generate a training data set by using the sample generation method of the question answering system according to any one of claims 12 to 16; the training data set includes the question classification sample, the question parsing sample and the entity linking sample; 将所述问题分类样本输入所述问答系统中的初始问题分类模型进行迭代训练,得到问题分类模型;将所述问题解析样本输入所述问答系统中的初始问题解析模型进行迭代训练,得到问题解析模型;将所述实体链接样本输入所述问答系统中的初始实体链接模型进行迭代训练,得到实体链接模型。The question classification samples are input into the initial question classification model in the question-answering system for iterative training to obtain a question classification model; the question parsing samples are input into the initial question parsing model in the question-answering system for iterative training to obtain a question parsing model; the entity linking samples are input into the initial entity linking model in the question-answering system for iterative training to obtain an entity linking model. 一种应答方法,包括:A response method, comprising: 获取待应答的目标问题;Get the target questions to be answered; 将所述目标问题输入问题分类模型进行分类处理,得到分类结果;所述问题分类模型是通过将训练数据集中的问题分类样本输入初始问题分类模型进行训练所得到的;所述训练数据集是通过如权利要求12-16任一项所述的问答系统的样本生成方法所生成的;Inputting the target question into a question classification model for classification processing to obtain a classification result; the question classification model is obtained by inputting question classification samples in a training data set into an initial question classification model for training; the training data set is generated by the sample generation method of the question answering system according to any one of claims 12 to 16; 在所述分类结果用于表征所述目标问题属于第一预设分类的情况下,将所述目标问题输入问题解析模型进行解析处理,得到对应的目标片段;所述问题解析模型是通过将所述训练数据集中的问题解析样本输入初始问题解析模型进行训练所得到的;In the case where the classification result is used to characterize that the target question belongs to the first preset classification, the target question is input into a question parsing model for parsing processing to obtain a corresponding target segment; the question parsing model is obtained by inputting question parsing samples in the training data set into an initial question parsing model for training; 将所述目标片段输入实体链接模型进行预测处理,得到对应的目标实体;所述实体链接模型是通过将所述训练数据集中的实体链接样本输入初始实体链接模型进行训练所得到的;Inputting the target segment into an entity linking model for prediction processing to obtain a corresponding target entity; the entity linking model is obtained by inputting entity linking samples in the training data set into an initial entity linking model for training; 根据所述目标实体,确定所述目标问题的答案。An answer to the target question is determined according to the target entity. 根据权利要求18所述的方法,其中,所述根据所述目标实体,确定所述目标问题的答案,包括:The method according to claim 18, wherein determining the answer to the target question according to the target entity comprises: 根据所述目标实体,进行槽位填充处理,得到所述目标问题的槽位填充结果;Perform slot filling processing according to the target entity to obtain a slot filling result of the target problem; 根据所述槽位填充结果,在预先配置的知识图谱中查询对应的答案,得到所述目标问题的答案。According to the slot filling result, the corresponding answer is queried in the pre-configured knowledge graph to obtain the answer to the target question. 根据权利要求19所述的方法,其中,所述根据所述槽位填充结果,在预先配置的知识图谱中查询对应的答案,包括:The method according to claim 19, wherein the querying a corresponding answer in a preconfigured knowledge graph according to the slot filling result comprises: 若所述槽位填充结果无法用于在所述知识图谱中查询得到唯一对应的答案,则确定所述槽位填充结果对应的槽位缺失信息; If the slot filling result cannot be used to query in the knowledge graph to obtain a unique corresponding answer, then determining the slot missing information corresponding to the slot filling result; 根据所述槽位缺失信息生成反问句;Generate a rhetorical question sentence according to the slot missing information; 接收响应所述反问句的用户输入;receiving a user input in response to the rhetorical question; 根据所述用户输入进行槽位填充处理;Perform slot filling processing according to the user input; 根据槽位填充处理后的槽位填充结果,在预先配置的知识图谱中查询对应的答案,得到所述目标问题的答案。According to the slot filling results after the slot filling process, the corresponding answer is queried in the pre-configured knowledge graph to obtain the answer to the target question. 一种问答系统的样本生成装置,所述装置包括:A sample generation device for a question-answering system, the device comprising: 第三获取单元,用于获取预先配置的同义词集合、相似问集合、比较词集合、问题域集合以及标准实体库;所述同义词集合包括多个标准词以及每个所述标准词对应的至少一个同义词;所述相似问集合包括多个第一属性以及每个所述第一属性对应的至少一个相似问句;所述比较词集合包括比较词信息;所述问题域集合包括域内问题文本和域外问题文本;所述标准实体库包括多个标准实体;The third acquisition unit is used to acquire a pre-configured synonym set, a similar question set, a comparative word set, a problem domain set, and a standard entity library; the synonym set includes a plurality of standard words and at least one synonym corresponding to each of the standard words; the similar question set includes a plurality of first attributes and at least one similar question sentence corresponding to each of the first attributes; the comparative word set includes comparative word information; the problem domain set includes in-domain question text and out-of-domain question text; the standard entity library includes a plurality of standard entities; 第一生成单元,用于根据所述同义词集合、相似问集合以及比较词集合中的至少一者,生成问题解析样本;A first generating unit, configured to generate a question parsing sample according to at least one of the synonym set, the similar question set and the comparative word set; 第二生成单元,用于根据所述问题解析样本和所述问题域集合,生成问题分类样本;以及,根据所述问题解析样本和所述标准实体库,生成所述实体链接样本;A second generating unit is used to generate a question classification sample according to the question parsing sample and the question domain set; and to generate the entity linking sample according to the question parsing sample and the standard entity library; 构建单元,用于根据所述问题分类样本、所述问题解析样本以及所述实体链接样本,构建所述问答系统的训练数据集。A construction unit is used to construct a training data set for the question answering system according to the question classification samples, the question parsing samples and the entity linking samples. 一种应答装置,包括:A response device, comprising: 第四获取单元,用于获取待应答的目标问题;A fourth acquisition unit, used to acquire a target question to be answered; 分类单元,用于将所述目标问题输入问题分类模型进行分类处理,得到分类结果;所述问题分类模型是通过将训练数据集中的问题分类样本输入初始问题分类模型进行训练所得到的;所述训练数据集是通过如权利要求12-16任一项所述的问答系统的样本生成方法所生成的;A classification unit, used for inputting the target question into a question classification model for classification processing to obtain a classification result; the question classification model is obtained by inputting question classification samples in a training data set into an initial question classification model for training; the training data set is generated by the sample generation method of the question answering system according to any one of claims 12 to 16; 第二解析单元,用于在所述分类结果用于表征所述目标问题属于第一预设分类的情况下,将所述目标问题输入问题解析模型进行解析处理,得到对应的目标片段;所述问题解析模型是通过将所述训练数据集中的问题解析样本输入初始问题解析模型进行训练所得到的; A second parsing unit is used to input the target question into a question parsing model for parsing and obtaining a corresponding target segment when the classification result is used to characterize that the target question belongs to a first preset classification; the question parsing model is obtained by inputting the question parsing samples in the training data set into an initial question parsing model for training; 预测单元,用于将所述目标片段输入实体链接模型进行预测处理,得到对应的目标实体;所述实体链接模型是通过将所述训练数据集中的实体链接样本输入初始实体链接模型进行训练所得到的;A prediction unit, configured to input the target segment into an entity linking model for prediction processing to obtain a corresponding target entity; the entity linking model is obtained by inputting entity linking samples in the training data set into an initial entity linking model for training; 第二确定单元,用于根据所述目标实体,确定所述目标问题的答案。The second determining unit is used to determine the answer to the target question according to the target entity. 一种电子设备,所述设备包括:An electronic device, comprising: 处理器;以及,被配置为存储计算机可执行指令的存储器,所述计算机可执行指令在被执行时使所述处理器执行如权利要求1-7任一项所述的问答系统的模型训练方法,或者,如权利要求8-10任一项所述的应答方法,或者如权利要求12-16任一项所述的问答系统的样本生成方法,或者,如权利要求17所述的问答系统的模型训练方法,或者,如权利要求18-20任一项所述的应答方法。A processor; and a memory configured to store computer-executable instructions, wherein the computer-executable instructions, when executed, cause the processor to execute the model training method for the question-answering system as described in any one of claims 1 to 7, or the answering method as described in any one of claims 8 to 10, or the sample generation method for the question-answering system as described in any one of claims 12 to 16, or the model training method for the question-answering system as described in claim 17, or the answering method as described in any one of claims 18 to 20. 一种计算机可读存储介质,其中,所述计算机可读存储介质用于存储计算机可执行指令,所述计算机可执行指令在被处理器执行时实现如权利要求1-7任一项所述的问答系统的模型训练方法,或者,如权利要求8-10任一项所述的应答方法,或者如权利要求12-16任一项所述的问答系统的样本生成方法,或者,如权利要求17所述的问答系统的模型训练方法,或者,如权利要求18-20任一项所述的应答方法。 A computer-readable storage medium, wherein the computer-readable storage medium is used to store computer-executable instructions, and the computer-executable instructions, when executed by a processor, implement the model training method for a question-answering system as described in any one of claims 1-7, or the answering method as described in any one of claims 8-10, or the sample generation method for a question-answering system as described in any one of claims 12-16, or the model training method for a question-answering system as described in claim 17, or the answering method as described in any one of claims 18-20.
PCT/CN2024/070737 2023-03-14 2024-01-05 Question answering system model training method, and sample generation method Pending WO2024187925A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN202310247102.2 2023-03-14
CN202310247092.2 2023-03-14
CN202310247092.2A CN116306974B (en) 2023-03-14 Model training methods, devices, electronic equipment, and storage media for question-answering systems
CN202310247102.2A CN118674059A (en) 2023-03-14 2023-03-14 Sample generation method, device, electronic device and storage medium for question-answering system

Publications (1)

Publication Number Publication Date
WO2024187925A1 true WO2024187925A1 (en) 2024-09-19

Family

ID=92754291

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2024/070737 Pending WO2024187925A1 (en) 2023-03-14 2024-01-05 Question answering system model training method, and sample generation method

Country Status (1)

Country Link
WO (1) WO2024187925A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118939790A (en) * 2024-10-12 2024-11-12 杭州亚信软件有限公司 A resource processing method and device based on knowledge graph and large model
CN119149712A (en) * 2024-11-18 2024-12-17 阿里巴巴(中国)有限公司 Data construction, code question-answering method, task platform and code question-answering system
CN119647425A (en) * 2024-11-27 2025-03-18 中国农业银行股份有限公司 A smart filling device and smart filling method for a contract
CN120850987A (en) * 2025-09-22 2025-10-28 杭州汉资信息科技有限公司 An intelligent legal document automatic generation system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109522393A (en) * 2018-10-11 2019-03-26 平安科技(深圳)有限公司 Intelligent answer method, apparatus, computer equipment and storage medium
CN111506719A (en) * 2020-04-20 2020-08-07 深圳追一科技有限公司 Associated question recommending method, device and equipment and readable storage medium
CN112163067A (en) * 2020-09-24 2021-01-01 平安直通咨询有限公司上海分公司 Sentence reply method, sentence reply device and electronic device
CN113377936A (en) * 2021-05-25 2021-09-10 杭州搜车数据科技有限公司 Intelligent question and answer method, device and equipment
US20230069935A1 (en) * 2019-11-20 2023-03-09 Korea Advanced Institute Of Science And Technology Dialog system answering method based on sentence paraphrase recognition
CN116306974A (en) * 2023-03-14 2023-06-23 马上消费金融股份有限公司 Model training method, device, electronic device and storage medium for question answering system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109522393A (en) * 2018-10-11 2019-03-26 平安科技(深圳)有限公司 Intelligent answer method, apparatus, computer equipment and storage medium
US20230069935A1 (en) * 2019-11-20 2023-03-09 Korea Advanced Institute Of Science And Technology Dialog system answering method based on sentence paraphrase recognition
CN111506719A (en) * 2020-04-20 2020-08-07 深圳追一科技有限公司 Associated question recommending method, device and equipment and readable storage medium
CN112163067A (en) * 2020-09-24 2021-01-01 平安直通咨询有限公司上海分公司 Sentence reply method, sentence reply device and electronic device
CN113377936A (en) * 2021-05-25 2021-09-10 杭州搜车数据科技有限公司 Intelligent question and answer method, device and equipment
CN116306974A (en) * 2023-03-14 2023-06-23 马上消费金融股份有限公司 Model training method, device, electronic device and storage medium for question answering system

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118939790A (en) * 2024-10-12 2024-11-12 杭州亚信软件有限公司 A resource processing method and device based on knowledge graph and large model
CN119149712A (en) * 2024-11-18 2024-12-17 阿里巴巴(中国)有限公司 Data construction, code question-answering method, task platform and code question-answering system
CN119647425A (en) * 2024-11-27 2025-03-18 中国农业银行股份有限公司 A smart filling device and smart filling method for a contract
CN120850987A (en) * 2025-09-22 2025-10-28 杭州汉资信息科技有限公司 An intelligent legal document automatic generation system

Similar Documents

Publication Publication Date Title
US12013850B2 (en) Method and system for advanced data conversations
WO2024187925A1 (en) Question answering system model training method, and sample generation method
US11934392B2 (en) Method and system for data conversations
CN118819778A (en) Task processing method, device, equipment, storage medium and program product based on large model intelligent agent arrangement
CN113886553B (en) A text generation method, device, equipment and storage medium
CN111339277A (en) Question-answer interaction method and device based on machine learning
CN115687647A (en) Notarization document generation method and device, electronic equipment and storage medium
CN114942981B (en) Question and answer query method and device, electronic equipment and computer readable storage medium
CN107958059B (en) Smart question answering method, device, terminal and computer readable storage medium
CN117808923B (en) Image generation method, system, electronic device and readable storage medium
CN113656579B (en) Text classification method, device, equipment and medium
CN114707510A (en) Resource recommendation information pushing method and device, computer equipment and storage medium
CN119474317A (en) Problem processing method and device based on large language model
CN119884312A (en) Dialogue data processing method, system, equipment and medium
CN116306974B (en) Model training methods, devices, electronic equipment, and storage media for question-answering systems
CN119201935A (en) Data processing method and device
CN118689961A (en) Data processing method and device based on large model
CN118674059A (en) Sample generation method, device, electronic device and storage medium for question-answering system
CN116306974A (en) Model training method, device, electronic device and storage medium for question answering system
CN116205669A (en) Sales follow-up judging method, device, equipment and medium
CN116028620A (en) Method and system for generating patent abstract based on multi-task feature cooperation
CN114925179A (en) Information query method, device, storage medium and terminal
CN113535125A (en) Method and device for generating financial demand items
CN117875280B (en) Method and device for generating digital report, storage medium, and electronic device
CN120975117A (en) A method and apparatus for agent routing orchestration and coordination services

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 24769623

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE