WO2019153613A1 - Chat response method, electronic device and storage medium - Google Patents
Chat response method, electronic device and storage medium Download PDFInfo
- Publication number
- WO2019153613A1 WO2019153613A1 PCT/CN2018/090643 CN2018090643W WO2019153613A1 WO 2019153613 A1 WO2019153613 A1 WO 2019153613A1 CN 2018090643 W CN2018090643 W CN 2018090643W WO 2019153613 A1 WO2019153613 A1 WO 2019153613A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- answer
- question
- candidate
- conversation
- similarity
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/01—Customer relationship services
- G06Q30/015—Providing customer assistance, e.g. assisting a customer within a business location or via helpdesk
- G06Q30/016—After-sales
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/901—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
- G06N3/0442—Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/09—Supervised learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/02—Knowledge representation; Symbolic representation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/04—Inference or reasoning models
Definitions
- the present application relates to the field of computer technologies, and in particular, to a chat response method, an electronic device, and a storage medium.
- AI Artificial Intelligence
- smart question and answer is one of them.
- the customer consults online via text or voice
- the customer can be intelligently answered by the online intelligent customer service.
- Intelligent Q&A can effectively alleviate the waiting situation of customer service and improve service quality, so it has a very broad prospect.
- the online consultation process will contain some pure chat content.
- chat session content input by the customer cannot respond to the customer quickly, accurately, and effectively, the service quality of the intelligent customer service will be reduced, and the humanized high quality experience cannot be brought to the customer.
- the present application provides a chat response method, the method comprising: a pre-processing step of: acquiring a session question input by a client, pre-processing the session problem, and obtaining text feature information of the session problem, the text feature
- the information includes part of speech, location and part-of-speech attribution information of each term in the conversation question, the word class attribution includes belonging to the keyword or the named entity; and the first calculating step: constructing an inverted index for the question-and-answer knowledge base, the question and answer
- the knowledge base includes a plurality of questions pre-arranged and one or more answers associated with each question, and according to the text feature information, querying candidate questions related to the conversation problem from the question-and-answer knowledge base by means of inverted index query Collecting, and separately calculating text similarity between the conversation problem and each candidate question in the candidate question set; a question retrieval step: determining whether the session exists in the candidate question set according to the preset rule and the text similarity An approximation problem of the problem,
- the present application further provides an electronic device including a memory and a processor, wherein the memory includes a chat response program, and the chat response program is executed by the processor to implement the following steps: a pre-processing step Obtaining a session problem input by the client, pre-processing the session problem, and obtaining text feature information of the session problem, where the text feature information includes part of speech, location, and word class attribution information of each term in the conversation question,
- the term class attribution includes attribution to a keyword or a named entity;
- a first calculation step constructing an inverted index for the question and answer knowledge base, the question and answer knowledge base including a plurality of questions pre-arranged and one or more answers associated with each question, Determining, according to the text feature information, a candidate question set related to the session problem from a question and answer knowledge base by means of an inverted index query, and respectively calculating the session problem and each candidate question in the candidate question set Text similarity; question retrieval step: according to preset rules and the similarity of the text,
- the present application further provides a computer readable storage medium including a chat response program, when the chat response program is executed by a processor, implementing the chat response method as described above Any step.
- the chat response method, the electronic device and the storage medium proposed by the present application after acquiring the session problem and performing pre-processing, query the candidate question set related to the conversation problem from the question-and-answer knowledge base by means of the inverted index query, and respectively Calculating a text similarity between the conversation problem and each candidate question in the candidate question set, determining whether there is an approximation problem of the conversation problem in the candidate question set, and if so, searching for an association of the approximation problem in the Q&A knowledge base
- the answer is that the associated answer is output as the target answer of the conversation question. If there is no approximation problem of the conversation problem in the candidate question set, the query and the query are obtained from the question and answer knowledge base by means of an inverted index query.
- Determining a set of candidate answers related to the conversation problem and separately calculating a topic similarity of the conversation question and each candidate answer in the candidate answer set, and determining whether an approximate answer of the conversation question exists in the candidate answer set, and if so, Outputting the approximate answer as a target answer to the conversation question, if the candidate If there is no approximate answer to the conversation problem in the answer set, the iterative training of encoding and decoding each question and answer in the question and answer knowledge base is performed by the seq2seq model, thereby constructing a sequence prediction model, and inputting the conversation problem
- the sequence prediction model generates a strain answer, and outputs the strain answer as a target answer of the conversation question, and can provide accurate and responsive feedback to the client for the conversation problem, thereby improving service quality.
- FIG. 1 is a schematic diagram of an operating environment of a preferred embodiment of an electronic device of the present application
- FIG. 2 is a schematic diagram of interaction between an electronic device and a client according to a preferred embodiment of the present application
- FIG. 3 is a flow chart of a preferred embodiment of a chat response method of the present application.
- FIG. 4 is a program block diagram of the chat response program of FIG. 1.
- embodiments of the present application can be implemented as a method, apparatus, device, system, or computer program product. Accordingly, the application can be embodied in a complete hardware, complete software (including firmware, resident software, microcode, etc.), or a combination of hardware and software.
- a chat response method an electronic device, and a storage medium are proposed.
- FIG. 1 is a schematic diagram of an operating environment of a preferred embodiment of an electronic device of the present application.
- the electronic device 1 may be a terminal device having a storage and computing function such as a server, a portable computer, or a desktop computer.
- the electronic device 1 includes a memory 11, a processor 12, a network interface 13, and a communication bus 14.
- the network interface 13 can optionally include a standard wired interface and a wireless interface (such as a WI-FI interface).
- the communication bus 14 is used to implement connection communication between the above components.
- the memory 11 includes at least one type of readable storage medium.
- the at least one type of readable storage medium may be a non-volatile storage medium such as a flash memory, a hard disk, a multimedia card, a card type memory, or the like.
- the readable storage medium may be an internal storage unit of the electronic device 1, such as a hard disk of the electronic device 1.
- the readable storage medium may also be an external memory 11 of the electronic device 1, such as a plug-in hard disk equipped on the electronic device 1, a smart memory card (SMC). , Secure Digital (SD) card, Flash Card, etc.
- SMC smart memory card
- SD Secure Digital
- the readable storage medium of the memory 11 is generally used to store the chat response program 10, the Q&A knowledge base 4, and the like installed in the electronic device 1.
- the memory 11 can also be used to temporarily store data that has been output or is about to be output.
- the processor 12 in some embodiments, may be a Central Processing Unit (CPU), microprocessor or other data processing chip for running program code or processing data stored in the memory 11, such as executing a chat response program. 10 and so on.
- CPU Central Processing Unit
- microprocessor or other data processing chip for running program code or processing data stored in the memory 11, such as executing a chat response program. 10 and so on.
- FIG. 1 shows only the electronic device 1 having the components 11-14 and the chat response program 10, but it should be understood that not all illustrated components may be implemented, and more or fewer components may be implemented instead.
- the electronic device 1 may further include a user interface
- the user interface may include an input unit such as a keyboard, a voice input device such as a microphone, a device with a voice recognition function, a voice output device such as an audio, a headphone, and the like.
- the user interface may also include a standard wired interface and a wireless interface.
- the electronic device 1 may further include a display, which may also be referred to as a display screen or a display unit.
- a display may also be referred to as a display screen or a display unit.
- it may be an LED display, a liquid crystal display, a touch liquid crystal display, and an Organic Light-Emitting Diode (OLED) display.
- the display is used to display information processed in the electronic device 1 and a user interface for displaying visualizations.
- the electronic device 1 further comprises a touch sensor.
- the area provided by the touch sensor for the user to perform a touch operation is referred to as a touch area.
- the touch sensor described herein may be a resistive touch sensor, a capacitive touch sensor, or the like.
- the touch sensor includes not only a contact type touch sensor but also a proximity type touch sensor or the like.
- the touch sensor may be a single sensor or a plurality of sensors arranged, for example, in an array. The user can initiate the chat response program 10 by touching the touch area.
- the area of the display of the electronic device 1 may be the same as or different from the area of the touch sensor.
- a display is stacked with the touch sensor to form a touch display. The device detects a user-triggered touch operation based on a touch screen display.
- the electronic device 1 may further include a radio frequency (RF) circuit, a sensor, an audio circuit, and the like, and details are not described herein.
- RF radio frequency
- FIG. 2 it is a schematic diagram of interaction between the electronic device 1 and the client 2 according to a preferred embodiment of the present application.
- the chat response program 10 runs in the electronic device 1.
- the preferred embodiment of the electronic device 1 is a server.
- the electronic device 1 is communicatively coupled to the client 2 via a network 3.
- the client 2 can run in various types of terminal devices, such as smart phones, portable computers, and the like.
- the session question can be input to the chat answering program 10, and the session question can be a session problem for a specific domain or a chat session content.
- the chat response program 10 can adopt the chat response method, determine an appropriate response content according to the session problem, and feed back the response content to the client 2.
- FIG. 3 it is a flowchart of a preferred embodiment of the chat response method of the present application. The following steps of implementing the chat response method when the processor 12 of the electronic device 1 executes the chat response program 10 stored in the memory 11:
- Step S1 Obtain a session question input by the client, perform pre-processing on the session problem, and obtain text feature information of the session problem, where the text feature information includes part of speech, location, and word class attribution information of each term in the conversation question.
- the word class attribution includes attribution to a keyword or a named entity.
- the session question can be, for example, a conversational question for a particular domain, such as "how long the warranty period is," or it can be a chat session content, such as "The weather is very good today.”
- step S1 may first perform some pre-processing on the session problem.
- the pre-processing performed in step S1 may include the following processing:
- the conversation question is “how long the warranty period is”, and the terms obtained after the word segmentation are “warranty period” and “yes”.
- the method of word segmentation includes performing a forward maximum match based on a dictionary and/or performing a reverse maximum match based on a dictionary;
- part-of-speech analysis on each term obtained by the word segmentation, and marking the part-of-speech of each term.
- the result of the part-of-speech tagging according to the preset rule is “warranty period/noun”. "Yes/verbs”, “multiple/adverbs”, “long/adjectives”, the part of speech analysis is realized by a part-of-speech tagging model obtained through a preset large-scale corpus training;
- Named entity identification is performed on the session problem, thereby identifying a named entity having a specific meaning, the named entity includes a person name, a place name, an organization, a proper noun, and the method for identifying the named entity includes a dictionary-based and rule-based method And methods based on statistical learning;
- the preset dictionary includes a business scenario-specific dictionary.
- Step S2 constructing an inverted index for the question and answer knowledge base 4, the question and answer knowledge base 4 includes a plurality of questions arranged in advance and one or more answers associated with each question, and querying through the inverted index according to the text feature information
- the method queries the candidate question set related to the conversation question from the question and answer knowledge base 4, and separately calculates the text similarity of the session question and each candidate question in the candidate question set.
- the constructing the inverted index for the Q&A knowledge base 4 includes:
- Each question and answer in the Q&A Knowledge Base 4 is divided into word segmentation, part-of-speech tagging, keyword extraction, keyword occurrence location record, assignment of ID number, and assignment of each term after each word and answer segmentation.
- ID number ;
- Each question and answer in the Q&A knowledge base 4 is sorted according to the corresponding ID number, and each term obtained after each word and answer word segmentation is sorted according to the corresponding ID number, and will have the same term ID. All question IDs and answer IDs are placed in the inverted record table corresponding to the entry;
- At least one candidate question is included in the candidate question set, and each candidate question has a certain degree of association with the session problem due to the manner of using an inverted index query.
- the association between each candidate question and the conversation question may be reflected by the text similarity. If the text similarity between the conversation problem and the corresponding candidate question is higher, the conversation problem is considered to be similar to the candidate problem. .
- the method for calculating the text similarity between the session problem and each candidate question in the candidate question set in step S2 may include:
- Step S3 determining, according to the preset rule and the text similarity, whether there is an approximation problem of the conversation problem in the candidate question set, and if there is an approximation problem of the conversation problem in the candidate question set, then the Q&A knowledge base Finding the associated answer of the approximation question, and outputting the associated answer as the target answer of the conversation question.
- the preset rule may include: determining whether there is a candidate problem that the text similarity with the conversation problem is greater than the second preset threshold, and if there is a candidate problem that the text similarity with the conversation problem is greater than the second preset threshold, Then, an approximation problem of the conversation problem exists in the candidate problem set. If there is no candidate problem that the text similarity with the conversation problem is greater than the second preset threshold, then it is determined that there is no approximation problem of the conversation problem in the candidate problem set.
- step S3 selects a candidate corresponding to the maximum text similarity from the candidate questions with the text similarity of the conversation problem being greater than the second preset threshold.
- the problem is taken as the approximation question, and the associated answer of the approximation question is searched in the Q&A knowledge base 4, and the associated answer is output as the target answer of the conversation question.
- the approximation question may also have more than one associated answer in the Q&A knowledge base 4.
- step S3 may take the plurality of associated answers. And outputting the associated answer with the highest frequency in the preset time period (for example, the most recent week) as the target answer of the conversation question.
- Step S4 If there is no approximation problem of the conversation problem in the candidate question set, query the candidate related to the conversation problem from the question and answer knowledge base 4 by using an inverted index query according to the text feature information. An answer set, and separately calculating a topic similarity of the conversation question and each candidate answer in the candidate answer set.
- At least one candidate answer is included in the set of candidate answers, and each candidate answer has a certain degree of association with the session question due to the manner of using an inverted index query.
- the association of each candidate answer with the conversation question may be reflected by the topic similarity, and if the topic similarity between the conversation question and the corresponding candidate answer is higher, the conversation problem and the topic of the candidate answer are considered The more similar, the more likely the candidate answer is to be the answer to the conversation question.
- the method for calculating the topic similarity of each of the candidate answers in the candidate answer set in step S4 may include:
- the calculating the similarity degree between the conversation problem and each candidate answer in the candidate answer set respectively includes:
- Step S5 determining, according to the preset rule and the topic similarity, whether an approximate answer of the conversation problem exists in the candidate answer set, and if the approximate answer of the conversation problem exists in the candidate answer set, the approximation is The answer is output as the target answer to the conversation question.
- the preset rule may include: determining whether there is a candidate answer whose topic similarity with the session problem is greater than a third preset threshold, and if there is a candidate answer with a topic similarity of the conversation problem being greater than a third preset threshold, Then, an approximate answer to the conversation question exists in the candidate answer set. If there is no candidate answer with the topic similarity of the conversation problem being greater than the third preset threshold, it is determined that there is no approximate answer of the conversation question in the candidate answer set.
- step S5 If there is a candidate answer whose topic similarity to the conversation question is greater than the third preset threshold, the candidate answer is taken as an approximate answer to the conversation question, and step S5 outputs the approximate answer as the target answer of the conversation question.
- the candidate answers with the topic similarity of the conversation problem greater than the third preset threshold may also have more than one in the Q&A knowledge base 4, when the similarity with the topic of the conversation problem is greater than the third predetermined threshold.
- step S5 may take the approximate answer which is the highest frequency of the conversation question in the preset time period (for example, the most recent week).
- Step S6 if there is no approximate answer of the conversation question in the candidate answer set, the iterative training of encoding and decoding each question and answer in the Q&A knowledge base 4 by using the seq2seq model, thereby constructing a sequence prediction model,
- the conversation question is input to the sequence prediction model to generate a strain answer, and the strain answer is output as a target answer of the conversation question.
- the seq2seq model consists of a forward long and short memory network LSTM model and a backward LSTM model for performing the encoding and decoding iterative training, and an attention mechanism for calculating hidden layer information weights for each encoding and decoding.
- the candidate question set related to the session problem is queried from the question-and-answer knowledge base 4 by means of an inverted index query, and the said a textual similarity between the conversation problem and each candidate question in the candidate question set, determining whether there is an approximation problem of the conversation problem in the candidate question set, and if so, searching for the associated answer of the approximation question in the Q&A knowledge base 4, Outputting the associated answer as a target answer of the conversation question.
- FIG. 4 it is a program module diagram of the chat response program 10 in FIG.
- the chat response program 10 is divided into a plurality of modules that are stored in the memory 11 and executed by the processor 12 to complete the present application.
- a module as referred to in this application refers to a series of computer program instructions that are capable of performing a particular function.
- the chat response program 10 can be divided into: a pre-processing module 110, a first calculation module 120, a question retrieval module 130, a second calculation module 140, an answer retrieval module 150, and an answer prediction module 160.
- the pre-processing module 110 is configured to obtain a session problem input by the client, and perform pre-processing on the session problem to obtain text feature information of the session problem, where the text feature information includes the part of speech and location of each term in the conversation problem.
- word class attribution information the word class attribution includes attribution to a keyword or a named entity.
- the pre-processing module 110 is configured to perform the following pre-processing on the session problem:
- the method of word segmentation includes performing forward maximum matching based on the dictionary and/or performing reverse maximum matching based on the dictionary;
- part-of-speech analysis on each term obtained by the word segmentation process, and labeling the part-of-speech of each term, and the part-of-speech analysis is realized by a part-of-speech tagging model obtained through a preset large-scale corpus training;
- Named entity identification is performed on the session problem, thereby identifying a named entity having a specific meaning, the named entity includes a person name, a place name, an organization, a proper noun, and the method for identifying the named entity includes a dictionary-based and rule-based method And methods based on statistical learning;
- the preset dictionary includes a business scenario-specific dictionary.
- a first calculation module 120 configured to build an inverted index for the Q&A knowledge base 4, the Q&A knowledge base includes a plurality of questions pre-arranged and one or more answers associated with each question, according to the text feature information,
- the manner of inverting the index query queries the candidate question set related to the session question from the question and answer knowledge base 4, and separately calculates the text similarity of the session question and each candidate question in the candidate question set.
- the first calculating module 120 is configured to construct an inverted index for the question and answer knowledge base 4 by:
- Each question and answer in the Q&A Knowledge Base 4 is divided into word segmentation, part-of-speech tagging, keyword extraction, keyword occurrence location record, assignment of ID number, and assignment of each term after each word and answer segmentation.
- ID number ;
- Each question and answer in the Q&A knowledge base 4 is sorted according to the corresponding ID number, and each term obtained after each word and answer word segmentation is sorted according to the corresponding ID number, and will have the same term ID. All question IDs and answer IDs are placed in the inverted record table corresponding to the entry;
- the first calculation module 120 calculates a text similarity between the conversation problem and each candidate question in the candidate question set, including:
- the problem retrieving module 130 is configured to determine, according to the preset rule and the text similarity, whether there is an approximation problem of the session problem in the candidate question set, and if there is an approximation problem of the session problem in the candidate question set, Finding the associated answer of the approximate question in the question and answer knowledge base, and outputting the associated answer as the target answer of the conversation question.
- the problem retrieval module 130 determines whether there is a candidate problem that the text similarity with the conversation problem is greater than the second preset threshold, and if so, from the candidate problem that the text similarity with the conversation problem is greater than the second preset threshold Selecting a candidate question corresponding to the maximum text similarity as the approximation question; if there is no candidate problem that the text similarity with the session problem is greater than the second preset threshold, determining that the session problem does not exist in the candidate question set Approximate problem.
- the second calculating module 140 is configured to: if the approximation problem of the session problem does not exist in the candidate question set, query and query from the Q&A knowledge base 4 by using an inverted index query according to the text feature information A set of candidate answers related to the conversation question, and calculating a topic similarity of the conversation question and each candidate answer in the candidate answer set, respectively.
- the second calculation module 140 calculates a topic similarity between the conversation question and each candidate answer in the candidate answer set, including:
- the answer retrieval module 150 is configured to determine, according to the preset rule and the topic similarity, whether an approximate answer of the conversation problem exists in the candidate answer set, if an approximate answer of the conversation problem exists in the candidate answer set, The approximate answer is output as a target answer to the conversation question.
- the answer retrieval module 150 determines whether there is a candidate answer whose topic similarity to the conversation problem is greater than the third preset threshold, and if so, from the candidate answers that the topic similarity with the conversation question is greater than the third preset threshold Selecting a candidate answer corresponding to the maximum topic similarity as the approximate answer; if there is no candidate answer with a topic similarity of the conversation problem greater than the third preset threshold, determining that the session problem does not exist in the candidate answer set Approximate answer.
- the answer prediction module 160 is configured to: if the approximate answer of the conversation problem does not exist in the candidate answer set, perform iterative training on encoding and decoding each question and answer in the Q&A knowledge base 4 by using the seq2seq model, thereby constructing a sequence And predicting a model, inputting the conversation question into the sequence prediction model to generate a strain answer, and outputting the strain answer as a target answer of the conversation question.
- the answer prediction module 160 describes the seq2seq model by a forward-long memory network LSTM model and a backward LSTM model for performing the encoding and decoding iterative training, and attention for calculating hidden layer information weights for each encoding and decoding. Mechanism composition.
- the memory 11 including the readable storage medium may include an operating system, a chat response program 10, and a question and answer knowledge base 4.
- the processor 12 executes the chat response program 10 stored in the memory 11, the following steps are implemented:
- a pre-processing step obtaining a session question input by the client, pre-processing the session problem, and obtaining text feature information of the session problem, where the text feature information includes part of speech, location, and part-of-speech attribution of each term in the conversation problem Information, the word class attribution includes belonging to a keyword or a named entity;
- a first calculating step constructing an inverted index for the question and answer knowledge base, the question and answer knowledge base including a plurality of questions arranged in advance and one or more answers associated with each question, and querying through the inverted index according to the text feature information Means querying a candidate question set related to the conversation problem from a question and answer knowledge base, and separately calculating text similarity between the conversation problem and each candidate question in the candidate question set;
- a problem retrieval step determining, according to a preset rule and the text similarity, whether there is an approximation problem of the conversation problem in the candidate question set, and if there is an approximation problem of the conversation problem in the candidate question set, then the question and answer knowledge Finding an associated answer of the approximate problem in the library, and outputting the associated answer as a target answer of the conversation question;
- a second calculating step if there is no approximation problem of the session problem in the candidate question set, querying, according to the text feature information, a query related to the session problem from the Q&A knowledge base by means of an inverted index query a set of candidate answers, and separately calculating a topic similarity of the conversation question and each candidate answer in the candidate answer set;
- An answer retrieval step determining, according to a preset rule and the topic similarity, whether an approximate answer of the conversation question exists in the candidate answer set, and if an approximate answer of the conversation problem exists in the candidate answer set, The approximate answer is output as the target answer to the conversation question;
- An answer prediction step if an approximate answer of the conversation problem does not exist in the candidate answer set, iteratively trains and decodes each question and answer in the question and answer knowledge base through the seq2seq model, thereby constructing a sequence prediction model, The conversation question is input to the sequence prediction model to generate a strain answer, and the strain answer is output as a target answer of the conversation question.
- the pre-processing of the session problem includes:
- the method of word segmentation includes performing forward maximum matching based on the dictionary and/or performing reverse maximum matching based on the dictionary;
- part-of-speech analysis on each term obtained by the word segmentation process, and labeling the part-of-speech of each term, and the part-of-speech analysis is realized by a part-of-speech tagging model obtained through a preset large-scale corpus training;
- Named entity identification is performed on the session problem, thereby identifying a named entity having a specific meaning, the named entity includes a person name, a place name, an organization, a proper noun, and the method for identifying the named entity includes a dictionary-based and rule-based method And methods based on statistical learning;
- the preset dictionary includes a business scenario-specific dictionary.
- the calculating the similarity degree between the conversation problem and each candidate answer in the candidate answer set respectively includes:
- the approximating the problem of determining whether the session problem exists in the candidate question set according to the preset rule and the problem similarity includes:
- the determining, according to the preset rule and the topic similarity, determining whether the candidate answer exists in the candidate answer set includes:
- the constructing the inverted index for the Q&A knowledge base includes:
- Each question and answer in the Q&A knowledge base is divided into word segmentation, part-of-speech tagging, keyword extraction, keyword location location record, assignment ID number, and ID assigned to each term after each word and answer segmentation. number;
- Each question and answer in the Q&A knowledge base is sorted according to the corresponding ID number, and each term obtained after each word and answer word segmentation is sorted according to the corresponding ID number, and all the items with the same item ID are The question ID and the answer ID are placed in the inverted record table corresponding to the entry;
- the seq2seq model consists of a forward long and short memory network LSTM model and a backward LSTM model for performing the encoding and decoding iterative training, and an attention mechanism for calculating hidden layer information weights for each encoding and decoding.
- the embodiment of the present application further provides a computer readable storage medium, which may be a hard disk, a multimedia card, an SD card, a flash memory card, an SMC, a read only memory (ROM), and an erasable programmable Any combination or combination of any one or more of read only memory (EPROM), portable compact disk read only memory (CD-ROM), USB memory, and the like.
- the computer readable storage medium includes a Q&A knowledge base 4, a chat response program 10, and the like. When the chat response program 10 is executed by the processor 12, the following operations are implemented:
- a pre-processing step obtaining a session question input by the client, pre-processing the session problem, and obtaining text feature information of the session problem, where the text feature information includes part of speech, location, and part-of-speech attribution of each term in the conversation problem Information, the word class attribution includes belonging to a keyword or a named entity;
- a first calculating step constructing an inverted index for the question and answer knowledge base, the question and answer knowledge base including a plurality of questions arranged in advance and one or more answers associated with each question, and querying through the inverted index according to the text feature information Means querying a candidate question set related to the conversation problem from a question and answer knowledge base, and separately calculating text similarity between the conversation problem and each candidate question in the candidate question set;
- a problem retrieval step determining, according to a preset rule and the text similarity, whether there is an approximation problem of the conversation problem in the candidate question set, and if there is an approximation problem of the conversation problem in the candidate question set, then the question and answer knowledge Finding an associated answer of the approximate problem in the library, and outputting the associated answer as a target answer of the conversation question;
- a second calculating step if there is no approximation problem of the session problem in the candidate question set, querying, according to the text feature information, a query related to the session problem from the Q&A knowledge base by means of an inverted index query a set of candidate answers, and separately calculating a topic similarity of the conversation question and each candidate answer in the candidate answer set;
- An answer retrieval step determining, according to a preset rule and the topic similarity, whether an approximate answer of the conversation question exists in the candidate answer set, and if an approximate answer of the conversation problem exists in the candidate answer set, The approximate answer is output as the target answer to the conversation question;
- An answer prediction step if an approximate answer of the conversation problem does not exist in the candidate answer set, iteratively trains and decodes each question and answer in the question and answer knowledge base through the seq2seq model, thereby constructing a sequence prediction model, The conversation question is input to the sequence prediction model to generate a strain answer, and the strain answer is output as a target answer of the conversation question.
- the pre-processing of the session problem includes:
- the method of word segmentation includes performing forward maximum matching based on the dictionary and/or performing reverse maximum matching based on the dictionary;
- part-of-speech analysis on each term obtained by the word segmentation process, and labeling the part-of-speech of each term, and the part-of-speech analysis is realized by a part-of-speech tagging model obtained through a preset large-scale corpus training;
- Named entity identification is performed on the session problem, thereby identifying a named entity having a specific meaning, the named entity includes a person name, a place name, an organization, a proper noun, and the method for identifying the named entity includes a dictionary-based and rule-based method And methods based on statistical learning;
- the preset dictionary includes a business scenario-specific dictionary.
- the calculating the similarity degree between the conversation problem and each candidate answer in the candidate answer set respectively includes:
- the approximating the problem of determining whether the session problem exists in the candidate question set according to the preset rule and the problem similarity includes:
- the determining, according to the preset rule and the topic similarity, determining whether the candidate answer exists in the candidate answer set includes:
- the constructing the inverted index for the Q&A knowledge base includes:
- Each question and answer in the Q&A knowledge base is divided into word segmentation, part-of-speech tagging, keyword extraction, keyword location location record, assignment ID number, and ID assigned to each term after each word and answer segmentation. number;
- Each question and answer in the Q&A knowledge base is sorted according to the corresponding ID number, and each term obtained after each word and answer word segmentation is sorted according to the corresponding ID number, and all the items with the same item ID are The question ID and the answer ID are placed in the inverted record table corresponding to the entry;
- the seq2seq model consists of a forward long and short memory network LSTM model and a backward LSTM model for performing the encoding and decoding iterative training, and an attention mechanism for calculating hidden layer information weights for each encoding and decoding.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Biophysics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- Business, Economics & Management (AREA)
- Accounting & Taxation (AREA)
- Development Economics (AREA)
- Economics (AREA)
- Finance (AREA)
- Marketing (AREA)
- Strategic Management (AREA)
- General Business, Economics & Management (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Machine Translation (AREA)
Abstract
Description
本申请要求于2018年2月9日提交中国专利局,申请号为201810135747.6、发明名称为“聊天应答方法、电子装置及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims priority to Chinese Patent Application No. 201101135747.6, entitled "Chat Response Method, Electronic Device, and Storage Media", filed on February 9, 2018, the entire contents of which are incorporated herein by reference. In the application.
本申请涉及计算机技术领域,尤其涉及一种聊天应答方法、电子装置及存储介质。The present application relates to the field of computer technologies, and in particular, to a chat response method, an electronic device, and a storage medium.
随着科技的发展,AI(Artificial Intelligence,人工智能)正逐步改变着我们的生活方式,例如智能问答就是其中一种。当客户通过文字或语音在线咨询时,可以由线上的智能客服为客户进行智能应答。智能问答可以有效缓解客户服务的等待状况,提升服务质量,因而有着非常广阔的前景。With the development of technology, AI (Artificial Intelligence) is gradually changing our way of life. For example, smart question and answer is one of them. When the customer consults online via text or voice, the customer can be intelligently answered by the online intelligent customer service. Intelligent Q&A can effectively alleviate the waiting situation of customer service and improve service quality, so it has a very broad prospect.
然而,即使是在特定的服务领域,例如金融、银行、证券、保险等垂直的领域中,在线咨询的过程中也会包含一些纯闲聊的内容。此时针对客户输入的聊天会话内容,若无法快速准确和有效应变地响应客户,则会降低智能客服的服务质量,无法为客户带来人性化的高质量体验。However, even in a specific service area, such as financial, banking, securities, insurance and other vertical areas, the online consultation process will contain some pure chat content. At this time, if the chat session content input by the customer cannot respond to the customer quickly, accurately, and effectively, the service quality of the intelligent customer service will be reduced, and the humanized high quality experience cannot be brought to the customer.
发明内容Summary of the invention
鉴于以上原因,有必要提供一种聊天应答方法、电子装置及存储介质,可以针对会话问题为客户做出准确和应变的反馈,从而提高服务质量。In view of the above reasons, it is necessary to provide a chat response method, an electronic device and a storage medium, which can provide accurate and responsive feedback to customers for conversational problems, thereby improving service quality.
为实现上述目的,本申请提供一种聊天应答方法,该方法包括:预处理步骤:获取客户输入的会话问题,对所述会话问题进行预处理,得到会话问题的文本特征信息,所述文本特征信息包括各词条在所述会话问题中的词性、位置和词类归属信息,所述词类归属包括归属于关键词或命名实体;第一计算步骤:为问答知识库构建倒排索引,所述问答知识库包括预先整理的多个问题以及每个问题关联的一个或多个答案,根据所述文本特征信息,通过倒排索引查询的方式从问答知识库中查询与所述会话问题相关的候选问题集合,并分别计算所述会话问题与所述候选问题集合中每个候选问题的文本相似度;问题检索步骤:根据预设规则及所述文本相似度,判断候选问题集合中是否存在所述会话问题的近似问题,若所述候选问题集合中存在所述会话问题的近似问题,则在问答知识库中查找该近似问题的关联答案,将所述关联答案作为所述会话问题的目标答案输出;第二计算步骤:若所述候选问题集合中不存在所述会话问题的近似问题,则根据所述文本特征信息,通过倒排索引查询的方式从问答知识库中查询与所述会话问题相关的候选答案集合,并分别计算所述会话问题与所述候选答案集合中每个候选答案的主题相似度;答案检索步骤:根据预设规则及所述主题相似度,判断候选答案集合中是否存 在所述会话问题的近似答案,若所述候选答案集合中存在所述会话问题的近似答案,则将所述近似答案作为所述会话问题的目标答案输出;答案预测步骤:若候选答案集合中不存在所述会话问题的近似答案,则通过seq2seq模型对所述问答知识库中的各个问题和答案进行编码和解码的迭代训练,从而构建序列预测模型,将所述会话问题输入所述序列预测模型生成应变答案,将所述应变答案作为所述会话问题的目标答案输出。To achieve the above objective, the present application provides a chat response method, the method comprising: a pre-processing step of: acquiring a session question input by a client, pre-processing the session problem, and obtaining text feature information of the session problem, the text feature The information includes part of speech, location and part-of-speech attribution information of each term in the conversation question, the word class attribution includes belonging to the keyword or the named entity; and the first calculating step: constructing an inverted index for the question-and-answer knowledge base, the question and answer The knowledge base includes a plurality of questions pre-arranged and one or more answers associated with each question, and according to the text feature information, querying candidate questions related to the conversation problem from the question-and-answer knowledge base by means of inverted index query Collecting, and separately calculating text similarity between the conversation problem and each candidate question in the candidate question set; a question retrieval step: determining whether the session exists in the candidate question set according to the preset rule and the text similarity An approximation problem of the problem, if there is an approximation of the conversation problem in the set of candidate questions And searching for the associated answer of the approximate question in the question and answer knowledge base, and outputting the associated answer as the target answer of the conversation question; and second calculating step: if the session problem does not exist in the candidate question set Approximating the problem, according to the text feature information, querying a candidate answer set related to the conversation question from the question and answer knowledge base by means of an inverted index query, and calculating each of the conversation problem and the candidate answer set separately The topic similarity of the candidate answers; the answer retrieval step: determining, according to the preset rule and the topic similarity, whether there is an approximate answer of the conversation question in the candidate answer set, if the conversation problem exists in the candidate answer set An approximate answer, the approximate answer is output as the target answer of the conversation question; the answer prediction step: if the approximate answer of the conversation problem does not exist in the candidate answer set, the seq2seq model is used in the question and answer knowledge base Each question and answer is iteratively trained in encoding and decoding to construct a sequence prediction module Type, inputting the conversation question into the sequence prediction model to generate a strain answer, and outputting the strain answer as a target answer of the conversation question.
为实现上述目的,本申请还提供一种电子装置,该电子装置包括存储器和处理器,所述存储器中包括聊天应答程序,该聊天应答程序被所述处理器执行时实现如下步骤:预处理步骤:获取客户输入的会话问题,对所述会话问题进行预处理,得到会话问题的文本特征信息,所述文本特征信息包括各词条在所述会话问题中的词性、位置和词类归属信息,所述词类归属包括归属于关键词或命名实体;第一计算步骤:为问答知识库构建倒排索引,所述问答知识库包括预先整理的多个问题以及每个问题关联的一个或多个答案,根据所述文本特征信息,通过倒排索引查询的方式从问答知识库中查询与所述会话问题相关的候选问题集合,并分别计算所述会话问题与所述候选问题集合中每个候选问题的文本相似度;问题检索步骤:根据预设规则及所述文本相似度,判断候选问题集合中是否存在所述会话问题的近似问题,若所述候选问题集合中存在所述会话问题的近似问题,则在问答知识库中查找该近似问题的关联答案,将所述关联答案作为所述会话问题的目标答案输出;第二计算步骤:若所述候选问题集合中不存在所述会话问题的近似问题,则根据所述文本特征信息,通过倒排索引查询的方式从问答知识库中查询与所述会话问题相关的候选答案集合,并分别计算所述会话问题与所述候选答案集合中每个候选答案的主题相似度;答案检索步骤:根据预设规则及所述主题相似度,判断候选答案集合中是否存在所述会话问题的近似答案,若所述候选答案集合中存在所述会话问题的近似答案,则将所述近似答案作为所述会话问题的目标答案输出;答案预测步骤:若候选答案集合中不存在所述会话问题的近似答案,则通过seq2seq模型对所述问答知识库中的各个问题和答案进行编码和解码的迭代训练,从而构建序列预测模型,将所述会话问题输入所述序列预测模型生成应变答案,将所述应变答案作为所述会话问题的目标答案输出。To achieve the above object, the present application further provides an electronic device including a memory and a processor, wherein the memory includes a chat response program, and the chat response program is executed by the processor to implement the following steps: a pre-processing step Obtaining a session problem input by the client, pre-processing the session problem, and obtaining text feature information of the session problem, where the text feature information includes part of speech, location, and word class attribution information of each term in the conversation question, The term class attribution includes attribution to a keyword or a named entity; a first calculation step: constructing an inverted index for the question and answer knowledge base, the question and answer knowledge base including a plurality of questions pre-arranged and one or more answers associated with each question, Determining, according to the text feature information, a candidate question set related to the session problem from a question and answer knowledge base by means of an inverted index query, and respectively calculating the session problem and each candidate question in the candidate question set Text similarity; question retrieval step: according to preset rules and the similarity of the text, Whether there is an approximation problem of the conversation problem in the set of candidate questions, if there is an approximation problem of the conversation problem in the candidate question set, the associated answer of the approximation question is searched in the question and answer knowledge base, and the associated answer is a target answer output as the conversation problem; a second calculation step: if there is no approximation problem of the conversation problem in the candidate question set, according to the text feature information, the question and answer knowledge is obtained by means of an inverted index query Querying, in the library, a set of candidate answers related to the conversation question, and separately calculating a topic similarity between the conversation problem and each candidate answer in the candidate answer set; an answer retrieval step: according to a preset rule and the topic is similar Degree, determining whether there is an approximate answer of the conversation question in the candidate answer set, if an approximate answer of the conversation question exists in the candidate answer set, outputting the approximate answer as a target answer of the conversation question; Prediction step: if there is no approximate answer to the conversation question in the candidate answer set, An iterative training of encoding and decoding each question and answer in the Q&A knowledge base by the seq2seq model, thereby constructing a sequence prediction model, inputting the conversation problem into the sequence prediction model to generate a strain answer, and using the strain answer as The target answer output of the conversation question.
此外,为实现上述目的,本申请还提供一种计算机可读存储介质,所述计算机可读存储介质中包括聊天应答程序,该聊天应答程序被处理器执行时,实现如上所述的聊天应答方法的任意步骤。In addition, in order to achieve the above object, the present application further provides a computer readable storage medium including a chat response program, when the chat response program is executed by a processor, implementing the chat response method as described above Any step.
本申请提出的聊天应答方法、电子装置及存储介质,在获取会话问题并进行预处理后,通过倒排索引查询的方式从问答知识库中查询与所述会话问题相关的候选问题集合,并分别计算所述会话问题与所述候选问题集合中每个候选问题的文本相似度,判断候选问题集合中是否存在所述会话问题的近似问题,若是,则在问答知识库中查找该近似问题的关联答案,将所述关联答案作为所述会话问题的目标答案输出,若所述候选问题集合中不存在所述 会话问题的近似问题,则通过倒排索引查询的方式从问答知识库中查询与所述会话问题相关的候选答案集合,并分别计算所述会话问题与所述候选答案集合中每个候选答案的主题相似度,判断候选答案集合中是否存在所述会话问题的近似答案,若是,则将所述近似答案作为所述会话问题的目标答案输出,若候选答案集合中不存在所述会话问题的近似答案,则通过seq2seq模型对所述问答知识库中的各个问题和答案进行编码和解码的迭代训练,从而构建序列预测模型,将所述会话问题输入所述序列预测模型生成应变答案,将所述应变答案作为所述会话问题的目标答案输出,可以针对会话问题为客户做出准确和应变的反馈,从而提高服务质量。The chat response method, the electronic device and the storage medium proposed by the present application, after acquiring the session problem and performing pre-processing, query the candidate question set related to the conversation problem from the question-and-answer knowledge base by means of the inverted index query, and respectively Calculating a text similarity between the conversation problem and each candidate question in the candidate question set, determining whether there is an approximation problem of the conversation problem in the candidate question set, and if so, searching for an association of the approximation problem in the Q&A knowledge base The answer is that the associated answer is output as the target answer of the conversation question. If there is no approximation problem of the conversation problem in the candidate question set, the query and the query are obtained from the question and answer knowledge base by means of an inverted index query. Determining a set of candidate answers related to the conversation problem, and separately calculating a topic similarity of the conversation question and each candidate answer in the candidate answer set, and determining whether an approximate answer of the conversation question exists in the candidate answer set, and if so, Outputting the approximate answer as a target answer to the conversation question, if the candidate If there is no approximate answer to the conversation problem in the answer set, the iterative training of encoding and decoding each question and answer in the question and answer knowledge base is performed by the seq2seq model, thereby constructing a sequence prediction model, and inputting the conversation problem The sequence prediction model generates a strain answer, and outputs the strain answer as a target answer of the conversation question, and can provide accurate and responsive feedback to the client for the conversation problem, thereby improving service quality.
图1为本申请电子装置较佳实施例的运行环境示意图;1 is a schematic diagram of an operating environment of a preferred embodiment of an electronic device of the present application;
图2为本申请电子装置与客户端较佳实施例的交互示意图;2 is a schematic diagram of interaction between an electronic device and a client according to a preferred embodiment of the present application;
图3为本申请聊天应答方法较佳实施例的流程图;3 is a flow chart of a preferred embodiment of a chat response method of the present application;
图4为图1中聊天应答程序的程序模块图。4 is a program block diagram of the chat response program of FIG. 1.
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。The implementation, functional features and advantages of the present application will be further described with reference to the accompanying drawings.
下面将参考若干具体实施例来描述本申请的原理和精神。应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。The principles and spirit of the present application are described below with reference to a number of specific embodiments. It is understood that the specific embodiments described herein are merely illustrative of the application and are not intended to be limiting.
本领域的技术人员知道,本申请的实施方式可以实现为一种方法、装置、设备、系统或计算机程序产品。因此,本申请可以具体实现为完全的硬件、完全的软件(包括固件、驻留软件、微代码等),或者硬件和软件结合的形式。Those skilled in the art will appreciate that embodiments of the present application can be implemented as a method, apparatus, device, system, or computer program product. Accordingly, the application can be embodied in a complete hardware, complete software (including firmware, resident software, microcode, etc.), or a combination of hardware and software.
根据本申请的实施例,提出了一种聊天应答方法、电子装置及存储介质。According to an embodiment of the present application, a chat response method, an electronic device, and a storage medium are proposed.
参照图1所示,为本申请电子装置较佳实施例的运行环境示意图。1 is a schematic diagram of an operating environment of a preferred embodiment of an electronic device of the present application.
该电子装置1可以是服务器、便携式计算机、桌上型计算机等具有存储和运算功能的终端设备。The
该电子装置1包括存储器11、处理器12、网络接口13及通信总线14。所述网络接口13可选地可以包括标准的有线接口和无线接口(如WI-FI接口)。通信总线14用于实现上述组件之间的连接通信。The
存储器11包括至少一种类型的可读存储介质。所述至少一种类型的可读存储介质可为如闪存、硬盘、多媒体卡、卡型存储器等的非易失性存储介质。在一些实施例中,所述可读存储介质可以是所述电子装置1的内部存储单元,例如该电子装置1的硬盘。在另一些实施例中,所述可读存储介质也可以是所述电子装置1的外部存储器11,例如所述电子装置1上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡, 闪存卡(Flash Card)等。The
在本实施例中,所述存储器11的可读存储介质通常用于存储安装于所述电子装置1的聊天应答程序10及问答知识库4等。所述存储器11还可以用于暂时地存储已经输出或者将要输出的数据。In the present embodiment, the readable storage medium of the
处理器12在一些实施例中可以是一中央处理器(Central Processing Unit,CPU),微处理器或其他数据处理芯片,用于运行存储器11中存储的程序代码或处理数据,例如执行聊天应答程序10等。The
图1仅示出了具有组件11-14以及聊天应答程序10的电子装置1,但是应理解的是,并不要求实施所有示出的组件,可以替代的实施更多或者更少的组件。1 shows only the
可选地,该电子装置1还可以包括用户接口,用户接口可以包括输入单元比如键盘(Keyboard)、语音输入装置比如麦克风(microphone)等具有语音识别功能的设备、语音输出装置比如音响、耳机等。可选地,用户接口还可以包括标准的有线接口、无线接口。Optionally, the
可选地,该电子装置1还可以包括显示器,显示器也可以称为显示屏或显示单元。在一些实施例中可以是LED显示器、液晶显示器、触控式液晶显示器以及有机发光二极管(Organic Light-Emitting Diode,OLED)显示器等。显示器用于显示在电子装置1中处理的信息以及用于显示可视化的用户界面。Optionally, the
可选地,该电子装置1还包括触摸传感器。所述触摸传感器所提供的供用户进行触摸操作的区域称为触控区域。此外,这里所述的触摸传感器可以为电阻式触摸传感器、电容式触摸传感器等。而且,所述触摸传感器不仅包括接触式的触摸传感器,也可包括接近式的触摸传感器等。此外,所述触摸传感器可以为单个传感器,也可以为例如阵列布置的多个传感器。用户可以通过触摸所述触控区域启动聊天应答程序10。Optionally, the
此外,该电子装置1的显示器的面积可以与所述触摸传感器的面积相同,也可以不同。可选地,将显示器与所述触摸传感器层叠设置,以形成触摸显示屏。该装置基于触摸显示屏侦测用户触发的触控操作。In addition, the area of the display of the
该电子装置1还可以包括射频(Radio Frequency,RF)电路、传感器和音频电路等等,在此不再赘述。The
参阅图2所示,为本申请电子装置1与客户端2较佳实施例的交互示意图。所述聊天应答程序10运行于电子装置1中,在图2中所述电子装置1的较佳实施例为服务器。所述电子装置1通过网络3与客户端2通信连接。所述客户端2可以运行于各类终端设备中,例如智能手机、便携式计算机等。用户通过客户端2登录至所述电子装置1后,可以向聊天应答程序10输入会话问题,所述会话问题可以为对特定领域的会话问题,也可以为聊天会话内容。聊天应答程序10可以采用所述聊天应答方法,根据所述会话问题确定合适的响应内容,并将所述响应内容反馈给客户端2。Referring to FIG. 2, it is a schematic diagram of interaction between the
参阅图3所示,为本申请聊天应答方法较佳实施例的流程图。电子装置1的处理器12执行存储器11中存储的聊天应答程序10时实现聊天应答方法的如下步骤:Referring to FIG. 3, it is a flowchart of a preferred embodiment of the chat response method of the present application. The following steps of implementing the chat response method when the
步骤S1,获取客户输入的会话问题,对所述会话问题进行预处理,得到会话问题的文本特征信息,所述文本特征信息包括各词条在所述会话问题中的词性、位置和词类归属信息,所述词类归属包括归属于关键词或命名实体。所述会话问题例如可以为对特定领域的会话问题,例如“保修期是多久”,也可以为聊天会话内容,例如“今天天气很不错”。为了便于后续对所述会话问题的处理,步骤S1可以先对所述会话问题进行一些预处理。Step S1: Obtain a session question input by the client, perform pre-processing on the session problem, and obtain text feature information of the session problem, where the text feature information includes part of speech, location, and word class attribution information of each term in the conversation question. The word class attribution includes attribution to a keyword or a named entity. The session question can be, for example, a conversational question for a particular domain, such as "how long the warranty period is," or it can be a chat session content, such as "The weather is very good today." In order to facilitate subsequent processing of the session problem, step S1 may first perform some pre-processing on the session problem.
具体地,步骤S1进行的预处理可以包括如下处理:Specifically, the pre-processing performed in step S1 may include the following processing:
对所述会话问题进行分词处理,从而切分出会话问题的各词条,例如,所述会话问题为“保修期是多久”,则分词后得到的词条是“保修期”、“是”、“多”、“久”,所述分词处理的方法包括基于词典进行正向最大匹配和/或基于词典进行逆向最大匹配;Performing word segmentation on the conversation problem, thereby segmenting the terms of the conversation problem. For example, the conversation question is “how long the warranty period is”, and the terms obtained after the word segmentation are “warranty period” and “yes”. , "multi", "long", the method of word segmentation includes performing a forward maximum match based on a dictionary and/or performing a reverse maximum match based on a dictionary;
对经所述分词处理得到的各词条进行词性解析,并对各词条的词性进行标注,例如对上述会话问题的示例,按照预设规则进行词性标注后的结果为“保修期/名词”、“是/动词”、“多/副词”、“久/形容词”,所述词性解析通过经预设大规模语料库训练得到的词性标注模型实现;Performing part-of-speech analysis on each term obtained by the word segmentation, and marking the part-of-speech of each term. For example, for the example of the above-mentioned conversation problem, the result of the part-of-speech tagging according to the preset rule is “warranty period/noun”. "Yes/verbs", "multiple/adverbs", "long/adjectives", the part of speech analysis is realized by a part-of-speech tagging model obtained through a preset large-scale corpus training;
对所述会话问题进行命名实体识别,从而识别出具有特定意义的命名实体,所述命名实体包括人名、地名、组织机构、专有名词,所述命名实体识别的方法包括基于词典和规则的方法,以及基于统计学习的方法;Named entity identification is performed on the session problem, thereby identifying a named entity having a specific meaning, the named entity includes a person name, a place name, an organization, a proper noun, and the method for identifying the named entity includes a dictionary-based and rule-based method And methods based on statistical learning;
根据所述各词条以及所述命名实体,从所述会话问题中提取关键词,所述关键词为字符数量多于第一预设阈值的词组,或者为存在于预设词典中的命名实体,所述预设词典包括业务场景专有词典。Extracting a keyword from the conversation question according to the term and the named entity, the keyword being a phrase whose number of characters is greater than a first preset threshold, or a named entity existing in a preset dictionary The preset dictionary includes a business scenario-specific dictionary.
步骤S2,为问答知识库4构建倒排索引,所述问答知识库4包括预先整理的多个问题以及每个问题关联的一个或多个答案,根据所述文本特征信息,通过倒排索引查询的方式从问答知识库4中查询与所述会话问题相关的候选问题集合,并分别计算所述会话问题与所述候选问题集合中每个候选问题的文本相似度。Step S2, constructing an inverted index for the question and answer
在一个实施例中,所述为问答知识库4构建倒排索引包括:In one embodiment, the constructing the inverted index for the
对问答知识库4中的每个问题和答案分别进行分词、词性标注、关键词提取、关键词出现位置记录、分配ID号的操作,以及为每个问题和答案分词后得到的各词条分配ID号;Each question and answer in the
对问答知识库4中每个问题和答案根据相应的ID号进行排序,对所述每个问题和答案分词后得到的各词条根据相应的ID号进行排序,并将具有同一词条ID的所有问题ID和答案ID放到该词条对应的倒排记录表中;Each question and answer in the
将所有倒排记录表合并为最终的倒排索引。Combine all inverted record tables into the final inverted index.
所述候选问题集合中包括至少一个候选问题,且由于采用的是倒排索引查询的方式,每个候选问题都与所述会话问题存在一定程度的联系。每个候 选问题与所述会话问题的所述联系可以通过所述文本相似度来反映,若会话问题与相应的候选问题之间的文本相似度越高,则认为会话问题与该候选问题越相似。At least one candidate question is included in the candidate question set, and each candidate question has a certain degree of association with the session problem due to the manner of using an inverted index query. The association between each candidate question and the conversation question may be reflected by the text similarity. If the text similarity between the conversation problem and the corresponding candidate question is higher, the conversation problem is considered to be similar to the candidate problem. .
具体地,步骤S2分别计算所述会话问题与所述候选问题集合中每个候选问题的文本相似度的方法可以包括:Specifically, the method for calculating the text similarity between the session problem and each candidate question in the candidate question set in step S2 may include:
构建卷积神经网络,通过所述卷积神经网络对所述问答知识库4中的所有问题语句进行样本训练,得到所述问答知识库4中问题语句对应的卷积神经网络模型;Constructing a convolutional neural network, and performing sample training on all the problem sentences in the
将所述会话问题和所述候选问题集合中的每个候选问题分别输入所述卷积神经网络模型,通过所述卷积神经网络模型的卷积核卷积得到所述会话问题和所述候选问题集合中的每个候选问题各自对应的特征向量;Entering each of the conversation problem and the candidate question set into the convolutional neural network model, respectively, and obtaining the conversation problem and the candidate by convolution kernel convolution of the convolutional neural network model a feature vector corresponding to each candidate question in the question set;
分别计算所述会话问题对应的特征向量与所述候选问题集合中的每个候选问题对应的特征向量之间的余弦距离,从而得到所述会话问题与所述候选问题集合中每个候选问题的文本相似度。Calculating a cosine distance between the feature vector corresponding to the session problem and the feature vector corresponding to each candidate question in the candidate question set, respectively, to obtain the session problem and each candidate problem in the candidate question set Text similarity.
步骤S3,根据预设规则及所述文本相似度,判断候选问题集合中是否存在所述会话问题的近似问题,若所述候选问题集合中存在所述会话问题的近似问题,则在问答知识库中查找该近似问题的关联答案,将所述关联答案作为所述会话问题的目标答案输出。Step S3: determining, according to the preset rule and the text similarity, whether there is an approximation problem of the conversation problem in the candidate question set, and if there is an approximation problem of the conversation problem in the candidate question set, then the Q&A knowledge base Finding the associated answer of the approximation question, and outputting the associated answer as the target answer of the conversation question.
具体地,所述预设规则可以包括:判断是否存在与会话问题的文本相似度大于第二预设阈值的候选问题,若存在与会话问题的文本相似度大于第二预设阈值的候选问题,则判定候选问题集合中存在所述会话问题的近似问题。若不存在与会话问题的文本相似度大于第二预设阈值的候选问题,则判定候选问题集合中不存在所述会话问题的近似问题。Specifically, the preset rule may include: determining whether there is a candidate problem that the text similarity with the conversation problem is greater than the second preset threshold, and if there is a candidate problem that the text similarity with the conversation problem is greater than the second preset threshold, Then, an approximation problem of the conversation problem exists in the candidate problem set. If there is no candidate problem that the text similarity with the conversation problem is greater than the second preset threshold, then it is determined that there is no approximation problem of the conversation problem in the candidate problem set.
若存在与会话问题的文本相似度大于第二预设阈值的候选问题,则步骤S3从所述与会话问题的文本相似度大于第二预设阈值的候选问题中选择最大文本相似度对应的候选问题作为所述近似问题,并在问答知识库4中查找该近似问题的关联答案,将所述关联答案作为所述会话问题的目标答案输出。值得注意的是,所述近似问题在问答知识库4中也可能有不止一个关联答案,当近似问题在问答知识库4中有多个关联答案时,步骤S3可以取所述多个关联答案中,在预设时间段(例如最近一周)内输出频率最高的关联答案作为所述会话问题的目标答案输出。If there is a candidate problem that the text similarity with the conversation problem is greater than the second preset threshold, step S3 selects a candidate corresponding to the maximum text similarity from the candidate questions with the text similarity of the conversation problem being greater than the second preset threshold. The problem is taken as the approximation question, and the associated answer of the approximation question is searched in the
步骤S4,若所述候选问题集合中不存在所述会话问题的近似问题,则根据所述文本特征信息,通过倒排索引查询的方式从问答知识库4中查询与所述会话问题相关的候选答案集合,并分别计算所述会话问题与所述候选答案集合中每个候选答案的主题相似度。Step S4: If there is no approximation problem of the conversation problem in the candidate question set, query the candidate related to the conversation problem from the question and answer
所述候选答案集合中包括至少一个候选答案,且由于采用的是倒排索引查询的方式,每个候选答案都与所述会话问题存在一定程度的联系。每个候选答案与所述会话问题的所述联系可以通过所述主题相似度来反映,若会话问题与相应的候选答案之间的主题相似度越高,则认为会话问题与该候选答 案的主题越相似,从而认为该候选答案越有可能是该会话问题对应的答案。At least one candidate answer is included in the set of candidate answers, and each candidate answer has a certain degree of association with the session question due to the manner of using an inverted index query. The association of each candidate answer with the conversation question may be reflected by the topic similarity, and if the topic similarity between the conversation question and the corresponding candidate answer is higher, the conversation problem and the topic of the candidate answer are considered The more similar, the more likely the candidate answer is to be the answer to the conversation question.
具体地,步骤S4分别计算所述会话问题与所述候选答案集合中每个候选答案的主题相似度的方法可以包括:Specifically, the method for calculating the topic similarity of each of the candidate answers in the candidate answer set in step S4 may include:
所述分别计算所述会话问题与所述候选答案集合中每个候选答案的主题相似度包括:The calculating the similarity degree between the conversation problem and each candidate answer in the candidate answer set respectively includes:
采用线性判别分析(Linear Discriminant Analysis,LDA)模型分别提取所述会话问题和所述候选答案集合中每个候选答案的主题向量;Extracting the conversation problem and the topic vector of each candidate answer in the candidate answer set by using a Linear Discriminant Analysis (LDA) model;
分别计算所述会话问题的主题向量与所述候选答案集合中每个候选答案的主题向量之间的余弦距离,从而得到所述会话问题与所述候选答案集合中每个候选答案的主题相似度。Calculating a cosine distance between a topic vector of the conversation question and a topic vector of each candidate answer in the candidate answer set, respectively, to obtain a topic similarity between the conversation question and each candidate answer in the candidate answer set .
步骤S5,根据预设规则及所述主题相似度,判断候选答案集合中是否存在所述会话问题的近似答案,若所述候选答案集合中存在所述会话问题的近似答案,则将所述近似答案作为所述会话问题的目标答案输出。Step S5: determining, according to the preset rule and the topic similarity, whether an approximate answer of the conversation problem exists in the candidate answer set, and if the approximate answer of the conversation problem exists in the candidate answer set, the approximation is The answer is output as the target answer to the conversation question.
具体地,所述预设规则可以包括:判断是否存在与会话问题的主题相似度大于第三预设阈值的候选答案,若存在与会话问题的主题相似度大于第三预设阈值的候选答案,则判定候选答案集合中存在所述会话问题的近似答案。若不存在与会话问题的主题相似度大于第三预设阈值的候选答案,则判定候选答案集合中不存在所述会话问题的近似答案。Specifically, the preset rule may include: determining whether there is a candidate answer whose topic similarity with the session problem is greater than a third preset threshold, and if there is a candidate answer with a topic similarity of the conversation problem being greater than a third preset threshold, Then, an approximate answer to the conversation question exists in the candidate answer set. If there is no candidate answer with the topic similarity of the conversation problem being greater than the third preset threshold, it is determined that there is no approximate answer of the conversation question in the candidate answer set.
若存在与会话问题的主题相似度大于第三预设阈值的候选答案,则将所述候选答案作为会话问题的近似答案,步骤S5将所述近似答案作为所述会话问题的目标答案输出。值得注意的是,与会话问题的主题相似度大于第三预设阈值的候选答案在问答知识库4中也可能有不止一个,当与会话问题的主题相似度大于第三预设阈值的候选答案在问答知识库4中有多个时,步骤S5可以取所述多个候选答案中,在预设时间段(例如最近一周)内输出频率最高的作为所述会话问题的近似答案。If there is a candidate answer whose topic similarity to the conversation question is greater than the third preset threshold, the candidate answer is taken as an approximate answer to the conversation question, and step S5 outputs the approximate answer as the target answer of the conversation question. It is worth noting that the candidate answers with the topic similarity of the conversation problem greater than the third preset threshold may also have more than one in the
步骤S6,若候选答案集合中不存在所述会话问题的近似答案,则通过seq2seq模型对所述问答知识库4中的各个问题和答案进行编码和解码的迭代训练,从而构建序列预测模型,将所述会话问题输入所述序列预测模型生成应变答案,将所述应变答案作为所述会话问题的目标答案输出。所述seq2seq模型由用于进行所述编码和解码迭代训练的前向长短记忆网络LSTM模型和后向LSTM模型,以及用于计算每次编码和解码的隐藏层信息权重的注意力机制构成。Step S6, if there is no approximate answer of the conversation question in the candidate answer set, the iterative training of encoding and decoding each question and answer in the
根据本实施例提供的聊天应答方法,在获取会话问题并进行预处理后,通过倒排索引查询的方式从问答知识库4中查询与所述会话问题相关的候选问题集合,并分别计算所述会话问题与所述候选问题集合中每个候选问题的文本相似度,判断候选问题集合中是否存在所述会话问题的近似问题,若是,则在问答知识库4中查找该近似问题的关联答案,将所述关联答案作为所述会话问题的目标答案输出,若所述候选问题集合中不存在所述会话问题的近似问题,则通过倒排索引查询的方式从问答知识库4中查询与所述会话问题 相关的候选答案集合,并分别计算所述会话问题与所述候选答案集合中每个候选答案的主题相似度,判断候选答案集合中是否存在所述会话问题的近似答案,若是,则将所述近似答案作为所述会话问题的目标答案输出,若候选答案集合中不存在所述会话问题的近似答案,则通过seq2seq模型对所述问答知识库中的各个问题和答案进行编码和解码的迭代训练,从而构建序列预测模型,将所述会话问题输入所述序列预测模型生成应变答案,将所述应变答案作为所述会话问题的目标答案输出。通过本实施例提供的聊天应答方法可以针对会话问题为客户做出准确和应变的反馈,从而提高服务质量。According to the chat response method provided in this embodiment, after the session problem is acquired and pre-processed, the candidate question set related to the session problem is queried from the question-and-
参阅图4所示,为图1中聊天应答程序10的程序模块图。在本实施例中,聊天应答程序10被分割为多个模块,该多个模块被存储于存储器11中,并由处理器12执行,以完成本申请。本申请所称的模块是指能够完成特定功能的一系列计算机程序指令段。Referring to FIG. 4, it is a program module diagram of the
所述聊天应答程序10可以被分割为:预处理模块110、第一计算模块120、问题检索模块130、第二计算模块140、答案检索模块150和答案预测模块160。The
预处理模块110,用于获取客户输入的会话问题,对所述会话问题进行预处理,得到会话问题的文本特征信息,所述文本特征信息包括各词条在所述会话问题中的词性、位置和词类归属信息,所述词类归属包括归属于关键词或命名实体。The pre-processing module 110 is configured to obtain a session problem input by the client, and perform pre-processing on the session problem to obtain text feature information of the session problem, where the text feature information includes the part of speech and location of each term in the conversation problem. And word class attribution information, the word class attribution includes attribution to a keyword or a named entity.
具体地,预处理模块110用于对所述会话问题进行以下预处理:Specifically, the pre-processing module 110 is configured to perform the following pre-processing on the session problem:
对所述会话问题进行分词处理,从而切分出会话问题的各词条,所述分词处理的方法包括基于词典进行正向最大匹配和/或基于词典进行逆向最大匹配;Performing word segmentation on the conversation problem, thereby segmenting the terms of the conversation problem, the method of word segmentation includes performing forward maximum matching based on the dictionary and/or performing reverse maximum matching based on the dictionary;
对经所述分词处理得到的各词条进行词性解析,并对各词条的词性进行标注,所述词性解析通过经预设大规模语料库训练得到的词性标注模型实现;Performing part-of-speech analysis on each term obtained by the word segmentation process, and labeling the part-of-speech of each term, and the part-of-speech analysis is realized by a part-of-speech tagging model obtained through a preset large-scale corpus training;
对所述会话问题进行命名实体识别,从而识别出具有特定意义的命名实体,所述命名实体包括人名、地名、组织机构、专有名词,所述命名实体识别的方法包括基于词典和规则的方法,以及基于统计学习的方法;Named entity identification is performed on the session problem, thereby identifying a named entity having a specific meaning, the named entity includes a person name, a place name, an organization, a proper noun, and the method for identifying the named entity includes a dictionary-based and rule-based method And methods based on statistical learning;
根据所述各词条以及所述命名实体,从所述会话问题中提取关键词,所述关键词为字符数量多于第一预设阈值的词组,或者为存在于预设词典中的命名实体,所述预设词典包括业务场景专有词典。Extracting a keyword from the conversation question according to the term and the named entity, the keyword being a phrase whose number of characters is greater than a first preset threshold, or a named entity existing in a preset dictionary The preset dictionary includes a business scenario-specific dictionary.
第一计算模块120,用于为问答知识库4构建倒排索引,所述问答知识库包括预先整理的多个问题以及每个问题关联的一个或多个答案,根据所述文本特征信息,通过倒排索引查询的方式从问答知识库4中查询与所述会话问题相关的候选问题集合,并分别计算所述会话问题与所述候选问题集合中每个候选问题的文本相似度。a first calculation module 120, configured to build an inverted index for the
具体地,第一计算模块120用于通过以下方式为问答知识库4构建倒排索引:Specifically, the first calculating module 120 is configured to construct an inverted index for the question and answer
对问答知识库4中的每个问题和答案分别进行分词、词性标注、关键词 提取、关键词出现位置记录、分配ID号的操作,以及为每个问题和答案分词后得到的各词条分配ID号;Each question and answer in the
对问答知识库4中每个问题和答案根据相应的ID号进行排序,对所述每个问题和答案分词后得到的各词条根据相应的ID号进行排序,并将具有同一词条ID的所有问题ID和答案ID放到该词条对应的倒排记录表中;Each question and answer in the
将所有倒排记录表合并为最终的倒排索引。Combine all inverted record tables into the final inverted index.
第一计算模块120计算所述会话问题与所述候选问题集合中每个候选问题的文本相似度包括:The first calculation module 120 calculates a text similarity between the conversation problem and each candidate question in the candidate question set, including:
构建卷积神经网络,通过所述卷积神经网络对所述问答知识库4中的所有问题语句进行样本训练,得到所述问答知识库4中问题语句对应的卷积神经网络模型;Constructing a convolutional neural network, and performing sample training on all the problem sentences in the
将所述会话问题和所述候选问题集合中的每个候选问题分别输入所述卷积神经网络模型,通过所述卷积神经网络模型的卷积核卷积得到所述会话问题和所述候选问题集合中的每个候选问题各自对应的特征向量;Entering each of the conversation problem and the candidate question set into the convolutional neural network model, respectively, and obtaining the conversation problem and the candidate by convolution kernel convolution of the convolutional neural network model a feature vector corresponding to each candidate question in the question set;
分别计算所述会话问题对应的特征向量与所述候选问题集合中的每个候选问题对应的特征向量之间的余弦距离,从而得到所述会话问题与所述候选问题集合中每个候选问题的文本相似度。Calculating a cosine distance between the feature vector corresponding to the session problem and the feature vector corresponding to each candidate question in the candidate question set, respectively, to obtain the session problem and each candidate problem in the candidate question set Text similarity.
问题检索模块130,用于根据预设规则及所述文本相似度,判断候选问题集合中是否存在所述会话问题的近似问题,若所述候选问题集合中存在所述会话问题的近似问题,则在问答知识库中查找该近似问题的关联答案,将所述关联答案作为所述会话问题的目标答案输出。The problem retrieving module 130 is configured to determine, according to the preset rule and the text similarity, whether there is an approximation problem of the session problem in the candidate question set, and if there is an approximation problem of the session problem in the candidate question set, Finding the associated answer of the approximate question in the question and answer knowledge base, and outputting the associated answer as the target answer of the conversation question.
具体地,问题检索模块130判断是否存在与会话问题的文本相似度大于第二预设阈值的候选问题,若是,则从所述与会话问题的文本相似度大于第二预设阈值的候选问题中选择最大文本相似度对应的候选问题作为所述近似问题;若不存在与会话问题的文本相似度大于第二预设阈值的候选问题,则判定所述候选问题集合中不存在所述会话问题的近似问题。Specifically, the problem retrieval module 130 determines whether there is a candidate problem that the text similarity with the conversation problem is greater than the second preset threshold, and if so, from the candidate problem that the text similarity with the conversation problem is greater than the second preset threshold Selecting a candidate question corresponding to the maximum text similarity as the approximation question; if there is no candidate problem that the text similarity with the session problem is greater than the second preset threshold, determining that the session problem does not exist in the candidate question set Approximate problem.
第二计算模块140,用于若所述候选问题集合中不存在所述会话问题的近似问题,则根据所述文本特征信息,通过倒排索引查询的方式从问答知识库4中查询与所述会话问题相关的候选答案集合,并分别计算所述会话问题与所述候选答案集合中每个候选答案的主题相似度。The second calculating module 140 is configured to: if the approximation problem of the session problem does not exist in the candidate question set, query and query from the
第二计算模块140计算所述会话问题与所述候选答案集合中每个候选答案的主题相似度包括:The second calculation module 140 calculates a topic similarity between the conversation question and each candidate answer in the candidate answer set, including:
采用线性判别分析模型分别提取所述会话问题和所述候选答案集合中每个候选答案的主题向量;Extracting the conversation problem and the topic vector of each candidate answer in the candidate answer set by using a linear discriminant analysis model;
分别计算所述会话问题的主题向量与所述候选答案集合中每个候选答案的主题向量之间的余弦距离,从而得到所述会话问题与所述候选答案集合中每个候选答案的主题相似度。Calculating a cosine distance between a topic vector of the conversation question and a topic vector of each candidate answer in the candidate answer set, respectively, to obtain a topic similarity between the conversation question and each candidate answer in the candidate answer set .
答案检索模块150,用于根据预设规则及所述主题相似度,判断候选答案集合中是否存在所述会话问题的近似答案,若所述候选答案集合中存在所述 会话问题的近似答案,则将所述近似答案作为所述会话问题的目标答案输出。The answer retrieval module 150 is configured to determine, according to the preset rule and the topic similarity, whether an approximate answer of the conversation problem exists in the candidate answer set, if an approximate answer of the conversation problem exists in the candidate answer set, The approximate answer is output as a target answer to the conversation question.
具体地,答案检索模块150判断是否存在与会话问题的主题相似度大于第三预设阈值的候选答案,若是,则从所述与会话问题的主题相似度大于第三预设阈值的候选答案中选择最大主题相似度对应的候选答案作为所述近似答案;若不存在与会话问题的主题相似度大于第三预设阈值的候选答案,则判定所述候选答案集合中不存在所述会话问题的近似答案。Specifically, the answer retrieval module 150 determines whether there is a candidate answer whose topic similarity to the conversation problem is greater than the third preset threshold, and if so, from the candidate answers that the topic similarity with the conversation question is greater than the third preset threshold Selecting a candidate answer corresponding to the maximum topic similarity as the approximate answer; if there is no candidate answer with a topic similarity of the conversation problem greater than the third preset threshold, determining that the session problem does not exist in the candidate answer set Approximate answer.
答案预测模块160,用于若候选答案集合中不存在所述会话问题的近似答案,则通过seq2seq模型对所述问答知识库4中的各个问题和答案进行编码和解码的迭代训练,从而构建序列预测模型,将所述会话问题输入所述序列预测模型生成应变答案,将所述应变答案作为所述会话问题的目标答案输出。答案预测模块160所述seq2seq模型由用于进行所述编码和解码迭代训练的前向长短记忆网络LSTM模型和后向LSTM模型,以及用于计算每次编码和解码的隐藏层信息权重的注意力机制构成。The answer prediction module 160 is configured to: if the approximate answer of the conversation problem does not exist in the candidate answer set, perform iterative training on encoding and decoding each question and answer in the
在图1所示的电子装置1较佳实施例的运行环境示意图中,包含可读存储介质的存储器11中可以包括操作系统、聊天应答程序10及问答知识库4。处理器12执行存储器11中存储的聊天应答程序10时实现如下步骤:In the operating environment diagram of the preferred embodiment of the
预处理步骤:获取客户输入的会话问题,对所述会话问题进行预处理,得到会话问题的文本特征信息,所述文本特征信息包括各词条在所述会话问题中的词性、位置和词类归属信息,所述词类归属包括归属于关键词或命名实体;a pre-processing step: obtaining a session question input by the client, pre-processing the session problem, and obtaining text feature information of the session problem, where the text feature information includes part of speech, location, and part-of-speech attribution of each term in the conversation problem Information, the word class attribution includes belonging to a keyword or a named entity;
第一计算步骤:为问答知识库构建倒排索引,所述问答知识库包括预先整理的多个问题以及每个问题关联的一个或多个答案,根据所述文本特征信息,通过倒排索引查询的方式从问答知识库中查询与所述会话问题相关的候选问题集合,并分别计算所述会话问题与所述候选问题集合中每个候选问题的文本相似度;a first calculating step: constructing an inverted index for the question and answer knowledge base, the question and answer knowledge base including a plurality of questions arranged in advance and one or more answers associated with each question, and querying through the inverted index according to the text feature information Means querying a candidate question set related to the conversation problem from a question and answer knowledge base, and separately calculating text similarity between the conversation problem and each candidate question in the candidate question set;
问题检索步骤:根据预设规则及所述文本相似度,判断候选问题集合中是否存在所述会话问题的近似问题,若所述候选问题集合中存在所述会话问题的近似问题,则在问答知识库中查找该近似问题的关联答案,将所述关联答案作为所述会话问题的目标答案输出;a problem retrieval step: determining, according to a preset rule and the text similarity, whether there is an approximation problem of the conversation problem in the candidate question set, and if there is an approximation problem of the conversation problem in the candidate question set, then the question and answer knowledge Finding an associated answer of the approximate problem in the library, and outputting the associated answer as a target answer of the conversation question;
第二计算步骤:若所述候选问题集合中不存在所述会话问题的近似问题,则根据所述文本特征信息,通过倒排索引查询的方式从问答知识库中查询与所述会话问题相关的候选答案集合,并分别计算所述会话问题与所述候选答案集合中每个候选答案的主题相似度;a second calculating step: if there is no approximation problem of the session problem in the candidate question set, querying, according to the text feature information, a query related to the session problem from the Q&A knowledge base by means of an inverted index query a set of candidate answers, and separately calculating a topic similarity of the conversation question and each candidate answer in the candidate answer set;
答案检索步骤:根据预设规则及所述主题相似度,判断候选答案集合中是否存在所述会话问题的近似答案,若所述候选答案集合中存在所述会话问题的近似答案,则将所述近似答案作为所述会话问题的目标答案输出;An answer retrieval step: determining, according to a preset rule and the topic similarity, whether an approximate answer of the conversation question exists in the candidate answer set, and if an approximate answer of the conversation problem exists in the candidate answer set, The approximate answer is output as the target answer to the conversation question;
答案预测步骤:若候选答案集合中不存在所述会话问题的近似答案,则通过seq2seq模型对所述问答知识库中的各个问题和答案进行编码和解码的 迭代训练,从而构建序列预测模型,将所述会话问题输入所述序列预测模型生成应变答案,将所述应变答案作为所述会话问题的目标答案输出。An answer prediction step: if an approximate answer of the conversation problem does not exist in the candidate answer set, iteratively trains and decodes each question and answer in the question and answer knowledge base through the seq2seq model, thereby constructing a sequence prediction model, The conversation question is input to the sequence prediction model to generate a strain answer, and the strain answer is output as a target answer of the conversation question.
其中,所述对所述会话问题进行预处理包括:The pre-processing of the session problem includes:
对所述会话问题进行分词处理,从而切分出会话问题的各词条,所述分词处理的方法包括基于词典进行正向最大匹配和/或基于词典进行逆向最大匹配;Performing word segmentation on the conversation problem, thereby segmenting the terms of the conversation problem, the method of word segmentation includes performing forward maximum matching based on the dictionary and/or performing reverse maximum matching based on the dictionary;
对经所述分词处理得到的各词条进行词性解析,并对各词条的词性进行标注,所述词性解析通过经预设大规模语料库训练得到的词性标注模型实现;Performing part-of-speech analysis on each term obtained by the word segmentation process, and labeling the part-of-speech of each term, and the part-of-speech analysis is realized by a part-of-speech tagging model obtained through a preset large-scale corpus training;
对所述会话问题进行命名实体识别,从而识别出具有特定意义的命名实体,所述命名实体包括人名、地名、组织机构、专有名词,所述命名实体识别的方法包括基于词典和规则的方法,以及基于统计学习的方法;Named entity identification is performed on the session problem, thereby identifying a named entity having a specific meaning, the named entity includes a person name, a place name, an organization, a proper noun, and the method for identifying the named entity includes a dictionary-based and rule-based method And methods based on statistical learning;
根据所述各词条以及所述命名实体,从所述会话问题中提取关键词,所述关键词为字符数量多于第一预设阈值的词组,或者为存在于预设词典中的命名实体,所述预设词典包括业务场景专有词典。Extracting a keyword from the conversation question according to the term and the named entity, the keyword being a phrase whose number of characters is greater than a first preset threshold, or a named entity existing in a preset dictionary The preset dictionary includes a business scenario-specific dictionary.
所述分别计算所述会话问题与所述候选问题集合中每个候选问题的文本相似度包括:And calculating, respectively, the text similarity between the conversation problem and each candidate question in the candidate question set includes:
构建卷积神经网络,通过所述卷积神经网络对所述问答知识库中的所有问题语句进行样本训练,得到所述问答知识库中问题语句对应的卷积神经网络模型;Constructing a convolutional neural network, and performing sample training on all problem sentences in the question and answer knowledge base through the convolutional neural network, and obtaining a convolutional neural network model corresponding to the problem statement in the question and answer knowledge base;
将所述会话问题和所述候选问题集合中的每个候选问题分别输入所述卷积神经网络模型,通过所述卷积神经网络模型的卷积核卷积得到所述会话问题和所述候选问题集合中的每个候选问题各自对应的特征向量;Entering each of the conversation problem and the candidate question set into the convolutional neural network model, respectively, and obtaining the conversation problem and the candidate by convolution kernel convolution of the convolutional neural network model a feature vector corresponding to each candidate question in the question set;
分别计算所述会话问题对应的特征向量与所述候选问题集合中的每个候选问题对应的特征向量之间的余弦距离,从而得到所述会话问题与所述候选问题集合中每个候选问题的文本相似度;Calculating a cosine distance between the feature vector corresponding to the session problem and the feature vector corresponding to each candidate question in the candidate question set, respectively, to obtain the session problem and each candidate problem in the candidate question set Text similarity
所述分别计算所述会话问题与所述候选答案集合中每个候选答案的主题相似度包括:The calculating the similarity degree between the conversation problem and each candidate answer in the candidate answer set respectively includes:
采用线性判别分析模型分别提取所述会话问题和所述候选答案集合中每个候选答案的主题向量;Extracting the conversation problem and the topic vector of each candidate answer in the candidate answer set by using a linear discriminant analysis model;
分别计算所述会话问题的主题向量与所述候选答案集合中每个候选答案的主题向量之间的余弦距离,从而得到所述会话问题与所述候选答案集合中每个候选答案的主题相似度。Calculating a cosine distance between a topic vector of the conversation question and a topic vector of each candidate answer in the candidate answer set, respectively, to obtain a topic similarity between the conversation question and each candidate answer in the candidate answer set .
所述根据预设规则及所述问题相似度,判断候选问题集合中是否存在所述会话问题的近似问题包括:The approximating the problem of determining whether the session problem exists in the candidate question set according to the preset rule and the problem similarity includes:
判断是否存在与会话问题的文本相似度大于第二预设阈值的候选问题,若是,则从所述与会话问题的文本相似度大于第二预设阈值的候选问题中选择最大文本相似度对应的候选问题作为所述近似问题;Determining whether there is a candidate problem that the text similarity with the conversation problem is greater than the second preset threshold, and if yes, selecting the maximum text similarity corresponding to the candidate problem that the text similarity with the conversation problem is greater than the second preset threshold Candidate questions as the approximation problem;
若不存在与会话问题的文本相似度大于第二预设阈值的候选问题,则判定所述候选问题集合中不存在所述会话问题的近似问题;If there is no candidate problem that the text similarity with the conversation problem is greater than the second preset threshold, determining that there is no approximation problem of the conversation problem in the candidate problem set;
所述根据预设规则及所述主题相似度,判断候选答案集合中是否存在所述会话问题的近似答案包括:The determining, according to the preset rule and the topic similarity, determining whether the candidate answer exists in the candidate answer set includes:
判断是否存在与会话问题的主题相似度大于第三预设阈值的候选答案,若是,则从所述与会话问题的主题相似度大于第三预设阈值的候选答案中选择最大主题相似度对应的候选答案作为所述近似答案;Determining whether there is a candidate answer whose topic similarity to the conversation problem is greater than a third preset threshold, and if yes, selecting a maximum topic similarity corresponding to the candidate answers having the topic similarity of the conversation problem being greater than the third preset threshold a candidate answer as the approximate answer;
若不存在与会话问题的主题相似度大于第三预设阈值的候选答案,则判定所述候选答案集合中不存在所述会话问题的近似答案。If there is no candidate answer with the topic similarity of the conversation problem being greater than the third preset threshold, it is determined that the approximate answer of the conversation question does not exist in the candidate answer set.
所述为问答知识库构建倒排索引包括:The constructing the inverted index for the Q&A knowledge base includes:
对问答知识库中的每个问题和答案分别进行分词、词性标注、关键词提取、关键词出现位置记录、分配ID号的操作,以及为每个问题和答案分词后得到的各词条分配ID号;Each question and answer in the Q&A knowledge base is divided into word segmentation, part-of-speech tagging, keyword extraction, keyword location location record, assignment ID number, and ID assigned to each term after each word and answer segmentation. number;
对问答知识库中每个问题和答案根据相应的ID号进行排序,对所述每个问题和答案分词后得到的各词条根据相应的ID号进行排序,并将具有同一词条ID的所有问题ID和答案ID放到该词条对应的倒排记录表中;Each question and answer in the Q&A knowledge base is sorted according to the corresponding ID number, and each term obtained after each word and answer word segmentation is sorted according to the corresponding ID number, and all the items with the same item ID are The question ID and the answer ID are placed in the inverted record table corresponding to the entry;
将所有倒排记录表合并为最终的倒排索引。Combine all inverted record tables into the final inverted index.
所述seq2seq模型由用于进行所述编码和解码迭代训练的前向长短记忆网络LSTM模型和后向LSTM模型,以及用于计算每次编码和解码的隐藏层信息权重的注意力机制构成。The seq2seq model consists of a forward long and short memory network LSTM model and a backward LSTM model for performing the encoding and decoding iterative training, and an attention mechanism for calculating hidden layer information weights for each encoding and decoding.
具体原理请参照上述图4关于聊天应答程序10的程序模块图及图3关于聊天应答方法较佳实施例的流程图的介绍。For specific principles, please refer to the program module diagram of the
此外,本申请实施例还提出一种计算机可读存储介质,所述计算机可读存储介质可以是硬盘、多媒体卡、SD卡、闪存卡、SMC、只读存储器(ROM)、可擦除可编程只读存储器(EPROM)、便携式紧致盘只读存储器(CD-ROM)、USB存储器等等中的任意一种或者几种的任意组合。所述计算机可读存储介质中包括存储有问答知识库4及聊天应答程序10等,所述聊天应答程序10被所述处理器12执行时实现如下操作:In addition, the embodiment of the present application further provides a computer readable storage medium, which may be a hard disk, a multimedia card, an SD card, a flash memory card, an SMC, a read only memory (ROM), and an erasable programmable Any combination or combination of any one or more of read only memory (EPROM), portable compact disk read only memory (CD-ROM), USB memory, and the like. The computer readable storage medium includes a
预处理步骤:获取客户输入的会话问题,对所述会话问题进行预处理,得到会话问题的文本特征信息,所述文本特征信息包括各词条在所述会话问题中的词性、位置和词类归属信息,所述词类归属包括归属于关键词或命名实体;a pre-processing step: obtaining a session question input by the client, pre-processing the session problem, and obtaining text feature information of the session problem, where the text feature information includes part of speech, location, and part-of-speech attribution of each term in the conversation problem Information, the word class attribution includes belonging to a keyword or a named entity;
第一计算步骤:为问答知识库构建倒排索引,所述问答知识库包括预先整理的多个问题以及每个问题关联的一个或多个答案,根据所述文本特征信息,通过倒排索引查询的方式从问答知识库中查询与所述会话问题相关的候选问题集合,并分别计算所述会话问题与所述候选问题集合中每个候选问题的文本相似度;a first calculating step: constructing an inverted index for the question and answer knowledge base, the question and answer knowledge base including a plurality of questions arranged in advance and one or more answers associated with each question, and querying through the inverted index according to the text feature information Means querying a candidate question set related to the conversation problem from a question and answer knowledge base, and separately calculating text similarity between the conversation problem and each candidate question in the candidate question set;
问题检索步骤:根据预设规则及所述文本相似度,判断候选问题集合中是否存在所述会话问题的近似问题,若所述候选问题集合中存在所述会话问题的近似问题,则在问答知识库中查找该近似问题的关联答案,将所述关联 答案作为所述会话问题的目标答案输出;a problem retrieval step: determining, according to a preset rule and the text similarity, whether there is an approximation problem of the conversation problem in the candidate question set, and if there is an approximation problem of the conversation problem in the candidate question set, then the question and answer knowledge Finding an associated answer of the approximate problem in the library, and outputting the associated answer as a target answer of the conversation question;
第二计算步骤:若所述候选问题集合中不存在所述会话问题的近似问题,则根据所述文本特征信息,通过倒排索引查询的方式从问答知识库中查询与所述会话问题相关的候选答案集合,并分别计算所述会话问题与所述候选答案集合中每个候选答案的主题相似度;a second calculating step: if there is no approximation problem of the session problem in the candidate question set, querying, according to the text feature information, a query related to the session problem from the Q&A knowledge base by means of an inverted index query a set of candidate answers, and separately calculating a topic similarity of the conversation question and each candidate answer in the candidate answer set;
答案检索步骤:根据预设规则及所述主题相似度,判断候选答案集合中是否存在所述会话问题的近似答案,若所述候选答案集合中存在所述会话问题的近似答案,则将所述近似答案作为所述会话问题的目标答案输出;An answer retrieval step: determining, according to a preset rule and the topic similarity, whether an approximate answer of the conversation question exists in the candidate answer set, and if an approximate answer of the conversation problem exists in the candidate answer set, The approximate answer is output as the target answer to the conversation question;
答案预测步骤:若候选答案集合中不存在所述会话问题的近似答案,则通过seq2seq模型对所述问答知识库中的各个问题和答案进行编码和解码的迭代训练,从而构建序列预测模型,将所述会话问题输入所述序列预测模型生成应变答案,将所述应变答案作为所述会话问题的目标答案输出。An answer prediction step: if an approximate answer of the conversation problem does not exist in the candidate answer set, iteratively trains and decodes each question and answer in the question and answer knowledge base through the seq2seq model, thereby constructing a sequence prediction model, The conversation question is input to the sequence prediction model to generate a strain answer, and the strain answer is output as a target answer of the conversation question.
其中,所述对所述会话问题进行预处理包括:The pre-processing of the session problem includes:
对所述会话问题进行分词处理,从而切分出会话问题的各词条,所述分词处理的方法包括基于词典进行正向最大匹配和/或基于词典进行逆向最大匹配;Performing word segmentation on the conversation problem, thereby segmenting the terms of the conversation problem, the method of word segmentation includes performing forward maximum matching based on the dictionary and/or performing reverse maximum matching based on the dictionary;
对经所述分词处理得到的各词条进行词性解析,并对各词条的词性进行标注,所述词性解析通过经预设大规模语料库训练得到的词性标注模型实现;Performing part-of-speech analysis on each term obtained by the word segmentation process, and labeling the part-of-speech of each term, and the part-of-speech analysis is realized by a part-of-speech tagging model obtained through a preset large-scale corpus training;
对所述会话问题进行命名实体识别,从而识别出具有特定意义的命名实体,所述命名实体包括人名、地名、组织机构、专有名词,所述命名实体识别的方法包括基于词典和规则的方法,以及基于统计学习的方法;Named entity identification is performed on the session problem, thereby identifying a named entity having a specific meaning, the named entity includes a person name, a place name, an organization, a proper noun, and the method for identifying the named entity includes a dictionary-based and rule-based method And methods based on statistical learning;
根据所述各词条以及所述命名实体,从所述会话问题中提取关键词,所述关键词为字符数量多于第一预设阈值的词组,或者为存在于预设词典中的命名实体,所述预设词典包括业务场景专有词典。Extracting a keyword from the conversation question according to the term and the named entity, the keyword being a phrase whose number of characters is greater than a first preset threshold, or a named entity existing in a preset dictionary The preset dictionary includes a business scenario-specific dictionary.
所述分别计算所述会话问题与所述候选问题集合中每个候选问题的文本相似度包括:And calculating, respectively, the text similarity between the conversation problem and each candidate question in the candidate question set includes:
构建卷积神经网络,通过所述卷积神经网络对所述问答知识库中的所有问题语句进行样本训练,得到所述问答知识库中问题语句对应的卷积神经网络模型;Constructing a convolutional neural network, and performing sample training on all problem sentences in the question and answer knowledge base through the convolutional neural network, and obtaining a convolutional neural network model corresponding to the problem statement in the question and answer knowledge base;
将所述会话问题和所述候选问题集合中的每个候选问题分别输入所述卷积神经网络模型,通过所述卷积神经网络模型的卷积核卷积得到所述会话问题和所述候选问题集合中的每个候选问题各自对应的特征向量;Entering each of the conversation problem and the candidate question set into the convolutional neural network model, respectively, and obtaining the conversation problem and the candidate by convolution kernel convolution of the convolutional neural network model a feature vector corresponding to each candidate question in the question set;
分别计算所述会话问题对应的特征向量与所述候选问题集合中的每个候选问题对应的特征向量之间的余弦距离,从而得到所述会话问题与所述候选问题集合中每个候选问题的文本相似度;Calculating a cosine distance between the feature vector corresponding to the session problem and the feature vector corresponding to each candidate question in the candidate question set, respectively, to obtain the session problem and each candidate problem in the candidate question set Text similarity
所述分别计算所述会话问题与所述候选答案集合中每个候选答案的主题相似度包括:The calculating the similarity degree between the conversation problem and each candidate answer in the candidate answer set respectively includes:
采用线性判别分析模型分别提取所述会话问题和所述候选答案集合中每个候选答案的主题向量;Extracting the conversation problem and the topic vector of each candidate answer in the candidate answer set by using a linear discriminant analysis model;
分别计算所述会话问题的主题向量与所述候选答案集合中每个候选答案的主题向量之间的余弦距离,从而得到所述会话问题与所述候选答案集合中每个候选答案的主题相似度。Calculating a cosine distance between a topic vector of the conversation question and a topic vector of each candidate answer in the candidate answer set, respectively, to obtain a topic similarity between the conversation question and each candidate answer in the candidate answer set .
所述根据预设规则及所述问题相似度,判断候选问题集合中是否存在所述会话问题的近似问题包括:The approximating the problem of determining whether the session problem exists in the candidate question set according to the preset rule and the problem similarity includes:
判断是否存在与会话问题的文本相似度大于第二预设阈值的候选问题,若是,则从所述与会话问题的文本相似度大于第二预设阈值的候选问题中选择最大文本相似度对应的候选问题作为所述近似问题;Determining whether there is a candidate problem that the text similarity with the conversation problem is greater than the second preset threshold, and if yes, selecting the maximum text similarity corresponding to the candidate problem that the text similarity with the conversation problem is greater than the second preset threshold Candidate questions as the approximation problem;
若不存在与会话问题的文本相似度大于第二预设阈值的候选问题,则判定所述候选问题集合中不存在所述会话问题的近似问题;If there is no candidate problem that the text similarity with the conversation problem is greater than the second preset threshold, determining that there is no approximation problem of the conversation problem in the candidate problem set;
所述根据预设规则及所述主题相似度,判断候选答案集合中是否存在所述会话问题的近似答案包括:The determining, according to the preset rule and the topic similarity, determining whether the candidate answer exists in the candidate answer set includes:
判断是否存在与会话问题的主题相似度大于第三预设阈值的候选答案,若是,则从所述与会话问题的主题相似度大于第三预设阈值的候选答案中选择最大主题相似度对应的候选答案作为所述近似答案;Determining whether there is a candidate answer whose topic similarity to the conversation problem is greater than a third preset threshold, and if yes, selecting a maximum topic similarity corresponding to the candidate answers having the topic similarity of the conversation problem being greater than the third preset threshold a candidate answer as the approximate answer;
若不存在与会话问题的主题相似度大于第三预设阈值的候选答案,则判定所述候选答案集合中不存在所述会话问题的近似答案。If there is no candidate answer with the topic similarity of the conversation problem being greater than the third preset threshold, it is determined that the approximate answer of the conversation question does not exist in the candidate answer set.
所述为问答知识库构建倒排索引包括:The constructing the inverted index for the Q&A knowledge base includes:
对问答知识库中的每个问题和答案分别进行分词、词性标注、关键词提取、关键词出现位置记录、分配ID号的操作,以及为每个问题和答案分词后得到的各词条分配ID号;Each question and answer in the Q&A knowledge base is divided into word segmentation, part-of-speech tagging, keyword extraction, keyword location location record, assignment ID number, and ID assigned to each term after each word and answer segmentation. number;
对问答知识库中每个问题和答案根据相应的ID号进行排序,对所述每个问题和答案分词后得到的各词条根据相应的ID号进行排序,并将具有同一词条ID的所有问题ID和答案ID放到该词条对应的倒排记录表中;Each question and answer in the Q&A knowledge base is sorted according to the corresponding ID number, and each term obtained after each word and answer word segmentation is sorted according to the corresponding ID number, and all the items with the same item ID are The question ID and the answer ID are placed in the inverted record table corresponding to the entry;
将所有倒排记录表合并为最终的倒排索引。Combine all inverted record tables into the final inverted index.
所述seq2seq模型由用于进行所述编码和解码迭代训练的前向长短记忆网络LSTM模型和后向LSTM模型,以及用于计算每次编码和解码的隐藏层信息权重的注意力机制构成。The seq2seq model consists of a forward long and short memory network LSTM model and a backward LSTM model for performing the encoding and decoding iterative training, and an attention mechanism for calculating hidden layer information weights for each encoding and decoding.
本申请之计算机可读存储介质的具体实施方式与上述聊天应答方法以及电子装置1的具体实施方式大致相同,在此不再赘述。The specific implementation of the computer readable storage medium of the present application is substantially the same as the above-described chat response method and the specific embodiment of the
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、装置、物品或者方法不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、装置、物品或者方法所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、装置、物品或者方法中还存在另外的相同要素。It is to be understood that the term "comprises", "comprising", or any other variants thereof, is intended to encompass a non-exclusive inclusion, such that a process, apparatus, article, or method that comprises a series of elements includes those elements. It also includes other elements not explicitly listed, or elements that are inherent to such a process, device, item, or method. An element that is defined by the phrase "comprising a ..." does not exclude the presence of additional equivalent elements in the process, the device, the item, or the method that comprises the element.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通 过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在如上所述的一个存储介质中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本申请各个实施例所述的方法。Through the description of the above embodiments, those skilled in the art can clearly understand that the foregoing embodiment method can be implemented by means of software plus a necessary general hardware platform, and of course, can also be through hardware, but in many cases, the former is better. Implementation. Based on such understanding, portions of the technical solution of the present application that contribute substantially or to the prior art may be embodied in the form of a software product stored in a storage medium as described above, including a number of instructions. To enable a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the methods described in the various embodiments of the present application.
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。The above is only a preferred embodiment of the present application, and is not intended to limit the scope of the patent application, and the equivalent structure or equivalent process transformations made by the specification and the drawings of the present application, or directly or indirectly applied to other related technical fields. The same is included in the scope of patent protection of this application.
Claims (20)
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201810135747.6A CN108491433B (en) | 2018-02-09 | 2018-02-09 | Chat answering method, electronic device and storage medium |
| CN201810135747.6 | 2018-02-09 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2019153613A1 true WO2019153613A1 (en) | 2019-08-15 |
Family
ID=63340316
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2018/090643 Ceased WO2019153613A1 (en) | 2018-02-09 | 2018-06-11 | Chat response method, electronic device and storage medium |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN108491433B (en) |
| WO (1) | WO2019153613A1 (en) |
Cited By (42)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN110502752A (en) * | 2019-08-21 | 2019-11-26 | 北京一链数云科技有限公司 | A kind of text handling method, device, equipment and computer storage medium |
| CN111090721A (en) * | 2019-11-25 | 2020-05-01 | 出门问问(苏州)信息科技有限公司 | Question answering method and device and electronic equipment |
| CN111177339A (en) * | 2019-12-06 | 2020-05-19 | 百度在线网络技术(北京)有限公司 | Dialog generation method and device, electronic equipment and storage medium |
| CN111177336A (en) * | 2019-11-30 | 2020-05-19 | 西安华为技术有限公司 | Method and device for determining response information |
| CN111291170A (en) * | 2020-01-20 | 2020-06-16 | 腾讯科技(深圳)有限公司 | Session recommendation method based on intelligent customer service and related device |
| CN111428019A (en) * | 2020-04-02 | 2020-07-17 | 出门问问信息科技有限公司 | Data processing method and equipment for knowledge base question answering |
| CN111538803A (en) * | 2020-04-20 | 2020-08-14 | 京东方科技集团股份有限公司 | Method, device, equipment and medium for acquiring candidate question text to be matched |
| CN111597321A (en) * | 2020-07-08 | 2020-08-28 | 腾讯科技(深圳)有限公司 | Question answer prediction method and device, storage medium and electronic equipment |
| CN111625635A (en) * | 2020-05-27 | 2020-09-04 | 北京百度网讯科技有限公司 | Question-answer processing method, language model training method, device, equipment and storage medium |
| CN111737401A (en) * | 2020-06-22 | 2020-10-02 | 首都师范大学 | A Keyword Group Prediction Method Based on Seq2set2seq Framework |
| CN111753062A (en) * | 2019-11-06 | 2020-10-09 | 北京京东尚科信息技术有限公司 | A method, apparatus, device and medium for determining a session response scheme |
| CN112184021A (en) * | 2020-09-28 | 2021-01-05 | 中国人民解放军国防科技大学 | Answer quality evaluation method based on similar support set |
| CN112232053A (en) * | 2020-09-16 | 2021-01-15 | 西北大学 | A text similarity calculation system, method, and storage medium based on multi-keyword pair matching |
| CN112330387A (en) * | 2020-09-29 | 2021-02-05 | 重庆锐云科技有限公司 | Virtual broker applied to house-watching software |
| CN112749260A (en) * | 2019-10-31 | 2021-05-04 | 阿里巴巴集团控股有限公司 | Information interaction method, device, equipment and medium |
| CN113076409A (en) * | 2021-04-20 | 2021-07-06 | 上海景吾智能科技有限公司 | Dialogue system and method applied to robot, robot and readable medium |
| CN113127613A (en) * | 2020-01-10 | 2021-07-16 | 北京搜狗科技发展有限公司 | Chat information processing method and device |
| CN113743124A (en) * | 2021-08-25 | 2021-12-03 | 南京星云数字技术有限公司 | Intelligent question-answer exception processing method and device and electronic equipment |
| CN113761986A (en) * | 2020-06-05 | 2021-12-07 | 阿里巴巴集团控股有限公司 | Text acquisition, live broadcast method, device and storage medium |
| CN114328796A (en) * | 2021-08-19 | 2022-04-12 | 腾讯科技(深圳)有限公司 | Question and answer index generation method, question and answer model processing method, device and storage medium |
| CN114443818A (en) * | 2022-01-30 | 2022-05-06 | 天津大学 | Dialogue type knowledge base question-answer implementation method |
| CN114491046A (en) * | 2022-02-14 | 2022-05-13 | 中国工商银行股份有限公司 | Information interaction method based on language model, device and electronic device thereof |
| CN114490957A (en) * | 2020-11-12 | 2022-05-13 | 中移物联网有限公司 | Question answering method, apparatus and computer readable storage medium |
| CN114579729A (en) * | 2022-05-09 | 2022-06-03 | 南京云问网络技术有限公司 | FAQ question-answer matching method and system fusing multi-algorithm model |
| CN114638236A (en) * | 2022-03-30 | 2022-06-17 | 政采云有限公司 | Intelligent question answering method, device, equipment and computer readable storage medium |
| CN114661883A (en) * | 2022-03-31 | 2022-06-24 | 北京金山数字娱乐科技有限公司 | Intelligent question and answer method and device and electronic equipment |
| CN114860898A (en) * | 2022-03-25 | 2022-08-05 | 成都淞幸科技有限责任公司 | A software development knowledge base construction and application method |
| CN115080720A (en) * | 2022-06-29 | 2022-09-20 | 壹沓科技(上海)有限公司 | Text processing method, device, equipment and medium based on RPA and AI |
| CN115129820A (en) * | 2022-07-22 | 2022-09-30 | 宁波牛信网络科技有限公司 | Similarity-based text feedback method and device |
| CN115221316A (en) * | 2022-06-14 | 2022-10-21 | 科大讯飞华南人工智能研究院(广州)有限公司 | Knowledge base processing, model training method, computer equipment and storage medium |
| CN116049376A (en) * | 2023-03-31 | 2023-05-02 | 北京太极信息系统技术有限公司 | Method, device and system for retrieving and replying information and creating knowledge |
| CN116226329A (en) * | 2023-01-04 | 2023-06-06 | 国网河北省电力有限公司信息通信分公司 | Intelligent retrieval method, device and terminal equipment for problems in the power grid field |
| CN116303981A (en) * | 2023-05-23 | 2023-06-23 | 山东森普信息技术有限公司 | Agricultural community knowledge question-answering method, device and storage medium |
| CN116795953A (en) * | 2022-03-08 | 2023-09-22 | 腾讯科技(深圳)有限公司 | Question-answer matching method and device, computer readable storage medium and computer equipment |
| CN116886656A (en) * | 2023-09-06 | 2023-10-13 | 北京小糖科技有限责任公司 | Chat room-oriented dance knowledge pushing method and device |
| CN117332789A (en) * | 2023-12-01 | 2024-01-02 | 诺比侃人工智能科技(成都)股份有限公司 | Semantic analysis method and system for dialogue scene |
| CN118350468A (en) * | 2024-06-14 | 2024-07-16 | 杭州字节方舟科技有限公司 | An AI dialogue method based on natural language processing |
| CN118606574A (en) * | 2024-08-12 | 2024-09-06 | 杭州领信数科信息技术有限公司 | Knowledge answering method, system, electronic device and storage medium based on large model |
| CN119294521A (en) * | 2024-10-14 | 2025-01-10 | 四川开物信息技术有限公司 | Intelligent question-answering system and question-answering method |
| CN119441431A (en) * | 2024-10-25 | 2025-02-14 | 北京房多多信息技术有限公司 | Data processing method, device, electronic device and storage medium |
| CN119621889A (en) * | 2024-11-21 | 2025-03-14 | 之江实验室 | A vertical knowledge question-answering method and device based on a large model |
| CN119719276A (en) * | 2024-11-26 | 2025-03-28 | 陕西优百信息技术有限公司 | Question answering method, device, storage medium and electronic device based on model knowledge base |
Families Citing this family (57)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109299250A (en) * | 2018-09-14 | 2019-02-01 | 广州神马移动信息科技有限公司 | Methods of exhibiting, device, storage medium and the electronic equipment of answer |
| CN110908663B (en) * | 2018-09-18 | 2024-08-16 | 北京京东尚科信息技术有限公司 | Positioning method and positioning device for business problem |
| US11514915B2 (en) * | 2018-09-27 | 2022-11-29 | Salesforce.Com, Inc. | Global-to-local memory pointer networks for task-oriented dialogue |
| CN109344242B (en) * | 2018-09-28 | 2021-10-01 | 广东工业大学 | A dialogue question answering method, device, equipment and storage medium |
| CN109359182B (en) * | 2018-10-08 | 2020-11-27 | 网宿科技股份有限公司 | A method and device for answering |
| CN109543005A (en) * | 2018-10-12 | 2019-03-29 | 平安科技(深圳)有限公司 | The dialogue state recognition methods of customer service robot and device, equipment, storage medium |
| CN109299242A (en) * | 2018-10-19 | 2019-02-01 | 武汉斗鱼网络科技有限公司 | A kind of session generation method, device, terminal device and storage medium |
| CN111125320A (en) * | 2018-10-31 | 2020-05-08 | 重庆小雨点小额贷款有限公司 | Data processing method, device, server and computer readable storage medium |
| KR102201074B1 (en) * | 2018-10-31 | 2021-01-08 | 서울대학교산학협력단 | Method and system of goal-oriented dialog based on information theory |
| CN111159363A (en) * | 2018-11-06 | 2020-05-15 | 航天信息股份有限公司 | Knowledge base-based question answer determination method and device |
| CN109446314A (en) * | 2018-11-14 | 2019-03-08 | 沈文策 | A kind of customer service question processing method and device |
| CN109492085B (en) * | 2018-11-15 | 2024-05-14 | 平安科技(深圳)有限公司 | Answer determination method, device, terminal and storage medium based on data processing |
| CN109543017B (en) * | 2018-11-21 | 2022-12-13 | 广州语义科技有限公司 | Legal question keyword generation method and system |
| CN109492086B (en) * | 2018-11-26 | 2022-01-21 | 出门问问创新科技有限公司 | Answer output method and device, electronic equipment and storage medium |
| CN109726265A (en) * | 2018-12-13 | 2019-05-07 | 深圳壹账通智能科技有限公司 | Information processing method, device and computer-readable storage medium for assisting chat |
| CN109685462A (en) * | 2018-12-21 | 2019-04-26 | 义橙网络科技(上海)有限公司 | A kind of personnel and post matching method, apparatus, system, equipment and medium |
| CN109766421A (en) * | 2018-12-28 | 2019-05-17 | 上海汇付数据服务有限公司 | Intelligent Answer System and method |
| CN109829478B (en) * | 2018-12-29 | 2024-05-07 | 平安科技(深圳)有限公司 | Problem classification method and device based on variation self-encoder |
| CN109918560B (en) * | 2019-01-09 | 2024-03-12 | 平安科技(深圳)有限公司 | Question and answer method and device based on search engine |
| CN109885810A (en) * | 2019-01-17 | 2019-06-14 | 平安城市建设科技(深圳)有限公司 | Nan-machine interrogation's method, apparatus, equipment and storage medium based on semanteme parsing |
| CN109829046A (en) * | 2019-01-18 | 2019-05-31 | 青牛智胜(深圳)科技有限公司 | A kind of intelligence seat system and method |
| CN111611354B (en) * | 2019-02-26 | 2023-09-29 | 北京嘀嘀无限科技发展有限公司 | Man-machine conversation control method and device, server and readable storage medium |
| US11600389B2 (en) | 2019-03-19 | 2023-03-07 | Boe Technology Group Co., Ltd. | Question generating method and apparatus, inquiring diagnosis system, and computer readable storage medium |
| CN111858859B (en) * | 2019-04-01 | 2024-07-26 | 北京百度网讯科技有限公司 | Automatic question-answering processing method, device, computer equipment and storage medium |
| CN111831132B (en) * | 2019-04-19 | 2024-12-27 | 北京搜狗科技发展有限公司 | Information recommendation method, device and electronic device |
| CN111858863B (en) * | 2019-04-29 | 2023-07-14 | 深圳市优必选科技有限公司 | Reply recommendation method, reply recommendation device and electronic equipment |
| CN110795542B (en) * | 2019-08-28 | 2024-03-15 | 腾讯科技(深圳)有限公司 | Dialogue method, related device and equipment |
| CN110765244B (en) * | 2019-09-18 | 2023-06-06 | 平安科技(深圳)有限公司 | Method, device, computer equipment and storage medium for obtaining answering operation |
| CN110781275B (en) * | 2019-09-18 | 2022-05-10 | 中国电子科技集团公司第二十八研究所 | Multi-feature-based question answerability discrimination method and computer storage medium |
| CN110781284B (en) * | 2019-09-18 | 2024-05-28 | 平安科技(深圳)有限公司 | Knowledge graph-based question and answer method, device and storage medium |
| CN110619038A (en) * | 2019-09-20 | 2019-12-27 | 上海氦豚机器人科技有限公司 | Method, system and electronic equipment for vertically guiding professional consultation |
| CN110737763A (en) * | 2019-10-18 | 2020-01-31 | 成都华律网络服务有限公司 | Chinese intelligent question-answering system and method integrating knowledge map and deep learning |
| CN111159331B (en) * | 2019-11-14 | 2021-11-23 | 中国科学院深圳先进技术研究院 | Text query method, text query device and computer storage medium |
| CN111339274B (en) * | 2020-02-25 | 2024-01-26 | 网易(杭州)网络有限公司 | Dialogue generation model training method, dialogue generation method and device |
| CN111400413B (en) * | 2020-03-10 | 2023-06-30 | 支付宝(杭州)信息技术有限公司 | Method and system for determining category of knowledge points in knowledge base |
| CN111475628B (en) * | 2020-03-30 | 2023-07-14 | 珠海格力电器股份有限公司 | Session data processing method, apparatus, computer device and storage medium |
| CN111651560B (en) * | 2020-05-29 | 2023-08-29 | 北京百度网讯科技有限公司 | Method and apparatus for configuring problems, electronic device, computer readable medium |
| CN111753052A (en) * | 2020-06-19 | 2020-10-09 | 微软技术许可有限责任公司 | Provide knowledgeable answers to knowledge intent questions |
| CN111814466B (en) * | 2020-06-24 | 2024-09-13 | 平安科技(深圳)有限公司 | Information extraction method based on machine reading understanding and related equipment thereof |
| CN111782785B (en) * | 2020-06-30 | 2024-04-19 | 北京百度网讯科技有限公司 | Automatic question and answer method, device, equipment and storage medium |
| CN111858856A (en) * | 2020-07-23 | 2020-10-30 | 海信电子科技(武汉)有限公司 | Multi-round search type chatting method and display equipment |
| CN111949787B (en) * | 2020-08-21 | 2023-04-28 | 平安国际智慧城市科技股份有限公司 | Automatic question-answering method, device, equipment and storage medium based on knowledge graph |
| CN112307164A (en) * | 2020-10-15 | 2021-02-02 | 江苏常熟农村商业银行股份有限公司 | Information recommendation method and device, computer equipment and storage medium |
| CN112527985A (en) * | 2020-12-04 | 2021-03-19 | 杭州远传新业科技有限公司 | Unknown problem processing method, device, equipment and medium |
| CN112507078B (en) * | 2020-12-15 | 2022-05-10 | 浙江诺诺网络科技有限公司 | Semantic question and answer method and device, electronic equipment and storage medium |
| CN112559707A (en) * | 2020-12-16 | 2021-03-26 | 四川智仟科技有限公司 | Knowledge-driven customer service question and answer method |
| CN112597291B (en) * | 2020-12-26 | 2024-09-17 | 中国农业银行股份有限公司 | Intelligent question-answering implementation method, device and equipment |
| CN112860863A (en) * | 2021-01-30 | 2021-05-28 | 云知声智能科技股份有限公司 | Machine reading understanding method and device |
| CN115238046A (en) * | 2021-04-25 | 2022-10-25 | 平安普惠企业管理有限公司 | User intention identification method and device, electronic equipment and storage medium |
| WO2022226879A1 (en) * | 2021-04-29 | 2022-11-03 | 京东方科技集团股份有限公司 | Question and answer processing method and apparatus, electronic device, and computer-readable storage medium |
| CN114328841A (en) * | 2021-07-13 | 2022-04-12 | 北京金山数字娱乐科技有限公司 | Question-answer model training method and device, question-answer method and device |
| CN114416962B (en) * | 2022-01-11 | 2024-10-18 | 平安科技(深圳)有限公司 | Prediction method, prediction device, electronic equipment and storage medium for answers to questions |
| CN114398909B (en) * | 2022-01-18 | 2025-09-05 | 中国平安人寿保险股份有限公司 | Question generation method, device, equipment and storage medium for dialogue training |
| CN116414959A (en) * | 2023-02-23 | 2023-07-11 | 厦门黑镜科技有限公司 | Digital human interaction control method, device, electronic device and storage medium |
| CN116955579B (en) * | 2023-09-21 | 2023-12-29 | 武汉轻度科技有限公司 | Chat reply generation method and device based on keyword knowledge retrieval |
| CN116992005B (en) * | 2023-09-25 | 2023-12-01 | 语仓科技(北京)有限公司 | Intelligent dialogue method, system and equipment based on large model and local knowledge base |
| CN119988552A (en) * | 2025-01-21 | 2025-05-13 | 青岛市市场监管发展服务中心(青岛市市场监管应急处置中心、青岛市消费者权益保护中心) | A market supervision public consultation method and system based on pre-trained large language model |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102866990A (en) * | 2012-08-20 | 2013-01-09 | 北京搜狗信息服务有限公司 | Thematic conversation method and device |
| CN105630917A (en) * | 2015-12-22 | 2016-06-01 | 成都小多科技有限公司 | Intelligent answering method and intelligent answering device |
| CN107463699A (en) * | 2017-08-15 | 2017-12-12 | 济南浪潮高新科技投资发展有限公司 | A kind of method for realizing question and answer robot based on seq2seq models |
| CN107609101A (en) * | 2017-09-11 | 2018-01-19 | 远光软件股份有限公司 | Intelligent interactive method, equipment and storage medium |
Family Cites Families (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20160371276A1 (en) * | 2015-06-19 | 2016-12-22 | Microsoft Technology Licensing, Llc | Answer scheme for information request |
-
2018
- 2018-02-09 CN CN201810135747.6A patent/CN108491433B/en active Active
- 2018-06-11 WO PCT/CN2018/090643 patent/WO2019153613A1/en not_active Ceased
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102866990A (en) * | 2012-08-20 | 2013-01-09 | 北京搜狗信息服务有限公司 | Thematic conversation method and device |
| CN105630917A (en) * | 2015-12-22 | 2016-06-01 | 成都小多科技有限公司 | Intelligent answering method and intelligent answering device |
| CN107463699A (en) * | 2017-08-15 | 2017-12-12 | 济南浪潮高新科技投资发展有限公司 | A kind of method for realizing question and answer robot based on seq2seq models |
| CN107609101A (en) * | 2017-09-11 | 2018-01-19 | 远光软件股份有限公司 | Intelligent interactive method, equipment and storage medium |
Cited By (54)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN110502752A (en) * | 2019-08-21 | 2019-11-26 | 北京一链数云科技有限公司 | A kind of text handling method, device, equipment and computer storage medium |
| CN112749260A (en) * | 2019-10-31 | 2021-05-04 | 阿里巴巴集团控股有限公司 | Information interaction method, device, equipment and medium |
| CN111753062A (en) * | 2019-11-06 | 2020-10-09 | 北京京东尚科信息技术有限公司 | A method, apparatus, device and medium for determining a session response scheme |
| CN111090721B (en) * | 2019-11-25 | 2023-09-12 | 出门问问(苏州)信息科技有限公司 | Question answering method and device and electronic equipment |
| CN111090721A (en) * | 2019-11-25 | 2020-05-01 | 出门问问(苏州)信息科技有限公司 | Question answering method and device and electronic equipment |
| CN111177336A (en) * | 2019-11-30 | 2020-05-19 | 西安华为技术有限公司 | Method and device for determining response information |
| CN111177336B (en) * | 2019-11-30 | 2023-11-10 | 西安华为技术有限公司 | Method and device for determining response information |
| CN111177339A (en) * | 2019-12-06 | 2020-05-19 | 百度在线网络技术(北京)有限公司 | Dialog generation method and device, electronic equipment and storage medium |
| CN113127613B (en) * | 2020-01-10 | 2024-01-09 | 北京搜狗科技发展有限公司 | Chat information processing method and device |
| CN113127613A (en) * | 2020-01-10 | 2021-07-16 | 北京搜狗科技发展有限公司 | Chat information processing method and device |
| CN111291170B (en) * | 2020-01-20 | 2023-09-19 | 腾讯科技(深圳)有限公司 | Session recommendation method and related device based on intelligent customer service |
| CN111291170A (en) * | 2020-01-20 | 2020-06-16 | 腾讯科技(深圳)有限公司 | Session recommendation method based on intelligent customer service and related device |
| CN111428019A (en) * | 2020-04-02 | 2020-07-17 | 出门问问信息科技有限公司 | Data processing method and equipment for knowledge base question answering |
| CN111538803A (en) * | 2020-04-20 | 2020-08-14 | 京东方科技集团股份有限公司 | Method, device, equipment and medium for acquiring candidate question text to be matched |
| CN111625635A (en) * | 2020-05-27 | 2020-09-04 | 北京百度网讯科技有限公司 | Question-answer processing method, language model training method, device, equipment and storage medium |
| CN111625635B (en) * | 2020-05-27 | 2023-09-29 | 北京百度网讯科技有限公司 | Question-answering processing method, device, equipment and storage medium |
| CN113761986A (en) * | 2020-06-05 | 2021-12-07 | 阿里巴巴集团控股有限公司 | Text acquisition, live broadcast method, device and storage medium |
| CN111737401A (en) * | 2020-06-22 | 2020-10-02 | 首都师范大学 | A Keyword Group Prediction Method Based on Seq2set2seq Framework |
| CN111597321B (en) * | 2020-07-08 | 2024-06-11 | 腾讯科技(深圳)有限公司 | Prediction method and device of answers to questions, storage medium and electronic equipment |
| CN111597321A (en) * | 2020-07-08 | 2020-08-28 | 腾讯科技(深圳)有限公司 | Question answer prediction method and device, storage medium and electronic equipment |
| CN112232053A (en) * | 2020-09-16 | 2021-01-15 | 西北大学 | A text similarity calculation system, method, and storage medium based on multi-keyword pair matching |
| CN112184021B (en) * | 2020-09-28 | 2023-09-05 | 中国人民解放军国防科技大学 | Answer quality assessment method based on similar support set |
| CN112184021A (en) * | 2020-09-28 | 2021-01-05 | 中国人民解放军国防科技大学 | Answer quality evaluation method based on similar support set |
| CN112330387A (en) * | 2020-09-29 | 2021-02-05 | 重庆锐云科技有限公司 | Virtual broker applied to house-watching software |
| CN112330387B (en) * | 2020-09-29 | 2023-07-18 | 重庆锐云科技有限公司 | Virtual broker applied to house watching software |
| CN114490957A (en) * | 2020-11-12 | 2022-05-13 | 中移物联网有限公司 | Question answering method, apparatus and computer readable storage medium |
| CN113076409A (en) * | 2021-04-20 | 2021-07-06 | 上海景吾智能科技有限公司 | Dialogue system and method applied to robot, robot and readable medium |
| CN114328796A (en) * | 2021-08-19 | 2022-04-12 | 腾讯科技(深圳)有限公司 | Question and answer index generation method, question and answer model processing method, device and storage medium |
| CN113743124B (en) * | 2021-08-25 | 2024-03-29 | 南京星云数字技术有限公司 | Intelligent question-answering exception processing method and device and electronic equipment |
| CN113743124A (en) * | 2021-08-25 | 2021-12-03 | 南京星云数字技术有限公司 | Intelligent question-answer exception processing method and device and electronic equipment |
| CN114443818A (en) * | 2022-01-30 | 2022-05-06 | 天津大学 | Dialogue type knowledge base question-answer implementation method |
| CN114491046A (en) * | 2022-02-14 | 2022-05-13 | 中国工商银行股份有限公司 | Information interaction method based on language model, device and electronic device thereof |
| CN116795953A (en) * | 2022-03-08 | 2023-09-22 | 腾讯科技(深圳)有限公司 | Question-answer matching method and device, computer readable storage medium and computer equipment |
| CN114860898A (en) * | 2022-03-25 | 2022-08-05 | 成都淞幸科技有限责任公司 | A software development knowledge base construction and application method |
| CN114638236A (en) * | 2022-03-30 | 2022-06-17 | 政采云有限公司 | Intelligent question answering method, device, equipment and computer readable storage medium |
| CN114661883A (en) * | 2022-03-31 | 2022-06-24 | 北京金山数字娱乐科技有限公司 | Intelligent question and answer method and device and electronic equipment |
| CN114579729A (en) * | 2022-05-09 | 2022-06-03 | 南京云问网络技术有限公司 | FAQ question-answer matching method and system fusing multi-algorithm model |
| CN115221316A (en) * | 2022-06-14 | 2022-10-21 | 科大讯飞华南人工智能研究院(广州)有限公司 | Knowledge base processing, model training method, computer equipment and storage medium |
| CN115080720A (en) * | 2022-06-29 | 2022-09-20 | 壹沓科技(上海)有限公司 | Text processing method, device, equipment and medium based on RPA and AI |
| CN115129820A (en) * | 2022-07-22 | 2022-09-30 | 宁波牛信网络科技有限公司 | Similarity-based text feedback method and device |
| CN116226329A (en) * | 2023-01-04 | 2023-06-06 | 国网河北省电力有限公司信息通信分公司 | Intelligent retrieval method, device and terminal equipment for problems in the power grid field |
| CN116049376A (en) * | 2023-03-31 | 2023-05-02 | 北京太极信息系统技术有限公司 | Method, device and system for retrieving and replying information and creating knowledge |
| CN116303981A (en) * | 2023-05-23 | 2023-06-23 | 山东森普信息技术有限公司 | Agricultural community knowledge question-answering method, device and storage medium |
| CN116303981B (en) * | 2023-05-23 | 2023-08-01 | 山东森普信息技术有限公司 | Agricultural community knowledge question-answering method, device and storage medium |
| CN116886656B (en) * | 2023-09-06 | 2023-12-08 | 北京小糖科技有限责任公司 | Chat room-oriented dance knowledge pushing method and device |
| CN116886656A (en) * | 2023-09-06 | 2023-10-13 | 北京小糖科技有限责任公司 | Chat room-oriented dance knowledge pushing method and device |
| CN117332789A (en) * | 2023-12-01 | 2024-01-02 | 诺比侃人工智能科技(成都)股份有限公司 | Semantic analysis method and system for dialogue scene |
| CN118350468A (en) * | 2024-06-14 | 2024-07-16 | 杭州字节方舟科技有限公司 | An AI dialogue method based on natural language processing |
| CN118606574A (en) * | 2024-08-12 | 2024-09-06 | 杭州领信数科信息技术有限公司 | Knowledge answering method, system, electronic device and storage medium based on large model |
| CN119294521A (en) * | 2024-10-14 | 2025-01-10 | 四川开物信息技术有限公司 | Intelligent question-answering system and question-answering method |
| CN119441431A (en) * | 2024-10-25 | 2025-02-14 | 北京房多多信息技术有限公司 | Data processing method, device, electronic device and storage medium |
| CN119621889A (en) * | 2024-11-21 | 2025-03-14 | 之江实验室 | A vertical knowledge question-answering method and device based on a large model |
| CN119719276A (en) * | 2024-11-26 | 2025-03-28 | 陕西优百信息技术有限公司 | Question answering method, device, storage medium and electronic device based on model knowledge base |
| CN119719276B (en) * | 2024-11-26 | 2025-09-30 | 陕西优百信息技术有限公司 | Question and answer method and device based on model knowledge base, storage medium and electronic equipment |
Also Published As
| Publication number | Publication date |
|---|---|
| CN108491433B (en) | 2022-05-03 |
| CN108491433A (en) | 2018-09-04 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN108491433B (en) | Chat answering method, electronic device and storage medium | |
| US12470503B2 (en) | Customized message suggestion with user embedding vectors | |
| US11334635B2 (en) | Domain specific natural language understanding of customer intent in self-help | |
| CN109871446B (en) | Rejection method, electronic device and storage medium in intent recognition | |
| US10657332B2 (en) | Language-agnostic understanding | |
| US8073877B2 (en) | Scalable semi-structured named entity detection | |
| WO2019153607A1 (en) | Intelligent response method, electronic device and storage medium | |
| US12131827B2 (en) | Knowledge graph-based question answering method, computer device, and medium | |
| US10192545B2 (en) | Language modeling based on spoken and unspeakable corpuses | |
| US20200019609A1 (en) | Suggesting a response to a message by selecting a template using a neural network | |
| CN111428010B (en) | Man-machine intelligent question-answering method and device | |
| US10289957B2 (en) | Method and system for entity linking | |
| WO2019153612A1 (en) | Question and answer data processing method, electronic device and storage medium | |
| US11030394B1 (en) | Neural models for keyphrase extraction | |
| WO2020233131A1 (en) | Question-and-answer processing method and apparatus, computer device and storage medium | |
| CN104471568A (en) | Learning-Based Processing of Natural Language Problems | |
| CN109299235B (en) | Knowledge base searching method, device and computer readable storage medium | |
| CN113505293B (en) | Information pushing method and device, electronic equipment and storage medium | |
| CN107885717B (en) | Keyword extraction method and device | |
| CN112287069A (en) | Information retrieval method and device based on voice semantics and computer equipment | |
| CN110134777B (en) | Question duplication eliminating method and device, electronic equipment and computer readable storage medium | |
| CN111783424A (en) | Text clause dividing method and device | |
| CN108268450B (en) | Method and apparatus for generating information | |
| US20200272696A1 (en) | Finding of asymmetric relation between words | |
| CN113127621A (en) | Dialogue module pushing method, device, equipment and storage medium |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 30.09.2020) |
|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 18905843 Country of ref document: EP Kind code of ref document: A1 |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 18905843 Country of ref document: EP Kind code of ref document: A1 |