[go: up one dir, main page]

WO2019153613A1 - Chat response method, electronic device and storage medium - Google Patents

Chat response method, electronic device and storage medium Download PDF

Info

Publication number
WO2019153613A1
WO2019153613A1 PCT/CN2018/090643 CN2018090643W WO2019153613A1 WO 2019153613 A1 WO2019153613 A1 WO 2019153613A1 CN 2018090643 W CN2018090643 W CN 2018090643W WO 2019153613 A1 WO2019153613 A1 WO 2019153613A1
Authority
WO
WIPO (PCT)
Prior art keywords
answer
question
candidate
conversation
similarity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2018/090643
Other languages
French (fr)
Chinese (zh)
Inventor
于凤英
王健宗
肖京
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Publication of WO2019153613A1 publication Critical patent/WO2019153613A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/01Customer relationship services
    • G06Q30/015Providing customer assistance, e.g. assisting a customer within a business location or via helpdesk
    • G06Q30/016After-sales
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models

Definitions

  • the present application relates to the field of computer technologies, and in particular, to a chat response method, an electronic device, and a storage medium.
  • AI Artificial Intelligence
  • smart question and answer is one of them.
  • the customer consults online via text or voice
  • the customer can be intelligently answered by the online intelligent customer service.
  • Intelligent Q&A can effectively alleviate the waiting situation of customer service and improve service quality, so it has a very broad prospect.
  • the online consultation process will contain some pure chat content.
  • chat session content input by the customer cannot respond to the customer quickly, accurately, and effectively, the service quality of the intelligent customer service will be reduced, and the humanized high quality experience cannot be brought to the customer.
  • the present application provides a chat response method, the method comprising: a pre-processing step of: acquiring a session question input by a client, pre-processing the session problem, and obtaining text feature information of the session problem, the text feature
  • the information includes part of speech, location and part-of-speech attribution information of each term in the conversation question, the word class attribution includes belonging to the keyword or the named entity; and the first calculating step: constructing an inverted index for the question-and-answer knowledge base, the question and answer
  • the knowledge base includes a plurality of questions pre-arranged and one or more answers associated with each question, and according to the text feature information, querying candidate questions related to the conversation problem from the question-and-answer knowledge base by means of inverted index query Collecting, and separately calculating text similarity between the conversation problem and each candidate question in the candidate question set; a question retrieval step: determining whether the session exists in the candidate question set according to the preset rule and the text similarity An approximation problem of the problem,
  • the present application further provides an electronic device including a memory and a processor, wherein the memory includes a chat response program, and the chat response program is executed by the processor to implement the following steps: a pre-processing step Obtaining a session problem input by the client, pre-processing the session problem, and obtaining text feature information of the session problem, where the text feature information includes part of speech, location, and word class attribution information of each term in the conversation question,
  • the term class attribution includes attribution to a keyword or a named entity;
  • a first calculation step constructing an inverted index for the question and answer knowledge base, the question and answer knowledge base including a plurality of questions pre-arranged and one or more answers associated with each question, Determining, according to the text feature information, a candidate question set related to the session problem from a question and answer knowledge base by means of an inverted index query, and respectively calculating the session problem and each candidate question in the candidate question set Text similarity; question retrieval step: according to preset rules and the similarity of the text,
  • the present application further provides a computer readable storage medium including a chat response program, when the chat response program is executed by a processor, implementing the chat response method as described above Any step.
  • the chat response method, the electronic device and the storage medium proposed by the present application after acquiring the session problem and performing pre-processing, query the candidate question set related to the conversation problem from the question-and-answer knowledge base by means of the inverted index query, and respectively Calculating a text similarity between the conversation problem and each candidate question in the candidate question set, determining whether there is an approximation problem of the conversation problem in the candidate question set, and if so, searching for an association of the approximation problem in the Q&A knowledge base
  • the answer is that the associated answer is output as the target answer of the conversation question. If there is no approximation problem of the conversation problem in the candidate question set, the query and the query are obtained from the question and answer knowledge base by means of an inverted index query.
  • Determining a set of candidate answers related to the conversation problem and separately calculating a topic similarity of the conversation question and each candidate answer in the candidate answer set, and determining whether an approximate answer of the conversation question exists in the candidate answer set, and if so, Outputting the approximate answer as a target answer to the conversation question, if the candidate If there is no approximate answer to the conversation problem in the answer set, the iterative training of encoding and decoding each question and answer in the question and answer knowledge base is performed by the seq2seq model, thereby constructing a sequence prediction model, and inputting the conversation problem
  • the sequence prediction model generates a strain answer, and outputs the strain answer as a target answer of the conversation question, and can provide accurate and responsive feedback to the client for the conversation problem, thereby improving service quality.
  • FIG. 1 is a schematic diagram of an operating environment of a preferred embodiment of an electronic device of the present application
  • FIG. 2 is a schematic diagram of interaction between an electronic device and a client according to a preferred embodiment of the present application
  • FIG. 3 is a flow chart of a preferred embodiment of a chat response method of the present application.
  • FIG. 4 is a program block diagram of the chat response program of FIG. 1.
  • embodiments of the present application can be implemented as a method, apparatus, device, system, or computer program product. Accordingly, the application can be embodied in a complete hardware, complete software (including firmware, resident software, microcode, etc.), or a combination of hardware and software.
  • a chat response method an electronic device, and a storage medium are proposed.
  • FIG. 1 is a schematic diagram of an operating environment of a preferred embodiment of an electronic device of the present application.
  • the electronic device 1 may be a terminal device having a storage and computing function such as a server, a portable computer, or a desktop computer.
  • the electronic device 1 includes a memory 11, a processor 12, a network interface 13, and a communication bus 14.
  • the network interface 13 can optionally include a standard wired interface and a wireless interface (such as a WI-FI interface).
  • the communication bus 14 is used to implement connection communication between the above components.
  • the memory 11 includes at least one type of readable storage medium.
  • the at least one type of readable storage medium may be a non-volatile storage medium such as a flash memory, a hard disk, a multimedia card, a card type memory, or the like.
  • the readable storage medium may be an internal storage unit of the electronic device 1, such as a hard disk of the electronic device 1.
  • the readable storage medium may also be an external memory 11 of the electronic device 1, such as a plug-in hard disk equipped on the electronic device 1, a smart memory card (SMC). , Secure Digital (SD) card, Flash Card, etc.
  • SMC smart memory card
  • SD Secure Digital
  • the readable storage medium of the memory 11 is generally used to store the chat response program 10, the Q&A knowledge base 4, and the like installed in the electronic device 1.
  • the memory 11 can also be used to temporarily store data that has been output or is about to be output.
  • the processor 12 in some embodiments, may be a Central Processing Unit (CPU), microprocessor or other data processing chip for running program code or processing data stored in the memory 11, such as executing a chat response program. 10 and so on.
  • CPU Central Processing Unit
  • microprocessor or other data processing chip for running program code or processing data stored in the memory 11, such as executing a chat response program. 10 and so on.
  • FIG. 1 shows only the electronic device 1 having the components 11-14 and the chat response program 10, but it should be understood that not all illustrated components may be implemented, and more or fewer components may be implemented instead.
  • the electronic device 1 may further include a user interface
  • the user interface may include an input unit such as a keyboard, a voice input device such as a microphone, a device with a voice recognition function, a voice output device such as an audio, a headphone, and the like.
  • the user interface may also include a standard wired interface and a wireless interface.
  • the electronic device 1 may further include a display, which may also be referred to as a display screen or a display unit.
  • a display may also be referred to as a display screen or a display unit.
  • it may be an LED display, a liquid crystal display, a touch liquid crystal display, and an Organic Light-Emitting Diode (OLED) display.
  • the display is used to display information processed in the electronic device 1 and a user interface for displaying visualizations.
  • the electronic device 1 further comprises a touch sensor.
  • the area provided by the touch sensor for the user to perform a touch operation is referred to as a touch area.
  • the touch sensor described herein may be a resistive touch sensor, a capacitive touch sensor, or the like.
  • the touch sensor includes not only a contact type touch sensor but also a proximity type touch sensor or the like.
  • the touch sensor may be a single sensor or a plurality of sensors arranged, for example, in an array. The user can initiate the chat response program 10 by touching the touch area.
  • the area of the display of the electronic device 1 may be the same as or different from the area of the touch sensor.
  • a display is stacked with the touch sensor to form a touch display. The device detects a user-triggered touch operation based on a touch screen display.
  • the electronic device 1 may further include a radio frequency (RF) circuit, a sensor, an audio circuit, and the like, and details are not described herein.
  • RF radio frequency
  • FIG. 2 it is a schematic diagram of interaction between the electronic device 1 and the client 2 according to a preferred embodiment of the present application.
  • the chat response program 10 runs in the electronic device 1.
  • the preferred embodiment of the electronic device 1 is a server.
  • the electronic device 1 is communicatively coupled to the client 2 via a network 3.
  • the client 2 can run in various types of terminal devices, such as smart phones, portable computers, and the like.
  • the session question can be input to the chat answering program 10, and the session question can be a session problem for a specific domain or a chat session content.
  • the chat response program 10 can adopt the chat response method, determine an appropriate response content according to the session problem, and feed back the response content to the client 2.
  • FIG. 3 it is a flowchart of a preferred embodiment of the chat response method of the present application. The following steps of implementing the chat response method when the processor 12 of the electronic device 1 executes the chat response program 10 stored in the memory 11:
  • Step S1 Obtain a session question input by the client, perform pre-processing on the session problem, and obtain text feature information of the session problem, where the text feature information includes part of speech, location, and word class attribution information of each term in the conversation question.
  • the word class attribution includes attribution to a keyword or a named entity.
  • the session question can be, for example, a conversational question for a particular domain, such as "how long the warranty period is," or it can be a chat session content, such as "The weather is very good today.”
  • step S1 may first perform some pre-processing on the session problem.
  • the pre-processing performed in step S1 may include the following processing:
  • the conversation question is “how long the warranty period is”, and the terms obtained after the word segmentation are “warranty period” and “yes”.
  • the method of word segmentation includes performing a forward maximum match based on a dictionary and/or performing a reverse maximum match based on a dictionary;
  • part-of-speech analysis on each term obtained by the word segmentation, and marking the part-of-speech of each term.
  • the result of the part-of-speech tagging according to the preset rule is “warranty period/noun”. "Yes/verbs”, “multiple/adverbs”, “long/adjectives”, the part of speech analysis is realized by a part-of-speech tagging model obtained through a preset large-scale corpus training;
  • Named entity identification is performed on the session problem, thereby identifying a named entity having a specific meaning, the named entity includes a person name, a place name, an organization, a proper noun, and the method for identifying the named entity includes a dictionary-based and rule-based method And methods based on statistical learning;
  • the preset dictionary includes a business scenario-specific dictionary.
  • Step S2 constructing an inverted index for the question and answer knowledge base 4, the question and answer knowledge base 4 includes a plurality of questions arranged in advance and one or more answers associated with each question, and querying through the inverted index according to the text feature information
  • the method queries the candidate question set related to the conversation question from the question and answer knowledge base 4, and separately calculates the text similarity of the session question and each candidate question in the candidate question set.
  • the constructing the inverted index for the Q&A knowledge base 4 includes:
  • Each question and answer in the Q&A Knowledge Base 4 is divided into word segmentation, part-of-speech tagging, keyword extraction, keyword occurrence location record, assignment of ID number, and assignment of each term after each word and answer segmentation.
  • ID number ;
  • Each question and answer in the Q&A knowledge base 4 is sorted according to the corresponding ID number, and each term obtained after each word and answer word segmentation is sorted according to the corresponding ID number, and will have the same term ID. All question IDs and answer IDs are placed in the inverted record table corresponding to the entry;
  • At least one candidate question is included in the candidate question set, and each candidate question has a certain degree of association with the session problem due to the manner of using an inverted index query.
  • the association between each candidate question and the conversation question may be reflected by the text similarity. If the text similarity between the conversation problem and the corresponding candidate question is higher, the conversation problem is considered to be similar to the candidate problem. .
  • the method for calculating the text similarity between the session problem and each candidate question in the candidate question set in step S2 may include:
  • Step S3 determining, according to the preset rule and the text similarity, whether there is an approximation problem of the conversation problem in the candidate question set, and if there is an approximation problem of the conversation problem in the candidate question set, then the Q&A knowledge base Finding the associated answer of the approximation question, and outputting the associated answer as the target answer of the conversation question.
  • the preset rule may include: determining whether there is a candidate problem that the text similarity with the conversation problem is greater than the second preset threshold, and if there is a candidate problem that the text similarity with the conversation problem is greater than the second preset threshold, Then, an approximation problem of the conversation problem exists in the candidate problem set. If there is no candidate problem that the text similarity with the conversation problem is greater than the second preset threshold, then it is determined that there is no approximation problem of the conversation problem in the candidate problem set.
  • step S3 selects a candidate corresponding to the maximum text similarity from the candidate questions with the text similarity of the conversation problem being greater than the second preset threshold.
  • the problem is taken as the approximation question, and the associated answer of the approximation question is searched in the Q&A knowledge base 4, and the associated answer is output as the target answer of the conversation question.
  • the approximation question may also have more than one associated answer in the Q&A knowledge base 4.
  • step S3 may take the plurality of associated answers. And outputting the associated answer with the highest frequency in the preset time period (for example, the most recent week) as the target answer of the conversation question.
  • Step S4 If there is no approximation problem of the conversation problem in the candidate question set, query the candidate related to the conversation problem from the question and answer knowledge base 4 by using an inverted index query according to the text feature information. An answer set, and separately calculating a topic similarity of the conversation question and each candidate answer in the candidate answer set.
  • At least one candidate answer is included in the set of candidate answers, and each candidate answer has a certain degree of association with the session question due to the manner of using an inverted index query.
  • the association of each candidate answer with the conversation question may be reflected by the topic similarity, and if the topic similarity between the conversation question and the corresponding candidate answer is higher, the conversation problem and the topic of the candidate answer are considered The more similar, the more likely the candidate answer is to be the answer to the conversation question.
  • the method for calculating the topic similarity of each of the candidate answers in the candidate answer set in step S4 may include:
  • the calculating the similarity degree between the conversation problem and each candidate answer in the candidate answer set respectively includes:
  • Step S5 determining, according to the preset rule and the topic similarity, whether an approximate answer of the conversation problem exists in the candidate answer set, and if the approximate answer of the conversation problem exists in the candidate answer set, the approximation is The answer is output as the target answer to the conversation question.
  • the preset rule may include: determining whether there is a candidate answer whose topic similarity with the session problem is greater than a third preset threshold, and if there is a candidate answer with a topic similarity of the conversation problem being greater than a third preset threshold, Then, an approximate answer to the conversation question exists in the candidate answer set. If there is no candidate answer with the topic similarity of the conversation problem being greater than the third preset threshold, it is determined that there is no approximate answer of the conversation question in the candidate answer set.
  • step S5 If there is a candidate answer whose topic similarity to the conversation question is greater than the third preset threshold, the candidate answer is taken as an approximate answer to the conversation question, and step S5 outputs the approximate answer as the target answer of the conversation question.
  • the candidate answers with the topic similarity of the conversation problem greater than the third preset threshold may also have more than one in the Q&A knowledge base 4, when the similarity with the topic of the conversation problem is greater than the third predetermined threshold.
  • step S5 may take the approximate answer which is the highest frequency of the conversation question in the preset time period (for example, the most recent week).
  • Step S6 if there is no approximate answer of the conversation question in the candidate answer set, the iterative training of encoding and decoding each question and answer in the Q&A knowledge base 4 by using the seq2seq model, thereby constructing a sequence prediction model,
  • the conversation question is input to the sequence prediction model to generate a strain answer, and the strain answer is output as a target answer of the conversation question.
  • the seq2seq model consists of a forward long and short memory network LSTM model and a backward LSTM model for performing the encoding and decoding iterative training, and an attention mechanism for calculating hidden layer information weights for each encoding and decoding.
  • the candidate question set related to the session problem is queried from the question-and-answer knowledge base 4 by means of an inverted index query, and the said a textual similarity between the conversation problem and each candidate question in the candidate question set, determining whether there is an approximation problem of the conversation problem in the candidate question set, and if so, searching for the associated answer of the approximation question in the Q&A knowledge base 4, Outputting the associated answer as a target answer of the conversation question.
  • FIG. 4 it is a program module diagram of the chat response program 10 in FIG.
  • the chat response program 10 is divided into a plurality of modules that are stored in the memory 11 and executed by the processor 12 to complete the present application.
  • a module as referred to in this application refers to a series of computer program instructions that are capable of performing a particular function.
  • the chat response program 10 can be divided into: a pre-processing module 110, a first calculation module 120, a question retrieval module 130, a second calculation module 140, an answer retrieval module 150, and an answer prediction module 160.
  • the pre-processing module 110 is configured to obtain a session problem input by the client, and perform pre-processing on the session problem to obtain text feature information of the session problem, where the text feature information includes the part of speech and location of each term in the conversation problem.
  • word class attribution information the word class attribution includes attribution to a keyword or a named entity.
  • the pre-processing module 110 is configured to perform the following pre-processing on the session problem:
  • the method of word segmentation includes performing forward maximum matching based on the dictionary and/or performing reverse maximum matching based on the dictionary;
  • part-of-speech analysis on each term obtained by the word segmentation process, and labeling the part-of-speech of each term, and the part-of-speech analysis is realized by a part-of-speech tagging model obtained through a preset large-scale corpus training;
  • Named entity identification is performed on the session problem, thereby identifying a named entity having a specific meaning, the named entity includes a person name, a place name, an organization, a proper noun, and the method for identifying the named entity includes a dictionary-based and rule-based method And methods based on statistical learning;
  • the preset dictionary includes a business scenario-specific dictionary.
  • a first calculation module 120 configured to build an inverted index for the Q&A knowledge base 4, the Q&A knowledge base includes a plurality of questions pre-arranged and one or more answers associated with each question, according to the text feature information,
  • the manner of inverting the index query queries the candidate question set related to the session question from the question and answer knowledge base 4, and separately calculates the text similarity of the session question and each candidate question in the candidate question set.
  • the first calculating module 120 is configured to construct an inverted index for the question and answer knowledge base 4 by:
  • Each question and answer in the Q&A Knowledge Base 4 is divided into word segmentation, part-of-speech tagging, keyword extraction, keyword occurrence location record, assignment of ID number, and assignment of each term after each word and answer segmentation.
  • ID number ;
  • Each question and answer in the Q&A knowledge base 4 is sorted according to the corresponding ID number, and each term obtained after each word and answer word segmentation is sorted according to the corresponding ID number, and will have the same term ID. All question IDs and answer IDs are placed in the inverted record table corresponding to the entry;
  • the first calculation module 120 calculates a text similarity between the conversation problem and each candidate question in the candidate question set, including:
  • the problem retrieving module 130 is configured to determine, according to the preset rule and the text similarity, whether there is an approximation problem of the session problem in the candidate question set, and if there is an approximation problem of the session problem in the candidate question set, Finding the associated answer of the approximate question in the question and answer knowledge base, and outputting the associated answer as the target answer of the conversation question.
  • the problem retrieval module 130 determines whether there is a candidate problem that the text similarity with the conversation problem is greater than the second preset threshold, and if so, from the candidate problem that the text similarity with the conversation problem is greater than the second preset threshold Selecting a candidate question corresponding to the maximum text similarity as the approximation question; if there is no candidate problem that the text similarity with the session problem is greater than the second preset threshold, determining that the session problem does not exist in the candidate question set Approximate problem.
  • the second calculating module 140 is configured to: if the approximation problem of the session problem does not exist in the candidate question set, query and query from the Q&A knowledge base 4 by using an inverted index query according to the text feature information A set of candidate answers related to the conversation question, and calculating a topic similarity of the conversation question and each candidate answer in the candidate answer set, respectively.
  • the second calculation module 140 calculates a topic similarity between the conversation question and each candidate answer in the candidate answer set, including:
  • the answer retrieval module 150 is configured to determine, according to the preset rule and the topic similarity, whether an approximate answer of the conversation problem exists in the candidate answer set, if an approximate answer of the conversation problem exists in the candidate answer set, The approximate answer is output as a target answer to the conversation question.
  • the answer retrieval module 150 determines whether there is a candidate answer whose topic similarity to the conversation problem is greater than the third preset threshold, and if so, from the candidate answers that the topic similarity with the conversation question is greater than the third preset threshold Selecting a candidate answer corresponding to the maximum topic similarity as the approximate answer; if there is no candidate answer with a topic similarity of the conversation problem greater than the third preset threshold, determining that the session problem does not exist in the candidate answer set Approximate answer.
  • the answer prediction module 160 is configured to: if the approximate answer of the conversation problem does not exist in the candidate answer set, perform iterative training on encoding and decoding each question and answer in the Q&A knowledge base 4 by using the seq2seq model, thereby constructing a sequence And predicting a model, inputting the conversation question into the sequence prediction model to generate a strain answer, and outputting the strain answer as a target answer of the conversation question.
  • the answer prediction module 160 describes the seq2seq model by a forward-long memory network LSTM model and a backward LSTM model for performing the encoding and decoding iterative training, and attention for calculating hidden layer information weights for each encoding and decoding. Mechanism composition.
  • the memory 11 including the readable storage medium may include an operating system, a chat response program 10, and a question and answer knowledge base 4.
  • the processor 12 executes the chat response program 10 stored in the memory 11, the following steps are implemented:
  • a pre-processing step obtaining a session question input by the client, pre-processing the session problem, and obtaining text feature information of the session problem, where the text feature information includes part of speech, location, and part-of-speech attribution of each term in the conversation problem Information, the word class attribution includes belonging to a keyword or a named entity;
  • a first calculating step constructing an inverted index for the question and answer knowledge base, the question and answer knowledge base including a plurality of questions arranged in advance and one or more answers associated with each question, and querying through the inverted index according to the text feature information Means querying a candidate question set related to the conversation problem from a question and answer knowledge base, and separately calculating text similarity between the conversation problem and each candidate question in the candidate question set;
  • a problem retrieval step determining, according to a preset rule and the text similarity, whether there is an approximation problem of the conversation problem in the candidate question set, and if there is an approximation problem of the conversation problem in the candidate question set, then the question and answer knowledge Finding an associated answer of the approximate problem in the library, and outputting the associated answer as a target answer of the conversation question;
  • a second calculating step if there is no approximation problem of the session problem in the candidate question set, querying, according to the text feature information, a query related to the session problem from the Q&A knowledge base by means of an inverted index query a set of candidate answers, and separately calculating a topic similarity of the conversation question and each candidate answer in the candidate answer set;
  • An answer retrieval step determining, according to a preset rule and the topic similarity, whether an approximate answer of the conversation question exists in the candidate answer set, and if an approximate answer of the conversation problem exists in the candidate answer set, The approximate answer is output as the target answer to the conversation question;
  • An answer prediction step if an approximate answer of the conversation problem does not exist in the candidate answer set, iteratively trains and decodes each question and answer in the question and answer knowledge base through the seq2seq model, thereby constructing a sequence prediction model, The conversation question is input to the sequence prediction model to generate a strain answer, and the strain answer is output as a target answer of the conversation question.
  • the pre-processing of the session problem includes:
  • the method of word segmentation includes performing forward maximum matching based on the dictionary and/or performing reverse maximum matching based on the dictionary;
  • part-of-speech analysis on each term obtained by the word segmentation process, and labeling the part-of-speech of each term, and the part-of-speech analysis is realized by a part-of-speech tagging model obtained through a preset large-scale corpus training;
  • Named entity identification is performed on the session problem, thereby identifying a named entity having a specific meaning, the named entity includes a person name, a place name, an organization, a proper noun, and the method for identifying the named entity includes a dictionary-based and rule-based method And methods based on statistical learning;
  • the preset dictionary includes a business scenario-specific dictionary.
  • the calculating the similarity degree between the conversation problem and each candidate answer in the candidate answer set respectively includes:
  • the approximating the problem of determining whether the session problem exists in the candidate question set according to the preset rule and the problem similarity includes:
  • the determining, according to the preset rule and the topic similarity, determining whether the candidate answer exists in the candidate answer set includes:
  • the constructing the inverted index for the Q&A knowledge base includes:
  • Each question and answer in the Q&A knowledge base is divided into word segmentation, part-of-speech tagging, keyword extraction, keyword location location record, assignment ID number, and ID assigned to each term after each word and answer segmentation. number;
  • Each question and answer in the Q&A knowledge base is sorted according to the corresponding ID number, and each term obtained after each word and answer word segmentation is sorted according to the corresponding ID number, and all the items with the same item ID are The question ID and the answer ID are placed in the inverted record table corresponding to the entry;
  • the seq2seq model consists of a forward long and short memory network LSTM model and a backward LSTM model for performing the encoding and decoding iterative training, and an attention mechanism for calculating hidden layer information weights for each encoding and decoding.
  • the embodiment of the present application further provides a computer readable storage medium, which may be a hard disk, a multimedia card, an SD card, a flash memory card, an SMC, a read only memory (ROM), and an erasable programmable Any combination or combination of any one or more of read only memory (EPROM), portable compact disk read only memory (CD-ROM), USB memory, and the like.
  • the computer readable storage medium includes a Q&A knowledge base 4, a chat response program 10, and the like. When the chat response program 10 is executed by the processor 12, the following operations are implemented:
  • a pre-processing step obtaining a session question input by the client, pre-processing the session problem, and obtaining text feature information of the session problem, where the text feature information includes part of speech, location, and part-of-speech attribution of each term in the conversation problem Information, the word class attribution includes belonging to a keyword or a named entity;
  • a first calculating step constructing an inverted index for the question and answer knowledge base, the question and answer knowledge base including a plurality of questions arranged in advance and one or more answers associated with each question, and querying through the inverted index according to the text feature information Means querying a candidate question set related to the conversation problem from a question and answer knowledge base, and separately calculating text similarity between the conversation problem and each candidate question in the candidate question set;
  • a problem retrieval step determining, according to a preset rule and the text similarity, whether there is an approximation problem of the conversation problem in the candidate question set, and if there is an approximation problem of the conversation problem in the candidate question set, then the question and answer knowledge Finding an associated answer of the approximate problem in the library, and outputting the associated answer as a target answer of the conversation question;
  • a second calculating step if there is no approximation problem of the session problem in the candidate question set, querying, according to the text feature information, a query related to the session problem from the Q&A knowledge base by means of an inverted index query a set of candidate answers, and separately calculating a topic similarity of the conversation question and each candidate answer in the candidate answer set;
  • An answer retrieval step determining, according to a preset rule and the topic similarity, whether an approximate answer of the conversation question exists in the candidate answer set, and if an approximate answer of the conversation problem exists in the candidate answer set, The approximate answer is output as the target answer to the conversation question;
  • An answer prediction step if an approximate answer of the conversation problem does not exist in the candidate answer set, iteratively trains and decodes each question and answer in the question and answer knowledge base through the seq2seq model, thereby constructing a sequence prediction model, The conversation question is input to the sequence prediction model to generate a strain answer, and the strain answer is output as a target answer of the conversation question.
  • the pre-processing of the session problem includes:
  • the method of word segmentation includes performing forward maximum matching based on the dictionary and/or performing reverse maximum matching based on the dictionary;
  • part-of-speech analysis on each term obtained by the word segmentation process, and labeling the part-of-speech of each term, and the part-of-speech analysis is realized by a part-of-speech tagging model obtained through a preset large-scale corpus training;
  • Named entity identification is performed on the session problem, thereby identifying a named entity having a specific meaning, the named entity includes a person name, a place name, an organization, a proper noun, and the method for identifying the named entity includes a dictionary-based and rule-based method And methods based on statistical learning;
  • the preset dictionary includes a business scenario-specific dictionary.
  • the calculating the similarity degree between the conversation problem and each candidate answer in the candidate answer set respectively includes:
  • the approximating the problem of determining whether the session problem exists in the candidate question set according to the preset rule and the problem similarity includes:
  • the determining, according to the preset rule and the topic similarity, determining whether the candidate answer exists in the candidate answer set includes:
  • the constructing the inverted index for the Q&A knowledge base includes:
  • Each question and answer in the Q&A knowledge base is divided into word segmentation, part-of-speech tagging, keyword extraction, keyword location location record, assignment ID number, and ID assigned to each term after each word and answer segmentation. number;
  • Each question and answer in the Q&A knowledge base is sorted according to the corresponding ID number, and each term obtained after each word and answer word segmentation is sorted according to the corresponding ID number, and all the items with the same item ID are The question ID and the answer ID are placed in the inverted record table corresponding to the entry;
  • the seq2seq model consists of a forward long and short memory network LSTM model and a backward LSTM model for performing the encoding and decoding iterative training, and an attention mechanism for calculating hidden layer information weights for each encoding and decoding.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Finance (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

Provided is a chat response method, comprising: acquiring a session question; querying a candidate question set related to the session question in a question-answer knowledge base; calculating the text similarity between the session question and each candidate question; determining whether a question similar to the session question exists; if so, searching for and outputting an answer associated with the similar question; if not, querying a candidate answer set related to the session question in the question-answer knowledge base; calculating the topic similarity between the session question and each candidate answer; determining whether an answer similar to the session question exists; if so, outputting the similar answer; if not, establishing a sequence prediction model, inputting the session question into the sequence prediction model to generate a response answer, and outputting the response answer as a target answer. The described method may provide customers with accurate and response feedback regarding the session question, thus improving the quality of service.

Description

聊天应答方法、电子装置及存储介质Chat response method, electronic device and storage medium

本申请要求于2018年2月9日提交中国专利局,申请号为201810135747.6、发明名称为“聊天应答方法、电子装置及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims priority to Chinese Patent Application No. 201101135747.6, entitled "Chat Response Method, Electronic Device, and Storage Media", filed on February 9, 2018, the entire contents of which are incorporated herein by reference. In the application.

技术领域Technical field

本申请涉及计算机技术领域,尤其涉及一种聊天应答方法、电子装置及存储介质。The present application relates to the field of computer technologies, and in particular, to a chat response method, an electronic device, and a storage medium.

背景技术Background technique

随着科技的发展,AI(Artificial Intelligence,人工智能)正逐步改变着我们的生活方式,例如智能问答就是其中一种。当客户通过文字或语音在线咨询时,可以由线上的智能客服为客户进行智能应答。智能问答可以有效缓解客户服务的等待状况,提升服务质量,因而有着非常广阔的前景。With the development of technology, AI (Artificial Intelligence) is gradually changing our way of life. For example, smart question and answer is one of them. When the customer consults online via text or voice, the customer can be intelligently answered by the online intelligent customer service. Intelligent Q&A can effectively alleviate the waiting situation of customer service and improve service quality, so it has a very broad prospect.

然而,即使是在特定的服务领域,例如金融、银行、证券、保险等垂直的领域中,在线咨询的过程中也会包含一些纯闲聊的内容。此时针对客户输入的聊天会话内容,若无法快速准确和有效应变地响应客户,则会降低智能客服的服务质量,无法为客户带来人性化的高质量体验。However, even in a specific service area, such as financial, banking, securities, insurance and other vertical areas, the online consultation process will contain some pure chat content. At this time, if the chat session content input by the customer cannot respond to the customer quickly, accurately, and effectively, the service quality of the intelligent customer service will be reduced, and the humanized high quality experience cannot be brought to the customer.

发明内容Summary of the invention

鉴于以上原因,有必要提供一种聊天应答方法、电子装置及存储介质,可以针对会话问题为客户做出准确和应变的反馈,从而提高服务质量。In view of the above reasons, it is necessary to provide a chat response method, an electronic device and a storage medium, which can provide accurate and responsive feedback to customers for conversational problems, thereby improving service quality.

为实现上述目的,本申请提供一种聊天应答方法,该方法包括:预处理步骤:获取客户输入的会话问题,对所述会话问题进行预处理,得到会话问题的文本特征信息,所述文本特征信息包括各词条在所述会话问题中的词性、位置和词类归属信息,所述词类归属包括归属于关键词或命名实体;第一计算步骤:为问答知识库构建倒排索引,所述问答知识库包括预先整理的多个问题以及每个问题关联的一个或多个答案,根据所述文本特征信息,通过倒排索引查询的方式从问答知识库中查询与所述会话问题相关的候选问题集合,并分别计算所述会话问题与所述候选问题集合中每个候选问题的文本相似度;问题检索步骤:根据预设规则及所述文本相似度,判断候选问题集合中是否存在所述会话问题的近似问题,若所述候选问题集合中存在所述会话问题的近似问题,则在问答知识库中查找该近似问题的关联答案,将所述关联答案作为所述会话问题的目标答案输出;第二计算步骤:若所述候选问题集合中不存在所述会话问题的近似问题,则根据所述文本特征信息,通过倒排索引查询的方式从问答知识库中查询与所述会话问题相关的候选答案集合,并分别计算所述会话问题与所述候选答案集合中每个候选答案的主题相似度;答案检索步骤:根据预设规则及所述主题相似度,判断候选答案集合中是否存 在所述会话问题的近似答案,若所述候选答案集合中存在所述会话问题的近似答案,则将所述近似答案作为所述会话问题的目标答案输出;答案预测步骤:若候选答案集合中不存在所述会话问题的近似答案,则通过seq2seq模型对所述问答知识库中的各个问题和答案进行编码和解码的迭代训练,从而构建序列预测模型,将所述会话问题输入所述序列预测模型生成应变答案,将所述应变答案作为所述会话问题的目标答案输出。To achieve the above objective, the present application provides a chat response method, the method comprising: a pre-processing step of: acquiring a session question input by a client, pre-processing the session problem, and obtaining text feature information of the session problem, the text feature The information includes part of speech, location and part-of-speech attribution information of each term in the conversation question, the word class attribution includes belonging to the keyword or the named entity; and the first calculating step: constructing an inverted index for the question-and-answer knowledge base, the question and answer The knowledge base includes a plurality of questions pre-arranged and one or more answers associated with each question, and according to the text feature information, querying candidate questions related to the conversation problem from the question-and-answer knowledge base by means of inverted index query Collecting, and separately calculating text similarity between the conversation problem and each candidate question in the candidate question set; a question retrieval step: determining whether the session exists in the candidate question set according to the preset rule and the text similarity An approximation problem of the problem, if there is an approximation of the conversation problem in the set of candidate questions And searching for the associated answer of the approximate question in the question and answer knowledge base, and outputting the associated answer as the target answer of the conversation question; and second calculating step: if the session problem does not exist in the candidate question set Approximating the problem, according to the text feature information, querying a candidate answer set related to the conversation question from the question and answer knowledge base by means of an inverted index query, and calculating each of the conversation problem and the candidate answer set separately The topic similarity of the candidate answers; the answer retrieval step: determining, according to the preset rule and the topic similarity, whether there is an approximate answer of the conversation question in the candidate answer set, if the conversation problem exists in the candidate answer set An approximate answer, the approximate answer is output as the target answer of the conversation question; the answer prediction step: if the approximate answer of the conversation problem does not exist in the candidate answer set, the seq2seq model is used in the question and answer knowledge base Each question and answer is iteratively trained in encoding and decoding to construct a sequence prediction module Type, inputting the conversation question into the sequence prediction model to generate a strain answer, and outputting the strain answer as a target answer of the conversation question.

为实现上述目的,本申请还提供一种电子装置,该电子装置包括存储器和处理器,所述存储器中包括聊天应答程序,该聊天应答程序被所述处理器执行时实现如下步骤:预处理步骤:获取客户输入的会话问题,对所述会话问题进行预处理,得到会话问题的文本特征信息,所述文本特征信息包括各词条在所述会话问题中的词性、位置和词类归属信息,所述词类归属包括归属于关键词或命名实体;第一计算步骤:为问答知识库构建倒排索引,所述问答知识库包括预先整理的多个问题以及每个问题关联的一个或多个答案,根据所述文本特征信息,通过倒排索引查询的方式从问答知识库中查询与所述会话问题相关的候选问题集合,并分别计算所述会话问题与所述候选问题集合中每个候选问题的文本相似度;问题检索步骤:根据预设规则及所述文本相似度,判断候选问题集合中是否存在所述会话问题的近似问题,若所述候选问题集合中存在所述会话问题的近似问题,则在问答知识库中查找该近似问题的关联答案,将所述关联答案作为所述会话问题的目标答案输出;第二计算步骤:若所述候选问题集合中不存在所述会话问题的近似问题,则根据所述文本特征信息,通过倒排索引查询的方式从问答知识库中查询与所述会话问题相关的候选答案集合,并分别计算所述会话问题与所述候选答案集合中每个候选答案的主题相似度;答案检索步骤:根据预设规则及所述主题相似度,判断候选答案集合中是否存在所述会话问题的近似答案,若所述候选答案集合中存在所述会话问题的近似答案,则将所述近似答案作为所述会话问题的目标答案输出;答案预测步骤:若候选答案集合中不存在所述会话问题的近似答案,则通过seq2seq模型对所述问答知识库中的各个问题和答案进行编码和解码的迭代训练,从而构建序列预测模型,将所述会话问题输入所述序列预测模型生成应变答案,将所述应变答案作为所述会话问题的目标答案输出。To achieve the above object, the present application further provides an electronic device including a memory and a processor, wherein the memory includes a chat response program, and the chat response program is executed by the processor to implement the following steps: a pre-processing step Obtaining a session problem input by the client, pre-processing the session problem, and obtaining text feature information of the session problem, where the text feature information includes part of speech, location, and word class attribution information of each term in the conversation question, The term class attribution includes attribution to a keyword or a named entity; a first calculation step: constructing an inverted index for the question and answer knowledge base, the question and answer knowledge base including a plurality of questions pre-arranged and one or more answers associated with each question, Determining, according to the text feature information, a candidate question set related to the session problem from a question and answer knowledge base by means of an inverted index query, and respectively calculating the session problem and each candidate question in the candidate question set Text similarity; question retrieval step: according to preset rules and the similarity of the text, Whether there is an approximation problem of the conversation problem in the set of candidate questions, if there is an approximation problem of the conversation problem in the candidate question set, the associated answer of the approximation question is searched in the question and answer knowledge base, and the associated answer is a target answer output as the conversation problem; a second calculation step: if there is no approximation problem of the conversation problem in the candidate question set, according to the text feature information, the question and answer knowledge is obtained by means of an inverted index query Querying, in the library, a set of candidate answers related to the conversation question, and separately calculating a topic similarity between the conversation problem and each candidate answer in the candidate answer set; an answer retrieval step: according to a preset rule and the topic is similar Degree, determining whether there is an approximate answer of the conversation question in the candidate answer set, if an approximate answer of the conversation question exists in the candidate answer set, outputting the approximate answer as a target answer of the conversation question; Prediction step: if there is no approximate answer to the conversation question in the candidate answer set, An iterative training of encoding and decoding each question and answer in the Q&A knowledge base by the seq2seq model, thereby constructing a sequence prediction model, inputting the conversation problem into the sequence prediction model to generate a strain answer, and using the strain answer as The target answer output of the conversation question.

此外,为实现上述目的,本申请还提供一种计算机可读存储介质,所述计算机可读存储介质中包括聊天应答程序,该聊天应答程序被处理器执行时,实现如上所述的聊天应答方法的任意步骤。In addition, in order to achieve the above object, the present application further provides a computer readable storage medium including a chat response program, when the chat response program is executed by a processor, implementing the chat response method as described above Any step.

本申请提出的聊天应答方法、电子装置及存储介质,在获取会话问题并进行预处理后,通过倒排索引查询的方式从问答知识库中查询与所述会话问题相关的候选问题集合,并分别计算所述会话问题与所述候选问题集合中每个候选问题的文本相似度,判断候选问题集合中是否存在所述会话问题的近似问题,若是,则在问答知识库中查找该近似问题的关联答案,将所述关联答案作为所述会话问题的目标答案输出,若所述候选问题集合中不存在所述 会话问题的近似问题,则通过倒排索引查询的方式从问答知识库中查询与所述会话问题相关的候选答案集合,并分别计算所述会话问题与所述候选答案集合中每个候选答案的主题相似度,判断候选答案集合中是否存在所述会话问题的近似答案,若是,则将所述近似答案作为所述会话问题的目标答案输出,若候选答案集合中不存在所述会话问题的近似答案,则通过seq2seq模型对所述问答知识库中的各个问题和答案进行编码和解码的迭代训练,从而构建序列预测模型,将所述会话问题输入所述序列预测模型生成应变答案,将所述应变答案作为所述会话问题的目标答案输出,可以针对会话问题为客户做出准确和应变的反馈,从而提高服务质量。The chat response method, the electronic device and the storage medium proposed by the present application, after acquiring the session problem and performing pre-processing, query the candidate question set related to the conversation problem from the question-and-answer knowledge base by means of the inverted index query, and respectively Calculating a text similarity between the conversation problem and each candidate question in the candidate question set, determining whether there is an approximation problem of the conversation problem in the candidate question set, and if so, searching for an association of the approximation problem in the Q&A knowledge base The answer is that the associated answer is output as the target answer of the conversation question. If there is no approximation problem of the conversation problem in the candidate question set, the query and the query are obtained from the question and answer knowledge base by means of an inverted index query. Determining a set of candidate answers related to the conversation problem, and separately calculating a topic similarity of the conversation question and each candidate answer in the candidate answer set, and determining whether an approximate answer of the conversation question exists in the candidate answer set, and if so, Outputting the approximate answer as a target answer to the conversation question, if the candidate If there is no approximate answer to the conversation problem in the answer set, the iterative training of encoding and decoding each question and answer in the question and answer knowledge base is performed by the seq2seq model, thereby constructing a sequence prediction model, and inputting the conversation problem The sequence prediction model generates a strain answer, and outputs the strain answer as a target answer of the conversation question, and can provide accurate and responsive feedback to the client for the conversation problem, thereby improving service quality.

附图说明DRAWINGS

图1为本申请电子装置较佳实施例的运行环境示意图;1 is a schematic diagram of an operating environment of a preferred embodiment of an electronic device of the present application;

图2为本申请电子装置与客户端较佳实施例的交互示意图;2 is a schematic diagram of interaction between an electronic device and a client according to a preferred embodiment of the present application;

图3为本申请聊天应答方法较佳实施例的流程图;3 is a flow chart of a preferred embodiment of a chat response method of the present application;

图4为图1中聊天应答程序的程序模块图。4 is a program block diagram of the chat response program of FIG. 1.

本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。The implementation, functional features and advantages of the present application will be further described with reference to the accompanying drawings.

具体实施方式Detailed ways

下面将参考若干具体实施例来描述本申请的原理和精神。应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。The principles and spirit of the present application are described below with reference to a number of specific embodiments. It is understood that the specific embodiments described herein are merely illustrative of the application and are not intended to be limiting.

本领域的技术人员知道,本申请的实施方式可以实现为一种方法、装置、设备、系统或计算机程序产品。因此,本申请可以具体实现为完全的硬件、完全的软件(包括固件、驻留软件、微代码等),或者硬件和软件结合的形式。Those skilled in the art will appreciate that embodiments of the present application can be implemented as a method, apparatus, device, system, or computer program product. Accordingly, the application can be embodied in a complete hardware, complete software (including firmware, resident software, microcode, etc.), or a combination of hardware and software.

根据本申请的实施例,提出了一种聊天应答方法、电子装置及存储介质。According to an embodiment of the present application, a chat response method, an electronic device, and a storage medium are proposed.

参照图1所示,为本申请电子装置较佳实施例的运行环境示意图。1 is a schematic diagram of an operating environment of a preferred embodiment of an electronic device of the present application.

该电子装置1可以是服务器、便携式计算机、桌上型计算机等具有存储和运算功能的终端设备。The electronic device 1 may be a terminal device having a storage and computing function such as a server, a portable computer, or a desktop computer.

该电子装置1包括存储器11、处理器12、网络接口13及通信总线14。所述网络接口13可选地可以包括标准的有线接口和无线接口(如WI-FI接口)。通信总线14用于实现上述组件之间的连接通信。The electronic device 1 includes a memory 11, a processor 12, a network interface 13, and a communication bus 14. The network interface 13 can optionally include a standard wired interface and a wireless interface (such as a WI-FI interface). The communication bus 14 is used to implement connection communication between the above components.

存储器11包括至少一种类型的可读存储介质。所述至少一种类型的可读存储介质可为如闪存、硬盘、多媒体卡、卡型存储器等的非易失性存储介质。在一些实施例中,所述可读存储介质可以是所述电子装置1的内部存储单元,例如该电子装置1的硬盘。在另一些实施例中,所述可读存储介质也可以是所述电子装置1的外部存储器11,例如所述电子装置1上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡, 闪存卡(Flash Card)等。The memory 11 includes at least one type of readable storage medium. The at least one type of readable storage medium may be a non-volatile storage medium such as a flash memory, a hard disk, a multimedia card, a card type memory, or the like. In some embodiments, the readable storage medium may be an internal storage unit of the electronic device 1, such as a hard disk of the electronic device 1. In other embodiments, the readable storage medium may also be an external memory 11 of the electronic device 1, such as a plug-in hard disk equipped on the electronic device 1, a smart memory card (SMC). , Secure Digital (SD) card, Flash Card, etc.

在本实施例中,所述存储器11的可读存储介质通常用于存储安装于所述电子装置1的聊天应答程序10及问答知识库4等。所述存储器11还可以用于暂时地存储已经输出或者将要输出的数据。In the present embodiment, the readable storage medium of the memory 11 is generally used to store the chat response program 10, the Q&A knowledge base 4, and the like installed in the electronic device 1. The memory 11 can also be used to temporarily store data that has been output or is about to be output.

处理器12在一些实施例中可以是一中央处理器(Central Processing Unit,CPU),微处理器或其他数据处理芯片,用于运行存储器11中存储的程序代码或处理数据,例如执行聊天应答程序10等。The processor 12, in some embodiments, may be a Central Processing Unit (CPU), microprocessor or other data processing chip for running program code or processing data stored in the memory 11, such as executing a chat response program. 10 and so on.

图1仅示出了具有组件11-14以及聊天应答程序10的电子装置1,但是应理解的是,并不要求实施所有示出的组件,可以替代的实施更多或者更少的组件。1 shows only the electronic device 1 having the components 11-14 and the chat response program 10, but it should be understood that not all illustrated components may be implemented, and more or fewer components may be implemented instead.

可选地,该电子装置1还可以包括用户接口,用户接口可以包括输入单元比如键盘(Keyboard)、语音输入装置比如麦克风(microphone)等具有语音识别功能的设备、语音输出装置比如音响、耳机等。可选地,用户接口还可以包括标准的有线接口、无线接口。Optionally, the electronic device 1 may further include a user interface, and the user interface may include an input unit such as a keyboard, a voice input device such as a microphone, a device with a voice recognition function, a voice output device such as an audio, a headphone, and the like. . Optionally, the user interface may also include a standard wired interface and a wireless interface.

可选地,该电子装置1还可以包括显示器,显示器也可以称为显示屏或显示单元。在一些实施例中可以是LED显示器、液晶显示器、触控式液晶显示器以及有机发光二极管(Organic Light-Emitting Diode,OLED)显示器等。显示器用于显示在电子装置1中处理的信息以及用于显示可视化的用户界面。Optionally, the electronic device 1 may further include a display, which may also be referred to as a display screen or a display unit. In some embodiments, it may be an LED display, a liquid crystal display, a touch liquid crystal display, and an Organic Light-Emitting Diode (OLED) display. The display is used to display information processed in the electronic device 1 and a user interface for displaying visualizations.

可选地,该电子装置1还包括触摸传感器。所述触摸传感器所提供的供用户进行触摸操作的区域称为触控区域。此外,这里所述的触摸传感器可以为电阻式触摸传感器、电容式触摸传感器等。而且,所述触摸传感器不仅包括接触式的触摸传感器,也可包括接近式的触摸传感器等。此外,所述触摸传感器可以为单个传感器,也可以为例如阵列布置的多个传感器。用户可以通过触摸所述触控区域启动聊天应答程序10。Optionally, the electronic device 1 further comprises a touch sensor. The area provided by the touch sensor for the user to perform a touch operation is referred to as a touch area. Further, the touch sensor described herein may be a resistive touch sensor, a capacitive touch sensor, or the like. Moreover, the touch sensor includes not only a contact type touch sensor but also a proximity type touch sensor or the like. Furthermore, the touch sensor may be a single sensor or a plurality of sensors arranged, for example, in an array. The user can initiate the chat response program 10 by touching the touch area.

此外,该电子装置1的显示器的面积可以与所述触摸传感器的面积相同,也可以不同。可选地,将显示器与所述触摸传感器层叠设置,以形成触摸显示屏。该装置基于触摸显示屏侦测用户触发的触控操作。In addition, the area of the display of the electronic device 1 may be the same as or different from the area of the touch sensor. Optionally, a display is stacked with the touch sensor to form a touch display. The device detects a user-triggered touch operation based on a touch screen display.

该电子装置1还可以包括射频(Radio Frequency,RF)电路、传感器和音频电路等等,在此不再赘述。The electronic device 1 may further include a radio frequency (RF) circuit, a sensor, an audio circuit, and the like, and details are not described herein.

参阅图2所示,为本申请电子装置1与客户端2较佳实施例的交互示意图。所述聊天应答程序10运行于电子装置1中,在图2中所述电子装置1的较佳实施例为服务器。所述电子装置1通过网络3与客户端2通信连接。所述客户端2可以运行于各类终端设备中,例如智能手机、便携式计算机等。用户通过客户端2登录至所述电子装置1后,可以向聊天应答程序10输入会话问题,所述会话问题可以为对特定领域的会话问题,也可以为聊天会话内容。聊天应答程序10可以采用所述聊天应答方法,根据所述会话问题确定合适的响应内容,并将所述响应内容反馈给客户端2。Referring to FIG. 2, it is a schematic diagram of interaction between the electronic device 1 and the client 2 according to a preferred embodiment of the present application. The chat response program 10 runs in the electronic device 1. In Fig. 2, the preferred embodiment of the electronic device 1 is a server. The electronic device 1 is communicatively coupled to the client 2 via a network 3. The client 2 can run in various types of terminal devices, such as smart phones, portable computers, and the like. After the user logs in to the electronic device 1 through the client 2, the session question can be input to the chat answering program 10, and the session question can be a session problem for a specific domain or a chat session content. The chat response program 10 can adopt the chat response method, determine an appropriate response content according to the session problem, and feed back the response content to the client 2.

参阅图3所示,为本申请聊天应答方法较佳实施例的流程图。电子装置1的处理器12执行存储器11中存储的聊天应答程序10时实现聊天应答方法的如下步骤:Referring to FIG. 3, it is a flowchart of a preferred embodiment of the chat response method of the present application. The following steps of implementing the chat response method when the processor 12 of the electronic device 1 executes the chat response program 10 stored in the memory 11:

步骤S1,获取客户输入的会话问题,对所述会话问题进行预处理,得到会话问题的文本特征信息,所述文本特征信息包括各词条在所述会话问题中的词性、位置和词类归属信息,所述词类归属包括归属于关键词或命名实体。所述会话问题例如可以为对特定领域的会话问题,例如“保修期是多久”,也可以为聊天会话内容,例如“今天天气很不错”。为了便于后续对所述会话问题的处理,步骤S1可以先对所述会话问题进行一些预处理。Step S1: Obtain a session question input by the client, perform pre-processing on the session problem, and obtain text feature information of the session problem, where the text feature information includes part of speech, location, and word class attribution information of each term in the conversation question. The word class attribution includes attribution to a keyword or a named entity. The session question can be, for example, a conversational question for a particular domain, such as "how long the warranty period is," or it can be a chat session content, such as "The weather is very good today." In order to facilitate subsequent processing of the session problem, step S1 may first perform some pre-processing on the session problem.

具体地,步骤S1进行的预处理可以包括如下处理:Specifically, the pre-processing performed in step S1 may include the following processing:

对所述会话问题进行分词处理,从而切分出会话问题的各词条,例如,所述会话问题为“保修期是多久”,则分词后得到的词条是“保修期”、“是”、“多”、“久”,所述分词处理的方法包括基于词典进行正向最大匹配和/或基于词典进行逆向最大匹配;Performing word segmentation on the conversation problem, thereby segmenting the terms of the conversation problem. For example, the conversation question is “how long the warranty period is”, and the terms obtained after the word segmentation are “warranty period” and “yes”. , "multi", "long", the method of word segmentation includes performing a forward maximum match based on a dictionary and/or performing a reverse maximum match based on a dictionary;

对经所述分词处理得到的各词条进行词性解析,并对各词条的词性进行标注,例如对上述会话问题的示例,按照预设规则进行词性标注后的结果为“保修期/名词”、“是/动词”、“多/副词”、“久/形容词”,所述词性解析通过经预设大规模语料库训练得到的词性标注模型实现;Performing part-of-speech analysis on each term obtained by the word segmentation, and marking the part-of-speech of each term. For example, for the example of the above-mentioned conversation problem, the result of the part-of-speech tagging according to the preset rule is “warranty period/noun”. "Yes/verbs", "multiple/adverbs", "long/adjectives", the part of speech analysis is realized by a part-of-speech tagging model obtained through a preset large-scale corpus training;

对所述会话问题进行命名实体识别,从而识别出具有特定意义的命名实体,所述命名实体包括人名、地名、组织机构、专有名词,所述命名实体识别的方法包括基于词典和规则的方法,以及基于统计学习的方法;Named entity identification is performed on the session problem, thereby identifying a named entity having a specific meaning, the named entity includes a person name, a place name, an organization, a proper noun, and the method for identifying the named entity includes a dictionary-based and rule-based method And methods based on statistical learning;

根据所述各词条以及所述命名实体,从所述会话问题中提取关键词,所述关键词为字符数量多于第一预设阈值的词组,或者为存在于预设词典中的命名实体,所述预设词典包括业务场景专有词典。Extracting a keyword from the conversation question according to the term and the named entity, the keyword being a phrase whose number of characters is greater than a first preset threshold, or a named entity existing in a preset dictionary The preset dictionary includes a business scenario-specific dictionary.

步骤S2,为问答知识库4构建倒排索引,所述问答知识库4包括预先整理的多个问题以及每个问题关联的一个或多个答案,根据所述文本特征信息,通过倒排索引查询的方式从问答知识库4中查询与所述会话问题相关的候选问题集合,并分别计算所述会话问题与所述候选问题集合中每个候选问题的文本相似度。Step S2, constructing an inverted index for the question and answer knowledge base 4, the question and answer knowledge base 4 includes a plurality of questions arranged in advance and one or more answers associated with each question, and querying through the inverted index according to the text feature information The method queries the candidate question set related to the conversation question from the question and answer knowledge base 4, and separately calculates the text similarity of the session question and each candidate question in the candidate question set.

在一个实施例中,所述为问答知识库4构建倒排索引包括:In one embodiment, the constructing the inverted index for the Q&A knowledge base 4 includes:

对问答知识库4中的每个问题和答案分别进行分词、词性标注、关键词提取、关键词出现位置记录、分配ID号的操作,以及为每个问题和答案分词后得到的各词条分配ID号;Each question and answer in the Q&A Knowledge Base 4 is divided into word segmentation, part-of-speech tagging, keyword extraction, keyword occurrence location record, assignment of ID number, and assignment of each term after each word and answer segmentation. ID number;

对问答知识库4中每个问题和答案根据相应的ID号进行排序,对所述每个问题和答案分词后得到的各词条根据相应的ID号进行排序,并将具有同一词条ID的所有问题ID和答案ID放到该词条对应的倒排记录表中;Each question and answer in the Q&A knowledge base 4 is sorted according to the corresponding ID number, and each term obtained after each word and answer word segmentation is sorted according to the corresponding ID number, and will have the same term ID. All question IDs and answer IDs are placed in the inverted record table corresponding to the entry;

将所有倒排记录表合并为最终的倒排索引。Combine all inverted record tables into the final inverted index.

所述候选问题集合中包括至少一个候选问题,且由于采用的是倒排索引查询的方式,每个候选问题都与所述会话问题存在一定程度的联系。每个候 选问题与所述会话问题的所述联系可以通过所述文本相似度来反映,若会话问题与相应的候选问题之间的文本相似度越高,则认为会话问题与该候选问题越相似。At least one candidate question is included in the candidate question set, and each candidate question has a certain degree of association with the session problem due to the manner of using an inverted index query. The association between each candidate question and the conversation question may be reflected by the text similarity. If the text similarity between the conversation problem and the corresponding candidate question is higher, the conversation problem is considered to be similar to the candidate problem. .

具体地,步骤S2分别计算所述会话问题与所述候选问题集合中每个候选问题的文本相似度的方法可以包括:Specifically, the method for calculating the text similarity between the session problem and each candidate question in the candidate question set in step S2 may include:

构建卷积神经网络,通过所述卷积神经网络对所述问答知识库4中的所有问题语句进行样本训练,得到所述问答知识库4中问题语句对应的卷积神经网络模型;Constructing a convolutional neural network, and performing sample training on all the problem sentences in the Q&A knowledge base 4 through the convolutional neural network, and obtaining a convolutional neural network model corresponding to the problem statement in the Q&A knowledge base 4;

将所述会话问题和所述候选问题集合中的每个候选问题分别输入所述卷积神经网络模型,通过所述卷积神经网络模型的卷积核卷积得到所述会话问题和所述候选问题集合中的每个候选问题各自对应的特征向量;Entering each of the conversation problem and the candidate question set into the convolutional neural network model, respectively, and obtaining the conversation problem and the candidate by convolution kernel convolution of the convolutional neural network model a feature vector corresponding to each candidate question in the question set;

分别计算所述会话问题对应的特征向量与所述候选问题集合中的每个候选问题对应的特征向量之间的余弦距离,从而得到所述会话问题与所述候选问题集合中每个候选问题的文本相似度。Calculating a cosine distance between the feature vector corresponding to the session problem and the feature vector corresponding to each candidate question in the candidate question set, respectively, to obtain the session problem and each candidate problem in the candidate question set Text similarity.

步骤S3,根据预设规则及所述文本相似度,判断候选问题集合中是否存在所述会话问题的近似问题,若所述候选问题集合中存在所述会话问题的近似问题,则在问答知识库中查找该近似问题的关联答案,将所述关联答案作为所述会话问题的目标答案输出。Step S3: determining, according to the preset rule and the text similarity, whether there is an approximation problem of the conversation problem in the candidate question set, and if there is an approximation problem of the conversation problem in the candidate question set, then the Q&A knowledge base Finding the associated answer of the approximation question, and outputting the associated answer as the target answer of the conversation question.

具体地,所述预设规则可以包括:判断是否存在与会话问题的文本相似度大于第二预设阈值的候选问题,若存在与会话问题的文本相似度大于第二预设阈值的候选问题,则判定候选问题集合中存在所述会话问题的近似问题。若不存在与会话问题的文本相似度大于第二预设阈值的候选问题,则判定候选问题集合中不存在所述会话问题的近似问题。Specifically, the preset rule may include: determining whether there is a candidate problem that the text similarity with the conversation problem is greater than the second preset threshold, and if there is a candidate problem that the text similarity with the conversation problem is greater than the second preset threshold, Then, an approximation problem of the conversation problem exists in the candidate problem set. If there is no candidate problem that the text similarity with the conversation problem is greater than the second preset threshold, then it is determined that there is no approximation problem of the conversation problem in the candidate problem set.

若存在与会话问题的文本相似度大于第二预设阈值的候选问题,则步骤S3从所述与会话问题的文本相似度大于第二预设阈值的候选问题中选择最大文本相似度对应的候选问题作为所述近似问题,并在问答知识库4中查找该近似问题的关联答案,将所述关联答案作为所述会话问题的目标答案输出。值得注意的是,所述近似问题在问答知识库4中也可能有不止一个关联答案,当近似问题在问答知识库4中有多个关联答案时,步骤S3可以取所述多个关联答案中,在预设时间段(例如最近一周)内输出频率最高的关联答案作为所述会话问题的目标答案输出。If there is a candidate problem that the text similarity with the conversation problem is greater than the second preset threshold, step S3 selects a candidate corresponding to the maximum text similarity from the candidate questions with the text similarity of the conversation problem being greater than the second preset threshold. The problem is taken as the approximation question, and the associated answer of the approximation question is searched in the Q&A knowledge base 4, and the associated answer is output as the target answer of the conversation question. It should be noted that the approximation question may also have more than one associated answer in the Q&A knowledge base 4. When the approximation question has multiple associated answers in the Q&A knowledge base 4, step S3 may take the plurality of associated answers. And outputting the associated answer with the highest frequency in the preset time period (for example, the most recent week) as the target answer of the conversation question.

步骤S4,若所述候选问题集合中不存在所述会话问题的近似问题,则根据所述文本特征信息,通过倒排索引查询的方式从问答知识库4中查询与所述会话问题相关的候选答案集合,并分别计算所述会话问题与所述候选答案集合中每个候选答案的主题相似度。Step S4: If there is no approximation problem of the conversation problem in the candidate question set, query the candidate related to the conversation problem from the question and answer knowledge base 4 by using an inverted index query according to the text feature information. An answer set, and separately calculating a topic similarity of the conversation question and each candidate answer in the candidate answer set.

所述候选答案集合中包括至少一个候选答案,且由于采用的是倒排索引查询的方式,每个候选答案都与所述会话问题存在一定程度的联系。每个候选答案与所述会话问题的所述联系可以通过所述主题相似度来反映,若会话问题与相应的候选答案之间的主题相似度越高,则认为会话问题与该候选答 案的主题越相似,从而认为该候选答案越有可能是该会话问题对应的答案。At least one candidate answer is included in the set of candidate answers, and each candidate answer has a certain degree of association with the session question due to the manner of using an inverted index query. The association of each candidate answer with the conversation question may be reflected by the topic similarity, and if the topic similarity between the conversation question and the corresponding candidate answer is higher, the conversation problem and the topic of the candidate answer are considered The more similar, the more likely the candidate answer is to be the answer to the conversation question.

具体地,步骤S4分别计算所述会话问题与所述候选答案集合中每个候选答案的主题相似度的方法可以包括:Specifically, the method for calculating the topic similarity of each of the candidate answers in the candidate answer set in step S4 may include:

所述分别计算所述会话问题与所述候选答案集合中每个候选答案的主题相似度包括:The calculating the similarity degree between the conversation problem and each candidate answer in the candidate answer set respectively includes:

采用线性判别分析(Linear Discriminant Analysis,LDA)模型分别提取所述会话问题和所述候选答案集合中每个候选答案的主题向量;Extracting the conversation problem and the topic vector of each candidate answer in the candidate answer set by using a Linear Discriminant Analysis (LDA) model;

分别计算所述会话问题的主题向量与所述候选答案集合中每个候选答案的主题向量之间的余弦距离,从而得到所述会话问题与所述候选答案集合中每个候选答案的主题相似度。Calculating a cosine distance between a topic vector of the conversation question and a topic vector of each candidate answer in the candidate answer set, respectively, to obtain a topic similarity between the conversation question and each candidate answer in the candidate answer set .

步骤S5,根据预设规则及所述主题相似度,判断候选答案集合中是否存在所述会话问题的近似答案,若所述候选答案集合中存在所述会话问题的近似答案,则将所述近似答案作为所述会话问题的目标答案输出。Step S5: determining, according to the preset rule and the topic similarity, whether an approximate answer of the conversation problem exists in the candidate answer set, and if the approximate answer of the conversation problem exists in the candidate answer set, the approximation is The answer is output as the target answer to the conversation question.

具体地,所述预设规则可以包括:判断是否存在与会话问题的主题相似度大于第三预设阈值的候选答案,若存在与会话问题的主题相似度大于第三预设阈值的候选答案,则判定候选答案集合中存在所述会话问题的近似答案。若不存在与会话问题的主题相似度大于第三预设阈值的候选答案,则判定候选答案集合中不存在所述会话问题的近似答案。Specifically, the preset rule may include: determining whether there is a candidate answer whose topic similarity with the session problem is greater than a third preset threshold, and if there is a candidate answer with a topic similarity of the conversation problem being greater than a third preset threshold, Then, an approximate answer to the conversation question exists in the candidate answer set. If there is no candidate answer with the topic similarity of the conversation problem being greater than the third preset threshold, it is determined that there is no approximate answer of the conversation question in the candidate answer set.

若存在与会话问题的主题相似度大于第三预设阈值的候选答案,则将所述候选答案作为会话问题的近似答案,步骤S5将所述近似答案作为所述会话问题的目标答案输出。值得注意的是,与会话问题的主题相似度大于第三预设阈值的候选答案在问答知识库4中也可能有不止一个,当与会话问题的主题相似度大于第三预设阈值的候选答案在问答知识库4中有多个时,步骤S5可以取所述多个候选答案中,在预设时间段(例如最近一周)内输出频率最高的作为所述会话问题的近似答案。If there is a candidate answer whose topic similarity to the conversation question is greater than the third preset threshold, the candidate answer is taken as an approximate answer to the conversation question, and step S5 outputs the approximate answer as the target answer of the conversation question. It is worth noting that the candidate answers with the topic similarity of the conversation problem greater than the third preset threshold may also have more than one in the Q&A knowledge base 4, when the similarity with the topic of the conversation problem is greater than the third predetermined threshold. When there are multiple in the question and answer knowledge base 4, step S5 may take the approximate answer which is the highest frequency of the conversation question in the preset time period (for example, the most recent week).

步骤S6,若候选答案集合中不存在所述会话问题的近似答案,则通过seq2seq模型对所述问答知识库4中的各个问题和答案进行编码和解码的迭代训练,从而构建序列预测模型,将所述会话问题输入所述序列预测模型生成应变答案,将所述应变答案作为所述会话问题的目标答案输出。所述seq2seq模型由用于进行所述编码和解码迭代训练的前向长短记忆网络LSTM模型和后向LSTM模型,以及用于计算每次编码和解码的隐藏层信息权重的注意力机制构成。Step S6, if there is no approximate answer of the conversation question in the candidate answer set, the iterative training of encoding and decoding each question and answer in the Q&A knowledge base 4 by using the seq2seq model, thereby constructing a sequence prediction model, The conversation question is input to the sequence prediction model to generate a strain answer, and the strain answer is output as a target answer of the conversation question. The seq2seq model consists of a forward long and short memory network LSTM model and a backward LSTM model for performing the encoding and decoding iterative training, and an attention mechanism for calculating hidden layer information weights for each encoding and decoding.

根据本实施例提供的聊天应答方法,在获取会话问题并进行预处理后,通过倒排索引查询的方式从问答知识库4中查询与所述会话问题相关的候选问题集合,并分别计算所述会话问题与所述候选问题集合中每个候选问题的文本相似度,判断候选问题集合中是否存在所述会话问题的近似问题,若是,则在问答知识库4中查找该近似问题的关联答案,将所述关联答案作为所述会话问题的目标答案输出,若所述候选问题集合中不存在所述会话问题的近似问题,则通过倒排索引查询的方式从问答知识库4中查询与所述会话问题 相关的候选答案集合,并分别计算所述会话问题与所述候选答案集合中每个候选答案的主题相似度,判断候选答案集合中是否存在所述会话问题的近似答案,若是,则将所述近似答案作为所述会话问题的目标答案输出,若候选答案集合中不存在所述会话问题的近似答案,则通过seq2seq模型对所述问答知识库中的各个问题和答案进行编码和解码的迭代训练,从而构建序列预测模型,将所述会话问题输入所述序列预测模型生成应变答案,将所述应变答案作为所述会话问题的目标答案输出。通过本实施例提供的聊天应答方法可以针对会话问题为客户做出准确和应变的反馈,从而提高服务质量。According to the chat response method provided in this embodiment, after the session problem is acquired and pre-processed, the candidate question set related to the session problem is queried from the question-and-answer knowledge base 4 by means of an inverted index query, and the said a textual similarity between the conversation problem and each candidate question in the candidate question set, determining whether there is an approximation problem of the conversation problem in the candidate question set, and if so, searching for the associated answer of the approximation question in the Q&A knowledge base 4, Outputting the associated answer as a target answer of the conversation question. If there is no approximation problem of the conversation problem in the candidate question set, querying and speaking from the Q&A knowledge base 4 by means of an inverted index query a set of candidate answers related to the conversation problem, and respectively calculating a topic similarity of the conversation question and each candidate answer in the candidate answer set, and determining whether there is an approximate answer of the conversation question in the candidate answer set, and if so, The approximate answer is output as the target answer of the conversation question, if not in the candidate answer set An approximate answer to the conversation problem exists, and iterative training of encoding and decoding each question and answer in the question and answer knowledge base by the seq2seq model, thereby constructing a sequence prediction model, and inputting the conversation problem into the sequence prediction model A strain answer is generated, and the strain answer is output as a target answer to the conversation question. The chat response method provided by the embodiment can provide accurate and responsive feedback for the client for the session problem, thereby improving the service quality.

参阅图4所示,为图1中聊天应答程序10的程序模块图。在本实施例中,聊天应答程序10被分割为多个模块,该多个模块被存储于存储器11中,并由处理器12执行,以完成本申请。本申请所称的模块是指能够完成特定功能的一系列计算机程序指令段。Referring to FIG. 4, it is a program module diagram of the chat response program 10 in FIG. In the present embodiment, the chat response program 10 is divided into a plurality of modules that are stored in the memory 11 and executed by the processor 12 to complete the present application. A module as referred to in this application refers to a series of computer program instructions that are capable of performing a particular function.

所述聊天应答程序10可以被分割为:预处理模块110、第一计算模块120、问题检索模块130、第二计算模块140、答案检索模块150和答案预测模块160。The chat response program 10 can be divided into: a pre-processing module 110, a first calculation module 120, a question retrieval module 130, a second calculation module 140, an answer retrieval module 150, and an answer prediction module 160.

预处理模块110,用于获取客户输入的会话问题,对所述会话问题进行预处理,得到会话问题的文本特征信息,所述文本特征信息包括各词条在所述会话问题中的词性、位置和词类归属信息,所述词类归属包括归属于关键词或命名实体。The pre-processing module 110 is configured to obtain a session problem input by the client, and perform pre-processing on the session problem to obtain text feature information of the session problem, where the text feature information includes the part of speech and location of each term in the conversation problem. And word class attribution information, the word class attribution includes attribution to a keyword or a named entity.

具体地,预处理模块110用于对所述会话问题进行以下预处理:Specifically, the pre-processing module 110 is configured to perform the following pre-processing on the session problem:

对所述会话问题进行分词处理,从而切分出会话问题的各词条,所述分词处理的方法包括基于词典进行正向最大匹配和/或基于词典进行逆向最大匹配;Performing word segmentation on the conversation problem, thereby segmenting the terms of the conversation problem, the method of word segmentation includes performing forward maximum matching based on the dictionary and/or performing reverse maximum matching based on the dictionary;

对经所述分词处理得到的各词条进行词性解析,并对各词条的词性进行标注,所述词性解析通过经预设大规模语料库训练得到的词性标注模型实现;Performing part-of-speech analysis on each term obtained by the word segmentation process, and labeling the part-of-speech of each term, and the part-of-speech analysis is realized by a part-of-speech tagging model obtained through a preset large-scale corpus training;

对所述会话问题进行命名实体识别,从而识别出具有特定意义的命名实体,所述命名实体包括人名、地名、组织机构、专有名词,所述命名实体识别的方法包括基于词典和规则的方法,以及基于统计学习的方法;Named entity identification is performed on the session problem, thereby identifying a named entity having a specific meaning, the named entity includes a person name, a place name, an organization, a proper noun, and the method for identifying the named entity includes a dictionary-based and rule-based method And methods based on statistical learning;

根据所述各词条以及所述命名实体,从所述会话问题中提取关键词,所述关键词为字符数量多于第一预设阈值的词组,或者为存在于预设词典中的命名实体,所述预设词典包括业务场景专有词典。Extracting a keyword from the conversation question according to the term and the named entity, the keyword being a phrase whose number of characters is greater than a first preset threshold, or a named entity existing in a preset dictionary The preset dictionary includes a business scenario-specific dictionary.

第一计算模块120,用于为问答知识库4构建倒排索引,所述问答知识库包括预先整理的多个问题以及每个问题关联的一个或多个答案,根据所述文本特征信息,通过倒排索引查询的方式从问答知识库4中查询与所述会话问题相关的候选问题集合,并分别计算所述会话问题与所述候选问题集合中每个候选问题的文本相似度。a first calculation module 120, configured to build an inverted index for the Q&A knowledge base 4, the Q&A knowledge base includes a plurality of questions pre-arranged and one or more answers associated with each question, according to the text feature information, The manner of inverting the index query queries the candidate question set related to the session question from the question and answer knowledge base 4, and separately calculates the text similarity of the session question and each candidate question in the candidate question set.

具体地,第一计算模块120用于通过以下方式为问答知识库4构建倒排索引:Specifically, the first calculating module 120 is configured to construct an inverted index for the question and answer knowledge base 4 by:

对问答知识库4中的每个问题和答案分别进行分词、词性标注、关键词 提取、关键词出现位置记录、分配ID号的操作,以及为每个问题和答案分词后得到的各词条分配ID号;Each question and answer in the Q&A Knowledge Base 4 is divided into word segmentation, part-of-speech tagging, keyword extraction, keyword occurrence location record, assignment of ID number, and assignment of each term after each word and answer segmentation. ID number;

对问答知识库4中每个问题和答案根据相应的ID号进行排序,对所述每个问题和答案分词后得到的各词条根据相应的ID号进行排序,并将具有同一词条ID的所有问题ID和答案ID放到该词条对应的倒排记录表中;Each question and answer in the Q&A knowledge base 4 is sorted according to the corresponding ID number, and each term obtained after each word and answer word segmentation is sorted according to the corresponding ID number, and will have the same term ID. All question IDs and answer IDs are placed in the inverted record table corresponding to the entry;

将所有倒排记录表合并为最终的倒排索引。Combine all inverted record tables into the final inverted index.

第一计算模块120计算所述会话问题与所述候选问题集合中每个候选问题的文本相似度包括:The first calculation module 120 calculates a text similarity between the conversation problem and each candidate question in the candidate question set, including:

构建卷积神经网络,通过所述卷积神经网络对所述问答知识库4中的所有问题语句进行样本训练,得到所述问答知识库4中问题语句对应的卷积神经网络模型;Constructing a convolutional neural network, and performing sample training on all the problem sentences in the Q&A knowledge base 4 through the convolutional neural network, and obtaining a convolutional neural network model corresponding to the problem statement in the Q&A knowledge base 4;

将所述会话问题和所述候选问题集合中的每个候选问题分别输入所述卷积神经网络模型,通过所述卷积神经网络模型的卷积核卷积得到所述会话问题和所述候选问题集合中的每个候选问题各自对应的特征向量;Entering each of the conversation problem and the candidate question set into the convolutional neural network model, respectively, and obtaining the conversation problem and the candidate by convolution kernel convolution of the convolutional neural network model a feature vector corresponding to each candidate question in the question set;

分别计算所述会话问题对应的特征向量与所述候选问题集合中的每个候选问题对应的特征向量之间的余弦距离,从而得到所述会话问题与所述候选问题集合中每个候选问题的文本相似度。Calculating a cosine distance between the feature vector corresponding to the session problem and the feature vector corresponding to each candidate question in the candidate question set, respectively, to obtain the session problem and each candidate problem in the candidate question set Text similarity.

问题检索模块130,用于根据预设规则及所述文本相似度,判断候选问题集合中是否存在所述会话问题的近似问题,若所述候选问题集合中存在所述会话问题的近似问题,则在问答知识库中查找该近似问题的关联答案,将所述关联答案作为所述会话问题的目标答案输出。The problem retrieving module 130 is configured to determine, according to the preset rule and the text similarity, whether there is an approximation problem of the session problem in the candidate question set, and if there is an approximation problem of the session problem in the candidate question set, Finding the associated answer of the approximate question in the question and answer knowledge base, and outputting the associated answer as the target answer of the conversation question.

具体地,问题检索模块130判断是否存在与会话问题的文本相似度大于第二预设阈值的候选问题,若是,则从所述与会话问题的文本相似度大于第二预设阈值的候选问题中选择最大文本相似度对应的候选问题作为所述近似问题;若不存在与会话问题的文本相似度大于第二预设阈值的候选问题,则判定所述候选问题集合中不存在所述会话问题的近似问题。Specifically, the problem retrieval module 130 determines whether there is a candidate problem that the text similarity with the conversation problem is greater than the second preset threshold, and if so, from the candidate problem that the text similarity with the conversation problem is greater than the second preset threshold Selecting a candidate question corresponding to the maximum text similarity as the approximation question; if there is no candidate problem that the text similarity with the session problem is greater than the second preset threshold, determining that the session problem does not exist in the candidate question set Approximate problem.

第二计算模块140,用于若所述候选问题集合中不存在所述会话问题的近似问题,则根据所述文本特征信息,通过倒排索引查询的方式从问答知识库4中查询与所述会话问题相关的候选答案集合,并分别计算所述会话问题与所述候选答案集合中每个候选答案的主题相似度。The second calculating module 140 is configured to: if the approximation problem of the session problem does not exist in the candidate question set, query and query from the Q&A knowledge base 4 by using an inverted index query according to the text feature information A set of candidate answers related to the conversation question, and calculating a topic similarity of the conversation question and each candidate answer in the candidate answer set, respectively.

第二计算模块140计算所述会话问题与所述候选答案集合中每个候选答案的主题相似度包括:The second calculation module 140 calculates a topic similarity between the conversation question and each candidate answer in the candidate answer set, including:

采用线性判别分析模型分别提取所述会话问题和所述候选答案集合中每个候选答案的主题向量;Extracting the conversation problem and the topic vector of each candidate answer in the candidate answer set by using a linear discriminant analysis model;

分别计算所述会话问题的主题向量与所述候选答案集合中每个候选答案的主题向量之间的余弦距离,从而得到所述会话问题与所述候选答案集合中每个候选答案的主题相似度。Calculating a cosine distance between a topic vector of the conversation question and a topic vector of each candidate answer in the candidate answer set, respectively, to obtain a topic similarity between the conversation question and each candidate answer in the candidate answer set .

答案检索模块150,用于根据预设规则及所述主题相似度,判断候选答案集合中是否存在所述会话问题的近似答案,若所述候选答案集合中存在所述 会话问题的近似答案,则将所述近似答案作为所述会话问题的目标答案输出。The answer retrieval module 150 is configured to determine, according to the preset rule and the topic similarity, whether an approximate answer of the conversation problem exists in the candidate answer set, if an approximate answer of the conversation problem exists in the candidate answer set, The approximate answer is output as a target answer to the conversation question.

具体地,答案检索模块150判断是否存在与会话问题的主题相似度大于第三预设阈值的候选答案,若是,则从所述与会话问题的主题相似度大于第三预设阈值的候选答案中选择最大主题相似度对应的候选答案作为所述近似答案;若不存在与会话问题的主题相似度大于第三预设阈值的候选答案,则判定所述候选答案集合中不存在所述会话问题的近似答案。Specifically, the answer retrieval module 150 determines whether there is a candidate answer whose topic similarity to the conversation problem is greater than the third preset threshold, and if so, from the candidate answers that the topic similarity with the conversation question is greater than the third preset threshold Selecting a candidate answer corresponding to the maximum topic similarity as the approximate answer; if there is no candidate answer with a topic similarity of the conversation problem greater than the third preset threshold, determining that the session problem does not exist in the candidate answer set Approximate answer.

答案预测模块160,用于若候选答案集合中不存在所述会话问题的近似答案,则通过seq2seq模型对所述问答知识库4中的各个问题和答案进行编码和解码的迭代训练,从而构建序列预测模型,将所述会话问题输入所述序列预测模型生成应变答案,将所述应变答案作为所述会话问题的目标答案输出。答案预测模块160所述seq2seq模型由用于进行所述编码和解码迭代训练的前向长短记忆网络LSTM模型和后向LSTM模型,以及用于计算每次编码和解码的隐藏层信息权重的注意力机制构成。The answer prediction module 160 is configured to: if the approximate answer of the conversation problem does not exist in the candidate answer set, perform iterative training on encoding and decoding each question and answer in the Q&A knowledge base 4 by using the seq2seq model, thereby constructing a sequence And predicting a model, inputting the conversation question into the sequence prediction model to generate a strain answer, and outputting the strain answer as a target answer of the conversation question. The answer prediction module 160 describes the seq2seq model by a forward-long memory network LSTM model and a backward LSTM model for performing the encoding and decoding iterative training, and attention for calculating hidden layer information weights for each encoding and decoding. Mechanism composition.

在图1所示的电子装置1较佳实施例的运行环境示意图中,包含可读存储介质的存储器11中可以包括操作系统、聊天应答程序10及问答知识库4。处理器12执行存储器11中存储的聊天应答程序10时实现如下步骤:In the operating environment diagram of the preferred embodiment of the electronic device 1 shown in FIG. 1, the memory 11 including the readable storage medium may include an operating system, a chat response program 10, and a question and answer knowledge base 4. When the processor 12 executes the chat response program 10 stored in the memory 11, the following steps are implemented:

预处理步骤:获取客户输入的会话问题,对所述会话问题进行预处理,得到会话问题的文本特征信息,所述文本特征信息包括各词条在所述会话问题中的词性、位置和词类归属信息,所述词类归属包括归属于关键词或命名实体;a pre-processing step: obtaining a session question input by the client, pre-processing the session problem, and obtaining text feature information of the session problem, where the text feature information includes part of speech, location, and part-of-speech attribution of each term in the conversation problem Information, the word class attribution includes belonging to a keyword or a named entity;

第一计算步骤:为问答知识库构建倒排索引,所述问答知识库包括预先整理的多个问题以及每个问题关联的一个或多个答案,根据所述文本特征信息,通过倒排索引查询的方式从问答知识库中查询与所述会话问题相关的候选问题集合,并分别计算所述会话问题与所述候选问题集合中每个候选问题的文本相似度;a first calculating step: constructing an inverted index for the question and answer knowledge base, the question and answer knowledge base including a plurality of questions arranged in advance and one or more answers associated with each question, and querying through the inverted index according to the text feature information Means querying a candidate question set related to the conversation problem from a question and answer knowledge base, and separately calculating text similarity between the conversation problem and each candidate question in the candidate question set;

问题检索步骤:根据预设规则及所述文本相似度,判断候选问题集合中是否存在所述会话问题的近似问题,若所述候选问题集合中存在所述会话问题的近似问题,则在问答知识库中查找该近似问题的关联答案,将所述关联答案作为所述会话问题的目标答案输出;a problem retrieval step: determining, according to a preset rule and the text similarity, whether there is an approximation problem of the conversation problem in the candidate question set, and if there is an approximation problem of the conversation problem in the candidate question set, then the question and answer knowledge Finding an associated answer of the approximate problem in the library, and outputting the associated answer as a target answer of the conversation question;

第二计算步骤:若所述候选问题集合中不存在所述会话问题的近似问题,则根据所述文本特征信息,通过倒排索引查询的方式从问答知识库中查询与所述会话问题相关的候选答案集合,并分别计算所述会话问题与所述候选答案集合中每个候选答案的主题相似度;a second calculating step: if there is no approximation problem of the session problem in the candidate question set, querying, according to the text feature information, a query related to the session problem from the Q&A knowledge base by means of an inverted index query a set of candidate answers, and separately calculating a topic similarity of the conversation question and each candidate answer in the candidate answer set;

答案检索步骤:根据预设规则及所述主题相似度,判断候选答案集合中是否存在所述会话问题的近似答案,若所述候选答案集合中存在所述会话问题的近似答案,则将所述近似答案作为所述会话问题的目标答案输出;An answer retrieval step: determining, according to a preset rule and the topic similarity, whether an approximate answer of the conversation question exists in the candidate answer set, and if an approximate answer of the conversation problem exists in the candidate answer set, The approximate answer is output as the target answer to the conversation question;

答案预测步骤:若候选答案集合中不存在所述会话问题的近似答案,则通过seq2seq模型对所述问答知识库中的各个问题和答案进行编码和解码的 迭代训练,从而构建序列预测模型,将所述会话问题输入所述序列预测模型生成应变答案,将所述应变答案作为所述会话问题的目标答案输出。An answer prediction step: if an approximate answer of the conversation problem does not exist in the candidate answer set, iteratively trains and decodes each question and answer in the question and answer knowledge base through the seq2seq model, thereby constructing a sequence prediction model, The conversation question is input to the sequence prediction model to generate a strain answer, and the strain answer is output as a target answer of the conversation question.

其中,所述对所述会话问题进行预处理包括:The pre-processing of the session problem includes:

对所述会话问题进行分词处理,从而切分出会话问题的各词条,所述分词处理的方法包括基于词典进行正向最大匹配和/或基于词典进行逆向最大匹配;Performing word segmentation on the conversation problem, thereby segmenting the terms of the conversation problem, the method of word segmentation includes performing forward maximum matching based on the dictionary and/or performing reverse maximum matching based on the dictionary;

对经所述分词处理得到的各词条进行词性解析,并对各词条的词性进行标注,所述词性解析通过经预设大规模语料库训练得到的词性标注模型实现;Performing part-of-speech analysis on each term obtained by the word segmentation process, and labeling the part-of-speech of each term, and the part-of-speech analysis is realized by a part-of-speech tagging model obtained through a preset large-scale corpus training;

对所述会话问题进行命名实体识别,从而识别出具有特定意义的命名实体,所述命名实体包括人名、地名、组织机构、专有名词,所述命名实体识别的方法包括基于词典和规则的方法,以及基于统计学习的方法;Named entity identification is performed on the session problem, thereby identifying a named entity having a specific meaning, the named entity includes a person name, a place name, an organization, a proper noun, and the method for identifying the named entity includes a dictionary-based and rule-based method And methods based on statistical learning;

根据所述各词条以及所述命名实体,从所述会话问题中提取关键词,所述关键词为字符数量多于第一预设阈值的词组,或者为存在于预设词典中的命名实体,所述预设词典包括业务场景专有词典。Extracting a keyword from the conversation question according to the term and the named entity, the keyword being a phrase whose number of characters is greater than a first preset threshold, or a named entity existing in a preset dictionary The preset dictionary includes a business scenario-specific dictionary.

所述分别计算所述会话问题与所述候选问题集合中每个候选问题的文本相似度包括:And calculating, respectively, the text similarity between the conversation problem and each candidate question in the candidate question set includes:

构建卷积神经网络,通过所述卷积神经网络对所述问答知识库中的所有问题语句进行样本训练,得到所述问答知识库中问题语句对应的卷积神经网络模型;Constructing a convolutional neural network, and performing sample training on all problem sentences in the question and answer knowledge base through the convolutional neural network, and obtaining a convolutional neural network model corresponding to the problem statement in the question and answer knowledge base;

将所述会话问题和所述候选问题集合中的每个候选问题分别输入所述卷积神经网络模型,通过所述卷积神经网络模型的卷积核卷积得到所述会话问题和所述候选问题集合中的每个候选问题各自对应的特征向量;Entering each of the conversation problem and the candidate question set into the convolutional neural network model, respectively, and obtaining the conversation problem and the candidate by convolution kernel convolution of the convolutional neural network model a feature vector corresponding to each candidate question in the question set;

分别计算所述会话问题对应的特征向量与所述候选问题集合中的每个候选问题对应的特征向量之间的余弦距离,从而得到所述会话问题与所述候选问题集合中每个候选问题的文本相似度;Calculating a cosine distance between the feature vector corresponding to the session problem and the feature vector corresponding to each candidate question in the candidate question set, respectively, to obtain the session problem and each candidate problem in the candidate question set Text similarity

所述分别计算所述会话问题与所述候选答案集合中每个候选答案的主题相似度包括:The calculating the similarity degree between the conversation problem and each candidate answer in the candidate answer set respectively includes:

采用线性判别分析模型分别提取所述会话问题和所述候选答案集合中每个候选答案的主题向量;Extracting the conversation problem and the topic vector of each candidate answer in the candidate answer set by using a linear discriminant analysis model;

分别计算所述会话问题的主题向量与所述候选答案集合中每个候选答案的主题向量之间的余弦距离,从而得到所述会话问题与所述候选答案集合中每个候选答案的主题相似度。Calculating a cosine distance between a topic vector of the conversation question and a topic vector of each candidate answer in the candidate answer set, respectively, to obtain a topic similarity between the conversation question and each candidate answer in the candidate answer set .

所述根据预设规则及所述问题相似度,判断候选问题集合中是否存在所述会话问题的近似问题包括:The approximating the problem of determining whether the session problem exists in the candidate question set according to the preset rule and the problem similarity includes:

判断是否存在与会话问题的文本相似度大于第二预设阈值的候选问题,若是,则从所述与会话问题的文本相似度大于第二预设阈值的候选问题中选择最大文本相似度对应的候选问题作为所述近似问题;Determining whether there is a candidate problem that the text similarity with the conversation problem is greater than the second preset threshold, and if yes, selecting the maximum text similarity corresponding to the candidate problem that the text similarity with the conversation problem is greater than the second preset threshold Candidate questions as the approximation problem;

若不存在与会话问题的文本相似度大于第二预设阈值的候选问题,则判定所述候选问题集合中不存在所述会话问题的近似问题;If there is no candidate problem that the text similarity with the conversation problem is greater than the second preset threshold, determining that there is no approximation problem of the conversation problem in the candidate problem set;

所述根据预设规则及所述主题相似度,判断候选答案集合中是否存在所述会话问题的近似答案包括:The determining, according to the preset rule and the topic similarity, determining whether the candidate answer exists in the candidate answer set includes:

判断是否存在与会话问题的主题相似度大于第三预设阈值的候选答案,若是,则从所述与会话问题的主题相似度大于第三预设阈值的候选答案中选择最大主题相似度对应的候选答案作为所述近似答案;Determining whether there is a candidate answer whose topic similarity to the conversation problem is greater than a third preset threshold, and if yes, selecting a maximum topic similarity corresponding to the candidate answers having the topic similarity of the conversation problem being greater than the third preset threshold a candidate answer as the approximate answer;

若不存在与会话问题的主题相似度大于第三预设阈值的候选答案,则判定所述候选答案集合中不存在所述会话问题的近似答案。If there is no candidate answer with the topic similarity of the conversation problem being greater than the third preset threshold, it is determined that the approximate answer of the conversation question does not exist in the candidate answer set.

所述为问答知识库构建倒排索引包括:The constructing the inverted index for the Q&A knowledge base includes:

对问答知识库中的每个问题和答案分别进行分词、词性标注、关键词提取、关键词出现位置记录、分配ID号的操作,以及为每个问题和答案分词后得到的各词条分配ID号;Each question and answer in the Q&A knowledge base is divided into word segmentation, part-of-speech tagging, keyword extraction, keyword location location record, assignment ID number, and ID assigned to each term after each word and answer segmentation. number;

对问答知识库中每个问题和答案根据相应的ID号进行排序,对所述每个问题和答案分词后得到的各词条根据相应的ID号进行排序,并将具有同一词条ID的所有问题ID和答案ID放到该词条对应的倒排记录表中;Each question and answer in the Q&A knowledge base is sorted according to the corresponding ID number, and each term obtained after each word and answer word segmentation is sorted according to the corresponding ID number, and all the items with the same item ID are The question ID and the answer ID are placed in the inverted record table corresponding to the entry;

将所有倒排记录表合并为最终的倒排索引。Combine all inverted record tables into the final inverted index.

所述seq2seq模型由用于进行所述编码和解码迭代训练的前向长短记忆网络LSTM模型和后向LSTM模型,以及用于计算每次编码和解码的隐藏层信息权重的注意力机制构成。The seq2seq model consists of a forward long and short memory network LSTM model and a backward LSTM model for performing the encoding and decoding iterative training, and an attention mechanism for calculating hidden layer information weights for each encoding and decoding.

具体原理请参照上述图4关于聊天应答程序10的程序模块图及图3关于聊天应答方法较佳实施例的流程图的介绍。For specific principles, please refer to the program module diagram of the chat response program 10 in FIG. 4 and the flowchart of the preferred embodiment of the chat response method in FIG.

此外,本申请实施例还提出一种计算机可读存储介质,所述计算机可读存储介质可以是硬盘、多媒体卡、SD卡、闪存卡、SMC、只读存储器(ROM)、可擦除可编程只读存储器(EPROM)、便携式紧致盘只读存储器(CD-ROM)、USB存储器等等中的任意一种或者几种的任意组合。所述计算机可读存储介质中包括存储有问答知识库4及聊天应答程序10等,所述聊天应答程序10被所述处理器12执行时实现如下操作:In addition, the embodiment of the present application further provides a computer readable storage medium, which may be a hard disk, a multimedia card, an SD card, a flash memory card, an SMC, a read only memory (ROM), and an erasable programmable Any combination or combination of any one or more of read only memory (EPROM), portable compact disk read only memory (CD-ROM), USB memory, and the like. The computer readable storage medium includes a Q&A knowledge base 4, a chat response program 10, and the like. When the chat response program 10 is executed by the processor 12, the following operations are implemented:

预处理步骤:获取客户输入的会话问题,对所述会话问题进行预处理,得到会话问题的文本特征信息,所述文本特征信息包括各词条在所述会话问题中的词性、位置和词类归属信息,所述词类归属包括归属于关键词或命名实体;a pre-processing step: obtaining a session question input by the client, pre-processing the session problem, and obtaining text feature information of the session problem, where the text feature information includes part of speech, location, and part-of-speech attribution of each term in the conversation problem Information, the word class attribution includes belonging to a keyword or a named entity;

第一计算步骤:为问答知识库构建倒排索引,所述问答知识库包括预先整理的多个问题以及每个问题关联的一个或多个答案,根据所述文本特征信息,通过倒排索引查询的方式从问答知识库中查询与所述会话问题相关的候选问题集合,并分别计算所述会话问题与所述候选问题集合中每个候选问题的文本相似度;a first calculating step: constructing an inverted index for the question and answer knowledge base, the question and answer knowledge base including a plurality of questions arranged in advance and one or more answers associated with each question, and querying through the inverted index according to the text feature information Means querying a candidate question set related to the conversation problem from a question and answer knowledge base, and separately calculating text similarity between the conversation problem and each candidate question in the candidate question set;

问题检索步骤:根据预设规则及所述文本相似度,判断候选问题集合中是否存在所述会话问题的近似问题,若所述候选问题集合中存在所述会话问题的近似问题,则在问答知识库中查找该近似问题的关联答案,将所述关联 答案作为所述会话问题的目标答案输出;a problem retrieval step: determining, according to a preset rule and the text similarity, whether there is an approximation problem of the conversation problem in the candidate question set, and if there is an approximation problem of the conversation problem in the candidate question set, then the question and answer knowledge Finding an associated answer of the approximate problem in the library, and outputting the associated answer as a target answer of the conversation question;

第二计算步骤:若所述候选问题集合中不存在所述会话问题的近似问题,则根据所述文本特征信息,通过倒排索引查询的方式从问答知识库中查询与所述会话问题相关的候选答案集合,并分别计算所述会话问题与所述候选答案集合中每个候选答案的主题相似度;a second calculating step: if there is no approximation problem of the session problem in the candidate question set, querying, according to the text feature information, a query related to the session problem from the Q&A knowledge base by means of an inverted index query a set of candidate answers, and separately calculating a topic similarity of the conversation question and each candidate answer in the candidate answer set;

答案检索步骤:根据预设规则及所述主题相似度,判断候选答案集合中是否存在所述会话问题的近似答案,若所述候选答案集合中存在所述会话问题的近似答案,则将所述近似答案作为所述会话问题的目标答案输出;An answer retrieval step: determining, according to a preset rule and the topic similarity, whether an approximate answer of the conversation question exists in the candidate answer set, and if an approximate answer of the conversation problem exists in the candidate answer set, The approximate answer is output as the target answer to the conversation question;

答案预测步骤:若候选答案集合中不存在所述会话问题的近似答案,则通过seq2seq模型对所述问答知识库中的各个问题和答案进行编码和解码的迭代训练,从而构建序列预测模型,将所述会话问题输入所述序列预测模型生成应变答案,将所述应变答案作为所述会话问题的目标答案输出。An answer prediction step: if an approximate answer of the conversation problem does not exist in the candidate answer set, iteratively trains and decodes each question and answer in the question and answer knowledge base through the seq2seq model, thereby constructing a sequence prediction model, The conversation question is input to the sequence prediction model to generate a strain answer, and the strain answer is output as a target answer of the conversation question.

其中,所述对所述会话问题进行预处理包括:The pre-processing of the session problem includes:

对所述会话问题进行分词处理,从而切分出会话问题的各词条,所述分词处理的方法包括基于词典进行正向最大匹配和/或基于词典进行逆向最大匹配;Performing word segmentation on the conversation problem, thereby segmenting the terms of the conversation problem, the method of word segmentation includes performing forward maximum matching based on the dictionary and/or performing reverse maximum matching based on the dictionary;

对经所述分词处理得到的各词条进行词性解析,并对各词条的词性进行标注,所述词性解析通过经预设大规模语料库训练得到的词性标注模型实现;Performing part-of-speech analysis on each term obtained by the word segmentation process, and labeling the part-of-speech of each term, and the part-of-speech analysis is realized by a part-of-speech tagging model obtained through a preset large-scale corpus training;

对所述会话问题进行命名实体识别,从而识别出具有特定意义的命名实体,所述命名实体包括人名、地名、组织机构、专有名词,所述命名实体识别的方法包括基于词典和规则的方法,以及基于统计学习的方法;Named entity identification is performed on the session problem, thereby identifying a named entity having a specific meaning, the named entity includes a person name, a place name, an organization, a proper noun, and the method for identifying the named entity includes a dictionary-based and rule-based method And methods based on statistical learning;

根据所述各词条以及所述命名实体,从所述会话问题中提取关键词,所述关键词为字符数量多于第一预设阈值的词组,或者为存在于预设词典中的命名实体,所述预设词典包括业务场景专有词典。Extracting a keyword from the conversation question according to the term and the named entity, the keyword being a phrase whose number of characters is greater than a first preset threshold, or a named entity existing in a preset dictionary The preset dictionary includes a business scenario-specific dictionary.

所述分别计算所述会话问题与所述候选问题集合中每个候选问题的文本相似度包括:And calculating, respectively, the text similarity between the conversation problem and each candidate question in the candidate question set includes:

构建卷积神经网络,通过所述卷积神经网络对所述问答知识库中的所有问题语句进行样本训练,得到所述问答知识库中问题语句对应的卷积神经网络模型;Constructing a convolutional neural network, and performing sample training on all problem sentences in the question and answer knowledge base through the convolutional neural network, and obtaining a convolutional neural network model corresponding to the problem statement in the question and answer knowledge base;

将所述会话问题和所述候选问题集合中的每个候选问题分别输入所述卷积神经网络模型,通过所述卷积神经网络模型的卷积核卷积得到所述会话问题和所述候选问题集合中的每个候选问题各自对应的特征向量;Entering each of the conversation problem and the candidate question set into the convolutional neural network model, respectively, and obtaining the conversation problem and the candidate by convolution kernel convolution of the convolutional neural network model a feature vector corresponding to each candidate question in the question set;

分别计算所述会话问题对应的特征向量与所述候选问题集合中的每个候选问题对应的特征向量之间的余弦距离,从而得到所述会话问题与所述候选问题集合中每个候选问题的文本相似度;Calculating a cosine distance between the feature vector corresponding to the session problem and the feature vector corresponding to each candidate question in the candidate question set, respectively, to obtain the session problem and each candidate problem in the candidate question set Text similarity

所述分别计算所述会话问题与所述候选答案集合中每个候选答案的主题相似度包括:The calculating the similarity degree between the conversation problem and each candidate answer in the candidate answer set respectively includes:

采用线性判别分析模型分别提取所述会话问题和所述候选答案集合中每个候选答案的主题向量;Extracting the conversation problem and the topic vector of each candidate answer in the candidate answer set by using a linear discriminant analysis model;

分别计算所述会话问题的主题向量与所述候选答案集合中每个候选答案的主题向量之间的余弦距离,从而得到所述会话问题与所述候选答案集合中每个候选答案的主题相似度。Calculating a cosine distance between a topic vector of the conversation question and a topic vector of each candidate answer in the candidate answer set, respectively, to obtain a topic similarity between the conversation question and each candidate answer in the candidate answer set .

所述根据预设规则及所述问题相似度,判断候选问题集合中是否存在所述会话问题的近似问题包括:The approximating the problem of determining whether the session problem exists in the candidate question set according to the preset rule and the problem similarity includes:

判断是否存在与会话问题的文本相似度大于第二预设阈值的候选问题,若是,则从所述与会话问题的文本相似度大于第二预设阈值的候选问题中选择最大文本相似度对应的候选问题作为所述近似问题;Determining whether there is a candidate problem that the text similarity with the conversation problem is greater than the second preset threshold, and if yes, selecting the maximum text similarity corresponding to the candidate problem that the text similarity with the conversation problem is greater than the second preset threshold Candidate questions as the approximation problem;

若不存在与会话问题的文本相似度大于第二预设阈值的候选问题,则判定所述候选问题集合中不存在所述会话问题的近似问题;If there is no candidate problem that the text similarity with the conversation problem is greater than the second preset threshold, determining that there is no approximation problem of the conversation problem in the candidate problem set;

所述根据预设规则及所述主题相似度,判断候选答案集合中是否存在所述会话问题的近似答案包括:The determining, according to the preset rule and the topic similarity, determining whether the candidate answer exists in the candidate answer set includes:

判断是否存在与会话问题的主题相似度大于第三预设阈值的候选答案,若是,则从所述与会话问题的主题相似度大于第三预设阈值的候选答案中选择最大主题相似度对应的候选答案作为所述近似答案;Determining whether there is a candidate answer whose topic similarity to the conversation problem is greater than a third preset threshold, and if yes, selecting a maximum topic similarity corresponding to the candidate answers having the topic similarity of the conversation problem being greater than the third preset threshold a candidate answer as the approximate answer;

若不存在与会话问题的主题相似度大于第三预设阈值的候选答案,则判定所述候选答案集合中不存在所述会话问题的近似答案。If there is no candidate answer with the topic similarity of the conversation problem being greater than the third preset threshold, it is determined that the approximate answer of the conversation question does not exist in the candidate answer set.

所述为问答知识库构建倒排索引包括:The constructing the inverted index for the Q&A knowledge base includes:

对问答知识库中的每个问题和答案分别进行分词、词性标注、关键词提取、关键词出现位置记录、分配ID号的操作,以及为每个问题和答案分词后得到的各词条分配ID号;Each question and answer in the Q&A knowledge base is divided into word segmentation, part-of-speech tagging, keyword extraction, keyword location location record, assignment ID number, and ID assigned to each term after each word and answer segmentation. number;

对问答知识库中每个问题和答案根据相应的ID号进行排序,对所述每个问题和答案分词后得到的各词条根据相应的ID号进行排序,并将具有同一词条ID的所有问题ID和答案ID放到该词条对应的倒排记录表中;Each question and answer in the Q&A knowledge base is sorted according to the corresponding ID number, and each term obtained after each word and answer word segmentation is sorted according to the corresponding ID number, and all the items with the same item ID are The question ID and the answer ID are placed in the inverted record table corresponding to the entry;

将所有倒排记录表合并为最终的倒排索引。Combine all inverted record tables into the final inverted index.

所述seq2seq模型由用于进行所述编码和解码迭代训练的前向长短记忆网络LSTM模型和后向LSTM模型,以及用于计算每次编码和解码的隐藏层信息权重的注意力机制构成。The seq2seq model consists of a forward long and short memory network LSTM model and a backward LSTM model for performing the encoding and decoding iterative training, and an attention mechanism for calculating hidden layer information weights for each encoding and decoding.

本申请之计算机可读存储介质的具体实施方式与上述聊天应答方法以及电子装置1的具体实施方式大致相同,在此不再赘述。The specific implementation of the computer readable storage medium of the present application is substantially the same as the above-described chat response method and the specific embodiment of the electronic device 1, and details are not described herein again.

需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、装置、物品或者方法不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、装置、物品或者方法所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、装置、物品或者方法中还存在另外的相同要素。It is to be understood that the term "comprises", "comprising", or any other variants thereof, is intended to encompass a non-exclusive inclusion, such that a process, apparatus, article, or method that comprises a series of elements includes those elements. It also includes other elements not explicitly listed, or elements that are inherent to such a process, device, item, or method. An element that is defined by the phrase "comprising a ..." does not exclude the presence of additional equivalent elements in the process, the device, the item, or the method that comprises the element.

通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通 过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在如上所述的一个存储介质中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本申请各个实施例所述的方法。Through the description of the above embodiments, those skilled in the art can clearly understand that the foregoing embodiment method can be implemented by means of software plus a necessary general hardware platform, and of course, can also be through hardware, but in many cases, the former is better. Implementation. Based on such understanding, portions of the technical solution of the present application that contribute substantially or to the prior art may be embodied in the form of a software product stored in a storage medium as described above, including a number of instructions. To enable a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the methods described in the various embodiments of the present application.

以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。The above is only a preferred embodiment of the present application, and is not intended to limit the scope of the patent application, and the equivalent structure or equivalent process transformations made by the specification and the drawings of the present application, or directly or indirectly applied to other related technical fields. The same is included in the scope of patent protection of this application.

Claims (20)

一种聊天应答方法,其特征在于,该方法包括:A chat response method, the method comprising: 预处理步骤:获取客户输入的会话问题,对所述会话问题进行预处理,得到会话问题的文本特征信息,所述文本特征信息包括各词条在所述会话问题中的词性、位置和词类归属信息,所述词类归属包括归属于关键词或命名实体;a pre-processing step: obtaining a session question input by the client, pre-processing the session problem, and obtaining text feature information of the session problem, where the text feature information includes part of speech, location, and part-of-speech attribution of each term in the conversation problem Information, the word class attribution includes belonging to a keyword or a named entity; 第一计算步骤:为问答知识库构建倒排索引,所述问答知识库包括预先整理的多个问题以及每个问题关联的一个或多个答案,根据所述文本特征信息,通过倒排索引查询的方式从问答知识库中查询与所述会话问题相关的候选问题集合,并分别计算所述会话问题与所述候选问题集合中每个候选问题的文本相似度;a first calculating step: constructing an inverted index for the question and answer knowledge base, the question and answer knowledge base including a plurality of questions arranged in advance and one or more answers associated with each question, and querying through the inverted index according to the text feature information Means querying a candidate question set related to the conversation problem from a question and answer knowledge base, and separately calculating text similarity between the conversation problem and each candidate question in the candidate question set; 问题检索步骤:根据预设规则及所述文本相似度,判断候选问题集合中是否存在所述会话问题的近似问题,若所述候选问题集合中存在所述会话问题的近似问题,则在问答知识库中查找该近似问题的关联答案,将所述关联答案作为所述会话问题的目标答案输出;a problem retrieval step: determining, according to a preset rule and the text similarity, whether there is an approximation problem of the conversation problem in the candidate question set, and if there is an approximation problem of the conversation problem in the candidate question set, then the question and answer knowledge Finding an associated answer of the approximate problem in the library, and outputting the associated answer as a target answer of the conversation question; 第二计算步骤:若所述候选问题集合中不存在所述会话问题的近似问题,则根据所述文本特征信息,通过倒排索引查询的方式从问答知识库中查询与所述会话问题相关的候选答案集合,并分别计算所述会话问题与所述候选答案集合中每个候选答案的主题相似度;a second calculating step: if there is no approximation problem of the session problem in the candidate question set, querying, according to the text feature information, a query related to the session problem from the Q&A knowledge base by means of an inverted index query a set of candidate answers, and separately calculating a topic similarity of the conversation question and each candidate answer in the candidate answer set; 答案检索步骤:根据预设规则及所述主题相似度,判断候选答案集合中是否存在所述会话问题的近似答案,若所述候选答案集合中存在所述会话问题的近似答案,则将所述近似答案作为所述会话问题的目标答案输出;An answer retrieval step: determining, according to a preset rule and the topic similarity, whether an approximate answer of the conversation question exists in the candidate answer set, and if an approximate answer of the conversation problem exists in the candidate answer set, The approximate answer is output as the target answer to the conversation question; 答案预测步骤:若候选答案集合中不存在所述会话问题的近似答案,则通过seq2seq模型对所述问答知识库中的各个问题和答案进行编码和解码的迭代训练,从而构建序列预测模型,将所述会话问题输入所述序列预测模型生成应变答案,将所述应变答案作为所述会话问题的目标答案输出。An answer prediction step: if an approximate answer of the conversation problem does not exist in the candidate answer set, iteratively trains and decodes each question and answer in the question and answer knowledge base through the seq2seq model, thereby constructing a sequence prediction model, The conversation question is input to the sequence prediction model to generate a strain answer, and the strain answer is output as a target answer of the conversation question. 如权利要求1所述的聊天应答方法,其特征在于,所述对所述会话问题进行预处理包括:The chat response method according to claim 1, wherein the preprocessing the session problem comprises: 对所述会话问题进行分词处理,从而切分出会话问题的各词条,所述分词处理的方法包括基于词典进行正向最大匹配和/或基于词典进行逆向最大匹配;Performing word segmentation on the conversation problem, thereby segmenting the terms of the conversation problem, the method of word segmentation includes performing forward maximum matching based on the dictionary and/or performing reverse maximum matching based on the dictionary; 对经所述分词处理得到的各词条进行词性解析,并对各词条的词性进行标注,所述词性解析通过经预设大规模语料库训练得到的词性标注模型实现;Performing part-of-speech analysis on each term obtained by the word segmentation process, and labeling the part-of-speech of each term, and the part-of-speech analysis is realized by a part-of-speech tagging model obtained through a preset large-scale corpus training; 对所述会话问题进行命名实体识别,从而识别出具有特定意义的命名实体,所述命名实体包括人名、地名、组织机构、专有名词,所述命名实体识别的方法包括基于词典和规则的方法,以及基于统计学习的方法;Named entity identification is performed on the session problem, thereby identifying a named entity having a specific meaning, the named entity includes a person name, a place name, an organization, a proper noun, and the method for identifying the named entity includes a dictionary-based and rule-based method And methods based on statistical learning; 根据所述各词条以及所述命名实体,从所述会话问题中提取关键词,所述关键词为字符数量多于第一预设阈值的词组,或者为存在于预设词典中的 命名实体,所述预设词典包括业务场景专有词典。Extracting a keyword from the conversation question according to the term and the named entity, the keyword being a phrase whose number of characters is greater than a first preset threshold, or a named entity existing in a preset dictionary The preset dictionary includes a business scenario-specific dictionary. 如权利要求1所述的聊天应答方法,其特征在于,所述分别计算所述会话问题与所述候选问题集合中每个候选问题的文本相似度包括:The chat response method according to claim 1, wherein the calculating the text similarity of each of the session problem and each candidate question in the candidate question set separately comprises: 构建卷积神经网络,通过所述卷积神经网络对所述问答知识库中的所有问题语句进行样本训练,得到所述问答知识库中问题语句对应的卷积神经网络模型;Constructing a convolutional neural network, and performing sample training on all problem sentences in the question and answer knowledge base through the convolutional neural network, and obtaining a convolutional neural network model corresponding to the problem statement in the question and answer knowledge base; 将所述会话问题和所述候选问题集合中的每个候选问题分别输入所述卷积神经网络模型,通过所述卷积神经网络模型的卷积核卷积得到所述会话问题和所述候选问题集合中的每个候选问题各自对应的特征向量;Entering each of the conversation problem and the candidate question set into the convolutional neural network model, respectively, and obtaining the conversation problem and the candidate by convolution kernel convolution of the convolutional neural network model a feature vector corresponding to each candidate question in the question set; 分别计算所述会话问题对应的特征向量与所述候选问题集合中的每个候选问题对应的特征向量之间的余弦距离,从而得到所述会话问题与所述候选问题集合中每个候选问题的文本相似度;Calculating a cosine distance between the feature vector corresponding to the session problem and the feature vector corresponding to each candidate question in the candidate question set, respectively, to obtain the session problem and each candidate problem in the candidate question set Text similarity 所述分别计算所述会话问题与所述候选答案集合中每个候选答案的主题相似度包括:The calculating the similarity degree between the conversation problem and each candidate answer in the candidate answer set respectively includes: 采用线性判别分析模型分别提取所述会话问题和所述候选答案集合中每个候选答案的主题向量;Extracting the conversation problem and the topic vector of each candidate answer in the candidate answer set by using a linear discriminant analysis model; 分别计算所述会话问题的主题向量与所述候选答案集合中每个候选答案的主题向量之间的余弦距离,从而得到所述会话问题与所述候选答案集合中每个候选答案的主题相似度。Calculating a cosine distance between a topic vector of the conversation question and a topic vector of each candidate answer in the candidate answer set, respectively, to obtain a topic similarity between the conversation question and each candidate answer in the candidate answer set . 如权利要求1所述的聊天应答方法,其特征在于,所述根据预设规则及所述问题相似度,判断候选问题集合中是否存在所述会话问题的近似问题包括:The chat response method according to claim 1, wherein the approximating the problem of whether the session problem exists in the candidate question set according to the preset rule and the problem similarity includes: 判断是否存在与会话问题的文本相似度大于第二预设阈值的候选问题,若是,则从所述与会话问题的文本相似度大于第二预设阈值的候选问题中选择最大文本相似度对应的候选问题作为所述近似问题;Determining whether there is a candidate problem that the text similarity with the conversation problem is greater than the second preset threshold, and if yes, selecting the maximum text similarity corresponding to the candidate problem that the text similarity with the conversation problem is greater than the second preset threshold Candidate questions as the approximation problem; 若不存在与会话问题的文本相似度大于第二预设阈值的候选问题,则判定所述候选问题集合中不存在所述会话问题的近似问题;If there is no candidate problem that the text similarity with the conversation problem is greater than the second preset threshold, determining that there is no approximation problem of the conversation problem in the candidate problem set; 所述根据预设规则及所述主题相似度,判断候选答案集合中是否存在所述会话问题的近似答案包括:The determining, according to the preset rule and the topic similarity, determining whether the candidate answer exists in the candidate answer set includes: 判断是否存在与会话问题的主题相似度大于第三预设阈值的候选答案,若是,则从所述与会话问题的主题相似度大于第三预设阈值的候选答案中选择最大主题相似度对应的候选答案作为所述近似答案;Determining whether there is a candidate answer whose topic similarity to the conversation problem is greater than a third preset threshold, and if yes, selecting a maximum topic similarity corresponding to the candidate answers having the topic similarity of the conversation problem being greater than the third preset threshold a candidate answer as the approximate answer; 若不存在与会话问题的主题相似度大于第三预设阈值的候选答案,则判定所述候选答案集合中不存在所述会话问题的近似答案。If there is no candidate answer with the topic similarity of the conversation problem being greater than the third preset threshold, it is determined that the approximate answer of the conversation question does not exist in the candidate answer set. 如权利要求1所述的聊天应答方法,其特征在于,所述为问答知识库构建倒排索引包括:The chat response method according to claim 1, wherein the constructing the inverted index for the question and answer knowledge base comprises: 对问答知识库中的每个问题和答案分别进行分词、词性标注、关键词提取、关键词出现位置记录、分配ID号的操作,以及为每个问题和答案分词后得到的各词条分配ID号;Each question and answer in the Q&A knowledge base is divided into word segmentation, part-of-speech tagging, keyword extraction, keyword location location record, assignment ID number, and ID assigned to each term after each word and answer segmentation. number; 对问答知识库中每个问题和答案根据相应的ID号进行排序,对所述每个问题和答案分词后得到的各词条根据相应的ID号进行排序,并将具有同一词条ID的所有问题ID和答案ID放到该词条对应的倒排记录表中;Each question and answer in the Q&A knowledge base is sorted according to the corresponding ID number, and each term obtained after each word and answer word segmentation is sorted according to the corresponding ID number, and all the items with the same item ID are The question ID and the answer ID are placed in the inverted record table corresponding to the entry; 将所有倒排记录表合并为最终的倒排索引。Combine all inverted record tables into the final inverted index. 如权利要求1所述的聊天应答方法,其特征在于,所述seq2seq模型由用于进行所述编码和解码迭代训练的前向长短记忆网络LSTM模型和后向LSTM模型,以及用于计算每次编码和解码的隐藏层信息权重的注意力机制构成。The chat response method according to claim 1, wherein said seq2seq model is used by a forward-long memory network LSTM model and a backward LSTM model for performing said encoding and decoding iterative training, and for calculating each time The attention and mechanism of the hidden layer information weighting of encoding and decoding. 如权利要求2-5任一项所述的聊天应答方法,其特征在于,所述seq2seq模型由用于进行所述编码和解码迭代训练的前向长短记忆网络LSTM模型和后向LSTM模型,以及用于计算每次编码和解码的隐藏层信息权重的注意力机制构成。A chat response method according to any one of claims 2 to 5, wherein said seq2seq model is composed of a forward long and short memory network LSTM model and a backward LSTM model for performing said coding and decoding iterative training, and A attention mechanism for calculating the weight of hidden layer information for each encoding and decoding. 一种电子装置,包括存储器和处理器,其特征在于,所述存储器中包括聊天应答程序,该聊天应答程序被所述处理器执行时实现如下步骤:An electronic device includes a memory and a processor, wherein the memory includes a chat response program, and the chat response program is executed by the processor to implement the following steps: 预处理步骤:获取客户输入的会话问题,对所述会话问题进行预处理,得到会话问题的文本特征信息,所述文本特征信息包括各词条在所述会话问题中的词性、位置和词类归属信息,所述词类归属包括归属于关键词或命名实体;a pre-processing step: obtaining a session question input by the client, pre-processing the session problem, and obtaining text feature information of the session problem, where the text feature information includes part of speech, location, and part-of-speech attribution of each term in the conversation problem Information, the word class attribution includes belonging to a keyword or a named entity; 第一计算步骤:为问答知识库构建倒排索引,所述问答知识库包括预先整理的多个问题以及每个问题关联的一个或多个答案,根据所述文本特征信息,通过倒排索引查询的方式从问答知识库中查询与所述会话问题相关的候选问题集合,并分别计算所述会话问题与所述候选问题集合中每个候选问题的文本相似度;a first calculating step: constructing an inverted index for the question and answer knowledge base, the question and answer knowledge base including a plurality of questions arranged in advance and one or more answers associated with each question, and querying through the inverted index according to the text feature information Means querying a candidate question set related to the conversation problem from a question and answer knowledge base, and separately calculating text similarity between the conversation problem and each candidate question in the candidate question set; 问题检索步骤:根据预设规则及所述文本相似度,判断候选问题集合中是否存在所述会话问题的近似问题,若所述候选问题集合中存在所述会话问题的近似问题,则在问答知识库中查找该近似问题的关联答案,将所述关联答案作为所述会话问题的目标答案输出;a problem retrieval step: determining, according to a preset rule and the text similarity, whether there is an approximation problem of the conversation problem in the candidate question set, and if there is an approximation problem of the conversation problem in the candidate question set, then the question and answer knowledge Finding an associated answer of the approximate problem in the library, and outputting the associated answer as a target answer of the conversation question; 第二计算步骤:若所述候选问题集合中不存在所述会话问题的近似问题,则根据所述文本特征信息,通过倒排索引查询的方式从问答知识库中查询与所述会话问题相关的候选答案集合,并分别计算所述会话问题与所述候选答案集合中每个候选答案的主题相似度;a second calculating step: if there is no approximation problem of the session problem in the candidate question set, querying, according to the text feature information, a query related to the session problem from the Q&A knowledge base by means of an inverted index query a set of candidate answers, and separately calculating a topic similarity of the conversation question and each candidate answer in the candidate answer set; 答案检索步骤:根据预设规则及所述主题相似度,判断候选答案集合中是否存在所述会话问题的近似答案,若所述候选答案集合中存在所述会话问题的近似答案,则将所述近似答案作为所述会话问题的目标答案输出;An answer retrieval step: determining, according to a preset rule and the topic similarity, whether an approximate answer of the conversation question exists in the candidate answer set, and if an approximate answer of the conversation problem exists in the candidate answer set, The approximate answer is output as the target answer to the conversation question; 答案预测步骤:若候选答案集合中不存在所述会话问题的近似答案,则通过seq2seq模型对所述问答知识库中的各个问题和答案进行编码和解码的迭代训练,从而构建序列预测模型,将所述会话问题输入所述序列预测模型生成应变答案,将所述应变答案作为所述会话问题的目标答案输出。An answer prediction step: if an approximate answer of the conversation problem does not exist in the candidate answer set, iteratively trains and decodes each question and answer in the question and answer knowledge base through the seq2seq model, thereby constructing a sequence prediction model, The conversation question is input to the sequence prediction model to generate a strain answer, and the strain answer is output as a target answer of the conversation question. 如权利要求8所述的电子装置,其特征在于,所述对所述会话问题进 行预处理包括:The electronic device of claim 8 wherein said pre-processing said session problem comprises: 对所述会话问题进行分词处理,从而切分出会话问题的各词条,所述分词处理的方法包括基于词典进行正向最大匹配和/或基于词典进行逆向最大匹配;Performing word segmentation on the conversation problem, thereby segmenting the terms of the conversation problem, the method of word segmentation includes performing forward maximum matching based on the dictionary and/or performing reverse maximum matching based on the dictionary; 对经所述分词处理得到的各词条进行词性解析,并对各词条的词性进行标注,所述词性解析通过经预设大规模语料库训练得到的词性标注模型实现;Performing part-of-speech analysis on each term obtained by the word segmentation process, and labeling the part-of-speech of each term, and the part-of-speech analysis is realized by a part-of-speech tagging model obtained through a preset large-scale corpus training; 对所述会话问题进行命名实体识别,从而识别出具有特定意义的命名实体,所述命名实体包括人名、地名、组织机构、专有名词,所述命名实体识别的方法包括基于词典和规则的方法,以及基于统计学习的方法;Named entity identification is performed on the session problem, thereby identifying a named entity having a specific meaning, the named entity includes a person name, a place name, an organization, a proper noun, and the method for identifying the named entity includes a dictionary-based and rule-based method And methods based on statistical learning; 根据所述各词条以及所述命名实体,从所述会话问题中提取关键词,所述关键词为字符数量多于第一预设阈值的词组,或者为存在于预设词典中的命名实体,所述预设词典包括业务场景专有词典。Extracting a keyword from the conversation question according to the term and the named entity, the keyword being a phrase whose number of characters is greater than a first preset threshold, or a named entity existing in a preset dictionary The preset dictionary includes a business scenario-specific dictionary. 如权利要求9所述的电子装置,其特征在于,所述分别计算所述会话问题与所述候选问题集合中每个候选问题的文本相似度包括:The electronic device according to claim 9, wherein the calculating the text similarity of each of the conversation problem and each candidate question in the candidate question set respectively comprises: 构建卷积神经网络,通过所述卷积神经网络对所述问答知识库中的所有问题语句进行样本训练,得到所述问答知识库中问题语句对应的卷积神经网络模型;Constructing a convolutional neural network, and performing sample training on all problem sentences in the question and answer knowledge base through the convolutional neural network, and obtaining a convolutional neural network model corresponding to the problem statement in the question and answer knowledge base; 将所述会话问题和所述候选问题集合中的每个候选问题分别输入所述卷积神经网络模型,通过所述卷积神经网络模型的卷积核卷积得到所述会话问题和所述候选问题集合中的每个候选问题各自对应的特征向量;Entering each of the conversation problem and the candidate question set into the convolutional neural network model, respectively, and obtaining the conversation problem and the candidate by convolution kernel convolution of the convolutional neural network model a feature vector corresponding to each candidate question in the question set; 分别计算所述会话问题对应的特征向量与所述候选问题集合中的每个候选问题对应的特征向量之间的余弦距离,从而得到所述会话问题与所述候选问题集合中每个候选问题的文本相似度;Calculating a cosine distance between the feature vector corresponding to the session problem and the feature vector corresponding to each candidate question in the candidate question set, respectively, to obtain the session problem and each candidate problem in the candidate question set Text similarity 所述分别计算所述会话问题与所述候选答案集合中每个候选答案的主题相似度包括:The calculating the similarity degree between the conversation problem and each candidate answer in the candidate answer set respectively includes: 采用线性判别分析模型分别提取所述会话问题和所述候选答案集合中每个候选答案的主题向量;Extracting the conversation problem and the topic vector of each candidate answer in the candidate answer set by using a linear discriminant analysis model; 分别计算所述会话问题的主题向量与所述候选答案集合中每个候选答案的主题向量之间的余弦距离,从而得到所述会话问题与所述候选答案集合中每个候选答案的主题相似度。Calculating a cosine distance between a topic vector of the conversation question and a topic vector of each candidate answer in the candidate answer set, respectively, to obtain a topic similarity between the conversation question and each candidate answer in the candidate answer set . 如权利要求8所述的电子装置,其特征在于,所述根据预设规则及所述问题相似度,判断候选问题集合中是否存在所述会话问题的近似问题包括:The electronic device according to claim 8, wherein the approximating the problem of whether the session problem exists in the candidate question set according to the preset rule and the problem similarity includes: 判断是否存在与会话问题的文本相似度大于第二预设阈值的候选问题,若是,则从所述与会话问题的文本相似度大于第二预设阈值的候选问题中选择最大文本相似度对应的候选问题作为所述近似问题;Determining whether there is a candidate problem that the text similarity with the conversation problem is greater than the second preset threshold, and if yes, selecting the maximum text similarity corresponding to the candidate problem that the text similarity with the conversation problem is greater than the second preset threshold Candidate questions as the approximation problem; 若不存在与会话问题的文本相似度大于第二预设阈值的候选问题,则判定所述候选问题集合中不存在所述会话问题的近似问题;If there is no candidate problem that the text similarity with the conversation problem is greater than the second preset threshold, determining that there is no approximation problem of the conversation problem in the candidate problem set; 所述根据预设规则及所述主题相似度,判断候选答案集合中是否存在所 述会话问题的近似答案包括:According to the preset rule and the topic similarity, determining whether the candidate answer has an approximate answer of the conversation problem includes: 判断是否存在与会话问题的主题相似度大于第三预设阈值的候选答案,若是,则从所述与会话问题的主题相似度大于第三预设阈值的候选答案中选择最大主题相似度对应的候选答案作为所述近似答案;Determining whether there is a candidate answer whose topic similarity to the conversation problem is greater than a third preset threshold, and if yes, selecting a maximum topic similarity corresponding to the candidate answers having the topic similarity of the conversation problem being greater than the third preset threshold a candidate answer as the approximate answer; 若不存在与会话问题的主题相似度大于第三预设阈值的候选答案,则判定所述候选答案集合中不存在所述会话问题的近似答案。If there is no candidate answer with the topic similarity of the conversation problem being greater than the third preset threshold, it is determined that the approximate answer of the conversation question does not exist in the candidate answer set. 如权利要求8所述的电子装置,其特征在于,所述为问答知识库构建倒排索引包括:The electronic device according to claim 8, wherein the constructing the inverted index for the Q&A knowledge base comprises: 对问答知识库中的每个问题和答案分别进行分词、词性标注、关键词提取、关键词出现位置记录、分配ID号的操作,以及为每个问题和答案分词后得到的各词条分配ID号;Each question and answer in the Q&A knowledge base is divided into word segmentation, part-of-speech tagging, keyword extraction, keyword location location record, assignment ID number, and ID assigned to each term after each word and answer segmentation. number; 对问答知识库中每个问题和答案根据相应的ID号进行排序,对所述每个问题和答案分词后得到的各词条根据相应的ID号进行排序,并将具有同一词条ID的所有问题ID和答案ID放到该词条对应的倒排记录表中;Each question and answer in the Q&A knowledge base is sorted according to the corresponding ID number, and each term obtained after each word and answer word segmentation is sorted according to the corresponding ID number, and all the items with the same item ID are The question ID and the answer ID are placed in the inverted record table corresponding to the entry; 将所有倒排记录表合并为最终的倒排索引。Combine all inverted record tables into the final inverted index. 如权利要求8所述的电子装置,其特征在于,所述seq2seq模型由用于进行所述编码和解码迭代训练的前向长短记忆网络LSTM模型和后向LSTM模型,以及用于计算每次编码和解码的隐藏层信息权重的注意力机制构成。The electronic device of claim 8 wherein said seq2seq model is comprised of a forward long and short memory network LSTM model and a backward LSTM model for performing said encoding and decoding iterative training, and for calculating each encoding And the attention mechanism of the decoded hidden layer information weights. 如权利要求9-12任一项所述的电子装置,其特征在于,所述seq2seq模型由用于进行所述编码和解码迭代训练的前向长短记忆网络LSTM模型和后向LSTM模型,以及用于计算每次编码和解码的隐藏层信息权重的注意力机制构成。The electronic device according to any one of claims 9 to 12, wherein the seq2seq model is used by a forward-long memory network LSTM model and a backward LSTM model for performing the encoding and decoding iterative training, and The attention mechanism is constructed to calculate the hidden layer information weights for each encoding and decoding. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中包括聊天应答程序,所述聊天应答程序被处理器执行时,该聊天应答程序被所述处理器执行时实现如下步骤:A computer readable storage medium, comprising: a chat response program, wherein when the chat response program is executed by a processor, the chat response program is executed by the processor to implement the following steps : 预处理步骤:获取客户输入的会话问题,对所述会话问题进行预处理,得到会话问题的文本特征信息,所述文本特征信息包括各词条在所述会话问题中的词性、位置和词类归属信息,所述词类归属包括归属于关键词或命名实体;a pre-processing step: obtaining a session question input by the client, pre-processing the session problem, and obtaining text feature information of the session problem, where the text feature information includes part of speech, location, and part-of-speech attribution of each term in the conversation problem Information, the word class attribution includes belonging to a keyword or a named entity; 第一计算步骤:为问答知识库构建倒排索引,所述问答知识库包括预先整理的多个问题以及每个问题关联的一个或多个答案,根据所述文本特征信息,通过倒排索引查询的方式从问答知识库中查询与所述会话问题相关的候选问题集合,并分别计算所述会话问题与所述候选问题集合中每个候选问题的文本相似度;a first calculating step: constructing an inverted index for the question and answer knowledge base, the question and answer knowledge base including a plurality of questions arranged in advance and one or more answers associated with each question, and querying through the inverted index according to the text feature information Means querying a candidate question set related to the conversation problem from a question and answer knowledge base, and separately calculating text similarity between the conversation problem and each candidate question in the candidate question set; 问题检索步骤:根据预设规则及所述文本相似度,判断候选问题集合中是否存在所述会话问题的近似问题,若所述候选问题集合中存在所述会话问题的近似问题,则在问答知识库中查找该近似问题的关联答案,将所述关联答案作为所述会话问题的目标答案输出;a problem retrieval step: determining, according to a preset rule and the text similarity, whether there is an approximation problem of the conversation problem in the candidate question set, and if there is an approximation problem of the conversation problem in the candidate question set, then the question and answer knowledge Finding an associated answer of the approximate problem in the library, and outputting the associated answer as a target answer of the conversation question; 第二计算步骤:若所述候选问题集合中不存在所述会话问题的近似问题, 则根据所述文本特征信息,通过倒排索引查询的方式从问答知识库中查询与所述会话问题相关的候选答案集合,并分别计算所述会话问题与所述候选答案集合中每个候选答案的主题相似度;a second calculating step: if there is no approximation problem of the session problem in the candidate question set, querying, according to the text feature information, a query related to the session problem from the Q&A knowledge base by means of an inverted index query a set of candidate answers, and separately calculating a topic similarity of the conversation question and each candidate answer in the candidate answer set; 答案检索步骤:根据预设规则及所述主题相似度,判断候选答案集合中是否存在所述会话问题的近似答案,若所述候选答案集合中存在所述会话问题的近似答案,则将所述近似答案作为所述会话问题的目标答案输出;An answer retrieval step: determining, according to a preset rule and the topic similarity, whether an approximate answer of the conversation question exists in the candidate answer set, and if an approximate answer of the conversation problem exists in the candidate answer set, The approximate answer is output as the target answer to the conversation question; 答案预测步骤:若候选答案集合中不存在所述会话问题的近似答案,则通过seq2seq模型对所述问答知识库中的各个问题和答案进行编码和解码的迭代训练,从而构建序列预测模型,将所述会话问题输入所述序列预测模型生成应变答案,将所述应变答案作为所述会话问题的目标答案输出。An answer prediction step: if an approximate answer of the conversation problem does not exist in the candidate answer set, iteratively trains and decodes each question and answer in the question and answer knowledge base through the seq2seq model, thereby constructing a sequence prediction model, The conversation question is input to the sequence prediction model to generate a strain answer, and the strain answer is output as a target answer of the conversation question. 如权利要求15所述的计算机可读存储介质,其特征在于,所述对所述会话问题进行预处理包括:The computer readable storage medium of claim 15 wherein said preprocessing said session problem comprises: 对所述会话问题进行分词处理,从而切分出会话问题的各词条,所述分词处理的方法包括基于词典进行正向最大匹配和/或基于词典进行逆向最大匹配;Performing word segmentation on the conversation problem, thereby segmenting the terms of the conversation problem, the method of word segmentation includes performing forward maximum matching based on the dictionary and/or performing reverse maximum matching based on the dictionary; 对经所述分词处理得到的各词条进行词性解析,并对各词条的词性进行标注,所述词性解析通过经预设大规模语料库训练得到的词性标注模型实现;Performing part-of-speech analysis on each term obtained by the word segmentation process, and labeling the part-of-speech of each term, and the part-of-speech analysis is realized by a part-of-speech tagging model obtained through a preset large-scale corpus training; 对所述会话问题进行命名实体识别,从而识别出具有特定意义的命名实体,所述命名实体包括人名、地名、组织机构、专有名词,所述命名实体识别的方法包括基于词典和规则的方法,以及基于统计学习的方法;Named entity identification is performed on the session problem, thereby identifying a named entity having a specific meaning, the named entity includes a person name, a place name, an organization, a proper noun, and the method for identifying the named entity includes a dictionary-based and rule-based method And methods based on statistical learning; 根据所述各词条以及所述命名实体,从所述会话问题中提取关键词,所述关键词为字符数量多于第一预设阈值的词组,或者为存在于预设词典中的命名实体,所述预设词典包括业务场景专有词典。Extracting a keyword from the conversation question according to the term and the named entity, the keyword being a phrase whose number of characters is greater than a first preset threshold, or a named entity existing in a preset dictionary The preset dictionary includes a business scenario-specific dictionary. 如权利要求16所述的计算机可读存储介质,其特征在于,所述分别计算所述会话问题与所述候选问题集合中每个候选问题的文本相似度包括:The computer readable storage medium of claim 16, wherein the calculating the text similarity of the session question and each candidate question in the candidate question set respectively comprises: 构建卷积神经网络,通过所述卷积神经网络对所述问答知识库中的所有问题语句进行样本训练,得到所述问答知识库中问题语句对应的卷积神经网络模型;Constructing a convolutional neural network, and performing sample training on all problem sentences in the question and answer knowledge base through the convolutional neural network, and obtaining a convolutional neural network model corresponding to the problem statement in the question and answer knowledge base; 将所述会话问题和所述候选问题集合中的每个候选问题分别输入所述卷积神经网络模型,通过所述卷积神经网络模型的卷积核卷积得到所述会话问题和所述候选问题集合中的每个候选问题各自对应的特征向量;Entering each of the conversation problem and the candidate question set into the convolutional neural network model, respectively, and obtaining the conversation problem and the candidate by convolution kernel convolution of the convolutional neural network model a feature vector corresponding to each candidate question in the question set; 分别计算所述会话问题对应的特征向量与所述候选问题集合中的每个候选问题对应的特征向量之间的余弦距离,从而得到所述会话问题与所述候选问题集合中每个候选问题的文本相似度;Calculating a cosine distance between the feature vector corresponding to the session problem and the feature vector corresponding to each candidate question in the candidate question set, respectively, to obtain the session problem and each candidate problem in the candidate question set Text similarity 所述分别计算所述会话问题与所述候选答案集合中每个候选答案的主题相似度包括:The calculating the similarity degree between the conversation problem and each candidate answer in the candidate answer set respectively includes: 采用线性判别分析模型分别提取所述会话问题和所述候选答案集合中每个候选答案的主题向量;Extracting the conversation problem and the topic vector of each candidate answer in the candidate answer set by using a linear discriminant analysis model; 分别计算所述会话问题的主题向量与所述候选答案集合中每个候选答案 的主题向量之间的余弦距离,从而得到所述会话问题与所述候选答案集合中每个候选答案的主题相似度。Calculating a cosine distance between a topic vector of the conversation question and a topic vector of each candidate answer in the candidate answer set, respectively, to obtain a topic similarity between the conversation question and each candidate answer in the candidate answer set . 如权利要求15所述的计算机可读存储介质,其特征在于,所述根据预设规则及所述问题相似度,判断候选问题集合中是否存在所述会话问题的近似问题包括:The computer readable storage medium according to claim 15, wherein the approximating the problem of whether the session problem exists in the candidate question set according to the preset rule and the problem similarity includes: 判断是否存在与会话问题的文本相似度大于第二预设阈值的候选问题,若是,则从所述与会话问题的文本相似度大于第二预设阈值的候选问题中选择最大文本相似度对应的候选问题作为所述近似问题;Determining whether there is a candidate problem that the text similarity with the conversation problem is greater than the second preset threshold, and if yes, selecting the maximum text similarity corresponding to the candidate problem that the text similarity with the conversation problem is greater than the second preset threshold Candidate questions as the approximation problem; 若不存在与会话问题的文本相似度大于第二预设阈值的候选问题,则判定所述候选问题集合中不存在所述会话问题的近似问题;If there is no candidate problem that the text similarity with the conversation problem is greater than the second preset threshold, determining that there is no approximation problem of the conversation problem in the candidate problem set; 所述根据预设规则及所述主题相似度,判断候选答案集合中是否存在所述会话问题的近似答案包括:The determining, according to the preset rule and the topic similarity, determining whether the candidate answer exists in the candidate answer set includes: 判断是否存在与会话问题的主题相似度大于第三预设阈值的候选答案,若是,则从所述与会话问题的主题相似度大于第三预设阈值的候选答案中选择最大主题相似度对应的候选答案作为所述近似答案;Determining whether there is a candidate answer whose topic similarity to the conversation problem is greater than a third preset threshold, and if yes, selecting a maximum topic similarity corresponding to the candidate answers having the topic similarity of the conversation problem being greater than the third preset threshold a candidate answer as the approximate answer; 若不存在与会话问题的主题相似度大于第三预设阈值的候选答案,则判定所述候选答案集合中不存在所述会话问题的近似答案。If there is no candidate answer with the topic similarity of the conversation problem being greater than the third preset threshold, it is determined that the approximate answer of the conversation question does not exist in the candidate answer set. 如权利要求15所述的计算机可读存储介质,其特征在于,所述为问答知识库构建倒排索引包括:The computer readable storage medium of claim 15, wherein the constructing the inverted index for the question and answer knowledge base comprises: 对问答知识库中的每个问题和答案分别进行分词、词性标注、关键词提取、关键词出现位置记录、分配ID号的操作,以及为每个问题和答案分词后得到的各词条分配ID号;Each question and answer in the Q&A knowledge base is divided into word segmentation, part-of-speech tagging, keyword extraction, keyword location location record, assignment ID number, and ID assigned to each term after each word and answer segmentation. number; 对问答知识库中每个问题和答案根据相应的ID号进行排序,对所述每个问题和答案分词后得到的各词条根据相应的ID号进行排序,并将具有同一词条ID的所有问题ID和答案ID放到该词条对应的倒排记录表中;Each question and answer in the Q&A knowledge base is sorted according to the corresponding ID number, and each term obtained after each word and answer word segmentation is sorted according to the corresponding ID number, and all the items with the same item ID are The question ID and the answer ID are placed in the inverted record table corresponding to the entry; 将所有倒排记录表合并为最终的倒排索引。Combine all inverted record tables into the final inverted index. 如权利要求15所述的计算机可读存储介质,其特征在于,所述seq2seq模型由用于进行所述编码和解码迭代训练的前向长短记忆网络LSTM模型和后向LSTM模型,以及用于计算每次编码和解码的隐藏层信息权重的注意力机制构成。The computer readable storage medium of claim 15 wherein said seq2seq model is comprised of a forward long and short memory network LSTM model and a backward LSTM model for performing said encoding and decoding iterative training, and for computing Each time the encoding and decoding of the hidden layer information weights attention mechanism is constructed.
PCT/CN2018/090643 2018-02-09 2018-06-11 Chat response method, electronic device and storage medium Ceased WO2019153613A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810135747.6A CN108491433B (en) 2018-02-09 2018-02-09 Chat answering method, electronic device and storage medium
CN201810135747.6 2018-02-09

Publications (1)

Publication Number Publication Date
WO2019153613A1 true WO2019153613A1 (en) 2019-08-15

Family

ID=63340316

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/090643 Ceased WO2019153613A1 (en) 2018-02-09 2018-06-11 Chat response method, electronic device and storage medium

Country Status (2)

Country Link
CN (1) CN108491433B (en)
WO (1) WO2019153613A1 (en)

Cited By (42)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110502752A (en) * 2019-08-21 2019-11-26 北京一链数云科技有限公司 A kind of text handling method, device, equipment and computer storage medium
CN111090721A (en) * 2019-11-25 2020-05-01 出门问问(苏州)信息科技有限公司 Question answering method and device and electronic equipment
CN111177339A (en) * 2019-12-06 2020-05-19 百度在线网络技术(北京)有限公司 Dialog generation method and device, electronic equipment and storage medium
CN111177336A (en) * 2019-11-30 2020-05-19 西安华为技术有限公司 Method and device for determining response information
CN111291170A (en) * 2020-01-20 2020-06-16 腾讯科技(深圳)有限公司 Session recommendation method based on intelligent customer service and related device
CN111428019A (en) * 2020-04-02 2020-07-17 出门问问信息科技有限公司 Data processing method and equipment for knowledge base question answering
CN111538803A (en) * 2020-04-20 2020-08-14 京东方科技集团股份有限公司 Method, device, equipment and medium for acquiring candidate question text to be matched
CN111597321A (en) * 2020-07-08 2020-08-28 腾讯科技(深圳)有限公司 Question answer prediction method and device, storage medium and electronic equipment
CN111625635A (en) * 2020-05-27 2020-09-04 北京百度网讯科技有限公司 Question-answer processing method, language model training method, device, equipment and storage medium
CN111737401A (en) * 2020-06-22 2020-10-02 首都师范大学 A Keyword Group Prediction Method Based on Seq2set2seq Framework
CN111753062A (en) * 2019-11-06 2020-10-09 北京京东尚科信息技术有限公司 A method, apparatus, device and medium for determining a session response scheme
CN112184021A (en) * 2020-09-28 2021-01-05 中国人民解放军国防科技大学 Answer quality evaluation method based on similar support set
CN112232053A (en) * 2020-09-16 2021-01-15 西北大学 A text similarity calculation system, method, and storage medium based on multi-keyword pair matching
CN112330387A (en) * 2020-09-29 2021-02-05 重庆锐云科技有限公司 Virtual broker applied to house-watching software
CN112749260A (en) * 2019-10-31 2021-05-04 阿里巴巴集团控股有限公司 Information interaction method, device, equipment and medium
CN113076409A (en) * 2021-04-20 2021-07-06 上海景吾智能科技有限公司 Dialogue system and method applied to robot, robot and readable medium
CN113127613A (en) * 2020-01-10 2021-07-16 北京搜狗科技发展有限公司 Chat information processing method and device
CN113743124A (en) * 2021-08-25 2021-12-03 南京星云数字技术有限公司 Intelligent question-answer exception processing method and device and electronic equipment
CN113761986A (en) * 2020-06-05 2021-12-07 阿里巴巴集团控股有限公司 Text acquisition, live broadcast method, device and storage medium
CN114328796A (en) * 2021-08-19 2022-04-12 腾讯科技(深圳)有限公司 Question and answer index generation method, question and answer model processing method, device and storage medium
CN114443818A (en) * 2022-01-30 2022-05-06 天津大学 Dialogue type knowledge base question-answer implementation method
CN114491046A (en) * 2022-02-14 2022-05-13 中国工商银行股份有限公司 Information interaction method based on language model, device and electronic device thereof
CN114490957A (en) * 2020-11-12 2022-05-13 中移物联网有限公司 Question answering method, apparatus and computer readable storage medium
CN114579729A (en) * 2022-05-09 2022-06-03 南京云问网络技术有限公司 FAQ question-answer matching method and system fusing multi-algorithm model
CN114638236A (en) * 2022-03-30 2022-06-17 政采云有限公司 Intelligent question answering method, device, equipment and computer readable storage medium
CN114661883A (en) * 2022-03-31 2022-06-24 北京金山数字娱乐科技有限公司 Intelligent question and answer method and device and electronic equipment
CN114860898A (en) * 2022-03-25 2022-08-05 成都淞幸科技有限责任公司 A software development knowledge base construction and application method
CN115080720A (en) * 2022-06-29 2022-09-20 壹沓科技(上海)有限公司 Text processing method, device, equipment and medium based on RPA and AI
CN115129820A (en) * 2022-07-22 2022-09-30 宁波牛信网络科技有限公司 Similarity-based text feedback method and device
CN115221316A (en) * 2022-06-14 2022-10-21 科大讯飞华南人工智能研究院(广州)有限公司 Knowledge base processing, model training method, computer equipment and storage medium
CN116049376A (en) * 2023-03-31 2023-05-02 北京太极信息系统技术有限公司 Method, device and system for retrieving and replying information and creating knowledge
CN116226329A (en) * 2023-01-04 2023-06-06 国网河北省电力有限公司信息通信分公司 Intelligent retrieval method, device and terminal equipment for problems in the power grid field
CN116303981A (en) * 2023-05-23 2023-06-23 山东森普信息技术有限公司 Agricultural community knowledge question-answering method, device and storage medium
CN116795953A (en) * 2022-03-08 2023-09-22 腾讯科技(深圳)有限公司 Question-answer matching method and device, computer readable storage medium and computer equipment
CN116886656A (en) * 2023-09-06 2023-10-13 北京小糖科技有限责任公司 Chat room-oriented dance knowledge pushing method and device
CN117332789A (en) * 2023-12-01 2024-01-02 诺比侃人工智能科技(成都)股份有限公司 Semantic analysis method and system for dialogue scene
CN118350468A (en) * 2024-06-14 2024-07-16 杭州字节方舟科技有限公司 An AI dialogue method based on natural language processing
CN118606574A (en) * 2024-08-12 2024-09-06 杭州领信数科信息技术有限公司 Knowledge answering method, system, electronic device and storage medium based on large model
CN119294521A (en) * 2024-10-14 2025-01-10 四川开物信息技术有限公司 Intelligent question-answering system and question-answering method
CN119441431A (en) * 2024-10-25 2025-02-14 北京房多多信息技术有限公司 Data processing method, device, electronic device and storage medium
CN119621889A (en) * 2024-11-21 2025-03-14 之江实验室 A vertical knowledge question-answering method and device based on a large model
CN119719276A (en) * 2024-11-26 2025-03-28 陕西优百信息技术有限公司 Question answering method, device, storage medium and electronic device based on model knowledge base

Families Citing this family (57)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109299250A (en) * 2018-09-14 2019-02-01 广州神马移动信息科技有限公司 Methods of exhibiting, device, storage medium and the electronic equipment of answer
CN110908663B (en) * 2018-09-18 2024-08-16 北京京东尚科信息技术有限公司 Positioning method and positioning device for business problem
US11514915B2 (en) * 2018-09-27 2022-11-29 Salesforce.Com, Inc. Global-to-local memory pointer networks for task-oriented dialogue
CN109344242B (en) * 2018-09-28 2021-10-01 广东工业大学 A dialogue question answering method, device, equipment and storage medium
CN109359182B (en) * 2018-10-08 2020-11-27 网宿科技股份有限公司 A method and device for answering
CN109543005A (en) * 2018-10-12 2019-03-29 平安科技(深圳)有限公司 The dialogue state recognition methods of customer service robot and device, equipment, storage medium
CN109299242A (en) * 2018-10-19 2019-02-01 武汉斗鱼网络科技有限公司 A kind of session generation method, device, terminal device and storage medium
CN111125320A (en) * 2018-10-31 2020-05-08 重庆小雨点小额贷款有限公司 Data processing method, device, server and computer readable storage medium
KR102201074B1 (en) * 2018-10-31 2021-01-08 서울대학교산학협력단 Method and system of goal-oriented dialog based on information theory
CN111159363A (en) * 2018-11-06 2020-05-15 航天信息股份有限公司 Knowledge base-based question answer determination method and device
CN109446314A (en) * 2018-11-14 2019-03-08 沈文策 A kind of customer service question processing method and device
CN109492085B (en) * 2018-11-15 2024-05-14 平安科技(深圳)有限公司 Answer determination method, device, terminal and storage medium based on data processing
CN109543017B (en) * 2018-11-21 2022-12-13 广州语义科技有限公司 Legal question keyword generation method and system
CN109492086B (en) * 2018-11-26 2022-01-21 出门问问创新科技有限公司 Answer output method and device, electronic equipment and storage medium
CN109726265A (en) * 2018-12-13 2019-05-07 深圳壹账通智能科技有限公司 Information processing method, device and computer-readable storage medium for assisting chat
CN109685462A (en) * 2018-12-21 2019-04-26 义橙网络科技(上海)有限公司 A kind of personnel and post matching method, apparatus, system, equipment and medium
CN109766421A (en) * 2018-12-28 2019-05-17 上海汇付数据服务有限公司 Intelligent Answer System and method
CN109829478B (en) * 2018-12-29 2024-05-07 平安科技(深圳)有限公司 Problem classification method and device based on variation self-encoder
CN109918560B (en) * 2019-01-09 2024-03-12 平安科技(深圳)有限公司 Question and answer method and device based on search engine
CN109885810A (en) * 2019-01-17 2019-06-14 平安城市建设科技(深圳)有限公司 Nan-machine interrogation's method, apparatus, equipment and storage medium based on semanteme parsing
CN109829046A (en) * 2019-01-18 2019-05-31 青牛智胜(深圳)科技有限公司 A kind of intelligence seat system and method
CN111611354B (en) * 2019-02-26 2023-09-29 北京嘀嘀无限科技发展有限公司 Man-machine conversation control method and device, server and readable storage medium
US11600389B2 (en) 2019-03-19 2023-03-07 Boe Technology Group Co., Ltd. Question generating method and apparatus, inquiring diagnosis system, and computer readable storage medium
CN111858859B (en) * 2019-04-01 2024-07-26 北京百度网讯科技有限公司 Automatic question-answering processing method, device, computer equipment and storage medium
CN111831132B (en) * 2019-04-19 2024-12-27 北京搜狗科技发展有限公司 Information recommendation method, device and electronic device
CN111858863B (en) * 2019-04-29 2023-07-14 深圳市优必选科技有限公司 Reply recommendation method, reply recommendation device and electronic equipment
CN110795542B (en) * 2019-08-28 2024-03-15 腾讯科技(深圳)有限公司 Dialogue method, related device and equipment
CN110765244B (en) * 2019-09-18 2023-06-06 平安科技(深圳)有限公司 Method, device, computer equipment and storage medium for obtaining answering operation
CN110781275B (en) * 2019-09-18 2022-05-10 中国电子科技集团公司第二十八研究所 Multi-feature-based question answerability discrimination method and computer storage medium
CN110781284B (en) * 2019-09-18 2024-05-28 平安科技(深圳)有限公司 Knowledge graph-based question and answer method, device and storage medium
CN110619038A (en) * 2019-09-20 2019-12-27 上海氦豚机器人科技有限公司 Method, system and electronic equipment for vertically guiding professional consultation
CN110737763A (en) * 2019-10-18 2020-01-31 成都华律网络服务有限公司 Chinese intelligent question-answering system and method integrating knowledge map and deep learning
CN111159331B (en) * 2019-11-14 2021-11-23 中国科学院深圳先进技术研究院 Text query method, text query device and computer storage medium
CN111339274B (en) * 2020-02-25 2024-01-26 网易(杭州)网络有限公司 Dialogue generation model training method, dialogue generation method and device
CN111400413B (en) * 2020-03-10 2023-06-30 支付宝(杭州)信息技术有限公司 Method and system for determining category of knowledge points in knowledge base
CN111475628B (en) * 2020-03-30 2023-07-14 珠海格力电器股份有限公司 Session data processing method, apparatus, computer device and storage medium
CN111651560B (en) * 2020-05-29 2023-08-29 北京百度网讯科技有限公司 Method and apparatus for configuring problems, electronic device, computer readable medium
CN111753052A (en) * 2020-06-19 2020-10-09 微软技术许可有限责任公司 Provide knowledgeable answers to knowledge intent questions
CN111814466B (en) * 2020-06-24 2024-09-13 平安科技(深圳)有限公司 Information extraction method based on machine reading understanding and related equipment thereof
CN111782785B (en) * 2020-06-30 2024-04-19 北京百度网讯科技有限公司 Automatic question and answer method, device, equipment and storage medium
CN111858856A (en) * 2020-07-23 2020-10-30 海信电子科技(武汉)有限公司 Multi-round search type chatting method and display equipment
CN111949787B (en) * 2020-08-21 2023-04-28 平安国际智慧城市科技股份有限公司 Automatic question-answering method, device, equipment and storage medium based on knowledge graph
CN112307164A (en) * 2020-10-15 2021-02-02 江苏常熟农村商业银行股份有限公司 Information recommendation method and device, computer equipment and storage medium
CN112527985A (en) * 2020-12-04 2021-03-19 杭州远传新业科技有限公司 Unknown problem processing method, device, equipment and medium
CN112507078B (en) * 2020-12-15 2022-05-10 浙江诺诺网络科技有限公司 Semantic question and answer method and device, electronic equipment and storage medium
CN112559707A (en) * 2020-12-16 2021-03-26 四川智仟科技有限公司 Knowledge-driven customer service question and answer method
CN112597291B (en) * 2020-12-26 2024-09-17 中国农业银行股份有限公司 Intelligent question-answering implementation method, device and equipment
CN112860863A (en) * 2021-01-30 2021-05-28 云知声智能科技股份有限公司 Machine reading understanding method and device
CN115238046A (en) * 2021-04-25 2022-10-25 平安普惠企业管理有限公司 User intention identification method and device, electronic equipment and storage medium
WO2022226879A1 (en) * 2021-04-29 2022-11-03 京东方科技集团股份有限公司 Question and answer processing method and apparatus, electronic device, and computer-readable storage medium
CN114328841A (en) * 2021-07-13 2022-04-12 北京金山数字娱乐科技有限公司 Question-answer model training method and device, question-answer method and device
CN114416962B (en) * 2022-01-11 2024-10-18 平安科技(深圳)有限公司 Prediction method, prediction device, electronic equipment and storage medium for answers to questions
CN114398909B (en) * 2022-01-18 2025-09-05 中国平安人寿保险股份有限公司 Question generation method, device, equipment and storage medium for dialogue training
CN116414959A (en) * 2023-02-23 2023-07-11 厦门黑镜科技有限公司 Digital human interaction control method, device, electronic device and storage medium
CN116955579B (en) * 2023-09-21 2023-12-29 武汉轻度科技有限公司 Chat reply generation method and device based on keyword knowledge retrieval
CN116992005B (en) * 2023-09-25 2023-12-01 语仓科技(北京)有限公司 Intelligent dialogue method, system and equipment based on large model and local knowledge base
CN119988552A (en) * 2025-01-21 2025-05-13 青岛市市场监管发展服务中心(青岛市市场监管应急处置中心、青岛市消费者权益保护中心) A market supervision public consultation method and system based on pre-trained large language model

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102866990A (en) * 2012-08-20 2013-01-09 北京搜狗信息服务有限公司 Thematic conversation method and device
CN105630917A (en) * 2015-12-22 2016-06-01 成都小多科技有限公司 Intelligent answering method and intelligent answering device
CN107463699A (en) * 2017-08-15 2017-12-12 济南浪潮高新科技投资发展有限公司 A kind of method for realizing question and answer robot based on seq2seq models
CN107609101A (en) * 2017-09-11 2018-01-19 远光软件股份有限公司 Intelligent interactive method, equipment and storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160371276A1 (en) * 2015-06-19 2016-12-22 Microsoft Technology Licensing, Llc Answer scheme for information request

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102866990A (en) * 2012-08-20 2013-01-09 北京搜狗信息服务有限公司 Thematic conversation method and device
CN105630917A (en) * 2015-12-22 2016-06-01 成都小多科技有限公司 Intelligent answering method and intelligent answering device
CN107463699A (en) * 2017-08-15 2017-12-12 济南浪潮高新科技投资发展有限公司 A kind of method for realizing question and answer robot based on seq2seq models
CN107609101A (en) * 2017-09-11 2018-01-19 远光软件股份有限公司 Intelligent interactive method, equipment and storage medium

Cited By (54)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110502752A (en) * 2019-08-21 2019-11-26 北京一链数云科技有限公司 A kind of text handling method, device, equipment and computer storage medium
CN112749260A (en) * 2019-10-31 2021-05-04 阿里巴巴集团控股有限公司 Information interaction method, device, equipment and medium
CN111753062A (en) * 2019-11-06 2020-10-09 北京京东尚科信息技术有限公司 A method, apparatus, device and medium for determining a session response scheme
CN111090721B (en) * 2019-11-25 2023-09-12 出门问问(苏州)信息科技有限公司 Question answering method and device and electronic equipment
CN111090721A (en) * 2019-11-25 2020-05-01 出门问问(苏州)信息科技有限公司 Question answering method and device and electronic equipment
CN111177336A (en) * 2019-11-30 2020-05-19 西安华为技术有限公司 Method and device for determining response information
CN111177336B (en) * 2019-11-30 2023-11-10 西安华为技术有限公司 Method and device for determining response information
CN111177339A (en) * 2019-12-06 2020-05-19 百度在线网络技术(北京)有限公司 Dialog generation method and device, electronic equipment and storage medium
CN113127613B (en) * 2020-01-10 2024-01-09 北京搜狗科技发展有限公司 Chat information processing method and device
CN113127613A (en) * 2020-01-10 2021-07-16 北京搜狗科技发展有限公司 Chat information processing method and device
CN111291170B (en) * 2020-01-20 2023-09-19 腾讯科技(深圳)有限公司 Session recommendation method and related device based on intelligent customer service
CN111291170A (en) * 2020-01-20 2020-06-16 腾讯科技(深圳)有限公司 Session recommendation method based on intelligent customer service and related device
CN111428019A (en) * 2020-04-02 2020-07-17 出门问问信息科技有限公司 Data processing method and equipment for knowledge base question answering
CN111538803A (en) * 2020-04-20 2020-08-14 京东方科技集团股份有限公司 Method, device, equipment and medium for acquiring candidate question text to be matched
CN111625635A (en) * 2020-05-27 2020-09-04 北京百度网讯科技有限公司 Question-answer processing method, language model training method, device, equipment and storage medium
CN111625635B (en) * 2020-05-27 2023-09-29 北京百度网讯科技有限公司 Question-answering processing method, device, equipment and storage medium
CN113761986A (en) * 2020-06-05 2021-12-07 阿里巴巴集团控股有限公司 Text acquisition, live broadcast method, device and storage medium
CN111737401A (en) * 2020-06-22 2020-10-02 首都师范大学 A Keyword Group Prediction Method Based on Seq2set2seq Framework
CN111597321B (en) * 2020-07-08 2024-06-11 腾讯科技(深圳)有限公司 Prediction method and device of answers to questions, storage medium and electronic equipment
CN111597321A (en) * 2020-07-08 2020-08-28 腾讯科技(深圳)有限公司 Question answer prediction method and device, storage medium and electronic equipment
CN112232053A (en) * 2020-09-16 2021-01-15 西北大学 A text similarity calculation system, method, and storage medium based on multi-keyword pair matching
CN112184021B (en) * 2020-09-28 2023-09-05 中国人民解放军国防科技大学 Answer quality assessment method based on similar support set
CN112184021A (en) * 2020-09-28 2021-01-05 中国人民解放军国防科技大学 Answer quality evaluation method based on similar support set
CN112330387A (en) * 2020-09-29 2021-02-05 重庆锐云科技有限公司 Virtual broker applied to house-watching software
CN112330387B (en) * 2020-09-29 2023-07-18 重庆锐云科技有限公司 Virtual broker applied to house watching software
CN114490957A (en) * 2020-11-12 2022-05-13 中移物联网有限公司 Question answering method, apparatus and computer readable storage medium
CN113076409A (en) * 2021-04-20 2021-07-06 上海景吾智能科技有限公司 Dialogue system and method applied to robot, robot and readable medium
CN114328796A (en) * 2021-08-19 2022-04-12 腾讯科技(深圳)有限公司 Question and answer index generation method, question and answer model processing method, device and storage medium
CN113743124B (en) * 2021-08-25 2024-03-29 南京星云数字技术有限公司 Intelligent question-answering exception processing method and device and electronic equipment
CN113743124A (en) * 2021-08-25 2021-12-03 南京星云数字技术有限公司 Intelligent question-answer exception processing method and device and electronic equipment
CN114443818A (en) * 2022-01-30 2022-05-06 天津大学 Dialogue type knowledge base question-answer implementation method
CN114491046A (en) * 2022-02-14 2022-05-13 中国工商银行股份有限公司 Information interaction method based on language model, device and electronic device thereof
CN116795953A (en) * 2022-03-08 2023-09-22 腾讯科技(深圳)有限公司 Question-answer matching method and device, computer readable storage medium and computer equipment
CN114860898A (en) * 2022-03-25 2022-08-05 成都淞幸科技有限责任公司 A software development knowledge base construction and application method
CN114638236A (en) * 2022-03-30 2022-06-17 政采云有限公司 Intelligent question answering method, device, equipment and computer readable storage medium
CN114661883A (en) * 2022-03-31 2022-06-24 北京金山数字娱乐科技有限公司 Intelligent question and answer method and device and electronic equipment
CN114579729A (en) * 2022-05-09 2022-06-03 南京云问网络技术有限公司 FAQ question-answer matching method and system fusing multi-algorithm model
CN115221316A (en) * 2022-06-14 2022-10-21 科大讯飞华南人工智能研究院(广州)有限公司 Knowledge base processing, model training method, computer equipment and storage medium
CN115080720A (en) * 2022-06-29 2022-09-20 壹沓科技(上海)有限公司 Text processing method, device, equipment and medium based on RPA and AI
CN115129820A (en) * 2022-07-22 2022-09-30 宁波牛信网络科技有限公司 Similarity-based text feedback method and device
CN116226329A (en) * 2023-01-04 2023-06-06 国网河北省电力有限公司信息通信分公司 Intelligent retrieval method, device and terminal equipment for problems in the power grid field
CN116049376A (en) * 2023-03-31 2023-05-02 北京太极信息系统技术有限公司 Method, device and system for retrieving and replying information and creating knowledge
CN116303981A (en) * 2023-05-23 2023-06-23 山东森普信息技术有限公司 Agricultural community knowledge question-answering method, device and storage medium
CN116303981B (en) * 2023-05-23 2023-08-01 山东森普信息技术有限公司 Agricultural community knowledge question-answering method, device and storage medium
CN116886656B (en) * 2023-09-06 2023-12-08 北京小糖科技有限责任公司 Chat room-oriented dance knowledge pushing method and device
CN116886656A (en) * 2023-09-06 2023-10-13 北京小糖科技有限责任公司 Chat room-oriented dance knowledge pushing method and device
CN117332789A (en) * 2023-12-01 2024-01-02 诺比侃人工智能科技(成都)股份有限公司 Semantic analysis method and system for dialogue scene
CN118350468A (en) * 2024-06-14 2024-07-16 杭州字节方舟科技有限公司 An AI dialogue method based on natural language processing
CN118606574A (en) * 2024-08-12 2024-09-06 杭州领信数科信息技术有限公司 Knowledge answering method, system, electronic device and storage medium based on large model
CN119294521A (en) * 2024-10-14 2025-01-10 四川开物信息技术有限公司 Intelligent question-answering system and question-answering method
CN119441431A (en) * 2024-10-25 2025-02-14 北京房多多信息技术有限公司 Data processing method, device, electronic device and storage medium
CN119621889A (en) * 2024-11-21 2025-03-14 之江实验室 A vertical knowledge question-answering method and device based on a large model
CN119719276A (en) * 2024-11-26 2025-03-28 陕西优百信息技术有限公司 Question answering method, device, storage medium and electronic device based on model knowledge base
CN119719276B (en) * 2024-11-26 2025-09-30 陕西优百信息技术有限公司 Question and answer method and device based on model knowledge base, storage medium and electronic equipment

Also Published As

Publication number Publication date
CN108491433B (en) 2022-05-03
CN108491433A (en) 2018-09-04

Similar Documents

Publication Publication Date Title
CN108491433B (en) Chat answering method, electronic device and storage medium
US12470503B2 (en) Customized message suggestion with user embedding vectors
US11334635B2 (en) Domain specific natural language understanding of customer intent in self-help
CN109871446B (en) Rejection method, electronic device and storage medium in intent recognition
US10657332B2 (en) Language-agnostic understanding
US8073877B2 (en) Scalable semi-structured named entity detection
WO2019153607A1 (en) Intelligent response method, electronic device and storage medium
US12131827B2 (en) Knowledge graph-based question answering method, computer device, and medium
US10192545B2 (en) Language modeling based on spoken and unspeakable corpuses
US20200019609A1 (en) Suggesting a response to a message by selecting a template using a neural network
CN111428010B (en) Man-machine intelligent question-answering method and device
US10289957B2 (en) Method and system for entity linking
WO2019153612A1 (en) Question and answer data processing method, electronic device and storage medium
US11030394B1 (en) Neural models for keyphrase extraction
WO2020233131A1 (en) Question-and-answer processing method and apparatus, computer device and storage medium
CN104471568A (en) Learning-Based Processing of Natural Language Problems
CN109299235B (en) Knowledge base searching method, device and computer readable storage medium
CN113505293B (en) Information pushing method and device, electronic equipment and storage medium
CN107885717B (en) Keyword extraction method and device
CN112287069A (en) Information retrieval method and device based on voice semantics and computer equipment
CN110134777B (en) Question duplication eliminating method and device, electronic equipment and computer readable storage medium
CN111783424A (en) Text clause dividing method and device
CN108268450B (en) Method and apparatus for generating information
US20200272696A1 (en) Finding of asymmetric relation between words
CN113127621A (en) Dialogue module pushing method, device, equipment and storage medium

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 30.09.2020)

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18905843

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 18905843

Country of ref document: EP

Kind code of ref document: A1