WO2021051558A1 - Knowledge graph-based question and answer method and apparatus, and storage medium - Google Patents
Knowledge graph-based question and answer method and apparatus, and storage medium Download PDFInfo
- Publication number
- WO2021051558A1 WO2021051558A1 PCT/CN2019/117583 CN2019117583W WO2021051558A1 WO 2021051558 A1 WO2021051558 A1 WO 2021051558A1 CN 2019117583 W CN2019117583 W CN 2019117583W WO 2021051558 A1 WO2021051558 A1 WO 2021051558A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- recognition model
- label
- entity
- artificial
- entity element
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3329—Natural language query formulation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
Definitions
- This application relates to the field of information processing technology, and in particular to a question and answer method, device and storage medium based on a knowledge graph.
- Question answering system is an advanced form of information retrieval system. It can use accurate and concise natural language to answer users' questions in natural language.
- the development and improvement of question answering systems is also a research direction that has attracted much attention and has broad development prospects.
- the traditional question answering system is based on a certain question and answer corpus training model, the user's natural language is processed and input into the trained model, and the result is obtained by querying similar corpus in the model.
- the accuracy of this question answering system depends on the coverage of the training corpus.
- the question answering result output by the traditional question answering system is not accurate.
- the main purpose of this application is to provide a question and answer method, device and storage medium based on a knowledge graph, aiming to solve the technical problem of inaccurate question and answer results output by a traditional question and answer system.
- this application provides a question and answer method based on a knowledge graph, which includes the following steps:
- the entity element, the label, and the question and answer sentence are input to the Bayesian classifier, and the matching degree between each preset template in the Bayesian classifier and the question and answer sentence is calculated, and the one with the highest matching degree is calculated
- the preset template is determined to be the query template
- the entity element and the label are input into the query template to obtain the corresponding query sentence, and the query sentence is input into the knowledge graph for query, and the corresponding question and answer result is obtained.
- the present application also provides a device that includes a memory, a processor, and computer-readable instructions stored on the memory and capable of running on the processor, and the computer-readable instructions are used by the processor When executed, the steps of the question answering method based on the knowledge graph as described above are realized.
- the present application also provides a non-volatile computer-readable storage medium having computer-readable instructions stored on the non-volatile computer-readable storage medium, and when the computer-readable instructions are executed by a processor, the aforementioned Steps of question answering method based on knowledge graph.
- the application discloses a question and answer method, device and storage medium based on a knowledge graph.
- the method first obtains the question and answer sentence input by the user, performs word segmentation on the question and answer sentence, and completes the entity element recognition model and label recognition through training
- the model obtains the entity elements in the question and answer sentence and the label corresponding to the entity element respectively; inputs the entity element, the label, and the question and answer sentence into the Bayesian classifier, and calculates the Bayesian classifier
- the degree of matching between each preset template and the question and answer sentence, and the preset template with the highest matching degree is determined as the query template; the entity element and the label are input into the query template to obtain the corresponding query sentence,
- the query sentence is input into the knowledge graph for query, and the corresponding question and answer result is obtained.
- This application analyzes the user’s question and answer sentences of the trained entity element recognition model and the label recognition model, and obtains the entity elements and labels of the question and answer sentence by training the entity element recognition model and the label recognition model.
- Hierarchical mining determine the most suitable query template, and generate the corresponding query sentence, get the corresponding question and answer result according to the knowledge map, the whole process reduces the dependence on the training corpus, avoids the question and answer system missed detection and error detection, thereby improving the question and answer The accuracy of the Q&A results output by the system.
- FIG. 1 is a schematic diagram of the device structure of the hardware operating environment involved in the solution of the embodiment of the present application;
- FIG. 2 is a schematic flowchart of an embodiment of a question answering method based on a knowledge graph of this application;
- FIG. 3 is a schematic flowchart of another embodiment of the question answering method based on the knowledge graph of this application;
- FIG. 4 is a schematic flowchart of another embodiment of the question answering method based on the knowledge graph of this application.
- FIG. 1 is a schematic diagram of a terminal structure of a hardware operating environment involved in a solution of an embodiment of the present application.
- the terminal of this application is a device, and the device may be a terminal device with a storage function such as a mobile phone, a computer, or a mobile computer.
- the terminal may include: a processor 1001, such as a CPU, a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005.
- the communication bus 1002 is used to implement connection and communication between these components.
- the user interface 1003 may include a display screen (Display) and an input unit such as a keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface and a wireless interface.
- the network interface 1004 may optionally include a standard wired interface and a wireless interface (such as a WI-FI interface).
- the memory 1005 can be a high-speed RAM memory or a stable memory (non-volatile memory), such as disk storage.
- the memory 1005 may also be a storage device independent of the aforementioned processor 1001.
- the terminal may also include a camera, a Wi-Fi module, etc., which will not be repeated here.
- terminal structure shown in FIG. 1 does not constitute a limitation on the terminal, and may include more or fewer components than shown in the figure, or combine some components, or arrange different components.
- the terminal may also include a camera, a Wi-Fi module, etc., which will not be repeated here.
- terminal structure shown in FIG. 1 does not constitute a limitation on the terminal, and may include more or fewer components than shown in the figure, or combine some components, or arrange different components.
- the network interface 1004 is mainly used to connect to a back-end server and communicate with the back-end server;
- the user interface 1003 mainly includes an input unit such as a keyboard.
- the keyboard includes a wireless keyboard and a wired keyboard for connecting to a client.
- Perform data communication with the client; and the processor 1001 can be used to call the computer readable instructions stored in the memory 1005 and execute the steps of the question and answer method based on the knowledge graph.
- the optional embodiments of the device are basically the same as the following embodiments of the question and answer method based on the knowledge graph, and will not be repeated here.
- FIG. 1 is a schematic diagram of the device structure of the hardware operating environment involved in the solution of the embodiment of the present application.
- the question answering method based on the knowledge graph provided in this embodiment includes the following steps:
- Step S10 Obtain the question and answer sentence input by the user, perform word segmentation on the question and answer sentence, and obtain the entity element in the question and answer sentence and the label corresponding to the entity element through the entity element recognition model and the label recognition model completed by training. ;
- the question and answer sentences expressed by the user can be obtained by means of voice recognition, and the question and answer sentences input by the user can also be obtained by other methods, which is not limited in this embodiment.
- the obtained question and answer sentences are segmented, and the segmented question and answer sentences are passed through the entity element recognition model to extract the entity elements of the question and answer sentences; the question and answer sentences after the word segmentation are passed through the label recognition model to extract the question and answer sentences. label.
- entity elements and labels take the question and answer sentence "What is the relationship between Huang Xiaoming and Changsheng Medicine" as an example.
- “Huang Huaweing” is the entity element
- the corresponding label is person
- “Changsheng "Medicine” is an entity element
- the corresponding label is a company.
- Step S20 Input the entity element, the label, and the question and answer sentence into a Bayesian classifier, calculate the matching degree between each preset template in the Bayesian classifier and the question and answer sentence, and compare the matching The preset template with the highest degree is determined as the query template;
- the Bayesian classifier uses the prior probability of an object to calculate the object using the Bayesian formula For the probability of belonging to a certain class, the class with the largest posterior probability is selected as the class to which the object belongs.
- there are three preset templates in the Bayesian classifier namely, relational query template, entity query template, and attribute query template, which are combined by calculating the matching degree of each preset template with entity elements, tags, and question and answer sentences.
- the contextual semantics of the question and answer sentence determines the template with the highest matching degree as the query template.
- Step S30 input the entity element and the label into the query template to obtain the corresponding query sentence, and input the query sentence into the knowledge graph for query, and obtain the corresponding question and answer result.
- the entity elements and tags corresponding to the question and answer statement are input into the query template to obtain the corresponding query statement.
- a knowledge graph is also preset in this embodiment, and the query sentence is input into the knowledge graph, and the corresponding question and answer result is obtained by means of vector feature matching.
- This embodiment first obtains the question and answer sentence input by the user, performs word segmentation on the question and answer sentence, and obtains the entity element in the question and answer sentence and the corresponding entity element through the entity element recognition model and the label recognition model completed by training.
- Label input the entity element, the label, and the question and answer sentence into the Bayesian classifier, calculate the matching degree between each preset template in the Bayesian classifier and the question and answer sentence, and compare the matching degree
- the highest preset template is determined to be the query template; the entity elements and the tags are input into the query template to obtain the corresponding query sentence, and the query sentence is input into the knowledge graph for query, and the corresponding Q&A results.
- the entity elements and labels of the question and answer sentences are obtained by training the entity element recognition model and the label recognition model, and the user’s question and answer sentences are deeply analyzed.
- Mining determine the most suitable query template, and generate the corresponding query sentence, and get the corresponding question and answer result according to the knowledge graph.
- the whole process reduces the dependence on the training corpus, avoids the question and answer system's missed detection and error detection, thereby improving the output of the question and answer system The accuracy of the Q&A results.
- Step S10 obtains the question and answer sentence input by the user, and before performing word segmentation on the question and answer sentence, further includes:
- Step S40 Obtain training corpus through web crawler technology, and perform word segmentation on the training corpus;
- the web crawler technology is used to obtain a large amount of information from the existing database as training corpus, and the obtained training corpus is segmented.
- the web crawler technology is a program that automatically captures information on the World Wide Web according to certain rules or Scripts can automatically update a method of information acquisition and information retrieval of the stored content. For example, for the question and answer sentence "What is the relationship between Huang Huaweing and Changsheng Medicine", crawlers can be crawled from related databases such as the national enterprise information publicity system, news database, and enterprise credit database to obtain relevant training corpus.
- Step S50 receiving the artificial entity element and artificial label of the word obtained after word segmentation, and the artificial entity element and artificial label corresponding to the word are received;
- the obtained words are manually labeled with entity elements and labels.
- entity elements include multiple Chinese characters
- the labels should be manually labeled at the preset position of each Chinese character; and
- the type of the label may be determined according to the type of the preset query template.
- the preset query templates include relationship query templates, entity query templates, and attribute query templates.
- entity elements are "Huang Huaweing” and “Changsheng Medicine”
- result of manual labeling is "Huang person Xiao person Ming person "And “long company health company medical company medicine company”.
- Step S60 input the artificial entity element and training corpus into a preset entity element recognition model to train the entity element recognition model
- the artificial entity elements and training corpus are input into the preset entity element recognition model, and the preset entity element recognition model is trained.
- Step S70 Input the artificial label and training corpus into a preset label recognition model to train the label recognition model.
- the artificial label and training corpus are input into the preset label recognition model, and the preset label recognition model is trained.
- This embodiment uses crawler technology to obtain a large number of training corpora, so as to provide the model with data required for training, and uses manual standard methods to perform entity element labeling and label labeling on the training corpus to train the entity element recognition model and the label recognition model, thereby ensuring The accuracy of extracting entity elements by the entity element recognition model and the accuracy of extracting tags by the label recognition model.
- step of training the entity element recognition model includes:
- Step S61 extracting the training corpus through the entity element recognition model to obtain corresponding extracted entity elements
- the training corpus is input into the entity element recognition model, the training corpus is extracted through the entity element recognition model, and the entity elements extracted by the entity element recognition model are used as the extracted entity elements. It should be understood that since the accuracy of the entity element recognition model has not been judged, the entity element recognition model may extract wrong entity elements, that is, entity elements that do not belong to the training corpus.
- Step S62 Determine the entity element that overlaps the extracted entity element among the artificial entity elements as an entity element set
- the training corpus is manually labeled with entity elements, the artificial entity elements are compared with the extracted entity elements, and the entity elements that overlap with the extracted entity elements among the artificial entity elements are taken as the entity element set. It is easy to understand that the entity elements in the entity element set must be the correct entity elements, that is, they must belong to the input training corpus.
- Step S63 calculating the accuracy of the entity element recognition model according to the entity element set, the artificial entity element and the extracted entity element;
- the preset formula is used to calculate the entity element set, artificial entity element and extracted entity element to obtain the accuracy of the trained entity element recognition model.
- step S64 the entity element recognition model whose accuracy exceeds the preset first accuracy is used as the entity element recognition model completed by the training.
- the accuracy of the obtained entity element recognition model does not exceed the preset first accuracy, it means that the entity element recognition model is not accurate enough in extracting the entity elements, and the wrong entity will be extracted, and the training corpus will continue to be used as Input and train the entity element recognition model until the accuracy of the entity element recognition model exceeds the preset first accuracy.
- This embodiment uses training corpus and artificial entity elements to train the entity element recognition model, and by calculating the accuracy of the entity element recognition model to ensure that the trained entity element recognition model meets the preset accuracy, thereby ensuring that the entity element recognition model can Accurately extract the entity elements in question and answer sentences.
- the step of calculating the accuracy of the entity element recognition model based on the entity element set, the artificial entity element and the extracted entity element includes:
- Step S631 Divide the set of entity elements by the extracted entity elements to obtain the accuracy rate of the entity element recognition model
- the value of the entity elements in the entity element set Divide the value of the entity element extracted by the entity element recognition model, and use the calculation result as the accuracy rate of the entity element recognition model.
- Step S632 Divide the set of entity elements by the artificial entity elements to obtain the recall rate of the entity element recognition model
- the value of the entity element in the entity element set is divided by the value of the entity element manually labeled, and the calculation result is used as the recall rate of the entity element recognition model.
- Step S633 Calculate the product of the accuracy rate and the recall rate and the sum of the accuracy rate and the recall rate, divide the product by the sum value, and multiply the calculated result by the preset value as the entity element recognition model The first F value;
- the recall rate and accuracy rate of the entity element recognition model After obtaining the recall rate and accuracy rate of the entity element recognition model, multiply the accuracy rate and the recall rate, add the accuracy rate and the recall rate, and divide the product of the accuracy rate and the recall rate. Multiply the obtained calculation result by a preset value as the first F value of the entity element recognition model.
- Step S634 According to the accuracy rate, the recall rate and the first F value of the entity element recognition model, the accuracy of the entity element recognition model is calculated.
- the accuracy of the entity element recognition model can be obtained according to the above three elements.
- the design of the weight ratio corresponding to each value can be drawn up by the developer.
- the accuracy, recall, and first F value of the entity element recognition model are calculated according to the entity element set and the extracted entity elements, and the entity element recognition model is obtained by accurate calculation based on the accuracy, recall rate and the first F value. Accuracy.
- step of training the label recognition model includes:
- Step S71 Extract the above-mentioned training corpus through the label recognition model to obtain the corresponding extracted label
- the training corpus is input into the label recognition model, the training corpus is extracted through the label recognition model, and the label extracted by the label recognition model is used as the extracted label corresponding to the training corpus. It should be understood that, since the accuracy of the label recognition model has not been judged, the label recognition model may extract wrong labels, that is, labels that are not part of the training corpus.
- Step S72 Determine the label that overlaps the extracted label among the artificial labels as a label set
- the training corpus is manually labelled in the above steps, the artificial label is compared with the extracted label, and the label in the artificial label that is coincident with the extracted label is used as the label set. It is easy to understand that the tags in the tag set must be the correct tags, that is, the tags must belong to the input training corpus.
- Step S73 Calculate the accuracy of the label recognition model according to the label set, the artificial label and the extracted label;
- a preset formula is used to perform calculations based on the label set, manual labels, and extracted labels to obtain the accuracy of the trained label recognition model.
- step S74 the label recognition model whose accuracy exceeds the preset second accuracy is used as the trained label recognition model.
- the accuracy of the obtained label recognition model does not exceed the preset second accuracy, it means that the label recognition model is not accurate enough for label extraction, and the wrong label will be extracted. Then the training corpus will continue to be used as the input training place.
- the label recognition model is described until the accuracy of the label recognition model exceeds the preset second accuracy.
- This embodiment uses training corpus and artificial tags to train the label recognition model, and by calculating the accuracy of the label recognition model to ensure that the trained label recognition model meets the preset accuracy, thereby ensuring that the label recognition model can accurately extract question and answer sentences In the label.
- the step of calculating the accuracy of the label recognition model according to the label set, the artificial label and the extracted label includes:
- Step S731 Divide the label set by the extracted label to obtain the accuracy rate of the label recognition model
- the tags in the tag set must be correct tags, and the tags extracted by the tag recognition model may not belong to the tags of the training corpus; therefore, the value of the tags in the tag set is divided by the value extracted by the tag recognition model The numerical value of the label, and the calculation result is used as the accuracy rate of the label recognition model.
- Step S732 Divide the label set by the artificial label to obtain the recall rate of the label recognition model
- the numerical value of the label in the label set is divided by the numerical value of the manually labeled label, and the calculation result is used as the recall rate of the label recognition model.
- Step S733 Calculate the product of the accuracy rate and the recall rate and the sum of the accuracy rate and the recall rate, divide the product by the sum value, and multiply the obtained calculation result by a preset value as the label recognition model Second F value
- the accuracy rate is multiplied by the recall rate, the accuracy rate and the recall rate are added, and the product of the accuracy rate and the recall rate is divided.
- Step S734 According to the accuracy rate, the recall rate and the second F value of the tag recognition model, the accuracy of the tag recognition model is calculated.
- the accuracy of the label recognition model can be obtained based on the above three elements, and the accuracy can be calculated by setting different weights for the accuracy rate, recall rate and the second F value of the label recognition model It is easy to understand that the design of the weight ratio corresponding to each value can be drawn up by the developer.
- the accuracy rate, recall rate, and second F value of the label recognition model are calculated according to the label set and extracted tags, and the accuracy of the label recognition model is obtained by accurate calculation according to the accuracy rate, recall rate and the second F value.
- FIG. 4 is a schematic flowchart of another embodiment of the question answering method based on the knowledge graph of this application.
- Step S80 using artificial entity elements and artificial tags as inputs of a preset TransE algorithm, so that the artificial entity elements and the artificial tags are embedded in a low-dimensional vector space to generate a corresponding vector template;
- the TransE algorithm is also preset in this embodiment, and the TransE algorithm is a distributed vector representation based on entities and relationships.
- the artificial entity elements and artificial tags are input into the preset TransE algorithm, and the artificial entity elements and artificial tags are embedded in the low-dimensional vector space through the TransE algorithm algorithm to generate the corresponding vector template.
- step S90 the vector template is stored in the graph database to construct a corresponding knowledge graph.
- the vector template is stored in the graph database, and the corresponding knowledge graph is constructed according to the multiple vector templates stored in the graph database.
- the construction method of the knowledge graph is not described in this embodiment.
- This embodiment uses a preset TransE algorithm to obtain a vector template according to artificial entity elements and artificial tags, and constructs a corresponding knowledge map according to the vector template, so as to ensure the comprehensiveness and accuracy of the data of the knowledge map.
- the step of inputting the query sentence into the knowledge graph for query, and obtaining the corresponding question and answer result includes:
- Step S31 vectorizing the query sentence to generate a corresponding vector set
- the query sentence is vectorized by the NLP algorithm to generate the corresponding vector set. It should be understood that the vectorization of the query sentence can also be realized in other ways, and this embodiment will not do it here. limit.
- Step S32 matching the vector set with the vector template in the knowledge graph to obtain a corresponding question and answer result.
- the vector set is matched with the vector template in the knowledge graph.
- the vector that best matches the vector set in the knowledge graph is determined by calculating the matching degree between the vectors. Template, and then parse the vector template to obtain the corresponding question and answer result.
- the query sentence is vectorized, and the result of the question and answer in the knowledge graph is determined more accurately through the matching of the vector set and the vector template, thereby ensuring the accuracy of the question and answer result.
- the embodiment of the present application also proposes a non-volatile computer-readable storage medium, the non-volatile computer-readable storage medium stores computer-readable instructions, and when the computer-readable instructions are executed by a processor The operation of the question answering method based on the knowledge graph as described above is realized.
- the optional embodiments of the non-volatile computer-readable storage medium of the present application are basically the same as the above-mentioned embodiments of the question and answer method based on the knowledge graph, and will not be repeated here.
- the method of the embodiment can be realized by means of software plus the necessary general hardware platform, of course, it can also be realized through Over hardware, but in many cases the former is a better implementation.
- the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product.
- the computer software product is stored in a storage medium (such as ROM/RAM, floppy disk, optical disk)
- the disk includes several instructions to enable a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to execute the method described in each embodiment of the present application.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Human Computer Interaction (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Animal Behavior & Ethology (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
本申请要求于2019年9月18日提交中国专利局、申请号为201910885936.X、发明名称为“基于知识图谱的问答方法、装置和存储介质”的中国专利申请的优先权,其全部内容通过引用结合在申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on September 18, 2019, the application number is 201910885936.X, and the invention title is "Question and Answer Method, Apparatus, and Storage Medium Based on Knowledge Graph", and the entire content of which is approved The reference is incorporated in the application.
技术领域Technical field
本申请涉及信息处理技术领域,尤其涉及一种基于知识图谱的问答方法、装置和存储介质。This application relates to the field of information processing technology, and in particular to a question and answer method, device and storage medium based on a knowledge graph.
背景技术Background technique
随着互联网技术的发展,大规模网络数据资源的出现,人们希望从海量的互联网数据中准确、快速地获取有价值的信息,这推动了检索式的问答系统被广泛的应用。问答系统是信息检索系统的一种高级形式。它能用准确、简洁的自然语言回答用户用自然语言提出的问题。在目前的人工智能和自然语言处理领域中,对于问答系统的开发和完善,也是一个倍受关注并具有广泛发展前景的研究方向。With the development of Internet technology and the emergence of large-scale network data resources, people hope to obtain valuable information accurately and quickly from massive amounts of Internet data, which promotes the widespread application of search-type question and answer systems. Question answering system is an advanced form of information retrieval system. It can use accurate and concise natural language to answer users' questions in natural language. In the current field of artificial intelligence and natural language processing, the development and improvement of question answering systems is also a research direction that has attracted much attention and has broad development prospects.
目前,传统的问答系统是基于一定的问答语料训练模型,将用户的自然语言经过处理后输入至训练完成的模型中,通过查询模型中相似的语料得到结果。但这种问答系统的准确性取决于训练语料的覆盖性,当用户输入的问题较为复杂时,传统的问答系统输出的问答结果不准确。At present, the traditional question answering system is based on a certain question and answer corpus training model, the user's natural language is processed and input into the trained model, and the result is obtained by querying similar corpus in the model. However, the accuracy of this question answering system depends on the coverage of the training corpus. When the question input by the user is more complicated, the question answering result output by the traditional question answering system is not accurate.
发明内容Summary of the invention
本申请的主要目的在于提供了一种基于知识图谱的问答方法、装置和存储介质,旨在解决传统的问答系统输出的问答结果不准确的技术问题。The main purpose of this application is to provide a question and answer method, device and storage medium based on a knowledge graph, aiming to solve the technical problem of inaccurate question and answer results output by a traditional question and answer system.
为实现上述目的,本申请提供了一种基于知识图谱的问答方法,包括以下步骤:In order to achieve the above purpose, this application provides a question and answer method based on a knowledge graph, which includes the following steps:
通过网络爬虫技术获取训练语料,并对所述训练语料进行分词;Obtain training corpus through web crawler technology, and perform word segmentation on the training corpus;
接收对分词后得到的词语进行人工标注实体要素以及人工标注标签,得到的与所述词语对应的人工实体要素以及人工标签;Receiving the artificial entity element and artificial labeling of the word obtained after word segmentation, and the artificial entity element and artificial label corresponding to the word are obtained;
将所述人工实体要素和训练语料输入至预设实体要素识别模型中,以训练所述实体要素识别模型;Inputting the artificial entity element and training corpus into a preset entity element recognition model to train the entity element recognition model;
将所述人工标签和训练语料输入至预设标签识别模型中,以训练所述标签识别模型;Input the artificial label and training corpus into a preset label recognition model to train the label recognition model;
获取用户输入的问答语句,对所述问答语句进行分词,并通过训练完成的实体要素识别模型以及标签识别模型,分别获取所述问答语句中的实体要素和所述实体要素对应的标签;Acquiring the question and answer sentence input by the user, segmenting the question and answer sentence, and obtaining the entity element in the question and answer sentence and the label corresponding to the entity element through the entity element recognition model and the label recognition model completed through training;
将所述实体要素、所述标签以及所述问答语句输入至贝叶斯分类器,计算贝叶斯分类器中各个预设模板与所述问答语句的匹配度,并将所述匹配度最高的预设模板确定为查询模板;The entity element, the label, and the question and answer sentence are input to the Bayesian classifier, and the matching degree between each preset template in the Bayesian classifier and the question and answer sentence is calculated, and the one with the highest matching degree is calculated The preset template is determined to be the query template;
将所述实体要素和所述标签输入至所述查询模板中,得到对应的查询语句,并将所述查询语句输入至知识图谱中进行查询,得到对应的问答结果。The entity element and the label are input into the query template to obtain the corresponding query sentence, and the query sentence is input into the knowledge graph for query, and the corresponding question and answer result is obtained.
本申请还提供一种装置,所述装置包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机可读指令,所述计算机可读指令被所述处理器执行时实现如上所述基于知识图谱的问答方法的步骤。The present application also provides a device that includes a memory, a processor, and computer-readable instructions stored on the memory and capable of running on the processor, and the computer-readable instructions are used by the processor When executed, the steps of the question answering method based on the knowledge graph as described above are realized.
本申请还提供一种非易失性计算机可读存储介质,所述非易失性计算机可读存储介质上存储有计算机可读指令,所述计算机可读指令被处理器执行时实现如上所述基于知识图谱的问答方法的步骤。The present application also provides a non-volatile computer-readable storage medium having computer-readable instructions stored on the non-volatile computer-readable storage medium, and when the computer-readable instructions are executed by a processor, the aforementioned Steps of question answering method based on knowledge graph.
本申请公开了一种基于知识图谱的问答方法、装置和存储介质,所述方法方法先是获取用户输入的问答语句,对所述问答语句进行分词,并通过训练完成的实体要素识别模型以及标签识别模型,分别获取所述问答语句中的实体要素和所述实体要素对应的标签;将所述实体要素、所述标签以及所述问答语句输入至贝叶斯分类器,计算贝叶斯分类器中各个预设模板与问答语句的匹配度,并将所述匹配度最高的预设模板确定为查询模板;将所述实体要素和所述标签输入至所述查询模板中,得到对应的查询语句,并将所述查询语句输入至知识图谱中进行查询,得到对应的问答结果。本申请通过对训练完成的实体要素识别模型以及标签识别模型用户的问答语句进行分析,通过训练实体要素识别模型以及标签识别模型,得到问答语句的实体要素以及标签,对用户的问答语句进行了深层次的挖掘;确定最适配的查询模板,并生成对应的查询语句,根据知识图谱得到对应的问答结果,整个流程减轻了对训练语料的依赖程度,避免问答系统漏检错检,从而提高问答系统输出的问答结果的准确性。The application discloses a question and answer method, device and storage medium based on a knowledge graph. The method first obtains the question and answer sentence input by the user, performs word segmentation on the question and answer sentence, and completes the entity element recognition model and label recognition through training The model obtains the entity elements in the question and answer sentence and the label corresponding to the entity element respectively; inputs the entity element, the label, and the question and answer sentence into the Bayesian classifier, and calculates the Bayesian classifier The degree of matching between each preset template and the question and answer sentence, and the preset template with the highest matching degree is determined as the query template; the entity element and the label are input into the query template to obtain the corresponding query sentence, The query sentence is input into the knowledge graph for query, and the corresponding question and answer result is obtained. This application analyzes the user’s question and answer sentences of the trained entity element recognition model and the label recognition model, and obtains the entity elements and labels of the question and answer sentence by training the entity element recognition model and the label recognition model. Hierarchical mining; determine the most suitable query template, and generate the corresponding query sentence, get the corresponding question and answer result according to the knowledge map, the whole process reduces the dependence on the training corpus, avoids the question and answer system missed detection and error detection, thereby improving the question and answer The accuracy of the Q&A results output by the system.
附图说明Description of the drawings
图1是本申请实施例方案涉及的硬件运行环境的装置结构示意图;FIG. 1 is a schematic diagram of the device structure of the hardware operating environment involved in the solution of the embodiment of the present application;
图2为本申请基于知识图谱的问答方法一实施例的流程示意图;2 is a schematic flowchart of an embodiment of a question answering method based on a knowledge graph of this application;
图3为本申请基于知识图谱的问答方法另一实施例的流程示意图;FIG. 3 is a schematic flowchart of another embodiment of the question answering method based on the knowledge graph of this application;
图4为本申请基于知识图谱的问答方法又一实施例的流程示意图。FIG. 4 is a schematic flowchart of another embodiment of the question answering method based on the knowledge graph of this application.
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。The realization, functional characteristics, and advantages of the purpose of this application will be further described in conjunction with the embodiments and with reference to the accompanying drawings.
具体实施方式detailed description
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的可选实施例仅仅用以解释本申请,并不用于限定本申请。In order to make the purpose, technical solutions, and advantages of this application clearer, the following further describes this application in detail with reference to the accompanying drawings and embodiments. It should be understood that the optional embodiments described here are only used to explain the present application, and are not used to limit the present application.
如图1所示,图1是本申请实施例方案涉及的硬件运行环境的终端结构示意图。As shown in FIG. 1, FIG. 1 is a schematic diagram of a terminal structure of a hardware operating environment involved in a solution of an embodiment of the present application.
本申请终端是一种装置,该装置可以是一种手机、电脑、移动电脑等具有存储功能的终端设备。The terminal of this application is a device, and the device may be a terminal device with a storage function such as a mobile phone, a computer, or a mobile computer.
如图1所示,该终端可以包括:处理器1001,例如CPU,通信总线1002,用户接口1003,网络接口1004,存储器1005。其中,通信总线1002用于实现这些组件之间的连接通信。用户接口1003可以包括显示屏(Display)、输入单元比如键盘(Keyboard),可选的用户接口1003还可以包括标准的有线接口、无线接口。网络接口1004可选的可以包括标准的有线接口、无线接口(如WI-FI接口)。存储器1005可以是高速RAM存储器,也可以是稳定的存储器(non-volatile memory),例如磁盘存储器。存储器1005可选的还可以是独立于前述处理器1001的存储装置。As shown in FIG. 1, the terminal may include: a processor 1001, such as a CPU, a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. Among them, the communication bus 1002 is used to implement connection and communication between these components. The user interface 1003 may include a display screen (Display) and an input unit such as a keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface and a wireless interface. The network interface 1004 may optionally include a standard wired interface and a wireless interface (such as a WI-FI interface). The memory 1005 can be a high-speed RAM memory or a stable memory (non-volatile memory), such as disk storage. Optionally, the memory 1005 may also be a storage device independent of the aforementioned processor 1001.
可选地,终端还可以包括摄像头、Wi-Fi模块等等,在此不再赘述。Optionally, the terminal may also include a camera, a Wi-Fi module, etc., which will not be repeated here.
本领域技术人员可以理解,图1中示出的终端结构并不构成对终端的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。Those skilled in the art can understand that the terminal structure shown in FIG. 1 does not constitute a limitation on the terminal, and may include more or fewer components than shown in the figure, or combine some components, or arrange different components.
可选地,终端还可以包括摄像头、Wi-Fi模块等等,在此不再赘述。Optionally, the terminal may also include a camera, a Wi-Fi module, etc., which will not be repeated here.
本领域技术人员可以理解,图1中示出的终端结构并不构成对终端的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。Those skilled in the art can understand that the terminal structure shown in FIG. 1 does not constitute a limitation on the terminal, and may include more or fewer components than shown in the figure, or combine some components, or arrange different components.
在图1所示的终端中,网络接口1004主要用于连接后台服务器,与后台服务器进行数据通信;用户接口1003主要包括输入单元比如键盘,键盘包括无线键盘和有线键盘,用于连接客户端,与客户端进行数据通信;而处理器1001可以用于调用存储器1005中存储的计算机可读指令,并执行基于知识图谱的问答方法的步骤。In the terminal shown in FIG. 1, the network interface 1004 is mainly used to connect to a back-end server and communicate with the back-end server; the user interface 1003 mainly includes an input unit such as a keyboard. The keyboard includes a wireless keyboard and a wired keyboard for connecting to a client. Perform data communication with the client; and the processor 1001 can be used to call the computer readable instructions stored in the memory 1005 and execute the steps of the question and answer method based on the knowledge graph.
本装置的可选实施例与下述基于知识图谱的问答方法各实施例基本相同,在此不作赘述。The optional embodiments of the device are basically the same as the following embodiments of the question and answer method based on the knowledge graph, and will not be repeated here.
请参阅图1,图1是本申请实施例方案涉及的硬件运行环境的装置结构示意图。本实施例提供的基于知识图谱的问答方法包括如下步骤:Please refer to FIG. 1. FIG. 1 is a schematic diagram of the device structure of the hardware operating environment involved in the solution of the embodiment of the present application. The question answering method based on the knowledge graph provided in this embodiment includes the following steps:
步骤S10,获取用户输入的问答语句,对所述问答语句进行分词,并通过训练完成的实体要素识别模型以及标签识别模型,分别获取所述问答语句中的实体要素和所述实体要素对应的标签;Step S10: Obtain the question and answer sentence input by the user, perform word segmentation on the question and answer sentence, and obtain the entity element in the question and answer sentence and the label corresponding to the entity element through the entity element recognition model and the label recognition model completed by training. ;
获取用户输入的问答语句,应当理解的是,可以通过语音识别的方式得到用户表达的问答语句,也可以通过其他方式获取用户输入的问答语句,本实施例在此不做限制。基于NLP技术对得到到的问答语句进行分词,并将分词后的问答语句通过实体要素识别模型以提取出问答语句的实体要素;将分词后的问答语句通过标签识别模型以提取出问答语句中的标签。为了详尽的阐述问答语句、实体要素与标签的关系,以问答语句为“黄晓明和长生医药有什么关系”为例,则上述问答语句中“黄晓明”为实体要素,对应的标签为人物,“长生医药”为实体要素,对应的标签为公司。To obtain the question and answer sentences input by the user, it should be understood that the question and answer sentences expressed by the user can be obtained by means of voice recognition, and the question and answer sentences input by the user can also be obtained by other methods, which is not limited in this embodiment. Based on NLP technology, the obtained question and answer sentences are segmented, and the segmented question and answer sentences are passed through the entity element recognition model to extract the entity elements of the question and answer sentences; the question and answer sentences after the word segmentation are passed through the label recognition model to extract the question and answer sentences. label. In order to elaborate on the relationship between question and answer sentences, entity elements and labels, take the question and answer sentence "What is the relationship between Huang Xiaoming and Changsheng Medicine" as an example. In the above question and answer sentence, "Huang Xiaoming" is the entity element, and the corresponding label is person, "Changsheng "Medicine" is an entity element, and the corresponding label is a company.
步骤S20,将所述实体要素、所述标签以及所述问答语句输入至贝叶斯分类器,计算贝叶斯分类器中各个预设模板与所述问答语句的匹配度,并将所述匹配度最高的预设模板确定为查询模板;Step S20: Input the entity element, the label, and the question and answer sentence into a Bayesian classifier, calculate the matching degree between each preset template in the Bayesian classifier and the question and answer sentence, and compare the matching The preset template with the highest degree is determined as the query template;
将用户输入的问答语句,与问答语句对应的实体要素以及标签输入至预设的贝叶斯分类器中,贝叶斯分类器通过某对象的先验概率,利用贝叶斯公式计算出该对象属于某一类的概率,选择具有最大后验概率的类作为该对象所属的类。可选地,贝叶斯分类器中预设有三种模板,即关系查询模板、实体查询模板以及属性查询模板,通过计算各个预设模板与实体要素、标签以及问答语句的匹配度,以此结合问答语句的上下文语义,将匹配度最高的模板确定为查询模板。Input the question and answer sentences input by the user, the entity elements and labels corresponding to the question and answer sentences into the preset Bayesian classifier. The Bayesian classifier uses the prior probability of an object to calculate the object using the Bayesian formula For the probability of belonging to a certain class, the class with the largest posterior probability is selected as the class to which the object belongs. Optionally, there are three preset templates in the Bayesian classifier, namely, relational query template, entity query template, and attribute query template, which are combined by calculating the matching degree of each preset template with entity elements, tags, and question and answer sentences The contextual semantics of the question and answer sentence determines the template with the highest matching degree as the query template.
步骤S30,将所述实体要素和所述标签输入至所述查询模板中,得到对应的查询语句,并将所述查询语句输入至知识图谱中进行查询,得到对应的问答结果。Step S30, input the entity element and the label into the query template to obtain the corresponding query sentence, and input the query sentence into the knowledge graph for query, and obtain the corresponding question and answer result.
得到查询模板后,将问答语句对应的实体要素和标签输入至查询模板中,得到对应的查询语句。此外,本实施例中还预先设置有知识图谱,将查询语句输入至知识图谱中,通过向量特征匹配的方式得到对应的问答结果。After the query template is obtained, the entity elements and tags corresponding to the question and answer statement are input into the query template to obtain the corresponding query statement. In addition, a knowledge graph is also preset in this embodiment, and the query sentence is input into the knowledge graph, and the corresponding question and answer result is obtained by means of vector feature matching.
本实施例先是获取用户输入的问答语句,对所述问答语句进行分词,并通过训练完成的实体要素识别模型以及标签识别模型,分别获取所述问答语句中的实体要素和所述实体要素对应的标签;将所述实体要素、所述标签以及所述问答语句输入至贝叶斯分类器,计算贝叶斯分类器中各个预设模板与所述问答语句的匹配度,并将所述匹配度最高的预设模板确定为查询模板;将所述实体要素和所述标签输入至所述查询模板中,得到对应的查询语句,并将所述查询语句输入至知识图谱中进行查询,得到对应的问答结果。通过对训练完成的实体要素识别模型以及标签识别模型用户的问答语句进行分析,通过训练实体要素识别模型以及标签识别模型,得到问答语句的实体要素以及标签,对用户的问答语句进行了深层次的挖掘;确定最适配的查询模板,并生成对应的查询语句,根据知识图谱得到对应的问答结果,整个流程减轻了对训练语料的依赖程度,避免问答系统漏检错检,从而提高问答系统输出的问答结果的准确性。This embodiment first obtains the question and answer sentence input by the user, performs word segmentation on the question and answer sentence, and obtains the entity element in the question and answer sentence and the corresponding entity element through the entity element recognition model and the label recognition model completed by training. Label; input the entity element, the label, and the question and answer sentence into the Bayesian classifier, calculate the matching degree between each preset template in the Bayesian classifier and the question and answer sentence, and compare the matching degree The highest preset template is determined to be the query template; the entity elements and the tags are input into the query template to obtain the corresponding query sentence, and the query sentence is input into the knowledge graph for query, and the corresponding Q&A results. Through the analysis of the user’s question and answer sentences of the trained entity element recognition model and the label recognition model, the entity elements and labels of the question and answer sentences are obtained by training the entity element recognition model and the label recognition model, and the user’s question and answer sentences are deeply analyzed. Mining; determine the most suitable query template, and generate the corresponding query sentence, and get the corresponding question and answer result according to the knowledge graph. The whole process reduces the dependence on the training corpus, avoids the question and answer system's missed detection and error detection, thereby improving the output of the question and answer system The accuracy of the Q&A results.
进一步地,请参阅图3,图3为本申请基于知识图谱的问答方法另一实施例的流程示意图。步骤S10获取用户输入的问答语句,对所述问答语句进行分词之前,还包括:Further, please refer to FIG. 3, which is a schematic flowchart of another embodiment of the question answering method based on the knowledge graph of this application. Step S10 obtains the question and answer sentence input by the user, and before performing word segmentation on the question and answer sentence, further includes:
步骤S40,通过网络爬虫技术获取训练语料,并对所述训练语料进行分词;Step S40: Obtain training corpus through web crawler technology, and perform word segmentation on the training corpus;
在所有步骤之前,还需要利用现有数据训练实体要素识别模型以及标签识别模型。首先利用网络爬虫技术从现有的数据库中获取大量的信息作为训练语料,并对得到的训练语料进行分词,其中,网络爬虫技术是一种按照一定的规则,自动地抓取万维网信息的程序或者脚本,可以自动更新存储内容的一种信息获取方式和信息检索方式。例如针对问答语句“黄晓明和长生医药有什么关系”,可以从国家企业信息公示系统、新闻数据库以及企业征信数据库等相关数据库中进行爬虫,得到相关训练语料。Before all the steps, it is also necessary to use the existing data to train the entity element recognition model and the label recognition model. First, the web crawler technology is used to obtain a large amount of information from the existing database as training corpus, and the obtained training corpus is segmented. Among them, the web crawler technology is a program that automatically captures information on the World Wide Web according to certain rules or Scripts can automatically update a method of information acquisition and information retrieval of the stored content. For example, for the question and answer sentence "What is the relationship between Huang Xiaoming and Changsheng Medicine", crawlers can be crawled from related databases such as the national enterprise information publicity system, news database, and enterprise credit database to obtain relevant training corpus.
步骤S50,接收对分词后得到的词语进行人工标注实体要素以及人工标注标签,得到的与所述词语对应的人工实体要素以及人工标签;Step S50, receiving the artificial entity element and artificial label of the word obtained after word segmentation, and the artificial entity element and artificial label corresponding to the word are received;
对训练语料进行分词后,将得到的词语进行人工的实体要素标注以及标签标注,特别注意的是,若实体要素包括多个汉字,则应当把标签人工标注在每个汉字的预设位置;且作为一种实施方式可以根据预设查询模板的种类确定标签的种类。After the training corpus is segmented, the obtained words are manually labeled with entity elements and labels. Special attention is paid to the fact that if the entity elements include multiple Chinese characters, the labels should be manually labeled at the preset position of each Chinese character; and As an implementation manner, the type of the label may be determined according to the type of the preset query template.
例如,预设的查询模板包括关系查询模板、实体查询模板以及属性查询模板,实体要素为“黄晓明”和“长生医药”,则人工标注标签后的结果为“黄person晓person明person ”和“长company生company医company药company”。For example, the preset query templates include relationship query templates, entity query templates, and attribute query templates. The entity elements are "Huang Xiaoming" and "Changsheng Medicine", and the result of manual labeling is "Huang person Xiao person Ming person "And "long company health company medical company medicine company".
步骤S60,将所述人工实体要素和训练语料输入至预设实体要素识别模型中,以训练所述实体要素识别模型;Step S60, input the artificial entity element and training corpus into a preset entity element recognition model to train the entity element recognition model;
对于实体要素识别模型的训练,在人工标注实体要素完成后,将人工实体要素和训练语料输入至预设的实体要素识别模型中,对预设实体要素识别模型进行训练。For the training of the entity element recognition model, after the manual labeling of the entity elements is completed, the artificial entity elements and training corpus are input into the preset entity element recognition model, and the preset entity element recognition model is trained.
步骤S70,将所述人工标签和训练语料输入至预设标签识别模型中,以训练所述标签识别模型。Step S70: Input the artificial label and training corpus into a preset label recognition model to train the label recognition model.
对于标签识别模型的训练,在人工标注标签完成后,将人工标签和训练语料输入至预设的标签识别模型中,对预设标签识别模型进行训练。For the training of the label recognition model, after the manual labeling is completed, the artificial label and training corpus are input into the preset label recognition model, and the preset label recognition model is trained.
本实施例利用爬虫技术获取大量训练语料,以此向模型提供训练所需数据,并通过人工标准的方式对训练语料进行实体要素标注以及标签标注,训练实体要素识别模型和标签识别模型,从而保证实体要素识别模型抽取实体要素的准确性和标签识别模型抽取标签的准确性。This embodiment uses crawler technology to obtain a large number of training corpora, so as to provide the model with data required for training, and uses manual standard methods to perform entity element labeling and label labeling on the training corpus to train the entity element recognition model and the label recognition model, thereby ensuring The accuracy of extracting entity elements by the entity element recognition model and the accuracy of extracting tags by the label recognition model.
进一步地,所述训练所述实体要素识别模型的步骤包括:Further, the step of training the entity element recognition model includes:
步骤S61,通过所述实体要素识别模型对所述训练语料进行抽取,以得到对应的抽取实体要素;Step S61, extracting the training corpus through the entity element recognition model to obtain corresponding extracted entity elements;
本实施例中,将训练语料输入至实体要素识别模型中,通过实体要素识别模型对训练语料进行抽取,并将实体要素识别模型抽取出来的实体要素作为抽取实体要素。应当理解的是,由于尚未对实体要素识别模型的准确度进行判断,因此,实体要素识别模型可能抽取出错误的实体要素,即不属于训练语料的实体要素。In this embodiment, the training corpus is input into the entity element recognition model, the training corpus is extracted through the entity element recognition model, and the entity elements extracted by the entity element recognition model are used as the extracted entity elements. It should be understood that since the accuracy of the entity element recognition model has not been judged, the entity element recognition model may extract wrong entity elements, that is, entity elements that do not belong to the training corpus.
步骤S62,确定所述人工实体要素中与所述抽取实体要素重合的实体要素作为实体要素集合;Step S62: Determine the entity element that overlaps the extracted entity element among the artificial entity elements as an entity element set;
在上述步骤中对训练语料进行了人工标注实体要素,将人工实体要素与所述抽取实体要素作比较,并将人工实体要素中与抽取实体要素重合的实体要素作为实体要素集合。容易理解的是,实体要素集合中的实体要素一定是正确的实体要素,即一定属于输入的训练语料的实体要素。In the above steps, the training corpus is manually labeled with entity elements, the artificial entity elements are compared with the extracted entity elements, and the entity elements that overlap with the extracted entity elements among the artificial entity elements are taken as the entity element set. It is easy to understand that the entity elements in the entity element set must be the correct entity elements, that is, they must belong to the input training corpus.
步骤S63,根据所述实体要素集合、所述人工实体要素以及所述抽取实体要素,计算得到所述实体要素识别模型的准确度;Step S63, calculating the accuracy of the entity element recognition model according to the entity element set, the artificial entity element and the extracted entity element;
得到实体要素集合后,使用预设公式根据实体要素集合、人工实体要素以及抽取实体要素进行计算,得到所训练的实体要素识别模型的准确度。After the entity element set is obtained, the preset formula is used to calculate the entity element set, artificial entity element and extracted entity element to obtain the accuracy of the trained entity element recognition model.
步骤S64,将准确度超过预设第一准确度的实体要素识别模型作为训练完成的实体要素识别模型。In step S64, the entity element recognition model whose accuracy exceeds the preset first accuracy is used as the entity element recognition model completed by the training.
本实施例中,如若得到的实体要素识别模型的准确度未超过预设第一准确度,代表实体要素识别模型对于实体要素的抽取还不够精确,会抽取到错误实体,则继续将训练语料作为输入训练所述实体要素识别模型,直至实体要素识别模型的准确度超过预设第一准确度。In this embodiment, if the accuracy of the obtained entity element recognition model does not exceed the preset first accuracy, it means that the entity element recognition model is not accurate enough in extracting the entity elements, and the wrong entity will be extracted, and the training corpus will continue to be used as Input and train the entity element recognition model until the accuracy of the entity element recognition model exceeds the preset first accuracy.
本实施例利用训练语料和人工实体要素对实体要素识别模型进行训练,并通过计算实体要素识别模型准确度的方式确保训练完成的实体要素识别模型符合预设准确度,进而保证实体要素识别模型能准确的抽取问答语句中的实体要素。This embodiment uses training corpus and artificial entity elements to train the entity element recognition model, and by calculating the accuracy of the entity element recognition model to ensure that the trained entity element recognition model meets the preset accuracy, thereby ensuring that the entity element recognition model can Accurately extract the entity elements in question and answer sentences.
进一步地,所述根据所述实体要素集合、所述人工实体要素以及所述抽取实体要素,计算得到所述实体要素识别模型的准确度的步骤包括:Further, the step of calculating the accuracy of the entity element recognition model based on the entity element set, the artificial entity element and the extracted entity element includes:
步骤S631,将所述实体要素集合除以所述抽取实体要素,以得到所述实体要素识别模型的准确率;Step S631: Divide the set of entity elements by the extracted entity elements to obtain the accuracy rate of the entity element recognition model;
本实施例中,由于实体要素集合中的实体要素一定为正确的实体要素,而实体要素识别模型抽取出来的实体要素可能不属于训练语料的实体要素;因此,将实体要素集合中实体要素的数值除以实体要素识别模型抽取出来的实体要素的数值,将计算结果作为实体要素识别模型的准确率。In this embodiment, since the entity elements in the entity element set must be the correct entity elements, and the entity elements extracted by the entity element recognition model may not belong to the entity elements of the training corpus; therefore, the value of the entity elements in the entity element set Divide the value of the entity element extracted by the entity element recognition model, and use the calculation result as the accuracy rate of the entity element recognition model.
步骤S632,将所述实体要素集合除以所述人工实体要素,以得到所述实体要素识别模型的召回率;Step S632: Divide the set of entity elements by the artificial entity elements to obtain the recall rate of the entity element recognition model;
本实施例中,将实体要素集合中实体要素的数值除以人工标注的实体要素的数值,将计算结果作为所述实体要素识别模型的召回率。In this embodiment, the value of the entity element in the entity element set is divided by the value of the entity element manually labeled, and the calculation result is used as the recall rate of the entity element recognition model.
步骤S633,计算准确率与召回率的乘积以及准确率与召回率的和值,并将所述乘积除以所述和值,将得到的计算结果乘以预设数值作为所述实体要素识别模型的第一F值;Step S633: Calculate the product of the accuracy rate and the recall rate and the sum of the accuracy rate and the recall rate, divide the product by the sum value, and multiply the calculated result by the preset value as the entity element recognition model The first F value;
得到实体要素识别模型的召回率和准确率后,将准确率与召回率相乘以及将准确率与召回率相加,并将准确率与召回率的乘积相除。将得到的计算结果乘以预设数值作为所述实体要素识别模型的第一F值,可选地,预设数值为2,即第一F值=2×(准确率×召回率)/(准确率+召回率)。After obtaining the recall rate and accuracy rate of the entity element recognition model, multiply the accuracy rate and the recall rate, add the accuracy rate and the recall rate, and divide the product of the accuracy rate and the recall rate. Multiply the obtained calculation result by a preset value as the first F value of the entity element recognition model. Optionally, the preset value is 2, that is, the first F value = 2×(accuracy rate×recall rate)/( Accuracy rate + recall rate).
步骤S634,根据所述实体要素识别模型的准确率、召回率以及第一F值,计算得到所述实体要素识别模型的准确度。Step S634: According to the accuracy rate, the recall rate and the first F value of the entity element recognition model, the accuracy of the entity element recognition model is calculated.
得到准确率、召回率以及第一F值后,可以根据上述三个要素得到实体要素识别模型的准确度,通过对实体要素识别模型的准确率、召回率以及第一F值设置不同的权重计算得到准确度,容易理解的是,对于各个数值对应的权重比例的设计,可以由开发人员拟定。After obtaining the accuracy rate, recall rate and the first F value, the accuracy of the entity element recognition model can be obtained according to the above three elements. By setting different weights for the accuracy rate, recall rate and the first F value of the entity element recognition model To get the accuracy, it is easy to understand that the design of the weight ratio corresponding to each value can be drawn up by the developer.
本实施例根据实体要素集合和抽取实体要素计算实体要素识别模型的准确率、召回率以及第一F值,并根据所述准确率、召回率以及第一F值准确的计算得到实体要素识别模型的准确度。In this embodiment, the accuracy, recall, and first F value of the entity element recognition model are calculated according to the entity element set and the extracted entity elements, and the entity element recognition model is obtained by accurate calculation based on the accuracy, recall rate and the first F value. Accuracy.
进一步地,所述训练所述标签识别模型的步骤包括:Further, the step of training the label recognition model includes:
步骤S71,通过所述标签识别模型对上述训练语料进行抽取,以得到对应的抽取标签;Step S71: Extract the above-mentioned training corpus through the label recognition model to obtain the corresponding extracted label;
本实施例中,将训练语料输入至标签识别模型中,通过标签识别模型对训练语料进行抽取,并将标签识别模型抽取出来的标签作为训练语料对应的抽取标签。应当理解的是,由于尚未对标签识别模型的准确度进行判断,因此,标签识别模型可能抽取出错误的标签,即不属于训练语料的标签。In this embodiment, the training corpus is input into the label recognition model, the training corpus is extracted through the label recognition model, and the label extracted by the label recognition model is used as the extracted label corresponding to the training corpus. It should be understood that, since the accuracy of the label recognition model has not been judged, the label recognition model may extract wrong labels, that is, labels that are not part of the training corpus.
步骤S72,确定所述人工标签中与所述抽取标签重合的标签作为标签集合;Step S72: Determine the label that overlaps the extracted label among the artificial labels as a label set;
由于训练语料在上述步骤中进行了人工标注标签,将人工标签与所述抽取标签作比较,将人工标签中与抽取标签重合的标签作为标签集合。容易理解的是,标签集合中的标签一定是正确的标签,即一定属于输入的训练语料的标签。Since the training corpus is manually labelled in the above steps, the artificial label is compared with the extracted label, and the label in the artificial label that is coincident with the extracted label is used as the label set. It is easy to understand that the tags in the tag set must be the correct tags, that is, the tags must belong to the input training corpus.
步骤S73,根据所述标签集合、所述人工标签以及所述抽取标签,计算所述标签识别模型的准确度;Step S73: Calculate the accuracy of the label recognition model according to the label set, the artificial label and the extracted label;
得到标签集合后,使用预设公式根据标签集合、人工标签以及抽取标签进行计算,得到所训练的标签识别模型的准确度。After the label set is obtained, a preset formula is used to perform calculations based on the label set, manual labels, and extracted labels to obtain the accuracy of the trained label recognition model.
步骤S74,将准确度超过预设第二准确度的标签识别模型作为训练完成的标签识别模型。In step S74, the label recognition model whose accuracy exceeds the preset second accuracy is used as the trained label recognition model.
本实施例中,如若得到的标签识别模型的准确度未超过预设第二准确度,代表标签识别模型对于标签的抽取还不够精确,会抽取到错误标签,则继续将训练语料作为输入训练所述标签识别模型,直至标签识别模型的准确度超过预设第二准确度。In this embodiment, if the accuracy of the obtained label recognition model does not exceed the preset second accuracy, it means that the label recognition model is not accurate enough for label extraction, and the wrong label will be extracted. Then the training corpus will continue to be used as the input training place. The label recognition model is described until the accuracy of the label recognition model exceeds the preset second accuracy.
本实施例利用训练语料和人工标签对标签识别模型进行训练,并通过计算标签识别模型准确度的方式确保训练完成的标签识别模型符合预设准确度,进而保证标签识别模型能准确的抽取问答语句中的标签。This embodiment uses training corpus and artificial tags to train the label recognition model, and by calculating the accuracy of the label recognition model to ensure that the trained label recognition model meets the preset accuracy, thereby ensuring that the label recognition model can accurately extract question and answer sentences In the label.
进一步地,所述根据所述标签集合、所述人工标签以及所述抽取标签,计算所述标签识别模型的准确度的步骤包括:Further, the step of calculating the accuracy of the label recognition model according to the label set, the artificial label and the extracted label includes:
步骤S731,将所述标签集合除以所述抽取标签,得到所述标签识别模型的准确率;Step S731: Divide the label set by the extracted label to obtain the accuracy rate of the label recognition model;
本实施例中,由于标签集合中的标签一定为正确的标签,而标签识别模型抽取出来的标签可能不属于训练语料的标签;因此,将标签集合中标签的数值除以标签识别模型抽取出来的标签的数值,将计算结果作为标签识别模型的准确率。In this embodiment, since the tags in the tag set must be correct tags, and the tags extracted by the tag recognition model may not belong to the tags of the training corpus; therefore, the value of the tags in the tag set is divided by the value extracted by the tag recognition model The numerical value of the label, and the calculation result is used as the accuracy rate of the label recognition model.
步骤S732,将所述标签集合除以所述人工标签,得到所述标签识别模型的召回率;Step S732: Divide the label set by the artificial label to obtain the recall rate of the label recognition model;
本实施例中,将标签集合中标签的数值除以人工标注的标签的数值,将计算结果作为所述标签识别模型的召回率。In this embodiment, the numerical value of the label in the label set is divided by the numerical value of the manually labeled label, and the calculation result is used as the recall rate of the label recognition model.
步骤S733,计算准确率与召回率的乘积以及准确率与召回率的和值,并将所述乘积除以所述和值,将得到的计算结果乘以预设数值作为所述标签识别模型的第二F值;Step S733: Calculate the product of the accuracy rate and the recall rate and the sum of the accuracy rate and the recall rate, divide the product by the sum value, and multiply the obtained calculation result by a preset value as the label recognition model Second F value
得到标签识别模型的召回率和准确率后,将所述准确率与所述召回率相乘以及将准确率与召回率相加,并将准确率与召回率的乘积相除。将得到的计算结果乘以预设数值作为所述标签识别模型的第二F值,可选地,预设数值为2,即第二F值=2×(准确率×召回率)/(准确率+召回率)。After the recall rate and accuracy rate of the label recognition model are obtained, the accuracy rate is multiplied by the recall rate, the accuracy rate and the recall rate are added, and the product of the accuracy rate and the recall rate is divided. Multiply the obtained calculation result by a preset value as the second F value of the label recognition model. Optionally, the preset value is 2, that is, the second F value = 2×(accuracy rate×recall rate)/(accurate Rate + recall rate).
步骤S734,根据所述标签识别模型的准确率、召回率以及第二F值,计算得到所述标签识别模型的准确度。Step S734: According to the accuracy rate, the recall rate and the second F value of the tag recognition model, the accuracy of the tag recognition model is calculated.
得到准确率、召回率以及第二F值后,可以根据上述三个要素得到标签识别模型的准确度,通过对标签识别模型的准确率、召回率以及第二F值设置不同的权重计算得到准确度,容易理解的是,对于各个数值对应的权重比例的设计,可以由开发人员拟定。After obtaining the accuracy rate, recall rate and the second F value, the accuracy of the label recognition model can be obtained based on the above three elements, and the accuracy can be calculated by setting different weights for the accuracy rate, recall rate and the second F value of the label recognition model It is easy to understand that the design of the weight ratio corresponding to each value can be drawn up by the developer.
本实施例根据标签集合和抽取标签计算标签识别模型的准确率、召回率以及第二F值,并根据所述准确率、召回率以及第二F值准确的计算得到标签识别模型的准确度。In this embodiment, the accuracy rate, recall rate, and second F value of the label recognition model are calculated according to the label set and extracted tags, and the accuracy of the label recognition model is obtained by accurate calculation according to the accuracy rate, recall rate and the second F value.
进一步地,请参阅图4,图4为本申请基于知识图谱的问答方法又一实施例的流程示意图。所述步骤S50得到与所述词语对应的人工实体要素以及人工标签之后,还包括:Further, please refer to FIG. 4, which is a schematic flowchart of another embodiment of the question answering method based on the knowledge graph of this application. After the step S50 obtains the artificial entity elements and the artificial tags corresponding to the words, the method further includes:
步骤S80,将人工实体要素以及人工标签作为预设TransE算法的输入,以使得所述所述人工实体要素和所述人工标签嵌入至低维向量空间中,生成对应的向量模板;Step S80, using artificial entity elements and artificial tags as inputs of a preset TransE algorithm, so that the artificial entity elements and the artificial tags are embedded in a low-dimensional vector space to generate a corresponding vector template;
本实施例中还预设有TransE算法,TransE算法是基于实体和关系的分布式向量表示。将人工实体要素和人工标签输入至预设TransE算法中,通过TransE算法算法,将人工实体要素和人工标签嵌入至低维向量空间中,生成对应的向量模板。The TransE algorithm is also preset in this embodiment, and the TransE algorithm is a distributed vector representation based on entities and relationships. The artificial entity elements and artificial tags are input into the preset TransE algorithm, and the artificial entity elements and artificial tags are embedded in the low-dimensional vector space through the TransE algorithm algorithm to generate the corresponding vector template.
步骤S90,将所述向量模板存储至入图数据库,以构建对应的知识图谱。In step S90, the vector template is stored in the graph database to construct a corresponding knowledge graph.
得到向量模板后,将所述向量模板存储至入图数据库中,并根据入图数据库中存储的多个向量模板构建对应的知识图谱,对于知识图谱的构建方式本实施例在此不再阐述。After the vector template is obtained, the vector template is stored in the graph database, and the corresponding knowledge graph is constructed according to the multiple vector templates stored in the graph database. The construction method of the knowledge graph is not described in this embodiment.
本实施例通过预设的TransE算法,根据人工实体要素和人工标签得到向量模板,并根据向量模板构建对应的知识图谱,保证知识图谱的数据的全面性和准确性。This embodiment uses a preset TransE algorithm to obtain a vector template according to artificial entity elements and artificial tags, and constructs a corresponding knowledge map according to the vector template, so as to ensure the comprehensiveness and accuracy of the data of the knowledge map.
进一步地,所述将所述查询语句输入至知识图谱中进行查询,得到对应的问答结果的步骤包括:Further, the step of inputting the query sentence into the knowledge graph for query, and obtaining the corresponding question and answer result includes:
步骤S31,将所述查询语句向量化,生成对应的向量集;Step S31, vectorizing the query sentence to generate a corresponding vector set;
本实施例中,得到查询语句后,通过NLP算法将所述查询语句向量化,生成对应的向量集,应当理解的是,还可以通过其他方式实现查询语句的向量化,本实施例在此不作限制。In this embodiment, after the query sentence is obtained, the query sentence is vectorized by the NLP algorithm to generate the corresponding vector set. It should be understood that the vectorization of the query sentence can also be realized in other ways, and this embodiment will not do it here. limit.
步骤S32,将所述所述向量集与所述知识图谱中的向量模板进行匹配,得到对应的问答结果。Step S32, matching the vector set with the vector template in the knowledge graph to obtain a corresponding question and answer result.
得到向量集后,将所述向量集与所述知识图谱中的向量模板进行匹配,可选地,通过计算向量之间的匹配度的方式,确定知识图谱中与所述向量集最匹配的向量模板,随后对所述向量模板进行解析,得到对应的问答结果。After the vector set is obtained, the vector set is matched with the vector template in the knowledge graph. Optionally, the vector that best matches the vector set in the knowledge graph is determined by calculating the matching degree between the vectors. Template, and then parse the vector template to obtain the corresponding question and answer result.
本实施例将查询语句向量化,通过向量集与向量模板的匹配,更为精准的确定知识图谱中的问答结果,进而确保问答结果的准确。In this embodiment, the query sentence is vectorized, and the result of the question and answer in the knowledge graph is determined more accurately through the matching of the vector set and the vector template, thereby ensuring the accuracy of the question and answer result.
此外,本申请实施例还提出一种非易失性计算机可读存储介质,所述非易失性计算机可读存储介质上存储有计算机可读指令,所述计算机可读指令被处理器执行时实现如上所述基于知识图谱的问答方法的操作。In addition, the embodiment of the present application also proposes a non-volatile computer-readable storage medium, the non-volatile computer-readable storage medium stores computer-readable instructions, and when the computer-readable instructions are executed by a processor The operation of the question answering method based on the knowledge graph as described above is realized.
本申请非易失性计算机可读存储介质的可选实施例与上述基于知识图谱的问答方法各实施例基本相同,在此不作赘述。The optional embodiments of the non-volatile computer-readable storage medium of the present application are basically the same as the above-mentioned embodiments of the question and answer method based on the knowledge graph, and will not be repeated here.
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者系统不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者系统所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者系统中还存在另外的相同要素。It should be noted that in this article, the terms "include", "include" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, method, article or system including a series of elements not only includes those elements, It also includes other elements that are not explicitly listed, or elements inherent to the process, method, article, or system. Without more restrictions, the element defined by the sentence "including a..." does not exclude the existence of other identical elements in the process, method, article, or system that includes the element.
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。The serial numbers of the foregoing embodiments of the present application are only for description, and do not represent the advantages and disadvantages of the embodiments.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述 实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通 过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体 现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光 盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本申请各个实施例所述的方法。Through the description of the above embodiments, those skilled in the art can clearly understand the above The method of the embodiment can be realized by means of software plus the necessary general hardware platform, of course, it can also be realized through Over hardware, but in many cases the former is a better implementation. Based on this understanding, the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product. Now, the computer software product is stored in a storage medium (such as ROM/RAM, floppy disk, optical disk) The disk) includes several instructions to enable a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to execute the method described in each embodiment of the present application.
以上仅为本申请的可选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。The above are only optional embodiments of this application, and do not limit the scope of this application. Any equivalent structure or equivalent process transformation made using the content of the description and drawings of this application, or directly or indirectly applied to other related technologies In the same way, all fields are included in the scope of patent protection of this application.
Claims (20)
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201910885936.X | 2019-09-18 | ||
| CN201910885936.XA CN110781284B (en) | 2019-09-18 | 2019-09-18 | Knowledge graph-based question and answer method, device and storage medium |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2021051558A1 true WO2021051558A1 (en) | 2021-03-25 |
Family
ID=69383813
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2019/117583 Ceased WO2021051558A1 (en) | 2019-09-18 | 2019-11-12 | Knowledge graph-based question and answer method and apparatus, and storage medium |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN110781284B (en) |
| WO (1) | WO2021051558A1 (en) |
Cited By (10)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113221573A (en) * | 2021-05-31 | 2021-08-06 | 平安科技(深圳)有限公司 | Entity classification method and device, computing equipment and storage medium |
| CN113535917A (en) * | 2021-06-30 | 2021-10-22 | 山东师范大学 | Intelligent question-answering method and system based on travel knowledge map |
| CN114357194A (en) * | 2022-01-11 | 2022-04-15 | 平安科技(深圳)有限公司 | Seed data expansion method and device, computer equipment and storage medium |
| CN114860934A (en) * | 2022-05-09 | 2022-08-05 | 青岛日日顺乐信云科技有限公司 | A Smart Question Answering Method Based on NLP Technology |
| CN114860954A (en) * | 2022-04-28 | 2022-08-05 | 北京明略昭辉科技有限公司 | Entity linking method, device, equipment and medium |
| CN115309982A (en) * | 2022-07-19 | 2022-11-08 | 解放号网络科技有限公司 | Knowledge graph combined user portrait construction method |
| CN115964500A (en) * | 2021-09-08 | 2023-04-14 | 零洞科技有限公司 | Dialogue processing method, device, computer equipment and storage medium |
| CN116795954A (en) * | 2022-03-09 | 2023-09-22 | 北京有竹居网络技术有限公司 | Problem-solving method, device, storage medium and electronic device |
| CN119251515A (en) * | 2024-11-18 | 2025-01-03 | 安徽大学 | A visual and knowledge multimodal large model algorithm for pest identification |
| CN120179809A (en) * | 2025-04-08 | 2025-06-20 | 中国标准化研究院 | A knowledge reorganization retrieval method based on unstructured documents |
Families Citing this family (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN111597321B (en) * | 2020-07-08 | 2024-06-11 | 腾讯科技(深圳)有限公司 | Prediction method and device of answers to questions, storage medium and electronic equipment |
| CN111914074B (en) * | 2020-07-16 | 2023-06-20 | 华中师范大学 | Method and system for dialog generation in limited domain based on deep learning and knowledge graph |
| CN112182178A (en) * | 2020-09-25 | 2021-01-05 | 北京字节跳动网络技术有限公司 | Intelligent question answering method, device, equipment and readable storage medium |
| CN112507135B (en) * | 2020-12-17 | 2021-11-16 | 深圳市一号互联科技有限公司 | Knowledge graph query template construction method, device, system and storage medium |
| CN113254635B (en) * | 2021-04-14 | 2021-11-05 | 腾讯科技(深圳)有限公司 | Data processing method, device and storage medium |
| CN115794857A (en) * | 2022-01-19 | 2023-03-14 | 支付宝(杭州)信息技术有限公司 | Query request processing method and device |
| CN115186780B (en) * | 2022-09-14 | 2022-12-06 | 江西风向标智能科技有限公司 | Discipline knowledge point classification model training method, system, storage medium and equipment |
| CN116578933B (en) * | 2023-05-15 | 2025-04-01 | 深圳市鹏丰人力资源管理有限公司 | A talent resume matching method and system based on artificial intelligence big data |
| CN116975657B (en) * | 2023-09-25 | 2023-11-28 | 中国人民解放军军事科学院国防科技创新研究院 | Instant advantage window mining method and device based on manual experience |
Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20140363082A1 (en) * | 2013-06-09 | 2014-12-11 | Apple Inc. | Integrating stroke-distribution information into spatial feature extraction for automatic handwriting recognition |
| CN107066541A (en) * | 2017-03-13 | 2017-08-18 | 平安科技(深圳)有限公司 | The processing method and system of customer service question and answer data |
| CN107766483A (en) * | 2017-10-13 | 2018-03-06 | 华中科技大学 | The interactive answering method and system of a kind of knowledge based collection of illustrative plates |
| CN108491433A (en) * | 2018-02-09 | 2018-09-04 | 平安科技(深圳)有限公司 | Chat answer method, electronic device and storage medium |
| CN108959366A (en) * | 2018-05-21 | 2018-12-07 | 宁波薄言信息技术有限公司 | A kind of method of opening question and answer |
| CN109815318A (en) * | 2018-12-24 | 2019-05-28 | 平安科技(深圳)有限公司 | The problems in question answering system answer querying method, system and computer equipment |
| CN110032632A (en) * | 2019-04-04 | 2019-07-19 | 平安科技(深圳)有限公司 | Intelligent customer service answering method, device and storage medium based on text similarity |
Family Cites Families (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN105068661B (en) * | 2015-09-07 | 2018-09-07 | 百度在线网络技术(北京)有限公司 | Man-machine interaction method based on artificial intelligence and system |
| CN105868313B (en) * | 2016-03-25 | 2019-02-12 | 浙江大学 | A knowledge graph question answering system and method based on template matching technology |
| CN107992585B (en) * | 2017-12-08 | 2020-09-18 | 北京百度网讯科技有限公司 | Universal label mining method, device, server and medium |
| US11030226B2 (en) * | 2018-01-19 | 2021-06-08 | International Business Machines Corporation | Facilitating answering questions involving reasoning over quantitative information |
| CN109033374B (en) * | 2018-07-27 | 2022-03-15 | 四川长虹电器股份有限公司 | Knowledge graph retrieval method based on Bayesian classifier |
| CN109492077B (en) * | 2018-09-29 | 2020-09-29 | 北京智通云联科技有限公司 | Knowledge graph-based petrochemical field question-answering method and system |
| CN109308321A (en) * | 2018-11-27 | 2019-02-05 | 烟台中科网络技术研究所 | A knowledge question answering method, knowledge question answering system and computer readable storage medium |
-
2019
- 2019-09-18 CN CN201910885936.XA patent/CN110781284B/en active Active
- 2019-11-12 WO PCT/CN2019/117583 patent/WO2021051558A1/en not_active Ceased
Patent Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20140363082A1 (en) * | 2013-06-09 | 2014-12-11 | Apple Inc. | Integrating stroke-distribution information into spatial feature extraction for automatic handwriting recognition |
| CN107066541A (en) * | 2017-03-13 | 2017-08-18 | 平安科技(深圳)有限公司 | The processing method and system of customer service question and answer data |
| CN107766483A (en) * | 2017-10-13 | 2018-03-06 | 华中科技大学 | The interactive answering method and system of a kind of knowledge based collection of illustrative plates |
| CN108491433A (en) * | 2018-02-09 | 2018-09-04 | 平安科技(深圳)有限公司 | Chat answer method, electronic device and storage medium |
| CN108959366A (en) * | 2018-05-21 | 2018-12-07 | 宁波薄言信息技术有限公司 | A kind of method of opening question and answer |
| CN109815318A (en) * | 2018-12-24 | 2019-05-28 | 平安科技(深圳)有限公司 | The problems in question answering system answer querying method, system and computer equipment |
| CN110032632A (en) * | 2019-04-04 | 2019-07-19 | 平安科技(深圳)有限公司 | Intelligent customer service answering method, device and storage medium based on text similarity |
Cited By (10)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113221573A (en) * | 2021-05-31 | 2021-08-06 | 平安科技(深圳)有限公司 | Entity classification method and device, computing equipment and storage medium |
| CN113535917A (en) * | 2021-06-30 | 2021-10-22 | 山东师范大学 | Intelligent question-answering method and system based on travel knowledge map |
| CN115964500A (en) * | 2021-09-08 | 2023-04-14 | 零洞科技有限公司 | Dialogue processing method, device, computer equipment and storage medium |
| CN114357194A (en) * | 2022-01-11 | 2022-04-15 | 平安科技(深圳)有限公司 | Seed data expansion method and device, computer equipment and storage medium |
| CN116795954A (en) * | 2022-03-09 | 2023-09-22 | 北京有竹居网络技术有限公司 | Problem-solving method, device, storage medium and electronic device |
| CN114860954A (en) * | 2022-04-28 | 2022-08-05 | 北京明略昭辉科技有限公司 | Entity linking method, device, equipment and medium |
| CN114860934A (en) * | 2022-05-09 | 2022-08-05 | 青岛日日顺乐信云科技有限公司 | A Smart Question Answering Method Based on NLP Technology |
| CN115309982A (en) * | 2022-07-19 | 2022-11-08 | 解放号网络科技有限公司 | Knowledge graph combined user portrait construction method |
| CN119251515A (en) * | 2024-11-18 | 2025-01-03 | 安徽大学 | A visual and knowledge multimodal large model algorithm for pest identification |
| CN120179809A (en) * | 2025-04-08 | 2025-06-20 | 中国标准化研究院 | A knowledge reorganization retrieval method based on unstructured documents |
Also Published As
| Publication number | Publication date |
|---|---|
| CN110781284A (en) | 2020-02-11 |
| CN110781284B (en) | 2024-05-28 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2021051558A1 (en) | Knowledge graph-based question and answer method and apparatus, and storage medium | |
| WO2021132927A1 (en) | Computing device and method of classifying category of data | |
| WO2020164267A1 (en) | Text classification model construction method and apparatus, and terminal and storage medium | |
| WO2020107765A1 (en) | Statement analysis processing method, apparatus and device, and computer-readable storage medium | |
| WO2020143322A1 (en) | User request detection method and apparatus, computer device, and storage medium | |
| WO2020015067A1 (en) | Data acquisition method, device, equipment and storage medium | |
| WO2020215681A1 (en) | Indication information generation method and apparatus, terminal, and storage medium | |
| WO2020107761A1 (en) | Advertising copy processing method, apparatus and device, and computer-readable storage medium | |
| WO2019037197A1 (en) | Method and device for training topic classifier, and computer-readable storage medium | |
| WO2016112558A1 (en) | Question matching method and system in intelligent interaction system | |
| WO2021003956A1 (en) | Product information management method, apparatus and device, and storage medium | |
| WO2020253115A1 (en) | Voice recognition-based product recommendation method, apparatus and device, and storage medium | |
| WO2020082562A1 (en) | Symbol identification method, apparatus, device, and storage medium | |
| WO2021027143A1 (en) | Information pushing method, apparatus and device, and computer-readable storage medium | |
| WO2020258657A1 (en) | Abnormality detection method and apparatus, computer device and storage medium | |
| WO2015020354A1 (en) | Apparatus, server, and method for providing conversation topic | |
| WO2021215620A1 (en) | Device and method for automatically generating domain-specific image caption by using semantic ontology | |
| WO2020082766A1 (en) | Association method and apparatus for input method, device and readable storage medium | |
| WO2018205373A1 (en) | Method and apparatus for estimating injury claims settlement and loss adjustment expense, server and medium | |
| WO2020258656A1 (en) | Code segment generation method and apparatus, storage medium and computer device | |
| WO2020087704A1 (en) | Credit information management method, apparatus, and device, and storage medium | |
| WO2020107762A1 (en) | Ctr estimation method and device, and computer readable storage medium | |
| WO2020087981A1 (en) | Method and apparatus for generating risk control audit model, device and readable storage medium | |
| WO2019024485A1 (en) | Data sharing method and device and computer readable storage medium | |
| WO2020233089A1 (en) | Test case generating method and apparatus, terminal, and computer readable storage medium |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19945754 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 19945754 Country of ref document: EP Kind code of ref document: A1 |