[go: up one dir, main page]

CN110866042B - Intelligent query method and device for table and computer readable storage medium - Google Patents

Intelligent query method and device for table and computer readable storage medium Download PDF

Info

Publication number
CN110866042B
CN110866042B CN201910975458.1A CN201910975458A CN110866042B CN 110866042 B CN110866042 B CN 110866042B CN 201910975458 A CN201910975458 A CN 201910975458A CN 110866042 B CN110866042 B CN 110866042B
Authority
CN
China
Prior art keywords
query
intelligent
word vector
training
keyword
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910975458.1A
Other languages
Chinese (zh)
Other versions
CN110866042A (en
Inventor
王建华
马琳
张晓东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201910975458.1A priority Critical patent/CN110866042B/en
Publication of CN110866042A publication Critical patent/CN110866042A/en
Priority to PCT/CN2020/098951 priority patent/WO2021068565A1/en
Application granted granted Critical
Publication of CN110866042B publication Critical patent/CN110866042B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Fuzzy Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to an artificial intelligence technology, and discloses a table intelligent query method, which comprises the following steps: receiving an original table set and a label set, splitting the original table set to obtain a standard table set, performing part-of-speech coding on the label set to obtain a word vector set, inputting the standard table set and the word vector set into an intelligent query model for training to obtain a training value, until the training value is smaller than a preset threshold value, finishing training by the intelligent query model, receiving query contents of a user, extracting the query contents based on a keyword extraction technology to obtain a keyword set, performing part-of-speech coding on the keyword set to obtain a keyword vector set, inputting the keyword vector set into the intelligent query model for finishing training, obtaining the table set required by the query contents, and outputting. The invention also provides a table intelligent query device and a computer readable storage medium. The invention can realize the accurate and efficient intelligent form query function.

Description

Intelligent query method and device for table and computer readable storage medium
Technical Field
The present invention relates to the field of artificial intelligence technologies, and in particular, to a method and apparatus for intelligently querying a table, and a computer readable storage medium.
Background
With the rapid development of the internet, the data size is rapidly enlarged, so that the rapid query speed of the data is higher and higher. Most of the data are stored in a form, such as daily gain data of business operation of a company, property registration information of a property company and the like, and most of the data-based queries are based on a form traversal method or a user keyword search method at present, and although the query requirements can be met to a certain extent, when the form capacity is large, the form traversal method and the user keyword search method have low search speed and consume a large amount of computation memory.
Disclosure of Invention
The invention provides a method, a device and a computer readable storage medium for intelligently inquiring a form, which mainly aim at intelligently inquiring the form according to the inquiry requirement of a user.
In order to achieve the above object, the present invention provides a method for intelligently querying a table, including:
receiving an original table set and a label set, and splitting the original table set to obtain a standard table set;
performing part-of-speech coding on the tag set to obtain a word vector set;
inputting the standard table set and the word vector set into an intelligent query model to train to obtain a training value, judging the magnitude relation between the training value and a preset threshold, if the training value is larger than the preset threshold, continuing training by the intelligent query model, and if the training value is smaller than the preset threshold, completing training by the intelligent query model;
receiving query content of a user, extracting the query content based on a keyword extraction technology to obtain a keyword set, performing part-of-speech coding on the keyword set to obtain a keyword vector set, inputting the keyword vector set into the intelligent query model which is trained, obtaining and outputting a form set required by the query content.
Optionally, the splitting process splits the original table set into a user layer, a calculation layer and a data layer, and forms the user layer, the calculation layer and the data layer into the standard table set;
wherein:
the user layer consists of the table questions and the table heads of each table in the original table set;
the data layer is composed of a table body of each table in the original table set;
the computing layer provides mutual query functions of the user layer and the data layer.
Optionally, the part-of-speech encoding the tag set to obtain a word vector set includes:
performing single-heat coding on the tag set to obtain a primary word vector set;
and dimension reduction is carried out on the primary word vector set to obtain the word vector set.
Optionally, the dimension reduction includes:
establishing a forward probability model and a backward probability model;
and optimizing the forward probability model and the backward probability model to obtain an optimized solution, wherein the optimized solution is the word vector set.
Optionally, the standard table set and the word vector set are input into an intelligent query model to be trained to obtain training values, including:
the user layer information in the standard table set is subjected to part-of-speech coding to obtain a user layer information vector set;
inputting the user layer information vector set into the intelligent query model, and sequentially performing convolution operation, pooling operation and activation operation on the intelligent query model to obtain a predicted value set;
and carrying out loss calculation on the predicted value set and the word vector set to obtain the training value.
In addition, in order to achieve the above object, the present invention also provides a form intelligent query device, which includes a memory and a processor, wherein a form intelligent query program capable of running on the processor is stored in the memory, and the form intelligent query program when executed by the processor implements the following steps:
receiving an original table set and a label set, and splitting the original table set to obtain a standard table set;
performing part-of-speech coding on the tag set to obtain a word vector set;
inputting the standard table set and the word vector set into an intelligent query model to train to obtain a training value, judging the magnitude relation between the training value and a preset threshold, if the training value is larger than the preset threshold, continuing training by the intelligent query model, and if the training value is smaller than the preset threshold, completing training by the intelligent query model;
receiving query content of a user, extracting the query content based on a keyword extraction technology to obtain a keyword set, performing part-of-speech coding on the keyword set to obtain a keyword vector set, inputting the keyword vector set into the intelligent query model which is trained, obtaining and outputting a form set required by the query content.
Optionally, the splitting process splits the original table set into a user layer, a calculation layer and a data layer, and forms the user layer, the calculation layer and the data layer into the standard table set;
wherein:
the user layer consists of the table questions and the table heads of each table in the original table set;
the data layer is composed of a table body of each table in the original table set;
the computing layer provides mutual query functions of the user layer and the data layer.
Optionally, the part-of-speech encoding the tag set to obtain a word vector set includes:
performing single-heat coding on the tag set to obtain a primary word vector set;
and dimension reduction is carried out on the primary word vector set to obtain the word vector set.
Optionally, the dimension reduction includes:
establishing a forward probability model and a backward probability model;
and optimizing the forward probability model and the backward probability model to obtain an optimized solution, wherein the optimized solution is the word vector set.
In addition, to achieve the above object, the present invention also provides a computer-readable storage medium having stored thereon a table smart query program executable by one or more processors to implement the steps of the table smart query method as described above.
According to the invention, the standard form set is obtained by splitting the original form set, so that the splitting process can split the huge and integrated original form into the form with small volume, the quick query is facilitated, the intelligent query model is trained through the standard form set, so that the intelligent query model has an excellent and efficient query function, meanwhile, the keyword extraction is carried out on the query content of the user, the query content wanted by the user is accurately known, and the quick query is carried out according to the intelligent query model. Therefore, the intelligent form query method, the intelligent form query device and the computer readable storage medium can realize an accurate and efficient form query function.
Drawings
FIG. 1 is a flowchart of a method for intelligently querying a table according to an embodiment of the present invention;
fig. 2 is a schematic diagram of an internal structure of a table intelligent query device according to an embodiment of the invention;
fig. 3 is a schematic diagram of a table intelligent query program in a table intelligent query device according to an embodiment of the invention.
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
The invention provides an intelligent form query method. Referring to fig. 1, a flow chart of a table intelligent query method according to an embodiment of the invention is shown. The method may be performed by an apparatus, which may be implemented in software and/or hardware.
In this embodiment, the intelligent query method for the table includes:
s1, receiving an original table set and a label set, and splitting the original table set to obtain a standard table set.
The original table set in the preferred embodiment of the invention is a table which is automatically generated on the premise of different services, for example, a real estate agency collects a plurality of house sources and obtains a house property information table set based on the summary of EXCEL software; the communication service company generates a user consumption list set and the like according to the use conditions of conversations, flow rates and the like of different users.
The tag set is a description of each table in the original set of tables. Preferably, the description mode adopts a keyword combination mode, such as the property information table set, the tag set adopts a combination mode of 4 keywords of cell name and unit number, property area, market price and property pattern, and if the tag set records that the property information table set has a set of rooms as follows: a cell 3A unit +89 square meters +120 ten thousand + three-room one-room two-guard.
Preferably, the splitting process splits the original table set into a user layer, a calculation layer and a data layer to obtain the standard table set. Because the table comprises the table questions, the table heads and the table body, wherein the table questions and the table heads are based on the text expression form, and the table questions of the product sales statistics table are the product sales statistics table, and the table heads comprise the product numbers, the product names, the specifications, the packaging modes and the like, the table questions and the table heads of the original table set are extracted to form a user layer; the table body is data of the whole table, so that the table body of the original table set is extracted to form a data layer, when the whole table is split to form the user layer and the data layer, a query relation needs to be established between the user layer and the data layer, namely the function of the calculation layer, the calculation layer can adopt a multi-element linear index query form, and the linear query is that:
y=a 1 x 1 +a 2 x 2 +…+a n x n
wherein y is the data in the data layer, x 1 、x 2 、x n A is the text information of the user layer 1 、a 2 、a n The index coefficient is 0 or 1. If the real estate information with the real estate price of 130 ten thousand is to be inquired, so that y=130, and all user layer text information meeting the real estate price of 130 ten thousand is reversely calculated, namely, the original table is concentrated with the header and the header information, for example, the header is a district A, a district B and a district E, the real estate information meeting the real estate price of 130 ten thousand is provided, and the header is under the header of district A, and the real estate price of 130 ten thousand is provided.
S2, performing part-of-speech coding on the tag set to obtain a word vector set.
Because the intelligent query model cannot effectively identify the text information, the effective identification is to extract the characteristics of the text information and then perform differential judgment, preferably, the part-of-speech coding is to represent all keywords of each tag in the tag set by using an N-dimensional matrix vector, i.e. convert the text information into digital information for the identification and training of the subsequent model, wherein N is the number of the keywords in the tag set.
Further, the part-of-speech coding firstly carries out one-hot coding (one-hot) operation on the tag set to obtain a primary word vector set, and then carries out dimension reduction on the primary word vector set to obtain the word vector set.
The one-hot operation is as follows:
Figure BDA0002230002130000051
wherein i represents a keyword number, v i N-dimensional matrix vector representing key i, all v i Forming the primary word vector set, assuming a total of s keywords, v j Is the j-th element of the N-dimensional matrix vector.
Further, the dimension reduction is to reduce the generated N-dimensional matrix vector to smaller dimension data that is easier to compute for subsequent model training, i.e., to ultimately convert the primary word vector set into the word vector set.
Preferably, the dimension reduction establishes a forward probability model and a backward probability model, and then optimizes the forward probability model and the backward probability model to obtain an optimized solution, wherein the optimized solution is the word vector set.
Further, the forward probability model and the backward probability model are respectively:
Figure BDA0002230002130000061
Figure BDA0002230002130000062
optimizing the forward probability model and the backward probability model:
Figure BDA0002230002130000063
where max represents the optimization and,
Figure BDA0002230002130000065
representing the deviation derivative, v i And (3) representing the N-dimensional matrix vector of the keyword i, wherein the label set comprises s keywords, and further, after optimizing the forward probability model and the backward probability model, the dimension of the N-dimensional matrix vector is reduced to be smaller, and the word vector set is obtained after the dimension reduction is completed.
S3, inputting the standard table set and the word vector set into an intelligent query model to train to obtain a training value, judging the magnitude relation between the training value and a preset threshold, if the training value is larger than the preset threshold, continuing training by the intelligent query model, and if the training value is smaller than the preset threshold, completing training by the intelligent query model.
Preferably, the training to obtain the training value includes: and carrying out part-of-speech coding on the user layer information in the standard table set to obtain a user layer information vector set, inputting the user layer information vector set into the intelligent query model, sequentially carrying out convolution operation, pooling operation and activation operation on the intelligent query model to obtain a predicted value set, and carrying out loss calculation on the predicted value set and the word vector set to obtain the training value.
Further, the convolution operation and the pooling operation comprise the steps of constructing a convolution template in advance, determining a convolution step length, and calculating the convolution template and a user layer information vector set according to the convolution step length to obtain a convolution matrix set after the convolution operation, so that the convolution operation is completed. And selecting the maximum value or the average value of the matrixes in the convolution matrix set to replace the convolution matrix set, and completing the pooling operation.
Further, the pre-constructed convolution templates may be a standard 3*3 matrix, such as
Figure BDA0002230002130000064
The calculation mode of the matrix after the convolution operation is a mode of from left to right and the convolution amplitude is 1, and if the characteristic candidate area matrix with the characteristic 9*9 in the characteristic candidate area set is: />
Figure BDA0002230002130000071
Said pre-constructed convolution template->
Figure BDA0002230002130000072
First of all, and->
Figure BDA00022300021300000715
The calculation is carried out in the following way: 1*0, 0*3, 1*1, and the like, and the final result is: />
Figure BDA0002230002130000074
And so on, the pre-constructed convolution template +.>
Figure BDA0002230002130000075
According to the convolution amplitude of 1, continuously traversing the matrix to the right for one step is as follows: said pre-constructed convolution template->
Figure BDA0002230002130000076
Performing the above operation to obtain the pre-constructed convolution template +.>
Figure BDA0002230002130000077
It follows that a large number of small dimensional matrices can be generated when the convolution operation is completed, e.g.>
Figure BDA0002230002130000078
And->
Figure BDA0002230002130000079
Etc., so that the pooling operation reduces the dimensions of a large number of small-dimension matrices generated by the convolution operation, preferably using a maximization principle, such as the one described above +.>
Figure BDA00022300021300000710
And
Figure BDA00022300021300000711
the pooling operation is completed by substituting the largest values 3 and 7.
Preferably, the convolution operation and the pooling operation are repeated, and preferably 16 times of convolution and pooling operations can be used to obtain a final feature matrix set.
Preferably, the activating operation is to estimate the probability of the feature matrix set through a softmax function, and select the prediction result with the highest probability as the final prediction and output the final prediction result. The softmax function is:
Figure BDA00022300021300000712
wherein p (matrix) represents the output probability of the feature matrix set matrix, k represents the data amount of the feature matrix set, e is an infinite non-cyclic fraction, and j represents the selectable range of the prediction value set. Such as when
Figure BDA00022300021300000713
At the time, calculate +.>
Figure BDA00022300021300000714
Is 0.21, when->
Figure BDA0002230002130000081
At the time, calculate +.>
Figure BDA0002230002130000082
Is 0.64, so the feature matrix is represented by +.>
Figure BDA0002230002130000083
The loss calculation includes:
Figure BDA0002230002130000084
wherein t is the number of the word vector sets, y i For the set of word vectors, y i ' is the set of predictors.
S4, receiving query contents of a user, extracting the query contents based on a keyword extraction algorithm to obtain a keyword set, performing part-of-speech conversion on the keyword set to obtain a keyword vector set, inputting the keyword vector set into the intelligent query model which completes training to obtain a form set required by the query contents, and outputting the form set.
Since the query content input by the user is often not in the prescribed keyword combination form, and the spoken language is often tasted for a lot, such as "i want to find the telephone bill of September", the keyword extraction is required for the query content input by the user, and the keywords "September" and "telephone bill" in "i want to find the telephone bill of September" are extracted.
Preferably, the keyword extraction may be performed by using a traversal method, for example, splitting and de-duplicating all keywords of the tag set to construct a keyword table, and sequentially comparing the query content with the keyword table to complete the keyword extraction.
Further, the intelligent query model obtains a table set required by the query content, wherein the query content is information of the user layer, so that the function provided by the calculation layer for querying the data layer according to the user layer is called to obtain the data layer, and the data layer and the user layer are recombined to obtain the table set.
The invention also provides an intelligent query device for the table. Referring to fig. 2, an internal structure diagram of a table intelligent query device according to an embodiment of the invention is shown.
In this embodiment, the table intelligent query device 1 may be a PC (Personal Computer ), or a terminal device such as a smart phone, a tablet computer, a portable computer, or a server. The form intelligent querying device 1 comprises at least a memory 11, a processor 12, a communication bus 13, and a network interface 14.
The memory 11 includes at least one type of readable storage medium including flash memory, a hard disk, a multimedia card, a card memory (e.g., SD or DX memory, etc.), a magnetic memory, a magnetic disk, an optical disk, etc. The memory 11 may in some embodiments be an internal storage unit of the form smart query device 1, such as a hard disk of the form smart query device 1. The memory 11 may also be an external storage device of the table Smart query device 1 in other embodiments, for example, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the table Smart query device 1. Further, the memory 11 may also include both an internal storage unit and an external storage device of the table intelligent query apparatus 1. The memory 11 may be used not only for storing application software installed in the form intelligent inquiry apparatus 1 and various types of data, such as codes of the form intelligent inquiry program 01, but also for temporarily storing data that has been output or is to be output.
Processor 12 may in some embodiments be a central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor or other data processing chip for executing program code or processing data stored in memory 11, such as executing a form intelligent lookup program 01, etc.
The communication bus 13 is used to enable connection communication between these components.
The network interface 14 may optionally comprise a standard wired interface, a wireless interface (e.g. WI-FI interface), typically used to establish a communication connection between the apparatus 1 and other electronic devices.
Optionally, the device 1 may further comprise a user interface, which may comprise a Display (Display), an input unit such as a Keyboard (Keyboard), and a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch, or the like. The display may also be referred to as a display screen or a display unit, as appropriate, for displaying information processed in the form intelligent querying device 1 and for displaying a visual user interface.
Fig. 2 shows only a form intelligent querying device 1 having components 11-14 and a form intelligent querying program 01, it will be understood by those skilled in the art that the structure shown in fig. 1 does not constitute a limitation of the form intelligent querying device 1, and may include fewer or more components than shown, or may combine certain components, or may be arranged in different components.
In the embodiment of the apparatus 1 shown in fig. 2, a table intelligent inquiry program 01 is stored in the memory 11; the following steps are implemented when the processor 12 executes the table intelligent inquiry program 01 stored in the memory 11:
step one, receiving an original table set and a label set, and splitting the original table set to obtain a standard table set.
The original table set in the preferred embodiment of the invention is a table which is automatically generated on the premise of different services, for example, a real estate agency collects a plurality of house sources and obtains a house property information table set based on the summary of EXCEL software; the communication service company generates a user consumption list set and the like according to the use conditions of conversations, flow rates and the like of different users.
The tag set is a description of each table in the original set of tables. Preferably, the description mode adopts a keyword combination mode, such as the property information table set, the tag set adopts a combination mode of 4 keywords of cell name and unit number, property area, market price and property pattern, and if the tag set records that the property information table set has a set of rooms as follows: a cell 3A unit +89 square meters +120 ten thousand + three-room one-room two-guard.
Preferably, the splitting process splits the original table set into a user layer, a calculation layer and a data layer to obtain the standard table set. Because the table comprises the table questions, the table heads and the table body, wherein the table questions and the table heads are based on the text expression form, and the table questions of the product sales statistics table are the product sales statistics table, and the table heads comprise the product numbers, the product names, the specifications, the packaging modes and the like, the table questions and the table heads of the original table set are extracted to form a user layer; the table body is data of the whole table, so that the table body of the original table set is extracted to form a data layer, when the whole table is split to form the user layer and the data layer, a query relation needs to be established between the user layer and the data layer, namely the function of the calculation layer, the calculation layer can adopt a multi-element linear index query form, and the linear query is that:
y=a 1 x 1 +a 2 x 2 +…+a n x n
wherein y is the data in the data layer, x 1 、x 2 、x n A is the text information of the user layer 1 、a 2 、a n The index coefficient is 0 or 1. If the real estate information with the real estate price of 130 ten thousand is to be inquired, so that y=130, and all user layer text information meeting the real estate price of 130 ten thousand is reversely calculated, namely, the original table is concentrated with the header and the header information, for example, the header is a district A, a district B and a district E, the real estate information meeting the real estate price of 130 ten thousand is provided, and the header is under the header of district A, and the real estate price of 130 ten thousand is provided.
Step two, part-of-speech coding is carried out on the tag set to obtain a word vector set.
Because the intelligent query model cannot effectively identify the text information, the effective identification is to extract the characteristics of the text information and then perform differential judgment, preferably, the part-of-speech coding is to represent all keywords of each tag in the tag set by using an N-dimensional matrix vector, i.e. convert the text information into digital information for the identification and training of the subsequent model, wherein N is the number of the keywords in the tag set.
Further, the part-of-speech coding firstly carries out one-hot coding (one-hot) operation on the tag set to obtain a primary word vector set, and then carries out dimension reduction on the primary word vector set to obtain the word vector set.
The one-hot operation is as follows:
Figure BDA0002230002130000111
wherein i represents a keyword number, v i N-dimensional matrix vector representing key i, all v i Forming the primary word vector set, assuming a total of s keywords, v j Is the j-th element of the N-dimensional matrix vector.
Further, the dimension reduction is to reduce the generated N-dimensional matrix vector to smaller dimension data that is easier to compute for subsequent model training, i.e., to ultimately convert the primary word vector set into the word vector set.
Preferably, the dimension reduction establishes a forward probability model and a backward probability model, and then optimizes the forward probability model and the backward probability model to obtain an optimized solution, wherein the optimized solution is the word vector set.
Further, the forward probability model and the backward probability model are respectively:
Figure BDA0002230002130000112
Figure BDA0002230002130000113
optimizing the forward probability model and the backward probability model:
Figure BDA0002230002130000114
where max represents the optimization and,
Figure BDA0002230002130000115
representing the deviation derivative, v i And (3) representing the N-dimensional matrix vector of the keyword i, wherein the label set comprises s keywords, and further, after optimizing the forward probability model and the backward probability model, the dimension of the N-dimensional matrix vector is reduced to be smaller, and the word vector set is obtained after the dimension reduction is completed.
And thirdly, inputting the standard table set and the word vector set into an intelligent query model to train to obtain a training value, judging the magnitude relation between the training value and a preset threshold, if the training value is larger than the preset threshold, continuing training by the intelligent query model, and if the training value is smaller than the preset threshold, finishing training by the intelligent query model.
Preferably, the training to obtain the training value includes: and carrying out part-of-speech coding on the user layer information in the standard table set to obtain a user layer information vector set, inputting the user layer information vector set into the intelligent query model, sequentially carrying out convolution operation, pooling operation and activation operation on the intelligent query model to obtain a predicted value set, and carrying out loss calculation on the predicted value set and the word vector set to obtain the training value.
Further, the convolution operation and the pooling operation comprise the steps of constructing a convolution template in advance, determining a convolution step length, and calculating the convolution template and a user layer information vector set according to the convolution step length to obtain a convolution matrix set after the convolution operation, so that the convolution operation is completed. And selecting the maximum value or the average value of the matrixes in the convolution matrix set to replace the convolution matrix set, and completing the pooling operation.
Further, the pre-constructed convolution templates may be a standard 3*3 matrix, such as
Figure BDA0002230002130000121
The matrix after the convolution operation is calculated by adopting the following modeThe convolution amplitude is 1 from left to right, and if the feature candidate area matrix with the feature 9*9 in the feature candidate area set is: />
Figure BDA0002230002130000122
Said pre-constructed convolution template->
Figure BDA0002230002130000123
First of all, and->
Figure BDA0002230002130000124
The calculation is carried out in the following way: 1*0, 0*3, 1*1, and the like, and the final result is: />
Figure BDA0002230002130000125
And so on, the pre-constructed convolution template +.>
Figure BDA0002230002130000126
According to the convolution amplitude of 1, continuously traversing the matrix to the right for one step is as follows: said pre-constructed convolution template->
Figure BDA0002230002130000127
Performing the above operation to obtain the pre-constructed convolution template +.>
Figure BDA0002230002130000128
It follows that a large number of small dimensional matrices can be generated when the convolution operation is completed, e.g.>
Figure BDA0002230002130000131
And->
Figure BDA0002230002130000132
Etc., so that the pooling operation reduces the dimensions of a large number of small-dimension matrices generated by the convolution operation, preferably using a maximization principle, such as the one described above +.>
Figure BDA0002230002130000133
And
Figure BDA0002230002130000134
the pooling operation is completed by substituting the largest values 3 and 7.
Preferably, the convolution operation and the pooling operation are repeated, and preferably 16 times of convolution and pooling operations can be used to obtain a final feature matrix set.
Preferably, the activating operation is to estimate the probability of the feature matrix set through a softmax function, and select the prediction result with the highest probability as the final prediction and output the final prediction result. The softmax function is:
Figure BDA0002230002130000135
wherein p (matrix) represents the output probability of the feature matrix set matrix, k represents the data amount of the feature matrix set, e is an infinite non-cyclic fraction, and j represents the selectable range of the prediction value set. Such as when
Figure BDA0002230002130000136
At the time, calculate +.>
Figure BDA0002230002130000137
Is 0.21, when->
Figure BDA0002230002130000138
At the time, calculate +.>
Figure BDA0002230002130000139
Is 0.64, so the feature matrix is represented by +.>
Figure BDA00022300021300001310
The loss calculation includes:
Figure BDA00022300021300001311
wherein t is the number of the word vector sets, y i For the set of word vectors, y i ' is the set of predictors.
And step four, receiving query contents of a user, extracting the query contents based on a keyword extraction algorithm to obtain a keyword set, performing part-of-speech conversion on the keyword set to obtain a keyword vector set, inputting the keyword vector set into the intelligent query model which completes training to obtain a form set required by the query contents, and outputting the form set.
Since the query content input by the user is often not in the prescribed keyword combination form, and the spoken language is often tasted for a lot, such as "i want to find the telephone bill of September", the keyword extraction is required for the query content input by the user, and the keywords "September" and "telephone bill" in "i want to find the telephone bill of September" are extracted.
Preferably, the keyword extraction may be performed by using a traversal method, for example, splitting and de-duplicating all keywords of the tag set to construct a keyword table, and sequentially comparing the query content with the keyword table to complete the keyword extraction.
Further, the intelligent query model obtains a table set required by the query content, wherein the query content is information of the user layer, so that the function provided by the calculation layer for querying the data layer according to the user layer is called to obtain the data layer, and the data layer and the user layer are recombined to obtain the table set.
Optionally, in other embodiments, the table intelligent query program may be further divided into one or more modules, where one or more modules are stored in the memory 11 and executed by one or more processors (the processor 12 in this embodiment) to perform the present invention, and the modules referred to herein are a series of instruction segments of a computer program capable of performing a specific function, for describing the execution of the table intelligent query program in the table intelligent query device.
For example, referring to fig. 3, a schematic program module of a table intelligent query program in an embodiment of a table intelligent query apparatus according to the present invention is shown, where the table intelligent query program may be divided into a data receiving and processing module 10, a part-of-speech encoding module 20, an intelligent query model training module 30, and a table query and output module 40, which are exemplary:
the data receiving and processing module 10 is configured to: and receiving an original table set and a label set, and splitting the original table set to obtain a standard table set.
The part-of-speech encoding module 20 is configured to: and performing part-of-speech coding on the tag set to obtain a word vector set.
The intelligent query model training 30 is used to: and inputting the standard table set and the word vector set into an intelligent query model to train to obtain a training value, judging the magnitude relation between the training value and a preset threshold, if the training value is larger than the preset threshold, continuing training by the intelligent query model, and if the training value is smaller than the preset threshold, finishing training by the intelligent query model.
The table queries and outputs 40 for: receiving query content of a user, extracting the query content based on a keyword extraction technology to obtain a keyword set, performing part-of-speech coding on the keyword set to obtain a keyword vector set, inputting the keyword vector set into the intelligent query model which is trained, obtaining and outputting a form set required by the query content.
The functions or operation steps implemented when the program modules, such as the data receiving and processing module 10, the part-of-speech encoding module 20, the intelligent query model training module 30, the table query and output module 40, etc., are executed are substantially the same as those of the foregoing embodiments, and will not be described herein.
In addition, an embodiment of the present invention further provides a computer-readable storage medium, where a table intelligent query program is stored, where the table intelligent query program can be executed by one or more processors to implement the following operations:
and receiving an original table set and a label set, and splitting the original table set to obtain a standard table set.
And performing part-of-speech coding on the tag set to obtain a word vector set.
And inputting the standard table set and the word vector set into an intelligent query model to train to obtain a training value, judging the magnitude relation between the training value and a preset threshold, if the training value is larger than the preset threshold, continuing training by the intelligent query model, and if the training value is smaller than the preset threshold, finishing training by the intelligent query model.
Receiving query content of a user, extracting the query content based on a keyword extraction technology to obtain a keyword set, performing part-of-speech coding on the keyword set to obtain a keyword vector set, inputting the keyword vector set into the intelligent query model which is trained, obtaining and outputting a form set required by the query content.
It should be noted that, the foregoing reference numerals of the embodiments of the present invention are merely for describing the embodiments, and do not represent the advantages and disadvantages of the embodiments. And the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, apparatus, article, or method that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, apparatus, article, or method. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, apparatus, article or method that comprises the element.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) as described above, comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method according to the embodiments of the present invention.
The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.

Claims (10)

1. An intelligent query method for a form, which is characterized by comprising the following steps:
receiving an original table set and a label set, and splitting the original table set to obtain a standard table set;
performing part-of-speech coding on the tag set to obtain a word vector set;
inputting the standard table set and the word vector set into an intelligent query model to train to obtain a training value, judging the magnitude relation between the training value and a preset threshold, if the training value is larger than the preset threshold, continuing training by the intelligent query model, and if the training value is smaller than the preset threshold, completing training by the intelligent query model;
receiving query content of a user, extracting the query content based on a keyword extraction technology to obtain a keyword set, performing part-of-speech coding on the keyword set to obtain a keyword vector set, inputting the keyword vector set into the intelligent query model which is trained, obtaining and outputting a form set required by the query content.
2. The intelligent query method of claim 1, wherein the splitting process splits the original table set into a user layer, a calculation layer and a data layer, and composes the user layer, the calculation layer and the data layer into the standard table set;
wherein:
the user layer consists of the table questions and the table heads of each table in the original table set;
the data layer is composed of a table body of each table in the original table set;
the computing layer provides mutual query functions of the user layer and the data layer.
3. The method for intelligently querying a table according to claim 1 or 2, wherein the step of performing part-of-speech encoding on the tag set to obtain a word vector set includes:
performing single-heat coding on the tag set to obtain a primary word vector set;
and dimension reduction is carried out on the primary word vector set to obtain the word vector set.
4. The form intelligence query method of claim 3, wherein said dimension reduction comprises:
establishing a forward probability model and a backward probability model;
and optimizing the forward probability model and the backward probability model to obtain an optimized solution, wherein the optimized solution is the word vector set.
5. The method of claim 2, wherein inputting the standard table set and the word vector set into an intelligent query model for training to obtain training values comprises:
the user layer information in the standard table set is subjected to part-of-speech coding to obtain a user layer information vector set;
inputting the user layer information vector set into the intelligent query model, and sequentially performing convolution operation, pooling operation and activation operation on the intelligent query model to obtain a predicted value set;
and carrying out loss calculation on the predicted value set and the word vector set to obtain the training value.
6. A form intelligence query device, the device comprising a memory and a processor, the memory storing a form intelligence query program operable on the processor, the form intelligence query program when executed by the processor implementing the steps of:
receiving an original table set and a label set, and splitting the original table set to obtain a standard table set;
performing part-of-speech coding on the tag set to obtain a word vector set;
inputting the standard table set and the word vector set into an intelligent query model to train to obtain a training value, judging the magnitude relation between the training value and a preset threshold, if the training value is larger than the preset threshold, continuing training by the intelligent query model, and if the training value is smaller than the preset threshold, completing training by the intelligent query model;
receiving query content of a user, extracting the query content based on a keyword extraction technology to obtain a keyword set, performing part-of-speech coding on the keyword set to obtain a keyword vector set, inputting the keyword vector set into the intelligent query model which is trained, obtaining and outputting a form set required by the query content.
7. The intelligent table query device according to claim 6, wherein the splitting process splits the original table set into a user layer, a calculation layer and a data layer, and composes the user layer, the calculation layer and the data layer into the standard table set;
wherein:
the user layer consists of the table questions and the table heads of each table in the original table set;
the data layer is composed of a table body of each table in the original table set;
the computing layer provides mutual query functions of the user layer and the data layer.
8. The intelligent query device according to claim 6 or 7, wherein said part-of-speech encoding said tag set to obtain a word vector set, comprises:
performing single-heat coding on the tag set to obtain a primary word vector set;
and dimension reduction is carried out on the primary word vector set to obtain the word vector set.
9. The form intelligent querying device as recited in claim 8, wherein the dimension reduction comprises:
establishing a forward probability model and a backward probability model;
and optimizing the forward probability model and the backward probability model to obtain an optimized solution, wherein the optimized solution is the word vector set.
10. A computer-readable storage medium having stored thereon a form intelligence query program executable by one or more processors to implement the steps of the form intelligence query method of any of claims 1 to 5.
CN201910975458.1A 2019-10-11 2019-10-11 Intelligent query method and device for table and computer readable storage medium Active CN110866042B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201910975458.1A CN110866042B (en) 2019-10-11 2019-10-11 Intelligent query method and device for table and computer readable storage medium
PCT/CN2020/098951 WO2021068565A1 (en) 2019-10-11 2020-06-29 Table intelligent query method and apparatus, electronic device and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910975458.1A CN110866042B (en) 2019-10-11 2019-10-11 Intelligent query method and device for table and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN110866042A CN110866042A (en) 2020-03-06
CN110866042B true CN110866042B (en) 2023-05-12

Family

ID=69652834

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910975458.1A Active CN110866042B (en) 2019-10-11 2019-10-11 Intelligent query method and device for table and computer readable storage medium

Country Status (2)

Country Link
CN (1) CN110866042B (en)
WO (1) WO2021068565A1 (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110866042B (en) * 2019-10-11 2023-05-12 平安科技(深圳)有限公司 Intelligent query method and device for table and computer readable storage medium
CN112597171B (en) * 2020-12-31 2024-08-27 平安银行股份有限公司 Table relation visualization method and device, electronic equipment and storage medium
CN113111864A (en) * 2021-05-13 2021-07-13 上海巽联信息科技有限公司 Intelligent table extraction algorithm based on multiple modes
CN116049354B (en) * 2023-01-28 2023-06-20 北京原子回声智能科技有限公司 Multi-table retrieval method and device based on natural language

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106250381A (en) * 2015-06-04 2016-12-21 微软技术许可有限责任公司 The row sequence optimized for input/output in list data
CN106874411A (en) * 2017-01-22 2017-06-20 网易(杭州)网络有限公司 The searching method and search platform of a kind of form
JP2017224240A (en) * 2016-06-17 2017-12-21 富士通株式会社 TABLE DATA SEARCH DEVICE, TABLE DATA SEARCH METHOD, AND TABLE DATA SEARCH PROGRAM
CN110222160A (en) * 2019-05-06 2019-09-10 平安科技(深圳)有限公司 Intelligent semantic document recommendation method, device and computer readable storage medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101615193A (en) * 2009-07-07 2009-12-30 北京大学 A query system based on encyclopedia data extraction and integration
US8819055B2 (en) * 2010-05-14 2014-08-26 Oracle International Corporation System and method for logical people groups
US20140025626A1 (en) * 2012-04-19 2014-01-23 Avalon Consulting, LLC Method of using search engine facet indexes to enable search-enhanced business intelligence analysis
US10311374B2 (en) * 2015-09-11 2019-06-04 Adobe Inc. Categorization of forms to aid in form search
CN110866042B (en) * 2019-10-11 2023-05-12 平安科技(深圳)有限公司 Intelligent query method and device for table and computer readable storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106250381A (en) * 2015-06-04 2016-12-21 微软技术许可有限责任公司 The row sequence optimized for input/output in list data
JP2017224240A (en) * 2016-06-17 2017-12-21 富士通株式会社 TABLE DATA SEARCH DEVICE, TABLE DATA SEARCH METHOD, AND TABLE DATA SEARCH PROGRAM
CN106874411A (en) * 2017-01-22 2017-06-20 网易(杭州)网络有限公司 The searching method and search platform of a kind of form
CN110222160A (en) * 2019-05-06 2019-09-10 平安科技(深圳)有限公司 Intelligent semantic document recommendation method, device and computer readable storage medium

Also Published As

Publication number Publication date
CN110866042A (en) 2020-03-06
WO2021068565A1 (en) 2021-04-15

Similar Documents

Publication Publication Date Title
CN110334272B (en) Intelligent question-answering method and device based on knowledge graph and computer storage medium
CN110866042B (en) Intelligent query method and device for table and computer readable storage medium
CN110442857B (en) Emotion intelligent judging method and device and computer readable storage medium
CN110765765B (en) Contract key term extraction method, device and storage medium based on artificial intelligence
CN110413773B (en) Intelligent text classification method, device and computer readable storage medium
CN110795527B (en) Candidate entity ordering method, training method and related device
CN110427480B (en) Intelligent personalized text recommendation method and device and computer readable storage medium
CN110795548A (en) Intelligent question answering method, device and computer readable storage medium
CN111475617A (en) Event body extraction method and device and storage medium
CN108038208B (en) Training method and device of context information recognition model and storage medium
CN115062134A (en) Knowledge question-answering model training and knowledge question-answering method, device and computer equipment
CN111159485A (en) Tail entity linking method, device, server and storage medium
WO2020248366A1 (en) Text intention intelligent classification method and device, and computer-readable storage medium
CN113190702A (en) Method and apparatus for generating information
CN112784011B (en) Emotion problem processing method, device and medium based on CNN and LSTM
CN113627797A (en) Incoming employee portrait generation method, device, computer equipment and storage medium
WO2021139076A1 (en) Intelligent text dialogue generation method and apparatus, and computer-readable storage medium
CN113947456B (en) Online store matching method and its device, equipment, medium, and product
WO2025026402A1 (en) Question answering method and apparatus, electronic device and storage medium
CN115730597A (en) Multi-level semantic intention recognition method and related equipment thereof
CN115546869A (en) Facial expression recognition method and system based on multiple features
CN112988993B (en) Question and answer method and computing device
CN110263134B (en) Intelligent emotion question-answering method and device and computer readable storage medium
CN113434657B (en) E-commerce customer service response method and corresponding device, equipment and medium thereof
CN113961701B (en) Message text clustering method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant