CN110866042B - Intelligent query method and device for table and computer readable storage medium - Google Patents
Intelligent query method and device for table and computer readable storage medium Download PDFInfo
- Publication number
- CN110866042B CN110866042B CN201910975458.1A CN201910975458A CN110866042B CN 110866042 B CN110866042 B CN 110866042B CN 201910975458 A CN201910975458 A CN 201910975458A CN 110866042 B CN110866042 B CN 110866042B
- Authority
- CN
- China
- Prior art keywords
- query
- intelligent
- word vector
- training
- keyword
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computational Linguistics (AREA)
- Probability & Statistics with Applications (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Fuzzy Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to an artificial intelligence technology, and discloses a table intelligent query method, which comprises the following steps: receiving an original table set and a label set, splitting the original table set to obtain a standard table set, performing part-of-speech coding on the label set to obtain a word vector set, inputting the standard table set and the word vector set into an intelligent query model for training to obtain a training value, until the training value is smaller than a preset threshold value, finishing training by the intelligent query model, receiving query contents of a user, extracting the query contents based on a keyword extraction technology to obtain a keyword set, performing part-of-speech coding on the keyword set to obtain a keyword vector set, inputting the keyword vector set into the intelligent query model for finishing training, obtaining the table set required by the query contents, and outputting. The invention also provides a table intelligent query device and a computer readable storage medium. The invention can realize the accurate and efficient intelligent form query function.
Description
Technical Field
The present invention relates to the field of artificial intelligence technologies, and in particular, to a method and apparatus for intelligently querying a table, and a computer readable storage medium.
Background
With the rapid development of the internet, the data size is rapidly enlarged, so that the rapid query speed of the data is higher and higher. Most of the data are stored in a form, such as daily gain data of business operation of a company, property registration information of a property company and the like, and most of the data-based queries are based on a form traversal method or a user keyword search method at present, and although the query requirements can be met to a certain extent, when the form capacity is large, the form traversal method and the user keyword search method have low search speed and consume a large amount of computation memory.
Disclosure of Invention
The invention provides a method, a device and a computer readable storage medium for intelligently inquiring a form, which mainly aim at intelligently inquiring the form according to the inquiry requirement of a user.
In order to achieve the above object, the present invention provides a method for intelligently querying a table, including:
receiving an original table set and a label set, and splitting the original table set to obtain a standard table set;
performing part-of-speech coding on the tag set to obtain a word vector set;
inputting the standard table set and the word vector set into an intelligent query model to train to obtain a training value, judging the magnitude relation between the training value and a preset threshold, if the training value is larger than the preset threshold, continuing training by the intelligent query model, and if the training value is smaller than the preset threshold, completing training by the intelligent query model;
receiving query content of a user, extracting the query content based on a keyword extraction technology to obtain a keyword set, performing part-of-speech coding on the keyword set to obtain a keyword vector set, inputting the keyword vector set into the intelligent query model which is trained, obtaining and outputting a form set required by the query content.
Optionally, the splitting process splits the original table set into a user layer, a calculation layer and a data layer, and forms the user layer, the calculation layer and the data layer into the standard table set;
wherein:
the user layer consists of the table questions and the table heads of each table in the original table set;
the data layer is composed of a table body of each table in the original table set;
the computing layer provides mutual query functions of the user layer and the data layer.
Optionally, the part-of-speech encoding the tag set to obtain a word vector set includes:
performing single-heat coding on the tag set to obtain a primary word vector set;
and dimension reduction is carried out on the primary word vector set to obtain the word vector set.
Optionally, the dimension reduction includes:
establishing a forward probability model and a backward probability model;
and optimizing the forward probability model and the backward probability model to obtain an optimized solution, wherein the optimized solution is the word vector set.
Optionally, the standard table set and the word vector set are input into an intelligent query model to be trained to obtain training values, including:
the user layer information in the standard table set is subjected to part-of-speech coding to obtain a user layer information vector set;
inputting the user layer information vector set into the intelligent query model, and sequentially performing convolution operation, pooling operation and activation operation on the intelligent query model to obtain a predicted value set;
and carrying out loss calculation on the predicted value set and the word vector set to obtain the training value.
In addition, in order to achieve the above object, the present invention also provides a form intelligent query device, which includes a memory and a processor, wherein a form intelligent query program capable of running on the processor is stored in the memory, and the form intelligent query program when executed by the processor implements the following steps:
receiving an original table set and a label set, and splitting the original table set to obtain a standard table set;
performing part-of-speech coding on the tag set to obtain a word vector set;
inputting the standard table set and the word vector set into an intelligent query model to train to obtain a training value, judging the magnitude relation between the training value and a preset threshold, if the training value is larger than the preset threshold, continuing training by the intelligent query model, and if the training value is smaller than the preset threshold, completing training by the intelligent query model;
receiving query content of a user, extracting the query content based on a keyword extraction technology to obtain a keyword set, performing part-of-speech coding on the keyword set to obtain a keyword vector set, inputting the keyword vector set into the intelligent query model which is trained, obtaining and outputting a form set required by the query content.
Optionally, the splitting process splits the original table set into a user layer, a calculation layer and a data layer, and forms the user layer, the calculation layer and the data layer into the standard table set;
wherein:
the user layer consists of the table questions and the table heads of each table in the original table set;
the data layer is composed of a table body of each table in the original table set;
the computing layer provides mutual query functions of the user layer and the data layer.
Optionally, the part-of-speech encoding the tag set to obtain a word vector set includes:
performing single-heat coding on the tag set to obtain a primary word vector set;
and dimension reduction is carried out on the primary word vector set to obtain the word vector set.
Optionally, the dimension reduction includes:
establishing a forward probability model and a backward probability model;
and optimizing the forward probability model and the backward probability model to obtain an optimized solution, wherein the optimized solution is the word vector set.
In addition, to achieve the above object, the present invention also provides a computer-readable storage medium having stored thereon a table smart query program executable by one or more processors to implement the steps of the table smart query method as described above.
According to the invention, the standard form set is obtained by splitting the original form set, so that the splitting process can split the huge and integrated original form into the form with small volume, the quick query is facilitated, the intelligent query model is trained through the standard form set, so that the intelligent query model has an excellent and efficient query function, meanwhile, the keyword extraction is carried out on the query content of the user, the query content wanted by the user is accurately known, and the quick query is carried out according to the intelligent query model. Therefore, the intelligent form query method, the intelligent form query device and the computer readable storage medium can realize an accurate and efficient form query function.
Drawings
FIG. 1 is a flowchart of a method for intelligently querying a table according to an embodiment of the present invention;
fig. 2 is a schematic diagram of an internal structure of a table intelligent query device according to an embodiment of the invention;
fig. 3 is a schematic diagram of a table intelligent query program in a table intelligent query device according to an embodiment of the invention.
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
The invention provides an intelligent form query method. Referring to fig. 1, a flow chart of a table intelligent query method according to an embodiment of the invention is shown. The method may be performed by an apparatus, which may be implemented in software and/or hardware.
In this embodiment, the intelligent query method for the table includes:
s1, receiving an original table set and a label set, and splitting the original table set to obtain a standard table set.
The original table set in the preferred embodiment of the invention is a table which is automatically generated on the premise of different services, for example, a real estate agency collects a plurality of house sources and obtains a house property information table set based on the summary of EXCEL software; the communication service company generates a user consumption list set and the like according to the use conditions of conversations, flow rates and the like of different users.
The tag set is a description of each table in the original set of tables. Preferably, the description mode adopts a keyword combination mode, such as the property information table set, the tag set adopts a combination mode of 4 keywords of cell name and unit number, property area, market price and property pattern, and if the tag set records that the property information table set has a set of rooms as follows: a cell 3A unit +89 square meters +120 ten thousand + three-room one-room two-guard.
Preferably, the splitting process splits the original table set into a user layer, a calculation layer and a data layer to obtain the standard table set. Because the table comprises the table questions, the table heads and the table body, wherein the table questions and the table heads are based on the text expression form, and the table questions of the product sales statistics table are the product sales statistics table, and the table heads comprise the product numbers, the product names, the specifications, the packaging modes and the like, the table questions and the table heads of the original table set are extracted to form a user layer; the table body is data of the whole table, so that the table body of the original table set is extracted to form a data layer, when the whole table is split to form the user layer and the data layer, a query relation needs to be established between the user layer and the data layer, namely the function of the calculation layer, the calculation layer can adopt a multi-element linear index query form, and the linear query is that:
y=a 1 x 1 +a 2 x 2 +…+a n x n
wherein y is the data in the data layer, x 1 、x 2 、x n A is the text information of the user layer 1 、a 2 、a n The index coefficient is 0 or 1. If the real estate information with the real estate price of 130 ten thousand is to be inquired, so that y=130, and all user layer text information meeting the real estate price of 130 ten thousand is reversely calculated, namely, the original table is concentrated with the header and the header information, for example, the header is a district A, a district B and a district E, the real estate information meeting the real estate price of 130 ten thousand is provided, and the header is under the header of district A, and the real estate price of 130 ten thousand is provided.
S2, performing part-of-speech coding on the tag set to obtain a word vector set.
Because the intelligent query model cannot effectively identify the text information, the effective identification is to extract the characteristics of the text information and then perform differential judgment, preferably, the part-of-speech coding is to represent all keywords of each tag in the tag set by using an N-dimensional matrix vector, i.e. convert the text information into digital information for the identification and training of the subsequent model, wherein N is the number of the keywords in the tag set.
Further, the part-of-speech coding firstly carries out one-hot coding (one-hot) operation on the tag set to obtain a primary word vector set, and then carries out dimension reduction on the primary word vector set to obtain the word vector set.
The one-hot operation is as follows:
wherein i represents a keyword number, v i N-dimensional matrix vector representing key i, all v i Forming the primary word vector set, assuming a total of s keywords, v j Is the j-th element of the N-dimensional matrix vector.
Further, the dimension reduction is to reduce the generated N-dimensional matrix vector to smaller dimension data that is easier to compute for subsequent model training, i.e., to ultimately convert the primary word vector set into the word vector set.
Preferably, the dimension reduction establishes a forward probability model and a backward probability model, and then optimizes the forward probability model and the backward probability model to obtain an optimized solution, wherein the optimized solution is the word vector set.
Further, the forward probability model and the backward probability model are respectively:
optimizing the forward probability model and the backward probability model:
where max represents the optimization and,representing the deviation derivative, v i And (3) representing the N-dimensional matrix vector of the keyword i, wherein the label set comprises s keywords, and further, after optimizing the forward probability model and the backward probability model, the dimension of the N-dimensional matrix vector is reduced to be smaller, and the word vector set is obtained after the dimension reduction is completed.
S3, inputting the standard table set and the word vector set into an intelligent query model to train to obtain a training value, judging the magnitude relation between the training value and a preset threshold, if the training value is larger than the preset threshold, continuing training by the intelligent query model, and if the training value is smaller than the preset threshold, completing training by the intelligent query model.
Preferably, the training to obtain the training value includes: and carrying out part-of-speech coding on the user layer information in the standard table set to obtain a user layer information vector set, inputting the user layer information vector set into the intelligent query model, sequentially carrying out convolution operation, pooling operation and activation operation on the intelligent query model to obtain a predicted value set, and carrying out loss calculation on the predicted value set and the word vector set to obtain the training value.
Further, the convolution operation and the pooling operation comprise the steps of constructing a convolution template in advance, determining a convolution step length, and calculating the convolution template and a user layer information vector set according to the convolution step length to obtain a convolution matrix set after the convolution operation, so that the convolution operation is completed. And selecting the maximum value or the average value of the matrixes in the convolution matrix set to replace the convolution matrix set, and completing the pooling operation.
Further, the pre-constructed convolution templates may be a standard 3*3 matrix, such asThe calculation mode of the matrix after the convolution operation is a mode of from left to right and the convolution amplitude is 1, and if the characteristic candidate area matrix with the characteristic 9*9 in the characteristic candidate area set is: />Said pre-constructed convolution template->First of all, and->The calculation is carried out in the following way: 1*0, 0*3, 1*1, and the like, and the final result is: />And so on, the pre-constructed convolution template +.>According to the convolution amplitude of 1, continuously traversing the matrix to the right for one step is as follows: said pre-constructed convolution template->Performing the above operation to obtain the pre-constructed convolution template +.>It follows that a large number of small dimensional matrices can be generated when the convolution operation is completed, e.g.>And->Etc., so that the pooling operation reduces the dimensions of a large number of small-dimension matrices generated by the convolution operation, preferably using a maximization principle, such as the one described above +.>Andthe pooling operation is completed by substituting the largest values 3 and 7.
Preferably, the convolution operation and the pooling operation are repeated, and preferably 16 times of convolution and pooling operations can be used to obtain a final feature matrix set.
Preferably, the activating operation is to estimate the probability of the feature matrix set through a softmax function, and select the prediction result with the highest probability as the final prediction and output the final prediction result. The softmax function is:
wherein p (matrix) represents the output probability of the feature matrix set matrix, k represents the data amount of the feature matrix set, e is an infinite non-cyclic fraction, and j represents the selectable range of the prediction value set. Such as whenAt the time, calculate +.>Is 0.21, when->At the time, calculate +.>Is 0.64, so the feature matrix is represented by +.>
The loss calculation includes:
wherein t is the number of the word vector sets, y i For the set of word vectors, y i ' is the set of predictors.
S4, receiving query contents of a user, extracting the query contents based on a keyword extraction algorithm to obtain a keyword set, performing part-of-speech conversion on the keyword set to obtain a keyword vector set, inputting the keyword vector set into the intelligent query model which completes training to obtain a form set required by the query contents, and outputting the form set.
Since the query content input by the user is often not in the prescribed keyword combination form, and the spoken language is often tasted for a lot, such as "i want to find the telephone bill of September", the keyword extraction is required for the query content input by the user, and the keywords "September" and "telephone bill" in "i want to find the telephone bill of September" are extracted.
Preferably, the keyword extraction may be performed by using a traversal method, for example, splitting and de-duplicating all keywords of the tag set to construct a keyword table, and sequentially comparing the query content with the keyword table to complete the keyword extraction.
Further, the intelligent query model obtains a table set required by the query content, wherein the query content is information of the user layer, so that the function provided by the calculation layer for querying the data layer according to the user layer is called to obtain the data layer, and the data layer and the user layer are recombined to obtain the table set.
The invention also provides an intelligent query device for the table. Referring to fig. 2, an internal structure diagram of a table intelligent query device according to an embodiment of the invention is shown.
In this embodiment, the table intelligent query device 1 may be a PC (Personal Computer ), or a terminal device such as a smart phone, a tablet computer, a portable computer, or a server. The form intelligent querying device 1 comprises at least a memory 11, a processor 12, a communication bus 13, and a network interface 14.
The memory 11 includes at least one type of readable storage medium including flash memory, a hard disk, a multimedia card, a card memory (e.g., SD or DX memory, etc.), a magnetic memory, a magnetic disk, an optical disk, etc. The memory 11 may in some embodiments be an internal storage unit of the form smart query device 1, such as a hard disk of the form smart query device 1. The memory 11 may also be an external storage device of the table Smart query device 1 in other embodiments, for example, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the table Smart query device 1. Further, the memory 11 may also include both an internal storage unit and an external storage device of the table intelligent query apparatus 1. The memory 11 may be used not only for storing application software installed in the form intelligent inquiry apparatus 1 and various types of data, such as codes of the form intelligent inquiry program 01, but also for temporarily storing data that has been output or is to be output.
The communication bus 13 is used to enable connection communication between these components.
The network interface 14 may optionally comprise a standard wired interface, a wireless interface (e.g. WI-FI interface), typically used to establish a communication connection between the apparatus 1 and other electronic devices.
Optionally, the device 1 may further comprise a user interface, which may comprise a Display (Display), an input unit such as a Keyboard (Keyboard), and a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch, or the like. The display may also be referred to as a display screen or a display unit, as appropriate, for displaying information processed in the form intelligent querying device 1 and for displaying a visual user interface.
Fig. 2 shows only a form intelligent querying device 1 having components 11-14 and a form intelligent querying program 01, it will be understood by those skilled in the art that the structure shown in fig. 1 does not constitute a limitation of the form intelligent querying device 1, and may include fewer or more components than shown, or may combine certain components, or may be arranged in different components.
In the embodiment of the apparatus 1 shown in fig. 2, a table intelligent inquiry program 01 is stored in the memory 11; the following steps are implemented when the processor 12 executes the table intelligent inquiry program 01 stored in the memory 11:
step one, receiving an original table set and a label set, and splitting the original table set to obtain a standard table set.
The original table set in the preferred embodiment of the invention is a table which is automatically generated on the premise of different services, for example, a real estate agency collects a plurality of house sources and obtains a house property information table set based on the summary of EXCEL software; the communication service company generates a user consumption list set and the like according to the use conditions of conversations, flow rates and the like of different users.
The tag set is a description of each table in the original set of tables. Preferably, the description mode adopts a keyword combination mode, such as the property information table set, the tag set adopts a combination mode of 4 keywords of cell name and unit number, property area, market price and property pattern, and if the tag set records that the property information table set has a set of rooms as follows: a cell 3A unit +89 square meters +120 ten thousand + three-room one-room two-guard.
Preferably, the splitting process splits the original table set into a user layer, a calculation layer and a data layer to obtain the standard table set. Because the table comprises the table questions, the table heads and the table body, wherein the table questions and the table heads are based on the text expression form, and the table questions of the product sales statistics table are the product sales statistics table, and the table heads comprise the product numbers, the product names, the specifications, the packaging modes and the like, the table questions and the table heads of the original table set are extracted to form a user layer; the table body is data of the whole table, so that the table body of the original table set is extracted to form a data layer, when the whole table is split to form the user layer and the data layer, a query relation needs to be established between the user layer and the data layer, namely the function of the calculation layer, the calculation layer can adopt a multi-element linear index query form, and the linear query is that:
y=a 1 x 1 +a 2 x 2 +…+a n x n
wherein y is the data in the data layer, x 1 、x 2 、x n A is the text information of the user layer 1 、a 2 、a n The index coefficient is 0 or 1. If the real estate information with the real estate price of 130 ten thousand is to be inquired, so that y=130, and all user layer text information meeting the real estate price of 130 ten thousand is reversely calculated, namely, the original table is concentrated with the header and the header information, for example, the header is a district A, a district B and a district E, the real estate information meeting the real estate price of 130 ten thousand is provided, and the header is under the header of district A, and the real estate price of 130 ten thousand is provided.
Step two, part-of-speech coding is carried out on the tag set to obtain a word vector set.
Because the intelligent query model cannot effectively identify the text information, the effective identification is to extract the characteristics of the text information and then perform differential judgment, preferably, the part-of-speech coding is to represent all keywords of each tag in the tag set by using an N-dimensional matrix vector, i.e. convert the text information into digital information for the identification and training of the subsequent model, wherein N is the number of the keywords in the tag set.
Further, the part-of-speech coding firstly carries out one-hot coding (one-hot) operation on the tag set to obtain a primary word vector set, and then carries out dimension reduction on the primary word vector set to obtain the word vector set.
The one-hot operation is as follows:
wherein i represents a keyword number, v i N-dimensional matrix vector representing key i, all v i Forming the primary word vector set, assuming a total of s keywords, v j Is the j-th element of the N-dimensional matrix vector.
Further, the dimension reduction is to reduce the generated N-dimensional matrix vector to smaller dimension data that is easier to compute for subsequent model training, i.e., to ultimately convert the primary word vector set into the word vector set.
Preferably, the dimension reduction establishes a forward probability model and a backward probability model, and then optimizes the forward probability model and the backward probability model to obtain an optimized solution, wherein the optimized solution is the word vector set.
Further, the forward probability model and the backward probability model are respectively:
optimizing the forward probability model and the backward probability model:
where max represents the optimization and,representing the deviation derivative, v i And (3) representing the N-dimensional matrix vector of the keyword i, wherein the label set comprises s keywords, and further, after optimizing the forward probability model and the backward probability model, the dimension of the N-dimensional matrix vector is reduced to be smaller, and the word vector set is obtained after the dimension reduction is completed.
And thirdly, inputting the standard table set and the word vector set into an intelligent query model to train to obtain a training value, judging the magnitude relation between the training value and a preset threshold, if the training value is larger than the preset threshold, continuing training by the intelligent query model, and if the training value is smaller than the preset threshold, finishing training by the intelligent query model.
Preferably, the training to obtain the training value includes: and carrying out part-of-speech coding on the user layer information in the standard table set to obtain a user layer information vector set, inputting the user layer information vector set into the intelligent query model, sequentially carrying out convolution operation, pooling operation and activation operation on the intelligent query model to obtain a predicted value set, and carrying out loss calculation on the predicted value set and the word vector set to obtain the training value.
Further, the convolution operation and the pooling operation comprise the steps of constructing a convolution template in advance, determining a convolution step length, and calculating the convolution template and a user layer information vector set according to the convolution step length to obtain a convolution matrix set after the convolution operation, so that the convolution operation is completed. And selecting the maximum value or the average value of the matrixes in the convolution matrix set to replace the convolution matrix set, and completing the pooling operation.
Further, the pre-constructed convolution templates may be a standard 3*3 matrix, such asThe matrix after the convolution operation is calculated by adopting the following modeThe convolution amplitude is 1 from left to right, and if the feature candidate area matrix with the feature 9*9 in the feature candidate area set is: />Said pre-constructed convolution template->First of all, and->The calculation is carried out in the following way: 1*0, 0*3, 1*1, and the like, and the final result is: />And so on, the pre-constructed convolution template +.>According to the convolution amplitude of 1, continuously traversing the matrix to the right for one step is as follows: said pre-constructed convolution template->Performing the above operation to obtain the pre-constructed convolution template +.>It follows that a large number of small dimensional matrices can be generated when the convolution operation is completed, e.g.>And->Etc., so that the pooling operation reduces the dimensions of a large number of small-dimension matrices generated by the convolution operation, preferably using a maximization principle, such as the one described above +.>Andthe pooling operation is completed by substituting the largest values 3 and 7.
Preferably, the convolution operation and the pooling operation are repeated, and preferably 16 times of convolution and pooling operations can be used to obtain a final feature matrix set.
Preferably, the activating operation is to estimate the probability of the feature matrix set through a softmax function, and select the prediction result with the highest probability as the final prediction and output the final prediction result. The softmax function is:
wherein p (matrix) represents the output probability of the feature matrix set matrix, k represents the data amount of the feature matrix set, e is an infinite non-cyclic fraction, and j represents the selectable range of the prediction value set. Such as whenAt the time, calculate +.>Is 0.21, when->At the time, calculate +.>Is 0.64, so the feature matrix is represented by +.>
The loss calculation includes:
wherein t is the number of the word vector sets, y i For the set of word vectors, y i ' is the set of predictors.
And step four, receiving query contents of a user, extracting the query contents based on a keyword extraction algorithm to obtain a keyword set, performing part-of-speech conversion on the keyword set to obtain a keyword vector set, inputting the keyword vector set into the intelligent query model which completes training to obtain a form set required by the query contents, and outputting the form set.
Since the query content input by the user is often not in the prescribed keyword combination form, and the spoken language is often tasted for a lot, such as "i want to find the telephone bill of September", the keyword extraction is required for the query content input by the user, and the keywords "September" and "telephone bill" in "i want to find the telephone bill of September" are extracted.
Preferably, the keyword extraction may be performed by using a traversal method, for example, splitting and de-duplicating all keywords of the tag set to construct a keyword table, and sequentially comparing the query content with the keyword table to complete the keyword extraction.
Further, the intelligent query model obtains a table set required by the query content, wherein the query content is information of the user layer, so that the function provided by the calculation layer for querying the data layer according to the user layer is called to obtain the data layer, and the data layer and the user layer are recombined to obtain the table set.
Optionally, in other embodiments, the table intelligent query program may be further divided into one or more modules, where one or more modules are stored in the memory 11 and executed by one or more processors (the processor 12 in this embodiment) to perform the present invention, and the modules referred to herein are a series of instruction segments of a computer program capable of performing a specific function, for describing the execution of the table intelligent query program in the table intelligent query device.
For example, referring to fig. 3, a schematic program module of a table intelligent query program in an embodiment of a table intelligent query apparatus according to the present invention is shown, where the table intelligent query program may be divided into a data receiving and processing module 10, a part-of-speech encoding module 20, an intelligent query model training module 30, and a table query and output module 40, which are exemplary:
the data receiving and processing module 10 is configured to: and receiving an original table set and a label set, and splitting the original table set to obtain a standard table set.
The part-of-speech encoding module 20 is configured to: and performing part-of-speech coding on the tag set to obtain a word vector set.
The intelligent query model training 30 is used to: and inputting the standard table set and the word vector set into an intelligent query model to train to obtain a training value, judging the magnitude relation between the training value and a preset threshold, if the training value is larger than the preset threshold, continuing training by the intelligent query model, and if the training value is smaller than the preset threshold, finishing training by the intelligent query model.
The table queries and outputs 40 for: receiving query content of a user, extracting the query content based on a keyword extraction technology to obtain a keyword set, performing part-of-speech coding on the keyword set to obtain a keyword vector set, inputting the keyword vector set into the intelligent query model which is trained, obtaining and outputting a form set required by the query content.
The functions or operation steps implemented when the program modules, such as the data receiving and processing module 10, the part-of-speech encoding module 20, the intelligent query model training module 30, the table query and output module 40, etc., are executed are substantially the same as those of the foregoing embodiments, and will not be described herein.
In addition, an embodiment of the present invention further provides a computer-readable storage medium, where a table intelligent query program is stored, where the table intelligent query program can be executed by one or more processors to implement the following operations:
and receiving an original table set and a label set, and splitting the original table set to obtain a standard table set.
And performing part-of-speech coding on the tag set to obtain a word vector set.
And inputting the standard table set and the word vector set into an intelligent query model to train to obtain a training value, judging the magnitude relation between the training value and a preset threshold, if the training value is larger than the preset threshold, continuing training by the intelligent query model, and if the training value is smaller than the preset threshold, finishing training by the intelligent query model.
Receiving query content of a user, extracting the query content based on a keyword extraction technology to obtain a keyword set, performing part-of-speech coding on the keyword set to obtain a keyword vector set, inputting the keyword vector set into the intelligent query model which is trained, obtaining and outputting a form set required by the query content.
It should be noted that, the foregoing reference numerals of the embodiments of the present invention are merely for describing the embodiments, and do not represent the advantages and disadvantages of the embodiments. And the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, apparatus, article, or method that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, apparatus, article, or method. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, apparatus, article or method that comprises the element.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) as described above, comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method according to the embodiments of the present invention.
The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.
Claims (10)
1. An intelligent query method for a form, which is characterized by comprising the following steps:
receiving an original table set and a label set, and splitting the original table set to obtain a standard table set;
performing part-of-speech coding on the tag set to obtain a word vector set;
inputting the standard table set and the word vector set into an intelligent query model to train to obtain a training value, judging the magnitude relation between the training value and a preset threshold, if the training value is larger than the preset threshold, continuing training by the intelligent query model, and if the training value is smaller than the preset threshold, completing training by the intelligent query model;
receiving query content of a user, extracting the query content based on a keyword extraction technology to obtain a keyword set, performing part-of-speech coding on the keyword set to obtain a keyword vector set, inputting the keyword vector set into the intelligent query model which is trained, obtaining and outputting a form set required by the query content.
2. The intelligent query method of claim 1, wherein the splitting process splits the original table set into a user layer, a calculation layer and a data layer, and composes the user layer, the calculation layer and the data layer into the standard table set;
wherein:
the user layer consists of the table questions and the table heads of each table in the original table set;
the data layer is composed of a table body of each table in the original table set;
the computing layer provides mutual query functions of the user layer and the data layer.
3. The method for intelligently querying a table according to claim 1 or 2, wherein the step of performing part-of-speech encoding on the tag set to obtain a word vector set includes:
performing single-heat coding on the tag set to obtain a primary word vector set;
and dimension reduction is carried out on the primary word vector set to obtain the word vector set.
4. The form intelligence query method of claim 3, wherein said dimension reduction comprises:
establishing a forward probability model and a backward probability model;
and optimizing the forward probability model and the backward probability model to obtain an optimized solution, wherein the optimized solution is the word vector set.
5. The method of claim 2, wherein inputting the standard table set and the word vector set into an intelligent query model for training to obtain training values comprises:
the user layer information in the standard table set is subjected to part-of-speech coding to obtain a user layer information vector set;
inputting the user layer information vector set into the intelligent query model, and sequentially performing convolution operation, pooling operation and activation operation on the intelligent query model to obtain a predicted value set;
and carrying out loss calculation on the predicted value set and the word vector set to obtain the training value.
6. A form intelligence query device, the device comprising a memory and a processor, the memory storing a form intelligence query program operable on the processor, the form intelligence query program when executed by the processor implementing the steps of:
receiving an original table set and a label set, and splitting the original table set to obtain a standard table set;
performing part-of-speech coding on the tag set to obtain a word vector set;
inputting the standard table set and the word vector set into an intelligent query model to train to obtain a training value, judging the magnitude relation between the training value and a preset threshold, if the training value is larger than the preset threshold, continuing training by the intelligent query model, and if the training value is smaller than the preset threshold, completing training by the intelligent query model;
receiving query content of a user, extracting the query content based on a keyword extraction technology to obtain a keyword set, performing part-of-speech coding on the keyword set to obtain a keyword vector set, inputting the keyword vector set into the intelligent query model which is trained, obtaining and outputting a form set required by the query content.
7. The intelligent table query device according to claim 6, wherein the splitting process splits the original table set into a user layer, a calculation layer and a data layer, and composes the user layer, the calculation layer and the data layer into the standard table set;
wherein:
the user layer consists of the table questions and the table heads of each table in the original table set;
the data layer is composed of a table body of each table in the original table set;
the computing layer provides mutual query functions of the user layer and the data layer.
8. The intelligent query device according to claim 6 or 7, wherein said part-of-speech encoding said tag set to obtain a word vector set, comprises:
performing single-heat coding on the tag set to obtain a primary word vector set;
and dimension reduction is carried out on the primary word vector set to obtain the word vector set.
9. The form intelligent querying device as recited in claim 8, wherein the dimension reduction comprises:
establishing a forward probability model and a backward probability model;
and optimizing the forward probability model and the backward probability model to obtain an optimized solution, wherein the optimized solution is the word vector set.
10. A computer-readable storage medium having stored thereon a form intelligence query program executable by one or more processors to implement the steps of the form intelligence query method of any of claims 1 to 5.
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201910975458.1A CN110866042B (en) | 2019-10-11 | 2019-10-11 | Intelligent query method and device for table and computer readable storage medium |
| PCT/CN2020/098951 WO2021068565A1 (en) | 2019-10-11 | 2020-06-29 | Table intelligent query method and apparatus, electronic device and computer readable storage medium |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201910975458.1A CN110866042B (en) | 2019-10-11 | 2019-10-11 | Intelligent query method and device for table and computer readable storage medium |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN110866042A CN110866042A (en) | 2020-03-06 |
| CN110866042B true CN110866042B (en) | 2023-05-12 |
Family
ID=69652834
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201910975458.1A Active CN110866042B (en) | 2019-10-11 | 2019-10-11 | Intelligent query method and device for table and computer readable storage medium |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN110866042B (en) |
| WO (1) | WO2021068565A1 (en) |
Families Citing this family (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN110866042B (en) * | 2019-10-11 | 2023-05-12 | 平安科技(深圳)有限公司 | Intelligent query method and device for table and computer readable storage medium |
| CN112597171B (en) * | 2020-12-31 | 2024-08-27 | 平安银行股份有限公司 | Table relation visualization method and device, electronic equipment and storage medium |
| CN113111864A (en) * | 2021-05-13 | 2021-07-13 | 上海巽联信息科技有限公司 | Intelligent table extraction algorithm based on multiple modes |
| CN116049354B (en) * | 2023-01-28 | 2023-06-20 | 北京原子回声智能科技有限公司 | Multi-table retrieval method and device based on natural language |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN106250381A (en) * | 2015-06-04 | 2016-12-21 | 微软技术许可有限责任公司 | The row sequence optimized for input/output in list data |
| CN106874411A (en) * | 2017-01-22 | 2017-06-20 | 网易(杭州)网络有限公司 | The searching method and search platform of a kind of form |
| JP2017224240A (en) * | 2016-06-17 | 2017-12-21 | 富士通株式会社 | TABLE DATA SEARCH DEVICE, TABLE DATA SEARCH METHOD, AND TABLE DATA SEARCH PROGRAM |
| CN110222160A (en) * | 2019-05-06 | 2019-09-10 | 平安科技(深圳)有限公司 | Intelligent semantic document recommendation method, device and computer readable storage medium |
Family Cites Families (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101615193A (en) * | 2009-07-07 | 2009-12-30 | 北京大学 | A query system based on encyclopedia data extraction and integration |
| US8819055B2 (en) * | 2010-05-14 | 2014-08-26 | Oracle International Corporation | System and method for logical people groups |
| US20140025626A1 (en) * | 2012-04-19 | 2014-01-23 | Avalon Consulting, LLC | Method of using search engine facet indexes to enable search-enhanced business intelligence analysis |
| US10311374B2 (en) * | 2015-09-11 | 2019-06-04 | Adobe Inc. | Categorization of forms to aid in form search |
| CN110866042B (en) * | 2019-10-11 | 2023-05-12 | 平安科技(深圳)有限公司 | Intelligent query method and device for table and computer readable storage medium |
-
2019
- 2019-10-11 CN CN201910975458.1A patent/CN110866042B/en active Active
-
2020
- 2020-06-29 WO PCT/CN2020/098951 patent/WO2021068565A1/en not_active Ceased
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN106250381A (en) * | 2015-06-04 | 2016-12-21 | 微软技术许可有限责任公司 | The row sequence optimized for input/output in list data |
| JP2017224240A (en) * | 2016-06-17 | 2017-12-21 | 富士通株式会社 | TABLE DATA SEARCH DEVICE, TABLE DATA SEARCH METHOD, AND TABLE DATA SEARCH PROGRAM |
| CN106874411A (en) * | 2017-01-22 | 2017-06-20 | 网易(杭州)网络有限公司 | The searching method and search platform of a kind of form |
| CN110222160A (en) * | 2019-05-06 | 2019-09-10 | 平安科技(深圳)有限公司 | Intelligent semantic document recommendation method, device and computer readable storage medium |
Also Published As
| Publication number | Publication date |
|---|---|
| CN110866042A (en) | 2020-03-06 |
| WO2021068565A1 (en) | 2021-04-15 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN110334272B (en) | Intelligent question-answering method and device based on knowledge graph and computer storage medium | |
| CN110866042B (en) | Intelligent query method and device for table and computer readable storage medium | |
| CN110442857B (en) | Emotion intelligent judging method and device and computer readable storage medium | |
| CN110765765B (en) | Contract key term extraction method, device and storage medium based on artificial intelligence | |
| CN110413773B (en) | Intelligent text classification method, device and computer readable storage medium | |
| CN110795527B (en) | Candidate entity ordering method, training method and related device | |
| CN110427480B (en) | Intelligent personalized text recommendation method and device and computer readable storage medium | |
| CN110795548A (en) | Intelligent question answering method, device and computer readable storage medium | |
| CN111475617A (en) | Event body extraction method and device and storage medium | |
| CN108038208B (en) | Training method and device of context information recognition model and storage medium | |
| CN115062134A (en) | Knowledge question-answering model training and knowledge question-answering method, device and computer equipment | |
| CN111159485A (en) | Tail entity linking method, device, server and storage medium | |
| WO2020248366A1 (en) | Text intention intelligent classification method and device, and computer-readable storage medium | |
| CN113190702A (en) | Method and apparatus for generating information | |
| CN112784011B (en) | Emotion problem processing method, device and medium based on CNN and LSTM | |
| CN113627797A (en) | Incoming employee portrait generation method, device, computer equipment and storage medium | |
| WO2021139076A1 (en) | Intelligent text dialogue generation method and apparatus, and computer-readable storage medium | |
| CN113947456B (en) | Online store matching method and its device, equipment, medium, and product | |
| WO2025026402A1 (en) | Question answering method and apparatus, electronic device and storage medium | |
| CN115730597A (en) | Multi-level semantic intention recognition method and related equipment thereof | |
| CN115546869A (en) | Facial expression recognition method and system based on multiple features | |
| CN112988993B (en) | Question and answer method and computing device | |
| CN110263134B (en) | Intelligent emotion question-answering method and device and computer readable storage medium | |
| CN113434657B (en) | E-commerce customer service response method and corresponding device, equipment and medium thereof | |
| CN113961701B (en) | Message text clustering method and device |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |