CN107799116B

CN107799116B - Multi-round interactive parallel semantic understanding method and device

Info

Publication number: CN107799116B
Application number: CN201610793380.8A
Authority: CN
Inventors: 黄鑫; 陈志刚; 王智国
Original assignee: iFlytek Co Ltd
Current assignee: iFlytek Co Ltd
Priority date: 2016-08-31
Filing date: 2016-08-31
Publication date: 2024-08-02
Anticipated expiration: 2036-08-31
Also published as: CN107799116A

Abstract

The application provides a multi-round interaction parallel semantic understanding method and a device, wherein the multi-round interaction parallel semantic understanding method comprises the following steps: receiving voice data; segmenting the voice data to obtain voice data fragments, and performing voice recognition on the voice data fragments to obtain current segmented text data; independent semantic understanding is carried out on the current segmentation text data to obtain a first semantic understanding result, and semantic understanding is carried out on the current segmentation text data according to historical data interacted by a user to obtain a second semantic understanding result; and selecting the first semantic understanding result or the second semantic understanding result as a final semantic understanding result. The method can improve the accuracy of the semantic understanding result, so that a more accurate interaction result can be fed back to the user, and the user experience is improved.

Description

Multi-round interactive parallel semantic understanding method and device

Technical Field

The application relates to the technical field of natural language understanding, in particular to a multi-round interactive parallel semantic understanding method and device.

Background

With the rapid development of intelligent terminals and network technologies, people are increasingly used to finish various demands by using intelligent terminals, such as using intelligent sound boxes as inlets for man-machine interaction, users can perform voice interaction with the intelligent sound boxes to meet different demands of users, such as searching weather, searching stocks, listening to music, etc., or when the users drive, using intelligent car machines as inlets for man-machine interaction, users can perform voice interaction with the intelligent car machines to finish application demands such as navigation, radio station inquiry, music inquiry, etc. When various demands of a user are completed by using an intelligent terminal, generally, a voice interaction mode is used, the intelligent terminal receives voice data containing the demands of the user, the voice data is subjected to voice recognition to obtain corresponding recognition texts, after semantic understanding is carried out on the recognition texts, a system feeds back the semantic understanding results to the user interaction results, and therefore one-time interaction is completed; when one or more demands are made on the user, the user can often carry out multi-round interaction with the intelligent terminal, the system continuously feeds back the interaction result to the user according to the demands of the user each time, so that a natural and smooth interaction experience is provided for the user, for example, the user demands are navigation to science, after the user can carry out multi-round interaction with the system, the requirement is met, and when specific interaction is carried out, the user firstly speaks with the system: "go to the department of big", the system question "go to department of big south district, north district, middle district or thing district", the user say: the system starts planning paths, the planned candidate paths are fed back to the user, and after the user selects the navigation path, the system starts navigation.

In the related art, when a user performs multiple rounds of voice interaction with an intelligent terminal, a system generally performs semantic understanding only according to a text which is currently segmented during semantic understanding. However, if the user has the phenomena of pause, drag and the like or the interference sound of other people speaking in one request process, the system usually has segmentation errors, correspondingly has semantic understanding errors, and further feeds back the wrong interaction result to the user, thereby seriously reducing the user experience.

Disclosure of Invention

The present application aims to solve at least one of the technical problems in the related art to some extent.

Therefore, an object of the present application is to provide a multi-round interactive parallel semantic understanding method, which can improve the accuracy of the semantic understanding result, so as to feed back a more accurate interactive result to a user, and improve the user experience.

Another object of the application is to propose a multi-round interactive parallel semantic understanding device.

In order to achieve the above object, a multi-round interactive parallel semantic understanding method provided by an embodiment of a first aspect of the present application includes: receiving voice data; segmenting the voice data to obtain voice data fragments, and performing voice recognition on the voice data fragments to obtain current segmented text data; independent semantic understanding is carried out on the current segmentation text data to obtain a first semantic understanding result, and semantic understanding is carried out on the current segmentation text data according to historical data interacted by a user to obtain a second semantic understanding result; and selecting the first semantic understanding result or the second semantic understanding result as a final semantic understanding result.

According to the multi-round interactive parallel semantic understanding method provided by the embodiment of the first aspect of the application, through semantic understanding is carried out on the segmented text data according to the historical data, more information can be combined during semantic understanding, a more accurate semantic understanding result can be obtained, in addition, one of the two semantic understanding results is selected, and a more appropriate semantic understanding result can be selected according to actual conditions, so that a more accurate interactive result can be fed back to a user, and user experience is improved.

To achieve the above object, a multi-round interactive parallel semantic understanding device according to an embodiment of the second aspect of the present application includes: the receiving module is used for receiving voice data; the voice recognition module is used for segmenting the voice data to obtain voice data fragments, and carrying out voice recognition on the voice data fragments to obtain current segmented text data; the semantic understanding module is used for carrying out independent semantic understanding on the current segmentation text data to obtain a first semantic understanding result, and carrying out semantic understanding on the current segmentation text data according to the historical data of user interaction to obtain a second semantic understanding result; and the selection module is used for selecting the first semantic understanding result or the second semantic understanding result as a final semantic understanding result.

According to the multi-round interactive parallel semantic understanding device provided by the embodiment of the second aspect of the application, through semantic understanding of the segmented text data according to the historical data, more information can be combined during semantic understanding, and more accurate semantic understanding results can be obtained.

Additional aspects and advantages of the application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the application.

Drawings

The foregoing and/or additional aspects and advantages of the application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings, in which:

FIG. 1 is a flow diagram of a multi-round interactive parallel semantic understanding method according to one embodiment of the present application;

FIG. 2 is a flow chart of a method for ranking modification of initial semantic understanding results in an embodiment of the present application;

FIG. 3 is a flow chart of a method of selecting a first semantic understanding result or a second semantic understanding result according to an embodiment of the present application;

FIG. 4 is a flow chart of a method for semantically understanding cut text data from historical data in an embodiment of the present application;

FIG. 5 is a flow chart of a method for ranking and correcting secondary semantic understanding results in an embodiment of the present application;

FIG. 6 is a schematic diagram of a multi-round interactive parallel semantic understanding device according to one embodiment of the present application;

Fig. 7 is a schematic structural diagram of a multi-round interactive parallel semantic understanding device according to another embodiment of the present application.

Detailed Description

Embodiments of the present application are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar modules or modules having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative only and are not to be construed as limiting the application. On the contrary, the embodiments of the application include all alternatives, modifications and equivalents as may be included within the spirit and scope of the appended claims.

For a better understanding of the present application, the problems in the related art will be further described with a specific example.

For example, the user inputs voice data of "snowman who i want to listen Fan Xiaoxuan …", wherein the ellipses (…) represent pauses or drags caused by thinking or speaking habits of the user, and the system semantic understanding module receives two interactive requests, namely "i want to listen to Fan Xiaoxuan" and "snowman", for the first interactive request, and directly plays Fan Xiaoxuan songs (but not necessarily "snowman") after the system performs semantic understanding; for the second interactive request, after the system performs semantic understanding, playing a song "snowman" (which may be Fan Xiaoxuan sings or the same name songs by other singers), and possibly giving a boring result "the winter arrives", i like the feedback of snowman, and whatever feedback basically cannot meet the real requirement of the user, so that the user feels the interaction unsmooth, and the user experience effect is poor.

Through analysis of the above examples, it is found that an important cause of the inaccuracy of the above interaction results is that semantic understanding in the related art only performs semantic recognition on the current segmentation result, that is, semantic recognition is performed on "i want to listen Fan Xiaoxuan" and semantic recognition is performed on "snowman" respectively, and no combination recognition is performed on the historical data such as "i want to listen Fan Xiaoxuan" and "snowman".

In order to solve the problems of inaccurate semantic recognition results, unsmooth interaction, poor user experience and the like in the related technology, the application provides the following embodiment.

FIG. 1 is a flow diagram of a multi-round interactive parallel semantic understanding method according to one embodiment of the present application.

As shown in fig. 1, the method of the present embodiment includes:

s11: voice data is received.

When the user needs to perform voice interaction with the intelligent terminal, the intelligent terminal can receive voice data uttered by the user.

S12: and segmenting the voice data to obtain voice data fragments, and performing voice recognition on the voice data fragments to obtain current segmented text data.

In the full duplex interaction mode, a user can continuously and naturally interact with the system for multiple rounds, after the system receives voice data input by the user, the system firstly needs to segment the received voice data, each voice data segment obtained by segmentation is used as voice data of current interaction, and the system needs to feed back the voice data. The specific slicing method is not limited, and may be implemented by using the existing technology or technologies appearing in the future, for example, extracting acoustic features of the voice data, such as mel-frequency cepstrum coefficient (Mel Frequency Cepstrum Coefficient, MFCC) or filterbank, and using a pre-trained slicing model to slice the received voice data according to the acoustic features, where the slicing model is a classification model commonly used in pattern recognition, such as a deep neural network model or a support vector machine model.

After each cut-off voice data fragment is obtained, voice recognition is carried out on each voice data fragment, corresponding text data is obtained, and the text data corresponding to each voice data fragment can be called cut-off text data. The specific speech recognition algorithm is not limited and may be implemented using existing or future technologies.

S13: independent semantic understanding is carried out on the current segmentation text data to obtain a first semantic understanding result, and semantic understanding is carried out on the current segmentation text data according to historical data interacted by a user to obtain a second semantic understanding result.

The independent semantic understanding refers to directly performing semantic understanding on the current segmentation text data, history data is not referred to, and correspondingly, performing semantic understanding on the current segmentation text data according to the history data refers to referring to the history data when performing semantic understanding on the current segmentation text data.

Specific flow for semantic understanding based on historical data can be found in the following embodiments.

It will be appreciated that the timing of the independent semantic understanding and the semantic understanding from historical data is not limited. Further, to increase processing efficiency, the two semantic understanding processes may be processed in parallel.

The flow of independent semantic understanding of the current cut text data may be as follows:

The independent semantic understanding is that the semantic understanding is directly carried out on the current segmentation text data, and when the semantic understanding is carried out specifically without combining with the historical data interacted by the user, the initial semantic understanding is carried out on the current segmentation text data by using a semantic scene preset by a system, so that an initial semantic understanding result of the semantic understanding is obtained.

The preset semantic scenes of the system are preset according to the application requirements of the system, and the semantic scenes are music, films, ticket booking, catering and the like.

When the initial semantic understanding is carried out on the current segmentation text data, the initial semantic understanding result under each semantic scene can be obtained by using semantic analysis based on a syntax and semantic analysis based on a grammar rule network.

Taking semantic analysis based on a grammar rule network for carrying out initial semantic understanding on current segmentation text data as an example, expanding Backus-Naur Form (ABNF) grammar rules based on each semantic scene can be compiled in advance, then a weighted finite state machine (WEIGHTED FINITE STATE Transducer, WFST) network under each semantic scene is compiled and generated according to the grammar rules under each semantic scene, and semantic understanding is carried out on the current segmentation text data under each semantic scene according to the WFST network under each semantic scene, so that initial semantic understanding results under each semantic scene are obtained. It will be appreciated that the details of the semantic understanding process in each semantic scenario may be found in various existing or future technologies, and will not be described in detail herein.

Corresponding to each semantic scene, the initial semantic understanding result under the corresponding semantic scene is 0, one or more. For example, the confidence level (static setting or dynamic determination) may be predetermined in each semantic scene, the initial semantic understanding result with the probability value of the corresponding path larger than the confidence level is used as the candidate initial semantic understanding result in the semantic scene, and after the candidate initial semantic understanding result is obtained, all the candidate initial semantic understanding results may be used as the initial semantic understanding result in the corresponding semantic scene according to a predetermined rule; or sorting all candidate initial semantic understanding results according to probability values of corresponding paths, and sequentially selecting a preset number of candidate initial semantic understanding results as initial semantic understanding results under corresponding semantic scenes, namely nbest results, wherein n can be determined according to application requirements.

Each initial semantic understanding result contains information including: semantic scene name, semantic scene value, semantic slot and semantic slot value can be expressed as { semantic scene name: semantic scene value, semantic slot 1: semantic slot 1 takes value, semantic slot 2: semantic slot 2 takes value, …, semantic slot n: the semantic slot n takes value }, wherein the semantic slot is preset according to semantic scenes and application requirements.

If the 1best semantic understanding result when the current segmentation text data is "concert version" is { "service": "music", "version": "concert version" }, where "service" represents a semantic scene name, "music" represents a semantic scene value (music scene), "version" represents a semantic slot (version), and "concert version" represents a semantic slot value.

Carrying out semantic understanding on the current segmentation text data, and obtaining a first semantic understanding result according to the initial semantic understanding result in each semantic scene after obtaining the initial semantic understanding result in each semantic scene.

In some embodiments, the initial semantic understanding results in each semantic scene may be ranked according to the order from high to low of the probability values of the corresponding paths, and a preset number of initial semantic understanding results are sequentially selected from the ranked results to serve as final first semantic understanding results. Or alternatively

In some embodiments, after obtaining the initial semantic understanding results in each semantic scene, the initial semantic understanding results may be ranked and corrected, and among the ranked and corrected initial semantic understanding results, a preset number of initial semantic understanding results are sequentially selected as final first semantic understanding results. The specific content may be as follows.

S14: and selecting the first semantic understanding result or the second semantic understanding result as a final semantic understanding result.

After the first semantic understanding result and the second semantic understanding result are obtained, one of the two semantic understanding results can be selected as a final semantic understanding result, namely the first semantic understanding result is selected as the final semantic understanding result, or the second semantic understanding result is selected as the final semantic understanding result.

Further, as shown in fig. 2, the flow of the method for sorting and correcting the initial semantic understanding result includes:

s21: and acquiring related data of the initial semantic understanding result.

Under each semantic scene, searching data with higher correlation of initial semantic understanding results by using a vertical search engine.

When in specific search, each initial semantic understanding result is input as a search string of a search engine, the search engine automatically converts the search string into an internal search condition, searches documents related to each initial semantic understanding result, and returns the sorted related documents, wherein each initial semantic understanding result can select TopN related documents as related data of each initial semantic understanding result; the specific process is the same as the prior art and will not be described in detail here.

S22: and extracting relevance characteristics according to the initial semantic understanding result, the current segmentation text data and the relevant data.

Wherein, the correlation characteristic under the corresponding semantic scene can be extracted corresponding to each semantic scene.

The relevance feature is used for describing the relevance between the initial semantic understanding result and the current segmentation text data and the accuracy of the initial semantic understanding result sequencing.

The relevance feature includes at least one of:

The initial semantic understanding result is the same as the current segmentation text data in terms of terms, the initial semantic understanding result comprises key semantic slots, the initial semantic understanding result is the same as the current segmentation text data in terms of terms, the key semantic slots are valued at the same temperature, and the current segmentation text data comprises the number of key words related to semantic scenes.

The specific extraction method is as follows:

(1) The initial semantic understanding result is the same as the word number of the current segmentation text data

The word number of the initial semantic understanding result is the same as that of the current segmentation text data, namely the word in the current segmentation text data understood by the initial semantic understanding result, specifically, each word which is obtained by segmenting the semantic slot value and the current segmentation text data and is contained in the initial semantic understanding result can be sequentially matched, and the word number of the semantic slot value which is the same as that of the current segmentation text data in the initial semantic understanding result is determined.

(2) The initial semantic understanding result contains key semantic slot number

Counting key semantic slot numbers in the semantic understanding scenes to which each initial semantic understanding result belongs, wherein the more the key semantic slot numbers are contained, the higher the correlation between the semantic understanding result and the semantic scene to which the semantic understanding result belongs, so that the initial semantic understanding result with higher correlation with each semantic scene is arranged in front in each semantic scene; the more scene key semantic slots the initial semantic understanding result contains between different semantic scenes, the earlier the initial semantic understanding result is ordered.

(3) The key semantic slot value heat contained in the initial semantic understanding result is the same as the word in the current segmentation text data.

The key semantic slots in each initial semantic understanding result are found, whether the value of each key semantic slot is identical to the word in the current segmentation text data or not is sequentially matched, and if so, the heat of the value of the current key semantic slot is calculated; the value heat of the key semantic slots can be calculated according to the related data searched by the initial semantic understanding result of the semantic scene to which the key semantic slots belong; the related data, such as data contained in a vertical page in a search result, for example, in a music scene, the corresponding vertical page is dog search music, cool me music, tencent music, and the like, and the heat information of the current key semantic slot is comprehensively analyzed according to the information of the user click quantity, the user play quantity, the user search quantity, the distribution of the key semantic slot values in a plurality of vertical pages, and the like of the key semantic slot values in the vertical page, so that the heat of the key semantic slot is obtained, and a specific analysis and calculation method is the same as the prior art and is not described in detail herein.

(4) The current cut text data contains the number of keywords associated with the semantic scene

The keywords of each semantic scene can be segmented by collecting massive text data under each semantic scene in advance, counting the word frequency of each word, and taking the word with the word frequency larger than a threshold value as the keywords of each semantic scene; keywords in a music scene are "listen", "song", "play", etc.; the more semantic scene keywords contained in the current segmentation text data, the earlier the semantic understanding result ordering under the semantic scene.

S23: and according to the relevance characteristics and a pre-constructed sorting model, sorting and correcting the initial semantic understanding result.

The ranking model may be generated after training the training data in the training phase. For example, a large amount of text data is collected, the text data is subjected to semantic understanding to obtain initial semantic understanding results under each semantic scene, relevance features are extracted, the ranking of the initial semantic understanding results under each semantic scene is manually marked, and then training is performed according to the relevance features and the ranking of the manual marking, so that a ranking model is generated. The input of the sorting model is a relevance feature, and the output is an initial semantic understanding result under each semantic scene after sorting correction.

Therefore, after the correlation characteristics are extracted, the initial semantic understanding result after the sorting correction can be obtained according to the sorting model.

After the initial semantic understanding result after the sorting correction is obtained, topN may be selected as a final first semantic understanding result, where N may be determined according to an application requirement, for example, N takes a value of 10.

In this embodiment, by performing semantic understanding on the current cut text data according to the history data, more information can be combined during semantic understanding, so that a more accurate semantic understanding result can be obtained.

Fig. 3 is a flow chart of a method for selecting a first semantic understanding result or a second semantic understanding result according to an embodiment of the present application.

In this embodiment, when one semantic understanding result is selected from the two semantic understanding results, the selection feature is extracted first, and then the selection is selected according to the selection feature and the semantic result selection model.

The method comprises the steps that a first semantic understanding result and a second semantic understanding result are described by selection features from different angles, and the relation between current segmentation text data and user interaction history is used for selecting the semantic understanding result finally used by a system; in general, if the correlation between the current cut text data and the user interaction history data is smaller, the first semantic understanding result is used as a final semantic understanding result of the system, and if the correlation between the current cut text data and the user interaction history data is larger, the second semantic understanding result is used, and the specific selection process is as follows.

As shown in fig. 3, the method includes:

S31: and extracting selection features according to the current segmentation text data, the initial semantic understanding result of the current segmentation text data and the historical data of user interaction.

Wherein the history data includes at least one of:

feedback text data of the previous interaction, historical semantic understanding results of the previous interaction and historical segmentation text data of the previous interaction.

The selection feature may include at least one of:

The method comprises the steps of replacing semantic matching degree before and after a reference body in feedback text data of the previous round of interaction by current segmentation text data, semantic integrity of the current segmentation text data, coincidence degree of a semantic slot set contained in an initial semantic understanding result of the current segmentation text data and a semantic slot set contained in a historical semantic understanding result of the previous round of interaction, matching degree of a semantic slot value contained in the initial semantic understanding result of the current segmentation text data and a semantic slot value contained in the historical semantic understanding result of the previous round of interaction, association degree of a semantic slot contained in the initial semantic understanding result of the current segmentation text data and a semantic slot contained in the historical semantic understanding result of the previous round of interaction, and semantic integrity after the current segmentation text data is connected with the historical segmentation text data of the previous round of interaction.

The specific extraction content of each selection feature is as follows:

(1) Currently cut text data length

The current cut text data length is typically represented using the number of words contained in the current cut text data. And counting the number of words contained in the current segmentation text data to obtain the length of the current segmentation text data. Generally, the longer the current cut text data is, the more likely the current cut text data is to be an independent user demand, the final semantic understanding result can be directly obtained by using the first semantic understanding result, if the current cut text data is shorter, the more likely the current cut text data content is related to the historical data interacted with by the user, and the final semantic understanding result is more suitable by using the second semantic understanding result.

(2) The text data of the current segmentation replaces the semantic matching degree before and after the reference body in the feedback text data of the previous interaction

The semantic matching degree before and after the body is replaced in the feedback text data of the previous interaction by the current segmentation text data refers to that after the body is replaced in the feedback text data of the system of the previous interaction by the current segmentation text data, new feedback text data is generated, and the semantic matching degree between the new feedback text data and the feedback text data before replacement is calculated.

The reference body is generally obtained according to a semantic slot contained in the feedback text data, and when the system generates the feedback text data containing the reference body, two sentence types are often contained, namely, a general question and a selected question; the general question, such as feedback text data, is "when you want to go? "do you want $when to start" by the interactive system based on the interactive data and the corresponding templates? "generate, $ when" herein refers to the body "when"; the selection question, such as feedback text data, is "do you want an equal seat, two seats, or sleeping berth? "is you want $SEat [0], $SEat [1], or $SEat [2] by the interactive system based on the interaction data and the corresponding templates? "generating" $seat [0], $seat [1], or $seat [2] "refers to the body of" first class seat, second class seat, sleeper ", wherein $seat [0], $seat [1], $seat [2] respectively represent three modes of train seats, specifically, the data searched when the system generates feedback text data according to the previous interaction are obtained dynamically after clustering according to the value searched by the semantic slot seat (seat).

When the features are extracted, semantic understanding is carried out on the text data after the text data of the previous interaction refer to the body replacement, and a specific semantic analysis method is the same as an independent semantic understanding method, so that a semantic understanding result after the text data of the previous interaction refer to the body replacement is obtained; the matching degree of the semantic understanding of the feedback text before and after the replacement of the reference body is calculated, and the specific calculation is the same as that of the prior art and is not described in detail here.

If no indicator exists in the feedback text data of the previous interaction, the characteristic value is 0;

The text data of the segmentation of the previous round of interaction is "help me check out and get the best to Beijing", and the text data of the feedback given by the system is "you want to sit on plane or train? And finally, when the characteristics are extracted, carrying out semantic understanding on the feedback text data after the reference body replacement to obtain semantic understanding results after the reference body replacement, and calculating the matching degree of the semantic understanding results of the feedback text data before and after the reference body replacement.

(3) Semantic integrity of current cut text data

The semantic integrity of the current segmentation text data is used for describing whether the semantic of the current segmentation text data is complete or not, for example, the semantic integrity is 1.0, the semantic integrity of Liu Dehua is 0.3, and the semantic integrity of the 'e' is 0.0; the specific extraction method can be obtained by extracting semantic features of the current segmentation text data and pre-training a text data semantic integrity prediction model to predict the semantic integrity of the current segmentation text data; the semantic features can be obtained according to independent semantic understanding results of the current segmentation text data, such as whether semantic slots are included, the total number of the semantic slots, the length of each semantic slot value (represented by the number of words included by using the semantic slot values), whether the semantic understanding results include features such as semantic scene keywords, and the like, the semantic integrity prediction model is a common regression model in pattern recognition, such as a logistic regression model, and can be obtained by extracting semantic features and marking the semantic integrity of the text data after independent semantic understanding is performed by collecting a large amount of text data in advance and training according to the semantic features and the marked semantic integrity.

(4) The coincidence degree of the semantic slot set contained in the initial semantic understanding result of the current segmentation text data and the semantic slot set contained in the historical semantic understanding result of the previous interaction

The historical semantic understanding result of the previous interaction refers to the final semantic understanding result of the segmentation text data of the previous interaction.

And the coincidence degree of the semantic slot set contained in the initial semantic understanding result (called initial semantic understanding result for short) of the current segmentation text data and the semantic slot set contained in the historical semantic understanding result of the previous round of interaction describes the correlation degree of the current segmentation text data and the historical data of the previous round of interaction from the semantic slot angle.

During specific extraction, a semantic slot set contained in an initial semantic understanding result and a semantic slot set contained in a historical semantic understanding result are respectively obtained, wherein the semantic slot set contained in the initial semantic understanding result specifically refers to a set composed of semantic slots contained in the initial semantic understanding result; the semantic slot set contained in the historical semantic understanding result specifically refers to a set composed of semantic slots contained in the historical semantic understanding result. Further, the above-mentioned history semantic understanding result is specifically a history semantic understanding result of the previous interaction.

In general, the higher the overlap ratio is, the more likely that the text data is cut currently, the more the history data of the previous round of interaction is corrected, and the first semantic understanding result should be used at this time, so the probability that the system uses the first semantic understanding result is higher, otherwise, the lower the overlap ratio is, the more likely that the same sentence of interaction voice data is cut into two sentences by the system error, and the second semantic understanding result should be used at this time, so the probability that the system finally uses the second semantic understanding result is higher.

(5) Matching degree of semantic slot values contained in initial semantic understanding results of current segmentation text data and semantic slot values contained in historical semantic understanding results of previous interaction

During specific extraction, semantic slot values contained in the initial semantic understanding result and semantic slot values contained in the historical semantic understanding result are extracted respectively, the two groups of semantic slot values can form a character string respectively, and the matching degree characteristic can specifically refer to the editing distance between the two character strings. The edit distance (EDIT DISTANCE), also known as the Levenshtein distance, refers to the minimum number of edit operations required to change from one to the other between two strings. The permitted editing operations include replacing one character with another, inserting one character, and deleting one character. In general, the smaller the edit distance, the greater the similarity of the two character strings. The specific calculation mode of the editing distance can be implemented by adopting various existing or future technologies, such as calculating the editing distance based on the pronunciation phoneme jump cost matrix.

(6) The association degree of the semantic slots contained in the initial semantic understanding result of the current segmentation text data and the semantic slots contained in the historical semantic understanding result of the previous interaction

The combination degree of the semantic slots is calculated according to a large amount of interaction data collected in advance, the interaction data can be obtained according to user interaction historical data, in the specific calculation, according to the collected interaction data, the ratio of the number of simultaneous occurrence times of the semantic slots contained in the historical semantic understanding result of the previous round of interaction of the current segmentation text data and the number of occurrence times of the semantic slots contained in the initial semantic understanding result of the current segmentation text data in the interaction data to the number of occurrence times of the semantic slots contained in the historical semantic understanding result of the previous round of interaction of the current segmentation text data is calculated, wherein the simultaneous occurrence refers to the fact that in two adjacent rounds of interaction in the interaction data, the sequence of the previous round of interaction data and the next round of user interaction data is identical to the sequence of the previous round of interaction data and the current segmentation text data, and the specific calculation method is shown in a formula (1):

Wherein, T _c represents the association degree of the semantic slots contained in the initial semantic understanding result of the current segmentation text data and the semantic slots contained in the historical semantic understanding result of the previous interaction; count (slot _c|slot_hc) represents the number of times that a semantic slot contained in a history semantic understanding result of a previous interaction of the current cut text data and a semantic slot contained in an initial semantic understanding result of the current cut text data occur at the same time in the collected interaction data, and count (slot _hc) represents the number of times that a semantic slot contained in a history semantic understanding result of a previous interaction of the current cut text data occurs in the interaction data.

(7) Semantic integrity after connection of current cut text data and history cut text data of previous round of interaction

The connection of the current cut text data and the history cut text data of the previous round of interaction means that the current cut text data is directly placed behind the history cut text data of the previous round of interaction, so that new text data is formed; the method for calculating the semantic integrity of the new text data is the same as the method for calculating the semantic integrity of the current segmented text data, and will not be described in detail here.

S32: and selecting a first semantic understanding result or a second semantic understanding result according to the selection characteristics and a pre-constructed semantic result selection model.

And taking the selection features as input of the selection model, selecting parallel semantic understanding results by utilizing the pre-trained semantic result selection model, outputting selection probability that each semantic understanding result in the first semantic understanding result and the second semantic understanding result is a correct semantic understanding result, and selecting a semantic understanding result with high probability as a final semantic understanding result.

The semantic result selection model extracts selection features after parallel semantic understanding is performed by collecting a large amount of user interaction data in advance, and marks the correct semantic understanding result in the parallel semantic understanding results; training the semantic result selection model by using the selection features and the labeling results of the semantic understanding results, taking the selection features as input of the semantic understanding result selection model, taking the probability of each semantic understanding in the parallel semantic understanding results as output, updating the model parameters according to the labeling features of the parallel semantic understanding, and obtaining the semantic result selection model after parameter updating is finished; the specific process is the same as the prior art and is not described in detail here; the semantic result selection model is a classification model commonly used in pattern recognition, such as a decision tree model, a support vector machine model, a deep neural network model and the like.

In this embodiment, a more appropriate semantic understanding result can be selected by selecting a model based on the selection features and the semantic result.

The foregoing relates to semantic understanding of the current cut text data according to the history data to obtain the content of the second semantic understanding result, and the following describes in detail the process of semantic understanding of the current cut text data according to the history data.

Fig. 4 is a flowchart of a method for semantically understanding cut text data according to history data in an embodiment of the present application.

As shown in fig. 4, the method of the present embodiment includes:

S41: and carrying out independent semantic understanding on the current segmentation text data to obtain an initial semantic understanding result.

When independent semantic understanding is performed, each semantic scene can be corresponding, and under each semantic scene, independent semantic understanding is performed on the current segmentation text data, so that an initial semantic understanding result under each semantic scene is obtained.

The semantic scene can be preset according to the application requirement of the system, and the semantic scene can be music, films, ticket booking, catering and the like.

When independent semantic understanding is carried out on the current segmentation text data, initial semantic understanding results under various semantic scenes can be obtained by using semantic analysis based on syntax, semantic analysis based on grammar rule networks and the like.

Taking semantic analysis based on a grammar rule network to perform independent semantic understanding on the current segmentation text data as an example, an extended Backus-Naur Form (ABNF) grammar rule based on each semantic scene can be compiled in advance, then a weighted finite state machine (WEIGHTED FINITE STATE Transducer, WFST) network under each semantic scene is compiled and generated according to the grammar rule under each semantic scene, and semantic understanding is performed on the current segmentation text data under each semantic scene according to the WFST network under each semantic scene, so that an initial semantic understanding result under each semantic scene is obtained. It will be appreciated that the details of the semantic understanding process in each semantic scenario may be found in various existing or future technologies, and will not be described in detail herein.

S42: and extracting semantic features of the current segmentation text data according to the initial semantic understanding result and the historical data of the user interaction.

Wherein the history data includes: historical semantic understanding results, and/or feedback text data semantic understanding results.

The semantic features are used for carrying out secondary semantic understanding on the current segmentation text data. The semantic features include one or more of the following:

The method comprises the steps of initial semantic understanding result characteristics, historical semantic understanding result characteristics, the coincidence degree of a semantic slot set contained in the initial semantic understanding result and a semantic slot set contained in the historical semantic understanding result, the matching degree of key semantic slot values contained in the initial semantic understanding result and key semantic slot values contained in the historical semantic understanding result, and the coincidence degree of the semantic slot set contained in the initial semantic understanding result and the semantic slot set contained in the semantic understanding result of the feedback text data of the previous round.

The specific extraction flow of each semantic feature is as follows:

(1) Initial semantic understanding of result features

The initial semantic understanding result features are obtained according to the initial semantic understanding result.

In the specific extraction, the feature names and the feature values can be extracted according to the initial semantic understanding result, and the information obtained by combining the feature names and the feature values is used as the initial semantic understanding result feature.

The feature names and the determination modes of the feature values can be set. For example, the semantic scene name and the semantic scene value are connected by a connector and then used as a feature name, and the feature value corresponding to the feature name indicates whether the feature name appears in the initial semantic understanding result; and taking the semantic slot as another feature name, wherein the feature value corresponding to the feature name indicates whether the semantic slot has a value or not in the initial semantic understanding result.

The connectors described above are configurable, for example using "-" or other symbols as connectors. The feature value may be 0 or 1. For example, corresponding to the previous feature name, 1 indicates that the semantic scene name and the semantic scene value corresponding to the feature name appear in the semantic understanding result, otherwise, the feature value is 0. Or the value of the semantic slot in the initial semantic understanding result is represented by 1 corresponding to the latter feature name, otherwise, the value of the feature is 0.

For example, if the initial semantic understanding result is { "service": "music", "version": "concert version" }, the extracted initial semantic understanding result is characterized by (service. Music:1, version: 1), where "service. Music" is the semantic scene and the feature name after the value connection of the semantic scene, the value of the feature is given after the colon, and since in the semantic understanding result, "version" has a corresponding value, the value of the extracted feature "version" is given as 1.

(2) Historical semantic understanding of outcome features

The historical semantic understanding result features can be obtained according to the historical semantic understanding result. The method can be particularly obtained by referring to the mode of obtaining the initial semantic understanding result characteristics according to the initial semantic understanding result.

The history semantic understanding result refers to a final semantic understanding result of the history data, and further, the history data may specifically refer to history segmentation text data, that is, text data corresponding to voice data input by a user.

Further, the historical semantic understanding result may be a semantic understanding result of multiple rounds of historical data, at this time, the extracted features of the historical semantic understanding result are multiple, so as to form a feature sequence of the semantic understanding result, and specifically considering several rounds of historical data may be determined according to application requirements, for example, considering 5 rounds of historical data, the feature sequence of the historical semantic understanding result includes 5 rounds of historical semantic understanding result.

After each round of interaction is finished, the system stores text data of each round of user interaction and final semantic understanding results thereof, wherein the final semantic understanding results are the same as the expression forms of the initial semantic understanding results, namely, are composed of semantic scenes, semantic scene values, semantic slots and semantic slot values, so that the feature extraction method is the same as the extraction method of the initial semantic understanding results, and the feature extraction method is not described in detail herein.

For example, the current cut text data is "concert version", and assuming that the history data of the user interaction includes "turn-off television", "belief of first-time belief", the user interaction history contains two rounds altogether, the extracted history semantic understanding feature sequence contains two features in total, and is represented by brackets in use, namely [ (service.smart home:1, object tv:1, action: 1) ]; the information in each bracket is a historical semantic understanding result feature, wherein 'service.smart Home' and 'object.tv' in the first feature are feature names after semantic scene names are connected with semantic scene values, 'action' is a semantic slot feature name, and 'service.music' in the second feature is feature name after semantic scenes are connected with semantic scene values, 'artist' and 'song' are semantic slot feature names; the value of each feature is a number after a colon.

(3) The coincidence degree of the semantic slot set contained in the initial semantic understanding result and the semantic slot set contained in the historical semantic understanding result

And the coincidence degree describes the relativity of the current segmentation text data and the user interaction history from the semantic slot angle.

During specific extraction, a semantic slot set contained in an initial semantic understanding result and a semantic slot set contained in a historical semantic understanding result are respectively obtained, wherein the semantic slot set contained in the initial semantic understanding result specifically refers to a set composed of semantic slots contained in the initial semantic understanding result; the semantic slot set contained in the historical semantic understanding result specifically refers to a set composed of semantic slots contained in the historical semantic understanding result. Further, when there are multiple historical semantic understanding results, one historical semantic understanding result can be selected, and then the corresponding semantic slot set is determined. The selected one of the historical semantic understanding results may be specifically the historical semantic understanding result of the previous round of interaction.

The above-mentioned contact ratio calculating mode is: the number of the same semantic slot names in the two semantic slot sets is divided by the total number of elements in the semantic slot set contained in the initial semantic understanding result.

For example, if the segmented text data is "singing version", the initial semantic understanding result is { "service": "music", "version": "singing version" }, the total number of elements in the semantic slot set contained in the initial semantic understanding result is 1, and the elements are specifically: the semantic slots are "versions", and the previous round of history data is assumed to be "beliefs of first-time beliefs," and the semantic understanding result of the previous round of history data contains songs of various versions corresponding to the beliefs singed by Zhang Xinzhe, so that the semantic slots contained in the semantic slot set contained in the obtained history semantic understanding result are "song names", "singers", "versions", "albums to which the semantic slots belong"; the same semantic slots in the two semantic slot sets are only 'versions', the number of the same semantic slots is 1, 1 is divided by the number of elements in the semantic slot set contained in the initial semantic understanding result, only 1 element is used, and the finally obtained coincidence degree is 1.

(4) The matching degree of the key semantic slot values contained in the initial semantic understanding result and the key semantic slot values contained in the historical semantic understanding result

The key semantic slots refer to semantic slots preset according to application requirements in each semantic scene, and take a music scene as an example, the preset key semantic slots are such as a song name, a singer, an album, a version, a source (such as movie and television insertion, a variety program), a language, an area and the like.

After the key semantic slots are determined, the key semantic slots and the corresponding key semantic slot values can be respectively extracted from the initial semantic understanding result and the historical semantic understanding result, so that the key semantic slot values contained in the initial semantic understanding result and the key semantic slot values contained in the historical semantic understanding result can be respectively obtained. Further, where there are multiple historical semantic understanding results, a particular selected one or more of the historical semantic understanding results may be applied for demand determination.

After the key semantic slot values contained in the initial semantic understanding result and the key semantic slot values contained in the historical semantic understanding result are respectively obtained, the two groups of key semantic slot values can respectively form a character string, and the matching degree characteristic can specifically refer to the editing distance between the two character strings. The edit distance (EDIT DISTANCE), also known as the Levenshtein distance, refers to the minimum number of edit operations required to change from one to the other between two strings. The permitted editing operations include replacing one character with another, inserting one character, and deleting one character. In general, the smaller the edit distance, the greater the similarity of the two character strings. The specific calculation mode of the editing distance can be implemented by adopting various existing or future technologies, such as calculating the editing distance based on the pronunciation phoneme jump cost matrix.

(5) The coincidence degree of the semantic slot set contained in the initial semantic understanding result and the semantic slot set contained in the semantic understanding result of the last round of feedback text data.

The feedback text data of the previous round refers to a system response result obtained by a user request in the last interaction history before the text data is currently segmented, if the last request of the user is "Zhang Xinzhe beliefs", the feedback of the system to the user is "please enjoy Zhang Xinzhe beliefs", and the feedback text data of the previous round is "please enjoy Zhang Xinzhe beliefs". And then carrying out semantic understanding on the feedback text data, wherein a specific semantic understanding flow can refer to a flow for carrying out semantic understanding on the segmentation text data to obtain an initial semantic understanding result, and can obtain a semantic understanding result of the feedback text data. Further, similar to the processing of the initial semantic understanding result corresponding to the cut text data, a semantic slot set contained in the semantic understanding result of the feedback text data can be obtained.

After the semantic slot set contained in the initial semantic understanding result and the semantic slot set contained in the semantic understanding result of the feedback text data are obtained, the method can be similar to a calculation mode of the coincidence degree of the semantic slot set contained in the initial semantic understanding result and the semantic slot set contained in the historical semantic understanding result, and the coincidence degree of the semantic slot set contained in the initial semantic understanding result and the semantic slot set contained in the semantic understanding result of the feedback text data of the previous round is calculated.

S43: and carrying out secondary semantic understanding on the current segmentation text data according to the semantic features and a pre-constructed semantic understanding result sequencing model to obtain a secondary semantic understanding result.

The semantic understanding result ordering model can be generated after training according to the collected training data in the training stage. The input of the semantic understanding result ordering model is semantic features of text data, the output is ordering information of a plurality of initial semantic understanding results corresponding to the text data, such as probability values of the initial semantic understanding results, or the initial semantic understanding result ordering information which is used for ordering the correct initial semantic understanding results in the first position, therefore, after the semantic features are extracted, the extracted semantic features are used as the input of the semantic understanding result ordering model, and the ordering result of the initial semantic understanding results can be obtained according to the output.

Specifically, the process of training the semantic understanding result ordering model may include: collecting a large amount of user text data and feedback text data of a system, after initial semantic understanding is carried out on the data, obtaining an initial semantic understanding result corresponding to the user text data and an initial semantic understanding result corresponding to the feedback text data, extracting semantic features of the text data according to the feature extraction flow, manually marking the ordering sequence of the corresponding initial semantic understanding results, or directly marking the correct initial semantic understanding result, taking the correct initial semantic understanding result as the first semantic understanding result after ordering, putting other initial semantic understanding results behind the correct initial semantic understanding result, training according to the extracted semantic features and the ordering sequence of manual marking, and generating a semantic understanding result ordering model. The ranking model is a ranking model commonly used in pattern recognition, such as RANKING SVM, RANKING CNN.

Furthermore, when semantic understanding is carried out on text data according to semantic scenes, a semantic understanding result ordering model can be generated corresponding to each semantic scene in training the semantic understanding result ordering model, and training data adopted in generating the semantic understanding result ordering model of a certain semantic scene are user text data of the semantic scene, feedback text data of a system and the like.

S44: and obtaining a second semantic understanding result according to the second semantic understanding result.

In some embodiments, the second semantic understanding result may be derived directly from the second semantic understanding result.

For example, top-ranked N (TopN) secondary semantic understanding results are selected as the second semantic understanding results. The N is determined according to application requirements, such as selecting the top 10 semantic understanding results ranked first. The sorting order of the secondary semantic understanding results can be determined according to the probability values of the corresponding paths, such as sorting from high probability values to low probability values.

In order to further improve accuracy, in some embodiments, the secondary semantic understanding results may be further modified in a sorting manner, and the modified secondary semantic understanding results are used as final semantic understanding results.

As shown in fig. 5, the flow of the method for sorting and correcting the secondary semantic understanding result includes:

S51: and acquiring related data of a secondary semantic understanding result.

And under each semantic scene, searching data with higher correlation of secondary semantic understanding by using a vertical search engine.

In specific searching, each secondary semantic understanding result is input as a search string of a search engine, the search engine automatically converts the search string into an internal search condition, searches documents related to each secondary semantic understanding result, and returns the sorted related documents, wherein each secondary semantic understanding result can select TopN related documents as related data of each secondary semantic understanding result; the specific process is the same as the prior art and will not be described in detail here.

S52: and extracting relevance characteristics according to the current segmentation text data, the secondary semantic understanding result and the relevant data.

And extracting correlation characteristics according to the secondary semantic understanding result and the related data thereof under each semantic scene.

The relevance feature is used for describing the relevance between the secondary semantic understanding result and the current segmentation text data and the accuracy of the secondary semantic understanding result sequencing.

The relevance feature includes at least one of:

The secondary semantic understanding result has the same word number as the current segmentation text data, the secondary semantic understanding result comprises key semantic slots which are the same as the words in the current segmentation text data, the current segmentation text data comprises the number of key words related to semantic scenes, and the coincidence degree of a semantic slot set contained in the initial semantic understanding result and a semantic slot set contained in related data.

The specific extraction method is as follows:

(1) The secondary semantic understanding result is the same as the word number of the current segmentation text data

The word number of the secondary semantic understanding result is the same as that of the current segmentation text data, namely the word in the current segmentation text data understood by the secondary semantic understanding result can be matched with each word after the semantic slot value contained in the secondary semantic understanding result is segmented with the current segmentation text data in sequence, and the word number of the semantic slot value in the secondary semantic understanding result is determined to be the same as that of the current segmentation text data.

(2) The secondary semantic understanding result contains key semantic slot number

Counting key semantic slot numbers in semantic understanding scenes to which each secondary semantic understanding result belongs, wherein the key semantic slot numbers are more, and the correlation between the semantic understanding result and the semantic scenes to which the semantic understanding result belongs is higher, so that the secondary semantic understanding result with higher correlation with each semantic scene is ranked in front in each semantic scene; the more scene key semantic slots the secondary semantic understanding result contains among different semantic scenes, the earlier the secondary semantic understanding result is ordered.

(3) And the key semantic slot value heat contained in the secondary semantic understanding result which is the same as the word in the current segmentation text data.

The key semantic slots in each secondary semantic understanding result are found, whether the value of each key semantic slot is identical to the word in the current segmentation text data or not is sequentially matched, and if so, the heat of the value of the current key semantic slot is calculated; the value heat of the key semantic slots can be calculated according to the related data searched by the secondary semantic understanding result of the semantic scene to which the key semantic slots belong; the related data, such as data contained in a vertical page in a search result, for example, in a music scene, the corresponding vertical page is dog search music, cool me music, tencent music, and the like, and the heat information of the current key semantic slot is comprehensively analyzed according to the information of the user click quantity, the user play quantity, the user search quantity, the distribution of the key semantic slot values in a plurality of vertical pages, and the like of the key semantic slot values in the vertical page, so that the heat of the key semantic slot is obtained, and a specific analysis and calculation method is the same as the prior art and is not described in detail herein.

(5) Coincidence degree of semantic slot set contained in initial semantic understanding result and semantic slot set contained in related data

The semantic slots contained in the related data are obtained according to the related data searched by the secondary semantic understanding result, the searched related data are structured to obtain a semantic slot set contained in the related data, and the specific structuring method is the same as the prior art and is not described in detail herein; after comparing the semantic slots in the two extracted semantic slot sets, determining the number of semantic slots with the same name in the two semantic slot sets, and dividing the same number by the number of elements in the semantic slot set contained in the initial semantic understanding result to obtain the coincidence degree of the semantic slot set contained in the initial semantic understanding result and the semantic slot set contained in the related data.

S53: and carrying out sorting correction on the secondary semantic understanding result according to the correlation characteristics and a pre-established sorting model.

The ranking model may be generated after training the training data in the training phase. For example, a large amount of text data is collected, semantic understanding is carried out on the text data to obtain secondary semantic understanding results under each semantic scene, relevance features are extracted, manual annotation is carried out on the ordering of the secondary semantic understanding results under each semantic scene, and then training is carried out according to the relevance features and the ordering of the manual annotation, so that an ordering model is generated. The input of the sorting model is a correlation characteristic, and the output is a secondary semantic understanding result under each semantic scene after sorting correction.

Therefore, after the correlation characteristics are extracted, a secondary semantic understanding result after sorting correction can be obtained according to the sorting model.

After the secondary semantic understanding result after the sorting correction is obtained, topN may be selected as a final second semantic understanding result, where N may be determined according to an application requirement, for example, N takes a value of 10.

In this embodiment, by performing sorting correction on the secondary semantic understanding result, the accuracy of the semantic understanding result can be further improved, the accuracy of the feedback result can be further improved, and the user experience can be improved.

FIG. 6 is a schematic structural diagram of a multi-round interactive parallel semantic understanding device according to one embodiment of the present application.

As shown in fig. 6, the apparatus 60 of the present embodiment includes: a receiving module 61, a speech recognition module 62, a semantic understanding module 63 and a selection module 64.

A receiving module 61 for receiving voice data;

the voice recognition module 62 is configured to segment the voice data to obtain a voice data segment, and perform voice recognition on the voice data segment to obtain current segmented text data;

The semantic understanding module 63 is configured to perform independent semantic understanding on the current segmented text data to obtain a first semantic understanding result, and perform semantic understanding on the current segmented text data according to the history data of the user interaction to obtain a second semantic understanding result;

A selection module 64, configured to select the first semantic understanding result or the second semantic understanding result as a final semantic understanding result.

In some embodiments, referring to fig. 7, the selection module 64 includes:

a selection feature extraction sub-module 641 for extracting selection features according to the current cut text data, the initial semantic understanding result of the current cut text data, and the history data of user interaction;

A selecting sub-module 642, configured to select the first semantic understanding result or the second semantic understanding result as a final semantic understanding result according to the selection feature and a pre-constructed semantic result selection model.

In some embodiments, the historical data includes at least one of:

In some embodiments, the selection feature comprises at least one of:

In some embodiments, referring to fig. 7, the semantic understanding module 63 comprises:

The independent semantic understanding sub-module 631 is configured to perform independent initial semantic understanding on the current segmented text data, so as to obtain an initial semantic understanding result; directly selecting a preset number of initial semantic understanding results in sequence to serve as a first semantic understanding result; or performing independent initial semantic understanding on the current segmentation text data to obtain an initial semantic understanding result; sorting and correcting the initial semantic understanding result; and sequentially selecting the initial semantic understanding results after the sorting correction of the preset number as a first semantic understanding result.

In some embodiments, the independent semantic understanding submodule is configured to perform sorting correction on an initial semantic understanding result, and includes:

Acquiring related data of an initial semantic understanding result;

Extracting relevance characteristics according to the current segmentation text data, an initial semantic understanding result and the relevant data;

and according to the relevance characteristics and a pre-constructed sorting model, sorting and correcting the initial semantic understanding result.

In some embodiments, the relevance features include at least one of:

In some embodiments, referring to fig. 7, the semantic understanding module 63 further comprises:

an initial semantic understanding sub-module 632, configured to perform initial semantic understanding on the segmented text data, to obtain an initial semantic understanding result;

an extraction sub-module 633, configured to extract semantic features of the cut text data according to the initial semantic understanding result and the history data of the user interaction;

The secondary semantic understanding sub-module 634 is configured to perform secondary semantic understanding on the segmented text data according to the semantic features and a pre-constructed semantic understanding result ordering model, so as to obtain a secondary semantic understanding result;

and an obtaining sub-module 635, configured to obtain a second semantic understanding result according to the second semantic understanding result.

In some embodiments, the obtaining submodule is specifically configured to:

directly selecting a preset number of secondary semantic understanding results in sequence to serve as second semantic understanding results; or alternatively

And carrying out sorting correction on the secondary semantic understanding results, and sequentially selecting a preset number of sorting corrected secondary semantic understanding results to serve as second semantic understanding results.

In some embodiments, the obtaining submodule is configured to perform sorting correction on the secondary semantic understanding result, and includes:

acquiring related data of a secondary semantic understanding result;

Extracting relevance characteristics according to the current segmentation text data, the secondary semantic understanding result and the relevant data;

And carrying out sorting correction on the secondary semantic understanding result according to the correlation characteristics and a pre-established sorting model.

In some embodiments, the historical data includes at least one of:

historical semantic understanding results, and semantic understanding results of the feedback text data of the previous interaction.

In some embodiments, the semantic features include at least one of:

In some embodiments, the relevance features include at least one of:

It will be appreciated that the apparatus of this embodiment corresponds to the method embodiment described above, and specific details may be found in the related description of the method embodiment, which is not described in detail herein.

It is to be understood that the same or similar parts in the above embodiments may be referred to each other, and that in some embodiments, the same or similar parts in other embodiments may be referred to.

It should be noted that in the description of the present application, the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. Furthermore, in the description of the present application, unless otherwise indicated, the meaning of "plurality" means at least two.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and further implementations are included within the scope of the preferred embodiment of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present application.

It is to be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.

Those of ordinary skill in the art will appreciate that all or a portion of the steps carried out in the method of the above-described embodiments may be implemented by a program to instruct related hardware, where the program may be stored in a computer readable storage medium, and where the program, when executed, includes one or a combination of the steps of the method embodiments.

In addition, each functional unit in the embodiments of the present application may be integrated in one processing module, or each unit may exist alone physically, or two or more units may be integrated in one module. The integrated modules may be implemented in hardware or in software functional modules. The integrated modules may also be stored in a computer readable storage medium if implemented in the form of software functional modules and sold or used as a stand-alone product.

The above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, or the like.

In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

While embodiments of the present application have been shown and described above, it will be understood that the above embodiments are illustrative and not to be construed as limiting the application, and that variations, modifications, alternatives and variations may be made to the above embodiments by one of ordinary skill in the art within the scope of the application.

Claims

1. A multi-round interactive parallel semantic understanding method, comprising:

Receiving voice data;

Segmenting the voice data to obtain voice data fragments, and performing voice recognition on the voice data fragments to obtain current segmented text data;

Independent semantic understanding is carried out on the current segmentation text data to obtain a first semantic understanding result, and semantic understanding is carried out on the current segmentation text data according to historical data interacted by a user to obtain a second semantic understanding result;

selecting the first semantic understanding result or the second semantic understanding result as a final semantic understanding result; the selecting the first semantic understanding result or the second semantic understanding result as a final semantic understanding result includes:

extracting selection features according to the current segmentation text data, the initial semantic understanding result of the current segmentation text data and the historical data of user interaction;

selecting the first semantic understanding result or the second semantic understanding result as a final semantic understanding result according to the selection feature and a pre-constructed semantic result selection model;

the history data includes at least one of:

2. The method of claim 1, wherein the selection feature comprises at least one of:

3. The method of claim 1, wherein said performing independent semantic understanding on the current cut text data to obtain a first semantic understanding result comprises:

independent initial semantic understanding is carried out on the current segmentation text data, and an initial semantic understanding result is obtained;

and directly selecting a preset number of initial semantic understanding results in sequence to serve as a first semantic understanding result.

4. The method of claim 1, wherein said performing independent semantic understanding on the current cut text data to obtain a first semantic understanding result comprises:

sorting and correcting the initial semantic understanding result;

and sequentially selecting a preset number of initial semantic understanding results after sorting and correcting as a first semantic understanding result.

5. The method of claim 4, wherein ranking the initial semantic understanding results comprises:

Acquiring related data of an initial semantic understanding result;

6. The method of claim 5, wherein the relevance features include at least one of:

7. The method according to claim 1, wherein the performing semantic understanding on the current cut text data according to the history data of the user interaction to obtain a second semantic understanding result includes:

independent semantic understanding is carried out on the current segmentation text data, and an initial semantic understanding result is obtained;

extracting semantic features of the current segmentation text data according to the initial semantic understanding result and the historical data of user interaction;

performing secondary semantic understanding on the current segmentation text data according to the semantic features and a pre-constructed semantic understanding result ordering model to obtain a secondary semantic understanding result;

and obtaining a second semantic understanding result according to the second semantic understanding result.

8. The method of claim 7, wherein the deriving a second semantic understanding result from the result of the second semantic understanding comprises:

9. The method of claim 8, wherein ranking the secondary semantic understanding results comprises:

acquiring related data of a secondary semantic understanding result;

10. The method of claim 7, wherein the historical data comprises at least one of:

11. The method of claim 10, wherein the semantic features comprise at least one of:

12. The method of claim 9, wherein the relevance features include at least one of:

13. A multi-round interactive parallel semantic understanding device, comprising:

The receiving module is used for receiving voice data;

The voice recognition module is used for segmenting the voice data to obtain voice data fragments, and carrying out voice recognition on the voice data fragments to obtain current segmented text data;

the semantic understanding module is used for carrying out independent semantic understanding on the current segmentation text data to obtain a first semantic understanding result, and carrying out semantic understanding on the current segmentation text data according to the historical data of user interaction to obtain a second semantic understanding result;

The selection module is used for selecting the first semantic understanding result or the second semantic understanding result as a final semantic understanding result; the selection module comprises:

The selection feature extraction sub-module is used for extracting selection features according to the current segmentation text data, the initial semantic understanding result of the current segmentation text data and the historical data of user interaction;

the selection sub-module is used for selecting the first semantic understanding result or the second semantic understanding result as a final semantic understanding result according to the selection characteristics and a pre-constructed semantic result selection model;

the history data includes at least one of:

14. The apparatus of claim 13, wherein the selection feature comprises at least one of:

15. The apparatus of claim 13, wherein the semantic understanding module comprises: an independent semantic understanding sub-module for:

Independent initial semantic understanding is carried out on the current segmentation text data, and an initial semantic understanding result is obtained; directly selecting a preset number of initial semantic understanding results in sequence to serve as a first semantic understanding result;

Or alternatively

Independent initial semantic understanding is carried out on the current segmentation text data, and an initial semantic understanding result is obtained; sorting and correcting the initial semantic understanding result; and sequentially selecting the initial semantic understanding results after the sorting correction of the preset number as a first semantic understanding result.

16. The apparatus of claim 15, wherein the independent semantic understanding submodule is configured to perform a ranking correction on an initial semantic understanding result, and the method comprises:

Acquiring related data of an initial semantic understanding result;

17. The apparatus of claim 16, wherein the relevance feature comprises at least one of:

18. The apparatus of claim 13, wherein the semantic understanding module comprises:

the initial semantic understanding sub-module is used for carrying out initial semantic understanding on the segmented text data to obtain an initial semantic understanding result;

the extraction sub-module is used for extracting semantic features of the segmentation text data according to the initial semantic understanding result and the historical data of user interaction;

the secondary semantic understanding sub-module is used for carrying out secondary semantic understanding on the segmented text data according to the semantic features and a pre-constructed semantic understanding result ordering model to obtain a secondary semantic understanding result;

and the acquisition sub-module is used for acquiring a second semantic understanding result according to the second semantic understanding result.

19. The apparatus of claim 18, wherein the acquisition sub-module is specifically configured to:

20. The apparatus of claim 19, wherein the means for obtaining the sub-module for ranking the secondary semantic understanding results comprises:

acquiring related data of a secondary semantic understanding result;

21. The apparatus of claim 18, wherein the historical data comprises at least one of: historical semantic understanding results, and semantic understanding results of the feedback text data of the previous interaction.

22. The apparatus of claim 21, wherein the semantic features comprise at least one of:

23. The apparatus of claim 20, wherein the relevance feature comprises at least one of: