WO2018066489A1 - Système et procédé de questions-réponses sans faits allégués, et programme informatique associé - Google Patents
Système et procédé de questions-réponses sans faits allégués, et programme informatique associé Download PDFInfo
- Publication number
- WO2018066489A1 WO2018066489A1 PCT/JP2017/035765 JP2017035765W WO2018066489A1 WO 2018066489 A1 WO2018066489 A1 WO 2018066489A1 JP 2017035765 W JP2017035765 W JP 2017035765W WO 2018066489 A1 WO2018066489 A1 WO 2018066489A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- answer
- question
- expression
- word
- semantic
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/04—Inference or reasoning models
Definitions
- the present invention relates to a question answering system, and more particularly to an improvement of a question answering system for a non-factoid type question such as a reason, a method and a definition, not a question about a fact that can be answered with simple words.
- Non-Patent Document 1 recognizes a causal relationship in an answer passage using a clue word such as “because” or a specific causal relationship pattern such as “B occurs because of A”, and the recognized causal relationship. As a clue to answer selection or answer ranking. Examples of performing such processing include classification of answer passages, ranking of answer passages according to the degree of correctness, and the like.
- the underlined part CE1 represents a causal relationship, but no clue word is used. Since there is no clue word in the expression such as CE1, it is presumed that the conventional technique cannot recognize this causal relationship and therefore cannot find the answer such as answer 1 in the above question.
- an object of the present invention is to provide an accurate answer to a non-factoid type question by using an answer pattern including an expression of a semantic relation such as a causal relation that does not explicitly include a clue word. It is to provide a non-factoid type question answering system and a computer program therefor.
- the non-factoid question answering system generates a response to a non-factoid question by paying attention to an expression representing the first semantic relationship appearing in a sentence.
- the non-factoid type question answering system includes a first expression storage means for storing a plurality of expressions representing a first semantic relationship, a question, and a plurality of answer passages including answer candidates for the question.
- Receiving question / answer receiving means receives question / answer receiving means; first expression extracting means for extracting a semantic relation expression representing a first semantic relation from each of a plurality of answer passages; a question; a plurality of answer passages; For each of the combinations, a related expression selecting means for selecting a related expression that is the expression most related to the combination from a plurality of expressions stored in the first expression storage means, a question, and a plurality of Input each combination of an answer passage, a semantic relation expression for the answer passage, and one of related expressions for the combination of the question and the answer passage. Te receiving, and a reply selection means learned in advance by machine learning to select an answer to the question from among the plurality of answer passage.
- the non-factoid type question answering system further appears in the question in the plurality of expressions stored by the first expression storing means for each combination of the question and the plurality of answer passages.
- First semantic correlation amount calculating means for calculating a first semantic correlation amount between each word and each word appearing in the answer passage is included.
- the answer selection means receives as input a combination of a question, a plurality of answer passages, a semantic relation expression for the answer passage, and a related expression for the combination of the question and the answer passage, and for each word being input
- An evaluation means previously learned by machine learning so as to calculate and output an evaluation value representing a scale in which the answer passage is an answer to the question using the first semantic correlation amount as a weight, and a plurality of answer passages Selection means for selecting one of a plurality of answer passages as an answer to the question using the evaluation value output by the evaluation means for each of the above.
- the non-factoid question answering system further includes a first semantic relationship expression extracting unit that extracts a representation representing the first semantic relationship from the document archive and stores it in the first representation storage unit. Including.
- the first semantic correlation amount calculating means stores the first semantic correlation of word pairs included in a plurality of expressions representing the first semantic relation stored in the first expression storage means. For each combination of the first semantic correlation amount storage means for calculating and storing the relationship amount for each word pair, the question, and the plurality of answer passages, the word in the question and the answer For each pair with one word in the passage, the first semantic correlation amount is read from the first semantic correlation amount storage means, the words in the question are arranged on one axis, and the other axis A matrix in which words in the answer passage are arranged, and first semantic correlation amounts of words corresponding to the position are arranged in cells at positions where one axis intersects the other axis.
- Second matrix generation means for generating two second matrices made up of matrices.
- the non-factoid type question answering system further provides a first semantic correlation amount of the first word-sentence matrix for each word appearing in the question given to the answer selection means.
- Each word appearing in the passage includes means for weighting using each of the first semantic correlation quantities of the second word-sentence matrix.
- each of the first semantic correlation amounts stored in the two second matrices is normalized in a predetermined interval.
- the first semantic relationship is a causal relationship.
- each expression representing a causal relationship includes a cause part and a result part.
- the related expression selecting means is a first word extracting means for extracting nouns, verbs and adjectives from the question, and a noun extracted by the first word extracting means from the expressions stored in the first expression storing means.
- a first expression selecting means for selecting a predetermined number of all the parts included in the part, and all nouns extracted by the first word extracting means from the expressions stored in the first expression storing means
- Second expression selecting means for selecting a predetermined number of verbs or adjectives extracted by one word extracting means and containing at least one verb or adjective, and first expression selecting means for each of a plurality of answer passages
- the expression selected by the second expression selecting means has a word common to the answer passage in the result portion, and the answer passage is calculated by a score calculated by weighting the common word.
- a related causality representation selection means for selecting those judged if the most relevant higher.
- the non-factoid type question answering system focuses on the expression representing the first semantic relationship and the expression representing the second semantic relationship appearing in the sentence, so that the non-factoid type question answering system is used. Generate a response to.
- the non-factoid question answering system further includes a second expression storage means for storing a plurality of expressions representing the second semantic relationship, a combination of the question and each of the plurality of answer passages.
- the second semantic correlation amount representing the correlation between each word appearing in the question and each word appearing in the answer passage in the plurality of expressions stored in the second expression storage means is calculated. 2 semantic correlation amount calculation means.
- the evaluation means receives as input a combination of a question, a plurality of answer passages, a semantic relation expression extracted by the first expression extracting means for the answer passage, and a related expression for the question and the answer passage.
- a neural network previously learned by machine learning to output an evaluation value is included.
- the second semantic relationship is a normal semantic relationship that is not limited to a specific semantic relationship
- the second representation storage means stores a randomly collected representation
- the computer program according to the second aspect of the present invention causes a computer to function as each unit of any of the above-described devices.
- the method for responding to a non-factoid question focuses on an expression that represents a predetermined first semantic relationship that appears in a sentence, thereby responding to a non-factoid question.
- a computer-implemented method of generating the computer includes a step of communicatively connecting to a first storage device storing a plurality of expressions representing a first semantic relationship, and the computer includes a plurality of questions and answer candidates for the questions.
- a plurality of answer passages via an input device a computer extracting an expression representing the first semantic relationship from each of the plurality of answer passages, a computer comprising a plurality of questions, Selecting a representation most relevant to the combination from a plurality of representations stored by the first representation storage means for each combination with a plurality of response passages, and a computer comprising a question and a plurality of responses
- Generating an answer to the question by inputting the answer to the answer selection means previously learned by machine learning so as to select an answer to the question from a plurality of answer passages.
- the method further includes a computer for each word appearing in the question in the plurality of expressions stored in the first expression storage means for each combination of the question and the plurality of answer passages.
- the selecting step includes a question, a plurality of answer passages, an expression extracted in the step of extracting from the answer passage, and an expression selected in the selecting step for the question and the answer passage.
- the method includes a step of giving each combination as an input to an evaluation means that has been learned in advance by machine learning so as to calculate and output an evaluation value representing a scale whose answer passage is an answer to the question.
- the evaluation means uses the first semantic correlation amount as a weight for each word being input in calculating the evaluation value, and the method further includes a method in which the computer evaluates each of the plurality of answer passages.
- the method includes a step of selecting one of a plurality of answer passages as an answer to the question using the output evaluation value.
- a non-factoid question answering system includes a question / answer receiving means for receiving a question sentence, a plurality of answer passages for the question sentence, and a plurality of answer passages.
- a causal relation expression extracting means for extracting the answer causal relation expression; and an archive causal relation expression storing means for storing a plurality of archive causal relation expressions extracted from a document archive consisting of a large number of documents.
- Each of the answer causal relationship expression and the archive causal relationship expression includes a cause part and a result part.
- the non-factoid type question answering system further ranks the plurality of archive causal relation expressions stored in the archive causal relation expression storage means based on the degree of relevance with each of the answer passages, and the questions and answers.
- a ranking means for selecting an archive causal relation expression of higher rank, a question, a plurality of response passages, a plurality of answer causal relation expressions, and a ranking means are selected.
- a categorizing means that has previously received a machine learning to select one of a plurality of answer passages as an answer to the question.
- the non-factoid type question answering system further includes a correlation amount storage means for storing a correlation amount that is a measure representing a correlation between each word pair used in each answer passage, a question, and an answer passage.
- a correlation amount storage means for storing a correlation amount that is a measure representing a correlation between each word pair used in each answer passage, a question, and an answer passage.
- the correlation amount is read from the correlation amount storage means for each of the combination of the word extracted from the question and the word extracted from the answer passage, and the question given to the classification means
- a weighting means for attaching a weight according to the correlation amount to each word of the answer passage.
- the weight assigning means calculates a correlation amount for each of the combination of the word extracted from the question and the word extracted from the answer passage.
- one axis is an array of words extracted from the question
- the other axis is an array of words extracted from the answer passage, each at the position where one axis intersects the other axis
- a first matrix generating means for generating a first matrix comprising a word-word correlation amount matrix in which a correlation amount between words corresponding to the position on the axis is arranged; and one of the correlation amount matrices For each word arranged on the axis, the first word-sentence matrix storing the maximum value of the correlation quantity arranged in the other axial direction, and the correlation quantity matrix For each word arranged on the axis of the second, a second word matrix that generates two second matrices consisting of a second word-sentence matrix that stores the maximum value of the correlation amount arranged in one axial direction
- the correlation amounts stored in the first matrix and the two second matrices are normalized between 0 and 1.
- the ranking means selects a predetermined number of first word extracting means for extracting nouns, verbs and adjectives from the question, and a predetermined number of nouns extracted by the first word extracting means from the archive causal expression. And at least one verb or adjective extracted by the first word extraction means, including all nouns extracted by the first word extraction means from the archive causal relation expression selection means.
- a second archive causal relationship expression selecting means for selecting a predetermined number of items to be included, and the answer among the archive causal relation expressions selected by the first and second archive causal relationship selecting means for each answer passage. It has a word that is common to the passage in the result part, and the score is calculated by weighting the common word It may include a related causality representation selection means for selecting those most relevant with the answer passage is determined to be high.
- FIG. 1 is a block diagram showing a schematic configuration of a non-factoid question answering system according to a first embodiment of the present invention. It is a block diagram which shows schematic structure of the question relevant archive causal relation expression selection part shown in FIG. It is the schematic which shows the structure of the mutual information amount matrix between words. It is a figure which shows schematic structure of the multicolumn convolution neural network used in the 1st Embodiment of this invention. It is a schematic diagram for demonstrating the structure in a convolution neural network. It is a schematic diagram for demonstrating the learning process of the non-factoid type question answering system concerning a 1st embodiment of the present invention.
- FIG. 13 is a schematic diagram for illustrating a schematic configuration in the convolutional neural network illustrated in FIG. 12. It is a flowchart which shows the control structure of the program for implement
- the causal relation expression related to both the input question and the answer passage is selected from a large amount of text archives including explicit clue words and the like.
- An answer passage is a fragment of a document extracted from an existing document as having the possibility of being an answer to a question.
- Give the selected causal expression to the convolutional neural network along with the question and answer passage give each answer passage a score that represents the correct probability as the answer to the question, and select the answer that seems most correct as the answer to the question .
- the causal relation expression extracted from the text archive is called an archive causal relation expression
- the causal relation expression extracted from the answer passage is called an answer internal causal relation expression.
- archive causal relation expressions deeply related to both the question and the answer passage are extracted and used, and these causal relation expressions are referred to as related causal relation expressions.
- the concept of complementing the implicit causal relation expression using the archive causal relation expression is also adopted.
- the answer passage listed above and the causal relationship expression CE2 including explicit clue words etc. have common words (sea and water), and such common words are implicitly Even if the causal expression cannot be specified, we thought that it should be used as a clue to discover sufficient answers.
- an explicit clue word is included as a paraphrase.
- the archive causal expression can be estimated, and as a result, the accuracy of questions can be improved.
- causality-attention hereinafter “CA”.
- CA words Cerality-Attention words
- the classifier selects an answer by paying attention to such a CA word when finding a cause or reason for a given question.
- a multi-column neural network MCNN including a plurality of convolutional neural networks is used as a classifier as will be described later. This MCNN is called CA-MCNN in the sense of focusing on CA words.
- a non-factoid question answering system 30 includes a question accepting unit 50 that receives a question 32, and an existing question answering system 34 that accepts a question accepted by the question accepting unit 50.
- a response receiving unit 52 that receives a predetermined number of answer passages for the question 32 from the question answering system 34 in some form, a web archive storage unit 56 that stores a web archive of a large number of documents, and a web archive storage Causal for calculating a causal relationship matrix described later using the web archive stored in the unit 56, the question 130 received by the question receiving unit 50, and the answer passage received by the response receiving unit 52 from the question answering system 34.
- a related attention processing unit 40 that receives a predetermined number of answer passages for the question 32 from the question answering system 34 in some form
- a web archive storage unit 56 that stores a web archive of a large number of documents
- a web archive storage Causal for calculating a causal relationship matrix described later using the web archive stored in the unit 56, the question 130 received by the question receiving unit 50, and the answer passage received by the response receiving unit 52 from the question answering system 34.
- a related attention processing unit 40 that receives a predetermined number of answer passages for the question 32 from the question answering
- the causal relationship attention processing unit 40 includes a causal relationship expression extracting unit 58 that extracts a causal relationship expression from the web archive storage unit 56 using clue words using existing techniques, and a causal relationship expression extracted by the causal relationship expression extracting unit 58.
- the archive causal relationship expression storage unit 60 for storing (archive causal relationship expression) and the words included in the archive causal relationship expression stored in the archive causal relationship expression storage unit 60 are extracted and normalized by [-1,1]
- a mutual information amount calculation unit 62 that calculates a mutual information amount that is one measure representing a correlation between the words, and each word is arranged on both axes, and a word on one axis and a word on the other axis
- the mutual information matrix storage unit 64 for storing the mutual information matrix in which the mutual information amount between the word pairs is arranged at the intersection position of the word pair and the mutual information matrix storage unit 64 When calculating a score, which is an evaluation value of each answer passage for the question 130, using the mutual information matrix received, the question 130 received by the question receiving unit 50,
- the configuration of the causal relationship attention matrix generation unit 90 will be described later.
- the mutual information amount which is a measure representing the correlation between words, obtained from the causal relationship expression is used as the causal attention, but other measures are also used as the measure representing the correlation. it can.
- measures representing the correlation such as the co-occurrence frequency between words in the set of causal relationship expressions, the Dice coefficient, the Jaccard coefficient, and the like may be used.
- the non-factoid question answering system 30 further includes an answer passage received by the response receiving unit 52, a question 130 received by the question receiving unit 50, an archive causal relationship expression stored in the archive causal relationship expression storage unit 60, and a causal relationship.
- the classification unit 54 that calculates and outputs the score of the answer passage for the question 32, and the score output by the classification unit 54 as the answer candidate for the question 32
- Answer candidate storage unit 66 that stores the answer passages in association with each other, and answer candidates stored in the answer candidate storage unit 66 are sorted in descending order according to the scores, and the answer candidate ranking that outputs the answer candidate with the highest score as the answer 36 is output.
- Part 68 is sorted in descending order according to the scores, and the answer candidate ranking that outputs the answer candidate with the highest score as the answer 36 is output.
- the classification unit 54 extracts the causal relationship expression included in the answer passage storage unit 80 for storing the response passage received by the response receiving unit 52 and the answer passage stored in the answer passage storage unit 80. And a response internal causal relationship expression storage unit 84 for storing the relationship expression extracted from the answer passage by the causal relationship expression extraction unit 82.
- the causal relationship expression extracted from the answer passage in this way is referred to as an answer-causal relationship expression.
- the classification unit 54 uses the archive causal relationship expression stored in the archive causal relationship storage unit 60 to combine the question 130 received by the question receiving unit 50 and each of the answer passages stored in the answer passage storage unit 80.
- a related causal relationship expression extracting unit 86 for extracting the most relevant archive causal relationship expressions respectively, and a related causal relationship expression storage unit 88 for storing the causal relationship expressions extracted by the related causal relationship expression extracting unit 86 are included.
- the archive causal relationship expression extracted by the related causal relationship expression extracting unit 86 is considered to be a paraphrase of the causal relationship expression in the answer passage.
- the classification unit 54 further includes the question 130 received by the question receiving unit 50, the answer-causal relationship expression stored in the answer-causal relationship expression storage unit 84, and the related causal relationship expression stored in the related cause-and-effect relationship storage unit 88.
- the causal relationship attention matrix generated by the causal relationship attention matrix generation unit 90 is received, and a score indicating the probability that the answer passage stored in the response passage storage unit 80 is correct as an answer to the question 130 is output.
- a neural network 92 that has been learned in advance.
- the neural network 92 is a multi-column convolutional neural network as will be described later.
- the neural network 92 uses the causal relationship attention generated by the causal relationship attention matrix generation unit 90, and is considered to be related to the word included in the question 130 among the answer passages stored in the answer passage storage unit 80. Pay particular attention to calculating the score.
- the word that seems to be related to the word included in the question 130 seems to be selected based on common sense about the causal relationship that each person has.
- evaluating the answer passage by focusing on the words in the answer passage based on the mutual information amount is called causal relationship attention as described above.
- a multi-column neural network 92 that scores an answer passage using such a causal relationship attention is referred to as CA-MCNN.
- CA-MCNN a multi-column neural network 92 that scores an answer passage using such a causal relationship attention
- the related causal relationship expression extracting unit 86 extracts the content words from the question 130 received by the question receiving unit 50, and the word extracted from the question 130 among the archive causal relationship expressions stored in the archive causal relationship storage unit 60.
- a question-related archive causal relation expression selection unit 110 that selects a result part of the result part, a question-related causal relation expression storage unit 112 that stores an archive causal relation expression selected by the question-related archive causal relation expression selection unit 110, and an answer
- the question-related causality stored in the question-related causal relationship expression storage unit 112 based on a predetermined calculation formula indicating how much words are included in common with the answer passage.
- Ranking relational expressions and assigning top-level question-related causal relation expressions to questions and answer passages And a ranking unit 114 which selects and outputs as the causality representation.
- the predetermined formula used by the ranking unit 114 for ranking is the weighted word appearance number wgt-wc (x, y) represented by the following formula.
- three evaluation values wc (x, y), ratio (x, y), and wgt-ratio (x, y) are defined in addition to the weighted word occurrence number wgt-wc (x, y). is there. These are all inputs to the neural network 92.
- MW (x, y) represents a set of content words in the expression x and also exists in the expression y
- Word (x) represents a set of content words in the expression x
- idf (x) represents the reciprocal of the document frequency of the word x.
- x represents a cause part of the question-related causal relationship
- y represents an answer passage.
- FIG. 2 shows a schematic configuration of the question related archive causal relation expression selecting unit 110 in the related causal relation expression extracting unit 86.
- the question-related archive causal relationship expression selection unit 110 receives a question 130 from the question reception unit 50, extracts a noun included in the question 130, a verb included in the question 130, and The verb / adjective extraction unit 152 for extracting an adjective and the archive causal relationship expression storage unit 60 are searched, and the archive causal relationship expression including all the nouns extracted by the noun extraction unit 150 in the result part is extracted, and the question related causal relationship expression is extracted.
- the first search unit 154 for storing in the storage unit 112 and the archive causal relationship expression storage unit 60 are searched, all the nouns extracted by the noun extraction unit 150, and the verbs and adjectives extracted by the verb / adjective extraction unit 152
- the archive causal relationship expression including at least one of the results in the result part is extracted and stored in the question related causal relation expression storage unit 112.
- a search unit 156 is searched.
- the CA word included in the question and the answer passage is given a higher weight than others when scoring the answer passage by the neural network 92 based on the concept of causal relation attention.
- a mutual information matrix is used.
- the weighting here indicates how much causal relation the CA word included in the question and the CA word included in the answer passage have, and in the present embodiment, the value is between words.
- the mutual information amount is used.
- P (x, y) indicates the probability that the word x and the word y exist in the cause part and the result part of the same archive causal relationship expression, respectively.
- This probability is statistically obtained from all archive causal relationship expressions stored in the archive causal relationship expression storage unit 60 shown in FIG.
- P (x, *) and P (*, y) denote the probabilities that the word x and the word y appear in the cause part and the result part, respectively, in the entire archive causal expression.
- the strength of the causal relationship between the words x and y is expressed by the point-by-point mutual information (npmi) normalized in the range [-1,1] as follows.
- the first is a word-word matrix A
- the second is a word-sentence matrix ⁇ A.
- word-sentence matrix ⁇ A There are two more types of word-sentence matrix ⁇ A.
- One is a matrix ⁇ A q that consists of the maximum amount of mutual information with each word in the answer passage as seen from each word in the question, and the other is the matrix with each word in the question as seen from each word in the answer passage.
- Equation (3) when npmi (•) ⁇ 0, it is better to substitute 0, so in this embodiment, the constraint of Equation (3) is set to A [i, j]. Imposing.
- the causal relationship attention matrix generation unit 90 of the causal relationship attention processing unit 40 performs each combination of the question 130 from the question receiving unit 50 and the answer passage stored in the answer passage storage unit 80.
- a mutual information amount of two words corresponding to the position is read from the mutual information matrix storage unit 64 at a crossing position of the axes, and a first mutual information matrix arranged by replacing negative values with 0 is calculated.
- two second mutual information amounts are calculated by the method described below.
- a second matrix calculation unit 124 for calculating the Rikusu Since the negative value of the mutual information is replaced with 0, the value of the mutual information in the first mutual information matrix is normalized in the range of [0, 1].
- the word extracted from the question is arranged on the horizontal axis, and the word extracted from the answer passage to be processed is arranged on the vertical axis.
- It is a matrix in which the mutual information amount between words corresponding to the intersecting position is read from the mutual information amount matrix storage unit 64 and the negative value is replaced with 0 at the position where they intersect.
- the second matrix is composed of two matrices, namely, matrix A q 180 and matrix A p 182.
- the matrix ⁇ A q 180 is obtained by obtaining the maximum value of the mutual information stored in each column corresponding to each word included in the question in the mutual information matrix A170.
- the matrix ⁇ A p 182 is obtained by obtaining the maximum value of the mutual information stored in each row corresponding to each word included in the answer passage in the mutual information matrix A170.
- mutual information values are normalized to the range [0,1].
- the feature amount of the causal attention of the word in the question (referred to as “question word”) is all possible in the matrix ⁇ A between the question word and the word in all answer passages (referred to as “answer word”). It is represented by the npmi value which is the highest value among the combinations. Similarly, the characteristic amount of the causal attention of the answer word is represented by an npmi value that is the highest value among all possible combinations of the answer word and all the question words in the matrix A. This suggests that the feature quantity of the causal attention of words in the matrix ⁇ A is expressed by the most important causal attention feature extracted from the matrix A.
- a matrix ⁇ A q 180 is obtained by obtaining the maximum value of each column. For example, when the row 174 (corresponding to “earthquake”) is viewed horizontally, the maximum value is “0.65” of “tsunami”. That is, the question word that has the strongest causal relationship with the answer word “earthquake” is “tsunami”.
- the matrix ⁇ A p 182 is obtained by finding this for each row. In practice, as can be seen from FIG. 3, the matrix ⁇ circumflex over (A) ⁇ 180 is a horizontal vector of one row, and the matrix ⁇ circumflex over (A) ⁇ p 182 is a vertical vector of one column.
- the neural network 92 shown in FIG. 1 receives a question, an answer passage, an answer causal relation expression (answer CE), and a related causal relation expression (related CE), as will be described later.
- An input layer 200 that generates a word vector weighted by a relation attention
- a convolution / pooling layer 202 that receives an output of the input layer 200 and outputs a feature vector
- an output of the convolution / pooling layer 202 that receives an output.
- an output layer 204 for outputting a probability that the input answer is a correct answer to the inputted question.
- This neural network 92 has four columns C1 to C4.
- the input layer 200 includes a first column C1 in which a question is input, a second column C2 in which an answer passage is input, a third column C3 in which an answer causal relationship expression (answer CE) is input, and a related causal relationship. And a fourth column C4 into which the expression (related CE) is entered.
- the first column C1 and the second column C2 each receive the input of word strings constituting the question and the answer passage and convert them into word vectors, and the function of weighting each word vector by the above-described causal relationship attention 210.
- the third column C3 and the fourth column C4 do not have the weighting function 210 by the causal relation attention, but have the function of converting the word strings included in the answer causal relation expression and the related causal relation expression into word embedding vectors, respectively. .
- the word string is represented by a word embedded vector string X of d ⁇
- represents the length of the word string t.
- the vector sequence X is expressed by the following equation (9).
- x i: i + j is a concatenation of embedded vectors of x i ,..., x i + j , and i ⁇ 1 or i>
- t
- a d ⁇ t-dimensional attention vector sequence X ′ for the word sequence t is calculated based on the CA word.
- the CA word directly or indirectly represents a causal relationship between a question and a possible answer, and is automatically extracted from the archive causal relationship expression.
- a weighted word embedding vector sequence ⁇ X is obtained by adding the word embedding vector sequence X and the attention vector sequence X ′ for the word sequence t by component.
- the convolution / pooling layer 202 includes four convolution neural networks provided corresponding to the four columns C1 to C4, and four pooling layers that receive the outputs of the convolution / pooling layers 202 and output the results of the max pooling.
- a certain column 390 in the convolution / pooling layer 202 includes an input layer 400, a convolution layer 402, and a pooling layer 404.
- the convolution / pooling layer 202 is not limited to such a configuration, and may include a plurality of these three layers.
- the input vector 400 is input with word vector strings X 1 ,..., X
- , the matrix T [X 1, X 2 , ..., X
- M feature maps f 1 to f M are applied to the matrix T by the next convolution layer 402.
- Each feature map is a vector, and the vector which is an element of each feature map is moved by moving the n-gram 410 while applying a filter indicated by w to the n-gram 410 consisting of continuous word vectors. Calculated by obtaining.
- n is an arbitrary natural number. If the output of the feature map f is O, the i-th element O i of O is expressed by the following equation (10).
- n may be equal over the entire feature map, or there may be different ones. n may be about 2, 3, 4 and 5.
- the filter weight matrix is the same in all convolutional neural networks. These may be different from each other, but in fact, making them equal to each other gives higher accuracy than learning each weight matrix independently.
- the next pooling layer 404 For each feature map, the next pooling layer 404 performs so-called max pooling. That is, the pooling layer 404 selects, for example, the largest element 420 among the elements of the feature map f M and extracts it as the element 430. By performing this for each of the feature maps, the elements 430,..., 432 are taken out, connected in the order of f 1 to f M , and output as a vector 440 to the output layer 204 shown in FIG. Each pooling layer outputs the vector 440 and the like thus obtained to the output layer 204.
- Output layer 204 >> In the output layer 204, the similarity calculation unit 212 calculates the similarity between these feature vectors and supplies the calculated similarity to the Softmax layer 216.
- word matching 208 is performed between the word strings given to the four columns C1 to C4, and the common word number counting unit 214 uses the four values shown in Expression (1), which are indices of the number of common words. Calculate and give to the Softmax layer 216.
- the Softmax layer 216 applies a linear softmax function to the input and outputs the probability that the answer passage is the correct answer to the question.
- the similarity between two feature vectors is calculated as follows.
- other vector similarities such as cosine similarity can be applied as similarities.
- the similarity of the feature vectors as shown above is input to the output layer 204, but the information to be input is not limited to this.
- a feature vector itself may be used, or a combination of feature vectors and their similarities may be used.
- FIG. 7 is a flowchart showing a control structure of a computer program for realizing the non-factoid question answering system 30 by a computer. Since the description of the configuration of the computer program shown in FIG. 7 overlaps with the description of the operation of the non-factoid question answering system 30, it will be made together with the description of the operation.
- the operation of the non-factoid question answering system 30 includes a learning phase and an operation phase that outputs a response to an actual question.
- the causal relation expression extracting unit 58 extracts the archive causal relation expression from the web archive storage unit 56, and the mutual information amount calculating unit 62 calculates the mutual information amount matrix. And stored in the mutual information matrix storage unit 64.
- a question for learning, an answer passage for the learning question, and a manual label indicating whether or not these answers are correct answers to the question Learning is performed using learning data consisting of Similarly to the normal neural network, the neural network 92 outputs an error indicating that the combination of the question and the answer passage input using the same learning data is a correct combination. Learning is performed in advance using the back propagation method.
- processing 460 for automatically recognizing causal relation expressions from a large amount of web text archives is performed, and a large amount of archive causal relation expressions 462 are collected. From this, by selecting a word pair having a high causality relationship based on the co-occurrence frequency, a process 464 for extracting a causal relationship related word 466 is performed. From this related word 466, information expressing the causal relationship attention 468 is obtained. With this causal relationship attention 468, a word that should be particularly noted as a causal relationship in a question and an answer passage is given a greater weight than others.
- the question 470, the answer passage 472, the causal relation expression included in the answer passage, the causal relation attention 468, and the paraphrased expression (related causal relation expression) 476 corresponding to the answer passage are all stored in the neural network 92. give.
- the probability that the answer passage 472 is a correct answer to the question 470 is calculated by the neural network 92. This probability is calculated for each answer passage, and the answer passage having the highest probability of being a correct answer is selected as an answer to the question 470.
- the causal relation expression extracting unit 58 extracts the archive causal relation expression from the web archive and stores it in the archive causal relation expression storage unit 60. Further, the mutual information amount calculation unit 62 calculates the mutual information amount between words from the causal relationship expression stored in the archive causal relationship expression storage unit 60, and stores it in the mutual information amount matrix storage unit 64 as a mutual information amount matrix. .
- the question receiving unit 50 gives this question to the response receiving unit 52.
- the response receiving unit 52 transmits this question to the question answering system 34 (step 480 in FIG. 7).
- the question receiving unit 50 also gives the question 32 as the question 130 to the related causal relationship expression extracting unit 86, the word extracting unit 120 of the causal relationship attention matrix generating unit 90, and the neural network 92.
- the response receiving unit 52 receives a predetermined number (for example, 20) of answer passages for the question 32 from the question answering system 34.
- the response receiving unit 52 stores these response passages in the response passage storage unit 80 of the classification unit 54 (step 482 in FIG. 7).
- the noun extraction unit 150 of the question-related archive causal relationship expression selection unit 110 receives the question 130 from the question reception unit 50, extracts the noun included in the question 130, and extracts the first search unit 154 and the first search unit 154. 2 is given to the search unit 156.
- the verb / adjective extraction unit 152 extracts the verbs and adjectives included in the question 130 and gives them to the second search unit 156 (step 484 in FIG. 7).
- the first search unit 154 searches the archive causal relationship expression storage unit 60, extracts an archive causal relationship expression including all the nouns extracted by the noun extraction unit 150 in the result unit, and uses the extracted archive causal relationship expression as a question-related item.
- the result is stored in the causal relationship expression storage unit 112 (step 486 in FIG. 7).
- the second search unit 156 searches the archive causal relationship expression storage unit 60, includes all of the nouns extracted by the noun extraction unit 150, and obtains at least one of the verbs and adjectives extracted by the verb / adjective extraction unit 152 as a result part. Are extracted and stored in the question-related causal relationship expression storage unit 112 (step 490 in FIG. 7).
- the causal relation expression extracting unit 82 extracts an answer in-causal relation expression from the answer passage to be processed using an existing causal relation expression extraction algorithm, and stores it in the answer in-causal relation expression storage unit 84 (see FIG. 7 step 500).
- the ranking unit 114 calculates the weighted word appearance number wgt-wc (x, y) for the answer passage to be processed (step 502 in FIG. 7), and uses this weighted word appearance number to store the question related causal relationship expression storage unit
- the query-related causal relationship expressions stored in 112 are ranked.
- the ranking unit 114 further selects and outputs the highest-level question-related causal relationship expression as a causal relationship expression related to the set of the question and the answer passage being processed (step 504 in FIG. 7).
- the related causal relationship expression storage unit 88 stores the related causal relationship expression output by the related causal relationship expression extraction unit 86 for each answer passage.
- the word extraction unit 120 extracts all of the questions received by the question reception unit 50 and words appearing in the answer passage being processed, and the first matrix calculation is performed.
- the first matrix calculation unit 122 declares a two-dimensional array in order to generate a matrix with the words of the question sentence as the horizontal axis and the words of the answer passage being processed as the vertical axis (step 508 in FIG. 7). .
- the first matrix calculation unit 122 further reads the mutual information amount between the corresponding words from the mutual information matrix storage unit 64 in the cell at the intersection position of these words, and arranges the negative value by replacing it with 0.
- a mutual information matrix A170 (first matrix 170) between the words is generated (step 510 in FIG. 7).
- the second matrix calculation unit 124 calculates two second mutual information matrices ⁇ A q 180 (second matrix) from the first mutual information matrix calculated by the first matrix calculation unit 122 by the method described above. 180) and ⁇ A p 182 (second matrix 182) are calculated (step 512 in FIG. 7).
- word embedding vectors in the input layer 200 of the neural network 92 are all converted into word embedding vectors in the input layer 200 of the neural network 92.
- the word embedding vectors of the words constituting the question in the first column and the answer passage in the second column are multiplied by the weights obtained from the mutual information matrices ⁇ A q and ⁇ A p .
- the output layer 204 of the neural network 92 first, four types of similarity scores sv 1 (n) to sv 4 (n) of these feature vectors are calculated and output to the Softmax layer 216.
- the feature vector itself or a combination of the feature vector and the similarity may be input to the Softmax layer 216.
- word matching is performed on the word strings given to the first column C1 to the fourth column C4 as described above, and the four values shown in Expression (1), which are indices of the number of common words, are output layers. 204.
- the Softmax layer 216 Based on the output from the output layer 204, the Softmax layer 216 outputs the probability that the input answer passage is a correct answer to the question. This value is stored together with each answer candidate in the answer candidate storage unit 66 shown in FIG. 1 (step 516 in FIG. 7).
- the answer candidate ranking unit 68 sorts the answer candidates stored in the answer candidate storage unit 66 in descending order according to the score after all the above processes are completed for the answer candidates, and the answer candidate with the highest score or higher score N answer candidates (N> 1) are output as answers 36.
- 34,5 consecutive numbers in ⁇ 2,3,4,5,6 ⁇ are used to construct filters of different window sizes, and the number of filters for each combination of window sizes is ⁇ 25, 50, 75, 100 ⁇ .
- the mini-batch stochastic gradient descent method is used, and the weights of the filter W and the causal attention are randomly initialized in the range of (-0.01, 0.01).
- Q indicates a set of questions in the test data
- Answer q is a set of correct answers to the question q ⁇ Q.
- Prec (k) is the precision at the cutoff k in the top n answer passages
- rel (k) is an index that is 1 if the item of rank k is a correct answer in the set Answer q and 0 if it is an error.
- FIG. 8 shows the results of the above embodiment and the results of the control example in a table format. Symbols of the control example shown in this figure are shown below.
- Non-Patent Document 1 A system based on SVM that uses word n-grams, word classes, and causal relationships within answers as features.
- Baseline MCNN system that inputs only questions, answer passages, and causal relation expressions in answers and related common word counts. In the above embodiment, neither a causal relationship attention nor a related causal relationship expression is used.
- the system according to the embodiment of the present application showed a stable and high performance as compared with the prior art. More specifically, it is understood that P @ 1 is improved by 4 to 6% by rephrasing the causal relationship using the related causal relationship expression (reference numerals 520 ⁇ 524, 522 ⁇ 526 in FIG. 8). It can also be seen that P @ 1 is improved by 6% by using the causal relationship attention (reference numerals 520 ⁇ 522, 524 ⁇ 526).
- R (P @ 1) reaches 81.8% (54/66, reference numeral 526 and reference numeral 528) in the present invention.
- the quality of the top answer by OH13, OH16 and Proposed was examined. Therefore, for each system, only the top answer to each question in the test data was selected, and all the top answers were ranked according to the score given by each system. Furthermore, the relevance ratio of the top answer rank in each rank was calculated. The results are shown in FIG.
- the x-axis indicates the cumulative ratio of top answers to all the top answers in the ranked list
- the y-axis indicates the matching ratio at each point on the x-axis.
- the non-factoid question answering system 30 can be realized by computer hardware and a computer program executed on the computer hardware.
- FIG. 10 shows the external appearance of this computer system 630
- FIG. 11 shows the internal configuration of the computer system 630.
- this computer system 630 includes a computer 640 having a memory port 652 and a DVD (Digital Versatile Disk) drive 650, a keyboard 646, a mouse 648, and a monitor 642.
- DVD Digital Versatile Disk
- the computer 640 includes a CPU (Central Processing Unit) 656, a bus 666 connected to the CPU 656, the memory port 652, and the DVD drive 650, and a boot program. And the like, a read only memory (ROM) 658 for storing etc., a random access memory (RAM) 660 connected to the bus 666 for storing program instructions, system programs, work data and the like, and a hard disk 654.
- Computer system 630 further includes a network interface (I / F) 644 that provides a connection to network 668 that allows communication with other terminals.
- I / F network interface
- a computer program for causing the computer system 630 to function as each function unit of the non-factoid question answering system 30 according to the above-described embodiment is stored in the DVD drive 650 or the DVD 662 or the removable memory 664 installed in the memory port 652. And further transferred to the hard disk 654. Alternatively, the program may be transmitted to the computer 640 through the network 668 and stored in the hard disk 654. The program is loaded into the RAM 660 when executed. The program may be loaded directly from the DVD 662 to the RAM 660 from the removable memory 664 or via the network 668.
- This program includes an instruction sequence including a plurality of instructions for causing the computer 640 to function as each functional unit of the non-factoid question answering system 30 according to the above embodiment.
- Some of the basic functions necessary to cause computer 640 to perform this operation are an operating system or third party program running on computer 640 or various dynamically linked programming toolkits or programs installed on computer 640. Provided by the library. Therefore, this program itself does not necessarily include all the functions necessary for realizing the system, apparatus, and method of this embodiment.
- the program is a system as described above by dynamically calling an appropriate program in an appropriate function or programming toolkit or program library in a controlled manner to obtain a desired result among instructions, It is only necessary to include an instruction for realizing a function as an apparatus or a method. Of course, all necessary functions may be provided only by the program.
- the reason or cause of 2) and the causal relationship of 3) are considered, but the conformity with respect to the topic of 1) is particularly explicitly stated. Not considered.
- an attention regarding the suitability of the question to the topic is used, and this is used simultaneously with the causal relationship attention to obtain an answer to the question. That is, not only attention from a single point of view but also attentions from different points of view are used to obtain an answer. Therefore, in the second embodiment below, for each word of the question and the answer candidate, attention is paid to the meaning of the word in the context from a different viewpoint, and these are attentions (weights) at the time of input to the neural network. ).
- the meaning of a word in a general text context is used as a point of view regarding compatibility with a topic. That is, the semantic relationship between words in a general context that is not particularly related to the semantic relationship is used instead of the semantic relationship between specific words such as causal relationship and material relationship.
- the relevance of a topic can be determined based on whether or not words appearing in a question and words appearing in answer candidates are semantically similar. Such similar words often appear in similar contexts. Therefore, the similarity of the word embedding vector (referred to as “general word embedding vector”) learned from the general context is used as the relevance regarding the topic.
- FIG. 12 shows a block diagram of a non-factoid question answering system 730 according to the second embodiment.
- this non-factoid question answering system 730 is different from the non-factoid question answering system 30 shown in FIG.
- the similarity attention processing unit 740 is included.
- a classification unit 754 having a function of calculating the score of the answer candidate using the similarity attention generated by the similarity attention processing unit 740 at the same time as the causal relationship attention is provided.
- the non-factoid question answering system 730 is different from the non-factoid question answering system 30.
- the classification unit 754 is different from the classification unit 54 in that a neural network having a function of calculating the score of each answer passage by simultaneously using the similarity attention and the causal relationship attention instead of the neural network 92 of the classification unit 54. Only points including 792.
- the similarity attention processing unit 740 includes a semantic vector calculation unit 758 that calculates a semantic vector for each word appearing in the text stored in the web archive storage unit 56.
- a semantic vector calculation unit 758 that calculates a semantic vector for each word appearing in the text stored in the web archive storage unit 56.
- a general word embedded vector is used as the semantic vector.
- the similarity attention processing unit 740 further calculates a similarity of the semantic vectors for all combinations of two words of these words, thereby calculating a similarity of the two words.
- a similarity matrix storage unit 764 that stores the similarity calculated for each combination of two words by the similarity calculation unit 762 as a matrix in which the words are arranged on the horizontal axis and the vertical axis.
- the matrix stored in the similarity matrix storage unit 764 arranges all words appearing in the non-factoid type question answering system 730 on the horizontal axis and the vertical axis, and is at the position of the intersection of the word on the horizontal axis and the word on the vertical axis. The degree of similarity between these words is stored.
- the similarity attention processing unit 740 further stores words appearing in the question 130 from the question receiving unit 50 and the answer passages read from the answer passage storage unit 80 and the similarity matrix storage unit 764.
- the neural network 792 uses the similarity attention matrix calculated by the similarity attention matrix generation unit 790 between the question 130 and the answer passage. The configuration of the neural network 792 will be described later with reference to FIG.
- FIG. 13 shows the configuration of the similarity attention matrix generation unit 790 in the form of a block diagram. As can be seen by comparing FIG. 13 with FIG. 1, the similarity attention matrix generation unit 790 and the causal relationship attention matrix generation unit 90 shown in FIG. 1 have a parallel structure.
- similarity attention matrix generation section 790 includes content words included in each combination of question 130 from question reception section 50 and answer passage stored in answer passage storage section 80.
- the word extraction unit 820 for extracting all of the words, the question words extracted by the word extraction unit 820 are arranged on the horizontal axis, the words of the answer passage are arranged on the vertical axis, From the similarity matrix calculated by the third matrix calculation unit 822, the third matrix calculation unit 822 that calculates the similarity matrix in which the similarity of two words corresponding to the position is read from the similarity matrix storage unit 764 and arranged.
- a fourth matrix calculation unit 824 that calculates two fourth similarity matrices by the method described below.
- the similarity values in all similarity matrices are normalized in the range [0, 1].
- the generation method of the two fourth similarity matrices by the fourth matrix calculation unit 824 is the same as the generation method of the second matrices 180 and 182 shown in FIG. Therefore, the details are not repeated here.
- FIG. 14 shows a schematic configuration of the neural network 792.
- the configuration of the neural network 792 shown in FIG. 14 is almost the same as the configuration of the neural network 92 shown in FIG.
- the neural network 792 is different from the neural network 92 in that it has an input layer 900 instead of the input layer 200 of FIG.
- the third column and the fourth column of the input layer 900 are the same as those of the input layer 200.
- the first column C1 and the second column C2 of the input layer 900 each receive the input of word strings constituting the question and answer passages, convert them into word vectors, the above-described causal relationship attention and similarity attention, Is different from the input layer 200 in that it has a function 910 for weighting each word vector by a value obtained by adding for each element.
- weights are added to both corresponding elements of the causal relationship attention and the similarity attention, and then both are added. This weight forms part of the learning parameters of the neural network 792.
- the neural network 792 has the same configuration as the neural network 92 shown in FIG. Therefore, description of these common parts will not be repeated here.
- the non-factoid question answering system 730 operates as follows.
- the operation of the non-factoid question answering system 730 during learning is the same as that of the non-factoid question answering system 30. However, the difference is that the semantic vector calculation unit 758 and the similarity calculation unit 762 calculate a similarity matrix from the text stored in the web archive storage unit 56 and store it in the similarity matrix storage unit 764 prior to learning. . Further, in the non-factoid type question answering system 730, the combination of the learning data question and the answer passage is based on the similarity matrix calculated from the text stored in the web archive storage unit 56 and the mutual information matrix.
- the non-factoid type question answering system 30 also learns the non-factoid type question answering system 30 in that the similarity attention and the causal relationship attention are calculated for each time and the neural network 792 is used for learning at the same time. It is different from learning.
- the end timing of learning is not limited to that. For example, the time point at which learning is performed a predetermined number of times using the same learning data may be terminated.
- the operation of the non-factoid question answering system 730 during operation is the same as that of the non-factoid question answering system 30 of the first embodiment except that similarity attention is used. More specifically, the question receiving unit 50, the response receiving unit 52, the answer passage storage unit 80, the causal relationship expression extracting unit 82, the answer-causal relationship expression storing unit 84, the related causal relationship expression extracting unit 86 shown in FIG. The related causal relationship expression storage unit 88 and the causal relationship attention processing unit 40 operate in the same manner as in the first embodiment.
- the semantic vector calculation unit 758 and the similarity calculation unit 762 generate a similarity matrix in advance and store it in the similarity matrix storage unit 764.
- answer passages for the question are collected from the question answering system 34, and the answer causal relation expression extracted from them is obtained. It is stored in the answer causal relationship expression storage unit 84.
- an archive causal relationship expression is extracted from the web archive storage unit 56, and a related causal relationship expression is extracted from the archive causal relationship expression based on the answer passage and the question 130 and stored in the related causal relationship expression storage unit 88. .
- the causal relationship attention matrix is generated by the causal relationship attention matrix generation unit 90 from the word obtained from the question 130 and the answer passage.
- the similarity attention matrix is generated by the similarity attention matrix generation unit 790.
- These attentions are given to the neural network 792.
- the neural network 792 receives each of the words constituting the question and the answer passage as input, weights the causal relationship attention and the similarity attention, and inputs them to the hidden layer of the neural network. As a result, a score for the pair is output from the neural network 792.
- the score is calculated in this way for each pair of the answer passage and the question, and the highest score is stored in the answer candidate storage unit 66. Thereafter, the answer candidate ranking unit 68 ranks the answer candidates, and the highest ranked answer candidate is output as the answer 36.
- FIG. 15 is a flowchart showing the control structure of a computer program for realizing the non-factoid question answering system 730 according to the second embodiment by a computer.
- the program shown in FIG. 15 is different from that of the first embodiment shown in FIG. 7 in that a process 950 including a step for calculating an attention according to a general context is included instead of the process 494 in FIG. It is.
- the process 950 is branched from the step 952 for preparing two two-dimensional arrays instead of the step 508 of the process 494, and the step 952 to the step 510 to calculate the third matrix 954.
- step 956 for calculating two fourth matrices based on the third matrix calculated in step 954 by the same method as shown in FIG. 3, and in place of step 514 in FIG. ,
- the question accepted by the question accepting unit 50 is given to the first column of the neural network 792.
- the answer passage being processed is given.
- all the answer-causal relation expressions extracted from the answer passage being processed, which are stored in the answer-causal relation expression storage unit 84, are connected and given via a predetermined delimiter.
- the causal relationship expression related to the answer passage being processed which is stored in the related causal relationship expression storage unit 88, is given.
- the weights obtained from the mutual information matrices ⁇ A q and ⁇ A p are set to the third matrix and The weight obtained from the fourth matrix is added for each element.
- OH13 becomes the baseline of this experiment by the same method as that shown in FIG. Similar to FIG. 8, as can be seen from FIG. 16, the first embodiment provides a significant performance improvement compared to the baseline approach. However, according to the second embodiment, higher accuracy can be obtained as compared with the first embodiment.
- the experiment shown in FIG. 16 shows the results of 10-fold Cross Validation (learning: 15,000 cases, development: 1,000) for 17,000 question / answer pairs (20 answer candidates for 850 questions). Results, evaluation: 1,000 cases).
- P @ 1 indicates the accuracy of the highest result
- MAP indicates the quality of the entire top 20 items. 8 is different from the experiment of FIG. 8 in that 10-fold Cross Validation is performed.
- the first embodiment and the second embodiment of the present invention it is possible to obtain an answer to a non-factoid type question with very high accuracy compared to the conventional method. For example, constructing these question answering systems using learning data on problems that occurred in the factory production line, problems that occurred in the final product, problems that occurred in software testing, problems that occurred in some experimental process, etc. Thus, useful answers to various practical questions can be obtained. As a result, the production efficiency in the factory, the efficient design of industrial products and software, the efficiency of the experiment plan, and the like are brought about, which can greatly contribute to the development of the industry. Furthermore, the application is not limited to the manufacturing industry, and it can be applied to services such as the education field, customer service, automatic response in government offices, and software operation explanations.
- two types of attentions ie, a causal relationship attention and a similarity attention
- the present invention is not limited to such an embodiment.
- Other types of attention may be used depending on the application. For example, attention using the following relationship disclosed in JP-A-2015-121896 can be employed. Further, instead of either one or both of the causal relationship attention and the similarity attention, any one of the above relationships or two or more types of attention can be used.
- -Material relations (example: ⁇ produce B with A>, (corn, biofuel), etc.) -Necessary relationship (Example: ⁇ A is essential for B>, (Nikko, photosynthesis), etc.) -Usage relationship (eg: ⁇ use A for B>, (iPS cells, regenerative medicine), etc.) -Prevention relations (eg ⁇ Prevent B with A>, (Vaccine, Influenza), etc.)
- questions such as “Why can vaccines be used for influenza”, “Why iPS cells are attracting attention”, “Why plants need sunlight?” Answers can be made with high accuracy (corresponding to preventive relationships, usage relationships, and necessary relationships, respectively).
- Attention based on such a relationship can be obtained in the same manner as the causal relationship attention.
- the method described in the aforementioned Japanese Patent Application Laid-Open No. 2015-121896 can be used. That is, a group of specific patterns (this is referred to as a seed pattern) and semantic class information of words, which are the basis for extracting semantic relationship patterns, are stored in a database.
- a database of semantic relation patterns is constructed by extracting patterns similar to the seed patterns stored in these databases from the web archive storage unit 56. Expressions that match these semantic patterns are collected from the web archive, the mutual information of the words in the collected expression set is calculated, and an attention matrix relating to the relationship is generated.
- words are similarly extracted from the question and each answer passage, and two matrices are generated from the attention matrix created previously by the same method as shown in FIG. 3, and these are input to the neural network. The weight for each word.
- category part 754 shown in FIG. 12 may be prepared for every relationship, and the number of columns of the neural network 792 may be increased according to it.
- the classification unit 754 for a specific semantic relationship may be prepared, and only other attentions may be calculated for other semantic relationships.
- a value obtained by adding these attentions for each element may be used as a weight for each word.
- the present invention can give answers to various problems that humans encounter in their lives, the industry that manufactures devices that provide such functions, and the industry that provides such functions to people on a network Available to:
- the present invention can be used in all industries and in all fields of research because it can give answers on the various problems faced by actors in industrial activities and research activities regardless of the field. By doing so, industrial activities and research activities can be promoted more smoothly and promptly.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
[Problème] L'invention vise à mettre en place un système de questions-réponses capable d'utiliser un modèle de réponse comprenant une expression qui n'inclut pas explicitement des mots-clés, et de donner une réponse appropriée à un question sans fait allégué. [Solution] L'invention concerne un système 30 de questions-réponses, comportant: une unité 60 de stockage qui stocke des expressions qui représentent des causalités; une unité 52 de réception de réponses qui reçoit un question et qui reçoit des passages de réponse qui comprennent des réponses candidates à la question; une unité 82 d'extraction d'expressions de causalité qui extrait un expression de causalité de chacun des passages de réponse; une unité 86 d'extraction d'expressions associées de causalité qui sélectionne, par rapport à des combinaisons de la question et des passages de réponse, des expressions associées qui sont les plus associées aux combinaisons provenant de l'intérieur de l'unité 60 de stockage; et un réseau neuronal 92 qui reçoit des combinaisons formées à partir de la question, des passages de réponse, d'expressions de relations sémantiques par rapport aux passages de réponse, et d'une des expressions associées par rapport aux combinaisons de la question et aux passages de réponse, et sélectionne la réponse à la question parmi les passages de réponse.
Priority Applications (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US16/338,465 US20200034722A1 (en) | 2016-10-07 | 2017-10-02 | Non-factoid question-answering system and method and computer program therefor |
| CN201780061910.2A CN109863487B (zh) | 2016-10-07 | 2017-10-02 | 非事实型问答系统和方法及存储介质 |
| KR1020197008669A KR102408083B1 (ko) | 2016-10-07 | 2017-10-02 | 논팩토이드형 질의응답 시스템 및 방법 그리고 그것을 위한 컴퓨터 프로그램 |
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2016198929 | 2016-10-07 | ||
| JP2016-198929 | 2016-10-07 | ||
| JP2017-131291 | 2017-07-04 | ||
| JP2017131291A JP6929539B2 (ja) | 2016-10-07 | 2017-07-04 | ノン・ファクトイド型質問応答システム及び方法並びにそのためのコンピュータプログラム |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2018066489A1 true WO2018066489A1 (fr) | 2018-04-12 |
Family
ID=61830983
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2017/035765 Ceased WO2018066489A1 (fr) | 2016-10-07 | 2017-10-02 | Système et procédé de questions-réponses sans faits allégués, et programme informatique associé |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2018066489A1 (fr) |
Cited By (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109829040A (zh) * | 2018-12-21 | 2019-05-31 | 深圳市元征科技股份有限公司 | 一种智能对话方法及装置 |
| WO2020004136A1 (fr) * | 2018-06-27 | 2020-01-02 | 国立研究開発法人情報通信研究機構 | Dispositif de réponse à des questions et programme d'ordinateur |
| CN111460303A (zh) * | 2020-03-31 | 2020-07-28 | 拉扎斯网络科技(上海)有限公司 | 数据处理方法、装置、电子设备及计算机可读存储介质 |
| CN111506712A (zh) * | 2019-01-30 | 2020-08-07 | 株式会社东芝 | 对话系统、对话方法、程序以及存储介质 |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2008282366A (ja) * | 2007-05-14 | 2008-11-20 | Nippon Telegr & Teleph Corp <Ntt> | 質問応答装置、質問応答方法、質問応答プログラム並びにそのプログラムを記録した記録媒体 |
| JP2009157791A (ja) * | 2007-12-27 | 2009-07-16 | Nippon Telegr & Teleph Corp <Ntt> | 質問応答方法、装置、プログラム並びにそのプログラムを記録した記録媒体 |
| JP2013171550A (ja) * | 2012-02-23 | 2013-09-02 | National Institute Of Information & Communication Technology | ノン・ファクトイド型質問応答システム及びコンピュータプログラム |
| JP2015011426A (ja) * | 2013-06-27 | 2015-01-19 | 独立行政法人情報通信研究機構 | ノン・ファクトイド型質問応答システム及びコンピュータプログラム |
| JP2016045652A (ja) * | 2014-08-21 | 2016-04-04 | 国立研究開発法人情報通信研究機構 | 質問文生成装置及びコンピュータプログラム |
-
2017
- 2017-10-02 WO PCT/JP2017/035765 patent/WO2018066489A1/fr not_active Ceased
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2008282366A (ja) * | 2007-05-14 | 2008-11-20 | Nippon Telegr & Teleph Corp <Ntt> | 質問応答装置、質問応答方法、質問応答プログラム並びにそのプログラムを記録した記録媒体 |
| JP2009157791A (ja) * | 2007-12-27 | 2009-07-16 | Nippon Telegr & Teleph Corp <Ntt> | 質問応答方法、装置、プログラム並びにそのプログラムを記録した記録媒体 |
| JP2013171550A (ja) * | 2012-02-23 | 2013-09-02 | National Institute Of Information & Communication Technology | ノン・ファクトイド型質問応答システム及びコンピュータプログラム |
| JP2015011426A (ja) * | 2013-06-27 | 2015-01-19 | 独立行政法人情報通信研究機構 | ノン・ファクトイド型質問応答システム及びコンピュータプログラム |
| JP2016045652A (ja) * | 2014-08-21 | 2016-04-04 | 国立研究開発法人情報通信研究機構 | 質問文生成装置及びコンピュータプログラム |
Cited By (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2020004136A1 (fr) * | 2018-06-27 | 2020-01-02 | 国立研究開発法人情報通信研究機構 | Dispositif de réponse à des questions et programme d'ordinateur |
| JP2020004045A (ja) * | 2018-06-27 | 2020-01-09 | 国立研究開発法人情報通信研究機構 | 質問応答装置及びコンピュータプログラム |
| JP7084617B2 (ja) | 2018-06-27 | 2022-06-15 | 国立研究開発法人情報通信研究機構 | 質問応答装置及びコンピュータプログラム |
| CN109829040A (zh) * | 2018-12-21 | 2019-05-31 | 深圳市元征科技股份有限公司 | 一种智能对话方法及装置 |
| CN109829040B (zh) * | 2018-12-21 | 2023-04-07 | 深圳市元征科技股份有限公司 | 一种智能对话方法及装置 |
| CN111506712A (zh) * | 2019-01-30 | 2020-08-07 | 株式会社东芝 | 对话系统、对话方法、程序以及存储介质 |
| CN111506712B (zh) * | 2019-01-30 | 2023-11-07 | 株式会社东芝 | 对话系统、对话方法、程序以及存储介质 |
| CN111460303A (zh) * | 2020-03-31 | 2020-07-28 | 拉扎斯网络科技(上海)有限公司 | 数据处理方法、装置、电子设备及计算机可读存储介质 |
| CN111460303B (zh) * | 2020-03-31 | 2023-07-04 | 拉扎斯网络科技(上海)有限公司 | 数据处理方法、装置、电子设备及计算机可读存储介质 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP6929539B2 (ja) | ノン・ファクトイド型質問応答システム及び方法並びにそのためのコンピュータプログラム | |
| Kim et al. | Transparency and accountability in AI decision support: Explaining and visualizing convolutional neural networks for text information | |
| CN111753060B (zh) | 信息检索方法、装置、设备及计算机可读存储介质 | |
| US20200134263A1 (en) | Non-factoid question-answering device | |
| CN108959246B (zh) | 基于改进的注意力机制的答案选择方法、装置和电子设备 | |
| Oh et al. | Multi-column convolutional neural networks with causality-attention for why-question answering | |
| WO2020192401A1 (fr) | Système et procédé pour générer une réponse sur la base d'un regroupement et d'une similarité de phrases | |
| US20180276525A1 (en) | Method and neural network system for human-computer interaction, and user equipment | |
| US11481560B2 (en) | Information processing device, information processing method, and program | |
| KR20180048624A (ko) | 질의 응답 시스템의 훈련 장치 및 그것을 위한 컴퓨터 프로그램 | |
| CN113297410A (zh) | 一种图像检索方法、装置、计算机设备及存储介质 | |
| Korade et al. | Strengthening Sentence Similarity Identification Through OpenAI Embeddings and Deep Learning. | |
| WO2018066489A1 (fr) | Système et procédé de questions-réponses sans faits allégués, et programme informatique associé | |
| Collarana et al. | A question answering system on regulatory documents | |
| Zhang et al. | Double retrieval and ranking for accurate question answering | |
| Tohidi et al. | Optimizing the performance of Persian multi-objective question answering system | |
| CN115238034A (zh) | 数据搜索方法、装置、计算机设备和存储介质 | |
| Das et al. | An improvement of Bengali factoid question answering system using unsupervised statistical methods | |
| Valla et al. | Anomaly component analysis | |
| Jeon et al. | Random forest algorithm for linked data using a parallel processing environment | |
| Sinhababu et al. | Medical information retrieval and interpretation: a question-answer based interaction Model | |
| Rosso-Mateus et al. | A two-step neural network approach to passage retrieval for open domain question answering | |
| CN111159366A (zh) | 一种基于正交主题表示的问答优化方法 | |
| AU2021105030A4 (en) | A method for developing a hybrid deep learning approach for text summarization using latent semantic analysis | |
| Ahmed | Incorporating Sentiment Aware Features in a Neural Network Model for Stance Detection on Microblog |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 17858327 Country of ref document: EP Kind code of ref document: A1 |
|
| ENP | Entry into the national phase |
Ref document number: 20197008669 Country of ref document: KR Kind code of ref document: A |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 17858327 Country of ref document: EP Kind code of ref document: A1 |