CN119443267A

CN119443267A - A method and system for solving mathematical problems by thinking chain reasoning based on feature classifier

Info

Publication number: CN119443267A
Application number: CN202411479942.2A
Authority: CN
Inventors: 李嘉明; 谢亮; 王闻箫; 林彬彬
Original assignee: Zhejiang University ZJU
Current assignee: Zhejiang University ZJU
Priority date: 2024-10-23
Filing date: 2024-10-23
Publication date: 2025-02-14
Anticipated expiration: 2044-10-23
Also published as: CN119443267B

Abstract

The invention discloses a method and a system for solving a thinking chain reasoning mathematical problem based on a feature classifier, which comprise the steps of (1) obtaining a mathematical problem set and selecting forward and reverse thinking chain examples to splice to form a new set, (2) respectively inputting new set elements into a model to perform reasoning and constructing a word-level reasoning path spanning tree, selecting a attention weight matrix processed by pooling differences as node features to store, (3) traversing all generated reasoning path spanning trees, screening nodes meeting requirements to construct a feature classifier training set, (4) training the feature classifier by using a support vector machine algorithm, and (5) participating in path selection in a pre-training language model reasoning process through the trained feature classifier to obtain more accurate reasoning processes and answers. By utilizing the method, the adjustment and control of finer granularity of the pre-training language model reasoning path can be realized, the accurate reasoning and solving of the pre-training language model reasoning path on mathematical problems are facilitated, and the generalization level of the pre-training language model reasoning path is improved.

Description

Thinking chain reasoning mathematical problem solving method and system based on feature classifier

Technical Field

The invention relates to the technical field of computer application, in particular to a thinking chain reasoning mathematical problem solving method and system based on a feature classifier.

Background

In recent years, the deep learning artificial intelligence technology mainly goes through the following research paradigm shift from an early task specific model of 'annotation data supervised learning', to a pre-training model of 'non-annotation data pre-training+annotation data fine tuning', to a large model of 'large-scale non-annotation data pre-training+instruction fine tuning+human alignment', and gradually enters the age of a large model. Meanwhile, as the model parameter scale and the pre-training data scale are continuously increased and reach the billion level, a large language model has developed a strong context learning capability, namely learning from examples of model input contexts, so that a series of complex tasks can be executed, and the model is a powerful tool for helping human beings to process a series of automated complex tasks. As a specific application of the context learning, the thinking chain technology can remarkably improve the performance of a large model under a complex reasoning task scene by gradually participating in the large language model to decompose a complex problem into sub-problems step by step and sequentially solving the sub-problems, and completing the mapping from input to the thinking chain and then to output.

Mathematical reasoning is crucial to artificial intelligence, and promotes continuous exploration for autonomously solving mathematical problems. This process requires enhanced model reasoning capabilities, intensive research into text understanding, image recognition, form analysis, symbolic operations, logical operations, and intensive knowledge of the world. The understanding capability of the large language model in various mathematical fields is comprehensively improved, not only can the technical strength be embodied, but also the method is an important step for general artificial intelligence. Existing studies can demonstrate that large language models are effective tools to help solve mathematical problems, whose language capabilities motivate us to explore how to use them for mathematical reasoning, revealing new insights into the synergy between language and logic.

As disclosed in chinese patent document CN116595159a, a training method and apparatus for a mathematical problem solution model are disclosed, in which a solution with an inference process for generating a correct answer to a mathematical problem is first generated as an example of learning a thought chain by using a pre-training language model with freezing parameters, then the mathematical problem solution model is used to generate multiple inference processes and answers corresponding to a target mathematical problem to be solved, and a self-consistent majority voting method is adopted to take the inference process and answer with the largest number of times as a final result corresponding to the target mathematical problem.

Chinese patent document publication No. CN118113444a discloses a task processing method, apparatus, electronic device, and storage medium. According to the method, the target task is subjected to step decomposition according to the input content and the output description of the target task through a pre-training language model, so that a corresponding thinking tree is generated. The nodes of the mental tree represent each step obtained by decomposing the steps, so that the model has the capability of searching among a plurality of inference chains. Searching an optimal step path for executing the target task in the thinking tree based on a preset searching algorithm, then executing the target task according to the input content based on the optimal step path, and outputting an execution result.

However, the follow-up work related to the prior pre-training language model thinking chain reasoning comprises self-consistency, thinking tree and other methods for promoting the reasoning of complex mathematical problems, all focused on a sequence-level reasoning path, and the method has limited effect on promoting the accuracy of a reasoning result and is constrained by the type of the mathematical problems. In addition, self-consistency may generate incorrect or nonsensical inference paths, and the search method of the inference chain of the thinking tree also needs more computing resources relative to sampling, and similar works are limited by the computing cost and performance.

Disclosure of Invention

The invention provides a thinking chain reasoning mathematical problem solving method and system based on a feature classifier, which can provide a problem solving process and a problem solving answer with higher quality and accuracy.

A thinking chain reasoning mathematical problem solving method based on a feature classifier comprises the following steps:

(1) Obtaining a mathematical problem set q= { Q ₁,q₂…q_n }, for each element Q _i, finding two corresponding examples using a mental chain hint generation method AndSo thatA correct answer is obtained and the user can obtain the correct answer,Obtaining wrong answers, forming a new set Q ^′＝{q₁ ^′,q^′ ₂…q^′ _n }, wherein

(2) Based on the new set Q ^′, inputting a pre-training language model by using a thinking chain prompting method, constructing a word-level reasoning path spanning tree, and storing an attention weight matrix A _i after the average pooling and difference operation by each node n _i;

(3) Traversing all the reasoning path spanning trees generated in the step (2), screening nodes meeting the requirements, and constructing a feature classifier training set, wherein the attention weight matrix A _i of the nodes is used as a feature, the reasoning path result is used as a label, and a training data sample pair set D is constructed;

(4) Training a feature classifier C by using a support vector machine algorithm according to the training set generated in the step (3);

(5) And (3) in the pre-training language model reasoning stage, for the target mathematical problem q to be solved, randomly selecting two examples d _a and d _b in the data set, simultaneously carrying out reasoning, continuing to predict and generate until the end if the reasoning paths are the same, enabling the feature classifier C trained in the step (4) to select a word with higher correct probability to continue reasoning if the reasoning paths are different, finally obtaining a complete reasoning process, and extracting corresponding answers.

The invention firstly selects partial questions based on a mathematical question data set and constructs a word granularity level reasoning path spanning tree by using a thinking chain method, and then screens training sample data to train a feature selector. When the pre-training predictive model is used for reasoning, a thought chain prompting method is used for processing target mathematical questions to be solved, and during reasoning, a feature classifier intervenes in the selection generation of a reasoning path, so that a problem solving process and a problem solving answer with higher quality and accuracy are obtained.

In step (1), two examples are found corresponding using the mental chain hint generation methodAndThe specific process is as follows:

representing probability distribution fitted to a pre-trained large language model with parameter θ using p _θ

Representing N _d thought chain cues, where d _i＝(q_i,r_i,a_i) is the ith example, q _i,r_i,a_i represents the question, the reasoning step, and the answer, respectively, and if q represents the question to be reasoned, then the less sample thought chain FS-CoT is defined as:

Wherein y= { r, a } is the generated reasoning step r and the reasoning result a, only consider the situation that the value of N _d is 1, and for each problem q _i, the corresponding example is found by traversing the mathematical problem set AndSo thatAnswers obtained with maximum probabilityCorrect and at the same timeAnswers obtained with maximum probabilityError, eliminating the problem if no corresponding example is found, and finally, combining the problem and the corresponding two examples to obtain a new set Q ^′＝{q₁ ^′,q^′ ₂…q^′ _m }, m.ltoreq.n, where

The specific process of the step (2) is as follows:

For all elements within new set Q ^′, for Input text for obtaining pre-trained language model by stitchingFreezing all parameters of the model, and taking the parameters as input of a pre-training language model to perform forward reasoning;

In the reasoning process, q _i ^′ correspondingly generates a reasoning path tree G _i, each node n _j comprises three types of attributes (f (A _j,l),x_j,r_j), wherein x _j represents the text of the current node, r _j represents the accuracy of all reasoning paths passing through the node n _j, and A _j represents the attention weight matrix of the model when x _j is generated in a prediction mode;

The process of constructing the inference path tree includes constructing an empty node n _root as the root node of the inference path tree, marking as non-leaf nodes, finding one of the non-leaf nodes in the end nodes of each path of the tree, setting the node as n _i, and enabling the path from the root node n _root to the node n _i to represent an inference path Then determine the child node condition of node n _i, specificallyAnd (3) withRespectively inputting a pre-training language model, and predicting the language model according to the following probability when the first i text sequences are input:

x _i+1 there are two cases:

first, the text generated for the two campt runs is the same, i.e Then node n _i adds a child node n _i+1 with the text attribute set toIf it isIf the child node is a sentence terminator, marking the child node as a leaf node;

Second, the text generated for the two campt runs is different, i.e Then insert left and right child nodes for node n _i, respectivelyText attributes are respectively set asAndIf it isIs a sentence terminator, then the child node is markedIs a leaf node ifIs a sentence terminator, then the child node is markedIs a leaf node;

And the like, until the end node of each path of the tree is a leaf node, finally, all elements in the new set Q ^′ generate a corresponding reasoning path tree to obtain a set G=of the tree

{G₁,G₂,...G_m}。

In the step (3), nodes meeting the requirements are screened to construct a feature classifier training set, and the specific process is as follows:

Firstly, the set G of the tree generated in the step (2) comprises m trees, each of which corresponds to each problem in the mathematical problem set Q ^′, each node in the tree comprises three parts of attributes, n _i＝(f(A_i,l),x_i,r_i) which are respectively a feature matrix, a text and the accuracy of all the inference paths passing through the node n _i, before screening training set samples, the r _i attributes of the nodes need to be calculated, an inference path is formed by assuming that for one inference path generating tree, the paths from the root node to the leaf nodes form, and a unique attribute a for one inference path represents the accuracy of the result of the inference path, wherein 1 represents the accuracy and 0 represents the error, thus the following calculation formula is obtained:

Wherein, beta (n _u,S_j) represents whether the node n _j belongs to the inference path S _i, if so, the value is 1, and if not, the value is 0, the inference path correct rate of all nodes of all trees in the tree set G is calculated according to the formula, and then the nodes meeting the requirements are screened out based on the following conditions to be used as a training data set of the training feature classifier:

① . The nodes do not belong to root nodes and leaf nodes;

② . The r attribute value of the node is 0% or 100%, namely, the situation that the answers of the reasoning paths passing through the node are all wrong or are all correct is considered;

All eligible nodes are grouped into a node set n= { N ₁,n₂,...

The specific process of the step (4) is as follows:

Firstly, extracting the characteristics of each element of a training data set D as input, taking an inference path result as a label, carrying out standardized processing on the data, then, mapping the data to a high-dimensional characteristic space by using a radial basis function, training a support vector machine classification model by taking the maximum Lagrange dual problem as an objective function, evaluating the performance of the model by using cross verification, and adjusting super parameters as required, and finally, saving the trained support vector machine model for subsequent reasoning, wherein the obtained characteristic classifier C is as follows:

Wherein x is a new input sample, the similarity between the sample and all training samples x _i is calculated through a kernel function K (x _i, x), then the similarity is weighted and summed, wherein alpha _iy_i is derived from Lagrangian multipliers learned in the training process and labels of the training samples, and finally a bias term b is added, and a classification result is obtained through a sign function sign.

In the step (5), if the inference paths are different, the feature classifier C trained in the step (4) selects a word with higher accuracy probability to continue the inference, which specifically includes:

Respectively calculating attention weight matrixes used for generating the current text, carrying out average pooling of layer dimension and linear interpolation processing of sequence length dimension to obtain a feature matrix A _a in an example d _a and a feature matrix A _b in an example d _b, and obtaining a classification result of the feature matrix through a feature classifier C (x), wherein the classification result comprises the following 3 cases, and respectively carrying out the following processing according to the cases:

①、C(A_a)＝C(A_b ) =1, i.e. in the current text, both play a positive role in the correctness of the inference path generation, in which case the larger numerical term in the sign function in the feature classifier C is selected as the current generated text prediction content;

②、C(A_a)＝-1,C(A_b ) =1 or C (a _a)＝1,C(A_b) = -1, i.e. for the current text, both examples work positively and negatively, respectively, for the correctness of the correct inference path generation; in this case, a text of C (x) =1 is selected as the currently generated text prediction content;

③、C(A_a)＝C(A_b ) = -1, i.e. in the current text, both play a negative role in the correctness of the inference path generation, in which case the smaller items of value within the sign function in the feature classifier C are selected as the current generated text prediction content.

Based on the same inventive principle, the invention also provides a thinking chain reasoning mathematical problem solving system based on the feature classifier, which comprises the following steps:

The thinking chain prompt generation module is used for acquiring mathematical reasoning problems to be processed, selecting examples and instructions meeting requirements according to the problems and forming complete input of a pre-training language model;

The inference path tree generation and feature screening module is used for generating a word-level inference path generation tree and screening out feature training data sets meeting the requirements;

the feature classifier training module is used for training the feature classifier, introducing a support vector machine algorithm based on a feature training data set, mapping data to a high-dimensional feature space by using a radial basis function as a kernel function, and improving the training efficiency of the support vector machine by using a sequence minimum optimization algorithm;

The feature classifier is used for guiding the pre-training language model to select a more accurate reasoning path through intervention of the feature classifier, so that a final mathematical problem reasoning process and a final mathematical problem reasoning answer are obtained.

Based on the same inventive principle, the invention also provides a thinking chain reasoning mathematical problem solving system based on the feature classifier, which comprises a memory and one or more processors, wherein the memory stores executable codes, and the one or more processors are used for realizing the thinking chain reasoning mathematical problem solving method when executing the executable codes.

Compared with the prior art, the invention has the following beneficial effects:

1. The present invention focuses on the predictive generation of text levels. The granularity of processing at the text level is significantly smaller than the currently prevailing sequence level technique. This results in improved accuracy of the predictions, while enabling more accurate and efficient control over the generation of the inference results. In practical application, the fine processing of the text level can be better adapted to complex and changeable demand scenes, and more targeted and reliable results are provided for users.

2. The invention innovatively utilizes a self-training feature classifier. The classifier exhibits excellent migration ability and broad generalization performance in the face of various different types of mathematical reasoning tasks. This means that it can adapt and function quickly in different tasks and scenarios without extensive retraining and adjustment, thus greatly improving the working efficiency and versatility of the model.

3. Compared with common methods such as self-consistency and search of a thinking tree inference chain, the method has obvious advantages in the aspect of the demand of computing resources. It has relatively less resource requirements and can significantly save the calculation cost. The method not only reduces the hardware threshold of system operation, but also provides greater possibility for large-scale application and deployment, so that efficient operation can be smoothly carried out in an environment with limited resources.

4. The invention also has a certain research significance for the interpretive work of the reasoning of the pre-training language model under the prompt of the thinking chain. The method provides a new thought and a new method for the research and the practice in the field, is helpful for further deepening the understanding of the language model reasoning process, and promotes the related technology to develop towards a more transparent, interpretable and reliable direction.

Drawings

FIG. 1 is a flow chart of a method for solving a mental chain reasoning mathematical problem based on a feature classifier;

FIG. 2 is a schematic diagram of a forward and reverse example build inference path spanning tree;

FIG. 3 is a schematic diagram of a filtering inference path tree node;

fig. 4 is a flowchart of support vector machine feature classifier training.

Detailed Description

The invention will be described in further detail with reference to the drawings and examples, it being noted that the examples described below are intended to facilitate the understanding of the invention and are not intended to limit the invention in any way.

As shown in fig. 1, a method for solving a thinking chain reasoning mathematical problem based on a feature classifier comprises the following steps:

(1) Obtain a mathematical problem dataset problem set q= { Q ₁,q₂…q_n }, and find a corresponding example of a mental chain for each element Q _i AndSo thatA correct answer is obtained and the user can obtain the correct answer,Obtaining wrong answers, forming a new set Q ^′＝{q₁ ^′,q^′ ₂…q^′ _n }, wherein

(2) Based on the composed set Q ^′, a pre-training predictive model is input by using a mental chain prompting method, a word-level reasoning path spanning tree is constructed, and each node n _i stores an attention weight matrix a _j after the average pooling and difference operation.

(3) Traversing all the reasoning path spanning trees generated in the step (2), and screening nodes meeting the requirements to construct a feature classifier training set, wherein the requirements are that all the reasoning path results passing through the nodes are the same, namely correct or incorrect. And (3) taking the attention weight matrix A _i of the node as a characteristic, taking an inference path result as a label, and constructing a training data sample pair set D.

(4) And (3) training the feature classifier C by using a Support Vector Machine (SVM) algorithm according to the training set generated in the step (3). Firstly, extracting the characteristics of each element of the training data set D as input, taking an inference path result as a label, and carrying out standardization processing on the data. Then, the data is mapped to a high-dimensional feature space using the radial basis functions as kernel functions, the SVM model is trained, model performance is estimated using cross-validation, and the hyper-parameters are adjusted as needed. Finally, the trained model is saved for subsequent reasoning.

(5) And (3) in the pre-training language model reasoning stage, for the target mathematical problem q to be solved, randomly selecting two examples d _a and d _b in the data set, simultaneously carrying out reasoning, continuing to predict and generate until the end if the reasoning paths are the same, and if the reasoning paths are different, continuing to reason the word with higher correct probability selected by the feature classifier C trained in the step (4). And finally, obtaining a complete reasoning process and extracting corresponding answers.

In step (1), an original mathematical problem set to be solved is obtained, and a forward and reverse thinking chain example is selected to be spliced to form a new model input set, wherein the mathematical problem set to be solved is defined as q= { Q ₁,q₂…q_n }, and each element Q _i expresses a specific problem in the problem set. For each question q _i, find the corresponding forward example using the minlink hint generation methodAnd reverse exampleThe thought chain prompting method comprises the following steps:

Using p _θ to represent the probability distribution fitted to a pre-trained large language model with parameter θ, x= { X ₁,x₂,…,x_N } represents a text sequence of input length N. When the first i text sequences are entered, the language model predicts according to probability p _θ(x_i+1)＝p_θ(x_i+1|x₁,…,x_i).

By usingN _d mental chain cues are represented, where d _i＝{q_i,r_i,a_i is the ith example and q _i,r_i,a_i represents the question, reasoning step and answer, respectively. If q represents the question to be inferred, the less sample thought chain can be defined as follows:

Where y= { r, a } is the generated reasoning step r and reasoning result a. Considering only the case where N _d is 1 in this step, for each problem q _i, traversing the mathematical problem set finds the corresponding example AndSo thatAnswers obtained with maximum probabilityCorrect and at the same timeAnswers obtained with maximum probabilityErrors, if no corresponding example is found, the problem is eliminated. Finally, combining the problems in the original dataset and the corresponding forward and reverse examples can yield a new set Q ^′＝{q₁ ^′,q^′ ₂…q^′ _m, where m.ltoreq.n

In the step (2), the new set elements are respectively input into a model to be inferred and constructed into a text-level inference path spanning tree, and the attention weight matrix after pooling difference processing is selected to be stored as node characteristics.

Traversing the new set Q ^′ generated by the combination of step (1) forInput text for obtaining model by simple splicingAll parameters of the frozen model are processed, and forward reasoning is performed by taking the parameters as input of the pre-training language model. In the reasoning process, q _i ^′ corresponds to generating a reasoning path tree G _i = { V, E }, where V represents a vertex set and E represents an edge set. Each node n _j in the vertex set contains three classes of attributes (f (A _j,l),x_j,r_j), where x _j represents the text of the current node, r _j represents the correct rate of all inference paths through node n _j A _j represents the attention weight matrix of the model when the prediction generates x _j, and since the pre-training language model uses a multi-head mask self-attention mechanism, the sequence lengths of its input models are different, and the multi-head attention weight matrix is dimensionally transformed and processed using the f (-) function.

The f (a _j, l) function performs mainly two steps on the multi-headed attention weight matrix, the first step using a linear difference method to compress or expand the matrix a _ij from the dimension j x j to a fixed length l. In the implementation process, l=30 is set according to experimental experience. And secondly, carrying out average pooling processing on the layer dimension of the multi-head self-attention matrix to finally obtain a feature matrix with a fixed size, and storing the feature matrix as a feature matrix of the node.

After describing the tree and the attributes contained by the nodes, the process of building the inference path tree in detail is described below. First, an empty node n _root is constructed as the root node of the inference path tree, labeled as a non-leaf node. In building the tree, one of the non-leaf nodes (randomly chosen if there are multiple nodes) is found among the end nodes of each path of the tree, which is set to n _i. The path from root node n _root to node n _i may represent an inferred pathThe child node condition of node n _i is then determined. In particular, we willAnd (3) withRespectively inputting a pre-trained language model, as shown in fig. 2, when the first i text sequences are input, the language model predicts according to the following probabilities:

then x _i+1 has the following two cases:

first, the text generated for the two campt runs is the same, i.e Then node n _i adds a child node n _i+1 with the text attribute set toIf it isIs a sentence terminator, the child node is marked as a leaf node.

Second, the text generated for the two campt runs is different, i.eThen insert left and right child nodes for node n _i, respectivelyText attributes are respectively set asAndIf it isIs a sentence terminator, then the child node is markedIs a leaf node ifIs a sentence terminator, then the child node is markedIs a leaf node.

And so on until the end nodes of each path of the tree are leaf nodes. Finally, all elements in the new set Q ^′ can generate their corresponding inference path tree, and the set g= { G ₁,G₂,...G_m } of the tree can be obtained.

In step (3), traversing the inference path spanning tree from step (2), screening nodes meeting requirements, labeling and constructing a training data sample set together with characteristic attributes thereof. The specific implementation process is as follows.

First, the set G of trees generated in step (2) includes m trees, the inference path corresponding to each problem in the mathematical problem set Q ^′ generates a tree, and each node in the tree includes three attributes, n _i＝(f(A_i,l),x_i,r_i), which are the feature matrix, the text, and the accuracy of all the inference paths passing through the node n _i. It is defined that it is assumed that for an inference path spanning tree, paths from the root node to the leaf nodes constitute an inference path and that for an inference path there is a unique attribute a indicating the correctness of the result of this inference path, where 1 indicates correctness and 0 indicates error. The following formula can thus be derived:

Where β (n _i,S_j) represents whether the node n _j belongs to the inference path S _j, if so, the value is 1, and if not, the value is 0, and the inference path correctness of all nodes of all trees in the tree set G is calculated according to the above formula.

And then screening out nodes meeting the requirements based on the following conditions as a training data set of the training feature classifier:

1) The nodes do not belong to root nodes and leaf nodes;

2) The r attribute value of the node is 0 or 100, namely, consider the situation that the answers of the reasoning paths passing through the node are all wrong or all correct.

Finally, all the nodes meeting the conditions are combined into a node set n= { N ₁,n₂. As shown in fig. 3, in the inference path spanning tree, there are a plurality of inference paths corresponding to correct or incorrect results, and according to the above-mentioned screening conditions, thickened nodes can be selected as training sample sets.

In step (4), a support vector machine feature classifier D (x) is trained. As shown in fig. 4, firstly, a training data set is prepared, a node set meeting the conditions is screened out based on the step (3), and a training set t= { (x ₁,y₁),(x₂,y₂),...,(x_n,y_n) } is constructed. Where x _n represents the feature of the node attention weight matrix after the average pooling in the layer dimension and the linear difference processing in the sequence length dimension, y _n e {1, -1} is a label, indicating the correctness and the error of the reasoning result. Next, a definition kernel function is selected. The method adopts a radial basis function as a kernel function, maps data to a high-dimensional feature space, and determines an optimal parameter value by using a cross-validation mode. The support vector machine takes the maximized Lagrangian dual problem as an objective function, and the quadratic programming problem is decomposed into a series of solvable sub-problems through a sequence minimum optimization algorithm to optimally solve the objective function. The method can reduce the computational complexity, improve the training efficiency of the support vector machine on a large-scale data set, and further obtain Lagrange multipliers.

The resulting final classification decision function is summarized as follows. Given a new input sample x, the similarity of this sample to all training samples x _i is calculated by a kernel function K (x _i, x) and then these similarities are weighted and summed, where α _iy_i is derived from the lagrangian multiplier learned during training and the labels of the training samples. And finally adding a bias term b, and obtaining a classification result through a sign function sign. If the result is positive, the sample is classified as a positive class (+1) corresponding to a node that acts positively on the inference result, and if the result is negative, the sample is classified as a negative class (-1) corresponding to a node that acts negatively on the inference result.

In the step (5), the solution stage is carried out on the test set of the data question set, and mainly comprises four parts of contents, namely, selection of examples, reasoning of a pre-training language model, intervention of the feature classifier trained in the step (4) in the reasoning process, and finally extraction of generated answers. The method comprises the following steps:

Firstly, selecting an example, namely randomly selecting two problem examples d _a and d _b as the context of a thought chain prompt for the problem q _i of a mathematical problem set to be inferred

And (2) reasoning of the pre-training language model, wherein the reasoning generation logic of the step (2) is used, two examples are respectively combined with the problem to serve as input of the pre-training language model, and the two reasoning paths are compared. If the text generated at present is the same as the text generated at present under the prompt of examples d _a and d _b, continuing to reason, and if the text generated at present is different, introducing a feature classifier C to perform judgment and selection. The attention weight matrix used for generating the current text is calculated for the two, and the average pooling of the layer dimension and the linear interpolation of the sequence length dimension are performed, so that the feature matrix a _a in the example d _a and the feature matrix a _b in the example d _b can be obtained. The feature matrix obtains a classification result (+1: positive class, -1: negative class) by the feature classifier C (x), the classification result is contained in the following 3 cases, and the following processes are performed according to the cases, respectively:

1. C (a _a)＝C(A_b) =1, i.e. in the current text, both play a positive role in the correctness of the inference path generation, in which case the larger numerical term in the sign function in the feature classifier C is selected as the content of the currently generated text prediction.

2. C (a _a)＝-1,C(A_b) =1 or C (a _a)＝1,C(A_b) = -1, i.e. for the current text, both examples act positively and negatively, respectively, for the correctness of the correct inference path generation. In this case, a text of C (x) =1 is selected as the currently generated text prediction content.

3. C (a _a)＝C(A_b) = -1, i.e. in the current text, both play a negative role in the correctness of the inference path generation, in which case the smaller numerical terms in the sign function in the feature classifier C are selected as the current generated text prediction content.

And finally obtaining the reasoning process and the answer of the mathematical problem according to the rules.

The invention also comprises a thinking chain reasoning mathematical problem solving system based on the feature classifier, and the method comprises the steps of:

And the thinking chain prompt generation module is used for acquiring the mathematical reasoning problem to be processed, selecting examples and instructions meeting the requirements according to the problem, and forming complete input of the pre-training language model.

And the inference path tree generation and feature screening module is used for generating a word-level inference path generation tree and screening out a feature training data set meeting the requirements.

The feature classifier training module is used for training the feature classifier, introducing a support vector machine algorithm based on a feature training data set, mapping data to a high-dimensional feature space by using a radial basis function as a kernel function, and improving SVM training efficiency by using a sequence minimum optimization algorithm.

And the reasoning module is guided by the feature classifier and is used for solving the mathematical reasoning problem to be processed. The pre-training language model is guided to select a more accurate reasoning path through intervention of the feature classifier, so that a final mathematical problem reasoning process and a final mathematical problem answer are obtained.

The foregoing embodiments have described in detail the technical solution and the advantages of the present invention, it should be understood that the foregoing embodiments are merely illustrative of the present invention and are not intended to limit the invention, and any modifications, additions and equivalents made within the scope of the principles of the present invention should be included in the scope of the invention.

Claims

1. A method for solving mathematical problems by chain reasoning based on a feature classifier, characterized in that it comprises the following steps:

(1) Get a set of mathematical problems Q = {q ₁ ,q ₂ …q _n }. For each element q _i , use the thinking chain prompt generation method to find the corresponding two examples. and Make Get the correct answer, Get the wrong answer and form a new set Q ^′ = {q ₁ ^′ , q ^′ ₂ …q ^′ _n }, where

(2) Based on the composed new set Q ^′ , the thought chain prompt method is used to input the pre-trained language model to construct a word-level reasoning path spanning tree, and each node n _i saves the attention weight matrix A _i after average pooling and difference operation;

(3) Traverse all the inference path spanning trees generated in step (2), select nodes that meet the requirements to construct a feature classifier training set; use the node attention weight matrix _Ai as the feature and the inference path result as the label to construct a training data sample pair set D;

(4) Based on the training set generated in step (3), a feature classifier C is trained using a support vector machine algorithm;

(5) In the pre-trained language model reasoning stage, for the target math problem q to be solved, two examples d _a and d _b are randomly selected from the data set and reasoning is performed simultaneously. If the reasoning paths are the same, the prediction generation continues until the end. If the reasoning paths are different, the feature classifier C trained in step (4) selects the word with a higher correct probability to continue reasoning; finally, the complete reasoning process is obtained and the corresponding answer is extracted.

2. The method for solving mathematical problems based on thought chain reasoning using a feature classifier according to claim 1 is characterized in that, in step (1), the two corresponding examples are found using a thought chain prompt generation method. and The specific process is:

Let p _θ represent the probability distribution fitted by the pre-trained large language model with parameter θ, and represents N _d thought chain prompts, where d _i = (q _i , _ri , a _i ) is the i-th example, q _i , _ri , a _i represent the question, reasoning steps and answer respectively; if q represents the question to be reasoned, then the few-sample thought chain FS-CoT is defined as:

Where y = {r, a} is the generated reasoning step r and the reasoning result a; only consider the case where _Nd is 1, for each problem q _i , traverse the mathematical problem set to find the corresponding example and Make The answer with the highest probability Correct, at the same time The answer with the highest probability Error, if no corresponding example is found, the problem is excluded; finally, the problem and the corresponding two examples are combined to obtain a new set Q ^′ = {q ₁ ^′ ,q ^′ ₂ …q ^′ _m }, m≤n where

3. The method for solving mathematical problems by chain reasoning based on feature classifier according to claim 1 is characterized in that the specific process of step (2) is as follows:

For all elements in the new set Q ^′ , The input text of the pre-trained language model is obtained by concatenation Freeze all parameters of the model and use it as input to the pre-trained language model for forward inference;

During the reasoning process, q _i ^′ generates a corresponding reasoning path tree G _i , and each node n _j contains three types of attributes

(f(A _j ,l),x _j ,r _j ), where x _j represents the text of the current node, r _j represents the accuracy of all reasoning paths passing through node n _j , and A _j represents the attention weight matrix of the model when predicting and generating x _j ; since the pre-trained language model uses a multi-head masked self-attention mechanism and the sequence length of its input model is different, the f(·) function is used to transform and process the multi-head attention weight matrix;

The process of constructing the inference path tree is as follows: first, construct an empty node n _root as the root node of the inference path tree, marked as a non-leaf node; when constructing the tree, find one of the non-leaf nodes at the end of each path in the tree, and set the node as n _i ; the path from the root node n _root to the node n _i can represent an inference path. Then determine the child nodes of node n _i ; specifically, and Input the pre-trained language model separately. When the first i text sequences are input, the language model predicts according to the following probabilities:

There are two cases for x _i+1 :

The first type: For two prompts, the text generated in this round of prediction is the same, that is Then node n _i will add a new child node n _i+1 , and the text attribute is set to if If it is a sentence terminator, mark the child node as a leaf node;

The second type: For the two prompts, the text generated in this round of prediction is different, that is Then insert two child nodes on the left and right for node n _i respectively The text properties are set to as well as if Is a sentence terminator, then mark the child node is a leaf node; if Is a sentence terminator, then mark the child node is a leaf node;

This process is repeated until the end nodes of each path in the tree are leaf nodes. Finally, all elements in the new set Q ^′ generate their corresponding reasoning path trees, and the tree set G =

{G ₁ ,G ₂ ,...G _m }.

4. The method for solving mathematical problems by chain reasoning based on feature classifier according to claim 1 is characterized in that, in step (3), nodes meeting the requirements are selected to construct a feature classifier training set, and the specific process is as follows:

First, the tree set G generated in step (2) contains m trees, corresponding to the reasoning path spanning tree of each problem in the mathematical problem set Q ^′ , and each node in the tree contains three attributes, n _i = (f(A _i ,l), x _i , r _i ), which are the feature matrix, the text, and the accuracy of all reasoning paths passing through the node n _i . Before screening the training set samples, it is necessary to calculate the r _i attribute of the node. Assume that for a reasoning path spanning tree, the path from the root node to the leaf node constitutes a reasoning path, and for a reasoning path there is a unique attribute a representing the correctness of the result of this reasoning path, where 1 represents correctness and 0 represents error. Therefore, the following calculation formula is obtained:

Among them, β(n _i ,S _j ) indicates whether node n _j belongs to the reasoning path S _i . If it does, the value is 1, and if it does not, the value is 0. The reasoning path accuracy of all nodes in all trees in the tree set G is calculated according to the above formula; then the nodes that meet the requirements are selected based on the following conditions as the training data set for training the feature classifier:

①. The node is not a root node or a leaf node;

② The r attribute value of a node is 0% or 100%, which means that the answers to the reasoning paths passing through the node are all wrong or all correct;

All nodes that meet the conditions are formed into a node set N = {n ₁ , n ₂ , ...}.

5. The method for solving mathematical problems by chain reasoning based on feature classifier according to claim 1 is characterized in that the specific process of step (4) is as follows:

First, extract the features of each element of the training data set D as input, use the inference path results as labels, and standardize the data; then, use the radial basis kernel function to map the data to a high-dimensional feature space, and train the support vector machine classification model with the objective function of maximizing the Lagrangian dual problem. Use cross-validation to evaluate the model performance, and adjust the hyperparameters as needed; finally, save the trained support vector machine model for subsequent inference; the resulting feature classifier C is shown below:

In the formula, x is a new input sample. The similarity between the sample and all training samples x _i is calculated by the kernel function K( _xi , x), and then these similarities are weighted and summed, where _αiyi comes from the Lagrange _multiplier learned during the training process and the label of the training sample; finally, the bias term b is added, and the classification result is obtained through the sign function sign.

6. The method for solving mathematical problems by chain reasoning based on feature classifier according to claim 1 is characterized in that, in step (5), if the reasoning paths are different, the feature classifier C trained in step (4) selects a word with a higher correct probability to continue reasoning, specifically comprising:

The attention weight matrices used to generate the current text are calculated for both, and the average pooling of the layer dimension and the linear interpolation of the sequence length dimension are performed to obtain the feature matrix A _a under the example d _a and the feature matrix A _b under the example d _b ; the feature matrix obtains the classification result through the feature classifier C(x), and the classification result includes the following 3 cases, which are processed as follows according to the case:

①, C(A _a )＝C(A _b )＝1, that is, in the current text, both play a positive role in the correctness of the reasoning path generation. In this case, the item with a larger value in the symbol function in the feature classifier C is selected as the prediction content of the currently generated text;

②, C(A _a )＝-1, C(A _b )＝1 or C(A _a )＝1, C(A _b )＝-1, that is, for the current text, the two examples play a positive and negative role in the correctness of the correct reasoning path generation respectively; in this case, the text with C(x)＝1 is selected as the currently generated text prediction content;

③. C(A _a )＝C(A _b )＝-1, that is, in the current text, both of them play a negative role in the correctness of the reasoning path generation. In this case, the item with a smaller value in the symbol function in the feature classifier C is selected as the prediction content of the currently generated text.

7. A thinking chain reasoning mathematical problem solving system based on feature classifier, characterized by comprising:

Thinking chain prompt generation module: used to obtain the mathematical reasoning problems to be processed, and select examples and instructions that meet the requirements according to the problems to form the complete input of the pre-trained language model;

Reasoning path tree generation and feature screening module: used to generate word-level reasoning path generation trees and screen out feature training data sets that meet the requirements;

Feature classifier training module: used for feature classifier training. It introduces the support vector machine algorithm based on the feature training data set, uses the radial basis function as the kernel function to map the data to the high-dimensional feature space, and uses the sequence minimum optimization algorithm to improve the training efficiency of the support vector machine.

Reasoning module guided by feature classifier: used to solve pending mathematical reasoning problems; through the intervention of feature classifier, the pre-trained language model is guided to select a more accurate reasoning path, so as to obtain the final mathematical problem reasoning process and answer.

8. A system for solving mathematical problems by chain reasoning based on a feature classifier, characterized in that it comprises a memory and one or more processors, wherein the memory stores executable code, and when the one or more processors execute the executable code, they are used to implement the method for solving mathematical problems by chain reasoning described in any one of claims 1 to 6.