CN108008999B

CN108008999B - Index evaluation method and device

Info

Publication number: CN108008999B
Application number: CN201610944177.6A
Authority: CN
Inventors: 程捷; 刘旭东; 项定义
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2016-11-02
Filing date: 2016-11-02
Publication date: 2021-07-20
Anticipated expiration: 2036-11-02
Also published as: CN108008999A

Abstract

The embodiment of the present application provides an indicator evaluation method and device, the method includes: acquiring an instruction stream of a test program and dividing it into N instruction segments of equal length; determining a feature matrix A and a mean column vector corresponding to the N instruction segments B and weight vector W; determine the target optimization model of the test program according to the feature matrix A, the column vector B, the weight vector W and multiple optimization objectives, and solve the optimization column vector of the target optimization model; determine the optimization column The number vector and coefficient vector of the instruction segment corresponding to the vector; the instruction segment corresponding to the number vector is simulated in the simulation device respectively, and the feature index vector of the instruction segment corresponding to the number vector is obtained; the feature index vector and the coefficient vector are determined. The inner product of is the characteristic index for evaluating the test program.

Description

Index evaluation method and device

Technical Field

The embodiment of the application relates to the field of computers, in particular to an index evaluation method and device.

Background

Personnel involved in the design and development of processor architectures often need to run test programs in an emulator of a certain architecture and then collect performance-related data indicators, such as Instruction Per Clock (IPC), hit rate of a second level Cache (L2 Cache), energy consumption, etc., to find the bottleneck of the current processor architecture. After the system architecture is improved, the design is redeployed in the simulator, the simulator is used again to run the test program, data are collected, the performance difference of the same test program which runs under the new and old system architectures is compared, then the bottleneck is found again. It can be seen that a great deal of design test work is done with a software simulator before deployment in hardware.

However, one of the major disadvantages of the software simulation platform is: the same test program is run at a much longer runtime than the hardware platform. Especially when running large, comprehensive test suite programs, such as SPEC CPU 2006, it often takes weeks or even months to get the data needed by itself. And after the system architecture is changed every time, the test program needs to be operated again to collect data under the new architecture. Therefore, such repeated operations and waits will seriously affect the development efficiency.

Research shows that the running process of the application program has obvious stage characteristics. The developer needs to shorten the time required by simulation from the perspective of simplifying test programs, and the core thought of the method is as follows: a user can obtain a group of artificial programs only by running the original test program once, each artificial program is very small and can be simulated quickly, and the time for simulating the group of artificial programs is far shorter than that for simulating the original test program; and (3) multiplying the data obtained after each artificial program simulation by a specific weight by a user, wherein the data obtained after weighted averaging is very close to the data obtained after the original artificial program runs.

In the existing Simpoint technology, the steps are as follows: grabbing an instruction stream of a test program component, cutting the instruction stream into segments (intervals) with equal length, and calculating a Basic Block Vector (BBV) of each Interval; performing K-Means clustering on all BBVs, and selecting an Interval closest to the center of each type as a representative Interval of the type; and simulating and testing the indexes of the selected representative intervals, and multiplying the indexes measured by the representative intervals of each category by the proportion of the Interval of the category to the total Interval to calculate a weighted average to represent the indexes of the original test program components. The simulation test in the prior art needs too many selected fragments, long simulation time and uncontrollable error.

How to reduce the instruction fragments selected in the simulation test to shorten the simulation time and/or reduce the error of the simulation index is a technical problem to be solved by the embodiment of the present application.

Disclosure of Invention

The embodiment of the application provides an index evaluation method and device, which can reduce instruction segments selected in simulation test to shorten simulation time and reduce simulation index errors, so that the simulation efficiency or precision of a processor can be improved.

In a first aspect, an index evaluation method is provided, including:

acquiring an instruction stream of a test program and dividing the instruction stream into N instruction segments with equal length;

determining a feature matrix A, a mean column vector B and a weight vector W corresponding to the N instruction segments, wherein the feature matrix A is a matrix with M rows and N columns and is used for describing feature index vectors of the N instruction segments, M is the number of types of basic blocks BB in the N instruction segments, the mean column vector B is a column vector formed by the mean value of the feature index vectors of each row of the feature matrix A, and the weight vector W is a column vector formed by the weight occupied by the feature index of the basic block BB type corresponding to each row in the feature matrix A in a test program;

determining an optimized column vector according to the feature matrix A, the column vector B, the weight vector W and an optimization target, wherein the optimization target comprises reducing the number of instruction fragments for simulation and/or reducing an error between a feature index for evaluating the test program and the feature index of the test program, and the optimized column vector is used for representing the instruction fragments for simulation in the N instruction fragments and the weight corresponding to each instruction fragment;

determining a number vector and a coefficient vector of the instruction segment corresponding to the optimized column vector, wherein the number vector is used for indicating the number of the instruction segment corresponding to the non-zero position in the optimized column vector, and the coefficient vector is used for indicating the weight of the characteristic index of the instruction segment corresponding to the non-zero position in the optimized column vector;

simulating the instruction segments corresponding to the number vectors in a simulation device respectively to obtain characteristic index vectors of the instruction segments corresponding to the number vectors, wherein the characteristic index vectors are used for expressing characteristic indexes of the instruction segments corresponding to the number vectors;

and determining the inner product of the characteristic index vector and the coefficient vector as the characteristic index for evaluating the test program.

In the embodiment of the application, an optimized column vector is determined based on a feature matrix, a column vector, a weight vector and an optimized target of an instruction segment of a test program, the instruction segment and a coefficient vector corresponding to a non-zero position are determined according to the optimized column vector, the instruction segment corresponding to the optimized column vector is simulated in the simulation device to obtain a feature index vector of the instruction segment corresponding to the optimized column vector, and a feature index for evaluating the test program is obtained according to the feature index vector and the coefficient vector, so that the number of the instruction segments required by simulation work of the test program can be reduced, or simulation index errors of the test program are reduced.

With reference to the first aspect, in a first possible implementation manner, the optimization objective is: and when the first norm of W.AX-W.B or the second norm of AX-B is smaller than or equal to a first threshold value, minimizing a third norm of X, wherein X is an optimization variable used for representing instruction fragments used for simulation in the N instruction fragments and the weight corresponding to each instruction fragment, and the optimization column vector is the value of X meeting the optimization target.

The method of the embodiment of the application can reduce the number of instruction fragments required by simulation work of the test program on the premise of ensuring the error precision of the simulation index.

With reference to the first possible implementation manner of the first aspect, in a second possible implementation manner of the first aspect, determining an optimized column vector according to the feature matrix a, the column vector B, the weight vector W, and an optimization goal is specifically implemented by determining the optimized column vector through the following goal optimization formula:

With reference to the first aspect, in a third possible implementation manner of the first aspect, the optimization objective is: when a first norm of X is less than or equal to a second threshold, minimizing a sum of a first product and a second product, where the first product is a product of a second norm of W · AX-W · B and a first coefficient, the second product is a product of a third norm of X and a second coefficient, and a value of the first coefficient is a non-zero real number, where X is an optimization variable used to represent instruction fragments for performing simulation among the N instruction fragments and a weight corresponding to each instruction fragment, and the optimization column vector is a value of X that satisfies the optimization goal.

The method of the embodiment of the application can reduce the simulation index error on the premise of controlling the number of the instruction segments required by the simulation work of the test program.

With reference to the third possible implementation manner of the first aspect, in a fourth possible implementation manner of the first aspect, the determining an optimized column vector according to the feature matrix a, the column vector B, the weight vector W, and an optimization goal is specifically implemented by determining the optimized column vector through the following goal optimization formula:

With reference to the first aspect, in a fifth possible implementation manner of the first aspect, the optimization objective is: the sum of a first product, a second product, a third product, a fourth product and a fifth product is minimized, wherein the first product is the product of a first coefficient and a first norm of X, and the second product is the sum of a second coefficient and a first norm of X

Is the product of a third coefficient and a second norm of

A fourth product of a fourth coefficient and a fourth norm of L, a fifth product of a fifth coefficient and a fifth norm of S, a ═ L + S, L is a low rank matrix, S is a sparse matrix,

a column vector representing the average of the rows of the low rank matrix L,

and the column vector is a column vector formed by the average values of all rows of the sparse matrix S, and the first coefficient, the second coefficient, the third coefficient, the fourth coefficient and the fifth coefficient are not all zero, wherein X is an optimization variable and is used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, and the optimization column vector is a value of X meeting the optimization target.

The method of the embodiment of the application can achieve certain balance between reduction of the number of the instruction fragments required by simulation work of the test program and reduction of simulation index errors.

With reference to the first aspect, in a sixth possible implementation manner of the first aspect, the optimization objective is: in that

And/or the eighth norm of X is less than or equal to the second threshold, the sum of the first product, the second product, the third product, the fourth product and the fifth product is minimized, wherein the first product is the product of the first coefficient and the first norm of X, and the second product is the product of the second coefficient and the first norm of X

Is the product of a third coefficient and a second norm of

a column vector representing the average of the rows of the low rank matrix L,

and a column vector formed by the average values of the rows of the sparse matrix S, wherein the first coefficient, the second coefficient, the third coefficient, the fourth coefficient and the fifth coefficient are not all zero.

The method of the embodiment of the application can balance reduction of the number of the instruction segments required by simulation work of the test program and reduction of simulation index errors on the premise of ensuring precision of the simulation index errors and/or controlling the number of the instruction segments required by the simulation work of the test program.

With reference to the first aspect and the foregoing implementation manner, in a seventh possible implementation manner of the first aspect, determining an optimized column vector according to the feature matrix a, the column vector B, the weight vector W, and an optimization goal is specifically implemented as:

step 1: establishing a target optimization model of the optimization target according to the characteristic matrix A, the column vector B, the weight vector W and the optimization target, and establishing a Lagrangian function f (X, Y, Z, mu) of the target optimization model, wherein X is an optimization variable used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, mu is a penalty parameter, Z is a Lagrangian multiplier, and Y is a relaxation variable;

step 2: setting X, Y, Z and initial values of μ, and setting the optimization target as a convergence condition for the optimization completion;

and step 3: calculating an optimal value of Y by using a least square method;

and 4, step 4: calculating the optimal value of X by using a soft threshold method;

and 5: when the value of X does not satisfy the constraint condition of the target optimization model, reducing a non-zero element in X and replacing X until X satisfies the constraint condition of the target optimization model;

step 6: updating a Lagrange multiplier Z according to a residual error of an equation X ═ Y, and increasing a penalty coefficient mu by a fixed multiple rho, wherein rho is greater than 1;

and 7: and judging whether the convergence condition is met, if so, determining that X is solved, and if not, repeatedly executing the step 3-6.

With reference to the first aspect and the foregoing implementation manner of the first aspect, in an eighth possible implementation manner of the first aspect, reducing one non-zero element in X and replacing X is specifically implemented as: and on the premise that X is reduced by one non-zero element, when X formed by reducing the first non-zero element in X is used for minimizing the function value of the target optimization model, reducing the first non-zero element and replacing X.

With reference to the first aspect and the foregoing implementation manner of the first aspect, in a ninth possible implementation manner of the first aspect, the characteristic index includes at least one of the following: IPC (instruction per clock), success rate of predicted branch, failure rate of predicted branch, hit rate of L2 Cache and energy consumption.

In a second aspect, an index evaluation apparatus is provided, including:

the device comprises an acquisition unit, a processing unit and a control unit, wherein the acquisition unit is used for acquiring an instruction stream of a test program and dividing the instruction stream into N instruction fragments with equal length;

a first determining unit, configured to determine a feature matrix a, a mean column vector B, and a weight vector W, where the feature matrix a is a matrix with M rows and N columns, and is used to describe feature indicator vectors of the N instruction segments, M is the number of types of basic blocks BB in the N instruction segments, the mean column vector B is a column vector formed by a mean value of the feature indicator vectors in each row of the feature matrix a, and the weight vector W is a column vector formed by weights occupied by feature indicators of basic blocks BB of each row of the feature matrix a in a test program;

a second determining unit, configured to determine an optimized column vector according to the feature matrix a, the column vector B, the weight vector W, and an optimization target, where the optimization target includes reducing the number of instruction fragments used for simulation and/or reducing an error between a feature index of the test program and a feature index of the test program, and the optimized column vector is used to represent instruction fragments used for simulation in the N instruction fragments and a weight corresponding to each instruction fragment;

a third determining unit, configured to determine a number vector and a coefficient vector of the instruction segment corresponding to the optimized column vector, where the number vector is used to indicate a number of the instruction segment corresponding to a non-zero position in the optimized column vector, and the coefficient vector is used to indicate a weight of a feature indicator of the instruction segment corresponding to the non-zero position in the optimized column vector;

the simulation unit is used for simulating the instruction segments corresponding to the number vectors in a simulation device respectively to obtain the characteristic index vectors of the instruction segments corresponding to the number vectors, and the characteristic index vectors are used for expressing the characteristic indexes of the instruction segments corresponding to the number vectors;

and the fourth determination unit is used for determining the inner product of the characteristic index vector and the coefficient vector as the characteristic index for evaluating the test program.

With reference to the second aspect, in a first possible implementation manner, the optimization goal is: and when the first norm of W.AX-W.B or the second norm of AX-B is smaller than or equal to a first threshold value, minimizing a third norm of X, wherein X is an optimization variable used for representing instruction fragments used for simulation in the N instruction fragments and the weight corresponding to each instruction fragment, and the optimization column vector is the value of X meeting the optimization target.

With reference to the first possible implementation manner of the second aspect, in a second possible implementation manner of the second aspect, the second determining unit specifically determines the optimized column vector by using the following target optimization formula:

With reference to the second aspect, in a third possible implementation manner of the second aspect, the optimization goal is: when a first norm of X is less than or equal to a second threshold, minimizing a sum of a first product and a second product, where the first product is a product of a second norm of W · AX-W · B and a first coefficient, the second product is a product of a third norm of X and a second coefficient, and a value of the first coefficient is a non-zero real number, where X is an optimization variable used to represent instruction fragments for performing simulation among the N instruction fragments and a weight corresponding to each instruction fragment, and the optimization column vector is a value of X that satisfies the optimization goal.

With reference to the third possible implementation manner of the second aspect, in a fourth possible implementation manner of the second aspect, the second determining unit specifically determines the optimized column vector through the following target optimization formula:

wherein d represents the second threshold, | X | | non-volatile memory_t1Represents a first norm of X, | | W.AX-W.B | | tormentum_t2Represents a second norm W.AX-W.B,||X||_t3represents the third norm of X, λ represents the first coefficient, δ represents the second coefficient, λ ≠ 0.

With reference to the second aspect, in a fifth possible implementation manner of the second aspect, the optimization goal is: the sum of a first product, a second product, a third product, a fourth product and a fifth product is minimized, wherein the first product is the product of a first coefficient and a first norm of X, and the second product is the sum of a second coefficient and a first norm of X

Is the product of a third coefficient and a second norm of

a column vector representing the average of the rows of the low rank matrix L,

With reference to the second aspect, in a sixth possible implementation manner of the second aspect, the optimization goal is: a sixth norm at W.AX-W.B or a seventh norm at AX-B is less than or equal to a first threshold, and/or an eighth norm of X is less than or equal to a second normWhen the threshold value is used, the sum of a first product, a second product, a third product, a fourth product and a fifth product is minimized, wherein the sum of the first product, the second product, the third product, the fourth product and the fifth product is minimized, the first product is the product of a first coefficient and a first norm of X, and the second product is the product of a second coefficient and a first norm of X

Is the product of a third coefficient and a second norm of

a column vector representing the average of the rows of the low rank matrix L,

With reference to the second aspect and the foregoing implementation manner of the second aspect, in a seventh possible implementation manner of the second aspect, the second determining unit is specifically implemented as:

and step 3: calculating an optimal value of Y by using a least square method;

With reference to the second aspect and the foregoing implementation manner of the second aspect, in an eighth possible implementation manner of the second aspect, reducing one non-zero element in X and replacing X is specifically implemented as: and on the premise that X is reduced by one non-zero element, when X formed by reducing the first non-zero element in X is used for minimizing the function value of the target optimization model, reducing the first non-zero element and replacing X.

With reference to the second aspect and the foregoing implementation manner of the second aspect, in a ninth possible implementation manner of the second aspect, the characteristic index includes at least one of the following: IPC (instruction per clock), success rate of predicted branch, failure rate of predicted branch, hit rate of L2 Cache and energy consumption.

In a third aspect, there is provided another index evaluation apparatus, including a processor and a channel interface, where the processor is configured to execute the method in the first aspect or any possible implementation manner of the first aspect through the channel interface.

In a fourth aspect, a computer-readable storage medium is presented for storing a computer program comprising instructions for performing the method of the first aspect or any possible implementation manner of the first aspect.

Based on the above technical solutions, the index evaluation method and apparatus in the embodiments of the present application construct a target optimization model of a test program based on a feature matrix, a column vector, and a weight vector of an instruction segment of the test program, determine an optimized column vector based on the target optimization model and an optimization target, determine an instruction segment and a coefficient vector corresponding to a non-zero position according to the optimized column vector, simulate the instruction segment corresponding to the optimized column vector in the simulation apparatus to obtain a feature index vector of the instruction segment corresponding to the optimized column vector, and obtain a feature index for evaluating the test program according to the feature index vector and the coefficient vector, thereby reducing the number of instruction segments required for simulation work of the test program or reducing a simulation index error of the test program.

Drawings

FIG. 1 is a schematic diagram of instruction code according to an embodiment of the present application.

FIG. 2 is a flowchart of an index evaluation method according to an embodiment of the present application.

FIG. 3 is an interaction flow diagram of an index evaluation method according to an embodiment of the present application.

FIG. 4 is a schematic diagram of a target optimization model of an embodiment of the present application.

FIG. 5 is a flow diagram of a method of determining an optimized column vector according to one embodiment of the present application.

Fig. 6 is a schematic structural diagram of an index evaluation device according to an embodiment of the present application.

Fig. 7 is a schematic structural diagram of an index evaluation device according to another embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be described below with reference to the accompanying drawings.

To facilitate understanding of the embodiments of the present application, several elements that will be introduced in the description of the embodiments of the present application are first introduced herein.

An instruction stream file: the file for recording the instruction stream information is called an instruction stream file, and each line of the instruction stream file represents information related to an executed instruction and conforms to a uniform format. Typically, the instruction stream file size is fixed. For example, an instruction stream file generally contains information such as 1 hundred million instructions, and a complete test program can be regarded as an instruction stream composed of a plurality of 1 hundred million instructions, each 1 hundred million instruction being referred to as an Interval (Interval), in other words, a complete test program is composed of a plurality of segments. The full set of test programs is all fragments and the subset of test programs is part of fragments. The purpose of simplifying the test program is to select representative partial segments from the corpus, and it is required that the number of the selected segments is as small as possible, and the operation results of the segments are as high as possible in similarity with the operation results of the original test program. An instruction may include the following information: program pointer, assembly instruction, operation type, and memory address. The memory address is a selectable item.

And (3) program pointer: the program pointer for each line of instructions is the address in memory of the line assembler instruction, which is a hexadecimal number beginning with "0 x".

Assembling instructions: the binary instruction code of the instruction needs to meet assembly syntax requirements.

The operation type is as follows: all assembly instructions can be divided into three categories: the arithmetic logic unit operates, reads the memory, writes the memory and controls the instruction.

Memory address: if the instruction is an arithmetic logic unit operation, no memory address information is needed; if the instruction is a memory read-write operation, a memory address is required.

Basic Block (BB): a piece of sequentially executed instructions. In general, an instruction stream may be divided into a plurality of BBs with a control instruction as a boundary. Each segment of the test program is composed of BB.

Basic Block feature indicator Vector (BBV): according to different control instructions, the execution times of BB of different types in each segment are counted, and a vector constructed based on the types and the execution times of the BB is called BBV. FIG. 1 is a schematic diagram of instruction code according to an embodiment of the present application. As shown in the instruction code fragment of FIG. 1, if the type ID of BB is {1,2,3,4,5}, and the corresponding execution times is {1,20,0,5,0}, then BBV of the fragment can be recorded as {1:1,2:20,3:0,4:5,5:0 }.

FIG. 2 is a flowchart of an index evaluation method according to an embodiment of the present application. The method of FIG. 2 is performed by an index evaluation device of a processor. Specifically, the index evaluation device may be a host apparatus or the like.

An instruction stream of a test program is obtained and divided into N instruction fragments of equal length 201.

202, determining a feature matrix a, a mean column vector B, and a weight vector W corresponding to the N instruction fragments.

The feature matrix a is a matrix with M rows and N columns for describing feature index vectors of the N instruction segments, M is the number of types of basic blocks BB in the N instruction segments, the mean column vector B is a column vector formed by a mean value of the feature index vectors of each row of the feature matrix a, and the weight vector W is a column vector formed by weights occupied by feature indexes of the test program by types of the basic blocks BB corresponding to each row of the feature matrix a.

Specifically, in the embodiment of the present application, the characteristic index may be any one of the following: instruction Per Clock (IPC), predicted branch success/failure, L2 Cache hit, energy consumption, etc.

203, determining an optimized column vector according to the feature matrix a, the column vector B, the weight vector W and an optimization goal.

The optimization target comprises reducing the number of instruction segments used for simulation and/or reducing errors between characteristic indexes of the evaluation test program and characteristic indexes of the test program, and the optimization column vector is used for representing instruction segments used for simulation in the N instruction segments and weights corresponding to the instruction segments used for simulation.

Specifically, for example, an objective optimization model may be established according to the feature matrix a, the column vector B, the weight vector W and an optimization objective, the objective optimization model is used to represent the optimization objective through a relationship between A, B, W and an optimization variable X of the objective optimization model, and the optimization column vector is a value of the optimization variable X satisfying the constraint relationship.

And 204, determining a number vector and a coefficient vector of the instruction segment corresponding to the optimized column vector.

The number vector is used for representing the number of the instruction segment corresponding to the non-zero position in the optimized column vector, and the coefficient vector is used for representing the weight of the characteristic index of the instruction segment corresponding to the non-zero position in the optimized column vector.

205, the instruction segments corresponding to the number vectors are simulated in the simulation device to obtain the feature index vectors of the instruction segments corresponding to the number vectors.

The feature indicator vector is used to indicate the feature indicator of the instruction segment corresponding to the number vector.

And 206, determining the inner product of the characteristic index vector and the coefficient vector as the characteristic index for evaluating the test program.

Optionally, as an embodiment, the optimization objective is: and when the first norm of W.AX-W.B or the second norm of AX-B is smaller than or equal to a first threshold value, minimizing a third norm of X, wherein X is an optimization variable used for representing instruction fragments used for simulation in the N instruction fragments and the weight corresponding to each instruction fragment, and the optimization column vector is the value of X meeting the optimization target.

It should be understood that in the embodiments of the present application, the firstThe norm, the second norm, the third norm, etc. may be any one of the following norms: 0 norm, 1 norm,

Norm, p-norm or nuclear norm, etc., where p is a positive integer. It should be understood that the first to eighth norms, etc. mentioned elsewhere in this application may also be any of the following norms: 0 norm, 1 norm,

Norm of

P norm | X | Y | of X non-woven grid_pAnd p is a positive integer.

Specifically, step 203 may determine the optimized column vector by equation 1:

wherein | | W.AX-W.B | | non-woven hair_t1Representing a first norm of W.AX-W.B, | | AX-B | | non-combustible cells_t2Represents a second norm of AX-B, | X | | non-volatile memory_t3C represents the first threshold value, to represent a third norm of X. It is understood that the operator symbol, represents an inner product, e.g., [ x ]₁,x₂,x₃]·[y₁,y₂,y₃]＝x₁ y₁+x₂ y₂+x₃ y₃。

Optionally, as another embodiment, the optimization objective is: when a first norm of X is less than or equal to a second threshold, minimizing a sum of a first product and a second product, where the first product is a product of a second norm of W · AX-W · B and a first coefficient, the second product is a product of a third norm of X and a second coefficient, and a value of the first coefficient is a non-zero real number, where X is an optimization variable used to represent instruction fragments for performing simulation among the N instruction fragments and a weight corresponding to each instruction fragment, and the optimization column vector is a value of X that satisfies the optimization goal.

Specifically, step 203 may determine the optimized column vector by equation 3:

It should be understood that, in the embodiment of the present application, the first norm of X and the third norm of X may be any one of the following norms: 0 norm of X | | non-woven phosphor ₀1 norm of X (| X | | non-woven phosphor) of X₁X of

Norm of

P norm | X | Y | of X non-woven grid_pAnd p is a positive integer. The first norm of X and the third norm of X may be the same or different。

It should be understood that in the embodiment of the present application, the second norm of W · AX-W · B may be any one of the following norms: W.AX-W.B has a 0 norm | | | W.AX-W.B | | luminance ₀1 norm of W.AX-W.B

||W·AX-W·B||₁W.AX-W.B

Norm of

It should be understood that, in the first coefficient and the second coefficient of the embodiments of the present application, at least 1 term is not 0. For example, λ ≠ 0.

it should be understood that, in the embodiment of the present application, the values of the first coefficient and the second coefficient are determined by the optimization objective. The values of the first coefficient and the second coefficient determine whether the optimization target is smaller instruction fragments for simulation or smaller simulation errors.

Optionally, as another embodiment, the optimization objective is: so that the sum of the first product, the second product, the third product, the fourth product and the fifth product is minimum, wherein the first product is a product of a first coefficient and a first norm of X, the second product is a product of a second coefficient and a second norm of LX-L, the third product is a product of a third coefficient and a third norm of SX-S, the fourth product is a product of a fourth coefficient and a fourth norm of L, the fifth product is a product of a fifth coefficient and a fifth norm of S, A is L + S, L is a low-rank matrix, S is a sparse matrix, L represents a column vector formed by an average value of each row of the low-rank matrix L, and S represents a column vector formed by an average value of each row of the sparse matrix S, wherein, X is an optimization variable used for representing instruction segments used for simulation among the N instruction segments and a weight corresponding to each instruction segment, and the optimized column vector is a value of X that satisfies the optimization objective.

Specifically, step 203 may specifically determine the optimized column vector by equation 5:

wherein | X | Y luminance_t1A first norm representing X is given,

to represent

Is measured in a first direction of the first norm,

to represent

Is a third norm, | L | | non-woven phosphor_t4Represents the fourth norm of L, | S | | non-woven phosphor_t5Denotes a fifth norm of S, λ denotes a first coefficient, δ denotes a second coefficient, α denotes a third coefficient, β denotes a fourth coefficient, and γ denotes a fifth coefficient.

It should be understood that at least one of the first coefficient, the second coefficient, the third coefficient, the fourth coefficient, and the fifth coefficient in the embodiments of the present application is not 0.

Specifically, when the first norm of X is 1 norm, the second norm of LX-L is 2 norms, the third norm of SX-S is 2 norms, the fourth norm of L is 1 norm, and the fifth norm of S is 1 norm, equation 5 can be expressed as shown in equation 6:

optionally, as an embodiment, the optimization objective is: when the sixth norm of W.AX-W.B or the seventh norm of AX-B is less than or equal to a first threshold value and/or the eighth norm of X is less than or equal to a second threshold value, the sum of a first product, a second product, a third product, a fourth product and a fifth product is minimized, wherein the first product is the product of a first coefficient and the first norm of X, and the second product is the product of a second coefficient and the first norm of X

Is the product of a third coefficient and a second norm of

a column vector representing the average of the rows of the low rank matrix L,

and representing a column vector formed by the average values of all rows of the sparse matrix S, wherein X is an optimization variable and is used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, and the optimization column vector is a value of X meeting the optimization target.

When the constraint of the optimization objective is that the sixth norm of W · AX-W · B or the seventh norm of AX-B is less than or equal to the first threshold, step 203 may determine the optimized column vector by equation 7:

wherein c represents a first threshold, | W.AX-W.B | | calcei_t6Represents the sixth norm of W.AX-W.B, | | AX-B | | non-combustible cells_t7Represents the seventh norm of AX-B, and the remaining individual parameters have the same meanings as the corresponding parameters in equation 5.

When the constraint condition of the optimization objective is that the sixth norm of X is less than or equal to the second threshold, step 203 may determine the optimized column vector by equation 8:

wherein d represents a second threshold, | X | | non-volatile memory_t6Denotes the sixth norm of X, and the remaining respective parameters have the same meanings as the corresponding parameters in equation 5.

When the constraint of the optimization objective is that the sixth norm of W · AX-W · B or the seventh norm of AX-B is less than or equal to the first threshold and the eighth norm of X is less than or equal to the second threshold, step 203 may determine the optimized column vector by equation 9:

wherein c represents a first threshold, d represents a second threshold, and | X | | magnetism is not present when a sixth norm of W.AX-W.B or a seventh norm of AX-B is less than or equal to the first threshold_t8Represents the eighth norm of X, and the remaining parameters have the same meanings as the corresponding parameters in equation 5.

Specifically, the detailed process of step 203 may be as follows:

and step 3: calculating an optimal value of Y by using a least square method;

and 7: and judging whether the convergence condition is met, if so, determining that X is solved, and if not, repeatedly executing the steps 3-7.

In a specific implementation manner, the reduction of one non-zero element in X and the replacement of X in step 5 can be specifically implemented as: the lagrange multiplier Z is updated according to the residual of equation X ═ Y and the penalty factor μ is increased by a fixed multiple ρ, where ρ > 1. Of course, it should be understood that there may be other specific implementation manners of step 5, for example, decreasing the non-zero elements in X and replacing X one by one in the order of decreasing the non-zero elements from small to large or from large to small until X satisfies the constraint condition of the target optimization model, and so on.

In order to facilitate understanding of the method of the embodiments of the present application, the method of the embodiments of the present application will be further described with reference to specific embodiments.

FIG. 3 is a flowchart illustrating evaluation of metrics by a processor according to an embodiment of the present disclosure. It should be understood that in the embodiment of the present application, BBV is used as the feature index vector of the test program. The method of the embodiment of the present application is executed by an index evaluation apparatus of a processor disposed on a host device, where the apparatus may be an analog system or an execution chip, and the like, and the embodiment of the present application is not limited herein.

301, an instruction stream is fetched.

In one particular example, the host device may use an instruction stream fetching software tool to fetch an instruction stream of the test program, the instruction stream being composed of instructions. The host device can segment the instruction stream into BBs according to the jump instruction, and thus the instruction stream can be seen as being composed of BB sequences.

302, the instruction stream is partitioned into instruction fragments, and the BBV of each instruction fragment is calculated.

The host device may divide the instruction stream acquired in step 301 into a plurality of instruction fragments (intervals). For example, the instruction stream may be divided into instruction fragments of equal length, and the number and entry address of each instruction fragment may be recorded. Specifically, for example, each instruction fragment contains 1 billion BB, and so on. Of course, the division may be performed according to other instruction lengths, for example, 2 hundred million BBs, 10 hundred million BBs, etc., and the embodiment of the present application is not limited herein.

It is not assumed that the number of divided instruction fragments is N.

303, construct a matrix A, B, W.

The host device may construct the feature matrix a according to the BBV of each instruction segment after the instruction stream is split. Where each row represents the BBV of one instruction fragment.

The host device may calculate a column vector B based on the average value of each row in the feature matrix a. Specifically, the value of each row in B can be expressed by the following equation 10:

wherein, i is more than or equal to 1 and less than or equal to N, and M represents the number of BB types in the instruction segment.

In addition, a weight vector W may be set according to the difference in weight of the type of BB in the characteristic index of the test program. W is the column vector. Wherein each value of W represents a weight of the corresponding type of BB in the characteristic index of the test program.

FIG. 4 is a schematic diagram of a target optimization model according to an embodiment of the present application. As shown in FIG. 4, the rows of the feature matrix A correspond to the types of BB, the columns correspond to the divided instruction fragments, the feature matrix A is a matrix of M rows × N columns, and the element A in A_i,jIndicates the j-th instructionThe characteristic index value of the ith BB in the segment. The sparse solution matrix X is an optimization variable X in the embodiment shown in fig. 2, a position number of a non-zero coefficient in X is used to indicate a number of a selected instruction segment, a value of the non-zero coefficient indicates a weight of a finally calculated characteristic index of a characteristic index of the instruction segment corresponding to the non-zero coefficient, and the sparse solution matrix X is an N row × 1 column matrix. The column mean matrix B of a is a column vector formed by the column means of each column in a, and the matrix B is a matrix of M rows × 1 column, and its calculation method is shown in equation 10. The weight matrix W is a matrix of M rows by 1 column, and the ith element W in the weight matrix_iAnd (3) representing the weight of the feature vector of the ith BB in the finally calculated feature index. As shown in fig. 4, X is a value satisfying AX · W — B · W. It is understood that AX.W and W.AX are equivalent; B.W and W.B are equivalent.

An optimized column vector is determined 304 based on A, B, W and the optimization objective.

Specifically, an objective optimization model may be constructed from A, B, W and the optimization objectives, and the optimization column vectors of the objective optimization model may be solved.

An optimization variable X can be introduced, wherein a non-zero element in X represents a selected instruction segment, and the value of the non-zero element in X represents the value of the weight corresponding to the instruction segment. That is, X is used to represent the instruction segment for simulation among the N instruction segments of the test program and the weight corresponding to each instruction segment.

It should be understood that in practical applications, W · AX and W · B will have a certain error, and the smaller the error, the smaller the simulation error; further, the smaller the number of non-zero coefficients in X, the fewer instruction fragments to be selected for the simulation test, and the shorter the time required for the simulation test. Thus, an objective optimization model may be constructed based on A, B, W according to the optimization objectives.

It should be appreciated that if the cost of the number of instruction fragments is high, then a higher weight may be placed on the optimization objective that affects the number of X non-zero coefficients. At this point, the optimization objective may include limiting the number of non-zero coefficients in X.

Specifically, for example, the number of non-zero coefficients in X is minimizedAnd (4) transforming. At this time, it can be achieved by minimizing the norm of X. The X norm may include the 0 norm | | | X | | luminance of X ₀1 norm of X (| X | | non-woven phosphor) of X₁X of

Or, for example, such that the 1 norm of X is less than d. At this time, | | X | | non-woven phosphor can be restricted₁≤d,d∈R⁺. Reducing non-zero numbers in X is achieved by constraining the 1-norm reduction of X to be less than a value that results in a reduction of non-zero numbers in X.

Of course, other constraint formulas may exist, and the embodiments of the present application are not listed here.

It should be appreciated that if certain simulation error accuracy is required, higher weights may be placed on the optimization objectives affecting W.AX-W.B.

Specifically, for example, the value of W · AX-W · B is minimized, or W · AX-W · B is smaller than d, or the norm of W · AX-W · B is smaller than d, or the like.

Based on the difference of the optimization objectives, different objective optimization models can be set.

For example, assume that the optimization goal of the target optimization model is to minimize the third norm of X when either the first norm of W.AX-W.B or the second norm of AX-B is less than or equal to the first threshold. At this time, the objective optimization model may be expressed by equation 1 above.

For another example, assume that the optimization goal of the objective optimization model is to minimize the sum of a first product and a second product when a first norm of X is smaller than or equal to a second threshold, where the first product is a product of a second norm of W · AX-W · B and a first coefficient, the second product is a product of a third norm of X and a second coefficient, and a value of the first coefficient is a non-zero real number. At this time, the target optimization model may be expressed by equation 3 above.

As another example, assume that the optimization objective of the objective optimization model is such that the first product,The sum of a second product, a third product, a fourth product and a fifth product is minimum, wherein the first product is the product of a first coefficient and a first norm of X, and the second product is the sum of a second coefficient and a fifth product

Is the product of a third coefficient and a second norm of

A fourth product is a product of a fourth coefficient and a fourth norm of L, a fifth product is a product of a fifth coefficient and a fifth norm of S, a is L + S, L is a low-rank matrix, S is a sparse matrix, L represents a column vector formed by an average value of each row of the low-rank matrix L, and S represents a column vector formed by an average value of each row of the sparse matrix S. At this time, the target optimization model can be expressed by the above equation 5.

For another example, assume that the optimization goal of the objective optimization model is to minimize the sum of a first product, a second product, a third product, a fourth product and a fifth product when the sixth norm of W.AX-W.B or the seventh norm of AX-B is less than or equal to a first threshold and/or the eighth norm of X is less than or equal to a second threshold, wherein the first product is the product of a first coefficient and the first norm of X, and the second product is the product of a second coefficient and the first norm of X

Is the product of a third coefficient and a second norm of

a column vector representing the average of the rows of the low rank matrix L,

a column vector representing an average value of each row of the sparse matrix S. At this time, the target optimization model can be expressed by the above equations 7, 8, 9, and the like.

Without assuming that the formula of the objective optimization model of the embodiment of the present application is shown in formula 4, a method for establishing the objective optimization model and solving the optimized column vectors of the embodiment of the present application is shown in fig. 5. Fig. 5 is a flowchart of a method for determining an optimized column vector according to an embodiment of the present application.

S501, establishing a target optimization model, and determining a Lagrangian function f (X, Y, Z and mu) according to the target optimization model.

Specifically, an objective optimization model may be established based on A, B, W and an optimization objective, where the objective optimization model is used to represent a relationship between A, B, W and an optimization variable X under the optimization objective, and a value of X that satisfies the objective optimization model is an optimization column vector.

To solve for the optimized column vector, the lagrangian function f (X, Y, Z, μ) may be determined. Specifically, a relaxation variable Y can be introduced, a lagrangian multiplier method is used, a lagrangian multiplier Z is introduced, μ is a penalty parameter, and the value is a positive real number. The variable X of the Lagrange function is the optimized variable X of the target optimization model.

S502, setting initial values of variables X, Y, Z and mu in the Lagrangian function, and setting a convergence condition of the optimization completion.

For example, a convergence condition for completion of optimization when the profit of optimization is less than a predetermined threshold may be set. Specifically, taking the target optimization model shown in formula 4 as an example, if the obtained | | W · AX-W · B | | survival rate₂+||X||₁The value of (A) is less than the (| W.AX-W.B |) of the last circulation₂+||X||₁The magnitude of the decrease in the value of (b) is less than a predetermined threshold V, the optimization is deemed complete.

For another example, a maximum number of loop executions may be set, and when the loop reaches the maximum number, a convergence condition for completion of optimization may be set.

Of course, it should be understood that a plurality of determination conditions may be set, and any one of them may be satisfied. For example, if | | W · AX-W · B | | Wy calculation is obtained₂+||X||₁The value of (A) is less than the (| W.AX-W.B |) of the last circulation₂+||X||₁The amplitude of the decrease of the value of (b) is less than a predetermined threshold V, or the number of cycles reaches the maximum number, which is a convergence condition for completing optimization, and so on.

S503, an optimal value of Y is calculated using the least square method.

For a specific implementation of calculating Y by using the least square method, reference may be made to the prior art, and details of the embodiment of the present application are not repeated herein.

S504, the optimal value of X is calculated by using a soft threshold method.

For a specific implementation of calculating X by using the soft threshold method, reference may be made to the prior art, and details of the embodiment of the present application are not described herein again.

And S505, when the value of the X does not meet the constraint condition of the target optimization model, reducing a non-zero element in the X and replacing the X until the X meets the constraint condition of the target optimization model.

It should be understood that step S505 has different implementations according to different target optimization models.

For example, assuming that the target optimization model is the expression shown in formula 4, in step S505, the non-zero elements of X may be reduced to not more than d.

There are many ways to reduce non-zero elements in X.

Preferably, on the premise that X is reduced by one non-zero element, when X formed by reducing a first non-zero element in X minimizes the function value of the target optimization model, the first non-zero element is reduced and replaces X.

For example, in the target optimization model assumed in equation 4, in order to reduce the non-zero elements of X to no more than d, in order to make | | W · AX-W · B | | white magnetism₂+||X||₁The value of (A) is taken as the minimum value as much as possible, and when one nonzero element is reduced each time, the (| | W.AX-W.B | |) of each nonzero element can be reduced relatively₂+||X||₁Selecting | | | W.AX-W.B | | non-woven cells₂+||X||₁The minimum value of (a) is used as a principle of reducing the non-zero elements until the number of the non-zero elements does not exceed d.

Of course, the non-zero elements may be reduced according to the principle that the value of the non-zero elements in X is from large to small or from small to large. The method has the advantages of simple calculation, and smaller function value of the target optimization model due to the fact that multiple comparisons for reducing non-zero elements are not needed.

S506, the lagrange multiplier is updated according to the residual error of the equation X ═ Y, and the penalty coefficient μ is increased by a fixed multiple ρ, where ρ > 1.

S507, judging whether the convergence condition of the Lagrangian function f (X, Y, Z and mu) is met.

If the convergence condition of the Lagrangian function f (X, Y, Z, mu) is met, outputting the current value of X, and ending the execution;

if the convergence condition of the Lagrangian function f (X, Y, Z, μ) is not satisfied, step 503 is performed.

Steps S503 to S507 are executed in a loop until the convergence condition of the lagrangian function f (X, Y, Z, μ) is satisfied.

Of course, it should be understood that there may be other methods of solving the objective optimization model, and the embodiments of the present application are not limited thereto.

The number vector and coefficient vector of the instruction fragment to which the optimized column vector corresponds are determined 305.

The optimized column vector determined in step 304 is not recorded as X₀。

It is understood that X₀The numbering of the non-zero position-corresponding segments is recorded in a numbering vector D, X₀The non-zero coefficients are recorded in a coefficient vector G. The number vector D is used for representing the number of the instruction segment corresponding to the non-zero position in the optimized column vector, and the coefficient vector G is used for representing the weight of the characteristic index of the instruction segment corresponding to the non-zero position in the optimized column vector.

And 306, simulating the instruction segments corresponding to the number vector D in the simulation device respectively to obtain the characteristic index vector of the instruction segment corresponding to the number vector D.

And the simulation test system simulates the instruction segment corresponding to the number vector D by using a simulation tool according to the number vector D to obtain the characteristic indexes of each instruction segment, and the characteristic indexes of each instruction segment corresponding to the number vector D form a characteristic index vector C.

The simulation tool may be a software or hardware emulator, such as LiveSP, Checkpoint, or the like. The characteristic indexes of the instruction segment obtained by the simulation tool can be IPC, predicted branch success/failure rate, energy consumption and the like.

307, determining the inner product of the feature index vector and the coefficient vector as the feature index for evaluating the test program.

From the coefficient vector D of step 305 and the characteristic index vector C of step 306, a characteristic index C · D of the evaluation test program can be obtained.

And finishing the index evaluation of the test program.

Fig. 6 is a schematic structural diagram of an index evaluation device 600 according to an embodiment of the present application. As shown in fig. 6, the index evaluation device 600 may include:

an obtaining unit 601, configured to obtain an instruction stream of a test program and divide the instruction stream into N instruction fragments with equal lengths;

a first determining unit 602, configured to determine a feature matrix a, a mean column vector B, and a weight vector W, where the feature matrix a is a matrix with M rows and N columns, and is used to describe feature indicator vectors of the N instruction segments, M is the number of types of basic blocks BB in the N instruction segments, the mean column vector B is a column vector formed by a mean value of the feature indicator vectors in each row of the feature matrix a, and the weight vector W is a column vector formed by weights occupied by feature indicators of basic blocks BB of each row in the feature matrix a in a test program;

a second determining unit 603, configured to determine an optimized column vector according to the feature matrix a, the column vector B, the weight vector W, and an optimization goal, where the optimization goal includes reducing the number of instruction fragments used for simulation and/or reducing an error between a feature index of the test program and a feature index of the test program, and the optimized column vector is used to represent instruction fragments used for simulation in the N instruction fragments and a weight corresponding to each instruction fragment;

a third determining unit 604, configured to determine a number vector and a coefficient vector of the instruction fragment corresponding to the optimized column vector, where the number vector is used to indicate a number of the instruction fragment corresponding to a non-zero position in the optimized column vector, and the coefficient vector is used to indicate a weight of a feature indicator of the instruction fragment corresponding to the non-zero position in the optimized column vector;

a simulation unit 605, configured to simulate the instruction segments corresponding to the number vector in a simulation device, respectively, to obtain a feature indicator vector of the instruction segment corresponding to the number vector, where the feature indicator vector is used to indicate a feature indicator of the instruction segment corresponding to the number vector;

a fourth determining unit 606, configured to determine an inner product of the feature indicator vector and the coefficient vector as a feature indicator for evaluating the test program.

In the embodiment of the application, a target optimization model of a test program is constructed based on a feature matrix, a column vector and a weight vector of an instruction segment of the test program, an optimized column vector is determined based on the target optimization model and an optimized target, an instruction segment and a coefficient vector corresponding to a non-zero position are determined according to the optimized column vector, the instruction segment corresponding to the optimized column vector is simulated in a simulation device to obtain a feature index vector of the instruction segment corresponding to the optimized column vector, and a feature index for evaluating the test program is obtained according to the feature index vector and the coefficient vector, so that the number of the instruction segments required by simulation work of the test program can be reduced, or simulation index errors of the test program are reduced.

The index evaluation device 600 may also execute the method of fig. 2 and implement the functions of the index evaluation device in the embodiments shown in fig. 2, fig. 3, and fig. 5, which are not described herein again in this embodiment of the present application.

Fig. 7 is a schematic structural diagram of an index evaluation apparatus 700 according to an embodiment of the present application. The metric evaluation device 700 may include a channel interface 701, a processor 702, and a memory 703.

The channel interface 701, the processor 702 and the memory 703 are interconnected by a bus 704 system. Bus 704 may be an ISA bus, PCI bus, EISA bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one double-headed arrow is shown in FIG. 7, but this does not indicate only one bus or one type of bus.

The memory 703 is used for storing programs. In particular, the program may include program code comprising computer operating instructions. Memory 703 may include both read-only memory and random-access memory, and provides instructions and data to processor 702. The memory 703 may comprise high-speed RAM memory, and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.

The processor 702 executes the program stored in the memory 703, and is specifically configured to perform the following operations:

The method performed by the index estimation apparatus disclosed in any one of fig. 2, fig. 3, and fig. 5 may be applied to the processor 702, or implemented by the processor 702. The processor 702 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 702. The Processor 702 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 703, and the processor 702 reads the information in the memory 703 and performs the steps of the above method in combination with the hardware thereof.

The index evaluation device 700 may also perform the method of fig. 2 and implement the functions of the index evaluation device in the embodiments shown in fig. 2, fig. 3, and fig. 5, which are not described herein again in this application.

The embodiment of the present application further provides a computer-readable storage medium for storing a computer program, where the computer program includes instructions for executing the method of the embodiment shown in fig. 2.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. an index evaluation method, is characterized in that, comprises:

Obtain the instruction stream of the test program and divide it into N instruction segments of equal length;

Determine the characteristic matrix A, the mean column vector B and the weight vector W corresponding to the N instruction fragments, wherein the characteristic matrix A is a matrix with M rows and N columns, and is used to describe the characteristic index vectors of the N instruction fragments , M is the number of types of basic blocks BB in the N instruction segments, the mean column vector B is a column vector formed by the mean of the feature index vectors of each row of the feature matrix A, and the weight vector W is the A column vector composed of the weights of the basic block type BB corresponding to each row in the feature matrix A in the feature index of the test program;

An optimization column vector is determined according to the feature matrix A, the mean column vector B, the weight vector W, and an optimization objective including reducing the number of instruction fragments used for simulation and/or reducing the evaluation of the test program The error between the characteristic index of the test program and the characteristic index of the test program, the optimized column vector is used to represent the instruction segment used for simulation in the N instruction segments and the corresponding weight of each instruction segment;

Determine the number vector and coefficient vector of the instruction segment corresponding to the optimized column vector, wherein the number vector is used to represent the number of the instruction segment corresponding to the non-zero position in the optimized column vector, and the coefficient vector is used to represent the The weight of the feature index of the instruction fragment corresponding to the non-zero position in the optimized column vector;

The instruction fragments corresponding to the numbered vectors are simulated in the simulation device respectively, to obtain a feature index vector of the instruction segment corresponding to the numbered vector, and the feature index vector is used to represent the feature index of the instruction segment corresponding to the numbered vector ;

The inner product of the feature index vector and the coefficient vector is determined as the feature index for evaluating the test program.

2. The method of claim 1, wherein the optimization objective is: when the first norm of W·AX-W·B or the second norm of AX-B is less than or equal to the first threshold , so that the third norm of X is the smallest, where X is an optimization variable, which is used to represent the instruction segment used for simulation in the N instruction segments and the corresponding weight of each instruction segment, and the optimized column vector satisfies the The value of X for the optimization objective.

3. The method of claim 2, wherein the determining the optimized column vector according to the feature matrix A, the mean column vector B, the weight vector W and an optimization objective comprises:

The optimized column vector is determined by the following objective optimization formula:

Among them, ||W·AX-W·B|| _t1 represents the first norm of W·AX-W·B, ||AX-B|| _t2 represents the second norm of AX-B, ||X| | _t3 represents the third norm of X, and c represents the first threshold.

4. The method of claim 1, wherein the optimization objective is:

When the first norm of X is less than or equal to the second threshold, the sum of the first product and the second product is minimized, where the first product is the second norm of W·AX-W·B and the first coefficient Product, the second product is the product of the third norm of X and the second coefficient, and the value of the first coefficient is a non-zero real number, where X is an optimization variable, used to represent the N instruction fragments used in the For the instruction segments to be simulated and the weights corresponding to each instruction segment, the optimization column vector is the value of X that satisfies the optimization objective.

5. The method of claim 4, wherein

The determining the optimized column vector according to the feature matrix A, the mean column vector B, the weight vector W and the optimization objective includes:

min λ||W·AX-W·B|| _t2 +δ||X|| _t3

st||X|| _t1 ≤d

Wherein, d represents the second threshold, ||X|| _t1 represents the first norm of X, ||W·AX-W·B|| _t2 represents the second norm of W·AX-W·B, ||X|| _t3 represents the third norm of X, λ represents the first coefficient, δ represents the second coefficient, and λ≠0.

6. The method of claim 1, wherein the optimization objective is:

Minimize the sum of the first product, the second product, the third product, the fourth product, and the fifth product, where the first product is the product of the first coefficient and the first norm of X, and the second product is the first Two coefficients and

The product of the second norm of , the third product is the third coefficient and

The product of the third norm of , the fourth product is the product of the fourth coefficient and the fourth norm of L, the fifth product is the product of the fifth coefficient and the fifth norm of S, A=L+S, L is low-rank matrix, S is a sparse matrix,

is a column vector representing the mean value of each row of the low-rank matrix L,

represents a column vector composed of the average value of each row of the sparse matrix S, the first coefficient, the second coefficient, the third coefficient, the fourth coefficient and the fifth coefficient are not all zero, where X is an optimization variable, used to represent the N The instruction segments used for simulation among the instruction segments and the weights corresponding to each instruction segment, and the optimized column vector is the value of X that satisfies the optimization objective.

7. The method of claim 1, wherein the optimization objective is:

When the sixth norm of W·AX-W·B or the seventh norm of AX-B is less than or equal to the first threshold, and/or the eighth norm of X is less than or equal to the second threshold, such that the first product , the sum of the second product, the third product, the fourth product, and the fifth product is the smallest, wherein the first product is the product of the first coefficient and the first norm of X, and the second product is the second coefficient and the

represents a column vector composed of the average value of each row of the sparse matrix S, the first coefficient, the second coefficient, the third coefficient, the fourth coefficient and the fifth coefficient are not all zero, wherein, X is an optimization variable, used to represent the N The instruction segments used for simulation among the instruction segments and the weights corresponding to each instruction segment, and the optimized column vector is the value of X that satisfies the optimization objective.

8. The method according to any one of claims 1-7, wherein the determining the optimized column vector according to the feature matrix A, the mean column vector B, the weight vector W and an optimization objective comprises: :

Step 1: According to the feature matrix A, the mean column vector B, the weight vector W and the optimization objective, establish the objective optimization model of the optimization objective, and establish the Lagrangian function f of the objective optimization model ( X, Y, Z, μ), where X is an optimization variable, which is used to represent the instruction segment used for simulation in the N instruction segments and the corresponding weight of each instruction segment, μ is the penalty parameter, and Z is the pull factor Grange multiplier, Y is a slack variable;

Step 2: Set the initial values of X, Y, Z and μ, and set the optimization objective as the convergence condition for the optimization to be completed;

Step 3: Use the least squares method to calculate the optimal value of Y;

Step 4: Calculate the optimal value of X using the soft threshold method;

Step 5: when the value of X does not meet the constraints of the target optimization model, reduce a non-zero element in X and replace X, until X meets the constraints of the target optimization model;

Step 6: Update the Lagrange multiplier Z according to the residual of the equation X=Y, and increase the penalty coefficient μ by a fixed multiple ρ, where ρ>1;

Step 7: Determine whether the convergence condition is satisfied. If it is satisfied, X is the solution to be solved. If it is not satisfied, repeat steps 3-6.

9. The method of claim 8, wherein said reducing a non-zero element in X and replacing X comprises:

On the premise that X is reduced by one non-zero element, when X formed by reducing the first non-zero element in X minimizes the function value of the objective optimization model, the first non-zero element is reduced and X is replaced.

10. The method of claim 1, wherein the characteristic index comprises at least one of the following:

Instructions per clock IPC, predicted branch success rate, predicted branch failure rate, L2 Cache hit rate, and energy consumption.

11. A device for evaluating indicators of a processor, comprising:

The acquisition unit is used to acquire the instruction stream of the test program and divide it into N instruction segments of equal length;

A first determination unit, configured to determine a feature matrix A, a mean column vector B, and a weight vector W corresponding to the N instruction fragments, wherein the feature matrix A is a matrix with M rows and N columns, and is used to describe the N The feature index vectors of each instruction segment, M is the number of types of basic blocks BB in the N instruction segments, and the mean column vector B is a column vector formed by the mean of the feature index vectors of each row of the feature matrix A, so Described weight vector W is the column vector that the basic block BB type corresponding to each row in the described feature matrix A is constituted by the weight of the feature index of the test program;

A second determination unit, configured to determine an optimized column vector according to the feature matrix A, the mean column vector B, the weight vector W and an optimization objective, where the optimization objective includes reducing the number of instruction fragments used for simulation and/or Or reduce the error between evaluating the feature index of the test program and the feature index of the test program, the optimized column vector is used to represent the instruction segment and each instruction segment used for simulation among the N instruction segments corresponding weight;

The third determination unit is configured to determine the number vector and coefficient vector of the instruction segment corresponding to the optimized column vector, wherein the number vector is used to represent the number of the instruction segment corresponding to the non-zero position in the optimized column vector, and the The coefficient vector is used to represent the weight of the feature index of the instruction fragment corresponding to the non-zero position in the optimized column vector;

The simulation unit is used for simulating the instruction fragments corresponding to the numbered vectors in the simulation device, respectively, to obtain a feature index vector of the instruction fragments corresponding to the numbered vector, and the feature index vector is used to represent the corresponding numbered vector. The feature index of the instruction fragment;

The fourth determining unit is configured to determine the inner product of the feature index vector and the coefficient vector as the feature index for evaluating the test program.

12. The apparatus according to claim 11, wherein the optimization objective is: when the first norm of W·AX-W·B or the second norm of AX-B is less than or equal to a first threshold , so that the third norm of X is the smallest, where X is an optimization variable, which is used to represent the instruction segment used for simulation in the N instruction segments and the corresponding weight of each instruction segment, and the optimized column vector satisfies the The value of X for the optimization objective.

13. The apparatus of claim 12, wherein the second determining unit is configured to determine the optimized column vector by the following objective optimization formula:

14. The apparatus of claim 11, wherein the optimization objective is:

When the first norm of X is less than or equal to the second threshold, the sum of the first product and the second product is minimized, where the first product is the second norm of W·AX-W·B and the first coefficient Product, the second product is the product of the third norm of X and the second coefficient, and the first coefficient is a non-zero real number, where X is an optimization variable, used to indicate that the N instruction fragments are used for The instruction segments to be simulated and the weights corresponding to each instruction segment, and the optimized column vector is the value of X that satisfies the optimization objective.

15. The apparatus of claim 14, wherein the second determining unit is configured to determine the optimized column vector by the following objective optimization formula:

16. The apparatus of claim 11, wherein the optimization objective is:

17. The apparatus of claim 11, wherein the optimization objective is:

18. The apparatus according to any one of claims 11-17, wherein the second determination unit determines the feature matrix A, the mean column vector B, the weight vector W and an optimization objective according to the feature matrix A The optimized column vector is specifically implemented as:

Step 3: Use the least squares method to calculate the optimal value of Y;

Step 4: Calculate the optimal value of X using the soft threshold method;

19. The apparatus of claim 18, wherein the second determining unit reduces a non-zero element in X and replaces X and is specifically implemented as:

20. The apparatus of claim 11, wherein the characteristic index comprises at least one of the following: