[go: up one dir, main page]

CN108008999B - Index evaluation method and device - Google Patents

Index evaluation method and device Download PDF

Info

Publication number
CN108008999B
CN108008999B CN201610944177.6A CN201610944177A CN108008999B CN 108008999 B CN108008999 B CN 108008999B CN 201610944177 A CN201610944177 A CN 201610944177A CN 108008999 B CN108008999 B CN 108008999B
Authority
CN
China
Prior art keywords
product
coefficient
norm
vector
instruction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201610944177.6A
Other languages
Chinese (zh)
Other versions
CN108008999A (en
Inventor
程捷
刘旭东
项定义
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201610944177.6A priority Critical patent/CN108008999B/en
Publication of CN108008999A publication Critical patent/CN108008999A/en
Application granted granted Critical
Publication of CN108008999B publication Critical patent/CN108008999B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45504Abstract machines for programme code execution, e.g. Java virtual machine [JVM], interpreters, emulators
    • G06F9/45508Runtime interpretation or emulation, e g. emulator loops, bytecode interpretation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Prevention of errors by analysis, debugging or testing of software
    • G06F11/3668Testing of software
    • G06F11/3672Test management
    • G06F11/3688Test management for test execution, e.g. scheduling of test suites

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Advance Control (AREA)
  • Devices For Executing Special Programs (AREA)

Abstract

本申请实施例提供了一种指标评估方法及装置,该方法包括:获取测试程序的指令流并分割成长度相等的N个指令片段;确定该N个指令片段对应的特征矩阵A、均值列向量B和权重向量W;根据该特征矩阵A、该列向量B、该权重向量W和多个优化目标确定该测试程序的目标优化模型,并求解该目标优化模型的优化列向量;确定该优化列向量对应的指令片段的编号向量和系数向量;将该编号向量对应的指令片段分别在模拟装置中进行模拟,得到该编号向量对应的指令片段的特征指标向量;确定该特征指标向量和该系数向量的内积为评估该测试程序的特征指标。

Figure 201610944177

The embodiment of the present application provides an indicator evaluation method and device, the method includes: acquiring an instruction stream of a test program and dividing it into N instruction segments of equal length; determining a feature matrix A and a mean column vector corresponding to the N instruction segments B and weight vector W; determine the target optimization model of the test program according to the feature matrix A, the column vector B, the weight vector W and multiple optimization objectives, and solve the optimization column vector of the target optimization model; determine the optimization column The number vector and coefficient vector of the instruction segment corresponding to the vector; the instruction segment corresponding to the number vector is simulated in the simulation device respectively, and the feature index vector of the instruction segment corresponding to the number vector is obtained; the feature index vector and the coefficient vector are determined. The inner product of is the characteristic index for evaluating the test program.

Figure 201610944177

Description

Index evaluation method and device
Technical Field
The embodiment of the application relates to the field of computers, in particular to an index evaluation method and device.
Background
Personnel involved in the design and development of processor architectures often need to run test programs in an emulator of a certain architecture and then collect performance-related data indicators, such as Instruction Per Clock (IPC), hit rate of a second level Cache (L2 Cache), energy consumption, etc., to find the bottleneck of the current processor architecture. After the system architecture is improved, the design is redeployed in the simulator, the simulator is used again to run the test program, data are collected, the performance difference of the same test program which runs under the new and old system architectures is compared, then the bottleneck is found again. It can be seen that a great deal of design test work is done with a software simulator before deployment in hardware.
However, one of the major disadvantages of the software simulation platform is: the same test program is run at a much longer runtime than the hardware platform. Especially when running large, comprehensive test suite programs, such as SPEC CPU 2006, it often takes weeks or even months to get the data needed by itself. And after the system architecture is changed every time, the test program needs to be operated again to collect data under the new architecture. Therefore, such repeated operations and waits will seriously affect the development efficiency.
Research shows that the running process of the application program has obvious stage characteristics. The developer needs to shorten the time required by simulation from the perspective of simplifying test programs, and the core thought of the method is as follows: a user can obtain a group of artificial programs only by running the original test program once, each artificial program is very small and can be simulated quickly, and the time for simulating the group of artificial programs is far shorter than that for simulating the original test program; and (3) multiplying the data obtained after each artificial program simulation by a specific weight by a user, wherein the data obtained after weighted averaging is very close to the data obtained after the original artificial program runs.
In the existing Simpoint technology, the steps are as follows: grabbing an instruction stream of a test program component, cutting the instruction stream into segments (intervals) with equal length, and calculating a Basic Block Vector (BBV) of each Interval; performing K-Means clustering on all BBVs, and selecting an Interval closest to the center of each type as a representative Interval of the type; and simulating and testing the indexes of the selected representative intervals, and multiplying the indexes measured by the representative intervals of each category by the proportion of the Interval of the category to the total Interval to calculate a weighted average to represent the indexes of the original test program components. The simulation test in the prior art needs too many selected fragments, long simulation time and uncontrollable error.
How to reduce the instruction fragments selected in the simulation test to shorten the simulation time and/or reduce the error of the simulation index is a technical problem to be solved by the embodiment of the present application.
Disclosure of Invention
The embodiment of the application provides an index evaluation method and device, which can reduce instruction segments selected in simulation test to shorten simulation time and reduce simulation index errors, so that the simulation efficiency or precision of a processor can be improved.
In a first aspect, an index evaluation method is provided, including:
acquiring an instruction stream of a test program and dividing the instruction stream into N instruction segments with equal length;
determining a feature matrix A, a mean column vector B and a weight vector W corresponding to the N instruction segments, wherein the feature matrix A is a matrix with M rows and N columns and is used for describing feature index vectors of the N instruction segments, M is the number of types of basic blocks BB in the N instruction segments, the mean column vector B is a column vector formed by the mean value of the feature index vectors of each row of the feature matrix A, and the weight vector W is a column vector formed by the weight occupied by the feature index of the basic block BB type corresponding to each row in the feature matrix A in a test program;
determining an optimized column vector according to the feature matrix A, the column vector B, the weight vector W and an optimization target, wherein the optimization target comprises reducing the number of instruction fragments for simulation and/or reducing an error between a feature index for evaluating the test program and the feature index of the test program, and the optimized column vector is used for representing the instruction fragments for simulation in the N instruction fragments and the weight corresponding to each instruction fragment;
determining a number vector and a coefficient vector of the instruction segment corresponding to the optimized column vector, wherein the number vector is used for indicating the number of the instruction segment corresponding to the non-zero position in the optimized column vector, and the coefficient vector is used for indicating the weight of the characteristic index of the instruction segment corresponding to the non-zero position in the optimized column vector;
simulating the instruction segments corresponding to the number vectors in a simulation device respectively to obtain characteristic index vectors of the instruction segments corresponding to the number vectors, wherein the characteristic index vectors are used for expressing characteristic indexes of the instruction segments corresponding to the number vectors;
and determining the inner product of the characteristic index vector and the coefficient vector as the characteristic index for evaluating the test program.
In the embodiment of the application, an optimized column vector is determined based on a feature matrix, a column vector, a weight vector and an optimized target of an instruction segment of a test program, the instruction segment and a coefficient vector corresponding to a non-zero position are determined according to the optimized column vector, the instruction segment corresponding to the optimized column vector is simulated in the simulation device to obtain a feature index vector of the instruction segment corresponding to the optimized column vector, and a feature index for evaluating the test program is obtained according to the feature index vector and the coefficient vector, so that the number of the instruction segments required by simulation work of the test program can be reduced, or simulation index errors of the test program are reduced.
With reference to the first aspect, in a first possible implementation manner, the optimization objective is: and when the first norm of W.AX-W.B or the second norm of AX-B is smaller than or equal to a first threshold value, minimizing a third norm of X, wherein X is an optimization variable used for representing instruction fragments used for simulation in the N instruction fragments and the weight corresponding to each instruction fragment, and the optimization column vector is the value of X meeting the optimization target.
The method of the embodiment of the application can reduce the number of instruction fragments required by simulation work of the test program on the premise of ensuring the error precision of the simulation index.
With reference to the first possible implementation manner of the first aspect, in a second possible implementation manner of the first aspect, determining an optimized column vector according to the feature matrix a, the column vector B, the weight vector W, and an optimization goal is specifically implemented by determining the optimized column vector through the following goal optimization formula:
Figure GDA0003030818070000031
wherein | | W.AX-W.B | | non-woven hairt1Representing a first norm of W.AX-W.B, | | AX-B | | non-combustible cellst2Represents a second norm of AX-B, | X | | non-volatile memoryt3C represents the first threshold value, to represent a third norm of X.
With reference to the first aspect, in a third possible implementation manner of the first aspect, the optimization objective is: when a first norm of X is less than or equal to a second threshold, minimizing a sum of a first product and a second product, where the first product is a product of a second norm of W · AX-W · B and a first coefficient, the second product is a product of a third norm of X and a second coefficient, and a value of the first coefficient is a non-zero real number, where X is an optimization variable used to represent instruction fragments for performing simulation among the N instruction fragments and a weight corresponding to each instruction fragment, and the optimization column vector is a value of X that satisfies the optimization goal.
The method of the embodiment of the application can reduce the simulation index error on the premise of controlling the number of the instruction segments required by the simulation work of the test program.
With reference to the third possible implementation manner of the first aspect, in a fourth possible implementation manner of the first aspect, the determining an optimized column vector according to the feature matrix a, the column vector B, the weight vector W, and an optimization goal is specifically implemented by determining the optimized column vector through the following goal optimization formula:
Figure GDA0003030818070000041
wherein d represents the second threshold, | X | | non-volatile memoryt1Represents a first norm of X, | | W.AX-W.B | | tormentumt2Represents a second norm of W.AX-W.B, | X | | non-magnetic cellst3Represents the third norm of X, λ represents the first coefficient, δ represents the second coefficient, λ ≠ 0.
With reference to the first aspect, in a fifth possible implementation manner of the first aspect, the optimization objective is: the sum of a first product, a second product, a third product, a fourth product and a fifth product is minimized, wherein the first product is the product of a first coefficient and a first norm of X, and the second product is the sum of a second coefficient and a first norm of X
Figure GDA0003030818070000042
Is the product of a third coefficient and a second norm of
Figure GDA0003030818070000043
A fourth product of a fourth coefficient and a fourth norm of L, a fifth product of a fifth coefficient and a fifth norm of S, a ═ L + S, L is a low rank matrix, S is a sparse matrix,
Figure GDA0003030818070000044
a column vector representing the average of the rows of the low rank matrix L,
Figure GDA0003030818070000045
and the column vector is a column vector formed by the average values of all rows of the sparse matrix S, and the first coefficient, the second coefficient, the third coefficient, the fourth coefficient and the fifth coefficient are not all zero, wherein X is an optimization variable and is used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, and the optimization column vector is a value of X meeting the optimization target.
The method of the embodiment of the application can achieve certain balance between reduction of the number of the instruction fragments required by simulation work of the test program and reduction of simulation index errors.
With reference to the first aspect, in a sixth possible implementation manner of the first aspect, the optimization objective is: in that
Figure GDA0003030818070000046
And/or the eighth norm of X is less than or equal to the second threshold, the sum of the first product, the second product, the third product, the fourth product and the fifth product is minimized, wherein the first product is the product of the first coefficient and the first norm of X, and the second product is the product of the second coefficient and the first norm of X
Figure GDA0003030818070000047
Is the product of a third coefficient and a second norm of
Figure GDA0003030818070000048
A fourth product of a fourth coefficient and a fourth norm of L, a fifth product of a fifth coefficient and a fifth norm of S, a ═ L + S, L is a low rank matrix, S is a sparse matrix,
Figure GDA0003030818070000049
a column vector representing the average of the rows of the low rank matrix L,
Figure GDA00030308180700000410
and a column vector formed by the average values of the rows of the sparse matrix S, wherein the first coefficient, the second coefficient, the third coefficient, the fourth coefficient and the fifth coefficient are not all zero.
The method of the embodiment of the application can balance reduction of the number of the instruction segments required by simulation work of the test program and reduction of simulation index errors on the premise of ensuring precision of the simulation index errors and/or controlling the number of the instruction segments required by the simulation work of the test program.
With reference to the first aspect and the foregoing implementation manner, in a seventh possible implementation manner of the first aspect, determining an optimized column vector according to the feature matrix a, the column vector B, the weight vector W, and an optimization goal is specifically implemented as:
step 1: establishing a target optimization model of the optimization target according to the characteristic matrix A, the column vector B, the weight vector W and the optimization target, and establishing a Lagrangian function f (X, Y, Z, mu) of the target optimization model, wherein X is an optimization variable used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, mu is a penalty parameter, Z is a Lagrangian multiplier, and Y is a relaxation variable;
step 2: setting X, Y, Z and initial values of μ, and setting the optimization target as a convergence condition for the optimization completion;
and step 3: calculating an optimal value of Y by using a least square method;
and 4, step 4: calculating the optimal value of X by using a soft threshold method;
and 5: when the value of X does not satisfy the constraint condition of the target optimization model, reducing a non-zero element in X and replacing X until X satisfies the constraint condition of the target optimization model;
step 6: updating a Lagrange multiplier Z according to a residual error of an equation X ═ Y, and increasing a penalty coefficient mu by a fixed multiple rho, wherein rho is greater than 1;
and 7: and judging whether the convergence condition is met, if so, determining that X is solved, and if not, repeatedly executing the step 3-6.
With reference to the first aspect and the foregoing implementation manner of the first aspect, in an eighth possible implementation manner of the first aspect, reducing one non-zero element in X and replacing X is specifically implemented as: and on the premise that X is reduced by one non-zero element, when X formed by reducing the first non-zero element in X is used for minimizing the function value of the target optimization model, reducing the first non-zero element and replacing X.
With reference to the first aspect and the foregoing implementation manner of the first aspect, in a ninth possible implementation manner of the first aspect, the characteristic index includes at least one of the following: IPC (instruction per clock), success rate of predicted branch, failure rate of predicted branch, hit rate of L2 Cache and energy consumption.
In a second aspect, an index evaluation apparatus is provided, including:
the device comprises an acquisition unit, a processing unit and a control unit, wherein the acquisition unit is used for acquiring an instruction stream of a test program and dividing the instruction stream into N instruction fragments with equal length;
a first determining unit, configured to determine a feature matrix a, a mean column vector B, and a weight vector W, where the feature matrix a is a matrix with M rows and N columns, and is used to describe feature indicator vectors of the N instruction segments, M is the number of types of basic blocks BB in the N instruction segments, the mean column vector B is a column vector formed by a mean value of the feature indicator vectors in each row of the feature matrix a, and the weight vector W is a column vector formed by weights occupied by feature indicators of basic blocks BB of each row of the feature matrix a in a test program;
a second determining unit, configured to determine an optimized column vector according to the feature matrix a, the column vector B, the weight vector W, and an optimization target, where the optimization target includes reducing the number of instruction fragments used for simulation and/or reducing an error between a feature index of the test program and a feature index of the test program, and the optimized column vector is used to represent instruction fragments used for simulation in the N instruction fragments and a weight corresponding to each instruction fragment;
a third determining unit, configured to determine a number vector and a coefficient vector of the instruction segment corresponding to the optimized column vector, where the number vector is used to indicate a number of the instruction segment corresponding to a non-zero position in the optimized column vector, and the coefficient vector is used to indicate a weight of a feature indicator of the instruction segment corresponding to the non-zero position in the optimized column vector;
the simulation unit is used for simulating the instruction segments corresponding to the number vectors in a simulation device respectively to obtain the characteristic index vectors of the instruction segments corresponding to the number vectors, and the characteristic index vectors are used for expressing the characteristic indexes of the instruction segments corresponding to the number vectors;
and the fourth determination unit is used for determining the inner product of the characteristic index vector and the coefficient vector as the characteristic index for evaluating the test program.
In the embodiment of the application, an optimized column vector is determined based on a feature matrix, a column vector, a weight vector and an optimized target of an instruction segment of a test program, the instruction segment and a coefficient vector corresponding to a non-zero position are determined according to the optimized column vector, the instruction segment corresponding to the optimized column vector is simulated in the simulation device to obtain a feature index vector of the instruction segment corresponding to the optimized column vector, and a feature index for evaluating the test program is obtained according to the feature index vector and the coefficient vector, so that the number of the instruction segments required by simulation work of the test program can be reduced, or simulation index errors of the test program are reduced.
With reference to the second aspect, in a first possible implementation manner, the optimization goal is: and when the first norm of W.AX-W.B or the second norm of AX-B is smaller than or equal to a first threshold value, minimizing a third norm of X, wherein X is an optimization variable used for representing instruction fragments used for simulation in the N instruction fragments and the weight corresponding to each instruction fragment, and the optimization column vector is the value of X meeting the optimization target.
The method of the embodiment of the application can reduce the number of instruction fragments required by simulation work of the test program on the premise of ensuring the error precision of the simulation index.
With reference to the first possible implementation manner of the second aspect, in a second possible implementation manner of the second aspect, the second determining unit specifically determines the optimized column vector by using the following target optimization formula:
Figure GDA0003030818070000071
wherein | | W.AX-W.B | | non-woven hairt1Representing a first norm of W.AX-W.B, | | AX-B | | non-combustible cellst2Represents a second norm of AX-B, | X | | non-volatile memoryt3C represents the first threshold value, to represent a third norm of X.
With reference to the second aspect, in a third possible implementation manner of the second aspect, the optimization goal is: when a first norm of X is less than or equal to a second threshold, minimizing a sum of a first product and a second product, where the first product is a product of a second norm of W · AX-W · B and a first coefficient, the second product is a product of a third norm of X and a second coefficient, and a value of the first coefficient is a non-zero real number, where X is an optimization variable used to represent instruction fragments for performing simulation among the N instruction fragments and a weight corresponding to each instruction fragment, and the optimization column vector is a value of X that satisfies the optimization goal.
The method of the embodiment of the application can reduce the simulation index error on the premise of controlling the number of the instruction segments required by the simulation work of the test program.
With reference to the third possible implementation manner of the second aspect, in a fourth possible implementation manner of the second aspect, the second determining unit specifically determines the optimized column vector through the following target optimization formula:
Figure GDA0003030818070000072
wherein d represents the second threshold, | X | | non-volatile memoryt1Represents a first norm of X, | | W.AX-W.B | | tormentumt2Represents a second norm W.AX-W.B,||X||t3represents the third norm of X, λ represents the first coefficient, δ represents the second coefficient, λ ≠ 0.
With reference to the second aspect, in a fifth possible implementation manner of the second aspect, the optimization goal is: the sum of a first product, a second product, a third product, a fourth product and a fifth product is minimized, wherein the first product is the product of a first coefficient and a first norm of X, and the second product is the sum of a second coefficient and a first norm of X
Figure GDA0003030818070000073
Is the product of a third coefficient and a second norm of
Figure GDA0003030818070000074
A fourth product of a fourth coefficient and a fourth norm of L, a fifth product of a fifth coefficient and a fifth norm of S, a ═ L + S, L is a low rank matrix, S is a sparse matrix,
Figure GDA0003030818070000075
a column vector representing the average of the rows of the low rank matrix L,
Figure GDA0003030818070000081
and the column vector is a column vector formed by the average values of all rows of the sparse matrix S, and the first coefficient, the second coefficient, the third coefficient, the fourth coefficient and the fifth coefficient are not all zero, wherein X is an optimization variable and is used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, and the optimization column vector is a value of X meeting the optimization target.
The method of the embodiment of the application can achieve certain balance between reduction of the number of the instruction fragments required by simulation work of the test program and reduction of simulation index errors.
With reference to the second aspect, in a sixth possible implementation manner of the second aspect, the optimization goal is: a sixth norm at W.AX-W.B or a seventh norm at AX-B is less than or equal to a first threshold, and/or an eighth norm of X is less than or equal to a second normWhen the threshold value is used, the sum of a first product, a second product, a third product, a fourth product and a fifth product is minimized, wherein the sum of the first product, the second product, the third product, the fourth product and the fifth product is minimized, the first product is the product of a first coefficient and a first norm of X, and the second product is the product of a second coefficient and a first norm of X
Figure GDA0003030818070000082
Is the product of a third coefficient and a second norm of
Figure GDA0003030818070000083
A fourth product of a fourth coefficient and a fourth norm of L, a fifth product of a fifth coefficient and a fifth norm of S, a ═ L + S, L is a low rank matrix, S is a sparse matrix,
Figure GDA0003030818070000084
a column vector representing the average of the rows of the low rank matrix L,
Figure GDA0003030818070000085
and a column vector formed by the average values of the rows of the sparse matrix S, wherein the first coefficient, the second coefficient, the third coefficient, the fourth coefficient and the fifth coefficient are not all zero.
The method of the embodiment of the application can balance reduction of the number of the instruction segments required by simulation work of the test program and reduction of simulation index errors on the premise of ensuring precision of the simulation index errors and/or controlling the number of the instruction segments required by the simulation work of the test program.
With reference to the second aspect and the foregoing implementation manner of the second aspect, in a seventh possible implementation manner of the second aspect, the second determining unit is specifically implemented as:
step 1: establishing a target optimization model of the optimization target according to the characteristic matrix A, the column vector B, the weight vector W and the optimization target, and establishing a Lagrangian function f (X, Y, Z, mu) of the target optimization model, wherein X is an optimization variable used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, mu is a penalty parameter, Z is a Lagrangian multiplier, and Y is a relaxation variable;
step 2: setting X, Y, Z and initial values of μ, and setting the optimization target as a convergence condition for the optimization completion;
and step 3: calculating an optimal value of Y by using a least square method;
and 4, step 4: calculating the optimal value of X by using a soft threshold method;
and 5: when the value of X does not satisfy the constraint condition of the target optimization model, reducing a non-zero element in X and replacing X until X satisfies the constraint condition of the target optimization model;
step 6: updating a Lagrange multiplier Z according to a residual error of an equation X ═ Y, and increasing a penalty coefficient mu by a fixed multiple rho, wherein rho is greater than 1;
and 7: and judging whether the convergence condition is met, if so, determining that X is solved, and if not, repeatedly executing the step 3-6.
With reference to the second aspect and the foregoing implementation manner of the second aspect, in an eighth possible implementation manner of the second aspect, reducing one non-zero element in X and replacing X is specifically implemented as: and on the premise that X is reduced by one non-zero element, when X formed by reducing the first non-zero element in X is used for minimizing the function value of the target optimization model, reducing the first non-zero element and replacing X.
With reference to the second aspect and the foregoing implementation manner of the second aspect, in a ninth possible implementation manner of the second aspect, the characteristic index includes at least one of the following: IPC (instruction per clock), success rate of predicted branch, failure rate of predicted branch, hit rate of L2 Cache and energy consumption.
In a third aspect, there is provided another index evaluation apparatus, including a processor and a channel interface, where the processor is configured to execute the method in the first aspect or any possible implementation manner of the first aspect through the channel interface.
In a fourth aspect, a computer-readable storage medium is presented for storing a computer program comprising instructions for performing the method of the first aspect or any possible implementation manner of the first aspect.
Based on the above technical solutions, the index evaluation method and apparatus in the embodiments of the present application construct a target optimization model of a test program based on a feature matrix, a column vector, and a weight vector of an instruction segment of the test program, determine an optimized column vector based on the target optimization model and an optimization target, determine an instruction segment and a coefficient vector corresponding to a non-zero position according to the optimized column vector, simulate the instruction segment corresponding to the optimized column vector in the simulation apparatus to obtain a feature index vector of the instruction segment corresponding to the optimized column vector, and obtain a feature index for evaluating the test program according to the feature index vector and the coefficient vector, thereby reducing the number of instruction segments required for simulation work of the test program or reducing a simulation index error of the test program.
Drawings
FIG. 1 is a schematic diagram of instruction code according to an embodiment of the present application.
FIG. 2 is a flowchart of an index evaluation method according to an embodiment of the present application.
FIG. 3 is an interaction flow diagram of an index evaluation method according to an embodiment of the present application.
FIG. 4 is a schematic diagram of a target optimization model of an embodiment of the present application.
FIG. 5 is a flow diagram of a method of determining an optimized column vector according to one embodiment of the present application.
Fig. 6 is a schematic structural diagram of an index evaluation device according to an embodiment of the present application.
Fig. 7 is a schematic structural diagram of an index evaluation device according to another embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the accompanying drawings.
To facilitate understanding of the embodiments of the present application, several elements that will be introduced in the description of the embodiments of the present application are first introduced herein.
An instruction stream file: the file for recording the instruction stream information is called an instruction stream file, and each line of the instruction stream file represents information related to an executed instruction and conforms to a uniform format. Typically, the instruction stream file size is fixed. For example, an instruction stream file generally contains information such as 1 hundred million instructions, and a complete test program can be regarded as an instruction stream composed of a plurality of 1 hundred million instructions, each 1 hundred million instruction being referred to as an Interval (Interval), in other words, a complete test program is composed of a plurality of segments. The full set of test programs is all fragments and the subset of test programs is part of fragments. The purpose of simplifying the test program is to select representative partial segments from the corpus, and it is required that the number of the selected segments is as small as possible, and the operation results of the segments are as high as possible in similarity with the operation results of the original test program. An instruction may include the following information: program pointer, assembly instruction, operation type, and memory address. The memory address is a selectable item.
And (3) program pointer: the program pointer for each line of instructions is the address in memory of the line assembler instruction, which is a hexadecimal number beginning with "0 x".
Assembling instructions: the binary instruction code of the instruction needs to meet assembly syntax requirements.
The operation type is as follows: all assembly instructions can be divided into three categories: the arithmetic logic unit operates, reads the memory, writes the memory and controls the instruction.
Memory address: if the instruction is an arithmetic logic unit operation, no memory address information is needed; if the instruction is a memory read-write operation, a memory address is required.
Basic Block (BB): a piece of sequentially executed instructions. In general, an instruction stream may be divided into a plurality of BBs with a control instruction as a boundary. Each segment of the test program is composed of BB.
Basic Block feature indicator Vector (BBV): according to different control instructions, the execution times of BB of different types in each segment are counted, and a vector constructed based on the types and the execution times of the BB is called BBV. FIG. 1 is a schematic diagram of instruction code according to an embodiment of the present application. As shown in the instruction code fragment of FIG. 1, if the type ID of BB is {1,2,3,4,5}, and the corresponding execution times is {1,20,0,5,0}, then BBV of the fragment can be recorded as {1:1,2:20,3:0,4:5,5:0 }.
FIG. 2 is a flowchart of an index evaluation method according to an embodiment of the present application. The method of FIG. 2 is performed by an index evaluation device of a processor. Specifically, the index evaluation device may be a host apparatus or the like.
An instruction stream of a test program is obtained and divided into N instruction fragments of equal length 201.
202, determining a feature matrix a, a mean column vector B, and a weight vector W corresponding to the N instruction fragments.
The feature matrix a is a matrix with M rows and N columns for describing feature index vectors of the N instruction segments, M is the number of types of basic blocks BB in the N instruction segments, the mean column vector B is a column vector formed by a mean value of the feature index vectors of each row of the feature matrix a, and the weight vector W is a column vector formed by weights occupied by feature indexes of the test program by types of the basic blocks BB corresponding to each row of the feature matrix a.
Specifically, in the embodiment of the present application, the characteristic index may be any one of the following: instruction Per Clock (IPC), predicted branch success/failure, L2 Cache hit, energy consumption, etc.
203, determining an optimized column vector according to the feature matrix a, the column vector B, the weight vector W and an optimization goal.
The optimization target comprises reducing the number of instruction segments used for simulation and/or reducing errors between characteristic indexes of the evaluation test program and characteristic indexes of the test program, and the optimization column vector is used for representing instruction segments used for simulation in the N instruction segments and weights corresponding to the instruction segments used for simulation.
Specifically, for example, an objective optimization model may be established according to the feature matrix a, the column vector B, the weight vector W and an optimization objective, the objective optimization model is used to represent the optimization objective through a relationship between A, B, W and an optimization variable X of the objective optimization model, and the optimization column vector is a value of the optimization variable X satisfying the constraint relationship.
And 204, determining a number vector and a coefficient vector of the instruction segment corresponding to the optimized column vector.
The number vector is used for representing the number of the instruction segment corresponding to the non-zero position in the optimized column vector, and the coefficient vector is used for representing the weight of the characteristic index of the instruction segment corresponding to the non-zero position in the optimized column vector.
205, the instruction segments corresponding to the number vectors are simulated in the simulation device to obtain the feature index vectors of the instruction segments corresponding to the number vectors.
The feature indicator vector is used to indicate the feature indicator of the instruction segment corresponding to the number vector.
And 206, determining the inner product of the characteristic index vector and the coefficient vector as the characteristic index for evaluating the test program.
In the embodiment of the application, an optimized column vector is determined based on a feature matrix, a column vector, a weight vector and an optimized target of an instruction segment of a test program, the instruction segment and a coefficient vector corresponding to a non-zero position are determined according to the optimized column vector, the instruction segment corresponding to the optimized column vector is simulated in the simulation device to obtain a feature index vector of the instruction segment corresponding to the optimized column vector, and a feature index for evaluating the test program is obtained according to the feature index vector and the coefficient vector, so that the number of the instruction segments required by simulation work of the test program can be reduced, or simulation index errors of the test program are reduced.
Optionally, as an embodiment, the optimization objective is: and when the first norm of W.AX-W.B or the second norm of AX-B is smaller than or equal to a first threshold value, minimizing a third norm of X, wherein X is an optimization variable used for representing instruction fragments used for simulation in the N instruction fragments and the weight corresponding to each instruction fragment, and the optimization column vector is the value of X meeting the optimization target.
It should be understood that in the embodiments of the present application, the firstThe norm, the second norm, the third norm, etc. may be any one of the following norms: 0 norm, 1 norm,
Figure GDA0003030818070000121
Norm, p-norm or nuclear norm, etc., where p is a positive integer. It should be understood that the first to eighth norms, etc. mentioned elsewhere in this application may also be any of the following norms: 0 norm, 1 norm,
Figure GDA0003030818070000122
Norm, p-norm, or nuclear norm, etc. Taking X as an example, 0 norm | X | of X is non-woven01 norm of X (| X | | non-woven phosphor) of X1The kernel norm X of X is not counting*X of
Figure GDA0003030818070000123
Norm of
Figure GDA0003030818070000124
P norm | X | Y | of X non-woven gridpAnd p is a positive integer.
Specifically, step 203 may determine the optimized column vector by equation 1:
Figure GDA0003030818070000125
wherein | | W.AX-W.B | | non-woven hairt1Representing a first norm of W.AX-W.B, | | AX-B | | non-combustible cellst2Represents a second norm of AX-B, | X | | non-volatile memoryt3C represents the first threshold value, to represent a third norm of X. It is understood that the operator symbol, represents an inner product, e.g., [ x ]1,x2,x3]·[y1,y2,y3]=x1 y1+x2 y2+x3 y3
It is not assumed that the first norm of W.AX-W.B in equation 1 is | | | W.AX-W.B | | survival0The second norm of X | | non-conducting phosphor1A 0 norm | | X | | non-woven phosphor of X0Then equation 1 can now be as shown in equation 2:
Figure GDA0003030818070000131
the method of the embodiment of the application can reduce the number of instruction fragments required by simulation work of the test program on the premise of ensuring the error precision of the simulation index.
Optionally, as another embodiment, the optimization objective is: when a first norm of X is less than or equal to a second threshold, minimizing a sum of a first product and a second product, where the first product is a product of a second norm of W · AX-W · B and a first coefficient, the second product is a product of a third norm of X and a second coefficient, and a value of the first coefficient is a non-zero real number, where X is an optimization variable used to represent instruction fragments for performing simulation among the N instruction fragments and a weight corresponding to each instruction fragment, and the optimization column vector is a value of X that satisfies the optimization goal.
Specifically, step 203 may determine the optimized column vector by equation 3:
Figure GDA0003030818070000132
wherein d represents the second threshold, | X | | ceilingt1Represents a first norm of X, | | W.AX-W.B | | tormentumt2Represents a second norm of W.AX-W.B, | X | | non-magnetic cellst3Represents the third norm of X, λ represents the first coefficient, δ represents the second coefficient, λ ≠ 0.
It should be understood that, in the embodiment of the present application, the first norm of X and the third norm of X may be any one of the following norms: 0 norm of X | | non-woven phosphor 01 norm of X (| X | | non-woven phosphor) of X1X of
Figure GDA0003030818070000133
Norm of
Figure GDA0003030818070000134
P norm | X | Y | of X non-woven gridpAnd p is a positive integer. The first norm of X and the third norm of X may be the same or different。
It should be understood that in the embodiment of the present application, the second norm of W · AX-W · B may be any one of the following norms: W.AX-W.B has a 0 norm | | | W.AX-W.B | | luminance 01 norm of W.AX-W.B
||W·AX-W·B||1W.AX-W.B
Figure GDA0003030818070000137
Norm of
Figure GDA0003030818070000135
P norm of W.AX-W.B | | | W.AX-W.B | | luminancepAnd p is a positive integer.
It should be understood that, in the first coefficient and the second coefficient of the embodiments of the present application, at least 1 term is not 0. For example, λ ≠ 0.
It is not assumed that the first norm of X in equation 3 | | | X | | luminancetA 0 norm | | X | | non-woven phosphor of X0The third norm of X | | non-conducting phosphortA 1-norm | | X | | non-woven phosphor of X1The second norm of W.AX-W.B is 2 norms of W.AX-W.B | | tormenting2If the values of the first coefficient and the second coefficient are both 1, formula 3 can be as shown in formula 4:
Figure GDA0003030818070000136
it should be understood that, in the embodiment of the present application, the values of the first coefficient and the second coefficient are determined by the optimization objective. The values of the first coefficient and the second coefficient determine whether the optimization target is smaller instruction fragments for simulation or smaller simulation errors.
Optionally, as another embodiment, the optimization objective is: so that the sum of the first product, the second product, the third product, the fourth product and the fifth product is minimum, wherein the first product is a product of a first coefficient and a first norm of X, the second product is a product of a second coefficient and a second norm of LX-L, the third product is a product of a third coefficient and a third norm of SX-S, the fourth product is a product of a fourth coefficient and a fourth norm of L, the fifth product is a product of a fifth coefficient and a fifth norm of S, A is L + S, L is a low-rank matrix, S is a sparse matrix, L represents a column vector formed by an average value of each row of the low-rank matrix L, and S represents a column vector formed by an average value of each row of the sparse matrix S, wherein, X is an optimization variable used for representing instruction segments used for simulation among the N instruction segments and a weight corresponding to each instruction segment, and the optimized column vector is a value of X that satisfies the optimization objective.
Specifically, step 203 may specifically determine the optimized column vector by equation 5:
Figure GDA0003030818070000141
wherein | X | Y luminancet1A first norm representing X is given,
Figure GDA0003030818070000142
to represent
Figure GDA0003030818070000143
Is measured in a first direction of the first norm,
Figure GDA0003030818070000144
to represent
Figure GDA0003030818070000145
Is a third norm, | L | | non-woven phosphort4Represents the fourth norm of L, | S | | non-woven phosphort5Denotes a fifth norm of S, λ denotes a first coefficient, δ denotes a second coefficient, α denotes a third coefficient, β denotes a fourth coefficient, and γ denotes a fifth coefficient.
It should be understood that at least one of the first coefficient, the second coefficient, the third coefficient, the fourth coefficient, and the fifth coefficient in the embodiments of the present application is not 0.
Specifically, when the first norm of X is 1 norm, the second norm of LX-L is 2 norms, the third norm of SX-S is 2 norms, the fourth norm of L is 1 norm, and the fifth norm of S is 1 norm, equation 5 can be expressed as shown in equation 6:
Figure GDA0003030818070000146
optionally, as an embodiment, the optimization objective is: when the sixth norm of W.AX-W.B or the seventh norm of AX-B is less than or equal to a first threshold value and/or the eighth norm of X is less than or equal to a second threshold value, the sum of a first product, a second product, a third product, a fourth product and a fifth product is minimized, wherein the first product is the product of a first coefficient and the first norm of X, and the second product is the product of a second coefficient and the first norm of X
Figure GDA0003030818070000147
Is the product of a third coefficient and a second norm of
Figure GDA0003030818070000148
A fourth product of a fourth coefficient and a fourth norm of L, a fifth product of a fifth coefficient and a fifth norm of S, a ═ L + S, L is a low rank matrix, S is a sparse matrix,
Figure GDA0003030818070000149
a column vector representing the average of the rows of the low rank matrix L,
Figure GDA0003030818070000151
and representing a column vector formed by the average values of all rows of the sparse matrix S, wherein X is an optimization variable and is used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, and the optimization column vector is a value of X meeting the optimization target.
When the constraint of the optimization objective is that the sixth norm of W · AX-W · B or the seventh norm of AX-B is less than or equal to the first threshold, step 203 may determine the optimized column vector by equation 7:
Figure GDA0003030818070000152
wherein c represents a first threshold, | W.AX-W.B | | calceit6Represents the sixth norm of W.AX-W.B, | | AX-B | | non-combustible cellst7Represents the seventh norm of AX-B, and the remaining individual parameters have the same meanings as the corresponding parameters in equation 5.
When the constraint condition of the optimization objective is that the sixth norm of X is less than or equal to the second threshold, step 203 may determine the optimized column vector by equation 8:
Figure GDA0003030818070000153
wherein d represents a second threshold, | X | | non-volatile memoryt6Denotes the sixth norm of X, and the remaining respective parameters have the same meanings as the corresponding parameters in equation 5.
When the constraint of the optimization objective is that the sixth norm of W · AX-W · B or the seventh norm of AX-B is less than or equal to the first threshold and the eighth norm of X is less than or equal to the second threshold, step 203 may determine the optimized column vector by equation 9:
Figure GDA0003030818070000154
wherein c represents a first threshold, d represents a second threshold, and | X | | magnetism is not present when a sixth norm of W.AX-W.B or a seventh norm of AX-B is less than or equal to the first thresholdt8Represents the eighth norm of X, and the remaining parameters have the same meanings as the corresponding parameters in equation 5.
Specifically, the detailed process of step 203 may be as follows:
step 1: establishing a target optimization model of the optimization target according to the characteristic matrix A, the column vector B, the weight vector W and the optimization target, and establishing a Lagrangian function f (X, Y, Z, mu) of the target optimization model, wherein X is an optimization variable used for representing instruction segments used for simulation in the N instruction segments and the weight corresponding to each instruction segment, mu is a penalty parameter, Z is a Lagrangian multiplier, and Y is a relaxation variable;
step 2: setting X, Y, Z and initial values of μ, and setting the optimization target as a convergence condition for the optimization completion;
and step 3: calculating an optimal value of Y by using a least square method;
and 4, step 4: calculating the optimal value of X by using a soft threshold method;
and 5: when the value of X does not satisfy the constraint condition of the target optimization model, reducing a non-zero element in X and replacing X until X satisfies the constraint condition of the target optimization model;
step 6: updating a Lagrange multiplier Z according to a residual error of an equation X ═ Y, and increasing a penalty coefficient mu by a fixed multiple rho, wherein rho is greater than 1;
and 7: and judging whether the convergence condition is met, if so, determining that X is solved, and if not, repeatedly executing the steps 3-7.
In a specific implementation manner, the reduction of one non-zero element in X and the replacement of X in step 5 can be specifically implemented as: the lagrange multiplier Z is updated according to the residual of equation X ═ Y and the penalty factor μ is increased by a fixed multiple ρ, where ρ > 1. Of course, it should be understood that there may be other specific implementation manners of step 5, for example, decreasing the non-zero elements in X and replacing X one by one in the order of decreasing the non-zero elements from small to large or from large to small until X satisfies the constraint condition of the target optimization model, and so on.
In order to facilitate understanding of the method of the embodiments of the present application, the method of the embodiments of the present application will be further described with reference to specific embodiments.
FIG. 3 is a flowchart illustrating evaluation of metrics by a processor according to an embodiment of the present disclosure. It should be understood that in the embodiment of the present application, BBV is used as the feature index vector of the test program. The method of the embodiment of the present application is executed by an index evaluation apparatus of a processor disposed on a host device, where the apparatus may be an analog system or an execution chip, and the like, and the embodiment of the present application is not limited herein.
301, an instruction stream is fetched.
In one particular example, the host device may use an instruction stream fetching software tool to fetch an instruction stream of the test program, the instruction stream being composed of instructions. The host device can segment the instruction stream into BBs according to the jump instruction, and thus the instruction stream can be seen as being composed of BB sequences.
302, the instruction stream is partitioned into instruction fragments, and the BBV of each instruction fragment is calculated.
The host device may divide the instruction stream acquired in step 301 into a plurality of instruction fragments (intervals). For example, the instruction stream may be divided into instruction fragments of equal length, and the number and entry address of each instruction fragment may be recorded. Specifically, for example, each instruction fragment contains 1 billion BB, and so on. Of course, the division may be performed according to other instruction lengths, for example, 2 hundred million BBs, 10 hundred million BBs, etc., and the embodiment of the present application is not limited herein.
It is not assumed that the number of divided instruction fragments is N.
303, construct a matrix A, B, W.
The host device may construct the feature matrix a according to the BBV of each instruction segment after the instruction stream is split. Where each row represents the BBV of one instruction fragment.
The host device may calculate a column vector B based on the average value of each row in the feature matrix a. Specifically, the value of each row in B can be expressed by the following equation 10:
Figure GDA0003030818070000171
wherein, i is more than or equal to 1 and less than or equal to N, and M represents the number of BB types in the instruction segment.
In addition, a weight vector W may be set according to the difference in weight of the type of BB in the characteristic index of the test program. W is the column vector. Wherein each value of W represents a weight of the corresponding type of BB in the characteristic index of the test program.
FIG. 4 is a schematic diagram of a target optimization model according to an embodiment of the present application. As shown in FIG. 4, the rows of the feature matrix A correspond to the types of BB, the columns correspond to the divided instruction fragments, the feature matrix A is a matrix of M rows × N columns, and the element A in Ai,jIndicates the j-th instructionThe characteristic index value of the ith BB in the segment. The sparse solution matrix X is an optimization variable X in the embodiment shown in fig. 2, a position number of a non-zero coefficient in X is used to indicate a number of a selected instruction segment, a value of the non-zero coefficient indicates a weight of a finally calculated characteristic index of a characteristic index of the instruction segment corresponding to the non-zero coefficient, and the sparse solution matrix X is an N row × 1 column matrix. The column mean matrix B of a is a column vector formed by the column means of each column in a, and the matrix B is a matrix of M rows × 1 column, and its calculation method is shown in equation 10. The weight matrix W is a matrix of M rows by 1 column, and the ith element W in the weight matrixiAnd (3) representing the weight of the feature vector of the ith BB in the finally calculated feature index. As shown in fig. 4, X is a value satisfying AX · W — B · W. It is understood that AX.W and W.AX are equivalent; B.W and W.B are equivalent.
An optimized column vector is determined 304 based on A, B, W and the optimization objective.
Specifically, an objective optimization model may be constructed from A, B, W and the optimization objectives, and the optimization column vectors of the objective optimization model may be solved.
An optimization variable X can be introduced, wherein a non-zero element in X represents a selected instruction segment, and the value of the non-zero element in X represents the value of the weight corresponding to the instruction segment. That is, X is used to represent the instruction segment for simulation among the N instruction segments of the test program and the weight corresponding to each instruction segment.
It should be understood that in practical applications, W · AX and W · B will have a certain error, and the smaller the error, the smaller the simulation error; further, the smaller the number of non-zero coefficients in X, the fewer instruction fragments to be selected for the simulation test, and the shorter the time required for the simulation test. Thus, an objective optimization model may be constructed based on A, B, W according to the optimization objectives.
It should be appreciated that if the cost of the number of instruction fragments is high, then a higher weight may be placed on the optimization objective that affects the number of X non-zero coefficients. At this point, the optimization objective may include limiting the number of non-zero coefficients in X.
Specifically, for example, the number of non-zero coefficients in X is minimizedAnd (4) transforming. At this time, it can be achieved by minimizing the norm of X. The X norm may include the 0 norm | | | X | | luminance of X 01 norm of X (| X | | non-woven phosphor) of X1X of
Figure GDA0003030818070000181
Figure GDA0003030818070000182
Or, for example, such that the 1 norm of X is less than d. At this time, | | X | | non-woven phosphor can be restricted1≤d,d∈R+. Reducing non-zero numbers in X is achieved by constraining the 1-norm reduction of X to be less than a value that results in a reduction of non-zero numbers in X.
Of course, other constraint formulas may exist, and the embodiments of the present application are not listed here.
It should be appreciated that if certain simulation error accuracy is required, higher weights may be placed on the optimization objectives affecting W.AX-W.B.
Specifically, for example, the value of W · AX-W · B is minimized, or W · AX-W · B is smaller than d, or the norm of W · AX-W · B is smaller than d, or the like.
Based on the difference of the optimization objectives, different objective optimization models can be set.
For example, assume that the optimization goal of the target optimization model is to minimize the third norm of X when either the first norm of W.AX-W.B or the second norm of AX-B is less than or equal to the first threshold. At this time, the objective optimization model may be expressed by equation 1 above.
For another example, assume that the optimization goal of the objective optimization model is to minimize the sum of a first product and a second product when a first norm of X is smaller than or equal to a second threshold, where the first product is a product of a second norm of W · AX-W · B and a first coefficient, the second product is a product of a third norm of X and a second coefficient, and a value of the first coefficient is a non-zero real number. At this time, the target optimization model may be expressed by equation 3 above.
As another example, assume that the optimization objective of the objective optimization model is such that the first product,The sum of a second product, a third product, a fourth product and a fifth product is minimum, wherein the first product is the product of a first coefficient and a first norm of X, and the second product is the sum of a second coefficient and a fifth product
Figure GDA0003030818070000191
Is the product of a third coefficient and a second norm of
Figure GDA0003030818070000192
A fourth product is a product of a fourth coefficient and a fourth norm of L, a fifth product is a product of a fifth coefficient and a fifth norm of S, a is L + S, L is a low-rank matrix, S is a sparse matrix, L represents a column vector formed by an average value of each row of the low-rank matrix L, and S represents a column vector formed by an average value of each row of the sparse matrix S. At this time, the target optimization model can be expressed by the above equation 5.
For another example, assume that the optimization goal of the objective optimization model is to minimize the sum of a first product, a second product, a third product, a fourth product and a fifth product when the sixth norm of W.AX-W.B or the seventh norm of AX-B is less than or equal to a first threshold and/or the eighth norm of X is less than or equal to a second threshold, wherein the first product is the product of a first coefficient and the first norm of X, and the second product is the product of a second coefficient and the first norm of X
Figure GDA0003030818070000193
Is the product of a third coefficient and a second norm of
Figure GDA0003030818070000194
A fourth product of a fourth coefficient and a fourth norm of L, a fifth product of a fifth coefficient and a fifth norm of S, a ═ L + S, L is a low rank matrix, S is a sparse matrix,
Figure GDA0003030818070000195
a column vector representing the average of the rows of the low rank matrix L,
Figure GDA0003030818070000196
a column vector representing an average value of each row of the sparse matrix S. At this time, the target optimization model can be expressed by the above equations 7, 8, 9, and the like.
Without assuming that the formula of the objective optimization model of the embodiment of the present application is shown in formula 4, a method for establishing the objective optimization model and solving the optimized column vectors of the embodiment of the present application is shown in fig. 5. Fig. 5 is a flowchart of a method for determining an optimized column vector according to an embodiment of the present application.
S501, establishing a target optimization model, and determining a Lagrangian function f (X, Y, Z and mu) according to the target optimization model.
Specifically, an objective optimization model may be established based on A, B, W and an optimization objective, where the objective optimization model is used to represent a relationship between A, B, W and an optimization variable X under the optimization objective, and a value of X that satisfies the objective optimization model is an optimization column vector.
To solve for the optimized column vector, the lagrangian function f (X, Y, Z, μ) may be determined. Specifically, a relaxation variable Y can be introduced, a lagrangian multiplier method is used, a lagrangian multiplier Z is introduced, μ is a penalty parameter, and the value is a positive real number. The variable X of the Lagrange function is the optimized variable X of the target optimization model.
S502, setting initial values of variables X, Y, Z and mu in the Lagrangian function, and setting a convergence condition of the optimization completion.
For example, a convergence condition for completion of optimization when the profit of optimization is less than a predetermined threshold may be set. Specifically, taking the target optimization model shown in formula 4 as an example, if the obtained | | W · AX-W · B | | survival rate2+||X||1The value of (A) is less than the (| W.AX-W.B |) of the last circulation2+||X||1The magnitude of the decrease in the value of (b) is less than a predetermined threshold V, the optimization is deemed complete.
For another example, a maximum number of loop executions may be set, and when the loop reaches the maximum number, a convergence condition for completion of optimization may be set.
Of course, it should be understood that a plurality of determination conditions may be set, and any one of them may be satisfied. For example, if | | W · AX-W · B | | Wy calculation is obtained2+||X||1The value of (A) is less than the (| W.AX-W.B |) of the last circulation2+||X||1The amplitude of the decrease of the value of (b) is less than a predetermined threshold V, or the number of cycles reaches the maximum number, which is a convergence condition for completing optimization, and so on.
S503, an optimal value of Y is calculated using the least square method.
For a specific implementation of calculating Y by using the least square method, reference may be made to the prior art, and details of the embodiment of the present application are not repeated herein.
S504, the optimal value of X is calculated by using a soft threshold method.
For a specific implementation of calculating X by using the soft threshold method, reference may be made to the prior art, and details of the embodiment of the present application are not described herein again.
And S505, when the value of the X does not meet the constraint condition of the target optimization model, reducing a non-zero element in the X and replacing the X until the X meets the constraint condition of the target optimization model.
It should be understood that step S505 has different implementations according to different target optimization models.
For example, assuming that the target optimization model is the expression shown in formula 4, in step S505, the non-zero elements of X may be reduced to not more than d.
There are many ways to reduce non-zero elements in X.
Preferably, on the premise that X is reduced by one non-zero element, when X formed by reducing a first non-zero element in X minimizes the function value of the target optimization model, the first non-zero element is reduced and replaces X.
For example, in the target optimization model assumed in equation 4, in order to reduce the non-zero elements of X to no more than d, in order to make | | W · AX-W · B | | white magnetism2+||X||1The value of (A) is taken as the minimum value as much as possible, and when one nonzero element is reduced each time, the (| | W.AX-W.B | |) of each nonzero element can be reduced relatively2+||X||1Selecting | | | W.AX-W.B | | non-woven cells2+||X||1The minimum value of (a) is used as a principle of reducing the non-zero elements until the number of the non-zero elements does not exceed d.
Of course, the non-zero elements may be reduced according to the principle that the value of the non-zero elements in X is from large to small or from small to large. The method has the advantages of simple calculation, and smaller function value of the target optimization model due to the fact that multiple comparisons for reducing non-zero elements are not needed.
S506, the lagrange multiplier is updated according to the residual error of the equation X ═ Y, and the penalty coefficient μ is increased by a fixed multiple ρ, where ρ > 1.
S507, judging whether the convergence condition of the Lagrangian function f (X, Y, Z and mu) is met.
If the convergence condition of the Lagrangian function f (X, Y, Z, mu) is met, outputting the current value of X, and ending the execution;
if the convergence condition of the Lagrangian function f (X, Y, Z, μ) is not satisfied, step 503 is performed.
Steps S503 to S507 are executed in a loop until the convergence condition of the lagrangian function f (X, Y, Z, μ) is satisfied.
Of course, it should be understood that there may be other methods of solving the objective optimization model, and the embodiments of the present application are not limited thereto.
The number vector and coefficient vector of the instruction fragment to which the optimized column vector corresponds are determined 305.
The optimized column vector determined in step 304 is not recorded as X0
It is understood that X0The numbering of the non-zero position-corresponding segments is recorded in a numbering vector D, X0The non-zero coefficients are recorded in a coefficient vector G. The number vector D is used for representing the number of the instruction segment corresponding to the non-zero position in the optimized column vector, and the coefficient vector G is used for representing the weight of the characteristic index of the instruction segment corresponding to the non-zero position in the optimized column vector.
And 306, simulating the instruction segments corresponding to the number vector D in the simulation device respectively to obtain the characteristic index vector of the instruction segment corresponding to the number vector D.
And the simulation test system simulates the instruction segment corresponding to the number vector D by using a simulation tool according to the number vector D to obtain the characteristic indexes of each instruction segment, and the characteristic indexes of each instruction segment corresponding to the number vector D form a characteristic index vector C.
The simulation tool may be a software or hardware emulator, such as LiveSP, Checkpoint, or the like. The characteristic indexes of the instruction segment obtained by the simulation tool can be IPC, predicted branch success/failure rate, energy consumption and the like.
307, determining the inner product of the feature index vector and the coefficient vector as the feature index for evaluating the test program.
From the coefficient vector D of step 305 and the characteristic index vector C of step 306, a characteristic index C · D of the evaluation test program can be obtained.
And finishing the index evaluation of the test program.
Fig. 6 is a schematic structural diagram of an index evaluation device 600 according to an embodiment of the present application. As shown in fig. 6, the index evaluation device 600 may include:
an obtaining unit 601, configured to obtain an instruction stream of a test program and divide the instruction stream into N instruction fragments with equal lengths;
a first determining unit 602, configured to determine a feature matrix a, a mean column vector B, and a weight vector W, where the feature matrix a is a matrix with M rows and N columns, and is used to describe feature indicator vectors of the N instruction segments, M is the number of types of basic blocks BB in the N instruction segments, the mean column vector B is a column vector formed by a mean value of the feature indicator vectors in each row of the feature matrix a, and the weight vector W is a column vector formed by weights occupied by feature indicators of basic blocks BB of each row in the feature matrix a in a test program;
a second determining unit 603, configured to determine an optimized column vector according to the feature matrix a, the column vector B, the weight vector W, and an optimization goal, where the optimization goal includes reducing the number of instruction fragments used for simulation and/or reducing an error between a feature index of the test program and a feature index of the test program, and the optimized column vector is used to represent instruction fragments used for simulation in the N instruction fragments and a weight corresponding to each instruction fragment;
a third determining unit 604, configured to determine a number vector and a coefficient vector of the instruction fragment corresponding to the optimized column vector, where the number vector is used to indicate a number of the instruction fragment corresponding to a non-zero position in the optimized column vector, and the coefficient vector is used to indicate a weight of a feature indicator of the instruction fragment corresponding to the non-zero position in the optimized column vector;
a simulation unit 605, configured to simulate the instruction segments corresponding to the number vector in a simulation device, respectively, to obtain a feature indicator vector of the instruction segment corresponding to the number vector, where the feature indicator vector is used to indicate a feature indicator of the instruction segment corresponding to the number vector;
a fourth determining unit 606, configured to determine an inner product of the feature indicator vector and the coefficient vector as a feature indicator for evaluating the test program.
In the embodiment of the application, a target optimization model of a test program is constructed based on a feature matrix, a column vector and a weight vector of an instruction segment of the test program, an optimized column vector is determined based on the target optimization model and an optimized target, an instruction segment and a coefficient vector corresponding to a non-zero position are determined according to the optimized column vector, the instruction segment corresponding to the optimized column vector is simulated in a simulation device to obtain a feature index vector of the instruction segment corresponding to the optimized column vector, and a feature index for evaluating the test program is obtained according to the feature index vector and the coefficient vector, so that the number of the instruction segments required by simulation work of the test program can be reduced, or simulation index errors of the test program are reduced.
The index evaluation device 600 may also execute the method of fig. 2 and implement the functions of the index evaluation device in the embodiments shown in fig. 2, fig. 3, and fig. 5, which are not described herein again in this embodiment of the present application.
Fig. 7 is a schematic structural diagram of an index evaluation apparatus 700 according to an embodiment of the present application. The metric evaluation device 700 may include a channel interface 701, a processor 702, and a memory 703.
The channel interface 701, the processor 702 and the memory 703 are interconnected by a bus 704 system. Bus 704 may be an ISA bus, PCI bus, EISA bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one double-headed arrow is shown in FIG. 7, but this does not indicate only one bus or one type of bus.
The memory 703 is used for storing programs. In particular, the program may include program code comprising computer operating instructions. Memory 703 may include both read-only memory and random-access memory, and provides instructions and data to processor 702. The memory 703 may comprise high-speed RAM memory, and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
The processor 702 executes the program stored in the memory 703, and is specifically configured to perform the following operations:
acquiring an instruction stream of a test program and dividing the instruction stream into N instruction segments with equal length;
determining a feature matrix A, a mean column vector B and a weight vector W corresponding to the N instruction segments, wherein the feature matrix A is a matrix with M rows and N columns and is used for describing feature index vectors of the N instruction segments, M is the number of types of basic blocks BB in the N instruction segments, the mean column vector B is a column vector formed by the mean value of the feature index vectors of each row of the feature matrix A, and the weight vector W is a column vector formed by the weight occupied by the feature index of the basic block BB type corresponding to each row in the feature matrix A in a test program;
determining an optimized column vector according to the feature matrix A, the column vector B, the weight vector W and an optimization target, wherein the optimization target comprises reducing the number of instruction fragments for simulation and/or reducing an error between a feature index for evaluating the test program and the feature index of the test program, and the optimized column vector is used for representing the instruction fragments for simulation in the N instruction fragments and the weight corresponding to each instruction fragment;
determining a number vector and a coefficient vector of the instruction segment corresponding to the optimized column vector, wherein the number vector is used for indicating the number of the instruction segment corresponding to the non-zero position in the optimized column vector, and the coefficient vector is used for indicating the weight of the characteristic index of the instruction segment corresponding to the non-zero position in the optimized column vector;
simulating the instruction segments corresponding to the number vectors in a simulation device respectively to obtain characteristic index vectors of the instruction segments corresponding to the number vectors, wherein the characteristic index vectors are used for expressing characteristic indexes of the instruction segments corresponding to the number vectors;
and determining the inner product of the characteristic index vector and the coefficient vector as the characteristic index for evaluating the test program.
The method performed by the index estimation apparatus disclosed in any one of fig. 2, fig. 3, and fig. 5 may be applied to the processor 702, or implemented by the processor 702. The processor 702 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 702. The Processor 702 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 703, and the processor 702 reads the information in the memory 703 and performs the steps of the above method in combination with the hardware thereof.
The index evaluation device 700 may also perform the method of fig. 2 and implement the functions of the index evaluation device in the embodiments shown in fig. 2, fig. 3, and fig. 5, which are not described herein again in this application.
The embodiment of the present application further provides a computer-readable storage medium for storing a computer program, where the computer program includes instructions for executing the method of the embodiment shown in fig. 2.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (20)

1.一种指标评估方法,其特征在于,包括:1. an index evaluation method, is characterized in that, comprises: 获取测试程序的指令流并分割成长度相等的N个指令片段;Obtain the instruction stream of the test program and divide it into N instruction segments of equal length; 确定所述N个指令片段对应的特征矩阵A、均值列向量B和权重向量W,其中,所述特征矩阵A为M行N列的矩阵,用于描述所述N个指令片段的特征指标向量,M为所述N个指令片段中基本块BB的种类数,所述均值列向量B为所述特征矩阵A每行的特征指标向量的均值构成的列向量,所述权重向量W为所述特征矩阵A中每行对应的基本块BB类型在测试程序的特征指标所占的权重构成的列向量;Determine the characteristic matrix A, the mean column vector B and the weight vector W corresponding to the N instruction fragments, wherein the characteristic matrix A is a matrix with M rows and N columns, and is used to describe the characteristic index vectors of the N instruction fragments , M is the number of types of basic blocks BB in the N instruction segments, the mean column vector B is a column vector formed by the mean of the feature index vectors of each row of the feature matrix A, and the weight vector W is the A column vector composed of the weights of the basic block type BB corresponding to each row in the feature matrix A in the feature index of the test program; 根据所述特征矩阵A、所述均值列向量B、所述权重向量W和优化目标确定优化列向量,所述优化目标包括减少用于仿真的指令片段个数和/或减少评估所述测试程序的特征指标与所述测试程序的特征指标之间的误差,所述优化列向量用于表示所述N个指令片段中用于进行模拟的指令片段及每个指令片段对应的权重;An optimization column vector is determined according to the feature matrix A, the mean column vector B, the weight vector W, and an optimization objective including reducing the number of instruction fragments used for simulation and/or reducing the evaluation of the test program The error between the characteristic index of the test program and the characteristic index of the test program, the optimized column vector is used to represent the instruction segment used for simulation in the N instruction segments and the corresponding weight of each instruction segment; 确定所述优化列向量对应的指令片段的编号向量和系数向量,其中,所述编号向量用于表示所述优化列向量中非零位置对应的指令片段的编号,所述系数向量用于表示所述优化列向量中非零位置对应的指令片段的特征指标的权重;Determine the number vector and coefficient vector of the instruction segment corresponding to the optimized column vector, wherein the number vector is used to represent the number of the instruction segment corresponding to the non-zero position in the optimized column vector, and the coefficient vector is used to represent the The weight of the feature index of the instruction fragment corresponding to the non-zero position in the optimized column vector; 将所述编号向量对应的指令片段分别在模拟装置中进行模拟,得到所述编号向量对应的指令片段的特征指标向量,所述特征指标向量用于表示所述编号向量对应的指令片段的特征指标;The instruction fragments corresponding to the numbered vectors are simulated in the simulation device respectively, to obtain a feature index vector of the instruction segment corresponding to the numbered vector, and the feature index vector is used to represent the feature index of the instruction segment corresponding to the numbered vector ; 确定所述特征指标向量和所述系数向量的内积为评估所述测试程序的特征指标。The inner product of the feature index vector and the coefficient vector is determined as the feature index for evaluating the test program. 2.如权利要求1所述的方法,其特征在于,所述优化目标为:在W·AX-W·B的第一范数或AX-B的第二范数小于或等于第一阈值时,使得X的第三范数最小,其中,X为优化变量,用于表示所述N个指令片段中用于进行模拟的指令片段及每个指令片段对应的权重,所述优化列向量为满足所述优化目标的X的取值。2. The method of claim 1, wherein the optimization objective is: when the first norm of W·AX-W·B or the second norm of AX-B is less than or equal to the first threshold , so that the third norm of X is the smallest, where X is an optimization variable, which is used to represent the instruction segment used for simulation in the N instruction segments and the corresponding weight of each instruction segment, and the optimized column vector satisfies the The value of X for the optimization objective. 3.如权利要求2所述的方法,其特征在于,所述根据所述特征矩阵A、所述均值列向量B、所述权重向量W和优化目标确定优化列向量包括:3. The method of claim 2, wherein the determining the optimized column vector according to the feature matrix A, the mean column vector B, the weight vector W and an optimization objective comprises: 通过以下目标优化公式确定所述优化列向量:The optimized column vector is determined by the following objective optimization formula:
Figure FDA0003030818060000021
Figure FDA0003030818060000021
其中,||W·AX-W·B||t1表示W·AX-W·B的第一范数,||AX-B||t2表示AX-B的第二范数,||X||t3为表示X的第三范数,c表示所述第一阈值。Among them, ||W·AX-W·B|| t1 represents the first norm of W·AX-W·B, ||AX-B|| t2 represents the second norm of AX-B, ||X| | t3 represents the third norm of X, and c represents the first threshold.
4.如权利要求1所述的方法,其特征在于,所述优化目标为:4. The method of claim 1, wherein the optimization objective is: 在X的第一范数小于或等于第二阈值时,使得第一乘积与第二乘积之和最小,所述第一乘积为W·AX-W·B的第二范数与第一系数的乘积,所述第二乘积为X的第三范数与第二系数的乘积,第一系数的取值为非零实数,其中,X为优化变量,用于表示所述N个指令片段中用于进行模拟的指令片段及每个指令片段对应的权重,所述优化列向量为满足所述优化目标的X的取值。When the first norm of X is less than or equal to the second threshold, the sum of the first product and the second product is minimized, where the first product is the second norm of W·AX-W·B and the first coefficient Product, the second product is the product of the third norm of X and the second coefficient, and the value of the first coefficient is a non-zero real number, where X is an optimization variable, used to represent the N instruction fragments used in the For the instruction segments to be simulated and the weights corresponding to each instruction segment, the optimization column vector is the value of X that satisfies the optimization objective. 5.如权利要求4所述的方法,其特征在于,5. The method of claim 4, wherein 所述根据所述特征矩阵A、所述均值列向量B、所述权重向量W和优化目标确定优化列向量包括:The determining the optimized column vector according to the feature matrix A, the mean column vector B, the weight vector W and the optimization objective includes: 通过以下目标优化公式确定所述优化列向量:The optimized column vector is determined by the following objective optimization formula: min λ||W·AX-W·B||t2+δ||X||t3 min λ||W·AX-W·B|| t2 +δ||X|| t3 s.t.||X||t1≤dst||X|| t1 ≤d 其中,d表示所述第二阈值,||X||t1表示X的第一范数,||W·AX-W·B||t2表示W·AX-W·B的第二范数,||X||t3表示X的第三范数,λ表示所述第一系数,δ表示所述第二系数,λ≠0。Wherein, d represents the second threshold, ||X|| t1 represents the first norm of X, ||W·AX-W·B|| t2 represents the second norm of W·AX-W·B, ||X|| t3 represents the third norm of X, λ represents the first coefficient, δ represents the second coefficient, and λ≠0. 6.如权利要求1所述的方法,其特征在于,所述优化目标为:6. The method of claim 1, wherein the optimization objective is: 使得第一乘积、第二乘积、第三乘积、第四乘积、第五乘积之和最小,其中,所述第一乘积为第一系数与X的第一范数的乘积,第二乘积为第二系数与
Figure FDA0003030818060000022
的第二范数的乘积,第三乘积为第三系数与
Figure FDA0003030818060000023
的第三范数的乘积,第四乘积为第四系数与L的第四范数的乘积,第五乘积为第五系数与S的第五范数的乘积,A=L+S,L为低秩矩阵,S为稀疏矩阵,
Figure FDA0003030818060000024
表示低秩矩阵L的各行的平均值构成的列向量,
Figure FDA0003030818060000025
表示稀疏矩阵S的各行的平均值构成的列向量,第一系数、第二系数、第三系数、第四系数和第五系数不全为零,其中,X为优化变量,用于表示所述N个指令片段中用于进行模拟的指令片段及每个指令片段对应的权重,所述优化列向量为满足所述优化目标的X的取值。
Minimize the sum of the first product, the second product, the third product, the fourth product, and the fifth product, where the first product is the product of the first coefficient and the first norm of X, and the second product is the first Two coefficients and
Figure FDA0003030818060000022
The product of the second norm of , the third product is the third coefficient and
Figure FDA0003030818060000023
The product of the third norm of , the fourth product is the product of the fourth coefficient and the fourth norm of L, the fifth product is the product of the fifth coefficient and the fifth norm of S, A=L+S, L is low-rank matrix, S is a sparse matrix,
Figure FDA0003030818060000024
is a column vector representing the mean value of each row of the low-rank matrix L,
Figure FDA0003030818060000025
represents a column vector composed of the average value of each row of the sparse matrix S, the first coefficient, the second coefficient, the third coefficient, the fourth coefficient and the fifth coefficient are not all zero, where X is an optimization variable, used to represent the N The instruction segments used for simulation among the instruction segments and the weights corresponding to each instruction segment, and the optimized column vector is the value of X that satisfies the optimization objective.
7.如权利要求1所述的方法,其特征在于,所述优化目标为:7. The method of claim 1, wherein the optimization objective is: 在W·AX-W·B的第六范数或AX-B的第七范数小于或等于第一阈值,和/或X的第八范数小于或等于第二阈值时,使得第一乘积、第二乘积、第三乘积、第四乘积、第五乘积之和最小,其中,所述第一乘积为第一系数与X的第一范数的乘积,第二乘积为第二系数与
Figure FDA0003030818060000031
的第二范数的乘积,第三乘积为第三系数与
Figure FDA0003030818060000032
的第三范数的乘积,第四乘积为第四系数与L的第四范数的乘积,第五乘积为第五系数与S的第五范数的乘积,A=L+S,L为低秩矩阵,S为稀疏矩阵,
Figure FDA0003030818060000033
表示低秩矩阵L的各行的平均值构成的列向量,
Figure FDA0003030818060000034
表示稀疏矩阵S的各行的平均值构成的列向量,第一系数、第二系数、第三系数、第四系数和第五系数不全为零,其中,X为优化变量,用于表示所述N个指令片段中用于进行模拟的指令片段及每个指令片段对应的权重,所述优化列向量为满足所述优化目标的X的取值。
When the sixth norm of W·AX-W·B or the seventh norm of AX-B is less than or equal to the first threshold, and/or the eighth norm of X is less than or equal to the second threshold, such that the first product , the sum of the second product, the third product, the fourth product, and the fifth product is the smallest, wherein the first product is the product of the first coefficient and the first norm of X, and the second product is the second coefficient and the
Figure FDA0003030818060000031
The product of the second norm of , the third product is the third coefficient and
Figure FDA0003030818060000032
The product of the third norm of , the fourth product is the product of the fourth coefficient and the fourth norm of L, the fifth product is the product of the fifth coefficient and the fifth norm of S, A=L+S, L is low-rank matrix, S is a sparse matrix,
Figure FDA0003030818060000033
is a column vector representing the mean value of each row of the low-rank matrix L,
Figure FDA0003030818060000034
represents a column vector composed of the average value of each row of the sparse matrix S, the first coefficient, the second coefficient, the third coefficient, the fourth coefficient and the fifth coefficient are not all zero, wherein, X is an optimization variable, used to represent the N The instruction segments used for simulation among the instruction segments and the weights corresponding to each instruction segment, and the optimized column vector is the value of X that satisfies the optimization objective.
8.如权利要求1-7中任一项所述的方法,其特征在于,所述根据所述特征矩阵A、所述均值列向量B、所述权重向量W和优化目标确定优化列向量包括:8. The method according to any one of claims 1-7, wherein the determining the optimized column vector according to the feature matrix A, the mean column vector B, the weight vector W and an optimization objective comprises: : 步骤1:根据所述特征矩阵A、所述均值列向量B、所述权重向量W和优化目标建立所述优化目标的目标优化模型,并建立所述目标优化模型的拉格朗日函数f(X,Y,Z,μ),其中,X为优化变量,用于表示所述N个指令片段中用于进行模拟的指令片段及每个指令片段对应的权重,μ为惩罚参数,Z为拉格朗日乘子,Y为松弛变量;Step 1: According to the feature matrix A, the mean column vector B, the weight vector W and the optimization objective, establish the objective optimization model of the optimization objective, and establish the Lagrangian function f of the objective optimization model ( X, Y, Z, μ), where X is an optimization variable, which is used to represent the instruction segment used for simulation in the N instruction segments and the corresponding weight of each instruction segment, μ is the penalty parameter, and Z is the pull factor Grange multiplier, Y is a slack variable; 步骤2:设置X、Y、Z和μ的初始值,并设置所述优化目标为优化完成的收敛条件;Step 2: Set the initial values of X, Y, Z and μ, and set the optimization objective as the convergence condition for the optimization to be completed; 步骤3:使用最小二乘法计算Y的最优值;Step 3: Use the least squares method to calculate the optimal value of Y; 步骤4:使用软阈值法计算X的最优值;Step 4: Calculate the optimal value of X using the soft threshold method; 步骤5:当X的取值不满足所述目标优化模型的约束条件时,减少X中的一个非零元素并替代X,直至X满足所述目标优化模型的约束条件;Step 5: when the value of X does not meet the constraints of the target optimization model, reduce a non-zero element in X and replace X, until X meets the constraints of the target optimization model; 步骤6:根据等式X=Y的残差更新拉格朗日乘子Z,并以固定的倍数ρ增大惩罚系数μ,其中ρ>1;Step 6: Update the Lagrange multiplier Z according to the residual of the equation X=Y, and increase the penalty coefficient μ by a fixed multiple ρ, where ρ>1; 步骤7:判断是否满足收敛条件,如果满足,则X为所求解,如果不满足,则重复执行步骤3-6。Step 7: Determine whether the convergence condition is satisfied. If it is satisfied, X is the solution to be solved. If it is not satisfied, repeat steps 3-6. 9.如权利要求8所述的方法,其特征在于,所述减少X中的一个非零元素并替代X包括:9. The method of claim 8, wherein said reducing a non-zero element in X and replacing X comprises: 在X减少一个非零元素前提下,当减少X中的第一非零元素形成的X使得所述目标优化模型的函数值最小,减少所述第一非零元素并替代X。On the premise that X is reduced by one non-zero element, when X formed by reducing the first non-zero element in X minimizes the function value of the objective optimization model, the first non-zero element is reduced and X is replaced. 10.如权利要求1所述的方法,其特征在于,所述特征指标包括如下的至少一种:10. The method of claim 1, wherein the characteristic index comprises at least one of the following: 每时钟指令数IPC、预测分支成功率、预测分支失败率、L2 Cache命中率、能耗。Instructions per clock IPC, predicted branch success rate, predicted branch failure rate, L2 Cache hit rate, and energy consumption. 11.一种处理器的指标评估装置,其特征在于,包括:11. A device for evaluating indicators of a processor, comprising: 获取单元,用于获取测试程序的指令流并分割成长度相等的N个指令片段;The acquisition unit is used to acquire the instruction stream of the test program and divide it into N instruction segments of equal length; 第一确定单元,用于确定所述N个指令片段对应的特征矩阵A、均值列向量B和权重向量W,其中,所述特征矩阵A为M行N列的矩阵,用于描述所述N个指令片段的特征指标向量,M为所述N个指令片段中基本块BB的种类数,所述均值列向量B为所述特征矩阵A每行的特征指标向量的均值构成的列向量,所述权重向量W为所述特征矩阵A中每行对应的基本块BB类型在测试程序的特征指标所占的权重构成的列向量;A first determination unit, configured to determine a feature matrix A, a mean column vector B, and a weight vector W corresponding to the N instruction fragments, wherein the feature matrix A is a matrix with M rows and N columns, and is used to describe the N The feature index vectors of each instruction segment, M is the number of types of basic blocks BB in the N instruction segments, and the mean column vector B is a column vector formed by the mean of the feature index vectors of each row of the feature matrix A, so Described weight vector W is the column vector that the basic block BB type corresponding to each row in the described feature matrix A is constituted by the weight of the feature index of the test program; 第二确定单元,用于根据所述特征矩阵A、所述均值列向量B、所述权重向量W和优化目标确定优化列向量,所述优化目标包括减少用于仿真的指令片段个数和/或减少评估所述测试程序的特征指标与所述测试程序的特征指标之间的误差,所述优化列向量用于表示所述N个指令片段中用于进行模拟的指令片段及每个指令片段对应的权重;A second determination unit, configured to determine an optimized column vector according to the feature matrix A, the mean column vector B, the weight vector W and an optimization objective, where the optimization objective includes reducing the number of instruction fragments used for simulation and/or Or reduce the error between evaluating the feature index of the test program and the feature index of the test program, the optimized column vector is used to represent the instruction segment and each instruction segment used for simulation among the N instruction segments corresponding weight; 第三确定单元,用于确定所述优化列向量对应的指令片段的编号向量和系数向量,其中,所述编号向量用于表示所述优化列向量中非零位置对应的指令片段的编号,所述系数向量用于表示所述优化列向量中非零位置对应的指令片段的特征指标的权重;The third determination unit is configured to determine the number vector and coefficient vector of the instruction segment corresponding to the optimized column vector, wherein the number vector is used to represent the number of the instruction segment corresponding to the non-zero position in the optimized column vector, and the The coefficient vector is used to represent the weight of the feature index of the instruction fragment corresponding to the non-zero position in the optimized column vector; 模拟单元,用于将所述编号向量对应的指令片段分别在模拟装置中进行模拟,得到所述编号向量对应的指令片段的特征指标向量,所述特征指标向量用于表示所述编号向量对应的指令片段的特征指标;The simulation unit is used for simulating the instruction fragments corresponding to the numbered vectors in the simulation device, respectively, to obtain a feature index vector of the instruction fragments corresponding to the numbered vector, and the feature index vector is used to represent the corresponding numbered vector. The feature index of the instruction fragment; 第四确定单元,用于确定所述特征指标向量和所述系数向量的内积为评估所述测试程序的特征指标。The fourth determining unit is configured to determine the inner product of the feature index vector and the coefficient vector as the feature index for evaluating the test program. 12.如权利要求11所述的装置,其特征在于,所述优化目标为:在W·AX-W·B的第一范数或AX-B的第二范数小于或等于第一阈值时,使得X的第三范数最小,其中,X为优化变量,用于表示所述N个指令片段中用于进行模拟的指令片段及每个指令片段对应的权重,所述优化列向量为满足所述优化目标的X的取值。12. The apparatus according to claim 11, wherein the optimization objective is: when the first norm of W·AX-W·B or the second norm of AX-B is less than or equal to a first threshold , so that the third norm of X is the smallest, where X is an optimization variable, which is used to represent the instruction segment used for simulation in the N instruction segments and the corresponding weight of each instruction segment, and the optimized column vector satisfies the The value of X for the optimization objective. 13.如权利要求12所述的装置,其特征在于,所述第二确定单元用于通过以下目标优化公式确定所述优化列向量:13. The apparatus of claim 12, wherein the second determining unit is configured to determine the optimized column vector by the following objective optimization formula:
Figure FDA0003030818060000051
Figure FDA0003030818060000051
其中,||W·AX-W·B||t1表示W·AX-W·B的第一范数,||AX-B||t2表示AX-B的第二范数,||X||t3为表示X的第三范数,c表示所述第一阈值。Among them, ||W·AX-W·B|| t1 represents the first norm of W·AX-W·B, ||AX-B|| t2 represents the second norm of AX-B, ||X| | t3 represents the third norm of X, and c represents the first threshold.
14.如权利要求11所述的装置,其特征在于,所述优化目标为:14. The apparatus of claim 11, wherein the optimization objective is: 在X的第一范数小于或等于第二阈值时,使得第一乘积与第二乘积之和最小,所述第一乘积为W·AX-W·B的第二范数与第一系数的乘积,所述第二乘积为X的第三范数与第二系数的乘积,第一系数取值为非零实数,其中,X为优化变量,用于表示所述N个指令片段中用于进行模拟的指令片段及每个指令片段对应的权重,所述优化列向量为满足所述优化目标的X的取值。When the first norm of X is less than or equal to the second threshold, the sum of the first product and the second product is minimized, where the first product is the second norm of W·AX-W·B and the first coefficient Product, the second product is the product of the third norm of X and the second coefficient, and the first coefficient is a non-zero real number, where X is an optimization variable, used to indicate that the N instruction fragments are used for The instruction segments to be simulated and the weights corresponding to each instruction segment, and the optimized column vector is the value of X that satisfies the optimization objective. 15.如权利要求14所述的装置,其特征在于,所述第二确定单元用于通过以下目标优化公式确定所述优化列向量:15. The apparatus of claim 14, wherein the second determining unit is configured to determine the optimized column vector by the following objective optimization formula:
Figure FDA0003030818060000052
Figure FDA0003030818060000052
其中,d表示所述第二阈值,||X||t1表示X的第一范数,||W·AX-W·B||t2表示W·AX-W·B的第二范数,||X||t3表示X的第三范数,λ表示所述第一系数,δ表示所述第二系数,λ≠0。Wherein, d represents the second threshold, ||X|| t1 represents the first norm of X, ||W·AX-W·B|| t2 represents the second norm of W·AX-W·B, ||X|| t3 represents the third norm of X, λ represents the first coefficient, δ represents the second coefficient, and λ≠0.
16.如权利要求11所述的装置,其特征在于,所述优化目标为:16. The apparatus of claim 11, wherein the optimization objective is: 使得第一乘积、第二乘积、第三乘积、第四乘积、第五乘积之和最小,其中,所述第一乘积为第一系数与X的第一范数的乘积,第二乘积为第二系数与
Figure FDA0003030818060000053
的第二范数的乘积,第三乘积为第三系数与
Figure FDA0003030818060000054
的第三范数的乘积,第四乘积为第四系数与L的第四范数的乘积,第五乘积为第五系数与S的第五范数的乘积,A=L+S,L为低秩矩阵,S为稀疏矩阵,
Figure FDA0003030818060000055
表示低秩矩阵L的各行的平均值构成的列向量,
Figure FDA0003030818060000056
表示稀疏矩阵S的各行的平均值构成的列向量,第一系数、第二系数、第三系数、第四系数和第五系数不全为零,其中,X为优化变量,用于表示所述N个指令片段中用于进行模拟的指令片段及每个指令片段对应的权重,所述优化列向量为满足所述优化目标的X的取值。
Minimize the sum of the first product, the second product, the third product, the fourth product, and the fifth product, where the first product is the product of the first coefficient and the first norm of X, and the second product is the first Two coefficients and
Figure FDA0003030818060000053
The product of the second norm of , the third product is the third coefficient and
Figure FDA0003030818060000054
The product of the third norm of , the fourth product is the product of the fourth coefficient and the fourth norm of L, the fifth product is the product of the fifth coefficient and the fifth norm of S, A=L+S, L is low-rank matrix, S is a sparse matrix,
Figure FDA0003030818060000055
is a column vector representing the mean value of each row of the low-rank matrix L,
Figure FDA0003030818060000056
represents a column vector composed of the average value of each row of the sparse matrix S, the first coefficient, the second coefficient, the third coefficient, the fourth coefficient and the fifth coefficient are not all zero, wherein, X is an optimization variable, used to represent the N The instruction segments used for simulation among the instruction segments and the weights corresponding to each instruction segment, and the optimized column vector is the value of X that satisfies the optimization objective.
17.如权利要求11所述的装置,其特征在于,所述优化目标为:17. The apparatus of claim 11, wherein the optimization objective is: 在W·AX-W·B的第六范数或AX-B的第七范数小于或等于第一阈值,和/或X的第八范数小于或等于第二阈值时,使得第一乘积、第二乘积、第三乘积、第四乘积、第五乘积之和最小,其中,所述第一乘积为第一系数与X的第一范数的乘积,第二乘积为第二系数与
Figure FDA0003030818060000061
的第二范数的乘积,第三乘积为第三系数与
Figure FDA0003030818060000062
的第三范数的乘积,第四乘积为第四系数与L的第四范数的乘积,第五乘积为第五系数与S的第五范数的乘积,A=L+S,L为低秩矩阵,S为稀疏矩阵,
Figure FDA0003030818060000063
表示低秩矩阵L的各行的平均值构成的列向量,
Figure FDA0003030818060000064
表示稀疏矩阵S的各行的平均值构成的列向量,第一系数、第二系数、第三系数、第四系数和第五系数不全为零,其中,X为优化变量,用于表示所述N个指令片段中用于进行模拟的指令片段及每个指令片段对应的权重,所述优化列向量为满足所述优化目标的X的取值。
When the sixth norm of W·AX-W·B or the seventh norm of AX-B is less than or equal to the first threshold, and/or the eighth norm of X is less than or equal to the second threshold, such that the first product , the sum of the second product, the third product, the fourth product, and the fifth product is the smallest, wherein the first product is the product of the first coefficient and the first norm of X, and the second product is the second coefficient and the
Figure FDA0003030818060000061
The product of the second norm of , the third product is the third coefficient and
Figure FDA0003030818060000062
The product of the third norm of , the fourth product is the product of the fourth coefficient and the fourth norm of L, the fifth product is the product of the fifth coefficient and the fifth norm of S, A=L+S, L is low-rank matrix, S is a sparse matrix,
Figure FDA0003030818060000063
is a column vector representing the mean value of each row of the low-rank matrix L,
Figure FDA0003030818060000064
represents a column vector composed of the average value of each row of the sparse matrix S, the first coefficient, the second coefficient, the third coefficient, the fourth coefficient and the fifth coefficient are not all zero, wherein, X is an optimization variable, used to represent the N The instruction segments used for simulation among the instruction segments and the weights corresponding to each instruction segment, and the optimized column vector is the value of X that satisfies the optimization objective.
18.如权利要求11-17中任一项所述的装置,其特征在于,所述第二确定单元根据所述特征矩阵A、所述均值列向量B、所述权重向量W和优化目标确定优化列向量具体实现为:18. The apparatus according to any one of claims 11-17, wherein the second determination unit determines the feature matrix A, the mean column vector B, the weight vector W and an optimization objective according to the feature matrix A The optimized column vector is specifically implemented as: 步骤1:根据所述特征矩阵A、所述均值列向量B、所述权重向量W和优化目标建立所述优化目标的目标优化模型,并建立所述目标优化模型的拉格朗日函数f(X,Y,Z,μ),其中,X为优化变量,用于表示所述N个指令片段中用于进行模拟的指令片段及每个指令片段对应的权重,μ为惩罚参数,Z为拉格朗日乘子,Y为松弛变量;Step 1: According to the feature matrix A, the mean column vector B, the weight vector W and the optimization objective, establish the objective optimization model of the optimization objective, and establish the Lagrangian function f of the objective optimization model ( X, Y, Z, μ), where X is an optimization variable, which is used to represent the instruction segment used for simulation in the N instruction segments and the corresponding weight of each instruction segment, μ is the penalty parameter, and Z is the pull factor Grange multiplier, Y is a slack variable; 步骤2:设置X、Y、Z和μ的初始值,并设置所述优化目标为优化完成的收敛条件;Step 2: Set the initial values of X, Y, Z and μ, and set the optimization objective as the convergence condition for the optimization to be completed; 步骤3:使用最小二乘法计算Y的最优值;Step 3: Use the least squares method to calculate the optimal value of Y; 步骤4:使用软阈值法计算X的最优值;Step 4: Calculate the optimal value of X using the soft threshold method; 步骤5:当X的取值不满足所述目标优化模型的约束条件时,减少X中的一个非零元素并替代X,直至X满足所述目标优化模型的约束条件;Step 5: when the value of X does not meet the constraints of the target optimization model, reduce a non-zero element in X and replace X, until X meets the constraints of the target optimization model; 步骤6:根据等式X=Y的残差更新拉格朗日乘子Z,并以固定的倍数ρ增大惩罚系数μ,其中ρ>1;Step 6: Update the Lagrange multiplier Z according to the residual of the equation X=Y, and increase the penalty coefficient μ by a fixed multiple ρ, where ρ>1; 步骤7:判断是否满足收敛条件,如果满足,则X为所求解,如果不满足,则重复执行步骤3-6。Step 7: Determine whether the convergence condition is satisfied. If it is satisfied, X is the solution to be solved. If it is not satisfied, repeat steps 3-6. 19.如权利要求18所述的装置,其特征在于,所述第二确定单元减少X中的一个非零元素并替代X具体实现为:19. The apparatus of claim 18, wherein the second determining unit reduces a non-zero element in X and replaces X and is specifically implemented as: 在X减少一个非零元素前提下,当减少X中的第一非零元素形成的X使得所述目标优化模型的函数值最小,减少所述第一非零元素并替代X。On the premise that X is reduced by one non-zero element, when X formed by reducing the first non-zero element in X minimizes the function value of the objective optimization model, the first non-zero element is reduced and X is replaced. 20.如权利要求11所述的装置,其特征在于,所述特征指标包括如下的至少一种:20. The apparatus of claim 11, wherein the characteristic index comprises at least one of the following: 每时钟指令数IPC、预测分支成功率、预测分支失败率、L2 Cache命中率、能耗。Instructions per clock IPC, predicted branch success rate, predicted branch failure rate, L2 Cache hit rate, and energy consumption.
CN201610944177.6A 2016-11-02 2016-11-02 Index evaluation method and device Expired - Fee Related CN108008999B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610944177.6A CN108008999B (en) 2016-11-02 2016-11-02 Index evaluation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610944177.6A CN108008999B (en) 2016-11-02 2016-11-02 Index evaluation method and device

Publications (2)

Publication Number Publication Date
CN108008999A CN108008999A (en) 2018-05-08
CN108008999B true CN108008999B (en) 2021-07-20

Family

ID=62048196

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610944177.6A Expired - Fee Related CN108008999B (en) 2016-11-02 2016-11-02 Index evaluation method and device

Country Status (1)

Country Link
CN (1) CN108008999B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114691457B (en) * 2020-12-28 2025-08-05 安徽寒武纪信息科技有限公司 Method, device, storage medium and electronic device for determining hardware performance
WO2023015560A1 (en) * 2021-08-13 2023-02-16 Huawei Technologies Co.,Ltd. Systems and methods for sparsity-aware vector processing in general purpose cpus
CN115543719B (en) * 2022-11-24 2023-04-07 飞腾信息技术有限公司 Component optimization method and device based on chip design, computer equipment and medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6751792B1 (en) * 2000-10-04 2004-06-15 Sun Microsystems, Inc. Using value-expression graphs for data-flow optimizations
CN103049310A (en) * 2012-12-29 2013-04-17 中国科学院深圳先进技术研究院 Multi-core simulation parallel accelerating method based on sampling
CN104268085A (en) * 2014-10-24 2015-01-07 重庆邮电大学 Software vulnerability discovery system and method based on attribute extraction
CN105654120A (en) * 2015-12-25 2016-06-08 东南大学—无锡集成电路技术研究所 Two-step cluster software load feature extraction method based on SOM and K-means

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9098352B2 (en) * 2013-07-17 2015-08-04 Deja Vu Security, Llc Metaphor based language fuzzing of computer code

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6751792B1 (en) * 2000-10-04 2004-06-15 Sun Microsystems, Inc. Using value-expression graphs for data-flow optimizations
CN103049310A (en) * 2012-12-29 2013-04-17 中国科学院深圳先进技术研究院 Multi-core simulation parallel accelerating method based on sampling
CN104268085A (en) * 2014-10-24 2015-01-07 重庆邮电大学 Software vulnerability discovery system and method based on attribute extraction
CN105654120A (en) * 2015-12-25 2016-06-08 东南大学—无锡集成电路技术研究所 Two-step cluster software load feature extraction method based on SOM and K-means

Also Published As

Publication number Publication date
CN108008999A (en) 2018-05-08

Similar Documents

Publication Publication Date Title
CN105589993B (en) Microprocessor function verification apparatus and microprocessor function verification method
EP1835426A1 (en) Estimating software power consumption
CN112990426A (en) Cooperative use of genetic algorithms and optimization trainers for automated encoder generation
US10102323B2 (en) Micro-benchmark analysis optimization for microprocessor designs
JP2011134329A (en) Method and apparatus to efficiently generate processor architecture model
CN110178123B (en) Performance index evaluation method and device
CN116149917A (en) Method and apparatus for evaluating processor performance, computing device, and readable storage medium
CN108008999B (en) Index evaluation method and device
CN112668706A (en) Performing genetic algorithms on a low power controller
CN117933073A (en) Method and device for exploring design space of CPU micro-architecture
CN117272896A (en) Machine learning techniques for circuit design verification
KR20200072391A (en) Method and apparatus for predicting game indicator information
JP5153724B2 (en) Processing time estimation device and processing time estimation program
CN114818600A (en) Chip verification method and device, electronic equipment and storage medium
CN108664368B (en) Processor performance index evaluation method and device
CN115543719B (en) Component optimization method and device based on chip design, computer equipment and medium
Röhl et al. Validation of hardware events for successful performance pattern identification in high performance computing
CN117973622A (en) A short-term load forecasting method considering meteorological influence
CN112602059B (en) Generate vector predicate summary
Ali et al. RV-ProViler: Evaluating RISC-V ISA for application-specific requirements
Madougou et al. Using colored petri nets for GPGPU performance modeling
CN110750856B (en) Effective instruction window size assessment method based on machine learning
CN107678734B (en) The Construction Method of CPU Benchmark Test Assembly Based on Genetic Algorithm
CN108564135B (en) A method for building skeleton programs and realizing high performance computing program runtime prediction
EP3518153A1 (en) Information processing method and information processing system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20210720