Disclosure of Invention
In order to solve the technical problems, the invention provides a built-in self-test architecture of an in-memory computing macro unit, which is efficient, low in cost and capable of realizing high fault coverage rate, and the built-in self-test architecture comprises a built-in self-test method of the in-memory computing macro unit, and is characterized by comprising the following steps of executing a first test mode:
Inserting a compute enable vector and a compute enable instruction in an MBIST test algorithm to conduct a compute test after a read operation to detect a coupling failure between a memory array and its bit multiplication logic, and
Judging whether the stored reading result and the calculated result are correct, if so, entering a second test mode or judging that the self-test is correct and finishing the detection, otherwise, diagnosing the fault position based on the detection result.
According to the testing method provided by the application, the special fault of data coupling between the storage unit and the computing unit can be further and effectively detected on the premise of covering the traditional memory fault type by inserting the computation enabling vector and the computation activating vector in the traditional MBIST algorithm and verifying the additionally executed multiply-accumulate computation operation after the read operation, so that the fault coverage rate is improved.
Optionally, the built-in self-test method for the in-memory computing macro unit provided by the application further includes executing a second test mode:
Generating a set of automatic test vectors by ATPG, traversing the set of automatic test vectors and corresponding activation vectors for performing multiply-accumulate operations for testing calculation logic, respectively, and
Judging whether the calculation result is correct, if not, diagnosing the fault position, if so, judging that the self-test is correct and finishing the detection.
The second test mode is used as the supplement of the first test mode to form a multi-mode combined test system, the integrity, the robustness and the reliability of the whole test of the in-memory computing unit are improved, the second test mode automatically generates an input mode with high fault coverage rate through ATPG, and the test efficiency and the coverage rate are improved.
Optionally, the built-in self-test method for the in-memory computing macro unit provided by the application further comprises the following steps:
a mode selection step of selecting a corresponding test mode based on a test requirement, wherein the test mode comprises a first test mode, a second test mode and a sequential test mode;
If only the storage test is needed, executing a first test mode;
executing the second test mode if only the calculation test is needed, and
If the storage test and the calculation test are needed to be carried out at the same time, the sequential test mode is executed, namely the first test mode is executed first, and then the second test mode is executed.
The method and the device realize flexible adaptation for different test requirements by supporting storage, calculation and combination of multiple test modes, enhance the universality and configuration capability of a test scheme, improve the intelligentization and automation degree of a test flow by configuration logic of a mode selection mechanism, simplify the test control flow, and finally avoid redundant test by calling according to the requirement, improve the test efficiency, shorten the test time and reduce the resource consumption.
Optionally, the built-in self-test method for the in-memory computing macro unit provided by the application, the first test mode includes:
A step of writing the right into the memory array, which is to sequentially output ascending addresses to the test address channel, write the test data into each row of the memory array through the test weight channel in the corresponding period of each address, and synchronously control the write operation through controlling the CIM write enabling signal so as to ensure that all physical addresses of the memory array are loaded with uniform known data and provide known weight references for subsequent tests;
the calculation activating step is that the address range of the weight rewriting step is traversed again, a test input channel is loaded as an activating vector, a CIM calculation enabling signal and a CIM reading enabling signal are pulled up to trigger the storage array to read and multiply the linkage starting of an accumulation calculating path, and in each period, the storage array reads the weight of a corresponding row and carries out parallel bit multiplication operation with an input activating value, and then accumulation is completed through an addition tree, and a calculating result is output;
And a result judging and error detecting step, wherein if the calculated result is consistent with the expected value, the next test mode is entered or the BIST correct is output to finish the test, otherwise, the system outputs the BIST correct to be low and enters the diagnosis flow.
Firstly, the known weight writing mechanism is unified, the predictability of the subsequent calculation output can be ensured, an ideal reference value is provided for fault detection, and the test controllability and accuracy are improved.
And secondly, a synchronous control mechanism of CIM write enabling signals and calculation/read enabling signals can realize precisely controlled write-in and calculation processes, avoid time sequence dislocation or test distortion and enhance test stability.
Optionally, the built-in self-test method for the in-memory computing macro unit provided by the application, the second test mode includes:
The test vector weight loading step is that the automatic test vector generated by ATPG is written into the corresponding storage array address row, and the writing time sequence is controlled by a CIM writing enabling signal, so that the correct configuration of the automatic test vector data is ensured;
After the automatic test vector writing is completed, the system synchronously loads the corresponding activation vector to a calculation path, and simultaneously pulls up CIM calculation enabling and starts multiply-accumulate operation;
and a result judging step of carrying out self-comparison on the test result and the adjacent columns, if all the automatic test vector tests pass, pulling up a BIST test pass signal and sending out a BIST end instruction, and if any automatic test vector test fails, recording error position information and outputting the error position information.
The main technical effects of such design include:
and the test precision of the calculation path is improved, namely the corner errors and interconnection anomalies of the complex calculation path can be accurately covered by the diversified test vectors generated by the ATPG.
The test dependence is reduced, the reference golden value requirement is eliminated by a self-comparison mechanism, and the test storage cost and the external control complexity are reduced.
And the stability of time sequence control is improved, and the synchronism and consistency of the test process are ensured through fine control signals such as CIM write enable, CIM calculation enable and the like.
And an automatic diagnosis and feedback mechanism, wherein once a fault is found, the system can automatically record and output the fault position, so that the fault positioning and debugging efficiency is enhanced.
The expansion is strong, the support is expanded to a plurality of groups of vector tests or different activation combination tests, and the flexibility of the whole BIST is improved.
In order to achieve the above object, the present application provides a built-in self-test architecture for an in-memory computing macro unit, comprising:
An MBIST module configured to generate a test address and a test weight for testing a memory array of the in-memory computing macro-cell in a first test mode;
An LBIST module configured to generate test inputs for coupling fault detection between the memory array and its bit multiplication logic in a first test mode, and to provide an automatic test vector for testing the computation logic of the in-memory computation macro-cell in a second test mode;
the BIST controller is connected with the MBIST module and the LBIST module and used for controlling the operation of the MBIST module and the LBIST module;
A multiplexer network connected among the BIST controller, the MBIST module, the LBIST module and the in-memory computing macro unit for switching inputs corresponding to modes between a functional mode and a test mode, and
And the diagnosis unit is connected with the BIST controller and used for analyzing and outputting the fault position according to the test feedback information.
Optionally, the MBIST module includes:
A memory decoder for generating test weights, and
And the storage counter is connected with the storage decoder and is used for generating a test address.
Optionally, the LBIST module includes:
A logic decoder for generating test inputs and automatic test vectors, and
And the logic counter is connected with the logic decoder and is used for controlling the time sequence of the automatic test vector.
In order to achieve the above object, the present application provides a built-in self-test system for an in-memory computing macro unit, including any one of the built-in self-test architecture for an in-memory computing macro unit described above, further including:
a memory computing macro unit connected with the built-in self-test architecture, and
The comparison unit is connected with the in-memory calculation macro unit and the diagnosis unit, and can compare the stored test result and/or the calculated test result of the in-memory calculation macro unit with a preset value and send the stored error information and/or the logic error information to the diagnosis unit.
Optionally, the comparing unit includes a logic comparator and a storage comparator, which are respectively used for calculating the comparison of the test results and storing the comparison of the test results, and the logic comparator determines whether the calculation results are correct based on the consistency of the calculation results of two adjacent columns.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
FIG. 1 illustrates an in-memory compute array and two typical failure models thereof, the in-memory compute array being used to perform the following operations:
vector-matrix multiplication (VMM) or matrix-matrix multiplication (MMM), i.e., y=w×a;
Wherein, A is input vector or input matrix, through address lines A0, A1, AM input, W is weight matrix, stored in memory cell mode, Y is output vector or matrix, representing the result after weight calculation.
With continued reference to fig. 1, the weight storage unit W { i, j } is responsible for storing the weight value of the ith row and jth column, and is typically implemented using SRAM, RRAM, feRAM, and other technologies.
The bit multiplication unit, i.e. multiplier x, receives the corresponding input Ai for each row and multiplies the weights of each column in parallel with W { i, j }. Ai.
The adder array accumulates multiplication results of all rows in the same column to realize vector dot product.
The multi-column computation structure selects different column results to output to Y {0,0} to Y {0, N }.
In two typical fault models, the first is a weight-input coupling fault, the fault is located between an input signal and a weight storage unit, electrical coupling exists between the input and the weight, voltage disturbance may be wound, and therefore the stored weight is unstable in reading, the stability of the storage unit may be affected by the input signal, the read weight is made to be in error, finally, the calculated weight value of multiplication is incorrect, and even if the input is correct, the result is wrong.
The second type is a coupling fault of weight-bit multiplication results, which is located between the weight storage unit and the multiplication result output end, and the stored weight value affects the multiplier result output line in the output process, which may cause the output multiplication result to be polluted due to weight change, and may cause the original result to be interfered by bypass signals in the output stage, and finally, even if the input and the weight are correct, the final result may be wrong due to the influence of the weight on the output path.
Since DCIM couples SRAM memory and logic together, a coupling failure of both in addition to the original SRAM and computational logic can cause circuit failure, which is a characteristic failure caused by the characteristics of DCIM. After finding the above-mentioned specific faults caused by the characteristics of in-memory computation, the applicant purposefully proposes a built-in self-test architecture 100 of an in-memory computation macrocell which is efficient, low-cost and capable of realizing high fault coverage, as shown in fig. 2, comprising:
MBIST module 10 for generating test stimulus for a memory path in-memory computational macro cell 200, specifically, MBIST module is configured to generate test address and test weight for testing a memory array of the in-memory computational macro cell in a first test mode, MBIST module 10 includes memory decoder 11 for generating test weight and decoding into control signal, memory counter 12 connected with memory decoder 11 for generating test address;
An LBIST module 20 for generating test stimuli for the computation path of the in-memory computation macro-cell 200, in particular, for generating test inputs for the storage array and the detection of coupling faults between it and the bit multiplication logic in a first test mode, and for providing an automatic test vector for testing the computation logic of the in-memory computation macro-cell 200 in a second test mode, the LBIST module 20 comprising a logic decoder 21 for generating test inputs of the logic function unit and the automatic test vector, a logic counter 22 connected to the logic decoder 21 for controlling the timing of the automatic test vector;
BIST controller 30, connected to MBIST module 10 and LBIST module 20, for controlling operation of MBIST module 10 and LBIST module 20;
A multiplexer network 40 connected between the BIST controller 30, the MBIST module 10, and the LBIST module 20 and the in-memory computation macro cell 200, capable of switching between a functional mode and a test mode and gating inputs corresponding to the modes, and
A diagnostic unit 50, coupled to the BIST controller 30, for analyzing and outputting the fault location and/or fault type based on the test feedback information.
Specifically, the multiplexer network 40 includes a weight selector 41 for selectively inputting test weights or functional weights, an input selector 42 for selectively inputting test inputs or functional inputs (matrices or vectors), an address selector 43 for selectively inputting test addresses or outputs of logic counters, and a control signal selector 44 for selectively inputting test control signals or control signals, the inputs of each selector being determined by the output instructions of the BIST controller 30, in other words, each selector being selectively operated in a test mode or a functional mode under the control of the BIST controller 30.
Specifically, the logic decoder 21 is configured to integrate ATPG for generating test vectors, which is an automated tool for generating test vectors for digital circuits, with the aim of detecting specific types of faults (such as open circuits, short circuits, coupling, delays, etc.), thereby verifying whether the chip has manufacturing defects or functional failures, and in some embodiments, ATPG may be provided separately from the logic decoder 21 and connected thereto.
With continued reference to fig. 2, the present embodiment further provides a built-in self-test system of the in-memory computing macro unit, which includes the built-in self-test architecture 100 of the in-memory computing macro unit, the in-memory computing macro unit 200, and the comparing unit 300.
Optionally, the comparing unit 300 includes a logic comparator 310 and a storage comparator 320, which are respectively used for comparing the calculated test results and comparing the stored test results, more specifically, the logic comparator 310 determines whether the calculated result is correct based on the consistency of the calculated results of two adjacent columns, as shown in fig. 3, in this embodiment, since the self-comparison method is adopted, the same activation and weight can be input into each column without generating an ideal reference calculated result, and further, whether the calculation is correct is determined by comparing the consistency of the calculated results between two adjacent columns, thereby greatly reducing the additional hardware cost.
The in-memory computing macro unit 200 is connected to the built-in self-test architecture 100, and the comparing unit 300 is connected to the in-memory computing macro unit 200 and the diagnosing unit 50, and is capable of comparing the stored test result and/or the computed test result of the in-memory computing macro unit 200 with a preset value, and transmitting the stored error information and/or the logic error information to the diagnosing unit 50.
The embodiment also provides a built-in self-test method of the memory computing macro unit, which comprises a first test mode and a second test mode, wherein fig. 4 is a flow chart of the first test mode, after the first test mode is started, a basic storage test algorithm is firstly selected, a test algorithm is executed, a computing enabling vector is inserted, a computing activating vector is computed after a reading operation, whether a storage reading result and a computing result are correct or not is judged, if not, a fault position is diagnosed, whether the last operation of the algorithm is further judged, if yes, the BIST is output correctly, if not, the test algorithm is executed again, and the computing enabling vector is inserted, so that the computing test is carried out after the reading operation.
FIG. 5 is a flow chart of a second test mode, after the second test mode is started, traversing the ATPG test vector, judging whether the calculation result is correct, if not, diagnosing the error position, if yes, further judging whether the calculation result is the last test vector, if yes, outputting the BIST correctly, and if not, returning to traversing the ATPG test vector.
The following describes the operation of the built-in self-test architecture 100 of the in-memory computational macro shown in fig. 2 with reference to fig. 6, wherein the BIST controller 30, upon receiving the BIST enable signal, gates the multiplexer network 40 into test control signals, test weights, test inputs and test addresses, and requires testing if the test instructions require both memory functions and computational functions:
Firstly, a first Test mode is executed, namely a built-In memory Self-Test module 10 is executed, based on a traditional MBIST algorithm such as a March C - algorithm, and then an address traversing and weight writing step is executed, namely a BIST controller 30 sequentially outputs ascending addresses (0-511) to Test address channels, and writes all 1 Test data into each row of the SRAM through the Test weight channels In each address corresponding period, wherein the writing operation is synchronously controlled by controlling a CIM write enable signal so as to ensure that all physical addresses of the SRAM are loaded with uniform known data, and a known weight reference is provided for subsequent tests.
After the write is completed, a computation activation step is performed in which BIST controller 30 re-traverses the address range (0-511), which loads the test input (also referred to as input activation) channel into a 32-bit all 1 activation vector (i.e., 32' hfffffffff), and pulls up the "CIM computation enable" signal and the "CIM read enable" signal to trigger the SRAM read and multiply-accumulate computation path to start in a coordinated manner. In each period, the SRAM reads the all 1 weights of the corresponding rows, performs parallel bit multiplication operation with the input activation value, then completes accumulation through an addition tree, and outputs a calculation result.
After the calculation is completed, the step of judging the result and detecting the error is carried out, wherein the expected output of the system is a multiplication accumulation result of 32-bit all 1 weight and 32-bit all 1 activation, namely a fixed value. The calculated output is sent to a logic comparator for result determination, if the calculated value is consistent with the expected value, the next test mode is entered, otherwise the system outputs a BIST correct low, and the diagnostic flow is entered.
In the test stage, whether the final accumulated result is an expected value is compared by a logic comparator, and the intermediate paths (reading, calculating and storing) are involved in data transmission, so that SRAM bitcell faults, reading path errors, local faults of a multiplier/adder and coupling defects in a data path are effectively covered.
It should be noted that the test inputs are all 1 as examples, and different test inputs may be selected to match corresponding algorithms for testing in different basic MBIST algorithms and different steps of the same MBIST algorithm.
According to the testing method provided by the embodiment, the computation enabling vector and the computation activating vector are inserted in the traditional MBIST algorithm, after the reading operation, the special fault of data coupling between the storage unit and the computing unit can be further and effectively detected on the premise of covering the traditional memory fault type through verification of additionally executed multiply-accumulate computation operation, and the fault coverage rate is improved.
If the settlement value executed by the first test mode is consistent with the expected value, a second test mode (also called entering a second test stage) is executed, and the second test mode mainly carries out coverage test on complex combination logic structures such as multipliers, addition trees and the like in the CIM structure, and excitation and expected pairs which are generated in advance by an automatic test vector generation tool (ATPG) are adopted.
The second test mode includes the steps of:
Test vector weight loading step the BIST controller 30 writes weight vectors (W0, W1,..w31) generated by ATPG (Automatic TEST PATTERN Generation) into corresponding SRAM address rows in sequence according to the ascending address. The weight data width is 256 bits, and each group of weights corresponds to a specific activation vector and is used for combining test calculation paths. And in the loading process, a CIM write enabling signal controls a write time sequence to ensure that the weight data is correctly configured.
And the step of activating vector loading and calculating triggering, namely after the weight writing is completed, synchronously loading corresponding activating vectors (A0, A1, ai) to a calculating path by a system, simultaneously pulling up CIM calculation enabling, and starting multiply-accumulate operation. In the process, the output result of the multiplier is compared and judged again through the self-comparison module, and the functional correctness of the multiplier is verified.
And a result judging step, namely after all ATPG vector tests pass, the BIST controller pulls up the BIST to pass and sends out a BIST ending instruction to complete the whole BIST flow. If either test fails, the BIST controller 30 records error location information and outputs the error location information through the diagnostic unit.
It should be noted that the BIST test pass is only characterized if the BIST is correct and the BIST is completed at 1.
Alternatively, if only the memory function is to be tested, only the relevant steps of the first test mode may be performed, and similarly, if only the logic function is to be tested, only the relevant steps of the second test mode may be performed. Specifically, as shown in fig. 4, the built-in self-test method of the in-memory computing macro unit further includes a mode selection step, if only a memory test is needed, entering a first test mode, if only a computation/logic test is needed, entering a second test mode, and if both a memory test and a computation test are needed, executing a sequential test mode, that is, executing a related step of the first test mode first and then executing a related step of the second test mode.
Optionally, as shown in fig. 7, the present embodiment provides a built-in self-test method for an in-memory computing macro unit, including the steps of:
Inserting a compute enable vector and a compute enable instruction in an MBIST test algorithm to conduct a compute test after a read operation to detect a coupling failure between a memory array and its bit multiplication logic, and
Judging whether the stored reading result and the calculated result are correct, if so, entering a second test mode or judging that the self-test is correct and finishing the detection, otherwise, diagnosing the fault position based on the detection result.
Optionally, as shown in fig. 8, the method for built-in self-testing of an in-memory computing macro unit according to the present embodiment further includes executing a second test mode:
Generating a set of automatic test vectors by ATPG, traversing the set of automatic test vectors and corresponding activation vectors for performing multiply-accumulate operations for testing calculation logic, respectively, and
Judging whether the stored reading result and the calculated result are correct, if not, diagnosing the fault position, if so, judging that the self-test is correct and finishing the detection.
Optionally, as shown in fig. 9, the built-in self-test method for an in-memory computing macro unit provided in this embodiment further includes the steps of:
A mode selection step of selecting a corresponding test mode based on a test requirement, wherein the test mode comprises a first test mode, a second test mode and a sequential test mode;
If only the storage test is needed, executing a first test mode;
executing the second test mode if only the calculation test is needed, and
If the storage test and the calculation test are needed to be carried out at the same time, the sequential test mode is executed, namely the first test mode is executed first, and then the second test mode is executed.
Optionally, the first test mode includes:
A step of writing the right into the memory array, which is to sequentially output ascending addresses to the test address channel, write the test data into each row of the memory array through the test weight channel in the corresponding period of each address, and synchronously control the write operation through controlling the CIM write enabling signal so as to ensure that all physical addresses of the memory array are loaded with uniform known data and provide known weight references for subsequent tests;
The method comprises the steps of computing and activating, namely, traversing the address range of the weight writing step again, loading a test input channel as an activating vector, pulling up a CIM computing enabling signal and a CIM reading enabling signal to trigger the memory array to read and multiply the computing path in a linkage mode, reading the weight of a corresponding row by the memory array in each period, carrying out parallel bit multiplication operation with an input activating value, completing accumulation through an addition tree, and outputting a computing result;
and a result judging and error detecting step, wherein if the calculated result is consistent with the expected value, the next test mode is started or the BIST correct is output to finish the test, otherwise, the system outputs the BIST correct to be low, and error diagnosis is carried out.
Optionally, the second test mode includes:
The test vector weight loading step is that the automatic test vector generated by ATPG is written into the corresponding storage array address row, and the writing time sequence is controlled by a CIM writing enabling signal, so that the correct configuration of the automatic test vector data is ensured;
after the automatic test vector writing is completed, the system synchronously loads the corresponding activation vector to a calculation path, pulls up CIM calculation enabling, and starts multiply-accumulate operation;
And a result judging step of carrying out self-comparison on the test result and the adjacent columns, if all the automatic test vectors pass the test, pulling up a BIST correct signal and sending out a BIST ending instruction, and if any automatic test vector fails the test, recording error position information and outputting the error position information.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, and alternatives falling within the spirit and principles of the invention.