NL2036520B1

NL2036520B1 - CIM allowing self-compensation via complementary data storage

Info

Publication number: NL2036520B1
Application number: NL2036520A
Authority: NL
Inventors: Singh Abhairaj; Hamdioui Said; Kumar Bishnoi Rajendra
Original assignee: Univ Delft Tech
Priority date: 2023-12-14
Filing date: 2023-12-14
Publication date: 2025-06-27

Abstract

CIM device comprising: a plurality of word lines; a plurality of bit lines; a plurality of select lines; and a plurality of bit storage units; wherein each bit storage unit of the plurality of bit storage units comprises: a first bit storage element connecting a bit line of the plurality of bit lines with a select line of the plurality of select lines, addressable via a word line of the plurality of word lines, wherein the first bit storage element is configured to store a respective bit ofthe binary data; and a second bit storage element connecting another bit line of the plurality of bit lines with said select line, addressable via said word line, wherein the second bit storage element is configured to store a binary complement of said respective bit of the binary data.

Description

CIM allowing self-compensation via complementary data storage

TECHNICAL FIELD

The present disclosure generally relates to CIM. Particular embodiments relate to a

CIM device for storing and processing binary data and to a method of operating a CIM device.

BACKGROUND

Computing-in-memory (CIM) — also denoted as computation in memory, in-memory computation, processing-in-memory, or in-memory computing — is the technique of running computer calculations entirely in computer memory (e.g. in RAM). This term typically implies large-scale, complex calculations which require specialized systems software to run the calculations on computers working together in a cluster. CIM architecture has demonstrated great potential to effectively compute large-scale matrix—vector multiplication, which is useful for many applications, ranging from economics and business activities to public administration, from national security to many scientific research areas, due to the need for data storage and analysis, and other Al-oriented applications, such as speech sequencing and face recognition, which heavily rely on large matrix operations.

However, the intensive multiply and accumulation (MAC) operations executed in CIM devices remain bottlenecks for further improvement of energy efficiency and throughput. To reduce computational costs and storage, model compression is a widely studied method to shrink the model size. In some attempts, a quantization algorithm was used, wherein the input and weight bit width is limited to reduce the computational complexity by using different types of quantizers.

SUMMARY

Therefore, it is an aim of at least some embodiments to improve CIM.

Particular embodiments moreover aim to provide one or more of the following advantages: simpler CIM, higher sensing margins, improved robustness, improved speed, and improved energy efficiency.

Accordingly, there is provided in a first aspect according to the present disclosure a computation in memory, CIM, device according to claim 1, for storing and processing binary data; the CIM device comprising: - a plurality of word lines; - a plurality of bit lines; - a plurality of select lines; and - a plurality of bit storage units; wherein each bit storage unit of the plurality of bit storage units comprises: - a first bit storage element connecting a bit line of the plurality of bit lines with a select line of the plurality of select lines, addressable via a word line of the plurality of word lines, wherein the first bit storage element is configured to store a respective bit of the binary data; and - a second bit storage element connecting another bit line of the plurality of bit lines with said select line, addressable via said word line, wherein the second bit storage element is configured to store a binary complement of said respective bit of the binary data.

Advantageously, various embodiments of the CIM device according to the present disclosure allow that, if the word line of an individual bit storage unit of the plurality of bit storage units is activated, a summed current stemming from the first bit storage element and the second bit storage element can be determined, and is expected to be equal to or higher than a fixed current value.

In a way, this is a form of self-compensation, because each individual bit storage unit (containing two complementary bit storage elements) is arranged to deliver (when addressed) a current value which can be calculated, regardless of whether it stores a 1 or 0 (e.g. high or low) data bit.

This is because both the first bit storage element and the second bit storage element contribute to the current output, and because the first bit storage element and the second bit storage element store the same information, but in a binary complementary manner compared to each other.

Therefore, when the first bit storage element is, say, high, the second bit storage element of the same bit storage unit must be, say, low, such that, if this bit storage unit is addressed, its output current is predictable (i.e. it can be evaluated or computed based on observations and logic) because it always depends on one high and one low bit storage element.

In some embodiments, the CIM device comprises a compensation means configured for, if the word line of an individual bit storage unit of the plurality of bit storage units is not activated, compensating a lacking current contribution of said individual bit storage unit to the select line corresponding with said individual bit storage unit, based on the fixed current value.

Of course, the compensation of the lacking current contribution is to said select line.

In some embodiments, the compensation means is configured for: - determining a maximum current of a select line of the plurality of select lines; - determining an actual current of the select line; - determining a difference between the maximum current and the actual current; and - compensating the difference.

In some embodiments, the compensation means is configured for: - receiving a value representing a number of activated word lines of the plurality of word lines; - determining a compensation current based on the received value and the fixed current value; and - compensating the compensation current.

In some embodiments, the compensation means comprises:

- a bit adder configured to sum signals on the plurality of word lines in order to determine the number of activated word lines; - a digital-to-analog converter, DAC, configured for converting the determined number of activated word lines into a bias current; and - a plurality of compensation blocks corresponding with the plurality of select lines, wherein each compensation block is configured for receiving the bias current from the

DAC and for providing the compensation current for its corresponding select line.

In some embodiments, the plurality of bit storage units defines a bitcell array.

In some embodiments, the CIM device comprises at least one of a Static Random

Access Memory, SRAM, Resistive Random Access Memory, RRAM, Magnetoresistive

Random Access Memory, MRAM, Ferroelectric Field-Effect Transistors, FeFET, and

Phase Change Memory, PCM.

Additionally, there is provided in a second aspect according to the present disclosure a method according to claim 8, which is a method of operating a computation in memory, CIM, device according to any of the above-described embodiments; the method comprising: - obtaining binary data; - storing the binary data in respective first bit storage elements of the plurality of bit storage units; and - storing a binary complement of the binary data in respective second bit storage elements of the plurality of bit storage units.

The skilled person will appreciate that the considerations and advantages applying to embodiments of the CIM device according to the present disclosure may apply as well to embodiments of the method according to the present disclosure, mutatis mutandis, and vice versa.

In some embodiments, the method comprises, if the word line of an individual bit storage unit of the plurality of bit storage units is not activated, compensating a lacking current contribution of said individual bit storage unit to the select line corresponding with said individual bit storage unit, based on the fixed current value.

In some embodiments, the method comprises: 5 - determining a maximum current of a select line of the plurality of select lines; - determining an actual current of the select line; - determining a difference between the maximum current and the actual current; and - compensating the difference.

In some embodiments, the method comprises: - receiving a value representing a number of activated word lines of the plurality of word lines; - determining a compensation current based on the received value and the fixed current value; and - compensating the compensation current.

In some embodiments, the method comprises operating a compensation means by: - using a bit adder to sum signals on the plurality of word lines in order to determine the number of activated word lines; - using a digital-to-analog converter, DAC, for converting the determined number of activated word lines into a bias current; and - using a plurality of compensation blocks corresponding with the plurality of select lines, wherein each compensation block receives the bias current from the DAC and provides the compensation current for its corresponding select line.

In some embodiments, the CIM device comprises at least one of a Static Random

Access Memory, SRAM, Resistive Random Access Memory, RRAM, Magnetoresistive

Random Access Memory, MRAM, Ferroelectric Field-Effect Transistors, FeFET, and

Phase Change Memory, PCM.

The embodiments described herein are provided for illustrative purposes and should not be construed as limiting the scope of the invention. It is to be understood that the invention encompasses other embodiments and variations that are within the scope of the appended claims. The invention is not restricted to the specific configurations, arrangements, and features described herein. The invention has wide applicability and should not be limited to the specific examples provided. The embodiments disclosed are merely exemplary, and the skilled person will appreciate that various modifications and alternative designs can be made without departing from the scope of the invention.

BRIEF DESCRIPTION OF THE DRAWINGS

In the following description, a number of exemplary embodiments will be described in more detail, to help understanding, with reference to the appended drawings, in which:

Fig. 1 illustrates schematically an embodiment of a CIM device according to the first aspect of the present disclosure;

Fig. 2 illustrates schematically another, further developed embodiment of a CIM device according to the first aspect of the present disclosure;

Fig. 3 illustrates schematically yet another embodiment of a CIM device according to the first aspect of the present disclosure;

Fig. 4 illustrates schematically the voltage difference between the select line and the bit line of an embodiment of a CIM device according to the first aspect of the present disclosure;

Fig. 5 illustrates schematically an embodiment of a method according to the second aspect of the present disclosure of operating a computation in memory, CIM, device according to the first aspect of the present disclosure; and

Fig. 6 illustrates schematically another embodiment of a CIM device according to the first aspect of the present disclosure.

DETAILED DESCRIPTION

Figure 1 illustrates schematically an embodiment of a CIM device 1 according to the first aspect of the present disclosure. The CIM device 1 is suitable for storing and processing binary data. The figure schematically shows that the CM device 1 comprises: a plurality of word lines WL; a plurality of bit lines BL, NBL; a plurality of select lines CSL; and a plurality of bit storage units 3. In this figure, for the sake of clarity, only one word line WL, one select line CSL, and one bit storage unit 3 is shown.

However, the skilled person will appreciate how pluralities of these elements (and a plurality of sets of bit lines BL, NBL) may be implemented when contrasting with

Figure 2.

The figure shows that each bit storage unit 3 of the plurality of bit storage units 3 comprises: - a first bit storage element 5 connecting a bit line BL of the plurality of bit lines BL,

NBL with a select line CSL of the plurality of select lines CSL, addressable via a word line WL of the plurality of word lines WL, wherein the first bit storage element 5 is configured to store a respective bit of the binary data; and - a second bit storage element 7 connecting another NBL or said BL bit line of the plurality of bit lines BL, NBL with said select line CSL, addressable via said word line

WL, wherein the second bit storage element 7 is configured to store a binary complement of said respective bit of the binary data.

The figure also shows, as an example, that the first bit storage element 5 stores a certain bit of some binary data, said certain bit having a value represented with top- left to bottom-right hatching. This value may be 1 or 0 — for this example, let's assume it is a value of 0, so the first bit storage element 5 stores a 0 as said certain bit. The second bit storage element is shown to store another value, represented with bottom- left to top-right hatching. It is intended that this hatching is opposite from the hatching of the first bit storage element 5. In our exemplary assumption, this hatching would represent a value of 1, so the second bit storage element 7 stores a 1, which is the binary complement of the 0 stored in the corresponding first bit storage element 5.

In other words, by arranging the first bit storage element 5 and the second bit storage element 7 in this manner, both the binary true or intended (in this case 0) and the binary false or opposite (in this case 1) value can be stored simultaneously for a given bit (assumed to be 0) of some binary data. Because both the first bit storage element 5 and the second bit storage element 7 are connected to the same select line CSL,

current on the select line CSL can be predicted to be based on both the first bit storage element 5 and the second bit storage element 7, because the same word line WL addresses them both. This means that the current on the select line CSL, represented in the figure as lcs. (i) is based on a combination of current of one bit storage element storing a 1 (represented in the figure as lon) and another bit storage element storing a

O (represented in the figure as lors). This would be the same irrespective of whether the first 5 or the second 7 bit storage element stores the given bit of the binary data.

This means that, regardless of the actually intended data bit of some binary data that is stored in a given bit storage unit 3, the outcome on the select line CSL will be predictable. This forms a sort of self-compensation mechanism, which can advantageously be put to further use.

An example of such a further use, is to stabilize the CIM device, e.g. by virtually fixing the common select line CSL {which may be a virtual ground in case of SRAM), even without the use of an amplifier. If all word lines WL are ON (i.e., if all bit storage units 3 are selected), the total current lcs is constant, hence making the select line SL constant itself. When all word lines WL are not ON but OFF, information on the number of ON word lines WL may be used to compensate for that.

This may for example be achieved in further developed embodiments through the use of a compensation means configured for, if the word line of an individual bit storage unit of the plurality of bit storage units is not activated, compensating a lacking current contribution of said individual bit storage unit to the select line corresponding with said individual bit storage unit, based on the fixed current value.

A particular implementation of such compensation means may for example be configured for: - determining a maximum current of a select line SL of the plurality of select lines SL; - determining an actual current of the select line lcs; - determining a difference between the maximum current and the actual current; and - compensating the difference.

Alternatively or additionally, such compensation means may for example be configured for: - receiving a value representing a number of activated ON word lines WL of the plurality of word lines WL; - determining a compensation current based on the received value and the fixed current value; and - compensating the compensation current.

To read the values stored in the first storage element 5 and the second storage element 7 of a bit storage unit 3 of the plurality of bit storage units 3, the transistors

M1 and M2 can be turned on. When the transistors M1 and M2 receive voltage to their gates from the word line WL, the transistors M1 and M2 become conductive and so the binary value stored in the first storage element 5 and the complementary value stored in the second storage element 7 can get transmitted to the bit line BL and to the complementary or negative bit line NBL, respectively.

It is noted that the implementation shown in this figure, which is a two-transistor, two- resistor or 2T2R structure, is just intended as an example, and that various other implementations may be designed by the skilled person within the scope of the present disclosure.

Furthermore, a complementary bitcell architecture, like the one illustrated here, may store both true and complementary values for binary storage, because if the (true) data is 1 (which may be mapped to high conductance states, e.g. HCS in RRAM devices), the complementary data would be O (which may be mapped to low conductance states, e.g. LCS in RRAM devices), and vice versa. This can advantageously be implemented in 2D memory or separate 2D tiers in a 3D memory, wherein just two bitcells would suffice for storing 1-bit data, in order to store true and complementary information.

Figure 2 illustrates schematically another, further developed embodiment of a CIM device 1 according to the first aspect of the present disclosure. This embodiment may be based on the embodiment of Figure 1, or may be designed independently. It will be clear to the skilled person how the summed contributions from multiple bit storage units 3 together ensure a constant current on the common select line CSL (activated via transistor M3). In this figure, it is made more insightful how the respective first and second bit storage elements (in pairs: {51, 71}, {52, 72}, {53, 73} and {54, 74}) of each bit storage unit 3 can be summed. The result on the bit line BL and the complementary or negative bit line NBL may be output to an analog-to-digital converter or ADC 9.

Figure 3 illustrates schematically yet another embodiment of a CIM device according to the first aspect of the present disclosure. This embodiment may be based on the embodiment of Figure 1 or Figure 2, or may be designed independently. The figure shows how multiple bit storage units 3 of the CIM device 1 feature a bit line BL1, BL2,

BLn and a corresponding select line SL1, SL2, SLn, respectively. The figure also shows how each bit storage unit 3 stores both the intended binary bit A1, A2, An and their complements (represented with an overbar symbol}, respectively.

An advantage of the embodiments illustrated in the above figures is that the voltage difference AV = Vs - Ve, between the select line SL and the bit line BL (or equivalently between the select line SL and the complementary or negative bit line NBL) can be considered to be a fixed value. This is illustrated schematically in Figure 4, which illustrates schematically the voltage difference AV between the select line SL and the bit line BL of an embodiment of a CIM device according to the first aspect of the present disclosure. This embodiment may be based on the embodiment of any of Figures 1-3, or may be designed independently. It can be seen from the figure that the sum of all voltages Vi multiplied by their respective conductance Gi is constant.

Figure 5 illustrates schematically an embodiment of a method 100 according to the second aspect of the present disclosure of operating a computation in memory, CIM, device according to the first aspect of the present disclosure. This embodiment may be based on the embodiment of any of the other figures, or may be designed independently.

The method 100 may comprise: - obtaining 101 binary data;

- storing 103 the binary data in respective first bit storage elements of the plurality of bit storage units; - storing 105 a binary complement of the binary data in respective second bit storage elements of the plurality of bit storage units.

The skilled person will understand how to expand the flowchart shown in Figure 5 in order to accommodate various other steps as described elsewhere in the present disclosure.

For example, not only the storage but as well as the processing of the data may be considered. Processing of data in the CIM device according to the present disclosure may involve performing in-situ operations on the data stored at the same physical location. In CIM, both logic and arithmetic operations can be performed. Various embodiments according to the present disclosure are particularly useful for arithmetic operations such a multiply-and-accumulate (MAC), n-bit addition, etc. In neuromorphic applications, INPUTS (activation signals as word line WL voltage V) and synaptic weights W (data stored as conductance G in CIM) MAC can be performed by applying

INPUTS as word line WL voltage which gets multiplied by W (data stored), i.e. VxG.

The MAC output in each column can be in terms of aggregated current of each bitcell onto the common bit line BL as BL current Is. (3 VxG)}, or the voltage developed caused by the aggregated BL current on BL as AVL. In the exemplary embodiments illustrated in the present disclosure, AVa has been illustrated as MAC output, but it will be appreciated that the other option would also work within the scope of the present disclosure.

Figure 6 illustrates schematically another embodiment of a CIM device according to the first aspect of the present disclosure. In order to ensure that the voltage at the select line Vee remains constant, the total current lcs. flowing into the NMOS footer device 608 must remain constant, regardless of the number of WL activations and data stored in the bit cells. The column current Icom that is dependent on WL activations is always less than or equal to maximum possible total current. The compensation can be deployed using a programmable current source (per ADC) which can feed-in a complementary current (total maximum current - column current) into the footer NMOS device. This block can use the information of the number of WL activations and can create a compensation current accordingly. Since this information is the same for all columns, a single biasing signal 604 can advantageously be used for all these compensating blocks 606, 607.

Now, a specific implementation using a numerical example will be described in more detail, to aid understanding. The notation #(WL)=k implies that k word lines (or WLs) are activated or are ON. For the purposes of this numerical example, assume that the total number of WLs can be 256. Of course, this is only an example, and the principle is applicable to various other values as well. This implies that Icom = kx(lon + lors) and compensating current should be (256-k)x(lon + lore) for the total current to be always 256x({lon + lorr). Assuming in this numerical example that #(WL)=50, sum signal 2 603, i.e. the sum of all the WL_EN signals, which is 50, can be produced using e.g. a 1-bit full adder 601. Also shown are word line drivers 602.

This information, so in this example 2=50, can be used by a digital-to-analog convertor, DAC, 605 to generate the bias voltage B signal 604, which can be supplied to various compensation blocks 606 (only one is shown for clarity, but ideally there may for example be one compensation block per select line / column). The compensation block 606 can generate the compensation current lcome=156x(lon + lors) and can provide it into the footer NMOS device 608 (and of course likewise for the not- shown ones 607).

In general, and as indicated above, in some embodiments, the CIM device may comprise at least one of SRAM, RRAM, MRAM, FeFET, and PCM. An explanation of

FeFET technology can be found i.a. in Chen, An. "A review of emerging non-volatile memory (NVM) technologies and applications." Solid-State Electronics 125 (2018): 25-38.

As used in this application and in the claims, the singular forms “a,” “an,” and “the” include the plural forms unless the context clearly dictates otherwise. The systems, apparatus, and methods described herein should not be construed as limiting in any way. Instead, the present disclosure is directed toward all novel and non-obvious features and aspects of the various disclosed embodiments, alone and in various combinations and sub-combinations with one another. The disclosed systems, methods, and apparatus are not limited to any specific aspect or feature or combinations thereof, nor do the disclosed systems, methods, and apparatus require that any one or more specific advantages be present or problems be solved. Any theories of operation are to facilitate explanation, but the disclosed systems, methods, and apparatus are not limited to such theories of operation.

Although the operations of some of the disclosed methods are described in a particular, sequential order for convenient presentation, it should be understood that this manner of description encompasses rearrangement, unless a particular ordering is required by specific language set forth below. For example, operations described sequentially may in some cases be rearranged or performed concurrently. Moreover, for the sake of simplicity, the attached figures may not show the various ways in which the disclosed systems, methods, and apparatus can be used in conjunction with other systems, methods, and apparatus. Additionally, the description sometimes uses terms like “obtaining” and “outputting” to describe the disclosed methods. These terms are high-level abstractions of the actual operations that are performed. The actual operations that correspond to these terms will vary depending on the particular implementation and are readily discernible by the skilled person.

It will be appreciated that for simplicity and clarity of illustration, where appropriate, reference numerals may have been repeated among the different figures to indicate corresponding or analogous elements. In addition, numerous specific details are set forth in order to provide a thorough understanding of the examples described herein.

However, it will be understood by the skilled person that the examples described herein can be practiced without these specific details. In other instances, methods, procedures and components have not been described in detail so as not to obscure the related relevant feature being described. The drawings are not necessarily to scale and the proportions of certain parts may be exaggerated to better illustrate details and features. The description is not to be considered as limiting the scope of the examples described herein.

Claims

CONCLUSIONS

Computation-in-memory (CIM) apparatus for storing and processing binary data; the CIM apparatus comprising: - a plurality of word lines; - a plurality of bit lines; - a plurality of select lines; and - a plurality of bit storage units; each bit storage unit of the plurality of bit storage units comprising: - a first bit storage element connecting a bit line of the plurality of bit lines to a select line of the plurality of select lines, addressable via a word line of the plurality of word lines, the first bit storage element configured to store a respective bit of the binary data; and - a second bit storage element connecting another bit line of the plurality of bit lines to said select line, addressable via said word line, the second bit storage element configured to store a binary complement of said respective bit of the binary data.

2. A CIM device according to claim 1, comprising compensation means adapted to compensate, if the word line of an individual bit storage unit of the plurality of bit storage units is not activated, for a missing current contribution from said individual bit storage unit to the selection line corresponding to said individual bit storage unit, based on the fixed current value.

3. A CIM device according to claim 2, wherein the compensation means is arranged for: - determining a maximum current of a selection line of the plurality of selection lines; - determining an actual current of the selection line; - determining a difference between the maximum current and the actual current; and - compensating for the difference.

4. A CIM device according to claim 2 or 3, wherein the compensation means is arranged for: - receiving a value representing a number of activated word lines of the plurality of word lines; - determining a compensation current based on the received value and the fixed current value; and - compensating the compensation current.

5. The CIM apparatus of claim 5, wherein the compensation means comprises: - a bit adder configured to add signals on the plurality of word lines to determine the number of activated word lines; - a digital-to-analog converter (DAC) configured to convert the determined number of activated word lines into a bias current; and - a plurality of compensation blocks corresponding to the plurality of selection lines, each compensation block configured to receive the bias current from the DAC and to provide the compensation current to its corresponding selection line.

8. A CIM device according to any preceding claim, wherein the plurality of bit storage units define a bit cell array.

A CIM device according to any preceding claim, wherein the CIM device comprises at least one of Static Random Access Memory, SRAM, Resistive Random Access Memory, RRAM, Magnetoresistive Random Access Memory, MRAM, Ferroelectric Field-Effect Transistors, FeFET, and Phase Change Memory, PCM.

A method for operating a computation in memory (CIM) device according to any preceding claim; the method comprising: - obtaining binary data;

- storing the binary data in respective first bit storage elements of the plurality of bit storage units; and - storing a binary complement of the binary data in respective second bit storage elements of the plurality of bit storage units.

9. A method as claimed in claim 8, comprising, if the word line of an individual bit storage unit of the plurality of bit storage units is not activated, compensating for a missing current contribution from said individual bit storage unit to the selection line corresponding to said individual bit storage unit based on the fixed current value.

A method according to claim 9, comprising: - determining a maximum current of a selection line of the plurality of selection lines; - determining an actual current of the selection line; - determining a difference between the maximum current and the actual current; and - compensating for the difference.

A method according to claim 9 or 10, comprising: - receiving a value representing a number of activated word lines of the plurality of word lines; - determining a compensation current based on the received value and the fixed current value; and - compensating the compensation current.

A method as claimed in claim 11, comprising operating a compensation means by: - using a bit adder to add signals on the plurality of word lines to determine the number of activated word lines; - using a digital-to-analog converter (DAC) to convert the determined number of activated word lines into a bias current; and

- using multiple compensation blocks corresponding to the multiple selection lines, with each compensation block receiving the bias current from the DAC and providing the compensation current for its corresponding selection line.

The method of any of claims 8 to 12, wherein the plurality of bit storage units define a bit cell array.

The method of any of claims 8 to 13, wherein the CIM device comprises at least one of Static Random Access Memory, SRAM, Resistive Random Access Memory, RRAM, Magnetoresistive Random Access Memory, MRAM, Ferroelectric Field-Effect Transistors, FeFET, and Phase Change Memory, PCM.