[go: up one dir, main page]

CN116521126A - Integrated circuit device and method for multiplying in memory therein - Google Patents

Integrated circuit device and method for multiplying in memory therein Download PDF

Info

Publication number
CN116521126A
CN116521126A CN202310200519.3A CN202310200519A CN116521126A CN 116521126 A CN116521126 A CN 116521126A CN 202310200519 A CN202310200519 A CN 202310200519A CN 116521126 A CN116521126 A CN 116521126A
Authority
CN
China
Prior art keywords
gate
word line
signal
read
circuit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310200519.3A
Other languages
Chinese (zh)
Inventor
藤原英弘
森阳纪
赵威丞
李嘉富
奈尔·艾特金·肯·阿卡雅
马合木提·斯楠吉尔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Taiwan Semiconductor Manufacturing Co TSMC Ltd
Original Assignee
Taiwan Semiconductor Manufacturing Co TSMC Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US17/855,089 external-priority patent/US20230315389A1/en
Application filed by Taiwan Semiconductor Manufacturing Co TSMC Ltd filed Critical Taiwan Semiconductor Manufacturing Co TSMC Ltd
Publication of CN116521126A publication Critical patent/CN116521126A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/544Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices for evaluating functions by calculation
    • G06F7/5443Sum of products
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C7/00Arrangements for writing information into, or reading information out from, a digital store
    • G11C7/10Input/output [I/O] data interface arrangements, e.g. I/O data control circuits, I/O data buffers

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Static Random-Access Memory (AREA)

Abstract

An embodiment of the present invention provides an integrated circuit device including a first memory cell, a second memory cell, a first logic element, a second logic element, and a third logic element. The first storage unit is configured to store a first bit at a first node and the second storage unit is configured to store a second bit at a second node. The first logic element includes a first node input connected to the first node, the second logic element includes a second node input connected to the second node, and the third logic element includes a first input connected to a first output of the first logic element and a second input connected to a second output of the second logic element. Embodiments of the present invention also provide a method of performing multiplication in an integrated circuit memory.

Description

Integrated circuit device and method for multiplying in memory therein
Technical Field
Embodiments of the present invention relate generally to the field of electronic circuits and, more particularly, to integrated circuit devices and methods of performing multiplication in memories therein.
Background
The present disclosure relates generally to in-memory Computing (CIM) systems, and also to memory cells and memory arrays used in data processing, such as in multiply-accumulate (MAC) operations. The CIM system stores information in the memory of a computer, such as in the Random Access Memory (RAM) of the computer, and performs the calculations at the memory cell level, rather than moving a large amount of data between the memory of the computer and the processor for each calculation step. Since data is accessed from the memory of a computer and processed in the same memory, the speed of operation is faster, enabling faster reporting and decision making in business and Machine Learning (ML) applications. Efforts are underway to improve the performance of CIM systems.
Disclosure of Invention
One aspect of the present invention provides an integrated circuit device comprising: a first storage unit configured to store a first bit at a first node; a second storage unit configured to store a second bit at a second node; a first logic element including a first node input connected to the first node; a second logic element including a second node input connected to the second node; and a third logic element including a first input connected to the first output of the first logic element and a second input connected to the second output of the second logic element.
Another aspect of the invention provides an integrated circuit device comprising: a selection circuit configured to receive a read select signal and an input signal and provide a read word line output signal based on the read select signal and the input signal; a memory circuit, comprising: a first storage unit configured to store a first bit at a first node; and a second storage unit configured to store a second bit at a second node; and a multiplication circuit configured to receive the read word line output signal, the first bit, and the second bit and provide a multiplication result.
Yet another aspect of the invention provides a method of performing a multiplication in an integrated circuit memory, comprising: storing a first bit at a first node of a first memory cell; storing a second bit at a second node of a second memory cell; receiving a read select signal and an input signal at a select circuit; outputting, by the selection circuit, a read word line output signal based on the read selection signal and the input signal; receiving the read word line output signal, the first bit, and the second bit at a multiplication circuit; and outputting a multiplication result through the multiplication circuit.
Drawings
The various aspects of the invention are best understood from the following detailed description when read in connection with the accompanying drawings. It should be noted that the various components are not drawn to scale according to standard practice in the industry. In fact, the dimensions of the various features may be arbitrarily increased or reduced for clarity of discussion. Further, the drawings are illustrative of examples of embodiments of the present disclosure and are not to be construed as limiting.
Fig. 1 is a schematic diagram illustrating a CIM device according to some embodiments.
Fig. 2 is a schematic diagram illustrating row selection and multiplication circuitry configured to multiply input data XIN with data from a memory cell, according to some embodiments.
FIG. 3 is a schematic diagram of an SRAM cell shown in accordance with some embodiments.
FIG. 4 is a schematic diagram of a row select and multiply circuit including a word line driver, two 6T SRAM cells, and a multiply circuit, shown in accordance with some embodiments.
FIG. 5 is a schematic diagram of a truth table for the row select and multiply circuit of FIG. 4 shown according to some embodiments.
Fig. 6 is a schematic diagram illustrating a MOSFET multiplication circuit that provides the functionality of the multiplication circuit shown in fig. 4, according to some embodiments.
Fig. 7 is a schematic diagram illustrating a transistor layout of the memory cell shown in fig. 4 and the multiplication circuit of fig. 6, according to some embodiments.
Fig. 8 is a schematic diagram illustrating a row selection and multiplication circuit that changes signal polarity by multiplying an input signal XIN and a data signal D to provide an inverted output signal OUTB, according to some embodiments.
Fig. 9 is a schematic diagram illustrating a MOSFET multiplication circuit that provides the functionality of the multiplication circuit shown in fig. 8, according to some embodiments.
Fig. 10 is a schematic diagram illustrating transistor layouts of the memory cell shown in fig. 8 and the multiplication circuit of fig. 9, according to some embodiments.
FIG. 11 is a schematic diagram of a three-row multiplication circuit shown according to some embodiments.
Fig. 12 is a schematic diagram of a MOSFET multiplication circuit shown providing the functionality of the multiplication circuit of fig. 11, according to some embodiments.
Fig. 13 is a schematic diagram of a four-row multiplication circuit shown in accordance with some embodiments.
Fig. 14 is a schematic diagram showing a table of the number of Read Word Lines (RWL) and the number of transistors (Tr) in a conventional read port (Conv) and the New multiplication circuit (New) of the present disclosure, according to some embodiments.
FIG. 15 is a schematic diagram of a latch SRAM cell shown in accordance with some embodiments.
Fig. 16 is a schematic diagram of a row select and multiply circuit including a word line driver (not shown), two 8TSRAM cells, and a multiply circuit, shown in accordance with some embodiments.
Fig. 17 is a schematic diagram illustrating a transistor layout of the memory cell shown in fig. 16 and the multiplication circuit shown in fig. 16, according to some embodiments.
FIG. 18 is a schematic diagram of a row select and multiply circuit including a word line driver (not shown), two 1T1C memory cells, and a multiply circuit, shown according to some embodiments.
FIG. 19 is a schematic diagram illustrating a multiplication method in integrated circuit memory, according to some embodiments.
Detailed Description
The invention provides many different embodiments, or examples, for implementing different features of the disclosure. Specific examples of components and arrangements are described below to simplify the present disclosure. These are, of course, merely examples and are not intended to limit the invention. Such as in the following description, forming the first component over or on the second component may include embodiments in which the first component and the second component are formed in direct contact, and may also include embodiments in which additional components may be formed between the first component and the second component, such that the first component and the second component may not be in direct contact. Furthermore, the present invention may repeat reference numerals and/or characters in the various examples. This repetition is for the purpose of simplicity and clarity and does not in itself dictate a relationship between the various embodiments and/or configurations discussed.
Moreover, spatially relative terms such as "below …," "below …," "lower," "above …," "upper" and the like may be used herein for ease of description to describe one element or component's relationship to another element(s) or component(s) as illustrated. Spatially relative terms are intended to encompass different orientations of the device in use or operation in addition to the orientation depicted in the figures. The device may be otherwise oriented (rotated 90 degrees or at other orientations) and the spatially relative descriptors used herein interpreted accordingly.
Artificial Intelligence (AI) uses deep learning techniques in which a computer system may be organized as a neural network having a plurality of interconnected processing nodes that are capable of analyzing data. The neural network includes multiple layers of computing nodes, wherein the deeper layers perform computations based on the results of the computations performed by the higher layers. Furthermore, in some neural networks, weights are calculated and used to perform calculations on the input data.
Artificial intelligence systems include ML systems in which computer algorithms are automatically improved by experience and data. The ML algorithm builds a model based on sample data (called training data) to make predictions or decisions without explicit programming to do so. In these systems, the input data is compared to training data, i.e., computational analysis is performed on the attributes of known data (e.g., training data). Example systems may be found in the field of object recognition, where the system analyzes attributes of many known images, such as a thousand or more images, to determine patterns that can be used to perform statistical analysis to recognize an input image/object. In some embodiments, the AI system is referred to as a Convolutional Neural Network (CNN).
The computation amount of ML is very large, wherein the ML neural network computes weights to perform computation on input data. ML involves calculating the absolute difference of the dot product and vector, which can be calculated using MAC operations performed on data such as input data and weights. The computation of large and deep neural networks involves many data elements, so it is impractical to store data in processor caches that are too expensive due to memory size and cache cost. In addition, transferring data between other memory resources (e.g., RAM) and the processor is time consuming and a bottleneck for ML systems. Furthermore, as the size of the data set increases, the time and energy/power consumed to move the data is ultimately several times the time and energy/power used by the processor to perform the calculations.
Therefore, CIM circuitry has been developed to perform neural network calculations. The CIM circuitry performs operations locally in memory without sending data to the host processor. This reduces the amount of data transferred between the memory and the host processor, thereby achieving higher throughput and performance. In addition, the reduction in transmitted data reduces the energy/power consumed by the system.
In some CIM systems, the memory array includes memory cells that store weight data, and the input driver provides input data. The memory cells may be arranged in rows and columns, and the weight data may be stored in any suitable type of memory cell, such as data latches, flip-flops, and/or other memory cells, such as flash memory, magnetic Random Access Memory (MRAM), resistive Random Access Memory (RRAM), static Random Access Memory (SRAM), and Dynamic Random Access Memory (DRAM), such as a one transistor-capacitor (1T 1C) memory cell.
In some CIM neural network applications, the MAC operation calculates the product of two numbers and adds the products. The memory cell storing the weight data is connected to a logic circuit (such as a multiplication circuit) that provides output data based on the weight data and the input data. The outputs of the logic circuits are accumulated or added using adder circuits to obtain output values. In these systems, if the number of rows of memory cells involved in a CIM memory read operation is less than or equal to four rows, the space on the chip for the memory cells and conventional static read ports is greater than necessary.
The disclosed embodiments include a CIM device including a memory cell electrically connected to a multiplication logic circuit that provides a bit-wise multiplication calculation, wherein data from the memory cell is multiplied with input data. In some embodiments, multiplication logic provides bit-wise multiplication for two rows of memory cells. In some embodiments, multiplication logic provides bit-wise multiplication for three rows of memory cells. In some embodiments, multiplication logic provides bit-wise multiplication for four rows of memory cells. In some embodiments, the data from the storage unit is a weight used in a neural network such as a CNN. In other embodiments, the multiplication logic circuit may be configured to provide a bitwise multiplication for more than four rows of memory cells, such that a bitwise multiplication for more than four rows of memory cells is within the scope of the present invention.
The disclosed embodiments include six-transistor and eight-transistor SRAM cells connected to select logic and multiplication logic circuits. In some embodiments, the selection logic includes a NAND gate. In some embodiments, the selection logic includes an AND gate. In some embodiments, the multiplication logic circuit includes an OR gate and a NAND gate. In some embodiments, the multiplication logic circuit includes an AND gate AND a NOR gate. In other embodiments, the memory includes different memory cells, such as other data latches, flip-flops, and/or memory cells including flash memory, MRAM, RRAM, SRAM, and DRAM cells. In some embodiments, the memory includes 1T1C memory cells.
In the disclosed embodiments, the number of transistors and read word lines used in the multiplication logic circuit is reduced as compared to previous read port configurations. In some embodiments, the transistors and read word lines used in the multiplication logic circuit are reduced to eight transistors and two read word lines as compared to twelve transistors and five read word lines in other read port configurations.
The advantages of the disclosed embodiments provide a CIM memory cell and logic circuit arrangement that reduces the amount of space occupied on the chip, provides in-memory multiplication that improves performance such as speed performance, and reduces energy/power requirements. Thus, power, performance, and area (PPA) are improved.
Fig. 1 is a schematic diagram of a CIM device 20 shown according to some embodiments. CIM device 20 includes a CIM memory array 22 that includes a memory cell block 24 and a multiplication circuit 26. Each of the memory cell blocks 24 includes memory cells 28, the memory cells 28 being configured to store data bits and arranged in two memory cell rows 30 and 32. Row 30 and row 32 are electrically connected to a corresponding one of multiplication circuits 26 to provide stored data bits to multiplication circuits 26. In some embodiments, the storage unit 28 is configured to store weight data, such as weights for CNNs. In other embodiments, the memory cells 28 are arranged in more than two rows 30 and rows 32, such as three or four rows, and the memory cells 28 in these rows are electrically connected to a corresponding one of the multiplication circuits 26. Additionally, in other embodiments, the memory cells 28 may be arranged in more than four rows of memory cells, and the memory cells 28 from more than four rows are electrically connected to a corresponding one or more of the multiplication circuits 26.
In some embodiments, the memory cell 28 comprises an SRAM cell. In an SRAM cell, data is written to and read from the SRAM cell by one or more bit lines, such as a Bit Line (BL) and a complementary bit line called an inverted Bit Line (BLB). After one or more access transistors in the SRAM cell are activated by a Word Line (WL) signal, data is written to and read from the SRAM cell. In other embodiments, memory cells 28 include different memory cells, such as data latches, flip-flops, and/or other memory cells including flash memory, MRAM, RRAM, SRAM, and DRAM cells. In some embodiments, memory cells 28 comprise 1T1C memory cells.
CIM device 20 also includes an input driver 34 and a WL driver 36. Input driver 34 is configured to receive input signal XIN and drive it to word line driver 36.WL driver 36 is configured to receive input signal XIN from input driver 34 and read select signal RSEL and provide a read word line signal to multiplication circuit 26 to activate row 30 and row 32 of memory array 22.
Memory controller 38 receives a control signal CNTRL for controlling the operation of CIM device 20. For example, the memory controller 38 provides control signals to read/write circuits 40 electrically connected to the bit lines of the memory array 22 to select the bit lines, i.e., columns, of the memory array 22. Read/write circuit 40 receives and provides input/output (I/O) data. In some embodiments, the stored data bits include 4-bit weights such that four columns of memory cells 28 and multiplication circuits 26 are used to store a 4-bit weight value. Furthermore, in some embodiments, a weight value having w bits uses w columns of memory cells 28 and corresponding multiplication circuits 26.
The output signals OUT from the multiplication circuits 26 are supplied to an adder circuit 42, and the adder circuit 42 adds the output signals OUT of the respective multiplication circuits 26. The accumulator circuit 44 is electrically connected to the adder circuit 42 and is configured to provide a MAC output MACOUT.
Fig. 2 is a schematic diagram illustrating a row selection and multiplication circuit 50 configured to multiply input data XIN with data from memory cells 52 and 54, according to some embodiments. The row select and multiply circuit 50 includes memory cells 52 and 54, a word line driver 56, and a multiply circuit 58. In some embodiments, word line driver 56 is similar to word line driver 36 (shown in FIG. 1). In some embodiments, multiplication circuit 58 is similar to multiplication circuit 26 (shown in FIG. 1). In some embodiments, memory cells 52 and 54 are similar to memory cell 28 (shown in FIG. 1), and each of memory cells 52 and 54 is from a different one of rows 30 and 32 of memory cell block 24. In other embodiments, row select and multiply circuit 50 is configured to multiply input data XIN with data from more than two rows of memory cells, such as from three or four rows of memory cells.
Word line driver 56 includes a NAND gate 60 and a NAND gate 62 electrically connected to multiplication circuit 58. Each of NAND gate 60 and NAND gate 62 is configured to receive an input signal XIN and one of read select signals RSEL and RSEL. In addition, each of NAND gate 60 and NAND gate 62 is configured to provide one of read word line signals RWLB and RWLB to multiplication circuit 58 to activate a selected row of memory cells 52 and 54. NAND gate 60 receives input signal XIN and read select signal RSEL and provides read word line signal RWLB to multiplication circuit 58.NAND gate 62 receives input signal XIN and read select signal RSEL and provides read word line signal RWLB to multiplication circuit 58.
The memory cells 52 and 54 are electrically connected to the multiplication circuit 58, providing stored data bits in the form of data signals DB and DB to the multiplication circuit 58. In some embodiments, memory cells 52 and 54 are SRAM cells. In other embodiments, memory cells 52 and 54 are data latches, flip-flops, and/or other memory cells, such as flash memory, MRAM, RRAM, SRAM, and DRAM cells. In some embodiments, memory cells 52 and 54 are 1T1C memory cells. In some embodiments, storage units 52 and 54 are configured to store weight data, such as weights for CNNs.
Multiplication circuit 58 includes logic gates for multiplying input data signal XIN received from word line driver 56 with the data bits from memory cells 52 and 54. In this example, multiplication circuit 58 includes a first OR gate 64, a second OR gate 66, and a NAND gate 68. In other embodiments, multiplication circuit 58 includes different logic gates.
The first OR gate 64 is configured to receive the read word line signal RWLB from the word line driver 56 and the data signal DB from the memory cell 52. The second OR gate 66 is configured to receive the read word line signal RWLB from the word line driver 56 and the data signal DB from the memory cell 54. The NAND gate 68 receives the output from each of the first OR gate 64 and the second OR gate 66 and provides the multiplication result at the output OUT.
In operation, to select one of the memory cells 52 and 54, one of the NAND gate 60 and the NAND gate 62 in the word line driver 56 receives a logic high (1) read select signal RESL or RSEL, and the other of the NAND gate 60 and the NAND gate 62 receives a logic low (0) read select signal RESL or RSEL. The NAND gate 60 OR 62 receiving a logic low (0) read select signal RESL OR RSEL is not selected and provides a logic high (1) to one of the OR gates 64 OR 66, which in turn, the OR gate 64 OR 66 transfers the logic high (1) read select signal RESL OR RSEL to one input of the output NAND gate 68. NAND gate 60 OR 62 receiving a logic high (1) read select signal RESL OR RSEL is selected to invert input signal XIN and the inverted input signal XINB is transferred to the other of OR gates 64 and 66.
The OR gate 64 OR 66 receiving the inverted input signal XINB also receives one of the data signals DB and DB from the connected memory cell 52 OR 54 and provides an output signal to the output NAND gate 68. This multiplies the inverted input signal XINB with the data received from the connected memory cell 52 or 54. NAND gate 68 provides the multiplication result at output OUT.
Advantages of having a word line driver 56 and in-memory multiplication circuitry 58 include reduced space on-chip, improved speed performance, and reduced energy/power requirements.
Fig. 3 is a schematic diagram of an SRAM cell 100 shown according to some embodiments. SRAM cell 100 is a six transistor (6T) SRAM cell. In some embodiments, SRAM cell 100 is similar to one or more memory cells 28 (shown in fig. 1). In some embodiments, SRAM cell 100 is similar to one or more of memory cells 52 and 54 (shown in fig. 2). In some embodiments, SRAM cell 100 is used in CIM device 20 of fig. 1. In some embodiments, SRAM cell 100 is used in row select and multiply circuit 50 of fig. 2. In other embodiments, the SRAM cell 100 may include more or less than six transistors, such as four, eight, or ten transistors.
SRAM cell 100 includes two cross-coupled inverters 102 and 104. The first inverter 102 includes a first PMOS/NMOS transistor pair 106 and 108 and the second inverter 104 includes a second PMOS/NMOS transistor pair 110 and 112.SRAM cell 100 also includes a left Pass Gate (PGL) transistor 114 and a right Pass Gate (PGR) transistor 116.
Each of the inverters 102 and 104 is powered with a first terminal of each of a left pull-up (PUL) transistor 106 and a right pull-up (PUR) transistor 110 electrically connected to a power supply VDD and a first terminal of each of a left pull-down (PDL) transistor 108 and a right pull-down (PDR) transistor 112 electrically connected to a reference voltage VSS, such as ground. The data bit is stored in the SRAM cell 100 as a voltage at node Q and can be read via the bit line BL through the right pass gate transistor 116, with access to node Q being controlled by the right pass gate transistor 116. The inverting node (QB) of node Q stores the complement of the value at node Q, so if Q is high, QB is low and vice versa. Node QB may be read through left pass gate transistor 114 via bit line BLB, wherein access to node QB is controlled by left pass gate transistor 114.
The gate of the left pass gate transistor 114 is connected to the word line WL. The first source/drain (S/D) terminal of left pass gate transistor 114 is connected to the inverse bit line BLB, the second S/D terminal of left pass gate transistor 114 is connected to the second terminals of left pull-up transistor 106 and left pull-down transistor 108 at node QB, and to the gates of right pull-up transistor 110 and right pull-down transistor 112 to provide the inverse data output signal DB.
In addition, the gate of the right pass gate transistor 116 is connected to the word line WL. The first S/D terminal of right pass gate transistor 116 is connected to bit line BL, the second S/D terminal of right pass gate transistor 116 is connected to the second terminals of right pull-up transistor 110 and right pull-down transistor 112 at node Q, and to the gates of left pull-up transistor 106 and left pull-down transistor 108.
Fig. 4 is a schematic diagram of a row select and multiply circuit 200 including a word line driver 202, two 6T SRAM memory cells 204 and 206, and a multiply circuit 208, shown according to some embodiments. The row select and multiply circuit 200 is configured to multiply the inverted input data XINB with the inverted data DB from the memory cells 204 and 206. In other embodiments, row select and multiply circuit 200 is configured to multiply input data with data from more than two rows of memory cells, such as from three or four rows of memory cells.
Word line driver 202 includes NAND gate 210 and NAND gate 212 electrically connected to multiplication circuit 208. NAND gate 210 receives input signal XIN and read select signal RSEL and provides read word line signal RWLB to multiplication circuit 208 through first read word line 214, and NAND gate 212 receives input signal XIN and read select signal RSEL and provides read word line signal RWLB to multiplication circuit 208 through second read word line 216. In some embodiments, word line driver 202 is similar to word line driver 36 (shown in FIG. 1). In some embodiments, word line driver 202 is similar to word line driver 56 (shown in FIG. 2).
The memory cells 204 and 206 are electrically connected to the multiplication circuit 208, providing stored data bits in the form of data signals DB and DB to the multiplication circuit 208. Memory cell 204 is configured to provide data signal DB to multiplication circuit 208 via data line 218 and memory cell 206 provides data signal DB to multiplication circuit 208 via data line 220. The memory cells 204 and 206 are 6T SRAM cells similar to the 6T SRAM cell 100 of fig. 3, and thus a description of the 6T SRAM cell will not be repeated here. Further, in some embodiments, each of memory cells 204 and 206 is similar to one of memory cells 28 (as shown in FIG. 1), where memory cells 204 and 206 are from different ones of rows 30 and 32, respectively, of memory cell block 24. In some embodiments, storage units 204 and 206 are configured to store weight data, such as weights for CNNs.
The multiplication circuit 208 includes a first OR gate 222, a second OR gate 224, and a NAND gate 226. The first OR gate 222 is configured to receive the read word line signal RWLB from the word line driver 202 and the data signal DB from the memory cell 204. The second OR gate 224 is configured to receive the read word line signal RWLB from the word line driver 202 and the data signal DB from the memory cell 206. NAND gate 226 receives the outputs from first OR gate 222 and second OR gate 224, respectively, and provides the multiplication result at output OUT. In some embodiments, multiplication circuit 208 is similar to multiplication circuit 26 (shown in FIG. 1). In some embodiments, multiplication circuit 208 is similar to multiplication circuit 58 (shown in FIG. 2).
In operation, to select one of the memory cells 204 and 206, one of the NAND gate 210 and the NAND gate 212 in the word line driver 202 receives a logic high (1) read select signal RESL or RSEL, and the other of the NAND gate 210 and the NAND gate 212 receives a logic low (0) read select signal RESL or RSEL. NAND gate 210 OR NAND gate 212 receiving a logic low (0) read select signal RESL OR RSEL is not selected and provides a logic high (1) to one of OR gate 222 and OR gate 224, which in turn, transfers a logic high (1) to one input of output NAND gate 226. NAND gate 210 OR NAND gate 212 receiving a logic high (1) read select signal RESL OR RSEL is selected to invert the input signal XIN and to transfer the inverted input signal XINB to the other of OR gates 222 and 224. This OR gate 222 OR the OR gate 224 receives the inverted input signal XINB and one of the data signals DB and provides an output signal to the other input of the NAND gate 226. This multiplies the inverted input signal XINB with data received from one of the data signals DB and DB. NAND gate 226 provides the multiplication result at output OUT.
Fig. 5 is a schematic diagram of a truth table 230 for the row select and multiply circuit 200 of fig. 4, shown in accordance with some embodiments. Truth table 230 includes signal RSEL at column 232, RSEL at column 234, XIN at column 236, RWLB at column 238, RWLB at column 240, DB at column 242, DB at column 244, and OUT at column 246.
In row 248, signal RSEL is at logic high (1) and RSEL is at logic low (0), which selects NAND gate 212.XIN is at logic high (1) such that RWLB is logic low (0). Further, since RSEL is at logic low (0), RWLB is at logic high (1) and the output of OR gate 222 is at logic high (1). If DB is at logic low (0), the output of OR gate 224 is logic low (0) and NAND gate 226 provides logic high (1) at output OUT. If DB is at logic high (1), the output of OR gate 224 is logic high (1) and NAND gate 226 provides logic low (0) at output OUT.
In row 250, signal RSEL is at logic high (1) and RSEL is at logic low (0), which selects NAND gate 212.XIN is at logic low (0) such that RWLB is at logic high (1) and the output of OR gate 224 is at logic high (1). Further, since RSEL is at logic low (0), RWLB is at logic high (1) and the output of OR gate 222 is at logic high (1). Thus, NAND gate 226 provides a logic low (0) at output OUT.
In row 252, signal RSEL is at logic high (1) and RSEL is at logic low (0), which selects NAND gate 210.XIN is at logic high (1) such that RWLB is logic low (0). Further, since RSEL is at logic low (0), RWLB is at logic high (1) and the output of OR gate 224 is at logic high (1). If DB is a logic low (0), the output of OR gate 222 is a logic low (0) and NAND gate 226 provides a logic high (1) at output OUT. If DB is a logic high (1), the output of OR gate 222 is a logic high (1), and NAND gate 226 provides a logic low (0) at output OUT.
In row 254, the signal RSEL is at a logic high (1) and RSEL is at a logic low (0), which selects NAND gate 210.XIN is at logic low (0) such that RWLB is at logic high (1) and the output of OR gate 222 is at logic high (1). Further, since RSEL is at logic low (0), RWLB is at logic high (1) and the output of OR gate 224 is at logic high (1). Thus, NAND gate 226 provides a logic low (0) at output OUT.
Fig. 6 is a schematic diagram of a MOSFET multiplication circuit 260 that is shown providing the functionality of multiplication circuit 208 (shown in fig. 4), according to some embodiments. The multiplication circuit 260 includes eight transistors, namely, four PMOS transistors 262, 264, 266, and 268, and four NMOS transistors 270, 272, 274, and 276.
The first S/D of PMOS transistor 262 is electrically connected to power supply VDD, and the second S/D of PMOS transistor 262 is electrically connected to the first S/D of PMOS transistor 264. Further, a first S/D of PMOS transistor 266 is electrically connected to power supply VDD and a second S/D of PMOS transistor 266 is electrically connected to a first S/D of PMOS transistor 268. The second S/D of PMOS transistor 264 is electrically connected to the second S/D of PMOS transistor 268 and the first S/D of each of NMOS transistors 270 and 274. The second S/D of NMOS transistor 270 is electrically connected to the second S/D of NMOS transistor 274 and the first S/D of each of NMOS transistors 272 and 276. The second S/D of each of the NMOS transistors 272 and 276 is electrically connected to a reference VSS, such as ground.
The gates of PMOS transistor 262 and NMOS transistor 270 are electrically connected together to receive data signal DB, and the gates of PMOS transistor 268 and NMOS transistor 276 are electrically connected together to receive data signal DB. In addition, the gates of PMOS transistor 264 and NMOS transistor 274 are electrically connected together to receive read word line signal RWLB, and the gates of PMOS transistor 266 and NMOS transistor 272 are electrically connected together to receive read word line signal RWLB.
In operation, if read word line signal RWLB is at logic low (0), PMOS transistor 266 is biased on and NMOS transistor 272 is biased off. Further, if the read word line signal RWLB is at logic high (1), the PMOS transistor 264 is biased off and the NMOS transistor 274 is biased on. Thus, if the data signal DB is at logic low (0), the PMOS transistor 268 is biased on and the NMOS transistor 276 is biased off so that the output OUT is at logic high (1), and if the data signal DB is at logic high (1), the PMOS transistor 268 is biased off and the NMOS transistor 276 is biased on so that the output OUT is at logic low (0).
Further, if the read word line signal RWLB is logic low (0), the PMOS transistor 264 is biased on, the NMOS transistor 274 is biased off, and if the read word line signal RWLB is logic high (1), the PMOS transistor 266 is biased off, and the NMOS transistor 272 is biased on. Thus, if the data signal DB is logic low (0), the PMOS transistor 262 is biased on and the NMOS transistor 270 is biased off so that the output OUT is at logic high (1), and if the data signal DB is logic high (1), the PMOS transistor 262 is biased off and the NMOS transistor 270 is biased on so that the output OUT is at logic low (0).
If each of the read word line signal RWLB and the read word line signal RWLB is at logic high (1), the PMOS transistors 264 and 266 are biased off and the NMOS transistors 272 and 274 are biased on so that the output terminal OUT is at logic low level (0).
Fig. 7 is a schematic diagram of a transistor layout 280 of memory cells 204 and 206 (shown in fig. 4) and multiplication circuit 260 of fig. 6, shown according to some embodiments. Layout 280 includes twenty transistors, where each of memory cells 204 and 206 is a 6T SRAM cell, such that two of memory cells 204 and 206 include twelve transistors, and multiplication circuit 260 includes eight transistors P0-P3 and N0-N3. The layout includes six active regions 282 a-282 f and six gate structures 284 a-284 f. Gate structures 284a and 284f at the top and bottom of layout 280 are both dummy gate structures.
The memory cell 206 providing the data signal DB is arranged with a right pull-up transistor PUR1 and a left pull-up transistor PUL1 in the first active region 282a and at the gate structures 284c and 284d, respectively. The right and left pass gate transistors PGR1 and PGL1 are located at the second active region 282b and gate structures 284b and 284e, respectively, and the right and left pull-down transistors PDR1 and PDL1 are located at the second active region 282b and gate structures 284c and 284d, respectively.
The memory cell 204 providing the data signal DB is respectively arranged with a right transfer gate transistor PGR0 and a left transfer gate transistor PGL0 in the third active region 282c and gate structures 284b and 284e, and a left pull-down transistor PDR0 and a right pull-down transistor PDL0 in the third active region 282c and gate structures 284c and 284d, respectively. The right pull-up transistor PUR0 and the left pull-up transistor PUL0 are located at the fourth active region 282d and the gate structures 284c and 284d, respectively.
The multiplication circuit has four PMOS transistors P0-P3 arranged in the fifth active region 282e and four NMOS transistors N0-N3 arranged in the sixth active region 282 f. Transistors P1 and N1 are at gate structure 284b, transistors P0 and N0 are at gate structure 284c, transistors P2 and N2 are at gate structure 284d, and transistors P3 and N3 are at gate structure 284 e.
As described above, layout 280 includes six active regions 282 a-282 f and six gate structures 284 a-284 f. Layout 280 also includes a metal over diffusion (metal over diffusion, MD) layer, such as MD layer 286, that is configured to electrically connect to active region 282 a-282 f. Layout 280 also includes a dicing MD (CMD) layer, such as CMD layer 288, configured to separate or dicing MD layer 286. In some embodiments, layout 280 also includes metal layers, such as metal layer 290, which are either backside metal layers or front side metal layers. In some embodiments, metal layer 290 is part of a Power Distribution Network (PDN) in layout 280.
Fig. 8 is a schematic diagram illustrating a row selection and multiplication circuit 300 that changes signal polarity by multiplying an input signal XIN and a data signal D to provide an inverted output signal OUTB, according to some embodiments. The row select and multiply circuit 300 includes a word line driver 302, two 6T SRAM memory cells 304 and 306, and a multiply circuit 308. The row select and multiply circuit 300 is configured to multiply input data XIN with data D from memory cells 304 and 306. In other embodiments, row select and multiply circuit 300 is configured to multiply input data with data from more than two rows of memory cells (e.g., from three or four rows of memory cells).
Word line driver 302 includes AND gates 310 AND 312 electrically connected to multiplication circuit 308. AND gate 310 receives input signal XIN AND read select signal RSEL AND provides read word line signal RWL to multiplication circuit 308 via first read word line 314, AND gate 312 receives input signal XIN AND read select signal RSEL AND provides second read word line 316 to provide read word line signal RWL to multiplication circuit 308. In some embodiments, word line driver 302 is similar to word line driver 36 (shown in FIG. 1). In some embodiments, word line driver 302 is similar to word line driver 56 (shown in FIG. 2).
Memory cells 304 and 306 are electrically coupled to multiplication circuit 308 to provide stored data bits to multiplication circuit 308 in data signals D and D. Memory cell 304 is configured to provide data signal D to multiplication circuit 308 via data line 318, and memory cell 306 provides data signal D to multiplication circuit 308 via data line 320. Memory cells 304 and 306 are 6T SRAM cells similar to 6T SRAM cell 100 of fig. 3, except that outputs Q and QB have been interchanged and bit lines BL and BLB have been interchanged. Furthermore, the memory cells 304 and 306 are 6T SRAM cells, similar to the 6T SRAM cell 100 of fig. 3, and thus the description of the 6T SRAM cell is not repeated here. In some embodiments, each of memory cells 304 and 306 is similar to one of memory cells 28 (as shown in FIG. 1), where memory cells 304 and 306 are from different ones of rows 30 and 32, respectively, of memory cell block 24. In some embodiments, storage units 304 and 306 are configured to store weight data, such as weights for CNNs.
Multiplication circuit 308 includes a first AND gate 322, a second AND gate 324, AND a NOR gate 326. The first AND gate 322 is configured to receive the read word line signal RWL from the word line driver 302 AND the read data signal D from the memory cell 304. The second AND gate 324 is configured to receive the read word line signal RWL from the word line driver 302 AND the data signal D from the memory cell 306. NOR gate 326 receives each of the signals from first AND gate 322 AND second AND gate 324 the output of one AND provides the multiplication result at the output OUTB. In some embodiments, multiplication circuit 308 is similar to multiplication circuit 26 (shown in FIG. 1). In some embodiments, multiplication circuit 308 is similar to multiplication circuit 58 (shown in FIG. 2).
In operation, to select one of the memory cells 304 AND 306, one of the AND gate 310 AND the AND gate 312 in the word line driver 302 receives a logic high (1) read select signal RESL or RSEL, AND the other of the AND gate 310 AND the AND gate 312 receives a logic low (0) read select signal RESL or RSEL. The AND gate 310 or 312 receiving a logic low (0) read select signal RESL or RSEL is not selected, AND provides a logic low (0) to one of AND gate 322 AND gate 324, which in turn transfers the logic low (0) to one input of NOR gate 326.
AND gate 310 or 312 receiving a logic high (1) read select signal RESL or RSEL is selected to transmit input signal XIN to the other of AND gates 322 AND 324. The AND gate 322 or 324 receives the input signal XIN AND one of the data signals D AND provides an output signal to the other input of the NOR gate 326. This multiplies the input signal XIN with the data received from one of the data signals D and D. The NOR gate 326 provides the multiplication result at the inverted output OUTB.
Fig. 9 is a schematic diagram of a MOSFET multiplication circuit 340 that is shown providing the functionality of multiplication circuit 308 (shown in fig. 8), according to some embodiments. The multiplication circuit 340 includes eight transistors, i.e., four PMOS transistors 342, 344, 346, and 348 and four NMOS transistors 350, 352, 354, and 356.
The first S/D of PMOS transistor 342 is electrically connected to power supply VDD, and the second S/D of PMOS transistor 342 is electrically connected to the first S/D of PMOS transistor 344. Also, a first S/D of PMOS transistor 346 is electrically connected to power supply VDD, and a second S/D of PMOS transistor 346 is electrically connected to a first S/D of PMOS transistor 348. In addition, the second S/D of PMOS transistor 342 is electrically connected to the second S/D of PMOS transistor 346. At the output OUTB, the second S/D of the PMOS transistor 344 is electrically connected to the second S/D of the PMOS transistor 348, and to the first S/D of each of the NMOS transistors 350 and 354. The second S/D of NMOS transistor 350 is electrically connected to the first S/D of NMOS transistor 352 and the second S/D of NMOS transistor 354 is electrically connected to the first S/D of NMOS transistor 356. The second S/D of each of NMOS transistors 352 and 356 is electrically connected to a reference VSS, e.g., ground.
The gates of PMOS transistor 344 and NMOS transistor 352 are electrically connected together to receive data signal D, and the gates of PMOS transistor 346 and NMOS transistor 354 are electrically connected together to receive data signal D. In addition, the gates of PMOS transistor 348 and NMOS transistor 350 are electrically connected together to receive read word line signal RWL, and the gates of PMOS transistor 342 and NMOS transistor 356 are electrically connected together to receive read word line signal RWL.
In operation, if read word line signal RWL is at logic low (0), PMOS transistor 342 is biased on and NMOS transistor 356 is biased off. Further, if the read word line signal RWL is at logic high (1), the PMOS transistor 348 is biased off and the NMOS transistor 350 is biased on. Thus, if the data signal D is at logic low (0), the PMOS transistor 344 is biased on and the NMOS transistor 352 is biased off so that the output OUTB is at logic high (1), and if the data signal D is at logic high (1), the PMOS transistor 344 is biased off and the NMOS transistor 352 is biased on so that the output OUTB is at logic low (0).
Further, if the read word line signal RWL is logic low (0), the PMOS transistor 348 is biased on and the NMOS transistor 350 is biased off, and if the read word line signal RWL is logic high (1), the PMOS transistor 342 is biased off and the NMOS transistor 356 is biased on. Thus, if the data signal D is at logic low (0), the PMOS transistor 346 is biased on and the NMOS transistor 354 is biased off so that the output OUTB is at logic high (1), and if the data signal D is at logic high (1), the PMOS transistor 346 is biased off and the NMOS transistor 354 is biased on so that the output OUTB is at logic low (0).
If each of the read word line signal RWL and the read word line signal RWL is at a logic low (0), the PMOS transistors 342 and 348 are biased on, and the NMOS transistors 350 and 356 are biased off, so that the output terminal OUTB is at a logic high level (1).
Fig. 10 is a schematic diagram of a transistor layout 360 of memory cells 304 and 306 (shown in fig. 8) and multiplication circuit 340 of fig. 9, shown according to some embodiments. Layout 360 includes twenty transistors, where each of memory cells 304 and 306 is a 6T SRAM cell, such that two memory cells 304 and 306 include twelve transistors, and multiplication circuit 340 includes eight transistors P0-P3 and N0-N3. The layout includes six active regions 362a-362f and six gate structures 364a-364f. Gate structures 364a and 364f located at the top and bottom of layout 360 are dummy gate structures.
The memory cell 306 providing the data signal D is respectively arranged with a right transfer gate transistor PGR1 and a left transfer gate transistor PGL1 at the first active region 362a and gate structures 364b and 364e, and a right pull-down transistor PDR1 and a left pull-down transistor PDL1 at the first active region 362a and gate structures 364c and 364D. The right pull-up transistor PUR1 and the left pull-up transistor PUL1 are located at the second active region 362b and the gate structures 364c and 364d, respectively.
The memory cell 304 providing the data signal D is respectively arranged with a right pull-up transistor PUR0 and a left pull-up transistor PUL0 at the third active region 362c and the gate structures 364c and 364D. The right and left pass gate transistors PGR0 and PGL0 are at the fourth active region 362d and gate structures 364b and 364e, respectively, and the right and left pull-down transistors PDR0 and PDL0 are at the fourth active region 362d and gate structures 364c and 364d, respectively.
The multiplication circuit 340 has four NMOS transistors N0-N3 arranged in the fifth active region 362e, and four PMOS transistors P0-P3 arranged in the sixth active region 362f. Transistors P1 and N1 are located at gate structure 364b, transistors P0 and N0 are located at gate structure 364c, transistors P2 and N2 are located at gate structure 364d, and transistors P3 and N3 are located at gate structure 364 e.
As described above, layout 360 includes six active regions 362a-362f and six gate structures 364a-364f. Layout 360 also includes an MD layer, such as MD layer 366, configured to electrically connect to active regions 362a-362f. Layout 360 also includes CMD layers, such as CMD layer 368, configured to separate or cut MD layers 366. In some embodiments, layout 360 also includes a metal layer, such as metal layer 370, which is a back side metal layer or a front side metal layer. In some embodiments, metal layer 370 is part of a Power Distribution Network (PDN) in layout 360.
Fig. 11 is a schematic diagram of a three-row multiplication circuit 400 shown according to some embodiments. The three-row multiplication circuit 400 is configured to multiply the inverting input XINB with data from each of three memory cells (not shown) and provide a multiplication result. The three-row multiplication circuit 400 includes a first OR gate 402, a second OR gate 404, a third OR gate 406, and a NAND gate 408. The outputs of the first OR gate 402, the second OR gate 404, and the third OR gate 406 are each electrically connected to an input of a NAND gate 408. In some embodiments, three rows of multiplication circuits 400 are similar to multiplication circuits 26 (as shown in FIG. 1). In some embodiments, three rows of multiplication circuits 400 are similar to multiplication circuits 58 (shown in FIG. 2). In some embodiments, the three-row multiplication circuit 400 is substantially similar to the multiplication circuit 208 (shown in fig. 4).
The first OR gate 402, the second OR gate 404, and the third OR gate 406 are configured to receive read word line signals from a word line driver (not shown). In some embodiments, the word line drivers are similar to word line drivers 36 (shown in FIG. 1). In some embodiments, the word line drivers are similar to word line drivers 56 (shown in FIG. 2). In some embodiments, the word line driver is similar to word line driver 202 (shown in FIG. 4).
In some embodiments, the first OR gate 402 is configured to receive the read word line signal RWLB from the word line driver and the data signal DB from the first memory cell, the second OR gate 404 is configured to receive the read word line signal RWLB from the word line driver and the data signal DB from the second memory cell, and the third OR gate 406 is configured to receive the read word line signal RWLB from the word line driver and the data signal DB from the third memory cell. The NAND gate 408 receives each output from the first OR gate 402, the second OR gate 404, and the third OR gate 406 and provides a multiplication result at the output OUT of the NAND gate 408.
Fig. 12 is a schematic diagram illustrating a MOSFET multiplication circuit 420 that provides the functionality of the multiplication circuit 400 of fig. 11, according to some embodiments. Multiplication circuit 420 includes twelve transistors, namely six PMOS transistors 422, 424, 426, 428, 430, and 432, and six NMOS transistors 434, 436, 438, 440, 442, and 444.
The first S/D of PMOS transistor 422 is electrically connected to power supply VDD, and the second S/D of PMOS transistor 422 is electrically connected to the first S/D of PMOS transistor 424. In addition, the second S/D of PMOS transistor 424 is electrically connected to the first S/D of PMOS transistor 426. The first S/D of PMOS transistor 428 is electrically connected to power supply VDD, and the second S/D of PMOS transistor 428 is electrically connected to the first S/D of PMOS transistor 430. The second S/D of PMOS transistor 430 is electrically connected to the first S/D of PMOS transistor 432. At the output OUT, the second S/D of PMOS transistor 426 is electrically connected to the second S/D of PMOS transistor 432 and the first S/D of each of NMOS transistors 434 and 440. The second S/D of NMOS transistor 434 is electrically connected to the second S/D of NMOS transistor 440 and the first S/D of each of NMOS transistors 436 and 442. The second S/D of NMOS transistor 436 is electrically connected to the second S/D of NMOS transistor 442 and the first S/D of each of NMOS transistors 438 and 444. The second S/D of each of NMOS transistors 438 and 444 is electrically connected to a reference VSS, such as ground.
Gates of the six PMOS transistors 422, 424, 426, 428, 430, and 432 and the six NMOS transistors 434, 436, 438, 440, 442, and 444 are connected to each other and to read word line signals RWLB and data signals DB and DB to perform functions of the multiplication circuit 400 of fig. 11.
Fig. 13 is a schematic diagram of a four-row multiplication circuit 450 shown according to some embodiments. The four-row multiplication circuit 450 is configured to multiply the inverting input XINB with data from each of four memory cells (not shown) and provide a multiplication result. In some embodiments, four rows of multiplication circuits 450 are similar to multiplication circuits 26 (shown in FIG. 1). In some embodiments, four rows of multiplication circuits 450 are similar to multiplication circuits 58 (shown in FIG. 2). In some embodiments, four rows of multiplication circuits 450 are substantially similar to multiplication circuit 208 (shown in FIG. 4).
The four-row multiplication circuit 450 includes a first OR gate 452, a second OR gate 454, a third OR gate 456, and a fourth OR gate 458. The four-row multiplication circuit 450 also includes a first NAND gate 460, a second NAND gate 462, and a NOR gate 464. Each output of the first OR gate 452, the second OR gate 454, the third OR gate 456, and the fourth OR gate 458 is electrically connected to an input of one of the NAND gates. The outputs of the first OR gate 452 and the second OR gate 454 are electrically connected to the input of the first NAND gate 460, and the outputs of the third OR gate 456 and the fourth OR gate 458 are electrically connected to the input of the second NAND gate 462. The outputs of the first NAND gate 460 and the second NAND gate 462 are electrically connected to the input of the NOR gate 464.
The first OR gate 452, the second OR gate 454, the third OR gate 456, and the fourth OR gate 458 are configured to receive read word line signals from a word line driver (not shown). In some embodiments, the word line drivers are similar to word line drivers 36 (shown in FIG. 1). In some embodiments, the word line drivers are similar to word line drivers 56 (shown in FIG. 2). In some embodiments, the word line driver is similar to word line driver 202 (shown in FIG. 4).
In some embodiments, the first OR gate 452 is configured to receive the read word line signal RWLB from the word line driver and the data signal DB from the first memory cell, the second OR gate 454 is configured to receive the read word line signal RWLB from the word line driver and the data signal DB from the second memory cell, the third OR gate 456 is configured to receive the read word line signal RWLB from the word line driver and the data signal DB from the third memory cell, and the fourth OR gate 458 is configured to receive the read word line signal RWLB from the word line driver and the data signal DB from the third memory cell. NAND gates 460 and 462 receive each of the outputs from first OR gate 452, second OR gate 454, third OR gate 456, and fourth OR gate 458 and provide an output to NOR gate 464 that provides a multiplication result at output OUT.
Fig. 14 is a schematic diagram showing a table 470 illustrating the number of Read Word Lines (RWLs) and the number of transistors (Tr) in a conventional read port (Conv) and the New multiplication circuit (New) of the present disclosure, according to some embodiments. Row 472 indicates the number of Read Word Lines (RWL) and row 474 indicates the number of transistors (Tr) in the conventional read port and the new multiplication circuit.
As shown in column 476, for two rows of memory cells, the conventional read port includes five RWLs and twelve transistors, whereas in the new multiplication circuit, such as in the multiplication circuit 208 (in fig. 4) depicted by the MOSFET multiplication circuit 260 of fig. 6, and in the multiplication circuit 308 (shown in fig. 8) depicted by the MOSFET multiplication circuit 340 of fig. 9, only two RWLs and eight transistors are included. This reduces the size of the three RWLs and four transistors, thereby reducing the area used in the integrated circuit.
As shown in column 478, for three rows of memory cells, the conventional read port includes seven RWLs and sixteen transistors, whereas in the new multiplication circuit, such as in the multiplication circuit 400 of fig. 11 depicted by the MOSFET multiplication circuit 420 of fig. 12, there are only three RWLs and twelve transistors. This reduces the size of the four RWLs and four transistors, thereby reducing the area used in the integrated circuit.
As shown in column 480, for four rows of memory cells, the conventional read port includes nine RWLs and twenty transistors, whereas in the new multiplication circuit there are only four RWLs and twenty transistors. This reduces the size of the five RWLs, thereby reducing the area and/or routing used in the integrated circuit.
As shown in column 482, for a five-row memory cell, the conventional read port includes eleven RWLs and twenty-four transistors, and in the new multiplication circuit five RWLs and thirty transistors. This reduces 5 RWLs but increases 6 transistors, which does not reduce the area used in the integrated circuit.
FIG. 15 is a schematic diagram of a latched SRAM cell 500 shown in accordance with some embodiments. SRAM cell 500 is an eight transistor (8T) SRAM cell. In some embodiments, SRAM cell 500 is similar to one or more memory cells 28 (shown in fig. 1). In some embodiments, SRAM cell 500 is similar to one or more of memory cells 52 and 54 (as shown in fig. 2). In some embodiments, SRAM cell 500 is used in CIM device 20 of fig. 1. In some embodiments, SRAM cell 500 is used in row select and multiply circuit 50 of fig. 2. In other embodiments, SRAM cell 500 may include more or less than eight transistors.
SRAM cell 500 includes two cross-coupled inverters 502 and 504. The first inverter 502 includes a first PMOS/NMOS transistor pair 506 and 508 and the second inverter 504 includes a second PMOS/NMOS transistor pair 510 and 512.SRAM cell 500 further includes a latch circuit that includes a PMOS latch gate transistor 514, an NMOS latch gate transistor 516, and a pass gate (transmission gate) 518 that includes an NMOS transistor 520 and a PMOS transistor 522.
The first S/D of the PMOS latch-gate transistor 514 is electrically connected to the power supply VDD, and the second S/D of the PMOS latch-gate transistor 514 is electrically connected to the first S/D of the left pull-up transistor. The first S/D of the NMOS latch gate transistor 516 is electrically connected to a reference voltage VSS, such as ground, and the second S/D of the NMOS latch gate transistor 516 is electrically connected to the first S/D of the left pull-down transistor 508. In addition, the first S/D of the right pull-up transistor 510 is electrically connected to the power supply VDD, and the first S/D of the right pull-down transistor 512 is electrically connected to the reference voltage VSS.
The second S/D of the left pull-up transistor 506 is electrically connected to the second S/D of the left pull-down transistor 508 and the gates of the right pull-up transistor 510 and the right pull-down transistor 512, and to the first S/D of each of the NMOS transistor 520 and the PMOS transistor 522. The second S/D of each of the NMOS transistor 520 and the PMOS transistor 522 is electrically connected to the bit line BL. Further, the second S/D of the right pull-up transistor 510 is electrically connected to the second S/D of the right pull-down transistor 512 and to the gate of the left pull-up transistor 506 and the gate of the left pull-down transistor 508.
The data bit is stored in SRAM cell 500 as a voltage at node Q and may be read through a bit line BL via a transfer gate 518, with access to node Q controlled by transfer gate 518. The inverting node (QB) of node Q stores the complement of the value at node Q such that if Q is high, QB is low and vice versa. The gates of PMOS latch-gate transistor 514 and NMOS transistor 520 are controlled by latch signal L, and the gates of NMOS latch-gate transistor 516 and PMOS transistor 522 are controlled by complementary latch signal LB.
In operation, to write the SRAM cell 500, the latch signal L is set to a high voltage (1) and the complementary latch signal LB is set to a low voltage (0). This bias turns on the pass gate 518, which includes the NMOS transistor 520 and the PMOS transistor 522, and turns off the PMOS latch gate transistor 514 and the NMOS latch gate transistor 516. The data voltage on bit line BL is transferred to node Q and the gates of right pull-up transistor 510 and right pull-down transistor 512, which provides a complementary data voltage at node QB and to the gates of left pull-up transistor 506 and left pull-down transistor 508. Next, the latch signal L is switched to a low voltage (0), and the complementary latch signal LB is switched to a high voltage (1). This latches the voltages at node Q and node QB. To read the voltage at node Q, latch signal L is set to a high voltage (1) and complementary latch signal LB is set to a low voltage (0) to bias on pass gate 518 and bias off PMOS latch gate transistor 514 and NMOS latch gate transistor 516.
Fig. 16 is a schematic diagram of a row select and multiply circuit 530 including a word line driver (not shown), two 8TSRAM cells 534 and 536, and a multiply circuit 538, shown in accordance with some embodiments. The row select and multiply circuit 530 is configured to multiply the inverted input data XINB with the inverted data DB from the memory cells 534 and 536. In other embodiments, row select and multiply circuit 530 is configured to multiply input data with data from more than two rows of memory cells, such as from three or four rows of memory cells.
The word line driver is similar to word line driver 202 (shown in fig. 4) and will not be described in detail herein. The word line driver provides a read word line signal RWLB to multiplication circuit 538 via a first read word line 544 and a read word line signal RWLB to multiplication circuit 538 via a second read word line 546. In some embodiments, the word line drivers are similar to word line drivers 36 (shown in FIG. 1). In some embodiments, the word line drivers are similar to word line drivers 56 (shown in FIG. 2).
The memory cells 534 and 536 are electrically connected to the multiplication circuit 538 to provide stored data bits in the form of data signals DB and DB to the multiplication circuit 538. Memory unit 534 is configured to provide data signal DB to multiplication circuit 538 via data line 548 and memory unit 536 provides data signal DB to multiplication circuit 538 via data line 550. Each of the memory cells 534 and 536 is similar to the SRAM cell 500 of fig. 15 and will not be described again here. Further, in some embodiments, each of memory cells 534 and 536 is similar to one of memory cells 28 (shown in fig. 1), where memory cells 534 and 536 are from different ones of rows 30 and 32, respectively, of memory cell block 24. In some embodiments, storage units 534 and 536 are configured to store weight data, such as weights for CNNs.
Multiplication circuit 538 includes a first OR gate 552, a second OR gate 554, and a NAND gate 556. The first OR gate 552 is configured to receive a read word line signal RWLB from the word line driver and a data signal DB from the memory cell 534. The second OR gate 554 is configured to receive a read word line signal RWLB from the word line driver and a data signal DB from the memory cell 536. The NAND gate 556 receives the output from each of the first OR gate 552 and the second OR gate 554 and provides the multiplication result at the output OUT. In some embodiments, multiplication circuit 538 is similar to multiplication circuit 26 (shown in FIG. 1). In some embodiments, multiplication circuit 538 is similar to multiplication circuit 58 (shown in FIG. 2).
In operation, the read word driver deselects one of the memory cells 534 and 536 by transmitting a logic high (1) to one of the OR gate 552 and OR gate 554 (OR gate 552 OR 554 in turn transmits a logic high (1) to one input of the NAND gate). The read word driver selects the other of the memory cells 534 and 536 by transmitting an inverted input signal XINB to the other of the OR gates 552 and 554. This selected OR gate 552 OR 554 receives the inverted input signal XINB and one of the data signals DB and DB from memory cells 534 and 536 and provides an output signal to the other input of NAND gate 556. NAND gate 556 provides the multiplication result at output OUT.
Fig. 17 is a schematic diagram of a transistor layout 560 of memory cells 534 and 536 (shown in fig. 16) and multiplication circuit 538 (shown in fig. 16) shown in accordance with some embodiments. Multiplication circuit 538 is similar to multiplication circuit 208 (shown in FIG. 4) and is arranged similar to MOSFET multiplication circuit 260 of FIG. 6, with four PMOS transistors P0-P3 and four NMOS transistors N0-N3. In addition, each memory cell 536 and 538 is similar to the SRAM cell 500 of fig. 15, and therefore the numerical notation in fig. 15 is used in the description of the transistor layout 560.
Layout 560 includes twenty-four transistors, where each of memory cells 534 and 536 are 8TSRAM cells, such that two of memory cells 534 and 536 include sixteen transistors and multiplication circuit 538 includes eight transistors P0-P3 and N0-N3. The layout includes six active regions 562a-562f and six gate structures 564a-564f. Gate structures 564a and 564f at the top and bottom of layout 560 are dummy gate structures.
The memory cell 536 providing the data signal DB is arranged with the NMOS transistor 520 (N11), the left pull-down transistor 508 (N10), the NMOS latch gate transistor 516 (N9), and the right pull-down transistor 512 (N8) at the first active region 562a and the gate structures 564b, 564c, 564d, and 564e, respectively. The memory cell 536 is also arranged with a PMOS transistor 522 (P11), a left pull-up transistor 506 (P10), a PMOS latch-gate transistor 514 (P9), and a right pull-up transistor 510 (P8) at the second active region 562b and the gate structures 564b, 564c, 564d, and 564e, respectively.
The memory cell 534 providing the data signal DB is arranged with the NMOS transistor 520 (N7), the left pull-down transistor 508 (N6), the NMOS latch gate transistor 516 (N5), and the right pull-down transistor 512 (N4) at the third active region 562c and the gate structures 564b, 564c, 564d, and 564e, respectively. The memory cell 534 is also arranged with a PMOS transistor 522 (P7), a left pull-up transistor 506 (P6), a PMOS latch gate transistor 514 (P5), and a right pull-up transistor 510 (P4) located at the fourth active region 562d and gate structures 564b, 564c, 564d, and 564e, respectively.
Multiplication circuit 538 has four PMOS transistors P0-P3 arranged in fifth active region 562e and four NMOS transistors N0-N3 arranged in sixth active region 562f. Transistors P1 and N1 are at gate structure 284b, transistors P0 and N0 are at gate structure 284c, transistors P2 and N2 are at gate structure 284d, and transistors P3 and N3 are at gate structure 284e.
As described above, layout 560 includes six active regions 562a-562f and six gate structures 564a-564f, similar to layout 280 of FIG. 7. Layout 560 also includes an MD layer, such as MD layer 566, which is configured to electrically connect to active regions 562a-562f. Layout 560 also includes a CMD layer, such as CMD layer 568, configured to separate or cut MD layer 566. In some embodiments, layout 560 further includes a metal layer, such as metal layer 570, which is a back side metal layer or a front side metal layer. In some embodiments, metal layer 570 is part of a Power Distribution Network (PDN) in layout 560.
FIG. 18 is a schematic diagram of a row select and multiply circuit 600 including a word line driver (not shown), two 1T1C memory cells 602 and 604, and a multiply circuit 606, according to some embodiments. The row select and multiply circuit 600 is configured to multiply the inverted input data XINB with the inverted data DB signals from the memory cells 602 and 604. In other embodiments, row select and multiply circuit 600 is configured to multiply input data with data from more than two rows of memory cells, such as from three or four rows of memory cells.
The word line driver (not shown) is similar to word line driver 202 (shown in fig. 4) and will not be described again here. The word line driver provides a read word line signal RWLB to the multiplication circuit 606 through a first read word line 608 and a read word line signal RWLB to the multiplication circuit 606 through a second read word line 610. In some embodiments, the word line drivers are similar to word line drivers 36 (shown in FIG. 1). In some embodiments, the word line drivers are similar to word line drivers 56 (shown in FIG. 2).
The memory cells 602 and 604 are electrically connected to the multiplication circuit 606, providing stored data bits in the form of data signals DB and DB to the multiplication circuit 606. The memory cell 602 is configured to provide a data signal DB to the multiplication circuit 606 via a data line 612, and the memory cell 604 is configured to provide a data signal DB to the multiplication circuit 606 via a data line 614. Further, in some embodiments, each of memory cells 602 and 604 is similar to one of memory cells 28 (as shown in FIG. 1), where memory cells 602 and 604 are from different ones of rows 30 and 32, respectively, of memory cell block 24. In some embodiments, the storage units 602 and 604 are configured to store weight data, such as weights for CNNs.
The memory cell 602 includes a first transistor 616 and a first capacitor 618. One S/D of the first transistor 616 is electrically connected to the bit bar line BLB, and the other S/D of the first transistor 616 is electrically connected to one side connected to the first capacitor 618. The other side of the first capacitor 618 is electrically connected to a reference VSS, such as ground. The gate of the first transistor 616 is electrically connected to the word line WL for reading data from the first capacitor 618 and writing data to the first capacitor 618. This side of the first capacitor 618 is electrically connected to provide a data signal DB to the multiplication circuit 606 via the data line 612.
The memory cell 604 includes a second transistor 620 and a second capacitor 622. One S/D of the second transistor 620 is electrically connected to the bit bar line BLB and the other S/D of the second transistor 620 is electrically connected to one side of the second capacitor 622. The other side of the second capacitor 622 is electrically connected to a reference VSS, such as ground. The gate of the second transistor 620 is electrically connected to the word line WL to read data from the second capacitor 622 and write data to the second capacitor 622. This side of the second capacitor 622 is electrically connected to provide the data signal DB to the multiplication circuit 606 via the data line 614.
Multiplication circuit 606 includes a first OR gate 624, a second OR gate 626, and a NAND gate 628. The first OR gate 624 is configured to receive the read word line signal RWLB from the word line driver and the data signal DB from the memory cell 602. The second OR gate 626 is configured to receive a read word line signal RWLB from the word line driver and a data signal DB from the memory cell 604. The NAND gate 628 receives the output from each of the first OR gate 624 and the second OR gate 626 and provides the multiplication result at the output OUT. In some embodiments, multiplication circuit 606 is similar to multiplication circuit 26 (shown in FIG. 1). In some embodiments, multiplication circuit 606 is similar to multiplication circuit 58 (shown in FIG. 2).
In operation, the read word driver deselects one of the memory cells 602 and 604 by transmitting a logic high (1) to one of the OR gates 624 and 626 (either OR gate 624 OR 626 in turn transmitting a logic high (1) to one input of NAND gate 628). The read word driver selects the other of the memory cells 602 and 604 by transmitting an inverted input signal XINB to the other of the OR gates 624 and 626. This selected OR gate 624 OR 626 receives the inverted input signal XINB and one of the data signals DB and DB from memory cells 602 and 604 and provides an output signal to the other input of NAND gate 628. NAND gate 628 provides the multiplication result at output OUT.
FIG. 19 is a schematic diagram illustrating a method of performing multiplication in integrated circuit memory, according to some embodiments. In some embodiments, the method is performed in a CIM device and a CNN application.
At step 700, the method includes storing a first bit at a first node of a first memory cell, and at step 702, the method includes storing a second bit at a second node of a second memory cell. In some embodiments, the first memory cell is one of memory cells 28, 52, 204, 304, 534, and 602. In some embodiments, the second memory cell is one of memory cells 28, 54, 206, 306, 536, and 604. In some embodiments, each of the first and second memory cells is one of the memory cells 28, 52, 54, 204, 206, 304, 306, 534, 536, 602, and 604.
At step 704, the method includes receiving a read select signal and an input signal at a select circuit. In some embodiments, the selection circuit is similar to one of the word line driver circuits 36, 56, 202, and 302. In some embodiments, receiving the read select signal and the input signal at the select circuit includes receiving one of the read select signal and the input signal at the first select logic element and receiving the other of the read select signal and the input signal at the second select logic element. In some embodiments, one or more of the first and second select logic elements are NAND gates. In some embodiments, one or more of the first AND second logic elements are AND gates.
At step 706, the method includes outputting, by the selection circuit, a read word line output signal based on the read selection signal and the input signal. In some embodiments, outputting, by the selection circuit, the read wordline output signal based on the read select signal and the input signal includes outputting, by the first selection logic element, one of the read wordline output signals and outputting, by the second selection logic element, the other of the read wordline output signals.
At step 708, the method includes receiving, at a multiplication circuit, a read word line output signal, a first bit, and a second bit, and at step 710, outputting, by the multiplication circuit, a multiplication result. In some embodiments, the multiplier circuit, i.e., the multiplication circuit, is similar to one of the multiplication circuits 26, 58, 208, 308, 538, and 606.
In some embodiments, receiving the read word line output signal, the first bit, and the second bit at the multiplication circuit includes receiving one of the read word line output signal and the first bit at the first logic element and receiving the other of the read word line output signal and the second bit at the second logic element. In some embodiments, the method includes receiving, at a third logic element, a first output based on one of a read word line output signal and a first bit from a first logic element, and receiving, at a third logic element, a second output based on the other of the read word line output signal and a second bit from the second logic element, and outputting a multiplication result based on the first output and the second output from the third logic element. In some embodiments, the first logic element is one of an OR gate AND an AND gate. In some embodiments, the second logic element is one of an OR gate AND an AND gate. In some embodiments, the third logic element is one of a NAND gate and a NOR gate.
Accordingly, the disclosed embodiments provide a CIM device that includes a read word line driver circuit and a memory cell electrically connected to a multiplication circuit. The read word line driving circuit receives the input data and the read select signal and supplies a read word line signal to the multiplication circuit. The read word line signal selects one of the memory cells and the multiplication circuit multiplies the input signal (e.g., inverted input signal XINB) with the data signal (e.g., inverted data signal DB) from the selected memory cell. This provides a multiplication result in which the data from the memory cell is multiplied with the input data. In some embodiments, the multiplication circuit provides multiplication for two rows of memory cells. In some embodiments, the multiplication circuit provides multiplication for three rows of memory cells. In some embodiments, the multiplication circuit provides multiplication for four rows of memory cells. In some embodiments, the data from the storage unit is a weight used in a neural network, such as CNN.
The disclosed embodiments also include a read word line driver circuit and a 6T or 8T SRAM cell connected to the logic gates in the multiplication circuit. In some embodiments, logic gates in the read word line drive circuit include NAND gates AND/or AND gates. In some embodiments, the logic gates in the multiplication circuit include OR gates AND NAND gates AND/OR AND gates AND NOR gates. In other embodiments, the memory cells may be different memory cells, such as other data latches, flip-flops, and/or memory cells including flash memory, MRAM, RRAM, SRAM, and DRAM cells. In some embodiments, the memory cells comprise 1T1C memory cells.
Furthermore, in the disclosed embodiments, the number of transistors and read word lines used in the multiplication circuit is reduced as compared to previous read port configurations. In some embodiments, the number of transistors and read word lines used in the multiplication circuit is reduced to 8 transistors and 2 read word lines compared to 12 transistors and 5 read word lines in the previous read port configuration.
Advantages of the disclosed embodiments include CIM cell and logic circuit arrangements that reduce the amount of space occupied on a chip, provide in-memory multiplication operations that improve performance, such as speed performance, and reduce energy/power requirements. Thus, power, performance, and area (PPA) are improved.
According to some embodiments, a device includes a first memory cell, a second memory cell, a first logic element, a second logic element, and a third logic element. The first storage unit is configured to store a first bit at a first node and the second storage unit is configured to store a second bit at a second node. The first logic element includes a first node input connected to the first node, the second logic element includes a second node input connected to the second node, and the third logic element includes a first input connected to the first output of the first logic element and a second input connected to the second output of the second logic element.
In some embodiments, the first storage unit includes a latch.
In some embodiments, the first memory cell comprises a capacitor.
In some embodiments, each of the first logic element and the second logic element is an OR gate, and the third logic element is a NAND gate.
In some embodiments, each of the first logic element AND the second logic element is an AND gate, AND the third logic element is a NOR gate.
In some embodiments, the first logic element, the second logic element, and the third logic element are comprised of eight metal-oxide semiconductor field effect transistors.
In some embodiments, the first logic element includes a first read word line input connected to a first read word line, and the second logic element includes a second read word line input connected to a second read word line.
In some embodiments, the device comprises: a third storage unit configured to store a third bit at a third node; and a fourth logic element comprising a third node input connected to the third node, wherein the third logic element comprises a third input connected to a third output of the fourth logic element.
In some embodiments, the first logic element, the second logic element, the third logic element, and the fourth logic element are comprised of twelve metal-oxide semiconductor field effect transistors.
In some embodiments, the device comprises: a third storage unit configured to store a third bit at a third node; a fourth storage unit configured to store a fourth bit at a fourth node; a fourth logic element comprising a third node input connected to the third node; a fifth logic element comprising a fourth node input connected to the fourth node; a sixth logic element comprising a third input connected to the third output of the fourth logic element and a fourth input connected to the fourth output of the fifth logic element; and a seventh logic element including a first logic input connected to the first logic output of the third logic element and a second logic input connected to the second logic output of the sixth logic element.
According to a further embodiment, a device includes a selection circuit, a memory circuit, and a multiplication circuit. The selection circuit is configured to receive the read select signal and the input signal and provide a read word line output signal based on the read select signal and the input signal. The memory circuit includes a first memory cell configured to store a first bit at a first node and a second memory cell configured to store a second bit at a second node. The multiplication circuit is configured to receive the read word line output signal, the first bit, and the second bit and provide a multiplication result.
In some embodiments, the selection circuit comprises: a first select logic element configured to receive one of the read select signals and the input signal and provide one of the read word line output signals; and a second select logic element configured to receive the other of the read select signals and the input signal and provide the other of the read word line output signals.
In some embodiments, the multiplication circuit includes: a first logic element including a first node input connected to the first node; a second logic element including a second node input connected to the second node; and a third logic element including a first input connected to the first output of the first logic element and a second input connected to the second output of the second logic element.
In some embodiments, the first logic element, the second logic element, and the third logic element are comprised of eight metal-oxide semiconductor field effect transistors.
In some embodiments, the first memory cell includes a latch including six or more metal-oxide semiconductor field effect transistors.
According to a further disclosed aspect, a method of performing a multiplication in an integrated circuit memory includes: storing a first bit at a first node in a first memory cell; storing a second bit at a second node in a second storage unit; receiving a read select signal and an input signal at a select circuit; the selection circuit outputs a read word line output signal according to the read selection signal and the input signal; receiving a read word line output signal, a first bit, and a second bit at a multiplication circuit; the multiplication circuit outputs a multiplication result.
In some embodiments, receiving the read select signal and the input signal at the select circuit comprises: receiving one of the read select signals and the input signal at a first select logic element; and receiving at a second select logic element the input signal and another of the read select signals.
In some embodiments, outputting, by the selection circuit, the read word line output signal based on the read selection signal and the input signal comprises: outputting one of the read word line output signals through the first select logic element; and outputting another one of the read word line output signals through the second select logic element.
In some embodiments, receiving the read word line output signal, the first bit, and the second bit at the multiplication circuit comprises: receiving one of the read wordline output signals and the first bit at a first logic element; and receiving at a second logic element another one of the read word line output signals and the second bit.
In some embodiments, the method comprises: receiving, at a third logic element, a first output based on the one of the read word line output signals from the first logic element and the first bit; receiving a second output at the third logic element based on the other of the read word line output signals from the second logic element and the second bit; and outputting the multiplication result based on the first output and the second output from the third logic element.
The present disclosure outlines features of several embodiments so that those skilled in the art may better understand the aspects of the disclosure. Those skilled in the art should appreciate that they may readily use the present disclosure as a basis for designing or modifying other processes and structures for carrying out the same purposes and/or achieving the same advantages of the embodiments introduced herein. Those skilled in the art should also realize that such equivalent constructions do not depart from the spirit and scope of the present disclosure, and that they may make various changes, substitutions, and alterations herein without departing from the spirit and scope of the present disclosure.

Claims (10)

1. An integrated circuit device, comprising:
a first storage unit configured to store a first bit at a first node;
a second storage unit configured to store a second bit at a second node;
a first logic element including a first node input connected to the first node;
a second logic element including a second node input connected to the second node; and
a third logic element comprising a first input connected to the first output of the first logic element and a second input connected to the second output of the second logic element.
2. The device of claim 1, wherein the first storage unit comprises a latch.
3. The device of claim 1, wherein the first memory cell comprises a capacitor.
4. The device of claim 1, wherein each of the first and second logic elements is an OR gate and the third logic element is a NAND gate.
5. The device of claim 1, wherein each of the first AND second logic elements is an AND gate AND the third logic element is a NOR gate.
6. An integrated circuit device, comprising:
a selection circuit configured to receive a read select signal and an input signal and provide a read word line output signal based on the read select signal and the input signal;
a memory circuit, comprising:
a first storage unit configured to store a first bit at a first node; and
a second storage unit configured to store a second bit at a second node; and
a multiplication circuit configured to receive the read word line output signal, the first bit, and the second bit and provide a multiplication result.
7. The device of claim 6, wherein the selection circuit comprises:
A first select logic element configured to receive one of the read select signals and the input signal and provide one of the read word line output signals; and
a second select logic element configured to receive the other of the read select signals and the input signal and provide the other of the read word line output signals.
8. The device of claim 6, wherein the multiplication circuit comprises:
a first logic element including a first node input connected to the first node;
a second logic element including a second node input connected to the second node; and
a third logic element comprising a first input connected to the first output of the first logic element and a second input connected to the second output of the second logic element.
9. A method of performing multiplication in an integrated circuit memory, comprising:
storing a first bit at a first node of a first memory cell;
storing a second bit at a second node of a second memory cell;
receiving a read select signal and an input signal at a select circuit;
outputting, by the selection circuit, a read word line output signal based on the read selection signal and the input signal;
Receiving the read word line output signal, the first bit, and the second bit at a multiplication circuit; and
and outputting a multiplication result through the multiplication circuit.
10. The method of claim 9, wherein receiving the read select signal and the input signal at the select circuit comprises:
receiving one of the read select signals and the input signal at a first select logic element; and
the other of the read select signals and the input signal are received at a second select logic element.
CN202310200519.3A 2022-04-04 2023-03-03 Integrated circuit device and method for multiplying in memory therein Pending CN116521126A (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US63/327,164 2022-04-04
US17/855,089 2022-06-30
US17/855,089 US20230315389A1 (en) 2022-04-04 2022-06-30 Compute-in-memory cell

Publications (1)

Publication Number Publication Date
CN116521126A true CN116521126A (en) 2023-08-01

Family

ID=87401843

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310200519.3A Pending CN116521126A (en) 2022-04-04 2023-03-03 Integrated circuit device and method for multiplying in memory therein

Country Status (1)

Country Link
CN (1) CN116521126A (en)

Similar Documents

Publication Publication Date Title
US11830543B2 (en) Memory computation circuit
KR102353068B1 (en) Sram memory
US10867681B2 (en) SRAM memory having subarrays with common IO block
US11966714B2 (en) Ternary in-memory accelerator
KR102726631B1 (en) Compute in memory (cim) memory array
US12118328B2 (en) In-memory bit-serial addition system
US11580059B2 (en) Multi-port memory architecture for a systolic array
US11043246B2 (en) Memory modules including a mirroring circuit and methods of operating the same
US12002542B2 (en) Write circuit of memory device and method of operating the same
CN114446350A (en) A row-column Boolean operation circuit for in-memory computing
US20200185392A1 (en) 3d integrated circuit random-access memory
CN115424645A (en) Computing device, memory controller and method of performing computations in memory
KR20230005345A (en) Memory devices to train neural networks
TWI870777B (en) Compute-in-memory device and method of multiplying in integrated circuit
US20250157530A1 (en) Low-power static random access memory
CN116521126A (en) Integrated circuit device and method for multiplying in memory therein
WO2003046918A2 (en) High performance semiconductor memory devices
JP2871967B2 (en) Dual-port semiconductor memory device
US12456525B2 (en) Three-dimensional memory device
US20250246215A1 (en) Integration of memory cells and logic cells for compute-in-memory applications
CN119091943B (en) 10T-SRAM cell, dual-channel read and content-addressed logic circuit and chip
US20250182813A1 (en) Write driver and semiconductor memory device including the same
CN111128286B (en) Memory device and operation method thereof
JP5429383B2 (en) Semiconductor memory device
JPH08221976A (en) Semiconductor memory

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination