WO2025085569A1 - Programmable multiplication circuit and systems - Google Patents
Programmable multiplication circuit and systems Download PDFInfo
- Publication number
- WO2025085569A1 WO2025085569A1 PCT/US2024/051661 US2024051661W WO2025085569A1 WO 2025085569 A1 WO2025085569 A1 WO 2025085569A1 US 2024051661 W US2024051661 W US 2024051661W WO 2025085569 A1 WO2025085569 A1 WO 2025085569A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- coupled
- circuit
- multiplication
- integrated circuit
- memory
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/52—Multiplying; Dividing
- G06F7/523—Multiplying only
- G06F7/53—Multiplying only in parallel-parallel fashion, i.e. both operands being entered in parallel
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2207/00—Indexing scheme relating to methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F2207/38—Indexing scheme relating to groups G06F7/38 - G06F7/575
- G06F2207/48—Indexing scheme relating to groups G06F7/48 - G06F7/575
- G06F2207/4802—Special implementations
- G06F2207/4814—Non-logic devices, e.g. operational amplifiers
Definitions
- Machine learning processing typically includes processing a multi-dimensional input vector through several computational layers, with each computational layer multiplying neuron values by respective connection weights.
- the connection weights are pre-programmed with values during training that appropriately transform training inputs into desired outputs.
- multiplication at each layer of a machine learning algorithm is designed to cumulatively reach a classification or regression result from the input that is consistent with the training the algorithm received.
- CPUs central processing units
- GPUs graphics processing units
- CPUs are designed to process information serially and are less efficient for highly parallel computation, such as machine learning processing, which typically involves parallel processing of large, multidimensional inputs.
- GPUs are designed for highly parallel computation, they are expensive, consume a large amount of power, and are not optimized for the kind of processing machine learning often entails.
- Alternative solutions to general-purpose CPUs and GPUs are therefore desirable to improve speed, power efficiency, and associated costs of machine learning processing.
- the present disclosure describes circuits and systems for performing multiplication that, in accordance with certain embodiments, are programmable, fast, reliable, and highly scalable.
- Some embodiments of the present disclosure provide integrated circuit structures with parallel- coupled switchable impedances that are programmable to perform multiplication.
- Some embodiments of the present disclosure provide scalar multiplication circuits that are controllable using selectable paths through the circuit. Embodiments of the present disclosure may be produced at high scale, efficiently, and reliably by leveraging existing integrated circuit and memory technologies.
- the integrated circuit comprises an input terminal, a first conductive portion that is coupled to the input terminal, an output terminal, a second conductive portion that is coupled to the output terminal, and a programmable multiplication circuit comprising a plurality of impedances coupled in parallel between the first conductive portion and the second conductive portion and having fixed impedance values, and a plurality of switches coupled in series with respective ones of the plurality of impedances between the first conductive portion and the second conductive portion.
- the impedances comprise resistors having fixed resistance values.
- the integrated circuit comprises a plurality of switch-control conductive portions that are electrically isolated from one another, and the plurality of switches comprises a plurality of transistors each having a first channel terminal coupled to the first conductive portion, a second channel terminal coupled to the second conductive portion via the programmable multiplication circuit, a gate terminal coupled to a respective one of the switchcontrol conductive portions.
- the integrated circuit further comprises a third conductive portion coupled to the input terminal or to a second input terminal, a fourth conductive portion coupled to the output terminal or to a second output terminal, and a second programmable impedance circuit, comprising a second plurality of impedances coupled in parallel between the third conductive portion and the fourth conductive portion and having fixed impedance values, and a second plurality of switches coupled in series with respective ones of the second plurality of impedances between the third conductive portion and the fourth conductive portion.
- the fourth conductive portion is coupled to the second output terminal.
- the third conductive portion is coupled to the second input terminal.
- the first and second conductive portions are disposed on different respective layers of the integrated circuit.
- the integrated circuit further comprises a memory comprising a plurality of memory cells coupled to respective switches of the plurality of switches.
- the plurality of impedances and the plurality of switches comprise a plurality of impedance-switch pairs, with ones of the plurality of memory cells co-located with respective ones of the plurality of impedance-switch pairs.
- the memory is configured as random- access memory (RAM).
- the plurality of impedances and the plurality of switches are arranged in an array of unit cell circuits, each unit cell circuit of at least a subarray of the array of unit cell circuits comprises a respective one of the plurality of impedances coupled in series with a respective one of the plurality of switches, and the first and second conductive portions at least partially overlap, in a direction normal to the array, with at least one of the unit cell circuits of the subarray.
- the multiplication circuit comprises an input terminal configured to receive an input signal, an output terminal configured to provide an output signal, and a plurality of selectable paths coupled in parallel between the input terminal and the output terminal, the plurality of selectable paths configured to generate the output signal as a multiplication of the input signal by a scalar value, wherein the scalar value is controllable by selecting a subset of the plurality of selectable paths.
- each of the plurality of selectable paths is configured to, when selected, propagate at least a portion of the input signal from the input terminal the output terminal.
- each of the plurality of selectable paths comprises a switch coupled in series with a fixed impedance, the switch being controllable to select the respective selectable path.
- each switch is configured to receive a respective select signal from a memory.
- a system comprises the multiplication circuit and a memory, and the multiplication circuit is configured to select the subset of the plurality of selectable paths based on values stored in the memory.
- the system further comprises a processor coupled to the multiplication circuit and configured to multiply a neuron value by a connection weight at least in part by providing, to the multiplication circuit, the input signal as a version of the neuron value, and receiving, via the multiplication circuit, a version of the output signal, and the scalar value is based on the connection weight.
- the processor is further configured to set the values stored in the memory based on the connection weight.
- the system further comprises an integrated circuit having the memory and the multiplication circuit included therein.
- each of the plurality of selectable paths is co-located with a respective memory cell of the memory.
- the memory is configured as random-access memory (RAM).
- FIG. 1A is a block diagram of an example system including an integrated circuit configured for multiplication, according to some embodiments
- FIG. IB is a block diagram of an alternative example system including an integrated circuit configured for multiplication that further includes a co-located memory, according to some embodiments;
- FIG. 2 is a circuit diagram of an example multiplication circuit that may be included in the integrated circuit of FIG. 1A, according to some embodiments;
- FIG. 3 is a circuit diagram of an example array of multiplication circuits that may be included in the integrated circuit of FIG. 1 A, according to some embodiments;
- FIG. 4 is a top view of a portion of an example integrated circuit including an array of multiplication circuits that may be included in the system of FIG. 1 A, according to some embodiments;
- FIG. 5 is a perspective view of a portion of an example integrated circuit including an array of multiplication circuits that may be included in the system of FIG. 1 A, according to some embodiments;
- FIG. 6A is a top view of a portion of an example integrated circuit including an array of resistor-based multiplication circuits that may be included in the system of FIG. 1A, according to some embodiments;
- FIG. 6B is a top view of the portion of the example integrated circuit of FIG. 6A with the first and second conductive portions hidden from view, according to some embodiments;
- FIG. 7A is a perspective view of a resistor that may be included in the resistor-based multiplication circuits of FIGs. 6A and 6B, according to some embodiments;
- FIG. 7B is a top view of the conductive trace and electrodes of the resistor of FIG. 7A, according to some embodiments.
- FIG. 7C is a top view of an alternative example of a resistor-switch configuration that may be included in the integrated circuit of FIGs. 6A-6B, according to some embodiments;
- FIG. 8 is a block diagram of an example unit cell circuit with a co-located selectable path and memory cell that may be included in an array of multiplication circuits within the system of FIG. IB, according to some embodiments;
- FIG. 9 is a block diagram of an example integrated circuit with an array of unit cell circuits of FIG. 8, which may be included within the system of FIG. IB, according to some embodiments.
- FIG. 10 is a block diagram of the integrated circuit of FIG. 9 illustrating multiple example configurations of the unit cell circuits, according to some embodiments.
- the present disclosure describes circuits and systems for performing multiplication that are fast, reliable, and highly scalable. Some embodiments of the present disclosure provide integrated circuit structures with parallel-coupled switchable impedances configured to perform multiplication. Some embodiments of the present disclosure provide scalar multiplication circuits that are controllable using selectable paths through the circuit. Embodiments of the present disclosure may be produced at high scale both efficiently and reliably by leveraging existing integrated circuit and memory technologies.
- Memristors arc circuit elements that may be programmed with a particular input-to-output transfer function, which may be used to perform a multiplication.
- a large array of memristors may be programmed to apply respective scalar multiplication to a large number of input signals and thereby perform massively parallel multiplication at high speed.
- Memristors are particularly attractive for this purpose because they may be easily reprogrammed on-the-fly (e.g., without any changes to their fabricated structure) to obtain different transfer functions, permitting the array to be trained or retrained.
- changing the transfer functions of the memristors may update the connection weights used for memristive neuron multiplication.
- the same memristive structure could be reused repeatedly and retrained and/or adapted for completely different machine learning tasks.
- memristive structures do not presently achieve reliable and accurate processing. For instance, while memristive structures may be advantageously reprogrammed with different transfer functions, memristive structures may not hold their programmed transfer functions with the degree of accuracy needed for consistent and low error computation. Thus, memristive technologies may require a tradeoff between performance over time and re-programmability.
- circuitry with selectable impedance paths may achieve better performance than memristive structures with a similar programmability advantage.
- a multiplication may be achieved by processing an input signal through one or more selectable impedance paths to obtain an output representing a scalar multiple of the input.
- the scalar multiple achieved using the circuit may be programmed by selecting a particular subset of available impedance paths to couple in parallel.
- multiplication structures may be implemented using existing integrated circuit technologies and reprogrammable scalar multiple configurations may be implemented using existing memory technologies, making the resulting circuits efficient and reliable to produce at scale. Such circuits may be especially useful, when configured as an array for implementing a machine learning inference machine programmed (and reprogrammable) to perform multiplication on a large number of inputs.
- FIG. 1A is a block diagram of an example system 100a including an integrated circuit (IC) 102a configured for multiplication, according to some embodiments.
- IC integrated circuit
- system 100a may be configured for multiplication, such as may be used to perform a machine learning inference.
- system 100a includes a processor 110, an analog/digital (A/D) interface 130, an integrated circuit 102a having one or more multiplication (Mult.) circuits 120, and a memory 140.
- A/D analog/digital
- Mult. multiplication circuits
- processor 110 may be configured to provide an input signal to the multiplication circuit(s) 120 that is indicative of a neuron value N that is to be multiplied by a connection weight C to produce an output signal O indicative of the multiplication result N x C (e.g., either before or after the activation function is applied, depending on the embodiment)
- the memory 140 may be programmed with the connection weight C (e.g., at the time of execution and/or beforehand) so as generate path select signals that control the multiplication circuit(s) 120 to apply the connection weight C.
- an artificial intelligence (Al) inference operation may include a multidimensional vector of neuron values N that are to be multiplied by a multidimensional vector of connection weights C programmed in the memory 140 to obtain a multidimensional vector output O when each vector component is processed by a respective multiplication circuit 120.
- the system 100a may be further configured to apply an activation (and/or deactivation function) to the multiplication result.
- Typical activation functions include 1 sigmoid ( 1+e-x ), hyperbolic tangent (tanh (%)), rectified linear unit (ReLU) (max (0, %)), Leaky ReLU (max (0.1%, %)), Maxout (max (w x + b ⁇ w ⁇ x + b 2 ), and exponential linear unit (ELU) (x, x > 0; a e x — 1), x ⁇ 0).
- a deactivation function is typically the derivative of the activation function.
- an activation and/or deactivation function may be applied by a component on the integrated circuit 102a, such as an activation and/or deactivation function circuit (e.g., analog circuitry providing and/or approximating the response of an activation and/or deactivation function).
- an analog activation and/or deactivation circuit may be included for each multiplication circuit, such as co-located with the multiplication circuit and/or elsewhere in the integrated circuit 102a.
- an activation and/or deactivation function may be applied by a digital processing circuit, such as by digital processing components (e.g., logic gates of an FPGA and/or ASIC) within or separate from the integrated circuit 102a, and/or by the processor 110.
- system 100a may be formed using a number of integrated circuits on one or more circuit boards, such as a computer motherboard. In other embodiments, at least some components may be in communication with one another over a communication network, such as in a distributed cloud computing environment.
- an integrated circuit may include one or more semiconductor dies having circuitry fabricated thereon and encapsulated in a package. Multiple dies within an integrated circuit package may be stacked and/or wire bonded together within the package.
- processor 110 may be a general-purpose central processing unit (CPU).
- the processor 110 may be included within a general-purpose computer in which the integrated circuit 102a has been incorporated, such as by mounting into a slot on the motherboard in a modular configuration and/or through permanent affixation to the circuit board as part of its manufacturing. While a single processor 110 is shown in FIG. 1A, it should be appreciated that multiple processors 110 may be used, whether in the same local computer system or in a distributed cloud computing environment. Further, while processor 110 is shown separate from integrated circuit 102a in FIG.
- a single integrated circuit may have both the processor 110 and multiplication circuit(s) 120 therein, such as on a single semiconductor die and/or on multiple dies stacked and/or wire bonded together within a single package.
- the processor 110 may be implemented as or including a CPU, in other embodiments the processor 110 may be implemented as or include a graphics processing unit (GPU), a field programmable gate array (FPGA), and/or an application- specific integrated circuit (ASIC), as embodiments described herein are not so limited.
- GPU graphics processing unit
- FPGA field programmable gate array
- ASIC application- specific integrated circuit
- the A/D interface 130 may be configured to facilitate bidirectional communication between the processor 110 and analog circuitry (e.g., the multiplication circuit(s) 120) of the integrated circuit 102a.
- analog circuitry e.g., the multiplication circuit(s) 120
- the A/D interface 130 includes a digital-to-analog converter (DAC) 132 and an analog-to-digital converter (ADC) 134 coupled between the processor 110 and the multiplication circuit(s) 120 of the integrated circuit 102a.
- DAC digital-to-analog converter
- ADC analog-to-digital converter
- the input signal from the processor 110 may have a version (e.g., digital value representative) of the neuron value N, which the A/D interface 130 may convert to a version (e.g., analog value, such as voltage and/or current amplitude, representative) of the neuron value N.
- the output signal O may have a version (e.g., analog value representative) of the multiplication result, which the A/D interface may convert to a version (e.g., digital value representative) of the multiplication result.
- some embodiments may omit one or each of DAC 132 and 134 and/or all of the A/D interface 130, such as where communication between the processor 110 and multiplication circuit(s) 120 is unidirectional, where the multiplication circuit(s) 120 are configured to process digital signals without an intervening DAC, where the multiplication circuit(s) 120 are configured to output digital signals without an intervening ADC, and/or where the processor 110 has at least some analog processing components for interfacing with the integrated circuit 102a.
- the multiplication circuit(s) 120 may be configured to multiply values represented in input signals received from processor 110 to provide output signals to processor 110 representing the result of the multiplication.
- the multiplication circuit(s) 120 may each include a plurality of selectable paths controllable to set the value by which the multiplication circuit(s) multiply values represented in input signals received from the processor 110. For instance, a subset of the paths may be selected using path select signals received by the memory 140.
- the multiplication circuit(s) 120 may include parallel-coupled impedances having fixed impedance values and switches (e.g., transistors) coupled in series with respective impedances, such that states of the switches may be controlled to set the desired overall impedance of a multiplication circuit 120 to obtain a desired value by which to multiply a value represented in an input signal from the processor 110.
- the memory 140 may be configured to provide path select values to multiplication circuit(s) 120 for controlling multiplication of values represented in input signals from the processor 110.
- the memory 140 may be configured to store at least one path select value for each multiplication circuit 120 of the integrated circuit 102a.
- each path select value may include a bit for each selectable path and/or switch of the respective multiplication circuit 120.
- the memory 140 may include random-access memory (RAM), such as static random-access memory (SRAM) and/or dynamic random-access memory (DRAM), though other forms of memory such as resistive RAM and/or flash memory may be used, depending on the preferred and/or available memory for the application.
- RAM random-access memory
- SRAM static random-access memory
- DRAM dynamic random-access memory
- RAM may be preferred for some applications to permit re-programming of path select values to change values by which the multiplication circuit(s) 120 are configured to multiply input values, which may be useful for re-training in the case of some Al inference machine implementations.
- permanent memory may be preferred for other applications, such as where path select values and resulting multiplication are predetermined and not to be changed (e.g., for a pre-trained permanently configured Al inference machine).
- the memory 140 may be at least a portion of a shared computer system memory where at least a portion is in communication with the integrated circuit 102a. In other embodiments, the memory 140 may be partially or entirely dedicated to storing configuration information for multiplication circuit 120. To set the path select values, in some embodiments, the memory 140 may be configured to receive the memory write and enable values from the processor 110 that control the path select values and/or individual bits within a path select value stored in the memory 140.
- the integrated circuit 102a may be configured to perform activation between neural network layers implemented by multiplication circuits 120 of the integrated circuit 102a.
- an analog input may be provided and an analog output obtained from the integrated circuit 102a, and/or a digital input may be provided and a digital output obtained from the integrated circuit 102a, facilitating omission of the A/D interface 130.
- outputs from one group (e.g., layer) of multiplication circuits 120 may be provided (e.g., after activation) as inputs to another group (e.g., layer) of multiplication circuits 120, such as may be used to implement some or all neural network processing within the integrated circuit 102a.
- FIG. IB is a block diagram of an alternative example system 100b including an integrated circuit 102b configured for multiplication that further includes a co-located memory 140, according to some embodiments.
- system 100b may be configured as described herein for the system 100a, such as including a processor 110, A/D interface 130, multiplication circuit(s) 120, and memory 140.
- system 100b in FIG. IB is shown including multiplication circuit(s) 120 and memory 140 colocated in the integrated circuit 102b.
- memory cells of the memory 140 may be co-located with ones of the multiplication circuit(s) 120.
- each multiplication circuit 120 may have colocated therewith a memory cell of the memory 140 storing a path select value for controlling the multiplication circuit 120.
- the co-located memory cells may be bitcells (e.g., holding a single bit) and/or may be groups of bitcells (e.g., holding multiple bits).
- the memory 140 may be configured as described in connection with FIG. 1A, with path select values stored in the co-located bitcell(s) programmed by processor 110 and provided to components of the multiplication circuit(s) 120.
- the multiplication circuit(s) 120 and memory 140 within the integrated circuit 102b may be formed on the same semiconductor die or on separate dies stacked and/or wire bonded and packaged together.
- the inventors have recognized that, regardless of whether memory cells of the memory 140 are co-located with respective ones of the multiplication circuit(s) 120, packaging the multiplication circuit(s) 120 and memory 140 may improve system efficiency in terms of device footprint and/or the number of interfaces through which the processor 110 is coupled.
- a single bus may be used to communicate between the processor 110 and both the multiplication circuit(s) 120 and memory 140.
- FIG. 2 is a circuit diagram of an example multiplication circuit 200 that may be included in the integrated circuit 102a of FIG. 1 A, according to some embodiments.
- the multiplication circuit 200 may be configured to receive an input signal 202 representing a first value and produce an output signal 204 representative of a second value that is a scalar multiplication of the first value.
- one of the input signal 202 and the output signal 204 may be a voltage signal and the other of the input signal 202 and the output signal 204 may be a current signal.
- an input signal 202 may be a voltage signal having a voltage of 1 Volt (V) and an output signal 204 may be a current signal having a current of 1 Milliampere (mA).
- the multiplication circuit 200 may have applied an impedance of 1 kilohm (kQ) to the input signal 202 to produce the output signal 204.
- the range of input voltages of the input signal 202 may be from 1 V to 2 V, representing a range of neuron values from 1 to 2
- the range of output currents of the output signal 204 may be from 0.5 mA to 2 mA, representing a range of multiplication results from 1 to 4.
- the input signal 202 may represent a neuron value of 1 and the output signal 204 may represent a multiplication result of 2, indicating that the 1 kQ impedance of the multiplication circuit 200 applied a connection weight multiplier of scalar value 2 to the input signal 202 to produce the output signal 204. While this example uses only nonzero voltages and currents as indicating values, other examples may use a zero voltage and/or zero current as indicating a value.
- the multiplication circuit 200 may be controllable to set an impedance between the input and the output of the circuit 200.
- multiplication circuit 200 includes a plurality of selectable paths 206 coupled in parallel between the input and the output.
- each path 206 includes a switch 220.
- each switch 220 may be controllable to select its path 206 to be included in an overall parallel path from the input to the output of the circuit 200, with at least a portion of the input signal 202 propagating through each selected path 206 within the parallel path.
- each path 206 includes an impedance 210 coupled in series with the switch 220. Also shown in the example of FIG.
- each path 206 includes an impedance-switch pair that includes an impedance 210 coupled in series with the switch 220.
- the switch(es) 220 of one or more paths 206 to include the respective impedance(s) 210 in parallel between the input and the output, the overall impedance of the overall path from the input to the output may be controlled.
- impedances 210 with fixed impedance values may be used in the multiplication circuit 200 while achieving control over scalar multiplication to be obtained using the circuit.
- the three paths 206 shown in FIG. 2 may have impedances 210 of 1, kQ, 2 kQ, and 4 kQ.
- impedances 210 of 1, kQ, 2 kQ, and 4 kQ.
- While this example selects one path 206 to apply an impedance to an input to achieve a desired output, other examples may select multiple paths to provide an overall (e.g., parallel) impedance, as this may provide more resolution (e.g., possible discrete impedance values) in setting the overall impedance.
- multiplication circuit 200 may be repeatedly programmable to select different subsets of paths 206 to obtain appropriate scalar multiple values (e.g., connection weights). For example, where fixed impedances 210 are used, scalar multiple values may be programmed in the circuit 200 by adjusting path select signals applied to the switches 220 to change which paths 206 are selected for inclusion in the overall path from the input to the output. It should be appreciated, however, that some applications may not take advantage of the reprogrammable nature of such a circuit 200, such as by fixedly coupling the circuit 200 to readonly memory (ROM) that provides the path select signals.
- ROM readonly memory
- an impedance and switch may be in series when substantially all current flowing in one of the impedance and switch flows through the other of the impedance and switch (e.g., though some insignificant amount of current may flow elsewhere due to leakage).
- a plurality of paths e.g., each including an impedance in series with a switch
- a circuit 200 may be configured with sub-paths within some or all paths 206, each sub-path including a switch and an impedance. For instance, such a configuration may provide further ways of fine-tuning the desired overall impedance from the input to the output of the circuit 200.
- an activation function may be applied to output signals, such as by analog processing components of or coupled to the multiplication circuit 200 (e.g., within the same integrated circuit), and/or by digital processing components (e.g., processor 110) coupled to the multiplication circuit 200.
- FIG. 3 is a circuit diagram of an example array 300 of multiplication circuits 306a, 306b, 306c, 306d that may be included in the integrated circuit 102a of FIG. 1 A, according to some embodiments.
- each multiplication circuit 306a, 306b, 306c, and 306d may be configured in the manner described herein for circuit 200.
- each multiplication circuit 306a, 306b, 306c, and 306d includes a plurality of paths, each path including a switch 320 and an impedance 310.
- multiplication circuits of the array 300 may be configured to perform multiply-accumulate (MAC) operations.
- multiplication circuits of the array 300 may have inputs and/or outputs coupled together.
- the inputs of multiplication circuits 306a and 306b are coupled together to receive an input signal 302a and the inputs of multiplication circuits 306c and 306d are coupled together to receive an input signal 302b.
- the outputs of multiplication circuits 306a and 306c are coupled together to produce an output signal 304a and the outputs of multiplication circuits 306b and 306d are coupled together to produce an output signal 304b.
- the output signal 304a may represent a sum of a first product obtained from the output of the multiplication circuit 306a together with a second product obtained from the output of the multiplication circuit 306c.
- the output signal 304a and 304b may represent a respective sum given by:
- switches 320 of the multiplication circuits 306a, 306b, 306c, and 306d may be configured to receive path select signals that select a subset of one or more paths through the circuit.
- multiplication circuit 306a is shown configured to receive path select signals IX, 1Y, and 1Z
- multiplication circuit 306b is shown configured to receive path select signals 2X, 2Y, and 2Z
- multiplication circuit 306c is shown configured to receive path select signals 3X, 3Y, and 3Z
- multiplication circuit 306d is shown configured to receive path select signals 4X, 4Y, and 4Z for controlling respective switches 320.
- the path select signals may be provided by a memory, such as with each path select signal (e.g., IX) being stored in a memory cell of the memory.
- impedances 310 within a multiplication circuit may be weighted with respect to one another.
- FIG. 3 shows a binary- weighted configuration in which the three impedances of the circuit 306a are doubled with respect to one another.
- a binary-weighted configuration of impedances of 1, kQ, 2 kQ, and 4 kQ may be used to achieve the above-described multiplication by selecting the 1 kQ path to obtain an overall impedance of 1 kQ.
- multiplication circuits 306a, 306b, 306c, and 306d are shown in the array 300 of FIG. 3 by way of illustration, any number of multiplication circuits may be included in an array.
- all four multiplication circuits 306a, 306b, 306c, and 306d are shown in FIG. 3 interconnected either at the input or output, it should be appreciated that in some implementations, multiplication circuits within an array may be interconnected to some other multiplication circuits in the array and separate from other multiplication circuits in the array.
- the four multiplication circuits shown in FIG. 3 may comprise a subarray within a larger array where each subarray is programmed to perform a MAC operation. Further alternatively or additionally, operations other than MAC operations may be performed using multiplication circuits described herein, such as multiply-add (MAD) operations and/or solely multiplication operations.
- MAD multiply-add
- FIG. 4 is a top view of a portion of an example integrated circuit 400 including an array 402 of multiplication circuits that may be included in the system 100a of FIG. 1 A, according to some embodiments.
- the array 402 of multiplication circuits may be configured in the manner described herein for array 300 of FIG. 3.
- each multiplication circuit of the array may include a plurality of paths, each path including a switch (e.g., 320) and an impedance (c.g., 310).
- the array 402 includes four multiplication circuits, with one multiplication circuit 406a indicated by a dashed box.
- conductive portions of the integrated circuit 400 may be coupled to inputs and outputs of the integrated circuit 400 and to the multiplication circuits.
- the integrated circuit 400 has two first conductive portions 412 respectively coupled to inputs 410a and 410b and six second conductive portions 422, three coupled to output 420a and three coupled to output 420b.
- each first conductive portion 412 may be coupled to a respective row of multiplication circuits of the array 402 and each second conductive portion may be coupled to a respective column of multiplication circuits of the array 402.
- multiplication circuit 406 is coupled to input 410a and output 420a, with output 420a configured as the sum of outputs of the left two multiplication circuits of the array 402 and output 420b configured as the sum of outputs of the right two multiplication circuits.
- each multiplication circuit may be configured to receive one or more path select signals for selecting a respective one or more switches of the circuit.
- the integrated circuit includes switch-control conductive portions 430 coupled to respective paths of the multiplication circuits.
- the multiplication circuit includes three paths, each coupled to input 410a by a first conductive portion 412 and to output 420a via respective second conductive portions 422, and each coupled to a respective switch-control conductive portion 430.
- each switchcontrol conductive portion 430 may be electrically isolated from one another (e.g., with little to no communicative coupling therebetween) such that each switch may receive its own individual path select signal.
- at least some of the first conductive portions 412 and/or second conductive portions 422 may be coupled to one another, such as shown in FIG. 4 where the left three second conductive portions 422 are coupled to one another to combine current signals into an output 420a.
- the multiplication circuits of the array 402 may have a scalable structure that is reprogrammable when coupled to a memory.
- the multiplication circuits may be formed as a subarray of unit cell circuits of an array of unit cell circuits, each unit cell circuit having an impedance and a switch, with the impedance being individually configured for that unit cell circuit during manufacture.
- interconnections between the unit cell circuits, such as first conductive portions 412 and second conductive portions 422 may be configured during manufacture to divide the unit cell circuits among multiplication circuits, such as with impedances within each circuit being different from one another.
- the scalar multiple value provided by each multiplication circuit may be reprogrammed by changing the values provided to each switch via the switch-control conductive portions 430.
- the integrated circuit 400 may have a first layer including the first conductive portions 412 and a second layer including the second conductive portions 422.
- the first conductive portions 412 may be coupled to respective input terminals of the integrated circuit (not shown) and the second conductive portions 422 may be coupled to one or more output terminals of the integrated circuit (not shown), such as depending on whether and/or how many of the second conductive portions 422 are connected together to sum the current signals therein.
- the array 402 of multiplication circuits may form at least a portion of a programmable impedance circuit.
- at least one of the multiplication circuits may have impedances (e.g., 310) coupled in parallel between an input (e.g., 410a via one of the first conductive portions 412) and an output (e.g., 420a via one of the second conductive portions 422).
- switches e.g., 320
- switches may be coupled in series with respective ones of the impedances between the input and the output (e.g., first and second conductive portions 412 and 422).
- a layer may have multiple first conductive portions 412, such as configured to provide respective inputs 410a and 410b.
- each first conductive portion 412 may be coupled to a respective input terminal.
- another layer may have multiple second conductive portions 422, such as coupled together to provide output 420a to a same output terminal, or electrically isolated and configured to provide respective outputs 420a and 420b to respective output terminals.
- a first programmable impedance circuit (e.g., multiplication circuit 406a) may be coupled between one of the first conductive portions 412 (e.g., to receive input 410a) and at least one of the second conductive portions 422 (e.g., the left three second conductive portions 422 to provide output 420a), and a second programmable impedance circuit (e.g., including the top-right and/or bottom right three unit cell circuits) may be coupled between one of the first conductive portions 412 (e.g., to receive input 410a and/or 410b) and at least one of the second conductive portions 422 (e.g., the right three second conductive portions 422 to provide output 420b).
- a first programmable impedance circuit e.g., multiplication circuit 406a
- a second programmable impedance circuit e.g., including the top-right and/or bottom right three unit cell circuits
- FIG. 4 shows each row of the array coupled to the same input and each column of the array coupled to the same output
- rows and/or columns may have multiplication circuits respectively coupled to multiple inputs and/or outputs.
- a row of an array may have a first conductive portion extending from an end of the row to the middle of the row and another first conductive portion extending from the opposite end of the row to the middle of the row.
- a column of an array may have a second conductive portion extending from an end of the column to the middle of the column and another second conductive portion extending from the opposite end of the column to the middle of the column.
- FIG. 4 shows each multiplication circuit having three unit cell circuits of the array 402
- a multiplication circuit may have any number of unit cell circuits, and arrays may include multiplication circuits having different numbers of unit cell circuits.
- FIG. 5 is a top view of a portion of an example integrated circuit 500 including an array 502 of multiplication circuits that may be included in the system 100a of FIG. 1 A, according to some embodiments.
- the integrated circuit 500 may be configured in the manner described herein for the integrated circuit 400.
- the integrated circuit 500 includes an array 502 of multiplication circuits that are configured to receive inputs and outputs.
- the integrated circuit 500 has three first conductive portions 510 configured to provide three respective voltage signals Vi, Vi, and V3 as inputs to the array 502, and three second conductive portions 520 configured to receive three respective current signals Ii, I2, and I3 as outputs of the array 502.
- each multiplication circuit in the array may have multiple paths configured as described below in connection with FIG. 7C.
- the multiplication circuits and the conductive portions 510 and 520 may be on different layers of the integrated circuit 500.
- the first conductive portions 510 may be on a first layer
- the second conductive portions may be on a second layer
- the multiplication circuits may be disposed on yet another layer.
- the multiplication circuits may be disposed on a layer that is between the first and second layers having the first conductive portions 510 and second conductive portions 520, respectively. It should be appreciated, however, that the layers may be configured differently.
- first conductive portions 510 may be on a layer that is between the layer having the multiplication circuits and the layer having the second conductive portions 520, and/or the second conductive portions 520 may be on a layer that is between the layer having the multiplication circuits and the layer having the first conductive portions 510. It should also be appreciated that not all first conductive portions 510, second conductive portions 520, and/or multiplication circuits need be confined to a single layer each.
- FIG. 6A is a top view of a portion of an example integrated circuit 600 including an array 602 of resistor-based multiplication circuits that may be included in the system 100a of FIG. 1 A, according to some embodiments.
- FIG. 6B is a top view of the portion of the example integrated circuit 600 of FIG. 6A with the first and second conductive portions 612, 622 hidden from view, according to some embodiments.
- the integrated circuit 600 may be configured as described herein for the integrated circuit 400.
- the integrated circuit 600 is configured to provide two inputs 610a, 610b via respective first conductive portions 612 to an array 602 of multiplication circuits (FIG. 6B) to obtain an output 620 via second conductive portions 622.
- the array 602 includes two multiplication circuits, each with a row of three unit cell circuits 630, and each unit cell circuit 630 including a resistor 632 and a transistor 634 that is coupled to a respective switch-control conductive portion 604.
- the transistor 634 of each unit cell circuit has a first channel terminal (e.g., drain) coupled to a first conductive portion 612, a second channel terminal (e.g., source) coupled to a first end of the resistor 632, and a control terminal (e.g., gate) coupled to the switchcontrol conductive portion 604, and a second end of the resistor is coupled to a second conductive portion 622.
- the transistors 634 may be metal-oxide- semiconductor field effect transistors (MOSFETs), such as thin-film transistors (TFTs), although other transistors may be used, such as bipolar junction transistors (BJTs).
- MOSFETs metal-oxide- semiconductor field effect transistors
- TFTs thin-film transistors
- BJTs bipolar junction transistors
- the resistors 632 may include meandering conductive traces that provide a fixed resistance value, although other types of resistors and/or impedances may be used.
- impedances may have non-negligible inductance and/or capacitance values, whereas in other cases, impedances may be substantially entirely resistive, resulting in resistor-based multiplication circuits.
- first conductive portions 612 and/or second conductive portions 622 may at least partially overlap, in a direction normal to the array 602, with at least one unit cell circuit 630.
- the array 602 has row and column dimensions that may form the plane of the array 602 (e.g., parallel to a plane of a wafer on which the array may be formed), and the first conductive portion 612 configured to provide the input 610a to the identified unit cell circuit 630 overlaps with (e.g., overlies or underlies) a portion of the unit cell circuit 630 in a direction normal to the plane of the array 602.
- the second conductive portion 622 configured to obtain at least a portion of the output 620 from the identified unit cell circuit 630 overlaps with (e.g., overlies or underlies) a portion of the unit cell circuit 630 in a direction normal to the plane of the array 602.
- the first conductive portion 612 and/or the second conductive portion 622 configured to provide the input 610a to and/or obtain at least a portion of the output 620 from the identified unit cell circuit 630 may be connected to the unit cell circuit 630 by a respective via elongated parallel to a direction normal to the plane of the array 602.
- a first conductive portion may at least partially overlap with more than one unit cell circuit 630, such as the first conductive portions 612 in FIGs. 6A-6B each overlapping with a row of unit cell circuits 630 and the second conductive portions 622 in FIGs. 6A-6B each overlapping with a column of unit cell circuits 630.
- FIGs. 6A-6B show only partial overlap of unit cell circuits 630 with first conductive portions 612 and second conductive portions 622, more or less overlap, such as total overlap, may be implemented.
- the first conductive portions 510 and/or the second conductive portions 520 may be configured to fully or at least substantially overlap with unit cell circuits of the array 502, such as at a point of overlap, in a direction normal to the array 502, of the first conductive portion 510, the second conductive portion 520, and electrodes of the unit cell circuit connected to the respective conductive portions.
- FIGs. 6A-6B show a single transistor per unit cell circuit, it should be appreciated that alternative or additional transistors may be included per unit cell circuit.
- at least some unit cell circuits may alternatively or additionally include a transistor configured as a row and/or column select transistor, such as with the channel coupled between the resistor and a conductive portion (e.g., in series with the illustrated transistor) and the control terminal couple to a metal portion to receive a row and/or column select signal.
- a row and/or column select transistor may permit inputs to the integrated circuit (e.g., from a processor 110) to include input signals and row and/or column select signals as an alternative or in addition to inputs including respective input signals for each multiplication circuit and/or unit cell circuit, permitting flexibility of operation and in implementation of the interface with the integrated circuit.
- including such transistors may advantageously improve computing accuracy, such as by preventing current from leaking through unselected unit cell circuits.
- FIG. 7A is a perspective view of a resistor 700 that may be included in the resistor-based multiplication circuits of FIGs. 6A-6B, according to some embodiments.
- FIG. 7B is a top view of the conductive trace 702 and electrodes 704a and 704b of the resistor 700 of FIG. 7A, according to some embodiments.
- the resistor 700 may be used in unit cell circuits 630 of the integrated circuit 600. As shown in FIGs. 7A-7B, the resistor 700 has a conductive trace 702 that meanders from a first electrode 704a to a second electrode 704b to provide resistance between a first conductive portion 710a coupled to the first electrode 704a and a second conductive portions 710b coupled to the second electrode 704b.
- the first conductive portion 710a may be configured to provide an input signal and the second conductive portion 710b may be configured to obtain an output signal, although in some embodiments the first and/or second electrode 704a, 704b may be coupled to a transistor that is, in turn, coupled to a conductive portion.
- the meandering conductive trace 702 may be fixed in position when manufactured to provide a fixed resistance in a unit cell circuit of a multiplication circuit.
- the meandering conductive trace 702 may be formed using gold, copper, or other suitable metal, whereas in further embodiments, the trace 702 (or like resistance-fixing component coupled between a pair of electrodes) may be formed using semiconductor material.
- FIG. 7C is a top view of an alternative example resistive structure that may be included in the resistor-based multiplication circuits of FIGs. 6A-6B, according to some embodiments.
- the resistive structure shown in FIG. 7C may be used in unit cell circuits 630 of the integrated circuit 600.
- the unit cell circuit includes a plurality of resistors in series with respective transistors and sharing an electrode, according to some embodiments.
- the resistors shown in FIG. 7C differ from the resistor 700 in that they are coupled between a plurality of switches, respectively, and a common electrode.
- the unit cell circuit includes four fixed impedances in series with respective switches.
- the unit cell circuit of FIG. 7C includes four conductive traces, Trace 1, Trace 2, Trace 3, and Trace 4, each coupled in series with a respective one of Switch 1, Switch 2, Switch 3, and Switch 4.
- each of Trace 1, Trace 2, Trace 3, and Trace 4 terminates at one end at a respective electrode that is coupled to the respective switch of Switch 1, Switch 2, Switch 3, and Switch 4.
- each of Trace 1, Trace 2, Trace 3, and Trace 4 terminates at another end at a common electrode.
- each of Trace 1, Trace 2, Trace 3, and Trace 4 may have a different fixed impedance value, such as due to different meandering trace geometries. It should be appreciated that, in some embodiments, at least some traces within a unit cell circuit may have the same fixed impedance value. Moreover, in some embodiments, when disposed in a row and/or column, unit cell circuits with the resistive structure of FIG. 7C may have different fixed impedances (e.g., combinations of fixed impedances) along at least a part of each row and/or column, depending on the application.
- the resistive structure of FIG. 7C when configured within a unit cell circuit, may be coupled to and between one or more first conductive portions and a second conductive portion.
- channel terminals of Switch 1, Switch 2, Switch 3, and Switch 4 may be coupled to one first conductive portion and the common electrode may be coupled to the second conductive portion, such that the series pairs of fixed impedances and switches are coupled in parallel with one another.
- control terminals of Switch 1, Switch 2, Switch 3, and Switch 4 may be configured to receive respective signals selecting the respective one of Trace 1, Trace 2, Trace 3, and Trace 4, which may set the parallel impedance the unit cell circuit provides between the first conductive portion and the second conductive portion.
- At least some channel terminals of Switch 1, Switch 2, Switch 3, and Switch 4 may be coupled to respective first conductive portions.
- control terminals of Switch 1, Switch 2, Switch 3, and Switch 4 may be configured to receive respective signals selecting the respective one of Trace 1, Trace 2, Trace 3, and Trace 4, which may select which first conductive portion(s) to couple to.
- one or more input signals may be received at Switch 1, Switch 2, Switch 3, and Switch 4, and/or at the common electrode, and/or one or more output signals may be provided at Switch 1, Switch 2, Switch 3, and Switch 4, and/or at the common electrode, depending on the application.
- an integrated circuit may be reconfigurable (e.g., based on interaction with input and output terminals of the integrated circuit) to use the same unit cell circuit in either configuration.
- FIG. 8 is a block diagram of an example unit cell circuit 800 with a selectable path 806 and co-located memory cell 830 that may be included in an array of multiplication circuits within the system 100b of FIG. IB, according to some embodiments.
- the unit cell circuit 800 may be configured in the manner described herein for the unit cell circuit 630 shown in FIG. 6B.
- the unit cell circuit 800 is shown in FIG. 8 including a selectable path 806 configured to receive an input 802 and provide an output 804, with the path including a switch 820 and an impedance 210.
- the unit cell circuit 800 may include a memory cell 830 co-located with the path 806.
- the memory cell 830 may include a single bitcell for the single path 806.
- the memory cell 830 is configured to provide a path select signal to the switch 820 in the path 806 and to receive a write enable signal 808a and a write value 808b, both of which may be received from a processor to set the value(s) for path selection in the unit cell circuit 800.
- unit cell circuit 800 is shown in FIG. 8 including only a single path 806, it should be appreciated that any number of paths may be included, such as multiple paths, which may be configured as shown for the unit cell circuit of FIG. 7C.
- FIG. 9 is a block diagram of an example integrated circuit 900 including an array 902 of circuit unit cell circuits with selectable paths 806 and co-located memory cells 830, which may be included within the system 100b of FIG. IB, according to some embodiments.
- the integrated circuit 900 may be configured as described herein for the integrated circuit 400.
- the integrated circuit 900 is configured to provide three inputs 910a, 910b, and 910c to the array 902 of multiplication circuits and to receive three outputs 920a, 920b, and 920c from the array 902.
- three first conductive portions may be coupled to the array 902 to provide the inputs 910a, 910b, and 910c and three second conductive portions (not shown, e.g., 422) may be coupled to the array 902 to receive the outputs 920a, 920b, 920c.
- three conductive portions may be coupled to the array 902 to provide respective write values 904a, 904b, and 904c for writing the memory cells 830 of each unit cell circuit 800, and three conductive portions may be coupled to the array 902 to provide write enable signals 902a, 902b, and 902c to the memory cells 830.
- the array 902 of multiplication circuits may be configured to operate as described herein for the array 402 of the integrated circuit 400.
- the unit cell circuits 800 of the array 902 may be interconnected to form multiplication circuits, such as with the top row of three unit cell circuits 800 in FIG. 9 forming a multiplication circuit receiving an input 910a and providing outputs 920a, 920b, and 920c that may be combined.
- the impedances 810 may be configured as resistors (e.g., 632) and the switches 820 may be configured as transistors (e.g., 634) in a manner similar to the unit cell circuits 630.
- switch-control conductive portions may be located entirely within the unit cell circuit 800 to connect the memory cell 830 directly to the switch 820.
- the memory cells 830 of the array may be formed on one or more layers different from the switches 820 and/or impedances 810 and connected to the switches 820 through vias.
- memory cells 830 are shown co-located with unit cell circuits 800 in FIGs. 8-9, there may be applications in which a memory is built into the same integrated circuit as the array of multiplication circuits but memory cells are not co-located with the unit cell circuits.
- FIG. 10 is a block diagram of the integrated circuit 900 of FIG. 9 illustrating multiple example configurations 1040a, 1040b, and 1040c of the unit cell circuits 800, according to some embodiments.
- FIG. 10 illustrates multiple programmable options for grouping the unit cell circuits 800.
- the top row of three unit cell circuits 800 may form a multiplication circuit with each unit cell circuit configured to receive the same input signal 1010a and provide (at least a portion of) output signals 1020a, 1020b, and 1020c combinable to obtain an output from the multiplication circuit.
- the left two unit cell circuits 800 of the first two rows may form a multiplication circuit with the top two unit cell circuits 800 configured to receive a first input signal 1010a and the bottom two unit cell circuits 800 configured to receive a second input signal 1010b, and the output signal 1020a includes a sum of outputs from the left two unit cell circuits 800 and the output signal 1020b includes a sum of outputs from the right two unit cell circuits 800.
- the left column of three unit cell circuits 800 may form a multiplication circuit with each unit cell circuit 800 configured to receive a respective one of input signals 1010a, 1010b, and 1010c and the output signal 1020a includes a sum of outputs from all three unit cell circuits 800.
- all three options for grouping described herein for FIG. 10 may be programmed and reprogrammed into the same integrated circuit.
- memory cells may be programmed to select impedances of unit cell circuits to be used and not to select impedances of unit cell circuits not to be used. It should be appreciated that this reprogrammability is not limited to embodiments having memory in the integrated circuit with the multiplication circuits, as embodiments that may receive memory values from an external memory may be similarly reprogrammed.
- One or more aspects and embodiments of the present disclosure involving the performance of processes or methods may utilize program instructions executable by a device (e.g., a computer, a processor, or other device) to perform, or control performance of, the processes or methods.
- a device e.g., a computer, a processor, or other device
- inventive concepts may be embodied as a computer readable storage medium (or multiple computer readable storage media) (e.g., a computer memory, one or more floppy discs, compact discs, optical discs, magnetic tapes, flash memories, circuit configurations in Field Programmable Gate Arrays or other semiconductor devices, or other tangible computer storage medium) encoded with one or more programs that, when executed on one or more computers or other processors, perform methods that implement one or more of the various embodiments described above.
- the computer readable medium or media can be transportable, such that the program or programs stored thereon can be loaded onto one or more different computers or other processors to implement various ones of the aspects described above.
- computer readable media may be non-transitory media.
- program or “software” are used herein in a generic sense to refer to any type of computer code or set of computer-executable instructions that can be employed to program a computer or other processor to implement various aspects as described above. Additionally, it should be appreciated that according to one aspect, one or more computer programs that when executed perform methods of the present disclosure need not reside on a single computer or processor, but may be distributed in a modular fashion among a number of different computers or processors to implement various aspects of the present disclosure.
- Computer-executable instructions may be in many forms, such as program modules, executed by one or more computers or other devices.
- program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types.
- functionality of the program modules may be combined or distributed as desired in various embodiments.
- data structures may be stored in computer-readable media in any suitable form.
- data structures may be shown to have fields that are related through location in the data structure. Such relationships may likewise be achieved by assigning storage for the fields with locations in a computer-readable medium that convey relationship between the fields.
- any suitable mechanism may be used to establish a relationship between information in fields of a data structure, including through the use of pointers, tags or other mechanisms that establish relationship between data elements.
- the software code can be executed on any suitable processor or collection of processors, whether provided in a single computer or distributed among multiple computers.
- a computer may be embodied in any of a number of forms, such as a rack-mounted computer, a desktop computer, a laptop computer, or a tablet computer, as non-limiting examples. Additionally, a computer may be embedded in a device not generally regarded as a computer but with suitable processing capabilities, including a Personal Digital Assistant (PDA), a smartphone or any other suitable portable or fixed electronic device.
- PDA Personal Digital Assistant
- a computer may have one or more input and output devices. These devices can he used, among other things, to present a user interface. Examples of output devices that can be used to provide a user interface include printers or display screens for visual presentation of output and speakers or other sound generating devices for audible presentation of output. Examples of input devices that can be used for a user interface include keyboards, and pointing devices, such as mice, touch pads, and digitizing tablets. As another example, a computer may receive input information through speech recognition or in other audible formats.
- Such computers may be interconnected by one or more networks in any suitable form, including a local area network or a wide area network, such as an enterprise network, and intelligent network (IN) or the Internet.
- networks may be based on any suitable technology and may operate according to any suitable protocol and may include wireless networks, wired networks or fiber optic networks.
- some aspects may be embodied as one or more methods.
- the acts performed as part of the method may be ordered in any suitable way. Accordingly, embodiments may be constructed in which acts are performed in an order different than illustrated, which may include performing some acts simultaneously, even though shown as sequential acts in illustrative embodiments.
- a reference to “A and/or B,” when used in conjunction with open-ended language such as “comprising” can refer, in one embodiment, to A without B (optionally including elements other than B); in another embodiment, to B without A (optionally including elements other than A); in yet another embodiment, to both A and B (optionally including other elements); etc.
- At least one of A and B can refer, in one embodiment, to at least one, optionally including more than one, A, with no B present (and optionally including elements other than B); in another embodiment, to at least one, optionally including more than one, B, with no A present (and optionally including elements other than A); in yet another embodiment, to at least one, optionally including more than one, A, and at least one, optionally including more than one, B (and optionally including other elements); etc.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computational Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Mathematical Physics (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Data Mining & Analysis (AREA)
- Algebra (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- Semiconductor Integrated Circuits (AREA)
Abstract
The present disclosure describes circuits and systems for performing multiplication that, in accordance with certain embodiments, are programmable, fast, reliable, and highly-scalable. Some embodiments of the present disclosure provide integrated circuit structures with parallel-coupled switchable impedances configured to perform multiplication. Some embodiments of the present disclosure provide scalar multiplication circuits that are controllable using selectable paths through the circuit. Embodiments of the present disclosure may be produced at high scale both efficiently and reliably by leveraging existing integrated circuit and memory technologies.
Description
PROGRAMMABLE MULTIPLICATION CIRCUIT AND SYSTEMS
RELATED APPLICATIONS
[0001] This application claims priority under 35 U.S.C. § 119(e) to U.S. Provisional Patent Application No. 63/591,032, filed October 17, 2023, and entitled “PROGRAMMABLE MULTIPLICATION CIRCUIT AND SYSTEMS,” which is incorporated herein by reference in its entirety for all purposes.
TECHNICAL FIELD
[0002] Circuits and systems for multiplication, including integrated circuits configured to perform machine learning inference calculations, are generally described.
BACKGROUND
[0003] Machine learning processing typically includes processing a multi-dimensional input vector through several computational layers, with each computational layer multiplying neuron values by respective connection weights. In a supervised learning algorithm, the connection weights are pre-programmed with values during training that appropriately transform training inputs into desired outputs. As a result, multiplication at each layer of a machine learning algorithm is designed to cumulatively reach a classification or regression result from the input that is consistent with the training the algorithm received.
[0004] Conventionally, machine learning processing is performed by general-purpose central processing units (CPUs) and sometimes graphics processing units (GPUs). CPUs are designed to process information serially and are less efficient for highly parallel computation, such as machine learning processing, which typically involves parallel processing of large, multidimensional inputs. While GPUs are designed for highly parallel computation, they are expensive, consume a large amount of power, and are not optimized for the kind of processing machine learning often entails. Alternative solutions to general-purpose CPUs and GPUs are therefore desirable to improve speed, power efficiency, and associated costs of machine learning processing.
SUMMARY OF THE DISCLOSURE
[0005] The present disclosure describes circuits and systems for performing multiplication that, in accordance with certain embodiments, are programmable, fast, reliable, and highly scalable. Some embodiments of the present disclosure provide integrated circuit structures with parallel- coupled switchable impedances that are programmable to perform multiplication. Some embodiments of the present disclosure provide scalar multiplication circuits that are controllable using selectable paths through the circuit. Embodiments of the present disclosure may be produced at high scale, efficiently, and reliably by leveraging existing integrated circuit and memory technologies.
[0006] Certain aspects relate to an integrated circuit. In some embodiments, the integrated circuit comprises an input terminal, a first conductive portion that is coupled to the input terminal, an output terminal, a second conductive portion that is coupled to the output terminal, and a programmable multiplication circuit comprising a plurality of impedances coupled in parallel between the first conductive portion and the second conductive portion and having fixed impedance values, and a plurality of switches coupled in series with respective ones of the plurality of impedances between the first conductive portion and the second conductive portion. In some embodiments, the impedances comprise resistors having fixed resistance values.
[0007] In some embodiments, the integrated circuit comprises a plurality of switch-control conductive portions that are electrically isolated from one another, and the plurality of switches comprises a plurality of transistors each having a first channel terminal coupled to the first conductive portion, a second channel terminal coupled to the second conductive portion via the programmable multiplication circuit, a gate terminal coupled to a respective one of the switchcontrol conductive portions.
[0008] In some embodiments, the integrated circuit further comprises a third conductive portion coupled to the input terminal or to a second input terminal, a fourth conductive portion coupled to the output terminal or to a second output terminal, and a second programmable impedance circuit, comprising a second plurality of impedances coupled in parallel between the third conductive portion and the fourth conductive portion and having fixed impedance values, and a second plurality of switches coupled in series with respective ones of the second plurality of impedances between the third conductive portion and the fourth conductive portion. In some
embodiments, the fourth conductive portion is coupled to the second output terminal. In some embodiments, the third conductive portion is coupled to the second input terminal.
[0009] In some embodiments, the first and second conductive portions are disposed on different respective layers of the integrated circuit.
[0010] In some embodiments, the integrated circuit further comprises a memory comprising a plurality of memory cells coupled to respective switches of the plurality of switches. In some embodiments, the plurality of impedances and the plurality of switches comprise a plurality of impedance-switch pairs, with ones of the plurality of memory cells co-located with respective ones of the plurality of impedance-switch pairs. In some embodiments, the memory is configured as random- access memory (RAM).
[0011] In some embodiments, the plurality of impedances and the plurality of switches are arranged in an array of unit cell circuits, each unit cell circuit of at least a subarray of the array of unit cell circuits comprises a respective one of the plurality of impedances coupled in series with a respective one of the plurality of switches, and the first and second conductive portions at least partially overlap, in a direction normal to the array, with at least one of the unit cell circuits of the subarray.
[0012] Certain aspects relate to a multiplication circuit. In some embodiments, the multiplication circuit comprises an input terminal configured to receive an input signal, an output terminal configured to provide an output signal, and a plurality of selectable paths coupled in parallel between the input terminal and the output terminal, the plurality of selectable paths configured to generate the output signal as a multiplication of the input signal by a scalar value, wherein the scalar value is controllable by selecting a subset of the plurality of selectable paths.
[0013] In some embodiments, each of the plurality of selectable paths is configured to, when selected, propagate at least a portion of the input signal from the input terminal the output terminal. In some embodiments, each of the plurality of selectable paths comprises a switch coupled in series with a fixed impedance, the switch being controllable to select the respective selectable path. In some embodiments, each switch is configured to receive a respective select signal from a memory.
[0014] In some embodiments, one of the input signal and the output signal is a voltage signal and the other of the input signal and the output signal is a current signal.
[0015] In some embodiments, a system comprises the multiplication circuit and a memory, and the multiplication circuit is configured to select the subset of the plurality of selectable paths based on values stored in the memory. In some embodiments, the system further comprises a processor coupled to the multiplication circuit and configured to multiply a neuron value by a connection weight at least in part by providing, to the multiplication circuit, the input signal as a version of the neuron value, and receiving, via the multiplication circuit, a version of the output signal, and the scalar value is based on the connection weight. In some embodiments, the processor is further configured to set the values stored in the memory based on the connection weight.
[0016] In some embodiments, the system further comprises an integrated circuit having the memory and the multiplication circuit included therein. In some embodiments, each of the plurality of selectable paths is co-located with a respective memory cell of the memory. In some embodiments, the memory is configured as random-access memory (RAM).
[0017] Other advantages and novel features of the present disclosure will become apparent when considered in conjunction with the accompanying figures. In cases where the present specification and a document incorporated by reference include conflicting and/or inconsistent disclosure, the present specification shall control.
BRIEF DESCRIPTION OF THE DRAWINGS
[0018] Non-limiting embodiments of the present disclosure will be described by way of example with reference to the accompanying figures, which are schematic and are not intended to be drawn to scale. In the figures, each identical or nearly identical component illustrated is typically represented by a single numeral. For purposes of clarity, not every component is labeled in every figure, nor is every component of each embodiment shown where illustration is not necessary to allow those of ordinary skill in the art to understand the disclosure. In the figures:
[0019] FIG. 1A is a block diagram of an example system including an integrated circuit configured for multiplication, according to some embodiments;
[0020] FIG. IB is a block diagram of an alternative example system including an integrated circuit configured for multiplication that further includes a co-located memory, according to some embodiments;
[0021] FIG. 2 is a circuit diagram of an example multiplication circuit that may be included in the integrated circuit of FIG. 1A, according to some embodiments;
[0022] FIG. 3 is a circuit diagram of an example array of multiplication circuits that may be included in the integrated circuit of FIG. 1 A, according to some embodiments;
[0023] FIG. 4 is a top view of a portion of an example integrated circuit including an array of multiplication circuits that may be included in the system of FIG. 1 A, according to some embodiments;
[0024] FIG. 5 is a perspective view of a portion of an example integrated circuit including an array of multiplication circuits that may be included in the system of FIG. 1 A, according to some embodiments;
[0025] FIG. 6A is a top view of a portion of an example integrated circuit including an array of resistor-based multiplication circuits that may be included in the system of FIG. 1A, according to some embodiments;
[0026] FIG. 6B is a top view of the portion of the example integrated circuit of FIG. 6A with the first and second conductive portions hidden from view, according to some embodiments;
[0027] FIG. 7A is a perspective view of a resistor that may be included in the resistor-based multiplication circuits of FIGs. 6A and 6B, according to some embodiments;
[0028] FIG. 7B is a top view of the conductive trace and electrodes of the resistor of FIG. 7A, according to some embodiments;
[0029] FIG. 7C is a top view of an alternative example of a resistor-switch configuration that may be included in the integrated circuit of FIGs. 6A-6B, according to some embodiments;
[0030] FIG. 8 is a block diagram of an example unit cell circuit with a co-located selectable path and memory cell that may be included in an array of multiplication circuits within the system of FIG. IB, according to some embodiments;
[0031] FIG. 9 is a block diagram of an example integrated circuit with an array of unit cell circuits of FIG. 8, which may be included within the system of FIG. IB, according to some embodiments; and
[0032] FIG. 10 is a block diagram of the integrated circuit of FIG. 9 illustrating multiple example configurations of the unit cell circuits, according to some embodiments.
DETAILED DESCRIPTION
[0033] The present disclosure describes circuits and systems for performing multiplication that are fast, reliable, and highly scalable. Some embodiments of the present disclosure provide integrated circuit structures with parallel-coupled switchable impedances configured to perform multiplication. Some embodiments of the present disclosure provide scalar multiplication circuits that are controllable using selectable paths through the circuit. Embodiments of the present disclosure may be produced at high scale both efficiently and reliably by leveraging existing integrated circuit and memory technologies.
[0034] The inventors have recognized that memristive technologies could make machine learning computations faster and more efficient, but with drawbacks that make memristive technologies unsuitable for some applications. Memristors arc circuit elements that may be programmed with a particular input-to-output transfer function, which may be used to perform a multiplication. On one hand, a large array of memristors may be programmed to apply respective scalar multiplication to a large number of input signals and thereby perform massively parallel multiplication at high speed. Memristors are particularly attractive for this purpose because they may be easily reprogrammed on-the-fly (e.g., without any changes to their fabricated structure) to obtain different transfer functions, permitting the array to be trained or retrained. For instance, changing the transfer functions of the memristors may update the connection weights used for memristive neuron multiplication. As a result, the same memristive structure could be reused repeatedly and retrained and/or adapted for completely different machine learning tasks.
[0035] On the other hand, however, the inventors recognized that memristive structures do not presently achieve reliable and accurate processing. For instance, while memristive structures may be advantageously reprogrammed with different transfer functions, memristive structures may not hold their programmed transfer functions with the degree of accuracy needed for consistent and low error computation. Thus, memristive technologies may require a tradeoff between performance over time and re-programmability.
[0036] Instead, the inventors recognized that, in accordance with some embodiments, circuitry with selectable impedance paths may achieve better performance than memristive structures with a similar programmability advantage. For example, a multiplication may be achieved by processing an input signal through one or more selectable impedance paths to obtain an output representing a scalar multiple of the input. For instance, the scalar multiple achieved using the
circuit may be programmed by selecting a particular subset of available impedance paths to couple in parallel. Moreover, in some embodiments, multiplication structures may be implemented using existing integrated circuit technologies and reprogrammable scalar multiple configurations may be implemented using existing memory technologies, making the resulting circuits efficient and reliable to produce at scale. Such circuits may be especially useful, when configured as an array for implementing a machine learning inference machine programmed (and reprogrammable) to perform multiplication on a large number of inputs.
[0037] FIG. 1A is a block diagram of an example system 100a including an integrated circuit (IC) 102a configured for multiplication, according to some embodiments.
[0038] In some embodiments, system 100a may be configured for multiplication, such as may be used to perform a machine learning inference. For example, as shown in FIG. 1A, system 100a includes a processor 110, an analog/digital (A/D) interface 130, an integrated circuit 102a having one or more multiplication (Mult.) circuits 120, and a memory 140. In the illustrated example, processor 110 may be configured to provide an input signal to the multiplication circuit(s) 120 that is indicative of a neuron value N that is to be multiplied by a connection weight C to produce an output signal O indicative of the multiplication result N x C (e.g., either before or after the activation function is applied, depending on the embodiment) Also in the illustrated example, the memory 140 may be programmed with the connection weight C (e.g., at the time of execution and/or beforehand) so as generate path select signals that control the multiplication circuit(s) 120 to apply the connection weight C. In some embodiments, an artificial intelligence (Al) inference operation may include a multidimensional vector of neuron values N that are to be multiplied by a multidimensional vector of connection weights C programmed in the memory 140 to obtain a multidimensional vector output O when each vector component is processed by a respective multiplication circuit 120.
[0039] In some embodiments, the system 100a may be further configured to apply an activation (and/or deactivation function) to the multiplication result. Typical activation functions include 1 sigmoid (1+e-x), hyperbolic tangent (tanh (%)), rectified linear unit (ReLU) (max (0, %)), Leaky ReLU (max (0.1%, %)), Maxout (max (w x + b^ w^x + b2 ), and exponential linear unit (ELU) (x, x > 0; a ex — 1), x < 0). A deactivation function is typically the derivative of the activation function. In some embodiments, an activation and/or deactivation function may be applied by a component on the integrated circuit 102a, such as an activation and/or deactivation function
circuit (e.g., analog circuitry providing and/or approximating the response of an activation and/or deactivation function). For instance, an analog activation and/or deactivation circuit may be included for each multiplication circuit, such as co-located with the multiplication circuit and/or elsewhere in the integrated circuit 102a. In some embodiments, an activation and/or deactivation function may be applied by a digital processing circuit, such as by digital processing components (e.g., logic gates of an FPGA and/or ASIC) within or separate from the integrated circuit 102a, and/or by the processor 110.
[0040] In some embodiments, system 100a may be formed using a number of integrated circuits on one or more circuit boards, such as a computer motherboard. In other embodiments, at least some components may be in communication with one another over a communication network, such as in a distributed cloud computing environment. As described herein, an integrated circuit may include one or more semiconductor dies having circuitry fabricated thereon and encapsulated in a package. Multiple dies within an integrated circuit package may be stacked and/or wire bonded together within the package.
[0041] In some embodiments, processor 110 may be a general-purpose central processing unit (CPU). For example, the processor 110 may be included within a general-purpose computer in which the integrated circuit 102a has been incorporated, such as by mounting into a slot on the motherboard in a modular configuration and/or through permanent affixation to the circuit board as part of its manufacturing. While a single processor 110 is shown in FIG. 1A, it should be appreciated that multiple processors 110 may be used, whether in the same local computer system or in a distributed cloud computing environment. Further, while processor 110 is shown separate from integrated circuit 102a in FIG. 1A, it should be appreciated that a single integrated circuit (e.g., mixed analog/digital integrated circuit) may have both the processor 110 and multiplication circuit(s) 120 therein, such as on a single semiconductor die and/or on multiple dies stacked and/or wire bonded together within a single package. Further, while the processor 110 may be implemented as or including a CPU, in other embodiments the processor 110 may be implemented as or include a graphics processing unit (GPU), a field programmable gate array (FPGA), and/or an application- specific integrated circuit (ASIC), as embodiments described herein are not so limited.
[0042] In some embodiments, the A/D interface 130 may be configured to facilitate bidirectional communication between the processor 110 and analog circuitry (e.g., the
multiplication circuit(s) 120) of the integrated circuit 102a. For example, as shown in FIG. 1 A, the A/D interface 130 includes a digital-to-analog converter (DAC) 132 and an analog-to-digital converter (ADC) 134 coupled between the processor 110 and the multiplication circuit(s) 120 of the integrated circuit 102a. In some embodiments, the input signal from the processor 110 may have a version (e.g., digital value representative) of the neuron value N, which the A/D interface 130 may convert to a version (e.g., analog value, such as voltage and/or current amplitude, representative) of the neuron value N. Likewise, the output signal O may have a version (e.g., analog value representative) of the multiplication result, which the A/D interface may convert to a version (e.g., digital value representative) of the multiplication result. It should be appreciated that some embodiments may omit one or each of DAC 132 and 134 and/or all of the A/D interface 130, such as where communication between the processor 110 and multiplication circuit(s) 120 is unidirectional, where the multiplication circuit(s) 120 are configured to process digital signals without an intervening DAC, where the multiplication circuit(s) 120 are configured to output digital signals without an intervening ADC, and/or where the processor 110 has at least some analog processing components for interfacing with the integrated circuit 102a. [0043] In some embodiments, the multiplication circuit(s) 120 may be configured to multiply values represented in input signals received from processor 110 to provide output signals to processor 110 representing the result of the multiplication. For example, in some embodiments, the multiplication circuit(s) 120 may each include a plurality of selectable paths controllable to set the value by which the multiplication circuit(s) multiply values represented in input signals received from the processor 110. For instance, a subset of the paths may be selected using path select signals received by the memory 140. In some embodiments, the multiplication circuit(s) 120 may include parallel-coupled impedances having fixed impedance values and switches (e.g., transistors) coupled in series with respective impedances, such that states of the switches may be controlled to set the desired overall impedance of a multiplication circuit 120 to obtain a desired value by which to multiply a value represented in an input signal from the processor 110.
[0044] In some embodiments, the memory 140 may be configured to provide path select values to multiplication circuit(s) 120 for controlling multiplication of values represented in input signals from the processor 110. For example, the memory 140 may be configured to store at least one path select value for each multiplication circuit 120 of the integrated circuit 102a. For instance, each path select value may include a bit for each selectable path and/or switch of the
respective multiplication circuit 120. Tn some embodiments, the memory 140 may include random-access memory (RAM), such as static random-access memory (SRAM) and/or dynamic random-access memory (DRAM), though other forms of memory such as resistive RAM and/or flash memory may be used, depending on the preferred and/or available memory for the application. Using RAM may be preferred for some applications to permit re-programming of path select values to change values by which the multiplication circuit(s) 120 are configured to multiply input values, which may be useful for re-training in the case of some Al inference machine implementations. However, permanent memory may be preferred for other applications, such as where path select values and resulting multiplication are predetermined and not to be changed (e.g., for a pre-trained permanently configured Al inference machine).
[0045] In some embodiments, the memory 140 may be at least a portion of a shared computer system memory where at least a portion is in communication with the integrated circuit 102a. In other embodiments, the memory 140 may be partially or entirely dedicated to storing configuration information for multiplication circuit 120. To set the path select values, in some embodiments, the memory 140 may be configured to receive the memory write and enable values from the processor 110 that control the path select values and/or individual bits within a path select value stored in the memory 140.
[0046] In some embodiments, the integrated circuit 102a may be configured to perform activation between neural network layers implemented by multiplication circuits 120 of the integrated circuit 102a. For some applications, an analog input may be provided and an analog output obtained from the integrated circuit 102a, and/or a digital input may be provided and a digital output obtained from the integrated circuit 102a, facilitating omission of the A/D interface 130. For instance, outputs from one group (e.g., layer) of multiplication circuits 120 may be provided (e.g., after activation) as inputs to another group (e.g., layer) of multiplication circuits 120, such as may be used to implement some or all neural network processing within the integrated circuit 102a.
[0047] FIG. IB is a block diagram of an alternative example system 100b including an integrated circuit 102b configured for multiplication that further includes a co-located memory 140, according to some embodiments.
[0048] In some embodiments, the system 100b may be configured as described herein for the system 100a, such as including a processor 110, A/D interface 130, multiplication circuit(s) 120,
and memory 140. In contrast to the embodiment of system 100a shown in FIG. 1 A, however, system 100b in FIG. IB is shown including multiplication circuit(s) 120 and memory 140 colocated in the integrated circuit 102b.
[0049] In some embodiments, memory cells of the memory 140 may be co-located with ones of the multiplication circuit(s) 120. For example, each multiplication circuit 120 may have colocated therewith a memory cell of the memory 140 storing a path select value for controlling the multiplication circuit 120. For example, the co-located memory cells may be bitcells (e.g., holding a single bit) and/or may be groups of bitcells (e.g., holding multiple bits). In this example, the memory 140 may be configured as described in connection with FIG. 1A, with path select values stored in the co-located bitcell(s) programmed by processor 110 and provided to components of the multiplication circuit(s) 120.
[0050] In some embodiments, the multiplication circuit(s) 120 and memory 140 within the integrated circuit 102b may be formed on the same semiconductor die or on separate dies stacked and/or wire bonded and packaged together. The inventors have recognized that, regardless of whether memory cells of the memory 140 are co-located with respective ones of the multiplication circuit(s) 120, packaging the multiplication circuit(s) 120 and memory 140 may improve system efficiency in terms of device footprint and/or the number of interfaces through which the processor 110 is coupled. In some embodiments, a single bus may be used to communicate between the processor 110 and both the multiplication circuit(s) 120 and memory 140.
[0051] FIG. 2 is a circuit diagram of an example multiplication circuit 200 that may be included in the integrated circuit 102a of FIG. 1 A, according to some embodiments.
[0052] In some embodiments, the multiplication circuit 200 may be configured to receive an input signal 202 representing a first value and produce an output signal 204 representative of a second value that is a scalar multiplication of the first value. For example, one of the input signal 202 and the output signal 204 may be a voltage signal and the other of the input signal 202 and the output signal 204 may be a current signal. For instance, where the input signal 202 is a voltage signal (e.g., output by an ADC) and the output signal 204 is a current signal (e.g., output by applying the voltage signal to an impedance), the output signal 204 may have a current amplitude that represents an output value that is a scalar multiple of an input value represented in a voltage amplitude of the input signal 202.
[0053] In one illustrative example of representative value multiplication, an input signal 202 may be a voltage signal having a voltage of 1 Volt (V) and an output signal 204 may be a current signal having a current of 1 Milliampere (mA). For instance, the multiplication circuit 200 may have applied an impedance of 1 kilohm (kQ) to the input signal 202 to produce the output signal 204. In this example, the range of input voltages of the input signal 202 may be from 1 V to 2 V, representing a range of neuron values from 1 to 2, and the range of output currents of the output signal 204 may be from 0.5 mA to 2 mA, representing a range of multiplication results from 1 to 4. Thus, the input signal 202 may represent a neuron value of 1 and the output signal 204 may represent a multiplication result of 2, indicating that the 1 kQ impedance of the multiplication circuit 200 applied a connection weight multiplier of scalar value 2 to the input signal 202 to produce the output signal 204. While this example uses only nonzero voltages and currents as indicating values, other examples may use a zero voltage and/or zero current as indicating a value.
[0054] In some embodiments, the multiplication circuit 200 may be controllable to set an impedance between the input and the output of the circuit 200. For example, as shown in FIG. 2, multiplication circuit 200 includes a plurality of selectable paths 206 coupled in parallel between the input and the output. In the illustrated example, each path 206 includes a switch 220. For instance, each switch 220 may be controllable to select its path 206 to be included in an overall parallel path from the input to the output of the circuit 200, with at least a portion of the input signal 202 propagating through each selected path 206 within the parallel path. As shown in the example of FIG. 2, each path 206 includes an impedance 210 coupled in series with the switch 220. Also shown in the example of FIG. 2, each path 206 includes an impedance-switch pair that includes an impedance 210 coupled in series with the switch 220. For example, by controlling the switch(es) 220 of one or more paths 206 to include the respective impedance(s) 210 in parallel between the input and the output, the overall impedance of the overall path from the input to the output may be controlled. Moreover, by using switches 220 to control selection of paths 206, impedances 210 with fixed impedance values may be used in the multiplication circuit 200 while achieving control over scalar multiplication to be obtained using the circuit.
[0055] Using the illustrative example above, the three paths 206 shown in FIG. 2 may have impedances 210 of 1, kQ, 2 kQ, and 4 kQ. To achieve the above-described multiplication of a neuron value of 1 represented in an input signal 202 having a voltage of 1 V by a scalar value
connection weight of 2 to obtain a resulting output signal 204 representing a multiplication result having a current of 1 mA, only the path 206 including the impedance 210 of 1 kQ may be selected by turning on the switch 220 connected in series with that impedance 210 to achieve the desired output. While this example selects one path 206 to apply an impedance to an input to achieve a desired output, other examples may select multiple paths to provide an overall (e.g., parallel) impedance, as this may provide more resolution (e.g., possible discrete impedance values) in setting the overall impedance.
[0056] In some embodiments, multiplication circuit 200 may be repeatedly programmable to select different subsets of paths 206 to obtain appropriate scalar multiple values (e.g., connection weights). For example, where fixed impedances 210 are used, scalar multiple values may be programmed in the circuit 200 by adjusting path select signals applied to the switches 220 to change which paths 206 are selected for inclusion in the overall path from the input to the output. It should be appreciated, however, that some applications may not take advantage of the reprogrammable nature of such a circuit 200, such as by fixedly coupling the circuit 200 to readonly memory (ROM) that provides the path select signals.
[0057] In some embodiments, an impedance and switch may be in series when substantially all current flowing in one of the impedance and switch flows through the other of the impedance and switch (e.g., though some insignificant amount of current may flow elsewhere due to leakage). In some embodiments a plurality of paths (e.g., each including an impedance in series with a switch) may be coupled in parallel between an input and an output when each path necessarily has a same voltage across the path (e.g., the voltage difference between the input and output), and/or when substantially all current flowing from the input to the output is divided among the paths (e.g., though some significant amount of current may flow elsewhere due to leakage).
[0058] It should be appreciated that, while three paths 206 each including one switch 220 and one impedance 210 are shown in FIG. 2, any number of paths 206 may be included, and any number of switches 220 and/or impedances 210 may be included per path 206. Alternatively or additionally, a circuit 200 may be configured with sub-paths within some or all paths 206, each sub-path including a switch and an impedance. For instance, such a configuration may provide further ways of fine-tuning the desired overall impedance from the input to the output of the circuit 200.
[0059] While not shown in FIG. 2, it should be appreciated that an activation function may be applied to output signals, such as by analog processing components of or coupled to the multiplication circuit 200 (e.g., within the same integrated circuit), and/or by digital processing components (e.g., processor 110) coupled to the multiplication circuit 200.
[0060] FIG. 3 is a circuit diagram of an example array 300 of multiplication circuits 306a, 306b, 306c, 306d that may be included in the integrated circuit 102a of FIG. 1 A, according to some embodiments.
[0061] In some embodiments, each multiplication circuit 306a, 306b, 306c, and 306d may be configured in the manner described herein for circuit 200. For example, in FIG. 3, each multiplication circuit 306a, 306b, 306c, and 306d includes a plurality of paths, each path including a switch 320 and an impedance 310.
[0062] In some embodiments, multiplication circuits of the array 300 may be configured to perform multiply-accumulate (MAC) operations. For example, multiplication circuits of the array 300 may have inputs and/or outputs coupled together. In the example illustrated in FIG. 3, the inputs of multiplication circuits 306a and 306b are coupled together to receive an input signal 302a and the inputs of multiplication circuits 306c and 306d are coupled together to receive an input signal 302b. Also shown in FIG. 3, the outputs of multiplication circuits 306a and 306c are coupled together to produce an output signal 304a and the outputs of multiplication circuits 306b and 306d are coupled together to produce an output signal 304b. In the illustrated example, the output signal 304a may represent a sum of a first product obtained from the output of the multiplication circuit 306a together with a second product obtained from the output of the multiplication circuit 306c. For instance, where the multiplication circuits 306a and 306c output current signals, coupling the outputs of the circuits 306a and 306c may serve to combine the current signals, which may represent summing the values represented in the current signals. Moreover, in the illustrated example, each output signal 304a and 304b may represent a respective sum given by:
(1) a,b ^a,b * i "b ^c,d * ^b ■> where Ia is the neuron value represented in the input signal 302a, lb is the neuron value represented in the input signal 302b, Xa,b is the scalar multiplier programmed into the respective multiplication circuit 306a or 306b, and Yc,d is the scalar multiplier programmed into the respective multiplication circuit 306c or 306d.
[0063] In some embodiments, switches 320 of the multiplication circuits 306a, 306b, 306c, and 306d may be configured to receive path select signals that select a subset of one or more paths through the circuit. For example, multiplication circuit 306a is shown configured to receive path select signals IX, 1Y, and 1Z, multiplication circuit 306b is shown configured to receive path select signals 2X, 2Y, and 2Z, multiplication circuit 306c is shown configured to receive path select signals 3X, 3Y, and 3Z, and multiplication circuit 306d is shown configured to receive path select signals 4X, 4Y, and 4Z for controlling respective switches 320. In some embodiments, the path select signals may be provided by a memory, such as with each path select signal (e.g., IX) being stored in a memory cell of the memory.
[0064] In some embodiments, impedances 310 within a multiplication circuit, such as circuit 306a, may be weighted with respect to one another. For example, FIG. 3 shows a binary- weighted configuration in which the three impedances of the circuit 306a are doubled with respect to one another. For instance, a binary-weighted configuration of impedances of 1, kQ, 2 kQ, and 4 kQ may be used to achieve the above-described multiplication by selecting the 1 kQ path to obtain an overall impedance of 1 kQ.
[0065] While four multiplication circuits 306a, 306b, 306c, and 306d are shown in the array 300 of FIG. 3 by way of illustration, any number of multiplication circuits may be included in an array. Moreover, while all four multiplication circuits 306a, 306b, 306c, and 306d are shown in FIG. 3 interconnected either at the input or output, it should be appreciated that in some implementations, multiplication circuits within an array may be interconnected to some other multiplication circuits in the array and separate from other multiplication circuits in the array. For example, in some embodiments, the four multiplication circuits shown in FIG. 3 may comprise a subarray within a larger array where each subarray is programmed to perform a MAC operation. Further alternatively or additionally, operations other than MAC operations may be performed using multiplication circuits described herein, such as multiply-add (MAD) operations and/or solely multiplication operations.
[0066] FIG. 4 is a top view of a portion of an example integrated circuit 400 including an array 402 of multiplication circuits that may be included in the system 100a of FIG. 1 A, according to some embodiments.
[0067] In some embodiments, the array 402 of multiplication circuits may be configured in the manner described herein for array 300 of FIG. 3. For example, each multiplication circuit of the
array may include a plurality of paths, each path including a switch (e.g., 320) and an impedance (c.g., 310). In the illustrated embodiment, the array 402 includes four multiplication circuits, with one multiplication circuit 406a indicated by a dashed box.
[0068] In some embodiments, conductive portions of the integrated circuit 400 may be coupled to inputs and outputs of the integrated circuit 400 and to the multiplication circuits. For example, in FIG. 4, the integrated circuit 400 has two first conductive portions 412 respectively coupled to inputs 410a and 410b and six second conductive portions 422, three coupled to output 420a and three coupled to output 420b. For instance, each first conductive portion 412 may be coupled to a respective row of multiplication circuits of the array 402 and each second conductive portion may be coupled to a respective column of multiplication circuits of the array 402. In the illustrated embodiment, multiplication circuit 406 is coupled to input 410a and output 420a, with output 420a configured as the sum of outputs of the left two multiplication circuits of the array 402 and output 420b configured as the sum of outputs of the right two multiplication circuits. [0069] In some embodiments, each multiplication circuit may be configured to receive one or more path select signals for selecting a respective one or more switches of the circuit. For example, as shown in FIG. 4, the integrated circuit includes switch-control conductive portions 430 coupled to respective paths of the multiplication circuits. In the illustrated embodiment, the multiplication circuit includes three paths, each coupled to input 410a by a first conductive portion 412 and to output 420a via respective second conductive portions 422, and each coupled to a respective switch-control conductive portion 430. In some embodiments, each switchcontrol conductive portion 430 may be electrically isolated from one another (e.g., with little to no communicative coupling therebetween) such that each switch may receive its own individual path select signal. In contrast, in some embodiments, at least some of the first conductive portions 412 and/or second conductive portions 422 may be coupled to one another, such as shown in FIG. 4 where the left three second conductive portions 422 are coupled to one another to combine current signals into an output 420a.
[0070] In some embodiments, the multiplication circuits of the array 402 may have a scalable structure that is reprogrammable when coupled to a memory. For example, the multiplication circuits may be formed as a subarray of unit cell circuits of an array of unit cell circuits, each unit cell circuit having an impedance and a switch, with the impedance being individually configured for that unit cell circuit during manufacture. In this example, interconnections between the unit
cell circuits, such as first conductive portions 412 and second conductive portions 422 may be configured during manufacture to divide the unit cell circuits among multiplication circuits, such as with impedances within each circuit being different from one another. At the same time, in this example, the scalar multiple value provided by each multiplication circuit may be reprogrammed by changing the values provided to each switch via the switch-control conductive portions 430. [0071] In some embodiments, the integrated circuit 400 may have a first layer including the first conductive portions 412 and a second layer including the second conductive portions 422. For example, the first conductive portions 412 may be coupled to respective input terminals of the integrated circuit (not shown) and the second conductive portions 422 may be coupled to one or more output terminals of the integrated circuit (not shown), such as depending on whether and/or how many of the second conductive portions 422 are connected together to sum the current signals therein. In some embodiments, the array 402 of multiplication circuits (and/or a portion thereof) may form at least a portion of a programmable impedance circuit. For example, at least one of the multiplication circuits may have impedances (e.g., 310) coupled in parallel between an input (e.g., 410a via one of the first conductive portions 412) and an output (e.g., 420a via one of the second conductive portions 422). In some embodiments, switches (e.g., 320) may be coupled in series with respective ones of the impedances between the input and the output (e.g., first and second conductive portions 412 and 422).
[0072] In some embodiments, a layer may have multiple first conductive portions 412, such as configured to provide respective inputs 410a and 410b. For example, each first conductive portion 412 may be coupled to a respective input terminal. In some embodiments, another layer may have multiple second conductive portions 422, such as coupled together to provide output 420a to a same output terminal, or electrically isolated and configured to provide respective outputs 420a and 420b to respective output terminals. In some embodiments, where multiple second conductive portions 422 are configured to provide respective outputs 420a and 420b to respective output terminals, a first programmable impedance circuit (e.g., multiplication circuit 406a) may be coupled between one of the first conductive portions 412 (e.g., to receive input 410a) and at least one of the second conductive portions 422 (e.g., the left three second conductive portions 422 to provide output 420a), and a second programmable impedance circuit (e.g., including the top-right and/or bottom right three unit cell circuits) may be coupled between one of the first conductive portions 412 (e.g., to receive input 410a and/or 410b) and at least one
of the second conductive portions 422 (e.g., the right three second conductive portions 422 to provide output 420b).
[0073] While FIG. 4 shows each row of the array coupled to the same input and each column of the array coupled to the same output, it should be appreciated that rows and/or columns may have multiplication circuits respectively coupled to multiple inputs and/or outputs. As one example, a row of an array may have a first conductive portion extending from an end of the row to the middle of the row and another first conductive portion extending from the opposite end of the row to the middle of the row. Likewise, a column of an array may have a second conductive portion extending from an end of the column to the middle of the column and another second conductive portion extending from the opposite end of the column to the middle of the column. [0074] In addition, while FIG. 4 shows each multiplication circuit having three unit cell circuits of the array 402, a multiplication circuit may have any number of unit cell circuits, and arrays may include multiplication circuits having different numbers of unit cell circuits.
[0075] FIG. 5 is a top view of a portion of an example integrated circuit 500 including an array 502 of multiplication circuits that may be included in the system 100a of FIG. 1 A, according to some embodiments.
[0076] In some embodiments, the integrated circuit 500 may be configured in the manner described herein for the integrated circuit 400. For example, as shown in FIG. 5, the integrated circuit 500 includes an array 502 of multiplication circuits that are configured to receive inputs and outputs. In the illustrated embodiment, the integrated circuit 500 has three first conductive portions 510 configured to provide three respective voltage signals Vi, Vi, and V3 as inputs to the array 502, and three second conductive portions 520 configured to receive three respective current signals Ii, I2, and I3 as outputs of the array 502. For instance, each multiplication circuit of the array 502 as shown in FIG. 5 may have a single path with a switch and an impedance, such that the available subsets of paths for selection via the switch-control conductors (not shown) include the single path or no path. Alternatively, the array 502 shown in FIG. 5 may be configured to include three multiplication circuits, each configured to receive one of the voltage signals Vi, V2, and V3 and output a current signal that is combined by summing the three current signals Ii, I2, and I3 from the second conductive portions 520, in like fashion to multiplication circuits 306a and 306c. Further alternatively, each multiplication circuit in the array may have multiple paths configured as described below in connection with FIG. 7C.
[0077] In some embodiments, the multiplication circuits and the conductive portions 510 and 520 may be on different layers of the integrated circuit 500. For example, as shown in FIG. 5, the first conductive portions 510 may be on a first layer, the second conductive portions may be on a second layer, and the multiplication circuits may be disposed on yet another layer. In the illustrated embodiment, the multiplication circuits may be disposed on a layer that is between the first and second layers having the first conductive portions 510 and second conductive portions 520, respectively. It should be appreciated, however, that the layers may be configured differently. For instance, the first conductive portions 510 may be on a layer that is between the layer having the multiplication circuits and the layer having the second conductive portions 520, and/or the second conductive portions 520 may be on a layer that is between the layer having the multiplication circuits and the layer having the first conductive portions 510. It should also be appreciated that not all first conductive portions 510, second conductive portions 520, and/or multiplication circuits need be confined to a single layer each.
[0078] FIG. 6A is a top view of a portion of an example integrated circuit 600 including an array 602 of resistor-based multiplication circuits that may be included in the system 100a of FIG. 1 A, according to some embodiments. FIG. 6B is a top view of the portion of the example integrated circuit 600 of FIG. 6A with the first and second conductive portions 612, 622 hidden from view, according to some embodiments.
[0079] In some embodiments, the integrated circuit 600 may be configured as described herein for the integrated circuit 400. For example, as shown in FIG. 6A, the integrated circuit 600 is configured to provide two inputs 610a, 610b via respective first conductive portions 612 to an array 602 of multiplication circuits (FIG. 6B) to obtain an output 620 via second conductive portions 622. As shown in FIG. 6B, the array 602 includes two multiplication circuits, each with a row of three unit cell circuits 630, and each unit cell circuit 630 including a resistor 632 and a transistor 634 that is coupled to a respective switch-control conductive portion 604. In the illustrated embodiment, the transistor 634 of each unit cell circuit has a first channel terminal (e.g., drain) coupled to a first conductive portion 612, a second channel terminal (e.g., source) coupled to a first end of the resistor 632, and a control terminal (e.g., gate) coupled to the switchcontrol conductive portion 604, and a second end of the resistor is coupled to a second conductive portion 622. In some embodiments, the transistors 634 may be metal-oxide- semiconductor field effect transistors (MOSFETs), such as thin-film transistors (TFTs), although
other transistors may be used, such as bipolar junction transistors (BJTs). In some embodiments, the resistors 632 may include meandering conductive traces that provide a fixed resistance value, although other types of resistors and/or impedances may be used. In some cases, impedances may have non-negligible inductance and/or capacitance values, whereas in other cases, impedances may be substantially entirely resistive, resulting in resistor-based multiplication circuits.
[0080] In some embodiments, first conductive portions 612 and/or second conductive portions 622 may at least partially overlap, in a direction normal to the array 602, with at least one unit cell circuit 630. For example, as shown in FIGs. 6A-6B, the array 602 has row and column dimensions that may form the plane of the array 602 (e.g., parallel to a plane of a wafer on which the array may be formed), and the first conductive portion 612 configured to provide the input 610a to the identified unit cell circuit 630 overlaps with (e.g., overlies or underlies) a portion of the unit cell circuit 630 in a direction normal to the plane of the array 602. Likewise, as shown in FIGs. 6A-6B, the second conductive portion 622 configured to obtain at least a portion of the output 620 from the identified unit cell circuit 630 overlaps with (e.g., overlies or underlies) a portion of the unit cell circuit 630 in a direction normal to the plane of the array 602. For instance, in FIGs. 6A-6B, the first conductive portion 612 and/or the second conductive portion 622 configured to provide the input 610a to and/or obtain at least a portion of the output 620 from the identified unit cell circuit 630 may be connected to the unit cell circuit 630 by a respective via elongated parallel to a direction normal to the plane of the array 602. Moreover, as shown in FIGs. 6A-6B, a first conductive portion may at least partially overlap with more than one unit cell circuit 630, such as the first conductive portions 612 in FIGs. 6A-6B each overlapping with a row of unit cell circuits 630 and the second conductive portions 622 in FIGs. 6A-6B each overlapping with a column of unit cell circuits 630.
[0081] While FIGs. 6A-6B show only partial overlap of unit cell circuits 630 with first conductive portions 612 and second conductive portions 622, more or less overlap, such as total overlap, may be implemented. For instance, in the example array of FIG. 5, the first conductive portions 510 and/or the second conductive portions 520 may be configured to fully or at least substantially overlap with unit cell circuits of the array 502, such as at a point of overlap, in a direction normal to the array 502, of the first conductive portion 510, the second conductive
portion 520, and electrodes of the unit cell circuit connected to the respective conductive portions.
[0082] While FIGs. 6A-6B show a single transistor per unit cell circuit, it should be appreciated that alternative or additional transistors may be included per unit cell circuit. For example, at least some unit cell circuits may alternatively or additionally include a transistor configured as a row and/or column select transistor, such as with the channel coupled between the resistor and a conductive portion (e.g., in series with the illustrated transistor) and the control terminal couple to a metal portion to receive a row and/or column select signal. For instance, a row and/or column select transistor may permit inputs to the integrated circuit (e.g., from a processor 110) to include input signals and row and/or column select signals as an alternative or in addition to inputs including respective input signals for each multiplication circuit and/or unit cell circuit, permitting flexibility of operation and in implementation of the interface with the integrated circuit. Moreover, including such transistors may advantageously improve computing accuracy, such as by preventing current from leaking through unselected unit cell circuits.
[0083] FIG. 7A is a perspective view of a resistor 700 that may be included in the resistor-based multiplication circuits of FIGs. 6A-6B, according to some embodiments. FIG. 7B is a top view of the conductive trace 702 and electrodes 704a and 704b of the resistor 700 of FIG. 7A, according to some embodiments.
[0084] In some embodiments, the resistor 700 may be used in unit cell circuits 630 of the integrated circuit 600. As shown in FIGs. 7A-7B, the resistor 700 has a conductive trace 702 that meanders from a first electrode 704a to a second electrode 704b to provide resistance between a first conductive portion 710a coupled to the first electrode 704a and a second conductive portions 710b coupled to the second electrode 704b. For instance, the first conductive portion 710a may be configured to provide an input signal and the second conductive portion 710b may be configured to obtain an output signal, although in some embodiments the first and/or second electrode 704a, 704b may be coupled to a transistor that is, in turn, coupled to a conductive portion. In the illustrated embodiment, the meandering conductive trace 702 may be fixed in position when manufactured to provide a fixed resistance in a unit cell circuit of a multiplication circuit. According to various embodiments, the meandering conductive trace 702 may be formed using gold, copper, or other suitable metal, whereas in further embodiments, the trace 702 (or
like resistance-fixing component coupled between a pair of electrodes) may be formed using semiconductor material.
[0085] FIG. 7C is a top view of an alternative example resistive structure that may be included in the resistor-based multiplication circuits of FIGs. 6A-6B, according to some embodiments. [0086] In some embodiments, the resistive structure shown in FIG. 7C may be used in unit cell circuits 630 of the integrated circuit 600. As shown in FIG. 7C, the unit cell circuit includes a plurality of resistors in series with respective transistors and sharing an electrode, according to some embodiments. The resistors shown in FIG. 7C differ from the resistor 700 in that they are coupled between a plurality of switches, respectively, and a common electrode.
[0087] As shown in FIG. 7C, the unit cell circuit includes four fixed impedances in series with respective switches. For example, the unit cell circuit of FIG. 7C includes four conductive traces, Trace 1, Trace 2, Trace 3, and Trace 4, each coupled in series with a respective one of Switch 1, Switch 2, Switch 3, and Switch 4. In the illustrated example, each of Trace 1, Trace 2, Trace 3, and Trace 4 terminates at one end at a respective electrode that is coupled to the respective switch of Switch 1, Switch 2, Switch 3, and Switch 4. Also shown in FIG. 7C, each of Trace 1, Trace 2, Trace 3, and Trace 4 terminates at another end at a common electrode. For example, each of Trace 1, Trace 2, Trace 3, and Trace 4 may have a different fixed impedance value, such as due to different meandering trace geometries. It should be appreciated that, in some embodiments, at least some traces within a unit cell circuit may have the same fixed impedance value. Moreover, in some embodiments, when disposed in a row and/or column, unit cell circuits with the resistive structure of FIG. 7C may have different fixed impedances (e.g., combinations of fixed impedances) along at least a part of each row and/or column, depending on the application.
[0088] In some embodiments, the resistive structure of FIG. 7C, when configured within a unit cell circuit, may be coupled to and between one or more first conductive portions and a second conductive portion. For example, channel terminals of Switch 1, Switch 2, Switch 3, and Switch 4 may be coupled to one first conductive portion and the common electrode may be coupled to the second conductive portion, such that the series pairs of fixed impedances and switches are coupled in parallel with one another. For instance, control terminals of Switch 1, Switch 2, Switch 3, and Switch 4 may be configured to receive respective signals selecting the respective one of Trace 1, Trace 2, Trace 3, and Trace 4, which may set the parallel impedance the unit cell
circuit provides between the first conductive portion and the second conductive portion. Alternatively or additionally, at least some channel terminals of Switch 1, Switch 2, Switch 3, and Switch 4 may be coupled to respective first conductive portions. For instance, control terminals of Switch 1, Switch 2, Switch 3, and Switch 4 may be configured to receive respective signals selecting the respective one of Trace 1, Trace 2, Trace 3, and Trace 4, which may select which first conductive portion(s) to couple to. It should be appreciated that, according to various embodiments, one or more input signals may be received at Switch 1, Switch 2, Switch 3, and Switch 4, and/or at the common electrode, and/or one or more output signals may be provided at Switch 1, Switch 2, Switch 3, and Switch 4, and/or at the common electrode, depending on the application. In some cases, an integrated circuit may be reconfigurable (e.g., based on interaction with input and output terminals of the integrated circuit) to use the same unit cell circuit in either configuration.
[0089] FIG. 8 is a block diagram of an example unit cell circuit 800 with a selectable path 806 and co-located memory cell 830 that may be included in an array of multiplication circuits within the system 100b of FIG. IB, according to some embodiments.
[0090] In some embodiments, the unit cell circuit 800 may be configured in the manner described herein for the unit cell circuit 630 shown in FIG. 6B. For example, the unit cell circuit 800 is shown in FIG. 8 including a selectable path 806 configured to receive an input 802 and provide an output 804, with the path including a switch 820 and an impedance 210. In addition, the unit cell circuit 800 may include a memory cell 830 co-located with the path 806. For example, the memory cell 830 may include a single bitcell for the single path 806. In the illustrated embodiment, the memory cell 830 is configured to provide a path select signal to the switch 820 in the path 806 and to receive a write enable signal 808a and a write value 808b, both of which may be received from a processor to set the value(s) for path selection in the unit cell circuit 800.
[0091] While the unit cell circuit 800 is shown in FIG. 8 including only a single path 806, it should be appreciated that any number of paths may be included, such as multiple paths, which may be configured as shown for the unit cell circuit of FIG. 7C.
[0092] FIG. 9 is a block diagram of an example integrated circuit 900 including an array 902 of circuit unit cell circuits with selectable paths 806 and co-located memory cells 830, which may be included within the system 100b of FIG. IB, according to some embodiments.
[0093] In some embodiments, the integrated circuit 900 may be configured as described herein for the integrated circuit 400. For example, as shown in FIG. 9, the integrated circuit 900 is configured to provide three inputs 910a, 910b, and 910c to the array 902 of multiplication circuits and to receive three outputs 920a, 920b, and 920c from the array 902. In the illustrated embodiment, three first conductive portions (not shown, e.g., 412) may be coupled to the array 902 to provide the inputs 910a, 910b, and 910c and three second conductive portions (not shown, e.g., 422) may be coupled to the array 902 to receive the outputs 920a, 920b, 920c. Also in the illustrated embodiment, three conductive portions may be coupled to the array 902 to provide respective write values 904a, 904b, and 904c for writing the memory cells 830 of each unit cell circuit 800, and three conductive portions may be coupled to the array 902 to provide write enable signals 902a, 902b, and 902c to the memory cells 830.
[0094] In some embodiments, the array 902 of multiplication circuits may be configured to operate as described herein for the array 402 of the integrated circuit 400. For example, as shown in FIG. 9, the unit cell circuits 800 of the array 902 may be interconnected to form multiplication circuits, such as with the top row of three unit cell circuits 800 in FIG. 9 forming a multiplication circuit receiving an input 910a and providing outputs 920a, 920b, and 920c that may be combined. For instance, the impedances 810 may be configured as resistors (e.g., 632) and the switches 820 may be configured as transistors (e.g., 634) in a manner similar to the unit cell circuits 630. In contrast to the array 602, however, switch-control conductive portions (not shown) may be located entirely within the unit cell circuit 800 to connect the memory cell 830 directly to the switch 820. In some embodiments, the memory cells 830 of the array may be formed on one or more layers different from the switches 820 and/or impedances 810 and connected to the switches 820 through vias.
[0095] It should be appreciated that, while memory cells 830 are shown co-located with unit cell circuits 800 in FIGs. 8-9, there may be applications in which a memory is built into the same integrated circuit as the array of multiplication circuits but memory cells are not co-located with the unit cell circuits.
[0096] FIG. 10 is a block diagram of the integrated circuit 900 of FIG. 9 illustrating multiple example configurations 1040a, 1040b, and 1040c of the unit cell circuits 800, according to some embodiments.
[0097] FIG. 10 illustrates multiple programmable options for grouping the unit cell circuits 800. In a first configuration 1040a, the top row of three unit cell circuits 800 may form a multiplication circuit with each unit cell circuit configured to receive the same input signal 1010a and provide (at least a portion of) output signals 1020a, 1020b, and 1020c combinable to obtain an output from the multiplication circuit. In a second configuration 1040b, the left two unit cell circuits 800 of the first two rows may form a multiplication circuit with the top two unit cell circuits 800 configured to receive a first input signal 1010a and the bottom two unit cell circuits 800 configured to receive a second input signal 1010b, and the output signal 1020a includes a sum of outputs from the left two unit cell circuits 800 and the output signal 1020b includes a sum of outputs from the right two unit cell circuits 800. In yet a third configuration 1040c, the left column of three unit cell circuits 800 may form a multiplication circuit with each unit cell circuit 800 configured to receive a respective one of input signals 1010a, 1010b, and 1010c and the output signal 1020a includes a sum of outputs from all three unit cell circuits 800.
[0098] In some embodiments, all three options for grouping described herein for FIG. 10 may be programmed and reprogrammed into the same integrated circuit. For example, without changing any impedance values or locations of conductive portions of the integrated circuit, memory cells may be programmed to select impedances of unit cell circuits to be used and not to select impedances of unit cell circuits not to be used. It should be appreciated that this reprogrammability is not limited to embodiments having memory in the integrated circuit with the multiplication circuits, as embodiments that may receive memory values from an external memory may be similarly reprogrammed.
[0099] The above-described embodiments can be implemented in any of numerous ways. One or more aspects and embodiments of the present disclosure involving the performance of processes or methods may utilize program instructions executable by a device (e.g., a computer, a processor, or other device) to perform, or control performance of, the processes or methods. In this respect, various inventive concepts may be embodied as a computer readable storage medium (or multiple computer readable storage media) (e.g., a computer memory, one or more floppy discs, compact discs, optical discs, magnetic tapes, flash memories, circuit configurations in Field Programmable Gate Arrays or other semiconductor devices, or other tangible computer storage medium) encoded with one or more programs that, when executed on one or more computers or other processors, perform methods that implement one or more of the various
embodiments described above. The computer readable medium or media can be transportable, such that the program or programs stored thereon can be loaded onto one or more different computers or other processors to implement various ones of the aspects described above. In some embodiments, computer readable media may be non-transitory media.
[0100] The terms “program” or “software” are used herein in a generic sense to refer to any type of computer code or set of computer-executable instructions that can be employed to program a computer or other processor to implement various aspects as described above. Additionally, it should be appreciated that according to one aspect, one or more computer programs that when executed perform methods of the present disclosure need not reside on a single computer or processor, but may be distributed in a modular fashion among a number of different computers or processors to implement various aspects of the present disclosure.
[0101] Computer-executable instructions may be in many forms, such as program modules, executed by one or more computers or other devices. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. Typically the functionality of the program modules may be combined or distributed as desired in various embodiments.
[0102] Also, data structures may be stored in computer-readable media in any suitable form. For simplicity of illustration, data structures may be shown to have fields that are related through location in the data structure. Such relationships may likewise be achieved by assigning storage for the fields with locations in a computer-readable medium that convey relationship between the fields. However, any suitable mechanism may be used to establish a relationship between information in fields of a data structure, including through the use of pointers, tags or other mechanisms that establish relationship between data elements.
[0103] When implemented in software, the software code can be executed on any suitable processor or collection of processors, whether provided in a single computer or distributed among multiple computers.
[0104] Further, it should be appreciated that a computer may be embodied in any of a number of forms, such as a rack-mounted computer, a desktop computer, a laptop computer, or a tablet computer, as non-limiting examples. Additionally, a computer may be embedded in a device not generally regarded as a computer but with suitable processing capabilities, including a Personal Digital Assistant (PDA), a smartphone or any other suitable portable or fixed electronic device.
[0105] Also, a computer may have one or more input and output devices. These devices can he used, among other things, to present a user interface. Examples of output devices that can be used to provide a user interface include printers or display screens for visual presentation of output and speakers or other sound generating devices for audible presentation of output. Examples of input devices that can be used for a user interface include keyboards, and pointing devices, such as mice, touch pads, and digitizing tablets. As another example, a computer may receive input information through speech recognition or in other audible formats.
[0106] Such computers may be interconnected by one or more networks in any suitable form, including a local area network or a wide area network, such as an enterprise network, and intelligent network (IN) or the Internet. Such networks may be based on any suitable technology and may operate according to any suitable protocol and may include wireless networks, wired networks or fiber optic networks.
[0107] Also, as described, some aspects may be embodied as one or more methods. The acts performed as part of the method may be ordered in any suitable way. Accordingly, embodiments may be constructed in which acts are performed in an order different than illustrated, which may include performing some acts simultaneously, even though shown as sequential acts in illustrative embodiments.
[0108] While several embodiments of the present disclosure have been described and illustrated herein, those of ordinary skill in the art will readily envision a variety of other means and/or structures for performing the functions and/or obtaining the results and/or one or more of the advantages described herein, and each of such variations and/or modifications is deemed to be within the scope of the present disclosure. More generally, those skilled in the art will readily appreciate that all parameters, dimensions, materials, and configurations described herein are meant to be exemplary and that the actual parameters, dimensions, materials, and/or configurations will depend upon the specific application or applications for which the teachings of the present disclosure is/are used. Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific embodiments of the disclosure described herein. It is, therefore, to be understood that the foregoing embodiments are presented by way of example only and that, within the scope of the appended claims and equivalents thereto, the disclosure may be practiced otherwise than as specifically described and claimed. The present disclosure is directed to each individual feature,
system, article, material, and/or method described herein. In addition, any combination of two or more such features, systems, articles, materials, and/or methods, if such features, systems, articles, materials, and/or methods are not mutually inconsistent, is included within the scope of the present disclosure.
[0109] The indefinite articles “a” and “an,” as used herein in the specification and in the claims, unless clearly indicated to the contrary, should be understood to mean “at least one.”
[0110] The phrase “and/or,” as used herein in the specification and in the claims, should be understood to mean “either or both” of the elements so conjoined, i.e., elements that are conjunctively present in some cases and disjunctively present in other cases. Other elements may optionally be present other than the elements specifically identified by the “and/or” clause, whether related or unrelated to those elements specifically identified unless clearly indicated to the contrary. Thus, as a non-limiting example, a reference to “A and/or B,” when used in conjunction with open-ended language such as “comprising” can refer, in one embodiment, to A without B (optionally including elements other than B); in another embodiment, to B without A (optionally including elements other than A); in yet another embodiment, to both A and B (optionally including other elements); etc.
[0111] As used herein in the specification and in the claims, “or” should be understood to have the same meaning as “and/or” as defined above. For example, when separating items in a list, “or” or “and/or” shall be interpreted as being inclusive, i.e., the inclusion of at least one, but also including more than one, of a number or list of elements, and, optionally, additional unlisted items. Only terms clearly indicated to the contrary, such as “only one of’ or “exactly one of,” or, when used in the claims, “consisting of,” will refer to the inclusion of exactly one element of a number or list of elements. In general, the term “or” as used herein shall only be interpreted as indicating exclusive alternatives (i.e. “one or the other but not both”) when preceded by terms of exclusivity, such as “either,” “one of,” “only one of,” or “exactly one of.” “Consisting essentially of,” when used in the claims, shall have its ordinary meaning as used in the field of patent law. [0112] As used herein in the specification and in the claims, the phrase “at least one,” in reference to a list of one or more elements, should be understood to mean at least one element selected from any one or more of the elements in the list of elements, but not necessarily including at least one of each and every element specifically listed within the list of elements and not excluding any combinations of elements in the list of elements. This definition also allows
that elements may optionally be present other than the elements specifically identified within the list of elements to which the phrase “at least one” refers, whether related or unrelated to those elements specifically identified. Thus, as a non-limiting example, “at least one of A and B” (or, equivalently, “at least one of A or B,” or, equivalently “at least one of A and/or B”) can refer, in one embodiment, to at least one, optionally including more than one, A, with no B present (and optionally including elements other than B); in another embodiment, to at least one, optionally including more than one, B, with no A present (and optionally including elements other than A); in yet another embodiment, to at least one, optionally including more than one, A, and at least one, optionally including more than one, B (and optionally including other elements); etc.
[0113] In the claims, as well as in the specification above, all transitional phrases such as “comprising,” “including,” “carrying,” “having,” “containing,” “involving,” “holding,” and the like are to be understood to be open-ended, i.e., to mean including but not limited to. Only the transitional phrases “consisting of’ and “consisting essentially of’ shall be closed or semi-closed transitional phrases, respectively, as set forth in the United States Patent Office Manual of Patent Examining Procedures, Section 2111.03.
Claims
1. An integrated circuit, comprising: an input terminal; a first conductive portion that is coupled to the input terminal; an output terminal; a second conductive portion that is coupled to the output terminal; and a programmable multiplication circuit comprising: a plurality of impedances coupled in parallel between the first conductive portion and the second conductive portion and having fixed impedance values; and a plurality of switches coupled in series with respective ones of the plurality of impedances between the first conductive portion and the second conductive portion.
2. The integrated circuit of claim 1, wherein the plurality of impedances comprise resistors having fixed resistance values.
3. The integrated circuit of claim 1 or 2, further comprising: a plurality of switch-control conductive portions that are electrically isolated from one another, wherein the plurality of switches comprises a plurality of transistors each having: a first channel terminal coupled to the first conductive portion; a second channel terminal coupled to the second conductive portion via the programmable multiplication circuit; and a gate terminal coupled to a respective one of the plurality of switch-control conductive portions.
4. The integrated circuit of any one of claims 1 to 3, further comprising: a third conductive portion coupled to the input terminal or to a second input terminal; a fourth conductive portion coupled to the output terminal or to a second output terminal; and a second programmable impedance circuit, comprising:
a second plurality of impedances coupled in parallel between the third conductive portion and the fourth conductive portion and having fixed impedance values; and a second plurality of switches coupled in series with respective ones of the second plurality of impedances between the third conductive portion and the fourth conductive portion.
5. The integrated circuit of claim 4, wherein the fourth conductive portion is coupled to the second output terminal.
6. The integrated circuit of claim 4 or 5, wherein the third conductive portion is coupled to the second input terminal.
7. The integrated circuit of any one of claims 1-5, wherein the first and second conductive portions are disposed on different respective layers of the integrated circuit.
8. The integrated circuit of any one of claims 1 to 7, further comprising: a memory comprising a plurality of memory cells coupled to respective switches of the plurality of switches.
9. The integrated circuit of claim 8, wherein the plurality of impedances and the plurality of switches comprise a plurality of impedance-switch pairs, with ones of the plurality of memory cells co-located with respective ones of the plurality of impedance-switch pairs.
10. The integrated circuit of claim 8 or 9, wherein the memory is configured as randomaccess memory (RAM).
11. The integrated circuit of claim 1, wherein: the plurality of impedances and the plurality of switches are arranged in an array of unit cell circuits,
each unit cell circuit of at least a subarray of the array of unit cell circuits comprises a respective one of the plurality of impedances coupled in scries with a respective one of the plurality of switches, and the first and second conductive portions at least partially overlap, in a direction normal to the array, with at least one of the unit cell circuits of the subarray.
12. A multiplication circuit, comprising: an input terminal configured to receive an input signal; an output terminal configured to provide an output signal; and a plurality of selectable paths coupled in parallel between the input terminal and the output terminal, the plurality of selectable paths configured to generate the output signal as a multiplication of the input signal by a scalar value, wherein the scalar value is controllable by selecting a subset of the plurality of selectable paths.
13. The multiplication circuit of claim 12, wherein each of the plurality of selectable paths is configured to, when selected, propagate at least a portion of the input signal from the input terminal the output terminal.
14. The multiplication circuit of claim 13, wherein each of the plurality of selectable paths comprises a switch coupled in series with a fixed impedance, the switch being controllable to select the respective selectable path.
15. The multiplication circuit of claim 14, wherein each switch is configured to receive a respective select signal from a memory.
16. The multiplication circuit of any of claims 12 to 15, wherein one of the input signal and the output signal is a voltage signal and the other of the input signal and the output signal is a current signal.
17. A system comprising: the multiplication circuit of any of claims 12 to 16; and
a memory, wherein the multiplication circuit is configured to select the subset of the plurality of selectable paths based on values stored in the memory.
18. The system of claim 17, further comprising: a processor coupled to the multiplication circuit and configured to multiply a neuron value by a connection weight at least in part by: providing, to the multiplication circuit, the input signal as a version of the neuron value; and receiving, via the multiplication circuit, a version of the output signal, wherein the scalar value is based on the connection weight.
19. The system of claim 18, wherein the processor is further configured to set the values stored in the memory based on the connection weight.
20. The system of claim 17, further comprising: an integrated circuit having the memory and the multiplication circuit included therein.
21. The system of claim 20, wherein each of the plurality of selectable paths is co-located with a respective memory cell of the memory.
22. The system of any of claims 17 to 21, wherein the memory is configured as randomaccess memory (RAM).
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202363591032P | 2023-10-17 | 2023-10-17 | |
| US63/591,032 | 2023-10-17 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025085569A1 true WO2025085569A1 (en) | 2025-04-24 |
Family
ID=95449327
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2024/051661 Pending WO2025085569A1 (en) | 2023-10-17 | 2024-10-16 | Programmable multiplication circuit and systems |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2025085569A1 (en) |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20190213234A1 (en) * | 2018-01-11 | 2019-07-11 | Mentium Technologies Inc. | Vector-by-matrix multiplier modules based on non-volatile 2d and 3d memory arrays |
| US10522223B1 (en) * | 2018-07-04 | 2019-12-31 | International Business Machines Corporation | Resistive memory device for matrix-vector multiplications |
| US10594334B1 (en) * | 2018-04-17 | 2020-03-17 | Ali Tasdighi Far | Mixed-mode multipliers for artificial intelligence |
| US20200410334A1 (en) * | 2019-06-25 | 2020-12-31 | Sandisk Technologies Llc | Binary weighted voltage encoding scheme for supporting multi-bit input precision |
| US20220028444A1 (en) * | 2020-07-27 | 2022-01-27 | Robert Bosch Gmbh | Read only memory architecture for analog matrix operations |
-
2024
- 2024-10-16 WO PCT/US2024/051661 patent/WO2025085569A1/en active Pending
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20190213234A1 (en) * | 2018-01-11 | 2019-07-11 | Mentium Technologies Inc. | Vector-by-matrix multiplier modules based on non-volatile 2d and 3d memory arrays |
| US10594334B1 (en) * | 2018-04-17 | 2020-03-17 | Ali Tasdighi Far | Mixed-mode multipliers for artificial intelligence |
| US10522223B1 (en) * | 2018-07-04 | 2019-12-31 | International Business Machines Corporation | Resistive memory device for matrix-vector multiplications |
| US20200410334A1 (en) * | 2019-06-25 | 2020-12-31 | Sandisk Technologies Llc | Binary weighted voltage encoding scheme for supporting multi-bit input precision |
| US20220028444A1 (en) * | 2020-07-27 | 2022-01-27 | Robert Bosch Gmbh | Read only memory architecture for analog matrix operations |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN111985630B (en) | Control circuit applied to product accumulation circuit of neural network-like system | |
| Yin et al. | High-throughput in-memory computing for binary deep neural networks with monolithically integrated RRAM and 90-nm CMOS | |
| EP3389051B1 (en) | Memory device and data-processing method based on multi-layer rram crossbar array | |
| Chen et al. | CMOS-integrated memristive non-volatile computing-in-memory for AI edge processors | |
| US12079708B2 (en) | Parallel acceleration method for memristor-based neural network, parallel acceleration processor based on memristor-based neural network and parallel acceleration device based on memristor-based neural network | |
| US20180144240A1 (en) | Semiconductor cell configured to perform logic operations | |
| CN114298296A (en) | Convolutional Neural Network Processing Method and Device Based on Storage-Computing Integrated Array | |
| JPWO2019049741A1 (en) | Neural network arithmetic circuit using non-volatile semiconductor memory device | |
| US20220012016A1 (en) | Analog multiply-accumulate unit for multibit in-memory cell computing | |
| TWI699711B (en) | Memory devices and manufacturing method thereof | |
| CN111052154A (en) | Neural network operation circuit using nonvolatile semiconductor memory element | |
| US11544540B2 (en) | Systems and methods for neural network training and deployment for hardware accelerators | |
| US20220108759A1 (en) | Multi-level ultra-low power inference engine accelerator | |
| Elbtity et al. | An in-memory analog computing co-processor for energy-efficient cnn inference on mobile devices | |
| US11200948B1 (en) | System for a flexible conductance crossbar | |
| TW202032545A (en) | Memory devices and methods for operating the same | |
| Fernando et al. | 3D memristor crossbar architecture for a multicore neuromorphic system | |
| Liu et al. | SME: ReRAM-based sparse-multiplication-engine to squeeze-out bit sparsity of neural network | |
| CN108154225B (en) | Neural network chip using analog computation | |
| Lebdeh et al. | Memristive device based circuits for computation-in-memory architectures | |
| US12027204B2 (en) | Memory including metal rails with balanced loading | |
| WO2025085569A1 (en) | Programmable multiplication circuit and systems | |
| Chaudhuri et al. | Hardware fault tolerance for binary RRAM crossbars | |
| CN113222131B (en) | A Synaptic Array Circuit with Signed Weight Coefficient Based on 1T1R | |
| US20250201297A1 (en) | Sub-Word Line Driver Placement For Memory Device |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24880534 Country of ref document: EP Kind code of ref document: A1 |