KR102365470B1

KR102365470B1 - Capacitance-based sequential matrix multiplication neural network by controlling weights with transistor-capacitor pair

Info

Publication number: KR102365470B1
Application number: KR1020190113317A
Authority: KR
Inventors: 유인경; 황현상
Original assignee: 포항공과대학교 산학협력단
Priority date: 2019-09-16
Filing date: 2019-09-16
Publication date: 2022-02-18
Anticipated expiration: 2039-09-16
Also published as: KR20210032074A

Abstract

본 발명의 실시예에 따른 트랜지스터-커패시터 쌍으로 가중치를 조절할 수 있는 커패시턴스 기반 순차 행렬 곱셈 뉴럴 네트워크는 워드 라인, 비트 라인, 및 커패시터와 연결되는 트랜지스터와, 트랜지스터와 연결되는 커패시터를 포함하되, 비트 라인의 입력은 펄스 발진기(pulse generator)와 연결되고 출력은 센스 앰프(Sense Amplifier)와 연결된다.A capacitance-based sequential matrix multiplication neural network capable of adjusting a weight with a transistor-capacitor pair according to an embodiment of the present invention includes a word line, a bit line, and a transistor connected to a capacitor, and a capacitor connected to the transistor, wherein the bit line The input is connected to the pulse generator and the output is connected to the sense amplifier.

Description

CAPACITANCE-BASED SEQUENTIAL MATRIX MULTIPLICATION NEURAL NETWORK BY CONTROLLING WEIGHTS WITH TRANSISTOR-CAPACITOR PAIR

본 발명은 뉴럴 네트워크에 관한 것으로서, 보다 상세하게는 트랜지스터-커패시터 쌍으로 가중치를 조절할 수 있는 커패시턴스 기반 순차 행렬 곱셈 뉴럴 네트워크에 관한 것이다.The present invention relates to a neural network, and more particularly, to a capacitance-based sequential matrix multiplication neural network capable of adjusting a weight by a transistor-capacitor pair.

모바일용 뉴럴 프로세서는 학습을 서버나 컴퓨터로 수행하고 학습결과를 모바일 뉴럴 프로세서에 저장하여 추론(inference)을 수행한다. 이 때 뉴럴 프로세서의 가중치에 저장되는 값은 멀티레벨이 되는 것이 바람직하나 멀티레벨 값에 한계가 있어서 학습을 수행한 후 전지 작업(pruning), 데이터 압축 등의 과정을 거쳐 작은 비트폭(small bit-width)화 한 다음 그 값을 모바일 뉴럴 프로세서 가중치로 저장한다. 이 가중치는 불휘발성 메모리 또는 휘발성 메모리에 저장할 수 있다. The mobile neural processor performs learning by a server or computer, and stores the learning results in the mobile neural processor to perform inference. At this time, it is preferable that the value stored in the weight of the neural processor be multi-level, but there is a limit to the multi-level value. width) and then store the value as the mobile neural processor weight. This weight may be stored in a non-volatile memory or a volatile memory.

서버용으로는 Google의 TPU(Tensor Processing Unit)가 있는데 가중치 값을 DRAM에 저장한 후 페치(fetch)하여 행렬 곱셈부(matrix multiply unit, MMU)로 보낸다. 출력(output) 계산 결과는 DRAM에 저장된 새로운 가중치 값과 함께 다시 행렬 곱셈부 입력(input)으로 보내어 최종 출력(output) 결과가 나올 때까지 순환시킨다. For servers, there is Google's Tensor Processing Unit (TPU), which stores weight values in DRAM, fetches them, and sends them to the matrix multiply unit (MMU). The output calculation result is sent back to the matrix multiplier input together with the new weight value stored in DRAM, and circulated until the final output result is obtained.

가중치를 불휘발성 메모리에 저장하여 사용하는 경우에는 추론 속도가 빠른 장점이 있으나 은닉층(hidden layer)을 모두 제작해야 하므로 회로 오버헤드(circuit overhead)가 증가하는 단점이 있다. Google의 TPU같은 경우는 가중치 정보를 뉴럴 네트워크 외부에 저장하고, 동일한 뉴럴 네트위크를 다시 사용하면서 순차적으로 계산하기 때문에 추론 속도는 감소하지만 회로 오버헤드를 줄일 수 있다.When the weights are stored and used in the nonvolatile memory, the reasoning speed is fast, but there is a disadvantage in that the circuit overhead increases because all hidden layers must be manufactured. In the case of Google's TPU, weight information is stored outside the neural network, and the inference speed is reduced, but circuit overhead can be reduced because it is sequentially calculated while using the same neural network again.

커패시턴스 기반 행렬 곱셈(matrix multiplication)은 커패시턴스를 가중치로 사용한다. 가중치를 결정하기 위해 커패시터들을 그룹으로 묶거나 커패시터의 크기를 바꾸는 방법이 있다.Capacitance-based matrix multiplication uses capacitance as a weight. There are ways to group capacitors or change the size of capacitors to determine the weight.

위 기재된 내용은 오직 본 발명의 기술적 사상들에 대한 배경 기술의 이해를 돕기 위한 것이며, 따라서 그것은 본 발명의 기술 분야의 당업자에게 알려진 선행 기술에 해당하는 내용으로 이해될 수 없다.The above description is only for helping the understanding of the background for the technical ideas of the present invention, and therefore it cannot be understood as the content corresponding to the prior art known to those skilled in the art.

본 발명은 인공지능 학습에서 학습결과를 수행하기 위한 가중치 조절 방법에 관한 뉴럴 네트워크 구성 및 작동 원리이다The present invention is a neural network configuration and operating principle related to a weight adjustment method for performing learning results in artificial intelligence learning

가중치 소자를 하드웨어를 이용하여 멀티레벨로 제작한다는 것은 물리적 한계가 있기 때문에, 소프트웨어에서 사용하는 가중치 비트폭(weight bit-width)을 따라갈 수 없다. 예를 들어서 16 비트폭의 멀티레벨, 즉, 65,536 저항 레벨을 갖는 저항 메모리 소재는 현재로서는 구현하기 어렵다. 따라서 가중치 값을 소프트웨어만큼 탄력적으로 입력하면서 행렬 곱셈이 가능한 행렬 곱셈부의 구조와 작동 방법을 고안해야 한다.Since there is a physical limit to manufacturing a weight element in multi-level using hardware, it cannot follow the weight bit-width used in software. For example, a 16-bit wide multilevel resistive memory material with 65,536 resistance levels is currently difficult to implement. Therefore, it is necessary to devise a structure and operation method of a matrix multiplier that can perform matrix multiplication while inputting weight values as flexibly as software.

커패시터의 크기를 바꾸는 방법은 여러 크기의 커패시터를 제작한 후 필요한 크기를 선택하는 방법이고 커패시터를 묶는 방법은 여러 커패시터를 동시에 작동하는 방법이다. 이러한 경우 회로가 복잡해지고 특히 비트 라인 커패시턴스(bit line capacitance) 같은 기생 커패시턴스(parasitic capacitance)의 영향을 받는다. 또한, 은닉층을 필요한 만큼 제작하는 것이 칩(chip) 크기에 제약을 주는 또 하나의 문제가 된다.The method of changing the size of a capacitor is a method of selecting the required size after manufacturing capacitors of various sizes, and the method of tying capacitors is a method of operating several capacitors at the same time. In this case, the circuit becomes complicated, and in particular, it is affected by parasitic capacitance such as bit line capacitance. In addition, manufacturing as many hidden layers as necessary is another problem that limits the size of a chip.

상기 목적을 달성하기 위하여 본 발명의 실시예에 따른 트랜지스터-커패시터 쌍으로 가중치를 조절할 수 있는 커패시턴스 기반 순차 행렬 곱셈 뉴럴 네트워크는, 워드 라인, 비트 라인, 및 커패시터와 연결되는 트랜지스터와; 상기 트랜지스터와 연결되는 커패시터를 포함하되, 상기 비트 라인의 입력은 펄스 발진기(pulse generator)와 연결되고 출력은 센스 앰프(Sense Amplifier)와 연결된다.In order to achieve the above object, according to an embodiment of the present invention, there is provided a capacitance-based sequential matrix multiplication neural network capable of adjusting a weight by a transistor-capacitor pair, comprising: a transistor connected to a word line, a bit line, and a capacitor; and a capacitor connected to the transistor, wherein an input of the bit line is connected to a pulse generator and an output is connected to a sense amplifier.

상기 센스 앰프와 상기 비트 라인 사이에 배치되는 선택 트랜지스터를 더 포함할 수 있다.A selection transistor may be further included between the sense amplifier and the bit line.

상기 센스 앰프와 상기 선택 트랜지스터 사이에 배치되는 다이오드를 더 포함할 수 있다.The display device may further include a diode disposed between the sense amplifier and the selection transistor.

상기 센스 앰프와 상기 다이오드 사이에 배치되는 어큐뮬레이터(accumulator)를 더 포함할 수 있다.An accumulator disposed between the sense amplifier and the diode may be further included.

상기 센스 앰프와 상기 어큐뮬레이터 사이에 배치되는 선별용 선택 트랜지스터를 더 포함할 수 있다.A selection transistor for selection disposed between the sense amplifier and the accumulator may be further included.

상기 센스 앰프와 선별용 선택트랜지스터 사이에 선별기(discriminator)를 더 포함할 수 있다.A discriminator may be further included between the sense amplifier and the selection transistor for selection.

상기 비트 라인과 상기 펄스 발진기 사이에 위치한 다이오드를 더 포함할 수 있다.A diode may be further included between the bit line and the pulse oscillator.

상기 선택 트랜지스터의 그라운드에 해당하는 P웰(p-well)을 상기 다이오드와 상기 펄스 발진기 사이에 연결하는 배선을 더 포함할 수 있다.A wiring connecting a P-well corresponding to a ground of the selection transistor between the diode and the pulse oscillator may be further included.

상기 워드 라인에 인가되는 전압은 일정한 전압이고, 상기 비트 라인에 인가되는 전압은 입력 신호에 대응할 수 있다.A voltage applied to the word line may be a constant voltage, and a voltage applied to the bit line may correspond to an input signal.

상기 비트 라인에 인가되는 펄스 전압은 상기 커패시터 플레이트에 인가되는 상시 전압의 두 배 이상일 수 있다.The pulse voltage applied to the bit line may be twice or more of the constant voltage applied to the capacitor plate.

본 발명의 실시예에 따른 트랜지스터-커패시터 쌍으로 가중치를 조절할 수 있는 커패시턴스 기반 순차 행렬 곱셈 뉴럴 네트워크는, 워드 라인, 비트 라인, 및 커패시터와 연결되는 트랜지스터와; 상기 트랜지스터와 연결되는 커패시터와; 상기 커패시터에 연결되는 플레이트를 포함하되, 상기 비트 라인의 입력은 펄스 발진기(pulse generator)와 연결되고 출력은 센스 앰프(Sense Amplifier)와 연결되는 가중치 셀들을 포함한다.Transistor-capacitor pair according to an embodiment of the present invention provides a capacitance-based sequential matrix multiplication neural network capable of adjusting weights, comprising: a transistor connected to a word line, a bit line, and a capacitor; a capacitor connected to the transistor; and a plate coupled to the capacitor, wherein the input of the bit line is coupled to a pulse generator and the output includes weight cells coupled to a sense amplifier.

상기 가중치 셀들의 워드 라인에 순차적으로 전압을 인가하여 순차적으로 행렬 곱셈(Matrix Multiplication)을 수행할 수 있다.By sequentially applying voltages to the word lines of the weight cells, matrix multiplication may be sequentially performed.

상기 행렬 곱셈의 출력 정보를 차기 은닉층의 입력 정보로 사용하여 행렬 곱셈을 수행할 수 있다.Matrix multiplication may be performed using the output information of the matrix multiplication as input information of the next hidden layer.

상기 행렬 곱셈을 반복적으로 수행한 결과에 대한 반복 횟수 정보, 가중치 정보, 입력 정보, 출력 정보 및 은닉층 정보를 저장하는 상기 뉴럴 네트워크 외부의 저장 매체를 더 포함할 수 있다.The apparatus may further include a storage medium external to the neural network for storing iteration number information, weight information, input information, output information, and hidden layer information for a result of repeatedly performing matrix multiplication.

상기 가중치 셀들의 두개 이상의 워드 라인을 선택하여 게이트 전압을 동시에 인가함으로써 순차적으로 행렬 곱셈을 수행할 수 있다.Matrix multiplication may be sequentially performed by selecting two or more word lines of the weight cells and simultaneously applying gate voltages.

본 발명의 실시예에 따른 트랜지스터-커패시터 쌍으로 가중치를 조절할 수 있는 커패시턴스 기반 순차 행렬 곱셈 뉴럴 네트워크는, 워드 라인, 비트 라인, 및 커패시터와 연결되는 트랜지스터; 상기 트랜지스터와 연결되는 커패시터; 및 상기 커패시터에 연결되는 플레이트를 포함하되, 상기 비트 라인의 입력은 펄스 발진기(pulse generator)와 연결되고 출력은 센스 앰프(Sense Amplifier)와 연결되는 제1 가중치 셀들과; 워드 라인, 비트 라인, 및 커패시터와 연결되는 트랜지스터; 상기 트랜지스터와 연결되는 커패시터; 및 상기 커패시터에 연결되는 플레이트를 포함하되, 상기 비트 라인은 상기 제1 가중치 셀들의 출력 및 센스 앰프(Sense Amplifier)와 연결되는 제2 가중치 셀들을 포함한다.According to an embodiment of the present invention, a capacitance-based sequential matrix multiplication neural network capable of adjusting a weight using a transistor-capacitor pair includes: a word line, a bit line, and a transistor connected to a capacitor; a capacitor connected to the transistor; and a plate connected to the capacitor, wherein an input of the bit line is connected to a pulse generator and an output is connected to a sense amplifier; a transistor coupled to the word line, the bit line, and the capacitor; a capacitor connected to the transistor; and a plate connected to the capacitor, wherein the bit line includes output of the first weight cells and second weight cells connected to a sense amplifier.

상기 제1 가중치 셀들의 출력과 상기 제2 가중치 셀들의 입력 사이에 배치되는 다이오드를 더 포함할 수 있다.The display device may further include a diode disposed between an output of the first weight cells and an input of the second weight cells.

상기 제1 가중치 셀들의 워드 라인에 순차적으로 전압을 인가하여 순차적으로 행렬 곱셈(Matrix Multiplication)을 수행할 수 있다.By sequentially applying voltages to the word lines of the first weight cells, matrix multiplication may be sequentially performed.

상기 행렬 곱셈의 출력 정보를 상기 제2 가중치 셀들에 저장하고 차기 은닉층의 입력 정보로 사용하여 행렬 곱셈을 수행할 수 있다.Matrix multiplication may be performed by storing output information of the matrix multiplication in the second weight cells and using it as input information of a next hidden layer.

상기 제2 가중치 셀들은 상기 제1 가중치 셀들에서 방전되는 전하들을 모을 수 있다.The second weight cells may collect charges discharged from the first weight cells.

이와 같은 본 발명의 실시예에 따른 트랜지스터-커패시터 쌍으로 가중치를 조절할 수 있는 커패시턴스 기반 순차 행렬 곱셈 뉴럴 네트워크는, 가중치와 은닉층 수를 탄력적으로 조절할 수 있으며, 회로 부하를 줄일 수 있고 행렬 곱셈 유닛 칩 크기(matrix multiplication unit chip size)도 최소화할 수 있다.As described above, the capacitance-based sequential matrix multiplication neural network capable of adjusting the weight by a transistor-capacitor pair according to the embodiment of the present invention can flexibly adjust the weight and the number of hidden layers, and can reduce the circuit load and the size of the matrix multiplication unit chip (matrix multiplication unit chip size) can also be minimized.

도 1은 본 발명의 일 실시예에 따른 뉴럴 네트워크를 개략적으로 나타내는 회로도이다.
도 2는 본 발명의 일 실시예에 따른 뉴럴 네트워크의 가중치 셀을 개략적으로 나타내는 회로도이다.
도 3은 본 발명의 일 실시예에 따른 뉴럴 네트워크의 가중치 셀의 동작 원리를 개략적으로 나타내는 회로도이다.
도 4는 본 발명의 일 실시예에 따른 뉴럴 네트워크 구성을 개략적으로 나타내는 회로도이다.
도 5, 도 6, 및 도 7은 본 발명의 일 실시예에 따른 뉴럴 네트워크의 동작을 설명하기 위한 회로도이다.1 is a circuit diagram schematically illustrating a neural network according to an embodiment of the present invention.
2 is a circuit diagram schematically illustrating a weight cell of a neural network according to an embodiment of the present invention.
3 is a circuit diagram schematically illustrating an operation principle of a weight cell of a neural network according to an embodiment of the present invention.
4 is a circuit diagram schematically illustrating a configuration of a neural network according to an embodiment of the present invention.
5, 6, and 7 are circuit diagrams for explaining the operation of a neural network according to an embodiment of the present invention.

위 발명의 배경이 되는 기술 란에 기재된 내용은 오직 본 발명의 기술적 사상에 대한 배경 기술의 이해를 돕기 위한 것이며, 따라서 그것은 본 발명의 기술 분야의 당업자에게 알려진 선행 기술에 해당하는 내용으로 이해될 수 없다.The content described in the technical field that is the background of the present invention is only for helping the understanding of the background for the technical idea of the present invention, and therefore it can be understood as the content corresponding to the prior art known to those skilled in the art of the present invention. none.

아래의 서술에서, 설명의 목적으로, 다양한 실시예들의 이해를 돕기 위해 많은 구체적인 세부 내용들이 제시된다. 그러나, 다양한 실시예들이 이러한 구체적인 세부 내용들 없이 또는 하나 이상의 동등한 방식으로 실시될 수 있다는 것은 명백하다. 다른 예시들에서, 잘 알려진 구조들과 장치들은 다양한 실시예들을 불필요하게 이해하기 어렵게 하는 것을 피하기 위해 블록도로 표시된다. In the following description, for purposes of explanation, numerous specific details are set forth to aid in understanding various embodiments. It will be evident, however, that various embodiments may be practiced without these specific details or in one or more equivalent manners. In other instances, well-known structures and devices are shown in block diagram form in order to avoid unnecessarily obscuring the various embodiments.

도면에서, 레이어들, 필름들, 패널들, 영역들 등의 크기 또는 상대적인 크기는 명확한 설명을 위해 과장될 수 있다. 또한, 동일한 참조 번호는 동일한 구성 요소를 나타낸다.In the drawings, the size or relative size of layers, films, panels, regions, etc. may be exaggerated for clarity. Also, like reference numbers indicate like elements.

명세서 전체에서, 어떤 부분이 다른 부분과 "연결"되어 있다고 할 때, 이는 "직접적으로 연결"되어 있는 경우뿐 아니라, 그 중간에 다른 소자를 사이에 두고 "간접적으로 연결"되어 있는 경우도 포함한다. 그러나, 만약 어떤 부분이 다른 부분과 "직접적으로 연결되어 있다"고 서술되어 있으면, 이는 해당 부분과 다른 부분 사이에 다른 소자가 없음을 의미할 것이다. "X, Y, 및 Z 중 적어도 어느 하나", 그리고 "X, Y, 및 Z로 구성된 그룹으로부터 선택된 적어도 어느 하나"는 X 하나, Y 하나, Z 하나, 또는 X, Y, 및 Z 중 둘 또는 그 이상의 어떤 조합 (예를 들면, XYZ, XYY, YZ, ZZ) 으로 이해될 것이다. 여기에서, "및/또는"은 해당 구성들 중 하나 또는 그 이상의 모든 조합을 포함한다.Throughout the specification, when a part is "connected" with another part, this includes not only the case where it is "directly connected" but also the case where it is "indirectly connected" with another element interposed therebetween. . However, if it is described that a part is "directly connected" to another part, this will mean that there is no other element between the part and the other part. “At least any one of X, Y, and Z” and “at least any one selected from the group consisting of X, Y, and Z” means one X, one Y, one Z, or two of X, Y, and Z or Any further combination (eg, XYZ, XYY, YZ, ZZ) will be understood. Herein, “and/or” includes any combination of one or more of the components.

여기에서, 첫번째, 두번째 등과 같은 용어가 다양한 소자들, 요소들, 지역들, 레이어들, 및/또는 섹션들을 설명하기 위해 사용될 수 있지만, 이러한 소자들, 요소들, 지역들, 레이어들, 및/또는 섹션들은 이러한 용어들에 한정되지 않는다. 이러한 용어들은 하나의 소자, 요소, 지역, 레이어, 및/또는 섹션을 다른 소자, 요소, 지역, 레이어, 및 또는 섹션과 구별하기 위해 사용된다. 따라서, 일 실시예에서의 첫번째 소자, 요소, 지역, 레이어, 및/또는 섹션은 다른 실시예에서 두번째 소자, 요소, 지역, 레이어, 및/또는 섹션이라 칭할 수 있다.Although terms such as first, second, etc. may be used herein to describe various elements, elements, regions, layers, and/or sections, such elements, elements, regions, layers, and/or or sections are not limited to these terms. These terms are used to distinguish one element, element, region, layer, and/or section from another element, element, region, layer, and/or section. Accordingly, a first element, element, region, layer, and/or section in one embodiment may be referred to as a second element, element, region, layer, and/or section in another embodiment.

"아래", "위" 등과 같은 공간적으로 상대적인 용어가 설명의 목적으로 사용될 수 있으며, 그렇게 함으로써 도면에서 도시된 대로 하나의 소자 또는 특징과 다른 소자(들) 또는 특징(들)과의 관계를 설명한다. 이는 도면 상에서 하나의 구성 요소의 다른 구성 요소에 대한 관계를 나타내는 데에 사용될 뿐, 절대적인 위치를 의미하는 것은 아니다. 예를 들어, 도면에 도시된 장치가 뒤집히면, 다른 소자들 또는 특징들의 "아래"에 위치하는 것으로 묘사된 소자들은 다른 소자들 또는 특징들의 "위"의 방향에 위치한다. 따라서, 일 실시예에서 "아래" 라는 용어는 위와 아래의 양방향을 포함할 수 있다. 뿐만 아니라, 장치는 그 외의 다른 방향일 수 있다 (예를 들어, 90도 회전된 혹은 다른 방향에서), 그리고, 여기에서 사용되는 그런 공간적으로 상대적인 용어들은 그에 따라 해석된다.Spatially relative terms such as "below", "above", etc. may be used for descriptive purposes, thereby describing the relationship of one element or feature to another element(s) or feature(s) as shown in the drawings. do. This is only used to indicate the relationship of one component to another component in the drawing, and does not mean an absolute position. For example, if the device shown in the figures is turned over, elements depicted as being "below" other elements or features are positioned "above" the other elements or features. Thus, in one embodiment, the term “below” may include both up and down. In addition, the device may be otherwise oriented (eg, rotated 90 degrees or in other orientations), and such spatially relative terms used herein are interpreted accordingly.

여기에서 사용된 용어는 특정한 실시예들을 설명하는 목적이고 제한하기 위한 목적이 아니다. 명세서 전체에서, 어떤 부분이 어떤 구성요소를 "포함"한다 고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다. 다른 정의가 없는 한, 여기에 사용된 용어들은 본 발명이 속하는 분야에서 통상적인 지식을 가진 자에게 일반적으로 이해되는 것과 같은 의미를 갖는다.The terminology used herein is for the purpose of describing particular embodiments and not for the purpose of limitation. Throughout the specification, when a part "includes" a certain component, it means that other components may be further included, rather than excluding other components, unless otherwise stated. Unless otherwise defined, terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

도 1은 본 발명의 일 실시예에 따른 뉴럴 네트워크를 개략적으로 나타내는 회로도이다.1 is a circuit diagram schematically illustrating a neural network according to an embodiment of the present invention.

도 1을 참조하면, 본 발명의 일 실시예에 따른 뉴럴 네트워크(neural network)는 입력 뉴런(10), 출력 뉴런(20), 및 가중치 셀(30)을 포함한다. 시냅스(30) 소자는 입력 뉴런(10)으로부터 수평으로 연장하는 로우 라인(R)(row lines) 및 출력 뉴런(20)으로부터 수직으로 연장하는 컬럼 라인(C)(column lines)의 교차점에 배치될 수 있다. 설명의 편의를 위해 도 1에는 예시적으로 각각 네 개의 입력 뉴런(10) 및 출력 뉴런(20)이 도시되었으나, 본 발명은 이에 한정되지 않는다.Referring to FIG. 1 , a neural network according to an embodiment of the present invention includes an input neuron 10 , an output neuron 20 , and a weight cell 30 . The synapse 30 device is to be disposed at the intersection of a row line (R) extending horizontally from the input neuron 10 and a column line (C) extending vertically from the output neuron 20 can For convenience of explanation, four input neurons 10 and four output neurons 20 are illustrated in FIG. 1 , respectively, but the present invention is not limited thereto.

입력 뉴런(10)은 학습 모드(learning mode), 리셋 모드(reset mode), 보정 또는 읽기 모드(reading mode)에서 로우 라인(R)을 통하여 가중치 셀(30)로 전기적 펄스들(pulses)을 전송할 수 있다.The input neuron 10 transmits electrical pulses to the weight cell 30 through the row line R in a learning mode, a reset mode, a correction or a reading mode. can

출력 뉴런(20)은 학습 모드 또는 리셋 모드 또는 보정 또는 읽기 모드에서 컬럼 라인(C)을 통하여 가중치 셀(30)로부터 전기적 펄스를 수신할 수 있다.The output neuron 20 may receive an electrical pulse from the weight cell 30 through the column line C in the learning mode, the reset mode, or the correction or read mode.

도 2는 본 발명의 일 실시예에 따른 뉴럴 네트워크의 가중치 셀을 개략적으로 나타내는 회로도이다. 도 3은 본 발명의 일 실시예에 따른 뉴럴 네트워크의 가중치 셀의 작동 원리를 개략적으로 나타내는 회로도이다.2 is a circuit diagram schematically illustrating a weight cell of a neural network according to an embodiment of the present invention. 3 is a circuit diagram schematically illustrating an operating principle of a weight cell of a neural network according to an embodiment of the present invention.

본 발명의 실시예에 따른 가중치 비트폭을 탄력적으로 적용하기 위하여 가중치를 새롭게 정의한다. 현재까지는 커패시턴스를 기초로 하는 가중치에서 Q=CV(전하량=커패시턴스×전압)를 적용하여 C를 멀티레벨화 했다. 여기서 선형 멀티레벨은 배수 n을 사용하여 C=nC_O로 표현할 수 있다. 이 때 기저(ground) 커패시턴스가 되는 C_O는 가중치의 해상도(resolution)가 되고 C_OV는 전하량의 해상도(resolution)가 된다. 여기서 n은 배수 개념에서 횟수(number) 개념으로 전환할 수 있기 때문에 C_O를 n회 적용하는 방식을 도입할 수 있다. 따라서 가중치를 n으로 정의할 수 있다.In order to flexibly apply the weight bit width according to the embodiment of the present invention, the weight is newly defined. Until now, C was multi-leveled by applying Q=CV (charge=capacitance×voltage) to the weight based on capacitance. Here, the linear multilevel can be expressed as C=nC _O using a multiple of n. At this time, _CO , which is the ground capacitance, becomes the resolution of the weight, and _CO V becomes the resolution of the amount of charge. Here, since n can be converted from a multiple concept to a number concept, a method of applying _CO n times can be introduced. Therefore, the weight can be defined as n.

도 2 및 도 3을 참조하면, 본 발명의 실시예에 따른 뉴럴 네트워크의 가중치 셀은, 워드 라인(WL), 비트 라인(BL), 및 커패시터와 연결되는 트랜지스터와, 트랜지스터와 연결되는 커패시터를 포함하고, 워드 라인은 비트 라인과 직교하여 배치된다. 실시예로서, 비트 라인의 입력은 펄스 발진기와 연결되고 출력은 센스 엠프와 연결될 수 있다.2 and 3 , a weight cell of a neural network according to an embodiment of the present invention includes a word line (WL), a bit line (BL), a transistor connected to a capacitor, and a capacitor connected to the transistor and the word line is disposed perpendicular to the bit line. As an embodiment, the input of the bit line may be connected to a pulse oscillator and the output may be connected to a sense amplifier.

본 발명의 일 실시예에 따르면, 비트 라인과 센스 엠프 사이에는 선택 트랜지스터(ST)가 배치된다. 실시예로서, 선택 트랜지스터의 게이트에 일정 시간(

) 동안 게이트 전압을 인가함으로써 뉴럴 네트위크의 가중치 셀에 펄스 트레인이 인가되는 시간, 즉 인가되는 펄스 횟수(n_j)를 조절할 수 있다. j는 가중치 셀 어레이의 열의 개수이다.According to an embodiment of the present invention, a selection transistor ST is disposed between the bit line and the sense amplifier. As an embodiment, the gate of the selection transistor is

), it is possible to control the time for which the pulse train is applied to the weight cells of the neural network, that is, the number of pulses applied (n _j ). j is the number of columns in the weight cell array.

본 발명의 일 실시예에 따르면, 트랜지스터와 커패시터로 구성된 가중치 셀을 이용하여 커패시터를 반복적으로 충방전하여 일정 시간동안 방전하는 전하량을 출력값으로 한다. 실시예로서, 가중치 셀의 워드 라인에 인가되는 전압은 일정한 전압이고, 비트 라인에 인가되는 전압은 입력 신호에 대응하는 펄스 전압이다. 실시예로서, 비트 라인에 인가되는 전압은 커패시터 플레이트에 인가되는 상시 전압의 두배 이상일 수 있다.According to an embodiment of the present invention, the amount of charge discharged for a predetermined time by repeatedly charging and discharging a capacitor using a weight cell composed of a transistor and a capacitor is an output value. As an embodiment, the voltage applied to the word line of the weight cell is a constant voltage, and the voltage applied to the bit line is a pulse voltage corresponding to the input signal. As an embodiment, the voltage applied to the bit line may be at least twice the constant voltage applied to the capacitor plate.

도 4는 본 발명의 일 실시예에 따른 뉴럴 네트워크 구성을 개략적으로 나타내는 회로도이다.4 is a circuit diagram schematically illustrating a configuration of a neural network according to an embodiment of the present invention.

도 4를 참조하면, 본 발명의 실시예에 따른 뉴럴 네트워크의 가중치 셀은, 워드 라인(WL), 비트 라인(BL), 및 커패시터와 연결되는 트랜지스터와, 트랜지스터와 연결되는 커패시터와, 상기 커패시터에 연결되는 플레이트를 포함하고, 비트 라인의 입력은 펄스 발진기와 연결되고 출력은 센스 앰프와 연결된다.Referring to FIG. 4 , a weight cell of a neural network according to an embodiment of the present invention includes a word line WL, a bit line BL, and a transistor connected to a capacitor, a capacitor connected to the transistor, and a capacitor connected to the capacitor. and a plate coupled thereto, the input of the bit line coupled with the pulse oscillator and the output coupled with the sense amplifier.

도 3 및 도 4를 참조하면, 트랜지스터의 비트 라인에는 입력 신호에 대응되는 입력 전압 Vpj의 펄스 트레인(pulse train)을 인가하고, 워드 라인에는 전압 Vg를 인가하되 행 별로 순차로 인가한다. 실시예로서, 행을 2개, 3개, 혹은 그 이상으로 동시에 작동시켜 선형적으로 방전량을 증가시킬 수 있다.3 and 4 , a pulse train of an input voltage Vpj corresponding to an input signal is applied to a bit line of a transistor, and a voltage Vg is applied to a word line, but sequentially applied row by row. As an embodiment, two, three, or more rows can be operated simultaneously to increase the amount of discharge linearly.

본 발명의 일 실시예에 따르면, 공통 펄스 발진기와 각각의 비트 라인 사이에 셀프 전압 분배기가 배치되어, 각각의 비트 라인에 입력되는 펄스 트레인 전압을 조절한다. 실시예로서, 셀프 전압 분배기의 게이트 전압(Vj)을 조절하여 비트 라인에 입력되는 펄스 전압(V_pj)을 결정할 수 있다. According to an embodiment of the present invention, a self voltage divider is disposed between the common pulse oscillator and each bit line to adjust the pulse train voltage input to each bit line. As an embodiment, the pulse voltage V _pj input to the bit line may be determined by adjusting the gate voltage Vj of the self voltage divider.

비트 라인의 출력에는 선택 트랜지스터가 배치되어, 선택 트랜지스터의 게이트에 각각 일정 시간(

) 동안 게이트 전압을 인가함으로써 펄스 트레인 전압이 인가되는 시간을 조절한다.A selection transistor is disposed at the output of the bit line, and each of the gates of the selection transistor is

), by applying the gate voltage, the time for which the pulse train voltage is applied is controlled.

본 발명의 실시예에 따른 뉴럴 네트워크의 각 가중치 셀에서 방출되는 전하량은 Q=n(

)C(Vpj-Vcc/2)가 된다. n(

)는 펄스 폭

에 의해 선형적으로 결정되는 입력 펄스 횟수(count)이다. 이 관계에 의해 입력 전압과 가중치에 해당하는 n과의 곱셈(multiplication)이 가능해진다.

의 최소 기본 단위를

라고 하면 n(

)C(Vpj-Vcc/2)는 Q의 해상도(resolution)가 된다. 행을 묶어서 작동하는 경우는 행 수만큼의 비율로 가중치의 선형성(linearity)을 확장할 수 있다. The amount of charge emitted from each weight cell of the neural network according to the embodiment of the present invention is Q = n (

)C(Vpj-Vcc/2). n (

) is the pulse width

It is the number of input pulses linearly determined by . This relationship enables multiplication between the input voltage and n corresponding to the weight.

the smallest basic unit of

If n (

)C(Vpj-Vcc/2) becomes the resolution of Q. In the case of operating by grouping rows, the linearity of the weights can be extended by a ratio equal to the number of rows.

실시예로서, 펄스 발진기의 입력 펄스 지연(delay) 동안 비트 라인으로부터 방전되는 방전 전하가 펄스 발진기로 역류하는 것을 막기 위하여, 비트 라인과 셀프 전압 분배기 사이에 다이오드를 배치할 수 있다.As an embodiment, a diode may be disposed between the bit line and the self voltage divider in order to prevent a discharge charge discharged from the bit line from flowing back to the pulse oscillator during an input pulse delay of the pulse oscillator.

실시예로서, 펄스 발진기의 펄스 전압이 인가되는 동안 비트라인으로 펄스가 직접 선별기로 진행되는 것을 방지하기 위하여, 선택 트랜지스터의 그라운드에 해당하는 P웰(p-well)과, 다이오드와 셀프 전압 분배기 사이를 연결하는 배선을 제작할 수 있다. 펄스 전압이 인가되면 선택 트랜지스터 P웰에도 전압이 인가되므로 선택 트랜지스터에 인가되는 게이트 전압을 상쇄시키거나 그 이상으로 선택 트랜지스터를 오프(off) 상태로 만들기 때문에 펄스가 선별기로 직접 진행하는 것을 방지한다. 따라서 입력 펄스 전압 Vpj는 선택 트랜지스터에 인가되는 게이트 전압 Vg보다 클 수 있다. (Vpj ≥Vg)As an embodiment, in order to prevent a pulse from being directly transmitted to the bit line to the selector while the pulse voltage of the pulse oscillator is applied, a P-well corresponding to the ground of the select transistor and between the diode and the self voltage divider You can make a wiring that connects them. When a pulse voltage is applied, a voltage is also applied to the selection transistor P well, thereby canceling the gate voltage applied to the selection transistor or turning the selection transistor off beyond that, thereby preventing the pulse from going directly to the selector. Accordingly, the input pulse voltage Vpj may be greater than the gate voltage Vg applied to the selection transistor. (Vpj ≥Vg)

출력 비트 라인들은 하나로 연결하여 출력 전하들을 모으는 또 다른 1T-1C 트랜지스터-커패시터 쌍의 어레이에 연결하여 전하들을 축적시킨다. 이 때 어큐뮬레이터(accumulator)에서 발생할 수 있는 역전압에 의한 전하의 역류를 방지하기 위해 선택 트랜지스터와 어큐뮬레이터 사이에 다이오드를 추가할 수 있다.The output bit lines accumulate charges by connecting one array to another 1T-1C transistor-capacitor pair that collects the output charges. In this case, a diode may be added between the selection transistor and the accumulator to prevent a reverse flow of charges due to a reverse voltage that may occur in the accumulator.

도 5, 도 6, 및 도 7은 본 발명의 일 실시예에 따른 뉴럴 네트워크의 동작을 설명하기 위한 회로도이다.5, 6, and 7 are circuit diagrams for explaining the operation of a neural network according to an embodiment of the present invention.

도 5, 도 6, 및 도 7을 참조하면, 본 발명의 실시예에 따른 뉴럴 네트워크는 가중치 셀을 어레이(array)로 구성하고 이를 네트워크 레이어(network layer)로 제작한 다음 가중치 셀 어레이의 행 별로 순차로 작동시키는 동시에 출력 값을 다음 은닉층의 입력 정보로 사용하는 리커런트(recurrent) 혹은 이터레이션(iteration) 방식을 적용한다.5, 6, and 7, in the neural network according to an embodiment of the present invention, weight cells are configured as an array, the weight cells are manufactured as a network layer, and then each row of the weight cell array is configured. A recurrent or iteration method that operates sequentially and uses the output value as input information for the next hidden layer is applied.

도 5 및 도 6을 참조하면, 본 발명의 실시예에 따른 뉴럴 네트워크의 특정 행의 각 가중치 셀에 입력 전압을 인가하는 시간은, 열 별로 연결된 선택 트랜지스터의 게이트에 인가하는 게이트 전압 펄스의 폭으로 조절한다. 셀프 전압 분배기의 전압 다이나믹 레인지(dynamic range)는 비트 라인에 입력되는 전압의 다이나믹 레인지 조건에 맞추어 설계한다. 실시예로서, 행렬 곱셈은 워드 라인에 순차적으로 전압을 인가하여 행 별로 순차적으로 수행하고, 각 은닉층별 가중치 정보와 입력 정보는 뉴럴 네트워크 외부의 저장 매체에 저장할 수 있다. 외부의 저장 매체에는 레이어별 가중치 정보와 입력 정보를 저장하여 추론(inference)시 순환 컴퓨팅(computing)에 사용한다.5 and 6, the time for applying an input voltage to each weight cell in a specific row of the neural network according to an embodiment of the present invention is the width of the gate voltage pulse applied to the gates of the selection transistors connected for each column. Adjust. The voltage dynamic range of the self voltage divider is designed according to the dynamic range condition of the voltage input to the bit line. As an embodiment, matrix multiplication may be sequentially performed for each row by sequentially applying voltages to word lines, and weight information and input information for each hidden layer may be stored in a storage medium external to the neural network. Weight information and input information for each layer are stored in an external storage medium and used for cyclic computing during inference.

도 5 및 도 6을 참조하면, i 개의 행 중, 두번째 행을 선택한 실시예를 확인할 수 있으나 본 발명은 이에 한정되지 않고 여러 행을 동시에 선택하여 충방전 전하량을 확대 사용할 수 있다.Referring to FIGS. 5 and 6 , an embodiment in which the second row is selected among i rows can be confirmed, but the present invention is not limited thereto, and the amount of charge and discharge charges can be expanded by selecting several rows at the same time.

축적된 전하들의 양이 일정 값을 넘으면 활성화하게 되는데 이 전하들을 축적하는 동안 어큐뮬레이터에는 전압이 증가하게 된다. 이 역전압이 제1 가중치 셀 어레이에서 방전(discharging)되는 전하들을 막을 수 있으므로 어큐뮬레이터를 구성하는 1T-1C 셀 어레이를 충분히 크게 제작한다. 또한 축적된 전하들을 일정시간 동안 방전시켜서 어큐뮬레이터에 전하들이 남아 있을 경우에 차기 은닉층의 입력 정보로 사용한다. 어큐뮬레이터에 전하들이 남아있지 않는다면 차기 은닉층의 입력이 없게 된다. 잔류 전하 여부를 판단하는 것은 센스 앰프(sense amplifier)로 수행하고 일정 시간 어큐뮬레이터의 전하들을 소멸시키는 것은 NMOS-PMOS 트랜지스터 쌍으로 수행한다.It is activated when the amount of accumulated charges exceeds a certain value, and the voltage in the accumulator increases while these charges are accumulated. Since this reverse voltage can prevent charges discharged from the first weight cell array, the 1T-1C cell array constituting the accumulator is manufactured to be sufficiently large. In addition, the accumulated charges are discharged for a certain period of time, and when charges remain in the accumulator, they are used as input information for the next hidden layer. If there are no charges left in the accumulator, there is no input to the next hidden layer. Determination of residual charge is performed by a sense amplifier, and dissipation of charges in the accumulator for a predetermined time is performed by a pair of NMOS-PMOS transistors.

도 7을 참조하면, 선별(discrimination)과 잔류 전하를 검출하는 방법을 확인할 수 있다. 어큐뮬레이터에 축적된 전하를 인버터로 보내기 위해서 어큐뮬레이터의 플레이트(plate)에 Vcc/2 전압을 인가하고 인버터에 전압 Vg'를 인가한 다음 인버터 앞의 선별용 선택 트랜지스터(DT)의 게이트에 전압 Vg를 인가한다. 인버터에 전압 Vg'을 인가하면 NMOS 트랜지스터가 on 상태가 되어 축적되었던 전하들이 NMOS 트랜지스터를 통하여 방전되어 소멸한다. 일정 기간 후에 인버터 전압을 제거하면 PMOS 트랜지스터가 작동하면서 잔류 전하가 센스 앰프 쪽으로 흐르고 이로 인해 잔류 전하 여부를 감지할 수 있다.Referring to FIG. 7 , a method of screening and detecting residual charge can be confirmed. In order to send the charge accumulated in the accumulator to the inverter, Vcc/2 voltage is applied to the plate of the accumulator, voltage Vg' is applied to the inverter, and then voltage Vg is applied to the gate of the selection transistor (DT) for selection in front of the inverter. do. When the voltage Vg' is applied to the inverter, the NMOS transistor is turned on, and the accumulated charges are discharged through the NMOS transistor and disappear. If the inverter voltage is removed after a certain period of time, the PMOS transistor operates and residual charge flows toward the sense amplifier, which can detect whether there is a residual charge.

전술한 바와 같은 본 발명의 실시예들에 따르면, 본 발명의 실시예에 따른 가중치 비트폭을 탄력적으로 적용할 수 있는 가중치 셀은, 가중치와 은닉층 수를 탄력적으로 조절할 수 있으며, 회로 오버헤드를 줄일 수 있고 행렬 곱셈 유닛 칩 크기(matrix multiplication unit chip size)도 최소화할 수 있다.According to the embodiments of the present invention as described above, in the weight cell to which the weight bit width according to the embodiment of the present invention can be flexibly applied, the weight and the number of hidden layers can be flexibly adjusted, and circuit overhead can be reduced. Also, the matrix multiplication unit chip size can be minimized.

이상과 같이 본 발명에서는 구체적인 구성 요소 등과 같은 특정 사항들과 한정된 실시예 및 도면에 의해 설명되었으나 이는 본 발명의 보다 전반적인 이해를 돕기 위해서 제공된 것일 뿐, 본 발명은 상기의 실시예에 한정되는 것은 아니며, 본 발명이 속하는 분야에서 통상적인 지식을 가진 자라면 이러한 기재로부터 다양한 수정 및 변형이 가능하다.As described above, in the present invention, specific matters such as specific components, etc., and limited embodiments and drawings have been described, but these are only provided to help a more general understanding of the present invention, and the present invention is not limited to the above embodiments. , various modifications and variations are possible from these descriptions by those of ordinary skill in the art to which the present invention pertains.

따라서, 본 발명의 사상은 설명된 실시예에 국한되어 정해져서는 아니되며, 후술하는 특허청구범위뿐 아니라 이 특허청구범위와 균등하거나 등가적 변형이 있는 모든 것들은 본 발명 사상의 범주에 속한다고 할 것이다.Therefore, the spirit of the present invention should not be limited to the described embodiments, and not only the claims described below, but also all of the claims and all equivalents or equivalent modifications to the claims will be said to belong to the scope of the spirit of the present invention. .

10: 입력 뉴런 20: 출력 뉴런
30: 가중치 셀10: input neuron 20: output neuron
30: weight cell

Claims

a transistor coupled to the word line, the bit line, and the capacitor; and
a capacitor connected to the transistor;
An input of the bit line is connected to a pulse generator and an output is connected to a sense amplifier,
a select transistor disposed between the sense amplifier and the bit line;
a diode disposed between the sense amplifier and the selection transistor;
an accumulator disposed between the sense amplifier and the diode;
a selection transistor for selection disposed between the sense amplifier and the accumulator; and
A discriminator between the sense amplifier and the selection transistor for selection,
further comprising,
A capacitance-based sequential matrix multiplicative neural network with tunable weights with transistor-capacitor pairs.

delete

The method of claim 1,
A capacitance-based sequential matrix multiplication neural network capable of adjusting weight with a transistor-capacitor pair, further comprising a diode positioned between the bit line and the pulse oscillator.

8. The method of claim 7,
A capacitance-based sequential matrix multiplication neural network capable of adjusting a weight with a transistor-capacitor pair further comprising a wiring connecting a P-well corresponding to the ground of the selection transistor between the diode and the pulse oscillator.

The method of claim 1,
A voltage applied to the word line is a constant voltage, and the voltage applied to the bit line is a pulse voltage corresponding to an input signal. A capacitance-based sequential matrix multiplication neural network whose weight can be adjusted by a transistor-capacitor pair.

The method of claim 1,
A capacitance-based sequential matrix multiplication neural network in which the weight of a pulse voltage applied to the bit line can be adjusted by a transistor-capacitor pair that is at least twice a normal voltage applied to a plate connected to the capacitor.

a transistor coupled to the word line, the bit line, and the capacitor;
a capacitor connected to the transistor; and
a plate connected to the capacitor;
The input of the bit line includes weight cells connected to a pulse generator and the output connected to a sense amplifier,
sequentially performing matrix multiplication by sequentially applying a voltage to the word lines of the weight cells,
A capacitance-based sequential matrix multiplicative neural network with tunable weights with transistor-capacitor pairs.

delete

12. The method of claim 11,
A capacitance-based sequential matrix multiplication neural network capable of adjusting a weight by a transistor-capacitor pair that performs matrix multiplication by using the output information of the matrix multiplication as input information of the next hidden layer.

14. The method of claim 13,
A transistor-capacitor pair that further includes a storage medium external to the neural network for storing iteration number information, weight information, input information, output information, and hidden layer information for a result of repeatedly performing the matrix multiplication. A capacitance-based sequential matrix multiplication neural network.

12. The method of claim 11,
A capacitance-based sequential matrix multiplication neural network capable of adjusting weights using a transistor-capacitor pair that sequentially performs matrix multiplication by selecting two or more word lines of the weight cells and simultaneously applying gate voltages.

a transistor coupled to the word line, the bit line, and the capacitor; a capacitor connected to the transistor; and a plate connected to the capacitor, wherein an input of the bit line is connected to a pulse generator and an output is connected to a sense amplifier; and
a transistor coupled to the word line, the bit line, and the capacitor; a capacitor connected to the transistor; and a plate connected to the capacitor, wherein the bit line is a transistor-capacitor pair comprising output of the first weight cells and second weight cells connected to a sense amplifier and a capacitance adjustable in weight Based sequential matrix multiplication neural networks.

17. The method of claim 16,
A capacitance-based sequential matrix multiplication neural network capable of adjusting weight with a transistor-capacitor pair, further comprising a diode disposed between an output of the first weight cells and an input of the second weight cells.

17. The method of claim 16,
A capacitance-based sequential matrix multiplication neural network capable of adjusting weights using a transistor-capacitor pair that sequentially performs matrix multiplication by sequentially applying voltages to word lines of the first weight cells.

19. The method of claim 18,
A capacitance-based sequential matrix multiplication neural network capable of adjusting weights with a transistor-capacitor pair that performs matrix multiplication by storing output information of the matrix multiplication in the second weight cells and using it as input information for a next hidden layer.

20. The method of claim 19,
A capacitance-based sequential matrix multiplication neural network in which the weight of the second weight cells can be adjusted as a transistor-capacitor pair that collects charges discharged from the first weight cells.