WO2023240841A1

WO2023240841A1 - Chip-based image affine transformation method and chip

Info

Publication number: WO2023240841A1
Application number: PCT/CN2022/122362
Authority: WO
Inventors: 肖晗; 袁峰
Original assignee: Orbbec Inc
Current assignee: Orbbec Inc
Priority date: 2022-06-15
Filing date: 2022-09-29
Publication date: 2023-12-21
Anticipated expiration: 2024-12-15
Also published as: CN115187448A

Abstract

A chip-based image affine transformation method and a chip. The method comprises: triggering the current input cache of a plurality of input caches and an intermediate cache to read, from an off-chip memory, current partial image data of an input image and parameter data required for calculation, and respectively writing said data into the current input cache and the intermediate cache; when reading the current partial data of the input image, actuating a calculation unit to perform, via the parameter data, affine transformation and interpolation calculation on historical partial image data of the input image acquired by a historical input cache of the plurality of input caches, so as to obtain a historical partial processing result of a target output image; and writing the historical partial processing result into an output cache, and performing affine transformation and interpolation calculation on the current partial data at the same time, so as to obtain a complete processing result of the target output image.

Description

A chip-based image affine transformation method and chip

本申请要求于2022年6月15日提交中国专利局，申请号为202210684991.4，发明名称为“一种基于芯片的图像仿射变换方法及芯片”的中国专利申请的优先权，其全部内容通过引用结合在本申请中。This application claims priority to the Chinese patent application submitted to the China Patent Office on June 15, 2022, with the application number 202210684991.4, and the invention title is "A chip-based image affine transformation method and chip", the entire content of which is incorporated by reference incorporated in this application.

Technical field

本申请属于图像处理及芯片技术领域，尤其涉及一种基于芯片的图像仿射变换方法及芯片。The present application belongs to the field of image processing and chip technology, and in particular relates to a chip-based image affine transformation method and chip.

Background technique

在图像识别中，为提高图像识别的准确率，经常需要对图像进行视角变换。例如，摄像头从不同的距离及不同的角度拍摄得到墙上一幅画的图像是不同的，如果将这些从不同距离及不同角度拍摄的图像通过变换矩阵统一投影成摄像头在画的正前方固定距离处拍摄的图像，可有利于提高图像识别的准确率。这种变换称为射影变换。然而在实际应用中，由于平行的直线在射影变换下可能变成不平行的，一般采用仿射变换来近似代替射影变换，如人脸识别的预处理。In image recognition, in order to improve the accuracy of image recognition, it is often necessary to transform the image perspective. For example, the images of a painting on the wall taken by the camera from different distances and different angles are different. If these images taken from different distances and different angles are uniformly projected through the transformation matrix, the camera will be at a fixed distance directly in front of the painting. The images taken at different places can help improve the accuracy of image recognition. This transformation is called projective transformation. However, in practical applications, since parallel straight lines may become non-parallel under projective transformation, affine transformation is generally used to approximate the projective transformation, such as preprocessing for face recognition.

现有技术中，常采用中央处理器(Central Processing Unit，CPU)和双倍数据率同步动态随机存取存储器(Double Data Rate Synchronous Dynamic Random Access Memory，DDR SDRAM)的计算机系统实现仿射变换，但在该系统中直接通过变换矩阵进行仿射变换会出现速度慢及耗时长的问题，因为DDR是一种动态存储器，在DDR中对随机地址进行读写访问花费的时间要远远长于对连读地址进行访问。In the existing technology, the computer system of Central Processing Unit (CPU) and Double Data Rate Synchronous Dynamic Random Access Memory (DDR SDRAM) are often used to implement affine transformation, but Directly performing affine transformation through the transformation matrix in this system will cause problems such as slow speed and long time consumption. Because DDR is a dynamic memory, read and write access to random addresses in DDR takes much longer than continuous reading. address to access.

Technical solutions

有鉴于此，本申请实施例提供了一种基于芯片的图像仿射变换方法及芯片，能够解决相关技术中的一个或多个技术问题。In view of this, embodiments of the present application provide a chip-based image affine transformation method and chip, which can solve one or more technical problems in related technologies.

第一方面，本申请一实施例提供了一种基于芯片的图像仿射变换方法，所述数字芯片包括多组输入缓存、中间缓存、计算单元及输出缓存，其中，所述图像仿射变换方法包括：触发多组输入缓存中的当前组输入缓存及中间缓存从片外存储器中读取输入图像的当前部分图像数据和计算所需的参数数据，并分别写入所述当前组输入缓存及所述中间缓存；其中，参数数据包括所需视角的目标输出图像及对应的分辨率；在读取输入图像当前部分数据的同时，致动计算单元利用所述参数数据对多组输入缓存中的历史组输入缓存获取的输入图像历史部分图像数据进行仿射变换及插值计算，得到目标输出图像的历史部分处理结果；将历史部分处理结果写入输出缓存并同时对当前部分数据执行仿射变换及插值计算，以得到目标输出图像的完整处理结果。In the first aspect, an embodiment of the present application provides a chip-based image affine transformation method. The digital chip includes multiple sets of input buffers, intermediate buffers, computing units and output buffers, wherein the image affine transformation method It includes: triggering the current group of input buffers and intermediate buffers in the multiple groups of input buffers to read the current part of the image data of the input image and the parameter data required for calculation from the off-chip memory, and write them into the current group of input buffers and all the parameter data respectively. The intermediate cache; wherein, the parameter data includes the target output image of the required viewing angle and the corresponding resolution; while reading the current part of the input image data, the actuating calculation unit uses the parameter data to calculate the history of multiple sets of input caches. Perform affine transformation and interpolation calculations on the historical partial image data of the input image obtained by the input cache to obtain the historical partial processing results of the target output image; write the historical partial processing results to the output cache and simultaneously perform affine transformation and interpolation on the current partial data Calculated to obtain the complete processing result of the target output image.

第二方面，本申请一实施例提供一种芯片，包括：多组输入缓存，用于轮循从片外存储器读取输入图像的部分图像数据；中间缓存，用于根据多组输入缓存读取数据的时序从片外存储器读取所述输入图像的部分参数数据；其中，参数数据包括输入图像数据在所述片外存储器的起始地址、目标输出图像的分辨率、输出图像的指定存储地址；计算模块，用于在多组输入缓存中的当前组输入缓存从片外存储器中获取输入图像的当前部分数据的同时，利用参数数据对多组输入缓存中的历史组输入缓存从所述片外存储器中获取的输入图像的历史部分数据进行插值计算，得到输出图像的部分处理结果；输出缓存，用于存储所述部分处理结果。In a second aspect, an embodiment of the present application provides a chip, including: multiple sets of input buffers, used to read part of the image data of the input image from an off-chip memory in turn; an intermediate buffer, used to read according to the multiple sets of input buffers The timing of the data reads part of the parameter data of the input image from the off-chip memory; where the parameter data includes the starting address of the input image data in the off-chip memory, the resolution of the target output image, and the designated storage address of the output image. ; Calculation module, configured to use parameter data to obtain the current partial data of the input image from the off-chip memory while the current group of input caches in the multiple groups of input buffers obtains the current partial data of the input image from the chip. The historical partial data of the input image obtained from the external memory is interpolated to obtain partial processing results of the output image; the output cache is used to store the partial processing results.

第三方面，本申请一实施例提供了一种计算机存储介质，所述计算机存储介质存储有计算机程序，所述计算机程序被处理器执行时实现如第一方面实施例所述的方法。In a third aspect, an embodiment of the present application provides a computer storage medium. The computer storage medium stores a computer program. When the computer program is executed by a processor, the method as described in the embodiment of the first aspect is implemented.

第四方面，本申请一实施例提供了一种计算机程序产品，当计算机程序产品在电子设备上运行时，使得电子设备实现如第一方面实施例所述的方法。In a fourth aspect, an embodiment of the present application provides a computer program product. When the computer program product is run on an electronic device, the electronic device implements the method described in the embodiment of the first aspect.

beneficial effects

本申请实施例通过芯片内部的几块输入缓存对存储在片外存储器中的输入图像的局部图像数据(或称部分图像数据)进行缓存，并通过插值计算方法计算仿射变换后的像素值使得插值计算过程与图像缓存过程可并行进行，不仅控制了成本，还极大地提高图像仿射变换的计算速度。The embodiment of the present application caches the partial image data (or partial image data) of the input image stored in the off-chip memory through several input buffers inside the chip, and calculates the affine transformed pixel values through the interpolation calculation method so that The interpolation calculation process and the image caching process can be performed in parallel, which not only controls the cost, but also greatly improves the calculation speed of image affine transformation.

Description of the drawings

为了更清楚地说明本申请实施例中的技术方案，下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本申请的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动性的前提下，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or description of the prior art will be briefly introduced below. Obviously, the drawings in the following description are only for the purpose of the present application. For some embodiments, for those of ordinary skill in the art, other drawings can be obtained based on these drawings without exerting creative efforts.

图1是本申请一实施例提供的一种芯片的架构示意图；Figure 1 is a schematic structural diagram of a chip provided by an embodiment of the present application;

图2是本申请一实施例提供的一种基于芯片的图像仿射变换方法的实现流程示意图；Figure 2 is a schematic flow chart of the implementation of a chip-based image affine transformation method provided by an embodiment of the present application;

图3是本申请一实施例提供的一种输入图像和所需视场的目标输出图像在仿射变换下的映射关系示意图；Figure 3 is a schematic diagram of the mapping relationship between an input image and a target output image of a required field of view under affine transformation provided by an embodiment of the present application;

图4是本申请一实施例提供的一种基于芯片的图像仿射变换方法中步骤S120的实现流程示意图；Figure 4 is a schematic flow chart of the implementation of step S120 in a chip-based image affine transformation method provided by an embodiment of the present application;

图5是本申请一实施例提供的一种顶点和主边位置示意图；Figure 5 is a schematic diagram of the positions of vertices and main edges provided by an embodiment of the present application;

图6是本申请一实施例提供一种目标输出图像在输入图像中的映射图像示意图；Figure 6 is a schematic diagram of a mapping image of a target output image in an input image according to an embodiment of the present application;

图7是本申请一实施例提供的一种芯片的架构示意图。FIG. 7 is a schematic structural diagram of a chip provided by an embodiment of the present application.

Embodiments of the invention

以下描述中，为了说明而不是为了限定，提出了诸如特定系统结构、技术之类的具体细节，以便透彻理解本申请实施例。然而，本领域的技术人员应当清楚，在没有这些具体细节的其它实施例中也可以实现本申请。在其它情况中，省略对众所周知的系统、装置、电路以及方法的详细说明，以免不必要的细节妨碍本申请的描述。In the following description, for the purpose of explanation rather than limitation, specific details such as specific system structures and technologies are provided to provide a thorough understanding of the embodiments of the present application. However, it will be apparent to those skilled in the art that the present application may be practiced in other embodiments without these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.

在本申请说明书和所附权利要求书中使用的术语“和/或”是指相关联列出的项中的一个或多个的任何组合以及所有可能组合，并且包括这些组合。As used in this specification and the appended claims, the term "and/or" means and includes any and all possible combinations of one or more of the associated listed items.

在本申请说明书中描述的“一个实施例”或“一些实施例”等意味着在本申请的一个或多个实施例中包括结合该实施例描述的特定特征、结构或特点。由此，在本说明书中的不同之处出现的语句“在一个实施例中”、“在一些实施例中”、“在其他一些实施例中”、“在另外一些实施例中”等不是必然都参考相同的实施例，而是意味着“一个或多个但不是所有的实施例”，除非是以其他方式另外特别强调。术语“包括”、“包含”、“具有”及它们的变形都意味着“包括但不限于”，除非是以其他方式另外特别强调。Reference in the specification of this application to "one embodiment" or "some embodiments" or the like means that a particular feature, structure or characteristic described in connection with the embodiment is included in one or more embodiments of the application. Therefore, the phrases "in one embodiment", "in some embodiments", "in other embodiments", "in other embodiments", etc. appearing in different places in this specification are not necessarily References are made to the same embodiment, but rather to "one or more but not all embodiments" unless specifically stated otherwise. The terms “including,” “includes,” “having,” and variations thereof all mean “including but not limited to,” unless otherwise specifically emphasized.

此外，在本申请的描述中，“多个”的含义是两个或两个以上。术语“第一”和“第二”等仅用于区分描述，而不能理解为指示或暗示相对重要性。In addition, in the description of this application, "plurality" means two or more. The terms "first", "second", etc. are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.

还应当理解，除非另有明确的规定或限定，术语“连接”应做广义理解，例如，可以是固定连接，也可以是可拆卸连接，或成一体；可以是直接相连，也可以是通过中间媒介间接相连，可以是两个元件内部的连通或两个元件的相互作用关系。对于本领域普通技术人员而言，可以根据具体情况理解上述术语在本申请中的具体含义。It should also be understood that, unless otherwise expressly stipulated or limited, the term "connection" should be understood in a broad sense. For example, it can be a fixed connection, a detachable connection, or an integral body; it can be a direct connection or an intermediate connection. The medium is indirectly connected, which can be the internal connection between two components or the interaction between two components. For those of ordinary skill in the art, the specific meanings of the above terms in this application can be understood according to specific circumstances.

目前，图像仿射变换常应用于CPU和DDR的计算机系统中，其直接通过变换矩阵实现仿射变换，但该方法速度慢及耗时长。At present, image affine transformation is often used in CPU and DDR computer systems, which directly implements affine transformation through transformation matrices. However, this method is slow and time-consuming.

有鉴于此，本申请实施例提供一种基于芯片的图像仿射变换方法及芯片，仅通过几块小容量SRAM对存储在DDR中的局部图像数据(或部分图像数据)进行缓存；并在利用插值计算方法计算仿射变换后的像素值时，使插值计算过程与图像缓存过程并行进行，不仅控制了成本，还极大地提高图像仿射变换的计算速度。In view of this, embodiments of the present application provide a chip-based image affine transformation method and chip, which only cache partial image data (or partial image data) stored in DDR through several small-capacity SRAMs; and use When the interpolation calculation method calculates the pixel value after affine transformation, the interpolation calculation process and the image caching process are performed in parallel, which not only controls the cost, but also greatly improves the calculation speed of the image affine transformation.

为了更好地说明本申请的技术方案，下述实施例将结合一些具体参数对本申请的技术方案进行详细的阐述和说明。应理解，这些参数只是为了便于说明而优选的一些参数，并不能解释为对本申请的具体限定。In order to better illustrate the technical solution of the present application, the following examples will elaborate and illustrate the technical solution of the present application in combination with some specific parameters. It should be understood that these parameters are only preferred parameters for convenience of explanation and cannot be construed as specific limitations of the present application.

图1为根据本申请一实施例提供的一种芯片的架构示意图，为了更好地说明本实施例的芯片，在图1中示出了与芯片23耦接的片外存储器23A。更具体地，芯片23包括多组输入缓存231、计算模块232、输出缓存233及中间缓存234，其中：FIG. 1 is an architectural schematic diagram of a chip provided according to an embodiment of the present application. In order to better illustrate the chip of this embodiment, FIG. 1 shows an off-chip memory 23A coupled to the chip 23 . More specifically, the chip 23 includes multiple sets of input caches 231, computing modules 232, output caches 233 and intermediate caches 234, wherein:

多组输入缓存231，用于轮循从片外存储器23A获取输入图像的部分数据，直至输入图像的全部数据获取完毕则停止获取。Multiple sets of input buffers 231 are used to obtain partial data of the input image from the off-chip memory 23A in turn, and stop obtaining until all data of the input image is obtained.

中间缓存234，用于缓存计算模块232计算时所需的参数；其中，参数包括所需视场的目标输出图像及对应的分辨率。The intermediate cache 234 is used to cache parameters required for calculation by the calculation module 232; the parameters include the target output image of the required field of view and the corresponding resolution.

计算模块232用于在多组输入缓存231中的当前组输入缓存从片外存储器23A中获取输入图像的当前部分数据的同时，利用中间缓存234缓存的参数对多组输入缓存中的历史组输入缓存从片外存储器23A中获取到的输入图像的历史部分数据进行插值计算，得到输出图像的部分处理结果。The calculation module 232 is configured to use the parameters cached in the intermediate cache 234 to input historical groups in the multiple sets of input caches using the parameters cached in the intermediate cache 234 while the current set of input caches in the multiple sets of input caches 231 obtains the current partial data of the input image from the off-chip memory 23A. The cache performs interpolation calculation on the historical partial data of the input image obtained from the off-chip memory 23A to obtain partial processing results of the output image.

输出缓存233，用于存储部分处理结果，并待输出图像的完整处理结果写入输出缓存后，将输出图像的完整处理结果写入片外存储器23A指定的对应地址中。The output cache 233 is used to store partial processing results, and after the complete processing result of the output image is written into the output cache, the complete processing result of the output image is written into the corresponding address specified by the off-chip memory 23A.

在一个实施例中，输入图像数据存储在片外存储器23A中。优选地，片外存储器23A可为DDR存储器。在其他一些实施例中，DDR存储器还可以采用诸如同步动态随机存储器(Synchronous Dynamic Random Access Memory，SDRAM)、动态随机存储器(Dynamic Random Access Memory，DRAM)，或伪静态随机存储器(Pseudo static random access memory，PSRAM)等替代，此处不作限制。In one embodiment, input image data is stored in off-chip memory 23A. Preferably, the off-chip memory 23A may be a DDR memory. In some other embodiments, the DDR memory can also use devices such as Synchronous Dynamic Random Access Memory (SDRAM), Dynamic Random Access Memory (DRAM), or Pseudo static random access memory (Pseudo static random access memory). , PSRAM) and other substitutions, there are no restrictions here.

图2为根据本申请一实施例提供的一种基于数字芯片的图像仿射变换方法的实现流程示意图，图像仿射变换方法可以包括步骤S110至S130。Figure 2 is a schematic flow chart of an implementation of an image affine transformation method based on a digital chip according to an embodiment of the present application. The image affine transformation method may include steps S110 to S130.

S110，触发多组输入缓存中的当前组输入缓存从片外存储器中获取输入图像的当前部分数据，并写入当前组输入缓存中及中间缓存。S110, trigger the current group of input buffers in the multiple groups of input buffers to obtain the current part of the input image data from the off-chip memory, and write it into the current group of input buffers and the intermediate buffer.

在一个实施例中，获取输入图像的数据包括图像数据和参数数据，其中，图像数据写入输入缓存中，参数数据写入中间缓存中；优选地，图像数据包括输入图像的像素值及像素坐标等，参数数据包括所需视角的目标输出图像及对应的分辨率、输入图像对应的输出图像在片外存储器中的指定地址等。In one embodiment, the data for obtaining the input image includes image data and parameter data, wherein the image data is written into the input cache, and the parameter data is written into the intermediate cache; preferably, the image data includes the pixel values and pixel coordinates of the input image. The parameter data includes the target output image of the required viewing angle and the corresponding resolution, the specified address of the output image corresponding to the input image in the off-chip memory, etc.

进一步地，多组输入缓存轮流切换为当前组输入缓存并从片外存储器中获取输入图像的部分数据，直至输入图像的全部数据获取完毕则停止获取；优选地，多组输入缓存至少包括两组输入缓存，两组输入缓存乒乓操作从片外存储器中获取输入图像的部分数据，即当前组输入缓存是当前执行缓存操作的输入缓存，而下述提及的历史组输入缓存则是已执行缓存操作的输入缓存，即其读取输入图像数据的时刻位于当前组输入缓存之前。Further, multiple groups of input buffers are switched to the current group of input buffers in turn and partial data of the input image is obtained from the off-chip memory. The acquisition is stopped until all data of the input image is obtained; preferably, the multiple groups of input buffers include at least two groups. Input cache, two sets of input cache ping-pong operations obtain part of the data of the input image from off-chip memory, that is, the current set of input cache is the input cache for the current cache operation, and the historical set of input cache mentioned below is the cache that has been executed The operation's input buffer, that is, the moment it reads the input image data, is before the current set of input buffers.

S120，在当前组输入缓存从片外存储器中获取输入图像的当前部分数据的同时，利用中间缓存中的数据对多组输入缓存中的历史组输入缓存从片外存储器中获取到的输入图像的历史部分数据进行仿射变换及插值计算，得到所需视角的目标输出图像的部分处理结果。S120, while the current group of input caches obtains the current part of the input image data from the off-chip memory, use the data in the intermediate cache to compare the input images obtained from the off-chip memories by the historical group input caches in the multiple groups of input caches. The historical partial data undergoes affine transformation and interpolation calculations to obtain partial processing results of the target output image from the required perspective.

更具体地，当多组输入缓存轮流切换为当前组输入缓存并从片外存储器中获取输入图像的部分数据时，对历史组输入缓存从片外存储器中获取到的输入图像的部分数据进行仿射变换及插值计算，得到所需视角的目标输出图像的部分处理结果，直至得到目标输出图像的全部或完整处理结果。此时，由于插值计算过程与图像缓存过程并行进行，极大地提高了图像仿射变换的计算速度。More specifically, when multiple groups of input buffers are switched to the current group of input buffers in turn and partial data of the input image is obtained from the off-chip memory, the partial data of the input image obtained by the historical group input buffer from the off-chip memory is simulated. Through radial transformation and interpolation calculation, a partial processing result of the target output image of the required viewing angle is obtained, until all or complete processing results of the target output image are obtained. At this time, since the interpolation calculation process and the image caching process are performed in parallel, the calculation speed of the image affine transformation is greatly improved.

图3为输入图像和所需视场的目标输出图像在仿射变换下的映射关系示意图。如图3所示，若目标输出图像为矩形图像，由于平行线在仿射变换下仍保持平行，则该目标输出图像映射在输入图像上是一个形状为平行四边形的映射图像。在一些实施例中，基于仿射变换原理计算出目标输出图像的四个顶点在输入图像的坐标，从而根据顶点坐标获取目标输出图像在输入图像中的映射图像；在其他一些实施例中，还可根据目标输出图像的分辨率并通过仿射变换原理计算出目标输出图像的四个顶点在输入图像的亚像素精度坐标，后再通过插值计算便可得到输出图像的所有像素。需要说明的是，本申请可优选使用亚像素精度坐标，以得到高分辨率的输出图像。Figure 3 is a schematic diagram of the mapping relationship between the input image and the target output image of the required field of view under affine transformation. As shown in Figure 3, if the target output image is a rectangular image, since the parallel lines remain parallel under affine transformation, the target output image is mapped to a parallelogram-shaped mapping image on the input image. In some embodiments, the coordinates of the four vertices of the target output image in the input image are calculated based on the affine transformation principle, thereby obtaining the mapping image of the target output image in the input image based on the vertex coordinates; in some other embodiments, The sub-pixel accuracy coordinates of the four vertices of the target output image in the input image can be calculated based on the resolution of the target output image and through the affine transformation principle, and then all pixels of the output image can be obtained through interpolation calculations. It should be noted that this application can preferably use sub-pixel precision coordinates to obtain a high-resolution output image.

有鉴于此，在一个实施例中，步骤S120更具体地如图4所示，可以包括步骤S121至S123。In view of this, in one embodiment, step S120 is more specifically shown in FIG. 4 and may include steps S121 to S123.

S121，基于所需视角的目标输出图像及仿射变换原理，获取目标输出图像在输入图像中的映射图像。S121. Based on the target output image from the required perspective and the affine transformation principle, obtain the mapping image of the target output image in the input image.

在一个实施例中，仿射变换是射影变换的特例，其公式为：In one embodiment, affine transformation is a special case of projective transformation, and its formula is:

其中，(x ₀,y ₀)和(x ₁,y ₁)代表同一个点在两个不同视角下拍摄的图像中的坐标，A ₁₁至A ₂₃是变换矩阵的参数，即表征输入图像和目标输出图像的变换，且基于该公式可将目标输出图像变换至与输入图像同一坐标系下，得到目标输出图像在输入图像中的映射图像。 Among them, (x ₀ , y ₀ ) and (x ₁ , y ₁ ) represent the coordinates of the same point in images taken at two different viewing angles, and A ₁₁ to A ₂₃ are the parameters of the transformation matrix, that is, characterizing the input image and Transformation of the target output image, and based on this formula, the target output image can be transformed into the same coordinate system as the input image, and the mapping image of the target output image in the input image can be obtained.

S122，定义映射图像中任一顶点所在的一边为主边，另一边为副边，根据目标输出图像的分辨率计算映射图像主边及副边的点数及各点坐标。S122. Define one side of any vertex in the mapping image as the primary side and the other side as the secondary side. Calculate the number of points and the coordinates of each point on the primary side and secondary side of the mapping image according to the resolution of the target output image.

在一个实施例中，假设第一顶点为映射图像的任一顶点，如图5所示，第一顶点为映射图像的主顶点，主边为映射图像上主顶点左边，而主顶点右边为副边。需要说明的是，主顶点可优选为位于映射图像中最高处的点，以便于后续计算。In one embodiment, assume that the first vertex is any vertex of the mapped image. As shown in Figure 5, the first vertex is the main vertex of the mapped image, the main edge is the left side of the main vertex on the mapped image, and the right side of the main vertex is the secondary vertex. side. It should be noted that the main vertex can preferably be the highest point in the mapping image to facilitate subsequent calculations.

进一步地，在一个实施例中，根据目标输出图像的分辨率以及映射图像的顶点在输入图像中的亚像素精度坐标，从主顶点开始沿映射图像的主边和副边向映射图像的另外两个顶点前进分别计算主边和副边的各点数及坐标，并利用主边和副边的点数及坐标确定插值计算时的前进步长，其中每一步长所到达的位置都对应输出图像的一个像素。因此，沿着映射图像的主边和副边前进各有一个步长，主边上的步长称为主边步长，副边上的步长称为副边步长，每个步长带有方向性，如图5箭头排布路径所示的向左下方前进或者向右下方前进。Further, in one embodiment, according to the resolution of the target output image and the sub-pixel precision coordinates of the vertices of the mapped image in the input image, starting from the main vertex along the main edge and the secondary edge of the mapped image to the other two sides of the mapped image. Calculate the number of points and coordinates of the primary and secondary edges respectively when moving forward each vertex, and use the number and coordinates of the primary and secondary edges to determine the forward step length during interpolation calculation. The position reached by each step length corresponds to a point in the output image. pixels. Therefore, there is a step length for each step along the primary edge and secondary edge of the mapped image. The step length on the primary edge is called the primary edge step length, and the step length on the secondary edge is called the secondary edge step length. Each step length has a It is directional, such as moving toward the lower left or toward the lower right as shown in the arrow arrangement path in Figure 5.

S123，根据主边及副边上各点的坐标对历史部分数据进行插值计算，生成目标输出图像的部分处理结果。S123: Interpolate the historical partial data according to the coordinates of each point on the primary edge and the secondary edge to generate a partial processing result of the target output image.

在一些实施例中，步骤S123可以包括根据主边上间隔主边步长的各点坐标，以主边上各点为起始点，沿着平行于映射图像的副边方向且以副边步长前进，对到达的且属于历史部分数据的各点进行插值计算得到各点的像素值，生成目标输出图像的部分处理结果；对到达且不属于历史部分数据的各点，将这些点的相关信息写入中间缓存，待当前组输入缓存获取到的当前部分数据中包括这些点时，可直接从中间缓存中读取数据对这些点进行插值计算生成各点的像素值，得到输出图像的部分处理结果。In some embodiments, step S123 may include, based on the coordinates of each point on the primary edge spaced by the primary edge step, taking each point on the primary edge as the starting point, along the secondary edge direction parallel to the mapping image and with the secondary edge step. Go forward, perform interpolation calculations on each point that arrives and belongs to the historical part of the data to obtain the pixel value of each point, and generate a partial processing result of the target output image; for each point that arrives and does not belong to the historical part of the data, the relevant information of these points Write to the intermediate cache. When the current part of the data obtained by the current group of input caches includes these points, the data can be read directly from the intermediate cache to perform interpolation calculations on these points to generate the pixel values of each point to obtain partial processing of the output image. result.

更具体地，如图6所示，平行四边形ABDC为目标输出图像在输入图像中的映射图像，点C为映射图像的主顶点，CA为映射图像的主边，CD为映射图像的副边。根据目标输出图像的分辨率计算映射图像主边CA和副边CD的点数和坐标得到主边步长和副边步长，后从主边的每个点起始沿副边方向以副边步长前进(如图6的箭头路径所示)插值计算各点的像素值。当多组输入缓存执行乒乓操作轮流读取输入图像的部分数据时，假设输入图像最上方被映射图像覆盖的历史部分数据存储到历史组输入缓存，输入图像中间被映射图像覆盖的当前部分数据存储到当前组输入缓存。More specifically, as shown in Figure 6, the parallelogram ABDC is the mapping image of the target output image in the input image, point C is the main vertex of the mapping image, CA is the main side of the mapping image, and CD is the secondary side of the mapping image. Calculate the points and coordinates of the primary edge CA and secondary edge CD of the mapped image according to the resolution of the target output image to obtain the primary edge step length and secondary edge step length, and then start from each point of the primary edge along the direction of the secondary edge in steps of the secondary edge Long forward (shown as the arrow path in Figure 6) interpolation calculates the pixel value of each point. When multiple groups of input caches perform ping-pong operations to read partial data of the input image in turn, it is assumed that the historical partial data covered by the mapped image at the top of the input image is stored in the historical group input cache, and the current partial data covered by the mapped image in the middle of the input image is stored to the current group input cache.

当当前组输入缓存读取输入图像中间被映射图像覆盖的部分时，计算单元对历史组输入缓存读取的映射图像数据以点C为主顶点，以主边CA上间隔主边步长的各点为起始点，沿副边CD方向(即图6所示的箭头方向)以副边步长前进，对到达的且属于历史部分数据的各点(如图6所示箭头的实线)进行插值计算得到各点的像素值，生成输出图像的部分处理结果；对到达的且不属于历史部分数据的各点(如图6所示箭头的虚线)，将这些点的相关信息写入中间缓存，待当前组输入缓存获取到的当前部分数据中包括这些点时，可直接从中间缓存中读取数据对这些点进行插值计算生成各点的像素值，得到输出图像的部分处理结果。When the current group input cache reads the part of the input image that is covered by the mapped image, the computing unit reads the mapped image data from the historical group input cache, with point C as the main vertex, and each main edge step on the main edge CA. point is the starting point, proceed with the secondary side step length along the direction of the secondary side CD (i.e., the direction of the arrow shown in Figure 6), and perform operations on each point reached and belonging to the historical part of the data (the solid line of the arrow shown in Figure 6) The interpolation calculation obtains the pixel value of each point and generates partial processing results of the output image; for each point that arrives and does not belong to the historical part of the data (the dotted line of the arrow as shown in Figure 6), the relevant information of these points is written to the intermediate cache , when the current part of the data obtained by the current group of input caches includes these points, the data can be directly read from the intermediate cache, interpolation calculations are performed on these points to generate the pixel values of each point, and partial processing results of the output image are obtained.

在另一个实施例中，由于所需视场的目标输出图像的分辨率受输出缓存的限制，所以会存在实际所需的目标输出图像的分辨率大于输出缓存所预设的标准输出图像分辨率的情况。因此，在执行步骤S121之前还包括步骤S1201：判断目标输出图像的分辨率与输出缓存中的标准输出图像分辨率的大小关系，并根据判断结果自适应选择是否利用标准输出图像替换目标输出图像进行后续计算。优选地，可分为三种情形：In another embodiment, since the resolution of the target output image of the required field of view is limited by the output cache, there may be an actual required resolution of the target output image that is greater than the standard output image resolution preset by the output cache. Case. Therefore, before executing step S121, step S1201 is also included: determine the relationship between the resolution of the target output image and the resolution of the standard output image in the output buffer, and adaptively select whether to replace the target output image with the standard output image based on the judgment result. subsequent calculations. Preferably, it can be divided into three situations:

(1)第一种情形：若目标输出图像的分辨率等于输出缓存中的标准输出图像分辨率，则直接执行前述步骤S121、步骤S122和步骤S123；(1) First situation: If the resolution of the target output image is equal to the standard output image resolution in the output cache, directly execute the aforementioned steps S121, S122 and S123;

(2)第二种情形：若目标输出图像的分辨率小于输出缓存中的标准输出图像的分辨率，则仍按标准输出图像的分辨率计算，同时在输出缓存存储的标准输出图像中指定目标输出图像所在区域，并利用标准输出图像代替前述的目标输出图像继续执行前述步骤S121、步骤S122和步骤S123。(2) The second case: If the resolution of the target output image is smaller than the resolution of the standard output image in the output cache, the resolution of the standard output image is still calculated, and the target is specified in the standard output image stored in the output cache. The area where the output image is located is used, and the standard output image is used to replace the aforementioned target output image and the aforementioned steps S121, S122 and S123 are continued.

在一个实施例中，指定标准输出图像的左上角部分为目标输出图像，待芯片计算出最终的处理结果后，即得到标准输出图像的全部或完整处理结果，可根据指定区域从输出缓存内存储的完整处理结果中只读取左上角部分，从而将目标输出图像写入诸如DDR等片外存储器指定的对应地址中。In one embodiment, the upper left corner of the standard output image is designated as the target output image. After the chip calculates the final processing result, all or complete processing results of the standard output image are obtained, which can be stored from the output cache according to the specified area. Only the upper left corner part of the complete processing result is read, thereby writing the target output image to the corresponding address specified by an off-chip memory such as DDR.

(3)第三种情形：若目标输出图像的分辨率大于输出缓存中的标准输出图像的分辨率，步骤S120可以包括：步骤S221和步骤S222。(3) Third scenario: If the resolution of the target output image is greater than the resolution of the standard output image in the output buffer, step S120 may include: step S221 and step S222.

S221，确定目标输出图像的分辨率大于标准输出图像分辨率，将目标输出图像分割成若干个输出图像块；其中，每个输出图像块的分辨率小于或等于标准输出图像分辨率。S221, determine that the resolution of the target output image is greater than the standard output image resolution, and divide the target output image into several output image blocks; wherein the resolution of each output image block is less than or equal to the standard output image resolution.

在一个实施例中，步骤S222可以包括以下两种子情形中的一种或多种。In one embodiment, step S222 may include one or more of the following two sub-scenarios.

(1)第一种子情形：若输出图像块的分辨率等于标准输出图像分辨率，利用该输出图像块替换前述的步骤S121、步骤S122和S123中的目标输出图像继续执行前述步骤，此处不再赘述。(1) First sub-scenario: If the resolution of the output image block is equal to the standard output image resolution, use the output image block to replace the target output image in the aforementioned steps S121, S122 and S123 and continue to perform the aforementioned steps. Again.

(2)第二种子情形：若输出图像块的分辨率小于标准输出图像的分辨率，则仍按标准输出图像分辨率计算，同时在输出缓存输出的标准输出图像中指定输出图像块所在区域，并利用标准输出图像代替前述目标输出图像继续执行前述步骤S121、步骤S122和步骤S123。需要说明的是，第二种子情形可以与前述的第二种情形类比。(2) The second sub-case: If the resolution of the output image block is smaller than the resolution of the standard output image, the calculation is still based on the standard output image resolution, and the area where the output image block is located is specified in the standard output image output by the output cache. And use the standard output image to replace the aforementioned target output image and continue to execute the aforementioned steps S121, S122 and S123. It should be noted that the second sub-scenario can be analogous to the aforementioned second scenario.

S130，将历史部分处理结果写入输出缓存并同时执行当前部分数据的仿射变换及插值计算，以得到目标输出图像的完整处理结果。S130: Write the historical partial processing results into the output cache and simultaneously perform affine transformation and interpolation calculations on the current partial data to obtain the complete processing results of the target output image.

待目标输出图像的完整处理结果已全部写入输出缓存，则从输出缓存中读取输出图像的完整处理结果并写入片外存储器指定的地址中。After all the complete processing results of the target output image have been written into the output cache, the complete processing results of the output image are read from the output cache and written to the address specified in the off-chip memory.

如上所述，本申请实施例提供一种基于数字芯片的图像仿射变换方法，能够通过几块专用的小容量SRAM对存储在DDR中的局部图像内容进行缓存，并通过插值的方法计算出仿射变换后的像素值，插值计算过程与图像缓存过程并行进行，从而极大地提高图像仿射变换的计算速度。As mentioned above, the embodiment of the present application provides an image affine transformation method based on a digital chip, which can cache the local image content stored in the DDR through several dedicated small-capacity SRAMs, and calculate the affine transformation method through interpolation. The interpolation calculation process is carried out in parallel with the image caching process, thus greatly improving the calculation speed of image affine transformation.

应理解，上述实施例中各步骤的序号的大小并不意味着执行顺序的先后，各过程的执行顺序应以其功能和内在逻辑确定，而不应对本申请实施例的实施过程构成任何限定。It should be understood that the sequence number of each step in the above embodiment does not mean the order of execution. The execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiment of the present application.

图7为根据本申请图1的基础上提供的又一芯片的架构示意图，该芯片实施例中未详细描述之处请参见前述内容。FIG. 7 is an architectural schematic diagram of another chip provided on the basis of FIG. 1 of the present application. Please refer to the foregoing content for details not described in the chip embodiment.

在一些实施例中，多组输入缓存231至少包括第一组输入缓存和第二组输入缓存，芯片还包括调度模块235，用于调度多组输入缓存231中的第一组输入缓存和第二组输入缓存轮流切换为当前组输入缓存以读取输入图像的图像数据及调度中间缓存读取输入图像的参数数据，即当当前组输入缓存及中间缓存读取输入图像的当前部分数据时，历史组输入缓存及中间缓存已完成读取输入图像的历史部分数据的步骤并准备将数据传输至计算模块232；计算模块232，用于利用中间缓存中的参数数据对多组输入缓存231中的历史组输入缓存从片外存储器23A中获取到的输入图像的历史部分数据进行插值计算，得到输出图像的部分处理结果。其中，调度模块235和计算模块232并行操作。In some embodiments, the multiple sets of input caches 231 include at least a first set of input caches and a second set of input caches, and the chip further includes a scheduling module 235 for scheduling the first set of input caches and the second set of input caches in the multiple sets of input caches 231 . The group input cache is switched to the current group input cache in turn to read the image data of the input image and schedule the intermediate cache to read the parameter data of the input image. That is, when the current group input cache and the intermediate cache read the current part of the input image data, the history The group input cache and the intermediate cache have completed the steps of reading the historical partial data of the input image and are ready to transmit the data to the calculation module 232; the calculation module 232 is used to use the parameter data in the intermediate cache to calculate the history in the multiple sets of input cache 231. The group input buffer performs interpolation calculation on the historical partial data of the input image obtained from the off-chip memory 23A to obtain partial processing results of the output image. Among them, the scheduling module 235 and the calculation module 232 operate in parallel.

在一些实施例中，继续参见图6所示，调度模块235包括发送读数据请求子模块2351和接收数据子模块2352。In some embodiments, continuing to refer to FIG. 6 , the scheduling module 235 includes a sending read data request sub-module 2351 and a receiving data sub-module 2352.

在一个实施例中，发送读数据请求子模块2351，用于确定输入图像中未被读入的剩余数据量，若剩余数据量为0，则待当前组输入缓存读入完成后，将多组输入缓存231均标识为不可用；若剩余数据量不为0，则确定当前组输入缓存的可用空间，若可用空间能够容纳预设长度数据量，则发送读数据请求；若所述可用空间无法容纳预设长度数据量，则切换当前组输入缓存。In one embodiment, the read data request sub-module 2351 is sent to determine the remaining amount of data in the input image that has not been read. If the remaining data amount is 0, after the current group of input cache reads is completed, multiple groups of The input caches 231 are all marked as unavailable; if the remaining data amount is not 0, then the available space of the current group of input caches is determined. If the available space can accommodate the preset length data amount, a read data request is sent; if the available space cannot To accommodate the amount of data of the preset length, the current group input buffer is switched.

接收数据子模块2352，用于接收片外存储器响应于读数据请求发送的输入图像的当前部分数据，并将当前部分数据的图像数据及参数数据分别写入当前组输入缓存及中间缓存234。The receiving data submodule 2352 is used to receive the current partial data of the input image sent by the off-chip memory in response to the read data request, and write the image data and parameter data of the current partial data into the current group input cache and the intermediate cache 234 respectively.

在一些实施例中，计算模块232包括获取主边子模块2321和插值并写入子模块2322，获取主边子模块2321和插值并写入子模块2322串联执行。In some embodiments, the calculation module 232 includes a get main edge sub-module 2321 and an interpolation and write sub-module 2322, which are executed in series.

其中，获取主边子模块2321，用于根据目标输出图像及仿射变换原理，获取目标输出图像在输入图像中的映射图像；并定义映射图像中任一顶点所在的一边为主边，另一边为副边，利用目标输出图像的分辨率计算映射图像主边及副边的点数及各点坐标，将映射图像主边及副边的点数及各点坐标写入中间缓存234。Among them, the main edge acquisition submodule 2321 is used to obtain the mapping image of the target output image in the input image based on the target output image and the affine transformation principle; and define one side of any vertex in the mapping image as the main edge, and the other side For the secondary side, use the resolution of the target output image to calculate the number of points and the coordinates of each point on the primary and secondary sides of the mapped image, and write the number of points and coordinates of each point on the primary and secondary sides of the mapped image into the intermediate cache 234 .

插值并写入子模块2322，用于从中间缓存234读取映射图像主边及副边的点数及各点坐标对历史部分数据进行插值计算，生成输出图像的部分处理结果，将部分处理结果写入输出缓存233。The interpolation and writing sub-module 2322 is used to read the number of points and the coordinates of each point on the primary and secondary sides of the mapped image from the intermediate cache 234, perform interpolation calculations on the historical partial data, generate partial processing results of the output image, and write the partial processing results. Input and output buffer 233.

本申请一实施例还提供了一种计算机可读存储介质，所述计算机可读存储介质存储有计算机程序，计算机程序被处理器执行时可实现前述方法实施例中的各个步骤。An embodiment of the present application also provides a computer-readable storage medium. The computer-readable storage medium stores a computer program. When the computer program is executed by a processor, the steps in the foregoing method embodiments can be implemented.

本申请一实施例提供了一种计算机程序产品，当计算机程序产品在电子设备上运行时，使得电子设备可实现前述方法实施例中的各个步骤。An embodiment of the present application provides a computer program product. When the computer program product is run on an electronic device, the electronic device can implement each step in the foregoing method embodiment.

在上述实施例中，对各个实施例的描述都各有侧重，某个实施例中没有详述或记载的部分，可以参见其它实施例的相关描述。In the above embodiments, each embodiment is described with its own emphasis. For parts that are not detailed or documented in a certain embodiment, please refer to the relevant descriptions of other embodiments.

其中，计算机程序包括计算机程序代码，计算机程序代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。计算机可读介质可以包括：能够携带计算机程序代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器(read-only memory，ROM)、RAM、电载波信号、电信信号以及软件分发介质等。需要说明的是，计算机可读介质包含的内容可以根据司法管辖区内立法和专利实践的要求进行适当的增减，例如在某些司法管辖区，根据立法和专利实践，计算机可读介质不包括电载波信号和电信信号。Among them, the computer program includes computer program code, and the computer program code can be in the form of source code, object code, executable file or some intermediate form, etc. Computer-readable media may include: any entity or device capable of carrying computer program code, recording media, USB flash drives, mobile hard drives, magnetic disks, optical disks, computer memory, read-only memory (ROM), RAM, electronic Carrier signals, telecommunications signals, and software distribution media, etc. It should be noted that the content contained in the computer-readable medium can be appropriately increased or decreased according to the requirements of legislation and patent practice in the jurisdiction. For example, in some jurisdictions, according to legislation and patent practice, the computer-readable medium does not include Electrical carrier signals and telecommunications signals.

以上实施例仅用以说明本申请的技术方案，而非对其限制；尽管参照前述实施例对本申请进行了详细的说明，本领域的普通技术人员应当理解：其依然可以对前述各实施例所记载的技术方案进行修改，或者对其中部分技术特征进行等同替换；而这些修改或者替换，并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围，均应包含在本申请的保护范围之内。The above embodiments are only used to illustrate the technical solutions of the present application, but are not intended to limit them. Although the present application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that they can still modify the technical solutions described in the foregoing embodiments. Modifications are made to the recorded technical solutions, or equivalent substitutions are made to some of the technical features; these modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of this application, and shall be included in this application. within the scope of protection.

Claims

A chip-based image affine transformation method, characterized in that the chip includes multiple sets of input caches, intermediate caches, computing units and output caches, wherein the image affine transformation method includes:

Triggering the current group of input buffers and intermediate buffers in the multiple groups of input buffers to read the current part of the image data of the input image and the parameter data required for calculation from the off-chip memory, and write them into the current group of input buffers and all the input buffers respectively. The intermediate cache; the parameter data includes the target output image of the required viewing angle and the corresponding resolution;

While reading the current partial image data of the input image, the computing unit is actuated to use the parameter data to perform a calculation on the historical partial image data of the input image obtained by the historical group input cache in the multiple groups of input buffers. Affine transformation and interpolation calculations are performed to obtain the historical partial processing results of the target output image; wherein the time when the historical group input cache reads data is before the current group input cache;

The historical partial processing results are written into the output cache and at the same time, affine transformation and interpolation calculations are performed on the current partial data to obtain the complete processing results of the target output image.

The method of claim 1, wherein the actuation calculation unit uses the parameter data to affine the historical partial image data of the input image obtained by the historical group input buffer in the multiple groups of input buffers. Transformation and interpolation calculations are performed to obtain the historical partial processing results of the target output image; including:

Based on the target output image of the required perspective and the principle of affine transformation, obtain the mapping image of the target output image in the input image;

Define the edge where any vertex in the mapping image is located as a primary edge and a secondary edge respectively, and calculate the number of points and the coordinates of each point on the primary edge and secondary edge of the mapping image according to the resolution of the target output image;

Interpolation calculation is performed on the historical partial image data based on the coordinates of each point on the primary edge and the secondary edge to obtain the historical partial processing result of the target output image.

The method according to claim 2, characterized in that, based on the target output image of the required perspective and the principle of affine transformation, obtaining the mapping image of the target output image in the input image includes:

The coordinates of the four vertices of the target output image mapped to the input image are calculated based on the affine transformation principle, thereby obtaining the mapping image of the target output image in the input image based on the vertex coordinates.

The method according to any one of claims 1 to 3, characterized in that the actuation calculation unit uses the parameter data to buffer the historical part of the input image acquired by the historical group in the plurality of input buffers. The image data undergoes affine transformation and interpolation calculations to obtain the historical partial processing results of the target output image, including:

Determine the relationship between the resolution of the target output image and the standard output image resolution in the output cache, and adaptively select whether to replace the target output image with the standard output image according to the judgment result before performing subsequent calculations.

The method according to claim 2 or 3, characterized in that after obtaining the number of points and the coordinates of each point on the main and secondary sides of the mapping image, it further includes:

The points and coordinates of the primary and secondary edges are used to determine the forward step length during the interpolation calculation, where the position reached by each step corresponds to a pixel of the target output image; wherein, the step length on the primary edge is the step size of the primary side, and the step size on the secondary side is the step size of the secondary side.

The method of claim 5, wherein the historical partial image data is interpolated based on the coordinates of each point on the primary edge and the secondary edge to obtain the historical partial processing result of the target output image, include:

According to the coordinates of each point on the main edge spaced by the main edge step, taking each point on the main edge as a starting point, along the direction of the secondary edge parallel to the mapping image and with the secondary edge Step forward, perform the interpolation calculation on each point that reaches and belongs to the historical partial data to obtain the pixel value of each point, and generate a partial processing result of the target output image;

For each point that arrives but does not belong to the historical partial data, the relevant information of each point is written into the intermediate cache. When the current partial data obtained by the current group input cache includes the points, Each point can be read from the intermediate cache and interpolated to calculate the pixel value of each point to obtain the current partial processing result of the target output image.

A chip is characterized by including:

Multiple sets of input buffers are used to read part of the image data of the input image from the off-chip memory in turn;

An intermediate cache, configured to read part of the parameter data of the input image from the off-chip memory according to the timing of reading data from the multiple sets of input caches; wherein the parameter data includes the target output image of the required field of view and Corresponding resolution;

A calculation module configured to use the parameter data to calculate the current partial data of the input image from the off-chip memory while the current group of input buffers in the multiple groups of input buffers obtains the current partial data of the input image from the off-chip memory. The historical group input cache performs interpolation calculation on the historical partial data of the input image obtained from the off-chip memory to obtain partial processing results of the output image; wherein, the time when the historical group input cache reads the data is located The current group is input before caching;

The output cache is used to store partial processing results of the output image.

The chip of claim 7, further comprising a scheduling module for scheduling each group of input buffers in multiple groups of input buffers to switch to the current group of input buffers in turn to read the image data of the input image and schedule the The intermediate cache reads the parameter data of the input image; wherein the scheduling module and the computing module operate in parallel.

The chip according to claim 8, characterized in that the scheduling module includes a sending read data request sub-module and a receiving data sub-module,

The sending read data request sub-module is used to determine the amount of remaining data that has not been read in the input image. If the amount of remaining data is 0, after the current group of input cache reads is completed, the multiple groups of input The caches are all marked as unavailable; if the remaining data amount is not 0, determine the available space of the current group input cache, and if the available space can accommodate the preset length data amount, send a read data request; if the available space cannot To accommodate the preset length of data, switch the current group input buffer;

The receiving data submodule is configured to receive the current partial data of the input image sent by the off-chip memory in response to the read data request, and write the current partial data into the current group input cache.

The chip according to any one of claims 7 to 9, characterized in that the calculation module includes a main edge acquisition sub-module and an interpolation and writing sub-module, and the main edge acquisition sub-module and the interpolation and writing sub-module are Submodules operate in series;

The main edge acquisition submodule is used to obtain the mapping image of the target output image in the input image based on the target output image and the affine transformation principle; and define the side where any vertex in the mapping image is located as the main edge, and the other side. One side is the secondary side, use the resolution of the target output image to calculate the number of points and the coordinates of each point on the primary and secondary sides of the mapping image, and write the number of points and coordinates of each point on the primary and secondary sides of the mapping image into the middle cache;

The interpolation and writing sub-module is used to read the number of points and the coordinates of each point on the primary and secondary sides of the mapping image from the intermediate cache, perform interpolation calculations on the historical partial data, generate partial processing results of the output image, and The partial processing results are written into the output cache.

A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program, characterized in that, when the computer program is executed by a processor, the method of any one of claims 1 to 6 is implemented. Graph affine transformation method.