CN103718244A - Acquisition method and device for media processing accelerator - Google Patents
Acquisition method and device for media processing accelerator Download PDFInfo
- Publication number
- CN103718244A CN103718244A CN201280036339.6A CN201280036339A CN103718244A CN 103718244 A CN103718244 A CN 103718244A CN 201280036339 A CN201280036339 A CN 201280036339A CN 103718244 A CN103718244 A CN 103718244A
- Authority
- CN
- China
- Prior art keywords
- register
- row
- pixel values
- tetris
- aligned
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09G—ARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
- G09G5/00—Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
- G09G5/36—Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators characterised by the display of a graphic pattern, e.g. using an all-points-addressable [APA] memory
- G09G5/39—Control of the bit-mapped memory
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
- G06T1/60—Memory management
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09G—ARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
- G09G2360/00—Aspects of the architecture of display systems
- G09G2360/12—Frame memory handling
- G09G2360/121—Frame memory handling using a cache memory
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09G—ARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
- G09G2360/00—Aspects of the architecture of display systems
- G09G2360/12—Frame memory handling
- G09G2360/122—Tiling
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09G—ARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
- G09G5/00—Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
- G09G5/36—Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators characterised by the display of a graphic pattern, e.g. using an all-points-addressable [APA] memory
- G09G5/363—Graphics controllers
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Hardware Design (AREA)
- Memory System Of A Hierarchy Structure (AREA)
- Image Processing (AREA)
Abstract
Description
背景技术Background technique
视频面通常以区块格式存储在存储器中,以改进存储器控制器效率。视频处理算法经常需要访问这些视频面内任意位置处的任意矩形尺寸的感兴趣的2D区域(ROI)。这些任意位置可以是未对齐的高速缓冲存储器,且可以跨越几个非相邻的高速缓冲存储器线和/或区块(tile)。为了从这样的位置采集像素,传统方式可以从存储器过量提取像素数据的几个高速缓冲存储器线,随后执行交叉混合(swizzling)、掩码和缩减操作,使得采集过程具有挑战性。Video planes are usually stored in memory in a block format to improve memory controller efficiency. Video processing algorithms often require access to a 2D region of interest (ROI) of arbitrary rectangular size at an arbitrary location within these video planes. These arbitrary locations may be cache-unaligned, and may span several non-adjacent cache lines and/or tiles. To acquire a pixel from such a location, traditional approaches can overfetch several cache lines of pixel data from memory, followed by swizzling, masking and downscaling operations, making the acquisition process challenging.
高能效的媒体处理通常由可编程向量或标量架构来进行,或者由固定的功能逻辑来进行。在传统的向量实施方式中,可以使用向量采集指令来采集ROI的像素值,这通常包括:从一个高速缓冲存储器线收集像素值的行中的某些值,遮蔽任何无效值,在缓冲器或存储器中存储值,从下一个高速缓冲存储器线收集该行的附加的像素值,并重复这个过程直到采集到像素值的完整的水平的行为止。结果,为了满足区块格式,典型的向量采集过程通常需要使用不同的蒙版(mask)来多次重发相同的高速缓冲存储器线。Power-efficient media processing is typically performed by programmable vector or scalar architectures, or by fixed-function logic. In a traditional vector implementation, the pixel values of an ROI can be captured using vector capture instructions, which typically include: collecting certain values from a line of pixel values from a cache line, masking out any invalid values, Values are stored in memory, additional pixel values for that row are collected from the next cache line, and the process is repeated until a complete horizontal row of pixel values has been collected. As a result, in order to satisfy the block format, a typical vector acquisition process often requires multiple retransmissions of the same cache line using different masks.
附图说明Description of drawings
在附图中通过示例而非限制的方式例示了本文中所描述的材料。为了例示的简单和清楚,附图中例示的元件不一定是按照比例绘制的。例如,为了清楚,可以相对于其他元件而放大某些元件的尺寸。此外,在认为适当的情况下,在附图中重复了附图标记,以表示相应的或类似的元件。在附图中:The materials described herein are illustrated in the drawings by way of example and not limitation. For simplicity and clarity of illustration, elements illustrated in the figures have not necessarily been drawn to scale. For example, the dimensions of some of the elements may be exaggerated relative to other elements for clarity. Further, where considered appropriate, reference numerals have been repeated among the figures to indicate corresponding or analogous elements. In the attached picture:
图1是示例性系统的示意图;Figure 1 is a schematic diagram of an exemplary system;
图2例示了示例性的过程;Figure 2 illustrates an exemplary process;
图3例示了示例性的区块存储器格式;Figure 3 illustrates an exemplary chunk store format;
图4例示了示例性的区块存储器格式;Figure 4 illustrates an exemplary chunk store format;
图5、6和7例示了不同环境下图1的示例性系统;Figures 5, 6 and 7 illustrate the exemplary system of Figure 1 in different environments;
图8例示了图2的示例性过程的附加部分;Figure 8 illustrates an additional portion of the exemplary process of Figure 2;
图9例示了溢出条件下图1的示例性系统;以及Figure 9 illustrates the exemplary system of Figure 1 under overflow conditions; and
图10是全部根据本公开内容的至少某些实施方式而布置的示例性系统的示意图。10 is a schematic diagram of an exemplary system, all arranged in accordance with at least some embodiments of the present disclosure.
具体实施方式Detailed ways
现在参考附图来说明一个或多个实施例。尽管论述了特定的结构和布置,但应理解,这仅是出于说明性目的而作出的。本领域技术人员应当认识到,在不脱离本说明书的精神和范围的情况下,可以使用其他结构和布置。对于本领域技术人员而言,本文中所描述的技术和/或布置也可以用于除了本文中所描述的以外的各种其他系统和应用是显而易见的。One or more embodiments are now described with reference to the figures. While specific structures and arrangements are discussed, it should be understood that this is done for illustrative purposes only. A person skilled in the art will recognize that other structures and arrangements may be used without departing from the spirit and scope of the description. It will be apparent to those skilled in the art that the techniques and/or arrangements described herein may also be used in various other systems and applications than those described herein.
尽管以下说明阐述了可以在例如这种芯片上系统(SoC)架构的架构中出现的多个实施方式,但本文所述的技术和/或布置的实施方式不局限于特定的架构和/或计算系统,可以由用于类似目的的任意架构和/或计算系统来实现。例如,采用例如多个集成电路(IC)芯片和/或封装的多种架构,和/或多种计算设备,和/或诸如机顶盒、智能电话之类的多种消费电子(CE)设备,可以实现本文所述的技术和/或布置。此外,尽管以下说明可以阐明多个特定细节,例如系统部件的逻辑实施方式、类型和相互关系,逻辑划分/集成选择等,但可以实施所要求保护的主题而不需要这样的特定细节。在其他情况下,例如,可以不详细示出诸如控制结构和完整软件指令序列之类的一些材料,从而不模糊本文中所公开的材料。Although the following description sets forth various implementations that may occur within an architecture such as this System-on-Chip (SoC) architecture, implementations of the techniques and/or arrangements described herein are not limited to a particular architecture and/or computing system, can be implemented by any architecture and/or computing system serving a similar purpose. For example, using various architectures such as multiple integrated circuit (IC) chips and/or packages, and/or various computing devices, and/or various consumer electronics (CE) devices such as set-top boxes, smartphones, etc., may Implement the techniques and/or arrangements described herein. Furthermore, while the following description may set forth numerous specific details, such as logical implementations, types and interrelationships of system components, logical partitioning/integration options, etc., claimed subject matter may be practiced without such specific details. In other instances, for example, some material, such as control structures and full software instruction sequences, may not be shown in detail so as not to obscure material disclosed herein.
本文中所公开的材料可以在硬件、固件、软件或其任意组合中实现。本文中所公开的材料也可以实现为存储在机器可读介质上的指令,其可以由一个或多个处理器读取并执行。机器可读介质可以包括用于以机器(例如计算设备)可读的形式存储或发送信息的任意介质和/或机制。例如,机器可读介质可以包括:只读存储器(ROM);随机存取存储器(RAM);磁盘存储介质;光存储介质;闪存设备;电、光、声或其他形式传播的信号(例如,载波、红外信号、数字信号等),及其他的介质。The material disclosed herein can be implemented in hardware, firmware, software, or any combination thereof. The material disclosed herein can also be implemented as instructions stored on a machine-readable medium, which can be read and executed by one or more processors. A machine-readable medium may include any medium and/or mechanism for storing or transmitting information in a form readable by a machine (eg, a computing device). For example, a machine-readable medium may include: read-only memory (ROM); random-access memory (RAM); magnetic disk storage media; optical storage media; flash memory devices; , infrared signal, digital signal, etc.), and other media.
说明书中引用的“一个实施例”、“一实施例”、“一示例性实施例”等表示所述的实施方式可以包括特定的特征、结构或特性,但是不需要每个实施方式都包括特定的特征、结构或特点。而且,这样的短语不一定指代相同的实施方式。此外,当结合一实施方式来描述特定的特征、结构或特点时,应当指出,这些特征、结构或特点在其他相关实施方式中起作用是在本领域技术人员的知识范围内的,而无论本文中是否明确地说明。References in the specification to "one embodiment," "an embodiment," "an exemplary embodiment," etc. mean that the described implementations may include a particular feature, structure, or characteristic, but that every implementation need not include the particular feature, structure, or characteristic. character, structure or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Furthermore, when a particular feature, structure, or characteristic is described in conjunction with one embodiment, it should be within the knowledge of those skilled in the art that such feature, structure, or characteristic would function in other related embodiments, regardless of the context herein. Is it clearly stated in .
图1例示了根据本公开内容的采集引擎100的示例性实施方式。在多个实施方式中,采集引擎100可以构成媒体处理加速器的至少一部分。采集引擎100包括寄存器阵列102、桶形移位器104、两个采集寄存器缓冲器(GRB)106和108和多路复用器(MUX)110。寄存器阵列102包括具有多个寄存器存储位置或部分122的多个俄罗斯方块寄存器(tetris register)112、114、116、118和120。在多个实施方式中,根据本公开内容的俄罗斯方块寄存器可以是任意临时存储逻辑,例如被配置为字节标记的或使能的处理器寄存器逻辑。FIG. 1 illustrates an exemplary implementation of an
根据本公开内容,采集引擎100可以用于从存储在诸如高速缓冲存储器(例如L1高速缓冲存储器)之类的存储器中的视频面的感兴趣的区域(ROI)采集视频数据。在多个实施方式中,ROI可以包括任意类型的视频数据,例如像素强度值等。在多个实施方式中,引擎100可以被配置为存储从高速缓冲存储器(未示出)接收的多个高速缓冲存储器线(CL)的内容,从而跨过阵列102的俄罗斯方块寄存器112-120中的相对应的一个的部分122来存储每个高速缓冲存储器线(例如CL1、CL2等)。在多个实施方式中,俄罗斯方块寄存器的第一部分可以构成阵列102的第一行124,而俄罗斯方块寄存器的第二部分可以构成阵列的第二行126,如此类推。According to the present disclosure, the
根据本公开内容,高速缓冲存储器线内容可以存储在阵列102中,以使得每个CL的内容的不同部分存储在俄罗斯方块寄存器中的相对应的一个的不同部分中。例如,在多个实施方式中,CL1的最高有效部分可以存储在俄罗斯方块寄存器112的第一部分128中,而CL2的最高有效部分可以存储在俄罗斯方块寄存器114的第一部分130中,如此类推。CL1的次最高有效部分可以存储在俄罗斯方块寄存器112的第二部分132中,而CL2的次最高有效部分可以存储在俄罗斯方块寄存器114的第二部分134中,如此类推。According to the present disclosure, cache line contents may be stored in
根据本公开内容,阵列102的行的数量可以与待处理的高速缓冲存储器线中的八进制字(OW)的数量相匹配,而阵列102的列的数量(及因此所采用的俄罗斯方块寄存器的数量)可以与高速缓冲存储器线OW加一的数量相匹配。在图1的示例中,引擎100可以配置为采集64字节的高速缓冲存储器线,以使得每个俄罗斯方块寄存器都包括四个部分122以存储相对应的高速缓冲存储器线的四个16字节OW部分,并且因此阵列102包括四行。例如,CL1的最高有效OW可以存储在俄罗斯方块寄存器112的部分128中,而CL1的次最高有效OW可以存储在寄存器112的部分132中,如此类推。如以下将更详细解释的那样,为了容纳并处理未对齐的和/或溢出的高速缓冲存储器线内容,根据本公开内容的采集引擎可以包括比存储高速缓冲存储器线OW所需的俄罗斯方块寄存器的数量至少多一个的俄罗斯方块寄存器。例如,为了处理具有四个OW的64字节高速缓冲存储器线,阵列102包括五个俄罗斯方块寄存器112-120,以使得阵列102的每一行都在宽度上横跨总共80字节。According to the present disclosure, the number of rows of
桶形移位器104可以接收寄存器102的任意一行的内容。例如,桶形移位器104可以是64字节桶形移位器,被配置为接收与在阵列102中存储的五个高速缓冲存储器线中的最高有效部分相对应的行124的内容。在多个实施方式中,如下将更详细解释的那样,桶形移位器104可以通过例如左移寄存器部分122的内容来对齐它们,随后可以将对齐的内容提供给GRB106或GRB108。例如,桶形移位器104可以以连续往复(successiveiteration)的方式接收行124的部分122的内容,对齐那些内容并将经对齐的内容提供给GRB106。例如,桶形移位器104可以接收寄存器部分128的内容,可以对齐那些内容,并且随后将经对齐的数据提供给GRB106。桶形移位器104可以随后接收寄存器部分130的内容,可以对齐那些内容并随后将经对齐的数据提供给GRB106,以相邻于与寄存器部分128相对应的经对齐的数据而临时存储,如此类推,直至行124的内容与GRB106对齐并存储于GRB106中,以生成像素数据的对齐行。The
当引擎100如刚才所描述那样处理行124的内容时,引擎100还可以以类似的方式进行行126的内容的处理,直至行126的内容与RGB108对齐并存储于RGB108中,以生成像素值的第二对齐行。在多个实施方式中,如下更详细解释的那样,GRB106和GRB108可以使用MUX110以往复方式将像素数据的对齐行提供给2D寄存器文件(未示出),以将GRB106和GRB108的内容交替地提供给寄存器文件(RF)。When the
在多个实施方式中,采集引擎100可以在一个或多个集成电路(IC)中实现,所述集成电路例如是芯片上系统(SoC)和消费电子(CE)媒体处理系统的附加IC。例如,引擎100可以由被配置为处理视频数据的任意设备来实现,所述设备例如是但不限于专用集成电路(ASIC)、现场可编程门阵列(FPGA)、数字信号处理器(DSP)等。如上所述,尽管引擎100包括适合于处理64字节高速缓冲存储器线的五个俄罗斯方块寄存器112-120,但根据本公开内容的采集引擎可以包括取决于高速缓冲存储器线和/或被处理的ROI的尺寸的任意数量的俄罗斯方块寄存器。In various implementations,
图2例示了根据本公开内容的多个实施方式的用于实现采集操作的示例性过程200的流程图。过程200可以包括如由图2的块201、202、204、206、208、210和212中的一个或多个块所示的一个或多个操作、功能或动作。通过非限制性示例的方式,本文中将参考图1的示例性采集引擎100来描述过程200。过程200可以在块201处开始,其中开始对视频面的ROI的采集处理。例如,过程200可以在块201处开始,其中开始对64x64的ROI的采集处理(例如,ROI横跨64行,每一行都具有64字节的像素值)。FIG. 2 illustrates a flowchart of an
在块202处,可以接收第一高速缓冲存储器线(CL),其中,所述CL对应于在ROI中所包含的数据的第一CL。在块204处,可以将CL划分为最高有效部分、次最高有效部分等。例如,如果在块202处接收64字节CL,则可以将CL划分为四个16字节OW部分。随后可以将CL部分载入寄存器阵列中,以便将最高有效部分存储在阵列的第一行的第一位置中,次最高有效部分存储在阵列的第二行的第一位置中,如此类推。例如,由阵列102接收的64字节CL(CL1)可以划分为四个OW,并载入第一俄罗斯方块寄存器112的寄存器部分122中,以便将最高有效OW存储在部分128中,次最高有效OW存储在部分132中,如此类推。At
在块208处,做出关于是否要针对ROI获得附加的数据的高速缓冲存储器线的确定。如果要获得附加的CL,则过程200可以环回(loop back)并针对ROI中下一个CL进行块202-206。例如,可以由阵列102接收下一个64字节CL(CL2),划分为四个OW并载入第二俄罗斯方块寄存器114的寄存器部分122中,以便将最高有效OW存储在部分130中,次最高有效OW存储在部分134中,如此类推。以此方式,过程200可以通过块202-206的连续往复而继续循环,直至ROI的一个或多个附加的CL载入阵列102中。例如,继续以上的示例,直到可以由阵列102接收ROI的另外三个CL(例如,CL3、CL4和CL5),以类似的方式划分为四个OW并载入剩余俄罗斯方块寄存器116、118和120的寄存器部分122中。At
图3和4例示了根据本公开内容的多个实施方式的、在区块存储器中用于存储视频面的示例性区块-y格式。在图3中,存储器的4KB个区块300可以包括八(8)列乘以16字节宽存储位置的三十二(32)行。在区块-y格式中,区块300可以将64字节CL302的四个OW存储为区块300的列的第一部分。以此方式,区块300可以存储数据的六十四(64)个高速缓冲存储器线。在图4中,示出区块300跨诸如高速缓冲存储器之类的存储器的区域400的一部分。参考过程200和引擎100,用以加载ROI的CL的块202-206的连续往复可以包括连续地将区块300的高速缓冲存储器线402-410载入阵列102中。3 and 4 illustrate exemplary tile-y formats for storing video planes in a tile store according to various embodiments of the present disclosure. In FIG. 3, a
返回到图2的论述,当已经将ROI的一个或多个CL载入到寄存器阵列中时,过程200可以在块210处继续,其中,针对阵列的第一行的每一个连续部分,将该部分载入到桶形移位器中,如有必要,对齐该部分的内容。例如,块210可以包括将行124的第一部分128的内容载入到移位器104中,随后左移数据以将其GRB106对齐。在一些实施方式中,如果当在块202-206处将高速缓冲存储器线载入阵列时已经对齐了高速缓冲存储器线,则块210可以不包括对齐内容。在块212处,可以将像素值的对齐的第一行提供给第一采集缓冲器。例如,可以从桶形移位器104将行124的对齐的像素值内容提供给GRB106。Returning to the discussion of FIG. 2, when one or more CLs of the ROI have been loaded into the register array,
例如,图5例示了根据本公开内容的多个实施方式的、在针对第一寄存器部分进行过程200的块210和212的环境500中的引擎100。在环境500中,如图所示,已经将ROI的五个CL载入到阵列102中,其中ROI的内容(由虚线标记示出)没有相对于阵列102对齐。在这个示例中,ROI的第一CL(例如CL1)载入到第一俄罗斯方块寄存器112中,以使得俄罗斯方块寄存器112的每一个部分122都包括无效部分502。根据本公开内容,当针对行124的第一寄存器部分128进行块210时,将部分128的内容载入到移位器104中并左移,以使得当在块210处将内容提供给GRB106时,数据如图所示地与GRB106对齐。For example, FIG. 5 illustrates
继续该示例,图6示出了根据本公开内容的多个实施方式的、在针对下一个寄存器部分进行过程200的块210和212的环境600中的引擎100。在环境600中,通过将俄罗斯方块寄存器114的部分130的内容载入到移位器104中,左移数据并随后将对齐的数据提供给GRB106来为行124的下一个部分130进行块210和212,以使得该数据如图所示地相邻于来自部分128的对齐的数据而被存储。以该方式,在块210和212结束处,行124的完全对齐的内容可以存储在GRB106中,如图7所示,其中,在根据本公开内容的多个实施方式的、针对第一寄存器行124完成过程200的块210和212的环境700中例示了引擎100。Continuing with the example, FIG. 6 illustrates
返回到图2的论述,当在块212处已经将第一行的对齐的内容载入到第一采集缓冲器中时,过程200可以继续进行寄存器阵列的任意附加的行的处理。图8示出了根据本公开内容的多个实施方式的用于实现采集操作的示例性过程200的附加部分的流程图。过程200的附加部分可以包括如图8的块215、214、216、218、220、和222中的一个或多个块所例示的一个或多个操作、功能或动作。通过非限制性示例的方式,本文中还将参考图1的示例性采集引擎100来描述过程200的附加的块。过程200可以在图8的块214处继续。Returning to the discussion of FIG. 2 , when the aligned contents of the first row have been loaded into the first capture buffer at
在块214处,可以将阵列的第二行的部分的内容连续地载入到桶形移位器中,并且如有必要,可以对齐该内容。在块215处,可以将经对齐的寄存器部分的内容并入第二采集缓冲器中。例如,块214和块215可以包括:将第二行126的第一部分132的内容载入到移位器104中,左移数据,将经对齐的数据载入到GRB108中,将第二行126的第二部分134的内容载入到移位器104中,左移数据,将经对齐的数据载入到的GRB108中邻近来自部分132的经对齐数据,如此类推,直至处理了第二行的全部部分。因此,在这个示例中,在块214和块215结束处,寄存器阵列102的第二行126的经对齐的内容可以被载入到GRB108中。At block 214, the contents of the portion of the second row of the array may be sequentially loaded into the barrel shifter, and the contents may be aligned if necessary. At
当块214和/或块215进行时,可以在块216处将第一行的经对齐的内容从第一寄存器缓冲器提供给2D寄存器文件。例如,块216可以包括:使用MUX110来将存储在GRB106中的经对齐的第一行数据提供给RF,其中,所述数据可以在RF中存储为第一行数据。在块218处,可以将第二行的经对齐的内容从第二寄存器缓冲器提供给RF。例如,块218可以包括:使用MUX110来将存储在GRB108中的经对齐的第二行数据提供给RF,其中,所述数据可以在RF中存储为第二行数据。While block 214 and/or block 215 are in progress, at
过程200可以在块220处继续,其中,以类似于以上针对寄存器阵列的前两行所描述的方式来处理寄存器阵列的附加的行。因此,例如,块220可以引起阵列102的三个剩余行的经对齐内容在RF中被存储为接下来的三行数据,并可以完成阵列的这些行的处理。在块222处,可以作出有关于是否应进行针对ROI采集更多的高速缓冲存储器线的确定。例如,如果过程200的第一次往复(iteration)已引起了采集64x64的ROI的四行,则可以针对ROI接下来的四行继续进行采集操作。如果将针对ROI继续采集操作,则过程200可以返回到图2,并可以在块201处开始针对ROI的一个或多个附加的高速缓冲存储器线进行第二次过程200。否则,如果采集操作不继续,则过程200可以结束。
尽管示例性过程200的实施方式如图2和8所示可以包括以例示的顺序进行所示的全部块,但是本公开内容不限于此,并且在多个示例中,过程200的实施方式可以包括仅进行所示的全部块的一子集和/或以不同于所示的顺序进行。例如,在多个实施方式中,可以在块214和215的任意一个或两者的之前、期间和/或之后进行图8的块216。另外,可以针对寄存器阵列的不同填充阶段来进行根据本公开内容的采集处理,以使得如果在任一时间,寄存器阵列的一行或多行为空的话,则可以在如本文所述地处理保持有ROI的像素值的阵列行的同时,用来自高速缓冲存储器的ROI像素值来加载那些行。Although an implementation of the
另外,可以响应于由一个或多个计算机程序产品提供的指令来进行图2和图8的处理和/或块中的任意一个或多个。这种程序产品可以包括提供指令的信号承载介质,在由例如一个或多个处理器核心执行所述指令时,可以提供本文所描述的功能。可以在任意形式的计算机可读介质中提供计算机程序产品。因此,例如,包括一个或多个处理器核心的处理器可以响应于由计算机可读介质传送到处理器的指令来进行图2和8中所示的一个或多个块。Additionally, any one or more of the processes and/or blocks of Figures 2 and 8 may be performed in response to instructions provided by one or more computer program products. Such a program product may include a signal bearing medium providing instructions that, when executed by, for example, one or more processor cores, may provide the functionality described herein. A computer program product may be provided on any form of computer readable medium. Thus, for example, a processor including one or more processor cores may perform one or more of the blocks shown in FIGS. 2 and 8 in response to instructions conveyed to the processor by a computer-readable medium.
此外,尽管本文中已经在针对在高速缓冲存储器中以区块-y格式存储的视频面的64x64的ROI来采集64字节的高速缓冲存储器线的示例性采集引擎100的环境中描述了过程200,但是本公开内容不限于高速缓冲存储器线的具体尺寸、ROI的尺寸或形状、和/或具体的区块存储器格式。例如,为了针对具有大于64字节宽度的ROI实现采集处理,可以将一个或多个附加的俄罗斯方块寄存器添加到寄存器阵列中。另外,对于较小宽度的ROI,例如32x64的ROI,阵列的前两行可以在被写出到RF之前收集到采集缓冲器中。此外,诸如区块-x之类的其他区块存储器格式可以根据本公开内容而进行采集处理。Furthermore, while
在多个实施方式中,一个或多个处理器核心可以针对ROI的任意尺寸和/或形状以及针对ROI数据相对于引擎100的任何对齐使用引擎100来进行过程200数据。在如此进行时,处理器吞吐量可以取决于ROI的尺寸、形状和/或对齐。例如,在非限制性实例中,如果待采集的ROI在X方向上伸展(例如,在区块-y格式中作为一行像素值)并完全对齐,则可以在两个循环中处理一个高速缓冲存储器线。在这种环境下,吞吐量会受到高速缓冲存储器宽度的限制。另一方面,如果ROI在Y方向上伸展(例如,在区块-y格式中作为一列像素值)并完全对齐,则可以在64个循环中处理一个高速缓冲存储器线。在另一个非限制性示例中,对于完全未对齐的17x17的ROI,可以在12个循环中处理一个高速缓冲存储器线。在最后的非限制性示例中,可以在50个循环中采集对齐的24x24的ROI的像素值,然而如果24x24的ROI完全未对齐,则可能用81个循环来采集全部像素值。In various embodiments, one or more processor cores may use the
在多个实施方式中,可以在溢出条件下进行根据本公开内容的采集过程。例如,参考示例性采集引擎100,在一些实施方式中,ROI可以超过桶形移位器104和GRB106及GRB108的宽度。图9例示了在根据本公开内容的多个实施方式的在溢出条件下进行过程200的环境900中的引擎100。如图9所示,在以第一行的大部分填充GRB106之后,可以将从第一行剩余的溢出数据902放置到GRB108中。可以以类似的方式继续剩余行的处理。In various embodiments, acquisition processes according to the present disclosure may be performed under overflow conditions. For example, referring to
图10例示了根据本公开内容的示例性系统1000。系统1000可以用于执行本文中所论述的多种功能的某些或全部,并可以包括根据本公开内容的多个实施方式能够进行采集处理的任何设备或设备的集合。例如,系统1000可以包括诸如台式机、移动或平板计算机、智能电话、机顶盒等之类的计算平台或设备的选择的部件,但是本公开内容不限于此。在一些实施方式中,系统1000可以是基于用于CE设备的architecture(IA)的计算平台或SoC。本领域技术人员易于理解,在不脱离本公开内容的范围的情况下,本文所描述的实施方式可以应用于替换的处理系统。FIG. 10 illustrates an
系统1000包括具有一个或多个处理器核心1004的处理器1002。处理器核心1004可以是能够至少部分地执行软件和/或处理数据信号的任意类型的处理器逻辑。在多个示例中,处理器核心1004可以包括CISC处理器核心、RISC微处理器核心、VLIW微处理器核心、和/或实现指令集的任何组合的任意数量的处理器核心,或者诸如数字信号处理器或微控制器之类的任何其他处理器设备。在多个实施方式中,一个或多个处理器核心1004可以根据本公开内容实现采集引擎和/或进行采集处理。
处理器1002还包括解码器1006,其可以用于将由例如显示处理器1008和/或图形处理器1010接收的指令解码为控制信号和/或微码入口点。尽管在系统1000中例示为与核心1004不同的部件,但本领域技术人员应当理解,一个或多个核心1004可以实现解码器1006、显示处理器1008和/或图形处理器1010。响应于控制信号和/或微码入口点,显示处理器1008和/或图形处理器1010可以执行相对应的操作。
处理核心1004、解码器1006、显示处理器1008和/或图形处理器1010可以通过系统互连1016彼此和/或与多个其他系统设备可通信地和/或可操作地耦合,所述其他系统设备可以包括但不限于,例如,存储器控制器1014、音频控制器1018和/或外围设备1020。外围设备1020可以包括,例如,通用串行总线(USB)主机端口、外围设备互连(PCI)Express端口、串行外围接口(SPI)、扩展总线、和/或其他外围设备。尽管图10将存储器控制器1014例示为由互连1016耦合到解码器1006和处理器1008及1010,但在多个实施方式中,存储器控制器1014可以直接耦合到解码器1006、显示处理器1008和/或图形处理器1010。
在一些实施方式中,系统1000可以经由I/O总线(未示出)与图10中未示出的多个I/O设备通信。这样的I/O设备可以包括但不限于,例如,通用异步接收器/发射器(UART)设备、USB设备、I/O扩展接口或其他I/O设备。在多个实施方式中,系统1000可以表示用于进行移动、网络和/或无线通信的系统的至少部分。In some implementations,
系统1000可以进一步包括存储器1012。存储器1012可以是一个或多个分离的存储器部件,例如动态随机存取存储器(DRAM)设备、静态随机存取存储器(SRAM)设备、闪存设备、或其他存储器设备。存储器1012可以存储由数据信号表示的指令和/或数据,其可以由处理器1002执行。在一些实施方式中,存储器1012可以包括系统存储器部分和显示存储器部分。在多个实施方式中,存储器1012可以存储视频数据,例如包括像素值的视频数据的帧,所述像素值可以在多个接合点被存储为由引擎100采集的和/或由过程200处理的高速缓冲存储器线。
尽管图10例示了在处理器1002以外的存储器1012,但在多个实施方式中,处理器1002包括诸如L1高速缓冲存储器之类的内部高速缓冲存储器1024的一个或多个实例。根据本公开内容,高速缓冲存储器1024可以以区块-y格式布置的高速缓冲存储器线的形式存储诸如像素值之类的视频数据。处理器核心1004可以访问存储在高速缓冲存储器1024中的数据,以实现本文中所描述的采集功能。此外,高速缓冲存储器1024可以提供2D寄存器文件,其存储引擎100和过程200的经对齐的数据输出。在多个实施方式中,高速缓冲存储器1024可以从存储器1012接收诸如像素值之类的视频数据。Although FIG. 10 illustrates
以上所描述的系统以及如本文中所描述的那样由系统执行的处理可以在硬件、固件或软件或者其任意组合中实现。另外,本文中所公开的任何一个或多个特征可以在包括分立的和集成的电路逻辑、专用集成电路(ASIC)逻辑和微控制器的硬件、软件、固件及其组合中实现,并可以实现为特定域集成电路封装的部分、或集成电路封装的组合。本文中所使用的术语软件指代计算机程序产品,其包括具有存储于其中的计算机程序逻辑的计算机可读介质,以使得计算机系统执行本文中所公开的一个或多个特征和/或特征的组合。The systems described above, and the processes performed by the systems as described herein, may be implemented in hardware, firmware, or software, or any combination thereof. Additionally, any one or more features disclosed herein can be implemented in hardware, software, firmware, and combinations thereof, including discrete and integrated circuit logic, application-specific integrated circuit (ASIC) logic, and microcontrollers, and can be implemented in Part of a domain-specific integrated circuit package, or combination of integrated circuit packages. The term software as used herein refers to a computer program product comprising a computer readable medium having computer program logic stored therein to cause a computer system to perform one or more features and/or combinations of features disclosed herein .
尽管已经参考多个实施方式描述了本文中所阐述的某些特征,但是该描述并非旨在以限制性意义来解释。因此,对于本发明所属领域技术人员显而易见的本文中所描述的实施方式的多种变型以及其他实施方式也视为在本公开内容的精神和范围内。While certain features set forth herein have been described with reference to various implementations, this description is not intended to be construed in a limiting sense. Accordingly, various modifications of the embodiments described herein, as well as other embodiments, which are apparent to those skilled in the art to which the invention pertains are considered to be within the spirit and scope of the present disclosure.
Claims (19)
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US13/189,663 | 2011-07-25 | ||
| US13/189,663 US20130027416A1 (en) | 2011-07-25 | 2011-07-25 | Gather method and apparatus for media processing accelerators |
| PCT/US2012/047879 WO2013016295A1 (en) | 2011-07-25 | 2012-07-23 | Gather method and apparatus for media processing accelerators |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN103718244A true CN103718244A (en) | 2014-04-09 |
| CN103718244B CN103718244B (en) | 2016-06-01 |
Family
ID=47596853
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201280036339.6A Expired - Fee Related CN103718244B (en) | 2011-07-25 | 2012-07-23 | Acquisition method and device for media processing accelerator |
Country Status (4)
| Country | Link |
|---|---|
| US (1) | US20130027416A1 (en) |
| KR (1) | KR101625418B1 (en) |
| CN (1) | CN103718244B (en) |
| WO (1) | WO2013016295A1 (en) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN107430760A (en) * | 2015-04-23 | 2017-12-01 | 谷歌公司 | Two-dimensional shift array for image processor |
Families Citing this family (21)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP5692780B2 (en) * | 2010-10-05 | 2015-04-01 | 日本電気株式会社 | Multi-core type error correction processing system and error correction processing device |
| US8707123B2 (en) * | 2011-12-30 | 2014-04-22 | Lsi Corporation | Variable barrel shifter |
| CN104205042B (en) | 2012-03-30 | 2019-01-08 | 英特尔公司 | Context switching mechanism for processing cores with general purpose CPU cores and tightly coupled accelerators |
| US20150228106A1 (en) * | 2014-02-13 | 2015-08-13 | Vixs Systems Inc. | Low latency video texture mapping via tight integration of codec engine with 3d graphics engine |
| US9749548B2 (en) | 2015-01-22 | 2017-08-29 | Google Inc. | Virtual linebuffers for image signal processors |
| US10298713B2 (en) * | 2015-03-30 | 2019-05-21 | Huawei Technologies Co., Ltd. | Distributed content discovery for in-network caching |
| US9772852B2 (en) | 2015-04-23 | 2017-09-26 | Google Inc. | Energy efficient processor core architecture for image processor |
| US10095479B2 (en) | 2015-04-23 | 2018-10-09 | Google Llc | Virtual image processor instruction set architecture (ISA) and memory model and exemplary target hardware having a two-dimensional shift array structure |
| US9965824B2 (en) | 2015-04-23 | 2018-05-08 | Google Llc | Architecture for high performance, power efficient, programmable image processing |
| US9756268B2 (en) | 2015-04-23 | 2017-09-05 | Google Inc. | Line buffer unit for image processor |
| US9785423B2 (en) | 2015-04-23 | 2017-10-10 | Google Inc. | Compiler for translating between a virtual image processor instruction set architecture (ISA) and target hardware having a two-dimensional shift array structure |
| US10291813B2 (en) | 2015-04-23 | 2019-05-14 | Google Llc | Sheet generator for image processor |
| US9830150B2 (en) | 2015-12-04 | 2017-11-28 | Google Llc | Multi-functional execution lane for image processor |
| US10313641B2 (en) | 2015-12-04 | 2019-06-04 | Google Llc | Shift register with reduced wiring complexity |
| US10387988B2 (en) | 2016-02-26 | 2019-08-20 | Google Llc | Compiler techniques for mapping program code to a high performance, power efficient, programmable image processing hardware platform |
| US10204396B2 (en) | 2016-02-26 | 2019-02-12 | Google Llc | Compiler managed memory for image processor |
| US10380969B2 (en) | 2016-02-28 | 2019-08-13 | Google Llc | Macro I/O unit for image processor |
| US20180005059A1 (en) | 2016-07-01 | 2018-01-04 | Google Inc. | Statistics Operations On Two Dimensional Image Processor |
| US20180007302A1 (en) | 2016-07-01 | 2018-01-04 | Google Inc. | Block Operations For An Image Processor Having A Two-Dimensional Execution Lane Array and A Two-Dimensional Shift Register |
| US20180005346A1 (en) | 2016-07-01 | 2018-01-04 | Google Inc. | Core Processes For Block Operations On An Image Processor Having A Two-Dimensional Execution Lane Array and A Two-Dimensional Shift Register |
| US10546211B2 (en) | 2016-07-01 | 2020-01-28 | Google Llc | Convolutional neural network on programmable two dimensional image processor |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US4797852A (en) * | 1986-02-03 | 1989-01-10 | Intel Corporation | Block shifter for graphics processor |
| US5875470A (en) * | 1995-09-28 | 1999-02-23 | International Business Machines Corporation | Multi-port multiple-simultaneous-access DRAM chip |
| US6061779A (en) * | 1998-01-16 | 2000-05-09 | Analog Devices, Inc. | Digital signal processor having data alignment buffer for performing unaligned data accesses |
| US6144356A (en) * | 1997-11-14 | 2000-11-07 | Aurora Systems, Inc. | System and method for data planarization |
Family Cites Families (134)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US3893088A (en) * | 1971-07-19 | 1975-07-01 | Texas Instruments Inc | Random access memory shift register system |
| JPS5019312A (en) * | 1973-06-21 | 1975-02-28 | ||
| US3944990A (en) * | 1974-12-06 | 1976-03-16 | Intel Corporation | Semiconductor memory employing charge-coupled shift registers with multiplexed refresh amplifiers |
| US3967251A (en) * | 1975-04-17 | 1976-06-29 | Xerox Corporation | User variable computer memory module |
| US4574345A (en) * | 1981-04-01 | 1986-03-04 | Advanced Parallel Systems, Inc. | Multiprocessor computer system utilizing a tapped delay line instruction bus |
| US4435792A (en) * | 1982-06-30 | 1984-03-06 | Sun Microsystems, Inc. | Raster memory manipulation apparatus |
| US4516238A (en) * | 1983-03-28 | 1985-05-07 | At&T Bell Laboratories | Self-routing switching network |
| US4720831A (en) * | 1985-12-02 | 1988-01-19 | Advanced Micro Devices, Inc. | CRC calculation machine with concurrent preset and CRC calculation function |
| DE3804938C2 (en) * | 1987-02-18 | 1994-07-28 | Canon Kk | Image processing device |
| US4829585A (en) * | 1987-05-04 | 1989-05-09 | Polaroid Corporation | Electronic image processing circuit |
| US5029105A (en) * | 1987-08-18 | 1991-07-02 | Hewlett-Packard | Programmable pipeline for formatting RGB pixel data into fields of selected size |
| US4958302A (en) * | 1987-08-18 | 1990-09-18 | Hewlett-Packard Company | Graphics frame buffer with pixel serializing group rotator |
| US5146592A (en) * | 1987-09-14 | 1992-09-08 | Visual Information Technologies, Inc. | High speed image processing computer with overlapping windows-div |
| US5270963A (en) * | 1988-08-10 | 1993-12-14 | Synaptics, Incorporated | Method and apparatus for performing neighborhood operations on a processing plane |
| JP2700903B2 (en) * | 1988-09-30 | 1998-01-21 | シャープ株式会社 | Liquid crystal display |
| JP2666411B2 (en) * | 1988-10-04 | 1997-10-22 | 三菱電機株式会社 | Integrated circuit device for orthogonal transformation of two-dimensional discrete data |
| GB2223918B (en) * | 1988-10-14 | 1993-05-19 | Sun Microsystems Inc | Method and apparatus for optimizing selected raster operations |
| US4958146A (en) * | 1988-10-14 | 1990-09-18 | Sun Microsystems, Inc. | Multiplexor implementation for raster operations including foreground and background colors |
| US5313613A (en) * | 1988-12-30 | 1994-05-17 | International Business Machines Corporation | Execution of storage-immediate and storage-storage instructions within cache buffer storage |
| US5416496A (en) * | 1989-08-22 | 1995-05-16 | Wood; Lawson A. | Ferroelectric liquid crystal display apparatus and method |
| US5056044A (en) * | 1989-12-21 | 1991-10-08 | Hewlett-Packard Company | Graphics frame buffer with programmable tile size |
| US5313624A (en) * | 1991-05-14 | 1994-05-17 | Next Computer, Inc. | DRAM multiplexer |
| US5254991A (en) * | 1991-07-30 | 1993-10-19 | Lsi Logic Corporation | Method and apparatus for decoding Huffman codes |
| DE4227733A1 (en) * | 1991-08-30 | 1993-03-04 | Allen Bradley Co | Configurable cache memory for data processing of video information - receives data sub-divided into groups controlled in selection process |
| US5392391A (en) * | 1991-10-18 | 1995-02-21 | Lsi Logic Corporation | High performance graphics applications controller |
| JP2757671B2 (en) * | 1992-04-13 | 1998-05-25 | 日本電気株式会社 | Priority encoder and floating point adder / subtracter |
| US5491702A (en) * | 1992-07-22 | 1996-02-13 | Silicon Graphics, Inc. | Apparatus for detecting any single bit error, detecting any two bit error, and detecting any three or four bit error in a group of four bits for a 25- or 64-bit data word |
| US5574672A (en) * | 1992-09-25 | 1996-11-12 | Cyrix Corporation | Combination multiplier/shifter |
| US5572655A (en) * | 1993-01-12 | 1996-11-05 | Lsi Logic Corporation | High-performance integrated bit-mapped graphics controller |
| US5821918A (en) * | 1993-07-29 | 1998-10-13 | S3 Incorporated | Video processing apparatus, systems and methods |
| EP0644553B1 (en) * | 1993-09-20 | 2000-07-12 | Codex Corporation | Circuit and method of interconnecting content addressable memory |
| US5509129A (en) * | 1993-11-30 | 1996-04-16 | Guttag; Karl M. | Long instruction word controlling plural independent processor operations |
| US5487022A (en) * | 1994-03-08 | 1996-01-23 | Texas Instruments Incorporated | Normalization method for floating point numbers |
| US5574880A (en) * | 1994-03-11 | 1996-11-12 | Intel Corporation | Mechanism for performing wrap-around reads during split-wordline reads |
| TW304254B (en) * | 1994-07-08 | 1997-05-01 | Hitachi Ltd | |
| DE69635066T2 (en) * | 1995-06-06 | 2006-07-20 | Hewlett-Packard Development Co., L.P., Houston | Interrupt scheme for updating a local store |
| JPH0916470A (en) * | 1995-07-03 | 1997-01-17 | Mitsubishi Electric Corp | Semiconductor storage device |
| US7301541B2 (en) * | 1995-08-16 | 2007-11-27 | Microunity Systems Engineering, Inc. | Programmable processor and method with wide operations |
| US6023441A (en) * | 1995-08-30 | 2000-02-08 | Intel Corporation | Method and apparatus for selectively enabling individual sets of registers in a row of a register array |
| TW389909B (en) * | 1995-09-13 | 2000-05-11 | Toshiba Corp | Nonvolatile semiconductor memory device and its usage |
| US5954811A (en) * | 1996-01-25 | 1999-09-21 | Analog Devices, Inc. | Digital signal processor architecture |
| US5941980A (en) * | 1996-08-05 | 1999-08-24 | Industrial Technology Research Institute | Apparatus and method for parallel decoding of variable-length instructions in a superscalar pipelined data processing system |
| IT1284976B1 (en) * | 1996-10-17 | 1998-05-28 | Sgs Thomson Microelectronics | METHOD FOR THE IDENTIFICATION OF SIGN STRIPES OF ROAD LANES |
| US5931940A (en) * | 1997-01-23 | 1999-08-03 | Unisys Corporation | Testing and string instructions for data stored on memory byte boundaries in a word oriented machine |
| US6246396B1 (en) * | 1997-04-30 | 2001-06-12 | Canon Kabushiki Kaisha | Cached color conversion method and apparatus |
| US6108101A (en) * | 1997-05-15 | 2000-08-22 | Canon Kabushiki Kaisha | Technique for printing with different printer heads |
| US5930167A (en) * | 1997-07-30 | 1999-07-27 | Sandisk Corporation | Multi-state non-volatile flash memory capable of being its own two state write cache |
| US6157210A (en) * | 1997-10-16 | 2000-12-05 | Altera Corporation | Programmable logic device with circuitry for observing programmable logic circuit signals and for preloading programmable logic circuits |
| US6208772B1 (en) * | 1997-10-17 | 2001-03-27 | Acuity Imaging, Llc | Data processing system for logically adjacent data samples such as image data in a machine vision system |
| KR100253366B1 (en) * | 1997-12-03 | 2000-04-15 | 김영환 | Variable Length Code Decoder for MPEG |
| US6020934A (en) * | 1998-03-23 | 2000-02-01 | International Business Machines Corporation | Motion estimation architecture for area and power reduction |
| US6173393B1 (en) * | 1998-03-31 | 2001-01-09 | Intel Corporation | System for writing select non-contiguous bytes of data with single instruction having operand identifying byte mask corresponding to respective blocks of packed data |
| US6476807B1 (en) * | 1998-08-20 | 2002-11-05 | Apple Computer, Inc. | Method and apparatus for performing conservative hidden surface removal in a graphics processor with deferred shading |
| JP2000182390A (en) * | 1998-12-11 | 2000-06-30 | Mitsubishi Electric Corp | Semiconductor storage device |
| US6452603B1 (en) * | 1998-12-23 | 2002-09-17 | Nvidia Us Investment Company | Circuit and method for trilinear filtering using texels from only one level of detail |
| JP3307360B2 (en) * | 1999-03-10 | 2002-07-24 | 日本電気株式会社 | Semiconductor integrated circuit device |
| JP4489305B2 (en) * | 1999-03-16 | 2010-06-23 | 浜松ホトニクス株式会社 | High-speed visual sensor device |
| US6694423B1 (en) * | 1999-05-26 | 2004-02-17 | Infineon Technologies North America Corp. | Prefetch streaming buffer |
| US6552710B1 (en) * | 1999-05-26 | 2003-04-22 | Nec Electronics Corporation | Driver unit for driving an active matrix LCD device in a dot reversible driving scheme |
| TW523730B (en) * | 1999-07-12 | 2003-03-11 | Semiconductor Energy Lab | Digital driver and display device |
| US6425044B1 (en) * | 1999-07-13 | 2002-07-23 | Micron Technology, Inc. | Apparatus for providing fast memory decode using a bank conflict table |
| KR100357126B1 (en) * | 1999-07-30 | 2002-10-18 | 엘지전자 주식회사 | Generation Apparatus for memory address and Wireless telephone using the same |
| KR100563826B1 (en) * | 1999-08-21 | 2006-04-17 | 엘지.필립스 엘시디 주식회사 | Data driving circuit of liquid crystal display |
| US6477635B1 (en) * | 1999-11-08 | 2002-11-05 | International Business Machines Corporation | Data processing system including load/store unit having a real address tag array and method for correcting effective address aliasing |
| US6654872B1 (en) * | 2000-01-27 | 2003-11-25 | Ati International Srl | Variable length instruction alignment device and method |
| US6578153B1 (en) * | 2000-03-16 | 2003-06-10 | Fujitsu Network Communications, Inc. | System and method for communications link calibration using a training packet |
| US7088322B2 (en) * | 2000-05-12 | 2006-08-08 | Semiconductor Energy Laboratory Co., Ltd. | Semiconductor device |
| US6778548B1 (en) * | 2000-06-26 | 2004-08-17 | Intel Corporation | Device to receive, buffer, and transmit packets of data in a packet switching network |
| KR100467990B1 (en) * | 2000-09-05 | 2005-01-24 | 가부시끼가이샤 도시바 | Display device |
| AU2002218489A1 (en) * | 2000-11-29 | 2002-06-11 | Nikon Corporation | Image processing method, image processing device, detection method, detection device, exposure method and exposure system |
| US20020105522A1 (en) * | 2000-12-12 | 2002-08-08 | Kolluru Mahadev S. | Embedded memory architecture for video applications |
| US6502170B2 (en) * | 2000-12-15 | 2002-12-31 | Intel Corporation | Memory-to-memory compare/exchange instructions to support non-blocking synchronization schemes |
| US20050280623A1 (en) * | 2000-12-18 | 2005-12-22 | Renesas Technology Corp. | Display control device and mobile electronic apparatus |
| US6928516B2 (en) * | 2000-12-22 | 2005-08-09 | Texas Instruments Incorporated | Image data processing system and method with image data organization into tile cache memory |
| US7757066B2 (en) * | 2000-12-29 | 2010-07-13 | Stmicroelectronics, Inc. | System and method for executing variable latency load operations in a date processor |
| US7051153B1 (en) * | 2001-05-06 | 2006-05-23 | Altera Corporation | Memory array operating as a shift register |
| US20020173860A1 (en) * | 2001-05-15 | 2002-11-21 | Bruce Charles W. | Integrated control system |
| US6778179B2 (en) * | 2001-05-18 | 2004-08-17 | Sun Microsystems, Inc. | External dirty tag bits for 3D-RAM SRAM |
| US6603683B2 (en) * | 2001-06-25 | 2003-08-05 | International Business Machines Corporation | Decoding scheme for a stacked bank architecture |
| JP4074502B2 (en) * | 2001-12-12 | 2008-04-09 | セイコーエプソン株式会社 | Power supply circuit for display device, display device and electronic device |
| US7114058B1 (en) * | 2001-12-31 | 2006-09-26 | Apple Computer, Inc. | Method and apparatus for forming and dispatching instruction groups based on priority comparisons |
| US6664807B1 (en) * | 2002-01-22 | 2003-12-16 | Xilinx, Inc. | Repeater for buffering a signal on a long data line of a programmable logic device |
| JP4024557B2 (en) * | 2002-02-28 | 2007-12-19 | 株式会社半導体エネルギー研究所 | Light emitting device, electronic equipment |
| JP2004177433A (en) * | 2002-11-22 | 2004-06-24 | Sharp Corp | Shift register block, data signal line driving circuit and display device including the same |
| US7093084B1 (en) * | 2002-12-03 | 2006-08-15 | Altera Corporation | Memory implementations of shift registers |
| US7162684B2 (en) * | 2003-01-27 | 2007-01-09 | Texas Instruments Incorporated | Efficient encoder for low-density-parity-check codes |
| US7571287B2 (en) * | 2003-03-13 | 2009-08-04 | Marvell World Trade Ltd. | Multiport memory architecture, devices and systems including the same, and methods of using the same |
| US7275147B2 (en) * | 2003-03-31 | 2007-09-25 | Hitachi, Ltd. | Method and apparatus for data alignment and parsing in SIMD computer architecture |
| US7071908B2 (en) * | 2003-05-20 | 2006-07-04 | Kagutech, Ltd. | Digital backplane |
| US7243172B2 (en) * | 2003-10-14 | 2007-07-10 | Broadcom Corporation | Fragment storage for data alignment and merger |
| GB2411975B (en) * | 2003-12-09 | 2006-10-04 | Advanced Risc Mach Ltd | Data processing apparatus and method for performing arithmetic operations in SIMD data processing |
| US7543142B2 (en) * | 2003-12-19 | 2009-06-02 | Intel Corporation | Method and apparatus for performing an authentication after cipher operation in a network processor |
| EP1555828A1 (en) * | 2004-01-14 | 2005-07-20 | Sony International (Europe) GmbH | Method for pre-processing block based digital data |
| US20050226337A1 (en) * | 2004-03-31 | 2005-10-13 | Mikhail Dorojevets | 2D block processing architecture |
| US7196708B2 (en) * | 2004-03-31 | 2007-03-27 | Sony Corporation | Parallel vector processing |
| JP3706383B1 (en) * | 2004-04-15 | 2005-10-12 | 株式会社ソニー・コンピュータエンタテインメント | Drawing processing apparatus and drawing processing method, information processing apparatus and information processing method |
| US7079156B1 (en) * | 2004-05-14 | 2006-07-18 | Nvidia Corporation | Method and system for implementing multiple high precision and low precision interpolators for a graphics pipeline |
| JP2006127460A (en) * | 2004-06-09 | 2006-05-18 | Renesas Technology Corp | Semiconductor device, semiconductor signal processing apparatus and crossbar switch |
| KR20050123487A (en) * | 2004-06-25 | 2005-12-29 | 엘지.필립스 엘시디 주식회사 | The liquid crystal display device and the method for driving the same |
| US9557994B2 (en) * | 2004-07-13 | 2017-01-31 | Arm Limited | Data processing apparatus and method for performing N-way interleaving and de-interleaving operations where N is an odd plural number |
| US7986733B2 (en) * | 2004-07-30 | 2011-07-26 | Broadcom Corporation | Tertiary content addressable memory based motion estimator |
| US7546328B2 (en) * | 2004-08-31 | 2009-06-09 | Wisconsin Alumni Research Foundation | Decimal floating-point adder |
| US7394636B2 (en) * | 2005-05-25 | 2008-07-01 | International Business Machines Corporation | Slave mode thermal control with throttling and shutdown |
| US8032688B2 (en) * | 2005-06-30 | 2011-10-04 | Intel Corporation | Micro-tile memory interfaces |
| US8253751B2 (en) * | 2005-06-30 | 2012-08-28 | Intel Corporation | Memory controller interface for micro-tiled memory access |
| US7375550B1 (en) * | 2005-07-15 | 2008-05-20 | Tabula, Inc. | Configurable IC with packet switch configuration network |
| US7827345B2 (en) * | 2005-08-04 | 2010-11-02 | Joel Henry Hinrichs | Serially interfaced random access memory |
| JP4652409B2 (en) * | 2005-08-25 | 2011-03-16 | スパンション エルエルシー | Storage device and storage device control method |
| US7565027B2 (en) * | 2005-10-07 | 2009-07-21 | Xerox Corporation | Countdown stamp error diffusion |
| US8593474B2 (en) * | 2005-12-30 | 2013-11-26 | Intel Corporation | Method and system for symmetric allocation for a shared L2 mapping cache |
| CN101449256B (en) * | 2006-04-12 | 2013-12-25 | 索夫特机械公司 | Apparatus and method for processing instruction matrix specifying parallel and dependent operations |
| JP2008047273A (en) * | 2006-07-20 | 2008-02-28 | Toshiba Corp | Semiconductor memory device and control method thereof |
| US7574562B2 (en) * | 2006-07-21 | 2009-08-11 | International Business Machines Corporation | Latency-aware thread scheduling in non-uniform cache architecture systems |
| KR100817056B1 (en) * | 2006-08-25 | 2008-03-26 | 삼성전자주식회사 | Branch history length indicator, branch prediction system and branch prediction method |
| US20080151670A1 (en) * | 2006-12-22 | 2008-06-26 | Tomohiro Kawakubo | Memory device, memory controller and memory system |
| US8878860B2 (en) * | 2006-12-28 | 2014-11-04 | Intel Corporation | Accessing memory using multi-tiling |
| US7783860B2 (en) * | 2007-07-31 | 2010-08-24 | International Business Machines Corporation | Load misaligned vector with permute and mask insert |
| US20090172348A1 (en) * | 2007-12-26 | 2009-07-02 | Robert Cavin | Methods, apparatus, and instructions for processing vector data |
| US8295367B2 (en) * | 2008-01-11 | 2012-10-23 | Csr Technology Inc. | Method and apparatus for video signal processing |
| JP4868607B2 (en) * | 2008-01-22 | 2012-02-01 | 株式会社リコー | SIMD type microprocessor |
| US9268746B2 (en) * | 2008-03-07 | 2016-02-23 | St Ericsson Sa | Architecture for vector memory array transposition using a block transposition accelerator |
| WO2009147535A1 (en) * | 2008-06-06 | 2009-12-10 | Tessera Technologies Hungary Kft. | Techniques for reducing noise while preserving contrast in an image |
| US8213735B2 (en) * | 2008-10-10 | 2012-07-03 | Accusoft Corporation | Methods and apparatus for performing image binarization |
| US20100149215A1 (en) * | 2008-12-15 | 2010-06-17 | Personal Web Systems, Inc. | Media Action Script Acceleration Apparatus, System and Method |
| US9189670B2 (en) * | 2009-02-11 | 2015-11-17 | Cognex Corporation | System and method for capturing and detecting symbology features and parameters |
| US8645589B2 (en) * | 2009-08-03 | 2014-02-04 | National Instruments Corporation | Methods for data acquisition systems in real time applications |
| CN101996550A (en) * | 2009-08-06 | 2011-03-30 | 株式会社东芝 | Semiconductor integrated circuit for displaying image |
| JP2011043766A (en) * | 2009-08-24 | 2011-03-03 | Seiko Epson Corp | Conversion circuit, display drive circuit, electro-optical device, and electronic equipment |
| US8832336B2 (en) * | 2010-01-30 | 2014-09-09 | Mosys, Inc. | Reducing latency in serializer-deserializer links |
| US8458405B2 (en) * | 2010-06-23 | 2013-06-04 | International Business Machines Corporation | Cache bank modeling with variable access and busy times |
| US20110320699A1 (en) * | 2010-06-24 | 2011-12-29 | International Business Machines Corporation | System Refresh in Cache Memory |
| US8331163B2 (en) * | 2010-09-07 | 2012-12-11 | Infineon Technologies Ag | Latch based memory device |
| US8717274B2 (en) * | 2010-10-07 | 2014-05-06 | Au Optronics Corporation | Driving circuit and method for driving a display |
| US20120254589A1 (en) * | 2011-04-01 | 2012-10-04 | Jesus Corbal San Adrian | System, apparatus, and method for aligning registers |
-
2011
- 2011-07-25 US US13/189,663 patent/US20130027416A1/en not_active Abandoned
-
2012
- 2012-07-23 KR KR1020147002300A patent/KR101625418B1/en not_active Expired - Fee Related
- 2012-07-23 CN CN201280036339.6A patent/CN103718244B/en not_active Expired - Fee Related
- 2012-07-23 WO PCT/US2012/047879 patent/WO2013016295A1/en active Application Filing
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US4797852A (en) * | 1986-02-03 | 1989-01-10 | Intel Corporation | Block shifter for graphics processor |
| US5875470A (en) * | 1995-09-28 | 1999-02-23 | International Business Machines Corporation | Multi-port multiple-simultaneous-access DRAM chip |
| US6144356A (en) * | 1997-11-14 | 2000-11-07 | Aurora Systems, Inc. | System and method for data planarization |
| CN1285944A (en) * | 1997-11-14 | 2001-02-28 | 奥罗拉系统公司 | System and method for data planarization |
| US6061779A (en) * | 1998-01-16 | 2000-05-09 | Analog Devices, Inc. | Digital signal processor having data alignment buffer for performing unaligned data accesses |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN107430760A (en) * | 2015-04-23 | 2017-12-01 | 谷歌公司 | Two-dimensional shift array for image processor |
| US11153464B2 (en) | 2015-04-23 | 2021-10-19 | Google Llc | Two dimensional shift array for image processor |
Also Published As
| Publication number | Publication date |
|---|---|
| KR101625418B1 (en) | 2016-05-30 |
| CN103718244B (en) | 2016-06-01 |
| KR20140043455A (en) | 2014-04-09 |
| WO2013016295A1 (en) | 2013-01-31 |
| US20130027416A1 (en) | 2013-01-31 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN103718244B (en) | Acquisition method and device for media processing accelerator | |
| CN109388595B (en) | High-bandwidth memory systems and logic dies | |
| EP3286724B1 (en) | Two dimensional shift array for image processor | |
| CN107438860A (en) | Architecture for high-performance power-efficient programmable image processing | |
| CN102648450A (en) | Hardware for parallel command list generation | |
| CN108388537A (en) | A kind of convolutional neural networks accelerator and method | |
| CN110574067B (en) | Image Processor I/O Unit | |
| CN107750366A (en) | Hardware accelerator for histogram of gradients | |
| CN107438861A (en) | Data slice generator for image generators | |
| CN111353575A (en) | Tiled format for convolutional neural networks | |
| KR20130060114A (en) | Inline image rotation | |
| CN110192220B (en) | Program code transformation to improve image processor runtime efficiency | |
| US20150026442A1 (en) | System, method, and computer program product for managing out-of-order execution of program instructions | |
| WO2022001550A1 (en) | Address generation method, related device and storage medium | |
| EP3485384A1 (en) | Memory request arbitration | |
| US20170251184A1 (en) | Shift register with reduced wiring complexity | |
| CN111291240B (en) | Method for processing data and data processing device | |
| CN105243399A (en) | Method of realizing image convolution and device, and method of realizing caching and device | |
| CN102438149A (en) | Realization method of AVS (Audio Video Standard) inverse transformation based on reconfiguration technology | |
| US20140372703A1 (en) | System, method, and computer program product for warming a cache for a task launch | |
| CN109074289B (en) | Data sharing between subgroups | |
| US20120327260A1 (en) | Parallel operation histogramming device and microcomputer | |
| TWI508023B (en) | Parallel and vectored gilbert-johnson-keerthi graphics processing | |
| EP3474224B1 (en) | Graphics processing method and device | |
| CN107305486A (en) | A neural network maxout layer computing device |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| C14 | Grant of patent or utility model | ||
| GR01 | Patent grant | ||
| CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20160601 Termination date: 20190723 |
|
| CF01 | Termination of patent right due to non-payment of annual fee |