[go: up one dir, main page]

WO2002067137A1 - Vector and scalar signal processing - Google Patents

Vector and scalar signal processing Download PDF

Info

Publication number
WO2002067137A1
WO2002067137A1 PCT/US2000/035385 US0035385W WO02067137A1 WO 2002067137 A1 WO2002067137 A1 WO 2002067137A1 US 0035385 W US0035385 W US 0035385W WO 02067137 A1 WO02067137 A1 WO 02067137A1
Authority
WO
WIPO (PCT)
Prior art keywords
vector
signal processing
processor
signal processor
controller
Prior art date
Application number
PCT/US2000/035385
Other languages
French (fr)
Inventor
Edward R. Prado
Edward E. Ille
Dean W. Brenner
John P. Prewitt
Original Assignee
Honeywell International Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Honeywell International Inc. filed Critical Honeywell International Inc.
Priority to PCT/US2000/035385 priority Critical patent/WO2002067137A1/en
Publication of WO2002067137A1 publication Critical patent/WO2002067137A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/80Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
    • G06F15/8053Vector processors
    • G06F15/8061Details on data memory access

Definitions

  • This invention relates to complex signal-processing that use algorithms combining vector and scalar constructs.
  • Vector operations such as Fast Fourier Transforms (FFT) and pixel element manipulations
  • FFT Fast Fourier Transforms
  • Scalar operations are array oriented: each data set, i.e., vector, is processed as a vector array.
  • Scalar operations are non-vector operations, which comprise flow control or decision making operations.
  • Signal processing algorithms can be partitioned into a series of both vector and scalar operations. Algorithm implementation using hardware optimized for each type of operation is the most efficient method of execution.
  • a vector processor can be eight to thirty-seven times more efficient in vector operations than a scalar processor.
  • Vector processors could be implemented using common devices such as Field Programmable Gate Arrays (FPGA) or Application Specific Integrated Circuits (ASIC).
  • FPGA Field Programmable Gate Arrays
  • ASIC Application Specific Integrated Circuits
  • Scalar operations can be handled using general purpose digital signal processors (GP-DSP).
  • GP-DSPs can be used to perform both vector and scalar operations. Due to the GP-DSP's inefficiency at vector operations, it has sometimes been necessary to use multiple GP-DSPs to obtain the required performance. Because DSPs are flow control intensive, multiple DSPs require complex software overhead to manage the operation each of the DSPs. On the other hand, pure use of a vector processor for these hybrid applications is not necessarily anymore practicable.
  • An object of the present invention is providing a more efficient way to perform signal processing by providing a integrated hybrid vector/scalar processor.
  • a signal processor comprises a vector processor and a scalar processor whose operation is managed by a controller.
  • the controller uses stored program scripts (algorithms) to schedule the operations of the vector processor and scalar processor.
  • the vector and scalar functions are individually stored programs associated with each processor.
  • each processor performs its respective operations as commanded by the controller.
  • the order in which these operations are performed is dependent upon the algorithm being implemented.
  • the controller is responsible for ensuring correct algorithm execution and data flow between the two processors.
  • the present invention can accommodate many DSPs by redefining the PROM based algorithms contained on the controller and scalar processor which is especially useful for so called "space applications" which often require on-orbit reconfiguration of missions algorithms.
  • Another feature of the present invention is that it is especially useful for signal processing in such applications as scatterometer/radar, image compression and hyperspectral imaging.
  • Another feature of the present invention is that it can be constructed using currently available vector and scalar processors.
  • Fig. 1 is a block diagram of a system embodying the present invention.
  • Fig. 2 illustrates a JPEG vector/scalar data compression process that can be performed with the present invention.
  • Fig. 3A-3B is a flow chart showing the operations of the two processors in performing the JPEG data compression shown in Fig. 2.
  • the system shown in Fig. 1 contains a vector processor 10 and a GP-DSP (general purpose digital signal processor) 12, both controlled by a programmed controller 14, a dedicated processor, such as an ASIC (application specific integrated circuit) or FPGA (field programmable gate array).
  • a PROM 24 contains the instructions for the controller 14 in the form of controller instructions generated from scripts.
  • the vector processor 10 also has its own memory devices 18, in this case S AMs, as does the DSP 12.
  • an image source 20 provides data to a buffer 22 with an output connected to the input of the controller 14.
  • the PROM 24, it will be explained, is also used to schedule the operations of the vector processor 10 and the scalar processor 12.
  • the result of the vector and scalar processing is an output from the controller 14 supplied to a output buffer 26. In this specific example the output is a "compressed image".
  • Vector processors are optimized to implement a set of high- level instructions that support pipe line oriented operations, such as the compression shown in Fig. 2, the input in Fig. 1, or FFT functions, as discussed previously.
  • the Sharp brand 9124 processor, manufactured by Sharp Electronics and DSP24 brand processor, manufactured by DSP Architectures are commercially available examples of these devices and can be used in this invention.
  • These particular vector processing devices support several signal processing functions: time domain processing that includes Fast Fourier Transforms (FF ), Finite Impulse Response (FIR) filtering, vector operations, logical array processing (real and complex) and such tasks as convolution and digital modulation/demodulation .
  • Vector processors typically have multiple complex bidirectional data ports that are highly flexible (any port to any port routing) and capable of the simultaneous reading and writing of data.
  • Vector processors can typically be cascaded together to support complex vector operations and significantly increase performance. For example a single vector processor operating at 50 MHz can perform an FFT operation in 42 ⁇ s. If two identical VPs are cascaded together then this same operation can be performed in only 2 l ⁇ s.
  • a vector processor is "pass-based", where a single instruction is implemented as one digital signal processing function on the entire vector array, that is passed through the chip from one port to another, instead of reading new instruction for each cycle.
  • a GP-DSP is required to fetch and decode instructions for every data entry within the vector array that is inputted.
  • Vector operations use the same instruction fetch for every piece of data, which presents significant processing overhead if implemented in a GP-DST, as compared to performing the same operation in a vector processor.
  • a scalar processor would use at least four instructions to perform the above example. This would require both instruction and data fetches, using several clock cycles per addition.
  • a vector processor will use one instruction, in this case "add”, and perform this operation every clock cycle while vector data is read in. This operation continues until the vector processor is scheduled to stop by the controller.
  • the GP-DSP 12 processor provides data flow control to the output buffer and handles the scalar portion of the signal processing.
  • block 26 illustrates vector computations performed on the inputted pixel block 27.
  • the scalar process is carried out following block 28 by the scalar processor 12.
  • This JPEG compression algorithm is well-known by those skilled in the art to which this invention relates and is illustrative of the type of complex, hybrid processing sometimes needed. Briefly, the steps in Fig.
  • step 26a involves deternnning the frequency components of an 8 by 8 pixel block by performing a 2-D Discrete Cosine Transform at step 26a to produce output data that is quantized (or decimated) in step 26b, which compresses the data according to a desired compression ratio parameter.
  • the output from the decimation step 26b is subjected to binary encoding 26a, producing a binary output stream and pre-formatted to an Environmental Data Record (EDR) per (CCSDS) step 28b, producing a serial data output that is routed via bus 12 from the scalar processor through the controller 14 to be output buffer 26 as the compressed pixel data.
  • EDR Environmental Data Record
  • the device Focusing on the controller 14, the device provides the interface between the vector processor 10 and the scalar processor 12 allowing both of the processors 10, 12 to operate independently. This approach allows the scalar processor 12 to operate on the first set of data set inputted at time (tl) while the vector processor operates on the next data set, inputted at time (t2).
  • Fig. 1 allows for scripts to be executedfrom SRAMs 18 as well as the PROM 24 to simultaneously carry out vector and scalar processing constructs with a controller scheduling all processing events of each processing element.
  • Figs. 3A, 3B illustrate the application of the invention on one example of a known process requiring vector and scalar signal processing, JPEG image compression.
  • Sl.l the first pixel block is applied to the buffer 12.
  • step Sl.3 the controller initializes the vector processor 10 and its MMUs 10A to run the first vector computation, block 26 beginning at step S1.4.
  • the vector processor 10 is setup by the controller 14 using instructions which have been generated using scripts and stored in PROM 24. Once setup, the controller 14 then cues the vector processor to start execution.
  • step Sl.5 the controller 14 queries vector processor 10 to find out if step S 1.4 is done for the entire image block of "n" bits. If so, the result from block 26a is stored in one of the vector processor 10 SRAMs 18, in step S1.6.
  • step Sl.7 the controller 10 directs the vector processor 10 to perform, at step Sl.8, the process of block 26b.
  • the process is performed on the data previously stored in step S1.6.
  • step S1.9 the controller determines if the decimation step 26b has been performed. An affirmative answer moves the process to step SI.10 in the controller 10, storing the decimation results in one of the vector processor 10 SRAMs.
  • step Si.11 the controller 10 initializes or cues the scalar processor for the blocks 28a, 28b.
  • the scalar operation begins with step 2.1, where the controller 10 insures that the scalar processor 12 is initialized.
  • the scalar processor waits until the controller 10 determines that the decimation data is present in a vector processor SRAM. That data is read in step 2.3 by the controller 10 to the scalar processor 12 over a bus 12a, whereupon the scalar processor 12 begins the binary encoding of block 28a by using a decision based processing scheme. Once the encoding for block 28a is completed, as determined by the controller 14 in step of S2.5, the binary stream 28c is stored in the scalar processor's 12 SRAM 18a in step S2.6.
  • the scalar processor 12 notifies the control at step S2.7 that the encoding is complete for the first pixel block, and at that point, the controller commands the MMU 10a at step S2.8 to empty the SRAM pulling the data storage at step Si.10.
  • step Sl.l is repeated, which begins the process to receive the next image block and perform the vector processing while the scalar processor is completing its operations on the earlier image block.
  • the scalar processor 12 completes the formatting of block 20b in the step S2.9, and in step S2.10, the controller 14 retrieves the formatted data from the scalar processor 12, outputting it to the buffer 26, where the buffer output 26a is the compressed image of the first pixel block 27.
  • the scalar processor waits for the vector processor, that in with other signal processes they can operate simultaneously.
  • the partition is a function of the type of algorithm.
  • the controller schedules the start of both vector and scalar processes based on when the next algorithm process can begin and when completion indicators are received from each type of processor.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Complex Calculations (AREA)

Abstract

Data, such as an image, is supplied to a signal processor that contains a vector signal processor and a scalar processor and a controller to control the flow of results between them. Vector processing scripts are stored in association with the vector processor in memory unit, such as SRAM and programs used by the scalar processor are stored in an associated memory unit. The vector processor performs vector operations and the results are sent to the scalar processor by the controller and the vector processor performs operations on new data as the scalar processor operates.

Description

Description
VECTOR AND SCALAR SIGNAL PROCESSING
Field of the Invention
This invention relates to complex signal-processing that use algorithms combining vector and scalar constructs.
Background Vector operations, such as Fast Fourier Transforms (FFT) and pixel element manipulations, are array oriented: each data set, i.e., vector, is processed as a vector array. Scalar operations, on the other hand, are non-vector operations, which comprise flow control or decision making operations. Signal processing algorithms can be partitioned into a series of both vector and scalar operations. Algorithm implementation using hardware optimized for each type of operation is the most efficient method of execution. A vector processor can be eight to thirty-seven times more efficient in vector operations than a scalar processor. Vector processors could be implemented using common devices such as Field Programmable Gate Arrays (FPGA) or Application Specific Integrated Circuits (ASIC). Scalar operations can be handled using general purpose digital signal processors (GP-DSP). GP-DSPs can be used to perform both vector and scalar operations. Due to the GP-DSP's inefficiency at vector operations, it has sometimes been necessary to use multiple GP-DSPs to obtain the required performance. Because DSPs are flow control intensive, multiple DSPs require complex software overhead to manage the operation each of the DSPs. On the other hand, pure use of a vector processor for these hybrid applications is not necessarily anymore practicable.
Summary of the Invention
An object of the present invention is providing a more efficient way to perform signal processing by providing a integrated hybrid vector/scalar processor.
According to the present invention, a signal processor comprises a vector processor and a scalar processor whose operation is managed by a controller. The controller uses stored program scripts (algorithms) to schedule the operations of the vector processor and scalar processor. The vector and scalar functions (algorithm components) are individually stored programs associated with each processor.
According to the invention, each processor performs its respective operations as commanded by the controller. The order in which these operations are performed is dependent upon the algorithm being implemented. The controller is responsible for ensuring correct algorithm execution and data flow between the two processors. Among the features of the present invention is that it can accommodate many DSPs by redefining the PROM based algorithms contained on the controller and scalar processor which is especially useful for so called "space applications" which often require on-orbit reconfiguration of missions algorithms.
Another feature of the present invention is that it is especially useful for signal processing in such applications as scatterometer/radar, image compression and hyperspectral imaging.
Another feature of the present invention is that it can be constructed using currently available vector and scalar processors.
Other objects, benefits and features of the invention will be apparent from the following discussion of one or more embodiments.
Brief Description of the Drawing
Fig. 1 is a block diagram of a system embodying the present invention.
Fig. 2 illustrates a JPEG vector/scalar data compression process that can be performed with the present invention.
Fig. 3A-3B is a flow chart showing the operations of the two processors in performing the JPEG data compression shown in Fig. 2.
Detailed Description
The system shown in Fig. 1 contains a vector processor 10 and a GP-DSP (general purpose digital signal processor) 12, both controlled by a programmed controller 14, a dedicated processor, such as an ASIC (application specific integrated circuit) or FPGA (field programmable gate array). A PROM 24 contains the instructions for the controller 14 in the form of controller instructions generated from scripts. The vector processor 10 also has its own memory devices 18, in this case S AMs, as does the DSP 12. In this particular example, an image source 20 provides data to a buffer 22 with an output connected to the input of the controller 14. The PROM 24, it will be explained, is also used to schedule the operations of the vector processor 10 and the scalar processor 12. The result of the vector and scalar processing is an output from the controller 14 supplied to a output buffer 26. In this specific example the output is a "compressed image".
The Vector Processor
Vector processors are optimized to implement a set of high- level instructions that support pipe line oriented operations, such as the compression shown in Fig. 2, the input in Fig. 1, or FFT functions, as discussed previously. The Sharp brand 9124 processor, manufactured by Sharp Electronics and DSP24 brand processor, manufactured by DSP Architectures are commercially available examples of these devices and can be used in this invention. These particular vector processing devices support several signal processing functions: time domain processing that includes Fast Fourier Transforms (FF ), Finite Impulse Response (FIR) filtering, vector operations, logical array processing (real and complex) and such tasks as convolution and digital modulation/demodulation . Vector processors typically have multiple complex bidirectional data ports that are highly flexible (any port to any port routing) and capable of the simultaneous reading and writing of data. Vector processors can typically be cascaded together to support complex vector operations and significantly increase performance. For example a single vector processor operating at 50 MHz can perform an FFT operation in 42μs. If two identical VPs are cascaded together then this same operation can be performed in only 2 lμs. A vector processor is "pass-based", where a single instruction is implemented as one digital signal processing function on the entire vector array, that is passed through the chip from one port to another, instead of reading new instruction for each cycle. A GP-DSP, on the other hand, is required to fetch and decode instructions for every data entry within the vector array that is inputted. Vector operations use the same instruction fetch for every piece of data, which presents significant processing overhead if implemented in a GP-DST, as compared to performing the same operation in a vector processor. For example, to do vector addition of the following vectors [al, a2, a3, a4] + [bl, b2, b3, b4] = [al+bl, a2+b2, a3+b3, a4+b4] a scalar processor would use at least four instructions to perform the above example. This would require both instruction and data fetches, using several clock cycles per addition. A vector processor will use one instruction, in this case "add", and perform this operation every clock cycle while vector data is read in. This operation continues until the vector processor is scheduled to stop by the controller.
The GP-DSP
The GP-DSP 12 processor provides data flow control to the output buffer and handles the scalar portion of the signal processing. In that respect, it should be noted in Fig. 2 that block 26 illustrates vector computations performed on the inputted pixel block 27. After the computation at block 26 is performed, the scalar process is carried out following block 28 by the scalar processor 12. This JPEG compression algorithm is well-known by those skilled in the art to which this invention relates and is illustrative of the type of complex, hybrid processing sometimes needed. Briefly, the steps in Fig. 2 involve deternnning the frequency components of an 8 by 8 pixel block by performing a 2-D Discrete Cosine Transform at step 26a to produce output data that is quantized (or decimated) in step 26b, which compresses the data according to a desired compression ratio parameter. The output from the decimation step 26b is subjected to binary encoding 26a, producing a binary output stream and pre-formatted to an Environmental Data Record (EDR) per (CCSDS) step 28b, producing a serial data output that is routed via bus 12 from the scalar processor through the controller 14 to be output buffer 26 as the compressed pixel data. Focusing on the controller 14, the device provides the interface between the vector processor 10 and the scalar processor 12 allowing both of the processors 10, 12 to operate independently. This approach allows the scalar processor 12 to operate on the first set of data set inputted at time (tl) while the vector processor operates on the next data set, inputted at time (t2).
One skilled in this art will understand that high- performance signal processing can be achieved by defining algorithms such as the compression algorithm shown in Fig. 2 into efficiently mapped vector and scalar constructs. This will be illustrated in the context of Figs. 3A, 3B, which illustrates the image compression algorithm in Fig. 2 to show that once an algorithm has been defined, vector and scalar operations can be partitioned between the vector then scalar processors 10, 12. A closely coupled integrated development standard for this allocation of processes should allow the GP-DSP to perform scalar processing while the vector processor is operating on the next data set. In contrast, those familiar with this technology will appreciate that past approaches have required hand coded vector constructs which had to be manually synchronized with scalar processing constructs. Such an approach mandates multiple development environments, resulting in reduced processing performance for each particular processor architecture. The system shown in Fig. 1 allows for scripts to be executedfrom SRAMs 18 as well as the PROM 24 to simultaneously carry out vector and scalar processing constructs with a controller scheduling all processing events of each processing element. Referring to Figs. 3A, 3B, as noted before, these illustrate the application of the invention on one example of a known process requiring vector and scalar signal processing, JPEG image compression. In the first step Sl.l, the first pixel block is applied to the buffer 12. Once the buffer is full, as determined in step Sl.2 by the controller 14, at step Sl.3, the controller initializes the vector processor 10 and its MMUs 10A to run the first vector computation, block 26 beginning at step S1.4. To run this process, the vector processor 10 is setup by the controller 14 using instructions which have been generated using scripts and stored in PROM 24. Once setup, the controller 14 then cues the vector processor to start execution. At step Sl.5, the controller 14 queries vector processor 10 to find out if step S 1.4 is done for the entire image block of "n" bits. If so, the result from block 26a is stored in one of the vector processor 10 SRAMs 18, in step S1.6. At step Sl.7, the controller 10 directs the vector processor 10 to perform, at step Sl.8, the process of block 26b. The process is performed on the data previously stored in step S1.6. In step S1.9, the controller determines if the decimation step 26b has been performed. An affirmative answer moves the process to step SI.10 in the controller 10, storing the decimation results in one of the vector processor 10 SRAMs. The vector processing steps having been completed and the results stored, at step Si.11, the controller 10 initializes or cues the scalar processor for the blocks 28a, 28b. The scalar operation begins with step 2.1, where the controller 10 insures that the scalar processor 12 is initialized. At step S2.2, the scalar processor waits until the controller 10 determines that the decimation data is present in a vector processor SRAM. That data is read in step 2.3 by the controller 10 to the scalar processor 12 over a bus 12a, whereupon the scalar processor 12 begins the binary encoding of block 28a by using a decision based processing scheme. Once the encoding for block 28a is completed, as determined by the controller 14 in step of S2.5, the binary stream 28c is stored in the scalar processor's 12 SRAM 18a in step S2.6. The scalar processor 12 notifies the control at step S2.7 that the encoding is complete for the first pixel block, and at that point, the controller commands the MMU 10a at step S2.8 to empty the SRAM pulling the data storage at step Si.10. At the same time, step Sl.l is repeated, which begins the process to receive the next image block and perform the vector processing while the scalar processor is completing its operations on the earlier image block. In the following step S2.8, the scalar processor 12 completes the formatting of block 20b in the step S2.9, and in step S2.10, the controller 14 retrieves the formatted data from the scalar processor 12, outputting it to the buffer 26, where the buffer output 26a is the compressed image of the first pixel block 27.
It should be apparent from the above that while in this particular example, JPEG image compression, the scalar processor waits for the vector processor, that in with other signal processes they can operate simultaneously. The partition is a function of the type of algorithm. The controller schedules the start of both vector and scalar processes based on when the next algorithm process can begin and when completion indicators are received from each type of processor.
With the benefit of the previous discussion of the invention, one of ordinary skill in art the may be able to modify the invention and its components and functions in whole or in part without departing from the true scope and spirit of the invention .

Claims

ClaimsWe daim:
1. An integrated circuit signal processing system, characterized by: a first signal processor for performing vector signal processing solutions; a first memory dedicated to said first signal processor for storing specific script operations to carry out a plurality of said vector processing solutions; a second signal processor for perfoπning deάsional signal processing routines; a second memory dedicated with said second signal processor for storing said deάsional signal processing routines; a controller that receives and transmits instructions to said first and second signal processors according to criteria based on data received by said controller, said criteria induding causing the first signal processor to perform vector signal processing operations according to said scripts on a first unit of said data to produce a result and transferring to said second signal processor said result and upon said transfer causing said second signal processor to perform said deάsional signal processing operations according on said results and to produce an output of said results for said data ; and a programmable memory dedicated to the controller for storing said criteria.
2. The integrated signal processing system described in daim 1, further characterized in that said controller causes the first signal processor to perform said vector signal processing on a second unit of data upon said transfer.
3. The integrated signal processing system described in daim 1, further characterized in that said scripts define vector solutions for each of a plurality of data requiring vector and scalar operations that are supplied to the integrated signal processing system.
4. An integrated circuit signal processing system, characterized by: a first signal processor for perforating vector signal processing solutions; a first memory dedicated to said first signal processor for storing specific script operations to carry out a plurality of said vector processing solutions; a second signal processor for performing deάsional signal processing routines; a second memory dedicated with said second signal processor for storing said deάsional signal processing routines; a controller that receives and transmits instructions to said first and second signal processors according to criteria based on data received by said controller, said criteria causing the first signal processor to perform vector signal processing operations according to said scripts on a first unit of said data to produce a result and causing said second signal processor to perform said deάsional signal processing operations on a second unit of said, the first signal processor and the second signal processor producing an output for said first unit and said second unit of data ; and a programmable memory dedicated to the controller for storing said criteria.
PCT/US2000/035385 2001-02-01 2001-02-01 Vector and scalar signal processing WO2002067137A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/US2000/035385 WO2002067137A1 (en) 2001-02-01 2001-02-01 Vector and scalar signal processing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/US2000/035385 WO2002067137A1 (en) 2001-02-01 2001-02-01 Vector and scalar signal processing

Publications (1)

Publication Number Publication Date
WO2002067137A1 true WO2002067137A1 (en) 2002-08-29

Family

ID=21742925

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2000/035385 WO2002067137A1 (en) 2001-02-01 2001-02-01 Vector and scalar signal processing

Country Status (1)

Country Link
WO (1) WO2002067137A1 (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5978838A (en) * 1996-08-19 1999-11-02 Samsung Electronics Co., Ltd. Coordination and synchronization of an asymmetric, single-chip, dual multiprocessor

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5978838A (en) * 1996-08-19 1999-11-02 Samsung Electronics Co., Ltd. Coordination and synchronization of an asymmetric, single-chip, dual multiprocessor

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
BHARDWAJ M. ET AL: "The Renaissance-a residue number system based vector co-processor for DSP dominated embedded ASICs", SIGNALS, SYSTEMS & COMPUTERS, 1998. CONFERENCE RECORD OF THE THIRTY-SECOND ASILOMAR CONFERENCE ON PACIFIC GROVE, CA, USA 1-4 NOV. 1998, PISCATAWAY, NJ, USA,IEEE, US, 1 November 1998 (1998-11-01), pages 202 - 207, XP010324340, ISBN: 0-7803-5148-7 *
BRANSTETTER R. ET AL: "ULTRA-RELIABLE DIGITAL AVIONICS (URDA) PROCESSOR ARCHITECTURE", PROCEEDINGS OF THE NATIONAL AEROSPACE AND ELECTRONICS CONFERENCE. (NAECON). DAYTON, MAY 23 - 27, 1994, NEW YORK, IEEE, US, vol. 1, 23 May 1994 (1994-05-23), pages 274 - 280, XP000510775, ISBN: 0-7803-1894-3 *
PRADO E.R. ET AL: "A high performance COTS based vector processor for space", MAPLDCON 99, 28 September 1999 (1999-09-28) - 30 September 1999 (1999-09-30), LAUREL, MARYLAND, USA, pages 1 - 6, XP002207357 *

Similar Documents

Publication Publication Date Title
Chakrabarti et al. Architectures for wavelet transforms: A survey
US20210055934A1 (en) Array-based inference engine for machine learning
US6530010B1 (en) Multiplexer reconfigurable image processing peripheral having for loop control
US5517665A (en) System for controlling arbitration using the memory request signal types generated by the plurality of datapaths having dual-ported local memory architecture for simultaneous data transmission
US5430854A (en) Simd with selective idling of individual processors based on stored conditional flags, and with consensus among all flags used for conditional branching
KR100211549B1 (en) Comprehensive digital signal processor / general purpose CPU with shared internal memory
US20210312322A1 (en) Machine learning network implemented by statically scheduled instructions, with system-on-chip
US12229658B2 (en) Configurable processor for implementing convolution neural networks
US4821224A (en) Method and apparatus for processing multi-dimensional data to obtain a Fourier transform
US7136989B2 (en) Parallel computation processor, parallel computation control method and program thereof
JPH11272631A (en) Data processing system and method therefor
CN111047036A (en) Neural network processor, chip and electronic equipment
US5583803A (en) Two-dimensional orthogonal transform processor
US5857088A (en) System for configuring memory space for storing single decoder table, reconfiguring same space for storing plurality of decoder tables, and selecting one configuration based on encoding scheme
CN117195989B (en) Vector processor, neural network accelerator, chip and electronic equipment
US5138704A (en) Data processing apparatus having a parallel arrangement of data communication and data computation
US11586894B2 (en) Ordering computations of a machine learning network in a machine learning accelerator for efficient memory usage
US5452101A (en) Apparatus and method for decoding fixed and variable length encoded data
Chakrabarti et al. Efficient realizations of analysis and synthesis filters based on the 2-D discrete wavelet transform
JP2008181535A (en) Digital signal processing apparatus
CN111047035A (en) Neural network processor, chip and electronic equipment
JPH04503720A (en) Flexible control device and method for digital signal processing device
WO2002067137A1 (en) Vector and scalar signal processing
US20070005830A1 (en) Systems and methods for weighted overlap and add processing
US20080229063A1 (en) Processor Array with Separate Serial Module

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): CA

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR

121 Ep: the epo has been informed by wipo that ep was designated in this application
121 Ep: the epo has been informed by wipo that ep was designated in this application
122 Ep: pct application non-entry in european phase