[go: up one dir, main page]

WO2016125202A1 - Data transfer apparatus - Google Patents

Data transfer apparatus Download PDF

Info

Publication number
WO2016125202A1
WO2016125202A1 PCT/JP2015/000482 JP2015000482W WO2016125202A1 WO 2016125202 A1 WO2016125202 A1 WO 2016125202A1 JP 2015000482 W JP2015000482 W JP 2015000482W WO 2016125202 A1 WO2016125202 A1 WO 2016125202A1
Authority
WO
WIPO (PCT)
Prior art keywords
offset
address
memory
memory bank
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/JP2015/000482
Other languages
French (fr)
Inventor
Hanno Lieske
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Renesas Electronics Corp
Original Assignee
Renesas Electronics Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Renesas Electronics Corp filed Critical Renesas Electronics Corp
Priority to PCT/JP2015/000482 priority Critical patent/WO2016125202A1/en
Publication of WO2016125202A1 publication Critical patent/WO2016125202A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/16Handling requests for interconnection or transfer for access to memory bus
    • G06F13/1668Details of memory controller

Definitions

  • the present invention relates to a data transfer apparatus, and more particularly, to a data transfer apparatus which reads data from a multi-bank memory.
  • the Viola-Jones object detection algorithm and SURF (Speeded Up Robust Features) local feature detection algorithm are popular algorithms. They are based on a calculation of digital image features such as a Haar-like feature or a determinant of a Hessian blob detector, which can be calculated extremely fast utilizing an integral image.
  • the integral image is a data structure for efficiently generating a sum of values in a rectangle subset of a grid.
  • the Haar-like feature or Hessian blob detector which consist of rectangles, only accesses to the integral image defined by four corners of a rectangle are required to calculate a sum of all values inside the rectangle. Therefore, it is possible to achieve a fast and constant processing time for one rectangle without depending on the rectangle size or geometry.
  • the Haar-like feature or Hessian blob detector can be easily calculated using calculated rectangle values.
  • PTL1 discloses a concept of accessing multiple memory banks in a multibank memory (e.g., SDRAM: Synchronous Dynamic Random Access Memory) at the same time for increasing a test speed per memory unit.
  • a multibank memory e.g., SDRAM: Synchronous Dynamic Random Access Memory
  • generated lower address bits are sent to both of two memory banks in the SDRAM at the same time in a test mode.
  • an upper address bit indicates which memory bank (a memory bank 1 or a memory bank 2) is to be accessed.
  • both memory banks are accessed at the same time, and the lower address bits indicate the address to be accessed.
  • PTL1 An alternative to PTL1, in which the upper address bits are used for bank selection, is the usage of the lower address bits for bank selection.
  • the lower address bits are used for memory bank selection and the upper address bits indicate an address in a selected memory bank.
  • neighbouring data can be accessed at the same time from the multiple banks, so that the width of a memory data transfer bus can be increased to a multiple of the memory bank data width.
  • a burst transfer in memories of a DDR (Double Data Rate) family is performed in the same manner as described above.
  • a transfer frequency is increased instead of the bus width, so that a system with a higher system frequency than a memory bank frequency can access the DDR memory in a burst transfer mode.
  • NPTL1 discloses that an integral image defined by four corners of a Haar-like rectangle is accessed to calculate the Haar-like feature.
  • the accessed data are not continuously stored in the multiple memories.
  • only one address is provided to a memory controller and the memory controller can access only one bank at a time. Therefore, it is necessary to regularly access the memory by accessing only one bank at a time.
  • NPTL 1 Paul Viola and Michael Jones, "Rapid Object Detection using a Boosted Cascade of Simple Features", IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1-511 to 1-518, December, 2001
  • the system disclosed in PTL1 can be only used for access of continuous data which can be accessed by specifying one address. Therefore, the system cannot access in parallel to multiple banks using the same address when the data is not continuously stored.
  • An aspect of any one of embodiments is a data transfer apparatus including: a processing unit that outputs a plurality of offset values and a base address; a memory bank array that includes a plurality of memory banks; and a memory controller that offsets the base address by the offset values to generate offset addresses, and reads data from the memory bank array using the offset addresses, the memory controller receiving the offset values and the base address from the processing unit.
  • Fig. 1 is a block diagram schematically showing a basic configuration of a data transfer apparatus according to a first embodiment.
  • Fig. 2 is a block diagram showing a specific configuration of the data transfer apparatus according to the first embodiment.
  • Fig. 3 is a timing chart showing a reading operation of the data transfer apparatus according to the first embodiment.
  • Fig. 4 is a block diagram showing a first operation of the data transfer apparatus 100 in a first clock cycle.
  • Fig. 5 is a block diagram showing a second operation of the data transfer apparatus 100 in the first clock cycle.
  • Fig. 6 is a block diagram showing an operation of the data transfer apparatus 100 in a second clock cycle.
  • Fig. 7 is a block diagram showing a first operation of the data transfer apparatus 100 in a third clock cycle.
  • Fig. 1 is a block diagram schematically showing a basic configuration of a data transfer apparatus according to a first embodiment.
  • Fig. 2 is a block diagram showing a specific configuration of the data transfer apparatus according to the first embodiment.
  • FIG. 8 is a block diagram showing a second operation of the data transfer apparatus 100 in the third clock cycle.
  • Fig. 9 is a block diagram showing a first operation of the data transfer apparatus 100 in a fourth clock cycle.
  • Fig. 10 is a block diagram showing a second operation of the data transfer apparatus 100 in the fourth clock cycle.
  • Fig. 11 is a block diagram schematically showing a configuration of a data transfer apparatus according to a second embodiment.
  • Fig. 12 is a timing chart showing a reading operation of the data transfer apparatus according to the second embodiment.
  • Fig. 13 is a flow chart showing a conflict judgement operation of the address conflict solving unit according to the second embodiment (Steps S201 to S210).
  • Fig. 14 is a flow chart showing the conflict judgement operation of the address conflict solving unit according to the second embodiment (Steps S211 to S216).
  • Fig. 15 is a flow chart showing the conflict judgement operation of the address conflict solving unit according to the second embodiment (Steps S217 to S227).
  • Fig. 16 is a block diagram schematically showing a configuration of a microcomputer according to a third embodiment.
  • the data transfer apparatus 100 can read data from a plurality of addresses in a multi-bank memory.
  • the plurality of addresses includes one base address and offset addresses.
  • data transfer apparatus 100 generates each of the offset addresses by offsetting the base address. Offset values between the base address and offset addresses are predetermined. Therefore, when the data transfer apparatus 100 continuously receives a plurality of the base addresses, the data transfer apparatus 100 executes the same offset operation on each base address in consecutive clock cycles.
  • the data transfer apparatus 100 generates four addresses (also referred to as four offset addresses) and reads data from the four offset addresses.
  • one offset address corresponds to the base address
  • the other three offset addresses correspond to the addresses generated by offsetting the base address.
  • the base address is also referred as a first offset address
  • the other three offset addresses are also referred as second to fourth offset addresses.
  • data read from the four offset addresses correspond to the values of the four corners of a Harr-like rectangle, for example.
  • Fig. 1 is a block diagram schematically showing a basic configuration of the data transfer apparatus 100 according to the first embodiment.
  • the data transfer apparatus 100 includes a processing unit 101, a memory controller 3, and a memory bank array 4.
  • the memory controller 3 and the memory bank array 4 are configured to transfer data between each other.
  • Fig. 2 is a block diagram showing a specific configuration of the data transfer apparatus 100 according to the first embodiment.
  • the processing unit 101 includes a processing memory 1 and a processing controller 2.
  • the processing memory 1 and the processing controller 2 are configured to transfer data between each other.
  • the memory controller 3 and the memory bank array 4 are disposed in a memory unit 102.
  • the processing controller 2 and the memory controller 3 can transfer data between each other via a bus 5.
  • the bus 5 is configured as a 128-bit bus. However, the configuration of the bus 5 is not limited to this configuration.
  • the processing controller 2 controls an operation of the memory controller 3.
  • the memory controller 3 reads data from the memory bank array 4 according to an instruction from the processing controller 2.
  • Data and address dealt with in the data transfer apparatus 100 are double-digit value in hexadecimal. However, the data and address dealt with in the data transfer apparatus 100 are not limited to this example.
  • the processing memory 1 includes an address memory 11 and a data memory 12.
  • the address memory 11 and the data memory 12 can hold 16 data values.
  • the address memory 11 holds base addresses therein.
  • the data memory 12 stores the data output from the processing controller 2 therein.
  • the processing controller 2 includes an offset memory 21, a temporary base address memory 22, and a temporary data memory 23.
  • the processing controller 2 holds three offset values in the offset memory 21, and outputs the three offset values to the memory controller 3 as appropriate.
  • the processing controller 2 reads the base address from the address memory 11 and stores the read base address into the temporary base address memory 22, and outputs the read base address to the memory controller 3 as appropriate. Further, the processing controller 2 stores data received from the memory controller 3 into the temporary data memory 23, and outputs the received data to the data memory 12.
  • the memory controller 3 includes offset registers 31A to 31C, adders 32A to 32C, and a calculation unit 33. Each of the offset registers 31A to 31C stores one of the three offset values from the offset memory 21 therein.
  • the adders 32A to 32C read the offset values from the offset registers 31A to 31C, respectively.
  • Each of the adders 32A to 32C adds the read offset value to the base address to generate the second to fourth offset addresses.
  • the memory controller 3 accesses the four offset addresses in the memory bank array 4, and reads the data from the four offset addresses.
  • the memory bank array 4 incudes 16 memory banks MB0 to MB15.
  • Each of the memory banks MB0 to MB15 includes a plurality of memory elements (not shown in figure).
  • the first to fourth offset addresses are different from each other. Further, the first to fourth offset addresses are included in different memory banks, respectively. In other words, each of the memory banks MB0 to MB15 includes only one of the first to fourth offset addresses.
  • the number of the memory banks in the memory bank array 4 is merely an example. Thus, the number of the memory banks in the memory bank array 4 may be a plural number other than 16.
  • the address memory 11 holds base addresses B1 to B4 (also as referred to as first to fourth base addresses). As an explanation of the other addresses in the address memory 11 is not needed to understand the reading operation of the data transfer apparatus 100, a description thereof will be omitted.
  • the offset memory 21 holds three offset values F1 to F3 (also as referred to as first to third offset values).
  • FIG. 3 is a timing chart showing the reading operation of the data transfer apparatus according to the first embodiment.
  • FIG. 4 is a block diagram showing a first operation of the data transfer apparatus 100 in a first clock cycle.
  • the processing controller 2 outputs the first to third offset values F1 to F3 to the memory controller 3 (T1 in Fig. 3).
  • the first to third offset values F1 to F3 in the offset memory 21 are output to the offset registers 31A to 31C, and the offset registers 31A to 31C stores the first to third offset values F1 to F3 therein, respectively.
  • Fig. 5 is a block diagram showing a second operation of the data transfer apparatus 100 in the first clock cycle.
  • the processing controller 2 reads the base address from the address memory 11.
  • the processing controller 2 reads the four base addresses B1 to B4 from the address memory 11 in a clock cycle.
  • Second clock cycle Fig. 6 is a block diagram showing an operation of the data transfer apparatus 100 in a second clock cycle.
  • the processing controller 2 outputs the stored base addresses one by one.
  • the processing controller 2 outputs the first base address B1 to the adders 32A to 32C.
  • the memory controller 3 sets the first base address B1 as a first offset address A1_1.
  • the adder 32A adds the first offset value F1 to the first base address B1, and sets the calculated value as a second offset address A2_1.
  • the adder 32B adds the second offset value F2 to the first base address B1, and sets the calculated value as a third offset address A3_1.
  • the adder 32C adds the third offset value F3 to the first base address B1, and sets the calculated value as a fourth offset address A4_1. Then, the memory controller 3 accesses the memory bank array 4 using the first to fourth offset addresses A1_1 to A4_1.
  • FIG. 7 is a block diagram showing a first operation of the data transfer apparatus 100 in a third clock cycle.
  • the memory controller 3 receives first to fourth data values D1_1 to D4_1 from the first to fourth offset addresses A1_1 to A4_1 in different memory banks in the memory bank array 4 corresponding to the first to fourth offset addresses A1_1 to A4_1.
  • Fig. 8 is a block diagram showing a second operation of the data transfer apparatus 100 in the third clock cycle. Further, in this clock cycle, the processing controller 2 outputs the second base address B2 to the adders 32A to 32C. Then, the memory controller 3 sets the second base address B2 as a first offset address A1_2.
  • the adder 32A adds the first offset value F1 to the second base address B2, and sets the calculated value as a second offset address A2_2.
  • the adder 32B adds the second offset value F2 to the second base address B2, and sets the calculated value as a third offset address A3_2.
  • the adder 32C adds the third offset value F3 to the second base address B2, and sets the calculated value as a fourth offset address A4_2.
  • the memory controller 3 accesses the memory bank array 4 using the first to fourth offset addresses A1_2 to A4_2. In sum, the memory controller 3 can receive the next base address from the processing controller 2 and generates the first to fourth offset addresses, because the memory controller 3 has received all of the first to fourth data values from the first to fourth offset addresses generated in the former clock cycle than the present clock cycle.
  • FIG. 9 is a block diagram showing a first operation of the data transfer apparatus 100 in a fourth clock cycle.
  • the calculation unit 33 in the memory controller 3 calculates a first calculated value CV1 based on the first to fourth data values and outputs the first calculated value CV1 to the temporary data memory 23.
  • the data values D1_1 to D4_1 from the first to fourth offset addresses are converted to one calculated value CV1.
  • the first calculated value CV1 is transferred to the data memory 12.
  • the calculation unit 33 in the memory controller 3 receives first to fourth data values D1_2 to D4_2 from the different memory banks in the memory bank array 4 corresponding to the first to fourth offset addresses A1_2 to A4_2.
  • FIG. 10 is a block diagram showing a third operation of the data transfer apparatus 100 in the fourth clock cycle.
  • the processing controller 2 outputs the third base address B3 to the adders 32A to 32C.
  • the memory controller 3 sets the third base address B3 as a first offset address A1_3.
  • the adder 32A adds the first offset value F1 to the third base address B3, and sets the calculated value as a second offset address A2_3.
  • the adder 32B adds the second offset value F2 to the third base address B3, and sets the calculated value as a third offset address A3_3.
  • the adder 32C adds the third offset value F3 to the third base address B3, and sets the calculated value as a fourth offset address A4_3.
  • the memory controller 3 accesses the memory bank array 4 using the first to fourth offset addresses A1_3 to A4_3.
  • the base address transfer and the memory access using the offset addresses (e.g., the transfer of the base address B1 and the memory access using the first to fourth offset addresses A1_1 to A4_1 shown in in Fig. 6), the data reading (e.g., reading the first to fourth data values D1_1 to D4_1 shown in in Fig. 7), and the calculated value transfer (e.g., transfer of the a first calculated value CV1 shown in in Fig. 9) are performed based on each of the base addresses transferred in series from the processing controller 2 to the memory controller 3.
  • the offset addresses e.g., the transfer of the base address B1 and the memory access using the first to fourth offset addresses A1_1 to A4_1 shown in in Fig. 6
  • the data reading e.g., reading the first to fourth data values D1_1 to D4_1 shown in in Fig. 7
  • the calculated value transfer e.g., transfer of the a first calculated value CV1 shown in in Fig. 9
  • Haar-like features are occasionally used.
  • a Haar-like rectangle accesses to an Integral Image stored in the memory bank array are performed.
  • the shape of the Haar-like rectangle is predetermined according to the purpose of using. For example, an upper-left corner of the rectangle corresponds to the base address. Then, an upper-right corner, a lower-left corner and a lower-right corner correspond to the second to fourth offset addresses, respectively. Therefore, it is possible to read the Integral Image data of the four corners of the Haar-like rectangle by using the one base address in the data transfer apparatus 100.
  • a rectangle value can be calculated by the calculation unit 33.
  • the calculation unit 33 assigns the read data from the four offset addresses to the expression 1 described below and calculates the rectangle value RV.
  • UL means a value of the upper-left corner
  • UR means a value of the upper-right corner
  • LL means a value of the lower-left corner
  • LR means a value of the lower-right corner.
  • the calculation unit 33 includes three subtractors to calculate the rectangle value RV.
  • the present data transfer apparatus 100 can read the data from a plurality of non-consecutive addresses in the multi-bank memory based on one base address.
  • one base address is sent to the memory unit from the processing unit to read a plurality of data in a clock cycle.
  • Fig. 11 is a block diagram schematically showing a configuration of the data transfer apparatus 200 according to the second embodiment.
  • the data transfer apparatus 200 has a configuration in which an address conflict solving unit 6 is added to the data transfer apparatus 100.
  • the address conflict solving unit 6 is inserted between the memory controller 3 and the memory bank array 4.
  • As other configurations of the data transfer apparatus 200 are similar to those of the data transfer apparatus 100, a description of those will be omitted.
  • the first to fourth offset addresses are addresses in different memory banks, respectively.
  • each of the memory banks MB0 to MB15 can include two or more of the first to fourth offset addresses in the present embodiment.
  • each of the memory banks MB0 to MB15 can fulfill only one access in a clock cycle and can output data only from one random address.
  • two or more clock cycles are needed to output all data from the first to fourth offset addresses, when any of the memory banks MB0 to MB15 receives two or more of the first to fourth offset addresses.
  • Fig. 12 is a timing chart showing the reading operation of the data transfer apparatus according to the second embodiment.
  • Case 1 In the case 1, the first to fourth offset addresses are included in different memory banks, respectively. Thus, the address conflict solving unit 6 does not find a conflict and the first to fourth offset addresses can access in parallel in the same clock cycle. That is, the address conflict solving unit 6 needs one clock cycle to receive all the data from memory bank array 4 (Fig. 12, referred to as "C1").
  • Case 2 two of the first to fourth offset addresses belong to the same memory bank.
  • the first and second offset addresses belong to a first memory bank
  • the third offset address belongs to a second memory bank
  • the fourth offset address belongs to the third memory bank.
  • the address conflict solving unit 6 accesses the first and second offset addresses in series in different clock cycles, and accesses the third and fourth offset addresses in the clock cycle when the first or second offset address is accessed.
  • the first and second offset addresses conflicts with each other and the third and fourth offset addresses conflicts with each other.
  • the address conflict solving unit 6 accesses the first and second offset addresses in series in different clock cycles and accesses the third and fourth offset addresses in series in different clock cycles.
  • the address conflict solving unit 6 accesses one of the first and second offset addresses and one of the third and fourth offset addresses in parallel in the same clock cycle, and the other of the first and second offset addresses and the other of the third and fourth offset addresses in parallel in the same clock cycle. That is, the address conflict solving unit 6 needs two clock cycles to receive all the data from memory bank array 4 (Fig. 12, referred to as "C2").
  • Case 3 In the case 3, three of the first to fourth offset addresses belong to the same memory bank.
  • the first to third offset addresses belong to one memory bank and the fourth offset address belongs to another memory bank.
  • the address conflict solving unit 6 accesses the first to third offset addresses in series in different clock cycles, respectively.
  • the address conflict solving unit 6 accesses the fourth offset address in the clock cycle when any one of the first to third offset addresses is accessed. That is, the address conflict solving unit 6 needs three clock cycles to receive all the data from memory bank array 4 (Fig. 12, referred to as "C3").
  • Case 4 In the case 4, all of the first to fourth offset addresses belong to the same memory bank. Thus, the first to fourth offset addresses conflict with each other.
  • the address conflict solving unit 6 accesses the first to fourth offset addresses in series in different clock cycles, respectively. That is, the address conflict solving unit 6 needs four clock cycles to receive all the data from memory bank array 4 (Fig. 12, referred to as "C4").
  • a conflict judgement operation of the address conflict solving unit 6 will be described.
  • a number of access time slots NOA and access delays AD1 to AD4 are set to "0".
  • the number of access time slots NOA means the number of clock cycles needed for reading all the data from the memory bank array 4.
  • the access delays AD1 to AD4 mean the clock cycles when the first to fourth offset address in the memory bank array 4 are accessed, respectively.
  • Figs. 13 to 15 are flow charts showing the conflict judgement operation of the address conflict solving unit 6 according to the second embodiment.
  • the conflict judgement operation of the address conflict solving unit 6 includes steps S201 to S227.
  • Fig. 13 shows the steps S201 to 210
  • Fig. 14 shows the steps S211 to 216
  • Fig. 15 shows the steps S217 to 227.
  • Step S201 the address conflict solving unit 6 judges whether the memory bank M1 of the first offset address A1 is the same as the memory bank M2 of the second offset address A2.
  • Step S202 When the memory bank M1 of the first offset address A1 is the same as the memory bank M2 of the second offset address A2 in the step S201, the address conflict solving unit 6 judges whether the memory bank M1 of the first offset address A1 is the same as the memory bank M3 of the third offset address A3.
  • Step S203 When the memory bank M1 of the first offset address A1 is the same as the memory bank M3 of the third offset address A3 in the step S202, the address conflict solving unit 6 judges whether the memory bank M1 of the first offset address A1 is the same as the memory bank M4 of the fourth offset address A4.
  • Step S204 When the memory bank M1 of the first offset address A1 is the same as the memory bank M4 of the fourth offset address A4 in the step S203, the address conflict solving unit 6 sets the access time slot NOA to "4", the access timing AD1 to "0", the access timing AD2 to “1", the access timing AD3 to "2", and the access timing AD4 to "3". Then, the conflict judgement operation is finished. As a result, this case corresponds to the case 4 described above.
  • Step S205 When the memory bank M1 of the first offset address A1 is different from the memory bank M4 of the fourth offset address A4 in the step S203, the address conflict solving unit 6 sets the access time slot NOA to "3", the access timing AD1 to "0", the access timing AD2 to “1", the access timing AD3 to "2", and the access timing AD4 to "0". Then, the conflict judgement operation is finished. As a result, this case corresponds to the case 3 described above.
  • Step S206 When the memory bank M1 of the first offset address A1 is different from the memory bank M3 of the third offset address A3 in the step S202, the address conflict solving unit 6 judges whether the memory bank M1 of the first offset address A1 is the same as the memory bank M4 of the fourth offset address A4.
  • Step S207 When the memory bank M1 of the first offset address A1 is the same as the memory bank M4 of the fourth offset address A4 in the step S206, the address conflict solving unit 6 sets the access time slot NOA to "3", the access timing AD1 to "0", the access timing AD2 to “1", the access timing AD3 to "0", and the access timing AD4 to "2". Then, the conflict judgement operation is finished. As a result, this case corresponds to the case 3 described above.
  • Step S208 When the memory bank M1 of the first offset address A1 is different from the memory bank M4 of the fourth offset address A4 in the step S206, the address conflict solving unit 6 judges whether the memory bank M3 of the third offset address A3 is the same as the memory bank M4 of the fourth offset address A4.
  • Step S209 When the memory bank M3 of the third offset address A3 is the same as the memory bank M4 of the fourth offset address A4 in the step S208, the address conflict solving unit 6 sets the access time slot NOA to "2", the access timing AD1 to "0", the access timing AD2 to "1", the access timing AD3 to "0", and the access timing AD4 to "1". Then, the conflict judgement operation is finished. As a result, this case corresponds to the case 2 described above.
  • Step S210 When the memory bank M3 of the third offset address A3 is different from the memory bank M4 of the fourth offset address A4 in the step S208, the address conflict solving unit 6 sets the access time slot NOA to "2", the access timing AD1 to “0", the access timing AD2 to “1", the access timing AD3 to "0", and the access timing AD4 to "0". Then, the conflict judgement operation is finished. As a result, this case corresponds to the case 2 described above.
  • Step S211 When the memory bank M1 of the first offset address A1 is different from the memory bank M2 of the second offset address A2 in the step S201, the address conflict solving unit 6 judges whether the memory bank M1 of the first offset address A1 is the same as the memory bank M3 of the third offset address A3.
  • Step S212 When the memory bank M1 of the first offset address A1 is the same as the memory bank M3 of the third offset address A3 in the step S211, the address conflict solving unit 6 judges whether the memory bank M1 of the first offset address A1 is the same as the memory bank M4 of the fourth offset address A4.
  • Step S213 When the memory bank M1 of the first offset address A1 is the same as the memory bank M4 of the fourth offset address A4 in the step S212, the address conflict solving unit 6 sets the access time slot NOA to "3", the access timing AD1 to "0", the access timing AD2 to “0", the access timing AD3 to "1", and the access timing AD4 to "2". Then, the conflict judgement operation is finished. As a result, this case corresponds to the case 3 described above.
  • Step S214 When the memory bank M1 of the first offset address A1 is different from the memory bank M4 of the fourth offset address A4 in the step S212, the address conflict solving unit 6 judges whether the memory bank M2 of the second offset address A2 is the same as the memory bank M4 of the fourth offset address A4.
  • Step S215 When the memory bank M2 of the second offset address A2 is the same as the memory bank M4 of the fourth offset address A4 in the step S214, the address conflict solving unit 6 sets the access time slot NOA to "2", the access timing AD1 to "0", the access timing AD2 to "0", the access timing AD3 to "1", and the access timing AD4 to "1". Then, the conflict judgement operation is finished. As a result, this case corresponds to the case 2 described above.
  • Step S216 When the memory bank M2 of the second offset address A2 is different from the memory bank M4 of the fourth offset address A4 in the step S214, the address conflict solving unit 6 sets the access time slot NOA to "2", the access timing AD1 to “0", the access timing AD2 to “0", the access timing AD3 to "1", and the access timing AD4 to "0". Then, the conflict judgement operation is finished. As a result, this case corresponds to the case 2 described above.
  • Step S217 When the memory bank M1 of the first offset address A1 is different from the memory bank M3 of the third offset address A3 in the step S211, the address conflict solving unit 6 judges whether the memory bank M2 of the second offset address A2 is the same as the memory bank M3 of the third offset address A3.
  • Step S218 When the memory bank M2 of the second offset address A2 is the same as the memory bank M3 of the third offset address A3 in the step S217, the address conflict solving unit 6 judges whether the memory bank M2 of the second offset address A2 is the same as the memory bank M4 of the fourth offset address A4.
  • Step S219 When the memory bank M2 of the second offset address A2 is the same as the memory bank M4 of the fourth offset address A4 in the step S218, the address conflict solving unit 6 sets the access time slot NOA to "3", the access timing AD1 to "0", the access timing AD2 to "0", the access timing AD3 to "1", and the access timing AD4 to "2". Then, the conflict judgement operation is finished. As a result, this case corresponds to the case 3 described above.
  • Step S220 When the memory bank M2 of the second offset address A2 is different from the memory bank M4 of the fourth offset address A4 in the step S218, the address conflict solving unit 6 judges whether the memory bank M1 of the first offset address A1 is the same as the memory bank M4 of the fourth offset address A4.
  • Step S221 When the memory bank M1 of the first offset address A1 is the same as the memory bank M4 of the fourth offset address A4 in the step S220, the address conflict solving unit 6 sets the access time slot NOA to "2", the access timing AD1 to "0", the access timing AD2 to “0", the access timing AD3 to "1", and the access timing AD4 to "1". Then, the conflict judgement operation is finished. As a result, this case corresponds to the case 2 described above.
  • Step S222 When the memory bank M1 of the first offset address A1 is different from the memory bank M4 of the fourth offset address A4 in the step S220, the address conflict solving unit 6 sets the access time slot NOA to "2", the access timing AD1 to "0", the access timing AD2 to “0", the access timing AD3 to "1", and the access timing AD4 to "0". Then, the conflict judgement operation is finished. As a result, this case corresponds to the case 2 described above.
  • Step S223 When the memory bank M2 of the second offset address A2 is different from the memory bank M3 of the third offset address A3 in the step S217, the address conflict solving unit 6 judges whether the memory bank M1 of the first offset address A1 is the same as the memory bank M4 of the fourth offset address A4.
  • Step S224 When the memory bank M1 of the first offset address A1 is different from the memory bank M4 of the fourth offset address A4 in the step S223, the address conflict solving unit 6 judges whether the memory bank M2 of the second offset address A2 is the same as the memory bank M4 of the fourth offset address A4.
  • Step S225 When the memory bank M2 of the second offset address A2 is different from the memory bank M4 of the fourth offset address A4 in the step S224, the address conflict solving unit 6 judges whether the memory bank M3 of the third offset address A3 is the same as the memory bank M4 of the fourth offset address A4.
  • Step S226 When the memory bank M1 of the first offset address A1 is the same as the memory bank M4 of the fourth offset address A4 in the step S223, the memory bank M2 of the second offset address A2 is the same as the memory bank M4 of the fourth offset address A4 in the step S224, or the memory bank M3 of the third offset address A3 is the same as the memory bank M4 of the fourth offset address A4 in the step S225, the address conflict solving unit 6 sets the access time slot NOA to "2", the access timing AD1 to "0", the access timing AD2 to "0", the access timing AD3 to "0", and the access timing AD4 to "1". Then, the conflict judgement operation is finished. As a result, this case corresponds to the case 2 described above.
  • Step S227 When the memory bank M3 of the third offset address A3 is different from the memory bank M4 of the fourth offset address A4 in the step S225, the address conflict solving unit 6 sets the access time slot NOA to "1", the access timing AD1 to “0", the access timing AD2 to “0", the access timing AD3 to "0", and the access timing AD4 to "0". Then, the conflict judgement operation is finished. As a result, this case corresponds to the case 1 described above.
  • the address conflict solving unit 6 which consists of the step S201 to S227 shown in Fig, 13 to 15, it can be understood that the address conflict solving unit 6 can read the data from the memory bank array 4 as appropriate, when the address conflict occurs.
  • the address conflict solving unit 6 accesses the address in the memory bank array 4 based on the access timings AD1 to AD4. Specifically, when the access timings AD1 to AD4 are "0", the address conflict solving unit 6 accesses the first to fourth offset addresses in the memory bank array 4 in a clock cycle CLK1, and reads the data from the first to fourth offset addresses in the memory bank array 4 in a clock cycle CLK2 following the clock cycle CLK1.
  • the address conflict solving unit 6 accesses the second to fourth offset addresses in the memory bank array 4 in the clock cycle CLK2, and reads the data from the second to fourth offset addresses in the memory bank array 4 in a clock cycle CLK3 following the clock cycle CLK2.
  • the address conflict solving unit 6 accesses the third and fourth offset addresses in the memory bank array 4 in the clock cycle CLK3, and reads the data from the third and fourth offset addresses in the memory bank array 4 in a clock cycle CLK4 following the clock cycle CLK3.
  • the address conflict solving unit 6 accesses the fourth offset address in the memory bank array 4 in the clock cycle CLK4, and reads the data from the fourth offset address in the memory bank array 4 in a clock cycle CLK5 following the clock cycle CLK4.
  • the data transfer apparatus 200 can read the data from the four offset addresses in the memory bank array 4, even if the four offset addresses are accessing to the same memory bank.
  • a microcomputer 300 is an example of a microcomputer in which the data transfer apparatus 100 is incorporated.
  • Fig. 16 is a block diagram schematically showing a configuration of the microcomputer 300 according to the third embodiment.
  • the microcomputer 300 includes a SIMD (Single Instruction Multiple Data) processor 301, a central processing unit (CPU) 302, a single memory wrapper 303, and the bus 5. Data can be transferred among the SIMD processor 301, the central processing unit (CPU) 302, and the single memory wrapper 303 via the bus 5.
  • SIMD Single Instruction Multiple Data
  • CPU central processing unit
  • the single memory wrapper 303 corresponds to the memory unit 102.
  • the data transfer apparatus can be applied to a microcomputer. Therefore, according to the configuration, it is possible to access and read a plurality of data by sending only one base address in a clock cycle in the microcomputer including the data transfer apparatus 100.
  • the data transfer apparatus 200 can be applied to a microcomputer.
  • the first offset address is the base address.
  • another address may be used as the first offset address.
  • the first offset address may be an address offset by another offset value (e.g., a fourth offset value) than the first to third offset values.
  • four offset values are sent to the memory controller 3 from the processing controller 2 via the bus 5.
  • the memory controller 3 further includes another offset register for holding the fourth offset value and another adder for generating the first offset value.
  • the four offset values are generated.
  • the number of the offset values may be an arbitrarily number more than two.
  • the processing memory 1 includes an address memory 11 and a data memory 12.
  • two different areas in one memory unit may be used.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Dram (AREA)

Abstract

A data transfer apparatus (100) includes a processing unit (101), a memory bank array (4), and a memory controller (3). The processing unit (101) outputs a plurality of offset values and a base address. The memory bank array includes a plurality of memory banks (4). The memory controller (3) offsets the base address by the offset values to generate offset addresses, and reads data from the memory bank array (4) using the offset addresses. The memory controller receives the offset values and base address from the processing unit (101).

Description

DATA TRANSFER APPARATUS
The present invention relates to a data transfer apparatus, and more particularly, to a data transfer apparatus which reads data from a multi-bank memory.
In an area of object detection, the Viola-Jones object detection algorithm and SURF (Speeded Up Robust Features) local feature detection algorithm are popular algorithms. They are based on a calculation of digital image features such as a Haar-like feature or a determinant of a Hessian blob detector, which can be calculated extremely fast utilizing an integral image.
The integral image is a data structure for efficiently generating a sum of values in a rectangle subset of a grid. For calculating the Haar-like feature or Hessian blob detector, which consist of rectangles, only accesses to the integral image defined by four corners of a rectangle are required to calculate a sum of all values inside the rectangle. Therefore, it is possible to achieve a fast and constant processing time for one rectangle without depending on the rectangle size or geometry. After, the Haar-like feature or Hessian blob detector can be easily calculated using calculated rectangle values.
PTL1 discloses a concept of accessing multiple memory banks in a multibank memory (e.g., SDRAM: Synchronous Dynamic Random Access Memory) at the same time for increasing a test speed per memory unit. For the case of a two-bank memory, e.g., generated lower address bits are sent to both of two memory banks in the SDRAM at the same time in a test mode. In a normal mode, an upper address bit indicates which memory bank (a memory bank 1 or a memory bank 2) is to be accessed. On the other hand, in the test mode, both memory banks are accessed at the same time, and the lower address bits indicate the address to be accessed.
An alternative to PTL1, in which the upper address bits are used for bank selection, is the usage of the lower address bits for bank selection. In this case, the lower address bits are used for memory bank selection and the upper address bits indicate an address in a selected memory bank. With such access, neighbouring data can be accessed at the same time from the multiple banks, so that the width of a memory data transfer bus can be increased to a multiple of the memory bank data width.
A burst transfer in memories of a DDR (Double Data Rate) family is performed in the same manner as described above. In this example, a transfer frequency is increased instead of the bus width, so that a system with a higher system frequency than a memory bank frequency can access the DDR memory in a burst transfer mode.
NPTL1 discloses that an integral image defined by four corners of a Haar-like rectangle is accessed to calculate the Haar-like feature. In this case, the accessed data are not continuously stored in the multiple memories. Further, only one address is provided to a memory controller and the memory controller can access only one bank at a time. Therefore, it is necessary to regularly access the memory by accessing only one bank at a time.
PTL 1:US Patent Publication No. 1997/5671392
NPTL 1:Paul Viola and Michael Jones, "Rapid Object Detection using a Boosted Cascade of Simple Features", IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1-511 to 1-518, December, 2001
However, the inventor has found out the following problems in the above-mentioned technologies. The system disclosed in PTL1 can be only used for access of continuous data which can be accessed by specifying one address. Therefore, the system cannot access in parallel to multiple banks using the same address when the data is not continuously stored.
In the case of NPTL1, where the data is not continuously accessed, a parallel access of multiple data from the multiple bank memory is not possible, because only one address is transferred to the memory controller of the memory bank array in each clock cycle.
An aspect of any one of embodiments is a data transfer apparatus including: a processing unit that outputs a plurality of offset values and a base address; a memory bank array that includes a plurality of memory banks; and a memory controller that offsets the base address by the offset values to generate offset addresses, and reads data from the memory bank array using the offset addresses, the memory controller receiving the offset values and the base address from the processing unit.
According to an aspect of any one of embodiments, it is possible to access and read a plurality of data words at non-consecutive addresses by receiving only one base address in a clock cycle.
Fig. 1 is a block diagram schematically showing a basic configuration of a data transfer apparatus according to a first embodiment. Fig. 2 is a block diagram showing a specific configuration of the data transfer apparatus according to the first embodiment. Fig. 3 is a timing chart showing a reading operation of the data transfer apparatus according to the first embodiment. Fig. 4 is a block diagram showing a first operation of the data transfer apparatus 100 in a first clock cycle. Fig. 5 is a block diagram showing a second operation of the data transfer apparatus 100 in the first clock cycle. Fig. 6 is a block diagram showing an operation of the data transfer apparatus 100 in a second clock cycle. Fig. 7 is a block diagram showing a first operation of the data transfer apparatus 100 in a third clock cycle. Fig. 8 is a block diagram showing a second operation of the data transfer apparatus 100 in the third clock cycle. Fig. 9 is a block diagram showing a first operation of the data transfer apparatus 100 in a fourth clock cycle. Fig. 10 is a block diagram showing a second operation of the data transfer apparatus 100 in the fourth clock cycle. Fig. 11 is a block diagram schematically showing a configuration of a data transfer apparatus according to a second embodiment. Fig. 12 is a timing chart showing a reading operation of the data transfer apparatus according to the second embodiment. Fig. 13 is a flow chart showing a conflict judgement operation of the address conflict solving unit according to the second embodiment (Steps S201 to S210). Fig. 14 is a flow chart showing the conflict judgement operation of the address conflict solving unit according to the second embodiment (Steps S211 to S216). Fig. 15 is a flow chart showing the conflict judgement operation of the address conflict solving unit according to the second embodiment (Steps S217 to S227). Fig. 16 is a block diagram schematically showing a configuration of a microcomputer according to a third embodiment.
Hereinafter, embodiments shall be explained with reference to the drawings. The same components are denoted by the same reference numerals throughout the drawings, and repeated explanations shall be omitted as necessary.
First Embodiment
A data transfer apparatus 100 according to a first embodiment shall be explained. The data transfer apparatus 100 can read data from a plurality of addresses in a multi-bank memory. The plurality of addresses includes one base address and offset addresses. In this embodiment, data transfer apparatus 100 generates each of the offset addresses by offsetting the base address. Offset values between the base address and offset addresses are predetermined. Therefore, when the data transfer apparatus 100 continuously receives a plurality of the base addresses, the data transfer apparatus 100 executes the same offset operation on each base address in consecutive clock cycles.
Hereinafter, an example in which the data transfer apparatus 100 generates four addresses (also referred to as four offset addresses) and reads data from the four offset addresses. In this embodiment, one offset address corresponds to the base address, and the other three offset addresses correspond to the addresses generated by offsetting the base address. Specifically, the base address is also referred as a first offset address, and the other three offset addresses are also referred as second to fourth offset addresses. Further, data read from the four offset addresses correspond to the values of the four corners of a Harr-like rectangle, for example.
Fig. 1 is a block diagram schematically showing a basic configuration of the data transfer apparatus 100 according to the first embodiment. As shown in Fig. 1, the data transfer apparatus 100 includes a processing unit 101, a memory controller 3, and a memory bank array 4. The memory controller 3 and the memory bank array 4 are configured to transfer data between each other.
Fig. 2 is a block diagram showing a specific configuration of the data transfer apparatus 100 according to the first embodiment. As shown in Fig. 1, the processing unit 101 includes a processing memory 1 and a processing controller 2. The processing memory 1 and the processing controller 2 are configured to transfer data between each other. The memory controller 3 and the memory bank array 4 are disposed in a memory unit 102.
The processing controller 2 and the memory controller 3 can transfer data between each other via a bus 5. The bus 5 is configured as a 128-bit bus. However, the configuration of the bus 5 is not limited to this configuration.
The processing controller 2 controls an operation of the memory controller 3. The memory controller 3 reads data from the memory bank array 4 according to an instruction from the processing controller 2.
Data and address dealt with in the data transfer apparatus 100 are double-digit value in hexadecimal. However, the data and address dealt with in the data transfer apparatus 100 are not limited to this example.
The processing memory 1 includes an address memory 11 and a data memory 12. The address memory 11 and the data memory 12 can hold 16 data values. The address memory 11 holds base addresses therein. The data memory 12 stores the data output from the processing controller 2 therein.
The processing controller 2 includes an offset memory 21, a temporary base address memory 22, and a temporary data memory 23. The processing controller 2 holds three offset values in the offset memory 21, and outputs the three offset values to the memory controller 3 as appropriate. The processing controller 2 reads the base address from the address memory 11 and stores the read base address into the temporary base address memory 22, and outputs the read base address to the memory controller 3 as appropriate. Further, the processing controller 2 stores data received from the memory controller 3 into the temporary data memory 23, and outputs the received data to the data memory 12.
The memory controller 3 includes offset registers 31A to 31C, adders 32A to 32C, and a calculation unit 33. Each of the offset registers 31A to 31C stores one of the three offset values from the offset memory 21 therein. The adders 32A to 32C read the offset values from the offset registers 31A to 31C, respectively. Each of the adders 32A to 32C adds the read offset value to the base address to generate the second to fourth offset addresses. The memory controller 3 accesses the four offset addresses in the memory bank array 4, and reads the data from the four offset addresses.
In this example, the memory bank array 4 incudes 16 memory banks MB0 to MB15. Each of the memory banks MB0 to MB15 includes a plurality of memory elements (not shown in figure). In this embodiment, the first to fourth offset addresses are different from each other. Further, the first to fourth offset addresses are included in different memory banks, respectively. In other words, each of the memory banks MB0 to MB15 includes only one of the first to fourth offset addresses. Note that the number of the memory banks in the memory bank array 4 is merely an example. Thus, the number of the memory banks in the memory bank array 4 may be a plural number other than 16.
Next, a reading operation of the data transfer apparatus 100 shall be described. An initial state of the data transfer apparatus 100 is shown in Fig. 2. In this example, the address memory 11 holds base addresses B1 to B4 (also as referred to as first to fourth base addresses). As an explanation of the other addresses in the address memory 11 is not needed to understand the reading operation of the data transfer apparatus 100, a description thereof will be omitted. The offset memory 21 holds three offset values F1 to F3 (also as referred to as first to third offset values).
Hereinafter, the reading operation of the data transfer apparatus 100 in each clock cycle will be described. Fig. 3 is a timing chart showing the reading operation of the data transfer apparatus according to the first embodiment.
First clock cycle
Fig. 4 is a block diagram showing a first operation of the data transfer apparatus 100 in a first clock cycle. In the first operation in the first clock cycle, the processing controller 2 outputs the first to third offset values F1 to F3 to the memory controller 3 (T1 in Fig. 3). Specifically, the first to third offset values F1 to F3 in the offset memory 21 are output to the offset registers 31A to 31C, and the offset registers 31A to 31C stores the first to third offset values F1 to F3 therein, respectively.
Fig. 5 is a block diagram showing a second operation of the data transfer apparatus 100 in the first clock cycle. In the second operation in the first clock cycle, the processing controller 2 reads the base address from the address memory 11. In this example, the processing controller 2 reads the four base addresses B1 to B4 from the address memory 11 in a clock cycle.
Second clock cycle
Fig. 6 is a block diagram showing an operation of the data transfer apparatus 100 in a second clock cycle. The processing controller 2 outputs the stored base addresses one by one. In this clock (T2 in Fig. 3), the processing controller 2 outputs the first base address B1 to the adders 32A to 32C. Then, the memory controller 3 sets the first base address B1 as a first offset address A1_1. The adder 32A adds the first offset value F1 to the first base address B1, and sets the calculated value as a second offset address A2_1. The adder 32B adds the second offset value F2 to the first base address B1, and sets the calculated value as a third offset address A3_1. The adder 32C adds the third offset value F3 to the first base address B1, and sets the calculated value as a fourth offset address A4_1. Then, the memory controller 3 accesses the memory bank array 4 using the first to fourth offset addresses A1_1 to A4_1.
Third clock cycle
Fig. 7 is a block diagram showing a first operation of the data transfer apparatus 100 in a third clock cycle. In this clock cycle (T3 in Fig. 3), the memory controller 3 receives first to fourth data values D1_1 to D4_1 from the first to fourth offset addresses A1_1 to A4_1 in different memory banks in the memory bank array 4 corresponding to the first to fourth offset addresses A1_1 to A4_1.
Fig. 8 is a block diagram showing a second operation of the data transfer apparatus 100 in the third clock cycle. Further, in this clock cycle, the processing controller 2 outputs the second base address B2 to the adders 32A to 32C. Then, the memory controller 3 sets the second base address B2 as a first offset address A1_2. The adder 32A adds the first offset value F1 to the second base address B2, and sets the calculated value as a second offset address A2_2. The adder 32B adds the second offset value F2 to the second base address B2, and sets the calculated value as a third offset address A3_2. The adder 32C adds the third offset value F3 to the second base address B2, and sets the calculated value as a fourth offset address A4_2. The memory controller 3 accesses the memory bank array 4 using the first to fourth offset addresses A1_2 to A4_2.
In sum, the memory controller 3 can receive the next base address from the processing controller 2 and generates the first to fourth offset addresses, because the memory controller 3 has received all of the first to fourth data values from the first to fourth offset addresses generated in the former clock cycle than the present clock cycle.
Fourth clock cycle
Fig. 9 is a block diagram showing a first operation of the data transfer apparatus 100 in a fourth clock cycle. In this clock cycle (T4 in Fig. 3), the calculation unit 33 in the memory controller 3 calculates a first calculated value CV1 based on the first to fourth data values and outputs the first calculated value CV1 to the temporary data memory 23. Thus, the data values D1_1 to D4_1 from the first to fourth offset addresses are converted to one calculated value CV1. Then, the first calculated value CV1 is transferred to the data memory 12. Further, the calculation unit 33 in the memory controller 3 receives first to fourth data values D1_2 to D4_2 from the different memory banks in the memory bank array 4 corresponding to the first to fourth offset addresses A1_2 to A4_2.
Fig. 10 is a block diagram showing a third operation of the data transfer apparatus 100 in the fourth clock cycle. Further, in this clock cycle, the processing controller 2 outputs the third base address B3 to the adders 32A to 32C. Then, the memory controller 3 sets the third base address B3 as a first offset address A1_3. The adder 32A adds the first offset value F1 to the third base address B3, and sets the calculated value as a second offset address A2_3. The adder 32B adds the second offset value F2 to the third base address B3, and sets the calculated value as a third offset address A3_3. The adder 32C adds the third offset value F3 to the third base address B3, and sets the calculated value as a fourth offset address A4_3. Then, the memory controller 3 accesses the memory bank array 4 using the first to fourth offset addresses A1_3 to A4_3.
In this embodiment, the base address transfer and the memory access using the offset addresses (e.g., the transfer of the base address B1 and the memory access using the first to fourth offset addresses A1_1 to A4_1 shown in in Fig. 6), the data reading (e.g., reading the first to fourth data values D1_1 to D4_1 shown in in Fig. 7), and the calculated value transfer (e.g., transfer of the a first calculated value CV1 shown in in Fig. 9) are performed based on each of the base addresses transferred in series from the processing controller 2 to the memory controller 3.
Next, an example of the four offset addresses will be described. In the object recognition technology, Haar-like features are occasionally used. In this case, a Haar-like rectangle accesses to an Integral Image stored in the memory bank array are performed. The shape of the Haar-like rectangle is predetermined according to the purpose of using. For example, an upper-left corner of the rectangle corresponds to the base address. Then, an upper-right corner, a lower-left corner and a lower-right corner correspond to the second to fourth offset addresses, respectively. Therefore, it is possible to read the Integral Image data of the four corners of the Haar-like rectangle by using the one base address in the data transfer apparatus 100.
In this case, a rectangle value can be calculated by the calculation unit 33. The calculation unit 33 assigns the read data from the four offset addresses to the expression 1 described below and calculates the rectangle value RV. In the expression 1, UL means a value of the upper-left corner, UR means a value of the upper-right corner, LL means a value of the lower-left corner, and LR means a value of the lower-right corner.

<Expression 1>
RV = (LR - LL) - (UR - UL)
As shown in the expression 1, a calculation of the rectangle value RV needs three subtracting operation. Therefore, in this case, the calculation unit 33 includes three subtractors to calculate the rectangle value RV.
As described above, the present data transfer apparatus 100 can read the data from a plurality of non-consecutive addresses in the multi-bank memory based on one base address. Thus, one base address is sent to the memory unit from the processing unit to read a plurality of data in a clock cycle. In other words, it is possible to access and read in parallel a plurality of data from non-consecutive addresses by receiving one base address in a clock cycle.
Second Embodiment
Next, a data transfer apparatus 200 according to a second embodiment shall be explained. Fig. 11 is a block diagram schematically showing a configuration of the data transfer apparatus 200 according to the second embodiment. The data transfer apparatus 200 has a configuration in which an address conflict solving unit 6 is added to the data transfer apparatus 100. The address conflict solving unit 6 is inserted between the memory controller 3 and the memory bank array 4. As other configurations of the data transfer apparatus 200 are similar to those of the data transfer apparatus 100, a description of those will be omitted.
In the first embodiment, the first to fourth offset addresses are addresses in different memory banks, respectively. In contrast, each of the memory banks MB0 to MB15 can include two or more of the first to fourth offset addresses in the present embodiment. However, each of the memory banks MB0 to MB15 can fulfill only one access in a clock cycle and can output data only from one random address. Thus, two or more clock cycles are needed to output all data from the first to fourth offset addresses, when any of the memory banks MB0 to MB15 receives two or more of the first to fourth offset addresses. Fig. 12 is a timing chart showing the reading operation of the data transfer apparatus according to the second embodiment.
Case 1
In the case 1, the first to fourth offset addresses are included in different memory banks, respectively. Thus, the address conflict solving unit 6 does not find a conflict and the first to fourth offset addresses can access in parallel in the same clock cycle. That is, the address conflict solving unit 6 needs one clock cycle to receive all the data from memory bank array 4 (Fig. 12, referred to as "C1").
Case 2
In the case 2, two of the first to fourth offset addresses belong to the same memory bank. For example, the first and second offset addresses belong to a first memory bank, the third offset address belongs to a second memory bank, and the fourth offset address belongs to the third memory bank. Thus, the first and second offset addresses conflict with each other. The address conflict solving unit 6 accesses the first and second offset addresses in series in different clock cycles, and accesses the third and fourth offset addresses in the clock cycle when the first or second offset address is accessed.
Further, there can be another situation in the case 2. That is, two of the first to fourth offset addresses belong to one memory bank, and the other two of the first to fourth offset addresses belong to another memory bank. Thus, the first and second offset addresses conflicts with each other and the third and fourth offset addresses conflicts with each other. The address conflict solving unit 6 accesses the first and second offset addresses in series in different clock cycles and accesses the third and fourth offset addresses in series in different clock cycles. The address conflict solving unit 6 accesses one of the first and second offset addresses and one of the third and fourth offset addresses in parallel in the same clock cycle, and the other of the first and second offset addresses and the other of the third and fourth offset addresses in parallel in the same clock cycle.
That is, the address conflict solving unit 6 needs two clock cycles to receive all the data from memory bank array 4 (Fig. 12, referred to as "C2").
Case 3
In the case 3, three of the first to fourth offset addresses belong to the same memory bank. For example, the first to third offset addresses belong to one memory bank and the fourth offset address belongs to another memory bank. Thus, the first to third offset addresses conflict with each other. The address conflict solving unit 6 accesses the first to third offset addresses in series in different clock cycles, respectively. The address conflict solving unit 6 accesses the fourth offset address in the clock cycle when any one of the first to third offset addresses is accessed. That is, the address conflict solving unit 6 needs three clock cycles to receive all the data from memory bank array 4 (Fig. 12, referred to as "C3").
Case 4
In the case 4, all of the first to fourth offset addresses belong to the same memory bank. Thus, the first to fourth offset addresses conflict with each other. The address conflict solving unit 6 accesses the first to fourth offset addresses in series in different clock cycles, respectively. That is, the address conflict solving unit 6 needs four clock cycles to receive all the data from memory bank array 4 (Fig. 12, referred to as "C4").
Next, a conflict judgement operation of the address conflict solving unit 6 will be described. At an initial state, a number of access time slots NOA and access delays AD1 to AD4 are set to "0". Note that the number of access time slots NOA means the number of clock cycles needed for reading all the data from the memory bank array 4. The access delays AD1 to AD4 mean the clock cycles when the first to fourth offset address in the memory bank array 4 are accessed, respectively.
Figs. 13 to 15 are flow charts showing the conflict judgement operation of the address conflict solving unit 6 according to the second embodiment. The conflict judgement operation of the address conflict solving unit 6 includes steps S201 to S227. Fig. 13 shows the steps S201 to 210, Fig. 14 shows the steps S211 to 216, and Fig. 15 shows the steps S217 to 227.
Step S201
At first, the address conflict solving unit 6 judges whether the memory bank M1 of the first offset address A1 is the same as the memory bank M2 of the second offset address A2.
Step S202
When the memory bank M1 of the first offset address A1 is the same as the memory bank M2 of the second offset address A2 in the step S201, the address conflict solving unit 6 judges whether the memory bank M1 of the first offset address A1 is the same as the memory bank M3 of the third offset address A3.
Step S203
When the memory bank M1 of the first offset address A1 is the same as the memory bank M3 of the third offset address A3 in the step S202, the address conflict solving unit 6 judges whether the memory bank M1 of the first offset address A1 is the same as the memory bank M4 of the fourth offset address A4.
Step S204
When the memory bank M1 of the first offset address A1 is the same as the memory bank M4 of the fourth offset address A4 in the step S203, the address conflict solving unit 6 sets the access time slot NOA to "4", the access timing AD1 to "0", the access timing AD2 to "1", the access timing AD3 to "2", and the access timing AD4 to "3". Then, the conflict judgement operation is finished. As a result, this case corresponds to the case 4 described above.
Step S205
When the memory bank M1 of the first offset address A1 is different from the memory bank M4 of the fourth offset address A4 in the step S203, the address conflict solving unit 6 sets the access time slot NOA to "3", the access timing AD1 to "0", the access timing AD2 to "1", the access timing AD3 to "2", and the access timing AD4 to "0". Then, the conflict judgement operation is finished. As a result, this case corresponds to the case 3 described above.
Step S206
When the memory bank M1 of the first offset address A1 is different from the memory bank M3 of the third offset address A3 in the step S202, the address conflict solving unit 6 judges whether the memory bank M1 of the first offset address A1 is the same as the memory bank M4 of the fourth offset address A4.
Step S207
When the memory bank M1 of the first offset address A1 is the same as the memory bank M4 of the fourth offset address A4 in the step S206, the address conflict solving unit 6 sets the access time slot NOA to "3", the access timing AD1 to "0", the access timing AD2 to "1", the access timing AD3 to "0", and the access timing AD4 to "2". Then, the conflict judgement operation is finished. As a result, this case corresponds to the case 3 described above.
Step S208
When the memory bank M1 of the first offset address A1 is different from the memory bank M4 of the fourth offset address A4 in the step S206, the address conflict solving unit 6 judges whether the memory bank M3 of the third offset address A3 is the same as the memory bank M4 of the fourth offset address A4.
Step S209
When the memory bank M3 of the third offset address A3 is the same as the memory bank M4 of the fourth offset address A4 in the step S208, the address conflict solving unit 6 sets the access time slot NOA to "2", the access timing AD1 to "0", the access timing AD2 to "1", the access timing AD3 to "0", and the access timing AD4 to "1". Then, the conflict judgement operation is finished. As a result, this case corresponds to the case 2 described above.
Step S210
When the memory bank M3 of the third offset address A3 is different from the memory bank M4 of the fourth offset address A4 in the step S208, the address conflict solving unit 6 sets the access time slot NOA to "2", the access timing AD1 to "0", the access timing AD2 to "1", the access timing AD3 to "0", and the access timing AD4 to "0". Then, the conflict judgement operation is finished. As a result, this case corresponds to the case 2 described above.
Step S211
When the memory bank M1 of the first offset address A1 is different from the memory bank M2 of the second offset address A2 in the step S201, the address conflict solving unit 6 judges whether the memory bank M1 of the first offset address A1 is the same as the memory bank M3 of the third offset address A3.
Step S212
When the memory bank M1 of the first offset address A1 is the same as the memory bank M3 of the third offset address A3 in the step S211, the address conflict solving unit 6 judges whether the memory bank M1 of the first offset address A1 is the same as the memory bank M4 of the fourth offset address A4.
Step S213
When the memory bank M1 of the first offset address A1 is the same as the memory bank M4 of the fourth offset address A4 in the step S212, the address conflict solving unit 6 sets the access time slot NOA to "3", the access timing AD1 to "0", the access timing AD2 to "0", the access timing AD3 to "1", and the access timing AD4 to "2". Then, the conflict judgement operation is finished. As a result, this case corresponds to the case 3 described above.
Step S214
When the memory bank M1 of the first offset address A1 is different from the memory bank M4 of the fourth offset address A4 in the step S212, the address conflict solving unit 6 judges whether the memory bank M2 of the second offset address A2 is the same as the memory bank M4 of the fourth offset address A4.
Step S215
When the memory bank M2 of the second offset address A2 is the same as the memory bank M4 of the fourth offset address A4 in the step S214, the address conflict solving unit 6 sets the access time slot NOA to "2", the access timing AD1 to "0", the access timing AD2 to "0", the access timing AD3 to "1", and the access timing AD4 to "1". Then, the conflict judgement operation is finished. As a result, this case corresponds to the case 2 described above.
Step S216
When the memory bank M2 of the second offset address A2 is different from the memory bank M4 of the fourth offset address A4 in the step S214, the address conflict solving unit 6 sets the access time slot NOA to "2", the access timing AD1 to "0", the access timing AD2 to "0", the access timing AD3 to "1", and the access timing AD4 to "0". Then, the conflict judgement operation is finished. As a result, this case corresponds to the case 2 described above.
Step S217
When the memory bank M1 of the first offset address A1 is different from the memory bank M3 of the third offset address A3 in the step S211, the address conflict solving unit 6 judges whether the memory bank M2 of the second offset address A2 is the same as the memory bank M3 of the third offset address A3.
Step S218
When the memory bank M2 of the second offset address A2 is the same as the memory bank M3 of the third offset address A3 in the step S217, the address conflict solving unit 6 judges whether the memory bank M2 of the second offset address A2 is the same as the memory bank M4 of the fourth offset address A4.
Step S219
When the memory bank M2 of the second offset address A2 is the same as the memory bank M4 of the fourth offset address A4 in the step S218, the address conflict solving unit 6 sets the access time slot NOA to "3", the access timing AD1 to "0", the access timing AD2 to "0", the access timing AD3 to "1", and the access timing AD4 to "2". Then, the conflict judgement operation is finished. As a result, this case corresponds to the case 3 described above.
Step S220
When the memory bank M2 of the second offset address A2 is different from the memory bank M4 of the fourth offset address A4 in the step S218, the address conflict solving unit 6 judges whether the memory bank M1 of the first offset address A1 is the same as the memory bank M4 of the fourth offset address A4.
Step S221
When the memory bank M1 of the first offset address A1 is the same as the memory bank M4 of the fourth offset address A4 in the step S220, the address conflict solving unit 6 sets the access time slot NOA to "2", the access timing AD1 to "0", the access timing AD2 to "0", the access timing AD3 to "1", and the access timing AD4 to "1". Then, the conflict judgement operation is finished. As a result, this case corresponds to the case 2 described above.
Step S222
When the memory bank M1 of the first offset address A1 is different from the memory bank M4 of the fourth offset address A4 in the step S220, the address conflict solving unit 6 sets the access time slot NOA to "2", the access timing AD1 to "0", the access timing AD2 to "0", the access timing AD3 to "1", and the access timing AD4 to "0". Then, the conflict judgement operation is finished. As a result, this case corresponds to the case 2 described above.
Step S223
When the memory bank M2 of the second offset address A2 is different from the memory bank M3 of the third offset address A3 in the step S217, the address conflict solving unit 6 judges whether the memory bank M1 of the first offset address A1 is the same as the memory bank M4 of the fourth offset address A4.
Step S224
When the memory bank M1 of the first offset address A1 is different from the memory bank M4 of the fourth offset address A4 in the step S223, the address conflict solving unit 6 judges whether the memory bank M2 of the second offset address A2 is the same as the memory bank M4 of the fourth offset address A4.
Step S225
When the memory bank M2 of the second offset address A2 is different from the memory bank M4 of the fourth offset address A4 in the step S224, the address conflict solving unit 6 judges whether the memory bank M3 of the third offset address A3 is the same as the memory bank M4 of the fourth offset address A4.
Step S226
When the memory bank M1 of the first offset address A1 is the same as the memory bank M4 of the fourth offset address A4 in the step S223, the memory bank M2 of the second offset address A2 is the same as the memory bank M4 of the fourth offset address A4 in the step S224, or the memory bank M3 of the third offset address A3 is the same as the memory bank M4 of the fourth offset address A4 in the step S225, the address conflict solving unit 6 sets the access time slot NOA to "2", the access timing AD1 to "0", the access timing AD2 to "0", the access timing AD3 to "0", and the access timing AD4 to "1". Then, the conflict judgement operation is finished. As a result, this case corresponds to the case 2 described above.
Step S227
When the memory bank M3 of the third offset address A3 is different from the memory bank M4 of the fourth offset address A4 in the step S225, the address conflict solving unit 6 sets the access time slot NOA to "1", the access timing AD1 to "0", the access timing AD2 to "0", the access timing AD3 to "0", and the access timing AD4 to "0". Then, the conflict judgement operation is finished. As a result, this case corresponds to the case 1 described above.
According to the conflict judgement operation of the address conflict solving unit 6, which consists of the step S201 to S227 shown in Fig, 13 to 15, it can be understood that the address conflict solving unit 6 can read the data from the memory bank array 4 as appropriate, when the address conflict occurs.
Next, an access operation of the address conflict solving unit 6 will be described. The address conflict solving unit 6 accesses the address in the memory bank array 4 based on the access timings AD1 to AD4. Specifically, when the access timings AD1 to AD4 are "0", the address conflict solving unit 6 accesses the first to fourth offset addresses in the memory bank array 4 in a clock cycle CLK1, and reads the data from the first to fourth offset addresses in the memory bank array 4 in a clock cycle CLK2 following the clock cycle CLK1. When the access timings AD2 to AD4 are "1", the address conflict solving unit 6 accesses the second to fourth offset addresses in the memory bank array 4 in the clock cycle CLK2, and reads the data from the second to fourth offset addresses in the memory bank array 4 in a clock cycle CLK3 following the clock cycle CLK2. When the access timings AD3 and AD4 are "2", the address conflict solving unit 6 accesses the third and fourth offset addresses in the memory bank array 4 in the clock cycle CLK3, and reads the data from the third and fourth offset addresses in the memory bank array 4 in a clock cycle CLK4 following the clock cycle CLK3. When the access timing AD4 are "3", the address conflict solving unit 6 accesses the fourth offset address in the memory bank array 4 in the clock cycle CLK4, and reads the data from the fourth offset address in the memory bank array 4 in a clock cycle CLK5 following the clock cycle CLK4.
As described above, the data transfer apparatus 200 can read the data from the four offset addresses in the memory bank array 4, even if the four offset addresses are accessing to the same memory bank.
Third Embodiment
Next, a microcomputer 300 according to a third embodiment shall be explained. The microcomputer 300 is an example of a microcomputer in which the data transfer apparatus 100 is incorporated. Fig. 16 is a block diagram schematically showing a configuration of the microcomputer 300 according to the third embodiment. As shown in Fig. 16, the microcomputer 300 includes a SIMD (Single Instruction Multiple Data) processor 301, a central processing unit (CPU) 302, a single memory wrapper 303, and the bus 5. Data can be transferred among the SIMD processor 301, the central processing unit (CPU) 302, and the single memory wrapper 303 via the bus 5. In this case, one or both of the SIMD processor 301 and the central processing unit (CPU) 302 correspond to the processing unit 101. The single memory wrapper 303 corresponds to the memory unit 102.
As described above, the data transfer apparatus can be applied to a microcomputer. Therefore, according to the configuration, it is possible to access and read a plurality of data by sending only one base address in a clock cycle in the microcomputer including the data transfer apparatus 100.
In this embodiment, although the configuration in which the data transfer apparatus 100 is applied to a microcomputer is described, this is merely an example. Therefore, the data transfer apparatus 200 can be applied to a microcomputer.
Other embodiments
The present invention is not limited to the embodiments described above, and can be modified as appropriate without departing from the scope of the invention. For example, the first offset address is the base address. However, another address may be used as the first offset address. The first offset address may be an address offset by another offset value (e.g., a fourth offset value) than the first to third offset values. In this case, four offset values are sent to the memory controller 3 from the processing controller 2 via the bus 5. The memory controller 3 further includes another offset register for holding the fourth offset value and another adder for generating the first offset value.
In the above embodiments, the four offset values are generated. However, this is merely an example. The number of the offset values may be an arbitrarily number more than two.
In the above embodiments, the processing memory 1 includes an address memory 11 and a data memory 12. However, this is merely an example. For example, two different areas in one memory unit may be used. Further, it may be possible to use one area in a memory unit and replace the base address information with the received data information.
1 PROCESSING MEMORY
2 PROCESSING CONTROLLER
3 MEMORY CONTROLLER
4 MEMORY BANK ARRAY
5 BUS
6 ADDRESS CONFLICT SOLVING UNIT
11 ADDRESS MEMORY
12 DATA MEMORY
21 OFFSET MEMORY
22 TEMPORARY BASE ADDRESS MEMORY
23 TEMPORARY DATA MEMORY
31A, 31B, 31C OFFSET REGISTERS
32A, 32B, 32C ADDERS
33 CALUCULATION UNIT
100, 200 DATA TRANSFER APPARATUSES
101 PROCESSING UNIT
102 MEMORY UNIT
300 MICROCOMPUTRER
301 SIMD (SINGLE INSTRUCTION MULTIPLE DATA) PROCESSOR
302 CENTRAL PROCESSING UNIT (CPU)
303 SINGLE MEMORY WRAPPER
MB0 TO MB15 MEMORY BANKS

Claims (7)

  1. A data transfer apparatus, comprising:
    a processing unit that outputs a plurality of offset values and a base address;
    a memory bank array that includes a plurality of memory banks; and
    a memory controller that offsets the base address by the offset values to generate offset addresses, and reads data from the memory bank array using the offset addresses, the memory controller receiving the offset values and the base address from the processing unit.
  2. The data transfer apparatus according to Claim 1, wherein
    the offset addresses belong to different memory banks, respectively, and
    the memory controller reads the data from the different memory banks in parallel in one clock cycle.
  3. The data transfer apparatus according to Claim 1, further comprising an address conflict solving unit that reads the data from the memory bank array using the offset addresses received from the memory controller and sends the read data to the memory controller, wherein
    when the offset addresses belong to different memory banks, respectively, the address conflict solving unit reads the data from the different memory banks in parallel in one clock cycle, and
    when two or more of the offset addresses belong to the same memory bank, the address conflict solving unit reads the data from the same memory bank including two or more of the offset addresses in series in a plurality of clock cycles.
  4. The data transfer apparatus according to Claim 1, wherein the memory controller adds the offset value to the base address to generate the offset address.
  5. The data transfer apparatus according to Claim 1, wherein
    the memory controller offsets the base address by three offset values to generate three offset addresses, and
    the base address and the three offset addresses correspond to the four corner addresses of a Haar-like rectangle.
  6. The data transfer apparatus according to Claim 1, wherein the memory controller offsets each base address in series when the memory controller receives a plurality of base addresses in series.
  7. The data transfer apparatus according to Claim 1, wherein the memory controller converts the read data to one data value and sends the converted value to the processing unit as a response for the corresponding base address.
PCT/JP2015/000482 2015-02-04 2015-02-04 Data transfer apparatus Ceased WO2016125202A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2015/000482 WO2016125202A1 (en) 2015-02-04 2015-02-04 Data transfer apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2015/000482 WO2016125202A1 (en) 2015-02-04 2015-02-04 Data transfer apparatus

Publications (1)

Publication Number Publication Date
WO2016125202A1 true WO2016125202A1 (en) 2016-08-11

Family

ID=52595393

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2015/000482 Ceased WO2016125202A1 (en) 2015-02-04 2015-02-04 Data transfer apparatus

Country Status (1)

Country Link
WO (1) WO2016125202A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5671392A (en) 1995-04-11 1997-09-23 United Memories, Inc. Memory device circuit and method for concurrently addressing columns of multiple banks of multi-bank memory array
WO2006039711A1 (en) * 2004-10-01 2006-04-13 Lockheed Martin Corporation Service layer architecture for memory access system and method
US8108625B1 (en) * 2006-10-30 2012-01-31 Nvidia Corporation Shared memory with parallel access and access conflict resolution mechanism
US20130219131A1 (en) * 2012-02-20 2013-08-22 Nimrod Alexandron Low access time indirect memory accesses

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5671392A (en) 1995-04-11 1997-09-23 United Memories, Inc. Memory device circuit and method for concurrently addressing columns of multiple banks of multi-bank memory array
WO2006039711A1 (en) * 2004-10-01 2006-04-13 Lockheed Martin Corporation Service layer architecture for memory access system and method
US8108625B1 (en) * 2006-10-30 2012-01-31 Nvidia Corporation Shared memory with parallel access and access conflict resolution mechanism
US20130219131A1 (en) * 2012-02-20 2013-08-22 Nimrod Alexandron Low access time indirect memory accesses

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
PAUL VIOLA; MICHAEL JONES: "Rapid Object Detection using a Boosted Cascade of Simple Features", IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, December 2001 (2001-12-01), pages 1 - 511,1-518

Similar Documents

Publication Publication Date Title
US8095782B1 (en) Multiple simultaneous context architecture for rebalancing contexts on multithreaded processing cores upon a context change
US20050204111A1 (en) Command scheduling for dual-data-rate two (DDR2) memory devices
US20130301371A1 (en) Dynamic random access memory with multiple thermal sensors disposed therein and control method thereof
WO2006031551A3 (en) Selective replication of data structure
KR940007690A (en) Processor system with synchronous dynamic memory
US20030112677A1 (en) Systems and methods for executing precharge commands using posted precharge in integrated circuit memory devices with memory banks each including local precharge control circuits
KR20210050591A (en) Shared error checking and correction logic for multiple data banks
JP2009015832A (en) Inter-access arbitration circuit, semiconductor device, and inter-access arbitration method
KR100914017B1 (en) Memory controller, access control method of semiconductor memory, and system
CN106960412B (en) Memory device including a plurality of buffers and method of driving a memory
US6377268B1 (en) Programmable graphics memory apparatus
US20140355893A1 (en) Vector processor calculation of local binary patterns
WO2016125202A1 (en) Data transfer apparatus
JP4031996B2 (en) Digital still camera with memory device
JP2003316571A (en) Parallel processor
KR950014176B1 (en) Memory controller and data processing system
JP5363060B2 (en) Memory module and memory auxiliary module
Khare et al. High-level synthesis with synchronous and RAMBUS DRAMs
US9760508B2 (en) Control apparatus, computer system, control method and storage medium
JP3923010B2 (en) Memory control circuit
US20170139865A1 (en) System for distributed computing and storage
JP2580999B2 (en) DMA controller
JP2003151273A (en) Storage device, storage device internal control method, system, and storage device control method in system
CN113626079A (en) Data processing method and device and related product
US8819509B2 (en) Integrated circuit, test circuit, and method of testing

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15706947

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15706947

Country of ref document: EP

Kind code of ref document: A1