Summary of the invention
An object of the present invention at least that, for how to overcome the above-mentioned problems of the prior art, provide one kind
The network-on-chip verification method and system towards mapping based on FPGA, can be realized to the network-on-chip of application-oriented mapping
High-speed simulation assessment.
To achieve the goals above, the technical solution adopted by the present invention includes following aspects.
A kind of network-on-chip verification method towards mapping, it is described to include:
S101: obtaining the application and Survey on network-on-chip topology set by user that user inputs, choose mapping algorithm,
And the node of the application is mapped on the corresponding node of the network-on-chip set by user using the mapping algorithm, it obtains
To mapping result and export the mapping result;
S102: according to each node-routing of the Survey on network-on-chip topology setting simulation network-on-chip set by user
Table configures the topological structure of the simulation network-on-chip based on each node route list of the simulation network-on-chip;And according to institute
Mapping result is stated, the corresponding application data stream of each node is configured, and sends the application data stream to the simulation network-on-chip;
S103: receiving the application data stream, the exchange transmission process of simulation application data flow in simulation network-on-chip,
With obtain the mapping result simulation network-on-chip in operation result, the operation result include simulate network-on-chip when
Prolong, power consumption information;
Whether within a preset range S104: judging the time delay, power consumption information, if the time delay, power consumption information are default
In range, then terminate to run.
Preferably, in the network-on-chip verification method, the selection mapping algorithm method particularly includes: to the user
The application of input carries out analysis of complexity, chooses mapping algorithm based on the complexity analyzing result.
Preferably, in the network-on-chip verification method, the step 104 further include: if the time delay, power consumption information are not
Within a preset range, then analysis of complexity is carried out to the application of user input again in return step 101, with more new mappings
Algorithm.
A kind of network-on-chip verifying system towards mapping, it is described to include:
Algorithm platform for obtaining the application and Survey on network-on-chip topology set by user that user inputs, and determines
Mapping algorithm, and it is corresponding using the mapping algorithm node of the application to be mapped to the network-on-chip set by user
On node, mapping result is obtained, and the mapping result is exported to processing platform;
Processing platform, for according to the simulation piece in the Survey on network-on-chip topology setting emulation platform set by user
Each node route list of upper network, based on each node route list of the simulation network-on-chip in the emulation platform, described in configuration
Simulate the topological structure of network-on-chip;And according to the mapping result, the corresponding application data stream of each node is configured, and send institute
Application data stream is stated to emulation platform;
Emulation platform simulates the application number for receiving the application data stream, and in the simulation network-on-chip
Transmission process is exchanged according to stream, to obtain the operation result of the mapping result, the operation result includes simulation network-on-chip
Time delay, power consumption information.
Preferably, in the network-on-chip verifying system, the algorithm platform is used for the time delay when the operation result, function
When consuming information not within values, the mapping algorithm is updated.
Preferably, in the network-on-chip verifying system, the emulation platform utilizes subregion, by stages bus and section net
Shape structure is configured.
In conclusion by adopting the above-described technical solution, the present invention at least has the advantages that
1, this paper presents the network-on-chips towards mapping based on FPGA to verify system, which utilizes towards mapping needle
The processing platform (including the soft core of three orientations mapping in processing platform) of application task can be simulated and generate application task
Data characteristic;And simulation network-on-chip emulation platform-MAENoC is further devised, by respectively saving to simulation network-on-chip
Data are applied in point write-in mapping, and mapping algorithm is monitored online in network-on-chip operation result, can be realized to application-oriented mapping
Network-on-chip high-speed simulation assessment;
2, the simulation network-on-chip emulation platform-MAENoC is connected in the form of subregion, and the node in area is with bus
Type of attachment.Using the method based on configuration, the different topology for generating NoC can be simulated;Each node supports multithread
Water dispenser system and Virtual Channel dynamic restructuring, and joined the technology of virtualization, can support the NoC of the node of enormous amount
Verifying.
Specific embodiment
With reference to the accompanying drawings and embodiments, the present invention will be described in further detail, so that the purpose of the present invention, technology
Scheme and advantage are more clearly understood.It should be appreciated that described herein, specific examples are only used to explain the present invention, and does not have to
It is of the invention in limiting.
Fig. 1 shows the network-on-chip verification method according to an exemplary embodiment of the present invention towards mapping.The embodiment
Method specifically include that
S101: obtaining the application and Survey on network-on-chip topology set by user that user inputs, choose mapping algorithm,
And the node of the application is mapped on the corresponding node of the network-on-chip set by user using the mapping algorithm, it obtains
To mapping result and export the mapping result;
S102: according to each node-routing of the Survey on network-on-chip topology setting simulation network-on-chip set by user
Table configures the topological structure of the simulation network-on-chip based on each node route list of the simulation network-on-chip;And according to institute
Mapping result is stated, the corresponding application data stream of each node is configured, and sends the application data stream to the simulation network-on-chip;
S103: receiving the application data stream, the exchange transmission process of simulation application data flow in simulation network-on-chip,
With obtain the mapping result simulation network-on-chip in operation result, the operation result include simulate network-on-chip when
Prolong, power consumption information;
Whether within a preset range S104: judging the time delay, power consumption information, if the time delay, power consumption information are default
In range, then terminate to run.
Specifically, the network-on-chip verification method towards mapping is the on piece towards mapping designed based on the present invention
What network verifying system was realized.The network-on-chip verifying system specifically includes that
Algorithm platform for obtaining the application and Survey on network-on-chip topology set by user that user inputs, and determines
Mapping algorithm, and it is corresponding using the mapping algorithm node of the application to be mapped to the network-on-chip set by user
On node, mapping result is obtained, and the mapping result is exported to processing platform;
Processing platform, for according to the simulation piece in the Survey on network-on-chip topology setting emulation platform set by user
Each node route list of upper network, based on each node route list of the simulation network-on-chip in the emulation platform, described in configuration
Simulate the topological structure of network-on-chip;And according to the mapping result, the corresponding application data stream of each node is configured, and send institute
Application data stream is stated to emulation platform;
Emulation platform simulates the application number for receiving the application data stream, and in the simulation network-on-chip
Transmission process is exchanged according to stream, to obtain the operation result of the mapping result, the operation result includes simulation network-on-chip
Time delay, power consumption information.
Wherein, algorithm platform obtains the application and Survey on network-on-chip topology set by user that user inputs, and can
Mapping algorithm is chosen according to application analysis of complexity, and the node of the application is mapped to the use using the mapping algorithm
On the corresponding node of network-on-chip of family setting, to obtain mapping result, and the processing of the mapping result to rear end is exported
Platform.Further, the algorithm platform can also be according to the simulation run result retrofit institute of processing platform, emulation platform
State mapping algorithm.For example, choosing simplest mapping algorithm during one-time authentication by complicated method analytic approach, working as emulation
The operation result time delay of platform, power consumption not within a preset range when, then analysis of complexity is carried out to application again, chooses new reflect
Algorithm is penetrated, the new mapping algorithm complexity is commonly greater than the preceding mapping algorithm once chosen.And the algorithm is in hardware
It is realized in processor (such as: computer, PC etc.) by software programmings such as MATLAB.
As shown in Fig. 2, the mapping result can pass to processing platform in a manner of configuration file.In processing platform I
Introduce towards mapping using data characteristic generate SCPU (the simple process core based on Microbalze), the processing
Platform includes mapping result processing core SCPU3, data transmit-receive core SCPU1, information configuration core SCPU2.Processing platform ties mapping
Executive condition output of the fruit on network-on-chip (NoC).The input of processing platform includes that the NoC topology size of pre-generatmg (is used
The Survey on network-on-chip topology of the customized setting in family) and mapping result.We utilize the PCIE interface of FPGA, at mapping result
Core write-in mapping scheduler result (mapping result) is managed, the mapping result processing core SCPU3 is responsible for being based on the mapping result mould
The quasi- task description information for generating application, task description information are that the corresponding application data stream of each node of configuration is (i.e. every
Node data stream of a node in the application task).
Information configuration core SCPU2 is for the simulation piece in the NoC topology size setting emulation platform based on the pre-generatmg
Each node route list information of upper network, emulation platform can be based on each node route list information, configuration analog network topology
Structure.Node among network-on-chip is exactly router, it is by inquiring the routing table of oneself, so that knowing can be toward which node
Or toward which node forwarding, thus obtain the topological structure of network.Data transmit-receive core SCP1 will generate the task of application
Data flow is sent to MAENoC (emulation platform designed by the present invention), and simulates network-on-chip data exchange mistake in emulation platform
As the receiving module of MAENoC when journey, receives and count the package informatin after routing.
Network-on-chip verifying system can change the topological size of network, existing verifying NoC topology shift gears including
To integrating again for entire engineering, by the way of restructural, or existing extensive NoC is utilized, routed by updating
The mode of table is to realize different NoC topologys.Therefore the mode for updating routing table realizes different Survey on network-on-chip topology.It is right
Again synthesis can devote a tremendous amount of time entire engineering, especially when a large amount of node that network-on-chip includes, generalized time
It will be inestimable.A large amount of FPGA resource is consumed using existing Reconfiguration Technologies.Therefore we devise based on FPGA's
MAENoC network-on-chip emulation platform, subregion and subregion are interconnected into network in the emulation platform, we are by updating in Fig. 1
Global route information table, achieve the purpose that change network-on-chip topology.If be fitted application PE (PE is processing unit,
Processing element) it joined in the design of node of emulation NoC, but when the interstitial content of verifying is more, PE meeting
A large amount of hardware resource is occupied, in addition distributed traffic generates the additional resource overhead of the bring complexity synchronous with design realization
Property.Since verification platform is without considering true PE node, it is contemplated that minimize Resources Consumption to generate the flow of application task
Properties of flow, and be conducive to monitor in real time, data transmit-receive part is moved to data transmit-receive core SCPU1, information configuration core by us
SCPU2 is responsible for NoC network node parameter centralized configuration, and the application task information of mapping is corresponding to network node for convenience,
We are added to mapping result processing core SCPU3.
When Host Pc is by the PCIE interface AXI4 agreement of high speed, mapping result information is passed into SCPU3.SCPU3 is then
It is sent to SCPU2 to generate task description frame according to mapping result, to receive this in TR (Traffic Receiver) termination
It is engaged in after all task packets that all father nodes are sent, triggering TG (Traffic Generator) generates corresponding data packet to send out
Give the task node.The description frame of task needs as shown in figure 3, contain all key messages to application task description
Source-purpose data packet that TG is generated, and indicate the section in the simulation network-on-chip in the emulation platform that the data packet to be sent to
Point, the task node generate data packet dependence, generate data packet the task execution primary time and corresponding data
The size of packet.
BRAM will be present in mission statement frame, according to the TR package informatin received and the execution sequencing of task, quilt
SCPU1 is read.TG illustrates information according to the execution time of task and system clock and other task frames, and number is randomly generated
According to the content of packet, the task data packet under complete full-scale condition is synthesized, the data packet comprising source address and destination address will be sent out
Toward the data pack buffer space of simulation each node of network-on-chip.Since the packet sending speed of SCPU is greater than the system clock of NoC, no
Must worry at the task simulation moment, data packet read it is empty, the case where network congestion.In addition we send using dynamic, section
The spatial cache of point need not include all data packets, save the expense of storage resource on FPGA.
The data packet sent by receiving each node by bus-sharing at the end TR, according to sending time stamp and receiving time
Stab the delay in a network of available data packet.When the last one microplate for receiving the transmission of the last one task, meaning
Application once executes completions, can calculate and apply in network-on-chip operation once and when carrying out data and exchanging required
Between.
SCPU2 receives definition of the Host PC about network topology parameters, including routing mechanism, network topology size, raw
At the routing table information of configuration NoC, and the node of verifying NoC is notified to update routing table in the suitable time, to realize network
Topological parameterization configuration.
The application data stream is finally received by emulation platform, and simulates the application number in the simulation network-on-chip
Transmission process is exchanged according to stream, to obtain the operation result of the mapping result, the operation result includes simulation network-on-chip
Time delay, power consumption information.
AdapNoC (a kind of network-on-chip emulation platform based on FPGA in the prior art) is using small-sized in subregion
Mesh (reticular structure) structure, DART (a kind of network-on-chip emulation platform based on FPGA in the prior art) is by subregion and divides
The mode connected entirely between area, appeal two ways will consume the occupancy of the interconnection resources in a large amount of FPGA.We mention thus
The method being classified out includes multiple subregions in design and simulation platform MAENoC, MAENoC designed by the present invention, includes in subregion
NoC interstitial content is 2k.Node in area is connected by bus, by the method for similar Mesh between subregion, in subregion
Node selects to carry out selection simulation node in a manner of repeating query, to send data to the node of this subregion or other subregions.Such as
Shown in Fig. 4.By this method, when the number of nodes in subregion increases, the interconnection of subregion interior nodes and subregion can be greatly reduced
Between interconnection resources expense.Concurrent multiplexing is simple, and node is added in subregion in a manner of similar carry, when needing to increase subregion
Node, as long as the top layer in subregion adds corresponding example file.
Subregion contains global routing table information (routing table information in whole network), and the road of local node
By table information, local routing table information will configure local router, and global routing table information is the output chain of selection packet
Road provides corresponding information.Status register is further comprised in subregion simultaneously, can be the virtualization services of proposition.
The local input end of router there may be obstruction the case where, the packet buffering queue piece of the local side of intradomain router
Outer RAM, the packet that RAM is sended over outside piece will be divided at the microplate generation module of routing node (Flit Generator, FG)
At head microplate, body microplate, tail microplate, head microplate passes through router-level topology, Virtual Channel distribution, switch according to routing table in the router
The processes such as distribution.MAENoC includes a control module, and the control module can be the routing node point in piece according to timestamp
With certain simulation time, selection signal is exported to selector.It can be generated simultaneously according to the address for the next-hop that router forwards
Output link controls signal (credit), distributes link (next hop router in link, including subregion in advance for the packet of forwarding
Location is in same subregion) or by stages link (next hop router address is in other subregions).Control module is forwarding packet point
It is to be based on lookup table technology, and after head microplate enters a cycle of router, control module just receives logical with link
Know signal (inform), the time cycle that control module completes link distribution is less than week time that router completes routing function
The distribution of phase, link have been completed before the output of router packet.
Node in simulation network-on-chip domain selects to be emulated and inputted by way of circulation, due to waiting other sections
The emulation of point needs the time, therefore when data packet leaves router, joined update of time stamp (Time in emulation platform
Stamp Update, TSU) module, to offset the time for waiting other nodes emulation in subregion.Either data packet is sent to local
Subregion or other subregions, using repeating query mode, subregion can not be immediately sent to after receiving the data packet for our nodes
Corresponding node, therefore, the packet in each area will domain of the existence local input-buffer FIFO, to wait the emulation of corresponding node
Moment.When leaving the caching simultaneously, update of time stamp also will do it.When the router of packet arrival destination, to the end according to reception
Identical just notification data receiving module has received data packet for the address of cauda and local address.
Existing most of hardware NoC emulation platforms use single-stage (One Stage) router, can accurately not
Verifying has the router of floating, and the delay of packet in a network can be effectively reduced in multistage floating.NoC emulation platform
Design in the characteristic of router flowing water should be added, to verify the influence of NoC corresponding time delay under different floatings, essence
The routing characteristic of true simulation NoC, floating is introduced into the design of router.With reference to the router proposed in the prior art
Floating, common level Four floating, including the distribution of router-level topology, Virtual Channel, circuit switching, transmission.Three-level flowing water is then
It is combined into level-one by advance Speculation Virtual Channel and switched circuit, Virtual Channel distribution is split as two-stage by Pyatyi flowing water, increases flowing water
Benefit be that can effectively reduce the route time of packet, but register control logic etc. can bring the expense of additional resource.DART,
AdapNOC supports the number of Virtual Channel to change in a manner of configuration parameter, by increasing additional Virtual Channel resource, is made with facilitating
Used time switching, but the hardware resource cost of redundancy can be brought.It is proposed that the stream of router by way of partial reconfigurable
Water dispenser system, Virtual Channel are reconstructed.The redundancy that the flexibility of design can be increased, and reduce router reduces hardware resource and opens
Pin.
Virtual channel flow control is the mode that current router at most uses, and Virtual Channel is a plurality of by arranging to each physical channel
Virtual path, each path have the buffering queue of oneself, than the throughput that Wormhole rou ting improves percent 40%.Increase simultaneously
The number of Virtual Channel will affect the performance of NoC router, different from the switching of the AdapNoC and DART mode parameterized, originally
Invention uses the restructural file of pre-generatmg difference Virtual Channel number, when needing different Virtual Channel numbers to be verified
It switches over.
Fig. 5 is the restructural mechanism of the router of emulation platform of the present invention, and a restructural region can include multiple roads NoC
By node, according to the restructural strategy of Xilinx, the resource size in each restructural region needs sets itself.We set often
A restructural region is identical size, corresponding to all NoC routers in a region.If the NoC in each region
Interstitial content needs to repartition restructural area size when increasing, and platform provides different restructural files.XP_ in Fig. 5
YVc_par.bit represents the partial reconfigurable configuration file of x grades of floatings of reconstruct and y Virtual Channel.For example, 4P_ in figure
2Vc_par.bit represents the partial reconfigurable configuration file of 4 grades of floatings of reconstruct and 2 Virtual Channels.
By corresponding restructural file write-in reconstruction region compared with reconstructing entire engineering, time overhead can be ignored not
Meter.Different configuration files corresponds to different Virtual Channel and floating, if it is desired to the function of modification router, it can be certainly
Row design NoC router feature HDL, and to the corresponding partial reconfigurable file of generation.
Can theoretically there be the huge NoC node of simulation scale on memory on software platform with infinite expanding.Hardware is imitative
True platform is limited to that hardware logic resource is limited to be difficult to verify the NoC comprising great deal of nodes.Many hardware simulation platforms
Virtualization technology is introduced, it is theoretically logical to separate NoC system clock and FPGA clock, with storage register by node state
Deposit realizes physical node and void using the node of time division multiplexing (Time Division Multiplexing, TDM) physics
The conversion of quasi- node sacrifices simulation velocity to exchange the emulation of extensive NoC node for.
Us are demonstrated as shown in Figure 6 and passes through the method for time multiplexed physical node simulation virtual node, are wrapped as shown in the figure
Two physical extents are contained, each subregion has corresponding status register for depositing the subregion for executing the certain F PGA clock cycle
Internal each module such as router, the information such as management module.Input-buffer contains TG generation and is stored in BRAM packet content and reception
Packet content, output caching includes that the subregion will be sent to the packet content of other subregion.Physical clock FPGA clock is shown in Fig. 7
With the relationship of emulation NoC clock, physical extent 1 is in the status register that FPGA rising edge clock (1) moment reads subregion 1
Hold, after the functional simulation that (2) complete logic region 1, deposits corresponding information in (3) rising edge clock, and update output
Cache contents.Then it reads and a information content of load logic subregion 2, the functional simulation of progress subregion 2 carries out logic later
3,4 functional simulation, moves in circles.2 physical extents can fictionalize 8 logical partitions by this way.
It is to discriminate between out due to the clock of system clock and FPGA, can theoretically fictionalize infinite number of virtual partition.
The Status register information of the frequent reading virtual partition of physical extent in FPGA clock, within a clock cycle of system,
Complete the node emulation of each virtual partition of NoC.It needs frequently to read memory money since the virtualization mode of appeal exists
Source, storage state information register is placed in the large-scale SRAM of high speed outside piece by us.In addition, the technology due to virtualization can band
The reduction for carrying out simulation velocity, the node theoretically virtualized increase one times, and simulation velocity can be reduced to original half.
The above, the only detailed description of the specific embodiment of the invention, rather than limitation of the present invention.The relevant technologies
The technical staff in field is not in the case where departing from principle and range of the invention, various replacements, modification and the improvement made
It should all be included in the protection scope of the present invention.